summaryrefslogtreecommitdiffstats
path: root/openmp/runtime/src/kmp_runtime.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [OpenMP] NFC: Fix trivial typos in commentsKazuaki Ishizaki2020-01-071-2/+2
| | | | | | | | | | | | Reviewers: jdoerfert, Jim Reviewed By: Jim Subscribers: Jim, mgorny, guansong, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D72285
* [OpenMP] NFC: Fix trivial typos in commentsKelvin Li2020-01-031-1/+1
| | | | | | Submitted by: kiszk Differential Revision: https://reviews.llvm.org/D72171
* [OpenMP] Remove -Wl,-fini=__kmp_internal_end_finiAaron Puchert2019-11-191-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The termination function duplicated the functionality of the __attribute((destructor))-annotated function __kmp_internal_end_fini, and we have no indication that this doesn't work. The function might cause issues with link-time optimization turned on: until very recently, none of the usual linkers was reporting functions named in -Wl,-fini as used to the LTO plugin, so it might be dropped. If the function is dropped, -Wl,-fini=__kmp_internal_end_fini doesn't do what we want: with ld.bfd and lld it drops the FINI attribute from .dynamic and with gold we get FINI = 0x0, which leads to a crash on cleanup. This can be reproduced by building with -DLLVM_ENABLE_PROJECTS="clang;openmp" \ -DLLVM_ENABLE_LTO=Thin \ -DLLVM_USE_LINKER=gold The issue in lld has been fixed in f95273f75aa, but gold remains without fix so far. Fixes PR43927. Reviewers: JonChesterfield, jdoerfert, AndreyChurbanov Reviewed By: AndreyChurbanov Differential Revision: https://reviews.llvm.org/D69927
* [OpenMP] Enable thread affinity on FreeBSDDavid Carlier2019-10-081-4/+4
| | | | | | | | | | Reviewers: chandlerc, jlpeyton, jdoerfert, dim Reviewed-By: dim Differential Revision: https://reviews.llvm.org/D68580 llvm-svn: 374118
* Force honoring nthreads-var and thread-limit-var inside teams construct on hostJonathan Peyton2019-08-201-4/+17
| | | | | | | | | | | | | This patch fixes https://bugs.llvm.org/show_bug.cgi?id=42906, via adding adjustment of number of threads on enter to the teams construct on host according to user settings. This allows to pass checks and avoid assertions at time of team of threads creation. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D66351 llvm-svn: 369430
* [OpenMP] Remove 'unnecessary parentheses'Jonas Hahnfeld2019-08-151-2/+2
| | | | | | | | | | The variables in kmp_lock.cpp are really arrays of function pointers that return void or int, not pointers to functions that return void* or int*. The other changes are only cosmetic. Differential Revision: https://reviews.llvm.org/D65870 llvm-svn: 369002
* Add OMPT support for teams constructHansang Bae2019-08-031-43/+117
| | | | | | | | This change adds OMPT support for events from teams construct. Differential Revision: https://reviews.llvm.org/D64025 llvm-svn: 367746
* [OpenMP] RISCV64 portJonas Hahnfeld2019-07-251-1/+2
| | | | | | | | | | | | | | This is a port of libomp for the RISC-V 64-bit Linux target. We have tested this port on a HiFive Unleashed development board using a downstream LLVM that has support for the missing bits in upstream. As of now, all tests are passing, including OMPT. Patch by Ferran Pallarès! Differential Revision: https://reviews.llvm.org/D59880 llvm-svn: 367021
* [OMPT] Cleanup reset of exit_frame pointerJonas Hahnfeld2019-07-221-20/+20
| | | | | | | | | | | | | | | | | | This is done at call-site and does not need to be handled in __kmp_invoke_microtask. It was already absent from the x86 and x86_64 assembly, this patch removes it from the generic implementation in z_Linux_util.cpp and adds documentation for AArch64 and PPC64 that it's actually not needed. I can't test on these architectures, so I don't want to change the code just because it looks right :) While at it, rename some variables for consistency and add a check in test/ompt/parallel/normal.c that the pointer was reset before entering the barrier. Differential Revision: https://reviews.llvm.org/D64442 llvm-svn: 366721
* [OpenMP] Remove OMP spec versioningJonathan Peyton2019-07-121-278/+55
| | | | | | | | | | Remove all older OMP spec versioning from the runtime and build system. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D64534 llvm-svn: 365963
* Fixed memory use-after-free problem.Andrey Churbanov2019-06-261-2/+19
| | | | | | | | | | | Bug reported in https://bugs.llvm.org/show_bug.cgi?id=42269. Freeing of the contention group (CG) stucture by master thread looks wrong, because workers can leave the CG later on. Intead the freeing is now done by the last thread leaving the CG. Differential Revision: https://reviews.llvm.org/D63599 llvm-svn: 364456
* Fixed third issue reported in https://bugs.llvm.org/show_bug.cgi?id=41584.Andrey Churbanov2019-05-221-5/+0
| | | | | | | | Removed wrong debug assertion. Differential Revision: https://reviews.llvm.org/D62251 llvm-svn: 361408
* [OMPT] Handling of the events of initial-task-begin and initial-task-endJoachim Protze2019-05-201-5/+12
| | | | | | | | | | | OpenMP 5.0 says that the callback for the events initial-task-begin and initial-task-end has to be ompt_callback_implicit_task. Patch by Tim Cramer Differential Revision: https://reviews.llvm.org/D58776 llvm-svn: 361157
* Fixed https://bugs.llvm.org/show_bug.cgi?id=41584.Andrey Churbanov2019-05-151-2/+0
| | | | | | | | | Removed unconditional and unsafe decrement of counter of active threads in pool at shutdown time. Differential Revision: https://reviews.llvm.org/D61944 llvm-svn: 360784
* Enable OpenMP build for 32-bit FreeBSDDimitry Andric2019-05-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: To be able to successfully build OpenMP on 32-bit FreeBSD, such as FreeBSD/i386, I first had to provide a few wrappers (see D60916), and then add `KMP_OS_FREEBSD` to the list of defines checked for 32-bit architectures in `kmp_runtime.cpp`. I have successfully built libomp.so and ran a bunch of test programs on FreeBSD/i386 with this. See also https://svnweb.freebsd.org/changeset/base/345283 Reviewers: emaste, jlpeyton, Hahnfeld Reviewed By: jlpeyton Subscribers: krytarowski, guansong, jdoerfert, openmp-commits, llvm-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D60917 llvm-svn: 359716
* [OpenMP] Add OpenMP 5.0 nonmonotonic codeJonathan Peyton2019-04-301-4/+21
| | | | | | | | | | | | This patch adds: * New omp_sched_monotonic flag to omp_sched_t which is handled within the runtime * Parsing of monotonic/nonmonotonic in OMP_SCHEDULE * Tests for the monotonic flag and envirable parsing * Logic to force monotonic when hierarchical scheduling is used Differential Revision: https://reviews.llvm.org/D60979 llvm-svn: 359601
* Fixed memory leak reported in Bugzilla:Andrey Churbanov2019-04-171-2/+9
| | | | | | | | | | https://bugs.llvm.org/show_bug.cgi?id=41494 Freed th_cg_roots structure at exit from uber thread. Differential Revision: https://reviews.llvm.org/D60729 llvm-svn: 358572
* [OpenMP] Clean up load balancing dynamic modeJonathan Peyton2019-04-081-9/+27
| | | | | | | | | | | | | | | | This patch cleans up the bookkeeping code for the load balancing dynamic mode. When a thread is moved to or from the thread pool, the th_active_in_pool flag and the __kmp_thread_pool_active_nth global counter are both updated. This removes the need for the corrective code in the main wait loop. Another global counter, __kmp_thread_pool_nth, was removed completely, as it was only used for debugging, but was not under KMP_DEBUG. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D59508 llvm-svn: 357927
* [OpenMP][Stats] Fix stats gathering for distribute and team clauseJonathan Peyton2019-04-031-10/+38
| | | | | | | | | | | The distribute clause needs an explicit push of a timer. The teams clause needs a timer added and also, similarly to parallel, exchanged with the serial timer when encountered so that serial regions are counted properly. Differential Revision: https://reviews.llvm.org/D59801 llvm-svn: 357621
* [OpenMP] Fix pause check with version infoJonathan Peyton2019-03-251-9/+6
| | | | | | | | | | Add 5.0 guard to pause code for now. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D59428 llvm-svn: 356933
* [OpenMP] Remove deprecated taskqJonathan Peyton2019-03-151-6/+0
| | | | | | | | | | Remove very old, unused, and deprecated taskq code. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D58989 llvm-svn: 356288
* [OpenMP][OMPT] Distinguish different barrier kindsJonathan Peyton2019-02-281-2/+4
| | | | | | | | | | | | | This change makes the runtime decide the intended use of each barrier invocation, for the OMPT synchronization tool callbacks. The OpenMP 5.0 specification defines four possible barrier kinds -- implicit, explicit, implementation, and just normal barrier. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D58247 llvm-svn: 355140
* [OpenMP 5.0] Deprecate nest-var and associated featuresJonathan Peyton2019-02-281-7/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Nest-var, OMP_NESTED, omp_set_nested()., and omp_get_nested() have been deprecated in the 5.0 spec. Initial nesting info is now derived from OMP_MAX_ACTIVE_LEVELS, OMP_NUM_THREADS, and OMP_PROC_BIND. This patch deprecates the internal ICV that corresponds to nest-var, and replaces it with the max-active-levels-var ICV to determine nesting. The change still allows for use of OMP_NESTED (according to 5.0 changes), omp_get_nested, and omp_set_nested, which have had deprecation messages added to them. The change allows certain settings of OMP_NUM_THREADS, OMP_PROC_BIND, and OMP_MAX_ACTIVE_LEVELS to turn on nesting, but OMP_NESTED=0 will still force nesting to be off. The runtime now prints informative messages about deprecation of OMP_NESTED, omp_set_nested(), and omp_get_nested(), when those environment variables or routines are used. It also prints deprecated message in output for KMP_SETTINGS and OMP_DISPLAY_ENV for OMP_NESTED. This patch also fixes OMP_DISPLAY_ENV output for OMP_TARGET_OFFLOAD. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D58408 llvm-svn: 355138
* [OpenMP] Make use of sched_yield optional in runtimeJonathan Peyton2019-02-281-24/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch cleans up the yielding code and makes it optional. An environment variable, KMP_USE_YIELD, was added. Yielding is still on by default (KMP_USE_YIELD=1), but can be turned off completely (KMP_USE_YIELD=0), or turned on only when oversubscription is detected (KMP_USE_YIELD=2). Note that oversubscription cannot always be detected by the runtime (for example, when the runtime is initialized and the process forks, oversubscription cannot be detected currently over multiple instances of the runtime). Because yielding can be controlled by user now, the library mode settings (from KMP_LIBRARY) for throughput and turnaround have been adjusted by altering blocktime, unless that was also explicitly set. In the original code, there were a number of places where a double yield might have been done under oversubscription. This version checks oversubscription and if that's not going to yield, then it does the spin check. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D58148 llvm-svn: 355120
* [OpenMP] Fix thread_limits to work properly for teams constructJonathan Peyton2019-02-111-23/+116
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The thread-limit-var and omp_get_thread_limit API was not perfectly handled for teams construct. Now, when modified by thread_limit clause, omp_get_thread_limit reports the correct value. In addition, the value is restored when leaving the teams construct to what it was in the encountering context. This is done partly by creating the notion of a Contention Group root (CG root) that keeps track of the thread at the root of each separate CG, the thread-limit-var associated with the CG, and associated counter of active threads within the contention group. thread-limits are passed from master to worker threads via an entry in the ICV data structure. When a "contention group switch" occurs, a new CG root record is made and passed from master to worker. A thread could potentially have several CG root records if it encounters multiple nested teams constructs (but at the moment the spec doesn't allow for nested teams, so the most one could have currently is 2). The master of the teams masters gets the thread-limit clause value stored to its local ICV structure, and the other teams masters copy it from the master. The thread-limit is set from that ICV copy and restored to the ICV copy when entering and leaving the teams construct. This change also fixes a bug when the top-level teams construct team gets reused, and OMP_DYNAMIC was true, which can cause the expected size of this team to be smaller than what was actually allocated. The fix updates the size of the team after its threads were reserved. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D56804 llvm-svn: 353747
* [OMPT] Make sure that OMPT is enabled when accessing internals of the runtimeJoachim Protze2019-02-041-0/+1
| | | | | | | | | | | | | | | | | The three switch fallthrough generate a warning with -Wimplicit-fallthrough. Two are documented as fallthrough, one is not, but I think the intention is to also fallthrough in kmp_tasking.cpp. Not sure whether kmp.h is the best place to define the macro. Reviewers: jlpeyton, AndreyChurbanov, Hahnfeld Reviewed By: jlpeyton Tags: #openmp Differential Revision: https://reviews.llvm.org/D56397 llvm-svn: 353052
* Update more file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | | to reflect the new license. These used slightly different spellings that defeated my regular expressions. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351648
* [OpenMP] Add omp_pause_resource* APIJonathan Peyton2019-01-161-3/+99
| | | | | | | | | | | | Add omp_pause_resource and omp_pause_resource_all API and enum, plus stub for internal implementation. Implemented callable helper function to do local pause, and added basic functionality for hard and soft pause. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D55078 llvm-svn: 351372
* [OMPT] Second chunk of final OMPT 5.0 interface updatesJoachim Protze2019-01-151-10/+10
| | | | | | | | | | | | | | | | | | | | | The omp-tools.h file is generated from the OpenMP spec to ensure that the interface is implemented as specified. The other changes are necessary to update the interface implementation to the final version as published in 5.0. The omp-tools.h header was previously called ompt.h, currently a copy under this name is installed for legacy tools. Patch partially perpared by @sconvent Reviewers: AndreyChurbanov, hbae, Hahnfeld Reviewed By: hbae Tags: #openmp, #ompt Differential Revision: https://reviews.llvm.org/D55579 llvm-svn: 351197
* [OMPT] First chunk of final OMPT 5.0 interface updatesJoachim Protze2018-12-181-38/+38
| | | | | | | | | | | | | This patch updates the implementation of the ompt_frame_t, ompt_wait_id_t and ompt_state_t. The final version of the OpenMP 5.0 spec added the "t" for these types. Furthermore the structure for ompt_frame_t changed and allows to specify that the reenter frame belongs to the runtime. Patch partially prepared by Simon Convent Reviewers: hbae llvm-svn: 349458
* [OpenMP] Implement OpenMP 5.0 affinity format functionalityJonathan Peyton2018-12-131-2/+407
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the affinity format functionality introduced in OpenMP 5.0. This patch adds: Two new environment variables: OMP_DISPLAY_AFFINITY=TRUE|FALSE OMP_AFFINITY_FORMAT=<string> and Four new API: 1) omp_set_affinity_format() 2) omp_get_affinity_format() 3) omp_display_affinity() 4) omp_capture_affinity() The affinity format functionality has two ICV's associated with it: affinity-display-var (bool) and affinity-format-var (string). The affinity-display-var enables/disables the functionality through the envirable OMP_DISPLAY_AFFINITY. The affinity-format-var is a formatted string with the special field types beginning with a '%' character similar to printf For example, the affinity-format-var could be: "OMP: host:%H pid:%P OStid:%i num_threads:%N thread_num:%n affinity:{%A}" The affinity-format-var is displayed by every thread implicitly at the beginning of a parallel region when any thread's affinity has changed (including a brand new thread being spawned), or explicitly using the omp_display_affinity() API. The omp_capture_affinity() function can capture the affinity-format-var in a char buffer. And omp_set|get_affinity_format() allow the user to set|get the affinity-format-var explicitly at runtime. omp_capture_affinity() and omp_get_affinity_format() both return the number of characters needed to hold the entire string it tried to make (not including NULL character). If not enough buffer space is available, both these functions truncate their output. Differential Revision: https://reviews.llvm.org/D55148 llvm-svn: 349089
* [OpenMP] Fix a few build issuesJonathan Peyton2018-12-101-2/+2
| | | | | | | | | | | | | | | | | | | | | Fix two build issues: 1) Recent commit 348756 accidentally included Unix clang compilers to use immintrin.h when only clang-cl should be using it leading to the following error: openmp-llvm/runtime/src/kmp_lock.cpp:2035:25: error: always_ inline function '_xbegin' requires target feature 'rtm', but would be inlined into function '__kmp_test_adaptive_lock_only' that is compiled without support for 'rtm' kmp_uint32 status = _xbegin(); This patch changes the guard to use immintrin.h to only use clang-cl instead of all clang 2) gcc-8 gives a warning about multiline comment in kmp_runtime.cpp: This patch just changes it to a two line comment openmp-llvm/runtime/src/kmp_runtime.cpp:7697:8: warning: multi-line comment [-Wcomment] #endif // KMP_OS_LINUX || KMP_OS_DRAGONFLY || KMP_OS_FREEBSD || KMP_OS_NETBSD \ llvm-svn: 348783
* Add OpenBSD support to OpenMPKamil Rytarowski2018-12-091-2/+2
| | | | | | | | | | | | | | | | Summary: This patch permits OpenMP to build and work (with both gcc and clang) on OpenBSD. It mostly follows what was done for FreeBSD and NetBSD, except OpenBSD does not have pthread_getattr_np support, so it follows OS X in that one instance. Reviewers: #openmp, krytarowski Reviewed By: krytarowski Subscribers: guansong, jfb, emaste, mgorny, krytarowski, #openmp Tags: #openmp Differential Revision: https://reviews.llvm.org/D34280 llvm-svn: 348726
* Add DragonFlyBSD support to OpenMPKamil Rytarowski2018-12-091-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Additions mostly follow FreeBSD and NetBSD and are not intrusive. There is similar patch for OpenBSD: https://reviews.llvm.org/D34280 The -lm was being omitted due to -Wl,--as-needed in cmake rule, similar patch is in freebsd-ports/devel/llvm-devel port. Simple OpenMP programs compile and work as expected: $ clang-devel ~/omp_hello.c -fopenmp -I/usr/local/llvm-devel/include $ LD_LIBRARY_PATH=/usr/local/llvm-devel/lib OMP_NUM_THREADS=100 ./a.out The assertion in LLVMgold.so when -fopenmp was used together with -flto in 20170524 snapshot is no longer triggered on current svn-trunk and works fine as in llvm-4.0 with our local patches. Reviewers: #openmp, krytarowski Reviewed By: krytarowski Subscribers: dexonsmith, jfb, krytarowski, guansong, gregrodgers, emaste, mgorny, mehdi_amini Differential Revision: https://reviews.llvm.org/D35129 llvm-svn: 348725
* [OpenMP] Fixed possible array out of bound accessJonathan Peyton2018-11-281-0/+1
| | | | | | | | | | | | | There is low probability that array th_hot_teams can be accessed out of bound (when many nested levels are requested to keep hot teams via KMP_HOT_TEAMS_MAX_LEVEL). The patch adds the check of index that fixes the problem. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D54950 llvm-svn: 347800
* Fix for bugzilla https://bugs.llvm.org/show_bug.cgi?id=39137.Andrey Churbanov2018-11-141-0/+2
| | | | | | | | Do not write to internal structure if it keeps same value. Differential Revision: https://reviews.llvm.org/D54305 llvm-svn: 346862
* Add Hurd support.Andrey Churbanov2018-11-071-2/+2
| | | | | | | | Patch by samuel.thibault@ens-lyon.org Differential Revision: https://reviews.llvm.org/D54079 llvm-svn: 346310
* [OpenMP] Convert KMP_DYNAMIC_LIB to a 0 or 1 guard everywhereJonathan Peyton2018-10-051-4/+4
| | | | llvm-svn: 343869
* [OpenMP][OMPT] Fix unsafe initialization of ompt_data_t objectsJonathan Peyton2018-10-041-6/+4
| | | | | | | | | | | | Initializing an ompt_data_t object using the pointer union member is potentially unsafe in 32-bit programs. This change fixes the issue by using the constant, ompt_data_none. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D52046 llvm-svn: 343785
* [OMPT] Update types according to TR7Joachim Protze2018-09-101-2/+3
| | | | | | | | | | | | | | | | | | Some types and callback signatures have changed from TR6 to TR7. Major changes (only adding signatures and stubs): (-remove idle callback) done by D48362 -add reduction and dispatch callback -add get_task_memory and finalize_tool runtime entry points -ompt_invoker_t becomes ompt_parallel_flag_t -more types of sync_regions Patch provided by Simon Convent Reviewers: hbae, protze.joachim Differential Revision: https://reviews.llvm.org/D50774 llvm-svn: 341834
* [OpenMP] Initial implementation of OMP 5.0 Memory Management routinesJonathan Peyton2018-09-071-3/+28
| | | | | | | | | | | | | | | | | | | | | | | Implemented omp_alloc, omp_free, omp_{set,get}_default_allocator entries, and OMP_ALLOCATOR environment variable. Added support for HBW memory on Linux if libmemkind.so library is accessible (dynamic library only, no support for static libraries). Only used stable API (hbwmalloc) of the memkind library though we may consider using experimental API in future. The ICV def-allocator-var is implemented per implicit task similar to place-partition-var. In the absence of a requested allocator, the uses the default allocator. Predefined allocators (the only ones currently available) are made similar for C and Fortran, - pointers (long integers) with values 1 to 8. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D51232 llvm-svn: 341687
* [OpenMP] Add check for hot_teams arrayJonathan Peyton2018-08-241-1/+2
| | | | | | | | | | If hot teams are not being used, this code could seg fault without the added check, and does so when composability is used in conjunction with nesting. The fix prevents the segfault. Differential Revision: https://reviews.llvm.org/D50649 llvm-svn: 340629
* [OpenMP] Cleanup codeJonathan Peyton2018-08-091-6/+8
| | | | | | | | | | | | | | | | This patch cleans up unused functions, variables, sign compare issues, and addresses some -Warning flags which are now enabled including -Wcast-qual. Not all the warning flags in LibompHandleFlags.cmake are enabled, but some are with this patch. Some __kmp_gtid_from_* macros in kmp.h are switched to static inline functions which allows us to remove the awkward definition of KMP_DEBUG_ASSERT() and KMP_ASSERT() macros which used the comma operator. This had to be done for the innumerable -Wunused-value warnings related to KMP_DEBUG_ASSERT() Differential Revision: https://reviews.llvm.org/D49105 llvm-svn: 339393
* [OpenMP][Stats] Cleanup stats gathering codeJonathan Peyton2018-07-301-23/+7
| | | | | | | | | | | | | | | | | | | | | | 1) Remove unnecessary data from list node structure 2) Remove timerPair in favor of pushing/popping explicitTimers. This way, nested timers will work properly. 3) Fix #pragma omp critical timers 4) Add histogram capability 5) Add KMP_STATS_FILE formatting capability 6) Have time partitioned into serial & parallel by introducing partitionedTimers::exchange(). This also counts the number of serial regions in the executable. 7) Fix up the timers around OMP loops so that scheduling overhead and work are both counted correctly. 8) Fix up the iterations statistics so they count the number of iterations the thread receives at each loop scheduling event 9) Change timers so there is only one RDTSC read per event change 10) Fix up the outdated comments for the timers Differential Revision: https://reviews.llvm.org/D49699 llvm-svn: 338276
* Block library shutdown until unreaped threads finish spin-waitingJonathan Peyton2018-07-191-0/+15
| | | | | | | | | | | | | | | | This change fixes possibly invalid access to the internal data structure during library shutdown. In a heavily oversubscribed situation, the library shutdown sequence can reach the point where resources are deallocated while there still exist threads in their final spinning loop. The added loop in __kmp_internal_end() checks if there are such busy-waiting threads and blocks the shutdown sequence if that is the case. Two versions of kmp_wait_template() are now used to minimize performance impact. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D49452 llvm-svn: 337486
* [OpenMP] Fix a few formatting issuesJonathan Peyton2018-07-091-1/+2
| | | | llvm-svn: 336575
* [OpenMP] Introduce hierarchical schedulingJonathan Peyton2018-07-091-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces the logic implementing hierarchical scheduling. First and foremost, hierarchical scheduling is off by default To enable, use -DLIBOMP_USE_HIER_SCHED=On during CMake's configure stage. This work is based off if the IWOMP paper: "Workstealing and Nested Parallelism in SMP Systems" Hierarchical scheduling is the layering of OpenMP schedules for different layers of the memory hierarchy. One can have multiple layers between the threads and the global iterations space. The threads will go up the hierarchy to grab iterations, using possibly a different schedule & chunk for each layer. [ Global iteration space (0-999) ] (use static) [ L1 | L1 | L1 | L1 ] (use dynamic,1) [ T0 T1 | T2 T3 | T4 T5 | T6 T7 ] In the example shown above, there are 8 threads and 4 L1 caches begin targeted. If the topology indicates that there are two threads per core, then two consecutive threads will share the data of one L1 cache unit. This example would have the iteration space (0-999) split statically across the four L1 caches (so the first L1 would get (0-249), the second would get (250-499), etc). Then the threads will use a dynamic,1 schedule to grab iterations from the L1 cache units. There are currently four supported layers: L1, L2, L3, NUMA OMP_SCHEDULE can now read a hierarchical schedule with this syntax: OMP_SCHEDULE='EXPERIMENTAL LAYER,SCHED[,CHUNK][:LAYER,SCHED[,CHUNK]...]:SCHED,CHUNK And OMP_SCHEDULE can still read the normal SCHED,CHUNK syntax from before I've kept most of the hierarchical scheduling logic inside kmp_dispatch_hier.h to try to keep it separate from the rest of the code. Differential Revision: https://reviews.llvm.org/D47962 llvm-svn: 336571
* [OpenMP] Use C++11 Atomics - barrier, tasking, and lock codeJonathan Peyton2018-07-091-10/+16
| | | | | | | | | | | | | | | | | These are preliminary changes that attempt to use C++11 Atomics in the runtime. We are expecting better portability with this change across architectures/OSes. Here is the summary of the changes. Most variables that need synchronization operation were converted to generic atomic variables (std::atomic<T>). Variables that are updated with combined CAS are packed into a single atomic variable, and partial read/write is done through unpacking/packing Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D47903 llvm-svn: 336563
* [OMPT] Rename ompt_frame_t to omp_frame_tJoachim Protze2018-05-281-1/+1
| | | | | | | | Rename ompt_frame_t to omp_frame_t, as defined in the spec. Differential Revision: https://reviews.llvm.org/D43568 llvm-svn: 333367
* [OMPT] Fix thread_num for implicit_task_end callbacks in nested parallel regionsJoachim Protze2018-05-071-4/+13
| | | | | | | | | | | | | | implicit_task_end callbacks in nested parallel regions did not always give the correct thread_num, since the inner parallel region may have already been finalized. Now, the thread_num is stored at the beginning of the implicit task and retrieved at the end, whenever necessary. A testcase was added as well. Differential Revision: https://reviews.llvm.org/D46260 llvm-svn: 331632
OpenPOWER on IntegriCloud