summaryrefslogtreecommitdiffstats
path: root/openmp/runtime/src/kmp.h
Commit message (Collapse)AuthorAgeFilesLines
...
* [OpenMP] Use C++11 Atomics - barrier, tasking, and lock codeJonathan Peyton2018-07-091-21/+25
| | | | | | | | | | | | | | | | | These are preliminary changes that attempt to use C++11 Atomics in the runtime. We are expecting better portability with this change across architectures/OSes. Here is the summary of the changes. Most variables that need synchronization operation were converted to generic atomic variables (std::atomic<T>). Variables that are updated with combined CAS are packed into a single atomic variable, and partial read/write is done through unpacking/packing Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D47903 llvm-svn: 336563
* [OpenMP] Fix affinity API for KMP_AFFINITY=none|compact|scatterJonathan Peyton2018-04-181-0/+5
| | | | | | | | | | | | | | | | | | | | | Currently, the affinity API reports garbage for the initial place list and any thread's place lists when using KMP_AFFINITY=none|compact|scatter. This patch does two things: for KMP_AFFINITY=none, Creates a one entry table for the places, this way, the initial place list is just a single place with all the proc ids in it. We also set the initial place of any thread to 0 instead of KMP_PLACE_ALL so that the thread reports that single place (place 0) instead of garbage (-1) when using the affinity API. When non-OMP_PROC_BIND affinity is used (including KMP_AFFINITY=compact|scatter), a thread's place list is populated correctly. We assume that each thread is assigned to a single place. This is implemented in two of the affinity API functions Differential Revision: https://reviews.llvm.org/D45527 llvm-svn: 330283
* Introduce GOMP_taskloop APIJonathan Peyton2018-04-181-0/+8
| | | | | | | | | | | | | | | This patch introduces GOMP_taskloop to our API. It adds GOMP_4.5 to our version symbols. Being a wrapper around __kmpc_taskloop, the function creates a task with the loop bounds properly nested in the shareds so that the GOMP task thunk will work properly. Also, the firstprivate copy constructors are properly handled using the __kmp_gomp_task_dup() auxiliary function. Currently, only linear spawning of tasks is supported for the GOMP_taskloop interface. Differential Revision: https://reviews.llvm.org/D45327 llvm-svn: 330282
* Read OMP_TARGET_OFFLOAD and provide API to access ICVJonathan Peyton2018-03-201-0/+12
| | | | | | | | | | | | | | Added settings code to read OMP_TARGET_OFFLOAD environment variable. Added target-offload-var ICV as __kmp_target_offload, set via OMP_TARGET_OFFLOAD, if available, otherwise defaulting to DEFAULT. Valid values for the ICV are specified as enum values {0,1,2} for disabled, default, and mandatory. An internal API access function __kmpc_get_target_offload is provided. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D44577 llvm-svn: 328046
* Improve OpenMP threadprivate implementation.Andrey Churbanov2018-03-051-0/+4
| | | | | | | | Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D41914 llvm-svn: 326733
* Improve stability of the runtime in parent/child processesJonathan Peyton2018-01-101-0/+2
| | | | | | | | | | | | | | | This change improves stability of the runtime when the application forks child processes. Acquiring/releasing __kmp_initz_lock and __kmp_forkjoin_lock in the atfork handlers insures that the actual fork does not occur while those two locks are held, and __kmp_itt_reset() reverts the itt's global state to the initial state which also initializes the mutex stored in the global state. Some missing initialization code was also inserted in the child's atfork handler. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D41462 llvm-svn: 322202
* Fix trademarks found by scannerJonathan Peyton2018-01-041-2/+2
| | | | llvm-svn: 321827
* Trivial enum fixJonathan Peyton2017-12-061-4/+4
| | | | | | | | | | | | This change is a trivial fix for enums that removes specification of "last" or "upper" values, or other boundary values. This simplifies the code in places, and results in never needing to update the "upper" values again. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D40804 llvm-svn: 319957
* Extension of HWLOC topology discovery with NUMA nodes and tilesAndrey Churbanov2017-11-301-0/+4
| | | | | | | | Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D40309 llvm-svn: 319422
* Make kmp_r_sched_t into a unionJonathan Peyton2017-11-291-4/+7
| | | | | | | | | | | This change makes kmp_r_sched_t type into a union for simpler comparisons and assignments Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D40374 llvm-svn: 319379
* Exclude untied tasks from checking of task scheduling constraint (TSC).Andrey Churbanov2017-11-161-5/+7
| | | | | | | | This can improve performance of tests with untied tasks. Differential Revision: https://reviews.llvm.org/D39613 llvm-svn: 318388
* Remove const from variables with dynamic memoryJonas Hahnfeld2017-11-091-5/+1
| | | | | | | | | | | | | | | | | | | Allocated memory is typically not 'const' if it needs to be freed. This patch removes around 50 wrong const attributes, modifies the corresponding functions and finally gets rid of some const_casts. These have especially been strange for __kmp_str_fname_free() that added a 'const' to call __kmp_str_free() which removed it again. Two minor cleanups that I performed in this process: * __kmp_tool_libraries now lives in kmp_settings.cpp as it is used nowhere else. * __kmp_msg_empty was removed as it was never used and Clang now complained that it was assigned a string literal that is 'const char *'. Differential Revision: https://reviews.llvm.org/D39755 llvm-svn: 317797
* Cleanup version symbol macros and attributes/declspecsJonathan Peyton2017-11-071-7/+1
| | | | | | | | | | | | 1) Get rid of xaliasify, xexpand and xversionify for KMP_EXPAND_NAME and KMP_VERSION_SYMBOL. KMP_VERSION_SYMBOL is a combination of xaliasify and xversionify. 2) Put all attribute and __declspec definitions in kmp_os.h Differential Revision: https://reviews.llvm.org/D39516 llvm-svn: 317636
* Update implementation of OMPT to the specification OpenMP 5.0 Preview 1 (TR4).Joachim Protze2017-11-011-5/+10
| | | | | | | | | | | | | | The code is tested to work with latest clang, GNU and Intel compiler. The implementation is optimized for low overhead when no tool is attached shifting the cost to execution with tool attached. This patch does not implement OMPT for libomptarget. Patch by Simon Convent and Joachim Protze Differential Revision: https://reviews.llvm.org/D38185 llvm-svn: 317085
* Apply formatting changesJonathan Peyton2017-10-201-2/+0
| | | | | | | | | | .clang-format's comments are removed and a (hopefully) final set of formatting changes are applied. Differential Revision: https://reviews.llvm.org/D38837 Differential Revision: https://reviews.llvm.org/D38920 llvm-svn: 316227
* Remove unnecessary semicolonsJonathan Peyton2017-09-271-1/+1
| | | | | | | | Removes semicolons after if {} blocks, function definitions, etc. I was able to apply the large OMPT patch cleanly on top of this one with no conflicts. llvm-svn: 314340
* Remove unused t_single_lockJonathan Peyton2017-09-261-1/+1
| | | | | | Add padding inside team structure to keep same structure size. llvm-svn: 314242
* Read blocktime value set by kmp_set_blocktime() before reading from ↵Jonathan Peyton2017-09-051-2/+6
| | | | | | | | | | KMP_BLOCKTIME Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D37403 llvm-svn: 312539
* Minor code cleanup of Klocwork issuesJonathan Peyton2017-09-051-1/+1
| | | | | | | | | | | Minor code cleanup of Klocwork issues. Fatal messages are given no return attribute. Define and use KMP_NORETURN to work for multiple C++ versions. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D37275 llvm-svn: 312538
* Remove BUILD_TVJonathan Peyton2017-08-171-31/+0
| | | | | | | | | | Cleanup code to remove BUILD_TV and unused code bracketed by it. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D36011 llvm-svn: 311114
* Add new envirable KMP_TEAMS_THREAD_LIMITJonathan Peyton2017-08-021-0/+1
| | | | | | | | | | | | | | | | | | | | This change adds a new environment variable, KMP_TEAMS_THREAD_LIMIT, which is used to set a new global variable, __kmp_teams_max_nth, which is checked when determining the size and quantity of teams that will be created in the teams construct. Specifically, it is a limit on the total number of threads in a given teams construct. It differentiates the limits for the teams construct from the limits for regular parallel regions (KMP_DEVICE_THREAD_LIMIT/__kmp_max_nth and OMP_THREAD_LIMIT/__kmp_cg_max_nth). When each individual team is formed, it is still subject to those limits. After the clauses to the teams construct are parsed and calculated, we check to make sure we are within this limit, and if not, reduce num_threads per team and/or number of teams, accordingly. The default value is set to the number of available processors on the system. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D36009 llvm-svn: 309874
* Fix implementation of OMP_THREAD_LIMITJonathan Peyton2017-07-271-2/+5
| | | | | | | | | | | | | | | | | | This change fixes the implementation of OMP_THREAD_LIMIT. The implementation of this previously was not restricted to a contention group (but it should be, according to the spec), and this is fixed here. A field is added to root thread to store a counter of the threads in the contention group. An extra check is added when reserving threads for a parallel region that checks this variable and compares to threadlimit-var, which is implemented as a new global variable, kmp_cg_max_nth. Associated settings changes were also made, and clean up of comments that referred to OMP_THREAD_LIMIT, but should refer to the new KMP_DEVICE_THREAD_LIMIT (added in an earlier patch). Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D35912 llvm-svn: 309319
* Cleanup: __kmp_env_* variablesJonathan Peyton2017-07-251-7/+1
| | | | | | | | | | Removed unused __kmp_env_* variables. Also clangified other people's code. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D35808 llvm-svn: 309000
* OpenMP RTL cleanup: two PAUSEs per spin loop iteration replaced with single oneAndrey Churbanov2017-07-191-3/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D35490 llvm-svn: 308423
* Add recursive task scheduling strategy to taskloop implementationJonathan Peyton2017-07-181-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Taskloop implementation is extended by using recursive task scheduling. Envirable KMP_TASKLOOP_MIN_TASKS added as a manual threshold for the user to switch from recursive to linear tasks scheduling. Details: * The calculations for the loop parameters are moved from __kmp_taskloop_linear upper level * Initial calculation is done in the __kmpc_taskloop, further range splitting is done in the __kmp_taskloop_recur. * Added threshold to switch from recursive to linear tasks scheduling; * One half of split range is scheduled as an internal task which just moves sub-range parameters to the stealing thread that continues recursive scheduling (if number of tasks still enough), the other half is processed recursively; * Internal task duplication routine fixed to assign parent task, that was not needed when all tasks were scheduled by same thread, but is needed now. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D35273 llvm-svn: 308338
* OpenMP RTL cleanup: eliminated warnings with -Wcast-qual.Andrey Churbanov2017-07-031-14/+12
| | | | | | | | | | | Changes are: replaced C-style casts with cons_cast and reinterpret_cast; type of several counters changed to signed; type of parameters of 32-bit and 64-bit AND and OR intrinsics changes to unsigned; changed files formatted using clang-format version 3.8.1. Differential Revision: https://reviews.llvm.org/D34759 llvm-svn: 307020
* Replace platform macro with KMP_MIC_SUPPORTEDJonathan Peyton2017-06-131-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D34119 llvm-svn: 305307
* Reset initial affinity in children processesJonathan Peyton2017-06-131-0/+3
| | | | | | | | | | | | | | | | If OpenMP is initialized before fork()-ing occurs and affinity is set to something like compact, then the master thread will be pinned to a single HW thread/core after initialization. If the master (or any other thread) then forks N processes, all N processes will then be pinned to that same single HW thread/core. To reset the affinity for the new child process, the atfork handler for the child process can call kmp_set_thread_affinity_mask_initial() to reset its affinity to the initial affinity of the application before it re-initializes libomp. The parent process will not be affected and still keeps its affinity setting. Differential Revision: https://reviews.llvm.org/D34118 llvm-svn: 305306
* OpenMP 4.5: implemented support of schedule(simd:guided) andAndrey Churbanov2017-06-051-1/+3
| | | | | | | | | | schedule(simd:runtime) - library part. Compiler generation should use newly introduced scheduling kinds kmp_sch_guided_simd = 46, kmp_sch_runtime_simd = 47, as parameters to __kmpc_dispatch_init_* entries. Differential Revision: https://reviews.llvm.org/D31602 llvm-svn: 304724
* Clang-format and whitespace cleanup of source codeJonathan Peyton2017-05-121-2166/+2403
| | | | | | | | | | | | | This patch contains the clang-format and cleanup of the entire code base. Some of clang-formats changes made the code look worse in places. A best effort was made to resolve the bulk of these problems, but many remain. Most of the problems were mangling line-breaks and tabbing of comments. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D32659 llvm-svn: 302929
* Fix Hwloc API IncompatibilityJonathan Peyton2017-04-251-1/+7
| | | | | | | | | | | Older Hwloc libraries (< 1.10.0) don't offer the HWLOC_OBJ_NUMANODE nor HWLOC_OBJ_PACKAGE types. Instead they are named HWLOC_OBJ_NODE and HWLOC_OBJ_SOCKET instead. This patch just defines the newer names based on the older names when using an older Hwloc. Differential Revision: https://reviews.llvm.org/D32496 llvm-svn: 301349
* KMP_HW_SUBSET extended with NUMA support when HWLOC enabledAndrey Churbanov2017-04-131-8/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D31600 llvm-svn: 300220
* Test check-in, comment changedOlga Malysheva2017-04-041-1/+2
| | | | llvm-svn: 299428
* Minor improvement of KMP_YIELD_NOW() macro.Jonathan Peyton2017-03-201-4/+8
| | | | | | | | | | | This change slightly improves performance of KMP_YIELD_NOW() macro, by using _rdtsc() intrinsic function if possible. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D31008 llvm-svn: 298314
* [OpenMP] Missing virtual destructor in KMPAffinityGeorge Rokos2017-02-221-0/+2
| | | | | | | | Added virtual destructor in a class containing virtual functions. Differential Revision: https://reviews.llvm.org/D30271 llvm-svn: 295896
* Run-time library part of OpenMP 5.0 task reduction implementation.Andrey Churbanov2017-02-161-1/+11
| | | | | | | | | Added test kmp_task_reduction_nest.cpp which has an example of possible compiler codegen. Differential Revision: https://reviews.llvm.org/D29600 llvm-svn: 295343
* Enable yield cycle on LinuxJonathan Peyton2017-02-151-1/+5
| | | | | | | | | | | | This change allows the runtime to turn __kmp_yield() on/off repeatedly on Linux. This feature was removed when disabling monitor thread, but there are applications that perform better with this feature on. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D29227 llvm-svn: 295203
* Fix a race in shutdown when tasking is used.Andrey Churbanov2017-02-061-0/+5
| | | | | | | | Patch by Terry Wilmarth. Differential Revision: https://reviews.llvm.org/D28377 llvm-svn: 294214
* Fix performance issue incurred by removing monitor thread.Jonathan Peyton2017-01-271-1/+17
| | | | | | | | | | | | | | | | When the monitor thread is used, most threads in the team directly go to sleep if the copy of bt_intervals/bt_set is not available in the cache, and this happens at least once per thread in the wait function, making the overall performance slightly better. This change tries to mimic this behavior by using the bt_intervals cache, which simply keeps the blocktime interval in terms of the platform-dependent ticks or nanoseconds. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D28906 llvm-svn: 293312
* Follow up to r289732: Update comments in source files to reference .cpp filesJonathan Peyton2016-12-141-1/+1
| | | | | | Patch by Hansang Bae llvm-svn: 289739
* Introduce dynamic affinity dispatch capabilitiesJonathan Peyton2016-11-141-272/+96
| | | | | | | | | | | | | | | | | | | | | | | | | This set of changes enables the affinity interface (Either the preexisting native operating system or HWLOC) to be dynamically set at runtime initialization. The point of this change is that we were seeing performance degradations when using HWLOC. This allows the user to use the old affinity mechanisms which on large machines (>64 cores) makes a large difference in initialization time. These changes mostly move affinity code under a small class hierarchy: KMPAffinity class Mask {} KMPNativeAffinity : public KMPAffinity class Mask : public KMPAffinity::Mask KMPHwlocAffinity class Mask : public KMPAffinity::Mask Since all interface functions (for both affinity and the mask implementation) are virtual, the implementation can be chosen at runtime initialization. Differential Revision: https://reviews.llvm.org/D26356 llvm-svn: 286890
* Fixed a memory leak related to task dependencies.Andrey Churbanov2016-10-271-6/+14
| | | | | | | | Differential Revision: http://reviews.llvm.org/D25504 Patch by Alex Duran. llvm-svn: 285283
* Fix OpenMP 4.0 library buildJonathan Peyton2016-10-181-0/+5
| | | | | | | | Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D25505 llvm-svn: 284499
* Code cleanup for the runtime without monitor threadJonathan Peyton2016-10-071-4/+17
| | | | | | | | | | This change removes/disables unnecessary code when monitor thread is not used. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D25102 llvm-svn: 283577
* Enable omp_get_schedule() to return static steal type.Jonathan Peyton2016-10-071-2/+4
| | | | | | | As the code is now, calling omp_get_schedule() when OMP_SCHEDULE=static_steal will cause an assert. llvm-svn: 283576
* Disable monitor thread creation by default.Jonathan Peyton2016-09-271-1/+14
| | | | | | | | | | | | | This change set disables creation of the monitor thread by default. The global counter maintained by the monitor thread was replaced by logic that uses system time directly, and cyclic yielding on Linux target was also removed since there was no clear benefit of using it. Turning on KMP_USE_MONITOR variable (=1) enables creation of monitor thread again if it is really necessary for some reasons. Differential Revision: https://reviews.llvm.org/D24739 llvm-svn: 282507
* Fix bitmask upper bounds checkJonathan Peyton2016-09-121-0/+1
| | | | | | | | | | | | Rather than checking KMP_CPU_SETSIZE, which doesn't exist when using Hwloc, we use the get_max_proc() function which can vary based on the operating system. For example on Windows with multiple processor groups, it might be the case that the highest bit possible in the bitmask is not equal to the number of hardware threads on the machine but something higher than that. Differential Revision: https://reviews.llvm.org/D24206 llvm-svn: 281245
* [OPENMP] Implementation of omp_get_default_device and omp_set_default_deviceGeorge Rokos2016-09-091-0/+6
| | | | | | | | | Implementation of missing OpenMP 4.0 API functions omp_get_default_device and omp_set_default_device. Also, added support for the environment variable OMP_DEFAULT_DEVICE. Differential Revision: https://reviews.llvm.org/D23587 llvm-svn: 281065
* Disable KMP_CANCEL_THREADS on AndroidPirama Arumuga Nainar2016-08-031-0/+6
| | | | | | | | | | | | Summary: Android does not have pthread_cancel. Disable KMP_CANCEL_THREADS if __ANDROID__ is defined. Subscribers: tberghammer, srhines, openmp-commits, danalbert Differential Revision: https://reviews.llvm.org/D23029 llvm-svn: 277618
* http://reviews.llvm.org/D22134: Implementation of OpenMP 4.5 nonmonotonic ↵Andrey Churbanov2016-07-111-2/+5
| | | | | | schedule modifier llvm-svn: 275052
OpenPOWER on IntegriCloud