summaryrefslogtreecommitdiffstats
path: root/openmp/runtime/src/kmp_global.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [OpenMP] Change initialization of __kmp_globalJonas Hahnfeld2019-09-041-1/+1
| | | | | | | | | | | | | | There's no need to initialize variables with static storage duration because they're implicitly initialized to zero. See https://en.cppreference.com/w/c/language/initialization#Implicit_initialization I think that's already relied upon because the supplied 0 only sets 'kmp_time_global_t g_time;' in 'struct kmp_base_global'. The other fields are not set in the code, but implicitly initialized by the compiler. Differential Revision: https://reviews.llvm.org/D66292 llvm-svn: 370943
* [OpenMP] Remove OMP spec versioningJonathan Peyton2019-07-121-15/+0
| | | | | | | | | | Remove all older OMP spec versioning from the runtime and build system. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D64534 llvm-svn: 365963
* Create a runtime option to disable task throttling.Andrey Churbanov2019-07-021-0/+1
| | | | | | | | Patch by viroulep (Philippe Virouleau) Differential Revision: https://reviews.llvm.org/D63196 llvm-svn: 364934
* [OpenMP] Implement 5.0 memory managementJonathan Peyton2019-04-081-11/+31
| | | | | | | | | | | | | | | | | | * Replace HBWMALLOC API with more general MEMKIND API, new functions and variables added. * Have libmemkind.so loaded when accessible. * Redirect memspaces to default one except for high bandwidth which is processed separately. * Ignore some allocator traits e.g., sync_hint, access, pinned, while others are processed normally e.g., alignment, pool_size, fallback, fb_data, partition. * Add tests for memory management Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D59783 llvm-svn: 357929
* [OpenMP] Clean up load balancing dynamic modeJonathan Peyton2019-04-081-1/+0
| | | | | | | | | | | | | | | | This patch cleans up the bookkeeping code for the load balancing dynamic mode. When a thread is moved to or from the thread pool, the th_active_in_pool flag and the __kmp_thread_pool_active_nth global counter are both updated. This removes the need for the corrective code in the main wait loop. Another global counter, __kmp_thread_pool_nth, was removed completely, as it was only used for debugging, but was not under KMP_DEBUG. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D59508 llvm-svn: 357927
* [OpenMP 5.0] Deprecate nest-var and associated featuresJonathan Peyton2019-02-281-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Nest-var, OMP_NESTED, omp_set_nested()., and omp_get_nested() have been deprecated in the 5.0 spec. Initial nesting info is now derived from OMP_MAX_ACTIVE_LEVELS, OMP_NUM_THREADS, and OMP_PROC_BIND. This patch deprecates the internal ICV that corresponds to nest-var, and replaces it with the max-active-levels-var ICV to determine nesting. The change still allows for use of OMP_NESTED (according to 5.0 changes), omp_get_nested, and omp_set_nested, which have had deprecation messages added to them. The change allows certain settings of OMP_NUM_THREADS, OMP_PROC_BIND, and OMP_MAX_ACTIVE_LEVELS to turn on nesting, but OMP_NESTED=0 will still force nesting to be off. The runtime now prints informative messages about deprecation of OMP_NESTED, omp_set_nested(), and omp_get_nested(), when those environment variables or routines are used. It also prints deprecated message in output for KMP_SETTINGS and OMP_DISPLAY_ENV for OMP_NESTED. This patch also fixes OMP_DISPLAY_ENV output for OMP_TARGET_OFFLOAD. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D58408 llvm-svn: 355138
* [OpenMP] Make use of sched_yield optional in runtimeJonathan Peyton2019-02-281-18/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch cleans up the yielding code and makes it optional. An environment variable, KMP_USE_YIELD, was added. Yielding is still on by default (KMP_USE_YIELD=1), but can be turned off completely (KMP_USE_YIELD=0), or turned on only when oversubscription is detected (KMP_USE_YIELD=2). Note that oversubscription cannot always be detected by the runtime (for example, when the runtime is initialized and the process forks, oversubscription cannot be detected currently over multiple instances of the runtime). Because yielding can be controlled by user now, the library mode settings (from KMP_LIBRARY) for throughput and turnaround have been adjusted by altering blocktime, unless that was also explicitly set. In the original code, there were a number of places where a double yield might have been done under oversubscription. This version checks oversubscription and if that's not going to yield, then it does the spin check. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D58148 llvm-svn: 355120
* Update more file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | | to reflect the new license. These used slightly different spellings that defeated my regular expressions. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351648
* [OpenMP] Add omp_pause_resource* APIJonathan Peyton2019-01-161-1/+5
| | | | | | | | | | | | Add omp_pause_resource and omp_pause_resource_all API and enum, plus stub for internal implementation. Implemented callable helper function to do local pause, and added basic functionality for hard and soft pause. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D55078 llvm-svn: 351372
* [OpenMP] Fix LIBOMP_USE_DEBUGGER=ON build (PR38612)Roman Lebedev2019-01-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Two things: 1. Those two variables had the wrong sigdness, which was resulting in "sign mismatch in comparison" warning. 2. The whole `kmp_debugger.cpp` wasn't being built, or rather, it was being built as-if `USE_DEBUGGER` was off, thus, nothing provided the definition of `__kmp_omp_debug_struct_info`, `__kmp_debugging`. Makes sense, because `USE_DEBUGGER` is set in `kmp_config.h`, which is not included explicitly. It is included by `kmp.h`, but that one is only included inside of the `#if USE_DEBUGGER` block.. I *think* this is the only source file with this issue, everything else seem to `#include` either `kmp.h` or `kmp_config.h`. The alternative solution would be to add `add_compile_options(-include kmp_config.h)` in CMake. I did verify that `__kmp_omp_debug_struct_info` becomes available with this patch. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=38612 | PR38612 ]]. Reviewers: AndreyChurbanov, jlpeyton, Hahnfeld Reviewed By: jlpeyton Subscribers: guansong, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D55783 llvm-svn: 351019
* [OpenMP] Implement OpenMP 5.0 affinity format functionalityJonathan Peyton2018-12-131-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the affinity format functionality introduced in OpenMP 5.0. This patch adds: Two new environment variables: OMP_DISPLAY_AFFINITY=TRUE|FALSE OMP_AFFINITY_FORMAT=<string> and Four new API: 1) omp_set_affinity_format() 2) omp_get_affinity_format() 3) omp_display_affinity() 4) omp_capture_affinity() The affinity format functionality has two ICV's associated with it: affinity-display-var (bool) and affinity-format-var (string). The affinity-display-var enables/disables the functionality through the envirable OMP_DISPLAY_AFFINITY. The affinity-format-var is a formatted string with the special field types beginning with a '%' character similar to printf For example, the affinity-format-var could be: "OMP: host:%H pid:%P OStid:%i num_threads:%N thread_num:%n affinity:{%A}" The affinity-format-var is displayed by every thread implicitly at the beginning of a parallel region when any thread's affinity has changed (including a brand new thread being spawned), or explicitly using the omp_display_affinity() API. The omp_capture_affinity() function can capture the affinity-format-var in a char buffer. And omp_set|get_affinity_format() allow the user to set|get the affinity-format-var explicitly at runtime. omp_capture_affinity() and omp_get_affinity_format() both return the number of characters needed to hold the entire string it tried to make (not including NULL character). If not enough buffer space is available, both these functions truncate their output. Differential Revision: https://reviews.llvm.org/D55148 llvm-svn: 349089
* [OpenMP] Initial implementation of OMP 5.0 Memory Management routinesJonathan Peyton2018-09-071-0/+15
| | | | | | | | | | | | | | | | | | | | | | | Implemented omp_alloc, omp_free, omp_{set,get}_default_allocator entries, and OMP_ALLOCATOR environment variable. Added support for HBW memory on Linux if libmemkind.so library is accessible (dynamic library only, no support for static libraries). Only used stable API (hbwmalloc) of the memkind library though we may consider using experimental API in future. The ICV def-allocator-var is implemented per implicit task similar to place-partition-var. In the absence of a requested allocator, the uses the default allocator. Predefined allocators (the only ones currently available) are made similar for C and Fortran, - pointers (long integers) with values 1 to 8. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D51232 llvm-svn: 341687
* [OpenMP] Fix build errors when building with KMP_DEBUG_ADAPTIVE_LOCKS=1Jonathan Peyton2018-07-301-1/+1
| | | | | | | | | | | | | This change fixes build errors when building a runtime with adaptive lock stats enabled. Most of the errors were due to the recent changes in the runtime, but it seems that we have not tried to build this debug runtime on Windows for a long time. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D49823 llvm-svn: 338277
* [OpenMP] Introduce hierarchical schedulingJonathan Peyton2018-07-091-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces the logic implementing hierarchical scheduling. First and foremost, hierarchical scheduling is off by default To enable, use -DLIBOMP_USE_HIER_SCHED=On during CMake's configure stage. This work is based off if the IWOMP paper: "Workstealing and Nested Parallelism in SMP Systems" Hierarchical scheduling is the layering of OpenMP schedules for different layers of the memory hierarchy. One can have multiple layers between the threads and the global iterations space. The threads will go up the hierarchy to grab iterations, using possibly a different schedule & chunk for each layer. [ Global iteration space (0-999) ] (use static) [ L1 | L1 | L1 | L1 ] (use dynamic,1) [ T0 T1 | T2 T3 | T4 T5 | T6 T7 ] In the example shown above, there are 8 threads and 4 L1 caches begin targeted. If the topology indicates that there are two threads per core, then two consecutive threads will share the data of one L1 cache unit. This example would have the iteration space (0-999) split statically across the four L1 caches (so the first L1 would get (0-249), the second would get (250-499), etc). Then the threads will use a dynamic,1 schedule to grab iterations from the L1 cache units. There are currently four supported layers: L1, L2, L3, NUMA OMP_SCHEDULE can now read a hierarchical schedule with this syntax: OMP_SCHEDULE='EXPERIMENTAL LAYER,SCHED[,CHUNK][:LAYER,SCHED[,CHUNK]...]:SCHED,CHUNK And OMP_SCHEDULE can still read the normal SCHED,CHUNK syntax from before I've kept most of the hierarchical scheduling logic inside kmp_dispatch_hier.h to try to keep it separate from the rest of the code. Differential Revision: https://reviews.llvm.org/D47962 llvm-svn: 336571
* [OpenMP] Use C++11 Atomics - barrier, tasking, and lock codeJonathan Peyton2018-07-091-21/+21
| | | | | | | | | | | | | | | | | These are preliminary changes that attempt to use C++11 Atomics in the runtime. We are expecting better portability with this change across architectures/OSes. Here is the summary of the changes. Most variables that need synchronization operation were converted to generic atomic variables (std::atomic<T>). Variables that are updated with combined CAS are packed into a single atomic variable, and partial read/write is done through unpacking/packing Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D47903 llvm-svn: 336563
* Read OMP_TARGET_OFFLOAD and provide API to access ICVJonathan Peyton2018-03-201-0/+3
| | | | | | | | | | | | | | Added settings code to read OMP_TARGET_OFFLOAD environment variable. Added target-offload-var ICV as __kmp_target_offload, set via OMP_TARGET_OFFLOAD, if available, otherwise defaulting to DEFAULT. Valid values for the ICV are specified as enum values {0,1,2} for disabled, default, and mandatory. An internal API access function __kmpc_get_target_offload is provided. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D44577 llvm-svn: 328046
* Use hyperbarrier by default on all architecturesJonas Hahnfeld2017-12-081-15/+6
| | | | | | | | | | | | | | | | | | All architectures except x86_64 used the linear barrier implementation by default which doesn't give good performance for a larger number of threads. Improvements for PARALLEL overhead (EPCC) with this patch on a Power8 system (2 sockets x 10 cores x 8 threads, OMP_PLACES=cores) 20 threads: 4.55us -> 3.49us 40 threads: 8.84us -> 4.06us 80 threads: 19.18us -> 4.74us 160 threads: 54.22us -> 6.73us Differential Revision: https://reviews.llvm.org/D40358 llvm-svn: 320152
* Extension of HWLOC topology discovery with NUMA nodes and tilesAndrey Churbanov2017-11-301-0/+2
| | | | | | | | Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D40309 llvm-svn: 319422
* Exclude untied tasks from checking of task scheduling constraint (TSC).Andrey Churbanov2017-11-161-2/+1
| | | | | | | | This can improve performance of tests with untied tasks. Differential Revision: https://reviews.llvm.org/D39613 llvm-svn: 318388
* Remove const from variables with dynamic memoryJonas Hahnfeld2017-11-091-5/+1
| | | | | | | | | | | | | | | | | | | Allocated memory is typically not 'const' if it needs to be freed. This patch removes around 50 wrong const attributes, modifies the corresponding functions and finally gets rid of some const_casts. These have especially been strange for __kmp_str_fname_free() that added a 'const' to call __kmp_str_free() which removed it again. Two minor cleanups that I performed in this process: * __kmp_tool_libraries now lives in kmp_settings.cpp as it is used nowhere else. * __kmp_msg_empty was removed as it was never used and Clang now complained that it was assigned a string literal that is 'const char *'. Differential Revision: https://reviews.llvm.org/D39755 llvm-svn: 317797
* Cleanup version symbol macros and attributes/declspecsJonathan Peyton2017-11-071-7/+2
| | | | | | | | | | | | 1) Get rid of xaliasify, xexpand and xversionify for KMP_EXPAND_NAME and KMP_VERSION_SYMBOL. KMP_VERSION_SYMBOL is a combination of xaliasify and xversionify. 2) Put all attribute and __declspec definitions in kmp_os.h Differential Revision: https://reviews.llvm.org/D39516 llvm-svn: 317636
* Update implementation of OMPT to the specification OpenMP 5.0 Preview 1 (TR4).Joachim Protze2017-11-011-0/+4
| | | | | | | | | | | | | | The code is tested to work with latest clang, GNU and Intel compiler. The implementation is optimized for low overhead when no tool is attached shifting the cost to execution with tool attached. This patch does not implement OMPT for libomptarget. Patch by Simon Convent and Joachim Protze Differential Revision: https://reviews.llvm.org/D38185 llvm-svn: 317085
* Apply formatting changesJonathan Peyton2017-10-201-3/+0
| | | | | | | | | | .clang-format's comments are removed and a (hopefully) final set of formatting changes are applied. Differential Revision: https://reviews.llvm.org/D38837 Differential Revision: https://reviews.llvm.org/D38920 llvm-svn: 316227
* Remove BUILD_TVJonathan Peyton2017-08-171-7/+0
| | | | | | | | | | Cleanup code to remove BUILD_TV and unused code bracketed by it. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D36011 llvm-svn: 311114
* Add new envirable KMP_TEAMS_THREAD_LIMITJonathan Peyton2017-08-021-0/+1
| | | | | | | | | | | | | | | | | | | | This change adds a new environment variable, KMP_TEAMS_THREAD_LIMIT, which is used to set a new global variable, __kmp_teams_max_nth, which is checked when determining the size and quantity of teams that will be created in the teams construct. Specifically, it is a limit on the total number of threads in a given teams construct. It differentiates the limits for the teams construct from the limits for regular parallel regions (KMP_DEVICE_THREAD_LIMIT/__kmp_max_nth and OMP_THREAD_LIMIT/__kmp_cg_max_nth). When each individual team is formed, it is still subject to those limits. After the clauses to the teams construct are parsed and calculated, we check to make sure we are within this limit, and if not, reduce num_threads per team and/or number of teams, accordingly. The default value is set to the number of available processors on the system. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D36009 llvm-svn: 309874
* Fix implementation of OMP_THREAD_LIMITJonathan Peyton2017-07-271-0/+1
| | | | | | | | | | | | | | | | | | This change fixes the implementation of OMP_THREAD_LIMIT. The implementation of this previously was not restricted to a contention group (but it should be, according to the spec), and this is fixed here. A field is added to root thread to store a counter of the threads in the contention group. An extra check is added when reserving threads for a parallel region that checks this variable and compares to threadlimit-var, which is implemented as a new global variable, kmp_cg_max_nth. Associated settings changes were also made, and clean up of comments that referred to OMP_THREAD_LIMIT, but should refer to the new KMP_DEVICE_THREAD_LIMIT (added in an earlier patch). Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D35912 llvm-svn: 309319
* Cleanup: __kmp_env_* variablesJonathan Peyton2017-07-251-5/+0
| | | | | | | | | | Removed unused __kmp_env_* variables. Also clangified other people's code. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D35808 llvm-svn: 309000
* Add recursive task scheduling strategy to taskloop implementationJonathan Peyton2017-07-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Taskloop implementation is extended by using recursive task scheduling. Envirable KMP_TASKLOOP_MIN_TASKS added as a manual threshold for the user to switch from recursive to linear tasks scheduling. Details: * The calculations for the loop parameters are moved from __kmp_taskloop_linear upper level * Initial calculation is done in the __kmpc_taskloop, further range splitting is done in the __kmp_taskloop_recur. * Added threshold to switch from recursive to linear tasks scheduling; * One half of split range is scheduled as an internal task which just moves sub-range parameters to the stealing thread that continues recursive scheduling (if number of tasks still enough), the other half is processed recursively; * Internal task duplication routine fixed to assign parent task, that was not needed when all tasks were scheduled by same thread, but is needed now. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D35273 llvm-svn: 308338
* Replace platform macro with KMP_MIC_SUPPORTEDJonathan Peyton2017-06-131-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D34119 llvm-svn: 305307
* Clang-format and whitespace cleanup of source codeJonathan Peyton2017-05-121-253/+284
| | | | | | | | | | | | | This patch contains the clang-format and cleanup of the entire code base. Some of clang-formats changes made the code look worse in places. A best effort was made to resolve the bulk of these problems, but many remain. Most of the problems were mangling line-breaks and tabbing of comments. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D32659 llvm-svn: 302929
* KMP_HW_SUBSET extended with NUMA support when HWLOC enabledAndrey Churbanov2017-04-131-5/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D31600 llvm-svn: 300220
* Enable yield cycle on LinuxJonathan Peyton2017-02-151-1/+1
| | | | | | | | | | | | This change allows the runtime to turn __kmp_yield() on/off repeatedly on Linux. This feature was removed when disabling monitor thread, but there are applications that perform better with this feature on. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D29227 llvm-svn: 295203
* Follow up to r289732: Update comments in source files to reference .cpp filesJonathan Peyton2016-12-141-1/+1
| | | | | | Patch by Hansang Bae llvm-svn: 289739
* Change source files from .c to .cppJonathan Peyton2016-12-141-0/+497
Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D26688 llvm-svn: 289732
OpenPOWER on IntegriCloud