summaryrefslogtreecommitdiffstats
path: root/openmp/runtime/src/kmp.h
Commit message (Collapse)AuthorAgeFilesLines
* [OpenMP] NFC: Fix trivial typos in commentsKazuaki Ishizaki2020-01-071-5/+5
| | | | | | | | | | | | Reviewers: jdoerfert, Jim Reviewed By: Jim Subscribers: Jim, mgorny, guansong, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D72285
* [OpenMP] NFC: Fix trivial typos in commentsKelvin Li2020-01-031-1/+1
| | | | | | Submitted by: kiszk Differential Revision: https://reviews.llvm.org/D72171
* [OpenMP] Remove -Wl,-fini=__kmp_internal_end_finiAaron Puchert2019-11-191-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The termination function duplicated the functionality of the __attribute((destructor))-annotated function __kmp_internal_end_fini, and we have no indication that this doesn't work. The function might cause issues with link-time optimization turned on: until very recently, none of the usual linkers was reporting functions named in -Wl,-fini as used to the LTO plugin, so it might be dropped. If the function is dropped, -Wl,-fini=__kmp_internal_end_fini doesn't do what we want: with ld.bfd and lld it drops the FINI attribute from .dynamic and with gold we get FINI = 0x0, which leads to a crash on cleanup. This can be reproduced by building with -DLLVM_ENABLE_PROJECTS="clang;openmp" \ -DLLVM_ENABLE_LTO=Thin \ -DLLVM_USE_LINKER=gold The issue in lld has been fixed in f95273f75aa, but gold remains without fix so far. Fixes PR43927. Reviewers: JonChesterfield, jdoerfert, AndreyChurbanov Reviewed By: AndreyChurbanov Differential Revision: https://reviews.llvm.org/D69927
* [OpenMP] Enable thread affinity on FreeBSDDavid Carlier2019-10-081-1/+1
| | | | | | | | | | Reviewers: chandlerc, jlpeyton, jdoerfert, dim Reviewed-By: dim Differential Revision: https://reviews.llvm.org/D68580 llvm-svn: 374118
* Enable tasks dependencies hashmaps resizing.Andrey Churbanov2019-09-251-2/+1
| | | | | | | | Patch by viroulep (Philippe Virouleau) Differential Revision: https://reviews.llvm.org/D67447 llvm-svn: 372879
* [OpenMP] Remove OMP spec versioningJonathan Peyton2019-07-121-174/+9
| | | | | | | | | | Remove all older OMP spec versioning from the runtime and build system. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D64534 llvm-svn: 365963
* NFC: fixed typo #ifdef --> #if to allow macro set to 0 work correctlyAndrey Churbanov2019-07-101-1/+1
| | | | llvm-svn: 365642
* Create a runtime option to disable task throttling.Andrey Churbanov2019-07-021-0/+1
| | | | | | | | Patch by viroulep (Philippe Virouleau) Differential Revision: https://reviews.llvm.org/D63196 llvm-svn: 364934
* New implementation of OpenMP 5.0 detached tasks.Andrey Churbanov2019-06-191-1/+28
| | | | | | | | Patch by Alex Duran Differential Revision: https://reviews.llvm.org/D62485 llvm-svn: 363799
* [OpenMP] Add task alloc functionGheorghe-Teodor Bercea2019-06-141-0/+6
| | | | | | | | | | | | | | | | Summary: Add the target task allocation function to the interface. Reviewers: ABataev, AlexEichenberger, caomhin, jlpeyton, AndreyChurbanov, RaviNarayanaswamy, hbae Reviewed By: AlexEichenberger, hbae Subscribers: hbae, RaviNarayanaswamy, cfe-commits, Hahnfeld, guansong, jdoerfert, openmp-commits Tags: #openmp, #clang Differential Revision: https://reviews.llvm.org/D63010 llvm-svn: 363449
* Added propagation of not big initial stack size of master thread to workers.Andrey Churbanov2019-06-051-0/+1
| | | | | | | | Currently implemented only for non-Windows 64-bit platforms. Differential Revision: https://reviews.llvm.org/D62488 llvm-svn: 362618
* Fixed build warning with -DLIBOMP_USE_HWLOC=1Andrey Churbanov2019-06-031-0/+6
| | | | | | | | | | Made type of depth of hwloc object to correapond with change from unsigned in hwloc 1,x to int in hwloc 2.x. This eliminates the warning on signed-unsigned comparison. Differential Revision: https://reviews.llvm.org/D62332 llvm-svn: 362401
* Fixed second issue reported in https://bugs.llvm.org/show_bug.cgi?id=41584.Andrey Churbanov2019-05-161-2/+2
| | | | | | | | | | | | | Added synchronization for possible concurrent initialization of mutexes by multiple threads. The need of synchronization caused by commit r357927 which added the use of mutexes at threads movement to/from common pool (earlier the mutexes were used only at suspend/resume). Patch by Johnny Peyton. Differential Revision: https://reviews.llvm.org/D61995 llvm-svn: 360919
* Add non-SSE wrapper for __kmp_{load,store}_mxcsrDimitry Andric2019-05-061-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: To be able to successfully build OpenMP on FreeBSD/i386, which still uses i486 as its default processor, I had to provide wrappers for the `__kmp_load_mxcsr` and `__kmp_store_mxcsr` functions. If the compiler signals that SSE is not available, loading and storing mxcsr does not make sense anway, so in that case the inline functions are empty. This gives the minimum amount of code churn. See also https://svnweb.freebsd.org/changeset/base/345283 Reviewers: emaste, jlpeyton, Hahnfeld Reviewed By: jlpeyton Subscribers: hfinkel, krytarowski, jdoerfert, openmp-commits, llvm-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D60916 llvm-svn: 360062
* [OpenMP] Implement task modifier for reduction clauseJonathan Peyton2019-05-011-0/+12
| | | | | | | | | | | | | | | | | Implemented task modifier in two versions - one without taking into account omp_orig variable (the omp_orig still can be processed by compiler without help of the library, but each reduction object will need separate initializer with global access to omp_orig), another with omp_orig variable included into interface (single initializer can be used for multiple reduction objects of the same type). Second version can be used when the omp_orig is not globally accessible, or to optimize code in case of multiple reduction objects of the same type. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D60976 llvm-svn: 359710
* [OpenMP] Add OpenMP 5.0 nonmonotonic codeJonathan Peyton2019-04-301-1/+43
| | | | | | | | | | | | This patch adds: * New omp_sched_monotonic flag to omp_sched_t which is handled within the runtime * Parsing of monotonic/nonmonotonic in OMP_SCHEDULE * Tests for the monotonic flag and envirable parsing * Logic to force monotonic when hierarchical scheduling is used Differential Revision: https://reviews.llvm.org/D60979 llvm-svn: 359601
* [OpenMP] Exchange code in asm file for inline assemblyJonathan Peyton2019-04-151-33/+69
| | | | | | | | | | This change replaces some of the assembly functions in z_Linux_asm.S for inline asm in kmp.h. This allows better interaction with compiler tools and sanitizers. Differential Revision: https://reviews.llvm.org/D60423 llvm-svn: 358438
* [OpenMP] Implement 5.0 memory managementJonathan Peyton2019-04-081-19/+90
| | | | | | | | | | | | | | | | | | * Replace HBWMALLOC API with more general MEMKIND API, new functions and variables added. * Have libmemkind.so loaded when accessible. * Redirect memspaces to default one except for high bandwidth which is processed separately. * Ignore some allocator traits e.g., sync_hint, access, pinned, while others are processed normally e.g., alignment, pool_size, fallback, fb_data, partition. * Add tests for memory management Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D59783 llvm-svn: 357929
* [OpenMP] Clean up load balancing dynamic modeJonathan Peyton2019-04-081-1/+1
| | | | | | | | | | | | | | | | This patch cleans up the bookkeeping code for the load balancing dynamic mode. When a thread is moved to or from the thread pool, the th_active_in_pool flag and the __kmp_thread_pool_active_nth global counter are both updated. This removes the need for the corrective code in the main wait loop. Another global counter, __kmp_thread_pool_nth, was removed completely, as it was only used for debugging, but was not under KMP_DEBUG. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D59508 llvm-svn: 357927
* [OpenMP] Remove deprecated taskqJonathan Peyton2019-03-151-190/+1
| | | | | | | | | | Remove very old, unused, and deprecated taskq code. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D58989 llvm-svn: 356288
* [OpenMP 5.0] Deprecate nest-var and associated featuresJonathan Peyton2019-02-281-15/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | Nest-var, OMP_NESTED, omp_set_nested()., and omp_get_nested() have been deprecated in the 5.0 spec. Initial nesting info is now derived from OMP_MAX_ACTIVE_LEVELS, OMP_NUM_THREADS, and OMP_PROC_BIND. This patch deprecates the internal ICV that corresponds to nest-var, and replaces it with the max-active-levels-var ICV to determine nesting. The change still allows for use of OMP_NESTED (according to 5.0 changes), omp_get_nested, and omp_set_nested, which have had deprecation messages added to them. The change allows certain settings of OMP_NUM_THREADS, OMP_PROC_BIND, and OMP_MAX_ACTIVE_LEVELS to turn on nesting, but OMP_NESTED=0 will still force nesting to be off. The runtime now prints informative messages about deprecation of OMP_NESTED, omp_set_nested(), and omp_get_nested(), when those environment variables or routines are used. It also prints deprecated message in output for KMP_SETTINGS and OMP_DISPLAY_ENV for OMP_NESTED. This patch also fixes OMP_DISPLAY_ENV output for OMP_TARGET_OFFLOAD. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D58408 llvm-svn: 355138
* [OpenMP] Make use of sched_yield optional in runtimeJonathan Peyton2019-02-281-44/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch cleans up the yielding code and makes it optional. An environment variable, KMP_USE_YIELD, was added. Yielding is still on by default (KMP_USE_YIELD=1), but can be turned off completely (KMP_USE_YIELD=0), or turned on only when oversubscription is detected (KMP_USE_YIELD=2). Note that oversubscription cannot always be detected by the runtime (for example, when the runtime is initialized and the process forks, oversubscription cannot be detected currently over multiple instances of the runtime). Because yielding can be controlled by user now, the library mode settings (from KMP_LIBRARY) for throughput and turnaround have been adjusted by altering blocktime, unless that was also explicitly set. In the original code, there were a number of places where a double yield might have been done under oversubscription. This version checks oversubscription and if that's not going to yield, then it does the spin check. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D58148 llvm-svn: 355120
* [OpenMP] Adding GOMP compatible cancellationJonathan Peyton2019-02-191-0/+1
| | | | | | | | | | | Remove fatal error messages from the cancellation API for GOMP Add __kmp_barrier_gomp_cancel() to implement cancellation of parallel regions. This new function uses the linear barrier algorithm with a cancellable nonsleepable wait loop. Differential Revision: https://reviews.llvm.org/D57969 llvm-svn: 354367
* [OpenMP] Fix thread_limits to work properly for teams constructJonathan Peyton2019-02-111-1/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The thread-limit-var and omp_get_thread_limit API was not perfectly handled for teams construct. Now, when modified by thread_limit clause, omp_get_thread_limit reports the correct value. In addition, the value is restored when leaving the teams construct to what it was in the encountering context. This is done partly by creating the notion of a Contention Group root (CG root) that keeps track of the thread at the root of each separate CG, the thread-limit-var associated with the CG, and associated counter of active threads within the contention group. thread-limits are passed from master to worker threads via an entry in the ICV data structure. When a "contention group switch" occurs, a new CG root record is made and passed from master to worker. A thread could potentially have several CG root records if it encounters multiple nested teams constructs (but at the moment the spec doesn't allow for nested teams, so the most one could have currently is 2). The master of the teams masters gets the thread-limit clause value stored to its local ICV structure, and the other teams masters copy it from the master. The thread-limit is set from that ICV copy and restored to the ICV copy when entering and leaving the teams construct. This change also fixes a bug when the top-level teams construct team gets reused, and OMP_DYNAMIC was true, which can cause the expected size of this team to be smaller than what was actually allocated. The fix updates the size of the team after its threads were reserved. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D56804 llvm-svn: 353747
* Update more file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | | to reflect the new license. These used slightly different spellings that defeated my regular expressions. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351648
* [OpenMP] Add omp_pause_resource* APIJonathan Peyton2019-01-161-0/+31
| | | | | | | | | | | | Add omp_pause_resource and omp_pause_resource_all API and enum, plus stub for internal implementation. Implemented callable helper function to do local pause, and added basic functionality for hard and soft pause. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D55078 llvm-svn: 351372
* [OpenMP] Fix LIBOMP_USE_DEBUGGER=ON build (PR38612)Roman Lebedev2019-01-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Two things: 1. Those two variables had the wrong sigdness, which was resulting in "sign mismatch in comparison" warning. 2. The whole `kmp_debugger.cpp` wasn't being built, or rather, it was being built as-if `USE_DEBUGGER` was off, thus, nothing provided the definition of `__kmp_omp_debug_struct_info`, `__kmp_debugging`. Makes sense, because `USE_DEBUGGER` is set in `kmp_config.h`, which is not included explicitly. It is included by `kmp.h`, but that one is only included inside of the `#if USE_DEBUGGER` block.. I *think* this is the only source file with this issue, everything else seem to `#include` either `kmp.h` or `kmp_config.h`. The alternative solution would be to add `add_compile_options(-include kmp_config.h)` in CMake. I did verify that `__kmp_omp_debug_struct_info` becomes available with this patch. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=38612 | PR38612 ]]. Reviewers: AndreyChurbanov, jlpeyton, Hahnfeld Reviewed By: jlpeyton Subscribers: guansong, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D55783 llvm-svn: 351019
* [OpenMP] Add omp_get_device_num() and update several other device API functionsJonathan Peyton2019-01-031-0/+7
| | | | | | | | | | | | | | | | | | | | Add omp_get_device_num() function for 5.0 which returns the number of the device the current thread is running on. Currently, we are leaving it to the compiler to handle this properly if it is called inside target. Also, did some cleanup and updating of duplicate device API functions (in both libomp and libomptarget) to make them into weak functions that check for the symbol from libomptarget, and will call the version in libomptarget if it is present. If any additional device API functions are implemented also in libomptarget in the future, we should add the dlsym calls to the host functions. Also, if the omp_target_* functions are to be implemented for the host (this has been requested), they should attempt to call the libomptarget versions as well. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D55578 llvm-svn: 350352
* [OpenMP] Implement OpenMP 5.0 affinity format functionalityJonathan Peyton2018-12-131-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the affinity format functionality introduced in OpenMP 5.0. This patch adds: Two new environment variables: OMP_DISPLAY_AFFINITY=TRUE|FALSE OMP_AFFINITY_FORMAT=<string> and Four new API: 1) omp_set_affinity_format() 2) omp_get_affinity_format() 3) omp_display_affinity() 4) omp_capture_affinity() The affinity format functionality has two ICV's associated with it: affinity-display-var (bool) and affinity-format-var (string). The affinity-display-var enables/disables the functionality through the envirable OMP_DISPLAY_AFFINITY. The affinity-format-var is a formatted string with the special field types beginning with a '%' character similar to printf For example, the affinity-format-var could be: "OMP: host:%H pid:%P OStid:%i num_threads:%N thread_num:%n affinity:{%A}" The affinity-format-var is displayed by every thread implicitly at the beginning of a parallel region when any thread's affinity has changed (including a brand new thread being spawned), or explicitly using the omp_display_affinity() API. The omp_capture_affinity() function can capture the affinity-format-var in a char buffer. And omp_set|get_affinity_format() allow the user to set|get the affinity-format-var explicitly at runtime. omp_capture_affinity() and omp_get_affinity_format() both return the number of characters needed to hold the entire string it tried to make (not including NULL character). If not enough buffer space is available, both these functions truncate their output. Differential Revision: https://reviews.llvm.org/D55148 llvm-svn: 349089
* Support clang compiling under windows-gnu and windows-msvcAndrey Churbanov2018-12-101-1/+5
| | | | | | | | Patch by Peiyuan Song <squallatf@gmail.com> Differential Revision: https://reviews.llvm.org/D53422 llvm-svn: 348756
* Add OpenBSD support to OpenMPKamil Rytarowski2018-12-091-0/+4
| | | | | | | | | | | | | | | | Summary: This patch permits OpenMP to build and work (with both gcc and clang) on OpenBSD. It mostly follows what was done for FreeBSD and NetBSD, except OpenBSD does not have pthread_getattr_np support, so it follows OS X in that one instance. Reviewers: #openmp, krytarowski Reviewed By: krytarowski Subscribers: guansong, jfb, emaste, mgorny, krytarowski, #openmp Tags: #openmp Differential Revision: https://reviews.llvm.org/D34280 llvm-svn: 348726
* Add DragonFlyBSD support to OpenMPKamil Rytarowski2018-12-091-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Additions mostly follow FreeBSD and NetBSD and are not intrusive. There is similar patch for OpenBSD: https://reviews.llvm.org/D34280 The -lm was being omitted due to -Wl,--as-needed in cmake rule, similar patch is in freebsd-ports/devel/llvm-devel port. Simple OpenMP programs compile and work as expected: $ clang-devel ~/omp_hello.c -fopenmp -I/usr/local/llvm-devel/include $ LD_LIBRARY_PATH=/usr/local/llvm-devel/lib OMP_NUM_THREADS=100 ./a.out The assertion in LLVMgold.so when -fopenmp was used together with -flto in 20170524 snapshot is no longer triggered on current svn-trunk and works fine as in llvm-4.0 with our local patches. Reviewers: #openmp, krytarowski Reviewed By: krytarowski Subscribers: dexonsmith, jfb, krytarowski, guansong, gregrodgers, emaste, mgorny, mehdi_amini Differential Revision: https://reviews.llvm.org/D35129 llvm-svn: 348725
* Revert r347799: Add omp_get_device_num() and update other device APIJonathan Peyton2018-11-291-7/+0
| | | | | | | There is a conflict between libomptarget and libomp concerning some of the standard OpenMP device API which needs further intestigation. llvm-svn: 347932
* [OpenMP] Add stubs for Task affinity APIJonathan Peyton2018-11-291-0/+15
| | | | | | | | | | | | | This patch adds __kmpc_omp_reg_task_with_affinity to register affinity information for tasks. For now, the affinity information is not used, and the function always succeeds. This also adds the kmp_task_affinity_info_t structure to store the task affinity information. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D55026 llvm-svn: 347907
* [OpenMP] Add omp_get_device_num() and update several other device API functionsJonathan Peyton2018-11-281-0/+7
| | | | | | | | | | | | | Add omp_get_device_num() function for 5.0 which returns the number of the device the current thread is running on. Also, did some cleanup and updating of device API functions to make them into weak functions that should be replaced with libomptarget functions when libomptarget is present. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D54342 llvm-svn: 347799
* Add Hurd support.Andrey Churbanov2018-11-071-0/+4
| | | | | | | | Patch by samuel.thibault@ens-lyon.org Differential Revision: https://reviews.llvm.org/D54079 llvm-svn: 346310
* Implementation of OpenMP 5.0 mutexinoutset task dependency type.Andrey Churbanov2018-11-071-6/+14
| | | | | | Differential Revision: https://reviews.llvm.org/D53380 llvm-svn: 346307
* [OpenMP] Add missing __kmpc_critical_with_hint to dllexportsJonathan Peyton2018-09-261-4/+2
| | | | | | | | | This patch puts the __kmpc_critical_with_hint function in dllexports and also replaces some OMP_45_ENABLED to OMP_50_ENABLED Differential Revision: https://reviews.llvm.org/D52380 llvm-svn: 343143
* [OpenMP] Fix balanced affinity so thread's private affinity mask is updatedJonathan Peyton2018-09-261-1/+1
| | | | | | | | | | | Balanced affinity only updated the thread's affinity with the operating system. This change also has the thread's private mask reflect that change as well so that any API that probes the thread's affinity mask will report the correct mask value. Differential Revision: https://reviews.llvm.org/D52379 llvm-svn: 343142
* [OpenMP] Fix performance issue from 376.kdtreeJonathan Peyton2018-09-261-3/+0
| | | | | | | | | | | | | | This change improves the performance of 376.kdtree by giving the compiler an opportunity to do inlining and other optimizations for the call path, __kmpc_omp_task_complete_if0()->__kmp_task_finish(), which is one of the hot paths in the program; some functions in kmp_taskdeps.cpp were moved to the new header file, kmp_taskdeps.h to achieve this. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D51889 llvm-svn: 343138
* [OpenMP] Change hint parameter type for critical to uint32_tJonathan Peyton2018-09-071-1/+5
| | | | | | | | | | | Add atomic hint flags to the enum. The hint parameter type was changed to uint32_t in __kmpc_critical_with_hint() Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D51235 llvm-svn: 341694
* [OpenMP] Synchronization hint constants added to headersJonathan Peyton2018-09-071-25/+33
| | | | | | | | | | | | | | | | ident flags reserved for atomic hints. This patch adds omp_sync_hint_t to omp.h and omp_sync_hint_kind to omp_lib.h. For better maintainability the list of macros for ident flags was replaced with a enum. The new KMP_IDENT_ATOMIC_HINT_MASK was added to the enum to support possible future atomic hints. Also fix omp_lib.h.var to be under 72 chars again after 5.0 OpenMP Memory commit Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D51233 llvm-svn: 341693
* [OpenMP] Initial implementation of OMP 5.0 Memory Management routinesJonathan Peyton2018-09-071-1/+31
| | | | | | | | | | | | | | | | | | | | | | | Implemented omp_alloc, omp_free, omp_{set,get}_default_allocator entries, and OMP_ALLOCATOR environment variable. Added support for HBW memory on Linux if libmemkind.so library is accessible (dynamic library only, no support for static libraries). Only used stable API (hbwmalloc) of the memkind library though we may consider using experimental API in future. The ICV def-allocator-var is implemented per implicit task similar to place-partition-var. In the absence of a requested allocator, the uses the default allocator. Predefined allocators (the only ones currently available) are made similar for C and Fortran, - pointers (long integers) with values 1 to 8. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D51232 llvm-svn: 341687
* [OpenMP] Cleanup codeJonathan Peyton2018-08-091-38/+49
| | | | | | | | | | | | | | | | This patch cleans up unused functions, variables, sign compare issues, and addresses some -Warning flags which are now enabled including -Wcast-qual. Not all the warning flags in LibompHandleFlags.cmake are enabled, but some are with this patch. Some __kmp_gtid_from_* macros in kmp.h are switched to static inline functions which allows us to remove the awkward definition of KMP_DEBUG_ASSERT() and KMP_ASSERT() macros which used the comma operator. This had to be done for the innumerable -Wunused-value warnings related to KMP_DEBUG_ASSERT() Differential Revision: https://reviews.llvm.org/D49105 llvm-svn: 339393
* [OpenMP] Implement GOMP doacross compatibilityJonathan Peyton2018-07-301-3/+4
| | | | | | | | | | | | | | | | | | | | | | This change introduces GOMP doacross compatibility. There are 12 new interface functions 6 for long type and 6 for unsigned long long type: GOMP_doacross_post, GOMP_doacross_wait, GOMP_loop_doacross_[schedule]_start where schedule can be static, dynamic, guided, or runtime. These functions just translate the parameters if necessary and send them to the corresponding kmp function. E.g., GOMP_doacross_post() -> __kmpc_doacross_post() For the GOMP_doacross_post function, there is template specialization to account for when long is a four byte vs an eight byte type. If it is a four byte type, then a temporary array has to be created to convert the four byte integers into eight byte integers and then sending that into __kmpc_doacross_post(). Because GOMP_doacross_wait uses varargs, it always needs a temporary array and does not need template specialization. Differential Revision: https://reviews.llvm.org/D49857 llvm-svn: 338280
* [OpenMP] Fix build errors when building with KMP_DEBUG_ADAPTIVE_LOCKS=1Jonathan Peyton2018-07-301-1/+1
| | | | | | | | | | | | | This change fixes build errors when building a runtime with adaptive lock stats enabled. Most of the errors were due to the recent changes in the runtime, but it seems that we have not tried to build this debug runtime on Windows for a long time. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D49823 llvm-svn: 338277
* PR30734: Remove __kmp_ft_page_allocate()Jonas Hahnfeld2018-07-261-9/+0
| | | | | | | | | | | | | This function was not enabled by default and not exported when manually tweaking the build flags. Additionally it was hard to use since there is no corresponding __kmp_ft_page_free(). The code itself is questionable because the returned memory address is padded by an extra pointer which stores the unpadded start of the allocated region (this would need to be freed). Differential Revision: https://reviews.llvm.org/D49802 llvm-svn: 338052
* Block library shutdown until unreaped threads finish spin-waitingJonathan Peyton2018-07-191-0/+3
| | | | | | | | | | | | | | | | This change fixes possibly invalid access to the internal data structure during library shutdown. In a heavily oversubscribed situation, the library shutdown sequence can reach the point where resources are deallocated while there still exist threads in their final spinning loop. The added loop in __kmp_internal_end() checks if there are such busy-waiting threads and blocks the shutdown sequence if that is the case. Two versions of kmp_wait_template() are now used to minimize performance impact. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D49452 llvm-svn: 337486
* [OpenMP] Introduce hierarchical schedulingJonathan Peyton2018-07-091-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces the logic implementing hierarchical scheduling. First and foremost, hierarchical scheduling is off by default To enable, use -DLIBOMP_USE_HIER_SCHED=On during CMake's configure stage. This work is based off if the IWOMP paper: "Workstealing and Nested Parallelism in SMP Systems" Hierarchical scheduling is the layering of OpenMP schedules for different layers of the memory hierarchy. One can have multiple layers between the threads and the global iterations space. The threads will go up the hierarchy to grab iterations, using possibly a different schedule & chunk for each layer. [ Global iteration space (0-999) ] (use static) [ L1 | L1 | L1 | L1 ] (use dynamic,1) [ T0 T1 | T2 T3 | T4 T5 | T6 T7 ] In the example shown above, there are 8 threads and 4 L1 caches begin targeted. If the topology indicates that there are two threads per core, then two consecutive threads will share the data of one L1 cache unit. This example would have the iteration space (0-999) split statically across the four L1 caches (so the first L1 would get (0-249), the second would get (250-499), etc). Then the threads will use a dynamic,1 schedule to grab iterations from the L1 cache units. There are currently four supported layers: L1, L2, L3, NUMA OMP_SCHEDULE can now read a hierarchical schedule with this syntax: OMP_SCHEDULE='EXPERIMENTAL LAYER,SCHED[,CHUNK][:LAYER,SCHED[,CHUNK]...]:SCHED,CHUNK And OMP_SCHEDULE can still read the normal SCHED,CHUNK syntax from before I've kept most of the hierarchical scheduling logic inside kmp_dispatch_hier.h to try to keep it separate from the rest of the code. Differential Revision: https://reviews.llvm.org/D47962 llvm-svn: 336571
* [OpenMP] Restructure loop code for hierarchical schedulingJonathan Peyton2018-07-091-2/+10
| | | | | | | | | | | | | | | | | | | | | | This patch reorganizes the loop scheduling code in order to allow hierarchical scheduling to use it more effectively. In particular, the goal of this patch is to separate the algorithmic parts of the scheduling from the thread logistics code. Moves declarations & structures to kmp_dispatch.h for easier access in other files. Extracts the algorithmic part of __kmp_dispatch_init() and __kmp_dispatch_next() into __kmp_dispatch_init_algorithm() and __kmp_dispatch_next_algorithm(). The thread bookkeeping logic is still kept in __kmp_dispatch_init() and __kmp_dispatch_next(). This is done because the hierarchical scheduler needs to access the scheduling logic without the bookkeeping logic. To prepare for new pointer in dispatch_private_info_t, a new flags variable is created which stores the ordered and nomerge flags instead of them being in two separate variables. This will keep the dispatch_private_info_t structure the same size. Differential Revision: https://reviews.llvm.org/D47961 llvm-svn: 336568
OpenPOWER on IntegriCloud