summaryrefslogtreecommitdiffstats
path: root/openmp/runtime/src
Commit message (Collapse)AuthorAgeFilesLines
...
* [OMPT] Handle null pointer in set_callback to improve performanceJoachim Protze2017-12-211-2/+5
| | | | | | | | | | | | | We use the bitmap ompt_enabled thoughout the runtime, to avoid loading the vector of callback functions when testing if specific code should be executed. Before invoking an event callback function, the pointer is tested for NULL. This revision resets the corresponding bit in ompt_enabled to 0 if NULL is passed in set_callback. Differential Revision: https://reviews.llvm.org/D41171 llvm-svn: 321264
* [AArch64] fix an issue with older /proc/cpuinfo layoutPaul Osmialowski2017-12-131-0/+8
| | | | | | | | | | There are two /proc/cpuinfo layots in use for AArch64: old and new. The old one has all 'processor : n' lines in one section, hence checking for duplications does not make sense. Differential Revision: https://reviews.llvm.org/D41000 llvm-svn: 320593
* Use hyperbarrier by default on all architecturesJonas Hahnfeld2017-12-081-15/+6
| | | | | | | | | | | | | | | | | | All architectures except x86_64 used the linear barrier implementation by default which doesn't give good performance for a larger number of threads. Improvements for PARALLEL overhead (EPCC) with this patch on a Power8 system (2 sockets x 10 cores x 8 threads, OMP_PLACES=cores) 20 threads: 4.55us -> 3.49us 40 threads: 8.84us -> 4.06us 80 threads: 19.18us -> 4.74us 160 threads: 54.22us -> 6.73us Differential Revision: https://reviews.llvm.org/D40358 llvm-svn: 320152
* Fix thread affinity on non-x86 LinuxJonas Hahnfeld2017-12-082-5/+2
| | | | | | | | | | | | | | | | | To make thread affinity work according to the OpenMP spec, the runtime needs information about the hardware topology. On Linux the default way is to parse /proc/cpuinfo which contains this information for x86 machines but (at least) not for AArch64 and Power architectures. Fortunately, there is a different code path which is able to get that data from sysfs. The needed patch has landed in 2006 for Linux 2.6.16 which is safe to assume nowadays (even RHEL 5 had a kernel version derived from 2.6.18, and we are now at RHEL 7!). Differential Revision: https://reviews.llvm.org/D40357 llvm-svn: 320151
* Add missing memory barrier for queuing locksJonas Hahnfeld2017-12-081-0/+1
| | | | | | | | | | Otherwise I see hangs in the omp_single_copyprivate test when compiling in release mode. With the debug assertions, I get a failure `head > 0 && tail > 0`. Differential Revision: https://reviews.llvm.org/D40722 llvm-svn: 320150
* [OpenMP] Add entry for Intel Compiler 18Jonathan Peyton2017-12-061-0/+2
| | | | | | | | Patch by Simon Convent Differential Revision: https://reviews.llvm.org/D40386 llvm-svn: 319961
* Eliminate double printing of verbose affinity settingsJonathan Peyton2017-12-061-1/+3
| | | | | | | | | Redundant extra verbose output of binding to full mask in case affinity=balanced or OMP_PLACES=<any> or OMP_PROC_BIND=<any> Differential Revision: https://reviews.llvm.org/D40624 llvm-svn: 319960
* Trivial enum fixJonathan Peyton2017-12-061-4/+4
| | | | | | | | | | | | This change is a trivial fix for enums that removes specification of "last" or "upper" values, or other boundary values. This simplifies the code in places, and results in never needing to update the "upper" values again. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D40804 llvm-svn: 319957
* Fix PR30890: Reduction across teams hangsJonas Hahnfeld2017-12-051-23/+70
| | | | | | | | | | __kmpc_reduce_nowait() correctly swapped the teams for reductions in a teams construct. Apply the same logic to __kmpc_reduce() and __kmpc_reduce_end(). Differential Revision: https://reviews.llvm.org/D40753 llvm-svn: 319788
* Extension of HWLOC topology discovery with NUMA nodes and tilesAndrey Churbanov2017-11-305-179/+385
| | | | | | | | Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D40309 llvm-svn: 319422
* Make kmp_r_sched_t into a unionJonathan Peyton2017-11-292-30/+24
| | | | | | | | | | | This change makes kmp_r_sched_t type into a union for simpler comparisons and assignments Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D40374 llvm-svn: 319379
* Fix aligned memory allocation in the stub libraryJonathan Peyton2017-11-291-9/+34
| | | | | | | | | | | | | | kmp_aligned_malloc() always returned NULL on Windows (stub library only) that may cause Fortran application crash. With this change all memory allocation functions were fixed to use aligned{m,re,rec}alloc() to allocate/reallocate memory. To deallocate that memory _aligned_free() is used in kmp_free(). Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D40296 llvm-svn: 319375
* Warning is emitted when tiles are requested but cannot be usedJonathan Peyton2017-11-292-1/+15
| | | | | | | | | | | | | | Added two warnings: 1) Before building the topology map check if tiles are requested but the topo method is not hwloc; 2) After building the topology map check if tiles are requested but not detected by the library. Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D40340 llvm-svn: 319374
* Fix types of Fortran array elementsJonathan Peyton2017-11-296-12/+12
| | | | | | | | | | | | Fortran array elements made default integer in OMP_GET_PLACE_PROC_IDS and OMP_GET_PARTITION_PLACE_NUMS subroutines, otherwise call to them produces incorrect result. Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D40356 llvm-svn: 319372
* [CMake] Refactor common settings and flagsJonas Hahnfeld2017-11-291-5/+5
| | | | | | | | | | | | These are needed by both libraries, so we can do that in a common namespace and unify configuration parameters. Also make sure that the user isn't requesting libomptarget if the library cannot be built on the system. Issue an error in that case. Differential Revision: https://reviews.llvm.org/D40081 llvm-svn: 319342
* [CMake] Disallow direct configurationJonas Hahnfeld2017-11-291-1/+1
| | | | | | | | | | As a first step, this allows us to generalize the detection of standalone builds and make it fully compatible when building in llvm/runtimes/ which automatically sets OPENMP_STANDLONE_BUILD. Differential Revision: https://reviews.llvm.org/D40080 llvm-svn: 319341
* Fix for OMP doacross implementation on PowerJonas Hahnfeld2017-11-221-1/+8
| | | | | | | | | | Power has a weak consistency model so we need memory barriers to make writes (both from runtime and from user code) available for all threads. Differential Revision: https://reviews.llvm.org/D40175 llvm-svn: 318848
* Fixed OMP doacross implementation on 32-bit platforms.Andrey Churbanov2017-11-201-8/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D40171 llvm-svn: 318658
* Exclude untied tasks from checking of task scheduling constraint (TSC).Andrey Churbanov2017-11-163-101/+164
| | | | | | | | This can improve performance of tests with untied tasks. Differential Revision: https://reviews.llvm.org/D39613 llvm-svn: 318388
* [OMPT] Provide initialization for Mac OS XJonas Hahnfeld2017-11-111-33/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Traditionally, the library had a weak symbol for ompt_start_tool() that served as fallback and disabled OMPT if called. Tools could provide their own version and replace the default implementation to register callbacks and lookup functions. This mechanism has worked reasonably well on Linux systems where this interface was initially developed. On Darwin / Mac OS X the situation is a bit more complicated and the weak symbol doesn't work out-of-the-box. In my tests, the library with the tool needed to link against the OpenMP runtime to make the process work. This would effectively mean that a tool needed to choose a runtime library whereas one design goal of the interface was to allow tools that are agnostic of the runtime. The solution is to use dlsym() with the argument RTLD_DEFAULT so that static implementations of ompt_start_tool() are found in the main executable. This works because the linker on Mac OS X includes all symbols of an executable in the global symbol table by default. To use the same code path on Linux, the application would need to be built with -Wl,--export-dynamic. To avoid this restriction, we continue to use weak symbols on Linux systems as before. Finally this patch extends the existing test to cover all possible ways of initializing the tool as described by the standard. It also fixes ompt_finalize() to not call omp_get_thread_num() when the library is shut down which resulted in hangs on Darwin. The changes have been tested on Linux to make sure that it passes the current tests as well as the newly extended one. Differential Revision: https://reviews.llvm.org/D39801 llvm-svn: 317980
* [OMPT] Fix assertion for OpenMP code generated with outdated compilersJoachim Protze2017-11-103-7/+13
| | | | | | | | | | | | For up-to-date compilers, this assertion is reasonable, but it breaks compatibility with the typical compiler installed on most systems. This patch changes the default value to what we had when there was no compiler support. A warning about the outdated compiler is printed during runtime, when this point is reached. Differential Revision: https://reviews.llvm.org/D39890 llvm-svn: 317928
* Add const to some variables to avoid const_castsJonas Hahnfeld2017-11-092-7/+6
| | | | | | | | | In these places the const attribute seems correct and doesn't need any other change, so let's do it. Differential Revision: https://reviews.llvm.org/D39756 llvm-svn: 317798
* Remove const from variables with dynamic memoryJonas Hahnfeld2017-11-0913-83/+70
| | | | | | | | | | | | | | | | | | | Allocated memory is typically not 'const' if it needs to be freed. This patch removes around 50 wrong const attributes, modifies the corresponding functions and finally gets rid of some const_casts. These have especially been strange for __kmp_str_fname_free() that added a 'const' to call __kmp_str_free() which removed it again. Two minor cleanups that I performed in this process: * __kmp_tool_libraries now lives in kmp_settings.cpp as it is used nowhere else. * __kmp_msg_empty was removed as it was never used and Clang now complained that it was assigned a string literal that is 'const char *'. Differential Revision: https://reviews.llvm.org/D39755 llvm-svn: 317797
* Cleanup version symbol macros and attributes/declspecsJonathan Peyton2017-11-0710-523/+371
| | | | | | | | | | | | 1) Get rid of xaliasify, xexpand and xversionify for KMP_EXPAND_NAME and KMP_VERSION_SYMBOL. KMP_VERSION_SYMBOL is a combination of xaliasify and xversionify. 2) Put all attribute and __declspec definitions in kmp_os.h Differential Revision: https://reviews.llvm.org/D39516 llvm-svn: 317636
* [OMPT] Improve cast that was lost on commit, NFC.Jonas Hahnfeld2017-11-061-2/+2
| | | | llvm-svn: 317480
* Updating implementation of OMPT as specified in OpenMP 5.0 Preview 2 (TR6)Joachim Protze2017-11-055-259/+167
| | | | | | | | | | | The TR6 document is expected to be publically released around November 15. This patch does not implement OMPT for libomptarget. Patch by Simon Convent and Joachim Protze Differential Revision: https://reviews.llvm.org/D39182 llvm-svn: 317436
* Rename fields of ompt_frame_tJoachim Protze2017-11-058-102/+94
| | | | | | | | | | This is part of the renaming of data types from OpenMP TR4 to TR6 Patch by Simon Convent Differential Revision: https://reviews.llvm.org/D39326 llvm-svn: 317435
* Revert "Rename fields of ompt_frame_t"Jonas Hahnfeld2017-11-038-94/+102
| | | | | | This reverts commit r317338 which discarded some recent commits. llvm-svn: 317347
* Revert "Updating implementation of OMPT as specified in OpenMP 5.0 Preview 2 ↵Jonas Hahnfeld2017-11-0315-218/+314
| | | | | | | | (TR6)" This reverts commit r317339 which discarded some recent commits. llvm-svn: 317346
* Updating implementation of OMPT as specified in OpenMP 5.0 Preview 2 (TR6)Joachim Protze2017-11-0315-314/+218
| | | | | | | | | | | The TR6 document is expected to be publically released around November 15. This patch does not implement OMPT for libomptarget. Patch by Simon Convent and Joachim Protze Differential Revision: https://reviews.llvm.org/D39182 llvm-svn: 317339
* Rename fields of ompt_frame_tJoachim Protze2017-11-038-102/+94
| | | | | | | | | | This is part of the renaming of data types from OpenMP TR4 to TR6 Patch by Simon Convent Differential Revision: https://reviews.llvm.org/D39326 llvm-svn: 317338
* [OpenMP] Fix race condition in omp_init_lockJonathan Peyton2017-11-011-3/+4
| | | | | | | | | | | | | | | | | | | | | This is a partial fix for bug 34050. This prevents callers of omp_set_lock (which does not hold __kmp_global_lock) from ever seeing an uninitialized version of __kmp_i_lock_table.table. It does not solve a use-after-free race condition if omp_set_lock obtains a pointer to __kmp_i_lock_table.table before it is updated and then attempts to dereference afterwards. That race is far less likely and can be handled in a separate patch. The unit test usually segfaults on the current trunk revision. It passes with the patch. Patch by Adam Azarchs Differential Revision: https://reviews.llvm.org/D39439 llvm-svn: 317115
* Update implementation of OMPT to the specification OpenMP 5.0 Preview 1 (TR4).Joachim Protze2017-11-0131-1420/+3619
| | | | | | | | | | | | | | The code is tested to work with latest clang, GNU and Intel compiler. The implementation is optimized for low overhead when no tool is attached shifting the cost to execution with tool attached. This patch does not implement OMPT for libomptarget. Patch by Simon Convent and Joachim Protze Differential Revision: https://reviews.llvm.org/D38185 llvm-svn: 317085
* Fix fatal error message displayingJonathan Peyton2017-10-251-6/+15
| | | | | | | | | | | | | Replacing call to __kmp_msg(kmp_ms_fatal,...) with __kmp_fatal(...) caused an issue when incomplete message is displayed in case an error message is followed by another message (e.g. by a hint messa)ge. This is because __kmp_fatal() passes incomplete list of arguments to __kmp_msg(). Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D39248 llvm-svn: 316623
* Disable threadprivate data cleanup if runtime is terminatingJonathan Peyton2017-10-241-0/+7
| | | | | | | | | | | The problem is due to the runtime's threadprivate cleanup code which tries to access data that was already destroyed by one of the root threads. __kmp_init_gtid is used as a checker here since it is set to false before actual resource cleanup is done in __kmp_cleanup(). Patch by Hansang Bae llvm-svn: 316452
* Restrict OMPT to OpenMP version 5.0 and remove old header filesJonathan Peyton2017-10-203-1518/+0
| | | | | | | | Patch by Simon Convent Differential Revision: https://reviews.llvm.org/D38876 llvm-svn: 316234
* Apply formatting changesJonathan Peyton2017-10-2066-182/+42
| | | | | | | | | | .clang-format's comments are removed and a (hopefully) final set of formatting changes are applied. Differential Revision: https://reviews.llvm.org/D38837 Differential Revision: https://reviews.llvm.org/D38920 llvm-svn: 316227
* KMP_HW_SUBSET vs KMP_PLACE_THREADS rival envirables fixJonathan Peyton2017-10-061-5/+22
| | | | | | | | | | | | | | If both KMP_HW_SUBSET and KMP_PLACE_THREADS are set and KMP_PLACE_THREADS gets parsed first, then the current environment variable parser rejects both and neither get used. This patch uses the rivals mechanism that is used for other environment variable groups (e.g., KMP_STACKSIZE, GOMP_STACKSIZE, OMP_STACKSIZE). If both are set, then it tells the user that it is ignoring KMP_PLACE_THREADS in favor of KMP_HW_SUBSET. The message about deprecating KMP_PLACE_THREADS when it is set is still printed regardless. Differential Revision: https://reviews.llvm.org/D38292 llvm-svn: 315091
* Remove unnecessary semicolonsJonathan Peyton2017-09-2727-509/+498
| | | | | | | | Removes semicolons after if {} blocks, function definitions, etc. I was able to apply the large OMPT patch cleanly on top of this one with no conflicts. llvm-svn: 314340
* Allow printing of KMP_TOPOLOGY_METHOD when KMP_SETTINGS=trueJonathan Peyton2017-09-261-2/+0
| | | | llvm-svn: 314243
* Remove unused t_single_lockJonathan Peyton2017-09-262-2/+1
| | | | | | Add padding inside team structure to keep same structure size. llvm-svn: 314242
* Read blocktime value set by kmp_set_blocktime() before reading from ↵Jonathan Peyton2017-09-052-5/+9
| | | | | | | | | | KMP_BLOCKTIME Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D37403 llvm-svn: 312539
* Minor code cleanup of Klocwork issuesJonathan Peyton2017-09-0514-149/+126
| | | | | | | | | | | Minor code cleanup of Klocwork issues. Fatal messages are given no return attribute. Define and use KMP_NORETURN to work for multiple C++ versions. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D37275 llvm-svn: 312538
* Use va_copy instead of __va_copy to fix building libomp against musl libcJonathan Peyton2017-08-191-1/+1
| | | | | | | | | | Fixes https://bugs.llvm.org/show_bug.cgi?id=34040 Patch by Peter Levine Differential Revision: https://reviews.llvm.org/D36343 llvm-svn: 311269
* Remove BUILD_TVJonathan Peyton2017-08-175-81/+0
| | | | | | | | | | Cleanup code to remove BUILD_TV and unused code bracketed by it. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D36011 llvm-svn: 311114
* OMP_PROC_BIND: better spreadPaul Osmialowski2017-08-101-42/+108
| | | | | | | | | This change improves the way threads are spread across cores when OMP_PROC_BIND=spread is set and no unusual affinity masks are in use. Differential Revision: https://reviews.llvm.org/D36510 llvm-svn: 310670
* Move lock acquire/release functions in task deque cleanup codeJonathan Peyton2017-08-021-3/+2
| | | | | | | | | | | | | The original locations can be reached without initializing the lock variable (td_deque_lock), so it is potentially unsafe. It is guaranteed that the lock is initialized if the deque (td_deque) is not NULL, and lock functions can be safely called. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D36017 llvm-svn: 309875
* Add new envirable KMP_TEAMS_THREAD_LIMITJonathan Peyton2017-08-025-10/+30
| | | | | | | | | | | | | | | | | | | | This change adds a new environment variable, KMP_TEAMS_THREAD_LIMIT, which is used to set a new global variable, __kmp_teams_max_nth, which is checked when determining the size and quantity of teams that will be created in the teams construct. Specifically, it is a limit on the total number of threads in a given teams construct. It differentiates the limits for the teams construct from the limits for regular parallel regions (KMP_DEVICE_THREAD_LIMIT/__kmp_max_nth and OMP_THREAD_LIMIT/__kmp_cg_max_nth). When each individual team is formed, it is still subject to those limits. After the clauses to the teams construct are parsed and calculated, we check to make sure we are within this limit, and if not, reduce num_threads per team and/or number of teams, accordingly. The default value is set to the number of available processors on the system. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D36009 llvm-svn: 309874
* Fix comments and build messages concerning TSXJonathan Peyton2017-07-281-1/+2
| | | | llvm-svn: 309418
* Fix implementation of OMP_THREAD_LIMITJonathan Peyton2017-07-276-20/+71
| | | | | | | | | | | | | | | | | | This change fixes the implementation of OMP_THREAD_LIMIT. The implementation of this previously was not restricted to a contention group (but it should be, according to the spec), and this is fixed here. A field is added to root thread to store a counter of the threads in the contention group. An extra check is added when reserving threads for a parallel region that checks this variable and compares to threadlimit-var, which is implemented as a new global variable, kmp_cg_max_nth. Associated settings changes were also made, and clean up of comments that referred to OMP_THREAD_LIMIT, but should refer to the new KMP_DEVICE_THREAD_LIMIT (added in an earlier patch). Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D35912 llvm-svn: 309319
OpenPOWER on IntegriCloud