summaryrefslogtreecommitdiffstats
path: root/openmp/runtime/src/kmp_runtime.c
Commit message (Collapse)AuthorAgeFilesLines
* Change source files from .c to .cppJonathan Peyton2016-12-141-7683/+0
| | | | | | | | Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D26688 llvm-svn: 289732
* Support of mips & mips64 for openmprtlSylvestre Ledru2016-12-081-2/+2
| | | | | | | | | | | | | | Summary: Implemented by Dejan Latinovic See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=790735 for more more information Reviewers: AndreyChurbanov, jlpeyton Subscribers: openmp-commits, mgorny Differential Revision: https://reviews.llvm.org/D26576 llvm-svn: 289032
* Fix for D25504 - segfault because of double free()-ing in shutdown code.Jonathan Peyton2016-11-211-1/+2
| | | | | | | | | | Paul Osmialowski pointed out a double free bug in shutdown code. This patch Moves the freeing of the implicit task to above the freeing of all fast memory to prevent the double-free issue. Differential Revision: https://reviews.llvm.org/D26860 llvm-svn: 287551
* Update stats-gathering codeJonathan Peyton2016-11-141-10/+12
| | | | | | | | | | | | | Have developer timers use partitioning scheme which also required that some redundant developer timers be removed in favor of the already existing normal timers. Move per thread stats initialization to just after global thread id assignment which is as early as possible. Also put all global stats initialization code in __kmp_stats_init() and all global stats destruction code in __kmp_stats_fini(). Differential Revision: https://reviews.llvm.org/D26361 llvm-svn: 286892
* Introduce dynamic affinity dispatch capabilitiesJonathan Peyton2016-11-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | This set of changes enables the affinity interface (Either the preexisting native operating system or HWLOC) to be dynamically set at runtime initialization. The point of this change is that we were seeing performance degradations when using HWLOC. This allows the user to use the old affinity mechanisms which on large machines (>64 cores) makes a large difference in initialization time. These changes mostly move affinity code under a small class hierarchy: KMPAffinity class Mask {} KMPNativeAffinity : public KMPAffinity class Mask : public KMPAffinity::Mask KMPHwlocAffinity class Mask : public KMPAffinity::Mask Since all interface functions (for both affinity and the mask implementation) are virtual, the implementation can be chosen at runtime initialization. Differential Revision: https://reviews.llvm.org/D26356 llvm-svn: 286890
* [OpenMP] Enable ThreadSanitizer to check OpenMP programsJonas Hahnfeld2016-11-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | This patch allows ThreadSanitizer (Tsan) to verify OpenMP programs. It means that no false positive will be reported by Tsan when verifying an OpenMP programs. This patch introduces annotations within the OpenMP runtime module to provide information about thread synchronization to the Tsan runtime. In order to enable the Tsan support when building the runtime, you must enable the TSAN_SUPPORT option with the following environment variable: -DLIBOMP_TSAN_SUPPORT=TRUE The annotations will be enabled in the main shared library (same mechanism of OMPT). Patch by Simone Atzeni and Joachim Protze! Differential Revision: https://reviews.llvm.org/D13072 llvm-svn: 286115
* Fixed a memory leak related to task dependencies.Andrey Churbanov2016-10-271-0/+3
| | | | | | | | Differential Revision: http://reviews.llvm.org/D25504 Patch by Alex Duran. llvm-svn: 285283
* Use getpagesize() instead of PAGE_SIZE macro when KMP_OS_LINUX is trueJonathan Peyton2016-10-261-5/+7
| | | | | | | | Patch by Victor Campos Differential Revision: https://reviews.llvm.org/D26001 llvm-svn: 285243
* [OpenMP] Fix issue with directives used in a macro.Samuel Antao2016-10-201-8/+9
| | | | | | | | | | | | | | | | | | | Summary: If directives are used in a macro, clang complains with: ``` src/projects/openmp/runtime/src/kmp_runtime.c:7486:2: error: embedding a directive within macro arguments has undefined behavior [-Werror,-Wembedded-directive] #if KMP_USE_MONITOR ``` This patch fixes two occurrences of the issue in `kmp_runtime.cpp`. Reviewers: tlwilmar, jlpeyton, AndreyChurbanov, Hahnfeld Subscribers: Hahnfeld, openmp-commits Differential Revision: https://reviews.llvm.org/D25823 llvm-svn: 284728
* Code cleanup for the runtime without monitor threadJonathan Peyton2016-10-071-3/+18
| | | | | | | | | | This change removes/disables unnecessary code when monitor thread is not used. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D25102 llvm-svn: 283577
* Enable omp_get_schedule() to return static steal type.Jonathan Peyton2016-10-071-2/+2
| | | | | | | As the code is now, calling omp_get_schedule() when OMP_SCHEDULE=static_steal will cause an assert. llvm-svn: 283576
* Fix incorrect OpenMP version in Fortran module.Jonathan Peyton2016-09-301-1/+3
| | | | | | | | | | | | | Add check for "45" version to use "201511" string for OpenMP 4.5, otherwise "200505" is used in Fortran module. Also, fix kmp_openmp_version variable (used for the debugger, e.g.) and kmp_version_omp_api that is used in KMP_VERSION=1 output. Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D24761 llvm-svn: 282868
* Disable monitor thread creation by default.Jonathan Peyton2016-09-271-2/+9
| | | | | | | | | | | | | This change set disables creation of the monitor thread by default. The global counter maintained by the monitor thread was replaced by logic that uses system time directly, and cyclic yielding on Linux target was also removed since there was no clear benefit of using it. Turning on KMP_USE_MONITOR variable (=1) enables creation of monitor thread again if it is really necessary for some reasons. Differential Revision: https://reviews.llvm.org/D24739 llvm-svn: 282507
* [OMPT] fix task frame information for gomp interfaceJonas Hahnfeld2016-09-141-1/+2
| | | | | | | | | | | Previous differencials D23305-D23310 changed task frame information management only for the kmp interface, but not for the whole gomp interface. This broke some testcases when building with gcc. This patch fixes the broken task frame information for the gomp interface. Patch by Joachim Protze! Differential Revision: https://reviews.llvm.org/D24502 llvm-svn: 281468
* [OMPT] Reset task exit frame when execution is finishedJonas Hahnfeld2016-09-141-0/+6
| | | | | | | | | | | | | | | | The exit address is set when execution of a task is started and should be reset as soon as the execution is finished. Especially for the asm implementation of __kmp_invoke_microtask, resetting in this call would be painfull, so reset just after the invokation. The testcase shows the effect of this patch: Before, the implicit barriers at the end of an implicit task would see an exit address for the implicit task. This barrier is a task scheduling point. Thus, any explicit task scheduled there would see an exit, but no reenter address for the implicit task. Patch by Joachim Protze! Differential Revision: https://reviews.llvm.org/D23307 llvm-svn: 281465
* [OMPT] Align implementation of reenter frame address to latest (frozen) ↵Jonas Hahnfeld2016-09-141-7/+7
| | | | | | | | | | | | | | | | version of OMPT spec The latest OMPT spec changed the semantic of a tasks reenter frame to be the application frame, that will be entered, when the runtime frame drops. Before it was the last frame in the runtime. This doesn't work for some gcc execution pathes or even clang generated code for : Since there is no runtime frame between the executed task and the encountering task. The test case compares exit and reenter addresses against addresses captured in application code Patch by Joachim Protze! Differential Revision: https://reviews.llvm.org/D23305 llvm-svn: 281464
* [OPENMP] Implementation of omp_get_default_device and omp_set_default_deviceGeorge Rokos2016-09-091-0/+1
| | | | | | | | | Implementation of missing OpenMP 4.0 API functions omp_get_default_device and omp_set_default_device. Also, added support for the environment variable OMP_DEFAULT_DEVICE. Differential Revision: https://reviews.llvm.org/D23587 llvm-svn: 281065
* Use 'critical' reduction method when 'atomic' is not available but requested.Jonathan Peyton2016-09-021-7/+14
| | | | | | | | | | | | | In case atomic reduction method is not available (the compiler can't generate it) the assertion failure occurred if KMP_FORCE_REDUCTION=atomic was specified. This change replaces the assertion with a warning and sets the reduction method to the default one - 'critical'. Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D23990 llvm-svn: 280519
* Fixes for hierarchical barrier (possible hang if team size changed).Andrey Churbanov2016-08-111-0/+14
| | | | | | Differential Revision: http://reviews.llvm.org/D23175 llvm-svn: 278332
* D22137: Memory leak fixed by adding missed cleanup of single level array of ↵Andrey Churbanov2016-07-081-2/+4
| | | | | | hot teams info llvm-svn: 274851
* __kmp_partition_places: Update assertion for new parameter update_master_onlyJonas Hahnfeld2016-07-041-2/+2
| | | | | | | | | | | | If update_master_only is set the place list is not completely traversed and therefore this assertion failed. Make it only trigger if update_master_only is false. (was introduced by D20539) Differential Revision: http://reviews.llvm.org/D21925 llvm-svn: 274482
* Fix checks on schedule structJonathan Peyton2016-07-011-19/+13
| | | | | | | | | | | | This change fixes an error in comparing the existing schedule on the team to the new schedule, in the chunk field. Also added additional checks and used KMP_CHECK_UPDATE where appropriate. Patch by Terry Wilmarth. Differential Revision: http://reviews.llvm.org/D21897 llvm-svn: 274371
* Improve performance of #pragma omp singleJonathan Peyton2016-07-011-2/+4
| | | | | | | | | | | | EPCC Performance of single is considerably worse than plain barrier. Adding a read-only check to the code before the atomic compare-and-store helps considerably. Patch by Terry Wilmarth. Differential Revision: http://reviews.llvm.org/D21893 llvm-svn: 274369
* Bug fix for hang when tasks used in nested parallelJonathan Peyton2016-06-211-3/+3
| | | | | | | | | | | | Bug fix for hang when omp task and nested parallelism used together. Still some problem remains with task state saving/restoring, but user's case works fine now. All tasking unit tests passed as well. Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21558 llvm-svn: 273297
* Addition of debugger comments and whitespaceJonathan Peyton2016-06-211-1/+0
| | | | | | | | | | | | The removal of legacy code to support long-deprecated debugger support library resulted in some whitespace changes. Comments from that legacy code were made public as they may be useful for other debuggers. Patch by Olga Malysheva. Differential Revision: http://reviews.llvm.org/D21391 llvm-svn: 273282
* Bug fix: crash if teams executed on hostJonathan Peyton2016-06-161-0/+1
| | | | | | | | | | | | | | Added argv array check/allocation for parallel directly nested inside the teams construct, as new coming Fortran codegen passes parameters directly into kmpc_fork_call missing same parameters in kmpc_fork_teams (earlier codegen passed to parallel the subset of parameter passed to teams, and thus no check/allocation needed). Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21336 llvm-svn: 272935
* Renaming change: 41 -> 45 and 4.1 -> 4.5Jonathan Peyton2016-06-141-6/+6
| | | | | | | | OpenMP 4.1 is now OpenMP 4.5. Any mention of 41 or 4.1 is replaced with 45 or 4.5. Also, if the CMake option LIBOMP_OMP_VERSION is 41, CMake warns that 41 is deprecated and to use 45 instead. llvm-svn: 272687
* Affinity mask processing improvementsJonathan Peyton2016-06-131-1/+1
| | | | | | | | | | | | Remove static specifier from var fullMask and remove kmp_get_fullMask() routine. When iterating through procs in a mask, always check if proc is in fullMask (this check was missing in a few places). Patch by Brian Bliss. Differential Revision: http://reviews.llvm.org/D21300 llvm-svn: 272589
* Offer API for setting number of loop dispatch buffersJonathan Peyton2016-05-311-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem is the lack of dispatch buffers when thousands of loops with nowait, about 10 iterations each, are executed by hundreds of threads. We only have built-in 7 dispatch buffers, but there is a need in dozens or hundreds of buffers. The problem can be fixed by setting KMP_MAX_DISP_BUF to bigger value. In order to give users same possibility I changed build-time control into run-time one, adding API just in case. This change adds an environment variable KMP_DISP_NUM_BUFFERS and a new API function kmp_set_disp_num_buffers(int num_buffers). The KMP_DISP_NUM_BUFFERS envirable works only before serial initialization, because during the serial initialization we already allocate buffers for the hot team, so it is too late to change the number of buffers later (or we need to reallocate buffers for all teams which sounds too complicated). The kmp_set_defaults() routine does not work for this envirable, because it calls serial initialization before reading the parameter string. So a new routine, kmp_set_disp_num_buffers(), is created so that it can set our internal global variable before the library initialization. If both the envirable and API used the envirable wins. Differential Revision: http://reviews.llvm.org/D20697 llvm-svn: 271318
* Fix for OMP_PROC_BIND=spread strategyJonathan Peyton2016-05-261-4/+14
| | | | | | | | | | | | | | | | The OMP_PROC_BIND=spread strategy fails to assign the master thread the correct place partition after the first parallel region. Other threads in the hot team will remember their place_partition, but the master's place partition is restored to what it was before entering the parallel region. So when the hot team is used for subsequent parallel regions, the master has lost this info. This fix calls __kmp_partition_places to update only the master thread's place partition in the spread case when there are no other changes to the hot team. Patch by Terry Wilmarth Differential Revision: http://reviews.llvm.org/D20539 llvm-svn: 270890
* Make LIBOMP_USE_ITT_NOTIFY a setting that can be enabled or disabledJonathan Peyton2016-05-261-1/+3
| | | | | | | | | | | | | On Blue Gene/Q, having LIBOMP_USE_ITT_NOTIFY support compiled into a statically-linked binary causes a failure at runtime because dlopen fails. This patch changes LIBOMP_USE_ITT_NOTIFY to a cacheable configuration setting that can be disabled. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D20517 llvm-svn: 270884
* Fork performance improvementsJonathan Peyton2016-05-231-26/+38
| | | | | | | | | | | Most of this is modifications to check for differences before updating data fields in team struct. There is also some rearrangement of the team struct. Patch by Diego Caballero Differential Revision: http://reviews.llvm.org/D20487 llvm-svn: 270468
* Remove trailing whitespace in src/ directoryJonathan Peyton2016-05-201-3/+3
| | | | | | This patch doesn't affect D19878's context. So D19878 still cleanly applies. llvm-svn: 270252
* Fix team reuse with foreign threadsJonathan Peyton2016-05-121-0/+2
| | | | | | | | | | | | | | After hot teams were enabled by default, the library started using levels kept in the team structure. The levels are broken in case foreign thread exits and puts its team into the pool which is then re-used by another foreign thread. The broken behavior observed is when printing the levels for each new team, one gets 1, 2, 1, 2, 1, 2, etc. This makes the library believe that every other team is nested which is incorrect. What is wanted is for the levels to be 1, 1, 1, etc. Differential Revision: http://reviews.llvm.org/D19980 llvm-svn: 269363
* [STATS] Use partitioned timer schemeJonathan Peyton2016-05-051-7/+16
| | | | | | | | | | | | | | | | | | | | | | | | This change removes the current timers with ones that partition time properly. The current timers are nested, so that if a new timer, B, starts when the current timer, A, is already timing, A's time will include B's. To eliminate this problem, the partitioned timers are designed to stop the current timer (A), let the new timer run (B), and when the new timer is finished, restart the previously running timer (A). With this partitioning of time, a threads' timers all sum up to the OMP_worker_thread_life time and can now easily show the percentage of time a thread is spending in different parts of the runtime or user code. There is also a new state variable associated with each thread which tells where it is executing a task. This corresponds with the timers: OMP_task_*, e.g., if time is spent in OMP_task_taskwait, then that thread executed tasks inside a #pragma omp taskwait construct. The changes are mostly changing the MACROs to use the new PARITIONED_* macros, the new partitionedTimers class and its methods, and new state logic. Differential Revision: http://reviews.llvm.org/D19229 llvm-svn: 268640
* [ITTNOTIFY] Remove serialized parallel regions from frame notificationJonathan Peyton2016-04-191-24/+0
| | | | llvm-svn: 266760
* Fix for pthread_setspecific (TLS and shutdown) problemJonathan Peyton2016-04-181-1/+1
| | | | | | | | | | | | | | | Some codes that use TLS fail intermittently because one thread tries to write TLS values after the TLS key has been destroyed by another thread. This happens when one thread executes library shutdown (and destroys TLS keys), while another thread starts to execute the TLS key destructor routine. Before this change, the kmp_init_runtime flag was checked before calling pthread_* TLS functions, but this flag is set to FALSE later than the destruction of the TLS keys, which leads to failure. The fix is to check kmp_init_gtid instead, as this flag is unset *before* the destruction of TLS keys. Differential Revision: http://reviews.llvm.org/D19022 llvm-svn: 266674
* Remove dead KMP_USE_POOLED_ALLOC codeJonathan Peyton2016-03-291-78/+6
| | | | llvm-svn: 264776
* Add new OpenMP 4.5 doacross loop nest featureJonathan Peyton2016-03-021-4/+19
| | | | | | | | | | | | | | | | | | | | | | From the standard: A doacross loop nest is a loop nest that has cross-iteration dependence. An iteration is dependent on one or more lexicographically earlier iterations. The ordered clause parameter on a loop directive identifies the loop(s) associated with the doacross loop nest. The init/fini routines allocate/free doacross buffer(s) for each loop for each thread. The wait routine waits for a flag designated by the dependence vector. The post routine sets the flag designated by current iteration vector. We use a similar technique of shared buffer indices that covers up to 7 nowait loops executed simultaneously by different threads (number 7 has no real meaning, just heuristic value). Also, the size of structures are kept intact via reducing dummy arrays. This needs to be put into the OpenMP runtime library in order for the compiler team to develop the compiler side of the implementation. Differential Revision: http://reviews.llvm.org/D17399 llvm-svn: 262532
* Add new OpenMP 4.5 affinity APIJonathan Peyton2016-02-251-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | This change introduces the new OpenMP 4.5 affinity api surrounding OpenMP Places. There are six new entry points: Typically called in serial region: * omp_get_num_places - returns the number of places available to the execution environment in the place list. * omp_get_place_num_procs - returns the number of processors available to the execution environment in the specified place. * omp_get_place_proc_ids - returns the numerical identifiers of the processors available to the execution environment in the specified place. Typically called inside parallel region: * omp_get_place_num - returns the place number of the place to which the encountering thread is bound. * omp_get_partition_num_places - returns the number of places in the place partition of the innermost implicit task. * omp_get_partition_place_nums - returns the list of place numbers corresponding to the places in the place-var ICV of the innermost implicit task. Differential Revision: http://reviews.llvm.org/D17417 llvm-svn: 261915
* Proxy task fix: task_state stack push condition on forkJonathan Peyton2016-02-091-1/+2
| | | | | | | The problem is that the master's thread state was not saved before entering a parallel region so it does not remember tasks when it returns. llvm-svn: 260306
* [OMPT] Avoid SEGV when a worker thread needs its parallel id behind the barrierJonas Hahnfeld2016-01-281-1/+4
| | | | | | | | | | When the code behind the barrier is executed, the master thread may have already resumed execution. That's why we cannot safely assume that *pteam is not yet freed. This has been introduced by r258866. llvm-svn: 259037
* Formatting fixesJonathan Peyton2016-01-271-1/+1
| | | | | | | Removing extraneous { } bracket sections. Unindenting blocks of code as a result. Also removing empty #ifdef KMP_STUB llvm-svn: 258986
* Fixing comments.Jonathan Peyton2016-01-271-3/+2
| | | | | | Removing references to non-existent functions, fixing typos. llvm-svn: 258985
* Removing extra empty linesJonathan Peyton2016-01-271-15/+0
| | | | llvm-svn: 258984
* [OMPT]: Fix the order of implicit_task_end_eventsJonathan Peyton2016-01-261-15/+31
| | | | | | | | | | | | | | | For implcit barriers in simple parallel for loops, the order of the OMPT events was wrong. The barrier_{begin,end} events came after the implcit_task_end event for the implcit barrier at the end of the parallel region. This is wrong because the implicit task executes the barrier before ending. This patch fixes the order of the event: It will be triggerd now just before __kmp_pop_current_task_from_thread() is called. Patch by Tim Cramer Differential Revision: http://reviews.llvm.org/D16347 llvm-svn: 258866
* Remove double negative in if() logic.Jonathan Peyton2016-01-111-2/+2
| | | | | | Change (__kmp_mic_type != non_mic) to (__kmp_mic_type == mic2) llvm-svn: 257380
* [STATS] Fix stats lock problem to be compatible with new hinted lock codeJonathan Peyton2015-12-171-0/+3
| | | | llvm-svn: 255901
* Fix honoring of OMP_THREAD_LIMIT in the teams constructJonathan Peyton2015-11-301-7/+36
| | | | | | | | | | | | Fix for crash in the teams construct in case user sets OMP_THREAD_LIMIT to a number less than the number of processors. Now the number of threads will be silently reduced if the user didn't specify teams parameters or with a warning if the user specified teams parameters conflicting with OMP_THREAD_LIMIT. Differential Revision: http://reviews.llvm.org/D14732 llvm-svn: 254322
* Add missing KMP_NESTED_HOT_TEAMS guardsJonathan Peyton2015-11-161-0/+6
| | | | llvm-svn: 253264
OpenPOWER on IntegriCloud