summaryrefslogtreecommitdiffstats
path: root/openmp/runtime/src
Commit message (Collapse)AuthorAgeFilesLines
...
* [OpenMP] Change hint parameter type for critical to uint32_tJonathan Peyton2018-09-072-2/+6
| | | | | | | | | | | Add atomic hint flags to the enum. The hint parameter type was changed to uint32_t in __kmpc_critical_with_hint() Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D51235 llvm-svn: 341694
* [OpenMP] Synchronization hint constants added to headersJonathan Peyton2018-09-074-56/+99
| | | | | | | | | | | | | | | | ident flags reserved for atomic hints. This patch adds omp_sync_hint_t to omp.h and omp_sync_hint_kind to omp_lib.h. For better maintainability the list of macros for ident flags was replaced with a enum. The new KMP_IDENT_ATOMIC_HINT_MASK was added to the enum to support possible future atomic hints. Also fix omp_lib.h.var to be under 72 chars again after 5.0 OpenMP Memory commit Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D51233 llvm-svn: 341693
* [OpenMP] Initial implementation of OMP 5.0 Memory Management routinesJonathan Peyton2018-09-0717-11/+560
| | | | | | | | | | | | | | | | | | | | | | | Implemented omp_alloc, omp_free, omp_{set,get}_default_allocator entries, and OMP_ALLOCATOR environment variable. Added support for HBW memory on Linux if libmemkind.so library is accessible (dynamic library only, no support for static libraries). Only used stable API (hbwmalloc) of the memkind library though we may consider using experimental API in future. The ICV def-allocator-var is implemented per implicit task similar to place-partition-var. In the absence of a requested allocator, the uses the default allocator. Predefined allocators (the only ones currently available) are made similar for C and Fortran, - pointers (long integers) with values 1 to 8. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D51232 llvm-svn: 341687
* Fix for https://bugs.llvm.org/show_bug.cgi?id=38839:Andrey Churbanov2018-09-071-57/+92
| | | | | | | | Changed style of declarations to be less than 72 char each. Differential Revision: https://reviews.llvm.org/D51694 llvm-svn: 341653
* [OpenMP][Fix] Conditional compilation leaves variables unusedGheorghe-Teodor Bercea2018-08-271-0/+2
| | | | | | | | | | | | | | Summary: Prevent variables from being left unused by conditional compilation. Reviewers: ABataev, grokos, Hahnfeld, caomhin, protze.joachim Reviewed By: Hahnfeld Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D51303 llvm-svn: 340771
* [OpenMP][Fix] Ensure comparison between unsigned values.Gheorghe-Teodor Bercea2018-08-271-1/+1
| | | | | | | | | | | | | | Summary: Ensure the values being compared are both unsigned. Reviewers: ABataev, Hahnfeld, caomhin, grokos, AndreyChurbanov Reviewed By: AndreyChurbanov Subscribers: AndreyChurbanov, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D51301 llvm-svn: 340745
* [OpenMP] Remove deprecated/obsolete MIC attributes from headersJonathan Peyton2018-08-241-1/+3
| | | | llvm-svn: 340656
* [OpenMP] Fixed affinity verbose double printing for balanced type.Jonathan Peyton2018-08-241-1/+2
| | | | llvm-svn: 340647
* [OpenMP] Fix tasking bug for decreasing hot team nthreadsJonathan Peyton2018-08-241-1/+1
| | | | | | | | | | | | | | | | | | The __kmp_execute_tasks_template() function reads the task_team and current_task from the thread structure. There appears to be a pathological timing where the number of threads in the hot team decreases and so a thread is put in the pool via __kmp_free_thread(). It could be the case that: 1) A thread reads th_task_team into task_team local variables and is then interrupted by the OS 2) Master frees the thread and sets current task and task team to NULL 3) The thread reads current_task as NULL When this happens, current_task is dereferenced and a segfault occurs. This patch just checks for current_task to not be NULL as well. Differential Revision: https://reviews.llvm.org/D50651 llvm-svn: 340632
* [OpenMP] Add check for hot_teams arrayJonathan Peyton2018-08-241-1/+2
| | | | | | | | | | If hot teams are not being used, this code could seg fault without the added check, and does so when composability is used in conjunction with nesting. The fix prevents the segfault. Differential Revision: https://reviews.llvm.org/D50649 llvm-svn: 340629
* [OpenMP] Fix incorrect barrier imbalance reporting in ITTNOTIFYJonathan Peyton2018-08-242-22/+26
| | | | | | | | | Exclude nested explicit tasks from timing, only outer level explicit task counted and its time added to barrier arrive time for the thread. Differential Revision: https://reviews.llvm.org/D50584 llvm-svn: 340628
* [OMPT] Remove OMPT idle callbackJoachim Protze2018-08-153-24/+0
| | | | | | | | | | | | | The idle callback was removed from the spec as of TR7. This removes it from the implementation. Patch provided by Simon Convent Reviewers: hbae, protze.joachim Differential Revision: https://reviews.llvm.org/D48362 llvm-svn: 339771
* [OMPT] Make omp_control_tool() compliant when called from Fortran programsJonathan Peyton2018-08-132-2/+4
| | | | | | | | | | | | | | This change fixes an incorrect behavior of the omp_control_tool function when called from Fortran applications. A tool callback function for this event is supposed to get NULL for the third argument according to the specification, but the current implementation just passes a garbage value. A possible fix is to use the OPTIONAL attribute for the third argument. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D50565 llvm-svn: 339585
* [OpenMP] Cleanup codeJonathan Peyton2018-08-0922-407/+223
| | | | | | | | | | | | | | | | This patch cleans up unused functions, variables, sign compare issues, and addresses some -Warning flags which are now enabled including -Wcast-qual. Not all the warning flags in LibompHandleFlags.cmake are enabled, but some are with this patch. Some __kmp_gtid_from_* macros in kmp.h are switched to static inline functions which allows us to remove the awkward definition of KMP_DEBUG_ASSERT() and KMP_ASSERT() macros which used the comma operator. This had to be done for the innumerable -Wunused-value warnings related to KMP_DEBUG_ASSERT() Differential Revision: https://reviews.llvm.org/D49105 llvm-svn: 339393
* [OpenMP] Fix tasking + parallel bugJonathan Peyton2018-07-301-0/+3
| | | | | | | | | | | | From the bug report, the runtime needs to initialize the nproc variables (inside middle init) for each root when the task is encountered, otherwise, a segfault can occur. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36720 Differential Revision: https://reviews.llvm.org/D49996 llvm-svn: 338313
* [OpenMP] Fix new task creationGheorghe-Teodor Bercea2018-07-301-1/+1
| | | | | | | | | | | | | | | | Summary: When OMPT is not supported the __kmp_omp_task() function is passed the parameters in the wrong order. This is a fix related to patch D47709. Reviewers: Hahnfeld, sconvent, caomhin, jlpeyton Reviewed By: Hahnfeld Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D50001 llvm-svn: 338295
* [OpenMP] Add GOMP version symbols for OMP_4.5 APIJonathan Peyton2018-07-302-7/+18
| | | | | | | | This patch adds the appropriate version symbols to the relevant API functions Differential Revision: https://reviews.llvm.org/D49859 llvm-svn: 338281
* [OpenMP] Implement GOMP doacross compatibilityJonathan Peyton2018-07-305-62/+424
| | | | | | | | | | | | | | | | | | | | | | This change introduces GOMP doacross compatibility. There are 12 new interface functions 6 for long type and 6 for unsigned long long type: GOMP_doacross_post, GOMP_doacross_wait, GOMP_loop_doacross_[schedule]_start where schedule can be static, dynamic, guided, or runtime. These functions just translate the parameters if necessary and send them to the corresponding kmp function. E.g., GOMP_doacross_post() -> __kmpc_doacross_post() For the GOMP_doacross_post function, there is template specialization to account for when long is a four byte vs an eight byte type. If it is a four byte type, then a temporary array has to be created to convert the four byte integers into eight byte integers and then sending that into __kmpc_doacross_post(). Because GOMP_doacross_wait uses varargs, it always needs a temporary array and does not need template specialization. Differential Revision: https://reviews.llvm.org/D49857 llvm-svn: 338280
* [OpenMP] Fix build errors when building with KMP_DEBUG_ADAPTIVE_LOCKS=1Jonathan Peyton2018-07-304-14/+15
| | | | | | | | | | | | | This change fixes build errors when building a runtime with adaptive lock stats enabled. Most of the errors were due to the recent changes in the runtime, but it seems that we have not tried to build this debug runtime on Windows for a long time. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D49823 llvm-svn: 338277
* [OpenMP][Stats] Cleanup stats gathering codeJonathan Peyton2018-07-308-303/+609
| | | | | | | | | | | | | | | | | | | | | | 1) Remove unnecessary data from list node structure 2) Remove timerPair in favor of pushing/popping explicitTimers. This way, nested timers will work properly. 3) Fix #pragma omp critical timers 4) Add histogram capability 5) Add KMP_STATS_FILE formatting capability 6) Have time partitioned into serial & parallel by introducing partitionedTimers::exchange(). This also counts the number of serial regions in the executable. 7) Fix up the timers around OMP loops so that scheduling overhead and work are both counted correctly. 8) Fix up the iterations statistics so they count the number of iterations the thread receives at each loop scheduling event 9) Change timers so there is only one RDTSC read per event change 10) Fix up the outdated comments for the timers Differential Revision: https://reviews.llvm.org/D49699 llvm-svn: 338276
* [OMPT] Fix OMPT callbacks for the taskloop construct and add testcaseJoachim Protze2018-07-271-61/+170
| | | | | | | | | | | | | | | | | Fix the order of callbacks related to the taskloop construct. Add the iteration_count to work callbacks (according to the spec). Use kmpc_omp_task() instead of kmp_omp_task() to include OMPT callbacks. Add a testcase. Patch by Simon Convent Reviewed by: protze.joachim, hbae Subscribers: openmp-commits Differential Revision: https://reviews.llvm.org/D47709 llvm-svn: 338146
* [OMPT] Adapt OMPT callbacks for tasks to handle untied tasks correctlyJoachim Protze2018-07-271-50/+62
| | | | | | | | | | | | | | | | | | | | | | The ompt/tasks/task_types.c testcase did not test untied tasks properly. Now, frame addresses are tested and two scheduling points are added at which the task can switch to another thread. Due to scheduling effects, the frame address could be NULL. This needed a restructure of the way OMPT callbacks are called. __ompt_task_finish() now as an extra parameter, whether a task is completed. Its invocation has been moved into __kmp_task_finish(). Thus, the order of the writes to the frame addresses is not subject to scheduling effects anymore. Patch by Simon Convent Reviewed by: protze.joachim, hbae Subscribers: openmp-commits Differential Revision: https://reviews.llvm.org/D49181 llvm-svn: 338145
* PR30734: Remove __kmp_ft_page_allocate()Jonas Hahnfeld2018-07-262-36/+0
| | | | | | | | | | | | | This function was not enabled by default and not exported when manually tweaking the build flags. Additionally it was hard to use since there is no corresponding __kmp_ft_page_free(). The code itself is questionable because the returned memory address is padded by an extra pointer which stores the unpadded start of the allocated region (this would need to be freed). Differential Revision: https://reviews.llvm.org/D49802 llvm-svn: 338052
* Block library shutdown until unreaped threads finish spin-waitingJonathan Peyton2018-07-194-11/+61
| | | | | | | | | | | | | | | | This change fixes possibly invalid access to the internal data structure during library shutdown. In a heavily oversubscribed situation, the library shutdown sequence can reach the point where resources are deallocated while there still exist threads in their final spinning loop. The added loop in __kmp_internal_end() checks if there are such busy-waiting threads and blocks the shutdown sequence if that is the case. Two versions of kmp_wait_template() are now used to minimize performance impact. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D49452 llvm-svn: 337486
* Fix const cast problem introduced in r336563Jonathan Peyton2018-07-091-3/+3
| | | | | | 336563 eliminated CCAST() macros caused build failures llvm-svn: 336586
* [OpenMP] Fix a few formatting issuesJonathan Peyton2018-07-093-3/+5
| | | | llvm-svn: 336575
* [OpenMP] Introduce hierarchical schedulingJonathan Peyton2018-07-0910-56/+1518
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces the logic implementing hierarchical scheduling. First and foremost, hierarchical scheduling is off by default To enable, use -DLIBOMP_USE_HIER_SCHED=On during CMake's configure stage. This work is based off if the IWOMP paper: "Workstealing and Nested Parallelism in SMP Systems" Hierarchical scheduling is the layering of OpenMP schedules for different layers of the memory hierarchy. One can have multiple layers between the threads and the global iterations space. The threads will go up the hierarchy to grab iterations, using possibly a different schedule & chunk for each layer. [ Global iteration space (0-999) ] (use static) [ L1 | L1 | L1 | L1 ] (use dynamic,1) [ T0 T1 | T2 T3 | T4 T5 | T6 T7 ] In the example shown above, there are 8 threads and 4 L1 caches begin targeted. If the topology indicates that there are two threads per core, then two consecutive threads will share the data of one L1 cache unit. This example would have the iteration space (0-999) split statically across the four L1 caches (so the first L1 would get (0-249), the second would get (250-499), etc). Then the threads will use a dynamic,1 schedule to grab iterations from the L1 cache units. There are currently four supported layers: L1, L2, L3, NUMA OMP_SCHEDULE can now read a hierarchical schedule with this syntax: OMP_SCHEDULE='EXPERIMENTAL LAYER,SCHED[,CHUNK][:LAYER,SCHED[,CHUNK]...]:SCHED,CHUNK And OMP_SCHEDULE can still read the normal SCHED,CHUNK syntax from before I've kept most of the hierarchical scheduling logic inside kmp_dispatch_hier.h to try to keep it separate from the rest of the code. Differential Revision: https://reviews.llvm.org/D47962 llvm-svn: 336571
* [OpenMP] Restructure loop code for hierarchical schedulingJonathan Peyton2018-07-093-1329/+1434
| | | | | | | | | | | | | | | | | | | | | | This patch reorganizes the loop scheduling code in order to allow hierarchical scheduling to use it more effectively. In particular, the goal of this patch is to separate the algorithmic parts of the scheduling from the thread logistics code. Moves declarations & structures to kmp_dispatch.h for easier access in other files. Extracts the algorithmic part of __kmp_dispatch_init() and __kmp_dispatch_next() into __kmp_dispatch_init_algorithm() and __kmp_dispatch_next_algorithm(). The thread bookkeeping logic is still kept in __kmp_dispatch_init() and __kmp_dispatch_next(). This is done because the hierarchical scheduler needs to access the scheduling logic without the bookkeeping logic. To prepare for new pointer in dispatch_private_info_t, a new flags variable is created which stores the ordered and nomerge flags instead of them being in two separate variables. This will keep the dispatch_private_info_t structure the same size. Differential Revision: https://reviews.llvm.org/D47961 llvm-svn: 336568
* [OpenMP] Use C++11 Atomics - barrier, tasking, and lock codeJonathan Peyton2018-07-0917-280/+433
| | | | | | | | | | | | | | | | | These are preliminary changes that attempt to use C++11 Atomics in the runtime. We are expecting better portability with this change across architectures/OSes. Here is the summary of the changes. Most variables that need synchronization operation were converted to generic atomic variables (std::atomic<T>). Variables that are updated with combined CAS are packed into a single atomic variable, and partial read/write is done through unpacking/packing Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D47903 llvm-svn: 336563
* [OMPT] Provide the right thread_num for ancestor levelsJoachim Protze2018-07-021-3/+14
| | | | | | | | | The current implementation always provides the thread-num for the current parallel region. This patch fixes the behavior for ancestor levels >0. Differential Revision: https://reviews.llvm.org/D46533 llvm-svn: 336085
* minor: fixed typo in debug printAndrey Churbanov2018-06-201-1/+1
| | | | llvm-svn: 335138
* [OpenMP] Fix formatting issues in kmp_stats.hJonathan Peyton2018-06-081-26/+39
| | | | llvm-svn: 334335
* [OMPT] Rename ompt_wait_id to omp_wait_idJoachim Protze2018-05-286-69/+69
| | | | | | | | Rename ompt_wait_id to omp_wait_id, as defined in the spec. Differential Revision: https://reviews.llvm.org/D46530 llvm-svn: 333368
* [OMPT] Rename ompt_frame_t to omp_frame_tJoachim Protze2018-05-289-32/+32
| | | | | | | | Rename ompt_frame_t to omp_frame_t, as defined in the spec. Differential Revision: https://reviews.llvm.org/D43568 llvm-svn: 333367
* [CMake] Unify install path for librariesJonas Hahnfeld2018-05-251-5/+5
| | | | | | | | | | Introduce OPENMP_INSTALL_LIBDIR and use in all install() commands. This also fixes installation of libomptarget-nvptx that previously didn't honor {OPENMP,LLVM}_LIBDIR_SUFFIX. Differential Revision: https://reviews.llvm.org/D47130 llvm-svn: 333284
* [OMPT] Fix thread_num for implicit_task_end callbacks in nested parallel regionsJoachim Protze2018-05-074-5/+16
| | | | | | | | | | | | | | implicit_task_end callbacks in nested parallel regions did not always give the correct thread_num, since the inner parallel region may have already been finalized. Now, the thread_num is stored at the beginning of the implicit task and retrieved at the end, whenever necessary. A testcase was added as well. Differential Revision: https://reviews.llvm.org/D46260 llvm-svn: 331632
* [OpenMP] Compilation error fix on const char*Heejin Ahn2018-04-181-1/+1
| | | | | | | | | | | | | | | Summary: This line (https://github.com/llvm-mirror/openmp/blob/0ed912c7a798f5c4f65f8bb6b492e07fab7f4cea/runtime/src/kmp_gsupport.cpp#L1459) added in D45327 (rL330282) causes a compilation failure. Reviewers: jlpeyton Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D45786 llvm-svn: 330299
* [OpenMP] Fix affinity API for KMP_AFFINITY=none|compact|scatterJonathan Peyton2018-04-184-2/+29
| | | | | | | | | | | | | | | | | | | | | Currently, the affinity API reports garbage for the initial place list and any thread's place lists when using KMP_AFFINITY=none|compact|scatter. This patch does two things: for KMP_AFFINITY=none, Creates a one entry table for the places, this way, the initial place list is just a single place with all the proc ids in it. We also set the initial place of any thread to 0 instead of KMP_PLACE_ALL so that the thread reports that single place (place 0) instead of garbage (-1) when using the affinity API. When non-OMP_PROC_BIND affinity is used (including KMP_AFFINITY=compact|scatter), a thread's place list is populated correctly. We assume that each thread is assigned to a single place. This is implemented in two of the affinity API functions Differential Revision: https://reviews.llvm.org/D45527 llvm-svn: 330283
* Introduce GOMP_taskloop APIJonathan Peyton2018-04-187-18/+297
| | | | | | | | | | | | | | | This patch introduces GOMP_taskloop to our API. It adds GOMP_4.5 to our version symbols. Being a wrapper around __kmpc_taskloop, the function creates a task with the loop bounds properly nested in the shareds so that the GOMP task thunk will work properly. Also, the firstprivate copy constructors are properly handled using the __kmp_gomp_task_dup() auxiliary function. Currently, only linear spawning of tasks is supported for the GOMP_taskloop interface. Differential Revision: https://reviews.llvm.org/D45327 llvm-svn: 330282
* Set the license header for all OMPT filesJoachim Protze2018-04-126-3/+73
| | | | llvm-svn: 329928
* Minor cleanup in __kmp_atfork_child()Jonathan Peyton2018-03-301-1/+0
| | | | | | | | | | | | This change removes the unnecessary lock operation on __kmp_initz_lock inside the __kmp_atfork_child() function for Linux; the lock variable is initialized in the same function later. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D44949 llvm-svn: 328900
* Move blocktime_str variable right before its first useJonathan Peyton2018-03-261-3/+3
| | | | llvm-svn: 328575
* Fixed __kmpc_get_target_offload() to call library initialization.Andrey Churbanov2018-03-221-1/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D44793 llvm-svn: 328228
* Read OMP_TARGET_OFFLOAD and provide API to access ICVJonathan Peyton2018-03-205-0/+64
| | | | | | | | | | | | | | Added settings code to read OMP_TARGET_OFFLOAD environment variable. Added target-offload-var ICV as __kmp_target_offload, set via OMP_TARGET_OFFLOAD, if available, otherwise defaulting to DEFAULT. Valid values for the ICV are specified as enum values {0,1,2} for disabled, default, and mandatory. An internal API access function __kmpc_get_target_offload is provided. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D44577 llvm-svn: 328046
* Fix for Fix for https://bugs.llvm.org/show_bug.cgi?id=36705.Andrey Churbanov2018-03-191-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D44637 llvm-svn: 327875
* Improve OpenMP threadprivate implementation.Andrey Churbanov2018-03-053-111/+182
| | | | | | | | Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D41914 llvm-svn: 326733
* Fixed build of the OpenMP stubs library.Andrey Churbanov2018-03-051-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D44019 llvm-svn: 326728
* [OMPT] Fix ompt_get_task_info() and add tests for itJoachim Protze2018-02-281-0/+3
| | | | | | | | | | | | | The thread_num parameter of ompt_get_task_info() was not being used previously, but need to be set. The print_task_type() function (form the task-types.c testcase) was merged into the print_ids() function (in callback.h). Testing of ompt_get_task_info() was added to the task-types.c testcase. It was not tested extensively previously. Differential Revision: https://reviews.llvm.org/D42472 llvm-svn: 326338
* [OMPT] Fix parallel_data in implicit barrier-endJonas Hahnfeld2018-02-232-35/+30
| | | | | | | | | This is required to be NULL for implicit barriers at the end of a parallel region. Noticed in review of D43191. Differential Revision: https://reviews.llvm.org/D43308 llvm-svn: 325922
* [OMPT] Omissionin in OMPT FormattingJoachim Protze2018-02-175-7/+9
| | | | | | | | Applying clang-format to the /runtime/src/ folder Differential Revision: https://reviews.llvm.org/D42169 llvm-svn: 325424
OpenPOWER on IntegriCloud