summaryrefslogtreecommitdiffstats
path: root/openmp/runtime
Commit message (Collapse)AuthorAgeFilesLines
...
* [OpenMP] Restructure loop code for hierarchical schedulingJonathan Peyton2018-07-093-1329/+1434
| | | | | | | | | | | | | | | | | | | | | | This patch reorganizes the loop scheduling code in order to allow hierarchical scheduling to use it more effectively. In particular, the goal of this patch is to separate the algorithmic parts of the scheduling from the thread logistics code. Moves declarations & structures to kmp_dispatch.h for easier access in other files. Extracts the algorithmic part of __kmp_dispatch_init() and __kmp_dispatch_next() into __kmp_dispatch_init_algorithm() and __kmp_dispatch_next_algorithm(). The thread bookkeeping logic is still kept in __kmp_dispatch_init() and __kmp_dispatch_next(). This is done because the hierarchical scheduler needs to access the scheduling logic without the bookkeeping logic. To prepare for new pointer in dispatch_private_info_t, a new flags variable is created which stores the ordered and nomerge flags instead of them being in two separate variables. This will keep the dispatch_private_info_t structure the same size. Differential Revision: https://reviews.llvm.org/D47961 llvm-svn: 336568
* [OpenMP] Use C++11 Atomics - barrier, tasking, and lock codeJonathan Peyton2018-07-0917-280/+433
| | | | | | | | | | | | | | | | | These are preliminary changes that attempt to use C++11 Atomics in the runtime. We are expecting better portability with this change across architectures/OSes. Here is the summary of the changes. Most variables that need synchronization operation were converted to generic atomic variables (std::atomic<T>). Variables that are updated with combined CAS are packed into a single atomic variable, and partial read/write is done through unpacking/packing Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D47903 llvm-svn: 336563
* Define the __STDC_FORMAT_MACROS to avoid test failure on some platforms.Kelvin Li2018-07-061-0/+5
| | | | | | | | | ompt/misc/api_calls_from_other_thread.cpp ompt/misc/interoperability.cpp Differential Revision: https://reviews.llvm.org/D48984 llvm-svn: 336438
* Dropped non-supoorted "--no-as-needed" flag from OMPT tests for macOSJoachim Protze2018-07-053-4/+8
| | | | | | | | | | | | | | | | | The flag "--no-as-needed" is not recognized by the linker on macOS making the following tests fail: ompt/loadtool/tool_available/tool_available.c ompt/loadtool/tool_not_available/tool_not_available.c This patch removes this flag for macOS and adds it only for Linux and Windows. I tested it on Ubuntu 16.04 and macOS HighSierra, with Clang/LLVM 6.0.1 and OpenMP trunk. This solution was also discussed in the OpenMP-dev mailing list. Patch provided by Simone Atzeni Differential Revision: https://reviews.llvm.org/D48888 llvm-svn: 336327
* [OMPT] Add synchronization to threads_nested.c testcaseJoachim Protze2018-07-051-2/+4
| | | | | | | | | | | The testcase potentially fails when a thread is reused. The added synchronization makes sure this does not happen. Patch provided by Simon Convent Differential Revision: https://reviews.llvm.org/D48932 llvm-svn: 336326
* [OMPT] Use alloca() to force availability of frame pointerJoachim Protze2018-07-021-0/+4
| | | | | | | | | | | | | When compiling with icc, there is a problem with reenter frame addresses in parallel_begin callbacks in the interoperability.c testcase. (The address is not available. thus NULL) Using alloca() forces availability of the frame pointer. Patch provided by Simon Convent Differential Revision: https://reviews.llvm.org/D48282 llvm-svn: 336088
* [OMPT] Add tests for runtime entry points from non-OpenMP threadsJoachim Protze2018-07-021-12/+53
| | | | | | | | | | | Several runtime entry points have not been tested from non-OpenMP threads. This adds tests to an existing testcase. While at it, the testcase was reformatted Patch provided by Simon Convent Differential Revision: https://reviews.llvm.org/D48124 llvm-svn: 336087
* [OMPT] Add testcases for thread_begin and thread_end callbacksJoachim Protze2018-07-022-0/+72
| | | | | | | | | | | Especially the thread_end callback has not been tested before. This adds a testcase for nested and non-nested threads. Patch provided by Simon Convent Differential Revision: https://reviews.llvm.org/D47824 llvm-svn: 336086
* [OMPT] Provide the right thread_num for ancestor levelsJoachim Protze2018-07-022-3/+371
| | | | | | | | | The current implementation always provides the thread-num for the current parallel region. This patch fixes the behavior for ancestor levels >0. Differential Revision: https://reviews.llvm.org/D46533 llvm-svn: 336085
* minor: fixed typo in debug printAndrey Churbanov2018-06-201-1/+1
| | | | llvm-svn: 335138
* [OpenMP] Fix formatting issues in kmp_stats.hJonathan Peyton2018-06-081-26/+39
| | | | llvm-svn: 334335
* [OMPT] Rename ompt_wait_id to omp_wait_idJoachim Protze2018-05-287-75/+75
| | | | | | | | Rename ompt_wait_id to omp_wait_id, as defined in the spec. Differential Revision: https://reviews.llvm.org/D46530 llvm-svn: 333368
* [OMPT] Rename ompt_frame_t to omp_frame_tJoachim Protze2018-05-2810-36/+36
| | | | | | | | Rename ompt_frame_t to omp_frame_t, as defined in the spec. Differential Revision: https://reviews.llvm.org/D43568 llvm-svn: 333367
* [OMPT] Fix test parallel/not_enough_threads.cJonas Hahnfeld2018-05-271-46/+60
| | | | | | | | | | | | | | | Upcoming changes to FileCheck will modify CHECK-DAG to not match overlapping regions of the input. This test was found to be affected because it expects to find four threads to invoke events of type ompt_event_implicit_task_begin. It turns out this is wrong because OMP_THREAD_LIMIT is set to 2, so there are only two threads. The rest of the test got it right so it went unnoticed until now. (Rewrite test and apply clang-format to it as discussed in the past.) Differential Revision: https://reviews.llvm.org/D47119 llvm-svn: 333361
* [CMake] Unify install path for librariesJonas Hahnfeld2018-05-251-5/+5
| | | | | | | | | | Introduce OPENMP_INSTALL_LIBDIR and use in all install() commands. This also fixes installation of libomptarget-nvptx that previously didn't honor {OPENMP,LLVM}_LIBDIR_SUFFIX. Differential Revision: https://reviews.llvm.org/D47130 llvm-svn: 333284
* [OMPT] Fix thread_num for implicit_task_end callbacks in nested parallel regionsJoachim Protze2018-05-074-5/+16
| | | | | | | | | | | | | | implicit_task_end callbacks in nested parallel regions did not always give the correct thread_num, since the inner parallel region may have already been finalized. Now, the thread_num is stored at the beginning of the implicit task and retrieved at the end, whenever necessary. A testcase was added as well. Differential Revision: https://reviews.llvm.org/D46260 llvm-svn: 331632
* [OMPT] Add api_calls_misc.c testcase and rename api_calls.c testcaseJoachim Protze2018-05-073-0/+76
| | | | | | | | | | | | | | | | The api_calls_misc.c testcase tests the following api calls: ompt_get_callback() ompt_get_state() ompt_enumerate_states() ompt_enumerate_mutex_impls() These have not been tested previously. The api_calls.c testcase has been renamed to api_calls_places.c because it only tests api calls that are related to places. Differential Revision: https://reviews.llvm.org/D42523 llvm-svn: 331631
* [OpenMP][OMPT] Fix api_calls_from_other_thread.cppJonathan Peyton2018-04-301-3/+3
| | | | | | | Removed environment setting in RUN: line that was being ignored anyways. Changed a few specific checks to "any number" llvm-svn: 331212
* [OpenMP] Compilation error fix on const char*Heejin Ahn2018-04-181-1/+1
| | | | | | | | | | | | | | | Summary: This line (https://github.com/llvm-mirror/openmp/blob/0ed912c7a798f5c4f65f8bb6b492e07fab7f4cea/runtime/src/kmp_gsupport.cpp#L1459) added in D45327 (rL330282) causes a compilation failure. Reviewers: jlpeyton Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D45786 llvm-svn: 330299
* [OpenMP] Fix affinity API for KMP_AFFINITY=none|compact|scatterJonathan Peyton2018-04-184-2/+29
| | | | | | | | | | | | | | | | | | | | | Currently, the affinity API reports garbage for the initial place list and any thread's place lists when using KMP_AFFINITY=none|compact|scatter. This patch does two things: for KMP_AFFINITY=none, Creates a one entry table for the places, this way, the initial place list is just a single place with all the proc ids in it. We also set the initial place of any thread to 0 instead of KMP_PLACE_ALL so that the thread reports that single place (place 0) instead of garbage (-1) when using the affinity API. When non-OMP_PROC_BIND affinity is used (including KMP_AFFINITY=compact|scatter), a thread's place list is populated correctly. We assume that each thread is assigned to a single place. This is implemented in two of the affinity API functions Differential Revision: https://reviews.llvm.org/D45527 llvm-svn: 330283
* Introduce GOMP_taskloop APIJonathan Peyton2018-04-189-22/+297
| | | | | | | | | | | | | | | This patch introduces GOMP_taskloop to our API. It adds GOMP_4.5 to our version symbols. Being a wrapper around __kmpc_taskloop, the function creates a task with the loop bounds properly nested in the shareds so that the GOMP task thunk will work properly. Also, the firstprivate copy constructors are properly handled using the __kmp_gomp_task_dup() auxiliary function. Currently, only linear spawning of tasks is supported for the GOMP_taskloop interface. Differential Revision: https://reviews.llvm.org/D45327 llvm-svn: 330282
* Set the license header for all OMPT filesJoachim Protze2018-04-126-3/+73
| | | | llvm-svn: 329928
* Minor cleanup in __kmp_atfork_child()Jonathan Peyton2018-03-301-1/+0
| | | | | | | | | | | | This change removes the unnecessary lock operation on __kmp_initz_lock inside the __kmp_atfork_child() function for Linux; the lock variable is initialized in the same function later. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D44949 llvm-svn: 328900
* Move blocktime_str variable right before its first useJonathan Peyton2018-03-261-3/+3
| | | | llvm-svn: 328575
* Add summarizeStats.py to tools directoryJonathan Peyton2018-03-261-0/+323
| | | | | | | | | | | | | | | | | | | | | | | | The summarizeStats.py script processes raw data provided by the instrumented (stats-gathering) OpenMP* runtime library. It provides: 1) A radar chart which plots counters as frequency (per GigaTick) of use within the program. The frequencies are plotted as log10, however values less than one are kept as it is and represented in red color. This was done to help visualize the differences better. 2) Pie charts separating total time as compute and non-compute. The compute and non-compute times have their own pie charts showing the constructs that contributed to them. The percentages listed are with respect to the total time. 3) '.csv' file with percentage of time spent within the different constructs. The script can be used as: $ python $PATH_TO_SCRIPT/summarizeStats.py instrumented1.csv instrumented2.csv Patch by Taru Doodi Differential Revision: https://reviews.llvm.org/D41838 llvm-svn: 328568
* Fixed __kmpc_get_target_offload() to call library initialization.Andrey Churbanov2018-03-221-1/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D44793 llvm-svn: 328228
* Read OMP_TARGET_OFFLOAD and provide API to access ICVJonathan Peyton2018-03-205-0/+64
| | | | | | | | | | | | | | Added settings code to read OMP_TARGET_OFFLOAD environment variable. Added target-offload-var ICV as __kmp_target_offload, set via OMP_TARGET_OFFLOAD, if available, otherwise defaulting to DEFAULT. Valid values for the ICV are specified as enum values {0,1,2} for disabled, default, and mandatory. An internal API access function __kmpc_get_target_offload is provided. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D44577 llvm-svn: 328046
* Fix for Fix for https://bugs.llvm.org/show_bug.cgi?id=36705.Andrey Churbanov2018-03-191-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D44637 llvm-svn: 327875
* Improve OpenMP threadprivate implementation.Andrey Churbanov2018-03-053-111/+182
| | | | | | | | Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D41914 llvm-svn: 326733
* Fixed build of the OpenMP stubs library.Andrey Churbanov2018-03-051-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D44019 llvm-svn: 326728
* [OMPT] Fix interoperability test with GCCJonas Hahnfeld2018-03-011-2/+14
| | | | | | | | | | | | | | | | We have to ensure that the runtime is initialized _before_ waiting for the two started threads to guarantee that the master threads post their ompt_event_thread_begin before the worker threads. This is not guaranteed in the parallel region where one worker thread could start before the other master thread has invoked the callback. The problem did not happen with Clang becauses the generated code calls __kmpc_global_thread_num() and cashes its result for functions that contain OpenMP pragmas. Differential Revision: https://reviews.llvm.org/D43882 llvm-svn: 326435
* [OMPT] Fix task-type test with GCCJoachim Protze2018-03-011-0/+3
| | | | | | | | | | This is similar to D43882. The runtime needs to be initialized before calling print_ids(0) http://lab.llvm.org:8011/builders/openmp-gcc-x86_64-linux-debian/builds/60 Differential Revision: https://reviews.llvm.org/D43897 llvm-svn: 326428
* [OMPT] Fix ompt_get_task_info() and add tests for itJoachim Protze2018-02-283-88/+182
| | | | | | | | | | | | | The thread_num parameter of ompt_get_task_info() was not being used previously, but need to be set. The print_task_type() function (form the task-types.c testcase) was merged into the print_ids() function (in callback.h). Testing of ompt_get_task_info() was added to the task-types.c testcase. It was not tested extensively previously. Differential Revision: https://reviews.llvm.org/D42472 llvm-svn: 326338
* [OMPT] Fix inconsistent testcasesJoachim Protze2018-02-282-31/+31
| | | | | | | | | | | The main change of this patch is to insert {{.*}} in current_address=[[RETURN_ADDRESS_END]]. This is needed to match any of the alternatively printed addresses. Additionally, clang-format is applied to the two tests. Differential Revision: https://reviews.llvm.org/D43115 llvm-svn: 326312
* [OMPT] Fix parallel_data in implicit barrier-endJonas Hahnfeld2018-02-234-98/+135
| | | | | | | | | This is required to be NULL for implicit barriers at the end of a parallel region. Noticed in review of D43191. Differential Revision: https://reviews.llvm.org/D43308 llvm-svn: 325922
* [OMPT] Fix test tasks/serialized.c with optimizationJonas Hahnfeld2018-02-232-54/+114
| | | | | | | | | | | | | | The compiler inlines the user code in the task. Check for that case at runtime by comparing the frame addresses and print the expected exit address. Also showcase how I think the OMPT tests could be reformatted to match LLVM's code style. In my opinion it would be great to that kind of change to all tests that need to be touched for whatever reason... Differential Revision: https://reviews.llvm.org/D43191 llvm-svn: 325921
* [OMPT] Omissionin in OMPT FormattingJoachim Protze2018-02-175-7/+9
| | | | | | | | Applying clang-format to the /runtime/src/ folder Differential Revision: https://reviews.llvm.org/D42169 llvm-svn: 325424
* [OMPT] Add interoperability testcaseJoachim Protze2018-02-171-0/+99
| | | | | | | | Test whether OMPT-callbacks for two threads that initiate a parallel region are correct. Differential Revision: https://reviews.llvm.org/D41942 llvm-svn: 325423
* [OMPT] Update api_calls testcaseJoachim Protze2018-02-172-34/+50
| | | | | | | | | | Only use ompt_ functions when testing OMPT in api_calls testcase. Add size parameter to print_list. Fix small bug in implementation of ompt_get_partition_place_nums(): return correct length. Differential Revision: https://reviews.llvm.org/D42162 llvm-svn: 325422
* [OMPT][test] Correct warning about added wrapper functionsJonas Hahnfeld2018-02-141-2/+4
| | | | | | | | | This affects all outlined functions, not just tasks! Only show warning when using Clang 5.0 or later. Differential Revision: https://reviews.llvm.org/D43190 llvm-svn: 325131
* [OMPT] Add tool_available_search testcaseJoachim Protze2018-02-081-0/+104
| | | | | | | | | | | | | | | | | | Tests the search for tools as defined in the spec. The OMP_TOOL_LIBRARIES environment variable contains paths to the following files(in that order) -to a nonexisting file -to a shared library that does not have a ompt_start_tool function -to a shared library that has an ompt_start_tool implementation returning NULL -to a shared library that has an ompt_start_tool implementation returning a pointer to a valid instance of ompt_start_tool_result_t The expected result is that the last tool gets active and can print in the thread-begin callback. Differential Revision: https://reviews.llvm.org/D42166 llvm-svn: 324588
* [OMPT] Add tool_not_available testcaseJoachim Protze2018-02-082-0/+69
| | | | | | | | | | | | Add a testcase that checks wheter the runtime can handle an ompt_start_tool method that returns NULL indicating that no tool shall be loaded. All tool_available testcases need a separate folder to avoid file conflicts for the generated tools. Differential Revision: https://reviews.llvm.org/D41904 llvm-svn: 324587
* [OMPT] Fix tool initialization returning 0Joachim Protze2018-02-061-0/+6
| | | | | | | | | If tool initialization returns 0, OMPT should not be active. The current implementation provided some callback invocations in this case. Differential Revision: https://reviews.llvm.org/D42709 llvm-svn: 324320
* [OMPT] Use fuzzy return addresses in lock testcasesJonas Hahnfeld2018-01-264-71/+71
| | | | | | | | | | | Use fuzzy return addresses in lock testcases so that these testcases can also be run using the Intel Compiler. Patch by Simon Convent! Differential Revision: https://reviews.llvm.org/D41896 llvm-svn: 323529
* Partial revert of [OMPT] Rename ompt_mutex_impl_t to kmp_mutex_implJoachim Protze2018-01-171-3/+3
| | | | | | The previous commit did not revert all replaced ompt_mutex_impl_unknown. llvm-svn: 322631
* [OMPT] Add Workaround for Intel Compiler BugJoachim Protze2018-01-172-1/+2
| | | | | | | | | | | | | | | | | | | | | | Add Workaround for Intel Compiler Bug with Case#: 03138964 A critical region within a nested task causes a segfault in icc 14-18: int main() { #pragma omp parallel num_threads(2) #pragma omp master #pragma omp task #pragma omp task #pragma omp critical printf("test\n"); } When the critical region is in a separate function, the segault does not occur. So we add noinline to make sure that the function call stays there. Differential Revision: https://reviews.llvm.org/D41182 llvm-svn: 322622
* [OMPT] Rename ompt_mutex_impl_t to kmp_mutex_implJoachim Protze2018-01-174-36/+36
| | | | | | | | | | The defintion is not part of the spec and thus should not have the prefix "ompt_" but rather a prefix that indicates that this is implementation specific. Differential Revision: https://reviews.llvm.org/D41166 llvm-svn: 322621
* [OMPT] Return appropiate values for ompt runtime entry points for non-OpenMP ↵Joachim Protze2018-01-173-8/+75
| | | | | | | | | | | threads When the current thread is not an (initialized) OpenMP thread, the runtime entry points return values that correspond to "not available" or similar Differential Revision: https://reviews.llvm.org/D41167 llvm-svn: 322620
* Fixed libomp static build broken by the commit rL322202.Andrey Churbanov2018-01-111-0/+2
| | | | | | | | Patch by simone <simone@cs.utah.edu>. Differential Revision: https://reviews.llvm.org/D41945 llvm-svn: 322282
* Force HWLOC topology method for NUMA-specific topologyJonathan Peyton2018-01-101-0/+9
| | | | | | | | | | | | If user requested affinity with granularity=tile we need to either use HWLOC or ignore the request. The change allows user to not specify KMP_TOPOLOGY_METHOD=hwloc and choose it automatically instead. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D40905 llvm-svn: 322205
OpenPOWER on IntegriCloud