summaryrefslogtreecommitdiffstats
path: root/openmp/runtime/src/kmp_sched.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [OpenMP] Remove OMP spec versioningJonathan Peyton2019-07-121-14/+2
| | | | | | | | | | Remove all older OMP spec versioning from the runtime and build system. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D64534 llvm-svn: 365963
* [OpenMP][Stats] Fix stats gathering for distribute and team clauseJonathan Peyton2019-04-031-20/+36
| | | | | | | | | | | The distribute clause needs an explicit push of a timer. The teams clause needs a timer added and also, similarly to parallel, exchanged with the serial timer when encountered so that serial regions are counted properly. Differential Revision: https://reviews.llvm.org/D59801 llvm-svn: 357621
* Update more file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | | to reflect the new license. These used slightly different spellings that defeated my regular expressions. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351648
* [OMPT] Update types according to TR7Joachim Protze2018-09-101-1/+1
| | | | | | | | | | | | | | | | | | Some types and callback signatures have changed from TR6 to TR7. Major changes (only adding signatures and stubs): (-remove idle callback) done by D48362 -add reduction and dispatch callback -add get_task_memory and finalize_tool runtime entry points -ompt_invoker_t becomes ompt_parallel_flag_t -more types of sync_regions Patch provided by Simon Convent Reviewers: hbae, protze.joachim Differential Revision: https://reviews.llvm.org/D50774 llvm-svn: 341834
* [OpenMP] Cleanup codeJonathan Peyton2018-08-091-2/+2
| | | | | | | | | | | | | | | | This patch cleans up unused functions, variables, sign compare issues, and addresses some -Warning flags which are now enabled including -Wcast-qual. Not all the warning flags in LibompHandleFlags.cmake are enabled, but some are with this patch. Some __kmp_gtid_from_* macros in kmp.h are switched to static inline functions which allows us to remove the awkward definition of KMP_DEBUG_ASSERT() and KMP_ASSERT() macros which used the comma operator. This had to be done for the innumerable -Wunused-value warnings related to KMP_DEBUG_ASSERT() Differential Revision: https://reviews.llvm.org/D49105 llvm-svn: 339393
* [OpenMP][Stats] Cleanup stats gathering codeJonathan Peyton2018-07-301-4/+23
| | | | | | | | | | | | | | | | | | | | | | 1) Remove unnecessary data from list node structure 2) Remove timerPair in favor of pushing/popping explicitTimers. This way, nested timers will work properly. 3) Fix #pragma omp critical timers 4) Add histogram capability 5) Add KMP_STATS_FILE formatting capability 6) Have time partitioned into serial & parallel by introducing partitionedTimers::exchange(). This also counts the number of serial regions in the executable. 7) Fix up the timers around OMP loops so that scheduling overhead and work are both counted correctly. 8) Fix up the iterations statistics so they count the number of iterations the thread receives at each loop scheduling event 9) Change timers so there is only one RDTSC read per event change 10) Fix up the outdated comments for the timers Differential Revision: https://reviews.llvm.org/D49699 llvm-svn: 338276
* Introduce GOMP_taskloop APIJonathan Peyton2018-04-181-0/+1
| | | | | | | | | | | | | | | This patch introduces GOMP_taskloop to our API. It adds GOMP_4.5 to our version symbols. Being a wrapper around __kmpc_taskloop, the function creates a task with the loop bounds properly nested in the shareds so that the GOMP task thunk will work properly. Also, the firstprivate copy constructors are properly handled using the __kmp_gomp_task_dup() auxiliary function. Currently, only linear spawning of tasks is supported for the GOMP_taskloop interface. Differential Revision: https://reviews.llvm.org/D45327 llvm-svn: 330282
* [OMPT] Fix assertion for OpenMP code generated with outdated compilersJoachim Protze2017-11-101-4/+8
| | | | | | | | | | | | For up-to-date compilers, this assertion is reasonable, but it breaks compatibility with the typical compiler installed on most systems. This patch changes the default value to what we had when there was no compiler support. A warning about the outdated compiler is printed during runtime, when this point is reached. Differential Revision: https://reviews.llvm.org/D39890 llvm-svn: 317928
* Remove const from variables with dynamic memoryJonas Hahnfeld2017-11-091-9/+9
| | | | | | | | | | | | | | | | | | | Allocated memory is typically not 'const' if it needs to be freed. This patch removes around 50 wrong const attributes, modifies the corresponding functions and finally gets rid of some const_casts. These have especially been strange for __kmp_str_fname_free() that added a 'const' to call __kmp_str_free() which removed it again. Two minor cleanups that I performed in this process: * __kmp_tool_libraries now lives in kmp_settings.cpp as it is used nowhere else. * __kmp_msg_empty was removed as it was never used and Clang now complained that it was assigned a string literal that is 'const char *'. Differential Revision: https://reviews.llvm.org/D39755 llvm-svn: 317797
* Update implementation of OMPT to the specification OpenMP 5.0 Preview 1 (TR4).Joachim Protze2017-11-011-24/+68
| | | | | | | | | | | | | | The code is tested to work with latest clang, GNU and Intel compiler. The implementation is optimized for low overhead when no tool is attached shifting the cost to execution with tool attached. This patch does not implement OMPT for libomptarget. Patch by Simon Convent and Joachim Protze Differential Revision: https://reviews.llvm.org/D38185 llvm-svn: 317085
* Apply formatting changesJonathan Peyton2017-10-201-2/+0
| | | | | | | | | | .clang-format's comments are removed and a (hopefully) final set of formatting changes are applied. Differential Revision: https://reviews.llvm.org/D38837 Differential Revision: https://reviews.llvm.org/D38920 llvm-svn: 316227
* remove deprecated register storage class specifierEd Maste2017-07-071-28/+28
| | | | | | | | | | While importing libomp into the FreeBSD base system we encountered Clang warnings that "'register' storage class specifier is deprecated and incompatible with C++1z [-Wdeprecated-register]". Differential Revision: https://reviews.llvm.org/D35124 llvm-svn: 307441
* Clang-format and whitespace cleanup of source codeJonathan Peyton2017-05-121-745/+732
| | | | | | | | | | | | | This patch contains the clang-format and cleanup of the entire code base. Some of clang-formats changes made the code look worse in places. A best effort was made to resolve the bulk of these problems, but many remain. Most of the problems were mangling line-breaks and tabbing of comments. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D32659 llvm-svn: 302929
* Stride in distribute parallel for loops with no chunk size.Andrey Churbanov2017-03-211-0/+1
| | | | | | | | Patch by George Rokos. Differential Revision: https://reviews.llvm.org/D24486 llvm-svn: 298362
* Cleanup: put i_maxmin members and ___kmp_size_type into traits_tJonathan Peyton2017-01-271-35/+8
| | | | | | | | | Put the duplicated i_maxmin into traits_t by adding new members max_value and min_value. Put ___kmp_size_type into traits_t by adding member type_size. Differential Revision: https://reviews.llvm.org/D28847 llvm-svn: 293316
* Follow up to r289732: Update comments in source files to reference .cpp filesJonathan Peyton2016-12-141-1/+1
| | | | | | Patch by Hansang Bae llvm-svn: 289739
* Renaming change: 41 -> 45 and 4.1 -> 4.5Jonathan Peyton2016-06-141-1/+1
| | | | | | | | OpenMP 4.1 is now OpenMP 4.5. Any mention of 41 or 4.1 is replaced with 45 or 4.5. Also, if the CMake option LIBOMP_OMP_VERSION is 41, CMake warns that 41 is deprecated and to use 45 instead. llvm-svn: 272687
* Addition of OpenMP 4.5 feature: schedule(simd:static)Jonathan Peyton2016-05-311-0/+23
| | | | | | | | | | | | This patch implements the new kmp_sch_static_balanced_chunked schedule kind that the compiler will generate when it encounters schedule(simd: static). It just adds the new constant and the new switch case __kmp_for_static_init. Patch by Alex Duran. Differential Revision: http://reviews.llvm.org/D20699 llvm-svn: 271320
* Remove trailing whitespace in src/ directoryJonathan Peyton2016-05-201-2/+2
| | | | | | This patch doesn't affect D19878's context. So D19878 still cleanly applies. llvm-svn: 270252
* [STATS] Use partitioned timer schemeJonathan Peyton2016-05-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | This change removes the current timers with ones that partition time properly. The current timers are nested, so that if a new timer, B, starts when the current timer, A, is already timing, A's time will include B's. To eliminate this problem, the partitioned timers are designed to stop the current timer (A), let the new timer run (B), and when the new timer is finished, restart the previously running timer (A). With this partitioning of time, a threads' timers all sum up to the OMP_worker_thread_life time and can now easily show the percentage of time a thread is spending in different parts of the runtime or user code. There is also a new state variable associated with each thread which tells where it is executing a task. This corresponds with the timers: OMP_task_*, e.g., if time is spent in OMP_task_taskwait, then that thread executed tasks inside a #pragma omp taskwait construct. The changes are mostly changing the MACROs to use the new PARITIONED_* macros, the new partitionedTimers class and its methods, and new state logic. Differential Revision: http://reviews.llvm.org/D19229 llvm-svn: 268640
* Fix trip count calculation for parallel loops in runtimeJonathan Peyton2016-04-181-8/+16
| | | | | | | | | | | | | | | The trip count calculation was incorrect for loops with large bounds. For example, for(int i=-2,000,000,000; i < 2,000,000,000; i+=50000000), the trip count calculation had overflow (trying to calculate 2,000,000,000 + 2,000,000,000 with signed integers) and wasn't giving the right value. This patch fixes this error in the runtime by using unsigned integers instead. There is still a bug in the clang compiler component because it warns that there is overflow in the test case file when there isn't. This error isn't there for the Intel Compiler. So for now, the test case is designated as XFAIL. Differential Revision: http://reviews.llvm.org/D19078 llvm-svn: 266677
* dd new OpenMP 4.5 schedule clause modifiers (monotonic/non-monotonic) featureJonathan Peyton2016-02-251-0/+3
| | | | | | | | | | | | | | | | | The monotonic/non-monotonic flags are sent to the runtime via the sched_type by setting the 30th (non-monotonic) or 29th (monotonic) bit in the sched_type. Macros are added to probe if monotonic or non-monotonic is specified (SCHEDULE_HAS_[NON]MONOTONIC & SCHEDULE_HAS_NO_MODIFIERS) and also to to get the base sched_type (SCHEDULE_WITHOUT_MODIFIERS) Currently, nothing is done with the modifiers. Also, this patch adds some comments on the use of the enumerations in at least one place where it is subtle. Differential Revision: http://reviews.llvm.org/D17406 llvm-svn: 261906
* [OMPT] Reduce overhead of OMPTJonathan Peyton2015-10-091-2/+8
| | | | | | | | | | | * Avoid computing state needed only by OMPT unless the ompt_enabled flag is set. * Properly handle a corner case in OMPT where team == NULL. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D13502 llvm-svn: 249857
* [OMPT] Simplify control variable logic for OMPTJonathan Peyton2015-09-211-4/+4
| | | | | | | | | | | | | | | Prior to this change, OMPT had a status flag ompt_status, which could take several values. This was due to an earlier OMPT design that had several levels of enablement (ready, disabled, tracking state, tracking callbacks). The current OMPT design has OMPT support either on or off. This revision replaces ompt_status with a boolean flag ompt_enabled, which simplifies the runtime logic for OMPT. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D12999 llvm-svn: 248189
* Fix the OpenMP 3.0 buildJonathan Peyton2015-09-211-2/+2
| | | | | | | | | | | This change adds guards to the code in places where they are missing to enable the OpenMP 3.0 build. Patch by Diego Caballero and Johnny Peyton Mailing List: http://lists.llvm.org/pipermail/openmp-dev/2015-September/000935.html llvm-svn: 248178
* Tidy statistics collectionJonathan Peyton2015-08-111-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This removes some statistics counters and timers which were not used, adds new counters and timers for some language features that were not monitored previously and separates the counters and timers into those which are of interest for investigating user code and those which are only of interest to the developer of the runtime itself. The runtime developer statistics are now ony collected if the additional #define KMP_DEVELOPER_STATS is set. Additional user statistics which are now collected include: * Count of nested parallelism (omp parallel inside a parallel region) * Count of omp distribute occurrences * Count of omp teams occurrences * Counts of task related statistics (taskyield, task execution, task cancellation, task steal) * Values passed to omp_set_numtheads * Time spent in omp single and omp master None of this affects code compiled without stats gathering enabled, which is the normal library build mode. This also fixes the CMake build by linking to the standard c++ library when building the stats library as it is a requirement. The normal library does not have this requirement and its link phase is left alone. Differential Revision: http://reviews.llvm.org/D11759 llvm-svn: 244677
* Fix doxygen commentsJonathan Peyton2015-05-221-6/+3
| | | | | | These fixes make doxygen happy. llvm-svn: 238061
* D9302.partial2: cleanup of ittnotify checks, that eliminats redundant ↵Andrey Churbanov2015-05-061-4/+10
| | | | | | notifications in case of nested regions. llvm-svn: 236631
* These are the actual changes in the runtime to issue OMPT-related functions. ↵Andrey Churbanov2015-04-291-0/+45
| | | | | | All of them are surrounded by #if OMPT_SUPPORT and can be disabled (which is the default). llvm-svn: 236122
* Comments only: removing the Revision and Date svn variables from the top of ↵Andrey Churbanov2015-01-271-2/+0
| | | | | | all the source files. llvm-svn: 227207
* I apologise in advance for the size of this check-in. At Intel we doJim Cownie2014-10-071-23/+540
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | understand that this is not friendly, and are working to change our internal code-development to make it easier to make development features available more frequently and in finer (more functional) chunks. Unfortunately we haven't got that in place yet, and unpicking this into multiple separate check-ins would be non-trivial, so please bear with me on this one. We should be better in the future. Apologies over, what do we have here? GGC 4.9 compatibility -------------------- * We have implemented the new entrypoints used by code compiled by GCC 4.9 to implement the same functionality in gcc 4.8. Therefore code compiled with gcc 4.9 that used to work will continue to do so. However, there are some other new entrypoints (associated with task cancellation) which are not implemented. Therefore user code compiled by gcc 4.9 that uses these new features will not link against the LLVM runtime. (It remains unclear how to handle those entrypoints, since the GCC interface has potentially unpleasant performance implications for join barriers even when cancellation is not used) --- new parallel entry points --- new entry points that aren't OpenMP 4.0 related These are implemented fully :- GOMP_parallel_loop_dynamic() GOMP_parallel_loop_guided() GOMP_parallel_loop_runtime() GOMP_parallel_loop_static() GOMP_parallel_sections() GOMP_parallel() --- cancellation entry points --- Currently, these only give a runtime error if OMP_CANCELLATION is true because our plain barriers don't check for cancellation while waiting GOMP_barrier_cancel() GOMP_cancel() GOMP_cancellation_point() GOMP_loop_end_cancel() GOMP_sections_end_cancel() --- taskgroup entry points --- These are implemented fully. GOMP_taskgroup_start() GOMP_taskgroup_end() --- target entry points --- These are empty (as they are in libgomp) GOMP_target() GOMP_target_data() GOMP_target_end_data() GOMP_target_update() GOMP_teams() Improvements in Barriers and Fork/Join -------------------------------------- * Barrier and fork/join code is now in its own file (which makes it easier to understand and modify). * Wait/release code is now templated and in its own file; suspend/resume code is also templated * There's a new, hierarchical, barrier, which exploits the cache-hierarchy of the Intel(r) Xeon Phi(tm) coprocessor to improve fork/join and barrier performance. ***BEWARE*** the new source files have *not* been added to the legacy Cmake build system. If you want to use that fixes wil be required. Statistics Collection Code -------------------------- * New code has been added to collect application statistics (if this is enabled at library compile time; by default it is not). The statistics code itself is generally useful, the lightweight timing code uses the X86 rdtsc instruction, so will require changes for other architectures. The intent of this code is not for users to tune their codes but rather 1) For timing code-paths inside the runtime 2) For gathering general properties of OpenMP codes to focus attention on which OpenMP features are most used. Nested Hot Teams ---------------- * The runtime now maintains more state to reduce the overhead of creating and destroying inner parallel teams. This improves the performance of code that repeatedly uses nested parallelism with the same resource allocation. Set the new KMP_HOT_TEAMS_MAX_LEVEL envirable to a depth to enable this (and, of course, OMP_NESTED=true to enable nested parallelism at all). Improved Intel(r) VTune(Tm) Amplifier support --------------------------------------------- * The runtime provides additional information to Vtune via the itt_notify interface to allow it to display better OpenMP specific analyses of load-imbalance. Support for OpenMP Composite Statements --------------------------------------- * Implement new entrypoints required by some of the OpenMP 4.1 composite statements. Improved ifdefs --------------- * More separation of concepts ("Does this platform do X?") from platforms ("Are we compiling for platform Y?"), which should simplify future porting. ScaleMP* contribution --------------------- Stack padding to improve the performance in their environment where cross-node coherency is managed at the page level. Redesign of wait and release code --------------------------------- The code is simplified and performance improved. Bug Fixes --------- *Fixes for Windows multiple processor groups. *Fix Fortran module build on Linux: offload attribute added. *Fix entry names for distribute-parallel-loop construct to be consistent with the compiler codegen. *Fix an inconsistent error message for KMP_PLACE_THREADS environment variable. llvm-svn: 219214
* First attempt to import OpenMP runtimeJim Cownie2013-09-271-0/+366
llvm-svn: 191506
OpenPOWER on IntegriCloud