summaryrefslogtreecommitdiffstats
path: root/openmp
Commit message (Collapse)AuthorAgeFilesLines
...
* [OMPT] Add tool_not_available testcaseJoachim Protze2018-02-082-0/+69
| | | | | | | | | | | | Add a testcase that checks wheter the runtime can handle an ompt_start_tool method that returns NULL indicating that no tool shall be loaded. All tool_available testcases need a separate folder to avoid file conflicts for the generated tools. Differential Revision: https://reviews.llvm.org/D41904 llvm-svn: 324587
* [OpenMP][libomptarget] Add data sharing support in libomptargetGheorghe-Teodor Bercea2018-02-076-1/+65
| | | | | | | | | | | | | | Summary: This patch extends the libomptarget functionality in patch D14254 with support for the data sharing scheme for supporting implicitly shared variables. The runtime therefore maintains a list of references to shared variables. Reviewers: carlo.bertolli, ABataev, Hahnfeld, grokos, caomhin, hfinkel Reviewed By: Hahnfeld, grokos Subscribers: guansong, llvm-commits, openmp-commits Differential Revision: https://reviews.llvm.org/D41485 llvm-svn: 324495
* [OMPT] Fix tool initialization returning 0Joachim Protze2018-02-061-0/+6
| | | | | | | | | If tool initialization returns 0, OMPT should not be active. The current implementation provided some callback invocations in this case. Differential Revision: https://reviews.llvm.org/D42709 llvm-svn: 324320
* [OpenMP-RT] Fix debug string for NVPTX runtime libraryCarlo Bertolli2018-02-011-1/+1
| | | | | | | | https://reviews.llvm.org/D42757 The method ThreadsInTeam is used to determine the number of threads to be used in a parallel region under SPMD mode (see line 127 of supporti.h in libomptarget/deviceRTLs/nvptx/src/). This patch fixes the corresponding debug print upon initialization of the kernel in SPMD mode. llvm-svn: 323978
* [libomptarget] Check for library with CUDA Driver APIJonas Hahnfeld2018-01-302-40/+67
| | | | | | | | | | | | | That's what we really need to link the CUDA plugin against, not the CUDA runtime API in CUDA_LIBRARIES! While the latter comes with the CUDA SDK, the Driver API is installed with the kernel driver and there is at most one per system. As fallback we can use the stubs library distributed with the CUDA SDK for linking. Differential Revision: https://reviews.llvm.org/D42643 llvm-svn: 323787
* [libomptarget] Only use CUDA Driver APIJonas Hahnfeld2018-01-301-11/+9
| | | | | | | | | | Use equivalents for the last calls to the Runtime API. Remove stray assert in case of an error found during review, we should only return OFFLOAD_FAIL. Differential Revision: https://reviews.llvm.org/D42686 llvm-svn: 323786
* [OpenMP] Initial implementation of OpenMP offloading library - libomptarget ↵George Rokos2018-01-2927-1/+5897
| | | | | | | | | | | | device RTLs. This patch implements the device runtime library whose interface is used in the code generation for OpenMP offloading devices. Currently there is a single device RTL written in CUDA meant to CUDA enabled GPUs. The interface is a variation of the kmpc interface that includes some extra calls to do thread and storage management that only make sense for a GPU target. Differential revision: https://reviews.llvm.org/D14254 llvm-svn: 323649
* [OMPT] Use fuzzy return addresses in lock testcasesJonas Hahnfeld2018-01-264-71/+71
| | | | | | | | | | | Use fuzzy return addresses in lock testcases so that these testcases can also be run using the Intel Compiler. Patch by Simon Convent! Differential Revision: https://reviews.llvm.org/D41896 llvm-svn: 323529
* Fix name of 'macOS' and add asteriks to brands, NFC.Jonas Hahnfeld2018-01-231-1/+1
| | | | llvm-svn: 323180
* Sprinkle a few <cstdlib> includes, for libomptarget sources usingDimitry Andric2018-01-183-0/+3
| | | | | | malloc, free, alloca and getenv. NFCI. llvm-svn: 322869
* Add missing headers for Debug buildsJonas Hahnfeld2018-01-182-0/+2
| | | | llvm-svn: 322830
* Partial revert of [OMPT] Rename ompt_mutex_impl_t to kmp_mutex_implJoachim Protze2018-01-171-3/+3
| | | | | | The previous commit did not revert all replaced ompt_mutex_impl_unknown. llvm-svn: 322631
* [OMPT] Add Workaround for Intel Compiler BugJoachim Protze2018-01-172-1/+2
| | | | | | | | | | | | | | | | | | | | | | Add Workaround for Intel Compiler Bug with Case#: 03138964 A critical region within a nested task causes a segfault in icc 14-18: int main() { #pragma omp parallel num_threads(2) #pragma omp master #pragma omp task #pragma omp task #pragma omp critical printf("test\n"); } When the critical region is in a separate function, the segault does not occur. So we add noinline to make sure that the function call stays there. Differential Revision: https://reviews.llvm.org/D41182 llvm-svn: 322622
* [OMPT] Rename ompt_mutex_impl_t to kmp_mutex_implJoachim Protze2018-01-174-36/+36
| | | | | | | | | | The defintion is not part of the spec and thus should not have the prefix "ompt_" but rather a prefix that indicates that this is implementation specific. Differential Revision: https://reviews.llvm.org/D41166 llvm-svn: 322621
* [OMPT] Return appropiate values for ompt runtime entry points for non-OpenMP ↵Joachim Protze2018-01-173-8/+75
| | | | | | | | | | | threads When the current thread is not an (initialized) OpenMP thread, the runtime entry points return values that correspond to "not available" or similar Differential Revision: https://reviews.llvm.org/D41167 llvm-svn: 322620
* Fixed libomp static build broken by the commit rL322202.Andrey Churbanov2018-01-111-0/+2
| | | | | | | | Patch by simone <simone@cs.utah.edu>. Differential Revision: https://reviews.llvm.org/D41945 llvm-svn: 322282
* Force HWLOC topology method for NUMA-specific topologyJonathan Peyton2018-01-101-0/+9
| | | | | | | | | | | | If user requested affinity with granularity=tile we need to either use HWLOC or ignore the request. The change allows user to not specify KMP_TOPOLOGY_METHOD=hwloc and choose it automatically instead. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D40905 llvm-svn: 322205
* Simplify __kmp_expand_threadsJonathan Peyton2018-01-101-37/+12
| | | | | | | | | | | | | | This change simplifies __kmp_expand_threads to take a single argument. Previously, it allowed two arguments and had logic to decide on different potential expansion sizes. However, no calls to __kmp_expand_threads in the runtime make use of this extra logic. Thus the extra argument and logic is removed here. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D41836 llvm-svn: 322204
* Minor code cleanupJonathan Peyton2018-01-103-4/+13
| | | | | | | | Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D41831 llvm-svn: 322203
* Improve stability of the runtime in parent/child processesJonathan Peyton2018-01-106-4/+32
| | | | | | | | | | | | | | | This change improves stability of the runtime when the application forks child processes. Acquiring/releasing __kmp_initz_lock and __kmp_forkjoin_lock in the atfork handlers insures that the actual fork does not occur while those two locks are held, and __kmp_itt_reset() reverts the itt's global state to the initial state which also initializes the mutex stored in the global state. Some missing initialization code was also inserted in the child's atfork handler. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D41462 llvm-svn: 322202
* Missed to add new test case in previous commitJoachim Protze2018-01-101-0/+150
| | | | llvm-svn: 322179
* [OMPT] Fix ompt_task_data handling in implicit barriersJoachim Protze2018-01-102-1/+2
| | | | | | | | Changes to task_data in barrier-begin were not visible at barrier-end Differential Revision: https://reviews.llvm.org/D41176 llvm-svn: 322178
* [OMPT] Fix cast and printf of wait_id in lock testJonas Hahnfeld2018-01-101-1/+1
| | | | | | | | This didn't work on 32 bit platforms. Differential Revision: https://reviews.llvm.org/D41853 llvm-svn: 322160
* Fix type mismatch in omp_control_tool() implementation that makes it run ↵Paul Osmialowski2018-01-091-1/+1
| | | | | | | | incorrectly on 32-bit machines. Differential Revision: https://reviews.llvm.org/D41854 llvm-svn: 322068
* Correct types of pointers to doacross_num_doneJonas Hahnfeld2018-01-071-3/+3
| | | | | | | | | | This field is defined as kmp_int32, so we should use neither pointers to kmp_int64 nor 64 bit atomic instructions. (Found while testing on a Raspberry Pi, 32 bit ARM) Differential Revision: https://reviews.llvm.org/D41656 llvm-svn: 321964
* Fix some comments and formatting in kmp_dispatch.cppJonathan Peyton2018-01-041-8/+9
| | | | llvm-svn: 321831
* Fix trademarks found by scannerJonathan Peyton2018-01-046-28/+28
| | | | llvm-svn: 321827
* [OMPT] Build runtime with OMPT support by defaultJoachim Protze2018-01-023-7/+23
| | | | | | | | This patch enables OMPT by default if version 50 or later is built and the config says, that OMPT will be supported. Differential Revision: https://reviews.llvm.org/D41508 llvm-svn: 321675
* Unify build documentation and convert to reStructuredTextJonas Hahnfeld2017-12-279-383/+316
| | | | | | | | | | | We now have several options that apply for both libraries and they shouldn't be documented in multiple files. When already merging the two Build_With_CMake.txt documents, convert them to reStructuredText which is used for all of LLVM's documentation. Differential Revision: https://reviews.llvm.org/D40920 llvm-svn: 321481
* [OMPT] Set and reset frame address when creating a task with dependencesJoachim Protze2017-12-242-13/+31
| | | | | | | | | As for normal task creation, the task frame addresses need to be stored for the encountering task. Differential Revision: https://reviews.llvm.org/D41165 llvm-svn: 321421
* [OMPT] Add missing initialization in nested_lwt.c test casePaul Osmialowski2017-12-221-1/+1
| | | | | | | | Without this initialization this test case tend to fail. Differential Revision: https://reviews.llvm.org/D41542 llvm-svn: 321379
* [OMPT] Fix failing test cases for gcc on UbuntuJoachim Protze2017-12-222-2/+3
| | | | | | | | | | | | | | The compiler warns that _BSD_SOURCE is deprecated and _DEFAULT_SOURCE should be used instead. We keep _BSD_SOURCE for older compilers, that don't know about _DEFAULT_SOURCE. The linker drops the tool when linking, since there is no visible need for the library. So we need to tell the linker, that the tool should be linked anyway. Differential Revision: https://reviews.llvm.org/D41499 llvm-svn: 321362
* Remove unused positional argument for printfJoachim Protze2017-12-222-2/+4
| | | | | | | | | | | | | | | | | The format string for hints only prints the second argument (string) and drops the first argument (hint id). Depending on how you read the POSIX text for printf, this could be valid. But for practical reason, i.e., unpacking the va_list passed to printf based on the formating information, it makes sense to fix the implementation and not pass the id for hint. Failing testcases were: misc_bugs/teams-reduction.c ompt/parallel/not_enough_threads.c Differential Revision: https://reviews.llvm.org/D41504 llvm-svn: 321361
* Add missing test case from D41171 commitJoachim Protze2017-12-211-0/+29
| | | | llvm-svn: 321270
* [OMPT] Add missing ompt_get_num_procs functionJoachim Protze2017-12-214-0/+21
| | | | | | | | | | | | | This function is defined in OpenMP-TR6 section 4.1.5.1.6 The functions was not implemented yet. Since ompt-functions can only be called after the runtime was initialized and has loaded a tool, it can assume the runtime to be initialized. In contrast to omp_get_num_procs which needs to check whether the runtime is initialized. Differential Revision: https://reviews.llvm.org/D40949 llvm-svn: 321269
* [OMPT] Fix return address handling in a few GOMP interface methodsJoachim Protze2017-12-212-8/+14
| | | | | | | | | | | This revision fixes failing testcases with parallel for loops and the gomp interface. The return address needs to be stored at entry to runtime. The storage is cleared on usage, so we need to update the storage before calling again internal functions, that will trigger event callbacks. Differential Revision: https://reviews.llvm.org/D41181 llvm-svn: 321265
* [OMPT] Handle null pointer in set_callback to improve performanceJoachim Protze2017-12-211-2/+5
| | | | | | | | | | | | | We use the bitmap ompt_enabled thoughout the runtime, to avoid loading the vector of callback functions when testing if specific code should be executed. Before invoking an event callback function, the pointer is tested for NULL. This revision resets the corresponding bit in ompt_enabled to 0 if NULL is passed in set_callback. Differential Revision: https://reviews.llvm.org/D41171 llvm-svn: 321264
* [OMPT] Use frames at different level when using clang version 5 or higher ↵Joachim Protze2017-12-217-18/+35
| | | | | | | | | | | | with debug flag Clang 5 or higher adds an intermediate function call in certain cases when compiling with debug flag. This revision updates the testcases to work correctly. Differential Revision: https://reviews.llvm.org/D40595 llvm-svn: 321263
* [OMPT] Add annotations to testcases that are expected to fail when using ↵Joachim Protze2017-12-2126-7/+32
| | | | | | | | | | | | | | | | | certain compilers Reasons for expected failures are mainly bugs when using lables in OpenMP regions or missing support of some OpenMP features. For some worksharing clauses, support to distinguish the kind of workshare was added just recently. If an issue was fixed in a minor release version of a compiler, we flag the test as unsupported for this compiler version to avoid false positives. Same for fixes that where backported to older compiler versions. Differential Revision: https://reviews.llvm.org/D40384 llvm-svn: 321262
* [AArch64] add required arch specific code for running OMPT test casesPaul Osmialowski2017-12-211-0/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D41482 llvm-svn: 321258
* Fix more inconsistent line endings. NFC.Dimitry Andric2017-12-181-323/+323
| | | | llvm-svn: 321016
* [AArch64] fix an issue with older /proc/cpuinfo layoutPaul Osmialowski2017-12-131-0/+8
| | | | | | | | | | There are two /proc/cpuinfo layots in use for AArch64: old and new. The old one has all 'processor : n' lines in one section, hence checking for duplications does not make sense. Differential Revision: https://reviews.llvm.org/D41000 llvm-svn: 320593
* [CMake] Remove legacy LIBOMP_LIT_ARGSJonas Hahnfeld2017-12-081-4/+0
| | | | | | The bots have been updated, this option isn't needed anymore. llvm-svn: 320153
* Use hyperbarrier by default on all architecturesJonas Hahnfeld2017-12-081-15/+6
| | | | | | | | | | | | | | | | | | All architectures except x86_64 used the linear barrier implementation by default which doesn't give good performance for a larger number of threads. Improvements for PARALLEL overhead (EPCC) with this patch on a Power8 system (2 sockets x 10 cores x 8 threads, OMP_PLACES=cores) 20 threads: 4.55us -> 3.49us 40 threads: 8.84us -> 4.06us 80 threads: 19.18us -> 4.74us 160 threads: 54.22us -> 6.73us Differential Revision: https://reviews.llvm.org/D40358 llvm-svn: 320152
* Fix thread affinity on non-x86 LinuxJonas Hahnfeld2017-12-082-5/+2
| | | | | | | | | | | | | | | | | To make thread affinity work according to the OpenMP spec, the runtime needs information about the hardware topology. On Linux the default way is to parse /proc/cpuinfo which contains this information for x86 machines but (at least) not for AArch64 and Power architectures. Fortunately, there is a different code path which is able to get that data from sysfs. The needed patch has landed in 2006 for Linux 2.6.16 which is safe to assume nowadays (even RHEL 5 had a kernel version derived from 2.6.18, and we are now at RHEL 7!). Differential Revision: https://reviews.llvm.org/D40357 llvm-svn: 320151
* Add missing memory barrier for queuing locksJonas Hahnfeld2017-12-082-1/+2
| | | | | | | | | | Otherwise I see hangs in the omp_single_copyprivate test when compiling in release mode. With the debug assertions, I get a failure `head > 0 && tail > 0`. Differential Revision: https://reviews.llvm.org/D40722 llvm-svn: 320150
* [libomptarget] Split implementation of interface functionsJonas Hahnfeld2017-12-064-484/+520
| | | | | | | | | | | | This last of four patches adds a new file for the interface functions that Clang uses during code generation. The only change except simply moving the current code is renaming the function CheckDeviceAndCtors() and using the correct type for 64bit device ids. Differential Revision: https://reviews.llvm.org/D40801 llvm-svn: 319972
* [libomptarget] Split implementation of API functionsJonas Hahnfeld2017-12-065-297/+316
| | | | | | | | | | This third patch moves the implementation of the user-facing OpenMP API functions into its own file. For now, the code is only moved, no cleanups applied yet. Differential Revision: https://reviews.llvm.org/D40800 llvm-svn: 319971
* [libomptarget] Split device functionalityJonas Hahnfeld2017-12-064-321/+339
| | | | | | | | | | This is the second patch to split the current monolithic implementation into separate files. Note that this change doesn't cleanup the code yet. Differential Revision: https://reviews.llvm.org/D40799 llvm-svn: 319970
* [libomptarget] Split RTL plugin functionalityJonas Hahnfeld2017-12-066-629/+742
| | | | | | | | | | | This is the first of four patches to split the target agnostic library into multiple (smaller) files. It only moves the code to separate implementation files and does no cleanup (yet) except removing unneeded headers. Differential Revision: https://reviews.llvm.org/D40798 llvm-svn: 319969
OpenPOWER on IntegriCloud