summaryrefslogtreecommitdiffstats
path: root/openmp/libomptarget/deviceRTLs
Commit message (Collapse)AuthorAgeFilesLines
...
* [OpenMP][libomptarget] Refactor SPMD and runtime requirement checkingGheorghe-Teodor Bercea2018-11-279-171/+262
| | | | | | | | | | | | | | Summary: Refactor the checking for SPMD mode and whether the runtime is initialized or not. This uses constant flags which enables the runtime to optimize out unused sections of code that depend on these flags. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D54960 llvm-svn: 347698
* [OPENMP][NVPTX]Improved lock/critical constructs.Alexey Bataev2018-11-203-24/+11
| | | | | | | | | | | | Summary: Improved support for critical constructs + omp_..._lock... constructs. Reviewers: gtbercea, kkwli0, caomhin Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D54766 llvm-svn: 347342
* [OPENMP][NVPTX]Fixed/improved support for globalization in team contexts.Alexey Bataev2018-11-026-71/+102
| | | | | | | | | | | | | | | | | | | Summary: Current globalization scheme works correctly only for SPMD+lightweight runtime mode and does not work for full runtime. Patch improves support for the globalization scheme + reduces global memory consumption in lightweight runtime mode. Patch adds runtime functions to work with the statically allocated global memory. It allows to improve performance and memory consumption. This global memory must be allocated by the compiler. Reviewers: grokos, kkwli0, gtbercea, caomhin Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D53943 llvm-svn: 345976
* [OpenMP][libomptarget] Add runtime function for pushing coalesced global recordsGheorghe-Teodor Bercea2018-11-015-35/+45
| | | | | | | | | | | | | | Summary: In the case of coalesced global records, we need to push the exact data size passed in. This patch fixes this by outlining the common functionality of the previous push function and by adding a separate entry point for coalesced pushes. The pop function remains unchanged. Reviewers: ABataev, grokos, caomhin Reviewed By: ABataev, grokos Subscribers: jholewinski, cfe-commits, Hahnfeld, guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D53141 llvm-svn: 345867
* [libomptarget-nvptx] Enable asserts in bclibJonas Hahnfeld2018-10-011-1/+1
| | | | | | | | | | | If the user requested LIBOMPTARGET_NVPTX_DEBUG, include asserts in the bitcode library. Everything else will have very unpleasent effects because asserts will appear when falling back to the static library libomptarget-nvptx.a. Differential Revision: https://reviews.llvm.org/D52701 llvm-svn: 343477
* [libomptarget-nvptx] reduction: Determine if runtime uninitializedJonas Hahnfeld2018-10-011-8/+10
| | | | | | | | | | Pass in the correct value of isRuntimeUninitialized() which solves parallel reductions as reported on the mailing list. For reference: r333285 did the same for loop scheduling. Differential Revision: https://reviews.llvm.org/D52725 llvm-svn: 343476
* [libomptarget-nvptx] Align data sharing stackJonas Hahnfeld2018-09-302-0/+62
| | | | | | | | | | | | NVPTX requires addresses of pointer locations to be 8-byte aligned or there will be an exception during runtime. This could happen without this patch as shown in the added test: getId() requires 4 byte of stack and putValueInParallel() uses 16 bytes to store the addresses of the captured variables. Differential Revision: https://reviews.llvm.org/D52655 llvm-svn: 343402
* [libomptarget-nvptx] Fix ancestor_thread_num and team_size (non-SPMD)Jonas Hahnfeld2018-09-302-11/+85
| | | | | | | | | | | | | | According to OpenMP 4.5, p250:12-14: If the requested nest level is outside the range of 0 and the nest level of the current thread, as returned by the omp_get_level routine, the routine returns -1. The SPMD code path will need a similar fix. Differential Revision: https://reviews.llvm.org/D51787 llvm-svn: 343401
* [libomptarget-nvptx] Add tests for nested parallelismJonas Hahnfeld2018-09-292-0/+141
| | | | | | | | | Clang trunk will serialize nested parallel regions. Check that this is correctly reflected in various API methods. Differential Revision: https://reviews.llvm.org/D51786 llvm-svn: 343382
* [libomptarget-nvptx] Ignore calls to dynamic APIJonas Hahnfeld2018-09-293-37/+44
| | | | | | | | | | | | | | | There is no support and according to the OpenMP 4.5, p238:7-9: For implementations that do not support dynamic adjustment of the number of threads this routine has no effect: the value of dyn-var remains false. Add a test that cancellation and nested parallelism aren't supported either. Differential Revision: https://reviews.llvm.org/D51785 llvm-svn: 343381
* [libomptarget-nvptx] Fix number of threads in parallelJonas Hahnfeld2018-09-293-84/+147
| | | | | | | | | | | | If there is no num_threads() clause we must consider the nthreads-var ICV. Its value is set by omp_set_num_threads() and can be queried using omp_get_max_num_threads(). The rewritten code now closely resembles the algorithm given in the OpenMP standard. Differential Revision: https://reviews.llvm.org/D51783 llvm-svn: 343380
* [libomptarget-nvptx] Add testing infrastructureJonas Hahnfeld2018-09-285-0/+191
| | | | | | | | | | | | | | | This patch also introduces testing for libomptarget-nvptx which has been missing until now. I propose to add tests for all bugs that are fixed in the future. The target check-libomptarget-nvptx is not run by default because - we can't determine if there is a GPU plugged into the system. - it will require the latest Clang compiler. Keeping compatibility with older releases would prevent testing newer code generation developed in trunk. Differential Revision: https://reviews.llvm.org/D51687 llvm-svn: 343324
* [OpenMP][libomptarget] Set the frame pointer then test empty slot conditionGheorghe-Teodor Bercea2018-09-251-3/+3
| | | | | | | | | | | | | | Summary: NFC - just fixing a bug: the empty slot test was before the re-setting of the Stack pointer. Reviewers: ABataev, caomhin, Hahnfeld Reviewed By: ABataev Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D52122 llvm-svn: 343006
* [OpenMP][libomptarget] Simplify warp master selection for data sharingGheorghe-Teodor Bercea2018-09-251-2/+2
| | | | | | | | | | | | | | | | | | Summary: There is currently no supported situation where the warp master is not the first thread in the warp. This also avoids the device execution from hanging on Volta GPUs when ballot_sync is called by a number of threads that is less that the size of a warp. Reviewers: ABataev, caomhin, grokos Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D50188 llvm-svn: 342972
* [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD ↵Alexey Bataev2018-09-216-1/+74
| | | | | | | | | | | | | | | | | | | constructs with lightweight runtime. Summary: We need the support for per-team shared variables to support codegen for lastprivates/reductions. Patch adds this support by using shared memory if the total size of the reductions/lastprivates is <= 128 bytes, then pre-allocated buffer in global memory if size is <= 4K bytes,or uses malloc/free, otherwise. Reviewers: gtbercea, kkwli0, grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D51875 llvm-svn: 342737
* [libomptarget-nvptx] Remove last mentions of __kmpc_print_*Jonas Hahnfeld2018-09-081-12/+0
| | | | | | | Their implementation was removed during review, delete their prototype declarations. llvm-svn: 341748
* [libomptarget][NVPTX] Drop dead code and data structures, NFCI.Jonas Hahnfeld2018-09-048-189/+2
| | | | | | | | | | | | * cg and HasCancel in WorkDescr were never read and can be removed. * This eliminates the last use of priv in ThreadPrivateContext. * CounterGroup is unused afterwards. * Remove duplicate external declares in omptarget-nvptx.cu that are already in the header omptarget-nvptx.h. Differential Revision: https://reviews.llvm.org/D51622 llvm-svn: 341370
* [libomptarget][NVPTX] Fix __kmpc_spmd_kernel_deinitJonas Hahnfeld2018-09-031-1/+1
| | | | | | | | If the runtime is uninitialized the master thread must Enqueue the state object, and ALL threads must return immediately. Found post-commit of https://reviews.llvm.org/D51222. llvm-svn: 341328
* [OPENMP][NVPTX] Replace assert() by ASSERT0() macro, NFC.Alexey Bataev2018-08-299-64/+71
| | | | | | Required to fix the buildbots. llvm-svn: 340956
* [OPENMP][NVPTX] Lightweight runtime support for SPMD mode.Alexey Bataev2018-08-2911-45/+263
| | | | | | | | | | | | | | | | Summary: Implemented simple and lightweight runtime support for SPMD mode-based constructs. It adds support for L2 sequential parallelism wihtout full runtime support. Also, patch fixes some use cases for uninitialized|lightweight runtime. Reviewers: grokos, kkwli0, Hahnfeld, gtbercea Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D51222 llvm-svn: 340944
* [OPNEMP, NVPTX] Fixed sychronization construct + code cleanup.Alexey Bataev2018-07-234-54/+25
| | | | | | | | | | | | | | | | | Summary: 1. Fixed internal problem in `__kmpc_barrier` function: SPMD mode synchronization function should be called only in L1 parallel level. 2. Removed some extra code for synchronization inside of the code, used `__kmpc_barrier` instead. 3. Some code cleanup. Reviewers: gtbercea, grokos Subscribers: openmp-commits Differential Revision: https://reviews.llvm.org/D49564 llvm-svn: 337691
* [OpenMP][libomptarget] Fix data sharing and globalization infrastructure to ↵Gheorghe-Teodor Bercea2018-07-133-105/+72
| | | | | | | | | | | | | | | | work in SPMD mode Summary: This patch fixes the data sharing infrastructure to work for the SPMD and non-SPMD cases. Reviewers: ABataev, grokos, carlo.bertolli, caomhin Reviewed By: ABataev, grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D49204 llvm-svn: 337013
* [OPENMP, NVPTX] Fix loop boundaries calculation for dynamic loops.Alexey Bataev2018-07-124-16/+50
| | | | | | | | | | | | | | | | | | | Summary: Patch fixes the next problems. 1. Removes unused functions from omptarget_nvptx_ThreadPrivateContext class + simplified data members. 2. Fixed calculation of loop boundaries for dynamic loops with static scheduling. 3. Introduced saving/restoring of the dynamic loop boundaries to support several nested parallel dynamic loops. Reviewers: grokos Subscribers: guansong, kkwli0, openmp-commits Differential Revision: https://reviews.llvm.org/D49241 llvm-svn: 336915
* [OPENMP, NVPTX] Sync threads before start ordered loops.Alexey Bataev2018-06-291-1/+6
| | | | | | | | | | | | Summary: Threads must be synchronized before starting ordered construct. Reviewers: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D48732 llvm-svn: 335987
* [OPENMP, NVPTX] Fixes for NVPTX RTLAlexey Bataev2018-06-253-32/+36
| | | | | | | | | | | | | | | | | | Summary: Patch fixes several problems in the implementation of NVPTX RTL. 1. Detection of the last iteration for loops with static scheduling, no chunks. 2. Fixes reductions for the serialized parallel constructs. 3. Fixes handling of the barriers. Reviewers: grokos Reviewed By: grokos Subscribers: Hahnfeld, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D48480 llvm-svn: 335469
* [OpenMP] [CUDA] Expose teamid to the debug pathGuansong Zhang2018-06-191-1/+1
| | | | | | | | | | | | | | | | Summary: Small bug fix for debug build. A previous fix causing trouble for debug build. Reviewers: grokos Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D48286 llvm-svn: 335046
* [libomptarget-nvptx] loop: Determine if runtime uninitializedJonas Hahnfeld2018-05-251-38/+42
| | | | | | | | | | | | | | | | | | | | The generic entry points for static loop scheduling previously hardcoded that the runtime was initialized. This can be wrong if the compiler analyzes that the runtime is not needed and calls the init functions accordingly. This didn't affect clang-ykt because they have entry points for different combinations of SPMD x Runtime not needed. I didn't do measurements yet but with inlining we might get away with always calling the generic interface and letting compiler and runtime figure out the rest. In any case, a correct runtime is always better than having functions that may only be called if previous calls passed in a specific set of arguments! Differential Revision: https://reviews.llvm.org/D47131 llvm-svn: 333285
* [CMake] Unify install path for librariesJonas Hahnfeld2018-05-251-3/+3
| | | | | | | | | | Introduce OPENMP_INSTALL_LIBDIR and use in all install() commands. This also fixes installation of libomptarget-nvptx that previously didn't honor {OPENMP,LLVM}_LIBDIR_SUFFIX. Differential Revision: https://reviews.llvm.org/D47130 llvm-svn: 333284
* [CUDA]Fix dynamic|guided scheduling.George Rokos2018-05-241-57/+50
| | | | | | | | | | | | | The existing implementation of the dynamic scheduling breaks the contract introduced by the original openmp runtime and, thus, is incorrect. Patch fixes it and introduces correct dynamic scheduling model. Thanks to Alexey Bataev for submitting this patch. Differential Revision: https://reviews.llvm.org/D47333 llvm-svn: 333225
* [libomptarget-nvptx] Test bitcode compiler flags and enable by defaultJonas Hahnfeld2018-05-161-103/+68
| | | | | | | | | | | | | | Move all logic related to selecting the bitcode compiler and linker into a new file and dynamically test required compiler flags. This also adds -fcuda-rdc for Clang trunk as previously attempted in D44992 which fixes the build. As a result this change also enables building the library by default if all prerequisites are met. Differential Revision: https://reviews.llvm.org/D46901 llvm-svn: 332494
* [OpenMP][libomptarget] Add function for checking SPMD modeGheorghe-Teodor Bercea2018-05-152-0/+8
| | | | | | | | | | | | | | Summary: Add function to the NVPTX libomptarget library that will return true if the current target region is being executed in SPMD mode. Reviewers: ABataev, grokos, carlo.bertolli, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D46840 llvm-svn: 332360
* [OpenMP] Use LIBOMPTARGET_DEVICE_RTL_DEBUG env var to control debug messages ↵Guansong Zhang2018-05-043-2/+21
| | | | | | | | | | | | | | | | | | | | | | | on the device side Summary: Enable the device side debug messages at compile time, use env var to control at runtime. To achieve this, an environment data block is passed to the device lib when it is loaded. By default, the message is off, to enable it, a user need to set LIBOMPDEVICE_DEBUG=1. Reviewers: grokos Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D46210 llvm-svn: 331550
* [OpenMP] Remove compilation warning when using clang to compile bc files.Guansong Zhang2018-04-265-14/+16
| | | | | | | | | | | | | | | | Summary: Minor printf format correction. NVCC ignore those. Clang will give warning on these if debug is enabled. Reviewers: grokos Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D45528 llvm-svn: 330944
* [OpenMP] Make bc file compilation sensitive to LIBOMPTARGET_NVPTX_DEBUG flagGuansong Zhang2018-04-201-1/+7
| | | | | | | | | | | | | | Summary: The LIBOMPTARGET_NVPTX_DEBUG flag is inconsistent between using nvcc to generate .a file and clang to generate .bc file. Sync the two setting so we can get debug messages from the bc file path as well. Reviewers: grokos Subscribers: Hahnfeld, openmp-commits, mgorny Tags: #openmp Differential Revision: https://reviews.llvm.org/D45530 llvm-svn: 330477
* [OpenMP] Remove extra warning when we buildGuansong Zhang2018-04-101-1/+1
| | | | | | | | | | | | | | | | | | | Summary: This one line change is to remove this warning message "warning: integer conversion resulted in a change of sign" Reviewers: grokos Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D45415 llvm-svn: 329713
* Revert "[OpenMP] enable bc file compilation using the latest clang"Guansong Zhang2018-04-091-1/+0
| | | | | | This reverts commit 6849e31c36d712d97433bca9af39b7a09c8c1207. llvm-svn: 329576
* [OpenMP] enable bc file compilation using the latest clangGuansong Zhang2018-04-031-0/+1
| | | | | | | | | | | | | | | | Summary: adding cuda-rdc flag to allow extern global data Reviewers: grokos Reviewed By: grokos Subscribers: gregrodgers, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D44992 llvm-svn: 329072
* [OpenMP][libomptarget] Initialize global memory stack only once.Gheorghe-Teodor Bercea2018-03-212-3/+15
| | | | | | | | | | | | | | Summary: The global stack initialization function may be called multiple times. The initialization of the shared memory slots should only happen when the function is called for the first time for a given warp master thread. Reviewers: grokos, carlo.bertolli, ABataev, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44754 llvm-svn: 328148
* [OpenMP][libomptarget] Fix master warp checkGheorghe-Teodor Bercea2018-03-213-13/+24
| | | | | | | | | | | | | | Summary: The check for the master warp must take into consideration the actual number of warps: the master warp is equal to the last active warp not necessarily WARPSIZE - 1. Reviewers: grokos, carlo.bertolli, ABataev, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44537 llvm-svn: 328146
* [OpenMP][libomptarget] Enable globalization for workersGheorghe-Teodor Bercea2018-03-211-33/+35
| | | | | | | | | | | | | | | | Summary: This patch allows worker to have a global memory stack managed by the runtime. This patch is needed for completeness and consistency with the globalization policy: if a worker-side variable escapes the current context it then needs to be globalized. Until now, only the master thread was allowed to have such a stack. These global values can now potentially be shared amongst workers if the semantics of the OpenMP program require it. Reviewers: ABataev, grokos, carlo.bertolli, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44487 llvm-svn: 328144
* [OpenMP][libomptarget] Enable usage of shared memory slotsGheorghe-Teodor Bercea2018-03-151-15/+1
| | | | | | | | | | | | | | | | | Summary: Allow the runtime to use the existing shared memory statically allocated slots. When a variable is globalized, the underlying memory can be either shared or global memory (both have block-wide visibility). In this case, we allow that the storage to use a limited amount of shared memory that has been statically allocated already. Only if shared memory doesn't prove to be enough do we then invoke malloc() to create a new global memory slot. Reviewers: ABataev, carlo.bertolli, grokos, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44486 llvm-svn: 327639
* [OpenMP][libomptarget] Enable multiple frames per global memory slotGheorghe-Teodor Bercea2018-03-153-47/+121
| | | | | | | | | | | | | | Summary: To save on calls to malloc, this patch enables the re-use of pre-allocated global memory slots. Reviewers: ABataev, grokos, carlo.bertolli, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44470 llvm-svn: 327637
* [libomptarget][nvptx] Bug fix: Correctly identify the warp master active thread.George Rokos2018-03-141-1/+2
| | | | llvm-svn: 327556
* [OpenMP][libomptarget] Add global memory data sharing support for ↵Gheorghe-Teodor Bercea2018-03-135-0/+222
| | | | | | | | | | | | | | | | | | | | | | master-worker sharing. Summary: This patch adds support for the sharing of variables from the master thread of a team to the worker threads of the team. The runtime uses a stack structure implemented as a doubly-linked list of slots with each slot having the exact same size as the size requested. This implementation leverages existing data structures. The runtime functions are added as separate functions to avoid interfering with the current interface. Limitations to be addressed in future patches: - This current patch only employs global memory. In a future patch we will enable to usage for shared memory as an optimization. - Allow the allocation of several requested sizes in the same slot. Reviewers: ABataev, grokos, caomhin, carlo.bertolli Reviewed By: grokos Subscribers: Hahnfeld, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44260 llvm-svn: 327440
* [OpenMP][libomptarget] Fix union.Gheorghe-Teodor Bercea2018-03-082-45/+41
| | | | | | | | | | | | | | Summary: To make the two parts of the union have the same size, the size of vect needs to be increased by 16 bits. Reviewers: grokos, carlo.bertolli, caomhin, ABataev Reviewed By: grokos, ABataev Subscribers: fedor.sergeev, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44254 llvm-svn: 327040
* [OpenMP] Remove implicit data sharing using device shared memory from ↵Gheorghe-Teodor Bercea2018-03-076-65/+1
| | | | | | | | | | | | | | | | | | | | libomptarget Summary: This patch reverts the changes to libomptarget that were coupled with the changes to Clang code gen for data sharing using shared memory. A similar patch exists for Clang: D43625 Shared memory is meant to be used as an optimization on top of a more general scheme. So far we didn't have a global memory implementation ready so shared memory was a solution which applied to the current level of OpenMP complexity supported by trunk on GPU devices (due to the missing NVPTX backend patch this functionality has never been exercised). Now that we have a global memory solution this patch is "in the way" and needs to be removed (for now). This patch (or an equivalent version of it) will be put out for review once the global memory scheme is in place. Reviewers: ABataev, grokos, carlo.bertolli, caomhin Reviewed By: grokos Subscribers: Hahnfeld, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D43626 llvm-svn: 326950
* [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for ↵Gheorghe-Teodor Bercea2018-02-121-39/+49
| | | | | | | | | | | | | | | | | | | runtime inlining Summary: Different NVIDIA GPUs support different compute capabilities. To enable the inlining of runtime functions and the best performance on different generations of NVIDIA GPUs, a bc library for each compute capability needs to be compiled. The same compiler build will then be usable in conjunction with multiple generations of NVIDIA GPUs. To differentiate between versions of the same bc lib, the output file name will contain the compute capability ID. Depends on D14254 Reviewers: Hahnfeld, hfinkel, carlo.bertolli, caomhin, ABataev, grokos Reviewed By: Hahnfeld, grokos Subscribers: guansong, mgorny, openmp-commits Differential Revision: https://reviews.llvm.org/D41724 llvm-svn: 324904
* [OpenMP][libomptarget] Add data sharing support in libomptargetGheorghe-Teodor Bercea2018-02-076-1/+65
| | | | | | | | | | | | | | Summary: This patch extends the libomptarget functionality in patch D14254 with support for the data sharing scheme for supporting implicitly shared variables. The runtime therefore maintains a list of references to shared variables. Reviewers: carlo.bertolli, ABataev, Hahnfeld, grokos, caomhin, hfinkel Reviewed By: Hahnfeld, grokos Subscribers: guansong, llvm-commits, openmp-commits Differential Revision: https://reviews.llvm.org/D41485 llvm-svn: 324495
* [OpenMP-RT] Fix debug string for NVPTX runtime libraryCarlo Bertolli2018-02-011-1/+1
| | | | | | | | https://reviews.llvm.org/D42757 The method ThreadsInTeam is used to determine the number of threads to be used in a parallel region under SPMD mode (see line 127 of supporti.h in libomptarget/deviceRTLs/nvptx/src/). This patch fixes the corresponding debug print upon initialization of the kernel in SPMD mode. llvm-svn: 323978
* [OpenMP] Initial implementation of OpenMP offloading library - libomptarget ↵George Rokos2018-01-2925-0/+5861
device RTLs. This patch implements the device runtime library whose interface is used in the code generation for OpenMP offloading devices. Currently there is a single device RTL written in CUDA meant to CUDA enabled GPUs. The interface is a variation of the kmpc interface that includes some extra calls to do thread and storage management that only make sense for a GPU target. Differential revision: https://reviews.llvm.org/D14254 llvm-svn: 323649
OpenPOWER on IntegriCloud