summaryrefslogtreecommitdiffstats
path: root/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h
Commit message (Collapse)AuthorAgeFilesLines
* [libomptarget][nfc] Change unintentional target_impl prefix to kmpc_implJon Chesterfield2019-12-301-3/+3
|
* [libomptarget][nfc] Provide target_impl malloc/freeJon Chesterfield2019-12-191-0/+6
| | | | | | | | | | | | | | | | | Summary: [libomptarget][nfc] Provide target_impl malloc/free Sufficient to build support.cu for amdgcn Reviewers: jdoerfert, ABataev, grokos Reviewed By: jdoerfert Subscribers: jvesely, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71685
* [libomptarget][nvptx] Fix build, second symbol reorderingJonChesterfield2019-12-191-9/+9
|
* [libomptarget][nvptx] Fix build, symbol ordering in target_impl.hJon Chesterfield2019-12-191-9/+9
|
* [libomptarget][nfc] Extract function from data_sharing, move to commonJonChesterfield2019-12-181-0/+9
| | | | | | | | | | | | | | | | | | Summary: [libomptarget][nfc] Extract function from data_sharing, move to common Finding the first active thread in the warp is different on nvptx and amdgcn, mostly due to warp size and the desire for efficiency. Reviewers: ABataev, jdoerfert, grokos Reviewed By: jdoerfert Subscribers: jvesely, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71643
* [libomptarget][nfc] Move omp locks under target_implJonChesterfield2019-12-171-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: [libomptarget][nfc] Move omp locks under target_impl These are likely to be target specific, even down to the lock_t which is correspondingly moved out of interface.h. The alternative is to include interface.h in target_impl which substantiatially increases the scope of those symbols. The current nvptx implementation deadlocks on amdgcn. The preferred implementation for that arch is still under discussion - this change leaves declarations in target_impl. The functions could be inline for nvptx. I'd prefer to keep the internals hidden in the target_impl translation unit, but will add the (possibly renamed) macros to target_impl.h if preferred. Reviewers: ABataev, jdoerfert, grokos Reviewed By: jdoerfert Subscribers: jvesely, mgorny, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71574
* [libomptarget][nfc] Move timer functions behind target_implJon Chesterfield2019-12-171-0/+11
| | | | | | | | | | | | | | Summary: [libomptarget][nfc] Move timer functions behind target_impl Reviewers: jdoerfert, ABataev, grokos Reviewed By: jdoerfert Subscribers: jvesely, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71584
* [libomptarget][nfc] Wrap cuda min() in target_implJon Chesterfield2019-12-171-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | Summary: [libomptarget][nfc] Wrap cuda min() in target_impl nvptx forwards to cuda min, amdgcn implements directly. Sufficient to build parallel.cu for amdgcn, added to CMakeLists. All call sites are homogenous except one that passes a uint32_t and an int32_t. This could be smoothed over by taking two type parameters and some care over the return type, but overall I think the inline <uint32_t> calling attention to what was an implicit sign conversion is cleaner. Reviewers: ABataev, jdoerfert Reviewed By: jdoerfert Subscribers: jvesely, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71580
* Revert "Revert "[libomptarget] Move resource id functions into target ↵JonChesterfield2019-12-161-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | specific code, implement for amdgcn"" Summary: This reverts commit dd8a7fcdd73dd63529b81bf9f72c7529dfe99ec3. Alexey reports undefined symbols for the new inline functions defined in target_impl.h This does not reproduce for me for nvptx, or amdgcn, under release or debug builds. I believe the patch is fine, based on: - the semantics of an inline function in C++ (the cuda INLINE functions end up as linkonce_odr in IR), which are only legal to drop if they have no uses - the code generated from a debug build of clang 9 does not show these undef symbols - the tests pass - the code is trivial To progress from here I either need: - A tie break - someone to play the role of CI in determining whether the patch works - Alexey to provide sufficient information about his build for me to reproduce the failure - Alexey to debug why the symbols are disappearing for him and report back Reviewers: ABataev, jdoerfert, grokos Subscribers: jvesely, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71502
* Revert "[libomptarget] Move resource id functions into target specific code, ↵Alexey Bataev2019-12-131-6/+0
| | | | | | | implement for amdgcn" This reverts commit dbb3fec8adfc4ac3fbf31f51f294427dbabbebb2 since it breaks the NVPTX tests.
* [libomptarget] Move resource id functions into target specific code, ↵Jon Chesterfield2019-12-121-0/+6
| | | | | | | | | | | | | | | | implement for amdgcn Summary: [libomptarget] Move resource id functions into target specific code, implement for amdgcn Reviewers: jdoerfert, ABataev, grokos Reviewed By: jdoerfert Subscribers: jvesely, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71382
* [libomptarget][nfc] Move cuda threadfence functions behind kmpc_implJonChesterfield2019-12-061-0/+4
| | | | | | | | | | | | | | | | | Summary: [libomptarget][nfc] Move cuda threadfence functions behind kmpc_impl Part of building code under common/ without requiring a cuda compiler Reviewers: ABataev, jdoerfert, grokos Reviewed By: ABataev Subscribers: jvesely, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71102
* [libomptarget][nfc] Introduce SHARED, ALIGN macrosJon Chesterfield2019-12-051-0/+2
| | | | | | | | | | | | | | | | Summary: [libomptarget][nfc] Introduce SHARED, ALIGN macros Move remaining cuda attributes behind such macros Reviewers: ABataev, jdoerfert, grokos Reviewed By: ABataev Subscribers: openmp-commits, jvesely Tags: #openmp Differential Revision: https://reviews.llvm.org/D71076
* [libomptarget] Move supporti.h to support.cuJonChesterfield2019-11-131-2/+3
| | | | | | | | | | | | | | | | | Summary: [libomptarget] Move supporti.h to support.cu Reimplementation of D69652, without the unity build and refactors. Will need a clean build of libomptarget as the cmakelists changed. Reviewers: ABataev, jdoerfert Reviewed By: jdoerfert Subscribers: mgorny, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D70131
* [libomptarget] Revert all improvements to supportJon Chesterfield2019-11-061-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: [libomptarget] Revert all improvements to support The change to unity build for nvcc has broken the build for some developers. This patch reverts to a known-working state. There has been some confusion over exactly how the build broke. I think we have reached a common understanding that the disappearing symbols are from the bitcode library built by clang. The static archive built by nvcc may show the same problem. Some of the confusion arose from building the deviceRTL twice and using one or the other library based on various environmental factors. I'm pretty sure the problem is clang expanding `__forceinline__` into both `__inline__` and `attribute(("always_inline"))`. The `__inline__` attribute resolves to linkonce_odr which is not safe for exporting symbols from translation units. "always_inline" is the desired semantic for small functions defined in one translation unit that are intended to be inlined at link time. "inline" is not. This therefore reintroduces the dependency hazard of supporti.h and some code duplication, and blocks progress separating deviceRTL into reusable components. See also D69857, D69859 for attempts at a fix instead of a revert. Reviewers: ABataev, jdoerfert, grokos, ikitayama, tianshilei1992 Reviewed By: ABataev Subscribers: mgorny, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D69885
* [nfc][libomptarget] Reorganise support headerJonChesterfield2019-10-311-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: [nfc][libomptarget] Reorganise support header All functions defined in support implementation are now declared in support.h Reordered functions in support implementation to match the sequence in support.h Added include guards to support.h Added #include interface to support.h to provide kmp_Ident declaration Move supporti.h to support.cu and s/INLINE/EXTERN/g Add remaining includes to support.cu A minor side effect is to change the name mangling of the support functions to extern "C". If this matters another macro along the lines of INLINE/EXTERN can be added - perhaps DEVICE as that's the obvious implementation. Reviewers: jdoerfert, ABataev, grokos Reviewed By: jdoerfert Subscribers: mgorny, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D69652
* [nfc][libomptarget] Move named_sync() into target_implJon Chesterfield2019-10-301-0/+7
| | | | | | | | | | | | | | Summary: [nfc][libomptarget] Move named_sync() into target_impl Reviewers: ABataev, jdoerfert, grokos Reviewed By: ABataev Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D69487
* [nfc][libomptarget] Move smid() into target_implJon Chesterfield2019-10-301-0/+6
| | | | | | | | | | | | | | Summary: [nfc][libomptarget] Move smid() into target_impl Reviewers: ABataev, jdoerfert, grokos Reviewed By: ABataev Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D69485
* [LIBOMPTARGET]Fix build, NFC.Alexey Bataev2019-10-281-1/+1
| | | | | Need to include nvptx_interface.h in target_impl.h, otherwise the build is failed because of missing __kmpc_impl_lanemask_t type.
* [nfc][libomptarget] Inline option into target_implJon Chesterfield2019-10-271-1/+36
| | | | | | | | | | | | | | | | | Summary: [nfc][libomptarget] Inline option into target_impl Subset of D69423. The macros that were in option.h are all target dependent. Inlining the header simplifies the dependency graph when looking to move code into a common subdir. Reviewers: ABataev, jdoerfert, grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D69472
* [NFC][libomptarget] move remaining device specific code out of omptarget-nvptx.hJon Chesterfield2019-10-251-0/+15
| | | | | | | | | | | | | | | | | | Summary: [NFC][libomptarget] move remaining device specific code out of omptarget-nvptx.h Strictly there is one remaining difference wrt amdgcn - parallelLevel is volatile qualified on amdgcn and not on nvptx. Determining whether this is correct - and how to represent the different semantics of 'volatile' under various conditions - is beyond the scope of this code motion patch. Reviewers: ABataev, jdoerfert, grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D69424
* [libomptarget][nfc] Make interface.h target independentJon Chesterfield2019-10-151-1/+0
| | | | | | | | | | | | | | | | | | | | Summary: [libomptarget][nfc] Make interface.h target independent Move interface.h under a top level include directory. Remove #includes to avoid the interface depending on the implementation. Reviewers: ABataev, jdoerfert, grokos, ronlieb, RaviNarayanaswamy Reviewed By: jdoerfert Subscribers: mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D68615 llvm-svn: 374919
* Use named constant to indicate all lanes, to handle 32 and 64 wide architecturesJon Chesterfield2019-10-041-0/+2
| | | | | | | | | | | | | | | | Summary: Use named constant to indicate all lanes, to handle 32 and 64 wide architectures Reviewers: ABataev, jdoerfert, grokos, ronlieb Reviewed By: grokos Subscribers: ronlieb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D68369 llvm-svn: 373793
* [libomptarget] Refactor activemask macro to inline functionJon Chesterfield2019-09-031-0/+10
| | | | | | | | | | | | | | | | | | Summary: [libomptarget] Refactor activemask macro to inline function See also abandoned D66846, split into this diff and others. Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers Reviewed By: jdoerfert, ABataev Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D66851 llvm-svn: 370781
* Use target_impl functions to replace more inline asmJon Chesterfield2019-08-281-5/+11
| | | | | | | | | | | | | | | | | Summary: Use target_impl functions to replace more inline asm Follow on from D65836. Removes remaining asm shuffles and lanemask accessors Also changes the types of target_impl bitwise functions to unsigned. Reviewers: jdoerfert, ABataev, grokos, Hahnfeld, gregrodgers, ronlieb, hfinkel Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D66809 llvm-svn: 370216
* [libomptarget] Refactor syncthreads macro to inline functionJon Chesterfield2019-08-281-0/+9
| | | | | | | | | | | | | | | | | | Summary: [libomptarget] Refactor syncthreads macro to inline function See also abandoned D66846, split into this diff and others. Rev 2 of D66855 Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D66861 llvm-svn: 370210
* [libomptarget] Refactor syncwarp macro to inline functionJon Chesterfield2019-08-281-1/+7
| | | | | | | | | | | | | | | | Summary: [libomptarget] Refactor syncwarp macro to inline function See also abandoned D66846, split into this diff and others. Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D66857 llvm-svn: 370149
* Fix build break due to close brace lost in mergeJon Chesterfield2019-08-281-0/+1
| | | | llvm-svn: 370148
* [libomptarget] Refactor shfl_down_sync macro to inline functionJon Chesterfield2019-08-281-0/+10
| | | | | | | | | | | | | | | | Summary: [libomptarget] Refactor shfl_down_sync macro to inline function See also abandoned D66846, split into this diff and others. Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D66853 llvm-svn: 370146
* [libomptarget] Refactor shfl_sync macro to inline functionJon Chesterfield2019-08-281-0/+14
| | | | | | | | | | | | | | | | Summary: [libomptarget] Refactor shfl_sync macro to inline function See also abandoned D66846, split into this diff and others. Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D66852 llvm-svn: 370144
* [OPENMP][NVPTX]Add __kmpc_syncwarp(int32_t) function.Alexey Bataev2019-08-261-0/+2
| | | | | | | | | | | | | | | | | | | Summary: Added function void __kmpc_syncwarp(int32_t) to expose it to the compiler. It is required to fix the problem with the critical regions in Cuda9.0+. We cannot use barrier in the critical region, but still need to reconverge the threads in the warp after. This function allows to do this. Reviewers: grokos, jdoerfert Subscribers: guansong, openmp-commits, kkwli0, caomhin Tags: #openmp Differential Revision: https://reviews.llvm.org/D66672 llvm-svn: 369933
* Factor architecture dependent code out of loop.cuJon Chesterfield2019-08-131-0/+41
Summary: [libomptarget] Factor architecture dependent code out of loop.cu Related to the patch series starting D64217. Added subscribers to said series as reviewers. This effort is smaller in scope. This patch factors out just enough architecture dependent code from loop.cu to allow the same source to be used with amdgcn, given a different target_impl.h. Testing is that the same bitcode (modulo variable names) is generated for libomptarget before and after the refactor, for nvptx and the out of tree amdgcn. Reviewers: jdoerfert, ABataev, bollu, jfb, tra, grokos, Hahnfeld, guansong, xtian, gregrodgers, ronlieb, hfinkel, gtbercea, guraypp, arpith-jacob Reviewed By: jdoerfert, ABataev Subscribers: dexonsmith, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D65836 llvm-svn: 368751
OpenPOWER on IntegriCloud