summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
diff options
context:
space:
mode:
authorArpith Chacko Jacob <acjacob@us.ibm.com>2017-02-16 16:20:16 +0000
committerArpith Chacko Jacob <acjacob@us.ibm.com>2017-02-16 16:20:16 +0000
commit101e8fb1f3b722f6f545cc0755c0186584aa5ab4 (patch)
treed7a52b3d5c14e603e9777d2d91d43d1dc0337dde /llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
parent505ac8dc412aaedb8f53c7066e5d931e4f6dd073 (diff)
downloadbcm5719-llvm-101e8fb1f3b722f6f545cc0755c0186584aa5ab4.tar.gz
bcm5719-llvm-101e8fb1f3b722f6f545cc0755c0186584aa5ab4.zip
[OpenMP] Parallel reduction on the NVPTX device.
This patch implements codegen for the reduction clause on any parallel construct for elementary data types. An efficient implementation requires hierarchical reduction within a warp and a threadblock. It is complicated by the fact that variables declared in the stack of a CUDA thread cannot be shared with other threads. The patch creates a struct to hold reduction variables and a number of helper functions. The OpenMP runtime on the GPU implements reduction algorithms that uses these helper functions to perform reductions within a team. Variables are shared between CUDA threads using shuffle intrinsics. An implementation of reductions on the NVPTX device is substantially different to that of CPUs. However, this patch is written so that there are minimal changes to the rest of OpenMP codegen. The implemented design allows the compiler and runtime to be decoupled, i.e., the runtime does not need to know of the reduction operation(s), the type of the reduction variable(s), or the number of reductions. The design also allows reuse of host codegen, with appropriate specialization for the NVPTX device. While the patch does introduce a number of abstractions, the expected use case calls for inlining of the GPU OpenMP runtime. After inlining and optimizations in LLVM, these abstractions are unwound and performance of OpenMP reductions is comparable to CUDA-canonical code. Patch by Tian Jin in collaboration with Arpith Jacob Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29758 llvm-svn: 295333
Diffstat (limited to 'llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp')
0 files changed, 0 insertions, 0 deletions
OpenPOWER on IntegriCloud