summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen/CodeGenFunction.h
diff options
context:
space:
mode:
authorArpith Chacko Jacob <acjacob@us.ibm.com>2017-02-16 16:20:16 +0000
committerArpith Chacko Jacob <acjacob@us.ibm.com>2017-02-16 16:20:16 +0000
commit101e8fb1f3b722f6f545cc0755c0186584aa5ab4 (patch)
treed7a52b3d5c14e603e9777d2d91d43d1dc0337dde /clang/lib/CodeGen/CodeGenFunction.h
parent505ac8dc412aaedb8f53c7066e5d931e4f6dd073 (diff)
downloadbcm5719-llvm-101e8fb1f3b722f6f545cc0755c0186584aa5ab4.tar.gz
bcm5719-llvm-101e8fb1f3b722f6f545cc0755c0186584aa5ab4.zip
[OpenMP] Parallel reduction on the NVPTX device.
This patch implements codegen for the reduction clause on any parallel construct for elementary data types. An efficient implementation requires hierarchical reduction within a warp and a threadblock. It is complicated by the fact that variables declared in the stack of a CUDA thread cannot be shared with other threads. The patch creates a struct to hold reduction variables and a number of helper functions. The OpenMP runtime on the GPU implements reduction algorithms that uses these helper functions to perform reductions within a team. Variables are shared between CUDA threads using shuffle intrinsics. An implementation of reductions on the NVPTX device is substantially different to that of CPUs. However, this patch is written so that there are minimal changes to the rest of OpenMP codegen. The implemented design allows the compiler and runtime to be decoupled, i.e., the runtime does not need to know of the reduction operation(s), the type of the reduction variable(s), or the number of reductions. The design also allows reuse of host codegen, with appropriate specialization for the NVPTX device. While the patch does introduce a number of abstractions, the expected use case calls for inlining of the GPU OpenMP runtime. After inlining and optimizations in LLVM, these abstractions are unwound and performance of OpenMP reductions is comparable to CUDA-canonical code. Patch by Tian Jin in collaboration with Arpith Jacob Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29758 llvm-svn: 295333
Diffstat (limited to 'clang/lib/CodeGen/CodeGenFunction.h')
-rw-r--r--clang/lib/CodeGen/CodeGenFunction.h4
1 files changed, 3 insertions, 1 deletions
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index 098767e23e7..b830df73ca7 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -2638,7 +2638,9 @@ public:
/// the end of the directive.
///
/// \param D Directive that has at least one 'reduction' directives.
- void EmitOMPReductionClauseFinal(const OMPExecutableDirective &D);
+ /// \param ReductionKind The kind of reduction to perform.
+ void EmitOMPReductionClauseFinal(const OMPExecutableDirective &D,
+ const OpenMPDirectiveKind ReductionKind);
/// \brief Emit initial code for linear variables. Creates private copies
/// and initializes them with the values according to OpenMP standard.
///
OpenPOWER on IntegriCloud