diff options
author | Arpith Chacko Jacob <acjacob@us.ibm.com> | 2017-02-16 14:03:36 +0000 |
---|---|---|
committer | Arpith Chacko Jacob <acjacob@us.ibm.com> | 2017-02-16 14:03:36 +0000 |
commit | 8e170fc8573f2091b1443484bcee5a76083cf710 (patch) | |
tree | bd4cddcbd7fa78266f5d8a49fe84110151c4c789 /clang/lib/CodeGen/CGOpenMPRuntime.h | |
parent | 3e81c2675e8d4abdfbfe179866dcf03cd5e51398 (diff) | |
download | bcm5719-llvm-8e170fc8573f2091b1443484bcee5a76083cf710.tar.gz bcm5719-llvm-8e170fc8573f2091b1443484bcee5a76083cf710.zip |
[OpenMP] Parallel reduction on the NVPTX device.
This patch implements codegen for the reduction clause on
any parallel construct for elementary data types. An efficient
implementation requires hierarchical reduction within a
warp and a threadblock. It is complicated by the fact that
variables declared in the stack of a CUDA thread cannot be
shared with other threads.
The patch creates a struct to hold reduction variables and
a number of helper functions. The OpenMP runtime on the GPU
implements reduction algorithms that uses these helper
functions to perform reductions within a team. Variables are
shared between CUDA threads using shuffle intrinsics.
An implementation of reductions on the NVPTX device is
substantially different to that of CPUs. However, this patch
is written so that there are minimal changes to the rest of
OpenMP codegen.
The implemented design allows the compiler and runtime to be
decoupled, i.e., the runtime does not need to know of the
reduction operation(s), the type of the reduction variable(s),
or the number of reductions. The design also allows reuse of
host codegen, with appropriate specialization for the NVPTX
device.
While the patch does introduce a number of abstractions, the
expected use case calls for inlining of the GPU OpenMP runtime.
After inlining and optimizations in LLVM, these abstractions
are unwound and performance of OpenMP reductions is comparable
to CUDA-canonical code.
Patch by Tian Jin in collaboration with Arpith Jacob
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29758
llvm-svn: 295319
Diffstat (limited to 'clang/lib/CodeGen/CGOpenMPRuntime.h')
-rw-r--r-- | clang/lib/CodeGen/CGOpenMPRuntime.h | 36 |
1 files changed, 33 insertions, 3 deletions
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h b/clang/lib/CodeGen/CGOpenMPRuntime.h index ee8c4da6ed7..dc62f2b6d55 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.h +++ b/clang/lib/CodeGen/CGOpenMPRuntime.h @@ -893,6 +893,32 @@ public: OpenMPDirectiveKind InnermostKind, const RegionCodeGenTy &CodeGen, bool HasCancel = false); + + /// Emits reduction function. + /// \param ArgsType Array type containing pointers to reduction variables. + /// \param Privates List of private copies for original reduction arguments. + /// \param LHSExprs List of LHS in \a ReductionOps reduction operations. + /// \param RHSExprs List of RHS in \a ReductionOps reduction operations. + /// \param ReductionOps List of reduction operations in form 'LHS binop RHS' + /// or 'operator binop(LHS, RHS)'. + llvm::Value *emitReductionFunction(CodeGenModule &CGM, llvm::Type *ArgsType, + ArrayRef<const Expr *> Privates, + ArrayRef<const Expr *> LHSExprs, + ArrayRef<const Expr *> RHSExprs, + ArrayRef<const Expr *> ReductionOps); + + /// Emits single reduction combiner + void emitSingleReductionCombiner(CodeGenFunction &CGF, + const Expr *ReductionOp, + const Expr *PrivateRef, + const DeclRefExpr *LHS, + const DeclRefExpr *RHS); + + struct ReductionOptionsTy { + bool WithNowait; + bool SimpleReduction; + OpenMPDirectiveKind ReductionKind; + }; /// \brief Emit a code for reduction clause. Next code should be emitted for /// reduction: /// \code @@ -929,14 +955,18 @@ public: /// \param RHSExprs List of RHS in \a ReductionOps reduction operations. /// \param ReductionOps List of reduction operations in form 'LHS binop RHS' /// or 'operator binop(LHS, RHS)'. - /// \param WithNowait true if parent directive has also nowait clause, false - /// otherwise. + /// \param Options List of options for reduction codegen: + /// WithNowait true if parent directive has also nowait clause, false + /// otherwise. + /// SimpleReduction Emit reduction operation only. Used for omp simd + /// directive on the host. + /// ReductionKind The kind of reduction to perform. virtual void emitReduction(CodeGenFunction &CGF, SourceLocation Loc, ArrayRef<const Expr *> Privates, ArrayRef<const Expr *> LHSExprs, ArrayRef<const Expr *> RHSExprs, ArrayRef<const Expr *> ReductionOps, - bool WithNowait, bool SimpleReduction); + ReductionOptionsTy Options); /// \brief Emit code for 'taskwait' directive. virtual void emitTaskwaitCall(CodeGenFunction &CGF, SourceLocation Loc); |