diff options
| author | Justin Lebar <jlebar@google.com> | 2016-02-24 21:55:11 +0000 |
|---|---|---|
| committer | Justin Lebar <jlebar@google.com> | 2016-02-24 21:55:11 +0000 |
| commit | ddd97faeecce050402ae5b068ee070132254f5f9 (patch) | |
| tree | 4f288415b1bfe6498a3d13543ef923fcd389665a /clang/lib | |
| parent | 1ef9b592846ccfad78c8d904428ba2500c169dc5 (diff) | |
| download | bcm5719-llvm-ddd97faeecce050402ae5b068ee070132254f5f9.tar.gz bcm5719-llvm-ddd97faeecce050402ae5b068ee070132254f5f9.zip | |
[CUDA] Mark all CUDA device-side function defs, decls, and calls as convergent.
Summary:
This is important for e.g. the following case:
void sync() { __syncthreads(); }
void foo() {
do_something();
sync();
do_something_else():
}
Without this change, if the optimizer does not inline sync() (which it
won't because __syncthreads is also marked as noduplicate, for now
anyway), it is free to perform optimizations on sync() that it would not
be able to perform on __syncthreads(), because sync() is not marked as
convergent.
Similarly, we need a notion of convergent calls, since in the case when
we can't statically determine a call's target(s), we need to know
whether it's safe to perform optimizations around the call.
This change is conservative; the optimizer will remove these attrs where
it can, see r260318, r260319.
Reviewers: majnemer
Subscribers: cfe-commits, jhen, echristo, tra
Differential Revision: http://reviews.llvm.org/D17056
llvm-svn: 261779
Diffstat (limited to 'clang/lib')
| -rw-r--r-- | clang/lib/CodeGen/CGCall.cpp | 8 |
1 files changed, 8 insertions, 0 deletions
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 4e9ea3a129e..932b53e44dc 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -1595,6 +1595,14 @@ void CodeGenModule::ConstructAttributeList( } } + if (getLangOpts().CUDA && getLangOpts().CUDAIsDevice) { + // Conservatively, mark all functions and calls in CUDA as convergent + // (meaning, they may call an intrinsically convergent op, such as + // __syncthreads(), and so can't have certain optimizations applied around + // them). LLVM will remove this attribute where it safely can. + FuncAttrs.addAttribute(llvm::Attribute::Convergent); + } + ClangToLLVMArgMapping IRFunctionArgs(getContext(), FI); QualType RetTy = FI.getReturnType(); |

