summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen/CGCXXABI.cpp
diff options
context:
space:
mode:
authorJustin Lebar <jlebar@google.com>2016-05-19 22:49:13 +0000
committerJustin Lebar <jlebar@google.com>2016-05-19 22:49:13 +0000
commit2e4ecfdebe8fa73ab4ed6f738307339ee9586418 (patch)
treea8eb6c205f01da79d919bb3f573cb92f11100a40 /clang/lib/CodeGen/CGCXXABI.cpp
parentb926bdac4c18e0f31d827dec482f207856e88e1e (diff)
downloadbcm5719-llvm-2e4ecfdebe8fa73ab4ed6f738307339ee9586418.tar.gz
bcm5719-llvm-2e4ecfdebe8fa73ab4ed6f738307339ee9586418.zip
[CUDA] Implement __ldg using intrinsics.
Summary: Previously it was implemented as inline asm in the CUDA headers. This change allows us to use the [addr+imm] addressing mode when executing ld.global.nc instructions. This translates into a 1.3x speedup on some benchmarks that call this instruction from within an unrolled loop. Reviewers: tra, rsmith Subscribers: jhen, cfe-commits, jholewinski Differential Revision: http://reviews.llvm.org/D19990 llvm-svn: 270150
Diffstat (limited to 'clang/lib/CodeGen/CGCXXABI.cpp')
0 files changed, 0 insertions, 0 deletions
OpenPOWER on IntegriCloud