summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMFrameLowering.cpp
diff options
context:
space:
mode:
authorJingyue Wu <jingyue@google.com>2015-07-20 21:28:54 +0000
committerJingyue Wu <jingyue@google.com>2015-07-20 21:28:54 +0000
commit48a9bdc6aa0385f0773445b9561e12456983c8da (patch)
tree9c563c02bf738ddda792e71f0a44d149bfa11ecd /llvm/lib/Target/ARM/ARMFrameLowering.cpp
parentb68a16c47c8eab2411b0b0257d4c147e8d94a559 (diff)
downloadbcm5719-llvm-48a9bdc6aa0385f0773445b9561e12456983c8da.tar.gz
bcm5719-llvm-48a9bdc6aa0385f0773445b9561e12456983c8da.zip
[NVPTX] make load on global readonly memory to use ldg
Summary: [NVPTX] make load on global readonly memory to use ldg Summary: As describe in [1], ld.global.nc may be used to load memory by nvcc when __restrict__ is used and compiler can detect whether read-only data cache is safe to use. This patch will try to check whether ldg is safe to use and use them to replace ld.global when possible. This change can improve the performance by 18~29% on affected kernels (ratt*_kernel and rwdot*_kernel) in S3D benchmark of shoc [2]. Patched by Xuetian Weng. [1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache [2] https://github.com/vetter/shoc Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll Reviewers: jholewinski, jingyue Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D11314 llvm-svn: 242713
Diffstat (limited to 'llvm/lib/Target/ARM/ARMFrameLowering.cpp')
0 files changed, 0 insertions, 0 deletions
OpenPOWER on IntegriCloud