diff options
author | Justin Holewinski <jholewinski@nvidia.com> | 2014-07-16 19:45:35 +0000 |
---|---|---|
committer | Justin Holewinski <jholewinski@nvidia.com> | 2014-07-16 19:45:35 +0000 |
commit | ac451066f48820a0be4bccba0a64b7c2e2dc0c35 (patch) | |
tree | 9f5cd54b26dbc5609bb35425a60f23ca5fb76bcb /clang/lib/Parse/ParseDecl.cpp | |
parent | a2e5deb86dd5279a806876a31332666f338aba55 (diff) | |
download | bcm5719-llvm-ac451066f48820a0be4bccba0a64b7c2e2dc0c35.tar.gz bcm5719-llvm-ac451066f48820a0be4bccba0a64b7c2e2dc0c35.zip |
[NVPTX] Honor alignment on vector loads/stores
We were not considering the stated alignment on vector loads/stores,
leading us to generate vector instructions even when we do not have
sufficient alignment.
Now, for IR like:
%1 = load <4 x float>, <4 x float>* %ptr, align 4
we will generate correct, conservative PTX like:
ld.f32 ... [%ptr]
ld.f32 ... [%ptr+4]
ld.f32 ... [%ptr+8]
ld.f32 ... [%ptr+12]
Or if we have an alignment of 8 (for example), we can
generate code like:
ld.v2.f32 ... [%ptr]
ld.v2.f32 ... [%ptr+8]
llvm-svn: 213186
Diffstat (limited to 'clang/lib/Parse/ParseDecl.cpp')
0 files changed, 0 insertions, 0 deletions