diff options
| author | Michael Zolotukhin <mzolotukhin@apple.com> | 2015-09-30 21:05:43 +0000 |
|---|---|---|
| committer | Michael Zolotukhin <mzolotukhin@apple.com> | 2015-09-30 21:05:43 +0000 |
| commit | fc783e91e0c0696ec5b3a990a7ac91bd751e370d (patch) | |
| tree | 3b24127aa0f733436e672638b5d9814499269f29 /llvm/test/Transforms | |
| parent | 757908e545e720a13c5391ce2eb399c4026859e2 (diff) | |
| download | bcm5719-llvm-fc783e91e0c0696ec5b3a990a7ac91bd751e370d.tar.gz bcm5719-llvm-fc783e91e0c0696ec5b3a990a7ac91bd751e370d.zip | |
[SLP] Don't vectorize loads of non-packed types (like i1, i2).
Summary:
Given an array of i2 elements, 4 consecutive scalar loads will be lowered to
i8-sized loads and thus will access 4 consecutive bytes in memory. If we
vectorize these loads into a single <4 x i2> load, it'll access only 1 byte in
memory. Hence, we should prohibit vectorization in such cases.
PS: Initial patch was proposed by Arnold.
Reviewers: aschwaighofer, nadav, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13277
llvm-svn: 248943
Diffstat (limited to 'llvm/test/Transforms')
| -rw-r--r-- | llvm/test/Transforms/SLPVectorizer/X86/bad_types.ll | 26 |
1 files changed, 26 insertions, 0 deletions
diff --git a/llvm/test/Transforms/SLPVectorizer/X86/bad_types.ll b/llvm/test/Transforms/SLPVectorizer/X86/bad_types.ll index 2d8f3832ee2..98c29068bb9 100644 --- a/llvm/test/Transforms/SLPVectorizer/X86/bad_types.ll +++ b/llvm/test/Transforms/SLPVectorizer/X86/bad_types.ll @@ -47,4 +47,30 @@ exit: ret void } +define i8 @test3(i8 *%addr) { +; Check that we do not vectorize types that are padded to a bigger ones. +; +; CHECK-LABEL: @test3 +; CHECK-NOT: <4 x i2> +; CHECK: ret i8 +entry: + %a = bitcast i8* %addr to i2* + %a0 = getelementptr inbounds i2, i2* %a, i64 0 + %a1 = getelementptr inbounds i2, i2* %a, i64 1 + %a2 = getelementptr inbounds i2, i2* %a, i64 2 + %a3 = getelementptr inbounds i2, i2* %a, i64 3 + %l0 = load i2, i2* %a0, align 1 + %l1 = load i2, i2* %a1, align 1 + %l2 = load i2, i2* %a2, align 1 + %l3 = load i2, i2* %a3, align 1 + br label %bb1 +bb1: ; preds = %entry + %p0 = phi i2 [ %l0, %entry ] + %p1 = phi i2 [ %l1, %entry ] + %p2 = phi i2 [ %l2, %entry ] + %p3 = phi i2 [ %l3, %entry ] + %r = zext i2 %p2 to i8 + ret i8 %r +} + declare void @f(i64, i64) |

