diff options
| author | Bill Schmidt <wschmidt@linux.vnet.ibm.com> | 2015-07-29 14:31:57 +0000 |
|---|---|---|
| committer | Bill Schmidt <wschmidt@linux.vnet.ibm.com> | 2015-07-29 14:31:57 +0000 |
| commit | 42ddd71120e445748a8d992e12297560590b3ca4 (patch) | |
| tree | c717d9c23359ab5809316b934ccad037c08e608f /llvm/test | |
| parent | 085da7ecae9f00338f95bd60f38be5f3de58733a (diff) | |
| download | bcm5719-llvm-42ddd71120e445748a8d992e12297560590b3ca4.tar.gz bcm5719-llvm-42ddd71120e445748a8d992e12297560590b3ca4.zip | |
[PPC] Fix PR24216: Don't generate splat for misaligned shuffle mask
Given certain shuffle-vector masks, LLVM emits splat instructions
which splat the wrong bytes from the source register. The issue is
that the function PPC::isSplatShuffleMask() in PPCISelLowering.cpp
does not ensure that the splat pattern found is requesting bytes that
are aligned on an EltSize boundary. This patch detects this situation
as not a valid splat mask, resulting in a permute being generated
instead of a splat.
Patch and test case by Tyler Kenney, cleaned up a bit by me.
This is a simple bug fix that would be good to incorporate into 3.7.
llvm-svn: 243519
Diffstat (limited to 'llvm/test')
| -rw-r--r-- | llvm/test/CodeGen/PowerPC/pr24216.ll | 14 |
1 files changed, 14 insertions, 0 deletions
diff --git a/llvm/test/CodeGen/PowerPC/pr24216.ll b/llvm/test/CodeGen/PowerPC/pr24216.ll new file mode 100644 index 00000000000..4ab41985f5b --- /dev/null +++ b/llvm/test/CodeGen/PowerPC/pr24216.ll @@ -0,0 +1,14 @@ +; RUN: llc -mcpu=pwr8 -mtriple=powerpc64le-unknown-linux-gnu < %s | FileCheck %s + +; Test case adapted from PR24216. + +define void @foo(<16 x i8>* nocapture readonly %in, <16 x i8>* nocapture %out) { +entry: + %0 = load <16 x i8>, <16 x i8>* %in, align 16 + %1 = shufflevector <16 x i8> %0, <16 x i8> undef, <16 x i32> <i32 2, i32 3, i32 4, i32 5, i32 2, i32 3, i32 4, i32 5, i32 2, i32 3, i32 4, i32 5, i32 2, i32 3, i32 4, i32 5> + store <16 x i8> %1, <16 x i8>* %out, align 16 + ret void +} + +; CHECK: vperm +; CHECK-NOT: vspltw |

