diff options
| author | Chad Rosier <mcrosier@apple.com> | 2012-04-09 20:32:02 +0000 |
|---|---|---|
| committer | Chad Rosier <mcrosier@apple.com> | 2012-04-09 20:32:02 +0000 |
| commit | e0e38f61a5467224bcff4dac5128f95d7369a99e (patch) | |
| tree | fa2d94e3274e3927260928813e1c7dbed4fcbd3a /llvm/test/CodeGen/ARM/opt-shuff-tstore.ll | |
| parent | 0cd70866041cb69b52e51ac0a486e16aee3ae84a (diff) | |
| download | bcm5719-llvm-e0e38f61a5467224bcff4dac5128f95d7369a99e.tar.gz bcm5719-llvm-e0e38f61a5467224bcff4dac5128f95d7369a99e.zip | |
When performing a truncating store, it's possible to rearrange the data
in-register, such that we can use a single vector store rather then a
series of scalar stores.
For func_4_8 the generated code
vldr d16, LCPI0_0
vmov d17, r0, r1
vadd.i16 d16, d17, d16
vmov.u16 r0, d16[3]
strb r0, [r2, #3]
vmov.u16 r0, d16[2]
strb r0, [r2, #2]
vmov.u16 r0, d16[1]
strb r0, [r2, #1]
vmov.u16 r0, d16[0]
strb r0, [r2]
bx lr
becomes
vldr d16, LCPI0_0
vmov d17, r0, r1
vadd.i16 d16, d17, d16
vuzp.8 d16, d17
vst1.32 {d16[0]}, [r2, :32]
bx lr
I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll,
but I couldn't think of a way to judiciously apply this combine.
This
ldrh r0, [r0, #4]
strh r0, [r1]
becomes
vldr d16, [r0]
vmov.u16 r0, d16[2]
vmov.32 d16[0], r0
vuzp.16 d16, d17
vst1.32 {d16[0]}, [r1, :32]
PR11158
rdar://10703339
llvm-svn: 154340
Diffstat (limited to 'llvm/test/CodeGen/ARM/opt-shuff-tstore.ll')
| -rw-r--r-- | llvm/test/CodeGen/ARM/opt-shuff-tstore.ll | 19 |
1 files changed, 19 insertions, 0 deletions
diff --git a/llvm/test/CodeGen/ARM/opt-shuff-tstore.ll b/llvm/test/CodeGen/ARM/opt-shuff-tstore.ll new file mode 100644 index 00000000000..b4da5524289 --- /dev/null +++ b/llvm/test/CodeGen/ARM/opt-shuff-tstore.ll @@ -0,0 +1,19 @@ +; RUN: llc -mcpu=cortex-a9 -mtriple=arm-linux-unknown -promote-elements -mattr=+neon < %s | FileCheck %s + +; CHECK: func_4_8 +; CHECK: vst1.32 +; CHECK-NEXT: bx lr +define void @func_4_8(<4 x i8> %param, <4 x i8>* %p) { + %r = add <4 x i8> %param, <i8 1, i8 2, i8 3, i8 4> + store <4 x i8> %r, <4 x i8>* %p + ret void +} + +; CHECK: func_2_16 +; CHECK: vst1.32 +; CHECK-NEXT: bx lr +define void @func_2_16(<2 x i16> %param, <2 x i16>* %p) { + %r = add <2 x i16> %param, <i16 1, i16 2> + store <2 x i16> %r, <2 x i16>* %p + ret void +} |

