diff options
| author | Simon Pilgrim <llvm-dev@redking.me.uk> | 2017-04-03 21:06:51 +0000 |
|---|---|---|
| committer | Simon Pilgrim <llvm-dev@redking.me.uk> | 2017-04-03 21:06:51 +0000 |
| commit | af33757b5dec5f99bc78f724a2eb2cd822c14b73 (patch) | |
| tree | b79baa363e9ca2afbdb8470ff7b522eba00312f8 /llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll | |
| parent | 3b392bb8d85d3cd6cf265e940884394d5f25d641 (diff) | |
| download | bcm5719-llvm-af33757b5dec5f99bc78f724a2eb2cd822c14b73.tar.gz bcm5719-llvm-af33757b5dec5f99bc78f724a2eb2cd822c14b73.zip | |
[X86][SSE]] Lower BUILD_VECTOR with repeated elts as BUILD_VECTOR + VECTOR_SHUFFLE
It can be costly to transfer from the gprs to the xmm registers and can prevent loads merging.
This patch splits vXi16/vXi32/vXi64 BUILD_VECTORS that use the same operand in multiple elements into a BUILD_VECTOR with only a single insertion of each of those elements and then performs an unary shuffle to duplicate the values.
There are a couple of minor regressions this patch unearths due to some missing MOVDDUP/BROADCAST folds that I will address in a future patch.
Note: Now that vector shuffle lowering and combining is pretty good we should be reusing that instead of duplicating so much in LowerBUILD_VECTOR - this is the first of several patches to address this.
Differential Revision: https://reviews.llvm.org/D31373
llvm-svn: 299387
Diffstat (limited to 'llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll')
| -rw-r--r-- | llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll | 3 |
1 files changed, 1 insertions, 2 deletions
diff --git a/llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll b/llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll index 3071155172e..030ad7683f0 100644 --- a/llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll +++ b/llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll @@ -2425,10 +2425,9 @@ define <2 x i64> @test_mm_set1_epi64x(i64 %a0) nounwind { ; X32-LABEL: test_mm_set1_epi64x: ; X32: # BB#0: ; X32-NEXT: movd {{.*#+}} xmm0 = mem[0],zero,zero,zero -; X32-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,0,1,1] ; X32-NEXT: movd {{.*#+}} xmm1 = mem[0],zero,zero,zero -; X32-NEXT: pshufd {{.*#+}} xmm1 = xmm1[0,0,1,1] ; X32-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; X32-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,1,0,1] ; X32-NEXT: retl ; ; X64-LABEL: test_mm_set1_epi64x: |

