diff options
| author | Chandler Carruth <chandlerc@gmail.com> | 2014-10-01 23:14:28 +0000 |
|---|---|---|
| committer | Chandler Carruth <chandlerc@gmail.com> | 2014-10-01 23:14:28 +0000 |
| commit | 8a16802d46ed62444a24b7f367db992a783a142e (patch) | |
| tree | 4a68fe570c4dce1db7f479b0215073abe61c0ba8 /llvm/test/CodeGen | |
| parent | 650cd8a38079ba1bd4e75ee5e4b623306f0e8407 (diff) | |
| download | bcm5719-llvm-8a16802d46ed62444a24b7f367db992a783a142e.tar.gz bcm5719-llvm-8a16802d46ed62444a24b7f367db992a783a142e.zip | |
[x86] Improve and correct how the new vector shuffle lowering was
matching and lowering 64-bit insertions.
The first problem was that we weren't looking through bitcasts to
discover that we *could* lower as insertions. Once fixed, we in turn
weren't looking through bitcasts to discover that we could fold a load
into the lowering. Once fixed, we weren't forming a SCALAR_TO_VECTOR
node around the inserted element and instead were passing a scalar to
a DAG node that expected a vector. It turns out there are some patterns
that will "lower" this into the correct asm, but the rest of the X86
backend is very unhappy with such antics.
This should fix a few more edge case regressions I've spotted going
through the regression test suite to enable the new vector shuffle
lowering.
llvm-svn: 218839
Diffstat (limited to 'llvm/test/CodeGen')
| -rw-r--r-- | llvm/test/CodeGen/X86/vector-shuffle-128-v2.ll | 21 |
1 files changed, 6 insertions, 15 deletions
diff --git a/llvm/test/CodeGen/X86/vector-shuffle-128-v2.ll b/llvm/test/CodeGen/X86/vector-shuffle-128-v2.ll index 15e98822246..c0bdc06138a 100644 --- a/llvm/test/CodeGen/X86/vector-shuffle-128-v2.ll +++ b/llvm/test/CodeGen/X86/vector-shuffle-128-v2.ll @@ -718,22 +718,19 @@ define <2 x i64> @insert_reg_lo_v2i64(i64 %a, <2 x i64> %b) { ; SSE2-LABEL: insert_reg_lo_v2i64: ; SSE2: # BB#0: ; SSE2-NEXT: movd %rdi, %xmm1 -; SSE2-NEXT: shufpd {{.*#+}} xmm1 = xmm1[0],xmm0[1] -; SSE2-NEXT: movapd %xmm1, %xmm0 +; SSE2-NEXT: movsd %xmm1, %xmm0 ; SSE2-NEXT: retq ; ; SSE3-LABEL: insert_reg_lo_v2i64: ; SSE3: # BB#0: ; SSE3-NEXT: movd %rdi, %xmm1 -; SSE3-NEXT: shufpd {{.*#+}} xmm1 = xmm1[0],xmm0[1] -; SSE3-NEXT: movapd %xmm1, %xmm0 +; SSE3-NEXT: movsd %xmm1, %xmm0 ; SSE3-NEXT: retq ; ; SSSE3-LABEL: insert_reg_lo_v2i64: ; SSSE3: # BB#0: ; SSSE3-NEXT: movd %rdi, %xmm1 -; SSSE3-NEXT: shufpd {{.*#+}} xmm1 = xmm1[0],xmm0[1] -; SSSE3-NEXT: movapd %xmm1, %xmm0 +; SSSE3-NEXT: movsd %xmm1, %xmm0 ; SSSE3-NEXT: retq ; ; SSE41-LABEL: insert_reg_lo_v2i64: @@ -762,23 +759,17 @@ define <2 x i64> @insert_reg_lo_v2i64(i64 %a, <2 x i64> %b) { define <2 x i64> @insert_mem_lo_v2i64(i64* %ptr, <2 x i64> %b) { ; SSE2-LABEL: insert_mem_lo_v2i64: ; SSE2: # BB#0: -; SSE2-NEXT: movq (%rdi), %xmm1 -; SSE2-NEXT: shufpd {{.*#+}} xmm1 = xmm1[0],xmm0[1] -; SSE2-NEXT: movapd %xmm1, %xmm0 +; SSE2-NEXT: movlpd (%rdi), %xmm0 ; SSE2-NEXT: retq ; ; SSE3-LABEL: insert_mem_lo_v2i64: ; SSE3: # BB#0: -; SSE3-NEXT: movq (%rdi), %xmm1 -; SSE3-NEXT: shufpd {{.*#+}} xmm1 = xmm1[0],xmm0[1] -; SSE3-NEXT: movapd %xmm1, %xmm0 +; SSE3-NEXT: movlpd (%rdi), %xmm0 ; SSE3-NEXT: retq ; ; SSSE3-LABEL: insert_mem_lo_v2i64: ; SSSE3: # BB#0: -; SSSE3-NEXT: movq (%rdi), %xmm1 -; SSSE3-NEXT: shufpd {{.*#+}} xmm1 = xmm1[0],xmm0[1] -; SSSE3-NEXT: movapd %xmm1, %xmm0 +; SSSE3-NEXT: movlpd (%rdi), %xmm0 ; SSSE3-NEXT: retq ; ; SSE41-LABEL: insert_mem_lo_v2i64: |

