diff options
| author | Simon Pilgrim <llvm-dev@redking.me.uk> | 2018-10-12 12:10:34 +0000 |
|---|---|---|
| committer | Simon Pilgrim <llvm-dev@redking.me.uk> | 2018-10-12 12:10:34 +0000 |
| commit | 29279f29c80b6ac35c187772c4337983c62a4815 (patch) | |
| tree | 5478d09dd472f5c7297ee2d672b8f4b80796b09b /llvm/test/CodeGen | |
| parent | f5617dce1ffb17a295790229e07be1172f0dcd7b (diff) | |
| download | bcm5719-llvm-29279f29c80b6ac35c187772c4337983c62a4815.tar.gz bcm5719-llvm-29279f29c80b6ac35c187772c4337983c62a4815.zip | |
[X86][SSE] Add extract_subvector(PSHUFB) -> PSHUFB(extract_subvector()) combine
Fixes PR32160 by reducing the size of PSHUFB if we only use one of the lanes.
This approach can probably be generalized to handle any target shuffle (and any subvector index) but we have no test coverage at the moment.
llvm-svn: 344336
Diffstat (limited to 'llvm/test/CodeGen')
| -rw-r--r-- | llvm/test/CodeGen/X86/vector-trunc.ll | 6 |
1 files changed, 2 insertions, 4 deletions
diff --git a/llvm/test/CodeGen/X86/vector-trunc.ll b/llvm/test/CodeGen/X86/vector-trunc.ll index 0d00f8af5a8..db3692f318f 100644 --- a/llvm/test/CodeGen/X86/vector-trunc.ll +++ b/llvm/test/CodeGen/X86/vector-trunc.ll @@ -1922,16 +1922,14 @@ define <8 x i16> @PR32160(<8 x i32> %x) { ; ; AVX2-SLOW-LABEL: PR32160: ; AVX2-SLOW: # %bb.0: -; AVX2-SLOW-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,4,5,8,9,12,13,8,9,12,13,12,13,14,15,16,17,20,21,24,25,28,29,24,25,28,29,28,29,30,31] -; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[2,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vpunpckhwd {{.*#+}} xmm0 = xmm0[4,4,5,5,6,6,7,7] ; AVX2-SLOW-NEXT: vpbroadcastd %xmm0, %xmm0 ; AVX2-SLOW-NEXT: vzeroupper ; AVX2-SLOW-NEXT: retq ; ; AVX2-FAST-LABEL: PR32160: ; AVX2-FAST: # %bb.0: -; AVX2-FAST-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,4,5,8,9,12,13,8,9,12,13,12,13,14,15,16,17,20,21,24,25,28,29,24,25,28,29,28,29,30,31] -; AVX2-FAST-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[4,5,4,5,4,5,4,5,4,5,4,5,4,5,4,5] +; AVX2-FAST-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[8,9,8,9,8,9,8,9,8,9,8,9,8,9,8,9] ; AVX2-FAST-NEXT: vzeroupper ; AVX2-FAST-NEXT: retq ; |

