diff options
| author | Simon Pilgrim <llvm-dev@redking.me.uk> | 2016-03-02 11:43:05 +0000 |
|---|---|---|
| committer | Simon Pilgrim <llvm-dev@redking.me.uk> | 2016-03-02 11:43:05 +0000 |
| commit | c02b72627a57f7de893ccf675d01e09a53a24b92 (patch) | |
| tree | 30f867c7591921809f509cfd8072084a4545be55 /llvm/test/CodeGen/X86/avx2-vbroadcast.ll | |
| parent | f2fbabe9c1955fc7e67d1e90f73aebcf54c49949 (diff) | |
| download | bcm5719-llvm-c02b72627a57f7de893ccf675d01e09a53a24b92.tar.gz bcm5719-llvm-c02b72627a57f7de893ccf675d01e09a53a24b92.zip | |
[X86][SSE] Lower 128-bit MOVDDUP with existing VBROADCAST mechanisms
We have a number of useful lowering strategies for VBROADCAST instructions (both from memory and register element 0) which the 128-bit form of the MOVDDUP instruction can make use of.
This patch tweaks lowerVectorShuffleAsBroadcast to enable it to broadcast 2f64 args using MOVDDUP as well.
It does require a slight tweak to the lowerVectorShuffleAsBroadcast mechanism as the existing MOVDDUP lowering uses isShuffleEquivalent which can match binary shuffles that can lower to (unary) broadcasts.
Differential Revision: http://reviews.llvm.org/D17680
llvm-svn: 262478
Diffstat (limited to 'llvm/test/CodeGen/X86/avx2-vbroadcast.ll')
| -rw-r--r-- | llvm/test/CodeGen/X86/avx2-vbroadcast.ll | 6 |
1 files changed, 2 insertions, 4 deletions
diff --git a/llvm/test/CodeGen/X86/avx2-vbroadcast.ll b/llvm/test/CodeGen/X86/avx2-vbroadcast.ll index c8c39763402..00adf86bd76 100644 --- a/llvm/test/CodeGen/X86/avx2-vbroadcast.ll +++ b/llvm/test/CodeGen/X86/avx2-vbroadcast.ll @@ -494,14 +494,12 @@ define <2 x double> @load_splat_2f64_2f64_1111(<2 x double>* %ptr) nounwind uwta ; X32-LABEL: load_splat_2f64_2f64_1111: ; X32: ## BB#0: ## %entry ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax -; X32-NEXT: vmovaps (%eax), %xmm0 -; X32-NEXT: vmovhlps {{.*#+}} xmm0 = xmm0[1,1] +; X32-NEXT: vmovddup {{.*#+}} xmm0 = mem[0,0] ; X32-NEXT: retl ; ; X64-LABEL: load_splat_2f64_2f64_1111: ; X64: ## BB#0: ## %entry -; X64-NEXT: vmovaps (%rdi), %xmm0 -; X64-NEXT: vmovhlps {{.*#+}} xmm0 = xmm0[1,1] +; X64-NEXT: vmovddup {{.*#+}} xmm0 = mem[0,0] ; X64-NEXT: retq entry: %ld = load <2 x double>, <2 x double>* %ptr |

