diff options
author | Simon Pilgrim <llvm-dev@redking.me.uk> | 2016-03-02 11:43:05 +0000 |
---|---|---|
committer | Simon Pilgrim <llvm-dev@redking.me.uk> | 2016-03-02 11:43:05 +0000 |
commit | c02b72627a57f7de893ccf675d01e09a53a24b92 (patch) | |
tree | 30f867c7591921809f509cfd8072084a4545be55 /llvm/test/CodeGen/X86/extractelement-load.ll | |
parent | f2fbabe9c1955fc7e67d1e90f73aebcf54c49949 (diff) | |
download | bcm5719-llvm-c02b72627a57f7de893ccf675d01e09a53a24b92.tar.gz bcm5719-llvm-c02b72627a57f7de893ccf675d01e09a53a24b92.zip |
[X86][SSE] Lower 128-bit MOVDDUP with existing VBROADCAST mechanisms
We have a number of useful lowering strategies for VBROADCAST instructions (both from memory and register element 0) which the 128-bit form of the MOVDDUP instruction can make use of.
This patch tweaks lowerVectorShuffleAsBroadcast to enable it to broadcast 2f64 args using MOVDDUP as well.
It does require a slight tweak to the lowerVectorShuffleAsBroadcast mechanism as the existing MOVDDUP lowering uses isShuffleEquivalent which can match binary shuffles that can lower to (unary) broadcasts.
Differential Revision: http://reviews.llvm.org/D17680
llvm-svn: 262478
Diffstat (limited to 'llvm/test/CodeGen/X86/extractelement-load.ll')
-rw-r--r-- | llvm/test/CodeGen/X86/extractelement-load.ll | 8 |
1 files changed, 4 insertions, 4 deletions
diff --git a/llvm/test/CodeGen/X86/extractelement-load.ll b/llvm/test/CodeGen/X86/extractelement-load.ll index fca8465ba56..5855303e127 100644 --- a/llvm/test/CodeGen/X86/extractelement-load.ll +++ b/llvm/test/CodeGen/X86/extractelement-load.ll @@ -63,13 +63,13 @@ define void @t3() { ; ; X64-SSSE3-LABEL: t3: ; X64-SSSE3: # BB#0: # %bb -; X64-SSSE3-NEXT: movupd (%rax), %xmm0 -; X64-SSSE3-NEXT: movhpd %xmm0, (%rax) +; X64-SSSE3-NEXT: movddup {{.*#+}} xmm0 = mem[0,0] +; X64-SSSE3-NEXT: movlpd %xmm0, (%rax) ; ; X64-AVX-LABEL: t3: ; X64-AVX: # BB#0: # %bb -; X64-AVX-NEXT: vmovupd (%rax), %xmm0 -; X64-AVX-NEXT: vmovhpd %xmm0, (%rax) +; X64-AVX-NEXT: vmovddup {{.*#+}} xmm0 = mem[0,0] +; X64-AVX-NEXT: vmovlpd %xmm0, (%rax) bb: %tmp13 = load <2 x double>, <2 x double>* undef, align 1 %.sroa.3.24.vec.extract = extractelement <2 x double> %tmp13, i32 1 |