summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86/vec_set-3.ll
diff options
context:
space:
mode:
authorChandler Carruth <chandlerc@gmail.com>2014-10-03 21:38:49 +0000
committerChandler Carruth <chandlerc@gmail.com>2014-10-03 21:38:49 +0000
commit0adda1e4d47a836275fb0baabf98c402759f385f (patch)
tree3ecf8726f61e3db0c845edfa0005d0031afbfbb3 /llvm/test/CodeGen/X86/vec_set-3.ll
parent0aca4b1aa0c4e912e1c01ce07e00f5647d1fc30e (diff)
downloadbcm5719-llvm-0adda1e4d47a836275fb0baabf98c402759f385f.tar.gz
bcm5719-llvm-0adda1e4d47a836275fb0baabf98c402759f385f.zip
[x86] Adjust the patterns for lowering X86vzmovl nodes which don't
perform a load to use blendps rather than movss when it is available. For non-loads, blendps is *much* faster. It can execute on two ports in Sandy Bridge and Ivy Bridge, and *three* ports on Haswell. This fixes one of the "regressions" from aggressively taking the "insertion" path in the new vector shuffle lowering. This does highlight one problem with blendps -- it isn't commuted as heavily as it should be. That's future work though. llvm-svn: 219022
Diffstat (limited to 'llvm/test/CodeGen/X86/vec_set-3.ll')
-rw-r--r--llvm/test/CodeGen/X86/vec_set-3.ll2
1 files changed, 1 insertions, 1 deletions
diff --git a/llvm/test/CodeGen/X86/vec_set-3.ll b/llvm/test/CodeGen/X86/vec_set-3.ll
index 043cf96a671..b38b8bfb81f 100644
--- a/llvm/test/CodeGen/X86/vec_set-3.ll
+++ b/llvm/test/CodeGen/X86/vec_set-3.ll
@@ -39,7 +39,7 @@ entry:
define <4 x float> @test3(<4 x float> %A) {
; CHECK-LABEL: test3:
; CHECK: xorps %[[X1:xmm[0-9]+]], %[[X1]]
-; CHECK-NEXT: movss %xmm0, %[[X1]]
+; CHECK-NEXT: blendps $1, %xmm0, %[[X1]]
; CHECK-NEXT: pshufd {{.*#+}} xmm0 = [[X1]][1,0,1,1]
; CHECK-NEXT: retl
;
OpenPOWER on IntegriCloud