[x86] Teach the x86 DAG combiner to form UNPCKLPS and UNPCKHPS

instructions from the relevant shuffle patterns. This is the last tweak I'm aware of to generate essentially perfect v4f32 and v2f64 shuffles with the new vector shuffle lowering up through SSE4.1. I'm sure I've missed some and it'd be nice to check since v4f32 is amenable to exhaustive exploration, but this is all of the tricks I'm aware of. With AVX there is a new trick to use the VPERMILPS instruction, that's coming up in a subsequent patch. llvm-svn: 217761
author: Chandler Carruth <chandlerc@gmail.com> 2014-09-15 11:26:25 +0000
committer: Chandler Carruth <chandlerc@gmail.com> 2014-09-15 11:26:25 +0000
commit: 12d4a70cbd31cc071fa9f1e64352b341c2e02fc9 (patch)
tree: 99ac3af2eb0616cfe17fbebff6234c9ee69ab94b /llvm/lib
parent: 0ffb0939316df3f37cb921626be53caa26ee8dd3 (diff)
download: bcm5719-llvm-12d4a70cbd31cc071fa9f1e64352b341c2e02fc9.tar.gz
bcm5719-llvm-12d4a70cbd31cc071fa9f1e64352b341c2e02fc9.zip
1 files changed, 14 insertions, 0 deletions
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 04f1fafa2e7..da3ec8b35eb 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -19413,6 +19413,20 @@ static bool combineX86ShuffleChain(SDValue Op, SDValue Root, ArrayRef<int> Mask,
                     /*AddTo*/ true);
       return true;
     }
+    if (Mask.equals(0, 0, 1, 1) || Mask.equals(2, 2, 3, 3)) {
+      bool Lo = Mask.equals(0, 0, 1, 1);
+      unsigned Shuffle = Lo ? X86ISD::UNPCKL : X86ISD::UNPCKH;
+      MVT ShuffleVT = MVT::v4f32;
+      if (Depth == 1 && Root->getOpcode() == Shuffle)
+        return false; // Nothing to do!
+      Op = DAG.getNode(ISD::BITCAST, DL, ShuffleVT, Input);
+      DCI.AddToWorklist(Op.getNode());
+      Op = DAG.getNode(Shuffle, DL, ShuffleVT, Op, Op);
+      DCI.AddToWorklist(Op.getNode());
+      DCI.CombineTo(Root.getNode(), DAG.getNode(ISD::BITCAST, DL, RootVT, Op),
+                    /*AddTo*/ true);
+      return true;
+    }
   }
 
   // We always canonicalize the 8 x i16 and 16 x i8 shuffles into their UNPCK
author	Chandler Carruth <chandlerc@gmail.com>	2014-09-15 11:26:25 +0000
committer	Chandler Carruth <chandlerc@gmail.com>	2014-09-15 11:26:25 +0000
commit	12d4a70cbd31cc071fa9f1e64352b341c2e02fc9 (patch)
tree	99ac3af2eb0616cfe17fbebff6234c9ee69ab94b /llvm/lib
parent	0ffb0939316df3f37cb921626be53caa26ee8dd3 (diff)
download	bcm5719-llvm-12d4a70cbd31cc071fa9f1e64352b341c2e02fc9.tar.gz bcm5719-llvm-12d4a70cbd31cc071fa9f1e64352b341c2e02fc9.zip