[PPC64LE] Enable missing lxvdsx optimization, and related swap optimization

When adding little-endian vector support for PowerPC last year, I inadvertently disabled an optimization that recognizes a load-splat idiom and generates the lxvdsx instruction. This patch moves the offending logic so lxvdsx is once again generated. This pattern is frequently generated by the vectorizer for scalar loads of an effective constant. Previously the lxvdsx instruction was wrongly listed as lane-sensitive for the VSX swap optimization (since both doublewords are identical, swaps are safe). This patch fixes this as well, so that vectorized code using lxvdsx can now have swaps removed from the computation. There is an existing test (@test50) in test/CodeGen/PowerPC/vsx.ll that checks for the missing optimization. However, vsx.ll was only being tested for POWER7 with big-endian code generation. I've added a little-endian RUN statement and expected LE code generation for all the tests in vsx.ll to give us a bit better VSX coverage, including what's needed for this patch. llvm-svn: 241183
author: Bill Schmidt <wschmidt@linux.vnet.ibm.com> 2015-07-01 19:40:07 +0000
committer: Bill Schmidt <wschmidt@linux.vnet.ibm.com> 2015-07-01 19:40:07 +0000
commit: ae94f11d55bdcea4a5dfaa32c52d2ba2c0bb7613 (patch)
tree: 68eaf4cd9418788038479278825a06165e5b6df6 /llvm/lib
parent: 5327b8900122dc753d570ab37c156c2fbcef46d8 (diff)
download: bcm5719-llvm-ae94f11d55bdcea4a5dfaa32c52d2ba2c0bb7613.tar.gz
bcm5719-llvm-ae94f11d55bdcea4a5dfaa32c52d2ba2c0bb7613.zip
2 files changed, 11 insertions, 13 deletions
diff --git a/llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp b/llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
index de7761c7db9..c85c2610d2f 100644
--- a/llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
+++ b/llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
@@ -2773,18 +2773,6 @@ SDNode *PPCDAGToDAGISel::Select(SDNode *N) {
         else
           DM[i] = 1;
 
-      // For little endian, we must swap the input operands and adjust
-      // the mask elements (reverse and invert them).
-      if (PPCSubTarget->isLittleEndian()) {
-        std::swap(Op1, Op2);
-        unsigned tmp = DM[0];
-        DM[0] = 1 - DM[1];
-        DM[1] = 1 - tmp;
-      }
-
-      SDValue DMV = CurDAG->getTargetConstant(DM[1] | (DM[0] << 1), dl,
-                                              MVT::i32);
-
       if (Op1 == Op2 && DM[0] == 0 && DM[1] == 0 &&
           Op1.getOpcode() == ISD::SCALAR_TO_VECTOR &&
           isa<LoadSDNode>(Op1.getOperand(0))) {
@@ -2800,6 +2788,17 @@ SDNode *PPCDAGToDAGISel::Select(SDNode *N) {
         }
       }
 
+      // For little endian, we must swap the input operands and adjust
+      // the mask elements (reverse and invert them).
+      if (PPCSubTarget->isLittleEndian()) {
+        std::swap(Op1, Op2);
+        unsigned tmp = DM[0];
+        DM[0] = 1 - DM[1];
+        DM[1] = 1 - tmp;
+      }
+
+      SDValue DMV = CurDAG->getTargetConstant(DM[1] | (DM[0] << 1), dl,
+                                              MVT::i32);
       SDValue Ops[] = { Op1, Op2, DMV };
       return CurDAG->SelectNodeTo(N, PPC::XXPERMDI, N->getValueType(0), Ops);
     }
diff --git a/llvm/lib/Target/PowerPC/PPCVSXSwapRemoval.cpp b/llvm/lib/Target/PowerPC/PPCVSXSwapRemoval.cpp
index e238669145a..6f75ff1dbf4 100644
--- a/llvm/lib/Target/PowerPC/PPCVSXSwapRemoval.cpp
+++ b/llvm/lib/Target/PowerPC/PPCVSXSwapRemoval.cpp
@@ -349,7 +349,6 @@ bool PPCVSXSwapRemoval::gatherVectorInstructions() {
       case PPC::LVSL:
       case PPC::LVSR:
       case PPC::LVXL:
-      case PPC::LXVDSX:
       case PPC::STVEBX:
       case PPC::STVEHX:
       case PPC::STVEWX:
author	Bill Schmidt <wschmidt@linux.vnet.ibm.com>	2015-07-01 19:40:07 +0000
committer	Bill Schmidt <wschmidt@linux.vnet.ibm.com>	2015-07-01 19:40:07 +0000
commit	ae94f11d55bdcea4a5dfaa32c52d2ba2c0bb7613 (patch)
tree	68eaf4cd9418788038479278825a06165e5b6df6 /llvm/lib
parent	5327b8900122dc753d570ab37c156c2fbcef46d8 (diff)
download	bcm5719-llvm-ae94f11d55bdcea4a5dfaa32c52d2ba2c0bb7613.tar.gz bcm5719-llvm-ae94f11d55bdcea4a5dfaa32c52d2ba2c0bb7613.zip