[LegalizeTypes][AArch64][X86] Make type legalization of vector (S/U)ADD/SUB/MULO follow getSetCCResultType for the overflow bits. Make UnrollVectorOverflowOp properly convert from scalar boolean contents to vector boolean contents

Summary: When promoting the over flow vector for these ops we should use the target's desired setcc result type. This way a v8i32 result type will use a v8i32 overflow vector instead of a v8i16 overflow vector. A v8i16 overflow vector will cause LegalizeDAG/LegalizeVectorOps to have to use v8i32 and truncate to v8i16 in its expansion. By doing this in type legalization instead, we get the truncate into the DAG earlier and give DAG combine more of a chance to optimize it. We also have to fix unrolling to use the scalar setcc result type for the scalarized operation, and convert it to the required vector element type after the scalar operation. We have to observe the vector boolean contents when doing this conversion. The previous code was just taking the scalar result and putting it in the vector. But for X86 and AArch64 that would have only put a the boolean value in bit 0 of the element and left all other bits in the element 0. We need to ensure all bits in the element are the same. I'm using a select with constants here because that's what setcc unrolling in LegalizeVectorOps used. Reviewers: spatel, RKSimon, nikic Reviewed By: nikic Subscribers: javed.absar, kristof.beyls, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58567 llvm-svn: 354753
author: Craig Topper <craig.topper@intel.com> 2019-02-24 19:23:36 +0000
committer: Craig Topper <craig.topper@intel.com> 2019-02-24 19:23:36 +0000
commit: be3348573ec101bb9b5e088d47fd3713ca78f088 (patch)
tree: 09a786aa3b3f5b1ce7a75cc49133111315044252 /llvm/lib/CodeGen
parent: 103799c06028c93efff0163c42d2d53ad1857fd3 (diff)
download: bcm5719-llvm-be3348573ec101bb9b5e088d47fd3713ca78f088.tar.gz
bcm5719-llvm-be3348573ec101bb9b5e088d47fd3713ca78f088.zip
2 files changed, 17 insertions, 7 deletions
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
index cfcab55ce4e..911f76ad45d 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
@@ -582,23 +582,27 @@ SDValue DAGTypeLegalizer::PromoteIntRes_MGATHER(MaskedGatherSDNode *N) {
 
 /// Promote the overflow flag of an overflowing arithmetic node.
 SDValue DAGTypeLegalizer::PromoteIntRes_Overflow(SDNode *N) {
-  // Simply change the return type of the boolean result.
+  // Change the return type of the boolean result while obeying
+  // getSetCCResultType.
   EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(1));
-  EVT ValueVTs[] = { N->getValueType(0), NVT };
+  EVT VT = N->getValueType(0);
+  EVT SVT = getSetCCResultType(VT);
   SDValue Ops[3] = { N->getOperand(0), N->getOperand(1) };
   unsigned NumOps = N->getNumOperands();
   assert(NumOps <= 3 && "Too many operands");
   if (NumOps == 3)
     Ops[2] = N->getOperand(2);
 
-  SDValue Res = DAG.getNode(N->getOpcode(), SDLoc(N),
-                            DAG.getVTList(ValueVTs), makeArrayRef(Ops, NumOps));
+  SDLoc dl(N);
+  SDValue Res = DAG.getNode(N->getOpcode(), dl, DAG.getVTList(VT, SVT),
+                            makeArrayRef(Ops, NumOps));
 
   // Modified the sum result - switch anything that used the old sum to use
   // the new one.
   ReplaceValueWith(SDValue(N, 0), Res);
 
-  return SDValue(Res.getNode(), 1);
+  // Convert to the expected type.
+  return DAG.getBoolExtOrTrunc(Res.getValue(1), dl, NVT, VT);
 }
 
 SDValue DAGTypeLegalizer::PromoteIntRes_ADDSUBSAT(SDNode *N) {
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 514e6f59cb5..b9be29be61d 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -8984,13 +8984,19 @@ std::pair<SDValue, SDValue> SelectionDAG::UnrollVectorOverflowOp(
   ExtractVectorElements(N->getOperand(0), LHSScalars, 0, NE);
   ExtractVectorElements(N->getOperand(1), RHSScalars, 0, NE);
 
-  SDVTList VTs = getVTList(ResEltVT, OvEltVT);
+  EVT SVT = TLI->getSetCCResultType(getDataLayout(), *getContext(), ResEltVT);
+  SDVTList VTs = getVTList(ResEltVT, SVT);
   SmallVector<SDValue, 8> ResScalars;
   SmallVector<SDValue, 8> OvScalars;
   for (unsigned i = 0; i < NE; ++i) {
     SDValue Res = getNode(Opcode, dl, VTs, LHSScalars[i], RHSScalars[i]);
+    SDValue Ov =
+        getSelect(dl, OvEltVT, Res.getValue(1),
+                  getBoolConstant(true, dl, OvEltVT, ResVT),
+                  getConstant(0, dl, OvEltVT));
+
     ResScalars.push_back(Res);
-    OvScalars.push_back(SDValue(Res.getNode(), 1));
+    OvScalars.push_back(Ov);
   }
 
   ResScalars.append(ResNE - NE, getUNDEF(ResEltVT));
author	Craig Topper <craig.topper@intel.com>	2019-02-24 19:23:36 +0000
committer	Craig Topper <craig.topper@intel.com>	2019-02-24 19:23:36 +0000
commit	be3348573ec101bb9b5e088d47fd3713ca78f088 (patch)
tree	09a786aa3b3f5b1ce7a75cc49133111315044252 /llvm/lib/CodeGen
parent	103799c06028c93efff0163c42d2d53ad1857fd3 (diff)
download	bcm5719-llvm-be3348573ec101bb9b5e088d47fd3713ca78f088.tar.gz bcm5719-llvm-be3348573ec101bb9b5e088d47fd3713ca78f088.zip