[IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835)

Summary: Currently, `rewriteLoopExitValues()`'s logic is roughly as following: > Loop over each incoming value in each PHI node. > Query whether the SCEV for that incoming value is high-cost. > Expand the SCEV. > Perform sanity check (`isValidRewrite()`, D51582) > Record the info > Afterwards, see if we can drop the loop given replacements. > Maybe perform replacements. The problem is that we interleave SCEV cost checking and expansion. This is A Problem, because `isHighCostExpansion()` takes special care to not bill for the expansions that were already expanded, and we can reuse. While it makes sense in general - if we know that we will expand some SCEV, all the other SCEV's costs should account for that, which might cause some of them to become non-high-cost too, and cause chain reaction. But that isn't what we are doing here. We expand *all* SCEV's, unconditionally. So every next SCEV's cost will be affected by the already-performed expansions for previous SCEV's. Even if we are not planning on keeping some of the expansions we performed. Worse yet, this current "bonus" depends on the exact PHI node incoming value processing order. This is completely wrong. As an example of an issue, see @dmajor's `pr45835.ll` - if we happen to have a PHI node with two(!) identical high-cost incoming values for the same basic blocks, we would decide first time around that it is high-cost, expand it, and immediately decide that it is not high-cost because we have an expansion that we could reuse (because we expanded it right before, temporarily), and replace the second incoming value but not the first one; thus resulting in a broken PHI. What we instead should do for now, is not perform any expansions until after we've queried all the costs. Later, in particular after `isValidRewrite()` is an assertion (D51582) we could improve upon that, but in a more coherent fashion. See [[ https://bugs.llvm.org/show_bug.cgi?id=45835 | PR45835 ]] Reviewers: dmajor, reames, mkazantsev, fhahn, efriedma Reviewed By: dmajor, mkazantsev Subscribers: smeenai, nikic, hiraditya, javed.absar, llvm-commits, dmajor Tags: #llvm Differential Revision: https://reviews.llvm.org/D79787 (cherry picked from commit b2df96123198deadad74634c978e84912314da26)
author: Tom Stellard <tstellar@redhat.com> 2020-05-27 19:16:30 +0000
committer: Tom Stellard <tstellar@redhat.com> 2020-06-16 20:16:13 -0700
commit: 4d0626a822be3cfe1809d9b3ba976e5bda0b02e4 (patch)
tree: 48597ff772a2369fdd1d310dba4994f617d3147c /llvm/lib
parent: bf89c5aeb8915d488fa1c790e1b237b62a49c01f (diff)
download: bcm5719-llvm-4d0626a822be3cfe1809d9b3ba976e5bda0b02e4.tar.gz
bcm5719-llvm-4d0626a822be3cfe1809d9b3ba976e5bda0b02e4.zip
1 files changed, 60 insertions, 34 deletions
diff --git a/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp b/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
index d8d7acae5c9..81163e9fcfa 100644
--- a/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
+++ b/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
@@ -527,19 +527,19 @@ namespace {
 // Collect information about PHI nodes which can be transformed in
 // rewriteLoopExitValues.
 struct RewritePhi {
-  PHINode *PN;
-
-  // Ith incoming value.
-  unsigned Ith;
-
-  // Exit value after expansion.
-  Value *Val;
-
-  // High Cost when expansion.
-  bool HighCost;
-
-  RewritePhi(PHINode *P, unsigned I, Value *V, bool H)
-      : PN(P), Ith(I), Val(V), HighCost(H) {}
+  PHINode *PN;               // For which PHI node is this replacement?
+  unsigned Ith;              // For which incoming value?
+  const SCEV *ExpansionSCEV; // The SCEV of the incoming value we are rewriting.
+  Instruction *ExpansionPoint; // Where we'd like to expand that SCEV?
+  bool HighCost;               // Is this expansion a high-cost?
+
+  Value *Expansion = nullptr;
+  bool ValidRewrite = false;
+
+  RewritePhi(PHINode *P, unsigned I, const SCEV *Val, Instruction *ExpansionPt,
+             bool H)
+      : PN(P), Ith(I), ExpansionSCEV(Val), ExpansionPoint(ExpansionPt),
+        HighCost(H) {}
 };
 
 } // end anonymous namespace
@@ -671,41 +671,65 @@ bool IndVarSimplify::rewriteLoopExitValues(Loop *L, SCEVExpander &Rewriter) {
             hasHardUserWithinLoop(L, Inst))
           continue;
 
+        // Check if expansions of this SCEV would count as being high cost.
         bool HighCost = Rewriter.isHighCostExpansion(ExitValue, L, Inst);
-        Value *ExitVal = Rewriter.expandCodeFor(ExitValue, PN->getType(), Inst);
-
-        LLVM_DEBUG(dbgs() << "INDVARS: RLEV: AfterLoopVal = " << *ExitVal
-                          << '\n'
-                          << "  LoopVal = " << *Inst << "\n");
-
-        if (!isValidRewrite(Inst, ExitVal)) {
-          DeadInsts.push_back(ExitVal);
-          continue;
-        }
 
-#ifndef NDEBUG
-        // If we reuse an instruction from a loop which is neither L nor one of
-        // its containing loops, we end up breaking LCSSA form for this loop by
-        // creating a new use of its instruction.
-        if (auto *ExitInsn = dyn_cast<Instruction>(ExitVal))
-          if (auto *EVL = LI->getLoopFor(ExitInsn->getParent()))
-            if (EVL != L)
-              assert(EVL->contains(L) && "LCSSA breach detected!");
-#endif
+        // Note that we must not perform expansions until after
+        // we query *all* the costs, because if we perform temporary expansion
+        // inbetween, one that we might not intend to keep, said expansion
+        // *may* affect cost calculation of the the next SCEV's we'll query,
+        // and next SCEV may errneously get smaller cost.
 
         // Collect all the candidate PHINodes to be rewritten.
-        RewritePhiSet.emplace_back(PN, i, ExitVal, HighCost);
+        RewritePhiSet.emplace_back(PN, i, ExitValue, Inst, HighCost);
       }
     }
   }
 
+  // Now that we've done preliminary filtering and billed all the SCEV's,
+  // we can perform the last sanity check - the expansion must be valid.
+  for (RewritePhi &Phi : RewritePhiSet) {
+    Phi.Expansion = Rewriter.expandCodeFor(Phi.ExpansionSCEV, Phi.PN->getType(),
+                                           Phi.ExpansionPoint);
+
+    LLVM_DEBUG(dbgs() << "rewriteLoopExitValues: AfterLoopVal = "
+                      << *(Phi.Expansion) << '\n'
+                      << "  LoopVal = " << *(Phi.ExpansionPoint) << "\n");
+
+    // FIXME: isValidRewrite() is a hack. it should be an assert, eventually.
+    Phi.ValidRewrite = isValidRewrite(Phi.ExpansionPoint, Phi.Expansion);
+    if (!Phi.ValidRewrite) {
+      DeadInsts.push_back(Phi.Expansion);
+      continue;
+    }
+
+#ifndef NDEBUG
+    // If we reuse an instruction from a loop which is neither L nor one of
+    // its containing loops, we end up breaking LCSSA form for this loop by
+    // creating a new use of its instruction.
+    if (auto *ExitInsn = dyn_cast<Instruction>(Phi.Expansion))
+      if (auto *EVL = LI->getLoopFor(ExitInsn->getParent()))
+        if (EVL != L)
+          assert(EVL->contains(L) && "LCSSA breach detected!");
+#endif
+  }
+
+  // TODO: after isValidRewrite() is an assertion, evaluate whether
+  // it is beneficial to change how we calculate high-cost:
+  // if we have SCEV 'A' which we know we will expand, should we calculate
+  // the cost of other SCEV's after expanding SCEV 'A',
+  // thus potentially giving cost bonus to those other SCEV's?
+
   bool LoopCanBeDel = canLoopBeDeleted(L, RewritePhiSet);
 
   bool Changed = false;
   // Transformation.
   for (const RewritePhi &Phi : RewritePhiSet) {
+    if (!Phi.ValidRewrite)
+      continue;
+
     PHINode *PN = Phi.PN;
-    Value *ExitVal = Phi.Val;
+    Value *ExitVal = Phi.Expansion;
 
     // Only do the rewrite when the ExitValue can be expanded cheaply.
     // If LoopCanBeDel is true, rewrite exit value aggressively.
@@ -844,6 +868,8 @@ bool IndVarSimplify::canLoopBeDeleted(
     // phase later. Skip it in the loop invariant check below.
     bool found = false;
     for (const RewritePhi &Phi : RewritePhiSet) {
+      if (!Phi.ValidRewrite)
+        continue;
       unsigned i = Phi.Ith;
       if (Phi.PN == P && (Phi.PN)->getIncomingValue(i) == Incoming) {
         found = true;
author	Tom Stellard <tstellar@redhat.com>	2020-05-27 19:16:30 +0000
committer	Tom Stellard <tstellar@redhat.com>	2020-06-16 20:16:13 -0700
commit	4d0626a822be3cfe1809d9b3ba976e5bda0b02e4 (patch)
tree	48597ff772a2369fdd1d310dba4994f617d3147c /llvm/lib
parent	bf89c5aeb8915d488fa1c790e1b237b62a49c01f (diff)
download	bcm5719-llvm-4d0626a822be3cfe1809d9b3ba976e5bda0b02e4.tar.gz bcm5719-llvm-4d0626a822be3cfe1809d9b3ba976e5bda0b02e4.zip