diff options
| author | Jingyue Wu <jingyue@google.com> | 2015-04-16 18:42:31 +0000 |
|---|---|---|
| committer | Jingyue Wu <jingyue@google.com> | 2015-04-16 18:42:31 +0000 |
| commit | 771dfe91cf6294bd14361a15ea6bbcc7b32626a3 (patch) | |
| tree | 4440917a579b2bb2d075d8b870231ecb013e9c31 /llvm/lib/Transforms | |
| parent | 7a60b6db76f5d2be3f42eb5d6b94aa758a7d78de (diff) | |
| download | bcm5719-llvm-771dfe91cf6294bd14361a15ea6bbcc7b32626a3.tar.gz bcm5719-llvm-771dfe91cf6294bd14361a15ea6bbcc7b32626a3.zip | |
[NaryReassociate] speeds up candidate searching
Summary:
This fixes a left-over efficiency issue in D8950.
As Andrew and Daniel suggested, we can store the candidates in a stack
and pop the top element when it does not dominate the current
instruction. This reduces the worst-case time complexity to O(n).
Test Plan: a new test in nary-add.ll that exercises this optimization.
Reviewers: broune, dberlin, meheff, atrick
Reviewed By: atrick
Subscribers: llvm-commits, sanjoy
Differential Revision: http://reviews.llvm.org/D9055
llvm-svn: 235129
Diffstat (limited to 'llvm/lib/Transforms')
| -rw-r--r-- | llvm/lib/Transforms/Scalar/NaryReassociate.cpp | 24 |
1 files changed, 15 insertions, 9 deletions
diff --git a/llvm/lib/Transforms/Scalar/NaryReassociate.cpp b/llvm/lib/Transforms/Scalar/NaryReassociate.cpp index ed7b24fa604..d6bc925955b 100644 --- a/llvm/lib/Transforms/Scalar/NaryReassociate.cpp +++ b/llvm/lib/Transforms/Scalar/NaryReassociate.cpp @@ -105,7 +105,9 @@ private: ScalarEvolution *SE; // A lookup table quickly telling which instructions compute the given SCEV. // Note that there can be multiple instructions at different locations - // computing to the same SCEV. For example, + // computing to the same SCEV, so we map a SCEV to an instruction list. For + // example, + // // if (p1) // foo(a + b); // if (p2) @@ -190,17 +192,21 @@ Instruction *NaryReassociate::tryReassociatedAdd(const SCEV *LHSExpr, return nullptr; auto &LHSCandidates = Pos->second; - unsigned NumIterations = 0; - // Search at most 10 items to avoid running quadratically. - static const unsigned MaxNumIterations = 10; - for (auto LHS = LHSCandidates.rbegin(); - LHS != LHSCandidates.rend() && NumIterations < MaxNumIterations; - ++LHS, ++NumIterations) { - if (DT->dominates(*LHS, I)) { - Instruction *NewI = BinaryOperator::CreateAdd(*LHS, RHS, "", I); + // Look for the closest dominator LHS of I that computes LHSExpr, and replace + // I with LHS + RHS. + // + // Because we traverse the dominator tree in the pre-order, a + // candidate that doesn't dominate the current instruction won't dominate any + // future instruction either. Therefore, we pop it out of the stack. This + // optimization makes the algorithm O(n). + while (!LHSCandidates.empty()) { + Instruction *LHS = LHSCandidates.back(); + if (DT->dominates(LHS, I)) { + Instruction *NewI = BinaryOperator::CreateAdd(LHS, RHS, "", I); NewI->takeName(I); return NewI; } + LHSCandidates.pop_back(); } return nullptr; } |

