summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis/ScalarEvolution.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [SCEV][NFC] Share value cache between SCEVs in GroupByComplexityMax Kazantsev2017-12-061-22/+26
| | | | | | | | | | | | | Current implementation of `compareSCEVComplexity` is being unreasonable with `SCEVUnknown`s: every time it sees one, it creates a new value cache and tries to prove equality of two values using it. This cache reallocates and gets lost from SCEV to SCEV. This patch changes this behavior: now we create one cache for all values and share it between SCEVs. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D40597 llvm-svn: 319880
* [SCEV] Use a "Discovered" set instead of a "Visited" set; NFCSanjoy Das2017-12-041-4/+3
| | | | | | Suggested by Max Kazantsev in https://reviews.llvm.org/D39361 llvm-svn: 319679
* [SCEV] A different fix for PR33494Sanjoy Das2017-12-041-29/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I don't think rL309080 is the right fix for PR33494 -- caching ExitLimit only hides the problem[0]. The real issue is that because of how we forget SCEV expressions ScalarEvolution::getBackedgeTakenInfo, in the test case for PR33494 computing the backedge for any loop invalidates the trip count for every other loop. This effectively makes the SCEV cache useless. I've instead made the SCEV expression invalidation in ScalarEvolution::getBackedgeTakenInfo less aggressive to fix this issue. [0]: One way to think about this is that rL309080 essentially augmented the backedge-taken-count cache with another equivalent exit-limit cache. The bug went away because we were explicitly not clearing the exit-limit cache in getBackedgeTakenInfo. But instead of doing all of that, we can just avoid clearing the backedge-taken-count cache. Reviewers: mkazantsev, mzolotukhin Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D39361 llvm-svn: 319678
* Mark all library options as hidden.Zachary Turner2017-12-011-4/+4
| | | | | | | | | | | | | | | | | These command line options are not intended for public use, and often don't even make sense in the context of a particular tool anyway. About 90% of them are already hidden, but when people add new options they forget to hide them, so if you were to make a brand new tool today, link against one of LLVM's libraries, and run tool -help you would get a bunch of junk that doesn't make sense for the tool you're writing. This patch hides these options. The real solution is to not have libraries defining command line options, but that's a much larger effort and not something I'm prepared to take on. Differential Revision: https://reviews.llvm.org/D40674 llvm-svn: 319505
* [SCEV][NFC] Remove condition that can never happen due to check few lines aboveMax Kazantsev2017-11-291-2/+0
| | | | llvm-svn: 319293
* [SCEV][NFC] More efficient caching in CompareValueComplexityMax Kazantsev2017-11-281-4/+4
| | | | | | | | | | | | | | Currently, we use a set of pairs to cache responces like `CompareValueComplexity(X, Y) == 0`. If we had proved that `CompareValueComplexity(S1, S2) == 0` and `CompareValueComplexity(S2, S3) == 0`, this cache does not allow us to prove that `CompareValueComplexity(S1, S3)` is also `0`. This patch replaces this set with `EquivalenceClasses` that merges Values into equivalence sets so that any two values from the same set are equal from point of `CompareValueComplexity`. This, in particular, allows us to prove the fact from example above. Differential Revision: https://reviews.llvm.org/D40429 llvm-svn: 319153
* [SCEV][NFC] More efficient caching in CompareSCEVComplexityMax Kazantsev2017-11-281-8/+9
| | | | | | | | | | | | | Currently, we use a set of pairs to cache responces like `CompareSCEVComplexity(X, Y) == 0`. If we had proved that `CompareSCEVComplexity(S1, S2) == 0` and `CompareSCEVComplexity(S2, S3) == 0`, this cache does not allow us to prove that `CompareSCEVComplexity(S1, S3)` is also `0`. This patch replaces this set with `EquivalenceClasses` any two values from the same set are equal from point of `CompareSCEVComplexity`. This, in particular, allows us to prove the fact from example above. Differential Revision: https://reviews.llvm.org/D40428 llvm-svn: 319149
* [SCEV] Adding a check on outgoing branches of a terminator instr for ↵Jatin Bhateja2017-11-261-10/+13
| | | | | | | | | | | | | | | | | | | | | | SCEVBackedgeConditionFolder, NFC. Summary: For a given loop, getLoopLatch returns a non-null value when a loop has only one latch block. In the modified context adding an assertion to check that both the outgoing branches of a terminator instruction (of latch) does not target same header. + few minor code reorganization. Reviewers: jbhateja Reviewed By: jbhateja Subscribers: sanjoy Differential Revision: https://reviews.llvm.org/D40460 llvm-svn: 318997
* [SCEV] NFC : Removing unnecessary check on outgoing branches of a branch instr.Jatin Bhateja2017-11-261-2/+1
| | | | | | | | | | | | | | | | | Summary: For a given loop, getLoopLatch returns a non-null value when a loop has only one latch block. In the modified context a check on both the outgoing branches of a terminator instruction (of latch) to same header is redundant. Reviewers: jbhateja Reviewed By: jbhateja Subscribers: sanjoy Differential Revision: https://reviews.llvm.org/D40460 llvm-svn: 318991
* [SCEV] Strengthen variance condition in calculateLoopDispositionMax Kazantsev2017-11-221-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Given loops `L1` and `L2` with AddRecs `AR1` and `AR2` varying in them respectively. When identifying loop disposition of `AR2` w.r.t. `L1`, we only say that it is varying if `L1` contains `L2`. But there is also a possible situation where `L1` and `L2` are consecutive sibling loops within the parent loop. In this case, `AR2` is also varying w.r.t. `L1`, but we don't correctly identify it. It can lead, for exaple, to attempt of incorrect folding. Consider: AR1 = {a,+,b}<L1> AR2 = {c,+,d}<L2> EXAR2 = sext(AR1) MUL = mul AR1, EXAR2 If we incorrectly assume that `EXAR2` is invariant w.r.t. `L1`, we can end up trying to construct something like: `{a * {c,+,d}<L2>,+,b * {c,+,d}<L2>}<L1>`, which is incorrect because `AR2` is not available on entrance of `L1`. Both situations "`L1` contains `L2`" and "`L1` preceeds sibling loop `L2`" can be handled with one check: "header of `L1` dominates header of `L2`". This patch replaces the old insufficient check with this one. Differential Revision: https://reviews.llvm.org/D39453 llvm-svn: 318819
* [SCEV] simplify loop. NFC.Javed Absar2017-11-161-2/+2
| | | | | | Change loop to range-based llvm-svn: 318401
* Fix clang -Wsometimes-uninitialized warning in SCEV codeReid Kleckner2017-11-131-1/+1
| | | | | | | I don't believe this was a problem in practice, as it's likely that the boolean wasn't checked unless the backend condition was non-null. llvm-svn: 318073
* [SCEV] Handling for ICmp occuring in the evolution chain.Jatin Bhateja2017-11-131-7/+87
| | | | | | | | | | | | | | | | | | | | Summary: If a compare instruction is same or inverse of the compare in the branch of the loop latch, then return a constant evolution node. This shall facilitate computations of loop exit counts in cases where compare appears in the evolution chain of induction variables. Will fix PR 34538 Reviewers: sanjoy, hfinkel, junryoungju Reviewed By: sanjoy, junryoungju Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38494 llvm-svn: 318050
* [SCEV] Fix an assertion failure in the max backedge taken countSanjoy Das2017-10-251-11/+10
| | | | | | | | | | | | | | | | | Max backedge taken count is always expected to be a constant; and this is usually true by construction -- it is a SCEV expression with constant inputs. However, if the max backedge expression ends up being computed to be a udiv with a constant zero denominator[0], SCEV does not fold the result to a constant since there is no constant it can fold it to (SCEV has no representation for "infinity" or "undef"). However, in computeMaxBECountForLT we already know the denominator is positive, and thus at least 1; and we can use this fact to avoid dividing by zero. [0]: We can end up with a constant zero denominator if the signed range of the stride is more precise than the unsigned range. llvm-svn: 316615
* Add a comment to clarify a future changeSanjoy Das2017-10-251-2/+3
| | | | llvm-svn: 316614
* Revert "[ScalarEvolution] Handling for ICmp occuring in the evolution chain."Sanjoy Das2017-10-181-42/+3
| | | | | | | This reverts commit r316054. There was some confusion over the review process: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171016/495884.html llvm-svn: 316129
* [ScalarEvolution] Handling for ICmp occuring in the evolution chain.Jatin Bhateja2017-10-181-3/+42
| | | | | | | | | | | | | | | | | | | | | | Summary: If a compare instruction is same or inverse of the compare in the branch of the loop latch, then return a constant evolution node. Currently scope of evaluation is limited to SCEV computation for PHI nodes. This shall facilitate computations of loop exit counts in cases where compare appears in the evolution chain of induction variables. Will fix PR 34538 Reviewers: sanjoy, hfinkel, junryoungju Reviewed By: junryoungju Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38494 llvm-svn: 316054
* Revert "[SCEV] Maintain and use a loop->loop invalidation dependency"Sanjoy Das2017-10-171-49/+31
| | | | | | | | | This reverts commit r315713. It causes PR34968. I think I know what the problem is, but I don't think I'll have time to fix it this week. llvm-svn: 315962
* [SCEV] Rename getMaxBECount and update comments. NFCAnna Thomas2017-10-161-8/+8
| | | | | | Post commit review comments at D38825. llvm-svn: 315920
* Reverting r315590; it did not include changes for llvm-tblgen, which is ↵Aaron Ballman2017-10-151-1/+1
| | | | | | | | causing link errors for several people. Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1 llvm-svn: 315854
* [SCEV] Maintain and use a loop->loop invalidation dependencySanjoy Das2017-10-131-31/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change uses the loop use list added in the previous change to remember the loops that appear in the trip count expressions of other loops; and uses it in forgetLoop. This lets us not scan every loop in the function on a forgetLoop call. With this change we no longer invalidate clear out backedge taken counts on forgetValue. I think this is fine -- the contract is that SCEV users must call forgetLoop(L) if their change to the IR could have changed the trip count of L; solely calling forgetValue on a value feeding into the backedge condition of L is not enough. Moreover, I don't think we can strengthen forgetValue to be sufficient for invalidating trip counts without significantly re-architecting SCEV. For instance, if we have the loop: I = *Ptr; E = I + 10; do { // ... } while (++I != E); then the backedge taken count of the loop is 9, and it has no reference to either I or E, i.e. there is no way in SCEV today to re-discover the dependency of the loop's trip count on E or I. So a SCEV client cannot change E to (say) "I + 20", call forgetValue(E) and expect the loop's trip count to be updated. Reviewers: atrick, sunfish, mkazantsev Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38435 llvm-svn: 315713
* [SCEV] Teach SCEV to find maxBECount when loop endbound is variantAnna Thomas2017-10-131-34/+56
| | | | | | | | | | | | | | | | | | | | | | | Summary: This patch teaches SCEV to calculate the maxBECount when the end bound of the loop can vary. Note that we cannot calculate the exactBECount. This will only be done when both conditions are satisfied: 1. the loop termination condition is strictly LT. 2. the IV is proven to not overflow. This provides more information to users of SCEV and can be used to improve identification of finite loops. Reviewers: sanjoy, mkazantsev, silviu.baranga, atrick Reviewed by: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38825 llvm-svn: 315683
* [SCEV] Maintain loop use lists, and use them in forgetLoopSanjoy Das2017-10-131-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently we do not correctly invalidate memoized results for add recurrences that were created directly (i.e. they were not created from a `Value`). This change fixes this by keeping loop use lists and using the loop use lists to determine which SCEV expressions to invalidate. Here are some statistics on the number of uses of in the use lists of all loops on a clang bootstrap (config: release, no asserts): Count: 731310 Min: 1 Mean: 8.555150 50th %time: 4 95th %tile: 25 99th %tile: 53 Max: 433 Reviewers: atrick, sunfish, mkazantsev Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38434 llvm-svn: 315672
* [dump] Remove NDEBUG from test to enable dump methods [NFC]Don Hinton2017-10-121-1/+1
| | | | | | | | | | | | | | | Summary: Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP. Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods. Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so it'll be picked up by public headers. Differential Revision: https://reviews.llvm.org/D38406 llvm-svn: 315590
* [SCEV] Properly handle the case of a non-constant start with a zero accum in ↵Daniel Neilson2017-10-111-2/+1
| | | | | | | | | | | | | | | | | | | | ScalarEvolution::createAddRecFromPHIWithCastsImpl Summary: This patch fixes an error in the patch to ScalarEvolution::createAddRecFromPHIWithCastsImpl made in D37265. In that patch we handle the cases where the either the start or accum values can be zero after truncation. But, we assume that the start value must be a constant if the accum is zero. This is clearly an erroneous assumption. This change removes that assumption. Reviewers: sanjoy, dorit, mkazantsev Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38814 llvm-svn: 315491
* Remove trailing whitespaces.Michael Liao2017-09-251-41/+41
| | | | llvm-svn: 314115
* [SCEV] Generalize folding of trunc(x)+n*trunc(y) into folding ↵Daniel Neilson2017-09-221-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | | m*trunc(x)+n*trunc(y) Summary: A SCEV such as: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> can be folded into, simply, {%v2,+,0}. However, the current code in ::getAddExpr() will not try to apply the simplification m*trunc(x)+n*trunc(y) -> trunc(trunc(m)*x+trunc(n)*y) because it only keys off having a non-multiplied trunc as the first term in the simplification. This patch generalizes this code to try to do a more generic fold of these trunc expressions. Reviewers: sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37888 llvm-svn: 313988
* [ScalarEvolution] Refactor forgetLoop() to improve performanceMarcello Maggioni2017-09-111-40/+45
| | | | | | | | | | | | | | | forgetLoop() has pretty bad performance because it goes over the same instructions over and over again in particular when nested loop are involved. The refactoring changes the function to a not-recursive function and reusing the allocation for data-structures and the Visited set. NFCI Differential Revision: https://reviews.llvm.org/D37659 llvm-svn: 312920
* [SCEV] Ensure ScalarEvolution::createAddRecFromPHIWithCastsImpl properly ↵Daniel Neilson2017-09-051-15/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | handles out of range truncations of the start and accum values Summary: When constructing the predicate P1 in ScalarEvolution::createAddRecFromPHIWithCastsImpl() it is possible for the PHISCEV from which the predicate is constructed to be a SCEVConstant instead of a SCEVAddRec. If this happens, then the cast<SCEVAddRec>(PHISCEV) in the code will assert. Such a PHISCEV is possible if either the start value or the accumulator value is a constant value that not equal to its truncated value, and if the truncated value is zero. This patch adds tests that demonstrate the cast<> assertion, and fixes this problem by checking whether the PHISCEV is a constant before constructing the P1 predicate; if it is, then P1 is equivalent to one of P2 or P3. Additionally, if we know that the start value or accumulator value are constants then we check whether the P2 and/or P3 predicates are known false at compile time; if either is, then we bail out of constructing the AddRec. Reviewers: sanjoy, mkazantsev, silviu.baranga Reviewed By: mkazantsev Subscribers: mkazantsev, llvm-commits Differential Revision: https://reviews.llvm.org/D37265 llvm-svn: 312568
* [SCEV] Add URem support to SCEVAlexandre Isoard2017-09-011-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In LLVM IR the following code: %r = urem <ty> %t, %b is equivalent to %q = udiv <ty> %t, %b %s = mul <ty> nuw %q, %b %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented with minimal effort using that relation: %r --> (-%b * (%t /u %b)) + %t We implement two special cases: - if %b is 1, the result is always 0 - if %b is a power-of-two, we produce a zext/trunc based expression instead That is, the following code: %r = urem i32 %t, 65536 Produces: %r --> (zext i16 (trunc i32 %a to i16) to i32) Note that while this helps get a tighter bound on the range analysis and the known-bits analysis, this exposes some normalization shortcoming of SCEVs: %div = udim i32 %a, 65536 %mul = mul i32 %div, 65536 %rem = urem i32 %a, 65536 %add = add i32 %mul, %rem Will usually not be reduced. llvm-svn: 312329
* [Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-08-181-82/+105
| | | | | | other minor fixes (NFC). llvm-svn: 311212
* [SCEV] Preserve NSW information for sext(subtract).Amara Emerson2017-08-041-3/+29
| | | | | | | | | | Pushes the sext onto the operands of a Sub if NSW is present. Also adds support for propagating the nowrap flags of the llvm.ssub.with.overflow intrinsic during analysis. Differential Revision: https://reviews.llvm.org/D35256 llvm-svn: 310117
* [SCEV] Re-enable "Cache results of computeExitLimit"Max Kazantsev2017-08-031-2/+37
| | | | | | | | | | The patch rL309080 was reverted because it did not clean up the cache on "forgetValue" method call. This patch re-enables this change, adds the missing check and introduces two new unit tests that make sure that the cache is cleaned properly. Differential Revision: https://reviews.llvm.org/D36087 llvm-svn: 309925
* [SCEV/IndVars] Always compute loop exiting values if the backedge count is 0Sanjoy Das2017-08-011-0/+19
| | | | | | | | | If SCEV can prove that the backedge taken count for a loop is zero, it does not need to "understand" a recursive PHI to compute its exiting value. This should fix PR33885. llvm-svn: 309758
* [SCEV] Change an early exit to an assert; NFCSanjoy Das2017-07-291-2/+2
| | | | llvm-svn: 309480
* [SCEV] Do not visit nodes twice in containsConstantSomewhereMax Kazantsev2017-07-281-13/+20
| | | | | | | | | | This patch reworks the function that searches constants in Add and Mul SCEV expression chains so that now it does not visit a node more than once, and also renames this function for better correspondence between its implementation and semantics. Differential Revision: https://reviews.llvm.org/D35931 llvm-svn: 309367
* Revert "[SCEV] Cache results of computeExitLimit"Sanjoy Das2017-07-281-21/+0
| | | | | | | | | This reverts commit r309080. The patch needs to clear out the ScalarEvolution::ExitLimits cache in forgetMemoizedResults. I've replied on the commit thread for the patch with more details. llvm-svn: 309357
* [SCEV] Cache results of computeExitLimitMax Kazantsev2017-07-261-0/+21
| | | | | | | | | This patch adds a cache for computeExitLimit to save compilation time. A lot of examples of tests that take extensive time to compile are attached to the bug 33494. Differential Revision: https://reviews.llvm.org/D35827 llvm-svn: 309080
* [SCEV] Remove unnecessary call to forgetMemoizedResultsSanjoy Das2017-07-261-3/+0
| | | | | | | | | | | | | `SCEVUnknown::allUsesReplacedWith` does not need to call `forgetMemoizedResults` since RAUW does a value-equivalent replacement by assumption. If this assumption was false then the later setValPtr(New) call would be incorrect too. This is a non-trivial performance optimization for functions with a large number of loops since `forgetMemoizedResults` walks all loop backedge taken counts to see if any of them use the SCEVUnknown being RAUWed. However, this improvement is difficult to demonstrate without checking in an excessively large IR file. llvm-svn: 309072
* [SCEV] Limit max size of AddRecExpr during evolvingMax Kazantsev2017-07-231-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | When SCEV calculates product of two SCEVAddRecs from the same loop, it tries to combine them into one big AddRecExpr. If the sizes of the initial SCEVs were `S1` and `S2`, the size of their product is `S1 + S2 - 1`, and every operand of the resulting SCEV is combined from operands of initial SCEV and has much higher complexity than they have. As result, if we try to calculate something like: %x1 = {a,+,b} %x2 = mul i32 %x1, %x1 %x3 = mul i32 %x2, %x1 %x4 = mul i32 %x3, %x2 ... The size of such SCEVs grows as `2^N`, and the arguments become more and more complex as we go forth. This leads to long compilation and huge memory consumption. This patch sets a limit after which we don't try to combine two `SCEVAddRecExpr`s into one. By default, max allowed size of the resulting AddRecExpr is set to 16. Differential Revision: https://reviews.llvm.org/D35664 llvm-svn: 308847
* PSCEV] Create AddRec for Phis in cases of possible integer overflow,Dorit Nuzman2017-07-181-13/+372
| | | | | | | | | | | | | using runtime checks Extend the SCEVPredicateRewriter to work a bit harder when it encounters an UnknownSCEV for a Phi node; Try to build an AddRecurrence also for Phi nodes whose update chain involves casts that can be ignored under the proper runtime overflow test. This is one step towards addressing PR30654. Differential revision: http://reviews.llvm.org/D30041 llvm-svn: 308299
* [Constants] Replace calls to ConstantInt::equalsInt(0)/equalsInt(1) with ↵Craig Topper2017-07-061-3/+3
| | | | | | isZero and isOne. NFCI llvm-svn: 307293
* [Constants] If we already have a ConstantInt*, prefer to use ↵Craig Topper2017-07-061-7/+7
| | | | | | | | isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne. llvm-svn: 307292
* [SCEV] Use depth limit instead of local cache for SExt and ZExtMax Kazantsev2017-06-301-124/+119
| | | | | | | | | | | | | | | | In rL300494 there was an attempt to deal with excessive compile time on invocations of getSign/ZeroExtExpr using local caching. This approach only helps if we request the same SCEV multiple times throughout recursion. But in the bug PR33431 we see a case where we request different values all the time, so caching does not help and the size of the cache grows enormously. In this patch we remove the local cache for this methods and add the recursion depth limit instead, as we do for arithmetics. This gives us a guarantee that the invocation sequence is limited and reasonably short. Differential Revision: https://reviews.llvm.org/D34273 llvm-svn: 306785
* Reverting r306695 while investigating failing test case.Alexandre Isoard2017-06-291-26/+0
| | | | | | | Failing test case: Transforms/LoopVectorize.iv_outside_user.ll llvm-svn: 306723
* ScalarEvolution: Add URem supportAlexandre Isoard2017-06-291-0/+26
| | | | | | | | | | | | | | | | | | | | In LLVM IR the following code: %r = urem <ty> %t, %b is equivalent to: %q = udiv <ty> %t, %b %s = mul <ty> nuw %q, %b %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented with minimal effort this way. Note: While SRem and SDiv are also related this way, SCEV does not provides SDiv yet. llvm-svn: 306695
* [SCEV] Avoid copying ConstantRange just to get the min/max valueCraig Topper2017-06-241-67/+62
| | | | | | | | | | | | | | | | | | | | | Summary: This patch changes getRange to getRangeRef and returns a reference to the ConstantRange object stored inside the DenseMap caches. We then take advantage of that to add new helper methods that can return min/max value of a signed or unsigned ConstantRange using that reference without first copying the ConstantRange. getRangeRef calls itself recursively and I believe the reference return is fine for those calls. I've left getSignedRange and getUnsignedRange returning a ConstantRange object so they will make a copy now. This is to ensure safety since the reference will be invalidated if the DenseMap changes. I'm sure there are still more places that can take advantage of the reference and I'll submit future patches as I find them. Reviewers: sanjoy, davide Reviewed By: sanjoy Subscribers: zzheng, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D32978 llvm-svn: 306229
* [SCEV] Make MulOpsInlineThreshold lower to avoid excessive compilation timeMax Kazantsev2017-06-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | MulOpsInlineThreshold option of SCEV is defaulted to 1000, which is inadequately high. When constructing SCEVs of expressions like: x1 = a * a x2 = x1 * x1 x3 = x2 * x2 ... We actually have huge SCEVs with max allowed amount of operands inlined. Such expressions are easy to get from unrolling of loops looking like x = a for (i = 0; i < n; i++) x = x * x Or more tricky cases where big powers are involved. If some non-linear analysis tries to work with a SCEV that has 1000 operands, it may lead to excessively long compilation. The attached test does not pass within 1 minute with default threshold. This patch decreases its default value to 32, which looks much more reasonable if we use analyzes with complexity O(N^2) or O(N^3) working with SCEV. Differential Revision: https://reviews.llvm.org/D34397 llvm-svn: 305882
* [SCEV][NFC] Fix a misleading description of AddOpsInlineThresholdMax Kazantsev2017-06-201-1/+1
| | | | | | | | | The description of this option was copy-pasted from another one and does not correspond to reality. Differential Revision: https://reviews.llvm.org/D34390 llvm-svn: 305782
* [ScalarEvolution] Apply Depth limit to getMulExprMax Kazantsev2017-06-151-53/+76
| | | | | | | | | | | | | | | | | | | | | This is a fix for PR33292 that shows a case of extremely long compilation of a single .c file with clang, with most time spent within SCEV. We have a mechanism of limiting recursion depth for getAddExpr to avoid long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr and back that do not propagate the info about depth. As result of this, a chain getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr can be extremely long, with every segment of getAddExpr's being up to max depth long. This leads either to long compilation or crash by stack overflow. We face this situation while analyzing big SCEVs in the test of PR33292. This patch applies the same limit on max expression depth for getAddExpr and getMulExpr. Differential Revision: https://reviews.llvm.org/D33984 llvm-svn: 305463
OpenPOWER on IntegriCloud