summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Utils
Commit message (Collapse)AuthorAgeFilesLines
* Fix LCSSA increased compile timeRong Xu2016-08-101-7/+7
| | | | | | | | | | | | | | | | | | | | | We are seeing r276077 drastically increasing compiler time for our larger benchmarks in PGO profile generation build (both clang based and IR based mode) -- it can be 20x slower than without the patch (like from 30 secs to 780 secs) The increased time are all in pass LCSSA. The problematic code is about PostProcessPHIs after use-rewrite. Note that the InsertedPhis from ssa_updater is accumulating (never been cleared). Since the inserted PHIs are added to the candidate for each rewrite, The earlier ones will be repeatedly added. Later when adding the new PHIs to the work-list, we don't check the duplication either. This can result in extremely long work-list that containing tons of duplicated PHIs. This patch fixes the issue by hoisting the code out of the loop. Differential Revision: http://reviews.llvm.org/D23344 llvm-svn: 278250
* [SimplifyLibCalls] Restore the old behaviour, emit a libcall.Davide Italiano2016-08-101-3/+5
| | | | | | | | Hal pointed out that the semantic of our intrinsic and the libc call are slightly different. Add a comment while I'm here to explain why we can't emit an intrinsic. Thanks Hal! llvm-svn: 278200
* [LoopSimplify] Rebuild LCSSA for the inner loop after separating nested loops.Michael Zolotukhin2016-08-091-32/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This hopefully fixes PR28825. The problem now was that a value from the original loop was used in a subloop, which became a sibling after separation. While a subloop doesn't need an lcssa phi node, a sibling does, and that's where we broke LCSSA. The most natural way to fix this now is to simply call formLCSSA on the original loop: it'll do what we've been doing before plus it'll cover situations described above. I think we don't need to run formLCSSARecursively here, and we have an assert to verify this (I've tried testing it on LLVM testsuite + SPECs). I'd be happy to be corrected here though. I also changed a run line in the test from '-lcssa -loop-unroll' to '-lcssa -loop-simplify -indvars', because it exercises LCSSA preservation to the same extent, but also makes less unrelated transformation on the CFG, which makes it easier to verify. Reviewers: chandlerc, sanjoy, silvas Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23288 llvm-svn: 278173
* Consistently use FunctionAnalysisManagerSean Silva2016-08-096-6/+6
| | | | | | | | | | | Besides a general consistently benefit, the extra layer of indirection allows the mechanical part of https://reviews.llvm.org/D23256 that requires touching every transformation and analysis to be factored out cleanly. Thanks to David for the suggestion. llvm-svn: 278077
* [LoopUnroll] Simplify loops created by unrolling.Michael Zolotukhin2016-08-081-0/+19
| | | | | | | | | | | | | | Summary: Currently loop-unrolling doesn't preserve loop-simplified form. This patch fixes it by resimplifying affected loops. Reviewers: chandlerc, sanjoy, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23148 llvm-svn: 278038
* [MemorySSA] Fix windows build breakage caused by r278028Geoff Berry2016-08-081-4/+4
| | | | | r278028: [MemorySSA] Ensure address stability of MemorySSA object. llvm-svn: 278035
* [MemorySSA] Ensure address stability of MemorySSA object.Geoff Berry2016-08-081-17/+13
| | | | | | | | | | | | | | | | | Summary: Ensure that the MemorySSA object never changes address when using the new pass manager since the walkers contained by MemorySSA cache pointers to it at construction time. This is achieved by wrapping the MemorySSAAnalysis result in a unique_ptr. Also add some asserts that check for this bug. Reviewers: george.burgess.iv, dberlin Subscribers: mcrosier, hfinkel, chandlerc, silvas, llvm-commits Differential Revision: https://reviews.llvm.org/D23171 llvm-svn: 278028
* Add some comments linking back to PR28400.Sean Silva2016-08-081-0/+2
| | | | | | Thanks to Mehdi for the suggestion! llvm-svn: 277984
* [PM] More workaround for PR28400Sean Silva2016-08-081-0/+2
| | | | llvm-svn: 277982
* [MSSA] Fix PR28880 by fixing use optimizer's lower bound tracking behavior.Daniel Berlin2016-08-081-4/+16
| | | | | | | | | | | | | | | | Summary: In the use optimizer, we need to keep of whether the lower bound still dominates us or else we may decide a lower bound is still valid when it is not due to intervening pushes/pops. Fixes PR28880 (and probably a bunch of other things). Reviewers: george.burgess.iv Subscribers: MatzeB, llvm-commits, sebpop Differential Revision: https://reviews.llvm.org/D23237 llvm-svn: 277978
* [JumpThreading] Fix handling of aliasing metadata.Eli Friedman2016-08-081-0/+11
| | | | | | | | | | | | | | | | | | Summary: The correctness fix here is that when we CSE a load with another load, we need to combine the metadata on the two loads. This matches the behavior of other passes, like instcombine and GVN. There's also a minor optimization improvement here: for load PRE, the aliasing metadata on the inserted load should be the same as the metadata on the original load. Not sure why the old code was throwing it away. Issue found by inspection. Differential Revision: http://reviews.llvm.org/D21460 llvm-svn: 277977
* [SimplifyLibCalls] Emit sqrt intrinsic instead of a libcall.Davide Italiano2016-08-081-2/+3
| | | | llvm-svn: 277972
* [SLC] Emit an intrinsic instead of a libcall for pow.Davide Italiano2016-08-071-9/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D22104 llvm-svn: 277963
* Revert "Revert "[LoopSimplify] Fix updating LCSSA after separating nested ↵Michael Zolotukhin2016-08-071-0/+15
| | | | | | | | | loops."" This reverts commit r277901. Reaaply the commit as it looks like it has nothing to do with the bots failures. llvm-svn: 277946
* Move helpers into anonymous namespaces. NFC.Benjamin Kramer2016-08-062-0/+4
| | | | llvm-svn: 277916
* Revert "[LoopSimplify] Fix updating LCSSA after separating nested loops."Michael Zolotukhin2016-08-061-15/+0
| | | | | | | This reverts commit r277877. Try to appease clang-x64-ninja-win7 buildbot. llvm-svn: 277901
* [MSSA] Use depth first iterator instead of custom version.Daniel Berlin2016-08-051-19/+3
| | | | | | | | | | | | | | | | | Summary: Originally the plan was to use the custom worklist to do some block popping, and because we don't actually need a visited set. The custom one we have here is slightly broken, and it's not worth fixing vs using depth_first_iterator since we aren't going to go the route we originally were. Fixes PR28874 Reviewers: george.burgess.iv Subscribers: llvm-commits, gberry Differential Revision: https://reviews.llvm.org/D23187 llvm-svn: 277880
* [LoopSimplify] Fix updating LCSSA after separating nested loops.Michael Zolotukhin2016-08-051-0/+15
| | | | | | | | | This fixes PR28825. The problem was that we only checked if a value from a created inner loop is used in the outer loop, and fixed LCSSA for them. But we missed to fixup LCSSA for values used in exits of the outer loop. llvm-svn: 277877
* [MSSA] Match assert vs llvm_unreachable style in verification functions.Daniel Berlin2016-08-051-11/+12
| | | | llvm-svn: 277873
* Rewrite domination verifier to handle local domination as well.Daniel Berlin2016-08-051-38/+19
| | | | | | | | | | | | | | Summary: Rewrite domination verifier to handle local domination as well. This catches a bug Geoff Berry noticed. Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23184 llvm-svn: 277872
* [FlattenCFG] Simplify + remove unused variable. NFCI.Davide Italiano2016-08-051-7/+2
| | | | llvm-svn: 277864
* Do not assign new discriminator for all intrinsics.Dehao Chen2016-08-051-2/+2
| | | | | | | | | | | | Summary: We do not care about intrinsic calls when assigning discriminators. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23212 llvm-svn: 277843
* [SimplifyCFG] Make range reduction code deterministic.Benjamin Kramer2016-08-051-2/+3
| | | | | | | | | | | This generated IR based on the order of evaluation, which is different between GCC and Clang. With that in mind you get bootstrap miscompares if you compare a Clang built with GCC-built Clang vs. Clang built with Clang-built Clang. Diagnosing that made my head hurt. This also reverts commit r277337, which "fixed" the test case. llvm-svn: 277820
* Forgot the dyn_cast_or_null intended for r277691.David Majnemer2016-08-041-1/+1
| | | | llvm-svn: 277693
* Reinstate "[CloneFunction] Don't remove side effecting calls"David Majnemer2016-08-041-2/+33
| | | | | | | This reinstates r277611 + r277614 and reverts r277642. A cast_or_null should have been a dyn_cast_or_null. llvm-svn: 277691
* [MSSA] Fix a bug in MemorySSA's move ctor.George Burgess IV2016-08-031-0/+2
| | | | | | | Not a correctness issue, but it would be nice if we didn't have to recompute our block numbering (worst-case) every time we move MSSA. llvm-svn: 277652
* Revert "[CloneFunction] Don't remove side effecting calls"Reid Kleckner2016-08-031-33/+2
| | | | | | | | | This reverts commit r277611 and the followup r277614. Bootstrap builds and chromium builds are crashing during inlining after this change. llvm-svn: 277642
* [MSSA] clang-format. NFC.George Burgess IV2016-08-031-8/+4
| | | | | | | Didn't want to fold this in with r277640, since it touches bits that aren't entirely related to r277640. llvm-svn: 277641
* [MSSA] Add special handling for invariant/constant loads.George Burgess IV2016-08-031-0/+23
| | | | | | | This is a follow-up to r277637. It teaches MemorySSA that invariant loads (and loads of provably constant memory) are always liveOnEntry. llvm-svn: 277640
* [MSSA] Add logic for special handling of atomics/volatiles.George Burgess IV2016-08-031-0/+58
| | | | | | | | | | | | | | | This patch makes MemorySSA recognize atomic/volatile loads, and makes MSSA treat said loads specially. This allows us to be a bit more aggressive in some cases. Administrative note: Revision was LGTM'ed by reames in person. Additionally, this doesn't include the `invariant.load` recognition in the differential revision, because I feel it's better to commit that separately. Will commit soon. Differential Revision: https://reviews.llvm.org/D16875 llvm-svn: 277637
* [CloneFunction] Don't crash if the value map doesn't hold somethingDavid Majnemer2016-08-031-1/+1
| | | | | | | | | It is possible for the value map to not have an entry for some value that has already been removed. I don't have a testcase, this is fall-out from a buildbot. llvm-svn: 277614
* [CloneFunction] Don't remove side effecting callsDavid Majnemer2016-08-031-2/+33
| | | | | | | | | | | We were able to figure out that the result of a call is some constant. While propagating that fact, we added the constant to the value map. This is problematic because it results in us losing the call site when processing the value map. This fixes PR28802. llvm-svn: 277611
* [MSSA] Fix a caching bug.George Burgess IV2016-08-031-8/+8
| | | | | | | | | | | | | | | This fixes a bug where we'd sometimes cache overly-conservative results with our walker. This bug was made more obvious by r277480, which makes our cache far more spotty than it was. Test case is llvm-unit, because we're likely going to use CachingWalker only for def optimization in the future. The bug stems from that there was a place where the walker assumed that `DefNode.Last` was a valid target to cache to when failing to optimize phis. This is sometimes incorrect if we have a cache hit. The fix is to use the thing we *can* assume is a valid target to cache to. :) llvm-svn: 277559
* Support for lifetime begin/end markers in the MemorySSA use optimizerDaniel Berlin2016-08-031-1/+38
| | | | | | | | | | | | Summary: Depends on D23072 Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23076 llvm-svn: 277553
* Imported statistics types changesPiotr Padlewski2016-08-021-23/+25
| | | | | | | | | | Reviewers: tejohnson, eraman Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22980 llvm-svn: 277534
* Move to having a single real instructionClobbersQueryDaniel Berlin2016-08-021-88/+94
| | | | | | | | | | | | Summary: We really want to move towards MemoryLocOrCall (or fix AA) everywhere, but for now, this lets us have a single instructionClobbersQuery. Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23072 llvm-svn: 277530
* [LoopUnroll] Switch the default value of -unroll-runtime-epilog back to its ↵Michael Zolotukhin2016-08-021-1/+1
| | | | | | | | | | original value. As agreed in post-commit review of r265388, I'm switching the flag to its original value until the 90% runtime performance regression on SingleSource/Benchmarks/Stanford/Bubblesort is addressed. llvm-svn: 277524
* Fixes for post-commit review comments on r277480Daniel Berlin2016-08-021-12/+10
| | | | llvm-svn: 277510
* [LoopUnroll] Ensure we create prolog loops in simplified form.Michael Zolotukhin2016-08-021-0/+12
| | | | llvm-svn: 277502
* MSVC 2013 does not implement C++11 unions properly, so remove the anoymous ↵Daniel Berlin2016-08-021-2/+2
| | | | | | | | union for now, and leave a FIXME. llvm-svn: 277485
* Rewrite the use optimizer to be less memory intensive and 50% faster.Daniel Berlin2016-08-021-31/+311
| | | | | | | | | | | | | | | | | | | | | | | | Fixes PR28670 Summary: Rewrite the use optimizer to be less memory intensive and 50% faster. Fixes PR28670 The new use optimizer works like a standard SSA renaming pass, storing all possible versions a MemorySSA use could get in a stack, and just tracking indexes into the stack. This uses much less memory than caching N^2 alias query results. It's also a lot faster. The current version defers phi node walking to the normal walker. Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23032 llvm-svn: 277480
* Minor code cleanups. NFC.Junmo Park2016-08-021-2/+2
| | | | llvm-svn: 277415
* CodeExtractor : Add ability to preserve profile data.Sean Silva2016-08-021-16/+110
| | | | | | | | | | | Added ability to estimate the entry count of the extracted function and the branch probabilities of the exit branches. Patch by River Riddle! Differential Revision: https://reviews.llvm.org/D22744 llvm-svn: 277411
* [SimplifyCFG] Fix nasty RAUW bug from r277325James Molloy2016-08-011-3/+4
| | | | | | | | | | | | | | | Using RAUW was wrong here; if we have a switch transform such as: 18 -> 6 then 6 -> 0 If we use RAUW, while performing the second transform the *transformed* 6 from the first will be also replaced, so we end up with: 18 -> 0 6 -> 0 Found by clang stage2 bootstrap; testcase added. llvm-svn: 277332
* [SimplifyCFG] Range reduce switchesJames Molloy2016-08-011-0/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a switch is sparse and all the cases (once sorted) are in arithmetic progression, we can extract the common factor out of the switch and create a dense switch. For example: switch (i) { case 5: ... case 9: ... case 13: ... case 17: ... } can become: if ( (i - 5) % 4 ) goto default; switch ((i - 5) / 4) { case 0: ... case 1: ... case 2: ... case 3: ... } or even better: switch ( ROTR(i - 5, 2) { case 0: ... case 1: ... case 2: ... case 3: ... } The division and remainder operations could be costly so we only do this if the factor is a power of two, and emit a right-rotate instead of a divide/remainder sequence. Dense switches can be lowered significantly better than sparse switches and can even be transformed into lookup tables. llvm-svn: 277325
* Revert r277313 and r277314.Sean Silva2016-08-011-110/+16
| | | | | | | | | | | | | | | They seem to trigger an LSan failure: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/15140/steps/check-llvm%20asan/logs/stdio Revert "Add the tests for r277313" This reverts commit r277314. Revert "CodeExtractor : Add ability to preserve profile data." This reverts commit r277313. llvm-svn: 277317
* Fix - CodeExtractor : Inherit Target Dependent Attributes from the parent ↵Sean Silva2016-08-011-1/+16
| | | | | | | | | | | | | | | | | | | function. When extracting a set of blocks make sure to inherit all of the target dependent attributes to make sure that the function will be valid for lowering. One example is the "target-features" attribute for x86, if the extracted region has functionality that relies on a specific feature it will fail to be lowered. This also allows for extracted functions to be valid for inlining, at least back into the parent function, as the target attributes are tested when inlining for compatibility. Patch by River Riddle! Differential Revision: https://reviews.llvm.org/D22713 llvm-svn: 277315
* CodeExtractor : Add ability to preserve profile data.Sean Silva2016-08-011-16/+110
| | | | | | | | | | | Added ability to estimate the entry count of the extracted function and the branch probabilities of the exit branches. Patch by River Riddle! Differential Revision: https://reviews.llvm.org/D22744 llvm-svn: 277313
* Fix the MemorySSA updating API to enable people to create memory accesses ↵Daniel Berlin2016-07-311-4/+6
| | | | | | before removing old ones llvm-svn: 277309
* [LoopUnroll] Include hotness of region in opt remarkAdam Nemet2016-07-292-12/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter is added to the common function analysis passes that loop passes depend on. The BFI and indirectly BPI used in this pass is computed lazily so no overhead should be observed unless -pass-remarks-with-hotness is used. This is how the patch affects the O3 pipeline: Dominator Tree Construction Natural Loop Information Canonicalize natural loops Loop-Closed SSA Form Pass Basic Alias Analysis (stateless AA impl) Function Alias Analysis Results Scalar Evolution Analysis + Lazy Branch Probability Analysis + Lazy Block Frequency Analysis + Optimization Remark Emitter Loop Pass Manager Rotate Loops Loop Invariant Code Motion Unswitch loops Simplify the CFG Dominator Tree Construction Basic Alias Analysis (stateless AA impl) Function Alias Analysis Results Combine redundant instructions Natural Loop Information Canonicalize natural loops Loop-Closed SSA Form Pass Scalar Evolution Analysis + Lazy Branch Probability Analysis + Lazy Block Frequency Analysis + Optimization Remark Emitter Loop Pass Manager Induction Variable Simplification Recognize loop idioms Delete dead loops Unroll loops ... llvm-svn: 277203
OpenPOWER on IntegriCloud