summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Utils
Commit message (Collapse)AuthorAgeFilesLines
...
* [SimplifyCFG] Fix bootstrap failure after r280220James Molloy2016-08-311-5/+21
| | | | | | We check that a sinking candidate is used by only one PHI node during our legality checks. However for instructions that are used by other sinking candidates our heuristic is less conservative. This can result in a candidate actually being illegal when we come to sink it because of how we sunk a predecessor. Do the used-by-only-one-PHI checks again during sinking to ensure we don't crash. llvm-svn: 280228
* [SimplifyCFG] Add a workaround to fix PR30188James Molloy2016-08-311-0/+10
| | | | | | | | We're sinking stores, which is a good thing, but in the process creating selects for the store address operand, which SROA/Mem2Reg can't look through, which caused serious regressions. The real fix is in SROA, which I'll be looking into. llvm-svn: 280219
* [SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more casesJames Molloy2016-08-311-6/+20
| | | | | | | | | | | | | | | | | | | A very important case is not handled here: multiple arcs to a single block with a PHI. Consider: a: %1 = icmp %b, 1 br %1, label %c, label %e c: %2 = icmp %b, 2 br %2, label %d, label %e d: br %e e: phi [0, %a], [1, %c], [2, %d] FoldValueComparisonIntoPredecessors will refuse to fold this, as it doesn't know how to deal with two arcs to a common destination with different PHI values. The answer is obvious - just split all conflicting arcs. llvm-svn: 280218
* [SimplifyCFG] Handle tail-sinking of more than 2 incoming branchesJames Molloy2016-08-311-27/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted. As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider: if (a) x(1); else if (b) x(2); This produces the following CFG: [if] / \ [x(1)] [if] | | \ | | \ | [x(2)] | \ | / [ end ] [end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch. We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects). We are now able to detect this case and split off the unconditional arcs to a common successor: [if] / \ [x(1)] [if] | | \ | | \ | [x(2)] | \ / | [sink.split] | \ / [ end ] Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases. llvm-svn: 280217
* [SimplifyCFG] Change the algorithm in SinkThenElseCodeToEndJames Molloy2016-08-311-86/+127
| | | | | | | | | | | | | | | | | | | | | | | | | r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything. This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink. This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example: %a = load i32* %b %d = load i32* %b %c = gep i32* %a, i32 0 %e = gep i32* %d, i32 1 Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor). This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough. Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm. In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging. In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive. This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans. llvm-svn: 280216
* [SimplifyCFG] Tail-merge calls with sideeffectsJames Molloy2016-08-311-7/+3
| | | | | | | | | | | | | This was deliberately disabled during my rewrite of SinkIfThenToEnd to keep behaviour at least vaguely consistent with the previous version and keep it as close to NFC as I could. There's no real reason not to merge sideeffect calls though, so let's do it! Small fixup along the way to ensure we don't create indirect calls. Should fix PR28964. llvm-svn: 280215
* [SimplifyCFG] Properly CSE metadata in SinkThenElseCodeToEndJames Molloy2016-08-301-0/+5
| | | | | | This was missing, meaning the metadata in sunk instructions was potentially bogus and could cause miscompiles. llvm-svn: 280072
* ASan: remove variable only used in assertions buildTim Northover2016-08-291-2/+1
| | | | llvm-svn: 279990
* [asan] Separate calculation of ShadowBytes from calculating ASanStackFrameLayoutVitaly Buka2016-08-291-29/+51
| | | | | | | | | | | | Summary: No functional changes, just refactoring to make D23947 simpler. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23954 llvm-svn: 279982
* [SimplifyCFG] Hoisting invalidates metadataDavid Majnemer2016-08-291-2/+8
| | | | | | | | | We forgot to remove optimization metadata when performing hosting during FoldTwoEntryPHINode. This fixes PR29163. llvm-svn: 279980
* [LoopUnroll] Use OptimizationRemarkEmitter directly not via the analysis passAdam Nemet2016-08-261-4/+0
| | | | | | | | | | | | | | | | We can't mark ORE (a function pass) preserved as required by the loop passes because that is how we ensure that the required passes like LazyBFI are all available any time ORE is used. See the new comments in the patch. Instead we use it directly just like the inliner does in D22694. As expected there is some additional overhead after removing the caching provided by analysis passes. The worst case, I measured was LNT/CINT2006_ref/401.bzip2 which regresses by 12%. As before, this only affects -Rpass-with-hotness and not default compilation. llvm-svn: 279829
* [UNROLL] Postpone ScalarEvolution::forgetLoop after TripCountSC is expandedWei Mi2016-08-251-5/+6
| | | | | | | | | | | | | | when unroll runtime iteration loop. In llvm::UnrollRuntimeLoopRemainder, if the loop to be unrolled is the inner loop inside a loop nest, the scalar evolution needs to be dropped for its parent loop which is done by ScalarEvolution::forgetLoop. However, we can postpone forgetLoop to the end of UnrollRuntimeLoopRemainder so TripCountSC expansion can still reuse existing value. Differential Revision: https://reviews.llvm.org/D23572 llvm-svn: 279748
* [MemorySSA] Remove unused field. NFC.George Burgess IV2016-08-221-6/+1
| | | | | | | | Given that we're not currently using blocker info, and whether or not we will end up using it it is unclear, don't waste 8 (or 4) bytes of memory per path node. llvm-svn: 279493
* MSSA: Factor out phi node placementDaniel Berlin2016-08-221-17/+22
| | | | llvm-svn: 279462
* MSSA: Only rename accesses whose defining access is nullptrDaniel Berlin2016-08-221-14/+6
| | | | llvm-svn: 279461
* [SimplifyCFG] Rewrite SinkThenElseCodeToEndJames Molloy2016-08-221-150/+236
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [Recommitting now an unrelated assertion in SROA is sorted out] The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return *b += 3; else return *b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables. This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup. llvm-svn: 279460
* Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd"James Molloy2016-08-221-236/+150
| | | | | | This reverts commit r279443. It caused buildbot failures. llvm-svn: 279447
* [SimplifyCFG] Rewrite SinkThenElseCodeToEndJames Molloy2016-08-221-150/+236
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return *b += 3; else return *b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables. This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup. llvm-svn: 279443
* [asan] Add support of lifetime poisoning into ComputeASanStackFrameLayoutVitaly Buka2016-08-201-3/+10
| | | | | | | | | | | | | | | Summary: We are going to combine poisoning of red zones and scope poisoning. PR27453 Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23623 llvm-svn: 279373
* Revert "[asan] Add support of lifetime poisoning into ↵Vitaly Buka2016-08-191-10/+3
| | | | | | | | | | ComputeASanStackFrameLayout" This reverts commit r279020. Speculative revert in hope to fix asan test on arm. llvm-svn: 279332
* Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd"Reid Kleckner2016-08-191-210/+150
| | | | | | | This reverts commit r279229. It breaks intrinsic function calls in diamonds. llvm-svn: 279313
* [CloneFunction] Don't remove unrelated nodes from the CGSSCDavid Majnemer2016-08-191-0/+6
| | | | | | | | CGSCC use a WeakVH to track call sites. RAUW a call within a function can result in that WeakVH getting confused about whether or not the call site is still around. llvm-svn: 279268
* [SimplifyCFG] Rewrite SinkThenElseCodeToEndJames Molloy2016-08-191-150/+210
| | | | | | | | | | | | | | | | | | | | | | | | | | The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return *b += 3; else return *b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. llvm-svn: 279229
* [asan] Add support of lifetime poisoning into ComputeASanStackFrameLayoutVitaly Buka2016-08-181-3/+10
| | | | | | | | | | | | | | | Summary: We are going to combine poisoning of red zones and scope poisoning. PR27453 Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23623 llvm-svn: 279020
* [PM] Port the always inliner to the new pass manager in a much moreChandler Carruth2016-08-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | minimal and boring form than the old pass manager's version. This pass does the very minimal amount of work necessary to inline functions declared as always-inline. It doesn't support a wide array of things that the legacy pass manager did support, but is alse ... about 20 lines of code. So it has that going for it. Notably things this doesn't support: - Array alloca merging - To support the above, bottom-up inlining with careful history tracking and call graph updates - DCE of the functions that become dead after this inlining. - Inlining through call instructions with the always_inline attribute. Instead, it focuses on inlining functions with that attribute. The first I've omitted because I'm hoping to just turn it off for the primary pass manager. If that doesn't pan out, I can add it here but it will be reasonably expensive to do so. The second should really be handled by running global-dce after the inliner. I don't want to re-implement the non-trivial logic necessary to do comdat-correct DCE of functions. This means the -O0 pipeline will have to be at least 'always-inline,global-dce', but that seems reasonable to me. If others are seriously worried about this I'd like to hear about it and understand why. Again, this is all solveable by factoring that logic into a utility and calling it here, but I'd like to wait to do that until there is a clear reason why the existing pass-based factoring won't work. The final point is a serious one. I can fairly easily add support for this, but it seems both costly and a confusing construct for the use case of the always inliner running at -O0. This attribute can of course still impact the normal inliner easily (although I find that a questionable re-use of the same attribute). I've started a discussion to sort out what semantics we want here and based on that can figure out if it makes sense ta have this complexity at O0 or not. One other advantage of this design is that it should be quite a bit faster due to checking for whether the function is a viable candidate for inlining exactly once per function instead of doing it for each call site. Anyways, hopefully a reasonable starting point for this pass. Differential Revision: https://reviews.llvm.org/D23299 llvm-svn: 278896
* SimplifyCFG: Avoid dereferencing end()Duncan P. N. Exon Smith2016-08-161-1/+4
| | | | | | | | When comparing a User* to a BasicBlock::iterator in passingValueIsAlwaysUndefined, don't dereference the iterator in case it is end(). llvm-svn: 278872
* Preserve the assumption cache more oftenDavid Majnemer2016-08-161-12/+22
| | | | | | | We were clearing it out in LoopUnswitch and InlineFunction instead of attempting to preserve it. llvm-svn: 278860
* [LoopUnroll] Don't clear out the AssumptionCache on each loopDavid Majnemer2016-08-161-6/+8
| | | | | | | | | | | Clearing out the AssumptionCache can cause us to rescan the entire function for assumes. If there are many loops, then we are scanning over the entire function many times. Instead of clearing out the AssumptionCache, register all cloned assumes. llvm-svn: 278854
* Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd"Reid Kleckner2016-08-151-207/+150
| | | | | | | | | This reverts commit r278660. It causes downstream assertion failure in InstCombine on shuffle instructions. Comes up in __mm_swizzle_epi32. llvm-svn: 278672
* [SimplifyCFG] Rewrite SinkThenElseCodeToEndJames Molloy2016-08-151-150/+207
| | | | | | | | | | | | | | | | | | | | | | | | | | The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return *b += 3; else return *b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. llvm-svn: 278660
* [Inliner] Don't treat inalloca allocas as staticReid Kleckner2016-08-121-3/+10
| | | | | | | | | They aren't static, and moving them to the entry block across something else will only result in tears. Root cause of http://crbug.com/636558. llvm-svn: 278571
* Fixed typo.David L Kreitzer2016-08-121-1/+1
| | | | llvm-svn: 278565
* [PM] Port LowerInvoke to the new pass managerMichael Kuperstein2016-08-122-18/+33
| | | | llvm-svn: 278531
* [PM] Port NameAnonFunction pass to new pass managerTeresa Johnson2016-08-122-8/+20
| | | | | | | | | | | | | | | Summary: Port the NameAnonFunction pass and add a test. Depends on D23439. Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23440 llvm-svn: 278509
* Use the range variant of remove_if instead of unpacking begin/endDavid Majnemer2016-08-121-4/+4
| | | | | | No functionality change is intended. llvm-svn: 278475
* Use the range variant of find/find_if instead of unpacking begin/endDavid Majnemer2016-08-122-4/+3
| | | | | | | | | If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278469
* Use the range variant of find instead of unpacking begin/endDavid Majnemer2016-08-113-3/+3
| | | | | | | | | If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278433
* [MSSA] Use is_containedDaniel Berlin2016-08-111-1/+1
| | | | llvm-svn: 278418
* Use range algorithms instead of unpacking begin/endDavid Majnemer2016-08-116-7/+8
| | | | | | No functionality change is intended. llvm-svn: 278417
* Fix LCSSA increased compile timeRong Xu2016-08-101-7/+7
| | | | | | | | | | | | | | | | | | | | | We are seeing r276077 drastically increasing compiler time for our larger benchmarks in PGO profile generation build (both clang based and IR based mode) -- it can be 20x slower than without the patch (like from 30 secs to 780 secs) The increased time are all in pass LCSSA. The problematic code is about PostProcessPHIs after use-rewrite. Note that the InsertedPhis from ssa_updater is accumulating (never been cleared). Since the inserted PHIs are added to the candidate for each rewrite, The earlier ones will be repeatedly added. Later when adding the new PHIs to the work-list, we don't check the duplication either. This can result in extremely long work-list that containing tons of duplicated PHIs. This patch fixes the issue by hoisting the code out of the loop. Differential Revision: http://reviews.llvm.org/D23344 llvm-svn: 278250
* [SimplifyLibCalls] Restore the old behaviour, emit a libcall.Davide Italiano2016-08-101-3/+5
| | | | | | | | Hal pointed out that the semantic of our intrinsic and the libc call are slightly different. Add a comment while I'm here to explain why we can't emit an intrinsic. Thanks Hal! llvm-svn: 278200
* [LoopSimplify] Rebuild LCSSA for the inner loop after separating nested loops.Michael Zolotukhin2016-08-091-32/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This hopefully fixes PR28825. The problem now was that a value from the original loop was used in a subloop, which became a sibling after separation. While a subloop doesn't need an lcssa phi node, a sibling does, and that's where we broke LCSSA. The most natural way to fix this now is to simply call formLCSSA on the original loop: it'll do what we've been doing before plus it'll cover situations described above. I think we don't need to run formLCSSARecursively here, and we have an assert to verify this (I've tried testing it on LLVM testsuite + SPECs). I'd be happy to be corrected here though. I also changed a run line in the test from '-lcssa -loop-unroll' to '-lcssa -loop-simplify -indvars', because it exercises LCSSA preservation to the same extent, but also makes less unrelated transformation on the CFG, which makes it easier to verify. Reviewers: chandlerc, sanjoy, silvas Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23288 llvm-svn: 278173
* Consistently use FunctionAnalysisManagerSean Silva2016-08-096-6/+6
| | | | | | | | | | | Besides a general consistently benefit, the extra layer of indirection allows the mechanical part of https://reviews.llvm.org/D23256 that requires touching every transformation and analysis to be factored out cleanly. Thanks to David for the suggestion. llvm-svn: 278077
* [LoopUnroll] Simplify loops created by unrolling.Michael Zolotukhin2016-08-081-0/+19
| | | | | | | | | | | | | | Summary: Currently loop-unrolling doesn't preserve loop-simplified form. This patch fixes it by resimplifying affected loops. Reviewers: chandlerc, sanjoy, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23148 llvm-svn: 278038
* [MemorySSA] Fix windows build breakage caused by r278028Geoff Berry2016-08-081-4/+4
| | | | | r278028: [MemorySSA] Ensure address stability of MemorySSA object. llvm-svn: 278035
* [MemorySSA] Ensure address stability of MemorySSA object.Geoff Berry2016-08-081-17/+13
| | | | | | | | | | | | | | | | | Summary: Ensure that the MemorySSA object never changes address when using the new pass manager since the walkers contained by MemorySSA cache pointers to it at construction time. This is achieved by wrapping the MemorySSAAnalysis result in a unique_ptr. Also add some asserts that check for this bug. Reviewers: george.burgess.iv, dberlin Subscribers: mcrosier, hfinkel, chandlerc, silvas, llvm-commits Differential Revision: https://reviews.llvm.org/D23171 llvm-svn: 278028
* Add some comments linking back to PR28400.Sean Silva2016-08-081-0/+2
| | | | | | Thanks to Mehdi for the suggestion! llvm-svn: 277984
* [PM] More workaround for PR28400Sean Silva2016-08-081-0/+2
| | | | llvm-svn: 277982
* [MSSA] Fix PR28880 by fixing use optimizer's lower bound tracking behavior.Daniel Berlin2016-08-081-4/+16
| | | | | | | | | | | | | | | | Summary: In the use optimizer, we need to keep of whether the lower bound still dominates us or else we may decide a lower bound is still valid when it is not due to intervening pushes/pops. Fixes PR28880 (and probably a bunch of other things). Reviewers: george.burgess.iv Subscribers: MatzeB, llvm-commits, sebpop Differential Revision: https://reviews.llvm.org/D23237 llvm-svn: 277978
* [JumpThreading] Fix handling of aliasing metadata.Eli Friedman2016-08-081-0/+11
| | | | | | | | | | | | | | | | | | Summary: The correctness fix here is that when we CSE a load with another load, we need to combine the metadata on the two loads. This matches the behavior of other passes, like instcombine and GVN. There's also a minor optimization improvement here: for load PRE, the aliasing metadata on the inserted load should be the same as the metadata on the original load. Not sure why the old code was throwing it away. Issue found by inspection. Differential Revision: http://reviews.llvm.org/D21460 llvm-svn: 277977
OpenPOWER on IntegriCloud