summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [SimpleLoopUnswitch] Early check exit for trivial unswitch with MemorySSA.Alina Sbirlea2019-01-281-0/+4
| | | | | | | | | | | | | | | Summary: If MemorySSA is avaiable, we can skip checking all instructions if block has any Defs. (volatile loads are also Defs). We still need to check all instructions for "canThrow", even if no Defs are found. Reviewers: chandlerc Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits Differential Revision: https://reviews.llvm.org/D57129 llvm-svn: 352393
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [SimpleLoopUnswitch] Increment stats counter for unswitching switch instructionZaara Syeda2019-01-151-1/+4
| | | | | | | | | | | Increment statistics counter NumSwitches at unswitchNontrivialInvariants() for unswitching a non-trivial switch instruction. This is to fix a bug that it increments NumBranches even for the case of switch instruction. There is no functional change in this patch. Differential Revision: https://reviews.llvm.org/D56408 llvm-svn: 351193
* [SimpleLoopUnswitch] Remove debug dump.Alina Sbirlea2018-12-041-3/+1
| | | | llvm-svn: 348267
* Update MemorySSA in SimpleLoopUnswitch.Alina Sbirlea2018-12-041-77/+235
| | | | | | | | | | | Summary: Teach SimpleLoopUnswitch to preserve MemorySSA. Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits Differential Revision: https://reviews.llvm.org/D47022 llvm-svn: 348263
* [SimpleLoopUnswitch] adding cost multiplier to cap exponential unswitch withFedor Sergeev2018-11-161-2/+116
| | | | | | | | | | | | | | | | | | | | | We need to control exponential behavior of loop-unswitch so we do not get run-away compilation. Suggested solution is to introduce a multiplier for an unswitch cost that makes cost prohibitive as soon as there are too many candidates and too many sibling loops (meaning we have already started duplicating loops by unswitching). It does solve the currently known problem with compile-time degradation (PR 39544). Tests are built on top of a recently implemented CHECK-COUNT-<num> FileCheck directives. Reviewed By: chandlerc, mkazantsev Differential Revision: https://reviews.llvm.org/D54223 llvm-svn: 347097
* [SimpleLoopUnswitch] partial unswitch needs to be careful when replacing ↵Fedor Sergeev2018-11-071-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | invariants with constants When partial unswitch operates on multiple conditions at once, .e.g: if (Cond1 || Cond2 || NonInv) ... it should infer (and replace) values for individual conditions only on one side of unswitch and not another. More precisely only these derivations hold true: (Cond1 || Cond2) == false => Cond1 == Cond2 == false (Cond1 && Cond2) == true => Cond1 == Cond2 == true By the way we organize unswitching it means only replacing on "continue" blocks and never on "unswitched" ones. Since trivial unswitch does not have "unswitched" blocks it does not have this problem. Fixes PR 39568. Reviewers: chandlerc, asbirlea Differential Revision: https://reviews.llvm.org/D54211 llvm-svn: 346350
* Fix -Wdocumentation warning. NFCI.Simon Pilgrim2018-10-271-4/+4
| | | | llvm-svn: 345454
* [SimpleLoopUnswitch] Unswitch by experimental.guard intrinsicsMax Kazantsev2018-10-261-2/+107
| | | | | | | | | | | | | | | | | | This patch adds support of `llvm.experimental.guard` intrinsics to non-trivial simple loop unswitching. These intrinsics represent implicit control flow which has pretty much the same semantics as usual conditional branches. The algorithm of dealing with them is following: - Consider guards as unswitching candidates; - If a guard is considered the best candidate, turn it into a branch; - Apply normal unswitching algorithm on this branch. The patch has no compile time effect on code that does not contain any guards. Differential Revision: https://reviews.llvm.org/D53744 Reviewed By: chandlerc llvm-svn: 345387
* [SimpleLoopUnswitch] Make all checks before actual non-trivial unswitchMax Kazantsev2018-10-261-18/+20
| | | | | | | | | | | We should be able to make all relevant checks before we actually start the non-trivial unswitching, so that we could guarantee that once we have started the transform, it will always succeed. Reviewed By: chandlerc Differential Revision: https://reviews.llvm.org/D53747 llvm-svn: 345375
* [TI removal] Switch simple loop unswitch to `Instruction`.Chandler Carruth2018-10-181-5/+5
| | | | llvm-svn: 344719
* [TI removal] Make variables declared as `TerminatorInst` and initializedChandler Carruth2018-10-151-1/+1
| | | | | | | | | | | | | by `getTerminator()` calls instead be declared as `Instruction`. This is the biggest remaining chunk of the usage of `getTerminator()` that insists on the narrow type and so is an easy batch of updates. Several files saw more extensive updates where this would cascade to requiring API updates within the file to use `Instruction` instead of `TerminatorInst`. All of these were trivial in nature (pervasively using `Instruction` instead just worked). llvm-svn: 344502
* llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)Fangrui Song2018-09-271-5/+4
| | | | | | | | | | | | Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163
* [SimpleLoopUnswitch] remove a chain of dead blocks at onceFedor Sergeev2018-09-041-19/+19
| | | | | | | | | | | | | | | | | | | | | | Recent change to deleteDeadBlocksFromLoop was not enough to fix all the problems related to dead blocks after nontrivial unswitching of switches. We need to delete all the dead blocks that were created during unswitching, otherwise we will keep having problems with phi's or dead blocks. This change removes all the dead blocks that are reachable from the loop, not trying to track whether these blocks are newly created by unswitching or not. While not completely correct, we are unlikely to get loose but reachable dead blocks that do not belong to our loop nest. It does fix all the failures currently known, in particular PR38778. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D51519 llvm-svn: 341398
* [SimpleLoopUnswitch] After unswitch delete dead blocks in parent loopsFedor Sergeev2018-08-291-2/+10
| | | | | | | | | | | | | | | | | | | | Summary: Assert from PR38737 happens on the dead block inside the parent loop after unswitching nontrivial switch in the inner loop. deleteDeadBlocksFromLoop now takes extra care to detect/remove dead blocks in all the parent loops in addition to the blocks from original loop being unswitched. Reviewers: asbirlea, chandlerc Reviewed By: asbirlea Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51415 llvm-svn: 340955
* [SimpleLoopUnswitch] Form dedicated exits after trivial unswitches.Alina Sbirlea2018-08-281-5/+8
| | | | | | | | | | | | | | Summary: Form dedicated exits after trivial unswitches. Fixes PR38737, PR38283. Reviewers: chandlerc, fedor.sergeev Subscribers: sanjoy, jlebar, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D51375 llvm-svn: 340871
* [SimpleLoopUnswitch] Fix DT updates for trivial branch unswitching.Alina Sbirlea2018-07-281-2/+4
| | | | | | | | | | | | | | | Summary: Fixing 2 issues with the DT update in trivial branch switching, though I don't have a case where DT update fails. 1. After splitting ParentBB->UnswitchedBB edge, new edges become: ParentBB->LoopExitBB->UnswitchedBB, so remove ParentBB->LoopExitBB edge. 2. AFAIU, for multiple CFG changes, DT should be updated using batch updates, vs consecutive addEdge and removeEdge calls. Reviewers: chandlerc, kuhar Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D49925 llvm-svn: 338180
* [PM/Unswitch] Fix unused variable in r336646.Chandler Carruth2018-07-101-0/+1
| | | | llvm-svn: 336647
* [PM/Unswitch] Fix a collection of closely related issues with trivialChandler Carruth2018-07-101-17/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | switch unswitching. The core problem was that the way we handled unswitching trivial exit edges through the default successor of a switch. For some reason I thought the right way to do this was to add a block containing unreachable and point the default successor at this block. In retrospect, this has an amazing number of problems. The first issue is the one that this pass has always worked around -- we have to *detect* such edges and avoid unswitching them again. This seemed pretty easy really. You juts look for an edge to a block containing unreachable. However, this pattern is woefully unsound. So many things can break it. The amazing thing is that I found a test case where *simple-loop-unswitch itself* breaks this! When we do a *non-trivial* unswitch of a switch we will end up splitting this exit edge. The result will be a default successor that is an exit and terminates in ... a perfectly normal branch. So the first test case that I started trying to fix is added to the nontrivial test cases. This is a ridiculous example that did just amazing things previously. With just unswitch, it would create 10+ copies of this stuff stamped out. But if you combine it *just right* with a bunch of other passes (like simplify-cfg, loop rotate, and some LICM) you can get it to do this infinitely. Or at least, I never got it to finish. =[ This, in turn, uncovered another related issue. When we are manipulating these switches after doing a trivial unswitch we never correctly updated PHI nodes to reflect our edits. As soon as I started changing how these edges were managed, it became obvious there were more issues that I couldn't realistically leave unaddressed, so I wrote more test cases around PHI updates here and ensured all of that works now. And this, in turn, required some adjustment to how we collect and manage the exit successor when it is the default successor. That showed a clear bug where we failed to include it in our search for the outer-most loop reached by an unswitched exit edge. This was actually already tested and the test case didn't work. I (wrongly) thought that was due to SCEV failing to analyze the switch. In fact, it was just a simple bug in the code that skipped the default successor. While changing this, I handled it correctly and have updated the test to reflect that we now get precise SCEV analysis of trip counts for the outer loop in one of these cases. llvm-svn: 336646
* [PM/Unswitch] Fix a nasty bug in the new PM's unswitch introduced inChandler Carruth2018-07-091-26/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r335553 with the non-trivial unswitching of switches. The code correctly updated most aspects of the CFG and analyses, but missed some crucial aspects: 1) When multiple cases have the same successor, we unswitch that a single time and replace the switch with a direct branch. The CFG here is correct, but the target of this direct branch may have had a PHI node with multiple entries in it. 2) When we still have to clone a successor of the switch into an unswitched copy of the loop, we'll delete potentially multiple edges entering this successor, not just one. 3) We also have to delete multiple edges entering the successors in the original loop when they have to be retained. 4) When the "retained successor" *also* occurs as a case successor, we just assert failed everywhere. This doesn't happen very easily because its always valid to simply drop the case -- the retained successor for switches is always the default successor. However, it is likely possible through some contrivance of different loop passes, unrolling, and simplifying for this to occur in practice and certainly there is nothing "invalid" about the IR so this pass needs to handle it. 5) In the case of #4, we also will replace these multiple edges with a direct branch much like in #1 and need to collapse the entries in any PHI nodes to a single enrty. All of this stems from the delightful fact that the same successor can show up in multiple parts of the switch terminator, and each of these are considered a distinct edge for the purpose of PHI nodes (and iterating the successors and predecessors) but not for unswitching itself, the dominator tree, or many other things. For the record, I intensely dislike this "feature" of the IR in large part because of the complexity it causes in passes like this. We already have a ton of logic building sets and handling duplicates, and we just had to add a bunch more. I've added a complex test case that covers all five of the above failure modes. I've also added a variation on it where #4 and #5 occur in loop exit, adding fun where we have an LCSSA PHI node with "multiple entries" despite have dedicated exits. There were no additional issues found by this, but it seems a useful corner case to cover with testing. One thing that working on all of this code has made painfully clear for me as well is how amazingly inefficient our PHI node representation is (in terms of the in-memory data structures and the APIs used to update them). This code has truly marvelous complexity bounds because every time we remove an entry from a PHI node we do a linear scan to find it and then a linear update to the data structure to remove it. We could in theory batch all of the PHI node updates into a single linear walk of the operands making this much more efficient, but the APIs fight hard against this and the fact that we have to handle duplicates in the peculiar manner we do (removing all but one in some cases) makes even implementing that very tedious and annoying. Anyways, none of this is new here or specific to loop unswitching. All code in LLVM that updates PHI node operands suffers from these problems. llvm-svn: 336536
* [PM/LoopUnswitch] Fix PR37889, producing the correct loop nest structureChandler Carruth2018-07-071-2/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | after trivial unswitching. This PR illustrates that a fundamental analysis update was not performed with the new loop unswitch. This update is also somewhat fundamental to the core idea of the new loop unswitch -- we actually *update* the CFG based on the unswitching. In order to do that, we need to update the loop nest in addition to the domtree. For some reason, when writing trivial unswitching, I thought that the loop nest structure cannot be changed by the transformation. But the PR helps illustrate that it clearly can. I've expanded this to a number of different test cases that try to cover the different cases of this. When we unswitch, we move an exit edge of a loop out of the loop. If this exit edge changes which loop reached by an exit is the innermost loop, it changes the parent of the loop. Essentially, this transformation may hoist the inner loop up the nest. I've added the simple logic to handle this reliably in the trivial unswitching case. This just requires updating LoopInfo and rebuilding LCSSA on the impacted loops. In the trivial case, we don't even need to handle dedicated exits because we're only hoisting the one loop and we just split its preheader. I've also ported all of these tests to non-trivial unswitching and verified that the logic already there correctly handles the loop nest updates necessary. Differential Revision: https://reviews.llvm.org/D48851 llvm-svn: 336477
* [PM/LoopUnswitch] Fix PR37651 by correctly invalidating SCEV whenChandler Carruth2018-07-031-21/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unswitching loops. Original patch trying to address this was sent in D47624, but that didn't quite handle things correctly. There are two key principles used to select whether and how to invalidate SCEV-cached information about loops: 1) We must invalidate any info SCEV has cached before unswitching as we may change (or destroy) the loop structure by the act of unswitching, and make it hard to recover everything we want to invalidate within SCEV. 2) We need to invalidate all of the loops whose CFGs are mutated by the unswitching. Notably, this isn't the *entire* loop nest, this is every loop contained by the outermost loop reached by an exit block relevant to the unswitch. And we need to do this even when doing trivial unswitching. I've added more focused tests that directly check that SCEV starts off with imprecise information and after unswitching (and simplifying instructions) re-querying SCEV will produce precise information. These tests also specifically work to check that an *outer* loop's information becomes precise. However, the testing here is still a bit imperfect. Crafting test cases that reliably fail to be analyzed by SCEV before unswitching and succeed afterward proved ... very, very hard. It took me several hours and careful work to build these, and I'm not optimistic about necessarily coming up with more to cover more elaborate possibilities. Fortunately, the code pattern we are testing here in the pass is really straightforward and reliable. Thanks to Max Kazantsev for the initial work on this as well as the review, and to Hal Finkel for helping me talk through approaches to test this stuff even if it didn't come to much. Differential Revision: https://reviews.llvm.org/D47624 llvm-svn: 336183
* [PM/LoopUnswitch] Teach the new unswitch to handle nontrivialChandler Carruth2018-06-251-156/+201
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unswitching of switches. This works much like trivial unswitching of switches in that it reliably moves the switch out of the loop. Here we potentially clone the entire loop into each successor of the switch and re-point the cases at these clones. Due to the complexity of actually doing nontrivial unswitching, this patch doesn't create a dedicated routine for handling switches -- it would duplicate far too much code. Instead, it generalizes the existing routine to handle both branches and switches as it largely reduces to looping in a few places instead of doing something once. This actually improves the results in some cases with branches due to being much more careful about how dead regions of code are managed. With branches, because exactly one clone is created and there are exactly two edges considered, somewhat sloppy handling of the dead regions of code was sufficient in most cases. But with switches, there are much more complicated patterns of dead code and so I've had to move to a more robust model generally. We still do as much pruning of the dead code early as possible because that allows us to avoid even cloning the code. This also surfaced another problem with nontrivial unswitching before which is that we weren't as precise in reconstructing loops as we could have been. This seems to have been mostly harmless, but resulted in pointless LCSSA PHI nodes and other unnecessary cruft. With switches, we have to get this *right*, and everything benefits from it. While the testing may seem a bit light here because we only have two real cases with actual switches, they do a surprisingly good job of exercising numerous edge cases. Also, because we share the logic with branches, most of the changes in this patch are reasonably well covered by existing tests. The new unswitch now has all of the same fundamental power as the old one with the exception of the single unsound case of *partial* switch unswitching -- that really is just loop specialization and not unswitching at all. It doesn't fit into the canonicalization model in any way. We can add a loop specialization pass that runs late based on profile data if important test cases ever come up here. Differential Revision: https://reviews.llvm.org/D47683 llvm-svn: 335553
* [PM/LoopUnswitch] Add partial non-trivial unswitching for invariantChandler Carruth2018-06-211-92/+237
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | conditions feeding a chain of `and`s or `or`s for a branch. Much like with full non-trivial unswitching, we rely on the pass manager to handle iterating until all of the profitable unswitches have been done. This is to allow other more profitable unswitches to fire on any of the cloned, simpler versions of the loop if viable. Threading the partial unswiching through the non-trivial unswitching logic motivated some minor refactorings. If those are too disruptive to make it reasonable to review this patch, I can separate them out, but it'll be somewhat timeconsuming so I wanted to send it for initial review as-is. Feel free to tell me whether it warrants pulling apart. I've tried to re-use (and factor out) logic form the partial trivial unswitching, but not as much could be shared as I had haped. Still, this wasn't as bad as I naively expected. Some basic testing is added, but I probably need more. Suggestions for things you'd like to see tested more than welcome. One thing I'd like to do is add some testing that when we schedule this with loop-instsimplify it effectively cleans up the cruft created. Last but not least, this uncovered a bug that has been in loop cloning the entire time for non-trivial unswitching. Specifically, we didn't correctly add the outer-most cloned loop to the list of cloned loops. This meant that LCSSA wouldn't be updated for it hypothetically, and more significantly that we would never visit it in the loop pass manager. I noticed this while checking loop-instsimplify by hand. I'll try to separate this bugfix out into its own patch with a more focused test. But it is just one line, so shouldn't significantly confuse the review here. After this patch, the only missing "feature" in this unswitch I'm aware of us non-trivial unswitching of switches. I'll try implementing *full* non-trivial unswitching of switches (which is at least a sound thing to implement), but *partial* non-trivial unswitching of switches is something I don't see any sound and principled way to implement. I also have no interesting test cases for the latter, so I'm not really worried. The rest of the things that need to be ported are bug-fixes and more narrow / targeted support for specific issues. Differential Revision: https://reviews.llvm.org/D47522 llvm-svn: 335203
* [PM/LoopUnswitch] Support partial trivial unswitching.Chandler Carruth2018-06-201-49/+171
| | | | | | | | | | | | | | | | | | | | | | | | | The idea of partial unswitching is to take a *part* of a branch's condition that is loop invariant and just unswitching that part. This primarily makes sense with i1 conditions of branches as opposed to switches. When dealing with i1 conditions, we can easily extract loop invariant inputs to a a branch and unswitch them to test them entirely outside the loop. As part of this, we now create much more significant cruft in the loop body, so this relies on adding cleanup passes to the loop pipeline and revisiting unswitched loops to do that cleanup before continuing to process them. This already appears to be more powerful at unswitching than the old loop unswitch pass, and so I'd appreciate pretty careful review in case I'm just missing some correctness checks. The `LIV-loop-condition` test case is not unswitched by the old unswitch pass, but is with this pass. Thanks to Sanjoy and Fedor for the review! Differential Revision: https://reviews.llvm.org/D46706 llvm-svn: 335156
* [NFC] fix trivial typos in commentsHiroshi Inoue2018-06-141-2/+2
| | | | llvm-svn: 334687
* [PM/LoopUnswitch] Fix how the cloned loops are handled when updating analyses.Chandler Carruth2018-06-021-44/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I noticed this issue because we didn't put the primary cloned loop into the `NonChildClonedLoops` vector and so never iterated on it. Once I fixed that, it made it clear why I had to do a really complicated and unnecesasry dance when updating the loops to remain in canonical form -- I was unwittingly working around the fact that the primary cloned loop wasn't in the expected list of cloned loops. Doh! Now that we include it in this vector, we don't need to return it and we can consolidate the update logic as we correctly have a single place where it can be handled. I've just added a test for the iteration order aspect as every time I changed the update logic partially or incorrectly here, an existing test failed and caught it so that seems well covered (which is also evidenced by the extensive working around of this missing update). Reviewers: asbirlea, sanjoy Subscribers: mcrosier, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D47647 llvm-svn: 333811
* [PM/LoopUnswitch] When using the new SimpleLoopUnswitch pass, scheduleChandler Carruth2018-05-301-25/+29
| | | | | | | | | | | | | | | | | | | | | | loop-cleanup passes at the beginning of the loop pass pipeline, and re-enqueue loops after even trivial unswitching. This will allow us to much more consistently avoid simplifying code while doing trivial unswitching. I've also added a test case that specifically shows effective iteration using this technique. I've unconditionally updated the new PM as that is always using the SimpleLoopUnswitch pass, and I've made the pipeline changes for the old PM conditional on using this new unswitch pass. I added a bunch of comments to the loop pass pipeline in the old PM to make it more clear what is going on when reviewing. Hopefully this will unblock doing *partial* unswitching instead of just full unswitching. Differential Revision: https://reviews.llvm.org/D47408 llvm-svn: 333493
* Rename DEBUG macro to LLVM_DEBUG.Nicola Zaghen2018-05-141-17/+20
| | | | | | | | | | | | | | | | The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
* [PM/LoopUnswitch] Avoid pointlessly creating an exit block set.Chandler Carruth2018-05-101-19/+4
| | | | | | | This code can just test whether blocks are *in* the loop, which we already have a dedicated set tracking in the loop itself. llvm-svn: 332004
* [PM/LoopUnswitch] Remove the last manual domtree update code from loopChandler Carruth2018-05-011-170/+18
| | | | | | | | | unswitch and replace it with the amazingly simple update API code. This addresses piles of FIXMEs around the update logic here and makes everything substantially simpler. llvm-svn: 331247
* [PM/LoopUnswitch] Add back a successor set that was removed based onChandler Carruth2018-05-011-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | code review. It turns out this *is* necessary, and I read the comment on the API correctly the first time. ;] The `applyUpdates` routine requires that updates are "balanced". This is in order to cleanly handle cycles like inserting, removing, nad then re-inserting the same edge. This precludes inserting the same edge multiple times in a row as handling that would cause the insertion logic to become *ordered* instead of *unordered* (which is what the API provides). It happens that in this specific case nothing (other than an assert and contract violation) goes wrong because we're never inserting and removing the same edge. The implementation *happens* to do the right thing to eliminate redundant insertions in that case. But the requirement is there and there is an assert to catch it. Somehow, after the code review I never did another asserts-clang build testing loop-unswich for a long time. As a consequence, I didn't notice this despite a bunch of testing going on, but it shows up immediately with an asserts build of clang itself. llvm-svn: 331246
* [PM/LoopUnswitch] Begin teaching SimpleLoopUnswitch to use the newChandler Carruth2018-04-251-70/+78
| | | | | | | | | | | | | | | | | | | | | | | update API for dominators rather than doing manual, hacky updates. This is just the first step, but in some ways the most important as it moves the non-trivial unswitching to update the domtree rather than fully recalculating it each time. Subsequent patches should remove the custom update logic used by the trivial unswitch and replace it with uses of the update API. This also fixes a number of bugs I was seeing when testing non-trivial unswitch due to it querying the quasi-correct dominator tree. Now the tree is 100% correct and safe to query. That said, there are still more bugs I can see with non-trivial unswitch just running over the test suite, so more bugfix patches are needed as well. Thanks to both Sanjoy and Fedor for reviews and testing! Differential Revision: https://reviews.llvm.org/D45943 llvm-svn: 330787
* [PM/LoopUnswitch] Fix a bug in the loop block set formation of the newChandler Carruth2018-04-241-4/+10
| | | | | | | | | | | | | | | | | | loop unswitch. This code incorrectly added the header to the loop block set early. As a consequence we would incorrectly conclude that a nested loop body had already been visited when the header of the outer loop was the preheader of the nested loop. In retrospect, adding the header eagerly doesn't really make sense. It seems nicer to let the cycle be formed naturally. This will catch crazy bugs in the CFG reconstruction where we can't correctly form the cycle earlier rather than later, and makes the rest of the logic just fall out. I've also added various asserts that make these issues *much* easier to debug. llvm-svn: 330707
* [PM/LoopUnswitch] Remove another over-aggressive assert.Chandler Carruth2018-04-241-4/+1
| | | | | | | | | | This code path can very clearly be called in a context where we have baselined all the cloned blocks to a particular loop and are trying to handle nested subloops. There is no harm in this, so just relax the assert. I've added a test case that will make sure we actually exercise this code path. llvm-svn: 330680
* [PM/LoopUnswitch] Remove a buggy assert in the new loop unswitch.Chandler Carruth2018-04-231-6/+5
| | | | | | | | The condition this was asserting doesn't actually hold. I've added comments to explain why, removed the assert, and added a fun test case reduced out of 403.gcc. llvm-svn: 330564
* [PM/LoopUnswitch] Fix comment typo. NFC.Chandler Carruth2018-04-231-1/+1
| | | | llvm-svn: 330560
* [PM/LoopUnswitch] Detect irreducible control flow within loops and skip ↵Chandler Carruth2018-04-191-0/+13
| | | | | | | | | | | | | | | | | | | | unswitching non-trivial edges. Summary: This fixes the bug pointed out in review with non-trivial unswitching. This also provides a basis that should make it pretty easy to finish fleshing out a routine to scan an entire function body for irreducible control flow, but this patch remains minimal for disabling loop unswitch. Reviewers: sanjoy, fedor.sergeev Subscribers: mcrosier, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45754 llvm-svn: 330357
* [Transforms] Change std::sort to llvm::sort in response to r327219Mandeep Singh Grang2018-04-131-5/+6
| | | | | | | | | | | | | | | | | | | | | | Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: kcc, pcc, danielcdh, jmolloy, sanjoy, dberlin, ruiu Reviewed By: ruiu Subscribers: ruiu, llvm-commits Differential Revision: https://reviews.llvm.org/D45142 llvm-svn: 330059
* [Dominators] Remove verifyDomTree and add some verifying for Post Dom TreesDavid Green2018-02-281-7/+4
| | | | | | | | | | | | Removes verifyDomTree, using assert(verify()) everywhere instead, and changes verify a little to always run IsSameAsFreshTree first in order to print good output when we find errors. Also adds verifyAnalysis for PostDomTrees, which will allow checking of PostDomTrees it the same way we check DomTrees and MachineDomTrees. Differential Revision: https://reviews.llvm.org/D41298 llvm-svn: 326315
* Use phi ranges to simplify code. No functionality change intended.Benjamin Kramer2017-12-301-22/+12
| | | | llvm-svn: 321585
* Make helpers static. NFC.Benjamin Kramer2017-11-241-4/+4
| | | | llvm-svn: 318953
* [PM/Unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching,Chandler Carruth2017-11-171-71/+1444
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | making it no longer even remotely simple. The pass will now be more of a "full loop unswitching" pass rather than anything substantively simpler than any other approach. I plan to rename it accordingly once the dust settles. The key ideas of the new loop unswitcher are carried over for non-trivial unswitching: 1) Fully unswitch a branch or switch instruction from inside of a loop to outside of it. 2) Update the CFG and IR. This avoids needing to "remember" the unswitched branches as well as avoiding excessively cloning and reliance on complex parts of simplify-cfg to cleanup the cfg. 3) Update the analyses (where we can) rather than just blowing them away or relying on something else updating them. Sadly, #3 is somewhat compromised here as the dominator tree updates were too complex for me to want to reason about. I will need to make another attempt to do this now that we have a nice dynamic update API for dominators. However, we do adhere to #3 w.r.t. LoopInfo. This approach also adds an important principls specific to non-trivial unswitching: not *all* of the loop will be duplicated when unswitching. This fact allows us to compute the cost in terms of how much *duplicate* code is inserted rather than just on raw size. Unswitching conditions which essentialy partition loops will work regardless of the total loop size. Some remaining issues that I will be addressing in subsequent commits: - Handling unstructured control flow. - Unswitching 'switch' cases instead of just branches. - Moving to the dynamic update API for dominators. Some high-level, interesting limitationsV that folks might want to push on as follow-ups but that I don't have any immediate plans around: - We could be much more clever about not cloning things that will be deleted. In fact, we should be able to delete *nothing* and do a minimal number of clones. - There are many more interesting selection criteria for which branch to unswitch that we might want to look at. One that I'm interested in particularly are a set of conditions which all exit the loop and which can be merged into a single unswitched test of them. Differential revision: https://reviews.llvm.org/D34200 llvm-svn: 318549
* Sort the remaining #include lines in include/... and lib/....Chandler Carruth2017-06-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
* [PM/Unswitch] Fix a bug in the domtree update logic for the new unswitchChandler Carruth2017-05-251-14/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | pass. The original logic only considered direct successors of the hoisted domtree nodes, but that isn't really enough. If there are other basic blocks that are completely within the subtree, their successors could just as easily be impacted by the hoisting. The more I think about it, the more I think the correct update here is to hoist every block on the dominance frontier which has an idom in the chain we hoist across. However, this is subtle enough that I'd definitely appreciate some more eyes on it. Sadly, if this is the correct algorithm, it requires computing a (highly localized) dominance frontier. I've done this in the simplest (IE, least code) way I could come up with, but that may be too naive. Suggestions welcome here, dominance update algorithms are not an area I've studied much, so I don't have strong opinions. In good news, with this patch, turning on simple unswitch passes the LLVM test suite for me with asserts enabled. Differential Revision: https://reviews.llvm.org/D32740 llvm-svn: 303843
* [ADT] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC).Eugene Zelenko2017-05-161-9/+28
| | | | llvm-svn: 303221
* [PM/Unswitch] Teach the new simple loop unswitch to handle loopChandler Carruth2017-05-121-23/+138
| | | | | | | | | | | | | | | | | | | | | | | | invariant PHI inputs and to rewrite PHI nodes during the actual unswitching. The checking is quite easy, but rewriting the PHI nodes is somewhat surprisingly challenging. This should handle both branches and switches. I think this is now a full featured trivial unswitcher, and more full featured than the trivial cases in the old pass while still being (IMO) somewhat simpler in how it works. Next up is to verify its correctness in more widespread testing, and then to add non-trivial unswitching. Thanks to Davide and Sanjoy for the excellent review. There is one remaining question that I may address in a follow-up patch (see the review thread for details) but it isn't related to the functionality specifically. Differential Revision: https://reviews.llvm.org/D32699 llvm-svn: 302867
* [PM/LoopUnswitch] Introduce a new, simpler loop unswitch pass.Chandler Carruth2017-04-271-0/+626
Currently, this pass only focuses on *trivial* loop unswitching. At that reduced problem it remains significantly better than the current loop unswitch: - Old pass is worse than cubic complexity. New pass is (I think) linear. - New pass is much simpler in its design by focusing on full unswitching. (See below for details on this). - New pass doesn't carry state for thresholds between pass iterations. - New pass doesn't carry state for correctness (both miscompile and infloop) between pass iterations. - New pass produces substantially better code after unswitching. - New pass can handle more trivial unswitch cases. - New pass doesn't recompute the dominator tree for the entire function and instead incrementally updates it. I've ported all of the trivial unswitching test cases from the old pass to the new one to make sure that major functionality isn't lost in the process. For several of the test cases I've worked to improve the precision and rigor of the CHECKs, but for many I've just updated them to handle the new IR produced. My initial motivation was the fact that the old pass carried state in very unreliable ways between pass iterations, and these mechansims were incompatible with the new pass manager. However, I discovered many more improvements to make along the way. This pass makes two very significant assumptions that enable most of these improvements: 1) Focus on *full* unswitching -- that is, completely removing whatever control flow construct is being unswitched from the loop. In the case of trivial unswitching, this means removing the trivial (exiting) edge. In non-trivial unswitching, this means removing the branch or switch itself. This is in opposition to *partial* unswitching where some part of the unswitched control flow remains in the loop. Partial unswitching only really applies to switches and to folded branches. These are very similar to full unrolling and partial unrolling. The full form is an effective canonicalization, the partial form needs a complex cost model, cannot be iterated, isn't canonicalizing, and should be a separate pass that runs very late (much like unrolling). 2) Leverage LLVM's Loop machinery to the fullest. The original unswitch dates from a time when a great deal of LLVM's loop infrastructure was missing, ineffective, and/or unreliable. As a consequence, a lot of complexity was added which we no longer need. With these two overarching principles, I think we can build a fast and effective unswitcher that fits in well in the new PM and in the canonicalization pipeline. Some of the remaining functionality around partial unswitching may not be relevant today (not many test cases or benchmarks I can find) but if they are I'd like to add support for them as a separate layer that runs very late in the pipeline. Purely to make reviewing and introducing this code more manageable, I've split this into first a trivial-unswitch-only pass and in the next patch I'll add support for full non-trivial unswitching against a *fixed* threshold, exactly like full unrolling. I even plan to re-use the unrolling thresholds, as these are incredibly similar cost tradeoffs: we're cloning a loop body in order to end up with simplified control flow. We should only do that when the total growth is reasonably small. One of the biggest changes with this pass compared to the previous one is that previously, each individual trivial exiting edge from a switch was unswitched separately as a branch. Now, we unswitch the entire switch at once, with cases going to the various destinations. This lets us unswitch multiple exiting edges in a single operation and also avoids numerous extremely bad behaviors, where we would introduce 1000s of branches to test for thousands of possible values, all of which would take the exact same exit path bypassing the loop. Now we will use a switch with 1000s of cases that can be efficiently lowered into a jumptable. This avoids relying on somehow forming a switch out of the branches or getting horrible code if that fails for any reason. Another significant change is that this pass actively updates the CFG based on unswitching. For trivial unswitching, this is actually very easy because of the definition of loop simplified form. Doing this makes the code coming out of loop unswitch dramatically more friendly. We still should run loop-simplifycfg (at the least) after this to clean up, but it will have to do a lot less work. Finally, this pass makes much fewer attempts to simplify instructions based on the unswitch. Something like loop-instsimplify, instcombine, or GVN can be used to do increasingly powerful simplifications based on the now dominating predicate. The old simplifications are things that something like loop-instsimplify should get today or a very, very basic loop-instcombine could get. Keeping that logic separate is a big simplifying technique. Most of the code in this pass that isn't in the old one has to do with achieving specific goals: - Updating the dominator tree as we go - Unswitching all cases in a switch in a single step. I think it is still shorter than just the trivial unswitching code in the old pass despite having this functionality. Differential Revision: https://reviews.llvm.org/D32409 llvm-svn: 301576
OpenPOWER on IntegriCloud