summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/TailDuplicator.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-111-13/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283934
* Revert "Codegen: Tail-duplicate during placement."Daniel Jasper2016-10-111-50/+13
| | | | | | | | | This reverts commit r283842. test/CodeGen/X86/tail-dup-repeat.ll causes and llc crash with our internal testing. I'll share a link with you. llvm-svn: 283857
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-111-13/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283842
* Revert "Codegen: Tail-duplicate during placement."Kyle Butt2016-10-081-46/+13
| | | | | | This reverts commit 71c312652c10f1855b28d06697c08d47e7a243e4. llvm-svn: 283647
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-071-13/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283619
* Revert "Codegen: Tail-duplicate during placement."Kyle Butt2016-10-051-46/+13
| | | | | | | | | | This reverts commit 062ace9764953e9769142c1099281a345f9b6bdc. Issue with loop info and block removal revealed by polly. I have a fix for this issue already in another patch, I'll re-roll this together with that fix, and a test case. llvm-svn: 283292
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-041-13/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283274
* Revert "Codegen: Tail-duplicate during placement."Kyle Butt2016-10-041-46/+13
| | | | | | | | This reverts commit ff234efbe23528e4f4c80c78057b920a51f434b2. Causing crashes on aarch64 build. llvm-svn: 283172
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-041-13/+46
| | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. llvm-svn: 283164
* Finish renaming remaining analyzeBranch functionsMatt Arsenault2016-09-141-2/+2
| | | | llvm-svn: 281535
* Make analyzeBranch family of instruction names consistentMatt Arsenault2016-09-141-1/+1
| | | | | | | analyzeBranch was renamed to use lowercase first, rename the related set to match. llvm-svn: 281506
* TailDuplication: Extract Indirect-Branch block limit as option. NFCKyle Butt2016-08-301-3/+9
| | | | | | | | The existing code hard-coded a limit of 20 instructions for duplication when a block ended with an indirect branch. Extract this as an option. No functional change intended. llvm-svn: 280125
* TailDuplication: Record blocks that received the duplicated block. NFC.Kyle Butt2016-08-261-2/+10
| | | | | | | This will allow tail duplication during layout to handle the cfg changes more cleanly. llvm-svn: 279858
* TailDuplication: Don't pass MMI separately from MF. NFCKyle Butt2016-08-251-2/+1
| | | | | | | MMI must match the function passed, and MF has a handle on MMI. Use that instead of accepting it as separate argument. No Functional Change. llvm-svn: 279701
* TailDuplication: Save MF and reduce number of parameters. NFCKyle Butt2016-08-251-22/+21
| | | | | | | | Save the function in the class, and then don't pass it around. This reduces the number of parameters and makes calls to member functions simpler. No Functional Change. llvm-svn: 279700
* TailDuplicator: Fix crash after r278974Matthias Braun2016-08-181-1/+1
| | | | | | | | Some inputs would after r278974 without this fix (see http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_build/2733/console for an example) llvm-svn: 279022
* Tail Duplication: Accept explicit threshold for duplicating.Kyle Butt2016-08-171-4/+8
| | | | | | | | This will allow tail duplication and tail merging during layout to have a shared threshold to make sure that they don't overlap. No observable change intended. llvm-svn: 278981
* TailDuplicator: Use optForSize instead of hasFnAttribute.Kyle Butt2016-08-171-1/+1
| | | | | | | This will cause minsize functions to have the same threshold as optsize functions, but otherwise should have no effects. llvm-svn: 278980
* Codegen: Don't tail-duplicate blocks with un-analyzable fallthrough.Kyle Butt2016-08-161-0/+10
| | | | | | | | | | | | If AnalyzeBranch can't analyze a block and it is possible to fallthrough, then duplicating the block doesn't make sense, as only one block can be the layout predecessor for the un-analyzable fallthrough. Submitted wit a test case, but NOTE: the test case doesn't currently fail. However, the test case fails with D20505 and would have saved me some time debugging. llvm-svn: 278866
* TailDuplicator: Use range loopsMatt Arsenault2016-08-161-42/+23
| | | | llvm-svn: 278847
* Revert "Codegen: Don't tail-duplicate blocks with un-analyzable fallthrough."Diana Picus2016-08-141-10/+0
| | | | | | | | | This reverts commit r278288. r278287 broke the clang-cmake-thumbv7-a15-full-sh bot. Revert this so we can get to r278287. llvm-svn: 278620
* Use the range variant of find instead of unpacking begin/endDavid Majnemer2016-08-111-1/+1
| | | | | | | | | If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278433
* Codegen: Don't tail-duplicate blocks with un-analyzable fallthrough.Kyle Butt2016-08-101-0/+10
| | | | | | | | | | | | If AnalyzeBranch can't analyze a block and it is possible to fallthrough, then duplicating the block doesn't make sense, as only one block can be the layout predecessor for the un-analyzable fallthrough. Submitted wit a test case, but NOTE: the test case doesn't currently fail. However, the test case fails with D20505 and would have saved me some time debugging. llvm-svn: 278288
* Codegen: Tail Duplication: Only duplicate into layout pred if it is a CFG Pred.Kyle Butt2016-07-201-0/+2
| | | | | | | | | Add a check that the layout predecessor of a block is an actual CFG predecssor of the block as well. No current code fails this check, but upcoming patches can trigger this, and it makes sense to separate it out. llvm-svn: 276066
* Codegen: Factor out canTailDuplicateKyle Butt2016-07-191-9/+19
| | | | | | | | canTailDuplicate accepts two blocks and returns true if the first can be duplicated into the second successfully. Use this function to encapsulate the heuristic. llvm-svn: 276062
* Rename AnalyzeBranch* to analyzeBranch*.Jacques Pienaar2016-07-151-5/+5
| | | | | | | | | | | | Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect. Reviewers: tstellarAMD, mcrosier Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai Differential Revision: https://reviews.llvm.org/D22409 llvm-svn: 275564
* TailDuplicator: Remove live-in updating logicMatthias Braun2016-07-061-18/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This logic was introduced in r157663 and does not make any sense to me. The motivating example in rdar://11538365 looks like this: This is the tail: BB#16: derived from LLVM BB %if.end68 Live Ins: %R0 %R4 %R5 Predecessors according to CFG: BB#15 BB#5 tBLXi pred:14, pred:%noreg, <ga:@CFRelease>, %R0<kill>, <regmask>, %LR<imp-def,dead>, %SP<imp-use>, %SP<imp-def> t2B <BB#20>, pred:14, pred:%noreg Successors according to CFG: BB#20 This is the predBB: BB#5: Live Ins: %R5 Predecessors according to CFG: BB#4 %R4<def> = t2MOVi 0, pred:14, pred:%noreg, opt:%noreg t2B <BB#16>, pred:14, pred:%noreg Successors according to CFG: BB#16 However this is invalid machine code to begin with, if %R0 is live-in to BB#16 then it must be live-in to BB#5 as well if BB#5 does not define it. We should not need logic to retroactively fix broken machine code and in fact the example from r157663 passes cleanly with the code removed and I do not see any (newly) failing tests with the machine verifier enabled. Differential Revision: http://reviews.llvm.org/D22031 llvm-svn: 274655
* CodeGen: Use MachineInstr& in TargetInstrInfo, NFCDuncan P. N. Exon Smith2016-06-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement. Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary. This is mostly mechanical fixes: adding and removing `*` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr*` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on. llvm-svn: 274189
* [TailDuplication] Split up NumInstrDups statistic.Geoff Berry2016-06-141-3/+6
| | | | | | | | | | | | Summary: Split NumInstrDups statistic into separate added/removed counts to avoid negative stat being printed as unsigned. Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D21335 llvm-svn: 272700
* Pass DebugLoc and SDLoc by const ref.Benjamin Kramer2016-06-121-1/+1
| | | | | | | | This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512
* [Tail duplication] Handle source registers with subregistersKrzysztof Parzyszek2016-04-261-34/+80
| | | | | | | | | | | | | | When a block is tail-duplicated, the PHI nodes from that block are replaced with appropriate COPY instructions. When those PHI nodes contained use operands with subregisters, the subregisters were dropped from the COPY instructions, resulting in incorrect code. Keep track of the subregister information and use this information when remapping instructions from the duplicated block. Differential Revision: http://reviews.llvm.org/D19337 llvm-svn: 267583
* CodeGen: Fix a use-after-free in TailDuplicationJustin Bogner2016-04-111-2/+0
| | | | | | | | | | The call to processPHI already erased MI from its parent, so MI isn't even valid here, making the getParent() call a use-after-free in addition to being redundant. Found by ASan with the ArrayRecycler changes in llvm.org/pr26808. llvm-svn: 266008
* Codegen: Factor tail duplication into a utility class. NFCKyle Butt2016-04-081-0/+903
This is in preparation for tail duplication during block placement. See D18226. This needs to be a utility class for 2 reasons. No passes may run after block placement, and also, tail-duplication affects subsequent layout decisions, so it must be interleaved with placement, and can't be separated out into its own pass. The original pass is still useful, and now runs by delegating to the utility class. llvm-svn: 265842
OpenPOWER on IntegriCloud