summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/MachineBlockPlacement.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [CodeGen]: BlockPlacement: Skip extraneous logging.Kyle Butt2017-02-041-3/+3
| | | | | | | Move a check for blocks that are not candidates for tail duplication up before the logging. Reduces logging noise. No non-logging changes intended. llvm-svn: 294086
* [CodeGen]: BlockPlacement: Apply const liberally. NFCKyle Butt2017-02-041-94/+103
| | | | | | | | Anything that needs to be passed to AnalyzeBranch unfortunately can't be const, or more would be const. Added const_iterator to BlockChain to allow BlockChain to be const when we don't expect to change it. llvm-svn: 294085
* [PGO] internal option cleanupsXinliang David Li2017-02-021-0/+6
| | | | | | | | | | 1. Added comments for options 2. Added missing option cl::desc field 3. Uniified function filter option for graph viewing. Now PGO count/raw-counts share the same filter option: -view-bfi-func-name=. llvm-svn: 293938
* [PGO] make graph view internal options available for all buildsXinliang David Li2017-02-021-4/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D29259 llvm-svn: 293921
* CodeGen: Allow small copyable blocks to "break" the CFG.Kyle Butt2017-01-311-35/+327
| | | | | | | | | | | When choosing the best successor for a block, ordinarily we would have preferred a block that preserves the CFG unless there is a strong probability the other direction. For small blocks that can be duplicated we now skip that requirement as well, subject to some simple frequency calculations. Differential Revision: https://reviews.llvm.org/D28583 llvm-svn: 293716
* Add support to dump dot graph block layout after MBPXinliang David Li2017-01-291-0/+14
| | | | | | Differential Revision: https://reviews.llvm.org/D29141 llvm-svn: 293408
* Revert "CodeGen: Allow small copyable blocks to "break" the CFG."Kyle Butt2017-01-111-52/+7
| | | | | | | | | This reverts commit ada6595a526d71df04988eb0a4b4fe84df398ded. This needs a simple probability check because there are some cases where it is not profitable. llvm-svn: 291695
* CodeGen: Allow small copyable blocks to "break" the CFG.Kyle Butt2017-01-101-7/+52
| | | | | | | | | | | When choosing the best successor for a block, ordinarily we would have preferred a block that preserves the CFG unless there is a strong probability the other direction. For small blocks that can be duplicated we now skip that requirement as well. Differential revision: https://reviews.llvm.org/D27742 llvm-svn: 291609
* Trying to fix NDEBUG build after r289764Hal Finkel2016-12-151-0/+2
| | | | llvm-svn: 289766
* [MachineBlockPlacement] Don't make blocks "uneditable"Sanjoy Das2016-12-151-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes an issue with MachineBlockPlacement due to a badly timed call to `analyzeBranch` with `AllowModify` set to true. The timeline is as follows: 1. `MachineBlockPlacement::maybeTailDuplicateBlock` calls `TailDup.shouldTailDuplicate` on its argument, which in turn calls `analyzeBranch` with `AllowModify` set to true. 2. This `analyzeBranch` call edits the terminator sequence of the block based on the physical layout of the machine function, turning an unanalyzable non-fallthrough block to a unanalyzable fallthrough block. Normally MBP bails out of rearranging such blocks, but this block was unanalyzable non-fallthrough (and thus rearrangeable) the first time MBP looked at it, and so it goes ahead and decides where it should be placed in the function. 3. When placing this block MBP fails to analyze and thus update the block in keeping with the new physical layout. Concretely, before (1) we have something like: ``` LBL0: < unknown terminator op that may branch to LBL1 > jmp LBL1 LBL1: ... A LBL2: ... B ``` In (2), analyze branch simplifies this to ``` LBL0: < unknown terminator op that may branch to LBL2 > ;; jmp LBL1 <- redundant jump removed LBL1: ... A LBL2: ... B ``` In (3), MachineBlockPlacement goes ahead with its plan of putting LBL2 after the first block since that is profitable. ``` LBL0: < unknown terminator op that may branch to LBL2 > ;; jmp LBL1 <- redundant jump LBL2: ... B LBL1: ... A ``` and the program now has incorrect behavior (we no longer fall-through from `LBL0` to `LBL1`) because MBP can no longer edit LBL0. There are several possible solutions, but I went with removing the teeth off of the `analyzeBranch` calls in TailDuplicator. That makes thinking about the result of these calls easier, and breaks nothing in the lit test suite. I've also added some bookkeeping to the MachineBlockPlacement pass and used that to write an assert that would have caught this. Reviewers: chandlerc, gberry, MatzeB, iteratee Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27783 llvm-svn: 289764
* Make block placement deterministicRong Xu2016-11-161-3/+3
| | | | | | | | | | | | We fail to produce bit-to-bit matching stage2 and stage3 compiler in PGO bootstrap build. The reason is because LoopBlockSet is of SmallPtrSet type whose iterating order depends on the pointer value. This patch fixes this issue by changing to use SmallSetVector. Differential Revision: http://reviews.llvm.org/D26634 llvm-svn: 287148
* Move the initialization of PreferredLoopExit into runOnMachineFunction to be ↵Eric Christopher2016-11-011-1/+5
| | | | | | near the other function specific initializations. llvm-svn: 285758
* Fix uninitialized access in MachineBlockPlacement.Sam McCall2016-11-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: Currently PreferredLoopExit is set only in buildLoopChains, which is never called if there are no MachineLoops. MSan is currently broken by this: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/145/steps/check-llvm%20msan/logs/stdio This is a naive fix to get things green again. iteratee: you may have a better fix. This change will also mean PreferredLoopExit will not carry over if buildCFGChains() is called a second time in runOnMachineFunction, this appears to be the right thing. Reviewers: bkramer, iteratee, echristo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26069 llvm-svn: 285757
* CodeGen: Handle missed case of block removal during BlockPlacement.Kyle Butt2016-10-271-4/+10
| | | | | | | | | There is a use after free bug in the existing code. Loop layout selects a preferred exit block, and then lays out the loop. If this block is removed during layout, it needs to be invalidated to prevent a use after free. llvm-svn: 285348
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-111-27/+278
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283934
* Revert "Codegen: Tail-duplicate during placement."Daniel Jasper2016-10-111-279/+27
| | | | | | | | | This reverts commit r283842. test/CodeGen/X86/tail-dup-repeat.ll causes and llc crash with our internal testing. I'll share a link with you. llvm-svn: 283857
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-111-27/+279
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283842
* Revert "Codegen: Tail-duplicate during placement."Kyle Butt2016-10-081-279/+27
| | | | | | This reverts commit 71c312652c10f1855b28d06697c08d47e7a243e4. llvm-svn: 283647
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-071-27/+279
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283619
* Revert "Codegen: Tail-duplicate during placement."Kyle Butt2016-10-051-279/+27
| | | | | | | | | | This reverts commit 062ace9764953e9769142c1099281a345f9b6bdc. Issue with loop info and block removal revealed by polly. I have a fix for this issue already in another patch, I'll re-roll this together with that fix, and a test case. llvm-svn: 283292
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-041-27/+279
| | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283274
* Revert "Codegen: Tail-duplicate during placement."Kyle Butt2016-10-041-274/+27
| | | | | | | | This reverts commit ff234efbe23528e4f4c80c78057b920a51f434b2. Causing crashes on aarch64 build. llvm-svn: 283172
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-041-27/+274
| | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. llvm-svn: 283164
* Finish renaming remaining analyzeBranch functionsMatt Arsenault2016-09-141-2/+2
| | | | llvm-svn: 281535
* Make analyzeBranch family of instruction names consistentMatt Arsenault2016-09-141-1/+1
| | | | | | | analyzeBranch was renamed to use lowercase first, rename the related set to match. llvm-svn: 281506
* Branch Folding: Accept explicit threshold for tail merge size.Kyle Butt2016-08-181-1/+3
| | | | | | | | This is prep work for allowing the threshold to be different during layout, and to enforce a single threshold between merging and duplicating during layout. No observable change intended. llvm-svn: 279117
* [MBP] do not reorder and move up loop latch blockSjoerd Meijer2016-08-161-0/+10
| | | | | | | | | | Do not reorder and move up a loop latch block before a loop header when optimising for size because this will generate an extra unconditional branch. Differential Revision: https://reviews.llvm.org/D22521 llvm-svn: 278840
* Use the range variant of remove_if instead of unpacking begin/endDavid Majnemer2016-08-121-4/+4
| | | | | | No functionality change is intended. llvm-svn: 278475
* Use the range variant of find instead of unpacking begin/endDavid Majnemer2016-08-111-3/+2
| | | | | | | | | If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278433
* Codegen: MachineBlockPlacement Improve probability layout.Kyle Butt2016-07-291-15/+45
| | | | | | | | | | | | | | | | | | | | | | | | | The following pattern was being layed out poorly: A / \ B C / \ / \ D E ? (Doesn't matter) Where A->B is far more likely than A->C, and prob(B->D) = prob(B->E) The current algorithm gives: A,B,C,E (D goes on worklist) It does this even if C has a frequency count of 0. This patch adjusts the layout calculation so that if freq(B->E) >> freq(C->E) then we go ahead and layout E rather than C. Fallthrough half the time is better than fallthrough never, or fallthrough very rarely. The resulting layout is: A,B,E, (C and D are in a worklist) llvm-svn: 277187
* [MBP] Added some more debug messages and some clean ups /NFCSjoerd Meijer2016-07-271-11/+31
| | | | | | Differential Revision: https://reviews.llvm.org/D22669 llvm-svn: 276849
* [MBP] Clean up of the comments, and a first attempt to better describe a partSjoerd Meijer2016-07-151-28/+49
| | | | | | | | of the algorithm. Differential Revision: https://reviews.llvm.org/D22364 llvm-svn: 275595
* Rename AnalyzeBranch* to analyzeBranch*.Jacques Pienaar2016-07-151-6/+6
| | | | | | | | | | | | Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect. Reviewers: tstellarAMD, mcrosier Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai Differential Revision: https://reviews.llvm.org/D22409 llvm-svn: 275564
* [MBP] method interface cleanupXinliang David Li2016-07-011-25/+20
| | | | | | | Make worklist and ehworklist member of the class so that they don't need to be passed around. llvm-svn: 274333
* Codegen: [MBP] Add messages to asserts. NFCKyle Butt2016-06-281-3/+4
| | | | llvm-svn: 274075
* [MBP] show function name in debug dumpXinliang David Li2016-06-241-0/+1
| | | | llvm-svn: 273744
* Codegen: [MBP] Add assert strings. NFCKyle Butt2016-06-171-2/+2
| | | | llvm-svn: 273067
* [MBP] add comments and bug fixXinliang David Li2016-06-151-3/+13
| | | | | | | | | | Document the new parameter and threshod computation model. Also fix a bug when the threshold parameter is set to be different from the default. llvm-svn: 272749
* Set machine block placement hot prob threshold for both static and runtime ↵Dehao Chen2016-06-141-8/+16
| | | | | | | | | | | | | | profile. Summary: With runtime profile, we have more confidence in branch probability, thus during basic block layout, we set a lower hot prob threshold so that blocks can be layouted optimally. Reviewers: djasper, davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20991 llvm-svn: 272729
* [MBP] Interface cleanups /NFCXinliang David Li2016-06-131-59/+61
| | | | | | | | | | | | Save machine function pointer so that the reference does not need to be passed around. This also gives other methods access to machine function for information such as entry count etc. llvm-svn: 272594
* [MBP] Code cleanup #3 /NFCXinliang David Li2016-06-131-43/+137
| | | | | | | | | | | | | | | This is third patch to clean up the code. Included in this patch: 1. Further unclutter trace/chain formation main routine; 2. Isolate the logic to compute global cost/conflict detection into its own method; 3. Heavily document the selection algorithm; 4. Added helper hook to allow PGO specific logic to be added in the future. llvm-svn: 272582
* [MBP] Code cleanup /NFCXinliang David Li2016-06-121-43/+73
| | | | | | | | | | This is second patch to clean up the code. In this patch, the logic to determine block outlinining is refactored and more comments are added. llvm-svn: 272514
* [MBP] Code cleanup /NFCXinliang David Li2016-06-111-30/+59
| | | | | | | | | | | | This is one of the patches to clean up the code so that it is in a better form to make future enhancements easier. In htis patch, the logic to collect viable successors are extrated as a helper to unclutter the caller which gets very large recenty. Also cleaned up BP adjustment code. llvm-svn: 272482
* Reapply "[MBP] Reduce code size by running tail merging in MBP.""Haicheng Wu2016-06-091-3/+36
| | | | | | | | | | | | | | | | This reapplies commit r271930, r271915, r271923. They hit a bug in Thumb which is fixed in r272258 now. The original message: The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. This patch calls Tail Merging after MBP and calls MBP again if Tail Merging merges anything. llvm-svn: 272267
* Revive http://reviews.llvm.org/D12778 to handle forward-hot-prob and ↵Dehao Chen2016-06-081-3/+10
| | | | | | | | | | | | | | | | | | | | | | | backward-hot-prob consistently. Summary: Consider the following diamond CFG: A / \ B C \/ D Suppose A->B and A->C have probabilities 81% and 19%. In block-placement, A->B is called a hot edge and the final placement should be ABDC. However, the current implementation outputs ABCD. This is because when choosing the next block of B, it checks if Freq(C->D) > Freq(B->D) * 20%, which is true (if Freq(A) = 100, then Freq(B->D) = 81, Freq(C->D) = 19, and 19 > 81*20%=16.2). Actually, we should use 25% instead of 20% as the probability here, so that we have 19 < 81*25%=20.25, and the desired ABDC layout will be generated. Reviewers: djasper, davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20989 llvm-svn: 272203
* Revert "[MBP] Reduce code size by running tail merging in MBP."Haicheng Wu2016-06-071-36/+3
| | | | | | | This reverts commit r271930, r271915, r271923. They break a thumb selfhosting bot. llvm-svn: 272017
* [MBP] Reduce code size by running tail merging in MBP.Haicheng Wu2016-06-061-3/+36
| | | | | | | | | | | | | The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. This patch calls Tail Merging after MBP and calls MBP again if Tail Merging merges anything. Differential Revision: http://reviews.llvm.org/D20276 llvm-svn: 271925
* Replace hard coded probability threshold with parameter /NFCXinliang David Li2016-06-031-1/+3
| | | | llvm-svn: 271751
* [MBP] Factor out the optimizations on branch conditions and unanalyzable ↵Haicheng Wu2016-05-241-44/+49
| | | | | | | | | | | | | | | | | | | | | branches. NFCI. The benefits of this patch are -- We call AnalyzeBranch() to optimize unanalyzable branches, but the result of AnalyzeBranch() is not used. Now the result is useful. -- Before the layout of all the MBBs is set, the result of AnalyzeBranch() is not correct and needs to be fixed before using it to optimize the branch conditions. Now this optimization is called after the layout, the code used to fix the result of AnalyzeBranch() is not needed. -- The branch condition of the last block is not optimized before. Now it is optimized. Differential Revision: http://reviews.llvm.org/D20177 llvm-svn: 270623
* [MBP] Remove a redundant skipFunction(). NFC.Haicheng Wu2016-05-181-3/+0
| | | | | | | | skipFunction() is called twice. Differential Revision: http://reviews.llvm.org/D20377 llvm-svn: 269994
OpenPOWER on IntegriCloud