summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/MachineBlockPlacement.cpp
Commit message (Collapse)AuthorAgeFilesLines
* CodeGen: BlockPlacement: Reduce TriangleChainCount to 2Kyle Butt2017-03-161-1/+1
| | | | | | | | | This produces a 1% speedup on an important internal Google benchmark (protocol buffers), with no other regressions in google or in the llvm test-suite. Only 5 targets in the entire llvm test-suite are affected, and on those 5 targets the size increase is 0.027% llvm-svn: 297925
* CodeGen: BlockPlacement: Precompute layout for chains of triangles.Kyle Butt2017-03-031-0/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | For chains of triangles with small join blocks that can be tail duplicated, a simple calculation of probabilities is insufficient. Tail duplication can be profitable in 3 different ways for these cases: 1) The post-dominators marked 50% are actually taken 56% (This shrinks with longer chains) 2) The chains are statically correlated. Branch probabilities have a very U-shaped distribution. [http://nrs.harvard.edu/urn-3:HUL.InstRepos:24015805] If the branches in a chain are likely to be from the same side of the distribution as their predecessor, but are independent at runtime, this transformation is profitable. (Because the cost of being wrong is a small fixed cost, unlike the standard triangle layout where the cost of being wrong scales with the # of triangles.) 3) The chains are dynamically correlated. If the probability that a previous branch was taken positively influences whether the next branch will be taken We believe that 2 and 3 are common enough to justify the small margin in 1. The code pre-scans a function's CFG to identify this pattern and marks the edges so that the standard layout algorithm can use the computed results. llvm-svn: 296845
* CodeGen: MachineBlockPlacement: Remove the unused outlining heuristic.Kyle Butt2017-03-021-98/+1
| | | | | | | Outlining optional branches isn't a good heuristic, and it's never been on by default. Remove it to clean things up. llvm-svn: 296818
* CodeGen: MachineBlockPlacement: Rename member to more general name. NFC.Kyle Butt2017-02-231-13/+11
| | | | | | | | | Rename ComputedTrellisEdges to ComputedEdges to allow for other methods of pre-computing edges. Differential Revision: https://reviews.llvm.org/D30308 llvm-svn: 296018
* Codegen: Make chains from trellis-shaped CFGsKyle Butt2017-02-151-17/+293
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lay out trellis-shaped CFGs optimally. A trellis of the shape below: A B |\ /| | \ / | | X | | / \ | |/ \| C D would be laid out A; B->C ; D by the current layout algorithm. Now we identify trellises and lay them out either A->C; B->D or A->D; B->C. This scales with an increasing number of predecessors. A trellis is a a group of 2 or more predecessor blocks that all have the same successors. because of this we can tail duplicate to extend existing trellises. As an example consider the following CFG: B D F H / \ / \ / \ / \ A---C---E---G---Ret Where A,C,E,G are all small (Currently 2 instructions). The CFG preserving layout is then A,B,C,D,E,F,G,H,Ret. The current code will copy C into B, E into D and G into F and yield the layout A,C,B(C),E,D(E),F(G),G,H,ret define void @straight_test(i32 %tag) { entry: br label %test1 test1: ; A %tagbit1 = and i32 %tag, 1 %tagbit1eq0 = icmp eq i32 %tagbit1, 0 br i1 %tagbit1eq0, label %test2, label %optional1 optional1: ; B call void @a() br label %test2 test2: ; C %tagbit2 = and i32 %tag, 2 %tagbit2eq0 = icmp eq i32 %tagbit2, 0 br i1 %tagbit2eq0, label %test3, label %optional2 optional2: ; D call void @b() br label %test3 test3: ; E %tagbit3 = and i32 %tag, 4 %tagbit3eq0 = icmp eq i32 %tagbit3, 0 br i1 %tagbit3eq0, label %test4, label %optional3 optional3: ; F call void @c() br label %test4 test4: ; G %tagbit4 = and i32 %tag, 8 %tagbit4eq0 = icmp eq i32 %tagbit4, 0 br i1 %tagbit4eq0, label %exit, label %optional4 optional4: ; H call void @d() br label %exit exit: ret void } here is the layout after D27742: straight_test: # @straight_test ; ... Prologue elided ; BB#0: # %entry ; A (merged with test1) ; ... More prologue elided mr 30, 3 andi. 3, 30, 1 bc 12, 1, .LBB0_2 ; BB#1: # %test2 ; C rlwinm. 3, 30, 0, 30, 30 beq 0, .LBB0_3 b .LBB0_4 .LBB0_2: # %optional1 ; B (copy of C) bl a nop rlwinm. 3, 30, 0, 30, 30 bne 0, .LBB0_4 .LBB0_3: # %test3 ; E rlwinm. 3, 30, 0, 29, 29 beq 0, .LBB0_5 b .LBB0_6 .LBB0_4: # %optional2 ; D (copy of E) bl b nop rlwinm. 3, 30, 0, 29, 29 bne 0, .LBB0_6 .LBB0_5: # %test4 ; G rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 b .LBB0_7 .LBB0_6: # %optional3 ; F (copy of G) bl c nop rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 .LBB0_7: # %optional4 ; H bl d nop .LBB0_8: # %exit ; Ret ld 30, 96(1) # 8-byte Folded Reload addi 1, 1, 112 ld 0, 16(1) mtlr 0 blr The tail-duplication has produced some benefit, but it has also produced a trellis which is not laid out optimally. With this patch, we improve the layouts of such trellises, and decrease the cost calculation for tail-duplication accordingly. This patch produces the layout A,C,E,G,B,D,F,H,Ret. This layout does have back edges, which is a negative, but it has a bigger compensating positive, which is that it handles the case where there are long strings of skipped blocks much better than the original layout. Both layouts handle runs of executed blocks equally well. Branch prediction also improves if there is any correlation between subsequent optional blocks. Here is the resulting concrete layout: straight_test: # @straight_test ; BB#0: # %entry ; A (merged with test1) mr 30, 3 andi. 3, 30, 1 bc 12, 1, .LBB0_4 ; BB#1: # %test2 ; C rlwinm. 3, 30, 0, 30, 30 bne 0, .LBB0_5 .LBB0_2: # %test3 ; E rlwinm. 3, 30, 0, 29, 29 bne 0, .LBB0_6 .LBB0_3: # %test4 ; G rlwinm. 3, 30, 0, 28, 28 bne 0, .LBB0_7 b .LBB0_8 .LBB0_4: # %optional1 ; B (Copy of C) bl a nop rlwinm. 3, 30, 0, 30, 30 beq 0, .LBB0_2 .LBB0_5: # %optional2 ; D (Copy of E) bl b nop rlwinm. 3, 30, 0, 29, 29 beq 0, .LBB0_3 .LBB0_6: # %optional3 ; F (Copy of G) bl c nop rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 .LBB0_7: # %optional4 ; H bl d nop .LBB0_8: # %exit Differential Revision: https://reviews.llvm.org/D28522 llvm-svn: 295223
* include function name in dot filenameXinliang David Li2017-02-151-1/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D29975 llvm-svn: 295220
* [CodeGen]: BlockPlacement: Skip extraneous logging.Kyle Butt2017-02-041-3/+3
| | | | | | | Move a check for blocks that are not candidates for tail duplication up before the logging. Reduces logging noise. No non-logging changes intended. llvm-svn: 294086
* [CodeGen]: BlockPlacement: Apply const liberally. NFCKyle Butt2017-02-041-94/+103
| | | | | | | | Anything that needs to be passed to AnalyzeBranch unfortunately can't be const, or more would be const. Added const_iterator to BlockChain to allow BlockChain to be const when we don't expect to change it. llvm-svn: 294085
* [PGO] internal option cleanupsXinliang David Li2017-02-021-0/+6
| | | | | | | | | | 1. Added comments for options 2. Added missing option cl::desc field 3. Uniified function filter option for graph viewing. Now PGO count/raw-counts share the same filter option: -view-bfi-func-name=. llvm-svn: 293938
* [PGO] make graph view internal options available for all buildsXinliang David Li2017-02-021-4/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D29259 llvm-svn: 293921
* CodeGen: Allow small copyable blocks to "break" the CFG.Kyle Butt2017-01-311-35/+327
| | | | | | | | | | | When choosing the best successor for a block, ordinarily we would have preferred a block that preserves the CFG unless there is a strong probability the other direction. For small blocks that can be duplicated we now skip that requirement as well, subject to some simple frequency calculations. Differential Revision: https://reviews.llvm.org/D28583 llvm-svn: 293716
* Add support to dump dot graph block layout after MBPXinliang David Li2017-01-291-0/+14
| | | | | | Differential Revision: https://reviews.llvm.org/D29141 llvm-svn: 293408
* Revert "CodeGen: Allow small copyable blocks to "break" the CFG."Kyle Butt2017-01-111-52/+7
| | | | | | | | | This reverts commit ada6595a526d71df04988eb0a4b4fe84df398ded. This needs a simple probability check because there are some cases where it is not profitable. llvm-svn: 291695
* CodeGen: Allow small copyable blocks to "break" the CFG.Kyle Butt2017-01-101-7/+52
| | | | | | | | | | | When choosing the best successor for a block, ordinarily we would have preferred a block that preserves the CFG unless there is a strong probability the other direction. For small blocks that can be duplicated we now skip that requirement as well. Differential revision: https://reviews.llvm.org/D27742 llvm-svn: 291609
* Trying to fix NDEBUG build after r289764Hal Finkel2016-12-151-0/+2
| | | | llvm-svn: 289766
* [MachineBlockPlacement] Don't make blocks "uneditable"Sanjoy Das2016-12-151-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes an issue with MachineBlockPlacement due to a badly timed call to `analyzeBranch` with `AllowModify` set to true. The timeline is as follows: 1. `MachineBlockPlacement::maybeTailDuplicateBlock` calls `TailDup.shouldTailDuplicate` on its argument, which in turn calls `analyzeBranch` with `AllowModify` set to true. 2. This `analyzeBranch` call edits the terminator sequence of the block based on the physical layout of the machine function, turning an unanalyzable non-fallthrough block to a unanalyzable fallthrough block. Normally MBP bails out of rearranging such blocks, but this block was unanalyzable non-fallthrough (and thus rearrangeable) the first time MBP looked at it, and so it goes ahead and decides where it should be placed in the function. 3. When placing this block MBP fails to analyze and thus update the block in keeping with the new physical layout. Concretely, before (1) we have something like: ``` LBL0: < unknown terminator op that may branch to LBL1 > jmp LBL1 LBL1: ... A LBL2: ... B ``` In (2), analyze branch simplifies this to ``` LBL0: < unknown terminator op that may branch to LBL2 > ;; jmp LBL1 <- redundant jump removed LBL1: ... A LBL2: ... B ``` In (3), MachineBlockPlacement goes ahead with its plan of putting LBL2 after the first block since that is profitable. ``` LBL0: < unknown terminator op that may branch to LBL2 > ;; jmp LBL1 <- redundant jump LBL2: ... B LBL1: ... A ``` and the program now has incorrect behavior (we no longer fall-through from `LBL0` to `LBL1`) because MBP can no longer edit LBL0. There are several possible solutions, but I went with removing the teeth off of the `analyzeBranch` calls in TailDuplicator. That makes thinking about the result of these calls easier, and breaks nothing in the lit test suite. I've also added some bookkeeping to the MachineBlockPlacement pass and used that to write an assert that would have caught this. Reviewers: chandlerc, gberry, MatzeB, iteratee Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27783 llvm-svn: 289764
* Make block placement deterministicRong Xu2016-11-161-3/+3
| | | | | | | | | | | | We fail to produce bit-to-bit matching stage2 and stage3 compiler in PGO bootstrap build. The reason is because LoopBlockSet is of SmallPtrSet type whose iterating order depends on the pointer value. This patch fixes this issue by changing to use SmallSetVector. Differential Revision: http://reviews.llvm.org/D26634 llvm-svn: 287148
* Move the initialization of PreferredLoopExit into runOnMachineFunction to be ↵Eric Christopher2016-11-011-1/+5
| | | | | | near the other function specific initializations. llvm-svn: 285758
* Fix uninitialized access in MachineBlockPlacement.Sam McCall2016-11-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: Currently PreferredLoopExit is set only in buildLoopChains, which is never called if there are no MachineLoops. MSan is currently broken by this: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/145/steps/check-llvm%20msan/logs/stdio This is a naive fix to get things green again. iteratee: you may have a better fix. This change will also mean PreferredLoopExit will not carry over if buildCFGChains() is called a second time in runOnMachineFunction, this appears to be the right thing. Reviewers: bkramer, iteratee, echristo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26069 llvm-svn: 285757
* CodeGen: Handle missed case of block removal during BlockPlacement.Kyle Butt2016-10-271-4/+10
| | | | | | | | | There is a use after free bug in the existing code. Loop layout selects a preferred exit block, and then lays out the loop. If this block is removed during layout, it needs to be invalidated to prevent a use after free. llvm-svn: 285348
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-111-27/+278
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283934
* Revert "Codegen: Tail-duplicate during placement."Daniel Jasper2016-10-111-279/+27
| | | | | | | | | This reverts commit r283842. test/CodeGen/X86/tail-dup-repeat.ll causes and llc crash with our internal testing. I'll share a link with you. llvm-svn: 283857
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-111-27/+279
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283842
* Revert "Codegen: Tail-duplicate during placement."Kyle Butt2016-10-081-279/+27
| | | | | | This reverts commit 71c312652c10f1855b28d06697c08d47e7a243e4. llvm-svn: 283647
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-071-27/+279
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283619
* Revert "Codegen: Tail-duplicate during placement."Kyle Butt2016-10-051-279/+27
| | | | | | | | | | This reverts commit 062ace9764953e9769142c1099281a345f9b6bdc. Issue with loop info and block removal revealed by polly. I have a fix for this issue already in another patch, I'll re-roll this together with that fix, and a test case. llvm-svn: 283292
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-041-27/+279
| | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283274
* Revert "Codegen: Tail-duplicate during placement."Kyle Butt2016-10-041-274/+27
| | | | | | | | This reverts commit ff234efbe23528e4f4c80c78057b920a51f434b2. Causing crashes on aarch64 build. llvm-svn: 283172
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-041-27/+274
| | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. llvm-svn: 283164
* Finish renaming remaining analyzeBranch functionsMatt Arsenault2016-09-141-2/+2
| | | | llvm-svn: 281535
* Make analyzeBranch family of instruction names consistentMatt Arsenault2016-09-141-1/+1
| | | | | | | analyzeBranch was renamed to use lowercase first, rename the related set to match. llvm-svn: 281506
* Branch Folding: Accept explicit threshold for tail merge size.Kyle Butt2016-08-181-1/+3
| | | | | | | | This is prep work for allowing the threshold to be different during layout, and to enforce a single threshold between merging and duplicating during layout. No observable change intended. llvm-svn: 279117
* [MBP] do not reorder and move up loop latch blockSjoerd Meijer2016-08-161-0/+10
| | | | | | | | | | Do not reorder and move up a loop latch block before a loop header when optimising for size because this will generate an extra unconditional branch. Differential Revision: https://reviews.llvm.org/D22521 llvm-svn: 278840
* Use the range variant of remove_if instead of unpacking begin/endDavid Majnemer2016-08-121-4/+4
| | | | | | No functionality change is intended. llvm-svn: 278475
* Use the range variant of find instead of unpacking begin/endDavid Majnemer2016-08-111-3/+2
| | | | | | | | | If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278433
* Codegen: MachineBlockPlacement Improve probability layout.Kyle Butt2016-07-291-15/+45
| | | | | | | | | | | | | | | | | | | | | | | | | The following pattern was being layed out poorly: A / \ B C / \ / \ D E ? (Doesn't matter) Where A->B is far more likely than A->C, and prob(B->D) = prob(B->E) The current algorithm gives: A,B,C,E (D goes on worklist) It does this even if C has a frequency count of 0. This patch adjusts the layout calculation so that if freq(B->E) >> freq(C->E) then we go ahead and layout E rather than C. Fallthrough half the time is better than fallthrough never, or fallthrough very rarely. The resulting layout is: A,B,E, (C and D are in a worklist) llvm-svn: 277187
* [MBP] Added some more debug messages and some clean ups /NFCSjoerd Meijer2016-07-271-11/+31
| | | | | | Differential Revision: https://reviews.llvm.org/D22669 llvm-svn: 276849
* [MBP] Clean up of the comments, and a first attempt to better describe a partSjoerd Meijer2016-07-151-28/+49
| | | | | | | | of the algorithm. Differential Revision: https://reviews.llvm.org/D22364 llvm-svn: 275595
* Rename AnalyzeBranch* to analyzeBranch*.Jacques Pienaar2016-07-151-6/+6
| | | | | | | | | | | | Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect. Reviewers: tstellarAMD, mcrosier Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai Differential Revision: https://reviews.llvm.org/D22409 llvm-svn: 275564
* [MBP] method interface cleanupXinliang David Li2016-07-011-25/+20
| | | | | | | Make worklist and ehworklist member of the class so that they don't need to be passed around. llvm-svn: 274333
* Codegen: [MBP] Add messages to asserts. NFCKyle Butt2016-06-281-3/+4
| | | | llvm-svn: 274075
* [MBP] show function name in debug dumpXinliang David Li2016-06-241-0/+1
| | | | llvm-svn: 273744
* Codegen: [MBP] Add assert strings. NFCKyle Butt2016-06-171-2/+2
| | | | llvm-svn: 273067
* [MBP] add comments and bug fixXinliang David Li2016-06-151-3/+13
| | | | | | | | | | Document the new parameter and threshod computation model. Also fix a bug when the threshold parameter is set to be different from the default. llvm-svn: 272749
* Set machine block placement hot prob threshold for both static and runtime ↵Dehao Chen2016-06-141-8/+16
| | | | | | | | | | | | | | profile. Summary: With runtime profile, we have more confidence in branch probability, thus during basic block layout, we set a lower hot prob threshold so that blocks can be layouted optimally. Reviewers: djasper, davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20991 llvm-svn: 272729
* [MBP] Interface cleanups /NFCXinliang David Li2016-06-131-59/+61
| | | | | | | | | | | | Save machine function pointer so that the reference does not need to be passed around. This also gives other methods access to machine function for information such as entry count etc. llvm-svn: 272594
* [MBP] Code cleanup #3 /NFCXinliang David Li2016-06-131-43/+137
| | | | | | | | | | | | | | | This is third patch to clean up the code. Included in this patch: 1. Further unclutter trace/chain formation main routine; 2. Isolate the logic to compute global cost/conflict detection into its own method; 3. Heavily document the selection algorithm; 4. Added helper hook to allow PGO specific logic to be added in the future. llvm-svn: 272582
* [MBP] Code cleanup /NFCXinliang David Li2016-06-121-43/+73
| | | | | | | | | | This is second patch to clean up the code. In this patch, the logic to determine block outlinining is refactored and more comments are added. llvm-svn: 272514
* [MBP] Code cleanup /NFCXinliang David Li2016-06-111-30/+59
| | | | | | | | | | | | This is one of the patches to clean up the code so that it is in a better form to make future enhancements easier. In htis patch, the logic to collect viable successors are extrated as a helper to unclutter the caller which gets very large recenty. Also cleaned up BP adjustment code. llvm-svn: 272482
* Reapply "[MBP] Reduce code size by running tail merging in MBP.""Haicheng Wu2016-06-091-3/+36
| | | | | | | | | | | | | | | | This reapplies commit r271930, r271915, r271923. They hit a bug in Thumb which is fixed in r272258 now. The original message: The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. This patch calls Tail Merging after MBP and calls MBP again if Tail Merging merges anything. llvm-svn: 272267
OpenPOWER on IntegriCloud