summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/MachineBlockPlacement.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Do not select EhPad BB in MachineBlockPlacement when there is regular BB to ↵Amaury Sechet2016-04-071-12/+66
| | | | | | | | | | | | | | | | | schedule Summary: EHPad BB are not entered the classic way and therefor do not need to be placed after their predecessors. This patch make sure EHPad BB are not chosen amongst successors to form chains, and are selected as last resort when selecting the best candidate. EHPad are scheduled in reverse probability order in order to have them flow into each others naturally. Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17625 llvm-svn: 265726
* [BlockPlacement] Remove an unnecessary continueAmaury Sechet2016-04-071-1/+0
| | | | | | NFC. llvm-svn: 265643
* [MBP] Remove an unused function parameterAmaury Sechet2016-04-071-5/+3
| | | | | | NFC. llvm-svn: 265642
* Revert "[BlockPlacement] Remove an unnecessary continue" and "[MBP] Remove ↵Amaury Sechet2016-04-071-3/+6
| | | | | | an unused function parameter" llvm-svn: 265638
* [MBP] Remove an unused function parameterHaicheng Wu2016-04-061-5/+3
| | | | | | NFC. llvm-svn: 265596
* [BlockPlacement] Remove an unnecessary continueHaicheng Wu2016-04-051-1/+0
| | | | | | NFC. llvm-svn: 265407
* Factor out MachineBlockPlacement::fillWorkLists. NFCAmaury Sechet2016-03-141-36/+39
| | | | | | | | | | | | Summary: There are places in MachineBlockPlacement where a worklist is filled in pretty much identical way. The code is duplicated. This refactor it so that the same code is used in both scenarii. Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18077 llvm-svn: 263495
* Minor code cleanup. NFC.Junmo Park2016-03-111-1/+1
| | | | llvm-svn: 263196
* [MBP] Renaming a confusing variable and add clarifying commentsPhilip Reames2016-03-031-19/+24
| | | | | | Was discussed as part of http://reviews.llvm.org/D17830 llvm-svn: 262571
* [MBP] Avoid placing random blocks between loop preheader and headerPhilip Reames2016-03-031-1/+2
| | | | | | | | | | If we have a loop with a rarely taken path, we will prune that from the blocks which get added as part of the loop chain. The problem is that we weren't then recognizing the loop chain as schedulable when considering the preheader when forming the function chain. We'd then fall to various non-predecessors before finally scheduling the loop chain (as if the CFG was unnatural.) The net result was that there could be lots of garbage between a loop preheader and the loop, even though we could have directly fallen into the loop. It also meant we separated hot code with regions of colder code. The particular reason for the rejection of the loop chain was that we were scanning predecessor of the header, seeing the backedge, believing that was a globally more important predecessor (true), but forgetting to account for the fact the backedge precessor was already part of the existing loop chain (oops!. Differential Revision: http://reviews.llvm.org/D17830 llvm-svn: 262547
* [MBP] Remove overly verbose debug outputPhilip Reames2016-03-021-5/+2
| | | | llvm-svn: 262531
* [MBP] Adjust debug output to be more focused and approachablePhilip Reames2016-03-021-18/+9
| | | | llvm-svn: 262522
* Partially revert "Add command line options to force function/loop alignments."Chad Rosier2016-01-211-10/+0
| | | | | | This partially reverts r256571 in favor of the solution in r258409. llvm-svn: 258421
* [BlockPlacement] Add option to align all non-fall-through blocks.Geoff Berry2016-01-211-0/+16
| | | | | | | | | | | | Summary: This option is being added for testing purposes. Reviewers: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16410 llvm-svn: 258409
* Add command line options to force function/loop alignments.Chad Rosier2015-12-291-0/+10
| | | | | | | These are being added for testing purposes. http://reviews.llvm.org/D15648 llvm-svn: 256571
* Replace all weight-based interfaces in MBB with probability-based ↵Cong Hou2015-12-011-41/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | interfaces, and update all uses of old interfaces. (This is the second attempt to submit this patch. The first caused two assertion failures and was reverted. See https://llvm.org/bugs/show_bug.cgi?id=25687) The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254377
* Revert r254348: "Replace all weight-based interfaces in MBB with ↵Hans Wennborg2015-12-011-30/+41
| | | | | | | | | | probability-based interfaces, and update all uses of old interfaces." and the follow-up r254356: "Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction." Asserts were firing in Chromium builds. See PR25687. llvm-svn: 254366
* Fix a bug in MachineBlockPlacement that may cause assertion failure during ↵Cong Hou2015-12-011-3/+7
| | | | | | | | BranchProbability construction. The root cause is the rounding behavior in BranchProbability construction. We may consider to use truncation instead in the future. llvm-svn: 254356
* Replace all weight-based interfaces in MBB with probability-based ↵Cong Hou2015-12-011-41/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | interfaces, and update all uses of old interfaces. The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254348
* Improving edge probabilities computation when choosing the best successor in ↵Cong Hou2015-11-181-13/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | machine block placement. When looking for the best successor from the outer loop for a block belonging to an inner loop, the edge probability computation can be improved so that edges in the inner loop are ignored. For example, suppose we are building chains for the non-loop part of the following code, and looking for B1's best successor. Assume the true body is very hot, then B3 should be the best candidate. However, because of the existence of the back edge from B1 to B0, the probability from B1 to B3 can be very small, preventing B3 to be its successor. In this patch, when computing the probability of the edge from B1 to B3, the weight on the back edge B1->B0 is ignored, so that B1->B3 will have 100% probability. if (...) do { B0; ... // some branches B1; } while(...); else B2; B3; Differential revision: http://reviews.llvm.org/D10825 llvm-svn: 253414
* In MachineBlockPlacement, filter cold blocks off the loop chain when profile ↵Cong Hou2015-11-021-2/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | data is available. In the current BB placement algorithm, a loop chain always contains all loop blocks. This has a drawback that cold blocks in the loop may be inserted on a hot function path, hence increasing branch cost and also reducing icache locality. Consider a simple example shown below: A | B⇆C | D When B->C is quite cold, the best BB-layout should be A,B,D,C. But the current implementation produces A,C,B,D. This patch filters those cold blocks off from the loop chain by comparing the ratio: LoopBBFreq / LoopFreq to 20%: if it is less than 20%, we don't include this BB to the loop chain. Here LoopFreq is the frequency of the loop when we reduce the loop into a single node. In general we have more cold blocks when the loop has few iterations. And vice versa. Differential revision: http://reviews.llvm.org/D11662 llvm-svn: 251833
* Enhance loop rotation with existence of profile data in ↵Cong Hou2015-10-191-3/+184
| | | | | | | | | | | | | | | | MachineBlockPlacement pass. Currently, in MachineBlockPlacement pass the loop is rotated to let the best exit to be the last BB in the loop chain, to maximize the fall-through from the loop to outside. With profile data, we can determine the cost in terms of missed fall through opportunities when rotating a loop chain and select the best rotation. Basically, there are three kinds of cost to consider for each rotation: 1. The possibly missed fall through edge (if it exists) from BB out of the loop to the loop header. 2. The possibly missed fall through edges (if they exist) from the loop exits to BB out of the loop. 3. The missed fall through edge (if it exists) from the last BB to the first BB in the loop chain. Therefore, the cost for a given rotation is the sum of costs listed above. We select the best rotation with the smallest cost. This is only for PGO mode when we have more precise edge frequencies. Differential revision: http://reviews.llvm.org/D10717 llvm-svn: 250754
* CodeGen: Remove implicit iterator conversions from MBB.cppDuncan P. N. Exon Smith2015-10-091-8/+8
| | | | | | | | | | Remove implicit ilist iterator conversions from MachineBasicBlock.cpp. I've also added an overload of `splice()` that takes a pointer, since it's a natural API. This is similar to the overloads I added for `remove()` and `erase()` in r249867. llvm-svn: 249883
* Fix a spelling error in the description of a statistic. NFCCraig Topper2015-09-161-1/+1
| | | | llvm-svn: 247771
* [WinEH] Add some support for code generating catchpadReid Kleckner2015-08-271-1/+1
| | | | | | | We can now run 32-bit programs with empty catch bodies. The next step is to change PEI so that we get funclet prologues and epilogues. llvm-svn: 246235
* Revert r244154 which causes some build failure. See ↵Cong Hou2015-08-061-4/+6
| | | | | | https://llvm.org/bugs/show_bug.cgi?id=24377. llvm-svn: 244239
* Record whether the weights on out-edges from a MBB are normalized.Cong Hou2015-08-051-6/+4
| | | | | | | | | | | | 1. Create a utility function normalizeEdgeWeights() in MachineBranchProbabilityInfo that normalizes a list of edge weights so that the sum of then can fit in uint32_t. 2. Provide an interface in MachineBasicBlock to normalize its successors' weights. 3. Add a flag in MachineBasicBlock that tracks whether its successors' weights are normalized. 4. Provide an overload of getSumForBlock that accepts a non-const pointer to a MBB so that it can force normalizing this MBB's successors' weights. 5. Update several uses of getSumForBlock() by eliminating the once needed weight scale. Differential Revision: http://reviews.llvm.org/D11442 llvm-svn: 244154
* wrap OptSize and MinSize attributes for easier and consistent access (NFCI)Sanjay Patel2015-08-041-0/+1
| | | | | | | | | | | | | | | | | Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994
* Test commit.Cong Hou2015-07-151-1/+0
| | | | | | This is a test commit (one blank line deleted). llvm-svn: 242308
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-3/+3
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-3/+3
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* [MBP] Spell the conditions the same way through out this if statement.Chandler Carruth2015-04-151-1/+1
| | | | | | NFC. llvm-svn: 235009
* [MBP] Sink a comment into the if block to which it pertains. This makesChandler Carruth2015-04-151-1/+1
| | | | | | the content of the comment make much more sense. llvm-svn: 235007
* [MBP] Fix a really misleading typo in a comment.Chandler Carruth2015-04-151-1/+1
| | | | llvm-svn: 235006
* Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.Benjamin Kramer2015-03-231-0/+1
| | | | llvm-svn: 232998
* [MBP] Don't outline short optional branchesDaniel Jasper2015-03-201-2/+25
| | | | | | | | | | | | | | | | | | With the option -outline-optional-branches, LLVM will place optional branches out of line (more details on r231230). With this patch, this is not done for short optional branches. A short optional branch is a branch containing a single block with an instruction count below a certain threshold (defaulting to 3). Still everything is guarded under -outline-optional-branches). Outlining a short branch can't significantly improve code locality. It can however decrease performance because of the additional jmp and in cases where the optional branch is hot. This fixes a compile time regression I have observed in a benchmark. Review: http://reviews.llvm.org/D8108 llvm-svn: 232802
* [MBP] Use range based for-loops throughout this code. Several hadChandler Carruth2015-03-051-140/+108
| | | | | | | already been added and the inconsistency made choosing names and changing code more annoying. Plus, wow are they better for this code! llvm-svn: 231347
* [MBP] NFC, run clang-format over this code and tweak things to make theChandler Carruth2015-03-051-71/+62
| | | | | | | | | | | result reasonable. This code predated clang-format and so there was a reasonable amount of crufty formatting that had accumulated. This should ensure that neither myself nor others end up with formatting-only changes sneaking into other fixes. llvm-svn: 231341
* [MBP] This is no longer 'block-placement2'. ;] The old variants are longChandler Carruth2015-03-051-3/+3
| | | | | | gone, update this code to reflect that. llvm-svn: 231340
* [MBP] Revert r231238 which attempted to fix a nasty bug where MBP isChandler Carruth2015-03-051-26/+0
| | | | | | | | | | | | | just arbitrarily interleaving unrelated control flows once they get moved "out-of-line" (both outside of natural CFG ordering and with diamonds that cannot be fully laid out by chaining fallthrough edges). This easy solution doesn't work in practice, and it isn't just a small bug. It looks like a very different strategy will be required. I'm working on that now, and it'll again go behind some flag so that everyone can experiment and make sure it is working well for them. llvm-svn: 231332
* [MBP] Fix a really horrible bug in MachineBlockPlacement, but behindChandler Carruth2015-03-041-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a flag for now. First off, thanks to Daniel Jasper for really pointing out the issue here. It's been here forever (at least, I think it was there when I first wrote this code) without getting really noticed or fixed. The key problem is what happens when two reasonably common patterns happen at the same time: we outline multiple cold regions of code, and those regions in turn have diamonds or other CFGs for which we can't just topologically lay them out. Consider some C code that looks like: if (a1()) { if (b1()) c1(); else d1(); f1(); } if (a2()) { if (b2()) c2(); else d2(); f2(); } done(); Now consider the case where a1() and a2() are unlikely to be true. In that case, we might lay out the first part of the function like: a1, a2, done; And then we will be out of successors in which to build the chain. We go to find the best block to continue the chain with, which is perfectly reasonable here, and find "b1" let's say. Laying out successors gets us to: a1, a2, done; b1, c1; At this point, we will refuse to lay out the successor to c1 (f1) because there are still un-placed predecessors of f1 and we want to try to preserve the CFG structure. So we go get the next best block, d1. ... wait for it ... Except that the next best block *isn't* d1. It is b2! d1 is waaay down inside these conditionals. It is much less important than b2. Except that this is exactly what we didn't want. If we keep going we get the entire set of the rest of the CFG *interleaved*!!! a1, a2, done; b1, c1; b2, c2; d1, f1; d2, f2; So we clearly need a better strategy here. =] My current favorite strategy is to actually try to place the block whose predecessor is closest. This very simply ensures that we unwind these kinds of CFGs the way that is natural and fitting, and should minimize the number of cache lines instructions are spread across. It also happens to be *dead simple*. It's like the datastructure was specifically set up for this use case or something. We only push blocks onto the work list when the last predecessor for them is placed into the chain. So the back of the worklist *is* the nearest next block. Unfortunately, a change like this is going to cause *soooo* many benchmarks to swing wildly. So for now I'm adding this under a flag so that we and others can validate that this is fixing the problems described, that it seems possible to enable, and hopefully that it fixes more of our problems long term. llvm-svn: 231238
* Add a flag to experiment with outlining optional branches.Daniel Jasper2015-03-041-2/+46
| | | | | | | | | | | | | In a CFG with the edges A->B->C and A->C, B is an optional branch. LLVM's default behavior is to lay the blocks out naturally, i.e. A, B, C, in order to improve code locality and fallthroughs. However, if a function contains many of those optional branches only a few of which are taken, this leads to a lot of unnecessary icache misses. Moving B out of line can work around this. Review: http://reviews.llvm.org/D7719 llvm-svn: 231230
* NFC: Use range-based for loops and more consistent naming.Daniel Jasper2015-02-181-19/+15
| | | | | | | | | No functional changes intended. (I plan on doing some modifications to this function and would like to have as few unrelated changes as possible in the patch) llvm-svn: 229649
* Remove experimental options to control machine block placement.Daniel Jasper2015-02-181-35/+20
| | | | | | | This reverts r226034. Benchmarking with those flags has not revealed anything interesting. llvm-svn: 229648
* CodeGen: Canonicalize access to function attributes, NFCDuncan P. N. Exon Smith2015-02-141-2/+1
| | | | | | | | | | | | | | | | | Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) Also, add `Function::getFnStackAlignment()`, and canonicalize: getAttributes().getStackAlignment(AttributeSet::FunctionIndex) => getFnStackAlignment() llvm-svn: 229208
* [MBP] Add flags to disable the BadCFGConflict check in MachineBlockPlacement.Chandler Carruth2015-01-141-20/+35
| | | | | | | | | | | | | | | | | | | | | | | | Some benchmarks have shown that this could lead to a potential performance benefit, and so adding some flags to try to help measure the difference. A possible explanation. In diamond-shaped CFGs (A followed by either B or C both followed by D), putting B and C both in between A and D leads to the code being less dense than it could be. Always either B or C have to be skipped increasing the chance of cache misses etc. Moving either B or C to after D might be beneficial on average. In the long run, but we should probably do a better job of analyzing the basic block and branch probabilities to move the correct one of B or C to after D. But even if we don't use this in the long run, it is a good baseline for benchmarking. Original patch authored by Daniel Jasper with test tweaks and a second flag added by me. Differential Revision: http://reviews.llvm.org/D6969 llvm-svn: 226034
* [PowerPC/BlockPlacement] Allow target to provide a per-loop alignment preferenceHal Finkel2015-01-031-3/+4
| | | | | | | | | | | | | | | | | The existing code provided for specifying a global loop alignment preference. However, the preferred loop alignment might depend on the loop itself. For recent POWER cores, loops between 5 and 8 instructions should have 32-byte alignment (while the others are better with 16-byte alignment) so that the entire loop will fit in one i-cache line. To support this, getPrefLoopAlignment has been made virtual, and can be provided with an optional MachineLoop* so the target can inspect the loop before answering the query. The default behavior, as before, is to return the value set with setPrefLoopAlignment. MachineBlockPlacement now queries the target for each loop instead of only once per function. There should be no functional change for other targets. llvm-svn: 225117
* Update SetVector to rely on the underlying set's insert to return a ↵David Blaikie2014-11-191-2/+2
| | | | | | | | | | | | | pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334
* Have MachineFunction cache a pointer to the subtarget to make lookupsEric Christopher2014-08-051-2/+2
| | | | | | | | | | | shorter/easier and have the DAG use that to do the same lookup. This can be used in the future for TargetMachine based caching lookups from the MachineFunction easily. Update the MIPS subtarget switching machinery to update this pointer at the same time it runs. llvm-svn: 214838
* Remove the TargetMachine forwards for TargetSubtargetInfo basedEric Christopher2014-08-041-2/+3
| | | | | | information and update all callers. No functional change. llvm-svn: 214781
OpenPOWER on IntegriCloud