summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/IPO/HotColdSplitting.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [ProfileSummary] Standardize methods and fix commentVedant Kumar2018-11-191-2/+2
| | | | | | | | | | | | | | | | | | | | | Every Analysis pass has a get method that returns a reference of the Result of the Analysis, for example, BlockFrequencyInfo &BlockFrequencyInfoWrapperPass::getBFI(). I believe that ProfileSummaryInfo::getPSI() is the only exception to that, as it was returning a pointer. Another change is renaming isHotBB and isColdBB to isHotBlock and isColdBlock, respectively. Most methods use BB as the argument of variable names while methods usually refer to Basic Blocks as Blocks, instead of BB. For example, Function::getEntryBlock, Loop:getExitBlock, etc. I also fixed one of the comments. Patch by Rodrigo Caetano Rocha! Differential Revision: https://reviews.llvm.org/D54669 llvm-svn: 347182
* [HotColdSplitting] Use TTI to inform outlining thresholdVedant Kumar2018-11-041-18/+26
| | | | | | | | | | | | | | | Using TargetTransformInfo allows the splitting pass to factor in the code size cost of instructions as it decides whether or not outlining is profitable. This did not regress the overall amount of outlining seen on the handful of internal frameworks I tested. Thanks to Jun Bum Lim for suggesting this! Differential Revision: https://reviews.llvm.org/D53835 llvm-svn: 346108
* [HotColdSplitting] Allow outlining single-block cold regionsVedant Kumar2018-10-291-3/+20
| | | | | | | | | | | | | | | | It can be profitable to outline single-block cold regions because they may be large. Allow outlining single-block regions if they have over some threshold of non-debug, non-terminator instructions. I chose 3 as the threshold after experimenting with several internal frameworks. In practice, reducing the threshold further did not give much improvement, whereas increasing it resulted in substantial regressions. Differential Revision: https://reviews.llvm.org/D53824 llvm-svn: 345524
* [HotColdSplitting] Identify larger cold regions using domtree queriesVedant Kumar2018-10-241-185/+168
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current splitting algorithm works in three stages: 1) Identify cold blocks, then 2) Use forward/backward propagation to mark hot blocks, then 3) Grow a SESE region of blocks *outside* of the set of hot blocks and start outlining. While testing this pass on Apple internal frameworks I noticed that some kinds of control flow (e.g. loops) are never outlined, even though they unconditionally lead to / follow cold blocks. I noticed two other issues related to how cold regions are identified: - An inconsistency can arise in the internal state of the hotness propagation stage, as a block may end up in both the ColdBlocks set and the HotBlocks set. Further inconsistencies can arise as these sets do not match what's in ProfileSummaryInfo. - It isn't necessary to limit outlining to single-exit regions. This patch teaches the splitting algorithm to identify maximal cold regions and outline them. A maximal cold region is defined as the set of blocks post-dominated by a cold sink block, or dominated by that sink block. This approach can successfully outline loops in the cold path. As a side benefit, it maintains less internal state than the current approach. Due to a limitation in CodeExtractor, blocks within the maximal cold region which aren't dominated by a single entry point (a so-called "max ancestor") are filtered out. Results: - X86 (LNT + -Os + externals): 134KB of TEXT were outlined compared to 47KB pre-patch, or a ~3x improvement. Did not see a performance impact across two runs. - AArch64 (LNT + -Os + externals + Apple-internal benchmarks): 149KB of TEXT were outlined. Ditto re: performance impact. - Outlining results improve marginally in the internal frameworks I tested. Follow-ups: - Outline more than once per function, outline large single basic blocks, & try to remove unconditional branches in outlined functions. Differential Revision: https://reviews.llvm.org/D53627 llvm-svn: 345209
* [hot-cold-split] Name split functions with ".cold" suffixTeresa Johnson2018-10-241-7/+9
| | | | | | | | | | | | | | | | | | | | | | | Summary: The current default of appending "_"+entry block label to the new extracted cold function breaks demangling. Change the deliminator from "_" to "." to enable demangling. Because the header block label will be empty for release compile code, use "extracted" after the "." when the label is empty. Additionally, add a mechanism for the client to pass in an alternate suffix applied after the ".", and have the hot cold split pass use "cold."+Count, where the Count is currently 1 but can be used to uniquely number multiple cold functions split out from the same function with D53588. Reviewers: sebpop, hiraditya Subscribers: llvm-commits, erik.pilkington Differential Revision: https://reviews.llvm.org/D53534 llvm-svn: 345178
* [HotColdSplitting] Attach MinSize to outlined codeVedant Kumar2018-10-231-0/+7
| | | | | | | | | | | | | | | | | | Outlined code is cold by assumption, so it makes sense to optimize it for minimal code size rather than performance. After r344869 moved the splitting pass to the end of the IR pipeline, this does not result in much of a code size reduction. This is probably because a comparatively small number backend transforms make use of the MinSize hint. Running LNT on x86_64, I see that 33/1020 binaries shrink for a total of 919 bytes of TEXT reduction. I didn't measure a significant performance impact. Differential Revision: https://reviews.llvm.org/D53518 llvm-svn: 345072
* [hot-cold-split] Add opt remark on successTeresa Johnson2018-10-221-0/+8
| | | | | | | | | | | | Summary: Emit optimization remark on successful hot cold split. Reviewers: sebpop, hiraditya Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53512 llvm-svn: 344938
* Change a TerminatorInst* to an Instruction* in HotColdSplitting.cpp.Lang Hames2018-10-151-1/+1
| | | | | | | | | | | r344558 added an assignment to a TerminatorInst* from BasicBlock::getTerminatorInst(), but BasicBlock::getTerminatorInst() returns an Instruction* rather than a TerminatorInst* since r344504 so this fails to compile. Changing the variable to an Instruction* should get the bots building again. llvm-svn: 344566
* [hot-cold-split] fix static analysis of cold regionsSebastian Pop2018-10-151-7/+41
| | | | | | | | | | | | | | | | | | | | | | Make the code of blockEndsInUnreachable to match the function blockEndsInUnreachable in CodeGen/BranchFolding.cpp. I also have added a note to make sure the code of this function will not be modified unless the back-end version is also modified. An early return before outlining has been added to avoid outlining the full function body when the first block in the function is marked cold. The static analysis of cold code has been amended to avoid marking the whole function as cold by back-propagation because the back-propagation would mark blocks with return statements as cold. The patch adds debug statements to help discover these problems. Differential Revision: https://reviews.llvm.org/D52904 llvm-svn: 344558
* [TI removal] Make variables declared as `TerminatorInst` and initializedChandler Carruth2018-10-151-1/+1
| | | | | | | | | | | | | by `getTerminator()` calls instead be declared as `Instruction`. This is the biggest remaining chunk of the usage of `getTerminator()` that insists on the narrow type and so is an easy batch of updates. Several files saw more extensive updates where this would cascade to requiring API updates within the file to use `Instruction` instead of `TerminatorInst`. All of these were trivial in nature (pervasively using `Instruction` instead just worked). llvm-svn: 344502
* Improve static analysis of cold basic blocksAditya Kumar2018-10-031-1/+14
| | | | | | | | | Differential Revision: https://reviews.llvm.org/D52704 Reviewers: sebpop, tejohnson, brzycki, SirishP Reviewed By: sebpop llvm-svn: 343663
* Add support for new pass managerAditya Kumar2018-10-031-0/+33
| | | | | | | | | | | Modified the testcases to use both pass managers Use single commandline flag for both pass managers. Differential Revision: https://reviews.llvm.org/D52708 Reviewers: sebpop, tejohnson, brzycki, SirishP Reviewed By: tejohnson, brzycki llvm-svn: 343662
* HotColdSplit: fix invalid SSA due to outliningSebastian Pop2018-09-141-15/+16
| | | | | | | | The test used to fail with an invalid phi node: the two predecessors were outlined and the SSA representation was left invalid. The patch adds the exit block to the cold region. llvm-svn: 342277
* HotColdSplit: fix isSingleEntrySingleExitSebastian Pop2018-09-141-10/+6
| | | | | | | | | | | | remove duplicate entries from isSingleEntrySingleExit: the Entry block is already added by the loop over the dominance frontier. Remove the heuristic from isOutlineCandidate that a region is too small when it only contains a basic block. With this change we now grow regions starting from a block and we continue adding to the ValidColdRegion. Check the heuristic just before code generation. llvm-svn: 342276
* HotColdSplit: add back propagation to extend cold regionsSebastian Pop2018-09-141-18/+64
| | | | | | | | Also fix a problem in forward propagation: const TerminatorInst *TI = It->getTerminator(); was set outside the while loop that iterates over It. llvm-svn: 342275
* HotColdSplitting: check that target supports cold calling conventionSebastian Pop2018-09-101-4/+13
| | | | | | | | | Before tagging a function with coldcc make sure the target supports cold calling convention. Without this patch HotColdSplitting pass fails on aarch64 with: fatal error: error in backend: Unsupported calling convention. llvm-svn: 341838
* Hot cold splitting passAditya Kumar2018-09-071-0/+370
Find cold blocks based on profile information (or optionally with static analysis). Forward propagate profile information to all cold-blocks. Outline a cold region. Set calling conv and prof hint for the callsite of the outlined function. Worked in collaboration with: Sebastian Pop <s.pop@samsung.com> Differential Revision: https://reviews.llvm.org/D50658 llvm-svn: 341669
OpenPOWER on IntegriCloud