summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
diff options
context:
space:
mode:
authorVedant Kumar <vsk@apple.com>2019-01-24 18:55:49 +0000
committerVedant Kumar <vsk@apple.com>2019-01-24 18:55:49 +0000
commitef1ebed1c6851d8f32d8b4400f120be6a6af7ec1 (patch)
treecf524287efea8147f515a83521a25ff9b89b408a /llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
parentfa361206822acd0b9b4d16149f7015184c35139f (diff)
downloadbcm5719-llvm-ef1ebed1c6851d8f32d8b4400f120be6a6af7ec1.tar.gz
bcm5719-llvm-ef1ebed1c6851d8f32d8b4400f120be6a6af7ec1.zip
[HotColdSplit] Move splitting earlier in the pipeline
Performing splitting early has several advantages: - Inhibiting inlining of cold code early improves code size. Compared to scheduling splitting at the end of the pipeline, this cuts code size growth in half within the iOS shared cache (0.69% to 0.34%). - Inhibiting inlining of cold code improves compile time. There's no need to inline split cold functions, or to inline as much *within* those split functions as they are marked `minsize`. - During LTO, extra work is only done in the pre-link step. Less code must be inlined during cross-module inlining. An additional motivation here is that the most common cold regions identified by the static/conservative splitting heuristic can (a) be found before inlining and (b) do not grow after inlining. E.g. __assert_fail, os_log_error. The disadvantages are: - Some opportunities for splitting out cold code may be missed. This gap can potentially be narrowed by adding a worklist algorithm to the splitting pass. - Some opportunities to reduce code size may be lost (e.g. store sinking, when one side of the CFG diamond is split). This does not outweigh the code size benefits of splitting earlier. On net, splitting early in the pipeline has substantial code size benefits, and no major effects on memory locality or performance. We measured memory locality using ktrace data, and consistently found that 10% fewer pages were needed to capture 95% of text page faults in key iOS benchmarks. We measured performance on frequency-stabilized iOS devices using LNT+externals. This reverses course on the decision made to schedule splitting late in r344869 (D53437). Differential Revision: https://reviews.llvm.org/D57082 llvm-svn: 352080
Diffstat (limited to 'llvm/lib/Transforms/IPO/PassManagerBuilder.cpp')
-rw-r--r--llvm/lib/Transforms/IPO/PassManagerBuilder.cpp14
1 files changed, 10 insertions, 4 deletions
diff --git a/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp b/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
index 455fa0f099f..03d7088eab4 100644
--- a/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
+++ b/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
@@ -420,6 +420,10 @@ void PassManagerBuilder::addFunctionSimplificationPasses(
void PassManagerBuilder::populateModulePassManager(
legacy::PassManagerBase &MPM) {
+ // Whether this is a default or *LTO pre-link pipeline. The FullLTO post-link
+ // is handled separately, so just check this is not the ThinLTO post-link.
+ bool DefaultOrPreLinkPipeline = !PerformThinLTO;
+
if (!PGOSampleUse.empty()) {
MPM.add(createPruneEHPass());
// In ThinLTO mode, when flattened profile is used, all the available
@@ -513,6 +517,11 @@ void PassManagerBuilder::populateModulePassManager(
MPM.add(createDeadArgEliminationPass()); // Dead argument elimination
+ // Split out cold code before inlining. See comment in the new PM
+ // (\ref buildModuleSimplificationPipeline).
+ if (EnableHotColdSplit && DefaultOrPreLinkPipeline)
+ MPM.add(createHotColdSplittingPass());
+
addInstructionCombiningPass(MPM); // Clean up after IPCP & DAE
addExtensionsToPM(EP_Peephole, MPM);
MPM.add(createCFGSimplificationPass()); // Clean up after IPCP & DAE
@@ -522,7 +531,7 @@ void PassManagerBuilder::populateModulePassManager(
// profile annotation in backend more difficult.
// PGO instrumentation is added during the compile phase for ThinLTO, do
// not run it a second time
- if (!PerformThinLTO && !PrepareForThinLTOUsingPGOSampleProfile)
+ if (DefaultOrPreLinkPipeline && !PrepareForThinLTOUsingPGOSampleProfile)
addPGOInstrPasses(MPM);
// We add a module alias analysis pass here. In part due to bugs in the
@@ -737,9 +746,6 @@ void PassManagerBuilder::populateModulePassManager(
// flattening of blocks.
MPM.add(createDivRemPairsPass());
- if (EnableHotColdSplit)
- MPM.add(createHotColdSplittingPass());
-
// LoopSink (and other loop passes since the last simplifyCFG) might have
// resulted in single-entry-single-exit or empty blocks. Clean up the CFG.
MPM.add(createCFGSimplificationPass());
OpenPOWER on IntegriCloud