From db3f9774eefc662cbcf976b51f459c80d2664d82 Mon Sep 17 00:00:00 2001 From: Vedant Kumar Date: Fri, 25 Jan 2019 18:30:37 +0000 Subject: [HotColdSplit] Introduce a cost model to control splitting behavior The main goal of the model is to avoid *increasing* function size, as that would eradicate any memory locality benefits from splitting. This happens when: - There are too many inputs or outputs to the cold region. Argument materialization and reloads of outputs have a cost. - The cold region has too many distinct exit blocks, causing a large switch to be formed in the caller. - The code size cost of the split code is less than the cost of a set-up call. A secondary goal is to prevent excessive overall binary size growth. With the cost model in place, I experimented to find a splitting threshold that works well in practice. To make warm & cold code easily separable for analysis purposes, I moved split functions to a "cold" section. I experimented with thresholds between [0, 4] and set the default to the threshold which minimized geomean __text size. Experiment data from building LNT+externals for X86 (N = 639 programs, all sizes in bytes): | Configuration | __text geom size | __cold geom size | TEXT geom size | | **-Os** | 1736.3 | 0, n=0 | 10961.6 | | -Os, thresh=0 | 1740.53 | 124.482, n=134 | 11014 | | -Os, thresh=1 | 1734.79 | 57.8781, n=90 | 10978.6 | | -Os, thresh=2 | ** 1733.85 ** | 65.6604, n=61 | 10977.6 | | -Os, thresh=3 | 1733.85 | 65.3071, n=61 | 10977.6 | | -Os, thresh=4 | 1735.08 | 67.5156, n=54 | 10965.7 | | **-Oz** | 1554.4 | 0, n=0 | 10153 | | -Oz, thresh=2 | ** 1552.2 ** | 65.633, n=61 | 10176 | | **-O3** | 2563.37 | 0, n=0 | 13105.4 | | -O3, thresh=2 | ** 2559.49 ** | 71.1072, n=61 | 13162.4 | Picking thresh=2 reduces the geomean __text section size by 0.14% at -Os, -Oz, and -O3 and causes ~0.2% growth in the TEXT segment. Note that TEXT size is page-aligned, whereas section sizes are byte-aligned. Experiment data from building LNT+externals for ARM64 (N = 558 programs, all sizes in bytes): | Configuration | __text geom size | __cold geom size | TEXT geom size | | **-Os** | 1763.96 | 0, n=0 | 42934.9 | | -Os, thresh=2 | ** 1760.9 ** | 76.6755, n=61 | 42934.9 | Picking thresh=2 reduces the geomean __text section size by 0.17% at -Os and causes no growth in the TEXT segment. Measurements were done with D57082 (r352080) applied. Differential Revision: https://reviews.llvm.org/D57125 llvm-svn: 352228 --- .../HotColdSplit/apply-penalty-for-inputs.ll | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) create mode 100644 llvm/test/Transforms/HotColdSplit/apply-penalty-for-inputs.ll (limited to 'llvm/test/Transforms/HotColdSplit/apply-penalty-for-inputs.ll') diff --git a/llvm/test/Transforms/HotColdSplit/apply-penalty-for-inputs.ll b/llvm/test/Transforms/HotColdSplit/apply-penalty-for-inputs.ll new file mode 100644 index 00000000000..fffd6f9f5dc --- /dev/null +++ b/llvm/test/Transforms/HotColdSplit/apply-penalty-for-inputs.ll @@ -0,0 +1,19 @@ +; REQUIRES: asserts +; RUN: opt -hotcoldsplit -debug-only=hotcoldsplit -S < %s -o /dev/null 2>&1 | FileCheck %s + +declare void @sink(i32*, i32, i32) cold + +@g = global i32 0 + +define void @foo(i32 %arg) { + %local = load i32, i32* @g + br i1 undef, label %cold, label %exit + +cold: + ; CHECK: Applying penalty for: 2 inputs + call void @sink(i32* @g, i32 %arg, i32 %local) + ret void + +exit: + ret void +} -- cgit v1.2.3