diff options
author | Sanjay Patel <spatel@rotateright.com> | 2018-02-05 23:43:05 +0000 |
---|---|---|
committer | Sanjay Patel <spatel@rotateright.com> | 2018-02-05 23:43:05 +0000 |
commit | d7c702b45191ea1cab867a257b8b6b1455b9259f (patch) | |
tree | a21b47a8eda4c0b76d0658feef0aa737d1d9458c /llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp | |
parent | 40ddcb8133f4acaafabef2406345fdb8c796214e (diff) | |
download | bcm5719-llvm-d7c702b45191ea1cab867a257b8b6b1455b9259f.tar.gz bcm5719-llvm-d7c702b45191ea1cab867a257b8b6b1455b9259f.zip |
[LoopStrengthReduce, x86] don't add cost for a cmp that will be macro-fused (PR35681)
In the motivating case from PR35681 and represented by the macro-fuse-cmp test:
https://bugs.llvm.org/show_bug.cgi?id=35681
...there's a 37 -> 31 byte size win for the loop because we eliminate the big base
address offsets.
SPEC2017 on Ryzen shows no significant perf difference.
Differential Revision: https://reviews.llvm.org/D42607
llvm-svn: 324289
Diffstat (limited to 'llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp')
-rw-r--r-- | llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp | 5 |
1 files changed, 3 insertions, 2 deletions
diff --git a/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp b/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp index 4b8e2286ed9..b7d9a258913 100644 --- a/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp +++ b/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp @@ -1343,14 +1343,15 @@ void Cost::RateFormula(const TargetTransformInfo &TTI, // If ICmpZero formula ends with not 0, it could not be replaced by // just add or sub. We'll need to compare final result of AddRec. - // That means we'll need an additional instruction. + // That means we'll need an additional instruction. But if the target can + // macro-fuse a compare with a branch, don't count this extra instruction. // For -10 + {0, +, 1}: // i = i + 1; // cmp i, 10 // // For {-10, +, 1}: // i = i + 1; - if (LU.Kind == LSRUse::ICmpZero && !F.hasZeroEnd()) + if (LU.Kind == LSRUse::ICmpZero && !F.hasZeroEnd() && !TTI.canMacroFuseCmp()) C.Insns++; // Each new AddRec adds 1 instruction to calculation. C.Insns += (C.AddRecCost - PrevAddRecCost); |