diff options
author | Tobias Grosser <tobias@grosser.es> | 2016-07-19 16:39:17 +0000 |
---|---|---|
committer | Tobias Grosser <tobias@grosser.es> | 2016-07-19 16:39:17 +0000 |
commit | 1c382622791d28b552d5e7329319eff566b4a75d (patch) | |
tree | da2cd6221d8a80f1ee3c0d1ec1a51884166586e6 /llvm/lib/Transforms | |
parent | 03006fd3c43d81106d9b29a4b602e18836434614 (diff) | |
download | bcm5719-llvm-1c382622791d28b552d5e7329319eff566b4a75d.tar.gz bcm5719-llvm-1c382622791d28b552d5e7329319eff566b4a75d.zip |
[InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp))
Summary:
Currently, InstCombine is already able to fold expressions of the form `logic(cast(A), cast(B))` to the simpler form `cast(logic(A, B))`, where logic designates one of `and`/`or`/`xor`. This transformation is implemented in `foldCastedBitwiseLogic()` in InstCombineAndOrXor.cpp. However, this optimization will not be performed if both `A` and `B` are `icmp` instructions. The decision to preclude casts of `icmp` instructions originates in r48715 in combination with r261707, and can be best understood by the title of the former one:
> Transform (zext (or (icmp), (icmp))) to (or (zext (cimp), (zext icmp))) if at least one of the (zext icmp) can be transformed to eliminate an icmp.
Apparently, it introduced a transformation that is a reverse of the transformation that is done in `foldCastedBitwiseLogic()`. Its purpose is to expose pairs of `zext icmp` that would subsequently be optimized by `transformZExtICmp()` in InstCombineCasts.cpp. Therefore, in order to avoid an endless loop of switching back and forth between these two transformations, the one in `foldCastedBitwiseLogic()` has been restricted to exclude `icmp` instructions which is mirrored in the responsible check:
`if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src)) && ...`
This check seems to sort out more cases than necessary because:
- the reverse transformation is obviously done for `or` instructions only
- and also not every `zext icmp` pair is necessarily the result of this reverse transformation
Therefore we now remove this check and replace it by a more finegrained one in `shouldOptimizeCast()` that now rejects only those `logic(zext(icmp), zext(icmp))` that would be able to be optimized by `transformZExtICmp()`, which also avoids the mentioned endless loop. That means we are now able to also simplify expressions of the form `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` (`cast` being an arbitrary `CastInst`).
As an example, consider the following IR snippet
```
%1 = icmp sgt i64 %a, %b
%2 = zext i1 %1 to i8
%3 = icmp slt i64 %a, %c
%4 = zext i1 %3 to i8
%5 = and i8 %2, %4
```
which would now be transformed to
```
%1 = icmp sgt i64 %a, %b
%2 = icmp slt i64 %a, %c
%3 = and i1 %1, %2
%4 = zext i1 %3 to i8
```
This issue became apparent when experimenting with the programming language Julia, which makes use of LLVM. Currently, Julia lowers its `Bool` datatype to LLVM's `i8` (also see https://github.com/JuliaLang/julia/pull/17225). In fact, the above IR example is the lowered form of the Julia snippet `(a > b) & (a < c)`. Like shown above, this may introduce `zext` operations, casting between `i1` and `i8`, which could for example hinder ScalarEvolution and Polly on certain code.
Reviewers: grosser, vtjnash, majnemer
Subscribers: majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D22511
Contributed-by: Matthias Reisinger
llvm-svn: 275989
Diffstat (limited to 'llvm/lib/Transforms')
-rw-r--r-- | llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp | 10 |
1 files changed, 8 insertions, 2 deletions
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp index 4cdf154764b..529dc9db049 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp @@ -1212,6 +1212,13 @@ bool InstCombiner::shouldOptimizeCast(CastInst *CI) { isa<CmpInst>(CastSrc) && CI->getDestTy()->isVectorTy()) return false; + // Don't optimize the cast if it is a (zext icmp) that can already be + // eliminated. + if (auto *ZExt = dyn_cast<ZExtInst>(CI)) + if (auto *ICmp = dyn_cast<ICmpInst>(CastSrc)) + if (transformZExtICmp(ICmp, *ZExt, false)) + return false; + return true; } @@ -1260,8 +1267,7 @@ Instruction *InstCombiner::foldCastedBitwiseLogic(BinaryOperator &I) { Value *Cast1Src = Cast1->getOperand(0); // fold logic(cast(A), cast(B)) -> cast(logic(A, B)) - if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src)) && - shouldOptimizeCast(Cast0) && shouldOptimizeCast(Cast1)) { + if (shouldOptimizeCast(Cast0) && shouldOptimizeCast(Cast1)) { Value *NewOp = Builder->CreateBinOp(LogicOpc, Cast0Src, Cast1Src, I.getName()); return CastInst::Create(CastOpcode, NewOp, DestTy); |