| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
| |
Just because we can constant fold the result of an instruction does not
imply that we can delete the instruction. It may have side effects.
This fixes PR28655.
llvm-svn: 276389
|
| |
|
|
|
|
|
| |
Almost all of these folds require changes to allow vector types.
Splitting up the logic should make that easier to do incrementally.
llvm-svn: 276360
|
| |
|
|
|
|
| |
Also, rename some of them for consistency and to follow current conventions.
llvm-svn: 276312
|
| |
|
|
|
|
|
| |
Making smaller pieces out of some of these ~1000 line functions should make
it easier to incrementally upgrade them to handle vector types.
llvm-svn: 276304
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The benefits of this change include:
1. Remove DeMorgan-matching code that was added specifically to work-around
the missing transform in http://reviews.llvm.org/rL248634.
2. Makes the DeMorgan transform work for vectors too.
3. Fix PR28476: https://llvm.org/bugs/show_bug.cgi?id=28476
Extending this transform to other casts and other associative operators may
be useful too. See https://reviews.llvm.org/D22421 for a prerequisite for
doing that though.
Differential Revision: https://reviews.llvm.org/D22271
llvm-svn: 276221
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D22602
llvm-svn: 276209
|
| |
|
|
|
|
|
|
| |
As noted in https://reviews.llvm.org/D22537 , we can use this functionality in
visitSelectInstWithICmp() and InstSimplify, but currently we have duplicated
code.
llvm-svn: 276140
|
| |
|
|
|
|
|
|
| |
Makes InstCombine infloop when compiling v8.
This reverts commit r275989 and r276105.
llvm-svn: 276106
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the source type
The pattern may look more obviously like a sext if written as:
define i32 @g(i16 %x) {
%zext = zext i16 %x to i32
%xor = xor i32 %zext, 32768
%add = add i32 %xor, -32768
ret i32 %add
}
We already have that fold in visitAdd().
Differential Revision: https://reviews.llvm.org/D22477
llvm-svn: 276035
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently, InstCombine is already able to fold expressions of the form `logic(cast(A), cast(B))` to the simpler form `cast(logic(A, B))`, where logic designates one of `and`/`or`/`xor`. This transformation is implemented in `foldCastedBitwiseLogic()` in InstCombineAndOrXor.cpp. However, this optimization will not be performed if both `A` and `B` are `icmp` instructions. The decision to preclude casts of `icmp` instructions originates in r48715 in combination with r261707, and can be best understood by the title of the former one:
> Transform (zext (or (icmp), (icmp))) to (or (zext (cimp), (zext icmp))) if at least one of the (zext icmp) can be transformed to eliminate an icmp.
Apparently, it introduced a transformation that is a reverse of the transformation that is done in `foldCastedBitwiseLogic()`. Its purpose is to expose pairs of `zext icmp` that would subsequently be optimized by `transformZExtICmp()` in InstCombineCasts.cpp. Therefore, in order to avoid an endless loop of switching back and forth between these two transformations, the one in `foldCastedBitwiseLogic()` has been restricted to exclude `icmp` instructions which is mirrored in the responsible check:
`if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src)) && ...`
This check seems to sort out more cases than necessary because:
- the reverse transformation is obviously done for `or` instructions only
- and also not every `zext icmp` pair is necessarily the result of this reverse transformation
Therefore we now remove this check and replace it by a more finegrained one in `shouldOptimizeCast()` that now rejects only those `logic(zext(icmp), zext(icmp))` that would be able to be optimized by `transformZExtICmp()`, which also avoids the mentioned endless loop. That means we are now able to also simplify expressions of the form `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` (`cast` being an arbitrary `CastInst`).
As an example, consider the following IR snippet
```
%1 = icmp sgt i64 %a, %b
%2 = zext i1 %1 to i8
%3 = icmp slt i64 %a, %c
%4 = zext i1 %3 to i8
%5 = and i8 %2, %4
```
which would now be transformed to
```
%1 = icmp sgt i64 %a, %b
%2 = icmp slt i64 %a, %c
%3 = and i1 %1, %2
%4 = zext i1 %3 to i8
```
This issue became apparent when experimenting with the programming language Julia, which makes use of LLVM. Currently, Julia lowers its `Bool` datatype to LLVM's `i8` (also see https://github.com/JuliaLang/julia/pull/17225). In fact, the above IR example is the lowered form of the Julia snippet `(a > b) & (a < c)`. Like shown above, this may introduce `zext` operations, casting between `i1` and `i8`, which could for example hinder ScalarEvolution and Polly on certain code.
Reviewers: grosser, vtjnash, majnemer
Subscribers: majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D22511
Contributed-by: Matthias Reisinger
llvm-svn: 275989
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch cleans up parts of InstCombine to raise its compliance with the LLVM coding standards and to increase its readability. The changes and according rationale are summarized in the following:
- Rename `ShouldOptimizeCast()` to `shouldOptimizeCast()` since functions should start with a lower case letter.
- Move `shouldOptimizeCast()` from InstCombineCasts.cpp to InstCombineAndOrXor.cpp since it's only used there.
- Simplify interface of `shouldOptimizeCast()`.
- Minor code style adaptions in `shouldOptimizeCast()`.
- Remove the documentation on the function definition of `shouldOptimizeCast()` since it just repeats the documentation on its declaration. Also enhance the documentation on its declaration with more information describing its intended use and make it doxygen-compliant.
- Change a comment in `foldCastedBitwiseLogic()` from `fold (logic (cast A), (cast B)) -> (cast (logic A, B))` to `fold logic(cast(A), cast(B)) -> cast(logic(A, B))` since the surrounding comments use this format.
- Remove comment `Only do this if the casts both really cause code to be generated.` in `foldCastedBitwiseLogic()` since it just repeats parts of the documentation of `shouldOptimizeCast()` and does not help to improve readability.
- Simplify the interface of `isEliminableCastPair()`.
- Removed the documentation on the function definition of `isEliminableCastPair()` which only contained obvious statements about its implementation. Instead added more general doxygen-compliant documentation to its declaration.
- Renamed parameter `DoXform` of `transformZExtIcmp()` to `DoTransform` to make its intention clearer.
- Moved documentation of `transformZExtIcmp()` from its definition to its declaration and made it doxygen-compliant.
Reviewers: vtjnash, grosser
Subscribers: majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D22449
Contributed-by: Matthias Reisinger
llvm-svn: 275964
|
| |
|
|
| |
llvm-svn: 275691
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This is a partial implementation of a general fold for associative+commutative operators:
(op (cast (op X, C2)), C1) --> (cast (op X, op (C1, C2)))
(op (cast (op X, C2)), C1) --> (op (cast X), op (C1, C2))
There are 7 associative operators and 13 cast types, so this could potentially go a lot further.
Differential Revision: https://reviews.llvm.org/D22421
llvm-svn: 275684
|
| |
|
|
| |
llvm-svn: 275470
|
| |
|
|
|
|
|
|
| |
We were able to fold masked loads with an all-ones mask to a normal
load. However, we couldn't turn a masked load with a mask with mixed
ones and undefs into a normal load.
llvm-svn: 275380
|
| |
|
|
|
|
|
| |
This transform doesn't require any new instructions, it can safely live
in InstSimplify.
llvm-svn: 275344
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In D21740, we discussed trying to make this a more general matcher. However, I didn't see a clean
way to handle the regular m_Not cases and these non-splat vector patterns, so I've opted for the
direct approach here. If there are other potential uses of areInverseVectorBitmasks(), we could
move that helper function to a higher level.
There is an open question as to which is of these forms should be considered the canonical IR:
%sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i32> %a, <4 x i32> %b
%shuf = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 0, i32 5, i32 6, i32 3>
Differential Revision: http://reviews.llvm.org/D22114
llvm-svn: 275289
|
| |
|
|
|
|
|
| |
This reverts commit r274853.
Caused failure in ppcBE build
llvm-svn: 274943
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This isn't a sure thing (are 2 extra bitcasts less expensive than a logic op?),
but we'll try to err on the conservative side by going with the case that has
less IR instructions.
Note: This question came up in http://reviews.llvm.org/D22114 , but this part is
independent of that patch proposal, so I'm making this small change ahead of that
one.
See also:
http://reviews.llvm.org/rL274926
llvm-svn: 274932
|
| |
|
|
|
|
| |
eliminate any ops
llvm-svn: 274926
|
| |
|
|
| |
llvm-svn: 274891
|
| |
|
|
| |
llvm-svn: 274883
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
We can fold truncs whose operand feeds from a load, if the trunc value
is available through a prior load/store.
This change is from: http://reviews.llvm.org/D21246, which folded the
trunc but missed the bitcast or ptrtoint/inttoptr required in the RAUW
call, when the load type didnt match the prior load/store type.
Differential Revision: http://reviews.llvm.org/D21791
llvm-svn: 274853
|
| |
|
|
| |
llvm-svn: 274760
|
| |
|
|
|
|
|
|
|
| |
By replacing dyn_cast of ConstantInt with m_Zero/m_One/m_AllOnes, we
allow these transforms for splat vectors.
Differential Revision: http://reviews.llvm.org/D21899
llvm-svn: 274696
|
| |
|
|
|
|
|
|
| |
Follow-up from r274465: we don't need to capture the value in these cases,
so just match the constant that we're looking for. m_One/m_Zero work with
vector splats as well as scalars.
llvm-svn: 274670
|
| |
|
|
| |
llvm-svn: 274465
|
| |
|
|
| |
llvm-svn: 274463
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This actually uncovered a surprisingly large chain of ultimately unused
TLI args.
From what I can gather, this argument is a remnant of when
isKnownNonNull would look at the TLI directly.
The current approach seems to be that InferFunctionAttrs runs early in
the pipeline and uses TLI to annotate the TLI-dependent non-null
information as return attributes.
This also removes the dependence of functionattrs on TLI altogether.
llvm-svn: 274455
|
| |
|
|
| |
llvm-svn: 274238
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://llvm.org/bugs/show_bug.cgi?id=24766#c2
This removes a hack that was added for the benefit of x86 codegen.
It prevented shrinking the switch condition even to smaller legal (DataLayout) types.
We have a safety mechanism in CGP after:
http://reviews.llvm.org/rL251857
...so we're free to use the optimal (smallest) IR type now.
Differential Revision: http://reviews.llvm.org/D12965
llvm-svn: 274233
|
| |
|
|
|
|
| |
instruction
llvm-svn: 274229
|
| |
|
|
|
|
|
|
| |
If the incoming types are i1, then we don't have to pattern match any sext ops.
Differential Revision: http://reviews.llvm.org/D21740
llvm-svn: 274228
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Take advantage of FCmpInst::Predicate's bit pattern and handle (fcmp *, x, y) | (fcmp *, x, y) and (fcmp *, x, y) & (fcmp *, x, y) more consistently. Also fold more FCmpInst::FCMP_FALSE and FCmpInst::FCMP_TRUE to constants.
Currently InstCombine wrongly folds (fcmp ogt, x, y) | (fcmp ord, x, y) to (fcmp ogt, x, y); this patch also fixes that.
Reviewers: spatel
Subscribers: llvm-commits, iteratee, echristo
Differential Revision: http://reviews.llvm.org/D21775
llvm-svn: 274156
|
| |
|
|
|
|
|
|
| |
instructions
This removes some noise for D21775's test changes.
llvm-svn: 274155
|
| |
|
|
|
|
|
|
|
|
|
|
| |
both address and result of load instructions"
Revert "[InstCombine] Combine A->B->A BitCast"
as this appears to cause PR27996 and as discussed in http://reviews.llvm.org/D20847
This reverts commits r270135 and r263734.
llvm-svn: 274094
|
| |
|
|
| |
llvm-svn: 273974
|
| |
|
|
|
|
|
|
|
|
|
| |
is small enough (PR28153)
This should fix PR28153:
https://llvm.org/bugs/show_bug.cgi?id=28153
Differential Revision: http://reviews.llvm.org/D21769
llvm-svn: 273951
|
| |
|
|
|
|
|
| |
There's at least one more fold to do here:
https://llvm.org/bugs/show_bug.cgi?id=28153
llvm-svn: 273904
|
| |
|
|
|
|
| |
The APInt matcher works with splat vectors, so we get this fold for vectors too.
llvm-svn: 273897
|
| |
|
|
|
|
| |
Only minor manual fixes. No functionality change intended.
llvm-svn: 273808
|
| |
|
|
| |
llvm-svn: 273715
|
| |
|
|
|
|
|
|
|
| |
one code path (NFCI)
Tests to verify that the commuted variants are all exercised were added with:
http://reviews.llvm.org/rL273702
llvm-svn: 273706
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r273608.
Broke building code with sanitizers, where apparently these kinds of
loads, casts, and truncations are common:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/24502
http://crbug.com/623099
llvm-svn: 273703
|
| |
|
|
|
|
|
|
|
|
| |
one place; NFCI
By putting all the possible commutations together, we simplify the code.
Note that this is NFCI, but I'm adding tests that actually exercise each
commutation pattern because we don't have this anywhere else.
llvm-svn: 273702
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This instcombine rule folds away trunc operations that have value available from a prior load or store.
This kind of code can be generated as a result of GVN widening the load or from source code as well.
Reviewers: reames, majnemer, sanjoy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21246
llvm-svn: 273608
|
| |
|
|
|
|
| |
Found by gcc 6.
llvm-svn: 273402
|
| |
|
|
| |
llvm-svn: 273244
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
false logic (PR27869)
By moving this transform to InstSimplify from InstCombine, we sidestep the problem/question
raised by PR27869:
https://llvm.org/bugs/show_bug.cgi?id=27869
...where InstCombine turns an icmp+zext into a shift causing us to miss the fold.
Credit to David Majnemer for a draft patch of the changes to InstructionSimplify.cpp.
Differential Revision: http://reviews.llvm.org/D21512
llvm-svn: 273200
|
| |
|
|
|
|
|
| |
Specific instances of intrinsic calls may want to be convergent, such
as certain register reads but the intrinsic declaration is not.
llvm-svn: 273188
|