| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
(including those with vector splat shift amount)
Differential Revision: https://reviews.llvm.org/D36784
llvm-svn: 311050
|
|
|
|
|
|
|
|
|
|
| |
+ C1 support splat vectors
This also uses decomposeBitTestICmp to decode the compare.
Differential Revision: https://reviews.llvm.org/D36781
llvm-svn: 311044
|
|
|
|
|
|
|
|
|
|
| |
vector shifts with splat shift amount
We were only allowing ConstantInt before. This patch allows splat of ConstantInt too.
Differential Revision: https://reviews.llvm.org/D36763
llvm-svn: 310970
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D36743
llvm-svn: 310949
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Narrow ops are better for bit-tracking, and in the case of vectors,
may enable better codegen.
As the trunc test shows, this can allow follow-on simplifications.
There's a block of code in visitTrunc that deals with shifted ops
with FIXME comments. It may be possible to remove some of that now,
but I want to make sure there are no problems with this step first.
http://rise4fun.com/Alive/Y3a
Name: hoist_ashr_ahead_of_sext_1
%s = sext i8 %x to i32
%r = ashr i32 %s, 3 ; shift value is < than source bit width
=>
%a = ashr i8 %x, 3
%r = sext i8 %a to i32
Name: hoist_ashr_ahead_of_sext_2
%s = sext i8 %x to i32
%r = ashr i32 %s, 8 ; shift value is >= than source bit width
=>
%a = ashr i8 %x, 7 ; so clamp this shift value
%r = sext i8 %a to i32
Name: junc_the_trunc
%a = sext i16 %v to i32
%s = ashr i32 %a, 18
%t = trunc i32 %s to i16
=>
%t = ashr i16 %v, 15
llvm-svn: 310942
|
|
|
|
|
|
|
| |
These haven't done anything since debug info intrinsics stopped
appearing in Value use lists in 2014.
llvm-svn: 310892
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
decomposeBitTestICmp and use it in the InstSimplify"
This recommits r310869, with the moved files and no extra changes.
Original commit message:
This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.
I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.
I also had to make decomposeBitTest support vectors since InstSimplify needs that.
As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.
Differential Revision: https://reviews.llvm.org/D36593
llvm-svn: 310889
|
|
|
|
|
|
|
|
| |
decomposeBitTestICmp and use it in the InstSimplify"
Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything.
llvm-svn: 310873
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
use it in the InstSimplify
This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.
I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.
I also had to make decomposeBitTest support vectors since InstSimplify needs that.
As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.
Differential Revision: https://reviews.llvm.org/D36593
llvm-svn: 310869
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These functions were overly complicated. The body of this function was rechecking for an And operation to find the constant, but we already knew we were looking at two Ands ORed together and the pieces are in variables. We already had earlier nearby code that checked for ConstantInts. So just inline the remaining parts into the earlier code.
Next step is to use m_APInt instead of ConstantInt.
Reviewers: spatel, efriedma, davide, majnemer
Reviewed By: spatel
Subscribers: zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D36439
llvm-svn: 310806
|
|
|
|
|
|
|
|
| |
This also corrects the description to match what was actually implemented. The old comment said X^(C1|C2), but it implemented X^((C1|C2)&~(C1&C2)). I believe ((C1|C2)&~(C1&C2)) is equivalent to (C1^C2).
Differential Revision: https://reviews.llvm.org/D36505
llvm-svn: 310658
|
|
|
|
|
|
|
|
| |
inverse vectors of i1 constants.
We used to try to truncate the constant vector to vXi1, but if it's already i1 this would fail. Instead we now use IRBuilder::getZExtOrTrunc which should check the type and only create a trunc if needed. I believe this should trigger constant folding in the IRBuilder and ultimately do the same thing just with the additional type check.
llvm-svn: 310639
|
|
|
|
|
|
|
|
|
|
|
|
| |
instructions are visited for debug
Sometimes it would be nice to stop InstCombine mid way through its combining to see the current IR. By using a debug counter we can place an upper limit on how many instructions to process.
This will also allow skipping the first X combines, but that has the potential to change later combines since earlier canonicalizations might have been skipped.
Differential Revision: https://reviews.llvm.org/D36553
llvm-svn: 310638
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR34046)
I couldn't find any smaller folds to help the cases in:
https://bugs.llvm.org/show_bug.cgi?id=34046
after:
rL310141
The truncated rotate-by-variable patterns elude all of the existing transforms because
of multiple uses and knowledge about demanded bits and knownbits that doesn't exist
without the whole pattern. So we need an unfortunately large pattern match. But by
simplifying this pattern in IR, the backend is already able to generate
rolb/rolw/rorb/rorw for x86 using its existing rotate matching logic (although
there is a likely extraneous 'and' of the rotate amount).
Note that rotate-by-constant doesn't have this problem - smaller folds should already
produce the narrow IR ops.
Differential Revision: https://reviews.llvm.org/D36395
llvm-svn: 310509
|
|
|
|
| |
llvm-svn: 310446
|
|
|
|
|
|
|
|
| |
We already support pulling through an add with constant RHS. We can do the same for subtract.
Differential Revision: https://reviews.llvm.org/D36443
llvm-svn: 310407
|
|
|
|
|
|
|
|
| |
the code.
We no longer need the explicit operand count check or the later dynamic cast.
llvm-svn: 310339
|
|
|
|
|
|
| |
NFC.
llvm-svn: 310285
|
|
|
|
|
|
|
|
|
|
|
|
| |
vector splats
Note the original code I deleted incorrectly listed this as (X | C1) & C2 --> (X & C2^(C1&C2)) | C1 Which is only valid if C1 is a subset of C2. This relied on SimplifyDemandedBits to remove any extra bits from C1 before we got to that code.
My new implementation avoids relying on that behavior so that it can be naively verified with alive.
Differential Revision: https://reviews.llvm.org/D36384
llvm-svn: 310272
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This is all handled by SimplifyDemandedBits.
Reviewers: spatel, davide
Reviewed By: davide
Subscribers: davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D36382
llvm-svn: 310234
|
|
|
|
| |
llvm-svn: 310233
|
|
|
|
|
|
| |
C) ^ signmask -> (X + C + signmask)' for vector splats.
llvm-svn: 310232
|
|
|
|
|
|
| |
vectors.
llvm-svn: 310195
|
|
|
|
| |
llvm-svn: 310186
|
|
|
|
|
|
| |
shifts to handle vector splats too.
llvm-svn: 310185
|
|
|
|
|
|
| |
Unfortunately, it looks like there's some other missed optimizations in the generated code for some of these cases. I'll try to look at some of those next.
llvm-svn: 310184
|
|
|
|
|
|
|
| |
In addition to moving the shift transforms over, we may want to
detect too-wide rotate patterns here (PR34046).
llvm-svn: 310181
|
|
|
|
|
|
|
|
|
|
|
|
| |
type to the 'select' type, do it after shifting right instead of just bailing.
Previously we were always trying to emit the zext or truncate before any shift. This meant if the 'and' mask was larger than the size of the truncate we would skip the transformation.
Now we shift the result of the and right first leaving the bit within the range of the truncate.
This matches what we are doing in foldSelectICmpAndOr for the same problem.
llvm-svn: 310159
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Name: narrow_sub
%sub = sub i32 C1, %x
%r = trunc i32 %sub to i8
=>
%xn = trunc i32 %x to i8
%narrowC = trunc i32 C1 to i8
%r = sub i8 %narrowC, %xn
Name: narrow_add
%add = add i32 %x, C1
%r = trunc i32 %add to i8
=>
%xn = trunc i32 %x to i8
%narrowC = trunc i32 C1 to i8
%r = add i8 %xn, %narrowC
Name: narrow_mul
%mul = mul i32 %x, C1
%r = trunc i32 %mul to i8
=>
%xn = trunc i32 %x to i8
%narrowC = trunc i32 C1 to i8
%r = mul i8 %xn, %narrowC
http://rise4fun.com/Alive/QpS
This doesn't solve PR34046 (failure to recognize rotate):
https://bugs.llvm.org/show_bug.cgi?id=34046
...but it reduces an extra complication in the description examples
to a form that we can more easily match.
llvm-svn: 310141
|
|
|
|
|
|
| |
Avoids unused variable warnings in Release builds. No functional change.
llvm-svn: 310064
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
inline the remaining code to match visitOr
Summary:
The (not (sext)) case is really (xor (sext), -1) which should have been simplified to (sext (xor, 1)) before we got here. So we shouldn't need to handle it.
With that taken care of we only need to two cases so don't need the swap anymore. This makes us in sync with the equivalent code in visitOr so inline this to match.
Reviewers: spatel, eli.friedman, majnemer
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36240
llvm-svn: 310063
|
|
|
|
| |
llvm-svn: 310062
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Name: narrow_shift
Pre: C1 < 8
%zx = zext i8 %x to i32
%l = lshr i32 %zx, C1
=>
%narrowC = trunc i32 C1 to i8
%ns = lshr i8 %x, %narrowC
%l = zext i8 %ns to i32
http://rise4fun.com/Alive/jIV
This isn't directly applicable to PR34046 as written, but we
need to have more narrowing folds like this to be sure that
rotate patterns are recognized.
llvm-svn: 310060
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This commit allows matchSelectPattern to recognize clamp of float
arguments in the presence of FMF the same way as already done for
integers.
This case is a little different though. With integers, given the
min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX
"automatically". That is not the case for float, because for them only
full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care
about NaNs. On the other hand, some backends (e.g. X86) have only
FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM
nodes are illegal thus selection is not happening. So I decided to do
such kind of transformation in IR (InstCombiner) instead of
complicating the logic in the backend.
Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper
Reviewed By: efriedma
Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits
Patch by Andrei Elovikov <andrei.elovikov@intel.com>
Differential Revision: https://reviews.llvm.org/D33186
llvm-svn: 310054
|
|
|
|
|
|
| |
foldSelectInstWithICmp. NFCI
llvm-svn: 310025
|
|
|
|
|
|
| |
We're calling an overload of getOpcode that already returns Instruction::CastOps.
llvm-svn: 310024
|
|
|
|
| |
llvm-svn: 309887
|
|
|
|
|
|
|
|
|
|
| |
(xor(sext(cmp)), -1) to ext(!cmp).
As far as I can tell this should be handled by foldCastedBitwiseLogic which is called later in visitXor.
Differential Revision: https://reviews.llvm.org/D36214
llvm-svn: 309882
|
|
|
|
|
|
|
|
| |
This adds support for sext in foldLogicCastConstant. This is a prerequisite for D36214.
Differential Revision: https://reviews.llvm.org/D36234
llvm-svn: 309880
|
|
|
|
|
|
| |
IMHO this is a bit more readable.
llvm-svn: 309739
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
assert
Summary:
As far as I can tell the earlier call getLimitedValue will guaranteed ShiftAmt is saturated to BitWidth-1 preventing it from ever being equal or greater than BitWidth.
At one point in the past the getLimitedValue call was only passed BitWidth not BitWidth - 1. This would have allowed the equality case to get here. And in fact this check was initially added as just BitWidth == ShiftAmt, but was changed shortly after to include > which should have never been possible.
Reviewers: spatel, majnemer, davide
Reviewed By: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36123
llvm-svn: 309690
|
|
|
|
| |
llvm-svn: 309627
|
|
|
|
|
|
|
|
|
|
| |
This intrinsic clears the upper bits starting at a specified index. If the index is a constant we can do some simplifications.
This could be in InstSimplify, but we don't handle any target specific intrinsics there today.
Differential Revision: https://reviews.llvm.org/D36069
llvm-svn: 309604
|
|
|
|
|
|
|
|
|
|
| |
This patch adds simplification support for the BEXTR/BEXTRI intrinsics to match gcc. This only supports cases that fold to 0 or can be fully constant folded. Theoretically we could support converting to AND if the shift part is unused or to only a shift if the mask doesn't modify any bits after an equivalent shl. gcc doesn't do these transformations either.
I put this in InstCombine, but it could be done in InstSimplify. It would be the first target specific intrinsic in InstSimplify.
Differential Revision: https://reviews.llvm.org/D36063
llvm-svn: 309603
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
have other uses and one non-constant index
Summary:
Pointer difference simplifications currently happen only if input GEPs don't have other uses or their indexes are all constants, to avoid duplicating indexing arithmetic.
This patch enables cases with exactly one non-constant index among input GEPs to happen where there is no duplicated arithmetic or code size increase even if input GEPs have other uses.
For example, this patch allows "(&A[42][i]-&A[42][0])" --> "i", which didn't happen previously, if the input GEP(s) have other uses.
Reviewers: sanjoy, bkramer
Reviewed By: sanjoy
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D35499
llvm-svn: 309304
|
|
|
|
| |
llvm-svn: 309192
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This changes SimplifyLibCalls to use the new OptimizationRemarkEmitter
API.
In fact, as SimplifyLibCalls is only ever called via InstCombine,
(as far as I can tell) the OptimizationRemarkEmitter is added there,
and then passed through to SimplifyLibCalls later.
I have avoided changing any remark text.
This closes PR33787
Patch by Sam Elliott!
Reviewers: anemet, davide
Reviewed By: anemet
Subscribers: davide, mehdi_amini, eraman, fhahn, llvm-commits
Differential Revision: https://reviews.llvm.org/D35608
llvm-svn: 309158
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
with undef.
This is a workaround for the bug described in PR31652 and
http://lists.llvm.org/pipermail/llvm-dev/2017-July/115497.html. The temporary
solution is to add a function EqualityPropUnSafe. In EqualityPropUnSafe, for
some simple patterns we can know the equality comparison may contains undef,
so we regard such comparison as unsafe and will not do loop-unswitching for
them. We also need to disable the select simplification when one of select
operand is undef and its result feeds into equality comparison.
The patch cannot clear the safety issue caused by the bug, but it can suppress
the issue from happening to some extent.
Differential Revision: https://reviews.llvm.org/D35811
llvm-svn: 309059
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Currently, when GVN creates a load and when InstCombine creates a new store for unreachable Load, the DebugLoc info gets lost.
Reviewers: dberlin, davide, aprantl
Reviewed By: aprantl
Subscribers: davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D34639
llvm-svn: 308404
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D35376
llvm-svn: 308144
|