| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 368978
|
|
|
|
|
|
| |
https://rise4fun.com/Alive/THl
llvm-svn: 368886
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: jdoerfert, reames
Reviewed By: jdoerfert
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66161
llvm-svn: 368884
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: I think this is better solution than annotating callsites in IC/SLC.
Reviewers: jdoerfert
Reviewed By: jdoerfert
Subscribers: MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66217
llvm-svn: 368875
|
|
|
|
|
|
| |
Being affected by WIP patch.
llvm-svn: 368807
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Should be fine for memcpy, strcpy, strncpy.
Reviewers: jdoerfert, efriedma
Reviewed By: jdoerfert
Subscribers: uenoku, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66135
llvm-svn: 368724
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use isGuaranteedToTransferExecutionToSuccessor() instead of
isSafeToSpeculativelyExecute() when seeing whether we can propagate
the information in an assume backwards in isValidAssumeForContext().
The latter is more general - it also allows arbitrary loads/stores -
and is also the condition we want: if our assume is guaranteed to
execute, its condition not holding would be UB.
Original patch by arielb1.
Differential Revision: https://reviews.llvm.org/D37215
llvm-svn: 368723
|
|
|
|
| |
llvm-svn: 368722
|
|
|
|
| |
llvm-svn: 368715
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Given a pattern like:
```
%old_cmp1 = icmp slt i32 %x, C2
%old_replacement = select i1 %old_cmp1, i32 %target_low, i32 %target_high
%old_x_offseted = add i32 %x, C1
%old_cmp0 = icmp ult i32 %old_x_offseted, C0
%r = select i1 %old_cmp0, i32 %x, i32 %old_replacement
```
it can be rewritten as more canonical pattern:
```
%new_cmp1 = icmp slt i32 %x, -C1
%new_cmp2 = icmp sge i32 %x, C0-C1
%new_clamped_low = select i1 %new_cmp1, i32 %target_low, i32 %x
%r = select i1 %new_cmp2, i32 %target_high, i32 %new_clamped_low
```
Iff `-C1 s<= C2 s<= C0-C1`
Also, `ULT` predicate can also be `UGE`; or `UGT` iff `C0 != -1` (+invert result)
Also, `SLT` predicate can also be `SGE`; or `SGT` iff `C2 != INT_MAX` (+invert result)
If `C1 == 0`, then all 3 instructions must be one-use; else at most either `%old_cmp1` or `%old_x_offseted` can have extra uses.
NOTE: if we could reuse `%old_cmp1` as one of the comparisons we'll have to build, this could be less limiting.
So there are two icmp's, each one with 3 predicate variants, so there are 9 fold variants:
| | ULT | UGE | UGT |
| SLT | https://rise4fun.com/Alive/yIJ | https://rise4fun.com/Alive/5BfN | https://rise4fun.com/Alive/INH |
| SGE | https://rise4fun.com/Alive/hd8 | https://rise4fun.com/Alive/Abk | https://rise4fun.com/Alive/PlzS |
| SGT | https://rise4fun.com/Alive/VYG | https://rise4fun.com/Alive/oMY | https://rise4fun.com/Alive/KrzC |
{F9730206}
This fold was brought up in https://reviews.llvm.org/D65148#1603922 by @dmgreen, and is needed to unblock that patch.
This patch requires D65530.
Reviewers: spatel, nikic, xbolva00, dmgreen
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits, dmgreen
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65765
llvm-svn: 368687
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
all users are freely invertible
Summary:
This is rather unconventional..
As the comment there says, we don't have much folds for xor-of-icmps,
we try to turn them into an and-of-icmps, for which we have plenty of folds.
But if the ICmp we need to invert is not single-use - we give up.
As discussed in https://reviews.llvm.org/D65148#1603922,
we may have a non-canonical CLAMP pattern, with bit match and
select-of-threshold that we'll potentially clamp.
As it can be seen in `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`,
out of all 8 variations of the pattern, only two are **not** canonicalized into
the variant with and+icmp instead of bit math.
The reason is because the ICmp we need to invert is not single-use - we give up.
We indeed can't perform this fold at will, the general rule is that
we should not increase instruction count in InstCombine,
But we wouldn't end up increasing instruction count if we can adapt every other
user to the inverted value. This way the `not` we create **will** get folded,
and in the end the instruction count did not increase.
For that, of course, we need to look at the users of a Value,
which is again rather unconventional for InstCombine :S
Thus i'm proposing to be a little bit more insistive in `foldXorOfICmps()`.
The alternatives would be to not create that `not`, but add duplicate code to
manually invert all users; or to add some even less general combine to handle
some more specific pattern[s].
Reviewers: spatel, nikic, RKSimon, craig.topper
Reviewed By: spatel
Subscribers: hiraditya, jdoerfert, dmgreen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65530
llvm-svn: 368685
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
int mm(char *a, char *b) {
return memcmp(a,b,16);
}
Currently:
define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 {
entry:
%call = tail call i32 @memcmp(i8* %a, i8* %b, i64 16)
ret i32 %call
}
After patch:
define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 {
entry:
%call = tail call i32 @memcmp(i8* dereferenceable(16) %a, i8* dereferenceable(16) %b, i64 16)
ret i32 %call
}
Reviewers: jdoerfert, efriedma
Reviewed By: jdoerfert
Subscribers: javed.absar, spatel, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66079
llvm-svn: 368657
|
|
|
|
|
|
|
|
|
|
| |
We can't handle 'uge' case because we can't ever get it,
there needs to be extra use on that compare or else it will be
canonicalized, but because of extra use we can't handle it.
'sge' case we can have.
llvm-svn: 368656
|
|
|
|
| |
llvm-svn: 368583
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
x / fabs(x) -> copysign(1.0, x)
fabs(x) / x -> copysign(1.0, x)
Reviewers: spatel, foad, RKSimon, efriedma
Reviewed By: spatel
Subscribers: lebedev.ri, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65898
llvm-svn: 368570
|
|
|
|
|
|
|
|
|
|
|
| |
constantexpr pitfail (PR42962)
Instead of matching value and then blindly casting to BinaryOperator
just to get the opcode, just match instruction and do no cast.
Fixes https://bugs.llvm.org/show_bug.cgi?id=42962
llvm-svn: 368554
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
truncated shl (PR42399)
trunc-of-shl:
https://rise4fun.com/Alive/zGx
https://rise4fun.com/Alive/sl0L
I.e. no extra legality check needed.
https://bugs.llvm.org/show_bug.cgi?id=42399
llvm-svn: 368520
|
|
|
|
|
|
|
|
|
|
| |
when shifting constant
If one of the values being shifted is a constant, since the new shift
amount is known-constant, the new shift will end up being constant-folded
so, we don't need that one-use restriction then.
llvm-svn: 368519
|
|
|
|
|
|
|
|
|
|
| |
restriction
That one-use restriction is not needed for correctness - we have already
ensured that one of the shifts will go away, so we know we won't increase
the instruction count. So there is no need for that restriction.
llvm-svn: 368518
|
|
|
|
|
|
| |
shift of const
llvm-svn: 368517
|
|
|
|
| |
llvm-svn: 368447
|
|
|
|
| |
llvm-svn: 368360
|
|
|
|
| |
llvm-svn: 368201
|
|
|
|
| |
llvm-svn: 368200
|
|
|
|
| |
llvm-svn: 368176
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In SimplifySelectsFeedingBinaryOp, propagate fast math flags from the
outer op into both arms of the new select, to take advantage of
simplifications that require fast math flags.
Reviewers: mcberg2017, majnemer, spatel, arsenm, xbolva00
Subscribers: wdng, javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65658
llvm-svn: 368175
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was initially committed in r368059 but got reverted in r368084
because there was a faulty logic in how the shift amounts type mismatch
was being handled (it simply wasn't).
I've added an explicit bailout before we SimplifyAddInst() - i don't think
it's designed in general to handle differently-typed values, even though
the actual problem only comes from ConstantExpr's.
I have also changed the common type deduction, to not just blindly
look past zext, but try to do that so that in the end types match.
Differential Revision: https://reviews.llvm.org/D65380
llvm-svn: 368141
|
|
|
|
|
|
|
|
|
| |
This reverts r368059 (git commit 0f957109761913c563922f1afd4ceb29ef21bbd0)
This caused Clang to assert while self-hosting and compiling
SystemZInstrInfo.cpp. Reduction is running.
llvm-svn: 368084
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently `reassociateShiftAmtsOfTwoSameDirectionShifts()` only handles
two shifts one after another. If the shifts are `shl`, we still can
easily perform the fold, with no extra legality checks:
https://rise4fun.com/Alive/OQbM
If we have right-shift however, we won't be able to make it
any simpler than it already is.
After this the only thing missing here is constant-folding: (`NewShAmt >= bitwidth(X)`)
* If it's a logical shift, then constant-fold to `0` (not `undef`)
* If it's a `ashr`, then a splat of original signbit
https://rise4fun.com/Alive/E1K
https://rise4fun.com/Alive/i0V
Reviewers: spatel, nikic, xbolva00
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65380
llvm-svn: 368059
|
|
|
|
|
|
| |
Baseline coverage for D65658.
llvm-svn: 368028
|
|
|
|
|
|
|
|
| |
As discussed in https://reviews.llvm.org/D65148#1607019
The canonical fold is: https://rise4fun.com/Alive/FKe
llvm-svn: 367897
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This appears to slightly help patterns similar to what's
shown in PR42874:
https://bugs.llvm.org/show_bug.cgi?id=42874
...but not in the way requested.
That fix will require some later IR and/or backend pass to
decompose multiply/shifts into something more optimal per
target. Those transforms already exist in some basic forms,
but probably need enhancing to catch more cases.
https://rise4fun.com/Alive/Qzv2
llvm-svn: 367891
|
|
|
|
| |
llvm-svn: 367883
|
|
|
|
|
|
|
| |
As the test shows, we can end up with more instructions than
we started with if we don't include the extra-use check.
llvm-svn: 367880
|
|
|
|
| |
llvm-svn: 367876
|
|
|
|
| |
llvm-svn: 367825
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed in PR42696:
https://bugs.llvm.org/show_bug.cgi?id=42696
...but won't help that case yet.
We have an odd situation where a select operand equivalence fold was
implemented in InstSimplify when it could have been done more generally
in InstCombine if we allow dropping of {nsw,nuw,exact} from a binop operand.
Here's an example:
https://rise4fun.com/Alive/Xplr
%cmp = icmp eq i32 %x, 2147483647
%add = add nsw i32 %x, 1
%sel = select i1 %cmp, i32 -2147483648, i32 %add
=>
%sel = add i32 %x, 1
I've left the InstSimplify code in place for now, but my guess is that we'd
prefer to remove that as a follow-up to save on code duplication and
compile-time.
Differential Revision: https://reviews.llvm.org/D65576
llvm-svn: 367695
|
|
|
|
|
|
| |
More coverage for the proposal in D65576.
llvm-svn: 367579
|
|
|
|
|
|
| |
More coverage for the proposal in D65576.
llvm-svn: 367577
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reverse the canonicalization of fneg relative to fmul/fdiv. That makes it
easier to implement the transforms (and possibly other fneg transforms) in
1 place because we can always start the pattern match from fneg (either the
legacy binop or the new unop).
There's a secondary practical benefit seen in PR21914 and PR42681:
https://bugs.llvm.org/show_bug.cgi?id=21914
https://bugs.llvm.org/show_bug.cgi?id=42681
...hoisting fneg rather than sinking seems to play nicer with LICM in IR
(although this change may expose analysis holes in the other direction).
1. The instcombine test changes show the expected neutral IR diffs from
reversing the order.
2. The reassociation tests show that we were missing an optimization
opportunity to fold away fneg-of-fneg. My reading of IEEE-754 says
that all of these transforms are allowed (regardless of binop/unop
fneg version) because:
"For all other operations [besides copy/abs/negate/copysign], this
standard does not specify the sign bit of a NaN result."
In all of these transforms, we always have some other binop
(fadd/fsub/fmul/fdiv), so we are free to flip the sign bit of a
potential intermediate NaN operand.
(If that interpretation is wrong, then we must already have a bug in
the existing transforms?)
3. The clang tests shouldn't exist as-is, but that's effectively a
revert of rL367149 (the test broke with an extension of the
pre-existing fneg canonicalization in rL367146).
Differential Revision: https://reviews.llvm.org/D65399
llvm-svn: 367447
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently InstCombiner::foldXorOfICmps() bailouts if the
ICMP it wants to invert has extra uses. As it can be seen
in the tests in previous commit, this is super unfortunate,
this is the single pattern that is left non-canonicalized.
We could analyze if we can also invert all the uses if said ICMP
at the same time, thus not bailing out there.
I'm not seeing any nicer alternative.
llvm-svn: 367439
|
|
|
|
|
|
|
| |
As disscussed in https://reviews.llvm.org/D65148#1603922
these would all need to be canonicalized to traditional clamp pattern.
llvm-svn: 367438
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
power-of-two
Summary:
I have stumbled into this by accident while preparing to extend backend `x s% C ==/!= 0` handling.
While we did happen to handle this fold in most of the cases,
the folding is indirect - we fold `x u% y` to `x & (y-1)` (iff `y` is power-of-two),
or first turn `x s% -y` to `x u% y`; that does handle most of the cases.
But we can't turn `x s% INT_MIN` to `x u% -INT_MIN`,
and thus we end up being stuck with `(x s% INT_MIN) == 0`.
There is no such restriction for the more general fold:
https://rise4fun.com/Alive/IIeS
To be noted, the fold does not enforce that `y` is a constant,
so it may indeed increase instruction count.
This is consistent with what `x u% y`->`x & (y-1)` already does.
I think it makes sense, it's at most one (simple) extra instruction,
while `rem`ainder is really much more un-simple (and likely **very** costly).
Reviewers: spatel, RKSimon, nikic, xbolva00, craig.topper
Reviewed By: RKSimon
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65046
llvm-svn: 367322
|
|
|
|
| |
llvm-svn: 367233
|
|
|
|
|
|
|
|
|
| |
The backend already does this via isNegatibleForFree(),
but we may want to alter the fneg IR canonicalizations
that currently exist, so we need to try harder to fold
fneg in IR to avoid regressions.
llvm-svn: 367227
|
|
|
|
| |
llvm-svn: 367222
|
|
|
|
|
|
| |
shift-amount-reassociation-with-truncation-shl.ll
llvm-svn: 367196
|
|
|
|
|
|
|
|
|
| |
The backend already does this via isNegatibleForFree(),
but we may want to alter the fneg IR canonicalizations
that currently exist, so we need to try harder to fold
fneg in IR to avoid regressions.
llvm-svn: 367194
|
|
|
|
|
|
|
|
|
| |
https://rise4fun.com/Alive/OQbM
Not so simple for lshr/ashr, so those maybe later.
https://bugs.llvm.org/show_bug.cgi?id=42391
llvm-svn: 367189
|
|
|
|
| |
llvm-svn: 367156
|