| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
This is based on existing CodeGen test files for x86 and AArch64.
The corresponding potential transform is shown in:
rL370617
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
That fold keeps growing and growing :(
I think this may be one of the last pieces for it.
Since D67677/D67725, the fold knowns the general form
of the pattern - where some masking is needed:
https://rise4fun.com/Alive/F5R
https://rise4fun.com/Alive/gslRa
But there is one more huge piece missing - if you are extracting some bits,
it is not impossible that the origin is wider than the extraction,
i.e. there may be a truncation. And we don't deal with that yet.
But we can, and the generalization remains fully identical:
https://rise4fun.com/Alive/Uar
https://rise4fun.com/Alive/5SW
After a preparatory cleanup i think the diff looks rather clean.
One missing piece is that in some patterns (especially pat. b),
`-1` only needs to be `-1` in final type, but that is for later..
https://bugs.llvm.org/show_bug.cgi?id=42563
Reviewers: spatel, nikic
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69125
|
|
|
|
| |
Avoid subsequent test noise from improved CHECK-LABEL matching.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Same as D60846 but with a fix for the problem encountered there which
was a missing context adjustment in the handling of PHI nodes.
The test that caused D60846 to be reverted was added in e15ab8f277c7.
Reviewers: nikic, nlopes, mkazantsev,spatel, dlrobertson, uabelho, hakzsam
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69571
|
|
|
|
| |
This is in preparation of D69571.
|
|
|
|
| |
In all cases, we currently unintentionally drop the FMF on the new select.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This adds some patterns to transform uadd.with.overflow to uadd.sat
(with usub.with.overflow to usub.sat too). The patterns selects from
UINTMAX (or 0 for subs) depending on whether the operation overflowed.
Signed patterns are a little more involved (they can wrap in two
directions), but can be added here in a followup patch too.
Differential Revision: https://reviews.llvm.org/D69245
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
in the following C code the branch is not removed by clang in O3.
```
int f1(char* p) {
int i1 = __builtin_strlen(p);
if (!p)
return -1;
return i1;
}
```
The issue is that the call to strlen is sunk to the following block by instcombine. In its new place the call to strlen doesn't dominate the use in the icmp anymore so value tracking can't see that p cannot be null.
This patch resolves the issue by inserting an assumption at the place of the call before sinking a call when that call can be used to prove an argument to be nonnull.
This resolves this issue at O3.
Reviewers: majnemer, xbolva00, fhahn, jdoerfert, spatel, efriedma
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69477
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
elements
This is a fix for:
https://bugs.llvm.org/show_bug.cgi?id=43730
...and as shown there, we have existing test cases that show potential miscompiles.
We could just bail out for vector constants that contain any undef elements, or we can do as shown here:
allow the transform, but replace the undefs with a safe value.
For most of the tests shown, this results in a full splat constant (no undefs) which is probably a win
for further IR analysis because we conservatively don't match undefs in most cases. Codegen can probably
recover these kinds of undef lanes via demanded elements analysis if that's profitable.
Differential Revision: https://reviews.llvm.org/D69519
|
|
|
|
|
|
| |
types; NFC
Increase coverage for D69519.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Getelementptr has vector type if any of its operands are vectors
(the scalar operands being implicitly broadcast to all vector elements).
Extractelement applied to a vector getelementptr can be folded by
applying the extractelement in turn to all of the vector operands.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69379
|
|
|
|
|
|
|
| |
This is an extra fold for a canonical form of uadd_sat, as shown in
D68651. It essentially selects uadd from an add and a select.
Differential Revision: https://reviews.llvm.org/D69244
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds an instcombine matcher for code that attempts to perform signed
saturating arithmetic by casting to a higher type. Unsigned cases are already
matched, this adds extra matches for the more complex signed cases, which
involves matching the min(max(add a b)) nodes with proper extends to ensure
legality.
Differential Revision: https://reviews.llvm.org/D68651
llvm-svn: 375505
|
|
|
|
| |
llvm-svn: 375503
|
|
|
|
| |
llvm-svn: 375418
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Allow for ignoring the check for a single use in SimplifyDemandedVectorElts
to be able to simplify operands if DemandedElts is known to contain
the union of elements used by all users.
It is a responsibility of a caller of SimplifyDemandedVectorElts to
supply correct DemandedElts.
Simplify a series of extractelement instructions if only a subset of
elements is used.
Reviewers: reames, arsenm, majnemer, nhaehnle
Reviewed By: nhaehnle
Subscribers: wdng, jvesely, nhaehnle, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67345
llvm-svn: 375395
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In this pattern, all the "magic" bits that we'd `add` are all
high sign bits, and in the value we'd be adding to they are all unset,
not unexpectedly, so we can have an `or` there:
https://rise4fun.com/Alive/ups
It is possible that `haveNoCommonBitsSet()` should be taught about this
pattern so that we never have an `add` variant, but the reasoning would
need to be recursive (because of that `select`), so i'm not really sure
that would be worth it just yet.
llvm-svn: 375378
|
|
|
|
|
|
|
|
|
|
|
| |
can be 'or' pattern.
In this pattern, all the "magic" bits that we'd add are all
high sign bits, and in the value we'd be adding to they are all unset,
not unexpectedly, so we can have an `or` there:
https://rise4fun.com/Alive/ups
llvm-svn: 375377
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds folds for comparing uadd.sat/usub.sat with zero:
* uadd.sat(a, b) == 0 => a == 0 && b == 0 => (a | b) == 0
* usub.sat(a, b) == 0 => a <= b
And inverted forms for !=.
Differential Revision: https://reviews.llvm.org/D69224
llvm-svn: 375374
|
|
|
|
| |
llvm-svn: 375372
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This problem consists of several parts:
* Basic sign bit extraction - `trunc? (?shr %x, (bitwidth(x)-1))`.
This is trivial, and easy to do, we have a fold for it.
* Shift amount reassociation - if we have two identical shifts,
and we can simplify-add their shift amounts together,
then we likely can just perform them as a single shift.
But this is finicky, has one-use restrictions,
and shift opcodes must be identical.
But there is a super-pattern where both of these work together.
to produce sign bit test from two shifts + comparison.
We do indeed already handle this in most cases.
But since we get that fold transitively, it has one-use restrictions.
And what's worse, in this case the right-shifts aren't required to be
identical, and we can't handle that transitively:
If the total shift amount is bitwidth-1, only a sign bit will remain
in the output value. But if we look at this from the perspective of
two shifts, we can't fold - we can't possibly know what bit pattern
we'd produce via two shifts, it will be *some* kind of a mask
produced from original sign bit, but we just can't tell it's shape:
https://rise4fun.com/Alive/cM0 https://rise4fun.com/Alive/9IN
But it will *only* contain sign bit and zeros.
So from the perspective of sign bit test, we're good:
https://rise4fun.com/Alive/FRz https://rise4fun.com/Alive/qBU
Superb!
So the simplest solution is to extend `reassociateShiftAmtsOfTwoSameDirectionShifts()` to also have a
sudo-analysis mode that will ignore extra-uses, and will only check
whether a) those are two right shifts and b) they end up with bitwidth(x)-1
shift amount and return either the original value that we sign-checking,
or null.
This does not have any functionality change for
the existing `reassociateShiftAmtsOfTwoSameDirectionShifts()`.
All that being said, as disscussed in the review, this yet again
increases usage of instsimplify in instcombine as utility.
Some day that may need to be reevaluated.
https://bugs.llvm.org/show_bug.cgi?id=43595
Reviewers: spatel, efriedma, vsk
Reviewed By: spatel
Subscribers: xbolva00, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68930
llvm-svn: 375371
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add restrictions in canEvaluateShuffled to prevent that we for example
transform
%0 = insertelement <2 x i16> undef, i16 %a, i32 0
%1 = srem <2 x i16> %0, <i16 2, i16 1>
%2 = shufflevector <2 x i16> %1, <2 x i16> undef, <2 x i32> <i32 undef, i32 0>
into
%1 = insertelement <2 x i16> undef, i16 %a, i32 1
%2 = srem <2 x i16> %1, <i16 undef, i16 2>
as having an undef denominator makes the srem undefined (for all
vector elements).
Fixes: https://bugs.llvm.org/show_bug.cgi?id=43689
Reviewers: spatel, lebedev.ri
Reviewed By: spatel, lebedev.ri
Subscribers: lebedev.ri, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69038
llvm-svn: 375208
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
canEvaluateShuffled
Adding the reproducer from https://bugs.llvm.org/show_bug.cgi?id=43689,
showing that instcombine is doing a bad transform. It transforms
%0 = insertelement <2 x i16> undef, i16 %a, i32 0
%1 = srem <2 x i16> %0, <i16 2, i16 1>
%2 = shufflevector <2 x i16> %1, <2 x i16> undef, <2 x i32> <i32 undef, i32 0>
into
%1 = insertelement <2 x i16> undef, i16 %a, i32 1
%2 = srem <2 x i16> %1, <i16 undef, i16 2>
The undef denominator makes the whole srem undefined.
llvm-svn: 375207
|
|
|
|
|
|
|
|
| |
shift-of-trunc" (PR42563)
https://bugs.llvm.org/show_bug.cgi?id=42563
llvm-svn: 375135
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR42699)
Summary:
Currently when computing a GEP offset using the function EmitGEPOffset
for the following instruction
getelementptr inbounds i32, i32* %p, i64 %offs
we get
mul nuw i64 %offs, 4
Unfortunately we cannot assume that unsigned wrapping won't happen
here because %offs is allowed to be negative.
Making such assumptions can lead to miscompilations: see the new test
test24_neg_offs in InstCombine/icmp.ll. Without the patch InstCombine
would generate the following comparison:
icmp eq i64 %offs, 4611686018427387902; 0x3ffffffffffffffe
Whereas the correct value to compare with is -2.
This patch replaces the NUW flag with NSW in the multiplication
instructions generated by EmitGEPOffset and adjusts the test suite.
https://bugs.llvm.org/show_bug.cgi?id=42699
Reviewers: chandlerc, craig.topper, ostannard, lebedev.ri, spatel, efriedma, nlopes, aqjune
Reviewed By: lebedev.ri
Subscribers: reames, lebedev.ri, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68342
llvm-svn: 375089
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is something of a workaround to avoid a crash later on in type
legalizer (WidenVectorResult()).
Also added some f16 tests, including a non-working v3f16 case with
a FIXME.
Reviewers: arsenm, tpr, nhaehnle
Reviewed By: arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68865
llvm-svn: 374993
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The 1st attempt at rL374828 inserted the code
at the wrong position (outside of the constant-shift-amount
block). Trying again with an additional test to verify
const-ness.
For a constant shift amount, add the following fold.
shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0)
https://rise4fun.com/Alive/IZ9
Fixes PR42257.
Based on original patch by @zvi (Zvi Rackover)
Differential Revision: https://reviews.llvm.org/D63382
llvm-svn: 374886
|
|
|
|
|
|
| |
This reverts r374828 (git commit 1f40f15d54aac06421448b6de131231d2d78bc75) due to bot breakage
llvm-svn: 374851
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a constant shift amount, add the following fold.
shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0)
https://rise4fun.com/Alive/IZ9
Fixes PR42257.
Based on original patch by @zvi (Zvi Rackover)
Differential Revision: https://reviews.llvm.org/D63382
llvm-svn: 374828
|
|
|
|
|
|
| |
A transform proposal for the shift form is in D63382.
llvm-svn: 374818
|
|
|
|
|
|
|
|
| |
Reapply r374240 with fix for Ocaml test, namely Bindings/OCaml/core.ml.
Differential Revision: https://reviews.llvm.org/D61675
llvm-svn: 374782
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
non-default address space
Follow-up to D68244 to account for a corner case discussed in:
https://bugs.llvm.org/show_bug.cgi?id=43501
Add one more restriction: if the pointer is deref-or-null and in a non-default
(non-zero) address space, we can't assume inbounds.
Differential Revision: https://reviews.llvm.org/D68706
llvm-svn: 374728
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While that pattern is indirectly handled via
reassociateShiftAmtsOfTwoSameDirectionShifts(),
that incursme one-use restriction on truncation,
which is pointless since we know that we'll produce a single instruction.
Additionally, *if* we are only looking for sign bit,
we don't need shifts to be identical,
which isn't the case in general,
and is the blocker for me in bug in question:
https://bugs.llvm.org/show_bug.cgi?id=43595
llvm-svn: 374726
|
|
|
|
|
|
| |
Also, refactor check in `LibCallSimplifier::optimizeLog()`.
llvm-svn: 374453
|
|
|
|
|
|
|
| |
This reverts commit r374240. It broke OCaml tests:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19014
llvm-svn: 374354
|
|
|
|
|
|
|
|
| |
Also update Clang to call Builder.CreateFNeg(...) for UnaryMinus.
Differential Revision: https://reviews.llvm.org/D61675
llvm-svn: 374240
|
|
|
|
| |
llvm-svn: 374190
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
high-bit-extract-with-signext (PR42389)
This can come up in Bit Stream abstractions.
The pattern looks big/scary, but it can't be simplified any further.
It only is so simple because a number of my preparatory folds had
happened already (shift amount reassociation / shift amount
reassociation in bit test, sign bit test detection).
Highlights:
* There are two main flavors: https://rise4fun.com/Alive/zWi
The difference is add vs. sub, and left-shift of -1 vs. 1
* Since we only change the shift opcode,
we can preserve the exact-ness: https://rise4fun.com/Alive/4u4
* There can be truncation after high-bit-extraction:
https://rise4fun.com/Alive/slHc1 (the main pattern i'm after!)
Which means that we need to ignore zext of shift amounts and of NBits.
* The sign-extending magic can be extended itself (in add pattern
via sext, in sub pattern via zext. not the other way around!)
https://rise4fun.com/Alive/NhG
(or those sext/zext can be sinked into `select`!)
Which again means we should pay attention when matching NBits.
* We can have both truncation of extraction and widening of magic:
https://rise4fun.com/Alive/XTw
In other words, i don't believe we need to have any checks on
bitwidths of any of these constructs.
This is worsened in general by the fact that we may have `sext` instead
of `zext` for shift amounts, and we don't yet canonicalize to `zext`,
although we should. I have not done anything about that here.
Also, we really should have something to weed out `sub` like these,
by folding them into `add` variant.
https://bugs.llvm.org/show_bug.cgi?id=42389
llvm-svn: 373964
|
|
|
|
|
|
|
|
| |
pattern (PR42389)
https://bugs.llvm.org/show_bug.cgi?id=42389
llvm-svn: 373963
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
amounts
Summary:
When we do `ConstantExpr::getZExt()`, that "extends" `undef` to `0`,
which means that for patterns a/b we'd assume that we must not produce
any bits for that channel, while in reality we simply didn't care
about that channel - i.e. we don't need to mask it.
Reviewers: spatel
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68239
llvm-svn: 373960
|
|
|
|
|
|
|
| |
Extends rL373230 and solves the motivating bug (although in a narrow way):
https://bugs.llvm.org/show_bug.cgi?id=43497
llvm-svn: 373851
|
|
|
|
| |
llvm-svn: 373848
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR43501)
https://bugs.llvm.org/show_bug.cgi?id=43501
We can't declare a GEP 'inbounds' in general. But we may salvage that information if
we have known dereferenceable bytes on the source pointer.
Differential Revision: https://reviews.llvm.org/D68244
llvm-svn: 373847
|
|
|
|
|
|
|
|
|
|
|
| |
'icmp sge/slt %x, 0'
We do indeed already get it right in some cases, but only transitively,
with one-use restrictions. Since we only need to produce a single
comparison, it makes sense to match the pattern directly:
https://rise4fun.com/Alive/kPg
llvm-svn: 373802
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR43564, PR42391)
Initially (D65380) i believed that if we have rightshift-trunc-rightshift,
we can't do any folding. But as it usually happens, i was wrong.
https://rise4fun.com/Alive/GEw
https://rise4fun.com/Alive/gN2O
In https://bugs.llvm.org/show_bug.cgi?id=43564 we happen to have
this very sequence, of two right shifts separated by trunc.
And "just" so that happens, we apparently can fold the pattern
if the total shift amount is either 0, or it's equal to the bitwidth
of the innermost widest shift - i.e. if we are left with only the
original sign bit. Which is exactly what is wanted there.
llvm-svn: 373801
|
|
|
|
| |
llvm-svn: 373800
|
|
|
|
| |
llvm-svn: 373799
|
|
|
|
|
|
|
|
| |
trunc) (PR43564)
https://rise4fun.com/Alive/x5IS
llvm-svn: 373798
|