| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D65815
llvm-svn: 368171
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was initially committed in r368059 but got reverted in r368084
because there was a faulty logic in how the shift amounts type mismatch
was being handled (it simply wasn't).
I've added an explicit bailout before we SimplifyAddInst() - i don't think
it's designed in general to handle differently-typed values, even though
the actual problem only comes from ConstantExpr's.
I have also changed the common type deduction, to not just blindly
look past zext, but try to do that so that in the end types match.
Differential Revision: https://reviews.llvm.org/D65380
llvm-svn: 368141
|
|
|
|
|
|
|
|
| |
Not all Darwin targets support _bcmp in all circumstances.
Differential Revision: https://reviews.llvm.org/D65834
llvm-svn: 368113
|
|
|
|
|
|
|
|
|
| |
This reverts commit 3de33245d2c992c9e0af60372043540b60f3a810.
This commit broke the MSan buildbots. See
https://reviews.llvm.org/rL367901 for more information.
llvm-svn: 368107
|
|
|
|
|
|
|
|
|
| |
This reverts r368059 (git commit 0f957109761913c563922f1afd4ceb29ef21bbd0)
This caused Clang to assert while self-hosting and compiling
SystemZInstrInfo.cpp. Reduction is running.
llvm-svn: 368084
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently `reassociateShiftAmtsOfTwoSameDirectionShifts()` only handles
two shifts one after another. If the shifts are `shl`, we still can
easily perform the fold, with no extra legality checks:
https://rise4fun.com/Alive/OQbM
If we have right-shift however, we won't be able to make it
any simpler than it already is.
After this the only thing missing here is constant-folding: (`NewShAmt >= bitwidth(X)`)
* If it's a logical shift, then constant-fold to `0` (not `undef`)
* If it's a `ashr`, then a splat of original signbit
https://rise4fun.com/Alive/E1K
https://rise4fun.com/Alive/i0V
Reviewers: spatel, nikic, xbolva00
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65380
llvm-svn: 368059
|
|
|
|
| |
llvm-svn: 368056
|
|
|
|
|
|
| |
Baseline coverage for D65658.
llvm-svn: 368028
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: hsaito, fhahn, rengolin
Reviewed By: rengolin
Patch by psamolysov, thanks!
Differential Revision: https://reviews.llvm.org/D62997
llvm-svn: 367980
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A function is "no-return" if we never reach a return instruction, either
because there are none or the ones that exist are dead.
Test have been adjusted:
- either noreturn was added, or
- noreturn was avoided by modifying the code.
The new noreturn_{sync,async} test make sure we do handle invoke
instructions with a noreturn (and potentially nowunwind) callee
correctly, even in the presence of potential asynchronous exceptions.
llvm-svn: 367948
|
|
|
|
|
|
|
|
| |
When we remove instructions cached references could still be live. This
patch avoids removing invoke instructions that are replaced by calls and
instead keeps them around but in a dead block.
llvm-svn: 367933
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch changes our defualt legalization behavior for 16, 32, and
64 bit vectors with i8/i16/i32/i64 scalar types from promotion to
widening. For example, v8i8 will now be widened to v16i8 instead of
promoted to v8i16. This keeps the elements widths the same and pads
with undef elements. We believe this is a better legalization strategy.
But it carries some issues due to the fragmented vector ISA. For
example, i8 shifts and multiplies get widened and then later have
to be promoted/split into vXi16 vectors.
This has the potential to cause regressions so we wanted to get
it in early in the 10.0 cycle so we have plenty of time to
address them.
Next steps will be to merge tests that explicitly test the command
line option. And then we can remove the option and its associated
code.
llvm-svn: 367901
|
|
|
|
|
|
|
|
| |
As discussed in https://reviews.llvm.org/D65148#1607019
The canonical fold is: https://rise4fun.com/Alive/FKe
llvm-svn: 367897
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This appears to slightly help patterns similar to what's
shown in PR42874:
https://bugs.llvm.org/show_bug.cgi?id=42874
...but not in the way requested.
That fix will require some later IR and/or backend pass to
decompose multiply/shifts into something more optimal per
target. Those transforms already exist in some basic forms,
but probably need enhancing to catch more cases.
https://rise4fun.com/Alive/Qzv2
llvm-svn: 367891
|
|
|
|
| |
llvm-svn: 367883
|
|
|
|
|
|
|
| |
As the test shows, we can end up with more instructions than
we started with if we don't include the extra-use check.
llvm-svn: 367880
|
|
|
|
| |
llvm-svn: 367876
|
|
|
|
| |
llvm-svn: 367825
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This contains various fixes:
- Explicitly determine and return the next noreturn instruction.
- If an invoke calls a noreturn function which is not nounwind we
keep the unwind destination live. This also means we require an
invoke. Though we can still add the unreachable to the normal
destination block.
- Check if the return instructions are dead after we look for calls
to avoid triggering an optimistic fixpoint in the presence of
assumed liveness information.
- Make the interface work with "const" pointers.
- Some simplifications
While additional tests are included, full coverage is achieved only with
D59978.
Reviewers: sstefan1, uenoku
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65701
llvm-svn: 367791
|
|
|
|
|
|
|
|
|
|
| |
For consistency with normal instructions and clarity when reading IR,
it's best to print the %0, %1, ... names of function arguments in
definitions.
Also modifies the parser to accept IR in that form for obvious reasons.
llvm-svn: 367755
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
idiom in computeKnownBits.
computeKnownBits will indicate the sign bit of abs is 0 if the
the RHS operand returned by matchSelectPattern has the nsw flag set.
For abs idioms like (X >= 0) ? X : -X, the RHS returns -X. But
we can also match ((X-Y) >= 0 ? X-Y : Y-X as abs. In this case
RHS will be the Y-X operand. According to Alive, the sign bit for
this is only 0 if both the X-Y and Y-X operands have the nsw flag.
But we're only checking the Y-X operand.
llvm-svn: 367747
|
|
|
|
|
|
|
|
|
|
| |
scalar bit tests for the branches for expandload/compressstore.
Same as what was done for gather/scatter/load/store in r367489.
Expandload/compressstore were delayed due to lack of constant
masking handling that has since been fixed.
llvm-svn: 367738
|
|
|
|
|
|
|
|
|
|
|
|
| |
Modifying other AbstractAttributes to use Liveness AA and skip dead instructions.
Reviewers: jdoerfert, uenoku
Subscribers: hiraditya, llvm-commits
Differential revision: https://reviews.llvm.org/D65243
llvm-svn: 367725
|
|
|
|
|
|
|
|
|
|
| |
compressstore scalarization
This adds support for generating all the loads or stores for a constant mask into a single basic block with no conditionals.
Differential Revision: https://reviews.llvm.org/D65613
llvm-svn: 367715
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed in PR42696:
https://bugs.llvm.org/show_bug.cgi?id=42696
...but won't help that case yet.
We have an odd situation where a select operand equivalence fold was
implemented in InstSimplify when it could have been done more generally
in InstCombine if we allow dropping of {nsw,nuw,exact} from a binop operand.
Here's an example:
https://rise4fun.com/Alive/Xplr
%cmp = icmp eq i32 %x, 2147483647
%add = add nsw i32 %x, 1
%sel = select i1 %cmp, i32 -2147483648, i32 %add
=>
%sel = add i32 %x, 1
I've left the InstSimplify code in place for now, but my guess is that we'd
prefer to remove that as a follow-up to save on code duplication and
compile-time.
Differential Revision: https://reviews.llvm.org/D65576
llvm-svn: 367695
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds an ability to disable profile based peeling
causing the peeling of all iterations and as a result prohibits
further unroll/peeling attempts on that loop.
The motivation to get an ability to separate peeling usage in
pipeline where in the first part we peel only separate iterations if needed
and later in pipeline we apply the full peeling which will prohibit further peeling.
Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D64983
llvm-svn: 367668
|
|
|
|
| |
llvm-svn: 367666
|
|
|
|
|
|
|
|
| |
subdirectory.
ps4-buildslave1 reported a failure. The test has x86 triple.
llvm-svn: 367659
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: hsaito, Ayal, fhahn, anna, mkazantsev
Reviewed By: hsaito
Patch by evrevnov, thanks!
Differential Revision: https://reviews.llvm.org/D63981
llvm-svn: 367654
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
not used.
Current peeling cost model can decide to peel off not all iterations
but only some of them to eliminate conditions on phi. At the same time
if any peeling happens the door for further unroll/peel optimizations on that
loop closes because the part of the code thinks that if peeling happened
it is profile based peeling and all iterations are peeled off.
To resolve this inconsistency the patch provides the flag which states whether
the full peeling basing on profile is enabled or not and peeling cost model
is able to modify this field like it does not PeelCount.
In a separate patch I will introduce an option to allow/disallow peeling basing
on profile.
To avoid infinite loop peeling the patch tracks the total number of peeled iteration
through llvm.loop.peeled.count loop metadata.
Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D64972
llvm-svn: 367647
|
|
|
|
|
|
|
|
|
| |
Added code to truncate or shrink offsets so that we can continue
base pointer search if size has changed along the way.
Differential Revision: https://reviews.llvm.org/D65612
llvm-svn: 367646
|
|
|
|
| |
llvm-svn: 367634
|
|
|
|
|
|
|
|
|
|
|
| |
The previous change to fix crash in the vectorizer introduced
performance regressions. The condition to preserve pointer
address space during the search is too tight, we only need to
match the size.
Differential Revision: https://reviews.llvm.org/D65600
llvm-svn: 367624
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
DominatorTree is invalid after SimplifyCFG because of a missed `Changed = true` when simplifying a branch condition and removing an edge.
Resolves PR42272.
Reviewers: zhizhouy, manojgupta
Subscribers: jlebar, sanjoy.google, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65490
llvm-svn: 367596
|
|
|
|
|
|
|
|
|
|
|
| |
This allows folding of the scalar epilogue loop (the tail) into the main
vectorised loop body when the loop is annotated with a "vector predicate"
metadata hint. To fold the tail, instructions need to be predicated (masked),
enabling/disabling lanes for the remainder iterations.
Differential Revision: https://reviews.llvm.org/D65197
llvm-svn: 367592
|
|
|
|
|
|
| |
More coverage for the proposal in D65576.
llvm-svn: 367579
|
|
|
|
|
|
| |
More coverage for the proposal in D65576.
llvm-svn: 367577
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
scalar bit tests for the branches.
X86 at least is able to use movmsk or kmov to move the mask to the scalar
domain. Then we can just use test instructions to test individual bits.
This is more efficient than extracting each mask element
individually.
I special cased v1i1 to use the previous behavior. This avoids
poor type legalization of bitcast of v1i1 to i1.
I've skipped expandload/compressstore as I think we need to
handle constant masks for those better first.
Many tests end up with duplicate test instructions due to tail
duplication in the branch folding pass. But the same thing
happens when constructing similar code in C. So its not unique
to the scalarization.
Not sure if this lowering code will also be good for other targets,
but we're only testing X86 today.
Differential Revision: https://reviews.llvm.org/D65319
llvm-svn: 367489
|
|
|
|
|
|
|
|
|
|
| |
(prep work)
This is a prepatory patch for future work on support exit value rewriting in loops with a mixture of computable and non-computable exit counts. The intention is to be "mostly NFC" - i.e. not enable any interesting new transforms - but in practice, there are some small output changes.
The test differences are caused by cases wherewhere getSCEVAtScope can simplify a single entry phi without needing any knowledge of the loop.
llvm-svn: 367485
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Update condition to remove addition that may cause an overflow.
Resolves PR42814.
Reviewers: sanjoy, RKSimon
Subscribers: jlebar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65417
llvm-svn: 367461
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reverse the canonicalization of fneg relative to fmul/fdiv. That makes it
easier to implement the transforms (and possibly other fneg transforms) in
1 place because we can always start the pattern match from fneg (either the
legacy binop or the new unop).
There's a secondary practical benefit seen in PR21914 and PR42681:
https://bugs.llvm.org/show_bug.cgi?id=21914
https://bugs.llvm.org/show_bug.cgi?id=42681
...hoisting fneg rather than sinking seems to play nicer with LICM in IR
(although this change may expose analysis holes in the other direction).
1. The instcombine test changes show the expected neutral IR diffs from
reversing the order.
2. The reassociation tests show that we were missing an optimization
opportunity to fold away fneg-of-fneg. My reading of IEEE-754 says
that all of these transforms are allowed (regardless of binop/unop
fneg version) because:
"For all other operations [besides copy/abs/negate/copysign], this
standard does not specify the sign bit of a NaN result."
In all of these transforms, we always have some other binop
(fadd/fsub/fmul/fdiv), so we are free to flip the sign bit of a
potential intermediate NaN operand.
(If that interpretation is wrong, then we must already have a bug in
the existing transforms?)
3. The clang tests shouldn't exist as-is, but that's effectively a
revert of rL367149 (the test broke with an extension of the
pre-existing fneg canonicalization in rL367146).
Differential Revision: https://reviews.llvm.org/D65399
llvm-svn: 367447
|
|
|
|
|
|
|
|
|
| |
When vectorizer strips pointers it can eventually end up with
pointers of two different sizes, then SCEV will crash.
Differential Revision: https://reviews.llvm.org/D65480
llvm-svn: 367443
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently InstCombiner::foldXorOfICmps() bailouts if the
ICMP it wants to invert has extra uses. As it can be seen
in the tests in previous commit, this is super unfortunate,
this is the single pattern that is left non-canonicalized.
We could analyze if we can also invert all the uses if said ICMP
at the same time, thus not bailing out there.
I'm not seeing any nicer alternative.
llvm-svn: 367439
|
|
|
|
|
|
|
| |
As disscussed in https://reviews.llvm.org/D65148#1603922
these would all need to be canonicalized to traditional clamp pattern.
llvm-svn: 367438
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have some code marks instructions with struct operands as overdefined,
but if the instruction is a call to a function with tracked arguments,
this breaks the assumption that the lattice values of all call sites
are not overdefined and will be replaced by a constant.
This also re-adds the assertion from D65222, with additionally skipping
non-callsite uses. This patch should address the cases reported in which
the assertion fired.
Fixes PR42738.
Reviewers: efriedma, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D65439
llvm-svn: 367430
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair
is unsupported by target, nothing performs the opposite fold.
We can't do that in InstCombine or DAGCombine since neither of those has access to TTI.
So it makes most sense to teach `-div-rem-pairs` about it.
If we matched rem in expanded form, we know we will be able to place div-rem pair
next to each other so we won't regress the situation.
Also, we shouldn't decompose rem if we matched already-decomposed form.
This is surprisingly straight-forward otherwise.
The original patch was committed in rL367288 but was reverted in rL367289
because it exposed pre-existing RAUW issues in internal data structures
of the pass; those now have been addressed in a previous patch.
https://bugs.llvm.org/show_bug.cgi?id=42673
Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner
Reviewed By: bogner
Subscribers: bogner, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65298
llvm-svn: 367419
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
`DivRemPairs` internally creates two maps:
* {sign, divident, divisor} -> div instruction
* {sign, divident, divisor} -> rem instruction
Then it iterates over rem map, and looks if there is an entry
in div map with the same key. Then depending on some internal logic
it may RAUW rem instruction with something else.
But if that rem instruction is an input to other div/rem,
then it was used as a key in these maps, so the old value (used in key)
is now dandling, because RAUW didn't update those maps.
And we can't even RAUW map keys in general, there's `ValueMap`,
but we don't have a single `Value` as key...
The bug was discovered via D65298, and the test there exists.
Now, i'm not sure how to expose this issue in trunk.
The bug is clearly there if i change the map keys to be `AssertingVH`/`PoisoningVH`,
but i guess this didn't miscompiled anything thus far?
I really don't think this is benin without that patch.
The fix is actually rather straight-forward - instead of trying to somehow
shoe-horn `ValueMap` here (doesn't fit, key isn't just `Value`), or writing a new
`ValueMap` with key being a struct of `Value`s, we can just have an intermediate
data structure - a vector, each entry containing matching `Div, Rem` pair,
and pre-filling it before doing any modifications.
This way we won't need to query map after doing RAUW, so no bug is possible.
Reviewers: spatel, bogner, RKSimon, craig.topper
Reviewed By: spatel
Subscribers: hiraditya, hans, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65451
llvm-svn: 367417
|
|
|
|
| |
llvm-svn: 367415
|
|
|
|
|
|
|
| |
This fixes some pipeline tests.
This reverts commit d0b6f42936bfb6d56d325c732ae79400c9c6016a.
llvm-svn: 367401
|
|
|
|
|
|
| |
This reverts r367332 (git commit 2d7227ec3ac91f36fc32b1c21e72e2f1f5d030ad)
llvm-svn: 367335
|