| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch sinks add/mul(shufflevector(insertelement())) into the basic block in which they are used so that they can then be selected together.
This is useful for various MVE instructions, such as vmla and others that take R registers.
Loop tests have been added to the vmla test file to make sure vmlas are generated in loops.
Differential revision: https://reviews.llvm.org/D66295
llvm-svn: 371218
|
|
|
|
| |
llvm-svn: 371158
|
|
|
|
| |
llvm-svn: 371146
|
|
|
|
|
|
|
|
| |
Properly check if NewAAInfo conflicts with AAInfo.
Update local variable and alias set that a change occured when a conflict is found.
Resolves PR42969.
llvm-svn: 371139
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Here we try to avoid issues with "explicit branch" with SimplifyBranchOnICmpChain
which can check on undef. Msan by design reports branches on uninitialized
memory and undefs, so we have false report here.
In general msan does not like when we convert
```
// If at least one of them is true we can MSAN is ok if another is undefs
if (a || b)
return;
```
into
```
// If 'a' is undef MSAN will complain even if 'b' is true
if (a)
return;
if (b)
return;
```
Example
Before optimization we had something like this:
```
while (true) {
bool maybe_undef = doStuff();
while (true) {
char c = getChar();
if (c != 10 && c != 13)
continue
break;
}
// we know that c == 10 || c == 13 if we get here,
// so msan know that branch is not affected by maybe_undef
if (maybe_undef || c == 10 || c == 13)
continue;
return;
}
```
SimplifyBranchOnICmpChain will convert that into
```
while (true) {
bool maybe_undef = doStuff();
while (true) {
char c = getChar();
if (c != 10 && c != 13)
continue;
break;
}
// however msan will complain here:
if (maybe_undef)
continue;
// we know that c == 10 || c == 13, so either way we will get continue
switch(c) {
case 10: continue;
case 13: continue;
}
return;
}
```
Reviewers: eugenis, efriedma
Reviewed By: eugenis, efriedma
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67205
llvm-svn: 371138
|
|
|
|
|
|
| |
patterns have full test coverage
llvm-svn: 371108
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
overflow' check
A follow-up for r329011.
This may be changed to produce @llvm.sub.with.overflow in a later patch,
but for now just make things more consistent overall.
A few observations stem from this:
* There does not seem to be a similar one-instruction fold for uadd-overflow
* I'm not sure we'll want to canonicalize `B u> A` as `usub.with.overflow`,
so since the `icmp` here no longer refers to `sub`,
reconstructing `usub.with.overflow` will be problematic,
and will likely require standalone pass (similar to DivRemPairs).
https://rise4fun.com/Alive/Zqs
Name: (A - B) u> A --> B u> A
%t0 = sub i8 %A, %B
%r = icmp ugt i8 %t0, %A
=>
%r = icmp ugt i8 %B, %A
Name: (A - B) u<= A --> B u<= A
%t0 = sub i8 %A, %B
%r = icmp ule i8 %t0, %A
=>
%r = icmp ule i8 %B, %A
Name: C u< (C - D) --> C u< D
%t0 = sub i8 %C, %D
%r = icmp ult i8 %C, %t0
=>
%r = icmp ult i8 %C, %D
Name: C u>= (C - D) --> C u>= D
%t0 = sub i8 %C, %D
%r = icmp uge i8 %C, %t0
=>
%r = icmp uge i8 %C, %D
llvm-svn: 371101
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
overflow' check
A follow-up for r342004.
This will be changed to produce @llvm.add.with.overflow in a later patch,
but for now just make things more consistent overall.
https://rise4fun.com/Alive/qxE
Name: (Op1 + X) u< Op1 --> ~Op1 u< X
%t0 = add i8 %Op1, %X
%r = icmp ult i8 %t0, %Op1
=>
%n = xor i8 %Op1, -1
%r = icmp ult i8 %n, %X
Name: (Op1 + X) u>= Op1 --> ~Op1 u>= X
%t0 = add i8 %Op1, %X
%r = icmp uge i8 %t0, %Op1
=>
%n = xor i8 %Op1, -1
%r = icmp uge i8 %n, %X
;-------------------------------------------------------------------------------
Name: Op0 u> (Op0 + X) --> X u> ~Op0
%t0 = add i8 %Op0, %X
%r = icmp ugt i8 %Op0, %t0
=>
%n = xor i8 %Op0, -1
%r = icmp ugt i8 %X, %n
Name: Op0 u<= (Op0 + X) --> X u<= ~Op0
%t0 = add i8 %Op0, %X
%r = icmp ule i8 %Op0, %t0
=>
%n = xor i8 %Op0, -1
%r = icmp ule i8 %X, %n
llvm-svn: 371100
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
----------------------------------------
Name: unsigned sub, overflow, v0
%sub = sub i8 %x, %y
%ov = icmp ugt i8 %sub, %x
=>
%agg = usub_overflow i8 %x, %y
%sub = extractvalue {i8, i1} %agg, 0
%ov = extractvalue {i8, i1} %agg, 1
Done: 1
Optimization is correct!
----------------------------------------
Name: unsigned sub, no overflow, v0
%sub = sub i8 %x, %y
%ov = icmp ule i8 %sub, %x
=>
%agg = usub_overflow i8 %x, %y
%sub = extractvalue {i8, i1} %agg, 0
%not.ov = extractvalue {i8, i1} %agg, 1
%ov = xor %not.ov, -1
Done: 1
Optimization is correct!
llvm-svn: 371099
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
----------------------------------------
Name: unsigned add, overflow, v0
%add = add i8 %x, %y
%ov = icmp ult i8 %add, %x
=>
%agg = uadd_overflow i8 %x, %y
%add = extractvalue {i8, i1} %agg, 0
%ov = extractvalue {i8, i1} %agg, 1
Done: 1
Optimization is correct!
----------------------------------------
Name: unsigned add, overflow, v1
%add = add i8 %x, %y
%ov = icmp ult i8 %add, %y
=>
%agg = uadd_overflow i8 %x, %y
%add = extractvalue {i8, i1} %agg, 0
%ov = extractvalue {i8, i1} %agg, 1
Done: 1
Optimization is correct!
----------------------------------------
Name: unsigned add, no overflow, v0
%add = add i8 %x, %y
%ov = icmp uge i8 %add, %x
=>
%agg = uadd_overflow i8 %x, %y
%add = extractvalue {i8, i1} %agg, 0
%not.ov = extractvalue {i8, i1} %agg, 1
%ov = xor %not.ov, -1
Done: 1
Optimization is correct!
----------------------------------------
Name: unsigned add, no overflow, v1
%add = add i8 %x, %y
%ov = icmp uge i8 %add, %y
=>
%agg = uadd_overflow i8 %x, %y
%add = extractvalue {i8, i1} %agg, 0
%not.ov = extractvalue {i8, i1} %agg, 1
%ov = xor %not.ov, -1
Done: 1
Optimization is correct!
llvm-svn: 371098
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we have:
bb5:
br i1 %arg3, label %bb6, label %bb7
bb6:
%tmp = getelementptr inbounds i32, i32* %arg1, i64 2
store i32 3, i32* %tmp, align 4
br label %bb9
bb7:
%tmp8 = getelementptr inbounds i32, i32* %arg1, i64 2
store i32 3, i32* %tmp8, align 4
br label %bb9
bb9: ; preds = %bb4, %bb6, %bb7
...
We can't sink stores directly into bb9.
This patch creates new BB that is successor of %bb6 and %bb7
and sinks stores into that block.
SplitFooterBB is the parameter to the pass that controls
that behavior.
Change-Id: I7fdf50a772b84633e4b1b860e905bf7e3e29940f
Differential: https://reviews.llvm.org/D66234
llvm-svn: 371089
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Avoid visiting an instruction more than once by using a map.
This is similar to https://reviews.llvm.org/rL361416.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67198
llvm-svn: 371086
|
|
|
|
|
|
|
| |
The library currently uses ptrtoint and directly checks the queue ptr
for this, which counts as a pointer capture.
llvm-svn: 371009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Liveness needs to mark edges, not blocks as dead.
Reviewers: sstefan1, uenoku
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67191
llvm-svn: 370975
|
|
|
|
|
|
| |
Add more test cases simplifying `log()`.
llvm-svn: 370966
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
```
Name: sub(xor(x, y), or(x, y)) -> neg(and(x, y))
%or = or i32 %y, %x
%xor = xor i32 %x, %y
%sub = sub i32 %xor, %or
=>
%sub1 = and i32 %x, %y
%sub = sub i32 0, %sub1
Optimization: sub(xor(x, y), or(x, y)) -> neg(and(x, y))
Done: 1
Optimization is correct!
```
https://rise4fun.com/Alive/8OI
Reviewers: lebedev.ri
Reviewed By: lebedev.ri
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67188
llvm-svn: 370945
|
|
|
|
| |
llvm-svn: 370941
|
|
|
|
| |
llvm-svn: 370939
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
```
Name: sub(and(x, y), or(x, y)) -> neg(xor(x, y))
%or = or i32 %y, %x
%and = and i32 %x, %y
%sub = sub i32 %and, %or
=>
%sub1 = xor i32 %x, %y
%sub = sub i32 0, %sub1
Optimization: sub(and(x, y), or(x, y)) -> neg(xor(x, y))
Done: 1
Optimization is correct!
```
https://rise4fun.com/Alive/VI6
Found by @lebedev.ri. Also author of the proof.
Reviewers: lebedev.ri, spatel
Reviewed By: lebedev.ri
Subscribers: llvm-commits, lebedev.ri
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67155
llvm-svn: 370934
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Instead of building attributes for internal functions which we do not
update as long as we assume they are dead, we now do not create
attributes until we assume the internal function to be live. This
improves the number of required iterations, as well as the number of
required updates, in real code. On our tests, the results are mixed.
Reviewers: sstefan1, uenoku
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66914
llvm-svn: 370924
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We create attributes on-demand so we need to check the white list
on-demand. This also unifies the location at which we create,
initialize, and eventually invalidate new abstract attributes.
The tests show mixed results, a few more call site attributes are
determined which can cause more iterations.
Reviewers: uenoku, sstefan1
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66913
llvm-svn: 370922
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Before we tried to rule out non-exact definitions early but that lead to
on-demand attributes created for them anyway. As a consequence we needed
to look at the definition in the initialize of each attribute again.
This patch centralized this lookup and tightens the condition under
which we give up on non-exact definitions.
Reviewers: uenoku, sstefan1
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67115
llvm-svn: 370917
|
|
|
|
|
|
|
| |
This would crash:
https://bugs.llvm.org/show_bug.cgi?id=43218
llvm-svn: 370911
|
|
|
|
|
|
|
|
|
|
|
|
| |
SROA pass processes debug info incorrecly if applied twice.
Specifically, after SROA works first time, instcombine converts dbg.declare
intrinsics into dbg.value. Inlining creates new opportunities for SROA,
so it is called again. This time it does not handle correctly previously
inserted dbg.value intrinsics.
Differential Revision: https://reviews.llvm.org/D64595
llvm-svn: 370906
|
|
|
|
| |
llvm-svn: 370901
|
|
|
|
| |
llvm-svn: 370890
|
|
|
|
| |
llvm-svn: 370888
|
|
|
|
| |
llvm-svn: 370886
|
|
|
|
| |
llvm-svn: 370885
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
```
Name: sub or and to xor
%or = or i32 %y, %x
%and = and i32 %x, %y
%sub = sub i32 %or, %and
=>
%sub = xor i32 %x, %y
Optimization: sub or and to xor
Done: 1
Optimization is correct!
```
https://rise4fun.com/Alive/eJu
Reviewers: spatel, lebedev.ri
Reviewed By: lebedev.ri
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67153
llvm-svn: 370883
|
|
|
|
| |
llvm-svn: 370881
|
|
|
|
| |
llvm-svn: 370878
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: uenoku, sstefan1
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66833
llvm-svn: 370818
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add the no-capture argument attribute deduction to the Attributor
fixpoint framework.
The new string attributed "no-capture-maybe-returned" is introduced to
allow deduction of no-capture through functions that "capture" an
argument but only by "returning" it. It is only used by the Attributor
for testing.
Differential Revision: https://reviews.llvm.org/D59922
llvm-svn: 370817
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extends the existing logic for propagating constant expressions in an analogous manner for what we do across basic blocks. The core point is that we chose some order of operands, and canonicalize uses towards that one.
The heuristic used is inspired by the one used across blocks; in a follow up change, I'd plan to common them so that the cross block version uses the slightly stronger ordering herein.
As noted by the TODOs in the code, there's a good amount of room for improving the existing code and making it more powerful. Some follow up work planned.
Differential Revision: https://reviews.llvm.org/D66977
llvm-svn: 370791
|
|
|
|
|
|
|
|
|
|
| |
https://bugs.llvm.org/show_bug.cgi?id=43206 was filed,
claiming that there is a miscompilation.
Reverting until i investigate.
This reverts commit r370454
llvm-svn: 370788
|
|
|
|
| |
llvm-svn: 370784
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fold-tail currently supports reduction last-vector-value live-out's,
but has yet to support last-scalar-value live-outs, including
non-header phi's. As it relies on AllowedExit in order to detect
them and bail out we need to add the non-header PHI nodes to
AllowedExit, otherwise we end up with miscompiles.
Solves https://bugs.llvm.org/show_bug.cgi?id=43166
Reviewers: fhahn, Ayal
Reviewed By: fhahn, Ayal
Subscribers: anna, hiraditya, rkruppe, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67074
llvm-svn: 370721
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Precommit test case showing miscompile from PR43166.
Reviewers: fhahn, Ayal
Reviewed By: fhahn
Subscribers: rkruppe, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67072
llvm-svn: 370720
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bitcast <N x i8> (shuf X, undef, <N, N-1,...0>) to i{N*8} --> bswap (bitcast X to i{N*8})
In PR43146:
https://bugs.llvm.org/show_bug.cgi?id=43146
...we have a more complicated case where SLP is making a mess of bswap. This patch won't
do anything for that currently, but we need to improve bswap recognition in instcombine,
SLP, and/or a standalone pass to avoid that problem.
This is limited using the data-layout so we don't try to do this transform with actual
vector types. The backend does not appear to have folds to convert in either direction,
so we don't want to mess up something that is actually better lowered as a shuffle.
On x86, we're trading something like this:
vmovd %edi, %xmm0
vpshufb LCPI0_0(%rip), %xmm0, %xmm0 ## xmm0 = xmm0[3,2,1,0,u,u,u,u,u,u,u,u,u,u,u,u]
vmovd %xmm0, %eax
For:
movl %edi, %eax
bswapl %eax
Differential Revision: https://reviews.llvm.org/D66965
llvm-svn: 370659
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ops (PR43188)
As we have already established/fixed in
https://bugs.llvm.org/show_bug.cgi?id=42209
https://reviews.llvm.org/D63065
https://reviews.llvm.org/rL363522
the InstSimplify handling for @llvm.with.overflow ops with undefs
is correct. Therefore if ConstantFolding produces different results,
then it is wrong.
This duplication of code hints at the need for some refactoring,
but for now address the brokenness of ConstantFolding by
copying the known-good handling from rL363522.
Fixes https://bugs.llvm.org/show_bug.cgi?id=43188
llvm-svn: 370608
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Back-end currently expands mempcpy, but middle-end should work with memcpy instead of mempcpy to enable more memcpy-optimization.
GCC backend emits mempcpy, so LLVM backend could form it too, if we know mempcpy libcall is better than memcpy + n.
https://godbolt.org/z/dOCG96
Reviewers: efriedma, spatel, craig.topper, RKSimon, jdoerfert
Reviewed By: efriedma
Subscribers: hjl.tools, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65737
llvm-svn: 370593
|
|
|
|
|
|
| |
These tests are based on D19867.
llvm-svn: 370574
|
|
|
|
|
|
|
|
|
|
| |
Use a { iN undef, i1 false } struct as the base, and only insert
the first operand, instead of using { iN undef, i1 undef } as the
base and inserting both. This is the same as what we do in InstCombine.
Differential Revision: https://reviews.llvm.org/D67034
llvm-svn: 370573
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cold versus function being newly added.
This is the second half of https://reviews.llvm.org/D66374.
Profile symbol list is the collection of function symbols showing up in
the binary which generates the current profile. It is used to discriminate
function being cold versus function being newly added. Profile symbol list
is only added for profile with ExtBinary format.
During profile use compilation, when profile-sample-accurate is enabled,
a function without profile will be regarded as cold only when it is
contained in that list.
Differential Revision: https://reviews.llvm.org/D66766
llvm-svn: 370563
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an updated version of https://reviews.llvm.org/D66909 to fix PR42605.
Basically, current phi translatation translates an old value number to an new
value number for a call instruction based on the literal equality of call
expression, without verifying there is no clobber in between. This is incorrect.
To get a finegrain check, use MachineDependence analysis to do the job. However,
this is still not ideal. Although given a call instruction,
`MemoryDependenceResults::getCallDependencyFrom` returns identical call
instructions without clobber in between using MemDepResult with its DepType to
be `Def`. However, identical is too strict here and we want it to be relaxed a
little to consider phi-translation -- callee is the same, param operands can be
different. That means changing the semantic of `MemDepResult::Def` and I don't
know the potential impact.
So currently the patch is still conservative to only handle
MemDepResult::NonFuncLocal, which means the current call has no function local
clobber. If there is clobber, even if the clobber doesn't stand in between the
current call and the call with the new value, we won't do phi-translate.
Differential Revision: https://reviews.llvm.org/D67013
llvm-svn: 370547
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Instead of recomputing information for call sites we now use the
function information directly. This is always valid and once we have
call site specific information we can improve here.
This patch also bootstraps attributes that are created on-demand through
an initial update call. Information that is known will then directly be
available in the new attribute without causing an iteration delay.
The tests show how this improves the iteration count.
Reviewers: sstefan1, uenoku
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66781
llvm-svn: 370480
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Any pointer could have load/store users not only floating ones so we
move the manifest logic for alignment into the AAAlignImpl class.
Reviewers: uenoku, sstefan1
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66922
llvm-svn: 370479
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Add missing tbuffer loads intrinsics in SimplifyDemandedVectorElts.
Reviewers: arsenm, nhaehnle
Reviewed By: arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66926
llvm-svn: 370475
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch adds an appropriate `initialize` method for `AANoAliasCallSiteArgument`.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66927
llvm-svn: 370456
|