| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
for buildbot failures.
llvm-svn: 355461
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This uses the infrastructure added in rL353152 to sink zext and sexts to
sub/add users, to enable vsubl/vaddl generation when NEON is available.
See https://bugs.llvm.org/show_bug.cgi?id=40025.
Reviewers: SjoerdMeijer, t.p.northover, samparker, efriedma
Reviewed By: samparker
Differential Revision: https://reviews.llvm.org/D58063
llvm-svn: 355460
|
| |
|
|
|
|
|
|
| |
Be consistent about how we treat atomics in non-zero address spaces. If we get to the backend, we tend to lower them as if in address space 0. Do the same if we need to insert a libcall instead.
Differential Revision: https://reviews.llvm.org/D58760
llvm-svn: 355453
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
x86-64 is an invalid architecture in triples. Changing it to the correct
triple (x86_64) changes some tests, because SLP is not deemed profitable
any more.
Reviewers: ABataev, RKSimon, spatel
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D58931
llvm-svn: 355420
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
A SCEV is not low-cost just because you can divide it by a power of 2. We need to also
check what we are dividing to make sure it too is not a high-code expansion. This helps
to not expand the exit value of certain loops, helping not to bloat the code.
The change in no-iv-rewrite.ll is reverting back to what it was testing before rL194116,
and looks a lot like the other tests in replace-loop-exit-folds.ll.
Differential Revision: https://reviews.llvm.org/D58435
llvm-svn: 355393
|
| |
|
|
|
|
|
|
|
|
|
| |
Add some tests for various loops of the form:
while(S >= 32) {
S -= 32;
something();
};
return S;
llvm-svn: 355389
|
| |
|
|
| |
llvm-svn: 355349
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The test is reduced from an example in the post-commit thread for:
rL354746
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190304/632396.html
While we must avoid dying here, the real question should be:
Why is non-canonical and/or degenerate code making it to CGP when
using the new pass manager?
llvm-svn: 355345
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm not too familiar with this pass, so there might be a better
solution, but this appears to fix the degenerate:
PR40930
PR40931
PR40932
PR40934
...without affecting any real-world code.
As we've seen in several other passes, when we have unreachable blocks,
they can contain semi-bogus IR and/or cause unexpected conditions. We
would not typically expect these patterns to make it this far, but we
have to guard against them anyway.
llvm-svn: 355337
|
| |
|
|
|
|
|
|
| |
Baseline tests for D58881, which fixes part of PR38146.
Patch by Dan Robertson.
llvm-svn: 355328
|
| |
|
|
|
|
| |
Fixes PR40838.
llvm-svn: 355301
|
| |
|
|
| |
llvm-svn: 355293
|
| |
|
|
|
|
|
|
|
|
| |
computeKnownBitsFromAssume()
There are no tests for this case, and I'm not sure how it could ever work,
so I'm just removing this option from the matcher. This should fix PR40940:
https://bugs.llvm.org/show_bug.cgi?id=40940
llvm-svn: 355292
|
| |
|
|
|
|
| |
operand ordering. NFC
llvm-svn: 355291
|
| |
|
|
| |
llvm-svn: 355277
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Follow-up to rL355221.
This isn't specifically called for within PR14613,
but we'll get there eventually if it's not already
requested in some other bug report.
https://rise4fun.com/Alive/5b0
Name: smax
Pre: WillNotOverflowSignedSub(C1,C0)
%a = add nsw i8 %x, C0
%cond = icmp sgt i8 %a, C1
%r = select i1 %cond, i8 %a, i8 C1
=>
%c2 = icmp sgt i8 %x, C1-C0
%u2 = select i1 %c2, i8 %x, i8 C1-C0
%r = add nsw i8 %u2, C0
Name: smin
Pre: WillNotOverflowSignedSub(C1,C0)
%a = add nsw i32 %x, C0
%cond = icmp slt i32 %a, C1
%r = select i1 %cond, i32 %a, i32 C1
=>
%c2 = icmp slt i32 %x, C1-C0
%u2 = select i1 %c2, i32 %x, i32 C1-C0
%r = add nsw i32 %u2, C0
llvm-svn: 355272
|
| |
|
|
| |
llvm-svn: 355271
|
| |
|
|
| |
llvm-svn: 355265
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases, MaxBECount can be less precise than ExactBECount for AND
and OR (the AND case was PR26207). In the OR test case, both ExactBECounts are
undef, but MaxBECount are different, so we hit the assertion below. This
patch uses the same solution the AND case already uses.
Assertion failed:
((isa<SCEVCouldNotCompute>(ExactNotTaken) || !isa<SCEVCouldNotCompute>(MaxNotTaken))
&& "Exact is not allowed to be less precise than Max"), function ExitLimit
This patch also consolidates test cases for both AND and OR in a single
test case.
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13245
Reviewers: sanjoy, efriedma, mkazantsev
Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D58853
llvm-svn: 355259
|
| |
|
|
|
|
|
|
|
|
| |
IntrArgMemOnly implies that only memory pointed to by pointer typed arguments will be accessed. But these intrinsics allow you to pass null to the pointer argument and put the full address into the index argument. Other passes won't be able to understand this.
A colleague found that ISPC was creating gathers like this and then dead store elimination removed some stores because it didn't understand what the gather was doing since the pointer argument was null.
Differential Revision: https://reviews.llvm.org/D58805
llvm-svn: 355228
|
| |
|
|
|
|
| |
This demonstrates dead store elimination removing a store that may alias a gather that uses null as its base.
llvm-svn: 355227
|
| |
|
|
|
|
|
|
| |
I'm assuming that the nan propogation logic for InstructonSimplify's handling of fadd and fsub is correct, and applying the same to atomicrmw.
Differential Revision: https://reviews.llvm.org/D58836
llvm-svn: 355222
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the motivating cases from PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613
...moving the add enables us to narrow the
min/max which eliminates zext/trunc which
enables signficantly better vectorization.
But that bug is still not completely fixed.
https://rise4fun.com/Alive/5KQ
Name: umax
Pre: C1 u>= C0
%a = add nuw i8 %x, C0
%cond = icmp ugt i8 %a, C1
%r = select i1 %cond, i8 %a, i8 C1
=>
%c2 = icmp ugt i8 %x, C1-C0
%u2 = select i1 %c2, i8 %x, i8 C1-C0
%r = add nuw i8 %u2, C0
Name: umin
Pre: C1 u>= C0
%a = add nuw i32 %x, C0
%cond = icmp ult i32 %a, C1
%r = select i1 %cond, i32 %a, i32 C1
=>
%c2 = icmp ult i32 %x, C1-C0
%u2 = select i1 %c2, i32 %x, i32 C1-C0
%r = add nuw i32 %u2, C0
llvm-svn: 355221
|
| |
|
|
| |
llvm-svn: 355220
|
| |
|
|
|
|
|
|
|
|
| |
This patch fixes an issue where we would compute an unnecessarily small alignment during scalar promotion when no store is not to be guaranteed to execute, but we've proven load speculation safety. Since speculating a load requires proving the existing alignment is valid at the new location (see Loads.cpp), we can use the alignment fact from the load.
For non-atomics, this is a performance problem. For atomics, this is a correctness issue, though an *incredibly* rare one to see in practice. For atomics, we might not be able to lower an improperly aligned load or store (i.e. i32 align 1). If such an instruction makes it all the way to codegen, we *may* fail to codegen the operation, or we may simply generate a slow call to a library function. The part that makes this super hard to see in practice is that the memory location actually *is* well aligned, and instcombine knows that. So, to see a failure, you have to have a) hit the bug in LICM, b) somehow hit a depth limit in InstCombine/ValueTracking to avoid fixing the alignment, and c) then have generated an instruction which fails codegen rather than simply emitting a slow libcall. All around, pretty hard to hit.
Differential Revision: https://reviews.llvm.org/D58809
llvm-svn: 355217
|
| |
|
|
| |
llvm-svn: 355215
|
| |
|
|
|
|
| |
atomicrmws
llvm-svn: 355212
|
| |
|
|
|
|
|
|
|
|
| |
An idempotent atomicrmw is one that does not change memory in the process of execution. We have already added handling for the various integer operations; this patch extends the same handling to floating point operations which were recently added to IR.
Note: At the moment, we canonicalize idempotent fsub to fadd when ordering requirements prevent us from using a load. As discussed in the review, I will be replacing this with canonicalizing both floating point ops to integer ops in the near future.
Differential Revision: https://reviews.llvm.org/D58251
llvm-svn: 355210
|
| |
|
|
|
|
|
| |
Fixing this should solve the biggest part of the vector problems seen in:
https://bugs.llvm.org/show_bug.cgi?id=14613
llvm-svn: 355206
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dangling elements in ConstIntInfoVec for new PM
Summary:
ConstIntInfoVec contains elements extracted from the previous function.
In new PM, releaseMemory() is not called and the dangling elements can
cause segfault in findConstantInsertionPoint.
Rename releaseMemory() to cleanup() to deliver the idea that it is
mandatory and call cleanup() in ConstantHoistingPass::runImpl to fix
this.
Reviewers: ormris, zzheng, dmgreen, wmi
Reviewed By: ormris, wmi
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58589
llvm-svn: 355174
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is part of a transform that may be done in the backend:
D13757
...but it should always be beneficial to fold this sooner in IR
for all targets.
https://rise4fun.com/Alive/vaiW
Name: sext add nsw
%add = add nsw i8 %i, C0
%ext = sext i8 %add to i32
%r = add i32 %ext, C1
=>
%s = sext i8 %i to i32
%r = add i32 %s, sext(C0)+C1
Name: zext add nuw
%add = add nuw i8 %i, C0
%ext = zext i8 %add to i16
%r = add i16 %ext, C1
=>
%s = zext i8 %i to i16
%r = add i16 %s, zext(C0)+C1
llvm-svn: 355118
|
| |
|
|
|
|
| |
This should have been part of r355110, but my brain isn't quite awake yet, despite the coffee. Per the original submit comment... Doing scalar promotion w/o being able to prove the alignment of the hoisted load or sunk store is a bug. Update tests to actually show the alignment so that impact of the patch which fixes this can be seen.
llvm-svn: 355111
|
| |
|
|
|
|
| |
Doing scalar promotion w/o being able to prove the alignment of the hoisted load or sunk store is a bug. Update tests to actually show the alignment so that impact of the patch which fixes this can be seen.
llvm-svn: 355110
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Second part of D58593.
Compute precise overflow conditions based on all known bits, rather
than just the sign bits. Unsigned a - b overflows iff a < b, and we
can determine whether this always/never happens based on the minimal
and maximal values achievable for a and b subject to the known bits
constraint.
llvm-svn: 355109
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Part of D58593.
Compute precise overflow conditions based on all known bits, rather
than just the sign bits. Unsigned a + b overflows iff a > ~b, and we
can determine whether this always/never happens based on the minimal
and maximal values achievable for a and ~b subject to the known bits
constraint.
llvm-svn: 355072
|
| |
|
|
|
|
|
|
| |
Function" and the dependent patch "Refine ArgPromotion metadata handling" as they're causing segfaults in argument promotion.
This reverts commits r354032 and r353537.
llvm-svn: 355060
|
| |
|
|
| |
llvm-svn: 355020
|
| |
|
|
|
|
| |
Baseline for D58593.
llvm-svn: 354996
|
| |
|
|
| |
llvm-svn: 354993
|
| |
|
|
|
|
|
|
|
|
| |
Splitting can make sanitizer errors harder to understand, as the
trapping instruction may not be in the function where the bug was
detected.
rdar://48142697
llvm-svn: 354931
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2019-February/130491.html
We can't remove the compare+select in the general case because
we are treating funnel shift like a standard instruction (as
opposed to a special instruction like select/phi).
That means that if one of the operands of the funnel shift is
poison, the result is poison regardless of whether we know that
the operand is actually unused based on the instruction's
particular semantics.
The motivating case for this transform is the more specific
rotate op (rather than funnel shift), and we are preserving the
fold for that case because there is no chance of introducing
extra poison when there is no anonymous extra operand to the
funnel shift.
llvm-svn: 354905
|
| |
|
|
|
|
|
|
| |
Rotate is a special-case of funnel shift that has different
poison constraints than the general case. That's not visible
yet in the existing tests, but it needs to be corrected.
llvm-svn: 354894
|
| |
|
|
|
|
|
| |
Not sure how it happened, but rL354886 was a duplicate of rL354881,
but not updated with rL354887.
llvm-svn: 354889
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Yet another pattern variation suggested by:
https://bugs.llvm.org/show_bug.cgi?id=14613
There are 8 more potential commuted patterns here on top of the
8 that were already handled (rL354221, rL354276, rL354393).
We have the obvious commute of the 'add' + commute of the cmp
predicate/operands (ugt/ult) + commute of the select operands:
Name: base
%notx = xor i32 %x, -1
%a = add i32 %notx, %y
%c = icmp ult i32 %x, %y
%r = select i1 %c, i32 -1, i32 %a
=>
%c2 = icmp ult i32 %a, %y
%r = select i1 %c2, i32 -1, i32 %a
Name: ugt
%notx = xor i32 %x, -1
%a = add i32 %notx, %y
%c = icmp ugt i32 %y, %x
%r = select i1 %c, i32 -1, i32 %a
=>
%c2 = icmp ult i32 %a, %y
%r = select i1 %c2, i32 -1, i32 %a
Name: commute select
%notx = xor i32 %x, -1
%a = add i32 %notx, %y
%c = icmp ult i32 %y, %x
%r = select i1 %c, i32 %a, i32 -1
=>
%c2 = icmp ult i32 %a, %y
%r = select i1 %c2, i32 -1, i32 %a
Name: ugt + commute select
%notx = xor i32 %x, -1
%a = add i32 %notx, %y
%c = icmp ugt i32 %x, %y
%r = select i1 %c, i32 %a, i32 -1
=>
%c2 = icmp ult i32 %a, %y
%r = select i1 %c2, i32 -1, i32 %a
https://rise4fun.com/Alive/den
llvm-svn: 354887
|
| |
|
|
| |
llvm-svn: 354886
|
| |
|
|
| |
llvm-svn: 354881
|
| |
|
|
|
|
|
| |
EarlyCSE with MemorySSA was able to use this to merge multiple calls
with no intervening store.
llvm-svn: 354814
|
| |
|
|
|
|
|
|
| |
This requires a couple of tweaks to existing vectorization functions as they were assuming that only the second call argument (ctlz/cttz/powi) could ever be the 'always scalar' argument, but for smul.fix + umul.fix its the third argument.
Differential Revision: https://reviews.llvm.org/D58616
llvm-svn: 354790
|
| |
|
|
|
|
| |
Baseline tests - fixed mul intrinsics aren't flagged as vectorizable yet
llvm-svn: 354783
|
| |
|
|
|
|
| |
The icmps are the same as the overflow result of the intrinsic.
llvm-svn: 354760
|