| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
| |
The HardwareLoops pass finds exit blocks with a scevable exit count.
If the target specifies to update the loop counter in a register,
through a phi, we need to ensure that the exit block is a latch so
that we can insert the phi with the correct value for the incoming
edge.
Differential Revision: https://reviews.llvm.org/D63336
llvm-svn: 363556
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Avoid that loop vectorizer creates loads/stores of vectors
with "irregular" types when interleaving. An example of
an irregular type is x86_fp80 that is 80 bits, but that
may have an allocation size that is 96 bits. So an array
of x86_fp80 is not bitcast compatible with a vector
of the same type.
Not sure if interleavedAccessCanBeWidened is the best
place for this check, but it solves the problem seen
in the added test case. And it is the same kind of check
that already exists in memoryInstructionCanBeWidened.
Reviewers: fhahn, Ayal, craig.topper
Reviewed By: fhahn
Subscribers: hiraditya, rkruppe, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63386
llvm-svn: 363547
|
| |
|
|
|
|
|
|
|
|
| |
Second functional change following on from rL362687. Pass the
NoWrapFlags from the MulExpr to InsertBinop when we're generating a
shl or mul.
Differential Revision: https://reviews.llvm.org/D61934
llvm-svn: 363540
|
| |
|
|
| |
llvm-svn: 363538
|
| |
|
|
|
|
|
|
|
|
|
| |
Also sink function calls without used results (PR41259)"
Third time's the charm.
This was reverted in r363220 due to being suspected of an internal benchmark
regression and a test failure, none of which turned out to be caused by this.
llvm-svn: 363529
|
| |
|
|
|
|
|
|
|
|
|
| |
SimplifyCFG has a bug that results in inconsistent prof branch_weights metadata
if unreachable switch cases are removed. This patch fixes this bug by making use
of the newly introduced SwitchInstProfUpdateWrapper class (see patch D62122).
A new test is created.
Differential Revision: https://reviews.llvm.org/D62186
llvm-svn: 363527
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix folds of addo and subo with an undef operand to be:
`@llvm.{u,s}{add,sub}.with.overflow` all fold to `{ undef, false }`,
as per LLVM undef rules.
Same for commuted variants.
Based on the original version of the patch by @nikic.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42209 | PR42209 ]]
Differential Revision: https://reviews.llvm.org/D63065
llvm-svn: 363522
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is based on the example/discussion in PR37428:
https://bugs.llvm.org/show_bug.cgi?id=37428
Proper vector shift instructions don't appear until AVX2, so we may generate several
extra instructions within a loop trying to compensate for that. It's difficult to
recover from that shift expansion later than this, so use the existing TLI hook and
splat analysis to enable better codegen.
This extends CGP functionality introduced with:
rL201655
Differential Revision: https://reviews.llvm.org/D63233
llvm-svn: 363511
|
| |
|
|
|
|
|
|
|
|
|
| |
If we can detect that saturating math that depends on an IV cannot
overflow, replace it with simple math. This is similar to the CVP
optimization from D62703, just based on a different underlying
analysis (SCEV vs LVI) that catches different cases.
Differential Revision: https://reviews.llvm.org/D62792
llvm-svn: 363489
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shift" (nfc).
Summary:
For icmp pred (and (sh X, Y), C), 0
When C is signbit, expect to fold (X << Y) & signbit ==/!= 0 into (X << Y) >=/< 0,
rather than (X & (signbit >> Y)) != 0.
When C+1 is power of 2, expect to fold (X << Y) & ~C ==/!= 0 into (X << Y) </>= C+1,
rather than (X & (~C >> Y)) == 0.
For icmp pred (and X, (sh signbit, Y)), 0
Expect to fold (X & (signbit l>> Y)) ==/!= 0 into (X << Y) >=/< 0
Expect to fold (X & (signbit << Y)) ==/!= 0 into (X l>> Y) >=/< 0
Reviewers: lebedev.ri, efriedma, spatel, craig.topper
Reviewed By: lebedev.ri
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63025
llvm-svn: 363479
|
| |
|
|
|
|
|
|
|
|
|
|
| |
with 'objc_arc_inert'
Those calls are no-ops, so they can be safely deleted.
rdar://problem/49839633
Differential Revision: https://reviews.llvm.org/D62433
llvm-svn: 363468
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a circular dependency between SROA and InferAddressSpaces
today that requires running both multiple times in order to be able to
eliminate all simple allocas and addrspacecasts. InferAddressSpaces
can't remove addrspacecasts when written to memory, and SROA helps
move pointers out of memory.
This should avoid inserting new commuting addrspacecasts with GEPs,
since there are unresolved questions about pointer wrapping between
different address spaces.
For now, don't replace volatile operations that don't match the alloca
addrspace, as it would change the address space of the access. It may
be still OK to insert an addrspacecast from the new alloca, but be
more conservative for now.
llvm-svn: 363462
|
| |
|
|
| |
llvm-svn: 363460
|
| |
|
|
|
|
|
|
|
| |
Reverting because it breaks a green dragon build:
http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208
This reverts r363289 (git commit eb88badff96dacef8fce3f003dec34c2ef6900bf)
llvm-svn: 363427
|
| |
|
|
| |
llvm-svn: 363409
|
| |
|
|
|
|
|
|
| |
I'm not 100% sure about this, since I'm worried about IR transforms
that might end up introducing divergence downstream once replaced with
a constant, but I haven't come up with an example yet.
llvm-svn: 363406
|
| |
|
|
|
|
|
|
|
|
|
|
| |
InsertBinop now accepts NoWrapFlags, so pass them through when
expanding a simple add expression.
This is the first re-commit of the functional changes from rL362687,
which was previously reverted.
Differential Revision: https://reviews.llvm.org/D61934
llvm-svn: 363364
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D63301
llvm-svn: 363339
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Also add baseline tests to show effect of later patches.
There were a couple of regressions here that were never caught,
but my patch set that this is a preparation to will fix them.
This is the third attempt to land this patch.
Differential Revision: https://reviews.llvm.org/D61150
llvm-svn: 363319
|
| |
|
|
| |
llvm-svn: 363291
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV.
The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program.
As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well.
(Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.)
Differential Revision: https://reviews.llvm.org/D62939
llvm-svn: 363289
|
| |
|
|
| |
llvm-svn: 363286
|
| |
|
|
| |
llvm-svn: 363285
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The logic in EarlyCSE that looks through 'not' operations in the
predicate recognizes e.g. that `select (not (cmp sgt X, Y)), X, Y` is
equivalent to `select (cmp sgt X, Y), Y, X`. Without this change,
however, only the latter is recognized as a form of `smin X, Y`, so the
two expressions receive different hash codes. This leads to missed
optimization opportunities when the quadratic probing for the two hashes
doesn't happen to collide, and assertion failures when probing doesn't
collide on insertion but does collide on a subsequent table grow
operation.
This change inverts the order of some of the pattern matching, checking
first for the optional `not` and then for the min/max/abs patterns, so
that e.g. both expressions above are recognized as a form of `smin X, Y`.
It also adds an assertion to isEqual verifying that it implies equal
hash codes; this fires when there's a collision during insertion, not
just grow, and so will make it easier to notice if these functions fall
out of sync again. A new flag --earlycse-debug-hash is added which can
be used when changing the hash function; it forces hash collisions so
that any pair of values inserted which compare as equal but hash
differently will be caught by the isEqual assertion.
Reviewers: spatel, nikic
Reviewed By: spatel, nikic
Subscribers: lebedev.ri, arsenm, craig.topper, efriedma, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62644
llvm-svn: 363274
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch uses the mechanism from D62995 to strengthen the
definitions of the reduction intrinsics by letting the scalar
result/accumulator type be overloaded from the vector element type.
For example:
; The LLVM LangRef specifies that the scalar result must equal the
; vector element type, but this is not checked/enforced by LLVM.
declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a)
This patch changes that into:
declare i32 @llvm.experimental.vector.reduce.or.v4i32(<4 x i32> %a)
Which has the type-constraint more explicit and causes LLVM to check
the result type with the vector element type.
Reviewers: RKSimon, arsenm, rnk, greened, aemerson
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D62996
llvm-svn: 363240
|
| |
|
|
|
|
|
|
|
| |
TTI should report that it's not profitable to generate a hardware loop
if it, or one of its child loops, has already been converted.
Differential Revision: https://reviews.llvm.org/D63212
llvm-svn: 363234
|
| |
|
|
|
|
|
|
|
| |
This reverts 363226 and 363227, both NFC intended
I swear I fixed the test case that is failing, and ran
the tests, but I will look into it again.
llvm-svn: 363229
|
| |
|
|
|
|
|
|
|
|
|
| |
Also add baseline tests to show effect of later patches.
There were a couple of regressions here that were never caught,
but my patch set that this is a preparation to will fix them.
Differential Revision: https://reviews.llvm.org/D61150
llvm-svn: 363226
|
| |
|
|
|
|
| |
I ran ALL the test suite locally, so I will look into this...
llvm-svn: 363223
|
| |
|
|
|
|
|
|
|
|
| |
see if my changes change anything
Also add baseline tests to show effect of later patches.
Differential Revision: https://reviews.llvm.org/D61150
llvm-svn: 363222
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SinkCommonCodeFromPredecessors ...'
We have observed some failures with internal builds with this revision.
- Performance regressions:
- llvm's SingleSource/Misc evalloop shows performance regressions (although these may be red herrings).
- Benchmarks for Abseil's SwissTable.
- Correctness:
- Failures for particular libicu tests when building the Google AppEngine SDK (for PHP).
hwennborg has already been notified, and is aware of reproducer failures.
llvm-svn: 363220
|
| |
|
|
|
|
| |
common subset, NFC.
llvm-svn: 363218
|
| |
|
|
| |
llvm-svn: 363217
|
| |
|
|
| |
llvm-svn: 363203
|
| |
|
|
| |
llvm-svn: 363193
|
| |
|
|
|
|
|
|
| |
exit rewriting
The issue addressed in r363180 is more broadly relevant. For the moment, we don't actually get any of these cases because we a) restrict SCEV formation due to SCEExpander needing to preserve LCSSA, and b) don't iterate between loops.
llvm-svn: 363192
|
| |
|
|
|
|
|
|
|
|
|
| |
SCEV does not propagate arguments through one-input Phis so as to make it easy for the SCEV expander (and related code) to preserve LCSSA. It's not entirely clear this restriction is neccessary, but for the moment it exists. For this reason, we don't analyze single-entry phi inputs. However it is possible that when an this input leaves the loop through LCSSA Phi, it is a provable constant. Missing that results in an order of optimization issue in loop exit value rewriting where we miss some oppurtunities based on order in which we visit sibling loops.
This patch teaches computeSCEVAtScope about this case. We can generalize it later, but so far we can only replace LCSSA Phis with their constant loop-exiting values. We should probably also add similiar logic directly in the SCEV construction path itself.
Patch by: mkazantsev (with revised commit message by me)
Differential Revision: https://reviews.llvm.org/D58113
llvm-svn: 363180
|
| |
|
|
| |
llvm-svn: 363175
|
| |
|
|
|
|
|
| |
The patch was to fix buildbots, but rL363157 should now be fixing it
in a cleaner way.
llvm-svn: 363174
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was using its own, outdated list of possible captures. This was
at minimum not catching cmpxchg and addrspacecast captures.
One change is now any volatile access is treated as capturing. The
test coverage for this pass is quite inadequate, but this required
removing volatile in the lifetime capture test.
Also fixes some infrastructure issues to allow running just the IR
pass.
Fixes bug 42238.
llvm-svn: 363169
|
| |
|
|
|
|
|
|
|
| |
This changes the standalone pass only. Arguably the utility class
itself should assert there are no convergent calls. However, a target
pass with additional context may still be able to version a loop if
all of the dynamic conditions are sufficiently uniform.
llvm-svn: 363165
|
| |
|
|
| |
llvm-svn: 363163
|
| |
|
|
| |
llvm-svn: 363162
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This case is slightly tricky, because loop distribution should be
allowed in some cases, and not others. As long as runtime dependency
checks don't need to be introduced, this should be OK. This is further
complicated by the fact that LoopDistribute partially ignores if LAA
says that vectorization is safe, and then does its own runtime pointer
legality checks.
Note this pass still does not handle noduplicate correctly, as this
should always be forbidden with it. I'm not going to bother trying to
fix it, as it would require more effort and I think noduplicate should
be removed.
https://reviews.llvm.org/D62607
llvm-svn: 363160
|
| |
|
|
|
|
|
|
|
| |
I broke 2 of these with a patch, but were not covered by existing
tests.
https://reviews.llvm.org/D63035
llvm-svn: 363158
|
| |
|
|
|
|
|
| |
Set Transforms/HardwareLoops/ARM/ tests as unsupported if there isn't
an arm target.
llvm-svn: 363157
|
| |
|
|
|
|
| |
Adding REQUIRES: arm to failing tests
llvm-svn: 363156
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement the backend target hook to drive the HardwareLoops pass.
The low-overhead branch extension for Arm M-class cores is flexible
enough that we don't have to ensure correctness at this point, except
checking that the loop counter variable can be stored in LR - a
32-bit register. For it to be profitable, we want to avoid loops that
contain function calls, or any other instruction that alters the PC.
This implementation uses TargetLoweringInfo, to query type and
operation actions, looks at intrinsic calls and also performs some
manual checks for remainder/division and FP operations.
I think this should be a good base to start and extra details can be
filled out later.
Differential Revision: https://reviews.llvm.org/D62907
llvm-svn: 363149
|
| |
|
|
|
|
|
|
|
| |
through loop even after completion"
This reverts commit 1a0f7a2077b70c9864faa476e15b048686cf1ca7.
See phabricator thread for D60831.
llvm-svn: 363132
|
| |
|
|
|
|
|
|
| |
We were only matching RHS being a loop invariant value, not the inverse. Since there's nothing which appears to canonicalize loop invariant values to RHS, this means we missed cases.
Differential Revision: https://reviews.llvm.org/D63112
llvm-svn: 363108
|