| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
A sequence of additions or multiplications that is known not to wrap, may wrap
if it's order is changed (i.e., reassociated). Therefore when vectorizing
integer sum or product reductions, their no-wrap flags need to be removed.
Fixes PR43828
Patch by Denis Antrushin
Differential Revision: https://reviews.llvm.org/D69563
|
|
|
|
|
|
|
|
| |
The reversion apparently deleted the test/Transforms directory.
Will be re-reverting again.
llvm-svn: 358552
|
|
|
|
|
|
|
|
| |
As it's causing some bot failures (and per request from kbarton).
This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.
llvm-svn: 358546
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is a fix for PR23997.
The loop vectorizer is not preserving the inbounds property of GEPs that it creates.
This is inhibiting some optimizations. This patch preserves the inbounds property in
the case where a load/store is being fed by an inbounds GEP.
Reviewers: mkuper, javed.absar, hsaito
Reviewed By: hsaito
Subscribers: dcaballe, hsaito, llvm-commits
Differential Revision: https://reviews.llvm.org/D46191
llvm-svn: 331269
|
|
|
|
|
|
|
|
|
|
|
| |
When ignoring a load that participates in an interleaved group, make sure to
move a cast that needs to sink after it.
Testcase derived from reproducer of PR34743.
Differential Revision: https://reviews.llvm.org/D38338
llvm-svn: 314986
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generate a single test to decide if there are enough iterations to jump to the
vectorized loop, or else go to the scalar remainder loop. This test compares the
Scalar Trip Count: if STC < VF * UF go to the scalar loop. If
requiresScalarEpilogue() holds, at-least one iteration must remain scalar; the
rest can be used to form vector iterations. So in this case the test checks
instead if (STC - 1) < VF * UF by comparing STC <= VF * UF, and going to the
scalar loop if so. Otherwise the vector loop is entered for at-least one vector
iteration.
This test covers the case where incrementing the backedge-taken count will
overflow leading to an incorrect trip count of zero. In this (rare) case we will
also avoid the vector loop and jump to the scalar loop.
This patch simplifies the existing tests and effectively removes the basic-block
originally named "min.iters.checked", leaving the single test in block
"vector.ph".
Original observation and initial patch by Evgeny Stupachenko.
Differential Revision: https://reviews.llvm.org/D34150
llvm-svn: 308421
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For uniform instructions, we're only required to generate a scalar value for
the first vector lane of each unroll iteration. Thus, if we have a reverse
interleaved group, computing the member index off the scalar GEP corresponding
to the last vector lane of its pointer operand technically makes the GEP
non-uniform. We should compute the member index off the first scalar GEP
instead.
I've added the updated member index computation to the existing reverse
interleaved group test.
llvm-svn: 280497
|
|
|
|
|
|
|
| |
Reasoning about a select in terms of a min or max allows us to derive a
tigher bound on the result.
llvm-svn: 277914
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The interleaved access analysis currently assumes that the inserted run-time
pointer aliasing checks ensure the absence of dependences that would prevent
its instruction reordering. However, this is not the case.
Issues can arise from how code generation is performed for interleaved groups.
For a load group, all loads in the group are essentially moved to the location
of the first load in program order, and for a store group, all stores in the
group are moved to the location of the last store. For groups having members
involved in a dependence relation with any other instruction in the loop, this
reordering can violate the dependence.
This patch teaches the interleaved access analysis how to avoid breaking such
dependences, and should fix PR27626.
An assumption of the original analysis was that the accesses had been collected
in "program order". The analysis was then simplified by visiting the accesses
bottom-up. However, this ordering was never guaranteed for anything other than
single basic block loops. Thus, this patch also enforces the desired ordering.
Reference: https://llvm.org/bugs/show_bug.cgi?id=27626
Differential Revision: http://reviews.llvm.org/D19984
llvm-svn: 273687
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We previously disallowed interleaved load groups that may cause us to
speculatively access memory out-of-bounds (r261331). We did this by ensuring
each load group had an access corresponding to the first and last member.
Instead of bailing out for these interleaved groups, this patch enables us to
peel off the last vector iteration, ensuring that we execute at least one
iteration of the scalar remainder loop. This solution was proposed in the
review of the previous patch.
Differential Revision: http://reviews.llvm.org/D19487
llvm-svn: 267751
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If we don't have the first and last access of an interleaved load group,
the first and last wide load in the loop can do an out of bounds
access. Even though we discard results from speculative loads,
this can cause problems, since it can technically generate page faults
(or worse).
We now discard interleaved load groups that don't have the first and
load in the group.
Reviewers: hfinkel, rengolin
Subscribers: rengolin, llvm-commits, mzolotukhin, anemet
Differential Revision: http://reviews.llvm.org/D17332
llvm-svn: 261331
|
|
Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector.
E.g. for (i = 0; i < N; i+=2) {
a = A[i]; // load of even element
b = A[i+1]; // load of odd element
... // operations on a, b, c, d
A[i] = c; // store of even element
A[i+1] = d; // store of odd element
}
The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like:
%wide.vec = load <8 x i32>, <8 x i32>* %ptr
%vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
%vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7>
The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like:
%interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7>
store <8 x i32> %interleaved.vec, <8 x i32>* %ptr
This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'.
llvm-svn: 239291
|