| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
performance-unnecessary-value-param.
Contains some manual fixes. No functionality change intended.
llvm-svn: 273047
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is still NFCI, so the list of clients that allow symbolic stride
speculation does not change (yes: LV and LoopVersioningLICM, no: LLE,
LDist). However since the symbolic strides are now managed by LAA
rather than passed by client a new bool parameter is used to enable
symbolic stride speculation.
The existing test Transforms/LoopVectorize/version-mem-access.ll checks
that stride speculation is performed for LV.
The previously added test Transforms/LoopLoadElim/symbolic-stride.ll
ensures that no speculation is performed for LLE.
The next patch will change the functionality and turn on symbolic stride
speculation in all of LAA's clients and remove the bool parameter.
llvm-svn: 272970
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Turns out SymbolicStrides is actually used in canVectorizeWithIfConvert
before it gets set up in canVectorizeMemory.
This works fine as long as SymbolicStrides resides in LV since we just
have an empty map. Based on this the conclusion is made that there are
no symbolic strides which is conservatively correct.
However once SymbolicStrides becomes part of LAI, LAI is nullptr at this
point so we need to differentiate the uninitialized state by returning a
nullptr for SymbolicStrides.
llvm-svn: 272966
|
|
|
|
|
|
| |
Patch written by Reid. I verified it locally with clang.
llvm-svn: 272875
|
|
|
|
|
|
|
|
|
| |
LoopVectorizationLegality holds a constant reference to LAI, so this
will have to be const as well.
Also added missed function comment.
llvm-svn: 272851
|
|
|
|
|
|
| |
This should help moving Strides to LAA later.
llvm-svn: 272796
|
|
|
|
|
|
| |
LoopVectorizationLegality::strides_begin/end are also unused.
llvm-svn: 272781
|
|
|
|
|
|
| |
LoopVectorizationLegality::mustCheckStrides is unused.
llvm-svn: 272780
|
|
|
|
|
|
|
|
|
|
|
|
| |
The error on clang-x86-win2008-selfhost is:
C:\buildbot\slave-config\clang-x86-win2008-selfhost\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp(955) : error C2248: 'llvm::slpvectorizer::BoUpSLP::ScheduleData' : cannot access private struct declared in class 'llvm::slpvectorizer::BoUpSLP'
C:\buildbot\slave-config\clang-x86-win2008-selfhost\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp(608) : see declaration of 'llvm::slpvectorizer::BoUpSLP::ScheduleData'
C:\buildbot\slave-config\clang-x86-win2008-selfhost\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp(337) : see declaration of 'llvm::slpvectorizer::BoUpSLP'
I reproduced this locally with both MSVC 2013 and MSVC 2015.
llvm-svn: 272772
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This wasn't failing for me with clang as the compiler. I think GCC may
disagree with clang about whether a friend declaration introduces a
declaration in the enclosing namespace (or something).
Example error:
/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:950:77: error: ‘llvm::raw_ostream& llvm::slpvectorizer::operator<<(llvm::raw_ostream&, const llvm::slpvectorizer::BoUpSLP::ScheduleData&)’ should have been declared inside ‘llvm::slpvectorizer’
const BoUpSLP::ScheduleData &SD) {
^
llvm-svn: 272767
|
|
|
|
|
|
|
|
|
|
|
| |
This uses the "runImpl" approach to share code with the old PM.
Porting to the new PM meant abandoning the anonymous namespace enclosing
most of SLPVectorizer.cpp which is a bit of a bummer (but not a big deal
compared to having to pull the pass class into a header which the new PM
requires since it calls the constructor directly).
llvm-svn: 272766
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
r272715 broke libcxx because it did not correctly handle cases where the
last iteration of one IV is the second-to-last iteration of another.
Original commit message:
Vectorizing loops with "escaping" IVs has been disabled since r190790, due to
PR17179. This re-enables it, with support for external use of both
"post-increment" (last iteration) and "pre-increment" (second-to-last iteration)
IVs.
llvm-svn: 272742
|
|
|
|
| |
llvm-svn: 272730
|
|
|
|
|
|
|
|
|
|
|
| |
Vectorizing loops with "escaping" IVs has been disabled since r190790, due to
PR17179. This re-enables it, with support for external use of both
"post-increment" (last iteration) and "pre-increment" (second-to-last iteration)
IVs.
Differential Revision: http://reviews.llvm.org/D21048
llvm-svn: 272715
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D21090
llvm-svn: 272294
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we materialized secondary vector IVs from the primary scalar IV,
by offseting the primary to match the correct start value, and then broadcasting
it - inside the loop body. Instead, we can use a real vector IV, like we do for
the primary.
This enables using vector IVs for secondary integer IVs whose type matches the
type of the primary.
Differential Revision: http://reviews.llvm.org/D20932
llvm-svn: 272283
|
|
|
|
| |
llvm-svn: 272243
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes PR27617.
Bug description: The SLPVectorizer asserts on encountering GEPs with different index types, such as i8 and i64.
The patch includes a simple relaxation of the assert to allow constants being of different types, along with a regression test that will provoke the unrelaxed assert.
Reviewers: nadav, mzolotukhin
Subscribers: JesperAntonsson, llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D20685
Patch by Jesper Antonsson!
llvm-svn: 272206
|
|
|
|
|
|
|
|
| |
This is the preparation patch to port the analysis to new PM
Differential Revision: http://reviews.llvm.org/D20560
llvm-svn: 272194
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, whenever we needed a vector IV, we would create it on the fly,
by splatting the scalar IV and adding a step vector. Instead, we can create a
real vector IV. This tends to save a couple of instructions per iteration.
This only changes the behavior for the most basic case - integer primary
IVs with a constant step.
Differential Revision: http://reviews.llvm.org/D20315
llvm-svn: 271410
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes bug https://llvm.org/bugs/show_bug.cgi?id=27897.
When query memory access cost, current SLP always passes in alignment value of 1 (unaligned), so it gets a very high cost of scalar memory access, and wrongly vectorize memory loads in the test case.
It can be fixed by simply giving correct alignment.
llvm-svn: 271333
|
|
|
|
|
|
|
|
|
|
| |
operands.
This fixes PR27879.
Differential Revision: http://reviews.llvm.org/D20659
llvm-svn: 270888
|
|
|
|
| |
llvm-svn: 270760
|
|
|
|
| |
llvm-svn: 270579
|
|
|
|
| |
llvm-svn: 270113
|
|
|
|
|
|
|
|
|
|
| |
In truncateToMinimalBitwidths() we were RAUW'ing an instruction then erasing it. However, that intruction could be cached in the map we're iterating over. The first check is "I->use_empty()" which in most cases would return true, as the (deleted) object was RAUW'd first so would have zero use count. However in some cases the object could have been polluted or written over and this wouldn't be the case. Also it makes valgrind, asan and traditionalists who don't like their compiler to crash sad.
No testcase as there are no externally visible symptoms apart from a crash if the stars align.
Fixes PR26509.
llvm-svn: 269908
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The selection of the vectorization factor currently doesn't consider
interleaved accesses. The vectorization factor is based on the maximum safe
dependence distance computed by LAA. However, for loops with interleaved
groups, we should instead base the vectorization factor on the maximum safe
dependence distance divided by the maximum interleave factor of all the
interleaved groups. Interleaved accesses not in a group will be scalarized.
Differential Revision: http://reviews.llvm.org/D20241
llvm-svn: 269659
|
|
|
|
| |
llvm-svn: 269482
|
|
|
|
|
|
| |
Split FCMP//ICMP/SEL from the basic arithmetic cost functions. They were not sharing any notable code path (just the return) and were repeatedly testing the opcode.
llvm-svn: 269348
|
|
|
|
|
|
|
|
|
|
|
| |
LoopVectorBody was changed from a single pointer to a SmallVector when
store predication was introduced in r200270. Since r247139, store predication
no longer splits the vector loop body in-place, so we can go back to having
a single LoopVectorBody block.
This reverts the no-longer-needed changes from r200270.
llvm-svn: 269321
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow vectorization when the step is a loop-invariant variable.
This is the loop example that is getting vectorized after the patch:
int int_inc;
int bar(int init, int *restrict A, int N) {
int x = init;
for (int i=0;i<N;i++){
A[i] = x;
x += int_inc;
}
return x;
}
"x" is an induction variable with *loop-invariant* step.
But it is not a primary induction. Primary induction variable with non-constant step is not handled yet.
Differential Revision: http://reviews.llvm.org/D19258
llvm-svn: 269023
|
|
|
|
|
|
|
| |
Changing misleading function name was approved in http://reviews.llvm.org/D17268.
Patch by Roman Shirokiy.
llvm-svn: 269021
|
|
|
|
| |
llvm-svn: 268655
|
|
|
|
| |
llvm-svn: 268634
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Some PHIs can have expressions that are not AddRecExprs due to the presence
of sext/zext instructions. In order to prevent the Loop Vectorizer from
bailing out when encountering these PHIs, we now coerce the SCEV
expressions to AddRecExprs using SCEV predicates (when possible).
We only do this when the alternative would be to not vectorize.
Reviewers: mzolotukhin, anemet
Subscribers: mssimpso, sanjoy, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D17153
llvm-svn: 268633
|
|
|
|
|
|
|
|
| |
This moves the validation of PHI inductions into a
separate method, making it easier to reuse this
logic.
llvm-svn: 268632
|
|
|
|
| |
llvm-svn: 268583
|
|
|
|
|
|
|
| |
SLPVectorizing a call site should result in further propagation of its
bundles.
llvm-svn: 268004
|
|
|
|
|
|
|
| |
Also, do not crash when calculating a cost model for loop-invariant
token values.
llvm-svn: 268003
|
|
|
|
|
|
|
| |
We don't preserve AAResults, because, for one, we don't preserve SCEV-AA.
That fixes PR25281.
llvm-svn: 267980
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to keep loop hints from the original loop on the new vector loop.
Failure to do this meant that, for example:
void foo(int *b) {
#pragma clang loop unroll(disable)
for (int i = 0; i < 16; ++i)
b[i] = 1;
}
this loop would be unrolled. Why? Because we'd vectorize it, thus dropping the
hints that unrolling should be disabled, and then we'd unroll it.
llvm-svn: 267970
|
|
|
|
|
|
|
|
| |
The refactoring portion part was done as r267748.
http://reviews.llvm.org/D14185
llvm-svn: 267899
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We previously disallowed interleaved load groups that may cause us to
speculatively access memory out-of-bounds (r261331). We did this by ensuring
each load group had an access corresponding to the first and last member.
Instead of bailing out for these interleaved groups, this patch enables us to
peel off the last vector iteration, ensuring that we execute at least one
iteration of the scalar remainder loop. This solution was proposed in the
review of the previous patch.
Differential Revision: http://reviews.llvm.org/D19487
llvm-svn: 267751
|
|
|
|
|
|
|
|
|
| |
This is the first of two commits for extending SLP Vectorizer to deal with aggregates.
This commit merely refactors existing logic.
http://reviews.llvm.org/D14185
llvm-svn: 267748
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change adds a new hook for estimating the cost of vector extracts followed
by zero- and sign-extensions. The motivating example for this change is the
SMOV and UMOV instructions on AArch64. These instructions move data from vector
to general purpose registers while performing the corresponding extension
(sign-extend for SMOV and zero-extend for UMOV) at the same time. For these
operations, TargetTransformInfo can assume the extensions are free and only
report the cost of the vector extract. The SLP vectorizer has been updated to
make use of the new hook.
Differential Revision: http://reviews.llvm.org/D18523
llvm-svn: 267725
|
|
|
|
|
|
|
|
| |
Fixed a bug in loop vectorization with conditional store.
Differential Revision: http://reviews.llvm.org/D19532
llvm-svn: 267597
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
marked parallel loops
I really thought we were doing this already, but we were not. Given this input:
void Test(int *res, int *c, int *d, int *p) {
for (int i = 0; i < 16; i++)
res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
}
we did not vectorize the loop. Even with "assume_safety" the check that we
don't if-convert conditionally-executed loads (to protect against
data-dependent deferenceability) was not elided.
One subtlety: As implemented, it will still prefer to use a masked-load
instrinsic (given target support) over the speculated load. The choice here
seems architecture specific; the best option depends on how expensive the
masked load is compared to a regular load. Ideally, using the masked load still
reduces unnecessary memory traffic, and so should be preferred. If we'd rather
do it the other way, flipping the order of the checks is easy.
The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also
implies that if conversion is okay.
Differential Revision: http://reviews.llvm.org/D19512
llvm-svn: 267514
|
|
|
|
|
|
|
|
|
|
| |
support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).
Differential Revision: http://reviews.llvm.org/D19172
llvm-svn: 267231
|
|
|
|
|
|
|
|
| |
This reverts commit r267022, due to an ASan failure:
http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549
llvm-svn: 267115
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.
The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.
The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way.
Differential Revision: http://reviews.llvm.org/D19172
llvm-svn: 267022
|