| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change teaches EarlyCSE some basic properties of guard intrinsics:
- Guard intrinsics read all memory, but don't write to any memory
- After a guard has executed, the condition it was guarding on can be
assumed to be true
- Guard intrinsics on a constant `true` are no-ops
Reviewers: reames, hfinkel
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D19578
llvm-svn: 268120
|
|
|
|
| |
llvm-svn: 268116
|
|
|
|
|
|
|
|
| |
elements. NFCI.
Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements.
llvm-svn: 268115
|
|
|
|
|
|
|
|
| |
The implemented heuristic has a large body of code which better sits
in its own function for better readability. It also allows adding more
heuristics easier in the future.
llvm-svn: 268107
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D19550
llvm-svn: 268104
|
|
|
|
| |
llvm-svn: 268099
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes two somewhat related bugs in MemorySSA's caching
walker. These bugs were found because D19695 brought up the problem
that we'd have defs cached to themselves, which is incorrect.
The bugs this fixes are:
- We would sometimes skip the nearest clobber of a MemoryAccess, because
we would query our cache for a given potential clobber before
checking if the potential clobber is the clobber we're looking for.
The cache entry for the potential clobber would point to the nearest
clobber *of the potential clobber*, so if that was a cache hit, we'd
ignore the potential clobber entirely.
- There are times (sometimes in DFS, sometimes in the getClobbering...
functions) where we would insert cache entries that say a def
clobbers itself.
There's a bit of common code between the fixes for the bugs, so they
aren't split out into multiple commits.
This patch also adds a few unit tests, and refactors existing tests a
bit to reduce the duplication of setup code.
llvm-svn: 268087
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Callee name is not used to identify a callsite now, so do not read it during annotation.
Reviewers: davidxl, dnovillo
Subscribers: dnovillo, danielcdh, llvm-commits
Differential Revision: http://reviews.llvm.org/D19704
llvm-svn: 268069
|
|
|
|
|
|
|
| |
As suggested in http://reviews.llvm.org/D17859 , we should enhance this
to support vectors.
llvm-svn: 268059
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the cmake build to enable them.
Summary:
Historically, we had a switch in the Makefiles for turning on "expensive
checks". This has never been ported to the cmake build, but the
(dead-ish) code is still around.
This will also make it easier to turn it on in buildbots.
Reviewers: chandlerc
Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits
Differential Revision: http://reviews.llvm.org/D19723
llvm-svn: 268050
|
|
|
|
|
|
|
| |
We neglected to transfer operand bundles for some transforms. These
were found via inspection, I'll try to come up with some test cases.
llvm-svn: 268011
|
|
|
|
|
|
|
| |
We neglected to transfer operand bundles for some transforms. These
were found via inspection, I'll try to come up with some test cases.
llvm-svn: 268010
|
|
|
|
|
|
|
| |
We neglected to transfer operand bundles when performing argument
promotion.
llvm-svn: 268008
|
|
|
|
|
|
|
| |
The option -Rpass=loop-distribute now reports the loops that were
distributed.
llvm-svn: 268006
|
|
|
|
|
|
| |
Next patch will add another use for 'Function' inside the class.
llvm-svn: 268005
|
|
|
|
|
|
|
| |
SLPVectorizing a call site should result in further propagation of its
bundles.
llvm-svn: 268004
|
|
|
|
|
|
|
| |
Also, do not crash when calculating a cost model for loop-invariant
token values.
llvm-svn: 268003
|
|
|
|
|
|
|
|
|
| |
We neglected to transfer operand bundles when performing argument
promotion.
This fixes PR27568.
llvm-svn: 267986
|
|
|
|
|
|
|
| |
We don't preserve AAResults, because, for one, we don't preserve SCEV-AA.
That fixes PR25281.
llvm-svn: 267980
|
|
|
|
|
|
|
|
| |
Reviewers: eugenis
Differential Revision: http://reviews.llvm.org/D19706
llvm-svn: 267974
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to keep loop hints from the original loop on the new vector loop.
Failure to do this meant that, for example:
void foo(int *b) {
#pragma clang loop unroll(disable)
for (int i = 0; i < 16; ++i)
b[i] = 1;
}
this loop would be unrolled. Why? Because we'd vectorize it, thus dropping the
hints that unrolling should be disabled, and then we'd unroll it.
llvm-svn: 267970
|
|
|
|
|
|
| |
This handles SSE and SSE2 cmp_* and comiXX_* intrinsics.
llvm-svn: 267966
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I closely followed the precedents set by the vectorizer:
* With -Rpass-missed, the loop is reported with further details pointing
to -Rpass--analysis.
* -Rpass-analysis reports the details why distribution has failed.
* Regardless of -Rpass*, when distribution fails for a loop where
distribution was forced with the pragma, a warning is produced according
to -Wpass-failed. In this case the analysis info is also printed even
without -Rpass-analysis.
llvm-svn: 267952
|
|
|
|
|
|
|
|
|
|
|
| |
The next patch will start using these for -Rpass-analysis so they won't
be internal-only anymore.
Move the 'Skipping; ' prefix that some of the message are using into the
'fail' function. We don't want to include this prefix in
the -Rpass-analysis report.
llvm-svn: 267951
|
|
|
|
|
|
| |
This will form the basis to emit optimization remarks (-Rpass*).
llvm-svn: 267950
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When inlining a call site with llvm.mem.parallel_loop_access metadata, this
metadata needs to be propagated to all cloned memory-accessing instructions.
Otherwise, inlining parts of the loop body will invalidate the annotation.
With this functionality, we now vectorize the following as expected:
void Body(int *res, int *c, int *d, int *p, int i) {
res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
}
void Test(int *res, int *c, int *d, int *p, int n) {
int i;
#pragma clang loop vectorize(assume_safety)
for (i = 0; i < 1600; i++) {
Body(res, c, d, p, i);
}
}
llvm-svn: 267949
|
|
|
|
|
|
|
| |
Should not store Twine objects to local variables. This is fixed the test
failures with r267815 in VS2015 X64 build.
llvm-svn: 267908
|
|
|
|
| |
llvm-svn: 267905
|
|
|
|
|
|
|
|
| |
The refactoring portion part was done as r267748.
http://reviews.llvm.org/D14185
llvm-svn: 267899
|
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D18828
Patch by Aditya Kumar!
llvm-svn: 267898
|
|
|
|
|
|
| |
Made in preparation for adding MemorySSA support to EarlyCSE.
llvm-svn: 267893
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: mcrosier
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D19617
llvm-svn: 267890
|
|
|
|
|
|
| |
r267873.
llvm-svn: 267887
|
|
|
|
|
|
|
|
|
|
| |
The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits.
This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero.
Differential Revision: http://reviews.llvm.org/D19614
llvm-svn: 267873
|
|
|
|
|
|
|
|
|
|
| |
This patch implements the transformation that promotes indirect calls to
conditional direct calls when the indirect-call value profile meta-data is
available.
Differential Revision: http://reviews.llvm.org/D17864
llvm-svn: 267815
|
|
|
|
|
|
|
|
| |
There's no existing test for this path, and I don't know how to expose
it in a regression test, but I'm assuming there's some reason this
path exists.
llvm-svn: 267813
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D19515
llvm-svn: 267792
|
|
|
|
|
|
|
|
|
| |
"inferattrs" will deduce the attribute, but it will be too late for
many optimizations. Set it ourselves when creating the call.
Differential Revision: http://reviews.llvm.org/D17598
llvm-svn: 267762
|
|
|
|
| |
llvm-svn: 267761
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D19471
llvm-svn: 267760
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now the pass is just a tiny wrapper around the util. This lets us reuse
the logic elsewhere (done here for BuildLibCalls) instead of duplicating
it.
The next step is to have something like getOrInsertLibFunc that also
sets the attributes.
Differential Revision: http://reviews.llvm.org/D19470
llvm-svn: 267759
|
|
|
|
|
|
|
|
|
| |
I tried to be as close as possible to the strongest check that
existed before; cleaning these up properly is left for future work.
Differential Revision: http://reviews.llvm.org/D19469
llvm-svn: 267758
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We previously disallowed interleaved load groups that may cause us to
speculatively access memory out-of-bounds (r261331). We did this by ensuring
each load group had an access corresponding to the first and last member.
Instead of bailing out for these interleaved groups, this patch enables us to
peel off the last vector iteration, ensuring that we execute at least one
iteration of the scalar remainder loop. This solution was proposed in the
review of the previous patch.
Differential Revision: http://reviews.llvm.org/D19487
llvm-svn: 267751
|
|
|
|
|
|
|
|
|
| |
This is the first of two commits for extending SLP Vectorizer to deal with aggregates.
This commit merely refactors existing logic.
http://reviews.llvm.org/D14185
llvm-svn: 267748
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change adds a new hook for estimating the cost of vector extracts followed
by zero- and sign-extensions. The motivating example for this change is the
SMOV and UMOV instructions on AArch64. These instructions move data from vector
to general purpose registers while performing the corresponding extension
(sign-extend for SMOV and zero-extend for UMOV) at the same time. For these
operations, TargetTransformInfo can assume the extensions are free and only
report the cost of the vector extract. The SLP vectorizer has been updated to
make use of the new hook.
Differential Revision: http://reviews.llvm.org/D18523
llvm-svn: 267725
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Refine the workaround from r266877 that attempts to prevent
renaming of locals in inline assembly, so that in addition to looking
for a llvm.used local value, that there is at least one inline assembly
call in the module. Otherwise, debug functions added to the llvm.used
can block importing/exporting unnecessarily.
Reviewers: joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D19573
llvm-svn: 267717
|
|
|
|
|
|
|
|
|
|
| |
This is required to use this function from isSafeToSpeculativelyExecute
Reviewed By: hfinkel
Differential Revision: http://reviews.llvm.org/D16231
llvm-svn: 267692
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
D19403 adds a new pragma for loop distribution. This change adds
support for the corresponding metadata that the pragma is translated to
by the FE.
As part of this I had to rethink the flag -enable-loop-distribute. My
goal was to be backward compatible with the existing behavior:
A1. pass is off by default from the optimization pipeline
unless -enable-loop-distribute is specified
A2. pass is on when invoked directly from opt (e.g. for unit-testing)
The new pragma/metadata overrides these defaults so the new behavior is:
B1. A1 + enable distribution for individual loop with the pragma/metadata
B2. A2 + disable distribution for individual loop with the pragma/metadata
The default value whether the pass is on or off comes from the initiator
of the pass. From the PassManagerBuilder the default is off, from opt
it's on.
I moved -enable-loop-distribute under the pass. If the flag is
specified it overrides the default from above.
Then the pragma/metadata can further modifies this per loop.
As a side-effect, we can now also use -enable-loop-distribute=0 from opt
to emulate the default from the optimization pipeline. So to be precise
this is the new behavior:
C1. pass is off by default from the optimization pipeline
unless -enable-loop-distribute or the pragma/metadata enables it
C2. pass is on when invoked directly from opt
unless -enable-loop-distribute=0 or the pragma/metadata disables it
Reviewers: hfinkel
Subscribers: joker.eph, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D19431
llvm-svn: 267672
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
cloneLoopWithPreheader() does not update LoopInfo for sub-loop of
the original loop being cloned. Add assert to ensure no sub-loops for loop being cloned.
Reviewers: anemet, ashutosh.nema, hfinkel
Subscribers: mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D15922
llvm-svn: 267671
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
It is incorrect to compare TripCount (which is BECount + 1)
with extraiters (or Count) to check if we should enter unrolled
loop or not, because TripCount can potentially overflow
(when BECount is max unsigned integer).
While comparing BECount with (Count - 1) is overflow safe and
therefore correct.
Reviewer: hfinkel
Differential Revision: http://reviews.llvm.org/D19256
From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 267662
|