| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
loop preheader
Summary:
This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with
its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch.
Reviewers: sanjoy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13974
llvm-svn: 251839
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Support for necessary linkage changes and symbol renaming during
ThinLTO function importing.
Also includes llvm-link support for manually importing functions
and associated llvm-link based tests.
Note that this does not include support for intelligently importing
metadata, which is currently imported duplicate times. That support will
be in the follow-on patch, and currently is ignored by the tests.
Reviewers: dexonsmith, joker.eph, davidxl
Subscribers: tobiasvk, tejohnson, llvm-commits
Differential Revision: http://reviews.llvm.org/D13515
llvm-svn: 251837
|
|
|
|
| |
llvm-svn: 251823
|
|
|
|
|
|
|
| |
I had clang formatted my earlier patches using the wrong style.
Reformatted with the LLVM style.
llvm-svn: 251812
|
|
|
|
|
|
|
|
| |
Reviewed By: hfinkel
Differential Revision: http://reviews.llvm.org/D13953
llvm-svn: 251809
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
SCEV Predicates represent conditions that typically cannot be derived from
static analysis, but can be used to reduce SCEV expressions to forms which are
usable for different optimizers.
ScalarEvolution now has the rewriteUsingPredicate method which can simplify a
SCEV expression using a SCEVPredicateSet. The normal workflow of a pass using
SCEVPredicates would be to hold a SCEVPredicateSet and every time assumptions
need to be made a new SCEV Predicate would be created and added to the set.
Each time after calling getSCEV, the user will call the rewriteUsingPredicate
method.
We add two types of predicates
SCEVPredicateSet - implements a set of predicates
SCEVEqualPredicate - tests for equality between two SCEV expressions
We use the SCEVEqualPredicate to re-implement stride versioning. Every time we
version a stride, we will add a SCEVEqualPredicate to the context.
Instead of adding specific stride checks, LoopVectorize now adds a more
generic SCEV check.
We only need to add support for this in the LoopVectorizer since this is the
only pass that will do stride versioning.
Reviewers: mzolotukhin, anemet, hfinkel, sanjoy
Subscribers: sanjoy, hfinkel, rengolin, jmolloy, llvm-commits
Differential Revision: http://reviews.llvm.org/D13595
llvm-svn: 251800
|
|
|
|
| |
llvm-svn: 251757
|
|
|
|
| |
llvm-svn: 251754
|
|
|
|
|
|
|
| |
The initial coverage checking code for sample records failed to count
records inside inlined profiles. This change fixes the oversight.
llvm-svn: 251752
|
|
|
|
| |
llvm-svn: 251737
|
|
|
|
|
|
|
|
| |
loop over the SCC.
The separate function wasn't really adding much, NFC.
llvm-svn: 251728
|
|
|
|
|
|
|
| |
This is a really straightforward port. Also adds a test for the pass,
since it only seemed to be tested tangentially before.
llvm-svn: 251726
|
|
|
|
| |
llvm-svn: 251725
|
|
|
|
| |
llvm-svn: 251724
|
|
|
|
|
|
|
|
|
|
|
| |
from its pass harness by providing a lambda to query for AA results.
This allows the legacy pass to easily provide a lambda that uses the
special helpers to construct function AA results from a legacy CGSCC
pass. With the new pass manager (the next patch) the lambda just
directly wraps the intuitive query API.
llvm-svn: 251715
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update the discriminator assignment algorithm
* If a scope has already been assigned a discriminator, do not reassign a nested discriminator for it.
* If the file and line both match, even if the column does not match, we should assign a new discriminator for the stmt.
original code:
; #1 int foo(int i) {
; #2 if (i == 3 || i == 5) return 100; else return 99;
; #3 }
; i == 3: discriminator 0
; i == 5: discriminator 2
; return 100: discriminator 1
; return 99: discriminator 3
llvm-svn: 251689
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update the discriminator assignment algorithm
* If a scope has already been assigned a discriminator, do not reassign a nested discriminator for it.
* If the file and line both match, even if the column does not match, we should assign a new discriminator for the stmt.
original code:
; #1 int foo(int i) {
; #2 if (i == 3 || i == 5) return 100; else return 99;
; #3 }
; i == 3: discriminator 0
; i == 5: discriminator 2
; return 100: discriminator 1
; return 99: discriminator 3
llvm-svn: 251685
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* If a scope has already been assigned a discriminator, do not reassign a nested discriminator for it.
* If the file and line both match, even if the column does not match, we should assign a new discriminator for the stmt.
original code:
; #1 int foo(int i) {
; #2 if (i == 3 || i == 5) return 100; else return 99;
; #3 }
; i == 3: discriminator 0
; i == 5: discriminator 2
; return 100: discriminator 1
; return 99: discriminator 3
llvm-svn: 251680
|
|
|
|
| |
llvm-svn: 251656
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
transformations in FunctionAttrs rather than building a new one each
time.
This isn't trivial because there are different heuristics from different
passes for exactly what set they want. The primary difference is whether
an *overridable* function completely disables the synthesis of
attributes. I've modeled this by directly testing for overridable, and
using the common set that excludes external and opt-none functions.
This does cause some changes by disabling more optimizations in the face
of opt-none. Specifically, we were still optimizing *calls* to opt-none
functions based on their attributes, just not the bodies. It seems
better to be conservative on both fronts given the intended semanticas
here (best effort to not assume or disturb anything). I've not tried to
test this change as it seems complex, brittle, and not important to the
implicit contract of opt-none. Instead, it seems more like a choice that
should be dictated by the simplified implementation and the change to be
acceptable differences within the space of opt-none.
A big benefit here is that these transformations no longer rely on the
legacy pass manager's SCC types, they just work on generic sets of
function pointers. This will make it easy to re-use their logic in the
new pass manager.
I've also made the transforms static functions instead of members where
trivial while I was touching the signatures.
llvm-svn: 251640
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch unify the 39-bit and 42-bit mapping for aarch64 to use only
one instrumentation algorithm. This removes compiler flag
SANITIZER_AARCH64_VMA requirement for MSAN on aarch64.
The mapping to use now is for 39 and 42-bits:
0x00000000000ULL-0x01000000000ULL MappingDesc::INVALID
0x01000000000ULL-0x02000000000ULL MappingDesc::SHADOW
0x02000000000ULL-0x03000000000ULL MappingDesc::ORIGIN
0x03000000000ULL-0x04000000000ULL MappingDesc::SHADOW
0x04000000000ULL-0x05000000000ULL MappingDesc::ORIGIN
0x05000000000ULL-0x06000000000ULL MappingDesc::APP
0x06000000000ULL-0x07000000000ULL MappingDesc::INVALID
0x07000000000ULL-0x08000000000ULL MappingDesc::APP
And only for 42-bits:
0x08000000000ULL-0x09000000000ULL MappingDesc::INVALID
0x09000000000ULL-0x0A000000000ULL MappingDesc::SHADOW
0x0A000000000ULL-0x0B000000000ULL MappingDesc::ORIGIN
0x0B000000000ULL-0x0F000000000ULL MappingDesc::INVALID
0x0F000000000ULL-0x10000000000ULL MappingDesc::APP
0x10000000000ULL-0x11000000000ULL MappingDesc::INVALID
0x11000000000ULL-0x12000000000ULL MappingDesc::APP
0x12000000000ULL-0x17000000000ULL MappingDesc::INVALID
0x17000000000ULL-0x18000000000ULL MappingDesc::SHADOW
0x18000000000ULL-0x19000000000ULL MappingDesc::ORIGIN
0x19000000000ULL-0x20000000000ULL MappingDesc::INVALID
0x20000000000ULL-0x21000000000ULL MappingDesc::APP
0x21000000000ULL-0x26000000000ULL MappingDesc::INVALID
0x26000000000ULL-0x27000000000ULL MappingDesc::SHADOW
0x27000000000ULL-0x28000000000ULL MappingDesc::ORIGIN
0x28000000000ULL-0x29000000000ULL MappingDesc::SHADOW
0x29000000000ULL-0x2A000000000ULL MappingDesc::ORIGIN
0x2A000000000ULL-0x2B000000000ULL MappingDesc::APP
0x2B000000000ULL-0x2C000000000ULL MappingDesc::INVALID
0x2C000000000ULL-0x2D000000000ULL MappingDesc::SHADOW
0x2D000000000ULL-0x2E000000000ULL MappingDesc::ORIGIN
0x2E000000000ULL-0x2F000000000ULL MappingDesc::APP
0x2F000000000ULL-0x39000000000ULL MappingDesc::INVALID
0x39000000000ULL-0x3A000000000ULL MappingDesc::SHADOW
0x3A000000000ULL-0x3B000000000ULL MappingDesc::ORIGIN
0x3B000000000ULL-0x3C000000000ULL MappingDesc::APP
0x3C000000000ULL-0x3D000000000ULL MappingDesc::INVALID
0x3D000000000ULL-0x3E000000000ULL MappingDesc::SHADOW
0x3E000000000ULL-0x3F000000000ULL MappingDesc::ORIGIN
0x3F000000000ULL-0x40000000000ULL MappingDesc::APP
And although complex it provides a better memory utilization that
previous one.
llvm-svn: 251624
|
|
|
|
| |
llvm-svn: 251623
|
|
|
|
| |
llvm-svn: 251617
|
|
|
|
|
|
|
|
|
|
| |
Clang driver now injects -u<hook_var> flag in the linker
command line, in which case user function is not needed
any more.
Differential Revision: http://reviews.llvm.org/D14033
llvm-svn: 251612
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Somewhat shockingly for an analysis pass which is computing constant ranges, LVI did not understand the ranges provided by range metadata.
As part of this change, I included a change to CVP primarily because doing so made it much easier to write small self contained test cases. CVP was previously only handling the non-local operand case, but given that LVI can sometimes figure out information about instructions standalone, I don't see any reason to restrict this. There could possibly be a compile time impact from this, but I suspect it should be minimal. If anyone has an example which substaintially regresses, please let me know. I could restrict the block local handling to ICmps feeding Terminator instructions if needed.
Note that this patch continues a somewhat bad practice in LVI. In many cases, we know facts about values, and separate context sensitive facts about values. LVI makes no effort to distinguish and will frequently cache the same value fact repeatedly for different contexts. I would like to change this, but that's a large enough change that I want it to go in separately with clear documentation of what's changing. Other examples of this include the non-null handling, and arguments.
As a meta comment: the entire motivation of this change was being able to write smaller (aka reasonable sized) test cases for a future patch teaching LVI about select instructions.
Differential Revision: http://reviews.llvm.org/D13543
llvm-svn: 251606
|
|
|
|
|
|
|
|
|
|
|
| |
The most common use case is when eliminating redundant range checks in an example like the following:
c = a[i+1] + a[i];
Note that all the smarts of the transform (the implication engine) is already in ValueTracking and is tested directly through InstructionSimplify.
Differential Revision: http://reviews.llvm.org/D13040
llvm-svn: 251596
|
|
|
|
| |
llvm-svn: 251595
|
|
|
|
|
|
|
|
| |
larger vectorization factor.
To be able to maximize the bandwidth during vectorization, this patch provides a new flag vectorizer-maximize-bandwidth. When it is turned on, the vectorizer will determine the vectorization factor (VF) using the smallest instead of widest type in the loop. To avoid increasing register pressure too much, estimates of the register usage for different VFs are calculated so that we only choose a VF when its register usage doesn't exceed the number of available registers.
llvm-svn: 251592
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the flag -mllvm -sample-profile-check-coverage=N to the
SampleProfile pass. N is the percent of input sample records that the
user expects to apply. If the pass does not use N% (or more) of the
sample records in the input, it emits a warning.
This is useful to detect some forms of stale profiles. If the code has
drifted enough from the original profile, there will be records that do
not match the IR anymore.
This will not detect cases where a sample profile record for line L is
referring to some other instructions that also used to be at line L.
llvm-svn: 251568
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If P branches to Q conditional on C and Q branches to R conditional on
C' and C => C' then the branch conditional on C' can be folded to an
unconditional branch.
Reviewers: reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13972
llvm-svn: 251557
|
|
|
|
|
|
|
|
|
|
| |
The pass was keeping around a lot of per-function data (visited blocks,
edges, dominance, etc) that is just taking up memory for no reason. In
fact, from function to function it could potentially confuse the
propagator since some maps are indexed by line offsets which can be
common between functions.
llvm-svn: 251531
|
|
|
|
| |
llvm-svn: 251521
|
|
|
|
|
|
|
|
| |
stride.
The simple fix is to prevent forming memcpy from loops with a negative stride.
llvm-svn: 251518
|
|
|
|
|
|
|
|
| |
I think these were affected by a change way back when to stop printing newlines in Value::dump() by default. This change simply allows the debug output to be readable.
NFC.
llvm-svn: 251517
|
|
|
|
|
|
|
|
| |
stride."
This reverts commit r251512. This is causing LNT/chomp to fail.
llvm-svn: 251513
|
|
|
|
|
|
| |
http://reviews.llvm.org/D14125
llvm-svn: 251512
|
|
|
|
|
|
| |
initial values from loop preheader", because it broke some bots.
llvm-svn: 251498
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
loop preheader
Summary:
This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with
its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch.
Reviewers: sanjoy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13974
llvm-svn: 251492
|
|
|
|
|
|
|
| |
CatchReturnInst has side-effects: it runs a destructor. This destructor
could conceivably run forever/call exit/etc. and should not be removed.
llvm-svn: 251461
|
|
|
|
| |
llvm-svn: 251437
|
|
|
|
|
|
|
| |
It causes miscompilation of llvm/lib/ExecutionEngine/Interpreter/Execution.cpp.
See also PR25324.
llvm-svn: 251436
|
|
|
|
| |
llvm-svn: 251434
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change could be way off-piste, I'm looking for any feedback on whether it's an acceptable approach.
It never seems to be a problem to gobble up as many reduction values as can be found, and then to attempt to reduce the resulting tree. Some of the workloads I'm looking at have been aggressively unrolled by hand, and by selecting reduction widths that are not constrained by a vector register size, it becomes possible to profitably vectorize. My test case shows such an unrolling which SLP was not vectorizing (on neither ARM nor X86) before this patch, but with it does vectorize.
I measure no significant compile time impact of this change when combined with D13949 and D14063. There are also no significant performance regressions on ARM/AArch64 in SPEC or LNT.
The more principled approach I thought of was to generate several candidate tree's and use the cost model to pick the cheapest one. That seemed like quite a big design change (the algorithms seem very much one-shot), and would likely be a costly thing for compile time. This seemed to do the job at very little cost, but I'm worried I've misunderstood something!
Reviewers: nadav, jmolloy
Subscribers: mssimpso, llvm-commits, aemerson
Differential Revision: http://reviews.llvm.org/D14116
llvm-svn: 251428
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently, when the SLP vectorizer considers whether a phi is part of a reduction, it dismisses phi's whose incoming blocks are not the same as the block containing the phi. For the patterns I'm looking at, extending this rule to allow phis whose incoming block is a containing loop latch allows me to vectorize certain workloads.
There is no significant compile-time impact, and combined with D13949, no performance improvement measured in ARM/AArch64 in any of SPEC2000, SPEC2006 or LNT.
Reviewers: jmolloy, mcrosier, nadav
Subscribers: mssimpso, nadav, aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D14063
llvm-svn: 251425
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Certain workloads, in particular sum-of-absdiff loops, can be vectorized using SLP if it can treat select instructions as reduction values.
The test case is a bit awkward. The AArch64 cost model needs some tuning to not be so pessimistic about selects. I've had to tweak the SLP threshold here.
Reviewers: jmolloy, mzolotukhin, spatel, nadav
Subscribers: nadav, mssimpso, aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D13949
llvm-svn: 251424
|
|
|
|
|
|
|
|
|
|
| |
When emitting a remark for a conditional branch annotation, the remark
uses the line location information of the conditional branch in the
message. In some cases, that information is unavailable and the
optimization would segfaul. I'm still not sure whether this is a bug or
WAI, but the optimizer should not die because of this.
llvm-svn: 251420
|
|
|
|
| |
llvm-svn: 251383
|
|
|
|
|
|
|
|
|
|
|
| |
No functionality changed here, but the indentation is substantially
reduced and IMO the code is much easier to read. I've also added some
helpful comments.
This is just a clean-up I wrote while studying the code, and that has
been in my backlog for a while.
llvm-svn: 251381
|
|
|
|
| |
llvm-svn: 251344
|
|
|
|
|
|
|
|
|
| |
We should remove noalias along with dereference and dereference_or_null attributes
because statepoint could potentially touch the entire heap including noalias objects.
Differential Revision: http://reviews.llvm.org/D14032
llvm-svn: 251333
|