| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
| |
reschedule them"
This commit reapplies 205018. After 205855 we should correctly vectorize
intrinsics.
llvm-svn: 205965
|
|
|
|
|
|
|
|
|
| |
The vectorizer only knows how to vectorize intrinics by widening all operands by
the same factor.
Patch by Tyler Nowicki!
llvm-svn: 205855
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
reschedule them"
This reverts commit r205018.
Conflicts:
lib/Transforms/Vectorize/SLPVectorizer.cpp
test/Transforms/SLPVectorizer/X86/insert-element-build-vector.ll
This is breaking libclc build.
llvm-svn: 205260
|
|
|
|
|
|
|
|
|
| |
Extract element instructions that will be removed when vectorzing lower the
cost.
Patch by Arch D. Robison!
llvm-svn: 205020
|
|
|
|
|
|
| |
Patch by Arch D. Robison!
llvm-svn: 205018
|
|
|
|
|
|
|
| |
This reverts commit 86cb795388643710dab34941ddcb5a9470ac39d8.
The problems previously found have been resolved through other CLs.
llvm-svn: 203707
|
|
|
|
| |
llvm-svn: 202924
|
|
|
|
|
|
|
|
|
| |
Vectorize sequential stores of a broadcasted value.
5% on eon.
radar://16124699
llvm-svn: 202067
|
|
|
|
|
|
| |
vectorizeTree()). radar://16064178
llvm-svn: 201501
|
|
|
|
|
|
|
| |
This reverts commit r200576. It broke 32-bit self-host builds by
vectorizing two calls to @llvm.bswap.i64, which we then fail to expand.
llvm-svn: 200602
|
|
|
|
|
|
|
|
|
|
| |
transform accordingly. Based on similar code from Loop vectorization.
Subsequent commits will include vectorization of function calls to
vector intrinsics and form function calls to vector library calls.
Patch by Raul Silvera! (Much delayed due to my not running dcommit)
llvm-svn: 200576
|
|
|
|
| |
llvm-svn: 199016
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were creating external uses for scalar values in MustGather entries that also
had a ScalarToTreeEntry (they also are present in a vectorized tuple). This
meant we would keep a value 'alive' as a scalar and vectorized causing havoc.
This is not necessary because when we create a MustGather vector we explicitly
create external uses entries for the insertelement instructions of the
MustGather vector elements.
Fixes PR18129.
radar://15582184
llvm-svn: 196508
|
|
|
|
|
|
|
|
|
|
| |
clang enables vectorization at optimization levels > 1 and size level < 2. opt
should behave similarily.
Loop vectorization and SLP vectorization can be disabled with the flags
-disable-(loop/slp)-vectorization.
llvm-svn: 196294
|
|
|
|
|
|
|
|
| |
some of these instructions
may be removed and optimized in future iterations. Instead we save a list of basic blocks that we need to CSE.
llvm-svn: 195791
|
|
|
|
|
|
|
|
| |
we generate PHI nodes with multiple entries from the same basic block but
with different values. Enabling CSE on ExtractElement instructions make sure
that all of the RAUWed instructions are the same.
llvm-svn: 195773
|
|
|
|
| |
llvm-svn: 195691
|
|
|
|
|
|
|
|
| |
We are going to drop debug info without a version number or with a different
version number, to make sure we don't crash when we see bitcode files with
different debug info metadata format.
llvm-svn: 195504
|
|
|
|
|
|
| |
multiple external uses.
llvm-svn: 195406
|
|
|
|
|
|
|
|
|
|
| |
radar://15231682
Reapply r192799,
http://lab.llvm.org:8011/builders/lldb-x86_64-debian-clang/builds/8226
showed that the bot is still broken even with this out.
llvm-svn: 192820
|
|
|
|
|
|
| |
This speculatively reverts commit 192799. It might have broken a linux buildbot.
llvm-svn: 192816
|
|
|
|
|
|
| |
radar://15231682
llvm-svn: 192799
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this patch we relied on the order of phi nodes when we looked for phi
nodes of the same type. This could prevent vectorization of cases where there
was a phi node of a second type in between phi nodes of some type.
This is important for vectorization of an internal graphics kernel. On the test
suite + external on x86_64 (and on a run on armv7s) it showed no impact on
either performance or compile time.
radar://15024459
llvm-svn: 192537
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sort the operands of the other entries in the current vectorization root
according to the first entry's operands opcodes.
%conv0 = uitofp ...
%load0 = load float ...
= fmul %conv0, %load0
= fmul %load0, %conv1
= fmul %load0, %conv2
Make sure that we recursively vectorize <%conv0, %conv1, %conv2> and <%load0,
%load0, %load0>.
This makes it more likely to obtain vectorizable trees. We have to be careful
when we sort that we don't destroy 'good' existing ordering implied by source
order.
radar://15080067
llvm-svn: 191977
|
|
|
|
| |
llvm-svn: 191852
|
|
|
|
|
|
|
|
|
| |
GetUnderlyingObject.
This recursively strips all GEPs like the existing code. It also handles bitcasts and
other operations that do not change the pointer value.
llvm-svn: 191847
|
|
|
|
| |
llvm-svn: 191690
|
|
|
|
| |
llvm-svn: 191689
|
|
|
|
|
|
|
|
| |
Inspired by the object from the SLPVectorizer. This found a minor bug in the
debug loc restoration in the vectorizer where the location of a following
instruction was attached instead of the location from the original instruction.
llvm-svn: 191673
|
|
|
|
|
|
|
|
|
|
| |
We were previously using getFirstInsertionPt to insert PHI
instructions when vectorizing, but getFirstInsertionPt also skips past
landingpads, causing this to generate invalid IR.
We can avoid this issue by using getFirstNonPHI instead.
llvm-svn: 191526
|
|
|
|
|
|
|
| |
Put them under a separate flag for experimentation. They are more likely to
interfere with loop vectorization which happens later in the pass pipeline.
llvm-svn: 191371
|
|
|
|
|
|
| |
Some supplemental information for r191314: We would like to make sure SLP Vectorizer will not try to vectorize tiny trees even with a negative threshold so we set the cost to INT_MAX.
llvm-svn: 191327
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reapply r191108 with a fix for a memory corruption error I introduced. Of
course, we can't reference the scalars that we replace by vectorizing and then
call their eraseFromParent method. I only 'needed' the scalars to get the
DebugLoc. Just store the DebugLoc before actually vectorizing instead. As a nice
side effect, this also simplifies the interface between BoUpSLP and the
HorizontalReduction class to returning a value pointer (the vectorized tree
root).
radar://14607682
llvm-svn: 191123
|
|
|
|
|
|
|
|
|
| |
This reverts commit r191108.
The horizontal.ll test case fails under libgmalloc. Thanks Shuxin for pointing
this out to me.
llvm-svn: 191121
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Match reductions starting at binary operation feeding into a phi. The code
handles trees like
r += v1 + v2 + v3 ...
and
r += v1
r += v2
...
and
r *= v1 + v2 + ...
We currently only handle associative operations (add, fadd fast).
The code can now also handle reductions feeding into stores.
a[i] = v1 + v2 + v3 + ...
The code is currently disabled behind the flag "-slp-vectorize-hor". The cost
model for most architectures is not there yet.
I found one opportunity of a horizontal reduction feeding a phi in TSVC
(LoopRerolling-flt) and there are several opportunities where reductions feed
into stores.
radar://14607682
llvm-svn: 191108
|
|
|
|
|
|
|
|
|
| |
We can't insert an insertelement after an invoke. We would have to split a
critical edge. So when we see a phi node that uses an invoke we just give up.
radar://14990770
llvm-svn: 190871
|
|
|
|
|
|
|
|
| |
Field 2 of DIType (Context), field 9 of DIDerivedType (TypeDerivedFrom),
field 12 of DICompositeType (ContainingType), fields 2, 7, 12 of DISubprogram
(Context, Type, ContainingType).
llvm-svn: 190205
|
|
|
|
|
|
|
|
|
| |
1) If the width of vectorization list candidate is bigger than vector reg width, we will break it down to fit the vector reg.
2) We do not vectorize the width which is not power of two.
The performance result shows it will help some spec benchmarks. mesa improved 6.97% and ammp improved 1.54%.
llvm-svn: 189830
|
|
|
|
|
|
|
|
|
|
|
| |
The builder inserts from before the insert point,
not after, so this would insert before the last
instruction in the bundle instead of after it.
I'm not sure if this can actually be a problem
with any of the current insertions.
llvm-svn: 189285
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DICompositeType will have an identifier field at position 14. For now, the
field is set to null in DIBuilder.
For DICompositeTypes where the template argument field (the 13th field)
was optional, modify DIBuilder to make sure the template argument field is set.
Now DICompositeType has 15 fields.
Update DIBuilder to use NULL instead of "i32 0" for null value of a MDNode.
Update verifier to check that DICompositeType has 15 fields and the last
field is null or a MDString.
Update testing cases to include an extra field for DICompositeType.
The identifier field will be used by type uniquing so a front end can
genearte a DICompositeType with a unique identifer.
llvm-svn: 189282
|
|
|
|
| |
llvm-svn: 189248
|
|
|
|
| |
llvm-svn: 189233
|
|
|
|
|
|
|
|
|
| |
A single metadata will not span multiple lines. This also helps me with
my script to automatic update the testing cases.
A debug info testing case should have a llvm.dbg.cu.
Do not use hard-coded id for debug nodes.
llvm-svn: 189033
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
using GEPs. Previously, it used a number of different heuristics for
analyzing the GEPs. Several of these were conservatively correct, but
failed to fall back to SCEV even when SCEV might have given a reasonable
answer. One was simply incorrect in how it was formulated.
There was good code already to recursively evaluate the constant offsets
in GEPs, look through pointer casts, etc. I gathered this into a form
code like the SLP code can use in a previous commit, which allows all of
this code to become quite simple.
There is some performance (compile time) concern here at first glance as
we're directly attempting to walk both pointers constant GEP chains.
However, a couple of thoughts:
1) The very common cases where there is a dynamic pointer, and a second
pointer at a constant offset (usually a stride) from it, this code
will actually not do any unnecessary work.
2) InstCombine and other passes work very hard to collapse constant
GEPs, so it will be rare that we iterate here for a long time.
That said, if there remain performance problems here, there are some
obvious things that can improve the situation immensely. Doing
a vectorizer-pass-wide memoizer for each individual layer of pointer
values, their base values, and the constant offset is likely to be able
to completely remove redundant work and strictly limit the scaling of
the work to scrape these GEPs. Since this optimization was not done on
the prior version (which would still benefit from it), I've not done it
here. But if folks have benchmarks that slow down it should be straight
forward for them to add.
I've added a test case, but I'm not really confident of the amount of
testing done for different access patterns, strides, and pointer
manipulation.
llvm-svn: 189007
|
|
|
|
|
|
|
|
|
|
|
| |
Update iterator when the SLP vectorizer changes the instructions in the basic
block by restarting the traversal of the basic block.
Patch by Yi Jiang!
Fixes PR 16899.
llvm-svn: 188832
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Instead of setting the suffixes in a bunch of places, just set one master
list in the top-level config. We now only modify the suffix list in a few
suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py).
- Aside from removing the need for a bunch of lit.local.cfg files, this enables
4 tests that were inadvertently being skipped (one in
Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and
CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been
XFAILED).
- This commit also fixes a bunch of config files to use config.root instead of
older copy-pasted code.
llvm-svn: 188513
|
|
|
|
|
|
|
| |
Do not generate new vector values for the same entries because we know that the incoming values
from the same block must be identical.
llvm-svn: 188185
|
|
|
|
|
|
|
|
| |
come from different blocks.
Thanks Alexey Samsonov.
llvm-svn: 187663
|
|
|
|
|
|
|
|
| |
info changes.
Thanks Eric.
llvm-svn: 187368
|
|
|
|
| |
llvm-svn: 187363
|