| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"SCEVAddRecExpr operand is not loop-invariant!")
Initially SLP vectorizer replaced all going-to-be-vectorized
instructions with Undef values. It may break ScalarEvaluation and may
cause a crash.
Reworked SLP vectorizer so that it does not replace vectorized
instructions by UndefValue anymore. Instead vectorized instructions are
marked for deletion inside if BoUpSLP class and deleted upon class
destruction.
Reviewers: mzolotukhin, mkuper, hfinkel, RKSimon, davide, spatel
Subscribers: RKSimon, Gerolf, anemet, hans, majnemer, llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D29641
llvm-svn: 373166
|
|
|
|
|
|
|
|
| |
(isLoopInvariant(Operands[i], L) && "SCEVAddRecExpr operand is not loop-invariant!")
This reverts r372626 (git commit 6a278d9073bdc158d31d4f4b15bbe34238f22c18)
llvm-svn: 373019
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"SCEVAddRecExpr operand is not loop-invariant!")
Summary:
Initially SLP vectorizer replaced all going-to-be-vectorized
instructions with Undef values. It may break ScalarEvaluation and may
cause a crash.
Reworked SLP vectorizer so that it does not replace vectorized
instructions by UndefValue anymore. Instead vectorized instructions are
marked for deletion inside if BoUpSLP class and deleted upon class
destruction.
Reviewers: mzolotukhin, mkuper, hfinkel, RKSimon, davide, spatel
Subscribers: RKSimon, Gerolf, anemet, hans, majnemer, llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D29641
llvm-svn: 372626
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch uses the mechanism from D62995 to strengthen the
definitions of the reduction intrinsics by letting the scalar
result/accumulator type be overloaded from the vector element type.
For example:
; The LLVM LangRef specifies that the scalar result must equal the
; vector element type, but this is not checked/enforced by LLVM.
declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a)
This patch changes that into:
declare i32 @llvm.experimental.vector.reduce.or.v4i32(<4 x i32> %a)
Which has the type-constraint more explicit and causes LLVM to check
the result type with the vector element type.
Reviewers: RKSimon, arsenm, rnk, greened, aemerson
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D62996
llvm-svn: 363240
|
|
|
|
|
|
|
|
| |
The reversion apparently deleted the test/Transforms directory.
Will be re-reverting again.
llvm-svn: 358552
|
|
|
|
|
|
|
|
| |
As it's causing some bot failures (and per request from kbarton).
This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.
llvm-svn: 358546
|
|
|
|
| |
llvm-svn: 350967
|
|
|
|
|
|
|
|
|
|
| |
This patch provides an implementation of getArithmeticReductionCost for
AArch64. We can specialize the cost of add reductions since they are computed
using the 'addv' instruction.
Differential Revision: https://reviews.llvm.org/D44490
llvm-svn: 327702
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Added more remarks to SLP pass, in particular "missed" optimization remarks.
Also proposed several tests for new functionality.
Patch by Vladimir Miloserdov!
For reference you may look at: https://reviews.llvm.org/rL302811
Reviewers: anemet, fhahn
Reviewed By: anemet
Subscribers: javed.absar, lattner, petecoup, yakush, llvm-commits
Differential Revision: https://reviews.llvm.org/D38367
llvm-svn: 318307
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The New Pass Manager infrastructure was forgetting to keep around the optimization remark yaml file that the compiler might have been producing. This meant setting the option to '-' for stdout worked, but setting it to a filename didn't give file output (presumably it was deleted because compilation didn't explicitly keep it). This change just ensures that the file is kept if compilation succeeds.
So far I have updated one of the optimization remark output tests to add a version with the new pass manager. It is my intention for this patch to also include changes to all tests that use `-opt-remark-output=` but I wanted to get the code patch ready for review while I was making all those changes.
Fixes https://bugs.llvm.org/show_bug.cgi?id=33951
Reviewers: anemet, chandlerc
Reviewed By: anemet, chandlerc
Subscribers: javed.absar, chandlerc, fhahn, llvm-commits
Differential Revision: https://reviews.llvm.org/D36906
llvm-svn: 311271
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The approach I followed was to emit the remark after getTreeCost concludes
that SLP is profitable. I initially tried emitting them after the
vectorizeRootInstruction calls in vectorizeChainsInBlock but I vaguely
remember missing a few cases for example in HorizontalReduction::tryToReduce.
ORE is placed in BoUpSLP so that it's available from everywhere (notably
HorizontalReduction::tryToReduce).
We use the first instruction in the root bundle as the locator for the remark.
In order to get a sense how far the tree is spanning I've include the size of
the tree in the remark. This is not perfect of course but it gives you at
least a rough idea about the tree. Then you can follow up with -view-slp-tree
to really see the actual tree.
llvm-svn: 302811
|
|
|
|
|
|
|
| |
These testcases no longer need to specify -slp-vectorize-hor, since it was
enabled by default in r252733.
llvm-svn: 255783
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change could be way off-piste, I'm looking for any feedback on whether it's an acceptable approach.
It never seems to be a problem to gobble up as many reduction values as can be found, and then to attempt to reduce the resulting tree. Some of the workloads I'm looking at have been aggressively unrolled by hand, and by selecting reduction widths that are not constrained by a vector register size, it becomes possible to profitably vectorize. My test case shows such an unrolling which SLP was not vectorizing (on neither ARM nor X86) before this patch, but with it does vectorize.
I measure no significant compile time impact of this change when combined with D13949 and D14063. There are also no significant performance regressions on ARM/AArch64 in SPEC or LNT.
The more principled approach I thought of was to generate several candidate tree's and use the cost model to pick the cheapest one. That seemed like quite a big design change (the algorithms seem very much one-shot), and would likely be a costly thing for compile time. This seemed to do the job at very little cost, but I'm worried I've misunderstood something!
Reviewers: nadav, jmolloy
Subscribers: mssimpso, llvm-commits, aemerson
Differential Revision: http://reviews.llvm.org/D14116
llvm-svn: 251428
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently, when the SLP vectorizer considers whether a phi is part of a reduction, it dismisses phi's whose incoming blocks are not the same as the block containing the phi. For the patterns I'm looking at, extending this rule to allow phis whose incoming block is a containing loop latch allows me to vectorize certain workloads.
There is no significant compile-time impact, and combined with D13949, no performance improvement measured in ARM/AArch64 in any of SPEC2000, SPEC2006 or LNT.
Reviewers: jmolloy, mcrosier, nadav
Subscribers: mssimpso, nadav, aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D14063
llvm-svn: 251425
|
|
Summary:
Certain workloads, in particular sum-of-absdiff loops, can be vectorized using SLP if it can treat select instructions as reduction values.
The test case is a bit awkward. The AArch64 cost model needs some tuning to not be so pessimistic about selects. I've had to tweak the SLP threshold here.
Reviewers: jmolloy, mzolotukhin, spatel, nadav
Subscribers: nadav, mssimpso, aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D13949
llvm-svn: 251424
|