| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow-up to D70607 where we made any
extract element on SLM more costly than default. But that is
pessimistic for extract from element 0 because that corresponds
to x86 movd/movq instructions. These generally have >1 cycle
latency, but they are probably implemented as single uop
instructions.
Note that no vectorization tests are affected by this change.
Also, no targets besides SLM are affected because those are
falling through to the default cost of 1 anyway. But this will
become visible/important if we add more specializations via cost
tables.
Differential Revision: https://reviews.llvm.org/D71023
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm not sure what the effect of this change will be on all of the affected
tests or a larger benchmark, but it fixes the horizontal add/sub problems
noted here:
https://reviews.llvm.org/D59710?vs=227972&id=228095&whitespace=ignore-most#toc
The costs are based on reciprocal throughput numbers in Agner's tables for
PEXTR*; these appear to be very slow ops on Silvermont.
This is a small step towards the larger motivation discussed in PR43605:
https://bugs.llvm.org/show_bug.cgi?id=43605
Also, it seems likely that insert/extract is the source of perf regressions on
other CPUs (up to 30%) that were cited as part of the reason to revert D59710,
so maybe we'll extend the table-based approach to other subtargets.
Differential Revision: https://reviews.llvm.org/D70607
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ShuffleVectorInst::isExtractSubvectorMask, introduced in
[CostModel] Add SK_ExtractSubvector handling to getInstructionThroughput (PR39368)
erroneously thought that
%340 = shufflevector <4 x float> %339, <4 x float> undef, <3 x i32> <i32 2, i32 3, i32 undef>
is a subvector extract, even though it goes off the end of the parent
vector with the undef index. That then caused an assert in
BasicTTIImplBase::getExtractSubvectorOverhead.
This commit fixes that, by not considering the above a subvector
extract.
Differential Revision: https://reviews.llvm.org/D70005
Change-Id: I87b8b00b24bef19ffc9a1b82ef4eca3b8a246eaf
|
|
|
|
|
|
|
|
| |
Now that we're using widening legalization. We need to improve our extract_subvector cost model for these types. This patch begins by modeling these as a subvector extract followed by a permute. I've left FIXMEs in the code for future improvements.
Differential Revision: https://reviews.llvm.org/D65892
llvm-svn: 369022
|
|
|
|
|
|
| |
These tests don't cover many cases where the subvectors don't start on aligned indices, but that can be added later.
llvm-svn: 368839
|
|
|
|
|
|
| |
We don't have full 512-bit test coverage yet - but there's enough to help test D65892
llvm-svn: 368716
|
|
|
|
|
|
| |
smaller element sizes and smaller than 128-bit vectors."
llvm-svn: 368185
|
|
|
|
|
|
|
|
|
|
|
| |
element sizes and smaller than 128-bit vectors."
This reverts commit fc33e33776b7a7ce22e539f0ec2e3bfdb09ad361.
This commit depends on the rolled back commit rL367901, and thus needs
to be rolled back.
llvm-svn: 368109
|
|
|
|
|
|
|
|
|
| |
and smaller than 128-bit vectors.
With the switch to widening legalization, we need to a better
job of costing extractions of less than 128-bits.
llvm-svn: 368081
|
|
|
|
|
|
| |
aligned within the source vector
llvm-svn: 346664
|
|
|
|
|
|
|
|
| |
Instead of defaulting to a cost = 1, expand to element extract/insert like we do for other shuffles.
This exposes an issue in LoopVectorize which could call SK_ExtractSubvector with a scalar subvector type.
llvm-svn: 346656
|
|
|
|
|
|
| |
start of the source vector
llvm-svn: 346538
|
|
|
|
|
|
|
|
| |
(PR39368)
Add ShuffleVectorInst::isExtractSubvectorMask helper to match shuffle masks.
llvm-svn: 346510
|
|
Just f64/i64 tests initially to demonstrate PR39368
llvm-svn: 344857
|