summaryrefslogtreecommitdiffstats
path: root/llvm/test/Analysis/CostModel/X86/shuffle-extract_subvector.ll
Commit message (Collapse)AuthorAgeFilesLines
* [x86] add cost model special-case for insert/extract from element 0Sanjay Patel2019-12-061-2/+2
| | | | | | | | | | | | | | | | | This is a follow-up to D70607 where we made any extract element on SLM more costly than default. But that is pessimistic for extract from element 0 because that corresponds to x86 movd/movq instructions. These generally have >1 cycle latency, but they are probably implemented as single uop instructions. Note that no vectorization tests are affected by this change. Also, no targets besides SLM are affected because those are falling through to the default cost of 1 anyway. But this will become visible/important if we add more specializations via cost tables. Differential Revision: https://reviews.llvm.org/D71023
* [x86] make SLM extract vector element more expensive than defaultSanjay Patel2019-11-271-179/+475
| | | | | | | | | | | | | | | | | | | I'm not sure what the effect of this change will be on all of the affected tests or a larger benchmark, but it fixes the horizontal add/sub problems noted here: https://reviews.llvm.org/D59710?vs=227972&id=228095&whitespace=ignore-most#toc The costs are based on reciprocal throughput numbers in Agner's tables for PEXTR*; these appear to be very slow ops on Silvermont. This is a small step towards the larger motivation discussed in PR43605: https://bugs.llvm.org/show_bug.cgi?id=43605 Also, it seems likely that insert/extract is the source of perf regressions on other CPUs (up to 30%) that were cited as part of the reason to revert D59710, so maybe we'll extend the table-based approach to other subtargets. Differential Revision: https://reviews.llvm.org/D70607
* [CostModel] Fixed isExtractSubvectorMask for undef index off endTim Renouf2019-11-081-0/+5
| | | | | | | | | | | | | | | | | | | ShuffleVectorInst::isExtractSubvectorMask, introduced in [CostModel] Add SK_ExtractSubvector handling to getInstructionThroughput (PR39368) erroneously thought that %340 = shufflevector <4 x float> %339, <4 x float> undef, <3 x i32> <i32 2, i32 3, i32 undef> is a subvector extract, even though it goes off the end of the parent vector with the undef index. That then caused an assert in BasicTTIImplBase::getExtractSubvectorOverhead. This commit fixes that, by not considering the above a subvector extract. Differential Revision: https://reviews.llvm.org/D70005 Change-Id: I87b8b00b24bef19ffc9a1b82ef4eca3b8a246eaf
* [X86] Improve cost model for subvector extraction of less than 128-bit vectorsCraig Topper2019-08-151-615/+853
| | | | | | | | Now that we're using widening legalization. We need to improve our extract_subvector cost model for these types. This patch begins by modeling these as a subvector extract followed by a permute. I've left FIXMEs in the code for future improvements. Differential Revision: https://reviews.llvm.org/D65892 llvm-svn: 369022
* [X86] Add missing regular 512-bit vXi8 extract subvector cost model testsSimon Pilgrim2019-08-141-73/+421
| | | | | | These tests don't cover many cases where the subvectors don't start on aligned indices, but that can be added later. llvm-svn: 368839
* [X86] Add some vXi8 extract subvector cost model testsSimon Pilgrim2019-08-131-0/+367
| | | | | | We don't have full 512-bit test coverage yet - but there's enough to help test D65892 llvm-svn: 368716
* Recommit r368081 "[X86] Add more extract subvector cost model tests for ↵Craig Topper2019-08-071-7/+488
| | | | | | smaller element sizes and smaller than 128-bit vectors." llvm-svn: 368185
* Revert "[X86] Add more extract subvector cost model tests for smaller ↵Mitch Phillips2019-08-061-488/+7
| | | | | | | | | | | element sizes and smaller than 128-bit vectors." This reverts commit fc33e33776b7a7ce22e539f0ec2e3bfdb09ad361. This commit depends on the rolled back commit rL367901, and thus needs to be rolled back. llvm-svn: 368109
* [X86] Add more extract subvector cost model tests for smaller element sizes ↵Craig Topper2019-08-061-7/+488
| | | | | | | | | and smaller than 128-bit vectors. With the switch to widening legalization, we need to a better job of costing extractions of less than 128-bits. llvm-svn: 368081
* [CostModel][X86] SK_ExtractSubvector is cheap if the (legal) subvector is ↵Simon Pilgrim2018-11-121-22/+22
| | | | | | aligned within the source vector llvm-svn: 346664
* [CostModel] Add more realistic SK_ExtractSubvector generic costs.Simon Pilgrim2018-11-121-28/+28
| | | | | | | | Instead of defaulting to a cost = 1, expand to element extract/insert like we do for other shuffles. This exposes an issue in LoopVectorize which could call SK_ExtractSubvector with a scalar subvector type. llvm-svn: 346656
* [CostModel][X86] SK_ExtractSubvector is free if the subvector is at the ↵Simon Pilgrim2018-11-091-32/+80
| | | | | | start of the source vector llvm-svn: 346538
* [CostModel] Add SK_ExtractSubvector handling to getInstructionThroughput ↵Simon Pilgrim2018-11-091-36/+36
| | | | | | | | (PR39368) Add ShuffleVectorInst::isExtractSubvectorMask helper to match shuffle masks. llvm-svn: 346510
* [CostModel][X86] Add some initial extract/insert subvector shuffle cost testsSimon Pilgrim2018-10-201-0/+91
Just f64/i64 tests initially to demonstrate PR39368 llvm-svn: 344857
OpenPOWER on IntegriCloud