summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/SLPVectorizer/X86/zext.ll
Commit message (Collapse)AuthorAgeFilesLines
* [x86] make SLM extract vector element more expensive than defaultSanjay Patel2019-11-271-330/+624
| | | | | | | | | | | | | | | | | | | I'm not sure what the effect of this change will be on all of the affected tests or a larger benchmark, but it fixes the horizontal add/sub problems noted here: https://reviews.llvm.org/D59710?vs=227972&id=228095&whitespace=ignore-most#toc The costs are based on reciprocal throughput numbers in Agner's tables for PEXTR*; these appear to be very slow ops on Silvermont. This is a small step towards the larger motivation discussed in PR43605: https://bugs.llvm.org/show_bug.cgi?id=43605 Also, it seems likely that insert/extract is the source of perf regressions on other CPUs (up to 30%) that were cited as part of the reason to revert D59710, so maybe we'll extend the table-based approach to other subtargets. Differential Revision: https://reviews.llvm.org/D70607
* [X86][CostModel] Adjust the costs of ZERO_EXTEND/SIGN_EXTEND with less than ↵Craig Topper2019-08-141-36/+104
| | | | | | | | | | | | 128-bit inputs Now that we legalize by widening, the element types here won't change. Previously these were modeled as the elements being widened and then the instruction might become an AND or SHL/ASHR pair. But now they'll become something like a ZERO_EXTEND_VECTOR_INREG/SIGN_EXTEND_VECTOR_INREG. For AVX2, when the destination type is legal its clear the cost should be 1 since we have extend instructions that can produce 256 bit vectors from less than 128 bit vectors. I'm a little less sure about AVX1 costs, but I think the ones I changed were definitely too high, but they might still be too high. Differential Revision: https://reviews.llvm.org/D66169 llvm-svn: 368858
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-171-0/+785
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-171-785/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
* Temporarily Revert "[X86][SLP] Enable SLP vectorization for 128-bit ↵Eric Christopher2019-02-201-26/+23
| | | | | | | | | | | | | | | | horizontal X86 instructions (add, sub)" As this has broken the lto bootstrap build for 3 days and is showing a significant regression on the Dither_benchmark results (from the LLVM benchmark suite) -- specifically, on the BENCHMARK_FLOYD_DITHER_128, BENCHMARK_FLOYD_DITHER_256, and BENCHMARK_FLOYD_DITHER_512; the others are unchanged. These have regressed by about 28% on Skylake, 34% on Haswell, and over 40% on Sandybridge. This reverts commit r353923. llvm-svn: 354434
* [X86][SLP] Enable SLP vectorization for 128-bit horizontal X86 instructions ↵Anton Afanasyev2019-02-131-23/+26
| | | | | | | | | | | | | (add, sub) Try to use 64-bit SLP vectorization. In addition to horizontal instrs this change triggers optimizations for partial vector operations (for instance, using low halfs of 128-bit registers xmm0 and xmm1 to multiply <2 x float> by <2 x float>). Fixes llvm.org/PR32433 llvm-svn: 353923
* [SLPVectorizer][X86] Add load extend tests (PR36091)Simon Pilgrim2018-02-221-0/+785
llvm-svn: 325772
OpenPOWER on IntegriCloud