summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/SLPVectorizer/X86
Commit message (Collapse)AuthorAgeFilesLines
...
* [SLPVectorizer] Support alternate opcodes in tryToVectorizeListSimon Pilgrim2018-06-224-503/+261
| | | | | | | | | | Enable tryToVectorizeList to support InstructionsState alternate opcode patterns at a root (build vector etc.) as well as further down the vectorization tree. NOTE: This patch reduces some of the debug reporting if there are opcode mismatches - I can try to add it back if it proves a problem. But it could get rather messy trying to provide equivalent verbose debug strings via getSameOpcode etc. Differential Revision: https://reviews.llvm.org/D48488 llvm-svn: 335364
* [SLPVectorizer][X86] Add alternate opcode tests for simple build vector casesSimon Pilgrim2018-06-222-0/+840
| | | | llvm-svn: 335348
* [X86][AVX] Reduce v4f64/v4i64 shuffle costs (PR37882)Simon Pilgrim2018-06-212-60/+20
| | | | | | These were being over cautious for costs for one/two op general shuffles - VSHUFPD doesn't have to replicate the same shuffle in both lanes like VSHUFPS does. llvm-svn: 335216
* [SLPVectorizer][X86] Add horizontal add/sub testsSimon Pilgrim2018-06-212-0/+1094
| | | | | | Shows PR37882 perf regression llvm-svn: 335215
* [SLPVectorizer] Relax "alternate" opcode vectorisation to work with any ↵Simon Pilgrim2018-06-201-18/+8
| | | | | | | | | | | | | | SK_Select shuffle pattern D47985 saw the old SK_Alternate 'alternating' shuffle mask replaced with the SK_Select mask which accepts either input operand for each lane, equivalent to a vector select with a constant condition operand. This patch updates SLPVectorizer to make full use of this SK_Select shuffle pattern by removing the 'isOdd()' limitation. The AArch64 regression will be fixed by D48172. Differential Revision: https://reviews.llvm.org/D48174 llvm-svn: 335130
* [SLP][X86] Add AVX2 run to POW2 SDIV TestsSimon Pilgrim2018-06-151-1/+2
| | | | | | Non-uniform pow2 tests are only make sense on targets with fast (low cost) non-uniform shifts llvm-svn: 334821
* [SLP][X86] Regenerate POW2 SDIV TestsSimon Pilgrim2018-06-151-9/+91
| | | | | | Added non-uniform pow2 test as well llvm-svn: 334819
* [DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label.Shiva Chen2018-05-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to set breakpoints on labels and list source code around labels, we need collect debug information for labels, i.e., label name, the function label belong, line number in the file, and the address label located. In order to keep these information in LLVM IR and to allow backend to generate debug information correctly. We create a new kind of metadata for labels, DILabel. The format of DILabel is !DILabel(scope: !1, name: "foo", file: !2, line: 3) We hope to keep debug information as much as possible even the code is optimized. So, we create a new kind of intrinsic for label metadata to avoid the metadata is eliminated with basic block. The intrinsic will keep existing if we keep it from optimized out. The format of the intrinsic is llvm.dbg.label(metadata !1) It has only one argument, that is the DILabel metadata. The intrinsic will follow the label immediately. Backend could get the label metadata through the intrinsic's parameter. We also create DIBuilder API for labels to be used by Frontend. Frontend could use createLabel() to allocate DILabel objects, and use insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR. Differential Revision: https://reviews.llvm.org/D45024 Patch by Hsiangkai Wang. llvm-svn: 331841
* [X86] Remove unnecessary -mattr to enable avx512bw when the -mcpu already ↵Craig Topper2018-04-161-1/+1
| | | | | | | | enabled it. NFC This makes the test similar to the arith-sub.ll and arith-mul.ll tests. llvm-svn: 330144
* [SLP] Additional tests for reorder reuse vectorization, NFC.Alexey Bataev2018-04-091-0/+230
| | | | llvm-svn: 329603
* [SLPVectorizer][X86] Regenerate some tests. NFCISimon Pilgrim2018-04-045-74/+187
| | | | llvm-svn: 329196
* [SLP] Fix PR36481: vectorize reassociated instructions.Alexey Bataev2018-04-039-186/+126
| | | | | | | | | | | | | | | | | | Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Patch does not support reordering of the repeated instruction, this must be handled in the separate patch. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 329085
* [SLP] Added tests for checks of reordering of the repeated instructions,Alexey Bataev2018-04-031-0/+129
| | | | | | NFC. llvm-svn: 329080
* Revert "[SLP] Fix PR36481: vectorize reassociated instructions."Benjamin Kramer2018-04-038-122/+183
| | | | | | This reverts commit r328980 and r329046. Makes the vectorizer crash. llvm-svn: 329071
* [SLP] Distinguish "demanded and shrinkable" from "demanded and not ↵Haicheng Wu2018-04-031-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | shrinkable" values when determining the minimum bitwidth We use two approaches for determining the minimum bitwidth. * Demanded bits * Value tracking If demanded bits doesn't result in a narrower type, we then try value tracking. We need this if we want to root SLP trees with the indices of getelementptr instructions since all the bits of the indices are demanded. But there is a missing piece though. We need to be able to distinguish "demanded and shrinkable" from "demanded and not shrinkable". For example, the bits of %i in %i = sext i32 %e1 to i64 %gep = getelementptr inbounds i64, i64* %p, i64 %i are demanded, but we can shrink %i's type to i32 because it won't change the result of the getelementptr. On the other hand, in %tmp15 = sext i32 %tmp14 to i64 %tmp16 = insertvalue { i64, i64 } undef, i64 %tmp15, 0 it doesn't make sense to shrink %tmp15 and we can skip the value tracking. Ideas are from Matthew Simpson! Differential Revision: https://reviews.llvm.org/D44868 llvm-svn: 329035
* [SLP] Fix PR36481: vectorize reassociated instructions.Alexey Bataev2018-04-028-183/+122
| | | | | | | | | | | | | | | Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 328980
* [SLPVectorizer] Add tests related to PR30787, NFCI.Dinar Temirbulatov2018-03-293-0/+338
| | | | llvm-svn: 328813
* [SLP] Stop counting cost of gather sequences with multiple usesMatthew Simpson2018-03-231-7/+8
| | | | | | | | | | | | | | | When building the SLP tree, we look for reuse among the vectorized tree entries. However, each gather sequence is represented by a unique tree entry, even though the sequence may be identical to another one. This means, for example, that a gather sequence with two uses will be counted twice when computing the cost of the tree. We should only count the cost of the definition of a gather sequence rather than its uses. During code generation, the redundant gather sequences are emitted, but we optimize them away with CSE. So it looks like this problem just affects the cost model. Differential Revision: https://reviews.llvm.org/D44742 llvm-svn: 328316
* [SLP] Additional tests for stores vectorization, NFC.Alexey Bataev2018-03-051-0/+179
| | | | llvm-svn: 326740
* [SLP] Added new tests and updated existing for jumbled load, NFC.Mohammad Shahid2018-02-284-6/+468
| | | | llvm-svn: 326303
* [X86][SSE] Reduce FADD/FSUB/FMUL costs on later targets (PR36280)Simon Pilgrim2018-02-265-146/+109
| | | | | | | | | | Agner's tables indicate that for SSE42+ targets (Core2 and later) we can reduce the FADD/FSUB/FMUL costs down to 1, which should fix the Himeno benchmark. Note: the AVX512 FDIV costs look rather dodgy, but this isn't part of this patch. Differential Revision: https://reviews.llvm.org/D43733 llvm-svn: 326133
* [SLP] Added new test + fixed some checks, NFC.Alexey Bataev2018-02-262-13/+175
| | | | llvm-svn: 326117
* [SLPVectorizer][X86] Add load extend tests (PR36091)Simon Pilgrim2018-02-222-0/+1696
| | | | llvm-svn: 325772
* [SLP] Fix test checks, NFC.Alexey Bataev2018-02-211-15/+30
| | | | llvm-svn: 325689
* revert r325515: [TTI CostModel] change default cost of FP ops to 1 (PR36280)Sanjay Patel2018-02-215-113/+146
| | | | | | | | There are too many perf regressions resulting from this, so we need to investigate (and add tests for) targets like ARM and AArch64 before trying to reinstate. llvm-svn: 325658
* [SLP] Fix tests checks, NFC.Alexey Bataev2018-02-205-74/+249
| | | | llvm-svn: 325605
* [TTI CostModel] change default cost of FP ops to 1 (PR36280)Sanjay Patel2018-02-195-146/+113
| | | | | | | | | | | | | | | | | | This change was mentioned at least as far back as: https://bugs.llvm.org/show_bug.cgi?id=26837#c26 ...and I found a real program that is harmed by this: Himeno running on AMD Jaguar gets 6% slower with SLP vectorization: https://bugs.llvm.org/show_bug.cgi?id=36280 ...but the change here appears to solve that bug only accidentally. The div/rem costs for x86 look very wrong in some cases, but that's already true, so we can fix those in follow-up patches. There's also evidence that more cost model changes are needed to solve SLP problems as shown in D42981, but that's an independent problem (though the solution may be adjusted after this change is made). Differential Revision: https://reviews.llvm.org/D43079 llvm-svn: 325515
* [SLP] Fix the test for the reversed stores, NFC.Alexey Bataev2018-02-151-18/+11
| | | | llvm-svn: 325268
* [SLP] Added test for reversed stores, NFC.Alexey Bataev2018-02-151-1/+64
| | | | llvm-svn: 325265
* [SLP] Allow vectorization of reversed loads.Alexey Bataev2018-02-141-6/+6
| | | | | | | | | | | | | | Summary: Reversed loads are handled as gathering. But we can just reshuffle these values. Patch adds support for vectorization of reversed loads. Reviewers: RKSimon, spatel, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43022 llvm-svn: 325134
* [SLP] Take user instructions cost into consideration in insertelement ↵Alexey Bataev2018-02-124-49/+45
| | | | | | | | | | | | | | | | | | | | | | | | vectorization. Summary: For better vectorization result we should take into consideration the cost of the user insertelement instructions when we try to vectorize sequences that build the whole vector. I.e. if we have the following scalar code: ``` <Scalar code> insertelement <ScalarCode>, ... ``` we should consider the cost of the last `insertelement ` instructions as the cost of the scalar code. Reviewers: RKSimon, spatel, hfinkel, mkuper Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D42657 llvm-svn: 324893
* [SLPVectorizer] auto-generate complete checks; NFCSanjay Patel2018-02-081-18/+52
| | | | llvm-svn: 324616
* [SLPVectorizer] auto-generate complete checks; NFCSanjay Patel2018-02-081-14/+43
| | | | llvm-svn: 324615
* [SLPVectorizer] move RUN line to top-of-file; NFCSanjay Patel2018-02-081-1/+1
| | | | | | | I was confused what we were checking because the RUN line was in the middle of the file. llvm-svn: 324614
* [SLPVectorizer] auto-generate complete checks; NFCSanjay Patel2018-02-081-30/+145
| | | | llvm-svn: 324612
* [SLP] Add a tests for PR36280, NFC.Alexey Bataev2018-02-071-0/+28
| | | | llvm-svn: 324510
* [SLP] Update test checks, NFC.Alexey Bataev2018-02-063-64/+1127
| | | | llvm-svn: 324387
* [SLP] Add extra test for extractelement shuffle, NFC.Alexey Bataev2018-01-301-0/+25
| | | | llvm-svn: 323815
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-293-31/+31
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323662
* [SLP] Add a test with extract for PR32086, NFC.Alexey Bataev2018-01-291-0/+33
| | | | llvm-svn: 323661
* Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵Alexey Bataev2018-01-273-24/+24
| | | | | | | | as shuffle." This reverts commit r323530 to fix possible problems in users code. llvm-svn: 323581
* [SLP] Test for trunc vectorization, NFC.Alexey Bataev2018-01-261-0/+33
| | | | llvm-svn: 323556
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-263-24/+24
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323530
* Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵Alexey Bataev2018-01-253-24/+24
| | | | | | | | as shuffle." This reverts commit r323441 to fix buildbots. llvm-svn: 323447
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-253-24/+24
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323441
* Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵Alexey Bataev2018-01-253-24/+24
| | | | | | | | as shuffle." This reverts commit r323430 to fix buildbots. llvm-svn: 323432
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-253-24/+24
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323430
* Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵Alexey Bataev2018-01-243-24/+24
| | | | | | | | as shuffle." This reverts commit r323348 because of the broken buildbots. llvm-svn: 323359
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-243-24/+24
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323348
* Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵Alexey Bataev2018-01-233-24/+24
| | | | | | | | as shuffle." This reverts commit r323246 because of the broken buildbots. llvm-svn: 323252
OpenPOWER on IntegriCloud