summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/SLPVectorizer/X86
Commit message (Collapse)AuthorAgeFilesLines
...
* [SLP] Vectorize loads of consecutive memory accesses, accessed in ↵Mohammad Shahid2017-01-283-36/+22
| | | | | | | | | | | | | non-consecutive (jumbled) way. The jumbled scalar loads will be sorted while building the tree and these accesses will be marked to generate shufflevector after the vectorized load with proper mask. Reviewers: hfinkel, mssimpso, mkuper Differential Revision: https://reviews.llvm.org/D26905 Change-Id: I9c0c8e6f91a00076a7ee1465440a3f6ae092f7ad llvm-svn: 293386
* [SLP] Add one more reduction operation for extra argument test to makeAlexey Bataev2017-01-261-2/+8
| | | | | | it vectorizable. llvm-svn: 293162
* [SLP] Fixed test for extra arguments in horizontal reductions.Alexey Bataev2017-01-261-3/+5
| | | | llvm-svn: 293153
* [SLP] Extra test for functionality with extra args.Alexey Bataev2017-01-251-2/+4
| | | | llvm-svn: 293076
* [SLP] Improve horizontal vectorization for non-power-of-2 number ofAlexey Bataev2017-01-251-98/+72
| | | | | | | | | | | | | instructions. If number of instructions in horizontal reduction list is not power of 2 then only PowerOf2Floor(NumberOfInstructions) last elements are actually vectorized, other instructions remain scalar. Patch tries to vectorize the remaining elements either. Differential Revision: https://reviews.llvm.org/D28959 llvm-svn: 293042
* [SLP] Additional test for checking that instruction with extra args isAlexey Bataev2017-01-241-0/+57
| | | | | | not reconstructed. llvm-svn: 292911
* [SLP] Additional test with extra args in horizontal reductions.Alexey Bataev2017-01-231-0/+63
| | | | llvm-svn: 292821
* [SLP] Additional test for SLP vectorizer with 31 reduction elements.Alexey Bataev2017-01-231-0/+196
| | | | llvm-svn: 292783
* [SLP] Initial test for fix of PR31690.Alexey Bataev2017-01-201-0/+203
| | | | llvm-svn: 292631
* [SLP] A new test for horizontal vectorization for non-power-of-2Alexey Bataev2017-01-201-0/+290
| | | | | | instructions. llvm-svn: 292626
* [SLP] Add a base test for jumbled storeMohammad Shahid2017-01-201-0/+68
| | | | | Change-Id: I905ce08a02c76a6896dcfd9629547417c99adc4a llvm-svn: 292581
* [SLP] Add a tests for a fix for PR30787.Alexey Bataev2017-01-181-0/+984
| | | | | | | Add a test for PR30787: Failure to beneficially vectorize 'copyable' elements in integer binary ops. llvm-svn: 292416
* [SLP] Remove bogus assert.Michael Kuperstein2017-01-111-0/+30
| | | | | | | | | | | | The removed assert seems bogus - it's perfectly legal for the roots of the vectorized subtrees to be equal even if the original scalar values aren't, if the original scalars happen to be equivalent. This fixes PR31599. Differential Revision: https://reviews.llvm.org/D28539 llvm-svn: 291692
* Revert r290970 [SLPVectorizer] Regenerate test.Simon Pilgrim2017-01-041-1/+1
| | | | | | The check script will use var names before they are declared, which filecheck doesn't like. llvm-svn: 290971
* [SLPVectorizer] Regenerate test. Simon Pilgrim2017-01-041-1/+1
| | | | | | Missed var name llvm-svn: 290970
* Regenerate test. Simon Pilgrim2017-01-041-5/+10
| | | | llvm-svn: 290969
* [InstCombine] Canonicalize insert splat sequences into an insert + shuffleMichael Kuperstein2016-12-281-6/+6
| | | | | | | | | | | This adds a combine that canonicalizes a chain of inserts which broadcasts a value into a single insert + a splat shufflevector. This fixes PR31286. Differential Revision: https://reviews.llvm.org/D27992 llvm-svn: 290641
* [TEST] Initial commit of tests for minmax horizontal reductions.Alexey Bataev2016-12-151-0/+1725
| | | | llvm-svn: 289817
* [SLP] Fix sign-extends for type-shrinkingMatthew Simpson2016-12-121-0/+72
| | | | | | | | | | | | | | This patch ensures the correct minimum bit width during type-shrinking. Previously when type-shrinking, we always sign-extended values back to their original width. However, if we are going to sign-extend, and the sign bit is unknown, we have to increase the minimum bit width by one bit so the sign-extend will fill the upper bits correctly. If the sign bit is known to be zero, we can perform a zero-extend instead. This should fix PR31243. Reference: https://llvm.org/bugs/show_bug.cgi?id=31243 Differential Revision: https://reviews.llvm.org/D27466 llvm-svn: 289470
* [Verifier] Add verification for TBAA metadataSanjoy Das2016-12-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change adds some verification in the IR verifier around struct path TBAA metadata. Other than some basic sanity checks (e.g. we get constant integers where we expect constant integers), this checks: - That by the time an struct access tuple `(base-type, offset)` is "reduced" to a scalar base type, the offset is `0`. For instance, in C++ you can't start from, say `("struct-a", 16)`, and end up with `("int", 4)` -- by the time the base type is `"int"`, the offset better be zero. In particular, a variant of this invariant is needed for `llvm::getMostGenericTBAA` to be correct. - That there are no cycles in a struct path. - That struct type nodes have their offsets listed in an ascending order. - That when generating the struct access path, you eventually reach the access type listed in the tbaa tag node. Reviewers: dexonsmith, chandlerc, reames, mehdi_amini, manmanren Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D26438 llvm-svn: 289402
* [SLP] Fix for PR6246: vectorization for scalar ops on vector elements.Alexey Bataev2016-12-082-774/+366
| | | | | | | | | | | | | | | When trying to vectorize trees that start at insertelement instructions function tryToVectorizeList() uses vectorization factor calculated as MinVecRegSize/ScalarTypeSize. But sometimes it does not work as tree cost for this fixed vectorization factor is too high. Patch tries to improve the situation. It tries different vectorization factors from max(PowerOf2Floor(NumberOfVectorizedValues), MinVecRegSize/ScalarTypeSize) to MinVecRegSize/ScalarTypeSize and tries to choose the best one. Differential Revision: https://reviews.llvm.org/D27215 llvm-svn: 289043
* [SLPVectorizer][X86] Tests to show missed buildvector sitofp/fptosi ↵Simon Pilgrim2016-12-062-0/+100
| | | | | | | | | vectorizations e.g. buildvector(sitofp(i32), sitofp(i32), sitofp(i32), sitofp(i32)) --> sitofp(buildvector(i32, i32, i32, i32)) llvm-svn: 288807
* Revert "[SLP] Fix for PR6246: vectorization for scalar ops on vector elements."Renato Golin2016-12-022-366/+774
| | | | | | | | This reverts commit r288497, as it broke the AArch64 build of Compiler-RT's builtins (twice: once in r288412 and once in r288497). We should investigate this offline. llvm-svn: 288508
* [SLP] Fix for PR6246: vectorization for scalar ops on vector elements.Alexey Bataev2016-12-022-774/+366
| | | | | | | | | | | | | | | When trying to vectorize trees that start at insertelement instructions function tryToVectorizeList() uses vectorization factor calculated as MinVecRegSize/ScalarTypeSize. But sometimes it does not work as tree cost for this fixed vectorization factor is too high. Patch tries to improve the situation. It tries different vectorization factors from max(PowerOf2Floor(NumberOfVectorizedValues), MinVecRegSize/ScalarTypeSize) to MinVecRegSize/ScalarTypeSize and tries to choose the best one. Differential Revision: https://reviews.llvm.org/D27215 llvm-svn: 288497
* [SLPVectorizer][X86] Add tests for vectorization of buildvector of scalar ↵Simon Pilgrim2016-12-021-0/+1573
| | | | | | fp-ops (PR6246) llvm-svn: 288492
* Revert "[SLP] Fix for PR6246: vectorization for scalar ops on vector elements."Artem Belevich2016-12-011-62/+114
| | | | | | This reverts r288412 which causes severe compile-time regression. llvm-svn: 288431
* [SLP] Fix for PR6246: vectorization for scalar ops on vector elements.Alexey Bataev2016-12-011-114/+62
| | | | | | | | | | | | | | | When trying to vectorize trees that start at insertelement instructions function tryToVectorizeList() uses vectorization factor calculated as MinVecRegSize/ScalarTypeSize. But sometimes it does not work as tree cost for this fixed vectorization factor is too high. Patch tries to improve the situation. It tries different vectorization factors from max(PowerOf2Floor(NumberOfVectorizedValues), MinVecRegSize/ScalarTypeSize) to MinVecRegSize/ScalarTypeSize and tries to choose the best one. Differential Revision: https://reviews.llvm.org/D27215 llvm-svn: 288412
* [SLP] Fixed cost model for horizontal reduction.Alexey Bataev2016-12-011-4/+4
| | | | | | | | | | | | | | Currently when cost of scalar operations is evaluated the vector type is used for scalar operations. Patch fixes this issue and fixes evaluation of the vector operations cost. Several test showed that vector cost model is too optimistic. It allowed vectorization of 8 or less add/fadd operations, though scalar code is faster. Actually, only for 16 or more operations vector code provides better performance. Differential Revision: https://reviews.llvm.org/D26277 llvm-svn: 288398
* [SLP] Additional tests with the cost of vector operations.Alexey Bataev2016-12-011-1/+19
| | | | llvm-svn: 288377
* Revert "[SLP] Additional tests with the cost of vector operations."Alexey Bataev2016-12-011-18/+1
| | | | | | This reverts commit a61718435fc4118c82f8aa6133fd81f803789c1e. llvm-svn: 288371
* [SLP] Additional tests with the cost of vector operations.Alexey Bataev2016-12-011-1/+18
| | | | llvm-svn: 288369
* [SLP] Add a new test for tree vectorization starting from insertelementAlexey Bataev2016-11-291-33/+508
| | | | | | instruction. llvm-svn: 288148
* [SLPVectorizer] Improved support of partial tree vectorization.Alexey Bataev2016-11-291-87/+74
| | | | | | | | | | | Currently SLP vectorizer tries to vectorize a binary operation and dies immediately after unsuccessful the first unsuccessfull attempt. Patch tries to improve the situation, trying to vectorize all binary operations of all children nodes in the binop tree. Differential Revision: https://reviews.llvm.org/D25517 llvm-svn: 288115
* [SLP] Add new and update existing lit testfor providing more context to ↵Mohammad Shahid2016-11-272-4/+103
| | | | | | | incoming patch for vectorization of jumbled load Change-Id: Ifb9091bb0f84c1937c2c8bd2fc345734f250d2f9 llvm-svn: 287992
* [X86][AVX512] Add support for v2i64 fptosi/fptoui/sitofp/uitofp on ↵Simon Pilgrim2016-11-241-8/+23
| | | | | | | | AVX512DQ-only targets Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances llvm-svn: 287882
* [SLP] Add more tests for SLP Vectorizer.Alexey Bataev2016-11-231-0/+302
| | | | llvm-svn: 287801
* [X86][AVX512] Add support for v4i64 fptosi/fptoui/sitofp/uitofp on ↵Simon Pilgrim2016-11-231-28/+70
| | | | | | | | AVX512DQ-only targets Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances llvm-svn: 287762
* [CostModel][X86] Add missing AVX512DQ v8i64 fptosi/sitofp costsSimon Pilgrim2016-11-231-52/+118
| | | | llvm-svn: 287760
* [AVX-512] Support FCOPYSIGN for v16f32 and v8f64Craig Topper2016-11-181-20/+34
| | | | | | | | | | | | | | | Summary: This extends FCOPYSIGN support to 512-bit vectors. I've also added tests to show what the 128-bit and 256-bit cases look like with broadcast loads. Reviewers: delena, zvi, RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26791 llvm-svn: 287298
* Fixed the lost FastMathFlags for CALL operations in SLPVectorizer.Vyacheslav Klochkov2016-11-162-2/+39
| | | | | | | Reviewer: Michael Zolotukhin. Differential Revision: https://reviews.llvm.org/D26575 llvm-svn: 287064
* Fixed the lost FastMathFlags for FCmp operations in SLPVectorizer.Vyacheslav Klochkov2016-11-111-0/+52
| | | | | | | Reviewer: Michael Zolotukhin. Differential Revision: https://reviews.llvm.org/D26543 llvm-svn: 286626
* [VectorLegalizer] Expansion of CTLZ using CTPOP when possibleSimon Pilgrim2016-11-081-550/+52
| | | | | | | | | | This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available. This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful. Differential Revision: https://reviews.llvm.org/D25910 llvm-svn: 286233
* [SLP] Fix for PR30626: Compiler crash inside SLP Vectorizer.Alexey Bataev2016-10-271-58/+153
| | | | | | | | | | | | | | | | After successfull horizontal reduction vectorization attempt for PHI node vectorizer tries to update root binary op by combining vectorized tree and the ReductionPHI node. But during vectorization this ReductionPHI can be vectorized itself and replaced by the `undef` value, while the instruction itself is marked for deletion. This 'marked for deletion' PHI node then can be used in new binary operation, causing "Use still stuck around after Def is destroyed" crash upon PHI node deletion. Also the test is fixed to make it perform actual testing. Differential Revision: https://reviews.llvm.org/D25671 llvm-svn: 285286
* [X86][SSE] Add lowering to cvttpd2dq/cvttps2dq for sitofp v2f64/2f32 to 2i32Simon Pilgrim2016-10-181-24/+6
| | | | | | | | | | | | As discussed on PR28461 we currently miss the chance to lower "fptosi <2 x double> %arg to <2 x i32>" to cvttpd2dq due to its use of illegal types. This patch adds support for fptosi to 2i32 from both 2f64 and 2f32. It also recognises that cvttpd2dq zeroes the upper 64-bits of the xmm result (similar to D23797) - we still don't do this for the cvttpd2dq/cvttps2dq intrinsics - this can be done in a future patch. Differential Revision: https://reviews.llvm.org/D23808 llvm-svn: 284459
* [SLPVectorizer][X86] Add 512-bit sitofp/uitofp testsSimon Pilgrim2016-10-102-16/+1214
| | | | llvm-svn: 283756
* [SLPVectorizer][X86] Add avx512 sitofp/uitofp testsSimon Pilgrim2016-10-102-56/+147
| | | | llvm-svn: 283751
* [SLPVectorizer][X86] Fixed alignments of scalar loads in sitofp/uitofp testsSimon Pilgrim2016-10-102-456/+456
| | | | | | | | Fixed copy+paste vector alignment to correct for per-element scalar loads Increased to 512-bit data sizes in preparation of avx512 tests llvm-svn: 283748
* [SLPVectorizer] Fix for PR25748: reduction vectorization after loopAlexey Bataev2016-10-072-14/+62
| | | | | | | | | | | | | | | | | | | | unrolling. The next code is not vectorized by the SLPVectorizer: ``` int test(unsigned int *p) { int sum = 0; for (int i = 0; i < 8; i++) sum += p[i]; return sum; } ``` During optimization this loop is fully unrolled and SLPVectorizer is unable to vectorize it. Patch tries to fix this problem. Differential Revision: https://reviews.llvm.org/D24796 llvm-svn: 283535
* [SLPVectorizer] Add a test with non-vectorizable IR.Alexey Bataev2016-10-041-0/+290
| | | | llvm-svn: 283225
* [x86, SSE/AVX] allow 128/256-bit lowering for copysign vector intrinsics ↵Sanjay Patel2016-10-031-176/+100
| | | | | | | | | | | | | | | | (PR30433) This should fix: https://llvm.org/bugs/show_bug.cgi?id=30433 There are a couple of open questions about the codegen: 1. Should we let scalar ops be scalars and avoid vector constant loads/splats? 2. Should we have a pass to combine constants such as the inverted pair that we have here? Differential Revision: https://reviews.llvm.org/D25165 llvm-svn: 283119
OpenPOWER on IntegriCloud