summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/SLPVectorizer
Commit message (Collapse)AuthorAgeFilesLines
...
* Reapply "[SLP] Initialize VectorizedValue when gathering"Matthew Simpson2016-08-201-0/+87
| | | | | | | | | | | The test case included in r279125 exposed existing undefined behavior in the SLP vectorizer that it did not introduce. This patch reapplies the original patch, but modifies the test case to avoid hitting the undefined behavior. This allows us to close PR28330 while keeping the UBSan bot happy. The undefined behavior the original test uncovered will be addressed in a follow-on patch. Reference: https://llvm.org/bugs/show_bug.cgi?id=28330 llvm-svn: 279370
* Revert "[SLP] Initialize VectorizedValue when gathering" to fix ubsan bot.Vitaly Buka2016-08-201-95/+0
| | | | | | | | This reverts commit r279125. https://reviews.llvm.org/D23410 llvm-svn: 279363
* [SLP] Initialize VectorizedValue when gatheringMatthew Simpson2016-08-181-0/+95
| | | | | | | | | | | | | | | | | We abort building vectorizable trees in some cases (e.g., if the maximum recursion depth is reached, if the region size is too large, etc.). If this happens for a reduction, we can be left with a root entry that needs to be gathered. For these cases, we need make sure we actually set VectorizedValue to the resulting vector. This patch ensures we properly set VectorizedValue, and it also ensures the insertelement sequence generated for the gathers is inserted at the correct location. Reference: https://llvm.org/bugs/show_bug.cgi?id=28330 Differential Revison: https://reviews.llvm.org/D23410 llvm-svn: 279125
* [X86][SSE] Add initial costs for vector CTTZ/CTLZSimon Pilgrim2016-08-042-994/+858
| | | | llvm-svn: 277716
* [SLPVectorizer][X86] Added vXi8/vXi16 sitofp/uitofp testsSimon Pilgrim2016-07-302-38/+484
| | | | | | Dropped useless 2i32-2f32 test llvm-svn: 277281
* [SLPVectorizer][X86] Added SITOFP/UITOFP vectorization testsSimon Pilgrim2016-07-302-0/+528
| | | | llvm-svn: 277275
* [SLPVectorizer] Vectorize reverse-order loads in horizontal reductionsMichael Kuperstein2016-07-221-0/+49
| | | | | | | | | | | | | | | | | | When vectorizing a tree rooted at a store bundle, we currently try to sort the stores before building the tree, so that the stores can be vectorized. For other trees, the order of the root bundle - which determines the order of all other bundles - is arbitrary. That is bad, since if a leaf bundle of consecutive loads happens to appear in the wrong order, we will not vectorize it. This is partially mitigated when the root is a binary operator, by trying to build a "reversed" tree when that's considered profitable. This patch extends the workaround we have for binops to trees rooted in a horizontal reduction. This fixes PR28474. Differential Revision: https://reviews.llvm.org/D22554 llvm-svn: 276477
* [X86][SSE] Add cost model values for CTPOP of vectorsSimon Pilgrim2016-07-201-35/+144
| | | | | | | | This patch adds costs for the vectorized implementations of CTPOP, the default values were seriously underestimating the cost of these and was encouraging vectorization on targets where serialized use of POPCNT would be much better. Differential Revision: https://reviews.llvm.org/D22456 llvm-svn: 276104
* [SLPVectorizer][X86] Added sqrt vectorization testsSimon Pilgrim2016-07-181-0/+274
| | | | llvm-svn: 275788
* [SLPVectorizer][X86] Added fma vectorization testsSimon Pilgrim2016-07-081-0/+562
| | | | llvm-svn: 274889
* Vector GEP test: renamed + some commentsElena Demikhovsky2016-07-061-1/+8
| | | | | | Differential revision: http://reviews.llvm.org/D21957 llvm-svn: 274611
* Fixed crash of SLP Vectorizer on KNLElena Demikhovsky2016-06-271-0/+17
| | | | | | | The bug is connected to vector GEPs. https://llvm.org/bugs/show_bug.cgi?id=28313 llvm-svn: 273919
* [SLPVectorizer][X86] Added ceil/floor/nearbyint/rint/trunc vectorization testsSimon Pilgrim2016-06-221-0/+2158
| | | | llvm-svn: 273420
* [X86][SSE] Add cost model for BSWAP of vectorsSimon Pilgrim2016-06-201-134/+68
| | | | | | | | The BSWAP of vector types is quite efficiently implemented using vector shuffles on SSE/AVX targets, we should reflect the typical cost of this to encourage vectorization. Differential Revision: http://reviews.llvm.org/D21521 llvm-svn: 273217
* [PM] Port SLPVectorizer to the new PMSean Silva2016-06-151-0/+1
| | | | | | | | | | | This uses the "runImpl" approach to share code with the old PM. Porting to the new PM meant abandoning the anonymous namespace enclosing most of SLPVectorizer.cpp which is a bit of a bummer (but not a big deal compared to having to pull the pass class into a header which the new PM requires since it calls the constructor directly). llvm-svn: 272766
* [CostModel][X86][SSE] Updated costs for vector BITREVERSE ops on SSSE3+ targetsSimon Pilgrim2016-06-111-268/+43
| | | | | | To account for the fast PSHUFB implementation now available llvm-svn: 272484
* [SLPVectorizer] Handle GEP with differing constant index typesMichael Zolotukhin2016-06-081-0/+22
| | | | | | | | | | | | | | | | | | | Summary: This fixes PR27617. Bug description: The SLPVectorizer asserts on encountering GEPs with different index types, such as i8 and i64. The patch includes a simple relaxation of the assert to allow constants being of different types, along with a regression test that will provoke the unrelaxed assert. Reviewers: nadav, mzolotukhin Subscribers: JesperAntonsson, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D20685 Patch by Jesper Antonsson! llvm-svn: 272206
* [Analysis] Enabled BITREVERSE as a vectorizable intrinsicSimon Pilgrim2016-06-041-288/+633
| | | | | | Allows XOP to vectorize BITREVERSE - other targets will follow as their costmodels improve. llvm-svn: 271803
* [SLP] Pass in correct alignment when query memory access costGuozhi Wei2016-05-312-0/+31
| | | | | | | | | | This patch fixes bug https://llvm.org/bugs/show_bug.cgi?id=27897. When query memory access cost, current SLP always passes in alignment value of 1 (unaligned), so it gets a very high cost of scalar memory access, and wrongly vectorize memory loads in the test case. It can be fixed by simply giving correct alignment. llvm-svn: 271333
* [SLPVectorizer][X86] Regenerated SEXT/ZEXT cast vectorization testsSimon Pilgrim2016-05-061-8/+94
| | | | | | Added 256-bit vector test as well llvm-svn: 268811
* [SLPVectorizer][X86] Added BSWAP/BITREVERSE vectorization testsSimon Pilgrim2016-05-062-0/+934
| | | | llvm-svn: 268803
* [SLPVectorizer][X86] Added CTPOP/CTLZ/CTTZ vectorization testsSimon Pilgrim2016-05-063-0/+2847
| | | | llvm-svn: 268800
* [SLPVectorizer] Add operand bundles to vectorized functionsDavid Majnemer2016-04-291-0/+48
| | | | | | | SLPVectorizing a call site should result in further propagation of its bundles. llvm-svn: 268004
* [SLPVectorizer] Extend SLP Vectorizer to deal with aggregates.Arch D. Robison2016-04-281-0/+189
| | | | | | | | The refactoring portion part was done as r267748. http://reviews.llvm.org/D14185 llvm-svn: 267899
* [TTI] Add hook for vector extract with extensionMatthew Simpson2016-04-271-19/+14
| | | | | | | | | | | | | | | This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725
* [PR27284] Reverse the ownership between DICompileUnit and DISubprogram.Adrian Prantl2016-04-151-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently each Function points to a DISubprogram and DISubprogram has a scope field. For member functions the scope is a DICompositeType. DIScopes point to the DICompileUnit to facilitate type uniquing. Distinct DISubprograms (with isDefinition: true) are not part of the type hierarchy and cannot be uniqued. This change removes the subprograms list from DICompileUnit and instead adds a pointer to the owning compile unit to distinct DISubprograms. This would make it easy for ThinLTO to strip unneeded DISubprograms and their transitively referenced debug info. Motivation ---------- Materializing DISubprograms is currently the most expensive operation when doing a ThinLTO build of clang. We want the DISubprogram to be stored in a separate Bitcode block (or the same block as the function body) so we can avoid having to expensively deserialize all DISubprograms together with the global metadata. If a function has been inlined into another subprogram we need to store a reference the block containing the inlined subprogram. Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script that updates LLVM IR testcases to the new format. http://reviews.llvm.org/D19034 <rdar://problem/25256815> llvm-svn: 266446
* [SLPVectorizer] Vectorizing the libm sqrt to llvm's sqrt intrinsic requires nnanDavid Majnemer2016-04-061-2/+2
| | | | | | | | | | | | | | To quote the langref "Unlike sqrt in libm, however, llvm.sqrt has undefined behavior for negative numbers other than -0.0 (which allows for better optimization, because there is no need to worry about errno being set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt." This means that it's unsafe to replace sqrt with llvm.sqrt unless the call is annotated with nnan. Thanks to Hal Finkel for pointing this out! llvm-svn: 265521
* [SLPVectorizer] Vectorize libcalls of sqrtDavid Majnemer2016-04-061-0/+23
| | | | | | | We didn't realize that we could transform the libcall into a vectorized intrinsic. llvm-svn: 265493
* [SLPVectorizer] Don't insert an extractelement before a catchswitchDavid Majnemer2016-04-011-0/+50
| | | | | | | | | | | | | A catchswitch cannot be preceded by another instruction in the same basic block (other than a PHI node). Instead, insert the extract element right after the materialization of the vectorized value. This isn't optimal but is a reasonable compromise given the constraints of WinEH. This fixes PR27163. llvm-svn: 265157
* testcase gardening: update the emissionKind enum to the new syntax. (NFC)Adrian Prantl2016-04-011-1/+1
| | | | llvm-svn: 265081
* Move the DebugEmissionKind enum from DIBuilder into DICompileUnit.Adrian Prantl2016-03-311-1/+1
| | | | | | | | | | | | | This mostly cosmetic patch moves the DebugEmissionKind enum from DIBuilder into DICompileUnit. DIBuilder is not the right place for this enum to live in — a metadata consumer should not have to include DIBuilder.h. I also added a Verifier check that checks that the emission kind of a DICompileUnit is actually legal. http://reviews.llvm.org/D18612 <rdar://problem/25427165> llvm-svn: 265077
* Fix tests that used CHECK-NEXT-NOT and CHECK-DAG-NOT.Paul Robinson2016-02-261-1/+1
| | | | | | | | FileCheck actually doesn't support combo suffixes. Differential Revision: http://reviews.llvm.org/D17588 llvm-svn: 262054
* Reapply commit r259357 with a fix for PR26629Matthew Simpson2016-02-182-13/+47
| | | | | | | | | | Commit r259357 was reverted because it caused PR26629. We were assuming all roots of a vectorizable tree could be truncated to the same width, which is not the case in general. This commit reapplies the patch along with a fix and a new test case to ensure we don't regress because of this issue again. This should fix PR26629. llvm-svn: 261212
* Revert "Reapply commit r258404 with fix."David Majnemer2016-02-171-18/+13
| | | | | | This reverts commit r259357, it caused PR26629. llvm-svn: 261137
* Add test case missing from r259357 (NFC)Matthew Simpson2016-02-011-0/+26
| | | | llvm-svn: 259385
* Reapply commit r258404 with fix.Matthew Simpson2016-02-011-13/+18
| | | | | | | The previous patch caused PR26364. The fix is to ensure that we don't enter a cycle when iterating over use-def chains. llvm-svn: 259357
* Revert "Reapply commit r258404 with fix"David Majnemer2016-01-291-18/+13
| | | | | | This reverts commit r258929, it caused PR26364. llvm-svn: 259148
* Reapply commit r258404 with fixMatthew Simpson2016-01-271-13/+18
| | | | | | | | | | | | | | This patch is the second attempt to reapply commit r258404. There was bug in the initial patch and subsequent fix (mentioned below). The initial patch caused an assertion because we were computing smaller type sizes for instructions that cannot be demoted. The fix first determines the instructions that will be demoted, and then applies the smaller type size to only those instructions. This should fix PR26239 and PR26307. llvm-svn: 258929
* Revert "Reapply commit r258404 with fix"Matthew Simpson2016-01-261-18/+13
| | | | | | | | This commit exposes a crash in computeKnownBits on the Chromium buildbots. Reverting to investigate. Reference: https://llvm.org/bugs/show_bug.cgi?id=26307 llvm-svn: 258812
* Reapply commit r25804 with fixMatthew Simpson2016-01-251-13/+18
| | | | | | | | | | | We were hitting an assertion because we were computing smaller type sizes for instructions that cannot be demoted. The fix first determines the instructions that will be demoted, and then applies the smaller type size to only those instructions. This should fix PR26239. llvm-svn: 258705
* Revert "[SLP] Truncate expressions to minimum required bit width"Matthew Simpson2016-01-211-18/+13
| | | | | | This reverts commit r258404. llvm-svn: 258408
* [SLP] Truncate expressions to minimum required bit widthMatthew Simpson2016-01-211-13/+18
| | | | | | | | | | | | | | | This change attempts to produce vectorized integer expressions in bit widths that are narrower than their scalar counterparts. The need for demotion arises especially on architectures in which the small integer types (e.g., i8 and i16) are not legal for scalar operations but can still be used in vectors. Like similar work done within the loop vectorizer, we rely on InstCombine to perform the actual type-shrinking. We use the DemandedBits analysis and ComputeNumSignBits from ValueTracking to determine the minimum required bit width of an expression. Differential revision: http://reviews.llvm.org/D15815 llvm-svn: 258404
* Reapply r257800 with fixMatthew Simpson2016-01-152-0/+369
| | | | | | | | | | | | | | | | | | | The fix uniques the bundle of getelementptr indices we are about to vectorize since it's possible for the same index to be used by multiple instructions. The original commit message is below. [SLP] Vectorize the index computations of getelementptr instructions. This patch seeds the SLP vectorizer with getelementptr indices. The primary motivation in doing so is to vectorize gather-like idioms beginning with consecutive loads (e.g., g[a[0] - b[0]] + g[a[1] - b[1]] + ...). While these cases could be vectorized with a top-down phase, seeding the existing bottom-up phase with the index computations avoids the complexity, compile-time, and phase ordering issues associated with a full top-down pass. Only bundles of single-index getelementptrs with non-constant differences are considered for vectorization. llvm-svn: 257918
* Revert "[SLP] Vectorize the index computations of getelementptr instructions."Matthew Simpson2016-01-152-369/+0
| | | | | | This reverts commit r257800. llvm-svn: 257888
* Reapply r257105 "[Verifier] Check that debug values have proper size"Keno Fischer2016-01-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I originally reapplied this in 257550, but had to revert again due to bot breakage. The only change in this version is to allow either the TypeSize or the TypeAllocSize of the variable to be the one represented in debug info (hopefully in the future we can figure out how to encode the difference). Additionally, several bot failures following r257550, were due to optimizer bugs now fixed in r257787 and r257795. r257550 commit message was: ``` The follow extra changes were made to test cases: Manually making the variable be the actual type instead of a pointer to avoid pointer-size differences in generic code: LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll LLVM :: DebugInfo/Generic/varargs.ll Delete sizing information from debug info for the same reason (but the presence of the pointer was important to the test case): LLVM :: DebugInfo/Generic/restrict.ll LLVM :: DebugInfo/Generic/tu-composite.ll LLVM :: Linker/type-unique-type-array-a.ll LLVM :: Linker/type-unique-simple2.ll Fixing an incorrect DW_OP_deref LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll Fixing a missing DW_OP_deref LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll Additionally, clang should no longer complain during bootstrap should no longer happen after r257534. The original commit message was: `` Summary: Teach the Verifier to make sure that the storage size given to llvm.dbg.declare or the value size given to llvm.dbg.value agree with what is declared in DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA). Additionally this catches a number of common mistakes, such as passing a pointer when a value was intended or vice versa. One complication comes from stack coloring which modifies the original IR when it merges allocas in order to make sure that if AA falls back to the IR it gets the correct result. However, given this new invariant, indiscriminately replacing one alloca by a different (differently sized one) is no longer valid. Fix this by just undefing out any use of the alloca in a dbg.declare in this case. Additionally, I had to fix a number of test cases. Of particular note: - I regenerated dbg-changes-codegen-branch-folding.ll from the given source as it was affected by the bug fixed in r256077 - two-cus-from-same-file.ll was changed to avoid having a variable-typed debug variable as that would depend on the target, even though this test is supposed to be generic - I had to manually declared size/align for reference type. See also the discussion for D14275/r253186. - fpstack-debuginstr-kill.ll required changing `double` to `long double` - most others were just a question of adding OP_deref `` ``` llvm-svn: 257850
* [SLP] Vectorize the index computations of getelementptr instructions.Matthew Simpson2016-01-142-0/+369
| | | | | | | | | | | | | | | This patch seeds the SLP vectorizer with getelementptr indices. The primary motivation in doing so is to vectorize gather-like idioms beginning with consecutive loads (e.g., g[a[0] - b[0]] + g[a[1] - b[1]] + ...). While these cases could be vectorized with a top-down phase, seeding the existing bottom-up phase with the index computations avoids the complexity, compile-time, and phase ordering issues associated with a full top-down pass. Only bundles of single-index getelementptrs with non-constant differences are considered for vectorization. Differential Revision: http://reviews.llvm.org/D14829 llvm-svn: 257800
* Re-Revert r257105 (Verifier debug info changes)Keno Fischer2016-01-131-2/+2
| | | | | | | While I investigate some new buildbot failures. This was originally reapplied as r257550 and r257558. llvm-svn: 257563
* Reapply r257105 "[Verifier] Check that debug values have proper size"Keno Fischer2016-01-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The follow extra changes were made to test cases: Manually making the variable be the actual type instead of a pointer to avoid pointer-size differences in generic code: LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll LLVM :: DebugInfo/Generic/varargs.ll Delete sizing information from debug info for the same reason (but the presence of the pointer was important to the test case): LLVM :: DebugInfo/Generic/restrict.ll LLVM :: DebugInfo/Generic/tu-composite.ll LLVM :: Linker/type-unique-type-array-a.ll LLVM :: Linker/type-unique-simple2.ll Fixing an incorrect DW_OP_deref LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll Fixing a missing DW_OP_deref LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll Additionally, clang should no longer complain during bootstrap should no longer happen after r257534. The original commit message was: ``` Summary: Teach the Verifier to make sure that the storage size given to llvm.dbg.declare or the value size given to llvm.dbg.value agree with what is declared in DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA). Additionally this catches a number of common mistakes, such as passing a pointer when a value was intended or vice versa. One complication comes from stack coloring which modifies the original IR when it merges allocas in order to make sure that if AA falls back to the IR it gets the correct result. However, given this new invariant, indiscriminately replacing one alloca by a different (differently sized one) is no longer valid. Fix this by just undefing out any use of the alloca in a dbg.declare in this case. Additionally, I had to fix a number of test cases. Of particular note: - I regenerated dbg-changes-codegen-branch-folding.ll from the given source as it was affected by the bug fixed in r256077 - two-cus-from-same-file.ll was changed to avoid having a variable-typed debug variable as that would depend on the target, even though this test is supposed to be generic - I had to manually declared size/align for reference type. See also the discussion for D14275/r253186. - fpstack-debuginstr-kill.ll required changing `double` to `long double` - most others were just a question of adding OP_deref ``` llvm-svn: 257550
* Temporarily revert r257105 "[Verifier] Check that debug values have proper size"Keno Fischer2016-01-071-2/+2
| | | | | | | Looks like there's a case where clang generates debug info that triggers the new verifier check. Reverting while investigating. llvm-svn: 257107
* [Verifier] Check that debug values have proper sizeKeno Fischer2016-01-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Teach the Verifier to make sure that the storage size given to llvm.dbg.declare or the value size given to llvm.dbg.value agree with what is declared in DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA). Additionally this catches a number of common mistakes, such as passing a pointer when a value was intended or vice versa. One complication comes from stack coloring which modifies the original IR when it merges allocas in order to make sure that if AA falls back to the IR it gets the correct result. However, given this new invariant, indiscriminately replacing one alloca by a different (differently sized one) is no longer valid. Fix this by just undefing out any use of the alloca in a dbg.declare in this case. Additionally, I had to fix a number of test cases. Of particular note: - I regenerated dbg-changes-codegen-branch-folding.ll from the given source as it was affected by the bug fixed in r256077 - two-cus-from-same-file.ll was changed to avoid having a variable-typed debug variable as that would depend on the target, even though this test is supposed to be generic - I had to manually declared size/align for reference type. See also the discussion for D14275/r253186. - fpstack-debuginstr-kill.ll required changing `double` to `long double` - most others were just a question of adding OP_deref Reviewers: aprantl Differential Revision: http://reviews.llvm.org/D14276 llvm-svn: 257105
OpenPOWER on IntegriCloud