summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Vectorize
Commit message (Collapse)AuthorAgeFilesLines
* Create masked gather and scatter intrinsics in Loop Vectorizer.Elena Demikhovsky2016-02-171-96/+184
| | | | | | | | | Loop vectorizer now knows to vectorize GEP and create masked gather and scatter intrinsics for random memory access. The feature is enabled on AVX-512 target. Differential Revision: http://reviews.llvm.org/D15690 llvm-svn: 261140
* Revert "Reapply commit r258404 with fix."David Majnemer2016-02-171-231/+11
| | | | | | This reverts commit r259357, it caused PR26629. llvm-svn: 261137
* [LV] Add support for insertelt/extractelt processing during type truncationSilviu Baranga2016-02-151-0/+14
| | | | | | | | | | | | | | | | | | Summary: While shrinking types according to the required bits, we can encounter insert/extract element instructions. This will cause us to reach an llvm_unreachable statement. This change adds support for truncating insert/extract element operations, and adds a regression test. Reviewers: jmolloy Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17078 llvm-svn: 260893
* [SLP] Add debug output for extract cost (NFC)Matthew Simpson2016-02-111-4/+6
| | | | llvm-svn: 260614
* [SCEV][LAA] Re-commit r260085 and r260086, this time with a fix for the memorySilviu Baranga2016-02-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sanitizer issue. The PredicatedScalarEvolution's copy constructor wasn't copying the Generation value, and was leaving it un-initialized. Original commit message: [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection Summary: This change adds no wrap SCEV predicates with: - support for runtime checking - support for expression rewriting: (sext ({x,+,y}) -> {sext(x),+,sext(y)} (zext ({x,+,y}) -> {zext(x),+,sext(y)} Note that we are sign extending the increment of the SCEV, even for the zext case. This is needed to cover the fairly common case where y would be a (small) negative integer. In order to do this, this change adds two new flags: nusw and nssw that are applicable to AddRecExprs and permit the transformations above. We also change isStridedPtr in LAA to be able to make use of these predicates. With this feature we should now always be able to work around overflow issues in the dependence analysis. Reviewers: mzolotukhin, sanjoy, anemet Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel Differential Revision: http://reviews.llvm.org/D15412 llvm-svn: 260112
* [SLP] Fix placement of debug statement (NFC)Igor Breger2016-02-081-7/+7
| | | | | | | | By Ayal Zaks (ayal.zaks@intel.com) Differential Revision: http://reviews.llvm.org/D16976 llvm-svn: 260094
* Revert r260086 and r260085. They have broken the memorySilviu Baranga2016-02-081-1/+1
| | | | | | sanitizer bots. llvm-svn: 260087
* [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided ↵Silviu Baranga2016-02-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | pointer detection Summary: This change adds no wrap SCEV predicates with: - support for runtime checking - support for expression rewriting: (sext ({x,+,y}) -> {sext(x),+,sext(y)} (zext ({x,+,y}) -> {zext(x),+,sext(y)} Note that we are sign extending the increment of the SCEV, even for the zext case. This is needed to cover the fairly common case where y would be a (small) negative integer. In order to do this, this change adds two new flags: nusw and nssw that are applicable to AddRecExprs and permit the transformations above. We also change isStridedPtr in LAA to be able to make use of these predicates. With this feature we should now always be able to work around overflow issues in the dependence analysis. Reviewers: mzolotukhin, sanjoy, anemet Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel Differential Revision: http://reviews.llvm.org/D15412 llvm-svn: 260085
* [SCEV] Try to reuse existing value during SCEV expansionWei Mi2016-02-041-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current SCEV expansion will expand SCEV as a sequence of operations and doesn't utilize the value already existed. This will introduce redundent computation which may not be cleaned up throughly by following optimizations. This patch introduces an ExprValueMap which is a map from SCEV to the set of equal values with the same SCEV. When a SCEV is expanded, the set of values is checked and reused whenever possible before generating a sequence of operations. The original commit triggered regressions in Polly tests. The regressions exposed two problems which have been fixed in current version. 1. Polly will generate a new function based on the old one. To generate an instruction for the new function, it builds SCEV for the old instruction, applies some tranformation on the SCEV generated, then expands the transformed SCEV and insert the expanded value into new function. Because SCEV expansion may reuse value cached in ExprValueMap, the value in old function may be inserted into new function, which is wrong. In SCEVExpander::expand, there is a logic to check the cached value to be used should dominate the insertion point. However, for the above case, the check always passes. That is because the insertion point is in a new function, which is unreachable from the old function. However for unreachable node, DominatorTreeBase::dominates thinks it will be dominated by any other node. The fix is to simply add a check that the cached value to be used in expansion should be in the same function as the insertion point instruction. 2. When the SCEV is of scConstant type, expanding it directly is cheaper than reusing a normal value cached. Although in the cached value set in ExprValueMap, there is a Constant type value, but it is not easy to find it out -- the cached Value set is not sorted according to the potential cost. Existing reuse logic in SCEVExpander::expand simply chooses the first legal element from the cached value set. The fix is that when the SCEV is of scConstant type, don't try the reuse logic. simply expand it. Differential Revision: http://reviews.llvm.org/D12090 llvm-svn: 259736
* Minor code cleanups. NFC.Junmo Park2016-02-031-18/+18
| | | | llvm-svn: 259725
* Revert r259662, which caused regressions on polly tests.Wei Mi2016-02-031-16/+4
| | | | llvm-svn: 259675
* [SCEV] Try to reuse existing value during SCEV expansionWei Mi2016-02-031-4/+16
| | | | | | | | | | | | | | | | Current SCEV expansion will expand SCEV as a sequence of operations and doesn't utilize the value already existed. This will introduce redundent computation which may not be cleaned up throughly by following optimizations. This patch introduces an ExprValueMap which is a map from SCEV to the set of equal values with the same SCEV. When a SCEV is expanded, the set of values is checked and reused whenever possible before generating a sequence of operations. Differential Revision: http://reviews.llvm.org/D12090 llvm-svn: 259662
* [LV] Rename RdxPHIsToFix to PHIsToFix (NFC)Matthew Simpson2016-02-011-39/+32
| | | | | | | | | | In the future, we will vectorize recurrences other than reductions. This patch renames a few variables and updates their associated comments to enable them to be reused for non-reduction PHI nodes. This change was requested in the review for D16197. llvm-svn: 259364
* Reapply commit r258404 with fix.Matthew Simpson2016-02-011-11/+231
| | | | | | | The previous patch caused PR26364. The fix is to ensure that we don't enter a cycle when iterating over use-def chains. llvm-svn: 259357
* [SLP] Fix printing of debug statement (NFC)Matthew Simpson2016-01-291-1/+1
| | | | llvm-svn: 259212
* Revert "Reapply commit r258404 with fix"David Majnemer2016-01-291-226/+11
| | | | | | This reverts commit r258929, it caused PR26364. llvm-svn: 259148
* Reapply commit r258404 with fixMatthew Simpson2016-01-271-11/+226
| | | | | | | | | | | | | | This patch is the second attempt to reapply commit r258404. There was bug in the initial patch and subsequent fix (mentioned below). The initial patch caused an assertion because we were computing smaller type sizes for instructions that cannot be demoted. The fix first determines the instructions that will be demoted, and then applies the smaller type size to only those instructions. This should fix PR26239 and PR26307. llvm-svn: 258929
* [SLPVectorizer] Swap the checking order of isCommutative and isConsecutiveAccessHaicheng Wu2016-01-271-4/+4
| | | | | | NFC llvm-svn: 258909
* Remove autoconf supportChris Bieneman2016-01-261-15/+0
| | | | | | | | | | | | | | | | Summary: This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html "I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened." - Obi Wan Kenobi Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16471 llvm-svn: 258861
* Revert "Reapply commit r258404 with fix"Matthew Simpson2016-01-261-226/+11
| | | | | | | | This commit exposes a crash in computeKnownBits on the Chromium buildbots. Reverting to investigate. Reference: https://llvm.org/bugs/show_bug.cgi?id=26307 llvm-svn: 258812
* [LIR] Add support for structs and hand unrolled loopsHaicheng Wu2016-01-261-78/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a recommit of r258620 which causes PR26293. The original message: Now LIR can turn following codes into memset: typedef struct foo { int a; int b; } foo_t; void bar(foo_t *f, unsigned n) { for (unsigned i = 0; i < n; ++i) { f[i].a = 0; f[i].b = 0; } } void test(foo_t *f, unsigned n) { for (unsigned i = 0; i < n; i += 2) { f[i] = 0; f[i+1] = 0; } } llvm-svn: 258777
* Reapply commit r25804 with fixMatthew Simpson2016-01-251-11/+226
| | | | | | | | | | | We were hitting an assertion because we were computing smaller type sizes for instructions that cannot be demoted. The fix first determines the instructions that will be demoted, and then applies the smaller type size to only those instructions. This should fix PR26239. llvm-svn: 258705
* Speculatively revert r258620 as it is the likely culprid of PR26293.Quentin Colombet2016-01-251-11/+78
| | | | llvm-svn: 258703
* [LIR] Add support for structs and hand unrolled loopsHaicheng Wu2016-01-231-78/+11
| | | | | | | | | | | | | | | | | | | | | | | | | Now LIR can turn following codes into memset: typedef struct foo { int a; int b; } foo_t; void bar(foo_t *f, unsigned n) { for (unsigned i = 0; i < n; ++i) { f[i].a = 0; f[i].b = 0; } } void test(foo_t *f, unsigned n) { for (unsigned i = 0; i < n; i += 2) { f[i] = 0; f[i+1] = 0; } } llvm-svn: 258620
* Revert "[SLP] Truncate expressions to minimum required bit width"Matthew Simpson2016-01-211-143/+11
| | | | | | This reverts commit r258404. llvm-svn: 258408
* [SLP] Truncate expressions to minimum required bit widthMatthew Simpson2016-01-211-11/+143
| | | | | | | | | | | | | | | This change attempts to produce vectorized integer expressions in bit widths that are narrower than their scalar counterparts. The need for demotion arises especially on architectures in which the small integer types (e.g., i8 and i16) are not legal for scalar operations but can still be used in vectors. Like similar work done within the loop vectorizer, we rely on InstCombine to perform the actual type-shrinking. We use the DemandedBits analysis and ComputeNumSignBits from ValueTracking to determine the minimum required bit width of an expression. Differential revision: http://reviews.llvm.org/D15815 llvm-svn: 258404
* Reapply r257800 with fixMatthew Simpson2016-01-151-42/+228
| | | | | | | | | | | | | | | | | | | The fix uniques the bundle of getelementptr indices we are about to vectorize since it's possible for the same index to be used by multiple instructions. The original commit message is below. [SLP] Vectorize the index computations of getelementptr instructions. This patch seeds the SLP vectorizer with getelementptr indices. The primary motivation in doing so is to vectorize gather-like idioms beginning with consecutive loads (e.g., g[a[0] - b[0]] + g[a[1] - b[1]] + ...). While these cases could be vectorized with a top-down phase, seeding the existing bottom-up phase with the index computations avoids the complexity, compile-time, and phase ordering issues associated with a full top-down pass. Only bundles of single-index getelementptrs with non-constant differences are considered for vectorization. llvm-svn: 257918
* Revert "[SLP] Vectorize the index computations of getelementptr instructions."Matthew Simpson2016-01-151-217/+41
| | | | | | This reverts commit r257800. llvm-svn: 257888
* [SLP] Vectorize the index computations of getelementptr instructions.Matthew Simpson2016-01-141-41/+217
| | | | | | | | | | | | | | | This patch seeds the SLP vectorizer with getelementptr indices. The primary motivation in doing so is to vectorize gather-like idioms beginning with consecutive loads (e.g., g[a[0] - b[0]] + g[a[1] - b[1]] + ...). While these cases could be vectorized with a top-down phase, seeding the existing bottom-up phase with the index computations avoids the complexity, compile-time, and phase ordering issues associated with a full top-down pass. Only bundles of single-index getelementptrs with non-constant differences are considered for vectorization. Differential Revision: http://reviews.llvm.org/D14829 llvm-svn: 257800
* Remove extra whitespace. NFC.Junmo Park2016-01-131-10/+10
| | | | llvm-svn: 257578
* rangify; NFCISanjay Patel2016-01-121-12/+10
| | | | llvm-svn: 257500
* function names start with a lower case letter ; NFCSanjay Patel2016-01-121-1/+1
| | | | llvm-svn: 257496
* [LV] Avoid creating empty reduction entries (NFC)Matthew Simpson2016-01-061-6/+6
| | | | | | | | | | This patch prevents us from unintentionally creating entries in the reductions map for PHIs that are not actually reductions. This is currently not an issue since we bail out if we encounter PHIs other than inductions or reductions. However the behavior could become problematic as we add support for additional recurrence types. llvm-svn: 256930
* [SCEV] Add and use SCEVConstant::getAPInt; NFCISanjoy Das2015-12-171-2/+2
| | | | llvm-svn: 255921
* [SLPVectorizer] Ensure dominated reduction values.Charlie Turner2015-12-161-7/+19
| | | | | | | | | | | | | | | | When considering incoming values as part of a reduction phi, ensure the incoming value is dominated by said phi. Failing to ensure this property causes miscompiles. Fixes PR25787. Many thanks to Mattias Eriksson for reporting, reducing and analyzing the problem for me. Differential Revision: http://reviews.llvm.org/D15580 llvm-svn: 255792
* [LoopVectorizer] Refine loop vectorizer's register usage calculator by ↵Cong Hou2015-12-151-31/+106
| | | | | | | | | | | | | | | | | | | | | | ignoring specific instructions. (This is the third attempt to check in this patch, and the first two are r255454 and r255460. The once failed test file reg-usage.ll is now moved to test/Transform/LoopVectorize/X86 directory with target datalayout and target triple indicated.) LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the register usage for specific VFs. However, it takes into account many instructions that won't be vectorized, such as induction variables, GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative when choosing VF. In this patch, the induction variables that won't be vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set so that their register usage won't be considered any more. Differential revision: http://reviews.llvm.org/D15177 llvm-svn: 255691
* Revert r255460, which still causes test failures on some platforms.Cong Hou2015-12-131-106/+31
| | | | | | Further investigation on the failures is ongoing. llvm-svn: 255463
* [LoopVectorizer] Refine loop vectorizer's register usage calculator by ↵Cong Hou2015-12-131-31/+106
| | | | | | | | | | | | | | | | | | | | ignoring specific instructions. (This is the second attempt to check in this patch: REQUIRES: asserts is added to reg-usage.ll now.) LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the register usage for specific VFs. However, it takes into account many instructions that won't be vectorized, such as induction variables, GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative when choosing VF. In this patch, the induction variables that won't be vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set so that their register usage won't be considered any more. Differential revision: http://reviews.llvm.org/D15177 llvm-svn: 255460
* Revert r255454 as it leads to several test failers on buildbots.Cong Hou2015-12-131-106/+31
| | | | llvm-svn: 255456
* [LoopVectorizer] Refine loop vectorizer's register usage calculator by ↵Cong Hou2015-12-131-31/+106
| | | | | | | | | | | | | | | | | ignoring specific instructions. LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the register usage for specific VFs. However, it takes into account many instructions that won't be vectorized, such as induction variables, GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative when choosing VF. In this patch, the induction variables that won't be vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set so that their register usage won't be considered any more. Differential revision: http://reviews.llvm.org/D15177 llvm-svn: 255454
* AlignmentFromAssumptions and SLPVectorizer preserves AA and GlobalsAAHal Finkel2015-12-111-0/+3
| | | | | | | | | | | | GlobalsAA's assumptions that passes do not escape globals not previously escaped is not violated by AlignmentFromAssumptions and SLPVectorizer. Marking them as such allows GlobalsAA to be preserved until GVN in the LTO pipeline. http://lists.llvm.org/pipermail/llvm-dev/2015-December/092972.html Patch by Vaivaswatha Nagaraj! llvm-svn: 255348
* Re-commit r255115, with the PredicatedScalarEvolution class moved toSilviu Baranga2015-12-091-85/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ScalarEvolution.h, in order to avoid cyclic dependencies between the Transform and Analysis modules: [LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions Summary: This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is that both LAA and LV should use this interface everywhere. This also solves a problem involving the result of SCEV expression rewritting when the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates P1: {a,+,b} has nsw P2: b = 1. Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies). The SCEVPredicatedLayer maintains the order of transformations by feeding back the results of previous transformations into new transformations, and therefore avoiding this issue. The SCEVPredicatedLayer maintains a cache to remember the results of previous SCEV rewritting results. This also has the benefit of reducing the overall number of expression rewrites. Reviewers: mzolotukhin, anemet Subscribers: jmolloy, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D14296 llvm-svn: 255122
* Revert r255115 until we figure out how to fix the bot failures.Silviu Baranga2015-12-091-80/+85
| | | | llvm-svn: 255117
* [LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV ↵Silviu Baranga2015-12-091-85/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | expressions Summary: This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is that both LAA and LV should use this interface everywhere. This also solves a problem involving the result of SCEV expression rewritting when the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates P1: {a,+,b} has nsw P2: b = 1. Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies). The SCEVPredicatedLayer maintains the order of transformations by feeding back the results of previous transformations into new transformations, and therefore avoiding this issue. The SCEVPredicatedLayer maintains a cache to remember the results of previous SCEV rewritting results. This also has the benefit of reducing the overall number of expression rewrites. Reviewers: mzolotukhin, anemet Subscribers: jmolloy, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D14296 llvm-svn: 255115
* Fix a typo in LoopVectorize.cpp. NFC.Cong Hou2015-12-051-1/+1
| | | | llvm-svn: 254813
* Fix a typo in LoopVectorize.cpp. NFC.Cong Hou2015-12-021-1/+1
| | | | llvm-svn: 254549
* [LoopVectorize] Use MapVector rather than DenseMap for MinBWs.Charlie Turner2015-11-261-3/+3
| | | | | | | | | | | | | | | | | The order in which instructions are truncated in truncateToMinimalBitwidths effects code generation. Switch to a map with a determinisic order, since the iteration order over a DenseMap is not defined. This code is not hot, so the difference in container performance isn't interesting. Many thanks to David Blaikie for making me aware of MapVector! Fixes PR25490. Differential Revision: http://reviews.llvm.org/D14981 llvm-svn: 254179
* [LV] Add a helper function, isReductionVariable. NFC.Chad Rosier2015-11-191-5/+7
| | | | llvm-svn: 253565
* Fix several long lines (>80) in LoopVectorize.cpp. NFC.Cong Hou2015-11-191-13/+19
| | | | llvm-svn: 253527
* Typo.Chad Rosier2015-11-171-1/+1
| | | | llvm-svn: 253336
OpenPOWER on IntegriCloud