summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Vectorize
Commit message (Collapse)AuthorAgeFilesLines
...
* Apply another batch of fixes from clang-tidy's ↵Benjamin Kramer2016-06-171-7/+7
| | | | | | | | performance-unnecessary-value-param. Contains some manual fixes. No functionality change intended. llvm-svn: 273047
* [LV] Move management of symbolic strides to LAA. NFCIAdam Nemet2016-06-161-37/+9
| | | | | | | | | | | | | | | | | | | This is still NFCI, so the list of clients that allow symbolic stride speculation does not change (yes: LV and LoopVersioningLICM, no: LLE, LDist). However since the symbolic strides are now managed by LAA rather than passed by client a new bool parameter is used to enable symbolic stride speculation. The existing test Transforms/LoopVectorize/version-mem-access.ll checks that stride speculation is performed for LV. The previously added test Transforms/LoopLoadElim/symbolic-stride.ll ensures that no speculation is performed for LLE. The next patch will change the functionality and turn on symbolic stride speculation in all of LAA's clients and remove the bool parameter. llvm-svn: 272970
* [LV] Make getSymbolicStrides return a pointer rather than a reference. NFCAdam Nemet2016-06-161-5/+5
| | | | | | | | | | | | | | | Turns out SymbolicStrides is actually used in canVectorizeWithIfConvert before it gets set up in canVectorizeMemory. This works fine as long as SymbolicStrides resides in LV since we just have an empty map. Based on this the conclusion is made that there are no symbolic strides which is conservatively correct. However once SymbolicStrides becomes part of LAI, LAI is nullptr at this point so we need to differentiate the uninitialized state by returning a nullptr for SymbolicStrides. llvm-svn: 272966
* Attempt to define friend function more portably.Sean Silva2016-06-161-16/+5
| | | | | | Patch written by Reid. I verified it locally with clang. llvm-svn: 272875
* [LV] Make the new getter return a const reference. NFCAdam Nemet2016-06-151-1/+3
| | | | | | | | | LoopVectorizationLegality holds a constant reference to LAI, so this will have to be const as well. Also added missed function comment. llvm-svn: 272851
* [LV] Add getter function for LoopVectorizationLegality::Strides. NFCAdam Nemet2016-06-151-6/+8
| | | | | | This should help moving Strides to LAA later. llvm-svn: 272796
* [LV] Remove more unused functions. NFCAdam Nemet2016-06-151-4/+0
| | | | | | LoopVectorizationLegality::strides_begin/end are also unused. llvm-svn: 272781
* [LV] Remove unused function. NFCAdam Nemet2016-06-151-1/+0
| | | | | | LoopVectorizationLegality::mustCheckStrides is unused. llvm-svn: 272780
* Work around MSVC "friend" semantics.Sean Silva2016-06-151-0/+2
| | | | | | | | | | | | The error on clang-x86-win2008-selfhost is: C:\buildbot\slave-config\clang-x86-win2008-selfhost\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp(955) : error C2248: 'llvm::slpvectorizer::BoUpSLP::ScheduleData' : cannot access private struct declared in class 'llvm::slpvectorizer::BoUpSLP' C:\buildbot\slave-config\clang-x86-win2008-selfhost\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp(608) : see declaration of 'llvm::slpvectorizer::BoUpSLP::ScheduleData' C:\buildbot\slave-config\clang-x86-win2008-selfhost\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp(337) : see declaration of 'llvm::slpvectorizer::BoUpSLP' I reproduced this locally with both MSVC 2013 and MSVC 2015. llvm-svn: 272772
* Speculative buildbot fix.Sean Silva2016-06-151-0/+5
| | | | | | | | | | | | | | This wasn't failing for me with clang as the compiler. I think GCC may disagree with clang about whether a friend declaration introduces a declaration in the enclosing namespace (or something). Example error: /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:950:77: error: ‘llvm::raw_ostream& llvm::slpvectorizer::operator<<(llvm::raw_ostream&, const llvm::slpvectorizer::BoUpSLP::ScheduleData&)’ should have been declared inside ‘llvm::slpvectorizer’ const BoUpSLP::ScheduleData &SD) { ^ llvm-svn: 272767
* [PM] Port SLPVectorizer to the new PMSean Silva2016-06-151-138/+113
| | | | | | | | | | | This uses the "runImpl" approach to share code with the old PM. Porting to the new PM meant abandoning the anonymous namespace enclosing most of SLPVectorizer.cpp which is a bit of a bummer (but not a big deal compared to having to pull the pass class into a header which the new PM requires since it calls the constructor directly). llvm-svn: 272766
* Recommit [LV] Enable vectorization of loops where the IV has an external useMichael Kuperstein2016-06-151-26/+92
| | | | | | | | | | | | | r272715 broke libcxx because it did not correctly handle cases where the last iteration of one IV is the second-to-last iteration of another. Original commit message: Vectorizing loops with "escaping" IVs has been disabled since r190790, due to PR17179. This re-enables it, with support for external use of both "post-increment" (last iteration) and "pre-increment" (second-to-last iteration) IVs. llvm-svn: 272742
* Reverting r272715 since it broke libcxx.Michael Kuperstein2016-06-141-80/+26
| | | | llvm-svn: 272730
* [LV] Enable vectorization of loops where the IV has an external useMichael Kuperstein2016-06-141-26/+80
| | | | | | | | | | | Vectorizing loops with "escaping" IVs has been disabled since r190790, due to PR17179. This re-enables it, with support for external use of both "post-increment" (last iteration) and "pre-increment" (second-to-last iteration) IVs. Differential Revision: http://reviews.llvm.org/D21048 llvm-svn: 272715
* [PM] Port LCSSA to the new PM.Easwaran Raman2016-06-091-1/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D21090 llvm-svn: 272294
* [LV] Use vector phis for some secondary induction variablesMichael Kuperstein2016-06-091-4/+6
| | | | | | | | | | | | | | Previously, we materialized secondary vector IVs from the primary scalar IV, by offseting the primary to match the correct start value, and then broadcasting it - inside the loop body. Instead, we can use a real vector IV, like we do for the primary. This enables using vector IVs for secondary integer IVs whose type matches the type of the primary. Differential Revision: http://reviews.llvm.org/D20932 llvm-svn: 272283
* Revert r272194 No need for it if loop Analysis Manager is usedXinliang David Li2016-06-091-1/+1
| | | | llvm-svn: 272243
* [SLPVectorizer] Handle GEP with differing constant index typesMichael Zolotukhin2016-06-081-1/+1
| | | | | | | | | | | | | | | | | | | Summary: This fixes PR27617. Bug description: The SLPVectorizer asserts on encountering GEPs with different index types, such as i8 and i64. The patch includes a simple relaxation of the assert to allow constants being of different types, along with a regression test that will provoke the unrelaxed assert. Reviewers: nadav, mzolotukhin Subscribers: JesperAntonsson, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D20685 Patch by Jesper Antonsson! llvm-svn: 272206
* [PM] Refector LoopAccessInfo analysis code Xinliang David Li2016-06-081-1/+1
| | | | | | | | This is the preparation patch to port the analysis to new PM Differential Revision: http://reviews.llvm.org/D20560 llvm-svn: 272194
* [LV] For some IVs, use vector phis instead of widening in the loop bodyMichael Kuperstein2016-06-011-20/+76
| | | | | | | | | | | | | Previously, whenever we needed a vector IV, we would create it on the fly, by splatting the scalar IV and adding a step vector. Instead, we can create a real vector IV. This tends to save a couple of instructions per iteration. This only changes the behavior for the most basic case - integer primary IVs with a constant step. Differential Revision: http://reviews.llvm.org/D20315 llvm-svn: 271410
* [SLP] Pass in correct alignment when query memory access costGuozhi Wei2016-05-311-4/+8
| | | | | | | | | | This patch fixes bug https://llvm.org/bugs/show_bug.cgi?id=27897. When query memory access cost, current SLP always passes in alignment value of 1 (unaligned), so it gets a very high cost of scalar memory access, and wrongly vectorize memory loads in the test case. It can be fixed by simply giving correct alignment. llvm-svn: 271333
* [BBVectorize] Don't vectorize selects with a scalar condition and vector ↵Michael Kuperstein2016-05-261-1/+8
| | | | | | | | | | operands. This fixes PR27879. Differential Revision: http://reviews.llvm.org/D20659 llvm-svn: 270888
* fix typo; NFCSanjay Patel2016-05-251-1/+1
| | | | llvm-svn: 270760
* fix typos; NFCSanjay Patel2016-05-241-11/+11
| | | | llvm-svn: 270579
* Recommit r255691 since PR26509 has been fixed.Wei Mi2016-05-191-31/+106
| | | | llvm-svn: 270113
* [VectorUtils] Fix nasty use-after-freeJames Molloy2016-05-181-1/+3
| | | | | | | | | | In truncateToMinimalBitwidths() we were RAUW'ing an instruction then erasing it. However, that intruction could be cached in the map we're iterating over. The first check is "I->use_empty()" which in most cases would return true, as the (deleted) object was RAUW'd first so would have zero use count. However in some cases the object could have been polluted or written over and this wouldn't be the case. Also it makes valgrind, asan and traditionalists who don't like their compiler to crash sad. No testcase as there are no externally visible symptoms apart from a crash if the stars align. Fixes PR26509. llvm-svn: 269908
* [LV] Ensure safe VF for loops with interleaved accessesMatthew Simpson2016-05-161-1/+23
| | | | | | | | | | | | | The selection of the vectorization factor currently doesn't consider interleaved accesses. The vectorization factor is based on the maximum safe dependence distance computed by LAA. However, for loops with interleaved groups, we should instead base the vectorization factor on the maximum safe dependence distance divided by the maximum interleave factor of all the interleaved groups. Interleaved accesses not in a group will be scalarized. Differential Revision: http://reviews.llvm.org/D20241 llvm-svn: 269659
* Correct spelling in comment (NFC)Matthew Simpson2016-05-131-1/+1
| | | | llvm-svn: 269482
* Tidied up switch cases. NFCI.Simon Pilgrim2016-05-121-52/+48
| | | | | | Split FCMP//ICMP/SEL from the basic arithmetic cost functions. They were not sharing any notable code path (just the return) and were repeatedly testing the opcode. llvm-svn: 269348
* [LoopVectorizer] LoopVectorBody doesn't need to be a vector. NFC.Michael Kuperstein2016-05-121-40/+22
| | | | | | | | | | | LoopVectorBody was changed from a single pointer to a SmallVector when store predication was introduced in r200270. Since r247139, store predication no longer splits the vector loop body in-place, so we can go back to having a single LoopVectorBody block. This reverts the no-longer-needed changes from r200270. llvm-svn: 269321
* [LoopVectorize] Handling induction variable with non-constant step.Elena Demikhovsky2016-05-101-22/+54
| | | | | | | | | | | | | | | | | | | | | | | Allow vectorization when the step is a loop-invariant variable. This is the loop example that is getting vectorized after the patch: int int_inc; int bar(int init, int *restrict A, int N) { int x = init; for (int i=0;i<N;i++){ A[i] = x; x += int_inc; } return x; } "x" is an induction variable with *loop-invariant* step. But it is not a primary induction. Primary induction variable with non-constant step is not handled yet. Differential Revision: http://reviews.llvm.org/D19258 llvm-svn: 269023
* [LAA] Rename "isStridedPtr" with "getPtrStride". NFC.Denis Zobnin2016-05-101-1/+1
| | | | | | | Changing misleading function name was approved in http://reviews.llvm.org/D17268. Patch by Roman Shirokiy. llvm-svn: 269021
* Remove dead include. NFC.Chad Rosier2016-05-051-1/+0
| | | | llvm-svn: 268655
* Fix unused variable warning after r268632Silviu Baranga2016-05-051-1/+0
| | | | llvm-svn: 268634
* [LV] Identify more induction PHIs by coercing expressions to AddRecExprsSilviu Baranga2016-05-051-7/+15
| | | | | | | | | | | | | | | | | | Summary: Some PHIs can have expressions that are not AddRecExprs due to the presence of sext/zext instructions. In order to prevent the Loop Vectorizer from bailing out when encountering these PHIs, we now coerce the SCEV expressions to AddRecExprs using SCEV predicates (when possible). We only do this when the alternative would be to not vectorize. Reviewers: mzolotukhin, anemet Subscribers: mssimpso, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17153 llvm-svn: 268633
* [LV] Refactor the validation of PHI inductions. NFCSilviu Baranga2016-05-051-29/+48
| | | | | | | | This moves the validation of PHI inductions into a separate method, making it easier to reuse this logic. llvm-svn: 268632
* clang-format some files in preparation of coming patch reviews.Dehao Chen2016-05-051-522/+487
| | | | llvm-svn: 268583
* [SLPVectorizer] Add operand bundles to vectorized functionsDavid Majnemer2016-04-291-2/+16
| | | | | | | SLPVectorizing a call site should result in further propagation of its bundles. llvm-svn: 268004
* [LoopVectorize] Add operand bundles to vectorized functionsDavid Majnemer2016-04-291-5/+7
| | | | | | | Also, do not crash when calculating a cost model for loop-invariant token values. llvm-svn: 268003
* [PR25281] Remove AAResultsWrapper from preserved analyses of loop vectorizer.Michael Zolotukhin2016-04-291-1/+0
| | | | | | | We don't preserve AAResults, because, for one, we don't preserve SCEV-AA. That fixes PR25281. llvm-svn: 267980
* [LoopVectorize] Keep hints from original loop on the vector loopHal Finkel2016-04-291-0/+5
| | | | | | | | | | | | | | | | We need to keep loop hints from the original loop on the new vector loop. Failure to do this meant that, for example: void foo(int *b) { #pragma clang loop unroll(disable) for (int i = 0; i < 16; ++i) b[i] = 1; } this loop would be unrolled. Why? Because we'd vectorize it, thus dropping the hints that unrolling should be disabled, and then we'd unroll it. llvm-svn: 267970
* [SLPVectorizer] Extend SLP Vectorizer to deal with aggregates.Arch D. Robison2016-04-281-37/+151
| | | | | | | | The refactoring portion part was done as r267748. http://reviews.llvm.org/D14185 llvm-svn: 267899
* [LV] Reallow positive-stride interleaved load groups with gapsMatthew Simpson2016-04-271-9/+47
| | | | | | | | | | | | | | We previously disallowed interleaved load groups that may cause us to speculatively access memory out-of-bounds (r261331). We did this by ensuring each load group had an access corresponding to the first and last member. Instead of bailing out for these interleaved groups, this patch enables us to peel off the last vector iteration, ensuring that we execute at least one iteration of the scalar remainder loop. This solution was proposed in the review of the previous patch. Differential Revision: http://reviews.llvm.org/D19487 llvm-svn: 267751
* [SLPVectorizer] Refactor where MinVecRegSize and MaxVecRegSize live.Arch D. Robison2016-04-271-20/+28
| | | | | | | | | This is the first of two commits for extending SLP Vectorizer to deal with aggregates. This commit merely refactors existing logic. http://reviews.llvm.org/D14185 llvm-svn: 267748
* [TTI] Add hook for vector extract with extensionMatthew Simpson2016-04-271-3/+4
| | | | | | | | | | | | | | | This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725
* Masked Store in Loop Vectorizer - bugfixElena Demikhovsky2016-04-261-13/+9
| | | | | | | | Fixed a bug in loop vectorization with conditional store. Differential Revision: http://reviews.llvm.org/D19532 llvm-svn: 267597
* [LoopVectorize] Don't consider conditional-load dereferenceability for ↵Hal Finkel2016-04-261-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | marked parallel loops I really thought we were doing this already, but we were not. Given this input: void Test(int *res, int *c, int *d, int *p) { for (int i = 0; i < 16; i++) res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; } we did not vectorize the loop. Even with "assume_safety" the check that we don't if-convert conditionally-executed loads (to protect against data-dependent deferenceability) was not elided. One subtlety: As implemented, it will still prefer to use a masked-load instrinsic (given target support) over the speculated load. The choice here seems architecture specific; the best option depends on how expensive the masked load is compared to a regular load. Ideally, using the masked load still reduces unnecessary memory traffic, and so should be preferred. If we'd rather do it the other way, flipping the order of the checks is easy. The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also implies that if conversion is okay. Differential Revision: http://reviews.llvm.org/D19512 llvm-svn: 267514
* Re-commit optimization bisect support (r267022) without new pass manager ↵Andrew Kaylor2016-04-223-2/+5
| | | | | | | | | | support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231
* Revert "Initial implementation of optimization bisect support."Vedant Kumar2016-04-223-5/+2
| | | | | | | | This reverts commit r267022, due to an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549 llvm-svn: 267115
* Initial implementation of optimization bisect support.Andrew Kaylor2016-04-213-2/+5
| | | | | | | | | | | | This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations. The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used. The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way. Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267022
OpenPOWER on IntegriCloud