summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Vectorize
Commit message (Collapse)AuthorAgeFilesLines
...
* [LoopVectorize] Make interleaved-accesses analysis less conservative aboutDorit Nuzman2016-10-301-1/+59
| | | | | | | | | | | | | | | | | | | | | possible pointer-wrap-around concerns, in some cases. Before this patch, collectConstStridedAccesses (part of interleaved-accesses analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when examining all candidate pointers. This is too conservative. Instead, this patch makes collectConstStridedAccesses use an optimistic approach, calling getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the candidate interleave groups have been formed, revisits the pointer-wrapping analysis but only where it matters: namely, in groups that have gaps, and where the gaps are not at the very end of the group (in which case the loop is peeled). This second time getPtrStride is called with [Assume=false, ShouldCheckWrap=true], but this could further be improved to using Assume=true, once we also add the logic to track that we are not going to meet the scev runtime checks threshold. Differential Revision: https://reviews.llvm.org/D25276 llvm-svn: 285517
* [SLP] Fix for PR30626: Compiler crash inside SLP Vectorizer.Alexey Bataev2016-10-271-4/+11
| | | | | | | | | | | | | | | | After successfull horizontal reduction vectorization attempt for PHI node vectorizer tries to update root binary op by combining vectorized tree and the ReductionPHI node. But during vectorization this ReductionPHI can be vectorized itself and replaced by the `undef` value, while the instruction itself is marked for deletion. This 'marked for deletion' PHI node then can be used in new binary operation, causing "Use still stuck around after Def is destroyed" crash upon PHI node deletion. Also the test is fixed to make it perform actual testing. Differential Revision: https://reviews.llvm.org/D25671 llvm-svn: 285286
* [LV] Sink scalar operands of predicated instructionsMatthew Simpson2016-10-251-13/+78
| | | | | | | | | | | | | When we predicate an instruction (div, rem, store) we place the instruction in its own basic block within the vectorized loop. If a predicated instruction has scalar operands, it's possible to recursively sink these scalar expressions into the predicated block so that they might avoid execution. This patch sinks as much scalar computation as possible into predicated blocks. We previously were able to sink such operands only if they were extractelement instructions. Differential Revision: https://reviews.llvm.org/D25632 llvm-svn: 285097
* [LV] Avoid emitting trivially dead instructionsMatthew Simpson2016-10-191-0/+45
| | | | | | | | | | | | | | | Some instructions from the original loop, when vectorized, can become trivially dead. This happens because of the way we structure the new loop. For example, we create new induction variables and induction variable "steps" in the new loop. Thus, when we go to vectorize the original induction variable update, it may no longer be needed due to the instructions we've already created. This patch prevents us from creating these redundant instructions. This reduces code size before simplification and allows greater flexibility in code generation since we have fewer unnecessary instruction uses. Differential Revision: https://reviews.llvm.org/D25631 llvm-svn: 284631
* [LV] Account for predicated stores in instruction costsMatthew Simpson2016-10-131-0/+6
| | | | | | | This patch ensures that we scale the estimated cost of predicated stores by block probability. This is a follow-on patch for r284123. llvm-svn: 284126
* [LV] Avoid rounding errors for predicated instruction costsMatthew Simpson2016-10-131-26/+29
| | | | | | | | | | | | This patch modifies the cost calculation of predicated instructions (div and rem) to avoid the accumulation of rounding errors due to multiple truncating integer divisions. The calculation for predicated stores will be addressed in a follow-on patch since we currently don't scale the cost of predicated stores by block probability. Differential Revision: https://reviews.llvm.org/D25333 llvm-svn: 284123
* [LV] Don't mark multi-use branch conditions uniformMatthew Simpson2016-10-071-3/+6
| | | | | | | | | | | Previously, we marked the branch conditions of latch blocks uniform after vectorization if they were instructions contained in the loop. However, if a condition instruction has users other than the branch, it may not remain uniform. This patch ensures the conditions we mark uniform are only used by the branch. This should fix PR30627. Reference: https://llvm.org/bugs/show_bug.cgi?id=30627 llvm-svn: 283563
* [SLPVectorizer] Fix for PR25748: reduction vectorization after loopAlexey Bataev2016-10-071-8/+22
| | | | | | | | | | | | | | | | | | | | unrolling. The next code is not vectorized by the SLPVectorizer: ``` int test(unsigned int *p) { int sum = 0; for (int i = 0; i < 8; i++) sum += p[i]; return sum; } ``` During optimization this loop is fully unrolled and SLPVectorizer is unable to vectorize it. Patch tries to fix this problem. Differential Revision: https://reviews.llvm.org/D24796 llvm-svn: 283535
* [LV] Pass profitability analysis in vectorizer constructor (NFC)Matthew Simpson2016-10-051-23/+28
| | | | | | | | | The vectorizer already holds a pointer to one cost model artifact in a member variable (i.e., MinBWs). As we add more, it will be easier to communicate these artifacts to the vectorizer if we simply pass a pointer to the cost model instead. llvm-svn: 283373
* [LV] Pass legality analysis in vectorizer constructor (NFC)Matthew Simpson2016-10-051-12/+12
| | | | | | | The vectorizer already holds a pointer to the legality analysis in a member variable, so it makes sense that we would pass it in the constructor. llvm-svn: 283368
* [LV] Remove obsolete comment (NFC)Matthew Simpson2016-10-051-3/+1
| | | | llvm-svn: 283365
* [LV] Use getScalarizationOverhead in memory instruction costs (NFC)Matthew Simpson2016-10-051-14/+10
| | | | | | | | | This patch refactors the cost estimation of scalarized loads and stores to reuse getScalarizationOverhead for the cost of the extractelement and insertelement instructions we might create. The existing code accounted for this cost, but it was functionally equivalent to the helper function. llvm-svn: 283364
* [LV] Add helper function for predicated block probability (NFC)Matthew Simpson2016-10-051-13/+25
| | | | | | | | | | | | The cost model has to estimate the probability of executing predicated blocks. However, we currently always assume predicated blocks have a 50% chance of executing (this value is hardcoded in several places throughout the code). Since we always use the same value, this patch adds a helper function for getting this uniform probability. The function simplifies some comments and makes our assumptions more clear. In the future, we may want to extend this with actual block probability information if it's available. llvm-svn: 283354
* [LV] Add isScalarWithPredication helper function (NFC)Matthew Simpson2016-10-051-11/+23
| | | | | | | | | This patch adds a single helper function for checking if an instruction will be scalarized with predication. Such instructions include conditional stores and instructions that may divide by zero. Existing checks have been updated to use the new function. llvm-svn: 283350
* Add new target hooks for LoadStoreVectorizerVolkan Keles2016-10-031-42/+39
| | | | | | | | | | | | Summary: Added 6 new target hooks for the vectorizer in order to filter types, handle size constraints and decide how to split chains. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, mzolotukhin, wdng, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D24727 llvm-svn: 283099
* Use StringRef in Pass/PassManager APIs (NFC)Mehdi Amini2016-10-011-1/+1
| | | | llvm-svn: 283004
* [LV] Build all scalar steps for non-uniform induction variablesMatthew Simpson2016-09-301-14/+3
| | | | | | | | | | | | | | | | When building the steps for scalar induction variables, we previously attempted to determine if all the scalar users of the induction variable were uniform. If they were, we would only emit the step corresponding to vector lane zero. This optimization was too aggressive. We generally don't know the entire set of induction variable users that will be scalar. We have isScalarAfterVectorization, but this is only a conservative estimate of the instructions that will be scalarized. Thus, an induction variable may have scalar users that aren't already known to be scalar. To avoid emitting unused steps, we can only check that the induction variable is uniform. This should fix PR30542. Reference: https://llvm.org/bugs/show_bug.cgi?id=30542 llvm-svn: 282863
* [LV] Port the remarks in processLoop to the new streaming APIAdam Nemet2016-09-301-22/+39
| | | | | | This completes LV. llvm-svn: 282821
* [LV] Port the last opt remark in Hints to the new streaming interfaceAdam Nemet2016-09-301-5/+6
| | | | llvm-svn: 282820
* [LAA, LV] Port to new streaming interface for opt remarks. Update LVAdam Nemet2016-09-301-3/+6
| | | | | | | | | | | | | | (Recommit after making sure IsVerbose gets properly initialized in DiagnosticInfoOptimizationBase. See previous commit that takes care of this.) OptimizationRemarkAnalysis directly takes the role of the report that is generated by LAA. Then we need the magic to be able to turn an LAA remark into an LV remark. This is done via a new OptimizationRemark ctor. llvm-svn: 282813
* Revert "[LAA, LV] Port to new streaming interface for opt remarks. Update LV"Adam Nemet2016-09-291-6/+3
| | | | | | | | This reverts commit r282758. There are some clang failures I haven't seen. llvm-svn: 282759
* [LAA, LV] Port to new streaming interface for opt remarks. Update LVAdam Nemet2016-09-291-3/+6
| | | | | | | | | | OptimizationRemarkAnalysis directly takes the role of the report that is generated by LAA. Then we need the magic to be able to turn an LAA remark into an LV remark. This is done via a new OptimizationRemark ctor. llvm-svn: 282758
* [LV] Port OptimizationRemarkAnalysisFPCommute andAdam Nemet2016-09-291-10/+12
| | | | | | OptimizationRemarkAnalysisAliasing to new streaming API for opt remarks llvm-svn: 282742
* [LV] Convert processLoop to new streaming API for opt remarksAdam Nemet2016-09-291-10/+10
| | | | llvm-svn: 282740
* [LV] Move static createMissedAnalysis from anonymous to global namespaceAdam Nemet2016-09-291-26/+26
| | | | | | This is an attempt to fix a windows bot. llvm-svn: 282730
* [LV] Convert CostModel to use the new streaming opt remark APIAdam Nemet2016-09-291-21/+20
| | | | | | Here we can already remove the member function emitAnalysis. llvm-svn: 282729
* [LV] Split most of createMissedAnalysis into a static function. NFCAdam Nemet2016-09-291-15/+28
| | | | | | This will be shared between Legality and CostModel. llvm-svn: 282728
* [LV] Convert all but one opt remark in Legality to new streaming interfaceAdam Nemet2016-09-291-46/+72
| | | | | | | | The last one remaining after which emitAnalysis can be removed is when we convert the LAA's report to a vectorization report. This requires converting LAA to the new interface first. llvm-svn: 282726
* [LV] Convert emitRemark to new opt remark streaming interfaceAdam Nemet2016-09-291-14/+17
| | | | | | | Also renamed the function to emitRemarkWithHints to better reflect what the function actually does. llvm-svn: 282723
* Test commit. NFC.Volkan Keles2016-09-291-1/+1
| | | | llvm-svn: 282717
* Shorten DiagnosticInfoOptimizationRemark* to OptimizationRemark*. NFCAdam Nemet2016-09-271-1/+1
| | | | | | | With the new streaming interface, these class names need to be typed a lot and it's way too looong. llvm-svn: 282544
* [LV] Scalarize instructions marked scalar after vectorizationMatthew Simpson2016-09-261-0/+9
| | | | | | | | | This patch ensures that we actually scalarize instructions marked scalar after vectorization. Previously, such instructions may have been vectorized instead. Differential Revision: https://reviews.llvm.org/D23889 llvm-svn: 282418
* [LV] Don't emit unused scalars for uniform instructionsMatthew Simpson2016-09-211-14/+58
| | | | | | | | | | | | If we identify an instruction as uniform after vectorization, we know that we should only use the value corresponding to the first vector lane of each unroll iteration. However, when scalarizing such instructions, we still produce values for the other vector lanes. This patch prevents us from generating the unused scalars. Differential Revision: https://reviews.llvm.org/D24275 llvm-svn: 282087
* [LV] Rename "Width" to "Lane" (NFC)Matthew Simpson2016-09-211-6/+6
| | | | llvm-svn: 282083
* [Loop Vectorizer] Consecutive memory access - fixed and simplifiedElena Demikhovsky2016-09-181-81/+5
| | | | | | | | | Amended consecutive memory access detection in Loop Vectorizer. Load/Store were not handled properly without preceding GEP instruction. Differential Revision: https://reviews.llvm.org/D20789 llvm-svn: 281853
* [Loop vectorizer] Simplified GEP cloning. NFC.Elena Demikhovsky2016-09-181-35/+26
| | | | | | | | | Simplified GEP cloning in vectorizeMemoryInstruction(). Added an assertion that checks consecutive GEP, which should have only one loop-variant operand. Differential Revision: https://reviews.llvm.org/D24557 llvm-svn: 281851
* [LV] Process pointer IVs with PHINodes in collectLoopUniformsMatthew Simpson2016-09-141-4/+22
| | | | | | | | | | | | | | | This patch moves the processing of pointer induction variables in collectLoopUniforms from the consecutive pointer phase of the analysis to the phi node phase. Previously, if a pointer induction variable was used by both a scalarized non-memory instruction as well as a vectorized memory instruction, we would incorrectly identify the pointer as uniform. Pointer induction variables should be treated the same as other phi nodes. That is, they are uniform if all users of the induction variable and induction variable update are uniform. Differential Revision: https://reviews.llvm.org/D24511 llvm-svn: 281485
* [LV] Clean up uniform induction variable analysis (NFC)Matthew Simpson2016-09-131-23/+31
| | | | llvm-svn: 281368
* LSV: Fix incorrectly increasing alignmentMatt Arsenault2016-09-091-18/+16
| | | | | | | If the unaligned access has a dynamic offset, it may be odd which would make the adjusted alignment incorrect to use. llvm-svn: 281110
* [LV] Ensure proper handling of multi-use case when collecting uniformsMatthew Simpson2016-09-081-5/+5
| | | | | | | | | | | The test case included in r280979 wasn't checking what it was supposed to be checking for the predicated store case. Fixing the test revealed that the multi-use case (when a pointer is used by both vectorized and scalarized memory accesses) wasn't being handled properly. We can't skip over non-consecutive-like pointers since they may have looked consecutive-like with a different memory access. llvm-svn: 280992
* [LV] Don't mark pointers used by scalarized memory accesses uniformMatthew Simpson2016-09-081-42/+143
| | | | | | | | | | | | | | | | | | Previously, all consecutive pointers were marked uniform after vectorization. However, if a consecutive pointer is used by a memory access that is eventually scalarized, the pointer won't remain uniform after all. An example is predicated stores. Even though a predicated store may be consecutive, it will still be scalarized, making it's pointer operand non-uniform. This patch updates the logic in collectLoopUniforms to consider the cases where a memory access may be scalarized. If a memory access may be scalarized, its pointer operand is not marked uniform. The determination of whether a given memory instruction will be scalarized or not has been moved into a common function that is used by the vectorizer, cost model, and legality analysis. Differential Revision: https://reviews.llvm.org/D24271 llvm-svn: 280979
* IR: Remove Value::intersectOptionalDataWith, replace all calls with calls to ↵Peter Collingbourne2016-09-071-1/+1
| | | | | | | | | | Instruction::andIRFlags. The two functions are functionally equivalent. Differential Revision: https://reviews.llvm.org/D22830 llvm-svn: 280884
* [LSV] Use the original loads' names for the extractelement instructions.Justin Lebar2016-09-071-2/+4
| | | | | | | | | | | | | | | | Summary: LSV replaces multiple adjacent loads with one vectorized load and a bunch of extractelement instructions. This patch makes the extractelement instructions' names match those of the original loads, for (hopefully) improved readability. Reviewers: asbirlea, tstellarAMD Subscribers: arsenm, mzolotukhin Differential Revision: https://reviews.llvm.org/D23748 llvm-svn: 280818
* ADT: Do not inherit from std::iterator in ilist_iteratorDuncan P. N. Exon Smith2016-09-031-1/+1
| | | | | | | | | | | | | Inheriting from std::iterator uses more boiler-plate than manual typedefs. Avoid that in both ilist_iterator and MachineInstrBundleIterator. This has the side effect of removing ilist_iterator from certain ADL lookups in namespace std; calls to std::next need to be qualified by "std::" that didn't have to before. The one case of this in-tree was operating on a temporary, so I used the more compact operator++. llvm-svn: 280570
* [SLP] Don't pass a global CL option as an argument. NFC.Chad Rosier2016-09-021-8/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D24199 llvm-svn: 280527
* [LV] Ensure reverse interleaved group GEPs remain uniformMatthew Simpson2016-09-021-1/+11
| | | | | | | | | | | | | | For uniform instructions, we're only required to generate a scalar value for the first vector lane of each unroll iteration. Thus, if we have a reverse interleaved group, computing the member index off the scalar GEP corresponding to the last vector lane of its pointer operand technically makes the GEP non-uniform. We should compute the member index off the first scalar GEP instead. I've added the updated member index computation to the existing reverse interleaved group test. llvm-svn: 280497
* [LV] Use ScalarParts for ad-hoc pointer IV scalarization (NFCI)Matthew Simpson2016-09-011-22/+9
| | | | | | | | | We can now maintain scalar values in VectorLoopValueMap. Thus, we no longer have to create temporary vectors with insertelement instructions when handling pointer induction variables. This case was mistakenly missed from r279649 when refactoring the other scalarization code. llvm-svn: 280405
* [LV] Move VectorParts allocation and mapping into PHI widening (NFC)Matthew Simpson2016-09-011-29/+38
| | | | | | | | | | | | | | | | This patch moves the allocation of VectorParts for PHI nodes into the actual PHI widening code. Previously, we allocated these VectorParts in vectorizeBlockInLoop, and passed them by reference to widenPHIInstruction. Upon returning, we would then map the VectorParts in VectorLoopValueMap. This behavior is problematic for the cases where we only want to generate a scalar version of a PHI node. For example, if in the future we only generate a scalar version of an induction variable, we would end up inserting an empty vector entry into the map once we return to vectorizeBlockInLoop. We now no longer need to pass VectorParts to the various PHI widening functions, and we can keep VectorParts allocation as close as possible to the point at which they are actually mapped in VectorLoopValueMap. llvm-svn: 280390
* [SLP] Update the debug based on Michael's suggestion.Chad Rosier2016-08-311-2/+3
| | | | | | | Passing the types/opcode check still doesn't guarantee we'll actually vectorize. Therefore, just make it clear we're attempting to vectorize. llvm-svn: 280263
* [SLP] Sink debug after checking for matching types/opcode.Chad Rosier2016-08-311-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D24090 llvm-svn: 280260
OpenPOWER on IntegriCloud