summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/LoopVectorize
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86][AVX] Fixed v16i16/v32i8 ADD/SUB costs on AVX1 subtargetsSimon Pilgrim2016-11-141-2/+4
| | | | | | | | Add explicit v16i16/v32i8 ADD/SUB costs, matching the costs of v4i64/v8i32 - they were missing for some reason. This has side effects on the LV max bandwidth tests (AVX1 now prefers 128-bit vectors vs AVX2 which still prefers 256-bit) llvm-svn: 286832
* [LV] Stop saying "use -Rpass-analysis=loop-vectorize"Adam Nemet2016-11-116-11/+11
| | | | | | | | | | | | | | | | | | This is PR28376. Unfortunately given the current structure of optimization diagnostics we lack the capability to tell whether the user has passed -Rpass-analysis=loop-vectorize since this is local to the front-end (BackendConsumer::OptimizationRemarkHandler). So rather than printing this even if the user has already passed -Rpass-analysis, this patch just punts and stops recommending this option. I don't think that getting this right is worth the complexity. Differential Revision: https://reviews.llvm.org/D26563 llvm-svn: 286662
* Update vectorization debug info unittest.Dehao Chen2016-11-091-11/+11
| | | | | | | | | | | | | | Summary: The change will test the change in r286159. The idea behind the change: Make the dbg location different between loop header and preheader/exit. Originally, dbg location 21 exists in 3 BBs: preheader, header, critical edge (exit). Update the debug location of inside the loop header from !21 to !22 so that it will reflect the correct location. Reviewers: probinson Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26428 llvm-svn: 286403
* Second attempt at r285517.Dorit Nuzman2016-10-316-3/+196
| | | | llvm-svn: 285568
* Revert r285517 due to build failures.Dorit Nuzman2016-10-306-196/+3
| | | | llvm-svn: 285518
* [LoopVectorize] Make interleaved-accesses analysis less conservative aboutDorit Nuzman2016-10-306-3/+196
| | | | | | | | | | | | | | | | | | | | | possible pointer-wrap-around concerns, in some cases. Before this patch, collectConstStridedAccesses (part of interleaved-accesses analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when examining all candidate pointers. This is too conservative. Instead, this patch makes collectConstStridedAccesses use an optimistic approach, calling getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the candidate interleave groups have been formed, revisits the pointer-wrapping analysis but only where it matters: namely, in groups that have gaps, and where the gaps are not at the very end of the group (in which case the loop is peeled). This second time getPtrStride is called with [Assume=false, ShouldCheckWrap=true], but this could further be improved to using Assume=true, once we also add the logic to track that we are not going to meet the scev runtime checks threshold. Differential Revision: https://reviews.llvm.org/D25276 llvm-svn: 285517
* [LV] Correct misleading comments in test (NFC)Matthew Simpson2016-10-281-9/+5
| | | | llvm-svn: 285402
* [LV] Sink scalar operands of predicated instructionsMatthew Simpson2016-10-253-16/+16
| | | | | | | | | | | | | When we predicate an instruction (div, rem, store) we place the instruction in its own basic block within the vectorized loop. If a predicated instruction has scalar operands, it's possible to recursively sink these scalar expressions into the predicated block so that they might avoid execution. This patch sinks as much scalar computation as possible into predicated blocks. We previously were able to sink such operands only if they were extractelement instructions. Differential Revision: https://reviews.llvm.org/D25632 llvm-svn: 285097
* [X86] Enable interleaved memory access by defaultMichael Kuperstein2016-10-204-11/+46
| | | | | | | | This lets the loop vectorizer generate interleaved memory accesses on x86. Differential Revision: https://reviews.llvm.org/D25350 llvm-svn: 284779
* [LV] Avoid emitting trivially dead instructionsMatthew Simpson2016-10-191-0/+42
| | | | | | | | | | | | | | | Some instructions from the original loop, when vectorized, can become trivially dead. This happens because of the way we structure the new loop. For example, we create new induction variables and induction variable "steps" in the new loop. Thus, when we go to vectorize the original induction variable update, it may no longer be needed due to the instructions we've already created. This patch prevents us from creating these redundant instructions. This reduces code size before simplification and allows greater flexibility in code generation since we have fewer unnecessary instruction uses. Differential Revision: https://reviews.llvm.org/D25631 llvm-svn: 284631
* [LV] Account for predicated stores in instruction costsMatthew Simpson2016-10-131-1/+39
| | | | | | | This patch ensures that we scale the estimated cost of predicated stores by block probability. This is a follow-on patch for r284123. llvm-svn: 284126
* [LV] Avoid rounding errors for predicated instruction costsMatthew Simpson2016-10-131-0/+53
| | | | | | | | | | | | This patch modifies the cost calculation of predicated instructions (div and rem) to avoid the accumulation of rounding errors due to multiple truncating integer divisions. The calculation for predicated stores will be addressed in a follow-on patch since we currently don't scale the cost of predicated stores by block probability. Differential Revision: https://reviews.llvm.org/D25333 llvm-svn: 284123
* [LV] Don't mark multi-use branch conditions uniformMatthew Simpson2016-10-071-0/+35
| | | | | | | | | | | Previously, we marked the branch conditions of latch blocks uniform after vectorization if they were instructions contained in the loop. However, if a condition instruction has users other than the branch, it may not remain uniform. This patch ensures the conditions we mark uniform are only used by the branch. This should fix PR30627. Reference: https://llvm.org/bugs/show_bug.cgi?id=30627 llvm-svn: 283563
* [LV] Remove triples from target-independent vectorizer tests. NFC.Michael Kuperstein2016-10-0668-68/+0
| | | | | | | | | Vectorizer tests in the target-independent directory should not have a target triple. If a test really needs to query a specific backend, it belongs in the right target subdirectory (which "REQUIRES" the right backend). Otherwise, it should not specify a triple. llvm-svn: 283512
* [LV] Build all scalar steps for non-uniform induction variablesMatthew Simpson2016-09-301-0/+97
| | | | | | | | | | | | | | | | When building the steps for scalar induction variables, we previously attempted to determine if all the scalar users of the induction variable were uniform. If they were, we would only emit the step corresponding to vector lane zero. This optimization was too aggressive. We generally don't know the entire set of induction variable users that will be scalar. We have isScalarAfterVectorization, but this is only a conservative estimate of the instructions that will be scalarized. Thus, an induction variable may have scalar users that aren't already known to be scalar. To avoid emitting unused steps, we can only check that the induction variable is uniform. This should fix PR30542. Reference: https://llvm.org/bugs/show_bug.cgi?id=30542 llvm-svn: 282863
* [LV] Scalarize instructions marked scalar after vectorizationMatthew Simpson2016-09-265-7/+83
| | | | | | | | | This patch ensures that we actually scalarize instructions marked scalar after vectorization. Previously, such instructions may have been vectorized instead. Differential Revision: https://reviews.llvm.org/D23889 llvm-svn: 282418
* [LV] Don't emit unused scalars for uniform instructionsMatthew Simpson2016-09-212-38/+0
| | | | | | | | | | | | If we identify an instruction as uniform after vectorization, we know that we should only use the value corresponding to the first vector lane of each unroll iteration. However, when scalarizing such instructions, we still produce values for the other vector lanes. This patch prevents us from generating the unused scalars. Differential Revision: https://reviews.llvm.org/D24275 llvm-svn: 282087
* [LV] When reporting about a specific instruction without debug location use ↵Adam Nemet2016-09-211-0/+78
| | | | | | | | loop's This can occur for example if some optimization drops the debug location. llvm-svn: 282048
* [Loop Vectorizer] Consecutive memory access - fixed and simplifiedElena Demikhovsky2016-09-181-0/+43
| | | | | | | | | Amended consecutive memory access detection in Loop Vectorizer. Load/Store were not handled properly without preceding GEP instruction. Differential Revision: https://reviews.llvm.org/D20789 llvm-svn: 281853
* [LV] Process pointer IVs with PHINodes in collectLoopUniformsMatthew Simpson2016-09-141-0/+169
| | | | | | | | | | | | | | | This patch moves the processing of pointer induction variables in collectLoopUniforms from the consecutive pointer phase of the analysis to the phi node phase. Previously, if a pointer induction variable was used by both a scalarized non-memory instruction as well as a vectorized memory instruction, we would incorrectly identify the pointer as uniform. Pointer induction variables should be treated the same as other phi nodes. That is, they are uniform if all users of the induction variable and induction variable update are uniform. Differential Revision: https://reviews.llvm.org/D24511 llvm-svn: 281485
* DebugInfo: New metadata representation for global variables.Peter Collingbourne2016-09-131-6/+6
| | | | | | | | | | | | | This patch reverses the edge from DIGlobalVariable to GlobalVariable. This will allow us to more easily preserve debug info metadata when manipulating global variables. Fixes PR30362. A program for upgrading test cases is attached to that bug. Differential Revision: http://reviews.llvm.org/D20147 llvm-svn: 281284
* [LV] Ensure proper handling of multi-use case when collecting uniformsMatthew Simpson2016-09-081-1/+4
| | | | | | | | | | | The test case included in r280979 wasn't checking what it was supposed to be checking for the predicated store case. Fixing the test revealed that the multi-use case (when a pointer is used by both vectorized and scalarized memory accesses) wasn't being handled properly. We can't skip over non-consecutive-like pointers since they may have looked consecutive-like with a different memory access. llvm-svn: 280992
* [LV] Don't mark pointers used by scalarized memory accesses uniformMatthew Simpson2016-09-081-0/+268
| | | | | | | | | | | | | | | | | | Previously, all consecutive pointers were marked uniform after vectorization. However, if a consecutive pointer is used by a memory access that is eventually scalarized, the pointer won't remain uniform after all. An example is predicated stores. Even though a predicated store may be consecutive, it will still be scalarized, making it's pointer operand non-uniform. This patch updates the logic in collectLoopUniforms to consider the cases where a memory access may be scalarized. If a memory access may be scalarized, its pointer operand is not marked uniform. The determination of whether a given memory instruction will be scalarized or not has been moved into a common function that is used by the vectorizer, cost model, and legality analysis. Differential Revision: https://reviews.llvm.org/D24271 llvm-svn: 280979
* [LV] Ensure reverse interleaved group GEPs remain uniformMatthew Simpson2016-09-021-2/+8
| | | | | | | | | | | | | | For uniform instructions, we're only required to generate a scalar value for the first vector lane of each unroll iteration. Thus, if we have a reverse interleaved group, computing the member index off the scalar GEP corresponding to the last vector lane of its pointer operand technically makes the GEP non-uniform. We should compute the member index off the first scalar GEP instead. I've added the updated member index computation to the existing reverse interleaved group test. llvm-svn: 280497
* [LoopVectorizer] Predicate instructions in blocks with several incoming edgesMichael Kuperstein2016-08-302-4/+62
| | | | | | | | | | We don't need to limit predication to blocks that have a single incoming edge, we just need to use the right mask. This fixes PR30172. Differential Revision: https://reviews.llvm.org/D24009 llvm-svn: 280148
* [LV] Move insertelement sequence after scalar definitionsMatthew Simpson2016-08-292-16/+58
| | | | | | | | | | | | | | After r279649 when getting a vector value from VectorLoopValueMap, we create an insertelement sequence on-demand if the value has been scalarized instead of vectorized. We previously inserted this insertelement sequence before the value's first vector user. However, this insert location is problematic if that user is the phi node of a first-order recurrence. With this patch, we move the insertelement sequence after the last scalar instruction we created when scalarizing the value. Thus, the value's vector definition in the new loop will immediately follow its scalar definitions. This should fix PR30183. Reference: https://llvm.org/bugs/show_bug.cgi?id=30183 llvm-svn: 280001
* [Loop Vectorizer] Fixed memory confilict checks.Elena Demikhovsky2016-08-282-11/+11
| | | | | | | | | Fixed a bug in run-time checks for possible memory conflicts inside loop. The bug is in Low <-> High boundaries calculation. The High boundary should be calculated as "last memory access pointer + element size". Differential revision: https://reviews.llvm.org/D23176 llvm-svn: 279930
* [LV] Unify vector and scalar mapsMatthew Simpson2016-08-244-89/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch unifies the data structures we use for mapping instructions from the original loop to their corresponding instructions in the new loop. Previously, we maintained two distinct maps for this purpose: WidenMap and ScalarIVMap. WidenMap maintained the vector values each instruction from the old loop was represented with, and ScalarIVMap maintained the scalar values each scalarized induction variable was represented with. With this patch, all values created for the new loop are maintained in VectorLoopValueMap. The change allows for several simplifications. Previously, when an instruction was scalarized, we had to insert the scalar values into vectors in order to maintain the mapping in WidenMap. Then, if a user of the scalarized value was also scalar, we had to extract the scalar values from the temporary vector we created. We now aovid these unnecessary scalar-to-vector-to-scalar conversions. If a scalarized value is used by a scalar instruction, the scalar value is used directly. However, if the scalarized value is needed by a vector instruction, we generate the needed insertelement instructions on-demand. A common idiom in several locations in the code (including the scalarization code), is to first get the vector values an instruction from the original loop maps to, and then extract a particular scalar value. This patch adds getScalarValue for this purpose along side getVectorValue as an interface into VectorLoopValueMap. These functions work together to return the requested values if they're available or to produce them if they're not. The mapping has also be made less permissive. Entries can be added to VectorLoopValue map with the new initVector and initScalar functions. getVectorValue has been modified to return a constant reference to the mapped entries. There's no real functional change with this patch; however, in some cases we will generate slightly different code. For example, instead of an insertelement sequence following the definition of an instruction, it will now precede the first use of that instruction. This can be seen in the test case changes. Differential Revision: https://reviews.llvm.org/D23169 llvm-svn: 279649
* [Loop Vectorizer] Support predication of div/remGil Rapaport2016-08-243-27/+267
| | | | | | | | | | div/rem instructions in basic blocks that require predication currently prevent vectorization. This patch extends the existing mechanism for predicating stores to handle other instructions and leverages it to predicate divs and rems. Differential Revision: https://reviews.llvm.org/D22918 llvm-svn: 279620
* [LoopVectorize] Detect loops in the innermost loop before creating ↵Tim Shen2016-08-121-0/+71
| | | | | | | | | | | | | InnerLoopVectorizer InnerLoopVectorizer shouldn't handle a loop with cycles inside the loop body, even if that cycle isn't a natural loop. Fixes PR28541. Differential Revision: https://reviews.llvm.org/D22952 llvm-svn: 278573
* [ValueTracking] Teach computeKnownBits about [su]min/maxDavid Majnemer2016-08-062-7/+7
| | | | | | | Reasoning about a select in terms of a min or max allows us to derive a tigher bound on the result. llvm-svn: 277914
* [LV, X86] Be more optimistic about vectorizing shifts.Michael Kuperstein2016-08-041-0/+23
| | | | | | | | | | | | | | | Shifts with a uniform but non-constant count were considered very expensive to vectorize, because the splat of the uniform count and the shift would tend to appear in different blocks. That made the splat invisible to ISel, and we'd scalarize the shift at codegen time. Since r201655, CodeGenPrepare sinks those splats to be next to their use, and we are able to select the appropriate vector shifts. This updates the cost model to to take this into account by making shifts by a uniform cheap again. Differential Revision: https://reviews.llvm.org/D23049 llvm-svn: 277782
* [LoopVectorize] Change comment for isOutOfScope in collectLoopUniforms, NFCWei Mi2016-08-021-0/+27
| | | | | | | | | Update comment for isOutOfScope and add a testcase for uniform value being used out of scope. Differential Revision: https://reviews.llvm.org/D23073 llvm-svn: 277515
* [LV] Generate both scalar and vector integer induction variablesMatthew Simpson2016-08-022-81/+182
| | | | | | | | | | | | | | | | This patch enables the vectorizer to generate both scalar and vector versions of an integer induction variable for a given loop. Previously, we only generated a scalar induction variable if we knew all its users were going to be scalar. Otherwise, we generated a vector induction variable. In the case of a loop with both scalar and vector users of the induction variable, we would generate the vector induction variable and extract scalar values from it for the scalar users. With this patch, we now generate both versions of the induction variable when there are both scalar and vector users and select which version to use based on whether the user is scalar or vector. Differential Revision: https://reviews.llvm.org/D22869 llvm-svn: 277474
* [LV] Untangle the concepts of uniform and scalarMatthew Simpson2016-08-021-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch refactors the logic in collectLoopUniforms and collectValuesToIgnore, untangling the concepts of "uniform" and "scalar". It adds isScalarAfterVectorization along side isUniformAfterVectorization to distinguish the two. Known scalar values include those that are uniform, getelementptr instructions that won't be vectorized, and induction variables and induction variable update instructions whose users are all known to be scalar. This patch includes the following functional changes: - In collectLoopUniforms, we mark uniform the pointer operands of interleaved accesses. Although non-consecutive, these pointers are treated like consecutive pointers during vectorization. - In collectValuesToIgnore, we insert a value into VecValuesToIgnore if it isScalarAfterVectorization rather than isUniformAfterVectorization. This differs from the previous functionaly in that we now add getelementptr instructions that will not be vectorized into VecValuesToIgnore. This patch also removes the ValuesNotWidened set used for induction variable scalarization since, after the above changes, it is now equivalent to isScalarAfterVectorization. Differential Revision: https://reviews.llvm.org/D22867 llvm-svn: 277460
* [AVX512] Don't use i128 masked gather/scatter/load/store. Do more accurately ↵Igor Breger2016-08-021-0/+76
| | | | | | | | dataWidth check. Differential Revision: http://reviews.llvm.org/D23055 llvm-svn: 277435
* [AVX-512] Fix a test missed in r277327.Craig Topper2016-08-011-1/+1
| | | | llvm-svn: 277330
* Initial support for vectorization using svml (short vector math library).Matt Masten2016-07-291-0/+185
| | | | | | Differential Revision: https://reviews.llvm.org/D19544 llvm-svn: 277166
* Fix the assertion error in collectLoopUniforms caused by empty Worklist ↵Wei Mi2016-07-271-0/+19
| | | | | | | | | | before expanding. Contributed-by: David Callahan Differential Revision: https://reviews.llvm.org/D22886 llvm-svn: 276943
* [Loop Vectorizer] Handling loops FP induction variables.Elena Demikhovsky2016-07-242-0/+304
| | | | | | | | | | | | | | | | Allowed loop vectorization with secondary FP IVs. Like this: float *A; float x = init; for (int i=0; i < N; ++i) { A[i] = x; x -= fp_inc; } The auto-vectorization is possible when the induction binary operator is "fast" or the function has "unsafe" attribute. Differential Revision: https://reviews.llvm.org/D21330 llvm-svn: 276554
* [LV] Move vector int induction update to end of latchMatthew Simpson2016-07-213-14/+15
| | | | | | | | | | | This patch moves the update instruction for vectorized integer induction phi nodes to the end of the latch block. This ensures consistent placement of all induction updates across all the kinds of int inductions we create (scalar, splat vector, or vector phi). Differential Revision: https://reviews.llvm.org/D22416 llvm-svn: 276339
* [OptDiag,LV] Add hotness attribute to applied-optimization remarksAdam Nemet2016-07-211-4/+4
| | | | | | | Test coverage is provided by modifying the function in the FP-math testcase that we are allowed to vectorize. llvm-svn: 276223
* [OptDiag,LV] Add hotness attribute to the derived analysis remarksAdam Nemet2016-07-201-0/+113
| | | | | | | | This includes FPCompute and Aliasing. Testcase is based on no_fpmath.ll. llvm-svn: 276211
* [OptDiag,LV] Add hotness attribute to analysis remarksAdam Nemet2016-07-201-0/+201
| | | | | | | | The earlier change added hotness attribute to missed-optimization remarks. This follows up with the analysis remarks (the ones explaining the reason for the missed optimization). llvm-svn: 276192
* [LV] Add hotness attribute to missed-optimization remarksAdam Nemet2016-07-201-0/+213
| | | | | | | The new OptimizationRemarkEmitter analysis pass is hooked up to both new and old PM passes. llvm-svn: 276080
* Recommit the patch "Use uniforms set to populate VecValuesToIgnore".Wei Mi2016-07-195-17/+96
| | | | | | | | | | | | | | | | | | For instructions in uniform set, they will not have vector versions so add them to VecValuesToIgnore. For induction vars, those only used in uniform instructions or consecutive ptrs instructions have already been added to VecValuesToIgnore above. For those induction vars which are only used in uniform instructions or non-consecutive/non-gather scatter ptr instructions, the related phi and update will also be added into VecValuesToIgnore set. The change will make the vector RegUsages estimation less conservative. Differential Revision: https://reviews.llvm.org/D20474 The recommit fixed the testcase global_alias.ll. llvm-svn: 275936
* Revert rL275912.Wei Mi2016-07-184-93/+14
| | | | llvm-svn: 275915
* Use uniforms set to populate VecValuesToIgnore.Wei Mi2016-07-184-14/+93
| | | | | | | | | | | | | | | | For instructions in uniform set, they will not have vector versions so add them to VecValuesToIgnore. For induction vars, those only used in uniform instructions or consecutive ptrs instructions have already been added to VecValuesToIgnore above. For those induction vars which are only used in uniform instructions or non-consecutive/non-gather scatter ptr instructions, the related phi and update will also be added into VecValuesToIgnore set. The change will make the vector RegUsages estimation less conservative. Differential Revision: https://reviews.llvm.org/D20474 llvm-svn: 275912
* [LV] Allow interleaved accesses in loops with predicated blocksMatthew Simpson2016-07-141-0/+164
| | | | | | | | | | This patch allows the formation of interleaved access groups in loops containing predicated blocks. However, the predicated accesses are prevented from forming groups. Differential Revision: https://reviews.llvm.org/D19694 llvm-svn: 275471
* [LV] Avoid unnecessary IV scalar-to-vector-to-scalar conversionsMatthew Simpson2016-07-142-24/+49
| | | | | | | | | | | | This patch prevents increases in the number of instructions, pre-instcombine, due to induction variable scalarization. An increase in instructions can lead to an increase in the compile-time required to simplify the induction variables. We now maintain a new map for scalarized induction variables to prevent us from converting between the scalar and vector forms. This patch should resolve compile-time regressions seen after r274627. llvm-svn: 275419
OpenPOWER on IntegriCloud