summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis/LoopAccessAnalysis.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Revert r313771 "[SLP] Vectorize jumbled memory loads."Hans Wennborg2017-09-201-71/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This broke the buildbots, e.g. http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/391 > Summary: > This patch tries to vectorize loads of consecutive memory accesses, accessed > in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 > which was reverted back due to some basic issue with representing the 'use mask' > jumbled accesses. > > This patch fixes the mask representation by recording the 'use mask' in the usertree entry. > > Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df > > Subscribers: mzolotukhin > > Reviewed By: ayal > > Differential Revision: https://reviews.llvm.org/D36130 > > Review comments updated accordingly > > Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0 > > Added a TODO for sortLoadAccesses API > > Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58 > > Modified the TODO for sortLoadAccesses API > > Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565 > > Review comment update for using OpdNum to insert the mask in respective location > > Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce > > Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase > > Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b llvm-svn: 313781
* [SLP] Vectorize jumbled memory loads.Mohammad Shahid2017-09-201-0/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Subscribers: mzolotukhin Reviewed By: ayal Differential Revision: https://reviews.llvm.org/D36130 Review comments updated accordingly Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0 Added a TODO for sortLoadAccesses API Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58 Modified the TODO for sortLoadAccesses API Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565 Review comment update for using OpdNum to insert the mask in respective location Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b llvm-svn: 313771
* Revert r313736: "[SLP] Vectorize jumbled memory loads."Alexander Kornienko2017-09-201-70/+0
| | | | | | | The revision breaks buildbots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/6694/steps/test/logs/stdio llvm-svn: 313758
* Revert r313753: "Fix a -Wsign-compare warning in LoopAccessAnalysis.cpp"Alexander Kornienko2017-09-201-2/+1
| | | | llvm-svn: 313757
* Fix a -Wsign-compare warning in LoopAccessAnalysis.cppAlexander Kornienko2017-09-201-1/+2
| | | | llvm-svn: 313753
* [SLP] Vectorize jumbled memory loads.Mohammad Shahid2017-09-201-0/+70
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' of jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh Reviewed By: Ayal Subscribers: mzolotukhin Differential Revision: https://reviews.llvm.org/D36130 Commit after rebase for patch D36130 Change-Id: I8add1c265455669ef288d880f870a9522c8c08ab llvm-svn: 313736
* [LV] Fix maximum legal VF calculationAlon Kom2017-09-141-3/+4
| | | | | | | | | | | | | | | | This patch fixes pr34283, which exposed that the computation of maximum legal width for vectorization was wrong, because it relied on MaxInterleaveFactor to obtain the maximum stride used in the loop, however not all strided accesses in the loop have an interleave-group associated with them. Instead of recording the maximum stride in the loop, which can be over conservative (e.g. if the access with the maximum stride is not involved in the dependence limitation), this patch tracks the actual maximum legal width imposed by accesses that are involved in dependencies. Differential Revision: https://reviews.llvm.org/D37507 llvm-svn: 313237
* [LAA] Allow more run-time alias checks by coercing pointer expressions to ↵Silviu Baranga2017-09-121-27/+96
| | | | | | | | | | | | | | | | | | | | | | AddRecExprs Summary: LAA can only emit run-time alias checks for pointers with affine AddRec SCEV expressions. However, non-AddRecExprs can be now be converted to affine AddRecExprs using SCEV predicates. This change tries to add the minimal set of SCEV predicates in order to enable run-time alias checking. Reviewers: anemet, mzolotukhin, mkuper, sanjoy, hfinkel Reviewed By: hfinkel Subscribers: mssimpso, Ayal, dorit, roman.shirokiy, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D17080 llvm-svn: 313012
* [LAA] Correctly return a half-open range in expandBoundsJames Molloy2017-04-051-1/+4
| | | | | | | | | | This is a latent bug that's been hanging around for a while. For a loop-invariant pointer, expandBounds would return the range {Ptr, Ptr}, but this was interpreted as a half-open range, not a closed range. So we ended up planting incorrect bounds checks. Even worse, they were tautological, so we ended up incorrectly executing the optimized loop. llvm-svn: 299526
* [SLP] Revert everything that has to do with memory access sorting.Michael Kuperstein2017-03-101-51/+0
| | | | | | | | | | | | This reverts r293386, r294027, r294029 and r296411. Turns out the SLP tree isn't actually a "tree" and we don't handle accessing the same packet of loads in several different orders well, causing miscompiles. Revert until we can fix this properly. llvm-svn: 297493
* [SLP] Fixed non-deterministic behavior in Loop Vectorizer.Amjad Aboud2017-03-081-9/+11
| | | | | | Differential Revision: https://reviews.llvm.org/D30638 llvm-svn: 297257
* [SLP] Revert r296863 due to miscompiles.Michael Kuperstein2017-03-061-24/+8
| | | | | | Details and reproducer are on the email thread for r296863. llvm-svn: 297103
* [SLP] Fixes the bug due to absence of in order uses of scalars which needs ↵Mohammad Shahid2017-03-031-8/+24
| | | | | | | | | | | | | | | | to be available for VectorizeTree() API.This API uses it for proper mask computation to be used in shufflevector IR. The fix is to compute the mask for out of order memory accesses while building the vectorizable tree instead of actual vectorization of vectorizable tree.It also needs to recompute the proper Lane for external use of vectorizable scalars based on shuffle mask. Reviewers: mkuper Differential Revision: https://reviews.llvm.org/D30159 Change-Id: Ide8773ce0ad3562f3cf4d1a0ad0f487e2f60ce5d llvm-svn: 296863
* Revert r296575 "[SLP] Fixes the bug due to absence of in order uses of ↵Hans Wennborg2017-03-011-23/+8
| | | | | | | | scalars which needs to be available" It caused miscompiles, e.g. in Chromium (PR32109). llvm-svn: 296654
* [SLP] Fixes the bug due to absence of in order uses of scalars which needs ↵Mohammad Shahid2017-03-011-8/+23
| | | | | | | | | | | | | | | to be available for VectorizeTree() API.This API uses it for proper mask computation to be used in shufflevector IR. The fix is to compute the mask for out of order memory accesses while building the vectorizable tree instead of actual vectorization of vectorizable tree. Reviewers: mkuper Differential Revision: https://reviews.llvm.org/D30159 Change-Id: Id1e287f073fa4959713ba545fa4254db5da8b40d llvm-svn: 296575
* [SLP] Load sorting should not try to sort things that aren't loads.Michael Kuperstein2017-02-271-0/+5
| | | | | | | We may get a VL where the first element is a load, but the others aren't. Trying to sort such VLs can only lead to sorrow. llvm-svn: 296411
* [LAA] Remove unused LoopAccessReportAdam Nemet2017-02-231-15/+0
| | | | | | | The need for this removed when I converted everything to use the opt-remark classes directly with the streaming interface. llvm-svn: 296017
* [LAA] Remove unused code (NFC)Matthew Simpson2017-02-171-5/+0
| | | | llvm-svn: 295493
* [LV/LoopAccess] Check statically if an unknown dependence distance can be Dorit Nuzman2017-02-121-6/+78
| | | | | | | | | | | | | | | | | | | | | | | proven larger than the loop-count This fixes PR31098: Try to resolve statically data-dependences whose compile-time-unknown distance can be proven larger than the loop-count, instead of resorting to runtime dependence checking (which are not always possible). For vectorization it is sufficient to prove that the dependence distance is >= VF; But in some cases we can prune unknown dependence distances early, and even before selecting the VF, and without a runtime test, by comparing the distance against the loop iteration count. Since the vectorized code will be executed only if LoopCount >= VF, proving distance >= LoopCount also guarantees that distance >= VF. This check is also equivalent to the Strong SIV Test. Reviewers: mkuper, anemet, sanjoy Differential Revision: https://reviews.llvm.org/D28044 llvm-svn: 294892
* [SLP] Make sortMemAccesses explicitly return an error. NFC.Michael Kuperstein2017-02-031-12/+10
| | | | llvm-svn: 294029
* [SLP] Use SCEV to sort memory accesses.Michael Kuperstein2017-02-031-17/+34
| | | | | | | | | | This generalizes memory access sorting to use differences between SCEVs, instead of relying on constant offsets. That allows us to properly do SLP vectorization of non-sequentially ordered loads within loops bodies. Differential Revision: https://reviews.llvm.org/D29425 llvm-svn: 294027
* [SLP] Vectorize loads of consecutive memory accesses, accessed in ↵Mohammad Shahid2017-01-281-0/+31
| | | | | | | | | | | | | non-consecutive (jumbled) way. The jumbled scalar loads will be sorted while building the tree and these accesses will be marked to generate shufflevector after the vectorized load with proper mask. Reviewers: hfinkel, mssimpso, mkuper Differential Revision: https://reviews.llvm.org/D26905 Change-Id: I9c0c8e6f91a00076a7ee1465440a3f6ae092f7ad llvm-svn: 293386
* [PM] Separate the LoopAnalysisManager from the LoopPassManager and moveChandler Carruth2017-01-111-16/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | the latter to the Transforms library. While the loop PM uses an analysis to form the IR units, the current plan is to have the PM itself establish and enforce both loop simplified form and LCSSA. This would be a layering violation in the analysis library. Fundamentally, the idea behind the loop PM is to *transform* loops in addition to running passes over them, so it really seemed like the most natural place to sink this was into the transforms library. We can't just move *everything* because we also have loop analyses that rely on a subset of the invariants. So this patch splits the the loop infrastructure into the analysis management that has to be part of the analysis library, and the transform-aware pass manager. This also required splitting the loop analyses' printer passes out to the transforms library, which makes sense to me as running these will transform the code into LCSSA in theory. I haven't split the unittest though because testing one component without the other seems nearly intractable. Differential Revision: https://reviews.llvm.org/D28452 llvm-svn: 291662
* [PM] Rewrite the loop pass manager to use a worklist and augmented runChandler Carruth2017-01-111-22/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arguments much like the CGSCC pass manager. This is a major redesign following the pattern establish for the CGSCC layer to support updates to the set of loops during the traversal of the loop nest and to support invalidation of analyses. An additional significant burden in the loop PM is that so many passes require access to a large number of function analyses. Manually ensuring these are cached, available, and preserved has been a long-standing burden in LLVM even with the help of the automatic scheduling in the old pass manager. And it made the new pass manager extremely unweildy. With this design, we can package the common analyses up while in a function pass and make them immediately available to all the loop passes. While in some cases this is unnecessary, I think the simplicity afforded is worth it. This does not (yet) address loop simplified form or LCSSA form, but those are the next things on my radar and I have a clear plan for them. While the patch is very large, most of it is either mechanically updating loop passes to the new API or the new testing for the loop PM. The code for it is reasonably compact. I have not yet updated all of the loop passes to correctly leverage the update mechanisms demonstrated in the unittests. I'll do that in follow-up patches along with improved FileCheck tests for those passes that ensure things work in more realistic scenarios. In many cases, there isn't much we can do with these until the loop simplified form and LCSSA form are in place. Differential Revision: https://reviews.llvm.org/D28292 llvm-svn: 291651
* [LAA] Prevent invalid IR for loop-invariant bound in loop bodyKeno Fischer2016-12-051-7/+13
| | | | | | | | | | | | | | | | | | Summary: If LAA expands a bound that is loop invariant, but not hoisted out of the loop body, it used to use that value anyway, causing a non-domination error, because the memcheck block is of course not dominated by the scalar loop body. Detect this situation and expand the SCEV expression instead. Fixes PR31251 Reviewers: anemet Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D27397 llvm-svn: 288705
* Fix some Clang-tidy and Include What You Use warnings; other minor fixes (NFC).Eugene Zelenko2016-11-301-1/+48
| | | | | | This preparation to remove SetVector.h dependency on SmallSet.h. llvm-svn: 288256
* [PM] Change the static object whose address is used to uniquely identifyChandler Carruth2016-11-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | analyses to have a common type which is enforced rather than using a char object and a `void *` type when used as an identifier. This has a number of advantages. First, it at least helps some of the confusion raised in Justin Lebar's code review of why `void *` was being used everywhere by having a stronger type that connects to documentation about this. However, perhaps more importantly, it addresses a serious issue where the alignment of these pointer-like identifiers was unknown. This made it hard to use them in pointer-like data structures. We were already dodging this in dangerous ways to create the "all analyses" entry. In a subsequent patch I attempted to use these with TinyPtrVector and things fell apart in a very bad way. And it isn't just a compile time or type system issue. Worse than that, the actual alignment of these pointer-like opaque identifiers wasn't guaranteed to be a useful alignment as they were just characters. This change introduces a type to use as the "key" object whose address forms the opaque identifier. This both forces the objects to have proper alignment, and provides type checking that we get it right everywhere. It also makes the types somewhat less mysterious than `void *`. We could go one step further and introduce a truly opaque pointer-like type to return from the `ID()` static function rather than returning `AnalysisKey *`, but that didn't seem to be a clear win so this is just the initial change to get to a reliably typed and aligned object serving is a key for all the analyses. Thanks to Richard Smith and Justin Lebar for helping pick plausible names and avoid making this refactoring many times. =] And thanks to Sean for the super fast review! While here, I've tried to move away from the "PassID" nomenclature entirely as it wasn't really helping and is overloaded with old pass manager constructs. Now we have IDs for analyses, and key objects whose address can be used as IDs. Where possible and clear I've shortened this to just "ID". In a few places I kept "AnalysisID" to make it clear what was being identified. Differential Revision: https://reviews.llvm.org/D27031 llvm-svn: 287783
* [LAA, LV] Port to new streaming interface for opt remarks. Update LVAdam Nemet2016-09-301-24/+38
| | | | | | | | | | | | | | (Recommit after making sure IsVerbose gets properly initialized in DiagnosticInfoOptimizationBase. See previous commit that takes care of this.) OptimizationRemarkAnalysis directly takes the role of the report that is generated by LAA. Then we need the magic to be able to turn an LAA remark into an LV remark. This is done via a new OptimizationRemark ctor. llvm-svn: 282813
* Revert "[LAA, LV] Port to new streaming interface for opt remarks. Update LV"Adam Nemet2016-09-291-38/+24
| | | | | | | | This reverts commit r282758. There are some clang failures I haven't seen. llvm-svn: 282759
* [LAA, LV] Port to new streaming interface for opt remarks. Update LVAdam Nemet2016-09-291-24/+38
| | | | | | | | | | OptimizationRemarkAnalysis directly takes the role of the report that is generated by LAA. Then we need the magic to be able to turn an LAA remark into an LV remark. This is done via a new OptimizationRemark ctor. llvm-svn: 282758
* [LAA] Rename emitAnalysis to recordAnalys. NFCAdam Nemet2016-09-281-23/+20
| | | | | | | | | Ever since LAA was split out into an analysis on its own, this function stopped emitting the report directly. Instead it stores it to be retrieved by the client which can then emit it as its own report (e.g. -Rpass-analysis=loop-vectorize). llvm-svn: 282561
* [LV] When reporting about a specific instruction without debug location use ↵Adam Nemet2016-09-211-1/+4
| | | | | | | | loop's This can occur for example if some optimization drops the debug location. llvm-svn: 282048
* [Loop Vectorizer] Consecutive memory access - fixed and simplifiedElena Demikhovsky2016-09-181-4/+4
| | | | | | | | | Amended consecutive memory access detection in Loop Vectorizer. Load/Store were not handled properly without preceding GEP instruction. Differential Revision: https://reviews.llvm.org/D20789 llvm-svn: 281853
* Fix indent. NFC.Chad Rosier2016-08-311-2/+2
| | | | llvm-svn: 280270
* [Loop Vectorizer] Fixed memory confilict checks.Elena Demikhovsky2016-08-281-3/+29
| | | | | | | | | Fixed a bug in run-time checks for possible memory conflicts inside loop. The bug is in Low <-> High boundaries calculation. The High boundary should be calculated as "last memory access pointer + element size". Differential revision: https://reviews.llvm.org/D23176 llvm-svn: 279930
* Use the range variant of transform instead of unpacking begin/endDavid Majnemer2016-08-121-4/+3
| | | | | | No functionality change is intended. llvm-svn: 278476
* Consistently use LoopAnalysisManagerSean Silva2016-08-091-2/+2
| | | | | | | | | | | | | | | | | One exception here is LoopInfo which must forward-declare it (because the typedef is in LoopPassManager.h which depends on LoopInfo). Also, some includes for LoopPassManager.h were needed since that file provides the typedef. Besides a general consistently benefit, the extra layer of indirection allows the mechanical part of https://reviews.llvm.org/D23256 that requires touching every transformation and analysis to be factored out cleanly. Thanks to David for the suggestion. llvm-svn: 278079
* Consistently use FunctionAnalysisManagerSean Silva2016-08-091-1/+1
| | | | | | | | | | | Besides a general consistently benefit, the extra layer of indirection allows the mechanical part of https://reviews.llvm.org/D23256 that requires touching every transformation and analysis to be factored out cleanly. Thanks to David for the suggestion. llvm-svn: 278077
* [LV, X86] Be more optimistic about vectorizing shifts.Michael Kuperstein2016-08-041-1/+8
| | | | | | | | | | | | | | | Shifts with a uniform but non-constant count were considered very expensive to vectorize, because the splat of the uniform count and the shift would tend to appear in different blocks. That made the splat invisible to ISel, and we'd scalarize the shift at codegen time. Since r201655, CodeGenPrepare sinks those splats to be next to their use, and we are able to select the appropriate vector shifts. This updates the cost model to to take this into account by making shifts by a uniform cheap again. Differential Revision: https://reviews.llvm.org/D23049 llvm-svn: 277782
* [OptDiag,LV] Add hotness attribute to analysis remarksAdam Nemet2016-07-201-7/+8
| | | | | | | | The earlier change added hotness attribute to missed-optimization remarks. This follows up with the analysis remarks (the ones explaining the reason for the missed optimization). llvm-svn: 276192
* [LAA] Don't hold on to DominatorTree in the analysis resultAdam Nemet2016-07-131-4/+5
| | | | llvm-svn: 275335
* [LAA] Don't hold on to TargetLibraryInfo in the analysis resultAdam Nemet2016-07-131-3/+4
| | | | llvm-svn: 275334
* [LAA] Don't hold on to DataLayout in the analysis resultAdam Nemet2016-07-131-11/+8
| | | | | | | In fact, don't even pass this to the ctor since we can get it from the module. llvm-svn: 275326
* [LAA] Don't hold on to LoopInfo in the analysis resultAdam Nemet2016-07-131-3/+3
| | | | llvm-svn: 275325
* [LAA] Don't hold on to AliasAnalysis in the analysis resultAdam Nemet2016-07-131-3/+3
| | | | llvm-svn: 275322
* [LoopAccessAnalysis] Some minor cleanupsDavid Majnemer2016-07-121-20/+16
| | | | | | | | | Use range-base for loops. Use auto when appropriate. No functional change is intended. llvm-svn: 275213
* [PM] name the new PM LAA class LoopAccessAnalysis (LAA) /NFCXinliang David Li2016-07-081-3/+3
| | | | llvm-svn: 274934
* Rename LoopAccessAnalysis to LoopAccessLegacyAnalysis /NFCXinliang David Li2016-07-081-9/+9
| | | | llvm-svn: 274927
* [LoopAccessAnalysis] Fix an integer overflowDavid Majnemer2016-07-071-22/+22
| | | | | | | | | We were inappropriately using 32-bit types to account for quantities that can be far larger. Fixed in PR28443. llvm-svn: 274737
* [PM] Avoid getResult on a higher level in LoopAccessAnalysisSean Silva2016-07-071-7/+15
| | | | | | | Note that require<domtree> and require<loops> aren't needed because they come in implicitly via the loop pass manager. llvm-svn: 274712
OpenPOWER on IntegriCloud