summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/LoopDistribute
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[IRBuilder] Fold consistently for or/and whether constant is LHS or RHS"Petr Hosek2019-07-071-14/+15
| | | | | | | | | | This reverts commit r365260 which broke the following tests: Clang :: CodeGenCXX/cfi-mfcall.cpp Clang :: CodeGenObjC/ubsan-nullability.m LLVM :: Transforms/LoopVectorize/AArch64/pr36032.ll llvm-svn: 365284
* [IRBuilder] Fold consistently for or/and whether constant is LHS or RHSPhilip Reames2019-07-061-15/+14
| | | | | | Without this, we have the unfortunate property that tests are dependent on the order of operads passed the CreateOr and CreateAnd functions. In actual usage, we'd promptly optimize them away, but it made tests slightly more verbose than they should have been. llvm-svn: 365260
* [SCEV][LSR] Prevent using undefined value in binopsEugene Leviant2019-07-031-6/+10
| | | | | | | | | | | On some occasions ReuseOrCreateCast may convert previously expanded value to undefined. That value may be passed by SCEVExpander as an argument to InsertBinop making IV chain undefined. Differential revision: https://reviews.llvm.org/D63928 llvm-svn: 365009
* LoopDistribute/LAA: Respect convergentMatt Arsenault2019-06-125-1/+416
| | | | | | | | | | | | | | | | | | This case is slightly tricky, because loop distribution should be allowed in some cases, and not others. As long as runtime dependency checks don't need to be introduced, this should be OK. This is further complicated by the fact that LoopDistribute partially ignores if LAA says that vectorization is safe, and then does its own runtime pointer legality checks. Note this pass still does not handle noduplicate correctly, as this should always be forbidden with it. I'm not going to bother trying to fix it, as it would require more effort and I think noduplicate should be removed. https://reviews.llvm.org/D62607 llvm-svn: 363160
* LoopDistribute/LAA: Add tests to catch regressionsMatt Arsenault2019-06-121-0/+65
| | | | | | | | | I broke 2 of these with a patch, but were not covered by existing tests. https://reviews.llvm.org/D63035 llvm-svn: 363158
* LoopDistribute: Add testcase where SCEV wants to insert a runtimeMatt Arsenault2019-06-071-0/+145
| | | | | | | | | check. Only the memory based checks were being tested. Prepare for fix in convergent handling. llvm-svn: 362854
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-1716-0/+1323
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-1716-1323/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
* [Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.Michael Kruse2018-12-123-0/+167
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g. #pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable) is the same as #pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable) and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used. This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance, !0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"} defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop. Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account. For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations. Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated. To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied. With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling). Reviewed By: hfinkel, dmgreen Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288 llvm-svn: 348944
* [DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label.Shiva Chen2018-05-092-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to set breakpoints on labels and list source code around labels, we need collect debug information for labels, i.e., label name, the function label belong, line number in the file, and the address label located. In order to keep these information in LLVM IR and to allow backend to generate debug information correctly. We create a new kind of metadata for labels, DILabel. The format of DILabel is !DILabel(scope: !1, name: "foo", file: !2, line: 3) We hope to keep debug information as much as possible even the code is optimized. So, we create a new kind of intrinsic for label metadata to avoid the metadata is eliminated with basic block. The intrinsic will keep existing if we keep it from optimized out. The format of the intrinsic is llvm.dbg.label(metadata !1) It has only one argument, that is the DILabel metadata. The intrinsic will follow the label immediately. Backend could get the label metadata through the intrinsic's parameter. We also create DIBuilder API for labels to be used by Frontend. Frontend could use createLabel() to allocate DILabel objects, and use insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR. Differential Revision: https://reviews.llvm.org/D45024 Patch by Hsiangkai Wang. llvm-svn: 331841
* Move test of lazy BFI with ORE to a generic directoryAdam Nemet2017-01-131-88/+0
| | | | llvm-svn: 291862
* [LDist] Match behavior between invoking via optimization pipeline or opt ↵Adam Nemet2016-12-2113-22/+22
| | | | | | | | | | | | | | | | | | | | | | | | -loop-distribute In r267672, where the loop distribution pragma was introduced, I tried it hard to keep the old behavior for opt: when opt is invoked with -loop-distribute, it should distribute the loop (it's off by default when ran via the optimization pipeline). As MichaelZ has discovered this has the unintended consequence of breaking a very common developer work-flow to reproduce compilations using opt: First you print the pass pipeline of clang with -debug-pass=Arguments and then invoking opt with the returned arguments. clang -debug-pass will include -loop-distribute but the pass is invoked with default=off so nothing happens unless the loop carries the pragma. While through opt (default=on) we will try to distribute all loops. This changes opt's default to off as well to match clang. The tests are modified to explicitly enable the transformation. llvm-svn: 290235
* [LoopVersioning] Require loop-simplify form for loop versioning.Florian Hahn2016-12-192-8/+11
| | | | | | | | | | | | | | | | Summary: Requiring loop-simplify form for loop versioning ensures that the runtime check block always dominates the exit block. This patch closes #30958 (https://llvm.org/bugs/show_bug.cgi?id=30958). Reviewers: silviu.baranga, hfinkel, anemet, ashutosh.nema Subscribers: ashutosh.nema, mzolotukhin, efriedma, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D27469 llvm-svn: 290116
* [BPI] Add new LazyBPI analysisAdam Nemet2016-07-281-4/+10
| | | | | | | | | | | | | | | | | | | | | | Summary: The motivation is the same as in D22141: In order to add the hotness attribute to optimization remarks we need BFI to be available in all passes that emit optimization remarks. BFI depends on BPI so unless we make this lazy as well we would still compute BPI unconditionally. The solution is to use the new LazyBPI pass in LazyBFI and only compute BPI when computation of BFI is requested by the client. I extended the laziness test using a LoopDistribute test to also cover BPI. Reviewers: hfinkel, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22835 llvm-svn: 277083
* [OptDiag,LDist] Convert remaining opt remarks to use the new APIAdam Nemet2016-07-211-0/+6
| | | | llvm-svn: 276340
* [LoopDist] This test does not require ASSERTSAdam Nemet2016-07-181-2/+0
| | | | | | | Only its counterpart, diagnostics-with-hotness-lazy-BFI.ll, which invokes opt with -debug-only=. llvm-svn: 275812
* [LoopDist] Port to new PMAdam Nemet2016-07-182-0/+10
| | | | | | | | | | | | | | | | | | | | | | Summary: The direct motivation for the port is to ensure that the OptRemarkEmitter tests work with the new PM. This remains a function pass because we not only create multiple loops but could also version the original loop. In the test I need to invoke opt with -passes='require<aa>,loop-distribute'. LoopDistribute does not directly depend on AA however LAA does. LAA uses getCachedResult so I *think* we need manually pull in 'aa'. Reviewers: davidxl, silvas Subscribers: sanjoy, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D22437 llvm-svn: 275811
* [OptRemark,LDist] RFC: Add hotness attributeAdam Nemet2016-07-152-0/+151
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the first set of changes implementing the RFC from http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334 This is a cross-sectional patch; rather than implementing the hotness attribute for all optimization remarks and all passes in a patch set, it implements it for the 'missed-optimization' remark for Loop Distribution. My goal is to shake out the design issues before scaling it up to other types and passes. Hotness is computed as an integer as the multiplication of the block frequency with the function entry count. It's only printed in opt currently since clang prints the diagnostic fields directly. E.g.: remark: /tmp/t.c:3:3: loop not distributed: use -Rpass-analysis=loop-distribute for more info (hotness: 300) A new API added is similar to emitOptimizationRemarkMissed. The difference is that it additionally takes a code region that the diagnostic corresponds to. From this, hotness is computed using BFI. The new API is exposed via an analysis pass so that it can be made dependent on LazyBFI. (Thanks to Hal for the analysis pass idea.) This feature can all be enabled by setDiagnosticHotnessRequested in the LLVM context. If this is off, LazyBFI is not calculated (D22141) so there should be no overhead. A new command-line option is added to turn this on in opt. My plan is to switch all user of emitOptimizationRemark* to use this module instead. Reviewers: hfinkel Subscribers: rcox2, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D21771 llvm-svn: 275583
* [LoopDist] Fix typo in diagnosticAdam Nemet2016-07-141-2/+2
| | | | llvm-svn: 275495
* [LoopAccessAnalysis] Fix an integer overflowDavid Majnemer2016-07-071-0/+36
| | | | | | | | | We were inappropriately using 32-bit types to account for quantities that can be far larger. Fixed in PR28443. llvm-svn: 274737
* [LAA] Enable symbolic stride speculation for all LAA clientsAdam Nemet2016-06-171-0/+65
| | | | | | | | | | | | | | | This is a functional change for LLE and LDist. The other clients (LV, LVerLICM) already had this explicitly enabled. The temporary boolean parameter to LAA is removed that allowed turning off speculation of symbolic strides. This makes LAA's caching interface LAA::getInfo only take the loop as the parameter. This makes the interface more friendly to the new Pass Manager. The flag -enable-mem-access-versioning is moved from LV to a LAA which now allows turning off speculation globally. llvm-svn: 273064
* [LoopDist] Add missing RUN line in test from r268006Adam Nemet2016-04-291-0/+2
| | | | llvm-svn: 268007
* [LoopDist] Also emit optimization remark on success (-Rpass=)Adam Nemet2016-04-291-0/+56
| | | | | | | The option -Rpass=loop-distribute now reports the loops that were distributed. llvm-svn: 268006
* [LoopDist] Emit optimization remarks (-Rpass*)Adam Nemet2016-04-281-0/+118
| | | | | | | | | | | | | | | | I closely followed the precedents set by the vectorizer: * With -Rpass-missed, the loop is reported with further details pointing to -Rpass--analysis. * -Rpass-analysis reports the details why distribution has failed. * Regardless of -Rpass*, when distribution fails for a loop where distribution was forced with the pragma, a warning is produced according to -Wpass-failed. In this case the analysis info is also printed even without -Rpass-analysis. llvm-svn: 267952
* [LoopDist] Add llvm.loop.distribute.enable loop metadataAdam Nemet2016-04-271-0/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: D19403 adds a new pragma for loop distribution. This change adds support for the corresponding metadata that the pragma is translated to by the FE. As part of this I had to rethink the flag -enable-loop-distribute. My goal was to be backward compatible with the existing behavior: A1. pass is off by default from the optimization pipeline unless -enable-loop-distribute is specified A2. pass is on when invoked directly from opt (e.g. for unit-testing) The new pragma/metadata overrides these defaults so the new behavior is: B1. A1 + enable distribution for individual loop with the pragma/metadata B2. A2 + disable distribution for individual loop with the pragma/metadata The default value whether the pass is on or off comes from the initiator of the pass. From the PassManagerBuilder the default is off, from opt it's on. I moved -enable-loop-distribute under the pass. If the flag is specified it overrides the default from above. Then the pragma/metadata can further modifies this per loop. As a side-effect, we can now also use -enable-loop-distribute=0 from opt to emulate the default from the optimization pipeline. So to be precise this is the new behavior: C1. pass is off by default from the optimization pipeline unless -enable-loop-distribute or the pragma/metadata enables it C2. pass is on when invoked directly from opt unless -enable-loop-distribute=0 or the pragma/metadata disables it Reviewers: hfinkel Subscribers: joker.eph, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D19431 llvm-svn: 267672
* Allow LLE/LD and the loop versioning infrastructure to use SCEV predicatesSilviu Baranga2015-11-091-1/+1
| | | | | | | | | | | | | | | | | | | Summary: LAA currently generates a set of SCEV predicates that must be checked by users. In the case of Loop Distribute/Loop Load Elimination, no such predicates could have been emitted, since we don't allow stride versioning. However, in the future there could be SCEV predicates that will need to be checked. This change adds support for SCEV predicate versioning in the Loop Distribute, Loop Load Eliminate and the loop versioning infrastructure. Reviewers: anemet Subscribers: mssimpso, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D14240 llvm-svn: 252467
* [LAA] Hold bounds via ValueHandles during SCEV expansionAdam Nemet2015-08-211-0/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SCEV expansion can invalidate previously expanded values. For example in SCEVExpander::ReuseOrCreateCast, if we already have the requested cast value but it's not at the desired location, a new cast is inserted and the old cast will be invalidated. Therefore, when expanding the bounds for the pointers, a later entry can invalidate the IR value for an earlier one. The fix is to store a value handle rather than the value itself. The newly added test has a more detailed description of how the bug triggers. This bug can have a negative but potentially highly variable performance impact in Loop Distribution. Because one of the bound values was invalidated and is an undef expression now, InstCombine is free to transform the array overlap check: Start0 <= End1 && Start1 <= End0 into: Start0 <= End1 So depending on the runtime location of the arrays, we would detect a conflict and fall back on the original loop of the versioned loop. Also tested compile time with SPEC2006 LTO bc files. llvm-svn: 245760
* [LoopDist] Add test for missing coverageAdam Nemet2015-08-121-0/+57
| | | | | | | Add a testcase to ensure that if we can't find bounds for a necessary memcheck we don't distribute. llvm-svn: 244703
* [LAA] Merge memchecks for accesses separated by a constant offsetSilviu Baranga2015-07-081-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Often filter-like loops will do memory accesses that are separated by constant offsets. In these cases it is common that we will exceed the threshold for the allowable number of checks. However, it should be possible to merge such checks, sice a check of any interval againt two other intervals separated by a constant offset (a,b), (a+c, b+c) will be equivalent with a check againt (a, b+c), as long as (a,b) and (a+c, b+c) overlap. Assuming the loop will be executed for a sufficient number of iterations, this will be true. If not true, checking against (a, b+c) is still safe (although not equivalent). As long as there are no dependencies between two accesses, we can merge their checks into a single one. We use this technique to construct groups of accesses, and then check the intervals associated with the groups instead of checking the accesses directly. Reviewers: anemet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10386 llvm-svn: 241673
* [LoopDist] Improve variable names and comments in LoopVersioning class, NFCAdam Nemet2015-06-222-7/+7
| | | | | | | | As with the previous patch, the goal is to turn the class into a general loop-versioning class. This patch removes any references to loop distribution. llvm-svn: 240352
* [LoopAccesses] Rearrange printed lines in -analyzeAdam Nemet2015-05-181-1/+0
| | | | | | | "Store to invariant address..." is moved as the last line. This is not the prime result of the analysis. Plus it simplifies some of the tests. llvm-svn: 237573
* New Loop Distribution passAdam Nemet2015-05-146-0/+484
Summary: This implements the initial version as was proposed earlier this year (http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-January/080462.html). Since then Loop Access Analysis was split out from the Loop Vectorizer and was made into a separate analysis pass. Loop Distribution becomes the second user of this analysis. The pass is off by default and can be enabled with -enable-loop-distribution. There is currently no notion of profitability; if there is a loop with dependence cycles, the pass will try to split them off from other memory operations into a separate loop. I decided to remove the control-dependence calculation from this first version. This and the issues with the PDT are actively discussed so it probably makes sense to treat it separately. Right now I just mark all terminator instruction required which keeps identical CFGs for each distributed loop. This seems to be working pretty well for 456.hmmer where even though there is an empty if-then block in the distributed loop initially, it gets completely removed. The pass keeps DominatorTree and LoopInfo updated. I've tested this with -loop-distribute-verify with the testsuite where we distribute ~90 loops. SimplifyLoop is violated in some cases and I have a FIXME covering this. Reviewers: hfinkel, nadav, aschwaighofer Reviewed By: aschwaighofer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8831 llvm-svn: 237358
OpenPOWER on IntegriCloud