summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/LoopUnroll
Commit message (Collapse)AuthorAgeFilesLines
* The patch fixes PR27392.Evgeny Stupachenko2016-04-274-18/+28
| | | | | | | | | | | | | | | | | Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can potentially overflow (when BECount is max unsigned integer). While comparing BECount with (Count - 1) is overflow safe and therefore correct. Reviewer: hfinkel Differential Revision: http://reviews.llvm.org/D19256 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 267662
* [PR27284] Reverse the ownership between DICompileUnit and DISubprogram.Adrian Prantl2016-04-151-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently each Function points to a DISubprogram and DISubprogram has a scope field. For member functions the scope is a DICompositeType. DIScopes point to the DICompileUnit to facilitate type uniquing. Distinct DISubprograms (with isDefinition: true) are not part of the type hierarchy and cannot be uniqued. This change removes the subprograms list from DICompileUnit and instead adds a pointer to the owning compile unit to distinct DISubprograms. This would make it easy for ThinLTO to strip unneeded DISubprograms and their transitively referenced debug info. Motivation ---------- Materializing DISubprograms is currently the most expensive operation when doing a ThinLTO build of clang. We want the DISubprogram to be stored in a separate Bitcode block (or the same block as the function body) so we can avoid having to expensively deserialize all DISubprograms together with the global metadata. If a function has been inlined into another subprogram we need to store a reference the block containing the inlined subprogram. Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script that updates LLVM IR testcases to the new format. http://reviews.llvm.org/D19034 <rdar://problem/25256815> llvm-svn: 266446
* [LoopUnroll] Fix the way we update DT after complete unrolling.Michael Zolotukhin2016-04-061-0/+53
| | | | | | | | Updating dominators for exit-blocks of the unrolled loops is not enough, as shown in PR27157. The proper way is to update dominators for all dominance-children of original loop blocks. llvm-svn: 265605
* LoopUnroll: only allow non-modulo Partial unrolling when Runtime=trueFiona Glaser2016-04-061-1/+1
| | | | | | Patch by Evgeny Stupachenko <evstupac@gmail.com>. llvm-svn: 265558
* [DebugInfo] Fix tests so that each subprogram belongs to a CU.Davide Italiano2016-04-051-0/+6
| | | | llvm-svn: 265490
* Adds the ability to use an epilog remainder loop during loop unrolling and makesDavid L Kreitzer2016-04-0512-87/+162
| | | | | | | | | | this the default behavior. Patch by Evgeny Stupachenko (evstupac@gmail.com). Differential Revision: http://reviews.llvm.org/D18158 llvm-svn: 265388
* Enable unroll for constant bound loops when TripCount is not modulo of ↵Zia Ansari2016-04-041-0/+29
| | | | | | | | | | unroll factor, reducing it to maximum power-of-2 that satisfies threshold limit. Commit for Evgeny Stupachenko (evstupac@gmail.com) Differential Revision: http://reviews.llvm.org/D18290 llvm-svn: 265337
* Enable non-power-of-2 #pragma unroll counts.David L Kreitzer2016-03-251-0/+37
| | | | | | | | Patch by Evgeny Stupachenko. Differential Revision: http://reviews.llvm.org/D18202 llvm-svn: 264407
* [LoopUnroll] Respect the convergent attribute.Justin Lebar2016-03-141-0/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Specifically, when we perform runtime loop unrolling of a loop that contains a convergent op, we can only unroll k times, where k divides the loop trip multiple. Without this change, we'll happily unroll e.g. the following loop for (int i = 0; i < N; ++i) { if (i == 0) convergent_op(); foo(); } into int i = 0; if (N % 2 == 1) { convergent_op(); foo(); ++i; } for (; i < N - 1; i += 2) { if (i == 0) convergent_op(); foo(); foo(); }. This is unsafe, because we've just added a control-flow dependency to the convergent op in the prelude. In general, runtime unrolling loops that contain convergent ops is safe only if we don't have emit a prelude, which occurs when the unroll count divides the trip multiple. Reviewers: resistor Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17526 llvm-svn: 263509
* [LoopUnroll] Convert some existing tests to unit-tests.Michael Zolotukhin2016-03-122-226/+0
| | | | | | | | | | | | Summary: As we now have unit-tests for UnrollAnalyzer, we can convert some existing tests to this format. It should make the tests more robust. Reviewers: chandlerc, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17904 llvm-svn: 263318
* [LoopUnrolling] Fix a bug introduced in r259869 (PR26688).Michael Zolotukhin2016-02-221-0/+34
| | | | | | | | The issue was that we only required LCSSA rebuilding if the immediate parent-loop had values used outside of it. The fix is to enaable the same logic for all outer loops, not only immediate parent. llvm-svn: 261575
* [SCEVExpander] Make findExistingExpansion smarterJunmo Park2016-02-161-0/+34
| | | | | | | | | | | | | Summary: Extending findExistingExpansion can use existing value in ExprValueMap. This patch gives 0.3~0.5% performance improvements on benchmarks(test-suite, spec2000, spec2006, commercial benchmark) Reviewers: mzolotukhin, sanjoy, zzheng Differential Revision: http://reviews.llvm.org/D15559 llvm-svn: 260938
* AMDGPU: Remove some old intrinsic uses from testsMatt Arsenault2016-02-111-6/+6
| | | | llvm-svn: 260493
* LoopUnroll: Use the optsize threshold for minsize as wellJustin Bogner2016-01-111-1/+2
| | | | | | | | | | | | Currently we're unrolling loops more in minsize than in optsize, which means -Oz will have a larger code size than -Os. That doesn't make any sense. This resolves the FIXME about this in LoopUnrollPass and extends the optsize test to make sure we use the smaller threshold for minsize as well. llvm-svn: 257402
* LoopInfo: Simplify ownership of Loop objectsJustin Bogner2016-01-081-1/+1
| | | | | | | | | | | It's strange that LoopInfo mostly owns the Loop objects, but that it defers deleting them to the loop pass manager. Instead, change the oddly named "updateUnloop" to "markAsRemoved" and have it queue the Loop object for deletion. We can't delete the Loop immediately when we remove it, since we need its pointer identity still, so we'll mark the object as "invalid" so that clients can see what's going on. llvm-svn: 257191
* AMDGPU: Switch barrier intrinsics to using convergentMatt Arsenault2015-12-192-0/+36
| | | | | | | | noduplicate prevents unrolling of small loops that happen to have barriers in them. If a loop has a barrier in it, it is OK to duplicate it for the unroll. llvm-svn: 256075
* Don't recompute LCSSA after loop-unrolling when possible.Michael Zolotukhin2015-11-141-0/+119
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently we always recompute LCSSA for outer loops after unrolling an inner loop. That leads to compile time problem when we have big loop nests, and we can solve it by avoiding unnecessary work. For instance, if w eonly do partial unrolling, we don't break LCSSA, so we don't need to rebuild it. Also, if all exits from the inner loop are inside the enclosing loop, then complete unrolling won't break LCSSA either. I replaced unconditional LCSSA recomputation with conditional recomputation + unconditional assert and added several tests, which were failing when I experimented with it. Soon I plan to follow up with a similar patch for recalculation of dominators tree. Reviewers: hfinkel, dexonsmith, bogner, joker.eph, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14526 llvm-svn: 253126
* DI: Reverse direction of subprogram -> function edge.Peter Collingbourne2015-11-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Previously, subprograms contained a metadata reference to the function they described. Because most clients need to get or set a subprogram for a given function rather than the other way around, this created unneeded inefficiency. For example, many passes needed to call the function llvm::makeSubprogramMap() to build a mapping from functions to subprograms, and the IR linker needed to fix up function references in a way that caused quadratic complexity in the IR linking phase of LTO. This change reverses the direction of the edge by storing the subprogram as function-level metadata and removing DISubprogram's function field. Since this is an IR change, a bitcode upgrade has been provided. Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is attached to the PR. Differential Revision: http://reviews.llvm.org/D14265 llvm-svn: 252219
* Revert "[IndVarSimplify] Rewrite loop exit values with their initial values ↵Tobias Grosser2015-11-031-1/+1
| | | | | | | | | | | | | | | | | | | from loop preheader" Commit 251839 triggers miscompiles on some bots: http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/13723 (The commit is listed in 13722, but due to an existing failure introduced in 13721 and reverted in 13723 the failure is only visible in 13723) To verify r251839 is indeed the only change that triggered the buildbot failures and to ensure the buildbots remain green while investigating I temporarily revert this commit. At the current state it is unclear if this commit introduced some miscompile or if it only exposed code to Polly that is subsequently miscompiled by Polly. llvm-svn: 251901
* [IndVarSimplify] Rewrite loop exit values with their initial values from ↵Chen Li2015-11-021-1/+1
| | | | | | | | | | | | | | | | | loop preheader Summary: This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch. Reviewers: sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13974 llvm-svn: 251839
* Revert r251492 "[IndVarSimplify] Rewrite loop exit values with their Chen Li2015-10-281-1/+1
| | | | | | initial values from loop preheader", because it broke some bots. llvm-svn: 251498
* [IndVarSimplify] Rewrite loop exit values with their initial values from ↵Chen Li2015-10-281-1/+1
| | | | | | | | | | | | | | | | | loop preheader Summary: This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch. Reviewers: sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13974 llvm-svn: 251492
* [Tests] Add one more case to LoopUnroll/pr18861.ll for better coverage.Michael Zolotukhin2015-10-021-0/+31
| | | | llvm-svn: 249174
* [Tests] Give meaningful names to blocks in LoopUnroll/pr18861.ll, add a ↵Michael Zolotukhin2015-10-021-13/+37
| | | | | | description of what's going on. llvm-svn: 249173
* [Tests] Slightly reduce test LoopUnroll/pr18861.ll.Michael Zolotukhin2015-10-021-16/+4
| | | | llvm-svn: 249172
* [Unroll] Do not crash trying to propagate a value to vector load.Michael Zolotukhin2015-09-221-0/+19
| | | | llvm-svn: 248333
* [Unroll] Follow-up for r247769: fix a bug in UnrolledInstAnalyzer::visitLoad.Michael Zolotukhin2015-09-221-2/+30
| | | | | | | | Apart from checking that GlobalVariable is a constant, we should check that it's not a weak constant, in which case we can't propagate its value. llvm-svn: 248327
* [Unroll] Fix a bug in UnrolledInstAnalyzer::visitLoad.Michael Zolotukhin2015-09-161-0/+29
| | | | | | | | We only checked that a global is initialized with constants, which is incorrect. We should be checking that GlobalVariable *is* a constant, not just initialized with it. llvm-svn: 247769
* DI: Require subprogram definitions to be distinctDuncan P. N. Exon Smith2015-08-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | As a follow-up to r246098, require `DISubprogram` definitions (`isDefinition: true`) to be 'distinct'. Specifically, add an assembler check, a verifier check, and bitcode upgrading logic to combat testcase bitrot after the `DIBuilder` change. While working on the testcases, I realized that test/Linker/subprogram-linkonce-weak-odr.ll isn't relevant anymore. Its purpose was to check for a corner case in PR22792 where two subprogram definitions match exactly and share the same metadata node. The new verifier check, requiring that subprogram definitions are 'distinct', precludes that possibility. I updated almost all the IR with the following script: git grep -l -E -e '= !DISubprogram\(.* isDefinition: true' | grep -v test/Bitcode | xargs sed -i '' -e 's/= \(!DISubprogram(.*, isDefinition: true\)/= distinct \1/' Likely some variant of would work for out-of-tree testcases. llvm-svn: 246327
* Add new llvm.loop.unroll.enable metadata.Mark Heffernan2015-08-101-0/+66
| | | | | | | | | | | | | This change adds the unroll metadata "llvm.loop.unroll.enable" which directs the optimizer to unroll a loop fully if the trip count is known at compile time, and unroll partially if the trip count is not known at compile time. This differs from "llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not known at compile time. The "llvm.loop.unroll.enable" is intended to be added for loops annotated with "#pragma unroll". llvm-svn: 244466
* [Unroll] Improve the brute force loop unroll estimate by propagatingChandler Carruth2015-08-031-0/+23
| | | | | | | | | | | | | | | | | | | | | | | through PHI nodes across iterations. This patch teaches the new advanced loop unrolling heuristics to propagate constants into the loop from the preheader and around the backedge after simulating each iteration. This lets us brute force solve simple recurrances that aren't modeled effectively by SCEV. It also makes it more clear why we need to process the loop in-order rather than bottom-up which might otherwise make much more sense (for example, for DCE). This came out of an attempt I'm making to develop a principled way to account for dead code in the unroll estimation. When I implemented a forward-propagating version of that it produced incorrect results due to failing to propagate *cost* between loop iterations through the PHI nodes, and it occured to me we really should at least propagate simplifications across those edges, and it is quite easy thanks to the loop being in canonical and LCSSA form. Differential Revision: http://reviews.llvm.org/D11706 llvm-svn: 243900
* [Unroll] Handle SwitchInst properly.Michael Zolotukhin2015-07-291-0/+24
| | | | | | Previously successor selection was simply wrong. llvm-svn: 243545
* [Unroll] Don't crash when simplified branch condition is undef.Michael Zolotukhin2015-07-291-0/+25
| | | | llvm-svn: 243544
* Rename test full-unroll-bad-geps.ll to full-unroll-crashers.ll.Michael Zolotukhin2015-07-291-0/+0
| | | | | | | No reason to limit it only to GEP-related crashes. More tests are to come here. llvm-svn: 243543
* Roll forward r243250Jingyue Wu2015-07-261-3/+6
| | | | | | | | | r243250 appeared to break clang/test/Analysis/dead-store.c on one of the build slaves, but I couldn't reproduce this failure locally. Probably a false positive as I saw this test was broken by r243246 or r243247 too but passed later without people fixing anything. llvm-svn: 243253
* Revert r243250Jingyue Wu2015-07-261-6/+3
| | | | | | breaks tests llvm-svn: 243251
* [TTI/CostModel] improve TTI::getGEPCost and use it in ↵Jingyue Wu2015-07-261-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CostModel::getInstructionCost Summary: This patch updates TargetTransformInfoImplCRTPBase::getGEPCost to consider addressing modes. It now returns TCC_Free when the GEP can be completely folded to an addresing mode. I started this patch as I refactored SLSR. Function isGEPFoldable looks common and is indeed used by some WIP of mine. So I extracted that logic to getGEPCost. Furthermore, I noticed getGEPCost wasn't directly tested anywhere. The best testing bed seems CostModel, but its getInstructionCost method invokes getAddressComputationCost for GEPs which provides very coarse estimation. So this patch also makes getInstructionCost call the updated getGEPCost for GEPs. This change inevitably breaks some tests because the cost model changes, but nothing looks seriously wrong -- if we believe the new cost model is the right way to go, these tests should be updated. This patch is not perfect yet -- the comments in some tests need to be updated. I want to know whether this is a right approach before fixing those details. Reviewers: chandlerc, hfinkel Subscribers: aschwaighofer, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D9819 llvm-svn: 243250
* Handle resolvable branches in complete loop unroll heuristic.Michael Zolotukhin2015-07-241-0/+207
| | | | | | | | | | | | | | Summary: Resolving a branch allows us to ignore blocks that won't be executed, and thus make our estimate more accurate. This patch is intended to be applied after D10205 (though it could be applied independently). Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10206 llvm-svn: 243084
* Tidy-up test case from r242257.Michael Zolotukhin2015-07-151-5/+8
| | | | llvm-svn: 242268
* [LoopUnrolling] Handle cast instructions.Michael Zolotukhin2015-07-151-0/+94
| | | | | | | | | During estimation of unrolling effect we should be able to propagate constants through casts. Differential Revision: http://reviews.llvm.org/D10207 llvm-svn: 242257
* Enable runtime unrolling with unroll pragma metadataMark Heffernan2015-07-131-14/+13
| | | | | | | | | | Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N") and a runtime trip count. Also, do not unroll loops with unroll full metadata if the loop has a runtime loop count. Previously, such loops would be unrolled with a very large threshold (pragma-unroll-threshold) if runtime unrolled happened to be enabled resulting in a very large (and likely unwise) unroll factor. llvm-svn: 242047
* [LoopUnroll] Use undef for phis with no value liveDavid Majnemer2015-07-011-0/+24
| | | | | | | | We would create a phi node with a zero initialized operand instead of undef in the case where no value was originally available. This was problematic for x86_mmx which has no null value. llvm-svn: 241143
* Set proper debug location for branch added in BasicBlock::splitBasicBlock().Alexey Samsonov2015-06-111-12/+33
| | | | | | | | | | | | | | | | This improves debug locations in passes that do a lot of basic block transformations. Important case is LoopUnroll pass, the test for correct debug locations accompanies this change. Test Plan: regression test suite Reviewers: dblaikie, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10367 llvm-svn: 239551
* [LoopUnroll] Fix truncation bug in canUnrollCompletely.Sanjoy Das2015-06-061-0/+58
| | | | | | | | | | | | | | | | | Summary: canUnrollCompletely takes `unsigned` values for `UnrolledCost` and `RolledDynamicCost` but is passed in `uint64_t`s that are silently truncated. Because of this, when `UnrolledSize` is a large integer that has a small remainder with UINT32_MAX, LLVM tries to completely unroll loops with high trip counts. Reviewers: mzolotukhin, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10293 llvm-svn: 239218
* [Unroll] Rework the naming and structure of the new unroll heuristics.Chandler Carruth2015-06-052-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new naming is (to me) much easier to understand. Here is a summary of the new state of the world: - '*Threshold' is the threshold for full unrolling. It is measured against the estimated unrolled cost as computed by getUserCost in TTI (or CodeMetrics, etc). We will exceed this threshold when unrolling loops where unrolling exposes a significant degree of simplification of the logic within the loop. - '*PercentDynamicCostSavedThreshold' is the percentage of the loop's estimated dynamic execution cost which needs to be saved by unrolling to apply a discount to the estimated unrolled cost. - '*DynamicCostSavingsDiscount' is the discount applied to the estimated unrolling cost when the dynamic savings are expected to be high. When actually analyzing the loop, we now produce both an estimated unrolled cost, and an estimated rolled cost. The rolled cost is notably a dynamic estimate based on our analysis of the expected execution of each iteration. While we're still working to build up the infrastructure for making these estimates, to me it is much more clear *how* to make them better when they have reasonably descriptive names. For example, we may want to apply estimated (from heuristics or profiles) dynamic execution weights to the *dynamic* cost estimates. If we start doing that, we would also need to track the static unrolled cost and the dynamic unrolled cost, as only the latter could reasonably be weighted by profile information. This patch is sadly not without functionality change for the new unroll analysis logic. Buried in the heuristic management were several things that surprised me. For example, we never subtracted the optimized instruction count off when comparing against the unroll heursistics! I don't know if this just got lost somewhere along the way or what, but with the new accounting of things, this is much easier to keep track of and we use the post-simplification cost estimate to compare to the thresholds, and use the dynamic cost reduction ratio to select whether we can exceed the baseline threshold. The old values of these flags also don't necessarily make sense. My impression is that none of these thresholds or discounts have been tuned yet, and so they're just arbitrary placehold numbers. As such, I've not bothered to adjust for the fact that this is now a discount and not a tow-tier threshold model. We need to tune all these values once the logic is ready to be enabled. Differential Revision: http://reviews.llvm.org/D9966 llvm-svn: 239164
* [PPC/LoopUnrollRuntime] Don't avoid high-cost trip count computation on the ↵Hal Finkel2015-05-211-0/+27
| | | | | | | | | | | | | PPC/A2 On X86 (and similar OOO cores) unrolling is very limited, and even if the runtime unrolling is otherwise profitable, the expense of a division to compute the trip count could greatly outweigh the benefits. On the A2, we unroll a lot, and the benefits of unrolling are more significant (seeing a 5x or 6x speedup is not uncommon), so we're more able to tolerate the expense, on average, of a division to compute the trip count. llvm-svn: 237947
* Add another InstCombine pass after LoopUnroll.Wei Mi2015-05-141-0/+85
| | | | | | | | This is to cleanup some redundency generated by LoopUnroll pass. Such redundency may not be cleaned up by existing passes after LoopUnroll. Differential Revision: http://reviews.llvm.org/D9777 llvm-svn: 237395
* Reimplement heuristic for estimating complete-unroll optimization effects.Michael Zolotukhin2015-05-122-2/+36
| | | | | | | | | | | | | | | | | | | | Summary: This patch reimplements heuristic that tries to estimate optimization beneftis from complete loop unrolling. In this patch I kept the minimal changes - e.g. I removed code handling branches and folding compares. That's a promising area, but now there are too many questions to discuss before we can enable it. Test Plan: Tests are included in the patch. Reviewers: hfinkel, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8816 llvm-svn: 237156
* [LoopUnrollRuntime] Avoid high-cost trip count computation.Sanjoy Das2015-04-143-1/+31
| | | | | | | | | | | | | | | | | Summary: Runtime unrolling of loops needs to emit an expression to compute the loop's runtime trip-count. Avoid runtime unrolling if this computation will be expensive. Depends on D8993. Reviewers: atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8994 llvm-svn: 234846
* [LoopUnrollRuntime] Clean up a predicate.Sanjoy Das2015-04-121-0/+45
| | | | | | | | Clean up a predicate I added in r229731, fix the relevant comment and add a test case. The earlier version is confusing to read and was also buggy (probably not a coincidence) till Alexey fixed it in r233881. llvm-svn: 234701
OpenPOWER on IntegriCloud