summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/LoopUnroll
Commit message (Collapse)AuthorAgeFilesLines
...
* IR: Make metadata typeless in assemblyDuncan P. N. Exon Smith2014-12-153-34/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that `Metadata` is typeless, reflect that in the assembly. These are the matching assembly changes for the metadata/value split in r223802. - Only use the `metadata` type when referencing metadata from a call intrinsic -- i.e., only when it's used as a `Value`. - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode` when referencing it from call intrinsics. So, assembly like this: define @foo(i32 %v) { call void @llvm.foo(metadata !{i32 %v}, metadata !0) call void @llvm.foo(metadata !{i32 7}, metadata !0) call void @llvm.foo(metadata !1, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{metadata !3}, metadata !0) ret void, !bar !2 } !0 = metadata !{metadata !2} !1 = metadata !{i32* @global} !2 = metadata !{metadata !3} !3 = metadata !{} turns into this: define @foo(i32 %v) { call void @llvm.foo(metadata i32 %v, metadata !0) call void @llvm.foo(metadata i32 7, metadata !0) call void @llvm.foo(metadata i32* @global, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{!3}, metadata !0) ret void, !bar !2 } !0 = !{!2} !1 = !{i32* @global} !2 = !{!3} !3 = !{} I wrote an upgrade script that handled almost all of the tests in llvm and many of the tests in cfe (even handling many `CHECK` lines). I've attached it (or will attach it in a moment if you're speedy) to PR21532 to help everyone update their out-of-tree testcases. This is part of PR21532. llvm-svn: 224257
* Fix a trip-count overflow issue in LoopUnroll.Michael Zolotukhin2014-11-202-1/+31
| | | | | | | | | | | | Currently LoopUnroll generates a prologue loop before the main loop body to execute first N%UnrollFactor iterations. Also, this loop is used if trip-count can overflow - it's determined by a runtime check. However, we've been mistakenly optimizing this loop to a linear code for UnrollFactor = 2, not taking into account that it also serves as a safe version of the loop if its trip-count overflows. llvm-svn: 222451
* [SCEV] Improve Scalar Evolution's use of no {un,}signed wrap flagsBradley Smith2014-10-311-0/+32
| | | | | | | | | | | | | | | In a case where we have a no {un,}signed wrap flag on the increment, if RHS - Start is constant then we can avoid inserting a max operation bewteen the two, since we can statically determine which is greater. This allows us to unroll loops such as: void testcase3(int v) { for (int i=v; i<=v+1; ++i) f(i); } llvm-svn: 220960
* This patch de-pessimizes the calculation of loop trip counts inMark Heffernan2014-10-101-9/+6
| | | | | | | | | | | | | | | | | | | | | ScalarEvolution in the presence of multiple exits. Previously all loops exits had to have identical counts for a loop trip count to be considered computable. This pessimization was implemented by calling getBackedgeTakenCount(L) rather than getExitCount(L, ExitingBlock) inside of ScalarEvolution::getSmallConstantTripCount() (see the FIXME in the comments of that function). The pessimization was added to fix a corner case involving undefined behavior (pr/16130). This patch more precisely handles the undefined behavior case allowing the pessimization to be removed. ControlsExit replaces IsSubExpr to more precisely track the case where undefined behavior is expected to occur. Because undefined behavior is tracked more precisely we can remove MustExit from ExitLimit. MustExit was used to track the case where the limit was computed potentially assuming undefined behavior even if undefined behavior didn't necessarily occur. llvm-svn: 219517
* LoopUnroll: Create sub-loops in LoopInfoDuncan P. N. Exon Smith2014-10-071-0/+35
| | | | | | | | | | | | | `LoopUnrollPass` says that it preserves `LoopInfo` -- make it so. In particular, tell `LoopInfo` about copies of inner loops when unrolling the outer loop. Conservatively, also tell `ScalarEvolution` to forget about the original versions of these loops, since their inputs may have changed. Fixes PR20987. llvm-svn: 219241
* Use a loop to simplify the runtime unrolling prologue.Kevin Qin2014-09-294-19/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Runtime unrolling will create a prologue to execute the extra iterations which is can't divided by the unroll factor. It generates an if-then-else sequence to jump into a factor -1 times unrolled loop body, like extraiters = tripcount % loopfactor if (extraiters == 0) jump Loop: if (extraiters == loopfactor) jump L1 if (extraiters == loopfactor-1) jump L2 ... L1: LoopBody; L2: LoopBody; ... if tripcount < loopfactor jump End Loop: ... End: It means if the unroll factor is 4, the loop body will be 7 times unrolled, 3 are in loop prologue, and 4 are in the loop. This commit is to use a loop to execute the extra iterations in prologue, like extraiters = tripcount % loopfactor if (extraiters == 0) jump Loop: else jump Prol Prol: LoopBody; extraiters -= 1 // Omitted if unroll factor is 2. if (extraiters != 0) jump Prol: // Omitted if unroll factor is 2. if (tripcount < loopfactor) jump End Loop: ... End: Then when unroll factor is 4, the loop body will be copied by only 5 times, 1 in the prologue loop, 4 in the original loop. And if the unroll factor is 2, new loop won't be created, just as the original solution. llvm-svn: 218604
* Ignore annotation function calls in cost computationDavid Peixotto2014-09-261-0/+133
| | | | | | | | | | | The annotation instructions are dropped during codegen and have no impact on size. In some cases, the annotations were preventing the unroller from unrolling a loop because the annotation calls were pushing the cost over the unrolling threshold. Differential Revision: http://reviews.llvm.org/D5335 llvm-svn: 218525
* Add functions for finding ephemeral valuesHal Finkel2014-09-071-0/+44
| | | | | | | | | | | | | | | | This adds a set of utility functions for collecting 'ephemeral' values. These are LLVM IR values that are used only by @llvm.assume intrinsics (directly or indirectly), and thus will be removed prior to code generation, implying that they should be considered free for certain purposes (like inlining). The inliner's cost analysis, and a few other passes, have been updated to account for ephemeral values using the provided functionality. This functionality is important for the usability of @llvm.assume, because it limits the "non-local" side-effects of adding llvm.assume on inlining, loop unrolling, etc. (these are hints, and do not generate code, so they should not directly contribute to estimates of execution cost). llvm-svn: 217335
* After unrolling a loop with llvm.loop.unroll.count metadata (unroll factorMark Heffernan2014-07-241-6/+54
| | | | | | | | | | hint) the loop unroller replaces the llvm.loop.unroll.count metadata with llvm.loop.unroll.disable metadata to prevent any subsequent unrolling passes from unrolling more than the hint indicates. This patch fixes an issue where loop unrolling could be disabled for other loops as well which share the same llvm.loop metadata. llvm-svn: 213900
* Do not add unroll disable metadata after unrolling pass for loops with ↵Mark Heffernan2014-07-231-14/+46
| | | | | | #pragma clang loop unroll(full). llvm-svn: 213789
* In unroll pragma syntax and loop hint metadata, change "enable" forms to a ↵Mark Heffernan2014-07-232-41/+11
| | | | | | new form using the string "full". llvm-svn: 213772
* Remove unroll pragma metadata after it is used.Mark Heffernan2014-07-182-0/+73
| | | | llvm-svn: 213412
* Rename loop unrolling and loop vectorizer metadata to have a common prefix.Eli Bendersky2014-06-251-4/+4
| | | | | | | | | | | | | | | | | | | [LLVM part] These patches rename the loop unrolling and loop vectorizer metadata such that they have a common 'llvm.loop.' prefix. Metadata name changes: llvm.vectorizer.* => llvm.loop.vectorizer.* llvm.loopunroll.* => llvm.loop.unroll.* This was a suggestion from an earlier review (http://reviews.llvm.org/D4090) which added the loop unrolling metadata. Patch by Mark Heffernan. llvm-svn: 211710
* LoopUnrollRuntime: Check for overflow in the trip count calculation.Benjamin Kramer2014-06-211-0/+6
| | | | | | Fixes PR19823. llvm-svn: 211436
* Teach LoopUnrollPass to respect loop unrolling hints in metadata.Eli Bendersky2014-06-161-0/+285
| | | | | | | | | | | | | [This is resubmitting r210721, which was reverted due to suspected breakage which turned out to be unrelated]. Some extra review comments were addressed. See D4090 and D4147 for more details. The Clang change that produces this metadata was committed in r210667 Patch by Mark Heffernan. llvm-svn: 211076
* Revert r210721 as it causes breakage in internal builds (and possibly GDB).Eli Bendersky2014-06-121-285/+0
| | | | llvm-svn: 210807
* Teach LoopUnrollPass to respect loop unrolling hints in metadata.Eli Bendersky2014-06-111-0/+285
| | | | | | | | | | See http://reviews.llvm.org/D4090 for more details. The Clang change that produces this metadata was committed in r210667 Patch by Mark Heffernan. llvm-svn: 210721
* Reduce verbiage of lit.local.cfg filesAlp Toker2014-06-092-4/+2
| | | | | | We can just split targets_to_build in one place and make it immutable. llvm-svn: 210496
* LCSSA should be performed on the outermost affected loop while unrolling loop.Dinesh Dwivedi2014-05-291-0/+43
| | | | | | | | | | During loop-unroll, loop exits from the current loop may end up in in different outer loop. This requires to re-form LCSSA recursively for one level down from the outer most loop where loop exits are landed during unroll. This fixes PR18861. Differential Revision: http://reviews.llvm.org/D2976 llvm-svn: 209796
* Move late partial-unrolling thresholds into the processor definitionsHal Finkel2014-05-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | The old method used by X86TTI to determine partial-unrolling thresholds was messy (because it worked by testing target features), and also would not correctly identify the target CPU if certain target features were disabled. After some discussions on IRC with Chandler et al., it was decided that the processor scheduling models were the right containers for this information (because it is often tied to special uop dispatch-buffer sizes). This does represent a small functionality change: - For generic x86-64 (which uses the SB model and, thus, will get some unrolling). - For AMD cores (because they still currently use the SB scheduling model) - For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump the default threshold to 50; we're working on a test case for this). Otherwise, nothing has changed for any other targets. The logic, however, has been moved into BasicTTI, so other targets may now also opt-in to this functionality simply by setting LoopMicroOpBufferSize in their processor model definitions. llvm-svn: 208289
* LoopUnroll: If we're doing partial unrolling, use the PartialThreshold to ↵Benjamin Kramer2014-05-041-0/+47
| | | | | | | | | | | limit unrolling. Otherwise we use the same threshold as for complete unrolling, which is way too high. This made us unroll any loop smaller than 150 instructions by 8 times, but only if someone specified -march=core2 or better, which happens to be the default on darwin. llvm-svn: 207940
* Fix vectorization remarks.Diego Novillo2014-04-291-0/+25
| | | | | | | | | This patch changes the vectorization remarks to also inform when vectorization is possible but not beneficial. Added tests to exercise some loop remarks. llvm-svn: 207574
* Implement X86TTI::getUnrollingPreferencesHal Finkel2014-04-012-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | This provides an initial implementation of getUnrollingPreferences for x86. getUnrollingPreferences is used by the generic (concatenation) unroller, which is distinct from the unrolling done by the loop vectorizer. Many modern x86 cores have some kind of uop cache and loop-stream detector (LSD) used to efficiently dispatch small loops, and taking full advantage of this requires unrolling small loops (small here means 10s of uops). These caches also have limits on the number of taken branches in the loop, and so we also cap the loop unrolling factor based on the maximum "depth" of the loop. This is currently calculated with a partial DFS traversal (partial because it will stop early if the path length grows too much). This is still an approximation, and one that is both conservative (because it does not account for branches eliminated via block placement) and optimistic (because it is only recording the maximum depth over minimum paths). Nevertheless, because the loops that fit in these uop caches are so small, it is not clear how much the details matter. The original set of patches posted for review produced the following test-suite performance results (from the TSVC benchmark) at that time: ControlLoops-dbl - 13% speedup ControlLoops-flt - 15% speedup Reductions-dbl - 7.5% speedup llvm-svn: 205348
* Implement TTI getUnrollingPreferences for PowerPCHal Finkel2013-09-112-0/+52
| | | | | | | | The PowerPC A2 core greatly benefits from aggressive concatenation unrolling; use the new getUnrollingPreferences to enable this by default when targeting the PPC A2 core. llvm-svn: 190549
* [tests] Cleanup initialization of test suffixes.Daniel Dunbar2013-08-161-1/+0
| | | | | | | | | | | | | | | | | - Instead of setting the suffixes in a bunch of places, just set one master list in the top-level config. We now only modify the suffix list in a few suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py). - Aside from removing the need for a bunch of lit.local.cfg files, this enables 4 tests that were inadvertently being skipped (one in Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been XFAILED). - This commit also fixes a bunch of config files to use config.root instead of older copy-pasted code. llvm-svn: 188513
* Fixup to r186268 and r186269: don't append -LABEL to CHECK-NOT. No ↵Stephen Lin2013-07-141-1/+1
| | | | | | functionality change. llvm-svn: 186271
* Update Transforms tests to use CHECK-LABEL for easier debugging. No ↵Stephen Lin2013-07-147-20/+20
| | | | | | | | | | | | | | | | | | | | | | functionality change. This update was done with the following bash script: find test/Transforms -name "*.ll" | \ while read NAME; do echo "$NAME" if ! grep -q "^; *RUN: *llc" $NAME; then TEMP=`mktemp -t temp` cp $NAME $TEMP sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \ while read FUNC; do sed -i '' "s/;\(.*\)\([A-Za-z0-9_]*\):\( *\)@$FUNC\([( ]*\)\$/;\1\2-LABEL:\3@$FUNC(/g" $TEMP done mv $TEMP $NAME fi done llvm-svn: 186268
* Modify two Transforms tests to explicitly check for full function names in ↵Stephen Lin2013-07-141-1/+1
| | | | | | | | some cases, rather than just a common prefix. No functionality change. (This is to avoid confusing a scripted mass update of these tests to use CHECK-LABEL) llvm-svn: 186267
* Prevent loop-unroll from making assumptions about undefined behavior.Andrew Trick2013-05-312-22/+62
| | | | | | | | | | | | | | Fixes rdar:14036816, PR16130. There is an opportunity to compute precise trip counts for 'or' expressions and multi-exit loops. rdar:14038809: Optimize trip count computation for multi-exit loops. To do this we need to record the fact that ExitLimit assumes NSW. When it does not we can safely assume that the loop trip count is the minimum ExitLimt across all subexpressions and loop exits. llvm-svn: 183060
* Revert the test moves from 176733. Use "REQUIRES: asserts" instead.Jan Wen Voung2013-03-122-4/+1
| | | | llvm-svn: 176873
* Disable statistics on Release builds and move tests that depend on -stats.Jan Wen Voung2013-03-082-0/+4
| | | | | | | | | | | | | | | | | Summary: Statistics are still available in Release+Asserts (any +Asserts builds), and stats can also be turned on with LLVM_ENABLE_STATS. Move some of the FastISel stats that were moved under DEBUG() back out of DEBUG(), since stats are disabled across the board now. Many tests depend on grepping "-stats" output. Move those into a orig_dir/Stats/. so that they can be marked as unsupported when building without statistics. Differential Revision: http://llvm-reviews.chandlerc.com/D486 llvm-svn: 176733
* Add a new attribute, 'noduplicate'. If a function contains a noduplicate ↵James Molloy2012-12-201-0/+23
| | | | | | | | call, the call cannot be duplicated - Jump threading, loop unrolling, loop unswitching, and loop rotation are inhibited if they would duplicate the call. Similarly inlining of the function is inhibited, if that would duplicate the call (in particular inlining is still allowed when there is only one callsite and the function has internal linkage). llvm-svn: 170704
* getSmallConstantTripMultiple should never return zero.Hal Finkel2012-10-241-0/+44
| | | | | | | | | When the trip count is -1, getSmallConstantTripMultiple could return zero, and this would cause runtime loop unrolling to assert. Instead of returning zero, one is now returned (consistent with the existing overflow cases). Fixes PR14167. llvm-svn: 166612
* Fix tests that didn't test anything.Benjamin Kramer2012-09-261-1/+1
| | | | llvm-svn: 164686
* Fix 12513: Loop unrolling breaks with indirect branches.Andrew Trick2012-04-101-0/+40
| | | | | | | | Take this opportunity to generalize the indirectbr bailout logic for loop transformations. CFG transformations will never get indirectbr right, and there's no point trying. llvm-svn: 154386
* Add testcase for r154007, when a function has the optsize attribute,Hongbin Zheng2012-04-041-0/+35
| | | | | | the loop should be unrolled according the value of OptSizeUnrollThreshold. llvm-svn: 154014
* Remove redundant -enable-iv-rewrite=false flags from test cases.Andrew Trick2012-03-221-1/+1
| | | | llvm-svn: 153255
* Replace all instances of dg.exp file with lit.local.cfg, since all tests are ↵Eli Bendersky2012-02-162-3/+1
| | | | | | | | run with LIT now and now Dejagnu. dg.exp is no longer needed. Patch reviewed by Daniel Dunbar. It will be followed by additional cleanup patches. llvm-svn: 150664
* Add -unroll-runtime for unrolling loops with run-time trip counts.Andrew Trick2011-12-094-0/+214
| | | | | | | | | | | | | Patch by Brendon Cahoon! This extends the existing LoopUnroll and LoopUnrollPass. Brendon measured no regressions in the llvm test suite with -unroll-runtime enabled. This implementation works by using the existing loop unrolling code to unroll the loop by a power-of-two (default 8). It generates an if-then-else sequence of code prior to the loop to execute the extra iterations before entering the unrolled loop. llvm-svn: 146245
* Fix a corner case in updating LoopInfo after fully unrolling an outer loop.Andrew Trick2011-11-181-0/+41
| | | | | | | | | | | The loop tree's inclusive block lists are painful and expensive to update. (I have no idea why they're inclusive). The design was supposed to handle this case but the implementation missed it and my unit tests weren't thorough enough. Fixes PR11335: loop unroll update. llvm-svn: 144970
* Don't try to loop on iterators that are potentially invalidated inside the ↵Nick Lewycky2011-11-121-0/+42
| | | | | | loop. Fixes PR11361! llvm-svn: 144454
* Unit test for r140919, loop unroll heuristics.Andrew Trick2011-10-041-0/+36
| | | | llvm-svn: 141049
* Reapply r139759. Disable IV rewriting by default. See PR10916.Andrew Trick2011-09-151-1/+1
| | | | llvm-svn: 139842
* [indvars] Revert r139579 until 401.bzip -arch i386 miscompilation is fixed. ↵Andrew Trick2011-09-131-1/+1
| | | | | | PR10920. llvm-svn: 139583
* Disable IV rewriting by default. See PR10916.Andrew Trick2011-09-131-1/+1
| | | | llvm-svn: 139579
* Rename -disable-iv-rewrite to -enable-iv-rewrite=false in preparation for ↵Andrew Trick2011-09-121-1/+1
| | | | | | default change. llvm-svn: 139517
* Test case update for unroll-scev.Andrew Trick2011-09-022-8/+13
| | | | llvm-svn: 139037
* -unroll-scev flag removalAndrew Trick2011-09-024-4/+4
| | | | llvm-svn: 139010
* ConstantVector returns arbitrary value for the wrong index.Jakub Staszak2011-09-021-0/+29
| | | | | | This fixes PR10813. llvm-svn: 139006
* A slew of unit tests for the recent LoopInfo::updateUnloop featureAndrew Trick2011-08-111-0/+429
| | | | | | checked in at r137276 and r137341. llvm-svn: 137385
OpenPOWER on IntegriCloud