summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/LoopUnroll/runtime-loop.ll
Commit message (Collapse)AuthorAgeFilesLines
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-171-0/+285
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-171-285/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
* [LoopUnroll] allow customization for new-pass-manager version of LoopUnrollFedor Sergeev2018-10-311-1/+33
| | | | | | | | | | | | | | | | | Unlike its legacy counterpart new pass manager's LoopUnrollPass does not provide any means to select which flavors of unroll to run (runtime, peeling, partial), relying on global defaults. In some cases having ability to run a restricted LoopUnroll that does more than LoopFullUnroll is needed. Introduced LoopUnrollOptions to select optional unroll behaviors. Added 'unroll<peeling>' to PassRegistry mainly for the sake of testing. Reviewers: chandlerc, tejohnson Differential Revision: https://reviews.llvm.org/D53440 llvm-svn: 345723
* [LoopUnroll] NFC. Factor out runtime-loop.ll common test behavior.Fedor Sergeev2018-10-291-19/+24
| | | | | | | Adding COMMON prefix to get common part handled there. Needed to simplify test changes for D53440. llvm-svn: 345538
* [PM] Split LoopUnrollPass and make partial unroller a function passTeresa Johnson2017-08-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | Summary: This is largely NFC*, in preparation for utilizing ProfileSummaryInfo and BranchFrequencyInfo analyses. In this patch I am only doing the splitting for the New PM, but I can do the same for the legacy PM as a follow-on if this looks good. *Not NFC since for partial unrolling we lose the updates done to the loop traversal (adding new sibling and child loops) - according to Chandler this is not very useful for partial unrolling, but it also means that the debugging flag -unroll-revisit-child-loops no longer works for partial unrolling. Reviewers: chandlerc Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D36157 llvm-svn: 309886
* [LoopUnrollRuntime] NFC: Refactored safety checks of unrolling multi-exit loopAnna Thomas2017-07-121-0/+68
| | | | | | | | | | | Refactored the code and separated out a function `canSafelyUnrollMultiExitLoop` to reduce redundant checks and make it easier to add profitability heuristics later. Added tests to runtime unrolling to make sure that unrolling for multi-exit loops is not done unless the option -unroll-runtime-multi-exit is true. llvm-svn: 307843
* [PM] Teach LoopUnroll to update the LPM infrastructure as it unrollsChandler Carruth2017-01-251-0/+3
| | | | | | | | | | | | | | | | | | | loops. We do this by reconstructing the newly added loops after the unroll completes to avoid threading pass manager details through all the mess of the unrolling infrastructure. I've enabled some extra assertions in the LPM to try and catch issues here and enabled a bunch of unroller tests to try and make sure this is sane. Currently, I'm manually running loop-simplify when needed. That should go away once it is folded into the LPM infrastructure. Differential Revision: https://reviews.llvm.org/D28848 llvm-svn: 293011
* [LoopUnroll] Switch the default value of -unroll-runtime-epilog back to its ↵Michael Zolotukhin2016-08-021-1/+1
| | | | | | | | | | original value. As agreed in post-commit review of r265388, I'm switching the flag to its original value until the 90% runtime performance regression on SingleSource/Benchmarks/Stanford/Bubblesort is addressed. llvm-svn: 277524
* [LoopUnroll] Ensure we create prolog loops in simplified form.Michael Zolotukhin2016-08-021-1/+1
| | | | llvm-svn: 277502
* The patch fixes PR27392.Evgeny Stupachenko2016-04-271-2/+2
| | | | | | | | | | | | | | | | | Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can potentially overflow (when BECount is max unsigned integer). While comparing BECount with (Count - 1) is overflow safe and therefore correct. Reviewer: hfinkel Differential Revision: http://reviews.llvm.org/D19256 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 267662
* Adds the ability to use an epilog remainder loop during loop unrolling and makesDavid L Kreitzer2016-04-051-18/+42
| | | | | | | | | | this the default behavior. Patch by Evgeny Stupachenko (evstupac@gmail.com). Differential Revision: http://reviews.llvm.org/D18158 llvm-svn: 265388
* [LoopUnrollRuntime] Avoid high-cost trip count computation.Sanjoy Das2015-04-141-0/+2
| | | | | | | | | | | | | | | | | Summary: Runtime unrolling of loops needs to emit an expression to compute the loop's runtime trip-count. Avoid runtime unrolling if this computation will be expensive. Depends on D8993. Reviewers: atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8994 llvm-svn: 234846
* Introduce runtime unrolling disable matadata and use it to mark the scalar ↵Kevin Qin2015-03-091-0/+33
| | | | | | | | | | | loop from vectorization. Runtime unrolling is an expensive optimization which can bring benefit only if the loop is hot and iteration number is relatively large enough. For some loops, we know they are not worth to be runtime unrolled. The scalar loop from vectorization is one of the cases. llvm-svn: 231631
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | getelementptr instruction One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type. This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions. * This doesn't modify gep operators, only instructions (operators will be handled separately) * Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes. * geps of vectors are transformed as: getelementptr <4 x float*> %x, ... ->getelementptr float, <4 x float*> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float. * address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files. update.py: import fileinput import sys import re ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))") normrep = re.compile( r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))") def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name *.ll | xargs ./apply.sh From llvm/src/tools/clang: find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll | xargs ./apply.sh After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out). The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases. Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7636 llvm-svn: 230786
* Partial fix for bug 22589Sanjoy Das2015-02-181-3/+1
| | | | | | | | | | | Don't spend the entire iteration space in the scalar loop prologue if computing the trip count overflows. This change also gets rid of the backedge check in the prologue loop and the extra check for overflowing trip-count. Differential Revision: http://reviews.llvm.org/D7715 llvm-svn: 229731
* IR: Add 'distinct' MDNodes to bitcode and assemblyDuncan P. N. Exon Smith2015-01-081-1/+1
| | | | | | | | | | | | | | | | | | Propagate whether `MDNode`s are 'distinct' through the other types of IR (assembly and bitcode). This adds the `distinct` keyword to assembly. Currently, no one actually calls `MDNode::getDistinct()`, so these nodes only get created for: - self-references, which are never uniqued, and - nodes whose operands are replaced that hit a uniquing collision. The concept of distinct nodes is still not quite first-class, since distinct-ness doesn't yet survive across `MapMetadata()`. Part of PR22111. llvm-svn: 225474
* IR: Make metadata typeless in assemblyDuncan P. N. Exon Smith2014-12-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that `Metadata` is typeless, reflect that in the assembly. These are the matching assembly changes for the metadata/value split in r223802. - Only use the `metadata` type when referencing metadata from a call intrinsic -- i.e., only when it's used as a `Value`. - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode` when referencing it from call intrinsics. So, assembly like this: define @foo(i32 %v) { call void @llvm.foo(metadata !{i32 %v}, metadata !0) call void @llvm.foo(metadata !{i32 7}, metadata !0) call void @llvm.foo(metadata !1, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{metadata !3}, metadata !0) ret void, !bar !2 } !0 = metadata !{metadata !2} !1 = metadata !{i32* @global} !2 = metadata !{metadata !3} !3 = metadata !{} turns into this: define @foo(i32 %v) { call void @llvm.foo(metadata i32 %v, metadata !0) call void @llvm.foo(metadata i32 7, metadata !0) call void @llvm.foo(metadata i32* @global, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{!3}, metadata !0) ret void, !bar !2 } !0 = !{!2} !1 = !{i32* @global} !2 = !{!3} !3 = !{} I wrote an upgrade script that handled almost all of the tests in llvm and many of the tests in cfe (even handling many `CHECK` lines). I've attached it (or will attach it in a moment if you're speedy) to PR21532 to help everyone update their out-of-tree testcases. This is part of PR21532. llvm-svn: 224257
* Use a loop to simplify the runtime unrolling prologue.Kevin Qin2014-09-291-11/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Runtime unrolling will create a prologue to execute the extra iterations which is can't divided by the unroll factor. It generates an if-then-else sequence to jump into a factor -1 times unrolled loop body, like extraiters = tripcount % loopfactor if (extraiters == 0) jump Loop: if (extraiters == loopfactor) jump L1 if (extraiters == loopfactor-1) jump L2 ... L1: LoopBody; L2: LoopBody; ... if tripcount < loopfactor jump End Loop: ... End: It means if the unroll factor is 4, the loop body will be 7 times unrolled, 3 are in loop prologue, and 4 are in the loop. This commit is to use a loop to execute the extra iterations in prologue, like extraiters = tripcount % loopfactor if (extraiters == 0) jump Loop: else jump Prol Prol: LoopBody; extraiters -= 1 // Omitted if unroll factor is 2. if (extraiters != 0) jump Prol: // Omitted if unroll factor is 2. if (tripcount < loopfactor) jump End Loop: ... End: Then when unroll factor is 4, the loop body will be copied by only 5 times, 1 in the prologue loop, 4 in the original loop. And if the unroll factor is 2, new loop won't be created, just as the original solution. llvm-svn: 218604
* LoopUnrollRuntime: Check for overflow in the trip count calculation.Benjamin Kramer2014-06-211-0/+6
| | | | | | Fixes PR19823. llvm-svn: 211436
* Add -unroll-runtime for unrolling loops with run-time trip counts.Andrew Trick2011-12-091-0/+109
Patch by Brendon Cahoon! This extends the existing LoopUnroll and LoopUnrollPass. Brendon measured no regressions in the llvm test suite with -unroll-runtime enabled. This implementation works by using the existing loop unrolling code to unroll the loop by a power-of-two (default 8). It generates an if-then-else sequence of code prior to the loop to execute the extra iterations before entering the unrolled loop. llvm-svn: 146245
OpenPOWER on IntegriCloud