summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/LoopIdiom
Commit message (Collapse)AuthorAgeFilesLines
...
* LoopIdiom: Give globals for memset_pattern16 private linkage.Benjamin Kramer2015-03-031-0/+7
| | | | | | | | There's really no reason to have them have entries in the symbol table anymore. Old versions of ld64 had some bugs in this area but those have been fixed long ago. llvm-svn: 231041
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-273-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-277-36/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | getelementptr instruction One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type. This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions. * This doesn't modify gep operators, only instructions (operators will be handled separately) * Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes. * geps of vectors are transformed as: getelementptr <4 x float*> %x, ... ->getelementptr float, <4 x float*> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float. * address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files. update.py: import fileinput import sys import re ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))") normrep = re.compile( r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))") def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name *.ll | xargs ./apply.sh From llvm/src/tools/clang: find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll | xargs ./apply.sh After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out). The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases. Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7636 llvm-svn: 230786
* IR: Move MDLocation into placeDuncan P. N. Exon Smith2015-01-141-5/+5
| | | | | | | | | | | | | | | | | | | | This commit moves `MDLocation`, finishing off PR21433. There's an accompanying clang commit for frontend testcases. I'll attach the testcase upgrade script I used to PR21433 to help out-of-tree frontends/backends. This changes the schema for `DebugLoc` and `DILocation` from: !{i32 3, i32 7, !7, !8} to: !MDLocation(line: 3, column: 7, scope: !7, inlinedAt: !8) Note that empty fields (line/column: 0 and inlinedAt: null) don't get printed by the assembly writer. llvm-svn: 226048
* IR: Make metadata typeless in assemblyDuncan P. N. Exon Smith2014-12-151-22/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that `Metadata` is typeless, reflect that in the assembly. These are the matching assembly changes for the metadata/value split in r223802. - Only use the `metadata` type when referencing metadata from a call intrinsic -- i.e., only when it's used as a `Value`. - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode` when referencing it from call intrinsics. So, assembly like this: define @foo(i32 %v) { call void @llvm.foo(metadata !{i32 %v}, metadata !0) call void @llvm.foo(metadata !{i32 7}, metadata !0) call void @llvm.foo(metadata !1, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{metadata !3}, metadata !0) ret void, !bar !2 } !0 = metadata !{metadata !2} !1 = metadata !{i32* @global} !2 = metadata !{metadata !3} !3 = metadata !{} turns into this: define @foo(i32 %v) { call void @llvm.foo(metadata i32 %v, metadata !0) call void @llvm.foo(metadata i32 7, metadata !0) call void @llvm.foo(metadata i32* @global, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{!3}, metadata !0) ret void, !bar !2 } !0 = !{!2} !1 = !{i32* @global} !2 = !{!3} !3 = !{} I wrote an upgrade script that handled almost all of the tests in llvm and many of the tests in cfe (even handling many `CHECK` lines). I've attached it (or will attach it in a moment if you're speedy) to PR21532 to help everyone update their out-of-tree testcases. This is part of PR21532. llvm-svn: 224257
* Revert "Revert "DI: Fold constant arguments into a single MDString""Duncan P. N. Exon Smith2014-10-031-12/+12
| | | | | | | | | | | | | | | | | | | | | | This reverts commit r218918, effectively reapplying r218914 after fixing an Ocaml bindings test and an Asan crash. The root cause of the latter was a tightened-up check in `DILexicalBlock::Verify()`, so I'll file a PR to investigate who requires the loose check (and why). Original commit message follows. -- This patch addresses the first stage of PR17891 by folding constant arguments together into a single MDString. Integers are stringified and a `\0` character is used as a separator. Part of PR17891. Note: I've attached my testcases upgrade scripts to the PR. If I've just broken your out-of-tree testcases, they might help. llvm-svn: 219010
* Revert "DI: Fold constant arguments into a single MDString"Duncan P. N. Exon Smith2014-10-021-12/+12
| | | | | | This reverts commit r218914 while I investigate some bots. llvm-svn: 218918
* DI: Fold constant arguments into a single MDStringDuncan P. N. Exon Smith2014-10-021-12/+12
| | | | | | | | | | | | | This patch addresses the first stage of PR17891 by folding constant arguments together into a single MDString. Integers are stringified and a `\0` character is used as a separator. Part of PR17891. Note: I've attached my testcases upgrade scripts to the PR. If I've just broken your out-of-tree testcases, they might help. llvm-svn: 218914
* Move the complex address expression out of DIVariable and into an extraAdrian Prantl2014-10-011-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | argument of the llvm.dbg.declare/llvm.dbg.value intrinsics. Previously, DIVariable was a variable-length field that has an optional reference to a Metadata array consisting of a variable number of complex address expressions. In the case of OpPiece expressions this is wasting a lot of storage in IR, because when an aggregate type is, e.g., SROA'd into all of its n individual members, the IR will contain n copies of the DIVariable, all alike, only differing in the complex address reference at the end. By making the complex address into an extra argument of the dbg.value/dbg.declare intrinsics, all of the pieces can reference the same variable and the complex address expressions can be uniqued across the CU, too. Down the road, this will allow us to move other flags, such as "indirection" out of the DIVariable, too. The new intrinsics look like this: declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr) declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr) This patch adds a new LLVM-local tag to DIExpressions, so we can detect and pretty-print DIExpression metadata nodes. What this patch doesn't do: This patch does not touch the "Indirect" field in DIVariable; but moving that into the expression would be a natural next step. http://reviews.llvm.org/D4919 rdar://problem/17994491 Thanks to dblaikie and dexonsmith for reviewing this patch! Note: I accidentally committed a bogus older version of this patch previously. llvm-svn: 218787
* Revert r218778 while investigating buldbot breakage.Adrian Prantl2014-10-011-5/+5
| | | | | | "Move the complex address expression out of DIVariable and into an extra" llvm-svn: 218782
* Move the complex address expression out of DIVariable and into an extraAdrian Prantl2014-10-011-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | argument of the llvm.dbg.declare/llvm.dbg.value intrinsics. Previously, DIVariable was a variable-length field that has an optional reference to a Metadata array consisting of a variable number of complex address expressions. In the case of OpPiece expressions this is wasting a lot of storage in IR, because when an aggregate type is, e.g., SROA'd into all of its n individual members, the IR will contain n copies of the DIVariable, all alike, only differing in the complex address reference at the end. By making the complex address into an extra argument of the dbg.value/dbg.declare intrinsics, all of the pieces can reference the same variable and the complex address expressions can be uniqued across the CU, too. Down the road, this will allow us to move other flags, such as "indirection" out of the DIVariable, too. The new intrinsics look like this: declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr) declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr) This patch adds a new LLVM-local tag to DIExpressions, so we can detect and pretty-print DIExpression metadata nodes. What this patch doesn't do: This patch does not touch the "Indirect" field in DIVariable; but moving that into the expression would be a natural next step. http://reviews.llvm.org/D4919 rdar://problem/17994491 Thanks to dblaikie and dexonsmith for reviewing this patch! llvm-svn: 218778
* R600: Implement TTI:getPopcntSupportMatt Arsenault2014-07-182-0/+107
| | | | | | | The test is just copied from X86, and I don't know of a better way to test it. llvm-svn: 213351
* Reduce verbiage of lit.local.cfg filesAlp Toker2014-06-091-2/+1
| | | | | | We can just split targets_to_build in one place and make it immutable. llvm-svn: 210496
* Debug Info: update testing cases to specify the debug info version number.Manman Ren2013-11-231-0/+2
| | | | | | | | | | We are going to drop debug info without a version number or with a different version number, to make sure we don't crash when we see bitcode files with different debug info metadata format. Make tests more robust by removing hard-coded metadata numbers in CHECK lines. llvm-svn: 195535
* Teach loop-idiom about address space pointer sizesMatt Arsenault2013-09-111-0/+91
| | | | llvm-svn: 190491
* Debug Info Testing: updated to use NULL instead of "i32 0" in a few fields.Manman Ren2013-09-061-2/+2
| | | | | | | | Field 2 of DIType (Context), field 9 of DIDerivedType (TypeDerivedFrom), field 12 of DICompositeType (ContainingType), fields 2, 7, 12 of DISubprogram (Context, Type, ContainingType). llvm-svn: 190205
* Debug Info: add an identifier field to DICompositeType.Manman Ren2013-08-261-1/+1
| | | | | | | | | | | | | | | | | | DICompositeType will have an identifier field at position 14. For now, the field is set to null in DIBuilder. For DICompositeTypes where the template argument field (the 13th field) was optional, modify DIBuilder to make sure the template argument field is set. Now DICompositeType has 15 fields. Update DIBuilder to use NULL instead of "i32 0" for null value of a MDNode. Update verifier to check that DICompositeType has 15 fields and the last field is null or a MDString. Update testing cases to include an extra field for DICompositeType. The identifier field will be used by type uniquing so a front end can genearte a DICompositeType with a unique identifer. llvm-svn: 189282
* [tests] Cleanup initialization of test suffixes.Daniel Dunbar2013-08-162-3/+0
| | | | | | | | | | | | | | | | | - Instead of setting the suffixes in a bunch of places, just set one master list in the top-level config. We now only modify the suffix list in a few suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py). - Aside from removing the need for a bunch of lit.local.cfg files, this enables 4 tests that were inadvertently being skipped (one in Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been XFAILED). - This commit also fixes a bunch of config files to use config.root instead of older copy-pasted code. llvm-svn: 188513
* Debug Info: enable verifier for testing cases.Manman Ren2013-07-291-1/+1
| | | | llvm-svn: 187375
* Debug Info Verifier: verify SPs in llvm.dbg.sp.Manman Ren2013-07-271-9/+10
| | | | | | | | Also always add DIType, DISubprogram and DIGlobalVariable to the list in DebugInfoFinder without checking them, so we can verify them later on. llvm-svn: 187285
* add -disable-debug-info-verifier to 3 test to fix tests with pipefail.Rafael Espindola2013-07-241-1/+1
| | | | llvm-svn: 187064
* Update Transforms tests to use CHECK-LABEL for easier debugging. No ↵Stephen Lin2013-07-142-17/+17
| | | | | | | | | | | | | | | | | | | | | | functionality change. This update was done with the following bash script: find test/Transforms -name "*.ll" | \ while read NAME; do echo "$NAME" if ! grep -q "^; *RUN: *llc" $NAME; then TEMP=`mktemp -t temp` cp $NAME $TEMP sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \ while read FUNC; do sed -i '' "s/;\(.*\)\([A-Za-z0-9_]*\):\( *\)@$FUNC\([( ]*\)\$/;\1\2-LABEL:\3@$FUNC(/g" $TEMP done mv $TEMP $NAME fi done llvm-svn: 186268
* PR14904: Segmentation fault running pass 'Recognize loop idioms'Shuxin Yang2013-01-101-0/+20
| | | | | | | | The root cause is mistakenly taking for granted that "dyn_cast<Instruction>(a-Value)" return a non-NULL instruction. llvm-svn: 172145
* Fix a mistaken commit that included some debugging code.David Tweed2013-01-071-1/+1
| | | | llvm-svn: 171734
* There was a switch fall-through in the parser for textual LLVM that causedDavid Tweed2013-01-071-1/+1
| | | | | | | | bogus comparison operands to default to eq/oeq. Fix that, fix a couple of tests that accidentally passed and test for bogus comparison opeartors explicitly. llvm-svn: 171733
* - Re-enable population count loop idiom recognization Shuxin Yang2012-12-092-0/+126
| | | | | | | - fix a bug which cause sigfault. - add two testing cases which was causing crash llvm-svn: 169687
* Revert the patches adding a popcount loop idiom recognition pass.Chandler Carruth2012-12-082-82/+0
| | | | | | | | | | | | | | There are still bugs in this pass, as well as other issues that are being worked on, but the bugs are crashers that occur pretty easily in the wild. Test cases have been sent to the original commit's review thread. This reverts the commits: r169671: Fix a logic error. r169604: Move the popcnt tests to an X86 subdirectory. r168931: Initial commit adding the pass. llvm-svn: 169683
* The test unconditionally assumes a particular cpu has a backend build in the ↵David Tweed2012-12-072-0/+6
| | | | | | | | | | target. Buildbots for some hosts may choose to build only their own backend in order to maximise testing-turnaround time. Move the test into a prefixed directory so lit's standard "backend specific" suppression can be done. llvm-svn: 169604
* rdar://12100355 (part 1)Shuxin Yang2012-11-291-0/+76
| | | | | | | | | | | | | | This revision attempts to recognize following population-count pattern: while(a) { c++; ... ; a &= a - 1; ... }, where <c> and <a>could be used multiple times in the loop body. TODO: On X8664 and ARM, __buildin_ctpop() are not expanded to a efficent instruction sequence, which need to be improved in the following commits. Reviewed by Nadav, really appreciate! llvm-svn: 168931
* Add a testcase to loop-idiom to cover PR14241 when we start handlingChandler Carruth2012-11-021-0/+33
| | | | | | strided loops again. llvm-svn: 167287
* Revert the switch of loop-idiom to use the new dependence analysis.Chandler Carruth2012-11-025-218/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The new analysis is not yet ready for prime time. It has a *critical* flawed assumption, and some troubling shortages of testing. Until it's been hammered into better shape, let's stick with the working code. This should be easy to revert itself when the analysis is ready. Fixes PR14241, a miscompile of any memcpy-able loop which uses a pointer as the induction mechanism. If you have been seeing miscompiles in this revision range, you really want to test with this backed out. The results of this miscompile are a bit subtle as they can lead to downstream passes concluding things are impossible which are in fact possible. Thanks to David Blaikie for the majority of the reduction of this miscompile. I'll be checking in the test case in a non-revert commit. Revesions reverted here: r167045: LoopIdiom: Fix a serious missed optimization: we only turned top-level loops into memmove. r166877: LoopIdiom: Add checks to avoid turning memmove into an infinite loop. r166875: LoopIdiom: Recognize memmove loops. r166874: LoopIdiom: Replace custom dependence analysis with DependenceAnalysis. llvm-svn: 167286
* LCSSA: Add a workaround for another nasty SCEV cache invalidation issue.Benjamin Kramer2012-10-311-0/+74
| | | | | | | I'm not entirely happy with this solution, but I don't see a smarter way currently. Fixes PR14214. llvm-svn: 167112
* DependenceAnalysis: Don't crash if there is no constant operand.Benjamin Kramer2012-10-311-0/+25
| | | | | | This makes the code match the comments. Resolves a crash in loop idiom (PR14219). llvm-svn: 167110
* LoopIdiom: Fix a serious missed optimization: we only turned top-level loops ↵Benjamin Kramer2012-10-301-0/+42
| | | | | | | | into memmove. Thanks to Preston Briggs for catching this! llvm-svn: 167045
* LoopIdiom: Add checks to avoid turning memmove into an infinite loop.Benjamin Kramer2012-10-271-1/+52
| | | | | | I don't think this is possible with the current implementation but that may change eventually. llvm-svn: 166877
* LoopIdiom: Recognize memmove loops.Benjamin Kramer2012-10-271-0/+22
| | | | | | | | | | | This turns loops like for (unsigned i = 0; i != n; ++i) p[i] = p[i+1]; into memmove, which has a highly optimized implementation in most libcs. This was really easy with the new DependenceAnalysis :) llvm-svn: 166875
* LoopIdiom: Replace custom dependence analysis with DependenceAnalysis.Benjamin Kramer2012-10-272-0/+102
| | | | | | | | | | | Requires a lot less code and complexity on loop-idiom's side and the more precise analysis can catch more cases, like the one I included as a test case. This also fixes the edge-case miscompilation from PR9481. Compile time performance seems to be slightly worse, but this is mostly due to an extra LCSSA run scheduled by the PassManager and should be fixed there. llvm-svn: 166874
* Revert r166390 "LoopIdiom: Replace custom dependence analysis with ↵Benjamin Kramer2012-10-212-102/+0
| | | | | | | | | | | | | LoopDependenceAnalysis." It passes all tests, produces better results than the old code but uses the wrong pass, LoopDependenceAnalysis, which is old and unmaintained. "Why is it still in tree?", you might ask. The answer is obviously: "To confuse developers." Just swapping in the new dependency pass sends the pass manager into an infinte loop, I'll try to figure out why tomorrow. llvm-svn: 166399
* LoopIdiom: Replace custom dependence analysis with LoopDependenceAnalysis.Benjamin Kramer2012-10-212-0/+102
| | | | | | | | | | Requires a lot less code and complexity on loop-idiom's side and the more precise analysis can catch more cases, like the one I included as a test case. This also fixes the edge-case miscompilation from PR9481. I'm not entirely sure that all cases are handled that the old checks handled but LDA will certainly become smarter in the future. llvm-svn: 166390
* LoopIdiom: Give up when the loop is not in canonical form.Benjamin Kramer2012-09-211-0/+34
| | | | | | | | | We rely on it when doing the transforms. This can happen when there is an indirectbr in the loop. Fixes PR13892. llvm-svn: 164383
* Replace all instances of dg.exp file with lit.local.cfg, since all tests are ↵Eli Bendersky2012-02-162-3/+1
| | | | | | | | run with LIT now and now Dejagnu. dg.exp is no longer needed. Patch reviewed by Daniel Dunbar. It will be followed by additional cleanup patches. llvm-svn: 150664
* Stop emitting instructions with the name "tmp" they eat up memory and have ↵Benjamin Kramer2011-09-271-2/+2
| | | | | | | | to be uniqued, without any benefit. If someone prefers %tmp42 to %42, run instnamer. llvm-svn: 140634
* A real testcase for r135286.Chad Rosier2011-07-151-36/+19
| | | | llvm-svn: 135299
* Add testcase for r135286.Chad Rosier2011-07-151-0/+47
| | | | llvm-svn: 135291
* Fix PR9815: I was trying to get out of "generating code and thenChris Lattner2011-05-221-0/+37
| | | | | | | | failing to form a memset, then having to delete it" but my approximation isn't safe for self recurrent loops. Instead of doign a hack, just do it the right way. llvm-svn: 131858
* Preserve line no. info.Devang Patel2011-03-071-0/+49
| | | | | | Radar 9097659 llvm-svn: 127182
* rewrite the memset_pattern pattern generation stuff to accept any 2/4/8/16-byteChris Lattner2011-02-191-2/+27
| | | | | | | | | | | | | | | | | | constant, including globals. This makes us generate much more "pretty" pattern globals as well because it doesn't break it down to an array of bytes all the time. This enables us to handle stores of relocatable globals. This kicks in about 48 times in 254.gap, giving us stuff like this: @.memset_pattern40 = internal constant [2 x %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)*] [%struct.TypHeader* (%struct.TypHeader*, %struct .TypHeader*)* @IsFalse, %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)* @IsFalse], align 16 ... call void @memset_pattern16(i8* %scevgep5859, i8* bitcast ([2 x %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)*]* @.memset_pattern40 to i8* ), i64 %tmp75) nounwind llvm-svn: 126044
* Stores of null pointers should turn into memset, we weren't recognizingChris Lattner2011-02-191-0/+22
| | | | | | them as splat values. llvm-svn: 126041
* Implement rdar://9009151, transforming strided loop stores ofChris Lattner2011-02-191-0/+27
| | | | | | | | | | | unsplatable values into memset_pattern16 when it is available (recent darwins). This transforms lots of strided loop stores of ints for example, like 5 in vpr: Formed memset: call void @memset_pattern16(i8* %4, i8* getelementptr inbounds ([16 x i8]* @.memset_pattern9, i32 0, i32 0), i64 %tmp25) from store to: {%3,+,4}<%11> at: store i32 3, i32* %scevgep, align 4, !tbaa !4 llvm-svn: 126040
* Teach loop-idiom to turn a loop containing a memset into a larger memsetChris Lattner2011-01-041-0/+33
| | | | | | | | | | | | | | | | when safe. The testcase is basically this nested loop: void foo(char *X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i*100] = 0; } which gets turned into a single memset now. clang -O3 doesn't optimize this yet though due to a phase ordering issue I haven't analyzed yet. llvm-svn: 122806
OpenPOWER on IntegriCloud