summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Make MemoryBuiltins aware of TargetLibraryInfo.Benjamin Kramer2012-08-291-11/+13
| | | | | | | | | | | | | | | | This disables malloc-specific optimization when -fno-builtin (or -ffreestanding) is specified. This has been a problem for a long time but became more severe with the recent memory builtin improvements. Since the memory builtin functions are used everywhere, this required passing TLI in many places. This means that functions that now have an optional TLI argument, like RecursivelyDeleteTriviallyDeadFunctions, won't remove dead mallocs anymore if the TLI argument is missing. I've updated most passes to do the right thing. Fixes PR13694 and probably others. llvm-svn: 162841
* Clean whitespaces.Nadav Rotem2012-07-241-1/+1
| | | | llvm-svn: 160668
* Move llvm/Support/IRBuilder.h -> llvm/IRBuilder.hChandler Carruth2012-06-291-5/+5
| | | | | | | | | | | | | | | | | This was always part of the VMCore library out of necessity -- it deals entirely in the IR. The .cpp file in fact was already part of the VMCore library. This is just a mechanical move. I've tried to go through and re-apply the coding standard's preferred header sort, but at 40-ish files, I may have gotten some wrong. Please let me know if so. I'll be committing the corresponding updates to Clang and Polly, and Duncan has DragonEgg. Thanks to Bill and Eric for giving the green light for this bit of cleanup. llvm-svn: 159421
* Correct grammar.Eli Friedman2011-09-131-1/+1
| | | | llvm-svn: 139565
* Change a bunch of isVolatile() checks to check for atomic load/store as well.Eli Friedman2011-09-121-2/+2
| | | | | | | | No tests; these changes aren't really interesting in the sense that the logic is the same for volatile and atomic. I believe this completes all of the changes necessary for the optimizer to handle loads and stores correctly. I'm going to try and come up with some additional testing, though. llvm-svn: 139533
* land David Blaikie's patch to de-constify Type, with a few tweaks.Chris Lattner2011-07-181-2/+2
| | | | llvm-svn: 135375
* Disable loop idiom recognition of memset/memcpy if the function being compiledChad Rosier2011-07-151-0/+5
| | | | | | | | | is named after a common idiom (i.e., memset/memcpy). Otherwise, we can run into infinite recursion. Ideally, the user should use the correct -fno-builtin flag, but in case they don't we should play nicely. rdar://9763412 llvm-svn: 135286
* SCEVExpander: give new insts a name that identifies the reponsible pass.Andrew Trick2011-06-281-2/+2
| | | | llvm-svn: 133992
* whitespaceAndrew Trick2011-06-281-8/+8
| | | | llvm-svn: 133991
* Fix PR9815: I was trying to get out of "generating code and thenChris Lattner2011-05-221-44/+66
| | | | | | | | failing to form a memset, then having to delete it" but my approximation isn't safe for self recurrent loops. Instead of doign a hack, just do it the right way. llvm-svn: 131858
* preserve line number info.Devang Patel2011-05-041-2/+3
| | | | llvm-svn: 130869
* Added SCEV::NoWrapFlags to manage unsigned, signed, and self wrapAndrew Trick2011-03-141-4/+4
| | | | | | | | | properties. Added the self-wrap flag for SCEV::AddRecExpr. A slew of temporary FIXMEs indicate the intention of the no-self-wrap flag without changing behavior in this revision. llvm-svn: 127590
* whitespaceAndrew Trick2011-03-141-66/+66
| | | | llvm-svn: 127589
* Preserve line no. info.Devang Patel2011-03-071-2/+2
| | | | | | Radar 9097659 llvm-svn: 127182
* fix a crasher in disabled code (on variable stride loops)Chris Lattner2011-02-211-1/+1
| | | | llvm-svn: 126125
* Add some (disabled code) to print out negative strides.Chris Lattner2011-02-211-3/+15
| | | | llvm-svn: 126102
* rewrite the memset_pattern pattern generation stuff to accept any 2/4/8/16-byteChris Lattner2011-02-191-32/+12
| | | | | | | | | | | | | | | | | | constant, including globals. This makes us generate much more "pretty" pattern globals as well because it doesn't break it down to an array of bytes all the time. This enables us to handle stores of relocatable globals. This kicks in about 48 times in 254.gap, giving us stuff like this: @.memset_pattern40 = internal constant [2 x %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)*] [%struct.TypHeader* (%struct.TypHeader*, %struct .TypHeader*)* @IsFalse, %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)* @IsFalse], align 16 ... call void @memset_pattern16(i8* %scevgep5859, i8* bitcast ([2 x %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)*]* @.memset_pattern40 to i8* ), i64 %tmp75) nounwind llvm-svn: 126044
* Implement rdar://9009151, transforming strided loop stores ofChris Lattner2011-02-191-32/+125
| | | | | | | | | | | unsplatable values into memset_pattern16 when it is available (recent darwins). This transforms lots of strided loop stores of ints for example, like 5 in vpr: Formed memset: call void @memset_pattern16(i8* %4, i8* getelementptr inbounds ([16 x i8]* @.memset_pattern9, i32 0, i32 0), i64 %tmp25) from store to: {%3,+,4}<%11> at: store i32 3, i32* %scevgep, align 4, !tbaa !4 llvm-svn: 126040
* Make loop-idiom use TargetLibraryInfo to determine whether it is allowedChris Lattner2011-02-181-1/+18
| | | | | | to hack on memset, memcpy etc. llvm-svn: 125974
* Spelling fix: consequtive -> consecutive.Duncan Sands2011-02-151-1/+1
| | | | llvm-svn: 125563
* Teach loop-idiom to turn a loop containing a memset into a larger memsetChris Lattner2011-01-041-18/+69
| | | | | | | | | | | | | | | | when safe. The testcase is basically this nested loop: void foo(char *X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i*100] = 0; } which gets turned into a single memset now. clang -O3 doesn't optimize this yet though due to a phase ordering issue I haven't analyzed yet. llvm-svn: 122806
* restructure this a bit. Initialize the WeakVH with "I", theChris Lattner2011-01-041-11/+14
| | | | | | | | instruction *after* the store. The store will always be deleted if the transformation kicks in, so we'd do an N^2 scan of every loop block. Whoops. llvm-svn: 122805
* use the very-handy getTruncateOrZeroExtend helper function, andChris Lattner2011-01-041-14/+6
| | | | | | | stop setting NSW: signed overflow is possible. Thanks to Dan for pointing these out. llvm-svn: 122790
* Fix comment.Owen Anderson2011-01-031-1/+1
| | | | llvm-svn: 122788
* reduce redundancy in the hashing code and other misc cleanups.Chris Lattner2011-01-031-1/+1
| | | | llvm-svn: 122720
* add DEBUG and -stats output to earlycse.Chris Lattner2011-01-021-3/+4
| | | | | | Teach it to CSE the rest of the non-side-effecting instructions. llvm-svn: 122716
* fix a miscompilation of tramp3d-v4: when forming a memcpy, we have to makeChris Lattner2011-01-021-12/+23
| | | | | | | | sure that the loop we're promoting into a memcpy doesn't mutate the input of the memcpy. Before we were just checking that the dest of the memcpy wasn't mod/ref'd by the loop. llvm-svn: 122712
* If a loop iterates exactly once (has backedge count = 0) then don'tChris Lattner2011-01-021-0/+6
| | | | | | | mess with it. We'd rather peel/unroll it than convert all of its stores into memsets. llvm-svn: 122711
* enhance loop idiom recognition to scan *all* unconditionally executedChris Lattner2011-01-021-8/+39
| | | | | | | blocks in a loop, instead of just the header block. This makes it more aggressive, able to handle Duncan's Ada examples. llvm-svn: 122704
* add a list of opportunities for future improvement.Chris Lattner2011-01-021-1/+22
| | | | llvm-svn: 122701
* Allow loop-idiom to run on multiple BB loops, but still only scan the loop Chris Lattner2011-01-021-5/+5
| | | | | | | | | | | | | | | | | | header for now for memset/memcpy opportunities. It turns out that loop-rotate is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for loops" into 2 basic block loops that loop-idiom was ignoring. With this fix, we form many *many* more memcpy and memsets than before, including on the "history" loops in the viterbi benchmark, which look like this: for (j=0; j<MAX_history; ++j) { history_new[i][j+1] = history[2*i][j]; } Transforming these loops into memcpy's speeds up the viterbi benchmark from 11.98s to 3.55s on my machine. Woo. llvm-svn: 122685
* remove debugging code.Chris Lattner2011-01-021-4/+0
| | | | llvm-svn: 122683
* add some -stats output.Chris Lattner2011-01-021-1/+10
| | | | llvm-svn: 122682
* teach loop idiom recognition to form memcpy's from simple loops.Chris Lattner2011-01-021-22/+102
| | | | llvm-svn: 122678
* add a validity check that was missed, fixing a crash on theChris Lattner2011-01-011-0/+5
| | | | | | new testcase. llvm-svn: 122662
* improve validity check to handle constant-trip-count loops moreChris Lattner2011-01-011-7/+17
| | | | | | | aggressively. In practice, this doesn't help anything though, see the todo. llvm-svn: 122660
* implement the "no aliasing accesses in loop" safety check. This passChris Lattner2011-01-011-5/+32
| | | | | | should be correct now. llvm-svn: 122659
* simplify this, isBytewiseValue handles the extra check. We stillChris Lattner2010-12-281-5/+2
| | | | | | | check for "multiple of a byte" in size to make it clear that the >> 3 below is safe. llvm-svn: 122604
* Silence gcc warning about an unused variable when doing a release build.Duncan Sands2010-12-281-0/+1
| | | | llvm-svn: 122593
* fix some issues Frits noticed, add AliasAnalysis as a dependencyChris Lattner2010-12-271-7/+17
| | | | llvm-svn: 122585
* have loop-idiom nuke instructions that feed stores that get removed.Chris Lattner2010-12-271-6/+45
| | | | llvm-svn: 122574
* implement enough of the memset inference algorithm to recognize and insert Chris Lattner2010-12-261-11/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memsets. This is still missing one important validity check, but this is enough to compile stuff like this: void test0(std::vector<char> &X) { for (std::vector<char>::iterator I = X.begin(), E = X.end(); I != E; ++I) *I = 0; } void test1(std::vector<int> &X) { for (long i = 0, e = X.size(); i != e; ++i) X[i] = 0x01010101; } With: $ clang t.cpp -S -o - -O2 -emit-llvm | opt -loop-idiom | opt -O3 | llc to: __Z5test0RSt6vectorIcSaIcEE: ## @_Z5test0RSt6vectorIcSaIcEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rsi cmpq %rsi, %rax je LBB0_2 ## BB#1: ## %bb.nph subq %rax, %rsi movq %rax, %rdi callq ___bzero LBB0_2: ## %for.end addq $8, %rsp ret ... __Z5test1RSt6vectorIiSaIiEE: ## @_Z5test1RSt6vectorIiSaIiEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rdx subq %rax, %rdx cmpq $4, %rdx jb LBB1_2 ## BB#1: ## %for.body.preheader andq $-4, %rdx movl $1, %esi movq %rax, %rdi callq _memset LBB1_2: ## %for.end addq $8, %rsp ret llvm-svn: 122573
* sketch more of this out.Chris Lattner2010-12-261-20/+64
| | | | llvm-svn: 122567
* actually add the file...Chris Lattner2010-12-261-0/+103
llvm-svn: 122563
OpenPOWER on IntegriCloud