summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/LoopIdiom
Commit message (Collapse)AuthorAgeFilesLines
...
* fix a miscompilation of tramp3d-v4: when forming a memcpy, we have to makeChris Lattner2011-01-021-0/+33
| | | | | | | | sure that the loop we're promoting into a memcpy doesn't mutate the input of the memcpy. Before we were just checking that the dest of the memcpy wasn't mod/ref'd by the loop. llvm-svn: 122712
* If a loop iterates exactly once (has backedge count = 0) then don'tChris Lattner2011-01-021-0/+18
| | | | | | | mess with it. We'd rather peel/unroll it than convert all of its stores into memsets. llvm-svn: 122711
* enhance loop idiom recognition to scan *all* unconditionally executedChris Lattner2011-01-021-0/+23
| | | | | | | blocks in a loop, instead of just the header block. This makes it more aggressive, able to handle Duncan's Ada examples. llvm-svn: 122704
* Allow loop-idiom to run on multiple BB loops, but still only scan the loop Chris Lattner2011-01-021-0/+24
| | | | | | | | | | | | | | | | | | header for now for memset/memcpy opportunities. It turns out that loop-rotate is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for loops" into 2 basic block loops that loop-idiom was ignoring. With this fix, we form many *many* more memcpy and memsets than before, including on the "history" loops in the viterbi benchmark, which look like this: for (j=0; j<MAX_history; ++j) { history_new[i][j+1] = history[2*i][j]; } Transforming these loops into memcpy's speeds up the viterbi benchmark from 11.98s to 3.55s on my machine. Woo. llvm-svn: 122685
* teach loop idiom recognition to form memcpy's from simple loops.Chris Lattner2011-01-021-0/+28
| | | | llvm-svn: 122678
* add a validity check that was missed, fixing a crash on theChris Lattner2011-01-011-0/+23
| | | | | | new testcase. llvm-svn: 122662
* improve validity check to handle constant-trip-count loops moreChris Lattner2011-01-011-1/+27
| | | | | | | aggressively. In practice, this doesn't help anything though, see the todo. llvm-svn: 122660
* implement the "no aliasing accesses in loop" safety check. This passChris Lattner2011-01-011-0/+23
| | | | | | should be correct now. llvm-svn: 122659
* implement enough of the memset inference algorithm to recognize and insert Chris Lattner2010-12-262-0/+47
memsets. This is still missing one important validity check, but this is enough to compile stuff like this: void test0(std::vector<char> &X) { for (std::vector<char>::iterator I = X.begin(), E = X.end(); I != E; ++I) *I = 0; } void test1(std::vector<int> &X) { for (long i = 0, e = X.size(); i != e; ++i) X[i] = 0x01010101; } With: $ clang t.cpp -S -o - -O2 -emit-llvm | opt -loop-idiom | opt -O3 | llc to: __Z5test0RSt6vectorIcSaIcEE: ## @_Z5test0RSt6vectorIcSaIcEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rsi cmpq %rsi, %rax je LBB0_2 ## BB#1: ## %bb.nph subq %rax, %rsi movq %rax, %rdi callq ___bzero LBB0_2: ## %for.end addq $8, %rsp ret ... __Z5test1RSt6vectorIiSaIiEE: ## @_Z5test1RSt6vectorIiSaIiEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rdx subq %rax, %rdx cmpq $4, %rdx jb LBB1_2 ## BB#1: ## %for.body.preheader andq $-4, %rdx movl $1, %esi movq %rax, %rdi callq _memset LBB1_2: ## %for.end addq $8, %rsp ret llvm-svn: 122573
OpenPOWER on IntegriCloud