summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* A workaround for a bug in cmake 2.8.3 diagnosed on PR 8885.Oscar Fuentes2011-01-021-0/+5
| | | | llvm-svn: 122706
* Also remove functions that use complex constant expressions in terms ofNick Lewycky2011-01-021-5/+18
| | | | | | another function. llvm-svn: 122705
* enhance loop idiom recognition to scan *all* unconditionally executedChris Lattner2011-01-022-8/+62
| | | | | | | blocks in a loop, instead of just the header block. This makes it more aggressive, able to handle Duncan's Ada examples. llvm-svn: 122704
* make inSubLoop much more efficient.Chris Lattner2011-01-021-4/+1
| | | | llvm-svn: 122703
* rip out isExitBlockDominatedByBlockInLoop, calling DomTree::dominates instead.Chris Lattner2011-01-021-37/+4
| | | | | | | | isExitBlockDominatedByBlockInLoop is a relic of the days when domtree was *just* a tree and didn't have DFS numbers. Checking DFS numbers is faster and easier than "limiting the search of the tree". llvm-svn: 122702
* add a list of opportunities for future improvement.Chris Lattner2011-01-021-1/+22
| | | | llvm-svn: 122701
* update a bunch of entries.Chris Lattner2011-01-022-137/+56
| | | | llvm-svn: 122700
* Fix PR8702 by not having LoopSimplify claim to preserve LCSSA form. As ↵Duncan Sands2011-01-022-15/+55
| | | | | | | | | | described in the PR, the pass could break LCSSA form when inserting preheaders. It probably would be easy enough to fix this, but since currently we always go into LCSSA form after running this pass, doing so is not urgent. llvm-svn: 122695
* Remove an unused member function.Cameron Zwarich2011-01-021-3/+0
| | | | llvm-svn: 122693
* Propagate to parent scope changes made to CMAKE_CXX_FLAGS.Oscar Fuentes2011-01-021-0/+1
| | | | llvm-svn: 122692
* Fix a typo in a variable name.Cameron Zwarich2011-01-021-3/+3
| | | | llvm-svn: 122691
* Move a load into the only branch where it is used and eliminate a temporary.Cameron Zwarich2011-01-021-3/+1
| | | | llvm-svn: 122690
* Add the explanatory comment from r122680's commit message to the code itself.Cameron Zwarich2011-01-021-0/+10
| | | | llvm-svn: 122689
* Tidy up indentation.Cameron Zwarich2011-01-021-5/+5
| | | | llvm-svn: 122688
* Fix a typo, which should also fix the failure on llvm-x86_64-linux-checks.Cameron Zwarich2011-01-021-1/+1
| | | | llvm-svn: 122687
* Remove obsolete comments.Francois Pichet2011-01-021-7/+0
| | | | llvm-svn: 122686
* Allow loop-idiom to run on multiple BB loops, but still only scan the loop Chris Lattner2011-01-023-13/+29
| | | | | | | | | | | | | | | | | | header for now for memset/memcpy opportunities. It turns out that loop-rotate is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for loops" into 2 basic block loops that loop-idiom was ignoring. With this fix, we form many *many* more memcpy and memsets than before, including on the "history" loops in the viterbi benchmark, which look like this: for (j=0; j<MAX_history; ++j) { history_new[i][j+1] = history[2*i][j]; } Transforming these loops into memcpy's speeds up the viterbi benchmark from 11.98s to 3.55s on my machine. Woo. llvm-svn: 122685
* Remove the #ifdef'd code for balancing the eval-link data structure. It doesn'tCameron Zwarich2011-01-021-65/+3
| | | | | | | | | | compile, and everyone's tests have shown it to be slower in practice, even for quite large graphs. I also hope to do an optimization that is only correct with the simpler data structure, which would break this even further. llvm-svn: 122684
* remove debugging code.Chris Lattner2011-01-021-4/+0
| | | | llvm-svn: 122683
* add some -stats output.Chris Lattner2011-01-021-1/+10
| | | | llvm-svn: 122682
* improve loop rotation to use CodeMetrics to analyze theChris Lattner2011-01-022-17/+8
| | | | | | | size of a loop header instead of its own code size estimator. This allows it to handle bitcasts etc more precisely. llvm-svn: 122681
* Speed up dominator computation some more by optimizing bucket processing. WhenCameron Zwarich2011-01-022-14/+24
| | | | | | | | | | | | | | | | | | | | | | | | | naively implemented, the Lengauer-Tarjan algorithm requires a separate bucket for each vertex. However, this is unnecessary, because each vertex is only placed into a single bucket (that of its semidominator), and each vertex's bucket is processed before it is added to any bucket itself. Instead of using a bucket per vertex, we use a single array Buckets that has two purposes. Before the vertex V with DFS number i is processed, Buckets[i] stores the index of the first element in V's bucket. After V's bucket is processed, Buckets[i] stores the index of the next element in the bucket to which V now belongs, if any. Reading from the buckets can also be optimized. Instead of processing the bucket of V's parent at the end of processing V, we process the bucket of V itself at the beginning of processing V. This means that the case of the root vertex can be simplified somewhat. It also means that we don't need to look up the DFS number of the semidominator of every node in the bucket we are processing, since we know it is the current index being processed. This is a 6.5% speedup running -domtree on test-suite + SPEC2000/2006, with larger speedups of around 12% on the larger benchmarks like GCC. llvm-svn: 122680
* Add support for passing variables declared to use a xmm register to asmRafael Espindola2011-01-022-1/+34
| | | | | | statements using the "x" constraint. llvm-svn: 122679
* teach loop idiom recognition to form memcpy's from simple loops.Chris Lattner2011-01-022-22/+130
| | | | llvm-svn: 122678
* Remove functions from the FnSet when one of their callee's is being merged. ThisNick Lewycky2011-01-021-82/+66
| | | | | | | | | | | maintains the guarantee that the DenseSet expects two elements it contains to not go from inequal to equal under its nose. As a side-effect, this also lets us switch from iterating to a fixed-point to actually maintaining a work queue of functions to look at again, and we don't add thunks to our work queue so we don't need to detect and ignore them. llvm-svn: 122677
* a missed __builtin_object_size case.Chris Lattner2011-01-011-0/+17
| | | | llvm-svn: 122676
* various updates.Chris Lattner2011-01-011-31/+29
| | | | llvm-svn: 122675
* fix a globalopt crash on two Adobe-C++ testcases that the recentChris Lattner2011-01-012-0/+14
| | | | | | loop idiom pass exposed. llvm-svn: 122674
* Fix darwin bots.Rafael Espindola2011-01-011-1/+1
| | | | llvm-svn: 122672
* Remove empty directories left behind by git-svn users.Benjamin Kramer2011-01-010-0/+0
| | | | llvm-svn: 122671
* Produce a better error message for invalid register names.Rafael Espindola2011-01-013-5/+10
| | | | llvm-svn: 122670
* Fix typo and add comment.Rafael Espindola2011-01-011-5/+8
| | | | llvm-svn: 122669
* More empty directory removal.Benjamin Kramer2011-01-010-0/+0
| | | | llvm-svn: 122668
* Add support for the 'H' modifier.Rafael Espindola2011-01-012-0/+18
| | | | llvm-svn: 122667
* Update the testAnton Korobeynikov2011-01-011-1/+1
| | | | llvm-svn: 122666
* Remove empty directories.Nick Lewycky2011-01-010-0/+0
| | | | llvm-svn: 122665
* turn on memset idiom recognition by default. Though there are still lots ofChris Lattner2011-01-011-0/+1
| | | | | | | | | | | limitations, this kicks in dozens of times in the 4 specfp2000 benchmarks, and hundreds of times in the int part. It also kicks in hundreds of times in multisource. This kicks in right before loop deletion, which has the pleasant effect of deleting loops that *just* do a memset. llvm-svn: 122664
* Model operand restrictions of mul-like instructions on ARMv5 viaAnton Korobeynikov2011-01-014-10/+100
| | | | | | | | | earlyclobber stuff. This should fix PRs 2313 and 8157. Unfortunately, no testcase, since it'd be dependent on register assignments. llvm-svn: 122663
* add a validity check that was missed, fixing a crash on theChris Lattner2011-01-012-0/+28
| | | | | | new testcase. llvm-svn: 122662
* Revert commit 122654 at the request of Chris, who reckons that instsimplifyDuncan Sands2011-01-013-133/+63
| | | | | | is the wrong hammer for this nail, and is probably right. llvm-svn: 122661
* improve validity check to handle constant-trip-count loops moreChris Lattner2011-01-012-8/+44
| | | | | | | aggressively. In practice, this doesn't help anything though, see the todo. llvm-svn: 122660
* implement the "no aliasing accesses in loop" safety check. This passChris Lattner2011-01-012-5/+55
| | | | | | should be correct now. llvm-svn: 122659
* Fix PR8878.Rafael Espindola2011-01-012-0/+8
| | | | llvm-svn: 122658
* Correct a bunch of mistakes which meant that the example pass didn'tDuncan Sands2011-01-011-8/+8
| | | | | | even compile, let alone work. llvm-svn: 122657
* I was unable to get the instructions to work if LLVM was builtDuncan Sands2011-01-011-2/+4
| | | | | | using a separate objects directory. llvm-svn: 122656
* Clarify that the loadable module turns up in the top-level directory,Duncan Sands2011-01-011-4/+5
| | | | | | not locally. llvm-svn: 122655
* Fix a README item by having InstructionSimplify do a mild form of valueDuncan Sands2011-01-013-63/+133
| | | | | | | | | | | numbering, in which it considers (for example) "%a = add i32 %x, %y" and "%b = add i32 %x, %y" to be equal because the operands are equal and the result of the instructions only depends on the values of the operands. This has almost no effect (it removes 4 instructions from gcc-as-one-file), and perhaps slows down compilation: I measured a 0.4% slowdown on the large gcc-as-one-file testcase, but it wasn't statistically significant. llvm-svn: 122654
* ptx: remove reg-reg addressing mode and st.constChe-Liang Chiou2011-01-014-39/+15
| | | | llvm-svn: 122653
* ptx: add store instructionChe-Liang Chiou2011-01-015-4/+179
| | | | llvm-svn: 122652
* Add a reference to the OCamlLangImpl8.Erick Tryzelaar2011-01-011-1/+1
| | | | llvm-svn: 122651
OpenPOWER on IntegriCloud