summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Clone loop.Devang Patel2007-08-101-5/+25
| | | | llvm-svn: 40998
* Add utility to clone loops.Devang Patel2007-08-101-0/+149
| | | | llvm-svn: 40997
* Remove unncessary duplication.Devang Patel2007-08-101-14/+1
| | | | llvm-svn: 40979
* Calculate exit and start value of true loop and false loop respectively.Devang Patel2007-08-101-2/+28
| | | | llvm-svn: 40978
* ExitCondition and Induction variable are loop constraints Devang Patel2007-08-101-71/+145
| | | | | | not split condition constraints. llvm-svn: 40977
* when we see a unaligned load from an insufficiently aligned global orChris Lattner2007-08-091-21/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | alloca, increase the alignment of the load, turning it into an aligned load. This allows us to compile: #include <xmmintrin.h> __m128i foo(__m128i x){ static const unsigned int c_0[4] = { 0, 0, 0, 0 }; __m128i v_Zero = _mm_loadu_si128((__m128i*)c_0); x = _mm_unpacklo_epi8(x, v_Zero); return x; } into: _foo: punpcklbw _c_0.5944, %xmm0 ret .data .lcomm _c_0.5944,16,4 # c_0.5944 instead of: _foo: movdqu _c_0.5944, %xmm1 punpcklbw %xmm1, %xmm0 ret .data .lcomm _c_0.5944,16,2 # c_0.5944 llvm-svn: 40971
* Make NonLocal and None const in the right way. :-)Owen Anderson2007-08-093-8/+8
| | | | llvm-svn: 40961
* Traverse loop blocks' terminators to find split candidates.Devang Patel2007-08-091-48/+106
| | | | llvm-svn: 40960
* Add cost analysis.Devang Patel2007-08-081-11/+41
| | | | llvm-svn: 40952
* Preserve dom info while processing one iteration loop.Devang Patel2007-08-081-0/+17
| | | | llvm-svn: 40947
* Change the None and NonLocal markers in memdep to be const.Owen Anderson2007-08-083-8/+8
| | | | llvm-svn: 40946
* Clear split info.Devang Patel2007-08-081-0/+11
| | | | llvm-svn: 40944
* Handle multiple split conditions.Devang Patel2007-08-081-60/+93
| | | | llvm-svn: 40941
* Global values also don't undead-ify pointers in our dead alloca's set.Owen Anderson2007-08-081-0/+3
| | | | llvm-svn: 40936
* Make handleEndBlock significantly faster with one trivial improvement,Owen Anderson2007-08-081-4/+30
| | | | | | and one hack to avoid hitting a bad case when the alias analysis is imprecise. llvm-svn: 40935
* Small improvement: if a function doesn't access memory, we don't need to scanOwen Anderson2007-08-081-2/+8
| | | | | | it for potentially undeading pointers. llvm-svn: 40933
* Add some comments, remove a dead argument, and simplify some control flow.Owen Anderson2007-08-081-19/+28
| | | | | | No functionality change. llvm-svn: 40932
* A few more small cleanups.Owen Anderson2007-08-081-9/+7
| | | | llvm-svn: 40922
* First round of cleanups from Chris' feedback.Owen Anderson2007-08-081-51/+58
| | | | llvm-svn: 40919
* Embrace patch review feedback.Devang Patel2007-08-081-23/+23
| | | | llvm-svn: 40915
* Fix new compare instruction's signness. Caught by Chris during review.Devang Patel2007-08-071-5/+7
| | | | llvm-svn: 40912
* Don't insert nearly as many redundant phi nodes.Owen Anderson2007-08-071-2/+25
| | | | llvm-svn: 40909
* Use eraseFromParent().Devang Patel2007-08-071-4/+2
| | | | llvm-svn: 40903
* Fix comment typoDavid Greene2007-08-071-1/+1
| | | | llvm-svn: 40898
* Fix GLIBCXX_DEBUG error triggered by incrementing erased iterator.David Greene2007-08-071-4/+4
| | | | llvm-svn: 40897
* Begin loop index split pass.Devang Patel2007-08-071-0/+384
| | | | llvm-svn: 40883
* It's safe to fold not of fcmp.Nick Lewycky2007-08-061-3/+8
| | | | llvm-svn: 40870
* Make this code more efficient.David Greene2007-08-061-5/+5
| | | | llvm-svn: 40861
* remove some dead linesChris Lattner2007-08-061-2/+0
| | | | llvm-svn: 40859
* Silence some warnings from doxygen about @param argument name not matching theReid Spencer2007-08-051-3/+3
| | | | | | actual argument name of the documented function. llvm-svn: 40851
* at the end of instcombine, explicitly clear WorklistMap.Chris Lattner2007-08-051-7/+9
| | | | | | | | This shrinks it down to something small. On the testcase from PR1432, this speeds up instcombine from 0.7959s to 0.5000s, (59%) llvm-svn: 40840
* rewrite the code used to construct pruned SSA form with the IDF method.Chris Lattner2007-08-041-82/+114
| | | | | | | | | | | | | | | | In the old way, we computed and inserted phi nodes for the whole IDF of the definitions of the alloca, then computed which ones were dead and removed them. In the new method, we first compute the region where the value is live, and use that information to only insert phi nodes that are live. This eliminates the need to compute liveness later, and stops the algorithm from inserting a bunch of phis which it then later removes. This speeds up the testcase in PR1432 from 2.00s to 0.15s (14x) in a release build and 6.84s->0.50s (14x) in a debug build. llvm-svn: 40825
* Factor out a whole bunch of code into it's own method.Chris Lattner2007-08-041-65/+82
| | | | llvm-svn: 40824
* Use getNumPreds(BB) instead of computing them manually. This is a very small butChris Lattner2007-08-041-4/+4
| | | | | | measurable speedup. llvm-svn: 40823
* Change the rename pass to be "tail recursive", only adding N-1 successorsChris Lattner2007-08-041-21/+35
| | | | | | | to the worklist, and handling the last one with a 'tail call'. This speeds up PR1432 from 2.0578s to 2.0012s (2.8%) llvm-svn: 40822
* cache computation of #preds for a BB. This speeds upChris Lattner2007-08-041-3/+14
| | | | | | mem2reg from 2.0742->2.0522s on PR1432. llvm-svn: 40821
* reserve operand space for phi nodes when we insert them.Chris Lattner2007-08-041-0/+1
| | | | llvm-svn: 40820
* use continue to avoid nesting, no functionality change.Chris Lattner2007-08-041-14/+15
| | | | llvm-svn: 40819
* Promoting allocas with the 'single store' fastpath is Chris Lattner2007-08-041-10/+9
| | | | | | | faster than with the 'local to a block' fastpath. This speeds up PR1432 from 2.1232 to 2.0686s (2.6%) llvm-svn: 40818
* When PromoteLocallyUsedAllocas promoted allocas, it didn't rememberChris Lattner2007-08-041-2/+13
| | | | | | | to increment NumLocalPromoted, and didn't actually delete the dead alloca, leading to an extra iteration of mem2reg. llvm-svn: 40817
* std::map -> DenseMapChris Lattner2007-08-041-3/+3
| | | | llvm-svn: 40816
* Clean up comments, fix up some confusing code logic.Nick Lewycky2007-08-041-30/+47
| | | | | | Predsimplify fails llvm-gcc bootstrap. llvm-svn: 40815
* fix a logic bug where we wouldn't promote single store allocas if the Chris Lattner2007-08-041-2/+2
| | | | | | | | | stored value was a non-instruction value. Doh. This increase the # single store allocas from 8982 to 9026, and speeds up mem2reg on the testcase in PR1432 from 2.17 to 2.13s. llvm-svn: 40813
* When we do the single-store optimization, delete both the storeChris Lattner2007-08-041-2/+8
| | | | | | | | and the alloca so they don't get reprocessed. This speeds up PR1432 from 2.20s to 2.17s. llvm-svn: 40812
* Three improvements:Chris Lattner2007-08-041-6/+16
| | | | | | | | | | | | | 1. Check for revisiting a block before checking domination, which is faster. 2. If the stored value isn't an instruction, we don't have to check for domination. 3. If we have a value used in the same block more than once, make sure to remove the block from the UsingBlocks vector. Not doing so forces us to go through the slow path for the alloca. The combination of these improvements increases the number of allocas on the fastpath from 8935 to 8982 on PR1432. This speeds it up from 2.90s to 2.20s (31%) llvm-svn: 40811
* switch from using a std::set to using a SmallPtrSet. This speeds up theChris Lattner2007-08-041-3/+3
| | | | | | testcase in PR1432 from 6.33s to 2.90s (2.22x) llvm-svn: 40810
* In mem2reg, when handling the single-store case, make sure to removeChris Lattner2007-08-041-8/+10
| | | | | | | | | | a using block from the list if we handle it. Not doing this caused us to not be able to promote (with the fast path) allocas which have uses (whoops). This increases the # allocas hitting this fastpath from 4042 to 8935 on the testcase in PR1432, speeding up mem2reg by 2.6x llvm-svn: 40809
* This is the patch to provide clean intrinsic function overloading support in ↵Chandler Carruth2007-08-041-2/+2
| | | | | | | | LLVM. It cleans up the intrinsic definitions and generally smooths the process for more complicated intrinsic writing. It will be used by the upcoming atomic intrinsics as well as vector and float intrinsics in the future. This also changes the syntax for llvm.bswap, llvm.part.set, llvm.part.select, and llvm.ct* intrinsics. They are automatically upgraded by both the LLVM ASM reader and the bitcode reader. The test cases have been updated, with special tests added to ensure the automatic upgrading is supported. llvm-svn: 40807
* split rewriting of single-store allocas into its ownChris Lattner2007-08-041-39/+57
| | | | | | method. llvm-svn: 40806
* refactor some code to shrink PromoteMem2Reg::run a bitChris Lattner2007-08-041-63/+96
| | | | llvm-svn: 40805
OpenPOWER on IntegriCloud