summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Fix compilation of mesa, which I broke earlier todayChris Lattner2004-03-171-2/+3
| | | | llvm-svn: 12465
* Be more accurateChris Lattner2004-03-171-4/+15
| | | | llvm-svn: 12464
* Fix bug in previous checkinChris Lattner2004-03-161-2/+7
| | | | llvm-svn: 12458
* Okay, so there is no reasonable way for tail duplication to update SSA form,Chris Lattner2004-03-161-195/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | as it is making effectively arbitrary modifications to the CFG and we don't have a domset/domfrontier implementations that can handle the dynamic updates. Instead of having a bunch of code that doesn't actually work in practice, just demote any potentially tricky values to the stack (causing the problem to go away entirely). Later invocations of mem2reg will rebuild SSA for us. This fixes all of the major performance regressions with tail duplication from LLVM 1.1. For example, this loop: --- int popcount(int x) { int result = 0; while (x != 0) { result = result + (x & 0x1); x = x >> 1; } return result; } --- Used to be compiled into: int %popcount(int %X) { entry: br label %loopentry loopentry: ; preds = %entry, %no_exit %x.0 = phi int [ %X, %entry ], [ %tmp.9, %no_exit ] ; <int> [#uses=3] %result.1.0 = phi int [ 0, %entry ], [ %tmp.6, %no_exit ] ; <int> [#uses=2] %tmp.1 = seteq int %x.0, 0 ; <bool> [#uses=1] br bool %tmp.1, label %loopexit, label %no_exit no_exit: ; preds = %loopentry %tmp.4 = and int %x.0, 1 ; <int> [#uses=1] %tmp.6 = add int %tmp.4, %result.1.0 ; <int> [#uses=1] %tmp.9 = shr int %x.0, ubyte 1 ; <int> [#uses=1] br label %loopentry loopexit: ; preds = %loopentry ret int %result.1.0 } And is now compiled into: int %popcount(int %X) { entry: br label %no_exit no_exit: ; preds = %entry, %no_exit %x.0.0 = phi int [ %X, %entry ], [ %tmp.9, %no_exit ] ; <int> [#uses=2] %result.1.0.0 = phi int [ 0, %entry ], [ %tmp.6, %no_exit ] ; <int> [#uses=1] %tmp.4 = and int %x.0.0, 1 ; <int> [#uses=1] %tmp.6 = add int %tmp.4, %result.1.0.0 ; <int> [#uses=2] %tmp.9 = shr int %x.0.0, ubyte 1 ; <int> [#uses=2] %tmp.1 = seteq int %tmp.9, 0 ; <bool> [#uses=1] br bool %tmp.1, label %loopexit, label %no_exit loopexit: ; preds = %no_exit ret int %tmp.6 } llvm-svn: 12457
* This code was both incredibly complex and incredibly broken. Fix it.Chris Lattner2004-03-161-137/+57
| | | | llvm-svn: 12456
* Punt if we see gigantic PHI nodes. This improves a huge interpreter loopChris Lattner2004-03-161-0/+6
| | | | | | testcase from 32.5s in -raise to take .3s llvm-svn: 12443
* Do not try to optimize PHI nodes with incredibly high degree. This reduces SCCPChris Lattner2004-03-161-0/+7
| | | | | | | time from 615s to 1.49s on a large testcase that has a gigantic switch statement that all of the blocks in the function go to (an intepreter). llvm-svn: 12442
* Do not copy gigantic switch instructionsChris Lattner2004-03-162-2/+19
| | | | llvm-svn: 12441
* Fix a regression from this patch:Chris Lattner2004-03-161-16/+13
| | | | | | | | | | | | | http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20040308/013095.html Basically, this patch only updated the immediate dominatees of the header node to tell them that the preheader also dominated them. In practice, ALL dominatees of the header node are also dominated by the preheader. This fixes: LoopSimplify/2004-03-15-IncorrectDomUpdate. and PR293 llvm-svn: 12434
* Restore old inlining heuristic. As the comment indicates, this is a nastyChris Lattner2004-03-151-1/+8
| | | | | | horrible hack. llvm-svn: 12423
* Add counters for the number of calls elimiantedChris Lattner2004-03-151-0/+6
| | | | llvm-svn: 12420
* Implement LICM of calls in simple cases. This is sufficient to move aroundChris Lattner2004-03-151-1/+31
| | | | | | | | sin/cos/strlen calls and stuff. This implements: LICM/call_sink_pure_function.ll LICM/call_sink_const_function.ll llvm-svn: 12415
* Mostly cosmetic improvements. Do fix the bug where a global value was ↵Chris Lattner2004-03-151-35/+23
| | | | | | considered an input. llvm-svn: 12406
* Assert that input blocks meet the invariants we expectChris Lattner2004-03-151-42/+38
| | | | | | | | Simplify the input/output finder. All elements of a basic block are instructions. Any used arguments are also inputs. An instruction can only be used by another instruction. llvm-svn: 12405
* Fix several bugs in the loop extractor. In particular, subloops were neverChris Lattner2004-03-151-8/+48
| | | | | | | extracted, and a function that contained a single top-level loop never had the loop extracted, regardless of how much non-loop code there was. llvm-svn: 12403
* No correctness fixes here, just minor qoi fixes:Chris Lattner2004-03-141-30/+26
| | | | | | | | | | | | | * Don't insert a branch to the switch instruction after the call, just make it a single block. * Insert the new alloca instructions in the entry block of the original function instead of having them execute dynamically * Don't make the default edge of the switch instruction go back to the switch. The loop extractor shouldn't create new loops! * Give meaningful names to the alloca slots and the reload instructions * Some minor code simplifications llvm-svn: 12402
* Simplify code a bit, and fix bug CodeExtractor/2004-03-14-NoSwitchSupport.llChris Lattner2004-03-141-62/+34
| | | | | | | | | | This also implements a two minor improvements: * Don't insert live-out stores IN the region, insert them on the code path that exits the region * If the region is exited to the same block from multiple paths, share the switch statement entry, live-out store code, and the basic block. llvm-svn: 12401
* Simplify the code a bit by making the collection of basic blocks to extractChris Lattner2004-03-141-57/+39
| | | | | | | a member of the class. While we're at it, turn the collection into a set instead of a vector to improve efficiency and make queries simpler. llvm-svn: 12400
* Split into two passes. Now there is the general loop extractor, usable onChris Lattner2004-03-141-6/+24
| | | | | | the command line, and the single loop extractor, usable by bugpoint llvm-svn: 12390
* Passes don't print stuff!Chris Lattner2004-03-141-2/+0
| | | | llvm-svn: 12385
* Do not create empty basic blocks when the lowerswitch pass expects blocks toChris Lattner2004-03-141-5/+2
| | | | | | be non-empty! This fixes LowerSwitch/2004-03-13-SwitchIsDefaultCrash.ll llvm-svn: 12384
* Minor random cleanupsChris Lattner2004-03-141-9/+7
| | | | llvm-svn: 12382
* FunctionPass's should not define their own 'run' method.Chris Lattner2004-03-141-8/+2
| | | | | | | Require 'simplified' loops, not just raw natural loops. This fixes CodeExtractor/2004-03-13-LoopExtractorCrash.ll llvm-svn: 12381
* If a block is dead, dominators will not be calculated for it. Because of thisChris Lattner2004-03-141-2/+33
| | | | | | | | loop information won't see it, and we could have unreachable blocks pointing to the non-header node of blocks in a natural loop. This isn't tidy, so have the loopsimplify pass clean it up. llvm-svn: 12380
* Verify functions as they are produced if -debug is specified. ReduceChris Lattner2004-03-141-6/+5
| | | | | | curly braceage llvm-svn: 12378
* Move prototype to IPO.h instead of Scalar.hChris Lattner2004-03-141-3/+2
| | | | | | | Make sure that the file interface header (IPO.h) is included first remove dead #incldue llvm-svn: 12375
* Indent anon namespace properly, add copyright blockChris Lattner2004-03-141-19/+20
| | | | llvm-svn: 12373
* Move to the IPO library. Utils shouldn't contain passes.Chris Lattner2004-03-141-0/+0
| | | | llvm-svn: 12372
* DemoteRegToStack got moved from DemoteRegToStack.h to Local.hChris Lattner2004-03-143-6/+6
| | | | llvm-svn: 12368
* Add some debugging outputChris Lattner2004-03-131-1/+8
| | | | | | | Fix InstCombine/2004-03-13-InstCombineInfLoop.ll which caused an infinite loop compiling (I think) povray. llvm-svn: 12365
* This change makes two big adjustments.Chris Lattner2004-03-131-11/+49
| | | | | | | | | | * Be a lot more accurate about what the effects will be when inlining a call to a function when an argument is an alloca. * Dramatically reduce the penalty for inlining a call in a large function. This heuristic made it almost impossible to inline a function into a large function, no matter how small the callee is. llvm-svn: 12363
* This little patch speeds up the loop used to update the dominator set analysis.Chris Lattner2004-03-131-17/+18
| | | | | | | | | | | | | On the testcase from GCC PR12440, which has a LOT of loops (1392 of which require preheaders to be inserted), this speeds up the loopsimplify pass from 1.931s to 0.1875s. The loop in question goes from 1.65s -> 0.0097s, which isn't bad. All of these times are a debug build. This adds a dependency on DominatorTree analysis that was not there before, but we always had dominatortree available anyway, because LICM requires both loop simplify and DT, so this doesn't add any extra analysis in practice. llvm-svn: 12362
* Implement sub.ll:test14Chris Lattner2004-03-131-8/+29
| | | | llvm-svn: 12355
* Implement InstCombine/sub.ll:test12 & test13Chris Lattner2004-03-121-0/+36
| | | | llvm-svn: 12353
* Add constant folding wrapper support for select instructions.Chris Lattner2004-03-121-0/+4
| | | | llvm-svn: 12319
* Add sccp support for select instructionsChris Lattner2004-03-121-0/+23
| | | | llvm-svn: 12318
* Add trivial optimizations for select instructionsChris Lattner2004-03-121-0/+15
| | | | llvm-svn: 12317
* Initial support for edge profilingChris Lattner2004-03-081-0/+94
| | | | llvm-svn: 12225
* Split utility functions out of BlockProfiling.cppChris Lattner2004-03-083-85/+137
| | | | llvm-svn: 12224
* finegrainify namespacificationChris Lattner2004-03-081-14/+14
| | | | llvm-svn: 12221
* Implement ArgumentPromotion/aggregate-promote.llChris Lattner2004-03-081-25/+145
| | | | | | | | | | | | | | | | | | This allows pointers to aggregate objects, whose elements are only read, to be promoted and passed in by element instead of by reference. This can enable a LOT of subsequent optimizations in the caller function. It's worth pointing out that this stuff happens a LOT of C++ programs, because objects in templates are generally passed around by reference. When these templates are instantiated on small aggregate or scalar types, however, it is more efficient to pass them in by value than by reference. This transformation triggers most on C++ codes (e.g. 334 times on eon), but does happen on C codes as well. For example, on mesa it triggers 72 times, and on gcc it triggers 35 times. this is amazingly good considering that we are using 'basicaa' so far. llvm-svn: 12202
* Implement: ArgumentPromotion/chained.llChris Lattner2004-03-071-0/+5
| | | | llvm-svn: 12200
* Fix another minor bug, exposed by perlbmkChris Lattner2004-03-071-3/+11
| | | | llvm-svn: 12198
* Since 'load null' is undefined, we can make it do whatever we want. ReturningChris Lattner2004-03-071-0/+6
| | | | | | a zero value is the most likely way to cause further simplification, so we do it. llvm-svn: 12197
* Fix a minor bug and turn debug output into, well, debug output.Chris Lattner2004-03-071-2/+2
| | | | llvm-svn: 12195
* New LLVM pass: argument promotion. This version only handles simple scalarChris Lattner2004-03-071-0/+328
| | | | | | variables. llvm-svn: 12193
* Don't emit things like malloc(16*1). Allocation instructions are fixed ↵Chris Lattner2004-03-031-1/+1
| | | | | | arity now. llvm-svn: 12086
* Implement ExtractCodeRegion()Misha Brukman2004-03-021-1/+9
| | | | llvm-svn: 12070
* Make a note that this is usually used via bugpoint.Misha Brukman2004-03-021-3/+2
| | | | llvm-svn: 12068
* * Add implementation of ExtractBasicBlock()Misha Brukman2004-03-011-0/+10
| | | | | | * Add comments to ExtractLoop() llvm-svn: 12053
OpenPOWER on IntegriCloud