summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* SCEVAddExpr::get() of an empty list is invalid.Chris Lattner2005-08-091-1/+4
| | | | llvm-svn: 22724
* Implement: LoopStrengthReduce/share_ivs.llChris Lattner2005-08-091-53/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two changes: * Only insert one PHI node for each stride. Other values are live in values. This cannot introduce higher register pressure than the previous approach, and can take advantage of reg+reg addressing modes. * Factor common base values out of uses before moving values from the base to the immediate fields. This improves codegen by starting the stride-specific PHI node out at a common place for each IV use. As an example, we used to generate this for a loop in swim: .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i lfd f0, 0(r8) stfd f0, 0(r3) lfd f0, 0(r6) stfd f0, 0(r7) lfd f0, 0(r2) stfd f0, 0(r5) addi r9, r9, 1 addi r2, r2, 8 addi r5, r5, 8 addi r6, r6, 8 addi r7, r7, 8 addi r8, r8, 8 addi r3, r3, 8 cmpw cr0, r9, r4 bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1 now we emit: .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i lfdx f0, r8, r2 stfdx f0, r9, r2 lfdx f0, r5, r2 stfdx f0, r7, r2 lfdx f0, r3, r2 stfdx f0, r6, r2 addi r10, r10, 1 addi r2, r2, 8 cmpw cr0, r10, r4 bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1 As another more dramatic example, we used to emit this: .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19 lfd f0, 8(r21) lfd f4, 8(r3) lfd f5, 8(r27) lfd f6, 8(r22) lfd f7, 8(r5) lfd f8, 8(r6) lfd f9, 8(r30) lfd f10, 8(r11) lfd f11, 8(r12) fsub f10, f10, f11 fadd f5, f4, f5 fmul f5, f5, f1 fadd f6, f6, f7 fadd f6, f6, f8 fadd f6, f6, f9 fmadd f0, f5, f6, f0 fnmsub f0, f10, f2, f0 stfd f0, 8(r4) lfd f0, 8(r25) lfd f5, 8(r26) lfd f6, 8(r23) lfd f9, 8(r28) lfd f10, 8(r10) lfd f12, 8(r9) lfd f13, 8(r29) fsub f11, f13, f11 fadd f4, f4, f5 fmul f4, f4, f1 fadd f5, f6, f9 fadd f5, f5, f10 fadd f5, f5, f12 fnmsub f0, f4, f5, f0 fnmsub f0, f11, f3, f0 stfd f0, 8(r24) lfd f0, 8(r8) fsub f4, f7, f8 fsub f5, f12, f10 fnmsub f0, f5, f2, f0 fnmsub f0, f4, f3, f0 stfd f0, 8(r2) addi r20, r20, 1 addi r2, r2, 8 addi r8, r8, 8 addi r10, r10, 8 addi r12, r12, 8 addi r6, r6, 8 addi r29, r29, 8 addi r28, r28, 8 addi r26, r26, 8 addi r25, r25, 8 addi r24, r24, 8 addi r5, r5, 8 addi r23, r23, 8 addi r22, r22, 8 addi r3, r3, 8 addi r9, r9, 8 addi r11, r11, 8 addi r30, r30, 8 addi r27, r27, 8 addi r21, r21, 8 addi r4, r4, 8 cmpw cr0, r20, r7 bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1 we now emit: .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19 lfdx f0, r21, r20 lfdx f4, r3, r20 lfdx f5, r27, r20 lfdx f6, r22, r20 lfdx f7, r5, r20 lfdx f8, r6, r20 lfdx f9, r30, r20 lfdx f10, r11, r20 lfdx f11, r12, r20 fsub f10, f10, f11 fadd f5, f4, f5 fmul f5, f5, f1 fadd f6, f6, f7 fadd f6, f6, f8 fadd f6, f6, f9 fmadd f0, f5, f6, f0 fnmsub f0, f10, f2, f0 stfdx f0, r4, r20 lfdx f0, r25, r20 lfdx f5, r26, r20 lfdx f6, r23, r20 lfdx f9, r28, r20 lfdx f10, r10, r20 lfdx f12, r9, r20 lfdx f13, r29, r20 fsub f11, f13, f11 fadd f4, f4, f5 fmul f4, f4, f1 fadd f5, f6, f9 fadd f5, f5, f10 fadd f5, f5, f12 fnmsub f0, f4, f5, f0 fnmsub f0, f11, f3, f0 stfdx f0, r24, r20 lfdx f0, r8, r20 fsub f4, f7, f8 fsub f5, f12, f10 fnmsub f0, f5, f2, f0 fnmsub f0, f4, f3, f0 stfdx f0, r2, r20 addi r19, r19, 1 addi r20, r20, 8 cmpw cr0, r19, r7 bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1 llvm-svn: 22722
* Suck the base value out of the UsersToProcess vector into the BasedUserChris Lattner2005-08-081-38/+38
| | | | | | class to simplify the code. Fuse two loops. llvm-svn: 22721
* Split MoveLoopVariantsToImediateField out from MoveImmediateValues. TheChris Lattner2005-08-081-23/+65
| | | | | | | first is a correctness thing, and the later is an optzn thing. This also is needed to support a future change. llvm-svn: 22720
* Use the new 'moveBefore' method to simplify some code. Really, which isChris Lattner2005-08-083-6/+4
| | | | | | easier to understand? :) llvm-svn: 22706
* Not all constants are legal immediates in load/store instructions.Chris Lattner2005-08-081-1/+7
| | | | llvm-svn: 22704
* Implement LoopStrengthReduce/share_code_in_preheader.ll by having oneChris Lattner2005-08-081-1/+4
| | | | | | rewriter for all code inserted into the preheader, which is never flushed. llvm-svn: 22702
* Implement a simple optimization for the termination condition of the loop.Chris Lattner2005-08-081-6/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The termination condition actually wants to use the post-incremented value of the loop, not a new indvar with an unusual base. On PPC, for example, this allows us to compile LoopStrengthReduce/exit_compare_live_range.ll to: _foo: li r2, 0 .LBB_foo_1: ; no_exit li r5, 0 stw r5, 0(r3) addi r2, r2, 1 cmpw cr0, r2, r4 bne .LBB_foo_1 ; no_exit blr instead of: _foo: li r2, 1 ;; IV starts at 1, not 0 .LBB_foo_1: ; no_exit li r5, 0 stw r5, 0(r3) addi r5, r2, 1 cmpw cr0, r2, r4 or r2, r5, r5 ;; Reg-reg copy, extra live range bne .LBB_foo_1 ; no_exit blr This implements LoopStrengthReduce/exit_compare_live_range.ll llvm-svn: 22699
* All stats are "Number of ..."Chris Lattner2005-08-071-1/+1
| | | | llvm-svn: 22694
* Add some simple folds that occur in bitfield cases. Fix a minor bug inChris Lattner2005-08-071-0/+32
| | | | | | isHighOnes, where it would consider 0 to have high ones. llvm-svn: 22693
* Fix typoCVS: ↵Chris Lattner2005-08-071-1/+1
| | | | | | ---------------------------------------------------------------------- llvm-svn: 22692
* * Use the new PHINode::hasConstantValue method to simplify some codeChris Lattner2005-08-071-26/+66
| | | | | | | | | | | * Teach this code to move allocas out of the loop when tail call eliminating a call marked 'tail'. This implements TailCallElim/move_alloca_for_tail_call.ll * Do not perform this transformation if a call is marked 'tail' and if there are allocas that we cannot move out of the loop in #2. Doing so would increase the stack usage of the function. This implements fixes PR615 and TailCallElim/dont-tce-tail-marked-call.ll. llvm-svn: 22690
* Make sure to clean CastedPointers after casts are potentially deleted.Chris Lattner2005-08-051-1/+1
| | | | | | This fixes LSR crashes on 301.apsi, 191.fma3d, and 189.lucas llvm-svn: 22673
* now that hasConstantValue defaults to only returning values that dominateChris Lattner2005-08-051-19/+2
| | | | | | the PHI node, this ugly code can vanish. llvm-svn: 22672
* This code can handle non-dominating instructionsChris Lattner2005-08-052-2/+2
| | | | llvm-svn: 22667
* Fix a fixme in CondPropagate.cpp by moving a PhiNode optimization intoNate Begeman2005-08-045-51/+4
| | | | | | | | BasicBlock's removePredecessor routine. This requires shuffling around the definition and implementation of hasContantValue from Utils.h,cpp into Instructions.h,cpp llvm-svn: 22664
* Modify how immediates are removed from base expressions to deal with the factChris Lattner2005-08-041-26/+41
| | | | | | | | | | | | | that the symbolic evaluator is not always able to use subtraction to remove expressions. This makes the code faster, and fixes the last crash on 178.galgel. Finally, add a statistic to see how many phi nodes are inserted. On 178.galgel, we get the follow stats: 2562 loop-reduce - Number of PHIs inserted 3927 loop-reduce - Number of GEPs strength reduced llvm-svn: 22662
* * Refactor some code into a new BasedUser::RewriteInstructionToUseNewBaseChris Lattner2005-08-041-16/+48
| | | | | | | | method. * Fix a crash on 178.galgel, where we would insert expressions before PHI nodes instead of into the PHI node predecessor blocks. llvm-svn: 22657
* Fix a case that caused this to crash on 178.galgelChris Lattner2005-08-041-0/+6
| | | | llvm-svn: 22653
* Teach LSR about loop-variant expressions, such as loops like this:Chris Lattner2005-08-041-68/+93
| | | | | | | | | | | | | | for (i = 0; i < N; ++i) A[i][foo()] = 0; here we still want to strength reduce the A[i] part, even though foo() is l-v. This also simplifies some of the 'CanReduce' logic. This implements Transforms/LoopStrengthReduce/ops_after_indvar.ll llvm-svn: 22652
* Remove some more dead code.Nate Begeman2005-08-041-20/+0
| | | | llvm-svn: 22650
* Refactor this code substantially with the following improvements:Chris Lattner2005-08-041-138/+38
| | | | | | | | | | | 1. We only analyze instructions once, guaranteed 2. AnalyzeGetElementPtrUsers has been ripped apart and replaced with something much simpler. The next step is to handle expressions that are not all indvar+loop-invariant values (e.g. handling indvar+loopvariant). llvm-svn: 22649
* refactor some codeChris Lattner2005-08-041-34/+45
| | | | llvm-svn: 22643
* invert to if's to make the logic simplerChris Lattner2005-08-041-14/+11
| | | | llvm-svn: 22641
* When processing outer loops and we find uses of an IV in inner loops, makeChris Lattner2005-08-041-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sure to handle the use, just don't recurse into it. This permits us to generate this code for a simple nested loop case: .LBB_foo_0: ; entry stwu r1, -48(r1) stw r29, 44(r1) stw r30, 40(r1) mflr r11 stw r11, 56(r1) lis r2, ha16(L_A$non_lazy_ptr) lwz r30, lo16(L_A$non_lazy_ptr)(r2) li r29, 1 .LBB_foo_1: ; no_exit.0 bl L_bar$stub li r2, 1 or r3, r30, r30 .LBB_foo_2: ; no_exit.1 lfd f0, 8(r3) stfd f0, 0(r3) addi r4, r2, 1 addi r3, r3, 8 cmpwi cr0, r2, 100 or r2, r4, r4 bne .LBB_foo_2 ; no_exit.1 .LBB_foo_3: ; loopexit.1 addi r30, r30, 800 addi r2, r29, 1 cmpwi cr0, r29, 100 or r29, r2, r2 bne .LBB_foo_1 ; no_exit.0 .LBB_foo_4: ; return lwz r11, 56(r1) mtlr r11 lwz r30, 40(r1) lwz r29, 44(r1) lwz r1, 0(r1) blr instead of this: _foo: .LBB_foo_0: ; entry stwu r1, -48(r1) stw r28, 44(r1) ;; uses an extra register. stw r29, 40(r1) stw r30, 36(r1) mflr r11 stw r11, 56(r1) li r30, 1 li r29, 0 or r28, r29, r29 .LBB_foo_1: ; no_exit.0 bl L_bar$stub mulli r2, r28, 800 ;; unstrength-reduced multiply lis r3, ha16(L_A$non_lazy_ptr) ;; loop invariant address computation lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 mulli r4, r29, 800 ;; unstrength-reduced multiply addi r3, r3, 8 add r3, r4, r3 li r4, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 0(r3) stfd f0, 0(r2) addi r5, r4, 1 addi r2, r2, 8 ;; multiple stride 8 IV's addi r3, r3, 8 cmpwi cr0, r4, 100 or r4, r5, r5 bne .LBB_foo_2 ; no_exit.1 .LBB_foo_3: ; loopexit.1 addi r28, r28, 1 ;;; Many IV's with stride 1 addi r29, r29, 1 addi r2, r30, 1 cmpwi cr0, r30, 100 or r30, r2, r2 bne .LBB_foo_1 ; no_exit.0 .LBB_foo_4: ; return lwz r11, 56(r1) mtlr r11 lwz r30, 36(r1) lwz r29, 40(r1) lwz r28, 44(r1) lwz r1, 0(r1) blr llvm-svn: 22640
* Teach loop-reduce to see into nested loops, to pull out immediate valuesChris Lattner2005-08-031-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pushed down by SCEV. In a nested loop case, this allows us to emit this: lis r3, ha16(L_A$non_lazy_ptr) lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 li r3, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 8(r2) ;; Uses offset of 8 instead of 0 stfd f0, 0(r2) addi r4, r3, 1 addi r2, r2, 8 cmpwi cr0, r3, 100 or r3, r4, r4 bne .LBB_foo_2 ; no_exit.1 instead of this: lis r3, ha16(L_A$non_lazy_ptr) lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 addi r3, r3, 8 li r4, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 0(r3) stfd f0, 0(r2) addi r5, r4, 1 addi r2, r2, 8 addi r3, r3, 8 cmpwi cr0, r4, 100 or r4, r5, r5 bne .LBB_foo_2 ; no_exit.1 llvm-svn: 22639
* improve debug outputChris Lattner2005-08-031-4/+9
| | | | llvm-svn: 22638
* Move from Stage 0 to Stage 1.Chris Lattner2005-08-031-31/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only emit one PHI node for IV uses with identical bases and strides (after moving foldable immediates to the load/store instruction). This implements LoopStrengthReduce/dont_insert_redundant_ops.ll, allowing us to generate this PPC code for test1: or r30, r3, r3 .LBB_test1_1: ; Loop li r2, 0 stw r2, 0(r30) stw r2, 4(r30) bl L_pred$stub addi r30, r30, 8 cmplwi cr0, r3, 0 bne .LBB_test1_1 ; Loop instead of this code: or r30, r3, r3 or r29, r3, r3 .LBB_test1_1: ; Loop li r2, 0 stw r2, 0(r29) stw r2, 4(r30) bl L_pred$stub addi r30, r30, 8 ;; Two iv's with step of 8 addi r29, r29, 8 cmplwi cr0, r3, 0 bne .LBB_test1_1 ; Loop llvm-svn: 22635
* Rename IVUse to IVUsersOfOneStride, use a struct instead of a pair toChris Lattner2005-08-031-25/+41
| | | | | | | unify some parallel vectors and get field names more descriptive than "first" and "second". This isn't lisp afterall :) llvm-svn: 22633
* Fix a nasty dangling pointer issue. The ScalarEvolution pass would keep aChris Lattner2005-08-031-1/+3
| | | | | | | | | map from instruction* to SCEVHandles. When we delete instructions, we have to tell it about it. We would run into nasty cases where new instructions were reallocated at old instruction addresses and get the old map values. Bad bad bad :( llvm-svn: 22632
* The correct fix for PR612, which also fixesChris Lattner2005-08-031-2/+12
| | | | | | Transforms/LowerInvoke/2005-08-03-InvokeWithPHIUse.ll llvm-svn: 22628
* When inserting code, make sure not to insert it before PHI nodes. ThisChris Lattner2005-08-031-1/+3
| | | | | | fixes PR612 and Transforms/LowerInvoke/2005-08-03-InvokeWithPHI.ll llvm-svn: 22626
* Fix Transforms/SimplifyCFG/2005-08-03-PHIFactorCrash.ll, a problem thatChris Lattner2005-08-031-2/+3
| | | | | | occurred while bugpointing another testcase llvm-svn: 22621
* Finally, add the required constraint checks to fix ↵Chris Lattner2005-08-031-2/+29
| | | | | | | | Transforms/SimplifyCFG/2005-08-01-PHIUpdateFail.ll the right way llvm-svn: 22615
* Simplify some code, add the correct pred checksChris Lattner2005-08-031-16/+25
| | | | llvm-svn: 22613
* Refactor code out of PropagatePredecessorsForPHIs, turning it into a pure ↵Chris Lattner2005-08-031-37/+36
| | | | | | function with no side-effects llvm-svn: 22612
* use splice instead of remove/insert to avoid some symtab operationsChris Lattner2005-08-031-2/+2
| | | | llvm-svn: 22611
* move two functions up in the file, use SafeToMergeTerminators to eliminateChris Lattner2005-08-031-61/+45
| | | | | | some duplicated code llvm-svn: 22610
* Rip some code out of the main SimplifyCFG function into a subfunction andChris Lattner2005-08-031-78/+72
| | | | | | call it from the only place it is live. No functionality changes. llvm-svn: 22609
* Disable this patch:Chris Lattner2005-08-021-1/+1
| | | | | | | | | http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050801/027345.html This breaks real programs and only fixes an obscure regression testcase. A real fix is in development. llvm-svn: 22606
* Change a place to use an arbitrary value instead of null, when possibleChris Lattner2005-08-021-3/+3
| | | | llvm-svn: 22605
* Update to use the new MathExtras.h support for log2 computation.Chris Lattner2005-08-021-22/+15
| | | | | | Patch contributed by Jim Laskey! llvm-svn: 22592
* Like the comment says, do not insert cast instructions before phi nodesChris Lattner2005-08-021-0/+4
| | | | llvm-svn: 22586
* This code was very close, but not quite right. It did not take intoChris Lattner2005-08-021-3/+10
| | | | | | | | consideration the case where a reference in an unreachable block could occur. This fixes Transforms/SimplifyCFG/2005-08-01-PHIUpdateFail.ll, something I ran into while bugpoint'ing another pass. llvm-svn: 22584
* add a comment, make a check more lenientChris Lattner2005-08-021-8/+10
| | | | llvm-svn: 22581
* Simplify for loop, clear a per-loop map after processing each loopChris Lattner2005-08-021-1/+2
| | | | llvm-svn: 22580
* Add a commentChris Lattner2005-08-021-0/+10
| | | | | | | Make LSR ignore GEP's that have loop variant base values, as we currently cannot codegen them llvm-svn: 22576
* Fix an iterator invalidation problemChris Lattner2005-08-021-1/+3
| | | | llvm-svn: 22575
* ConstantInt::get only works for arguments < 128.Chris Lattner2005-08-011-2/+6
| | | | | | | | | | SimplifyLibCalls probably has to be audited to make sure it does not make this mistake elsewhere. Also, if this code knows that the type will be unsigned, obviously one arm of this is dead. Reid, can you take a look into this further? llvm-svn: 22566
* Keep tabs and trailing spaces out.Jeff Cohen2005-07-301-23/+23
| | | | llvm-svn: 22565
OpenPOWER on IntegriCloud