summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix a FIXME: if we are inserting code for a PHI argument, split the criticalChris Lattner2005-08-121-6/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | edge so that the code is not always executed for both operands. This prevents LSR from inserting code into loops whose exit blocks contain PHI uses of IV expressions (which are outside of loops). On gzip, for example, we turn this ugly code: .LBB_test_1: ; loopentry add r27, r3, r28 lhz r27, 3(r27) add r26, r4, r28 lhz r26, 3(r26) add r25, r30, r28 ;; Only live if exiting the loop add r24, r29, r28 ;; Only live if exiting the loop cmpw cr0, r27, r26 bne .LBB_test_5 ; loopexit into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_test_9 ; loopexit .LBB_test_2: ; shortcirc_next.0 ... blt .LBB_test_1 into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_t_3: ; shortcirc_next.0 .LBB_test_3: ; shortcirc_next.0 ... blt .LBB_test_1 Next step: get the block out of the loop so that the loop is all fall-throughs again. llvm-svn: 22766
* Change break critical edges to not remove, then insert, PHI node entries.Chris Lattner2005-08-121-2/+2
| | | | | | | Instead, just update the BB in-place. This is both faster, and it prevents split-critical-edges from shuffling the PHI argument list unneccesarily. llvm-svn: 22765
* match gcc's use of tabs, makes diffs easierAndrew Lenharth2005-08-121-17/+17
| | | | llvm-svn: 22764
* .section cleanup, patch from Nicholas RileyAndrew Lenharth2005-08-121-3/+3
| | | | llvm-svn: 22763
* 1. Added the function isOpcWithIntImmediate to simplify testing of operand withJim Laskey2005-08-111-9/+40
| | | | | | | | specified opcode and an integer constant right operand. 2. Modified ISD::SHL, ISD::SRL, ISD::SRA to use rlwinm when applied after a mask. llvm-svn: 22761
* Tidied up the use of dyn_cast<ConstantSDNode> by using isIntImmediate more.Chris Lattner2005-08-111-22/+19
| | | | | | Patch by Jim Laskey. llvm-svn: 22760
* Use a more efficient method of creating integer and float virtual registersChris Lattner2005-08-111-44/+52
| | | | | | | | | | | | | | (avoids an extra level of indirection in MakeReg). defined MakeIntReg using RegMap->createVirtualRegister(PPC32::GPRCRegisterClass) defined MakeFPReg using RegMap->createVirtualRegister(PPC32::FPRCRegisterClass) s/MakeReg(MVT::i32)/MakeIntReg/ s/MakeReg(MVT::f64)/MakeFPReg/ Patch by Jim Laskey! llvm-svn: 22759
* Add a select_cc optimization for recognizing abs(int). This speeds up anNate Begeman2005-08-111-0/+16
| | | | | | integer MPEG encoding loop by a factor of two. llvm-svn: 22758
* Some SELECT_CC cleanups:Nate Begeman2005-08-112-53/+61
| | | | | | | | | | | | 1. move assertions for node creation to getNode() 2. legalize the values returned in ExpandOp immediately 3. Move select_cc optimizations from SELECT's getNode() to SELECT_CC's, allowing them to be cleaned up significantly. This paves the way to pick up additional optimizations on SELECT_CC, such as sum-of-absolute-differences. llvm-svn: 22757
* Make SELECT illegal on PPC32, switch to using SELECT_CC, which more closelyNate Begeman2005-08-101-134/+89
| | | | | | | reflects what the hardware is capable of. This significantly simplifies the CC handling logic throughout the ISel. llvm-svn: 22756
* Add new node, SELECT_CC. This node is for targets that don't nativelyNate Begeman2005-08-102-3/+53
| | | | | | implement SELECT. llvm-svn: 22755
* Changes for PPC32ISelPattern.cppChris Lattner2005-08-101-24/+22
| | | | | | | | | 1. Clean up how SelectIntImmediateExpr handles use counts. 2. "Subtract from" was not clearing hi 16 bits. Patch by Jim Laskey llvm-svn: 22754
* Fix an oversight that may be causing PR617.Chris Lattner2005-08-101-4/+13
| | | | llvm-svn: 22753
* remove some trickiness that broke yacr2 and some other programs last nightChris Lattner2005-08-101-3/+1
| | | | llvm-svn: 22751
* Changed the XOR case to use the isOprNot predicate.Chris Lattner2005-08-101-3/+1
| | | | | | Patch by Jim Laskey! llvm-svn: 22750
* 1. Refactored handling of integer immediate values for add, or, xor and sub.Chris Lattner2005-08-101-60/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | New routine: ISel::SelectIntImmediateExpr 2. Now checking use counts of large constants. If use count is > 2 then drop thru so that the constant gets loaded into a register. Source: int %test1(int %a) { entry: %tmp.1 = add int %a, 123456789 ; <int> [#uses=1] %tmp.2 = or int %tmp.1, 123456789 ; <int> [#uses=1] %tmp.3 = xor int %tmp.2, 123456789 ; <int> [#uses=1] %tmp.4 = sub int %tmp.3, -123456789 ; <int> [#uses=1] ret int %tmp.4 } Did Emit: .machine ppc970 .text .align 2 .globl _test1 _test1: .LBB_test1_0: ; entry addi r2, r3, -13035 addis r2, r2, 1884 ori r2, r2, 52501 oris r2, r2, 1883 xori r2, r2, 52501 xoris r2, r2, 1883 addi r2, r2, 52501 addis r3, r2, 1883 blr Now Emits: .machine ppc970 .text .align 2 .globl _test1 _test1: .LBB_test1_0: ; entry lis r2, 1883 ori r2, r2, 52501 add r3, r3, r2 or r3, r3, r2 xor r3, r3, r2 add r3, r3, r2 blr Patch by Jim Laskey! llvm-svn: 22749
* sorry!! this is temporary; for some reason the nasty constmul code seems toDuraid Madina2005-08-101-3/+4
| | | | | | | be an infinite loop when using g++-4.0.1*, this kills the ia64 nightly tester. A proper fix shall be forthcoming!!! thanks for not killing me. :) llvm-svn: 22748
* Fix a bug compiling: select (i32 < i32), f32, f32Chris Lattner2005-08-101-0/+1
| | | | llvm-svn: 22747
* Make loop-simplify produce better loops by turning PHI nodes like X = phi [X, Y]Chris Lattner2005-08-101-1/+16
| | | | | | | into just Y. This often occurs when it seperates loops that have collapsed loop headers. This implements LoopSimplify/phi-node-simplify.ll llvm-svn: 22746
* Allow indvar simplify to canonicalize ANY affine IV, not just affine IVs withChris Lattner2005-08-101-8/+8
| | | | | | constant stride. This implements Transforms/IndVarsSimplify/variable-stride-ivs.ll llvm-svn: 22744
* Fix an obvious oopsChris Lattner2005-08-101-1/+1
| | | | llvm-svn: 22742
* Teach LSR to strength reduce IVs that have a loop-invariant but non-constant ↵Chris Lattner2005-08-101-24/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | stride. For code like this: void foo(float *a, float *b, int n, int stride_a, int stride_b) { int i; for (i=0; i<n; i++) a[i*stride_a] = b[i*stride_b]; } we now emit: .LBB_foo2_2: ; no_exit lfs f0, 0(r4) stfs f0, 0(r3) addi r7, r7, 1 add r4, r2, r4 add r3, r6, r3 cmpw cr0, r7, r5 blt .LBB_foo2_2 ; no_exit instead of: .LBB_foo_2: ; no_exit mullw r8, r2, r7 ;; multiply! slwi r8, r8, 2 lfsx f0, r4, r8 mullw r8, r2, r6 ;; multiply! slwi r8, r8, 2 stfsx f0, r3, r8 addi r2, r2, 1 cmpw cr0, r2, r5 blt .LBB_foo_2 ; no_exit loops with variable strides occur pretty often. For example, in SPECFP2K there are 317 variable strides in 177.mesa, 3 in 179.art, 14 in 188.ammp, 56 in 168.wupwise, 36 in 172.mgrid. Now we can allow indvars to turn functions written like this: void foo2(float *a, float *b, int n, int stride_a, int stride_b) { int i, ai = 0, bi = 0; for (i=0; i<n; i++) { a[ai] = b[bi]; ai += stride_a; bi += stride_b; } } into code like the above for better analysis. With this patch, they generate identical code. llvm-svn: 22740
* Fix Regression/Transforms/LoopStrengthReduce/phi_node_update_multiple_preds.llChris Lattner2005-08-101-7/+14
| | | | | | by being more careful about updating PHI nodes llvm-svn: 22739
* Fix some 80 column violations.Chris Lattner2005-08-091-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once we compute the evolution for a GEP, tell SE about it. This allows users of the GEP to know it, if the users are not direct. This allows us to compile this testcase: void fbSolidFillmmx(int w, unsigned char *d) { while (w >= 64) { *(unsigned long long *) (d + 0) = 0; *(unsigned long long *) (d + 8) = 0; *(unsigned long long *) (d + 16) = 0; *(unsigned long long *) (d + 24) = 0; *(unsigned long long *) (d + 32) = 0; *(unsigned long long *) (d + 40) = 0; *(unsigned long long *) (d + 48) = 0; *(unsigned long long *) (d + 56) = 0; w -= 64; d += 64; } } into: .LBB_fbSolidFillmmx_2: ; no_exit li r2, 0 stw r2, 0(r4) stw r2, 4(r4) stw r2, 8(r4) stw r2, 12(r4) stw r2, 16(r4) stw r2, 20(r4) stw r2, 24(r4) stw r2, 28(r4) stw r2, 32(r4) stw r2, 36(r4) stw r2, 40(r4) stw r2, 44(r4) stw r2, 48(r4) stw r2, 52(r4) stw r2, 56(r4) stw r2, 60(r4) addi r4, r4, 64 addi r3, r3, -64 cmpwi cr0, r3, 63 bgt .LBB_fbSolidFillmmx_2 ; no_exit instead of: .LBB_fbSolidFillmmx_2: ; no_exit li r11, 0 stw r11, 0(r4) stw r11, 4(r4) stwx r11, r10, r4 add r12, r10, r4 stw r11, 4(r12) stwx r11, r9, r4 add r12, r9, r4 stw r11, 4(r12) stwx r11, r8, r4 add r12, r8, r4 stw r11, 4(r12) stwx r11, r7, r4 add r12, r7, r4 stw r11, 4(r12) stwx r11, r6, r4 add r12, r6, r4 stw r11, 4(r12) stwx r11, r5, r4 add r12, r5, r4 stw r11, 4(r12) stwx r11, r2, r4 add r12, r2, r4 stw r11, 4(r12) addi r4, r4, 64 addi r3, r3, -64 cmpwi cr0, r3, 63 bgt .LBB_fbSolidFillmmx_2 ; no_exit llvm-svn: 22737
* implement two helper methodsChris Lattner2005-08-091-0/+28
| | | | llvm-svn: 22736
* Fix spelling, fix some broken canonicalizations by my last patchChris Lattner2005-08-091-12/+11
| | | | llvm-svn: 22734
* add a optimization noteChris Lattner2005-08-091-0/+18
| | | | llvm-svn: 22732
* add cc nodes to the AllNodes list so they show up in Graphviz outputChris Lattner2005-08-091-1/+3
| | | | llvm-svn: 22731
* Update the targets to the new SETCC/CondCodeSDNode interfaces.Chris Lattner2005-08-095-426/+407
| | | | llvm-svn: 22729
* Eliminate the SetCCSDNode in favor of a CondCodeSDNode class. This pulls theChris Lattner2005-08-093-175/+185
| | | | | | | | CC out of the SetCC operation, making SETCC a standard ternary operation and CC's a standard DAG leaf. This will make it possible for other node to use CC's as operands in the future... llvm-svn: 22728
* Minor cleanup patch, no functionality changes. Written by Jim Laskey.Chris Lattner2005-08-091-19/+19
| | | | llvm-svn: 22727
* Fix CodeGen/Generic/div-neg-power-2.ll, a regression from last night.Chris Lattner2005-08-091-0/+2
| | | | llvm-svn: 22726
* SCEVAddExpr::get() of an empty list is invalid.Chris Lattner2005-08-091-1/+4
| | | | llvm-svn: 22724
* Implement: LoopStrengthReduce/share_ivs.llChris Lattner2005-08-091-53/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two changes: * Only insert one PHI node for each stride. Other values are live in values. This cannot introduce higher register pressure than the previous approach, and can take advantage of reg+reg addressing modes. * Factor common base values out of uses before moving values from the base to the immediate fields. This improves codegen by starting the stride-specific PHI node out at a common place for each IV use. As an example, we used to generate this for a loop in swim: .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i lfd f0, 0(r8) stfd f0, 0(r3) lfd f0, 0(r6) stfd f0, 0(r7) lfd f0, 0(r2) stfd f0, 0(r5) addi r9, r9, 1 addi r2, r2, 8 addi r5, r5, 8 addi r6, r6, 8 addi r7, r7, 8 addi r8, r8, 8 addi r3, r3, 8 cmpw cr0, r9, r4 bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1 now we emit: .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i lfdx f0, r8, r2 stfdx f0, r9, r2 lfdx f0, r5, r2 stfdx f0, r7, r2 lfdx f0, r3, r2 stfdx f0, r6, r2 addi r10, r10, 1 addi r2, r2, 8 cmpw cr0, r10, r4 bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1 As another more dramatic example, we used to emit this: .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19 lfd f0, 8(r21) lfd f4, 8(r3) lfd f5, 8(r27) lfd f6, 8(r22) lfd f7, 8(r5) lfd f8, 8(r6) lfd f9, 8(r30) lfd f10, 8(r11) lfd f11, 8(r12) fsub f10, f10, f11 fadd f5, f4, f5 fmul f5, f5, f1 fadd f6, f6, f7 fadd f6, f6, f8 fadd f6, f6, f9 fmadd f0, f5, f6, f0 fnmsub f0, f10, f2, f0 stfd f0, 8(r4) lfd f0, 8(r25) lfd f5, 8(r26) lfd f6, 8(r23) lfd f9, 8(r28) lfd f10, 8(r10) lfd f12, 8(r9) lfd f13, 8(r29) fsub f11, f13, f11 fadd f4, f4, f5 fmul f4, f4, f1 fadd f5, f6, f9 fadd f5, f5, f10 fadd f5, f5, f12 fnmsub f0, f4, f5, f0 fnmsub f0, f11, f3, f0 stfd f0, 8(r24) lfd f0, 8(r8) fsub f4, f7, f8 fsub f5, f12, f10 fnmsub f0, f5, f2, f0 fnmsub f0, f4, f3, f0 stfd f0, 8(r2) addi r20, r20, 1 addi r2, r2, 8 addi r8, r8, 8 addi r10, r10, 8 addi r12, r12, 8 addi r6, r6, 8 addi r29, r29, 8 addi r28, r28, 8 addi r26, r26, 8 addi r25, r25, 8 addi r24, r24, 8 addi r5, r5, 8 addi r23, r23, 8 addi r22, r22, 8 addi r3, r3, 8 addi r9, r9, 8 addi r11, r11, 8 addi r30, r30, 8 addi r27, r27, 8 addi r21, r21, 8 addi r4, r4, 8 cmpw cr0, r20, r7 bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1 we now emit: .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19 lfdx f0, r21, r20 lfdx f4, r3, r20 lfdx f5, r27, r20 lfdx f6, r22, r20 lfdx f7, r5, r20 lfdx f8, r6, r20 lfdx f9, r30, r20 lfdx f10, r11, r20 lfdx f11, r12, r20 fsub f10, f10, f11 fadd f5, f4, f5 fmul f5, f5, f1 fadd f6, f6, f7 fadd f6, f6, f8 fadd f6, f6, f9 fmadd f0, f5, f6, f0 fnmsub f0, f10, f2, f0 stfdx f0, r4, r20 lfdx f0, r25, r20 lfdx f5, r26, r20 lfdx f6, r23, r20 lfdx f9, r28, r20 lfdx f10, r10, r20 lfdx f12, r9, r20 lfdx f13, r29, r20 fsub f11, f13, f11 fadd f4, f4, f5 fmul f4, f4, f1 fadd f5, f6, f9 fadd f5, f5, f10 fadd f5, f5, f12 fnmsub f0, f4, f5, f0 fnmsub f0, f11, f3, f0 stfdx f0, r24, r20 lfdx f0, r8, r20 fsub f4, f7, f8 fsub f5, f12, f10 fnmsub f0, f5, f2, f0 fnmsub f0, f4, f3, f0 stfdx f0, r2, r20 addi r19, r19, 1 addi r20, r20, 8 cmpw cr0, r19, r7 bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1 llvm-svn: 22722
* Suck the base value out of the UsersToProcess vector into the BasedUserChris Lattner2005-08-081-38/+38
| | | | | | class to simplify the code. Fuse two loops. llvm-svn: 22721
* Split MoveLoopVariantsToImediateField out from MoveImmediateValues. TheChris Lattner2005-08-081-23/+65
| | | | | | | first is a correctness thing, and the later is an optzn thing. This also is needed to support a future change. llvm-svn: 22720
* Factor out some common code, and be smarter about when to emit load hi/loNate Begeman2005-08-081-31/+27
| | | | | | code sequences. llvm-svn: 22719
* Allow tools with "consume after" options (like lli) to take more positionalChris Lattner2005-08-081-4/+1
| | | | | | | opts than they take directly. Thanks to John C for pointing this problem out to me! llvm-svn: 22717
* Remove getImmediateForOpcode, which is now dead.Chris Lattner2005-08-081-59/+0
| | | | | | Patch by Jim Laskey. llvm-svn: 22716
* Add new immediate handling support for mul/div.Chris Lattner2005-08-081-21/+30
| | | | | | Patch by Jim Laskey! llvm-svn: 22715
* Add support for OR/XOR/SUB immediates that are handled with the new immediateChris Lattner2005-08-081-37/+55
| | | | | | way. This allows ORI/ORIS pairs, for example. llvm-svn: 22714
* Modify the ISD::AND opcode case to use new immediate constant predicates.Chris Lattner2005-08-081-45/+36
| | | | | | | | | | Includes wider support for rotate and mask cases. Patch by Jim Laskey. I've requested that Jim add new regression tests the newly handled cases. llvm-svn: 22712
* Modify the ISD::ADD opcode case to use new immediate constant predicates.Chris Lattner2005-08-081-11/+15
| | | | | | | | Includes support for 32-bit constants using addi/addis. Patch by Jim Laskey. llvm-svn: 22711
* Modify existing support functions to use new immediate constant predicates.Chris Lattner2005-08-081-10/+9
| | | | | | Patch by Jim Laskey llvm-svn: 22710
* Add support predicates for future immediate constant changes.Chris Lattner2005-08-081-0/+71
| | | | | | Patch by Jim Laskey llvm-svn: 22709
* Move IsRunOfOnes to a more logical place and rename to a proper predicate formChris Lattner2005-08-081-24/+24
| | | | | | | | (lowercase isXXX). Patch by Jim Laskey. llvm-svn: 22708
* Fix JIT encoding of ppc mfocrf instruction; the operands were reversedNate Begeman2005-08-082-6/+20
| | | | llvm-svn: 22707
* Use the new 'moveBefore' method to simplify some code. Really, which isChris Lattner2005-08-083-6/+4
| | | | | | easier to understand? :) llvm-svn: 22706
* Reject command lines that have too many positional arguments passed (e.g.,Chris Lattner2005-08-081-1/+15
| | | | | | | | 'opt x y'). This fixes PR493. Patch contributed by Owen Anderson! llvm-svn: 22705
* Not all constants are legal immediates in load/store instructions.Chris Lattner2005-08-081-1/+7
| | | | llvm-svn: 22704
OpenPOWER on IntegriCloud