summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Refactor this code a bit and make it more general. This now compiles:Chris Lattner2005-09-181-24/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus2 (unsigned int x) { b.j += x; } To: _plus2: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) slwi r3, r3, 6 add r3, r4, r3 rlwimi r3, r4, 0, 26, 14 stw r3, 0(r2) blr instead of: _plus2: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) rlwinm r5, r4, 26, 21, 31 add r3, r5, r3 rlwimi r4, r3, 6, 15, 25 stw r4, 0(r2) blr by eliminating an 'and'. I'm pretty sure this is as small as we can go :) llvm-svn: 23386
* CompileChris Lattner2005-09-181-31/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus2 (unsigned int x) { b.j += x; } to: plus2: mov %EAX, DWORD PTR [b] mov %ECX, %EAX and %ECX, 131008 mov %EDX, DWORD PTR [%ESP + 4] shl %EDX, 6 add %EDX, %ECX and %EDX, 131008 and %EAX, -131009 or %EDX, %EAX mov DWORD PTR [b], %EDX ret instead of: plus2: mov %EAX, DWORD PTR [b] mov %ECX, %EAX shr %ECX, 6 and %ECX, 2047 add %ECX, DWORD PTR [%ESP + 4] shl %ECX, 6 and %ECX, 131008 and %EAX, -131009 or %ECX, %EAX mov DWORD PTR [b], %ECX ret llvm-svn: 23385
* Generalize this transform, using MaskedValueIsZero, allowing us to compile:Chris Lattner2005-09-181-14/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus3 (unsigned int x) { b.k += x; } To: plus3: mov %EAX, DWORD PTR [%ESP + 4] shl %EAX, 17 add DWORD PTR [b], %EAX ret instead of: plus3: mov %EAX, DWORD PTR [%ESP + 4] shl %EAX, 17 mov %ECX, DWORD PTR [b] add %EAX, %ECX and %EAX, -131072 and %ECX, 131071 or %ECX, %EAX mov DWORD PTR [b], %ECX ret llvm-svn: 23384
* fix typeoChris Lattner2005-09-181-1/+1
| | | | llvm-svn: 23383
* Remove unintentionally committed codeChris Lattner2005-09-181-3/+0
| | | | llvm-svn: 23382
* implement shift.ll:test25. This compiles:Chris Lattner2005-09-181-3/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus3 (unsigned int x) { b.k += x; } to: _plus3: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r3, 0(r2) rlwinm r4, r3, 0, 0, 14 add r4, r4, r3 rlwimi r4, r3, 0, 15, 31 stw r4, 0(r2) blr instead of: _plus3: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) srwi r5, r4, 17 add r3, r5, r3 slwi r3, r3, 17 rlwimi r3, r4, 0, 15, 31 stw r3, 0(r2) blr llvm-svn: 23381
* Implement add.ll:test29. Codegening:Chris Lattner2005-09-181-0/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus1 (unsigned int x) { b.i += x; } as: _plus1: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) add r3, r4, r3 rlwimi r3, r4, 0, 0, 25 stw r3, 0(r2) blr instead of: _plus1: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) rlwinm r5, r4, 0, 26, 31 add r3, r5, r3 rlwimi r3, r4, 0, 0, 25 stw r3, 0(r2) blr llvm-svn: 23379
* remove debug outputChris Lattner2005-09-181-1/+0
| | | | llvm-svn: 23377
* Implement or.ll:test21. This teaches instcombine to be able to turn this:Chris Lattner2005-09-181-3/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct { unsigned int bit0:1; unsigned int ubyte:31; } sdata; void foo() { sdata.ubyte++; } into this: foo: add DWORD PTR [sdata], 2 ret instead of this: foo: mov %EAX, DWORD PTR [sdata] mov %ECX, %EAX add %ECX, 2 and %ECX, -2 and %EAX, 1 or %EAX, %ECX mov DWORD PTR [sdata], %EAX ret llvm-svn: 23376
* Implement hook for ppcChris Lattner2005-09-172-0/+18
| | | | llvm-svn: 23374
* More DAG combining. Still need the branch instructions, and select_ccNate Begeman2005-09-161-5/+425
| | | | llvm-svn: 23371
* disable this for nowChris Lattner2005-09-151-0/+2
| | | | llvm-svn: 23366
* Give all operands namesChris Lattner2005-09-141-1/+1
| | | | llvm-svn: 23357
* give all operands namesChris Lattner2005-09-142-12/+14
| | | | llvm-svn: 23356
* Fix some issues exposed by more testing. XORIS had the wrong operandsChris Lattner2005-09-141-5/+5
| | | | | | | specified. The various *imm operands defined by PPC are really all i32, even though the actual immediate is restricted to a smaller value in it. llvm-svn: 23352
* Fix some bugs noticed by new checking codeChris Lattner2005-09-141-8/+14
| | | | llvm-svn: 23350
* Fix the regression last night compiling povrayChris Lattner2005-09-141-2/+3
| | | | llvm-svn: 23348
* fix a major regression from my patch this afternoonChris Lattner2005-09-141-0/+1
| | | | llvm-svn: 23347
* we don't need this proto any longerChris Lattner2005-09-131-1/+0
| | | | llvm-svn: 23342
* move the #include for the generated code into the isel class body so weChris Lattner2005-09-131-1/+3
| | | | | | can use/define class methods llvm-svn: 23339
* Change the arg lowering code to use copyfromreg from vregs associatedChris Lattner2005-09-131-12/+17
| | | | | | | | with incoming arguments instead of the pregs themselves. This fixes the scheduler from causing problems by moving a copyfromreg for an argument to after a select_cc node (now it can, and bad things won't happen). llvm-svn: 23334
* This has been moved to the target-indep codeChris Lattner2005-09-131-22/+0
| | | | llvm-svn: 23333
* This code is no longer needed, it is moved to the target-indep codeChris Lattner2005-09-132-49/+0
| | | | llvm-svn: 23332
* If a function has liveins, and if the target requested that they be ploppedChris Lattner2005-09-131-0/+15
| | | | | | into particular vregs, emit copies into the entry MBB. llvm-svn: 23331
* Majik numbers are badChris Lattner2005-09-131-2/+2
| | | | llvm-svn: 23330
* Remove some dead vectorsChris Lattner2005-09-131-4/+0
| | | | llvm-svn: 23329
* Add a simple xform to simplify array accesses with casts in the way.Chris Lattner2005-09-131-2/+62
| | | | | | | This is useful for 178.galgel where resolution of dope vectors (by the optimizer) causes the scales to become apparent. llvm-svn: 23328
* Fix an issue where LSR would miss rewriting a use of an IV expression by a ↵Chris Lattner2005-09-131-4/+8
| | | | | | | | | PHI node that is not the original PHI. This fixes up a dot-product loop in galgel, speeding it up from 18.47s to 16.13s. llvm-svn: 23327
* Add a helper function, allowing us to simplify some code a bit, changingChris Lattner2005-09-131-39/+47
| | | | | | indentation, no functionality change llvm-svn: 23325
* Implement a simple xform to turn code like this:Chris Lattner2005-09-121-0/+66
| | | | | | | | | if () { store A -> P; } else { store B -> P; } into a PHI node with one store, in the most trival case. This implements load.ll:test10. llvm-svn: 23324
* Another load-peephole optimization: do gcse when two loads are next toChris Lattner2005-09-121-2/+5
| | | | | | each other. This implements InstCombine/load.ll:test9 llvm-svn: 23322
* Implement a trivial form of store->load forwarding where the store and theChris Lattner2005-09-121-0/+9
| | | | | | | | load are exactly consequtive. This is picked up by other passes, but this triggers thousands of times in fortran programs that use static locals (and is thus a compile-time speedup). llvm-svn: 23320
* Fix a regression from last night, which caused this pass to create invalidChris Lattner2005-09-121-8/+6
| | | | | | | | | | | | code for IV uses outside of loops that are not dominated by the latch block. We should only convert these uses to use the post-inc value if they ARE dominated by the latch block. Also use a new LoopInfo method to simplify some code. This fixes Transforms/LoopStrengthReduce/2005-09-12-UsesOutOutsideOfLoop.ll llvm-svn: 23318
* Add a new getLoopLatch() method.Chris Lattner2005-09-121-1/+25
| | | | llvm-svn: 23315
* _test:Chris Lattner2005-09-121-5/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | li r2, 0 LBB_test_1: ; no_exit.2 li r5, 0 stw r5, 0(r3) addi r2, r2, 1 addi r3, r3, 4 cmpwi cr0, r2, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r2, 1 stw r2, 0(r4) blr [zion ~/llvm]$ cat > ~/xx Uses of IV's outside of the loop should use hte post-incremented version of the IV, not the preincremented version. This helps many loops (e.g. in sixtrack) which used to generate code like this (this is the code from the dont-hoist-simple-loop-constants.ll testcase): _test: li r2, 0 **** IV starts at 0 LBB_test_1: ; no_exit.2 or r5, r2, r2 **** Copy for loop exit li r2, 0 stw r2, 0(r3) addi r3, r3, 4 addi r2, r5, 1 addi r6, r5, 2 **** IV+2 cmpwi cr0, r6, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r5, 2 **** IV+2 stw r2, 0(r4) blr And now generated code like this: _test: li r2, 1 *** IV starts at 1 LBB_test_1: ; no_exit.2 li r5, 0 stw r5, 0(r3) addi r2, r2, 1 addi r3, r3, 4 cmpwi cr0, r2, 701 *** IV.postinc + 0 blt cr0, LBB_test_1 LBB_test_2: ; loopexit.2.loopexit stw r2, 0(r4) *** IV.postinc + 0 blr llvm-svn: 23313
* implement Transforms/LoopStrengthReduce/dont-hoist-simple-loop-constants.ll.Chris Lattner2005-09-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to emit this code for it: _test: li r2, 1 ;; Value tying up a register for the whole loop li r5, 0 LBB_test_1: ; no_exit.2 or r6, r5, r5 li r5, 0 stw r5, 0(r3) addi r5, r6, 1 addi r3, r3, 4 add r7, r2, r5 ;; should be addi r7, r5, 1 cmpwi cr0, r7, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r6, 2 stw r2, 0(r4) blr now we emit this: _test: li r2, 0 LBB_test_1: ; no_exit.2 or r5, r2, r2 li r2, 0 stw r2, 0(r3) addi r3, r3, 4 addi r2, r5, 1 addi r6, r5, 2 ;; whoa, fold those adds! cmpwi cr0, r6, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r5, 2 stw r2, 0(r4) blr more improvement coming. llvm-svn: 23306
* PowerPC cannot truncstore i1 nativelyChris Lattner2005-09-103-2/+3
| | | | llvm-svn: 23304
* Allow targets to say they don't support truncstore i1 (which includes a maskChris Lattner2005-09-101-2/+15
| | | | | | when storing to an 8-bit memory location), as most don't. llvm-svn: 23303
* Add a missing #include, patch courtesy of Baptiste Lepilleur.Chris Lattner2005-09-091-0/+1
| | | | llvm-svn: 23302
* Fix a problem duraid encountered on itanium where this folding:Chris Lattner2005-09-091-2/+6
| | | | | | | select (x < y), 1, 0 -> (x < y) incorrectly: the setcc returns i1 but the select returned i32. Add the zero extend as needed. llvm-svn: 23301
* Fix a crash viewing dags that have target nodes in themChris Lattner2005-09-091-1/+2
| | | | llvm-svn: 23300
* I forgot that we always spill fp values as 64-bits. Implement spill foldingChris Lattner2005-09-091-3/+10
| | | | | | for FP as well. This triggers a couple dozen times on 177.mesa (for example). llvm-svn: 23299
* Fix a problem that Nate noticed, where spill code was not getting coallescedChris Lattner2005-09-092-0/+32
| | | | | | | | | | | | | | | | | | | | | with copies, leading to code like this: lwz r4, 380(r1) or r10, r4, r4 ;; Last use of r4 By teaching the PPC backend how to fold spills into copies, we now get this code: lwz r10, 380(r1) wow. :) This reduces a testcase nate sent me from 1505 instructions to 1484. Note that this could handle FP values but doesn't currently, for reasons mentioned in the patch llvm-svn: 23298
* code cleanupChris Lattner2005-09-091-2/+3
| | | | llvm-svn: 23297
* Use continue in the use-processing loop to make it clear what the early exitsChris Lattner2005-09-091-115/+123
| | | | | | | | | are, simplify logic, and cause things to not be nested as deeply. This also uses MRI->areAliases instead of an explicit loop. No functionality change, just code cleanup. llvm-svn: 23296
* Last round of 2-node folds from SD.cpp. Will move on to 3 node ops suchNate Begeman2005-09-092-2/+107
| | | | | | as setcc and select next. llvm-svn: 23295
* remove debugging code *slaps head*Chris Lattner2005-09-091-1/+0
| | | | llvm-svn: 23294
* When spilling a live range that is used multiple times by one instruction,Chris Lattner2005-09-091-9/+26
| | | | | | | | only add a reload live range once for the instruction. This is one step towards fixing a regalloc pessimization that Nate notice, but is later undone by the spiller (so no code is changed). llvm-svn: 23293
* Teach the code generator that rlwimi is commutable if the rotate amountChris Lattner2005-09-093-1/+38
| | | | | | | | is zero. This lets the register allocator elide some copies in some cases. This implements CodeGen/PowerPC/rlwimi-commute.ll llvm-svn: 23292
* Introduce two new concepts:Chris Lattner2005-09-091-11/+75
| | | | | | | | | | | | | | 1. Add support for defining Pattern's, which can match expressions when there is no instruction that directly implements something. Instructions usually implicitly define patterns. 2. Add support for defining SDNodeXForm's, which are node transformations. This seperates the concept of a node xform out from the existing predicate support. Using this new stuff, we add a few instruction patterns, one for testing, and two for OR/XOR by an arbitrary immediate. llvm-svn: 23286
OpenPOWER on IntegriCloud