summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* implement add_parts/sub_parts.Chris Lattner2005-01-201-3/+8
| | | | llvm-svn: 19714
* Add missing entry.Chris Lattner2005-01-201-0/+1
| | | | llvm-svn: 19712
* Fix a crash compiling 134.perl.Chris Lattner2005-01-201-21/+41
| | | | llvm-svn: 19711
* Support targets that do not use i8 shift amounts.Chris Lattner2005-01-191-0/+4
| | | | llvm-svn: 19707
* Add two optimizations. The first folds (X+Y)-X -> YChris Lattner2005-01-191-2/+89
| | | | | | | | | | | | | | | | | The second folds operations into selects, e.g. (select C, (X+Y), (Y+Z)) -> (Y+(select C, X, Z) This occurs a few times across spec, e.g. select add/sub mesa: 83 0 povray: 5 2 gcc 4 2 parser 0 22 perlbmk 13 30 twolf 0 3 llvm-svn: 19706
* Add an assertion that would have made more sense to duraidChris Lattner2005-01-191-1/+3
| | | | llvm-svn: 19704
* Add support for targets that pass args in registers to calls.Chris Lattner2005-01-191-6/+25
| | | | llvm-svn: 19703
* Fold single use token factor nodes into other token factor nodes.Chris Lattner2005-01-191-2/+10
| | | | llvm-svn: 19701
* Realize the individual pieces of an expanded copytoreg/store/load areChris Lattner2005-01-191-9/+16
| | | | | | independent of each other. llvm-svn: 19700
* Know some identities about tokenfactor nodes.Chris Lattner2005-01-191-0/+11
| | | | llvm-svn: 19699
* Know some simple identities. This improves codegen for (1LL << N).Chris Lattner2005-01-191-0/+13
| | | | llvm-svn: 19698
* Fix a problem where were were literally selecting for INCREASED registerChris Lattner2005-01-191-8/+8
| | | | | | | | pressure, not decreases register pressure. Fix problem where we accidentally swapped the operands of SHLD, which caused fourinarow to fail. This fixes fourinarow. llvm-svn: 19697
* Just in case, handle something that is both a use and a def.Chris Lattner2005-01-191-1/+2
| | | | llvm-svn: 19696
* When an instruction moves, make sure to update the VarInfo::Kills list asChris Lattner2005-01-191-3/+10
| | | | | | | well as all of teh other stuff in livevar. This fixes the compiler crash on fourinarow last night. llvm-svn: 19695
* When commuting these instructions, make sure to actually swap the operands too.Chris Lattner2005-01-191-1/+1
| | | | llvm-svn: 19694
* Fix 'raise' to work with packed types. Patch by Morten Ofstad.Chris Lattner2005-01-191-1/+1
| | | | llvm-svn: 19693
* Implement Regression/CodeGen/X86/rotate.ll: emit rotate instructions (whichChris Lattner2005-01-191-38/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | typically cost 1 cycle) instead of shld/shrd instruction (which are typically 6 or more cycles). This also saves code space. For example, instead of emitting: rotr: mov %EAX, DWORD PTR [%ESP + 4] mov %CL, BYTE PTR [%ESP + 8] shrd %EAX, %EAX, %CL ret rotli: mov %EAX, DWORD PTR [%ESP + 4] shrd %EAX, %EAX, 27 ret Emit: rotr32: mov %CL, BYTE PTR [%ESP + 8] mov %EAX, DWORD PTR [%ESP + 4] ror %EAX, %CL ret rotli32: mov %EAX, DWORD PTR [%ESP + 4] ror %EAX, 27 ret We also emit byte rotate instructions which do not have a sh[lr]d counterpart at all. llvm-svn: 19692
* Add rotate instructions.Chris Lattner2005-01-192-0/+75
| | | | llvm-svn: 19690
* Match 16-bit shld/shrd instructions as well, implementing shift-double.llx:test5Chris Lattner2005-01-191-16/+20
| | | | llvm-svn: 19689
* Improve coverage of the X86 instruction set by adding 16-bit shift doubles.Chris Lattner2005-01-193-3/+45
| | | | llvm-svn: 19687
* Teach the code generator that shrd/shld is commutable if it has an immediate.Chris Lattner2005-01-193-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | This allows us to generate this: foo: mov %EAX, DWORD PTR [%ESP + 4] mov %EDX, DWORD PTR [%ESP + 8] shld %EDX, %EDX, 2 shl %EAX, 2 ret instead of this: foo: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, DWORD PTR [%ESP + 8] mov %EDX, %EAX shrd %EDX, %ECX, 30 shl %EAX, 2 ret Note the magically transmogrifying immediate. llvm-svn: 19686
* Use the TargetInstrInfo::commuteInstruction method to commute instructionsChris Lattner2005-01-191-6/+17
| | | | | | instead of doing it manually. llvm-svn: 19685
* Finegrainify namespacificationChris Lattner2005-01-191-7/+20
| | | | | | | Add default impl of commuteInstruction Add notes about ugly V9 code. llvm-svn: 19684
* Codegen long >> 2 to this:Chris Lattner2005-01-191-1/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | foo: mov %EAX, DWORD PTR [%ESP + 4] mov %EDX, DWORD PTR [%ESP + 8] shrd %EAX, %EDX, 2 sar %EDX, 2 ret instead of this: test1: mov %ECX, DWORD PTR [%ESP + 4] shr %ECX, 2 mov %EDX, DWORD PTR [%ESP + 8] mov %EAX, %EDX shl %EAX, 30 or %EAX, %ECX sar %EDX, 2 ret and long << 2 to this: foo: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, DWORD PTR [%ESP + 8] *** mov %EDX, %EAX shrd %EDX, %ECX, 30 shl %EAX, 2 ret instead of this: foo: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, %EAX shr %ECX, 30 mov %EDX, DWORD PTR [%ESP + 8] shl %EDX, 2 or %EDX, %ECX shl %EAX, 2 ret The extra copy (marked ***) can be eliminated when I teach the code generator that shrd32rri8 is really commutative. llvm-svn: 19681
* Implement a way of expanding shifts. This applies to targets that offerChris Lattner2005-01-191-3/+94
| | | | | | | | select operations or to shifts that are by a constant. This automatically implements (with no special code) all of the special cases for shift by 32, shift by < 32 and shift by > 32. llvm-svn: 19679
* X86 shifts mask the amount.Chris Lattner2005-01-191-0/+1
| | | | llvm-svn: 19678
* Add a hook to find out how the target handles shift amounts that are out ofChris Lattner2005-01-191-0/+1
| | | | | | | | | range. Either they are undefined (the default), they mask the shift amount to the size of the register (X86, Alpha, etc), or they extend the shift (PPC). This defaults to undefined, which is conservatively correct. llvm-svn: 19677
* Zero is cheaper than sign extend.Chris Lattner2005-01-181-1/+1
| | | | llvm-svn: 19675
* Code to handle FP_EXTEND is dead now. X86 doesn't support any data types toChris Lattner2005-01-181-4/+1
| | | | | | FP_EXTEND from! llvm-svn: 19674
* Remove more dead code.Chris Lattner2005-01-181-17/+0
| | | | llvm-svn: 19673
* The selection dag code handles the promotions from F32 to F64 for us, so weChris Lattner2005-01-181-12/+0
| | | | | | don't need to even think about F32 in the X86 code anymore. llvm-svn: 19672
* Fix some fixmes (promoting bools for select and brcond), fix promotionChris Lattner2005-01-181-8/+43
| | | | | | of zero and sign extends. llvm-svn: 19671
* Keep track of the retval type as well.Chris Lattner2005-01-181-2/+5
| | | | llvm-svn: 19670
* Teach legalize to promote copy(from|to)reg, instead of making the isel passChris Lattner2005-01-182-26/+13
| | | | | | | | | do it. This results in better code on X86 for floats (because if strict precision is not required, we can elide some more expensive double -> float conversions like the old isel did), and allows other targets to emit CopyFromRegs that are not legal for arguments. llvm-svn: 19668
* Fix 124.m88ksim.Chris Lattner2005-01-181-0/+3
| | | | llvm-svn: 19667
* Do not emit loads multiple times, potentially in the wrong places.Chris Lattner2005-01-181-2/+2
| | | | llvm-svn: 19661
* Minor changes.Tanya Lattner2005-01-181-12/+22
| | | | llvm-svn: 19660
* Eliminate bad assertions.Chris Lattner2005-01-181-0/+2
| | | | llvm-svn: 19659
* * Eliminate the TokenSet and just use the ExprMap for both tokens and values.Chris Lattner2005-01-181-14/+13
| | | | | | | | * Insert some really pedantic assertions that will notice when we emit the same loads more than one time, exposing bugs. This turns a miscompilation in bzip2 into a compile-fail. yaay. llvm-svn: 19658
* Teach legalize to promote SetCC results.Chris Lattner2005-01-181-0/+8
| | | | llvm-svn: 19657
* Allow setcc operations to have nonbool types.Chris Lattner2005-01-183-42/+46
| | | | llvm-svn: 19656
* Rely on the code in MatchAddress to do this work. Otherwise we fail toChris Lattner2005-01-181-11/+13
| | | | | | | match (X+Y)+(Z << 1), because we match the X+Y first, consuming the index register, then there is no place to put the Z. llvm-svn: 19652
* Fix the completely broken FP constant folds for setcc's.Chris Lattner2005-01-181-4/+4
| | | | llvm-svn: 19651
* Fix a problem where probing for addressing modes caused expressions to beChris Lattner2005-01-181-33/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | emitted too early. In particular, this fixes Regression/CodeGen/X86/regpressure.ll:regpressure3. This also improves the 2nd basic block in 164.gzip:flush_block, which went from .LBBflush_block_1: # loopentry.1.i movzx %EAX, WORD PTR [dyn_ltree + 20] movzx %ECX, WORD PTR [dyn_ltree + 16] mov DWORD PTR [%ESP + 32], %ECX movzx %ECX, WORD PTR [dyn_ltree + 12] movzx %EDX, WORD PTR [dyn_ltree + 8] movzx %EBX, WORD PTR [dyn_ltree + 4] mov DWORD PTR [%ESP + 36], %EBX movzx %EBX, WORD PTR [dyn_ltree] add DWORD PTR [%ESP + 36], %EBX add %EDX, DWORD PTR [%ESP + 36] add %ECX, %EDX add DWORD PTR [%ESP + 32], %ECX add %EAX, DWORD PTR [%ESP + 32] movzx %ECX, WORD PTR [dyn_ltree + 24] add %EAX, %ECX mov %ECX, 0 mov %EDX, %ECX to .LBBflush_block_1: # loopentry.1.i movzx %EAX, WORD PTR [dyn_ltree] movzx %ECX, WORD PTR [dyn_ltree + 4] add %ECX, %EAX movzx %EAX, WORD PTR [dyn_ltree + 8] add %EAX, %ECX movzx %ECX, WORD PTR [dyn_ltree + 12] add %ECX, %EAX movzx %EAX, WORD PTR [dyn_ltree + 16] add %EAX, %ECX movzx %ECX, WORD PTR [dyn_ltree + 20] add %ECX, %EAX movzx %EAX, WORD PTR [dyn_ltree + 24] add %ECX, %EAX mov %EAX, 0 mov %EDX, %EAX ... which results in less spilling in the function. This change alone speeds up 164.gzip from 37.23s to 36.24s on apoc. The default isel takes 37.31s. llvm-svn: 19650
* Fix indentation.Chris Lattner2005-01-171-19/+18
| | | | llvm-svn: 19649
* Don't bother using max here.Chris Lattner2005-01-171-1/+1
| | | | llvm-svn: 19647
* Do not give token factor nodes outrageous weightsChris Lattner2005-01-171-2/+5
| | | | llvm-svn: 19645
* Non-volatile loads can be freely reordered against each other. This fixesChris Lattner2005-01-171-4/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | X86/reg-pressure.ll again, and allows us to do nice things in other cases. For example, we now codegen this sort of thing: int %loadload(int *%X, int* %Y) { %Z = load int* %Y %Y = load int* %X ;; load between %Z and store %Q = add int %Z, 1 store int %Q, int* %Y ret int %Y } Into this: loadload: mov %EAX, DWORD PTR [%ESP + 4] mov %EAX, DWORD PTR [%EAX] mov %ECX, DWORD PTR [%ESP + 8] inc DWORD PTR [%ECX] ret where we weren't able to form the 'inc [mem]' before. This also lets the instruction selector emit loads in any order it wants to, which can be good for register pressure as well. llvm-svn: 19644
* Two changes:Chris Lattner2005-01-171-5/+74
| | | | | | | 1. Fold [mem] += (1|-1) into inc [mem]/dec [mem] to save some icache space. 2. Do not let token factor nodes prevent forming '[mem] op= val' folds. llvm-svn: 19643
* Don't call SelectionDAG.getRoot() directly, go through a forwarding method.Chris Lattner2005-01-171-21/+30
| | | | llvm-svn: 19642
OpenPOWER on IntegriCloud