summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Lower llvm.sqrt -> fsqrt/sqrtChris Lattner2005-04-301-1/+17
| | | | llvm-svn: 21629
* Legalize FSQRT, FSIN, FCOS nodes, patch contributed by Morten OfstadChris Lattner2005-04-281-0/+13
| | | | llvm-svn: 21606
* Add FSQRT, FSIN, FCOS nodes, patch contributed by Morten OfstadChris Lattner2005-04-281-1/+4
| | | | llvm-svn: 21605
* Implement Value* tracking for loads and stores in the selection DAG. This ↵Andrew Lenharth2005-04-273-33/+65
| | | | | | | | enables one to use alias analysis in the backends. (TRUNK)Stores and (EXT|ZEXT|SEXT)Loads have an extra SDOperand which is a SrcValueSDNode which contains the Value*. Note that if the operation is introduced by the backend, it will still have the operand, but the value* will be null. llvm-svn: 21599
* Fold (X > -1) | (Y > -1) --> (X&Y > -1)Chris Lattner2005-04-261-1/+3
| | | | llvm-svn: 21552
* implement some more logical compares with constants, so that:Chris Lattner2005-04-251-7/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | int foo1(int x, int y) { int t1 = x >= 0; int t2 = y >= 0; return t1 & t2; } int foo2(int x, int y) { int t1 = x == -1; int t2 = y == -1; return t1 & t2; } produces: _foo1: or r2, r4, r3 srwi r2, r2, 31 xori r3, r2, 1 blr _foo2: and r2, r4, r3 addic r2, r2, 1 li r2, 0 addze r3, r2 blr instead of: _foo1: srwi r2, r4, 31 xori r2, r2, 1 srwi r3, r3, 31 xori r3, r3, 1 and r3, r2, r3 blr _foo2: addic r2, r4, 1 li r2, 0 addze r2, r2 addic r3, r3, 1 li r3, 0 addze r3, r3 and r3, r2, r3 blr llvm-svn: 21547
* Codegen x < 0 | y < 0 as (x|y) < 0. This allows us to compile this to:Chris Lattner2005-04-251-1/+4
| | | | | | | | | | | | | | | | | _foo: or r2, r4, r3 srwi r3, r2, 31 blr instead of: _foo: srwi r2, r4, 31 srwi r3, r3, 31 or r3, r2, r3 blr llvm-svn: 21544
* Convert tabs to spacesMisha Brukman2005-04-225-18/+13
| | | | llvm-svn: 21439
* Remove trailing whitespaceMisha Brukman2005-04-2125-227/+227
| | | | llvm-svn: 21420
* Improve and elimination. On PPC, for:Chris Lattner2005-04-211-6/+26
| | | | | | | | | | | | | | | | | | | | | | | | | bool %test(int %X) { %Y = and int %X, 8 %Z = setne int %Y, 0 ret bool %Z } we now generate this: rlwinm r2, r3, 0, 28, 28 srwi r3, r2, 3 instead of this: rlwinm r2, r3, 0, 28, 28 srwi r2, r2, 3 rlwinm r3, r2, 0, 31, 31 I'll leave it to Nate to get it down to one instruction. :) --------------------------------------------------------------------- llvm-svn: 21391
* Fold (x & 8) != 0 and (x & 8) == 8 into (x & 8) >> 3.Chris Lattner2005-04-211-0/+22
| | | | | | | | | | | | | | | | | | | This turns this PPC code: rlwinm r2, r3, 0, 28, 28 cmpwi cr7, r2, 8 mfcr r2 rlwinm r3, r2, 31, 31, 31 into this: rlwinm r2, r3, 0, 28, 28 srwi r2, r2, 3 rlwinm r3, r2, 0, 31, 31 Next up, nuking the extra and. llvm-svn: 21390
* Fold setcc of MVT::i1 operands into logical operationsChris Lattner2005-04-181-0/+39
| | | | llvm-svn: 21319
* Another minor simplification: handle setcc (zero_extend x), c -> setcc(x, c')Chris Lattner2005-04-181-0/+45
| | | | llvm-svn: 21318
* Another simple xformChris Lattner2005-04-181-0/+8
| | | | llvm-svn: 21317
* Fold:Chris Lattner2005-04-181-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | // (X != 0) | (Y != 0) -> (X|Y != 0) // (X == 0) & (Y == 0) -> (X|Y == 0) Compiling this: int %bar(int %a, int %b) { entry: %tmp.1 = setne int %a, 0 %tmp.2 = setne int %b, 0 %tmp.3 = or bool %tmp.1, %tmp.2 %retval = cast bool %tmp.3 to int ret int %retval } to this: _bar: or r2, r3, r4 addic r3, r2, -1 subfe r3, r3, r2 blr instead of: _bar: addic r2, r3, -1 subfe r2, r2, r3 addic r3, r4, -1 subfe r3, r3, r4 or r3, r2, r3 blr llvm-svn: 21316
* Make the AND elimination operation recursive and significantly more powerful,Chris Lattner2005-04-181-26/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | eliminating an and for Nate's testcase: int %bar(int %a, int %b) { entry: %tmp.1 = setne int %a, 0 %tmp.2 = setne int %b, 0 %tmp.3 = or bool %tmp.1, %tmp.2 %retval = cast bool %tmp.3 to int ret int %retval } generating: _bar: addic r2, r3, -1 subfe r2, r2, r3 addic r3, r4, -1 subfe r3, r3, r4 or r3, r2, r3 blr instead of: _bar: addic r2, r3, -1 subfe r2, r2, r3 addic r3, r4, -1 subfe r3, r3, r4 or r2, r2, r3 rlwinm r3, r2, 0, 31, 31 blr llvm-svn: 21315
* Add a couple missing transforms in getSetCC that were triggering assertionsNate Begeman2005-04-141-1/+8
| | | | | | in the PPC Pattern ISel llvm-svn: 21297
* Disbale the broken fold of shift + sz[ext] for nowNate Begeman2005-04-131-7/+30
| | | | | | | | Move the transform for select (a < 0) ? b : 0 into the dag from ppc isel Enable the dag to fold and (setcc, 1) -> setcc for targets where setcc always produces zero or one. llvm-svn: 21291
* fix an infinite loopChris Lattner2005-04-131-1/+1
| | | | llvm-svn: 21289
* fix some serious miscompiles on ia64, alpha, and ppcChris Lattner2005-04-131-1/+1
| | | | llvm-svn: 21288
* avoid work when possible, perhaps fix the problem nate and andrew are seeingChris Lattner2005-04-131-0/+1
| | | | | | with != 0 comparisons vanishing. llvm-svn: 21287
* Implement expansion of unsigned i64 -> FP.Chris Lattner2005-04-131-2/+31
| | | | | | | Note that this probably only works for little endian targets, but is enough to get siod working :) llvm-svn: 21280
* Make expansion of uint->fp cast assert out instead of infinitely recurse.Chris Lattner2005-04-131-1/+1
| | | | llvm-svn: 21275
* add back the optimization that Nate added for shl X, (zext_inreg y)Chris Lattner2005-04-131-2/+23
| | | | llvm-svn: 21273
* Oops, remove these too.Chris Lattner2005-04-131-6/+2
| | | | llvm-svn: 21272
* Instead of making ZERO_EXTEND_INREG nodes, use the helper method inChris Lattner2005-04-131-31/+22
| | | | | | | SelectionDAG to do the job with AND. Don't legalize Z_E_I anymore as it is gone llvm-svn: 21266
* Remove all foldings of ZERO_EXTEND_INREG, moving them to work for AND nodesChris Lattner2005-04-131-41/+46
| | | | | | instead. OVerall, this increases the amount of folding we can do. llvm-svn: 21265
* Fold shift x, [sz]ext(y) -> shift x, yNate Begeman2005-04-121-0/+16
| | | | llvm-svn: 21262
* Fold shift by size larger than type size to undefNate Begeman2005-04-122-19/+5
| | | | | | Make llvm undef values generate ISD::UNDEF nodes llvm-svn: 21261
* promote extload i1 -> extload i8Chris Lattner2005-04-121-2/+10
| | | | llvm-svn: 21258
* Remove some redundant checks, add a couple of new ones. This allows us toChris Lattner2005-04-121-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | compile this: int foo (unsigned long a, unsigned long long g) { return a >= g; } To: foo: movl 8(%esp), %eax cmpl %eax, 4(%esp) setae %al cmpl $0, 12(%esp) sete %cl andb %al, %cl movzbl %cl, %eax ret instead of: foo: movl 8(%esp), %eax cmpl %eax, 4(%esp) setae %al movzbw %al, %cx movl 12(%esp), %edx cmpl $0, %edx sete %al movzbw %al, %ax cmpl $0, %edx cmove %cx, %ax movzbl %al, %eax ret llvm-svn: 21244
* Emit comparisons against the sign bit better. Codegen this:Chris Lattner2005-04-121-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bool %test1(long %X) { %A = setlt long %X, 0 ret bool %A } like this: test1: cmpl $0, 8(%esp) setl %al movzbl %al, %eax ret instead of: test1: movl 8(%esp), %ecx cmpl $0, %ecx setl %al movzbw %al, %ax cmpl $0, 4(%esp) setb %dl movzbw %dl, %dx cmpl $0, %ecx cmove %dx, %ax movzbl %al, %eax ret llvm-svn: 21243
* Emit long comparison against -1 better. Instead of this (x86):Chris Lattner2005-04-121-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | test2: movl 8(%esp), %eax notl %eax movl 4(%esp), %ecx notl %ecx orl %eax, %ecx cmpl $0, %ecx sete %al movzbl %al, %eax ret or this (PPC): _test2: nor r2, r4, r4 nor r3, r3, r3 or r2, r2, r3 cntlzw r2, r2 srwi r3, r2, 5 blr Emit this: test2: movl 8(%esp), %eax andl 4(%esp), %eax cmpl $-1, %eax sete %al movzbl %al, %eax ret or this: _test2: .LBB_test2_0: ; and r2, r4, r3 cmpwi cr0, r2, -1 li r3, 1 li r2, 0 beq .LBB_test2_2 ; .LBB_test2_1: ; or r3, r2, r2 .LBB_test2_2: ; blr it seems like the PPC isel could do better for R32 == -1 case. llvm-svn: 21242
* canonicalize x <u 1 -> x == 0. On this testcase:Chris Lattner2005-04-121-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unsigned long long g; unsigned long foo (unsigned long a) { return (a >= g) ? 1 : 0; } It changes the ppc code from: _foo: .LBB_foo_0: ; entry mflr r11 stw r11, 8(r1) bl "L00000$pb" "L00000$pb": mflr r2 addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb") lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2) lwz r4, 0(r2) lwz r2, 4(r2) cmplw cr0, r3, r2 li r2, 1 li r3, 0 bge .LBB_foo_2 ; entry .LBB_foo_1: ; entry or r2, r3, r3 .LBB_foo_2: ; entry cmplwi cr0, r4, 1 li r3, 1 li r5, 0 blt .LBB_foo_4 ; entry .LBB_foo_3: ; entry or r3, r5, r5 .LBB_foo_4: ; entry cmpwi cr0, r4, 0 beq .LBB_foo_6 ; entry .LBB_foo_5: ; entry or r2, r3, r3 .LBB_foo_6: ; entry rlwinm r3, r2, 0, 31, 31 lwz r11, 8(r1) mtlr r11 blr to: _foo: .LBB_foo_0: ; entry mflr r11 stw r11, 8(r1) bl "L00000$pb" "L00000$pb": mflr r2 addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb") lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2) lwz r4, 0(r2) lwz r2, 4(r2) cmplw cr0, r3, r2 li r2, 1 li r3, 0 bge .LBB_foo_2 ; entry .LBB_foo_1: ; entry or r2, r3, r3 .LBB_foo_2: ; entry cntlzw r3, r4 srwi r3, r3, 5 cmpwi cr0, r4, 0 beq .LBB_foo_4 ; entry .LBB_foo_3: ; entry or r2, r3, r3 .LBB_foo_4: ; entry rlwinm r3, r2, 0, 31, 31 lwz r11, 8(r1) mtlr r11 blr llvm-svn: 21241
* Teach the dag mechanism that this:Chris Lattner2005-04-111-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | long long test2(unsigned A, unsigned B) { return ((unsigned long long)A << 32) + B; } is equivalent to this: long long test1(unsigned A, unsigned B) { return ((unsigned long long)A << 32) | B; } Now they are both codegen'd to this on ppc: _test2: blr or this on x86: test2: movl 4(%esp), %edx movl 8(%esp), %eax ret llvm-svn: 21231
* Fix expansion of shifts by exactly NVT bits on arch's (like X86) that haveChris Lattner2005-04-111-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | masking shifts. This fixes the miscompilation of this: long long test1(unsigned A, unsigned B) { return ((unsigned long long)A << 32) | B; } into this: test1: movl 4(%esp), %edx movl %edx, %eax orl 8(%esp), %eax ret allowing us to generate this instead: test1: movl 4(%esp), %edx movl 8(%esp), %eax ret llvm-svn: 21230
* Fix libcall code to not pass a NULL Chain to LowerCallToNate Begeman2005-04-111-5/+30
| | | | | | | | Fix libcall code to not crash or assert looking for an ADJCALLSTACKUP node when it is known that there is no ADJCALLSTACKDOWN to match. Expand i64 multiply when ISD::MULHU is legal for the target. llvm-svn: 21214
* Don't bother sign/zext_inreg'ing the result of an and operation if we knowChris Lattner2005-04-101-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the result does change as a result of the extend. This improves codegen for Alpha on this testcase: int %a(ushort* %i) { %tmp.1 = load ushort* %i %tmp.2 = cast ushort %tmp.1 to int %tmp.4 = and int %tmp.2, 1 ret int %tmp.4 } Generating: a: ldgp $29, 0($27) ldwu $0,0($16) and $0,1,$0 ret $31,($26),1 instead of: a: ldgp $29, 0($27) ldwu $0,0($16) and $0,1,$0 addl $0,0,$0 ret $31,($26),1 btw, alpha really should switch to livein/outs for args :) llvm-svn: 21213
* Teach legalize to deal with targets that don't support some SEXTLOAD/ZEXTLOADsChris Lattner2005-04-101-13/+38
| | | | llvm-svn: 21212
* don't zextload fp values!Chris Lattner2005-04-101-1/+4
| | | | llvm-svn: 21209
* Until we have a dag combiner, promote using zextload's instead of extloads.Chris Lattner2005-04-101-1/+2
| | | | | | | This gives the optimizer a bit of information about the top-part of the value. llvm-svn: 21205
* Fold zext_inreg(zextload), likewise for sext'sChris Lattner2005-04-101-0/+6
| | | | llvm-svn: 21204
* add a simple xformChris Lattner2005-04-101-0/+6
| | | | llvm-svn: 21203
* Fix a thinko. If the operand is promoted, pass the promoted value intoChris Lattner2005-04-102-1/+5
| | | | | | | | | the new zero extend, not the original operand. This fixes cast bool -> long on ppc. Add an unrelated fixme llvm-svn: 21196
* add a little peephole optimization. This allows us to codegen:Chris Lattner2005-04-091-0/+11
| | | | | | | | | | | | | | | | | | | | | | | int a(short i) { return i & 1; } as _a: andi. r3, r3, 1 blr instead of: _a: rlwinm r2, r3, 0, 16, 31 andi. r3, r2, 1 blr on ppc. It should also help the other risc targets. llvm-svn: 21189
* there is no need to remove this instruction, linscan does it already as itChris Lattner2005-04-091-5/+0
| | | | | | removes noop moves. llvm-svn: 21183
* Adjust live intervals to support a livein setChris Lattner2005-04-091-2/+44
| | | | llvm-svn: 21182
* Consider the livein/out set for a function, allowing targets to not have toChris Lattner2005-04-091-0/+20
| | | | | | use ugly imp_def/imp_uses for arguments and return values. llvm-svn: 21180
* recognize some patterns as fabs operations, so that fabs at the source levelChris Lattner2005-04-091-0/+21
| | | | | | | | | | | | | is deconstructed then reconstructed here. This catches 19 fabs's in 177.mesa 9 in 168.wupwise, 5 in 171.swim, 3 in 172.mgrid, and 14 in 173.applu out of specfp2000. This allows the X86 code generator to make MUCH better code than before for each of these and saves one instr on ppc. This depends on the previous CFE patch to expose these correctly. llvm-svn: 21171
* Emit BRCONDTWOWAY when possible.Chris Lattner2005-04-091-7/+6
| | | | llvm-svn: 21167
OpenPOWER on IntegriCloud