summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Implement count leading zeros (ctlz), count trailing zeros (cttz), and countAndrew Lenharth2005-05-034-3/+262
| | | | | | | | | population (ctpop). Generic lowering is implemented, however only promotion is implemented for SelectionDAG at the moment. More coming soon. llvm-svn: 21676
* Do not use deprecated APIsAlkis Evlogimenos2005-04-301-1/+1
| | | | llvm-svn: 21639
* Codegen and legalize sin/cos/llvm.sqrt as FSIN/FCOS/FSQRT calls. This patchChris Lattner2005-04-302-3/+53
| | | | | | | was contributed by Morten Ofstad, with some minor tweaks and bug fixes added by me. llvm-svn: 21636
* Lower llvm.sqrt -> fsqrt/sqrtChris Lattner2005-04-301-1/+17
| | | | llvm-svn: 21629
* Legalize FSQRT, FSIN, FCOS nodes, patch contributed by Morten OfstadChris Lattner2005-04-281-0/+13
| | | | llvm-svn: 21606
* Add FSQRT, FSIN, FCOS nodes, patch contributed by Morten OfstadChris Lattner2005-04-281-1/+4
| | | | llvm-svn: 21605
* Implement Value* tracking for loads and stores in the selection DAG. This ↵Andrew Lenharth2005-04-273-33/+65
| | | | | | | | enables one to use alias analysis in the backends. (TRUNK)Stores and (EXT|ZEXT|SEXT)Loads have an extra SDOperand which is a SrcValueSDNode which contains the Value*. Note that if the operation is introduced by the backend, it will still have the operand, but the value* will be null. llvm-svn: 21599
* Fold (X > -1) | (Y > -1) --> (X&Y > -1)Chris Lattner2005-04-261-1/+3
| | | | llvm-svn: 21552
* implement some more logical compares with constants, so that:Chris Lattner2005-04-251-7/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | int foo1(int x, int y) { int t1 = x >= 0; int t2 = y >= 0; return t1 & t2; } int foo2(int x, int y) { int t1 = x == -1; int t2 = y == -1; return t1 & t2; } produces: _foo1: or r2, r4, r3 srwi r2, r2, 31 xori r3, r2, 1 blr _foo2: and r2, r4, r3 addic r2, r2, 1 li r2, 0 addze r3, r2 blr instead of: _foo1: srwi r2, r4, 31 xori r2, r2, 1 srwi r3, r3, 31 xori r3, r3, 1 and r3, r2, r3 blr _foo2: addic r2, r4, 1 li r2, 0 addze r2, r2 addic r3, r3, 1 li r3, 0 addze r3, r3 and r3, r2, r3 blr llvm-svn: 21547
* Codegen x < 0 | y < 0 as (x|y) < 0. This allows us to compile this to:Chris Lattner2005-04-251-1/+4
| | | | | | | | | | | | | | | | | _foo: or r2, r4, r3 srwi r3, r2, 31 blr instead of: _foo: srwi r2, r4, 31 srwi r3, r3, 31 or r3, r2, r3 blr llvm-svn: 21544
* Convert tabs to spacesMisha Brukman2005-04-225-18/+13
| | | | llvm-svn: 21439
* Remove trailing whitespaceMisha Brukman2005-04-2125-227/+227
| | | | llvm-svn: 21420
* Improve and elimination. On PPC, for:Chris Lattner2005-04-211-6/+26
| | | | | | | | | | | | | | | | | | | | | | | | | bool %test(int %X) { %Y = and int %X, 8 %Z = setne int %Y, 0 ret bool %Z } we now generate this: rlwinm r2, r3, 0, 28, 28 srwi r3, r2, 3 instead of this: rlwinm r2, r3, 0, 28, 28 srwi r2, r2, 3 rlwinm r3, r2, 0, 31, 31 I'll leave it to Nate to get it down to one instruction. :) --------------------------------------------------------------------- llvm-svn: 21391
* Fold (x & 8) != 0 and (x & 8) == 8 into (x & 8) >> 3.Chris Lattner2005-04-211-0/+22
| | | | | | | | | | | | | | | | | | | This turns this PPC code: rlwinm r2, r3, 0, 28, 28 cmpwi cr7, r2, 8 mfcr r2 rlwinm r3, r2, 31, 31, 31 into this: rlwinm r2, r3, 0, 28, 28 srwi r2, r2, 3 rlwinm r3, r2, 0, 31, 31 Next up, nuking the extra and. llvm-svn: 21390
* Fold setcc of MVT::i1 operands into logical operationsChris Lattner2005-04-181-0/+39
| | | | llvm-svn: 21319
* Another minor simplification: handle setcc (zero_extend x), c -> setcc(x, c')Chris Lattner2005-04-181-0/+45
| | | | llvm-svn: 21318
* Another simple xformChris Lattner2005-04-181-0/+8
| | | | llvm-svn: 21317
* Fold:Chris Lattner2005-04-181-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | // (X != 0) | (Y != 0) -> (X|Y != 0) // (X == 0) & (Y == 0) -> (X|Y == 0) Compiling this: int %bar(int %a, int %b) { entry: %tmp.1 = setne int %a, 0 %tmp.2 = setne int %b, 0 %tmp.3 = or bool %tmp.1, %tmp.2 %retval = cast bool %tmp.3 to int ret int %retval } to this: _bar: or r2, r3, r4 addic r3, r2, -1 subfe r3, r3, r2 blr instead of: _bar: addic r2, r3, -1 subfe r2, r2, r3 addic r3, r4, -1 subfe r3, r3, r4 or r3, r2, r3 blr llvm-svn: 21316
* Make the AND elimination operation recursive and significantly more powerful,Chris Lattner2005-04-181-26/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | eliminating an and for Nate's testcase: int %bar(int %a, int %b) { entry: %tmp.1 = setne int %a, 0 %tmp.2 = setne int %b, 0 %tmp.3 = or bool %tmp.1, %tmp.2 %retval = cast bool %tmp.3 to int ret int %retval } generating: _bar: addic r2, r3, -1 subfe r2, r2, r3 addic r3, r4, -1 subfe r3, r3, r4 or r3, r2, r3 blr instead of: _bar: addic r2, r3, -1 subfe r2, r2, r3 addic r3, r4, -1 subfe r3, r3, r4 or r2, r2, r3 rlwinm r3, r2, 0, 31, 31 blr llvm-svn: 21315
* Add a couple missing transforms in getSetCC that were triggering assertionsNate Begeman2005-04-141-1/+8
| | | | | | in the PPC Pattern ISel llvm-svn: 21297
* Disbale the broken fold of shift + sz[ext] for nowNate Begeman2005-04-131-7/+30
| | | | | | | | Move the transform for select (a < 0) ? b : 0 into the dag from ppc isel Enable the dag to fold and (setcc, 1) -> setcc for targets where setcc always produces zero or one. llvm-svn: 21291
* fix an infinite loopChris Lattner2005-04-131-1/+1
| | | | llvm-svn: 21289
* fix some serious miscompiles on ia64, alpha, and ppcChris Lattner2005-04-131-1/+1
| | | | llvm-svn: 21288
* avoid work when possible, perhaps fix the problem nate and andrew are seeingChris Lattner2005-04-131-0/+1
| | | | | | with != 0 comparisons vanishing. llvm-svn: 21287
* Implement expansion of unsigned i64 -> FP.Chris Lattner2005-04-131-2/+31
| | | | | | | Note that this probably only works for little endian targets, but is enough to get siod working :) llvm-svn: 21280
* Make expansion of uint->fp cast assert out instead of infinitely recurse.Chris Lattner2005-04-131-1/+1
| | | | llvm-svn: 21275
* add back the optimization that Nate added for shl X, (zext_inreg y)Chris Lattner2005-04-131-2/+23
| | | | llvm-svn: 21273
* Oops, remove these too.Chris Lattner2005-04-131-6/+2
| | | | llvm-svn: 21272
* Instead of making ZERO_EXTEND_INREG nodes, use the helper method inChris Lattner2005-04-131-31/+22
| | | | | | | SelectionDAG to do the job with AND. Don't legalize Z_E_I anymore as it is gone llvm-svn: 21266
* Remove all foldings of ZERO_EXTEND_INREG, moving them to work for AND nodesChris Lattner2005-04-131-41/+46
| | | | | | instead. OVerall, this increases the amount of folding we can do. llvm-svn: 21265
* Fold shift x, [sz]ext(y) -> shift x, yNate Begeman2005-04-121-0/+16
| | | | llvm-svn: 21262
* Fold shift by size larger than type size to undefNate Begeman2005-04-122-19/+5
| | | | | | Make llvm undef values generate ISD::UNDEF nodes llvm-svn: 21261
* promote extload i1 -> extload i8Chris Lattner2005-04-121-2/+10
| | | | llvm-svn: 21258
* Remove some redundant checks, add a couple of new ones. This allows us toChris Lattner2005-04-121-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | compile this: int foo (unsigned long a, unsigned long long g) { return a >= g; } To: foo: movl 8(%esp), %eax cmpl %eax, 4(%esp) setae %al cmpl $0, 12(%esp) sete %cl andb %al, %cl movzbl %cl, %eax ret instead of: foo: movl 8(%esp), %eax cmpl %eax, 4(%esp) setae %al movzbw %al, %cx movl 12(%esp), %edx cmpl $0, %edx sete %al movzbw %al, %ax cmpl $0, %edx cmove %cx, %ax movzbl %al, %eax ret llvm-svn: 21244
* Emit comparisons against the sign bit better. Codegen this:Chris Lattner2005-04-121-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bool %test1(long %X) { %A = setlt long %X, 0 ret bool %A } like this: test1: cmpl $0, 8(%esp) setl %al movzbl %al, %eax ret instead of: test1: movl 8(%esp), %ecx cmpl $0, %ecx setl %al movzbw %al, %ax cmpl $0, 4(%esp) setb %dl movzbw %dl, %dx cmpl $0, %ecx cmove %dx, %ax movzbl %al, %eax ret llvm-svn: 21243
* Emit long comparison against -1 better. Instead of this (x86):Chris Lattner2005-04-121-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | test2: movl 8(%esp), %eax notl %eax movl 4(%esp), %ecx notl %ecx orl %eax, %ecx cmpl $0, %ecx sete %al movzbl %al, %eax ret or this (PPC): _test2: nor r2, r4, r4 nor r3, r3, r3 or r2, r2, r3 cntlzw r2, r2 srwi r3, r2, 5 blr Emit this: test2: movl 8(%esp), %eax andl 4(%esp), %eax cmpl $-1, %eax sete %al movzbl %al, %eax ret or this: _test2: .LBB_test2_0: ; and r2, r4, r3 cmpwi cr0, r2, -1 li r3, 1 li r2, 0 beq .LBB_test2_2 ; .LBB_test2_1: ; or r3, r2, r2 .LBB_test2_2: ; blr it seems like the PPC isel could do better for R32 == -1 case. llvm-svn: 21242
* canonicalize x <u 1 -> x == 0. On this testcase:Chris Lattner2005-04-121-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unsigned long long g; unsigned long foo (unsigned long a) { return (a >= g) ? 1 : 0; } It changes the ppc code from: _foo: .LBB_foo_0: ; entry mflr r11 stw r11, 8(r1) bl "L00000$pb" "L00000$pb": mflr r2 addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb") lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2) lwz r4, 0(r2) lwz r2, 4(r2) cmplw cr0, r3, r2 li r2, 1 li r3, 0 bge .LBB_foo_2 ; entry .LBB_foo_1: ; entry or r2, r3, r3 .LBB_foo_2: ; entry cmplwi cr0, r4, 1 li r3, 1 li r5, 0 blt .LBB_foo_4 ; entry .LBB_foo_3: ; entry or r3, r5, r5 .LBB_foo_4: ; entry cmpwi cr0, r4, 0 beq .LBB_foo_6 ; entry .LBB_foo_5: ; entry or r2, r3, r3 .LBB_foo_6: ; entry rlwinm r3, r2, 0, 31, 31 lwz r11, 8(r1) mtlr r11 blr to: _foo: .LBB_foo_0: ; entry mflr r11 stw r11, 8(r1) bl "L00000$pb" "L00000$pb": mflr r2 addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb") lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2) lwz r4, 0(r2) lwz r2, 4(r2) cmplw cr0, r3, r2 li r2, 1 li r3, 0 bge .LBB_foo_2 ; entry .LBB_foo_1: ; entry or r2, r3, r3 .LBB_foo_2: ; entry cntlzw r3, r4 srwi r3, r3, 5 cmpwi cr0, r4, 0 beq .LBB_foo_4 ; entry .LBB_foo_3: ; entry or r2, r3, r3 .LBB_foo_4: ; entry rlwinm r3, r2, 0, 31, 31 lwz r11, 8(r1) mtlr r11 blr llvm-svn: 21241
* Teach the dag mechanism that this:Chris Lattner2005-04-111-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | long long test2(unsigned A, unsigned B) { return ((unsigned long long)A << 32) + B; } is equivalent to this: long long test1(unsigned A, unsigned B) { return ((unsigned long long)A << 32) | B; } Now they are both codegen'd to this on ppc: _test2: blr or this on x86: test2: movl 4(%esp), %edx movl 8(%esp), %eax ret llvm-svn: 21231
* Fix expansion of shifts by exactly NVT bits on arch's (like X86) that haveChris Lattner2005-04-111-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | masking shifts. This fixes the miscompilation of this: long long test1(unsigned A, unsigned B) { return ((unsigned long long)A << 32) | B; } into this: test1: movl 4(%esp), %edx movl %edx, %eax orl 8(%esp), %eax ret allowing us to generate this instead: test1: movl 4(%esp), %edx movl 8(%esp), %eax ret llvm-svn: 21230
* Fix libcall code to not pass a NULL Chain to LowerCallToNate Begeman2005-04-111-5/+30
| | | | | | | | Fix libcall code to not crash or assert looking for an ADJCALLSTACKUP node when it is known that there is no ADJCALLSTACKDOWN to match. Expand i64 multiply when ISD::MULHU is legal for the target. llvm-svn: 21214
* Don't bother sign/zext_inreg'ing the result of an and operation if we knowChris Lattner2005-04-101-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the result does change as a result of the extend. This improves codegen for Alpha on this testcase: int %a(ushort* %i) { %tmp.1 = load ushort* %i %tmp.2 = cast ushort %tmp.1 to int %tmp.4 = and int %tmp.2, 1 ret int %tmp.4 } Generating: a: ldgp $29, 0($27) ldwu $0,0($16) and $0,1,$0 ret $31,($26),1 instead of: a: ldgp $29, 0($27) ldwu $0,0($16) and $0,1,$0 addl $0,0,$0 ret $31,($26),1 btw, alpha really should switch to livein/outs for args :) llvm-svn: 21213
* Teach legalize to deal with targets that don't support some SEXTLOAD/ZEXTLOADsChris Lattner2005-04-101-13/+38
| | | | llvm-svn: 21212
* don't zextload fp values!Chris Lattner2005-04-101-1/+4
| | | | llvm-svn: 21209
* Until we have a dag combiner, promote using zextload's instead of extloads.Chris Lattner2005-04-101-1/+2
| | | | | | | This gives the optimizer a bit of information about the top-part of the value. llvm-svn: 21205
* Fold zext_inreg(zextload), likewise for sext'sChris Lattner2005-04-101-0/+6
| | | | llvm-svn: 21204
* add a simple xformChris Lattner2005-04-101-0/+6
| | | | llvm-svn: 21203
* Fix a thinko. If the operand is promoted, pass the promoted value intoChris Lattner2005-04-102-1/+5
| | | | | | | | | the new zero extend, not the original operand. This fixes cast bool -> long on ppc. Add an unrelated fixme llvm-svn: 21196
* add a little peephole optimization. This allows us to codegen:Chris Lattner2005-04-091-0/+11
| | | | | | | | | | | | | | | | | | | | | | | int a(short i) { return i & 1; } as _a: andi. r3, r3, 1 blr instead of: _a: rlwinm r2, r3, 0, 16, 31 andi. r3, r2, 1 blr on ppc. It should also help the other risc targets. llvm-svn: 21189
* there is no need to remove this instruction, linscan does it already as itChris Lattner2005-04-091-5/+0
| | | | | | removes noop moves. llvm-svn: 21183
* Adjust live intervals to support a livein setChris Lattner2005-04-091-2/+44
| | | | llvm-svn: 21182
OpenPOWER on IntegriCloud