summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Initialize some variables the compiler warns about.Reid Spencer2006-07-251-2/+2
| | | | llvm-svn: 29277
* If a shuffle is a splat, check if the argument is a build_vector with all ↵Evan Cheng2006-07-211-8/+90
| | | | | | elements being the same. If so, return the argument. llvm-svn: 29242
* If a shuffle is unary, i.e. one of the vector argument is not needed, turn theEvan Cheng2006-07-201-10/+56
| | | | | | operand into a undef and adjust mask accordingly. llvm-svn: 29232
* 80 colsAndrew Lenharth2006-07-201-1/+2
| | | | llvm-svn: 29221
* Reduce number of exported symbolsAndrew Lenharth2006-07-201-1/+1
| | | | llvm-svn: 29220
* Mark these two classes as hidden, shrinking libllbmgcc.dylib by 25KChris Lattner2006-06-281-1/+2
| | | | llvm-svn: 28970
* Start on my todo listAndrew Lenharth2006-06-121-4/+4
| | | | llvm-svn: 28752
* visitVBinOp: Can't fold divide by zero!Evan Cheng2006-05-311-0/+8
| | | | llvm-svn: 28584
* Fix a nasty dag combiner bug that caused nondeterminstic crashes (MY FAVORITE!):Chris Lattner2006-05-271-4/+15
| | | | | | | | | | | | | | | | | | SimplifySelectOps would eliminate a Select, delete it, then return true. The clients would see that it did something and return null. The top level would see a null return, and decide that nothing happened, proceeding to process the node in other ways: boom. The fix is simple: clients of SimplifySelectOps should return the select node itself. In order to catch really obnoxious boogs like this in the future, add an assert that nodes are not deleted. We do this by checking for a sentry node type that the SDNode dtor sets when a node is destroyed. llvm-svn: 28514
* Move this code to a common placeAndrew Lenharth2006-05-161-238/+14
| | | | llvm-svn: 28329
* Comment out dead variablesChris Lattner2006-05-121-2/+2
| | | | llvm-svn: 28252
* Two simplifications for token factor nodes: simplify tf(x,x) -> x.Chris Lattner2006-05-121-2/+6
| | | | | | simplify tf(x,y,y,z) -> tf(x,y,z). llvm-svn: 28233
* Debugging infoEvan Cheng2006-05-091-3/+3
| | | | llvm-svn: 28200
* Make the case I just checked in stronger. Now we compile this:Chris Lattner2006-05-081-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | short test2(short X, short x) { int Y = (short)(X+x); return Y >> 1; } to: _test2: add r2, r3, r4 extsh r2, r2 srawi r3, r2, 1 blr instead of: _test2: add r2, r3, r4 extsh r2, r2 srwi r2, r2, 1 extsh r3, r2 blr llvm-svn: 28175
* Implement and_sext.ll:test3, generating:Chris Lattner2006-05-081-1/+8
| | | | | | | | | | | | | | | | | | | | | _test4: srawi r3, r3, 16 blr instead of: _test4: srwi r2, r3, 16 extsh r3, r2 blr for: short test4(unsigned X) { return (X >> 16); } llvm-svn: 28174
* Compile this:Chris Lattner2006-05-081-0/+5
| | | | | | | | | | | | | | | | | | | | | | | short test4(unsigned X) { return (X >> 16); } to: _test4: movl 4(%esp), %eax sarl $16, %eax ret instead of: _test4: movl $-65536, %eax andl 4(%esp), %eax sarl $16, %eax ret llvm-svn: 28171
* Fix PR772Nate Begeman2006-05-081-9/+9
| | | | llvm-svn: 28161
* Simplify some code, add a couple minor missed foldsChris Lattner2006-05-061-21/+16
| | | | llvm-svn: 28152
* remove cases handled elsewhereChris Lattner2006-05-061-16/+2
| | | | llvm-svn: 28150
* Use the new TargetLowering::ComputeNumSignBits method to eliminateChris Lattner2006-05-061-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sign_extend_inreg operations. Though ComputeNumSignBits is still rudimentary, this is enough to compile this: short test(short X, short x) { int Y = X+x; return (Y >> 1); } short test2(short X, short x) { int Y = (short)(X+x); return Y >> 1; } into: _test: add r2, r3, r4 srawi r3, r2, 1 blr _test2: add r2, r3, r4 extsh r2, r2 srawi r3, r2, 1 blr instead of: _test: add r2, r3, r4 srawi r2, r2, 1 extsh r3, r2 blr _test2: add r2, r3, r4 extsh r2, r2 srawi r2, r2, 1 extsh r3, r2 blr llvm-svn: 28146
* Fold trunc(any_ext). This gives stuff like:Chris Lattner2006-05-051-1/+2
| | | | | | | | | | 27,28c27 < movzwl %di, %edi < movl %edi, %ebx --- > movw %di, %bx llvm-svn: 28137
* Shrink shifts when possible.Chris Lattner2006-05-051-0/+12
| | | | llvm-svn: 28136
* Fold (fpext (load x)) -> (extload x)Chris Lattner2006-05-051-0/+14
| | | | llvm-svn: 28130
* Fold some common code.Chris Lattner2006-05-051-14/+2
| | | | llvm-svn: 28124
* Implement:Chris Lattner2006-05-051-5/+7
| | | | | | | | | | | // fold (and (sext x), (sext y)) -> (sext (and x, y)) // fold (or (sext x), (sext y)) -> (sext (or x, y)) // fold (xor (sext x), (sext y)) -> (sext (xor x, y)) // fold (and (aext x), (aext y)) -> (aext (and x, y)) // fold (or (aext x), (aext y)) -> (aext (or x, y)) // fold (xor (aext x), (aext y)) -> (aext (xor x, y)) llvm-svn: 28123
* Pull and through and/or/xor. This compiles some bitfield code to:Chris Lattner2006-05-051-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mov EAX, DWORD PTR [ESP + 4] mov ECX, DWORD PTR [EAX] mov EDX, ECX add EDX, EDX or EDX, ECX and EDX, -2147483648 and ECX, 2147483647 or EDX, ECX mov DWORD PTR [EAX], EDX ret instead of: sub ESP, 4 mov DWORD PTR [ESP], ESI mov EAX, DWORD PTR [ESP + 8] mov ECX, DWORD PTR [EAX] mov EDX, ECX add EDX, EDX mov ESI, ECX and ESI, -2147483648 and EDX, -2147483648 or EDX, ESI and ECX, 2147483647 or EDX, ECX mov DWORD PTR [EAX], EDX mov ESI, DWORD PTR [ESP] add ESP, 4 ret llvm-svn: 28122
* Implement a variety of simplifications for ANY_EXTEND.Chris Lattner2006-05-051-0/+51
| | | | llvm-svn: 28121
* Factor some code, add these transformations:Chris Lattner2006-05-051-55/+66
| | | | | | | | // fold (and (trunc x), (trunc y)) -> (trunc (and x, y)) // fold (or (trunc x), (trunc y)) -> (trunc (or x, y)) // fold (xor (trunc x), (trunc y)) -> (trunc (xor x, y)) llvm-svn: 28120
* Remove a bogus transformation. This fixes ↵Chris Lattner2006-04-281-7/+0
| | | | | | | | SingleSource/UnitTests/2006-01-23-InitializedBitField.c with some changes I have to the new CFE. llvm-svn: 28022
* Fix a couple more memory issuesChris Lattner2006-04-211-4/+4
| | | | llvm-svn: 27930
* Fix a really subtle and obnoxious memory bug that caused issues with anChris Lattner2006-04-201-11/+11
| | | | | | | | | llvm-gcc4 boostrap. Whenever a node is deleted by the dag combiner, it *must* be returned by the visit function, or the dag combiner will not know that the node has been processed (and will, e.g., send it to the target dag combine xforms). llvm-svn: 27922
* Turn a VAND into a VECTOR_SHUFFLE is applicable.Evan Cheng2006-04-201-1/+64
| | | | | | | | DAG combiner can turn a VAND V, <-1, 0, -1, -1>, i.e. vector clear elements, into a vector shuffle with a zero vector. It only does so when TLI tells it the xform is profitable. llvm-svn: 27874
* Canonicalize vvector_shuffle(x,x) -> vvector_shuffle(x,undef) to enable patternsChris Lattner2006-04-081-0/+36
| | | | | | to match again :) llvm-svn: 27533
* Codegen shufflevector as VVECTOR_SHUFFLEChris Lattner2006-04-081-1/+12
| | | | llvm-svn: 27529
* 1. If both vector operands of a vector_shuffle are undef, turn it into an undef.Evan Cheng2006-04-061-3/+6
| | | | | | 2. A shuffle mask element can also be an undef. llvm-svn: 27472
* Do not create ZEXTLOAD's unless we are before legalize or the operation isChris Lattner2006-04-041-1/+2
| | | | | | legal. llvm-svn: 27402
* Add a missing check, this fixes UnitTests/Vector/sumarray.cChris Lattner2006-04-031-2/+2
| | | | llvm-svn: 27375
* Add a missing check, which broke a bunch of vector tests.Chris Lattner2006-04-031-3/+6
| | | | llvm-svn: 27374
* back this outAndrew Lenharth2006-04-031-25/+0
| | | | llvm-svn: 27367
* This should be a win of every archAndrew Lenharth2006-04-021-1/+26
| | | | llvm-svn: 27364
* Add a little dag combine to compile this:Chris Lattner2006-04-021-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | int %AreSecondAndThirdElementsBothNegative(<4 x float>* %in) { entry: %tmp1 = load <4 x float>* %in ; <<4 x float>> [#uses=1] %tmp = tail call int %llvm.ppc.altivec.vcmpgefp.p( int 1, <4 x float> < float 0x7FF8000000000000, float 0.000000e+00, float 0.000000e+00, float 0x7FF8000000000000 >, <4 x float> %tmp1 ) ; <int> [#uses=1] %tmp = seteq int %tmp, 0 ; <bool> [#uses=1] %tmp3 = cast bool %tmp to int ; <int> [#uses=1] ret int %tmp3 } into this: _AreSecondAndThirdElementsBothNegative: mfspr r2, 256 oris r4, r2, 49152 mtspr 256, r4 li r4, lo16(LCPI1_0) lis r5, ha16(LCPI1_0) lvx v0, 0, r3 lvx v1, r5, r4 vcmpgefp. v0, v1, v0 mfcr r3, 2 rlwinm r3, r3, 27, 31, 31 mtspr 256, r2 blr instead of this: _AreSecondAndThirdElementsBothNegative: mfspr r2, 256 oris r4, r2, 49152 mtspr 256, r4 li r4, lo16(LCPI1_0) lis r5, ha16(LCPI1_0) lvx v0, 0, r3 lvx v1, r5, r4 vcmpgefp. v0, v1, v0 mfcr r3, 2 rlwinm r3, r3, 27, 31, 31 xori r3, r3, 1 cntlzw r3, r3 srwi r3, r3, 5 mtspr 256, r2 blr llvm-svn: 27356
* Constant fold all of the vector binops. This allows us to compile this:Chris Lattner2006-04-021-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "vector unsigned char mergeLowHigh = (vector unsigned char) ( 8, 9, 10, 11, 16, 17, 18, 19, 12, 13, 14, 15, 20, 21, 22, 23 ); vector unsigned char mergeHighLow = vec_xor( mergeLowHigh, vec_splat_u8(8));" aka: void %test2(<16 x sbyte>* %P) { store <16 x sbyte> cast (<4 x int> xor (<4 x int> cast (<16 x ubyte> < ubyte 8, ubyte 9, ubyte 10, ubyte 11, ubyte 16, ubyte 17, ubyte 18, ubyte 19, ubyte 12, ubyte 13, ubyte 14, ubyte 15, ubyte 20, ubyte 21, ubyte 22, ubyte 23 > to <4 x int>), <4 x int> cast (<16 x sbyte> < sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8 > to <4 x int>)) to <16 x sbyte>), <16 x sbyte> * %P ret void } into this: _test2: mfspr r2, 256 oris r4, r2, 32768 mtspr 256, r4 li r4, lo16(LCPI2_0) lis r5, ha16(LCPI2_0) lvx v0, r5, r4 stvx v0, 0, r3 mtspr 256, r2 blr instead of this: _test2: mfspr r2, 256 oris r4, r2, 49152 mtspr 256, r4 li r4, lo16(LCPI2_0) lis r5, ha16(LCPI2_0) vspltisb v0, 8 lvx v1, r5, r4 vxor v0, v1, v0 stvx v0, 0, r3 mtspr 256, r2 blr ... which occurs here: http://developer.apple.com/hardware/ve/calcspeed.html llvm-svn: 27343
* Implement constant folding of bit_convert of arbitrary constant ↵Chris Lattner2006-04-021-2/+139
| | | | | | vbuild_vector nodes. llvm-svn: 27341
* Delete identity shuffles, implementing ↵Chris Lattner2006-03-311-2/+56
| | | | | | CodeGen/Generic/vector-identity-shuffle.ll llvm-svn: 27317
* Remove dead *extloads. This allows us to codegen vector.ll:test_extract_eltChris Lattner2006-03-311-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to: test_extract_elt: alloc r3 = ar.pfs,0,1,0,0 adds r8 = 12, r32 ;; ldfs f8 = [r8] mov ar.pfs = r3 br.ret.sptk.many rp instead of: test_extract_elt: alloc r3 = ar.pfs,0,1,0,0 adds r8 = 28, r32 adds r9 = 24, r32 adds r10 = 20, r32 adds r11 = 16, r32 ;; ldfs f6 = [r8] ;; ldfs f6 = [r9] adds r8 = 12, r32 adds r9 = 8, r32 adds r14 = 4, r32 ;; ldfs f6 = [r10] ;; ldfs f6 = [r11] ldfs f8 = [r8] ;; ldfs f6 = [r9] ;; ldfs f6 = [r14] ;; ldfs f6 = [r32] mov ar.pfs = r3 br.ret.sptk.many rp llvm-svn: 27297
* Delete dead loads in the dag. This allows us to compileChris Lattner2006-03-311-0/+5
| | | | | | | | | | | | | | | | | | | | | | | vector.ll:test_extract_elt2 into: _test_extract_elt2: lfd f1, 32(r3) blr instead of: _test_extract_elt2: lfd f0, 56(r3) lfd f0, 48(r3) lfd f0, 40(r3) lfd f1, 32(r3) lfd f0, 24(r3) lfd f0, 16(r3) lfd f0, 8(r3) lfd f0, 0(r3) blr llvm-svn: 27296
* When building a VVECTOR_SHUFFLE node from extract_element operations, makeChris Lattner2006-03-281-1/+11
| | | | | | | | | | | | | | | | | | sure to build it as SHUFFLE(X, undef, mask), not SHUFFLE(X, X, mask). The later is not canonical form, and prevents the PPC splat pattern from matching. For a particular splat, we go from generating this: li r10, lo16(LCPI1_0) lis r11, ha16(LCPI1_0) lvx v3, r11, r10 vperm v3, v2, v2, v3 to generating: vspltw v3, v2, 3 llvm-svn: 27236
* Canonicalize VECTOR_SHUFFLE(X, X, Y) -> VECTOR_SHUFFLE(X,undef,Y')Chris Lattner2006-03-281-0/+30
| | | | llvm-svn: 27235
* Turn a series of extract_element's feeding a build_vector into aChris Lattner2006-03-281-0/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vector_shuffle node. For this: void test(__m128 *res, __m128 *A, __m128 *B) { *res = _mm_unpacklo_ps(*A, *B); } we now produce this code: _test: movl 8(%esp), %eax movaps (%eax), %xmm0 movl 12(%esp), %eax unpcklps (%eax), %xmm0 movl 4(%esp), %eax movaps %xmm0, (%eax) ret instead of this: _test: subl $76, %esp movl 88(%esp), %eax movaps (%eax), %xmm0 movaps %xmm0, (%esp) movaps %xmm0, 32(%esp) movss 4(%esp), %xmm0 movss 32(%esp), %xmm1 unpcklps %xmm0, %xmm1 movl 84(%esp), %eax movaps (%eax), %xmm0 movaps %xmm0, 16(%esp) movaps %xmm0, 48(%esp) movss 20(%esp), %xmm0 movss 48(%esp), %xmm2 unpcklps %xmm0, %xmm2 unpcklps %xmm1, %xmm2 movl 80(%esp), %eax movaps %xmm2, (%eax) addl $76, %esp ret GCC produces this (with -fomit-frame-pointer): _test: subl $12, %esp movl 20(%esp), %eax movaps (%eax), %xmm0 movl 24(%esp), %eax unpcklps (%eax), %xmm0 movl 16(%esp), %eax movaps %xmm0, (%eax) addl $12, %esp ret llvm-svn: 27233
* Don't crash on X^X if X is a vector. Instead, produce a vector of zeros.Chris Lattner2006-03-281-2/+10
| | | | llvm-svn: 27229
OpenPOWER on IntegriCloud