bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fix a really subtle and obnoxious memory bug that caused issues with an	Chris Lattner	2006-04-20	1	-11/+11
\| \| \| \| \| \| \| \| \|	llvm-gcc4 boostrap. Whenever a node is deleted by the dag combiner, it must be returned by the visit function, or the dag combiner will not know that the node has been processed (and will, e.g., send it to the target dag combine xforms). llvm-svn: 27922
*	Turn a VAND into a VECTOR_SHUFFLE is applicable.	Evan Cheng	2006-04-20	1	-1/+64
\| \| \| \| \| \| \| \|	DAG combiner can turn a VAND V, <-1, 0, -1, -1>, i.e. vector clear elements, into a vector shuffle with a zero vector. It only does so when TLI tells it the xform is profitable. llvm-svn: 27874
*	Canonicalize vvector_shuffle(x,x) -> vvector_shuffle(x,undef) to enable patterns	Chris Lattner	2006-04-08	1	-0/+36
\| \| \| \| \| \|	to match again :) llvm-svn: 27533
*	Codegen shufflevector as VVECTOR_SHUFFLE	Chris Lattner	2006-04-08	1	-1/+12
\| \| \| \|	llvm-svn: 27529
*	1. If both vector operands of a vector_shuffle are undef, turn it into an undef.	Evan Cheng	2006-04-06	1	-3/+6
\| \| \| \| \| \|	2. A shuffle mask element can also be an undef. llvm-svn: 27472
*	Do not create ZEXTLOAD's unless we are before legalize or the operation is	Chris Lattner	2006-04-04	1	-1/+2
\| \| \| \| \| \|	legal. llvm-svn: 27402
*	Add a missing check, this fixes UnitTests/Vector/sumarray.c	Chris Lattner	2006-04-03	1	-2/+2
\| \| \| \|	llvm-svn: 27375
*	Add a missing check, which broke a bunch of vector tests.	Chris Lattner	2006-04-03	1	-3/+6
\| \| \| \|	llvm-svn: 27374
*	back this out	Andrew Lenharth	2006-04-03	1	-25/+0
\| \| \| \|	llvm-svn: 27367
*	This should be a win of every arch	Andrew Lenharth	2006-04-02	1	-1/+26
\| \| \| \|	llvm-svn: 27364
*	Add a little dag combine to compile this:	Chris Lattner	2006-04-02	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	int %AreSecondAndThirdElementsBothNegative(<4 x float>* %in) { entry: %tmp1 = load <4 x float>* %in ; <<4 x float>> [#uses=1] %tmp = tail call int %llvm.ppc.altivec.vcmpgefp.p( int 1, <4 x float> < float 0x7FF8000000000000, float 0.000000e+00, float 0.000000e+00, float 0x7FF8000000000000 >, <4 x float> %tmp1 ) ; <int> [#uses=1] %tmp = seteq int %tmp, 0 ; <bool> [#uses=1] %tmp3 = cast bool %tmp to int ; <int> [#uses=1] ret int %tmp3 } into this: _AreSecondAndThirdElementsBothNegative: mfspr r2, 256 oris r4, r2, 49152 mtspr 256, r4 li r4, lo16(LCPI1_0) lis r5, ha16(LCPI1_0) lvx v0, 0, r3 lvx v1, r5, r4 vcmpgefp. v0, v1, v0 mfcr r3, 2 rlwinm r3, r3, 27, 31, 31 mtspr 256, r2 blr instead of this: _AreSecondAndThirdElementsBothNegative: mfspr r2, 256 oris r4, r2, 49152 mtspr 256, r4 li r4, lo16(LCPI1_0) lis r5, ha16(LCPI1_0) lvx v0, 0, r3 lvx v1, r5, r4 vcmpgefp. v0, v1, v0 mfcr r3, 2 rlwinm r3, r3, 27, 31, 31 xori r3, r3, 1 cntlzw r3, r3 srwi r3, r3, 5 mtspr 256, r2 blr llvm-svn: 27356
*	Constant fold all of the vector binops. This allows us to compile this:	Chris Lattner	2006-04-02	1	-0/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"vector unsigned char mergeLowHigh = (vector unsigned char) ( 8, 9, 10, 11, 16, 17, 18, 19, 12, 13, 14, 15, 20, 21, 22, 23 ); vector unsigned char mergeHighLow = vec_xor( mergeLowHigh, vec_splat_u8(8));" aka: void %test2(<16 x sbyte>* %P) { store <16 x sbyte> cast (<4 x int> xor (<4 x int> cast (<16 x ubyte> < ubyte 8, ubyte 9, ubyte 10, ubyte 11, ubyte 16, ubyte 17, ubyte 18, ubyte 19, ubyte 12, ubyte 13, ubyte 14, ubyte 15, ubyte 20, ubyte 21, ubyte 22, ubyte 23 > to <4 x int>), <4 x int> cast (<16 x sbyte> < sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8 > to <4 x int>)) to <16 x sbyte>), <16 x sbyte> * %P ret void } into this: _test2: mfspr r2, 256 oris r4, r2, 32768 mtspr 256, r4 li r4, lo16(LCPI2_0) lis r5, ha16(LCPI2_0) lvx v0, r5, r4 stvx v0, 0, r3 mtspr 256, r2 blr instead of this: _test2: mfspr r2, 256 oris r4, r2, 49152 mtspr 256, r4 li r4, lo16(LCPI2_0) lis r5, ha16(LCPI2_0) vspltisb v0, 8 lvx v1, r5, r4 vxor v0, v1, v0 stvx v0, 0, r3 mtspr 256, r2 blr ... which occurs here: http://developer.apple.com/hardware/ve/calcspeed.html llvm-svn: 27343
*	Implement constant folding of bit_convert of arbitrary constant ↵	Chris Lattner	2006-04-02	1	-2/+139
\| \| \| \| \| \|	vbuild_vector nodes. llvm-svn: 27341
*	Delete identity shuffles, implementing ↵	Chris Lattner	2006-03-31	1	-2/+56
\| \| \| \| \| \|	CodeGen/Generic/vector-identity-shuffle.ll llvm-svn: 27317
*	Remove dead *extloads. This allows us to codegen vector.ll:test_extract_elt	Chris Lattner	2006-03-31	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to: test_extract_elt: alloc r3 = ar.pfs,0,1,0,0 adds r8 = 12, r32 ;; ldfs f8 = [r8] mov ar.pfs = r3 br.ret.sptk.many rp instead of: test_extract_elt: alloc r3 = ar.pfs,0,1,0,0 adds r8 = 28, r32 adds r9 = 24, r32 adds r10 = 20, r32 adds r11 = 16, r32 ;; ldfs f6 = [r8] ;; ldfs f6 = [r9] adds r8 = 12, r32 adds r9 = 8, r32 adds r14 = 4, r32 ;; ldfs f6 = [r10] ;; ldfs f6 = [r11] ldfs f8 = [r8] ;; ldfs f6 = [r9] ;; ldfs f6 = [r14] ;; ldfs f6 = [r32] mov ar.pfs = r3 br.ret.sptk.many rp llvm-svn: 27297
*	Delete dead loads in the dag. This allows us to compile	Chris Lattner	2006-03-31	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vector.ll:test_extract_elt2 into: _test_extract_elt2: lfd f1, 32(r3) blr instead of: _test_extract_elt2: lfd f0, 56(r3) lfd f0, 48(r3) lfd f0, 40(r3) lfd f1, 32(r3) lfd f0, 24(r3) lfd f0, 16(r3) lfd f0, 8(r3) lfd f0, 0(r3) blr llvm-svn: 27296
*	When building a VVECTOR_SHUFFLE node from extract_element operations, make	Chris Lattner	2006-03-28	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sure to build it as SHUFFLE(X, undef, mask), not SHUFFLE(X, X, mask). The later is not canonical form, and prevents the PPC splat pattern from matching. For a particular splat, we go from generating this: li r10, lo16(LCPI1_0) lis r11, ha16(LCPI1_0) lvx v3, r11, r10 vperm v3, v2, v2, v3 to generating: vspltw v3, v2, 3 llvm-svn: 27236
*	Canonicalize VECTOR_SHUFFLE(X, X, Y) -> VECTOR_SHUFFLE(X,undef,Y')	Chris Lattner	2006-03-28	1	-0/+30
\| \| \| \|	llvm-svn: 27235
*	Turn a series of extract_element's feeding a build_vector into a	Chris Lattner	2006-03-28	1	-0/+86
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vector_shuffle node. For this: void test(__m128 res, __m128 A, __m128 B) { res = _mm_unpacklo_ps(A, B); } we now produce this code: _test: movl 8(%esp), %eax movaps (%eax), %xmm0 movl 12(%esp), %eax unpcklps (%eax), %xmm0 movl 4(%esp), %eax movaps %xmm0, (%eax) ret instead of this: _test: subl $76, %esp movl 88(%esp), %eax movaps (%eax), %xmm0 movaps %xmm0, (%esp) movaps %xmm0, 32(%esp) movss 4(%esp), %xmm0 movss 32(%esp), %xmm1 unpcklps %xmm0, %xmm1 movl 84(%esp), %eax movaps (%eax), %xmm0 movaps %xmm0, 16(%esp) movaps %xmm0, 48(%esp) movss 20(%esp), %xmm0 movss 48(%esp), %xmm2 unpcklps %xmm0, %xmm2 unpcklps %xmm1, %xmm2 movl 80(%esp), %eax movaps %xmm2, (%eax) addl $76, %esp ret GCC produces this (with -fomit-frame-pointer): _test: subl $12, %esp movl 20(%esp), %eax movaps (%eax), %xmm0 movl 24(%esp), %eax unpcklps (%eax), %xmm0 movl 16(%esp), %eax movaps %xmm0, (%eax) addl $12, %esp ret llvm-svn: 27233
*	Don't crash on X^X if X is a vector. Instead, produce a vector of zeros.	Chris Lattner	2006-03-28	1	-2/+10
\| \| \| \|	llvm-svn: 27229
*	Don't call SimplifyDemandedBits on vectors	Chris Lattner	2006-03-25	1	-1/+2
\| \| \| \|	llvm-svn: 27128
*	fold insertelement(buildvector) -> buildvector if the inserted element # is	Chris Lattner	2006-03-19	1	-0/+42
\| \| \| \| \| \|	a constant. This implements test_constant_insert in CodeGen/Generic/vector.ll llvm-svn: 26851
*	Remove BRTWOWAY*	Nate Begeman	2006-03-17	1	-68/+0
\| \| \| \| \| \| \| \|	Make the PPC backend not dependent on BRTWOWAY_CC and make the branch selector smarter about the code it generates, fixing a case in the readme. llvm-svn: 26814
*	make sure dead token factor nodes are removed by the dag combiner.	Chris Lattner	2006-03-13	1	-0/+1
\| \| \| \|	llvm-svn: 26731
*	Fold X+Y -> X\|Y when safe. This implements:	Chris Lattner	2006-03-13	1	-1/+19
\| \| \| \| \| \| \| \|	Regression/CodeGen/PowerPC/and_add.ll a case that occurs with dynamic allocas of constant size. llvm-svn: 26727
*	add a couple of missing folds	Chris Lattner	2006-03-13	1	-0/+12
\| \| \| \|	llvm-svn: 26724
*	Reinstate this now that the offending opposite xform has been removed.	Chris Lattner	2006-03-05	1	-0/+7
\| \| \| \|	llvm-svn: 26548
*	Back out fold (shl (add x, c1), c2) -> (add (shl x, c2), c1<<c2) for now.	Evan Cheng	2006-03-05	1	-7/+0
\| \| \| \| \| \|	It's causing an infinite loop compiling ldecod on x86 / Darwin. llvm-svn: 26544
*	Add some simple copysign folds	Chris Lattner	2006-03-05	1	-7/+59
\| \| \| \|	llvm-svn: 26543
*	fold (mul (add x, c1), c2) -> (add (mul x, c2), c1*c2)	Chris Lattner	2006-03-04	1	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fold (shl (add x, c1), c2) -> (add (shl x, c2), c1<<c2) This allows us to compile CodeGen/PowerPC/addi-reassoc.ll into: _test1: slwi r2, r4, 4 add r2, r2, r3 lwz r3, 36(r2) blr _test2: mulli r2, r4, 5 add r2, r2, r3 lbz r2, 11(r2) extsb r3, r2 blr instead of: _test1: addi r2, r4, 2 slwi r2, r2, 4 add r2, r3, r2 lwz r3, 4(r2) blr _test2: addi r2, r4, 2 mulli r2, r2, 5 add r2, r3, r2 lbz r2, 1(r2) extsb r3, r2 blr llvm-svn: 26535
*	Fix CodeGen/Generic/2006-03-01-dagcombineinfloop.ll, an infinite loop	Chris Lattner	2006-03-01	1	-4/+9
\| \| \| \| \| \|	in the dag combiner on 176.gcc on x86. llvm-svn: 26459
*	Fix a typo evan noticed	Chris Lattner	2006-03-01	1	-1/+1
\| \| \| \|	llvm-svn: 26454
*	Add support for target-specific dag combines	Chris Lattner	2006-03-01	1	-13/+58
\| \| \| \|	llvm-svn: 26443
*	Add a new AddToWorkList method, start using it	Chris Lattner	2006-03-01	1	-57/+63
\| \| \| \|	llvm-svn: 26441
*	Pull shifts by a constant through multiplies (a form of reassociation),	Chris Lattner	2006-03-01	1	-0/+27
\| \| \| \| \| \|	implementing Regression/CodeGen/X86/mul-shift-reassoc.ll llvm-svn: 26440
*	Vector ops lowering.	Evan Cheng	2006-03-01	1	-1/+1
\| \| \| \|	llvm-svn: 26436
*	Compile:	Chris Lattner	2006-02-28	1	-10/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	unsigned foo4(unsigned short P) { return P & 255; } unsigned foo5(short P) { return P & 255; } to: _foo4: lbz r3,1(r3) blr _foo5: lbz r3,1(r3) blr not: _foo4: lhz r2, 0(r3) rlwinm r3, r2, 0, 24, 31 blr _foo5: lhz r2, 0(r3) rlwinm r3, r2, 0, 24, 31 blr llvm-svn: 26419
*	Fold "and (LOAD P), 255" -> zextload. This allows us to compile:	Chris Lattner	2006-02-28	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	unsigned foo3(unsigned P) { return P & 255; } as: _foo3: lbz r3, 3(r3) blr instead of: _foo3: lwz r2, 0(r3) rlwinm r3, r2, 0, 24, 31 blr and: unsigned short foo2(float a) { return a; } as: _foo2: fctiwz f0, f1 stfd f0, -8(r1) lhz r3, -2(r1) blr instead of: _foo2: fctiwz f0, f1 stfd f0, -8(r1) lwz r2, -4(r1) rlwinm r3, r2, 0, 16, 31 blr llvm-svn: 26417
*	fold (sra (sra x, c1), c2) -> (sra x, c1+c2)	Chris Lattner	2006-02-28	1	-3/+11
\| \| \| \|	llvm-svn: 26416
*	remove some completed notes	Chris Lattner	2006-02-27	1	-4/+0
\| \| \| \|	llvm-svn: 26390
*	Fix a problem Nate and Duraid reported where simplifying nodes can cause	Chris Lattner	2006-02-20	1	-4/+8
\| \| \| \| \| \| \|	them to get ressurected, in which case, deleting the undead nodes is unfriendly. llvm-svn: 26291
*	Add checks to make sure we don't create bogus extend nodes, and fix a bug	Nate Begeman	2006-02-18	1	-4/+10
\| \| \| \| \| \| \|	where we were doing exactly that which was causing failures on x86 and alpha. llvm-svn: 26284
*	Fix a tricky issue in the SimplifyDemandedBits code where CombineTo wasn't	Chris Lattner	2006-02-17	1	-9/+34
\| \| \| \| \| \| \|	exactly the API we wanted to call into. This fixes the crash on crafty last night. llvm-svn: 26269
*	Clean up DemandedBitsAreZero interface	Nate Begeman	2006-02-17	1	-22/+26
\| \| \| \| \| \| \|	Make more use of the new mask helpers in valuetypes.h Combine (sra (srl x, c1), c1) -> sext_inreg if legal llvm-svn: 26263
*	Don't expand sdiv by power of two before legalize, since it will likely	Nate Begeman	2006-02-17	1	-2/+2
\| \| \| \| \| \|	generate illegal nodes. llvm-svn: 26261
*	kill ADD_PARTS & SUB_PARTS and replace them with fancy new ADDC, ADDE, SUBC	Nate Begeman	2006-02-17	1	-46/+0
\| \| \| \| \| \| \|	and SUBE nodes that actually expose what's going on and allow for significant simplifications in the targets. llvm-svn: 26255
*	Rework the SelectionDAG-based implementations of SimplifyDemandedBits	Nate Begeman	2006-02-16	1	-35/+17
\| \| \| \| \| \| \|	and ComputeMaskedBits to match the new improved versions in instcombine. Tested against all of multisource/benchmarks on ppc. llvm-svn: 26238
*	Lowering of sdiv X, pow2 was broken, this fixes it. This patch is written	Chris Lattner	2006-02-16	1	-6/+12
\| \| \| \| \| \|	by Nate, I'm just committing it for him. llvm-svn: 26230
*	Should not combine ISD::LOCATIONs until we have scheme to remove from	Jim Laskey	2006-02-15	1	-34/+0
\| \| \| \| \| \|	MachineDebugInfo tables. llvm-svn: 26216
*	Compile this:	Chris Lattner	2006-02-08	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	xori r6, r2, 1 rlwinm r6, r6, 0, 31, 31 cmpwi cr0, r6, 0 bne cr0, LBB1_3 ; endif to this: rlwinm r6, r2, 0, 31, 31 cmpwi cr0, r6, 0 beq cr0, LBB1_3 ; endif llvm-svn: 26047