bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Make IVUseShouldUsePostIncValue more aggressive when the use is a PHI. In	Chris Lattner	2005-10-03	1	-6/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	particular, it should realize that phi's use their values in the pred block not the phi block itself. This change turns our em3d loop from this: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r2, 0 b LBB_test_6 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit or r2, r6, r6 lwz r6, 0(r3) cmpw cr0, r6, r5 beq cr0, LBB_test_6 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r2, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; endif.loopexit.loopexit_crit_edge addi r3, r2, 1 blr LBB_test_6: ; loopexit or r3, r2, r2 blr into: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r2, 0 b LBB_test_5 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 or r2, r6, r6 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 or r2, r6, r6 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; loopexit or r3, r2, r2 blr Unfortunately, this is actually worse code, because the register coallescer is getting confused somehow. If it were doing its job right, it could turn the code into this: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r6, 0 b LBB_test_5 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; loopexit or r3, r6, r6 blr ... which I'll work on next. :) llvm-svn: 23604
*	Refactor some code into a function	Chris Lattner	2005-10-03	1	-7/+23
\| \| \| \|	llvm-svn: 23603
*	This break is bogus and I have no idea why it was there. Basically it prevents	Chris Lattner	2005-10-03	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	memoizing code when IV's are used by phinodes outside of loops. In a simple example, we were getting this code before (note that r6 and r7 are isomorphic IV's): li r6, 0 or r7, r6, r6 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 or r2, r7, r7 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r2, r7, 1 addi r7, r7, 1 addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit Now we get: li r6, 0 LBB_test_3: ; no_exit or r2, r6, r6 lwz r6, 0(r3) cmpw cr0, r6, r5 beq cr0, LBB_test_6 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r2, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit this was noticed in em3d. llvm-svn: 23602
*	when checking if we should move a split edge block outside of a loop,	Chris Lattner	2005-10-03	1	-7/+6
\| \| \| \| \| \| \| \|	check the presplit pred, not the post-split pred. This was causing us to make the wrong decision in some cases, leaving the critical edge block in the loop. llvm-svn: 23601
*	Fix VC++ warnings.	Jeff Cohen	2005-10-01	1	-1/+0
\| \| \| \|	llvm-svn: 23579
*	Insert stores after phi nodes in the normal dest. This fixes	Chris Lattner	2005-09-29	1	-2/+5
\| \| \| \| \| \|	LowerInvoke/2005-08-03-InvokeWithPHI.ll llvm-svn: 23525
*	Fold isascii into a simple comparison. This speeds up 197.parser by 7.4%,	Chris Lattner	2005-09-29	1	-0/+26
\| \| \| \| \| \|	bringing the LLC time down to the CBE time. llvm-svn: 23521
*	remove a bunch of unneeded stuff, or self evident comments	Chris Lattner	2005-09-29	1	-45/+6
\| \| \| \|	llvm-svn: 23519
*	Implement a couple of memcmp folds from the todo list	Chris Lattner	2005-09-29	1	-3/+121
\| \| \| \|	llvm-svn: 23517
*	Constant fold llvm.sqrt	Chris Lattner	2005-09-28	1	-1/+9
\| \| \| \|	llvm-svn: 23487
*	add a note about a way to improve this code further, that I won't be getting	Chris Lattner	2005-09-27	1	-0/+8
\| \| \| \| \| \|	to right now. llvm-svn: 23485
*	Fix a regression in my previous patch, fixing GlobalOpt/2005-09-27-Crash.ll	Chris Lattner	2005-09-27	1	-1/+1
\| \| \| \| \| \|	and PR632. llvm-svn: 23484
*	Avoid spilling stack slots... to stack slots.	Chris Lattner	2005-09-27	1	-0/+6
\| \| \| \|	llvm-svn: 23478
*	Completely rewrite 'correct' eh support. This changes how setjmp insertion	Chris Lattner	2005-09-27	1	-140/+301
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	is performed so it is only at most once per function that contains an invoke instead of once per invoke in the function. This patch has the following perks: 1. It fixes PR631, which complains about slowness. 2. If fixes PR240, which complains about non-volatile vars being live across setjmp/longjmps. 3. It improves (but does not fix) the jmpbuf alignment issue on itanium by not forcing the jmpbufs to always be 8-bytes off the alignment of the structure. 4. It speeds up 253.perlbmk from 338s to 13.70s (a 25x improvement!), making us now about 4% faster than GCC. Further improvements are also possible. llvm-svn: 23477
*	Make the pass name simpler	Chris Lattner	2005-09-27	1	-1/+1
\| \| \| \|	llvm-svn: 23476
*	allow demotion to volatile values, add support for invoke	Chris Lattner	2005-09-27	1	-12/+15
\| \| \| \|	llvm-svn: 23473
*	Add support for external calls that we know how to constant fold. This ↵	Chris Lattner	2005-09-27	1	-11/+20
\| \| \| \| \| \| \| \|	implements ctor-list-opt.ll:CTOR8 llvm-svn: 23465
*	Fix a bug where we would evaluate stores into linkonce objects which could be	Chris Lattner	2005-09-27	1	-1/+6
\| \| \| \| \| \|	potentially replaced at link-time. llvm-svn: 23463
*	Implement support for static constructors with calls in them. This is useful	Chris Lattner	2005-09-27	1	-23/+54
\| \| \| \| \| \| \| \|	because gccas runs globalopt before inlining. This implements ctor-list-opt.ll:CTOR7 llvm-svn: 23462
*	Refactor this code a bit, no functionality changes.	Chris Lattner	2005-09-27	1	-22/+40
\| \| \| \|	llvm-svn: 23460
*	Remove some dead code. ctor evaluation subsumes empty ctor elim	Chris Lattner	2005-09-26	1	-12/+0
\| \| \| \|	llvm-svn: 23453
*	Add support for alloca, implementing ctor-list-opt.ll:CTOR6	Chris Lattner	2005-09-26	1	-17/+48
\| \| \| \|	llvm-svn: 23452
*	Add a debug printout, fix a crash on kc++	Chris Lattner	2005-09-26	1	-1/+6
\| \| \| \|	llvm-svn: 23450
*	Implement loads/stores through GEP's of globals. This implements	Chris Lattner	2005-09-26	1	-6/+98
\| \| \| \| \| \|	ctor-list-opt.ll:CTOR5. llvm-svn: 23449
*	Replace TraverseGEPInitializer with ConstantFoldLoadThroughGEPConstantExpr	Chris Lattner	2005-09-26	1	-17/+5
\| \| \| \|	llvm-svn: 23447
*	Eliminate GetGEPGlobalInitializer in favor of the more powerful	Chris Lattner	2005-09-26	1	-27/+1
\| \| \| \| \| \|	ConstantFoldLoadThroughGEPConstantExpr function in the utils lib. llvm-svn: 23446
*	Factor the GetGEPGlobalInitializer out of this pass and into Transforms/Utils	Chris Lattner	2005-09-26	1	-44/+2
\| \| \| \| \| \|	as ConstantFoldLoadThroughGEPConstantExpr. llvm-svn: 23445
*	Move the ConstantFoldLoadThroughGEPConstantExpr function out of the InstCombine	Chris Lattner	2005-09-26	1	-1/+45
\| \| \| \| \| \|	pass. llvm-svn: 23444
*	add a comment	Chris Lattner	2005-09-26	1	-0/+3
\| \| \| \|	llvm-svn: 23442
*	Add support for getelementptr, load, and correctly reject volatile stores.	Chris Lattner	2005-09-26	1	-0/+29
\| \| \| \|	llvm-svn: 23441
*	Add support for br/brcond/switch and phi	Chris Lattner	2005-09-26	1	-3/+47
\| \| \| \|	llvm-svn: 23439
*	Add a simple interpreter to this code, allowing us to statically evaluate	Chris Lattner	2005-09-26	1	-4/+110
\| \| \| \| \| \|	global ctors that are simple enough. This implements ctor-list-opt.ll:CTOR2. llvm-svn: 23437
*	factor some code into a InstallGlobalCtors method, add comments. No ↵	Chris Lattner	2005-09-26	1	-35/+52
\| \| \| \| \| \|	functionality change. llvm-svn: 23435
*	Make the global opt optimizer work on modules with a null terminator, by	Chris Lattner	2005-09-26	1	-8/+13
\| \| \| \| \| \|	accepting the null even with a non-65535 init prio llvm-svn: 23434
*	Factor this code out into a few methods.	Chris Lattner	2005-09-26	1	-33/+190
\| \| \| \| \| \| \| \| \| \| \| \| \|	Implement the start of global ctor optimization. It is currently smart enough to remove the global ctor for cases like this: struct foo { foo() {} } x; ... saving a bit of startup time for the program. llvm-svn: 23433
*	Fix some logic I broke that caused a regression on	Chris Lattner	2005-09-25	1	-3/+5
\| \| \| \| \| \|	SimplifyLibCalls/2005-05-20-sprintf-crash.ll llvm-svn: 23430
*	Move MaskedValueIsZero up.	Chris Lattner	2005-09-24	1	-77/+146
\| \| \| \| \| \|	Match a bunch of idioms for sign extensions, implementing InstCombine/signext.ll llvm-svn: 23428
*	Simplify this code a bit by relying on recursive simplification. Support	Chris Lattner	2005-09-24	1	-51/+43
\| \| \| \| \| \| \| \|	sprintf("%s", P)'s that have uses. s/hasNUses(0)/use_empty()/ llvm-svn: 23425
*	remove some debugging code	Chris Lattner	2005-09-23	1	-1/+0
\| \| \| \|	llvm-svn: 23411
*	Fold two consequtive branches that share a common destination between them.	Chris Lattner	2005-09-23	1	-33/+119
\| \| \| \| \| \| \|	This implements SimplifyCFG/branch-fold.ll, and is useful on ?:/min/max heavy code llvm-svn: 23410
*	simplify some logic further	Chris Lattner	2005-09-23	1	-6/+1
\| \| \| \|	llvm-svn: 23408
*	pull a bunch of logic out of SimplifyCFG into a helper fn	Chris Lattner	2005-09-23	1	-112/+112
\| \| \| \|	llvm-svn: 23407
*	Start threading across blocks with code in them, so long as the code does	Chris Lattner	2005-09-20	1	-15/+64
\| \| \| \| \| \| \| \| \|	not define a value that is used outside of it's block. This catches many more simplifications, e.g. 854 in 176.gcc, 137 in vpr, etc. This implements branch-phi-thread.ll:test3.ll llvm-svn: 23397
*	Implement merging of blocks with the same condition if the block has multiple	Chris Lattner	2005-09-20	1	-21/+59
\| \| \| \| \| \|	predecessors. This implements branch-phi-thread.ll::test1 llvm-svn: 23395
*	Reject a case we don't handle yet	Chris Lattner	2005-09-19	1	-1/+3
\| \| \| \|	llvm-svn: 23393
*	remove debugging code :-/	Chris Lattner	2005-09-19	1	-2/+0
\| \| \| \|	llvm-svn: 23392
*	Implement SimplifyCFG/branch-phi-thread.ll, the most trivial case of threading	Chris Lattner	2005-09-19	1	-0/+73
\| \| \| \| \| \| \|	control across branches with determined outcomes. More generality to follow. This triggers a couple thousand times in specint. llvm-svn: 23391
*	Refactor this code a bit and make it more general. This now compiles:	Chris Lattner	2005-09-18	1	-24/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus2 (unsigned int x) { b.j += x; } To: _plus2: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) slwi r3, r3, 6 add r3, r4, r3 rlwimi r3, r4, 0, 26, 14 stw r3, 0(r2) blr instead of: _plus2: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) rlwinm r5, r4, 26, 21, 31 add r3, r5, r3 rlwimi r4, r3, 6, 15, 25 stw r4, 0(r2) blr by eliminating an 'and'. I'm pretty sure this is as small as we can go :) llvm-svn: 23386
*	Compile	Chris Lattner	2005-09-18	1	-31/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus2 (unsigned int x) { b.j += x; } to: plus2: mov %EAX, DWORD PTR [b] mov %ECX, %EAX and %ECX, 131008 mov %EDX, DWORD PTR [%ESP + 4] shl %EDX, 6 add %EDX, %ECX and %EDX, 131008 and %EAX, -131009 or %EDX, %EAX mov DWORD PTR [b], %EDX ret instead of: plus2: mov %EAX, DWORD PTR [b] mov %ECX, %EAX shr %ECX, 6 and %ECX, 2047 add %ECX, DWORD PTR [%ESP + 4] shl %ECX, 6 and %ECX, 131008 and %EAX, -131009 or %ECX, %EAX mov DWORD PTR [b], %ECX ret llvm-svn: 23385
*	Generalize this transform, using MaskedValueIsZero, allowing us to compile:	Chris Lattner	2005-09-18	1	-14/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus3 (unsigned int x) { b.k += x; } To: plus3: mov %EAX, DWORD PTR [%ESP + 4] shl %EAX, 17 add DWORD PTR [b], %EAX ret instead of: plus3: mov %EAX, DWORD PTR [%ESP + 4] shl %EAX, 17 mov %ECX, DWORD PTR [b] add %EAX, %ECX and %EAX, -131072 and %ECX, 131071 or %ECX, %EAX mov DWORD PTR [b], %ECX ret llvm-svn: 23384