bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Teach jump threading some more simple tricks:	Chris Lattner	2008-12-03	1	-16/+156
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1) have it fold "br undef", which does occur with surprising frequency as jump threading iterates. 2) teach j-t to delete dead blocks. This removes the successor edges, reducing the in-edges of other blocks, allowing recursive simplification. 3) Fold things like: br COND, BBX, BBY BBX: br COND, BBZ, BBW which also happens because jump threading iterates. llvm-svn: 60470
*	third time is the charm.	Chris Lattner	2008-12-03	1	-1/+3
\| \| \| \|	llvm-svn: 60469
*	fix assertion.	Chris Lattner	2008-12-03	1	-1/+1
\| \| \| \|	llvm-svn: 60468
*	Rename DeleteBlockIfDead to DeleteDeadBlock and make it	Chris Lattner	2008-12-03	2	-10/+5
\| \| \| \| \| \| \|	unconditionally delete the block. All likely clients will do the checking anyway. llvm-svn: 60464
*	Factor some code out of SimplifyCFG, forming a new	Chris Lattner	2008-12-03	2	-21/+34
\| \| \| \| \| \|	DeleteBlockIfDead method. llvm-svn: 60463
*	Minor rewrite per review feedback.	Dale Johannesen	2008-12-02	1	-8/+7
\| \| \| \|	llvm-svn: 60442
*	Make the code do what the comment says it does.	Dale Johannesen	2008-12-02	1	-4/+5
\| \| \| \|	llvm-svn: 60431
*	Implement PRE of loads in the GVN pass with a pretty cheap and	Chris Lattner	2008-12-02	1	-54/+193
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	straight-forward implementation. This does not require any extra alias analysis queries beyond what we already do for non-local loads. Some programs really really like load PRE. For example, SPASS triggers this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc. The biggest limitation to the implementation is that it does not split critical edges. This is a huge killer on many programs and should be addressed after the initial patch is enabled by default. The implementation of this should incidentally speed up rejection of non-local loads because it avoids creating the repl densemap in cases when it won't be used for fully redundant loads. This is currently disabled by default. Before I turn this on, I need to fix a couple of miscompilations in the testsuite, look at compile time performance numbers, and look at perf impact. This is pretty close to ready though. llvm-svn: 60408
*	Remove some errors that crept in. No functionality change.	Bill Wendling	2008-12-02	1	-3/+4
\| \| \| \|	llvm-svn: 60403
*	Merge two if-statements into one.	Bill Wendling	2008-12-02	1	-7/+3
\| \| \| \|	llvm-svn: 60402
*	More styalistic changes. No functionality change.	Bill Wendling	2008-12-02	1	-15/+12
\| \| \| \|	llvm-svn: 60401
*	- Remove the buggy -X/C -> X/-C transform. This isn't valid when X isn't a	Bill Wendling	2008-12-02	1	-12/+0
\| \| \| \| \| \| \| \| \|	constant. If X is a constant, then this is folded elsewhere. - Added a note to Target/README.txt to indicate that we'd like to implement this when we're able. llvm-svn: 60399
*	Improve comment.	Bill Wendling	2008-12-02	1	-4/+3
\| \| \| \|	llvm-svn: 60398
*	- Reduce nesting.	Bill Wendling	2008-12-02	1	-24/+18
\| \| \| \| \| \| \| \|	- No need to do a swap on a canonicalized pattern. No functionality change. llvm-svn: 60397
*	some random comment improvements.	Chris Lattner	2008-12-02	1	-11/+22
\| \| \| \|	llvm-svn: 60395
*	Fix an issue that Chris noticed, where local PRE was not properly instantiating	Owen Anderson	2008-12-02	1	-2/+7
\| \| \| \| \| \| \|	a new value numbering set after splitting a critical edge. This increases the number of instances of PRE on 403.gcc from ~60 to ~570. llvm-svn: 60393
*	Consider only references to an IV within the loop when	Dale Johannesen	2008-12-01	1	-6/+26
\| \| \| \| \| \| \| \| \|	figuring out the base of the IV. This produces better code in the example. (Addresses use (IV) instead of (BASE,IV) - a significant improvement on low-register machines like x86). llvm-svn: 60374
*	Don't rebuild RHSNeg. Just use the one that's already there.	Bill Wendling	2008-12-01	1	-2/+1
\| \| \| \|	llvm-svn: 60370
*	Document what this check is doing. Also, no need to cast to ConstantInt.	Bill Wendling	2008-12-01	1	-4/+4
\| \| \| \|	llvm-svn: 60369
*	Use a simple comparison. Overflow on integer negation can only occur when the	Bill Wendling	2008-12-01	1	-13/+2
\| \| \| \| \| \|	integer is "minint". llvm-svn: 60366
*	Generalize the FoldOrWithConstant method to fold for any two constants which	Bill Wendling	2008-12-01	1	-23/+22
\| \| \| \| \| \|	don't have overlapping bits. llvm-svn: 60344
*	Reduce copy-and-paste code by splitting out the code into its own function.	Bill Wendling	2008-12-01	1	-58/+50
\| \| \| \|	llvm-svn: 60343
*	Use m_Specific() instead of double matching.	Bill Wendling	2008-12-01	1	-18/+12
\| \| \| \|	llvm-svn: 60341
*	Move pattern check outside of the if-then statement. This prevents us from ↵	Bill Wendling	2008-12-01	1	-10/+12
\| \| \| \| \| \|	fiddling with constants unless we have to. llvm-svn: 60340
*	Rename some variables, only increment BI once at the start of the loop ↵	Chris Lattner	2008-12-01	1	-38/+30
\| \| \| \| \| \|	instead of throughout it. llvm-svn: 60339
*	pull the predMap densemap out of the inner loop of performPRE, so	Chris Lattner	2008-12-01	1	-2/+4
\| \| \| \| \| \| \|	that it isn't reallocated all the time. This is a tiny speedup for GVN: 3.90->3.88s llvm-svn: 60338
*	switch a couple more calls to use array_pod_sort.	Chris Lattner	2008-12-01	2	-3/+5
\| \| \| \|	llvm-svn: 60337
*	Introduce a new array_pod_sort function and switch LSR to use it	Chris Lattner	2008-12-01	1	-1/+1
\| \| \| \| \| \| \| \| \|	instead of std::sort. This shrinks the release-asserts LSR.o file by 1100 bytes of code on my system. We should start using array_pod_sort where possible. llvm-svn: 60335
*	Eliminate use of setvector for the DeadInsts set, just use a smallvector.	Chris Lattner	2008-12-01	1	-17/+31
\| \| \| \| \| \|	This is a lot cheaper and conceptually simpler. llvm-svn: 60332
*	DeleteTriviallyDeadInstructions is always passed the	Chris Lattner	2008-12-01	1	-10/+9
\| \| \| \| \| \|	DeadInsts ivar, just use it directly. llvm-svn: 60330
*	simplify DeleteTriviallyDeadInstructions again, unlike my previous	Chris Lattner	2008-12-01	1	-20/+13
\| \| \| \| \| \| \| \|	buggy rewrite, this notifies ScalarEvolution of a pending instruction about to be removed and then erases it, instead of erasing it then notifying. llvm-svn: 60329
*	simplify these patterns using m_Specific. No need to grep for	Chris Lattner	2008-12-01	1	-16/+6
\| \| \| \| \| \|	xor in testcase (or is a substring). llvm-svn: 60328
*	Teach jump threading to clean up after itself, DCE and constfolding the	Chris Lattner	2008-12-01	1	-1/+24
\| \| \| \| \| \| \| \| \|	new instructions it simplifies. Because we're threading jumps on edges with constants coming in from PHI's, we inherently are exposing a lot more constants to the new block. Folding them and deleting dead conditions allows the cost model in jump threading to be more accurate as it iterates. llvm-svn: 60327
*	Change instcombine to use FoldPHIArgGEPIntoPHI to fold two operand PHIs	Chris Lattner	2008-12-01	1	-17/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instead of using FoldPHIArgBinOpIntoPHI. In addition to being more obvious, this also fixes a problem where instcombine wouldn't merge two phis that had different variable indices. This prevented instcombine from factoring big chunks of code in 403.gcc. For example: insn_cuid.exit: - %tmp336 = load i32** @uid_cuid, align 4 - %tmp337 = getelementptr %struct.rtx_def* %insn_addr.0.ph.i, i32 0, i32 3 - %tmp338 = bitcast [1 x %struct.rtunion]* %tmp337 to i32* - %tmp339 = load i32* %tmp338, align 4 - %tmp340 = getelementptr i32* %tmp336, i32 %tmp339 br label %bb62 bb61: - %tmp341 = load i32** @uid_cuid, align 4 - %tmp342 = getelementptr %struct.rtx_def* %insn, i32 0, i32 3 - %tmp343 = bitcast [1 x %struct.rtunion]* %tmp342 to i32* - %tmp344 = load i32* %tmp343, align 4 - %tmp345 = getelementptr i32* %tmp341, i32 %tmp344 br label %bb62 bb62: - %iftmp.62.0.in = phi i32* [ %tmp345, %bb61 ], [ %tmp340, %insn_cuid.exit ] + %insn.pn2 = phi %struct.rtx_def* [ %insn, %bb61 ], [ %insn_addr.0.ph.i, %insn_cuid.exit ] + %tmp344.pn.in.in = getelementptr %struct.rtx_def* %insn.pn2, i32 0, i32 3 + %tmp344.pn.in = bitcast [1 x %struct.rtunion]* %tmp344.pn.in.in to i32* + %tmp341.pn = load i32** @uid_cuid + %tmp344.pn = load i32* %tmp344.pn.in + %iftmp.62.0.in = getelementptr i32* %tmp341.pn, i32 %tmp344.pn %iftmp.62.0 = load i32* %iftmp.62.0.in llvm-svn: 60325
*	Teach inst combine to merge GEPs through PHIs. This is really	Chris Lattner	2008-12-01	1	-16/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	important because it is sinking the loads using the GEPs, but not the GEPs themselves. This triggers 647 times on 403.gcc and makes the .s file much much nicer. For example before: je LBB1_87 ## bb78 LBB1_62: ## bb77 leal 84(%esi), %eax LBB1_63: ## bb79 movl (%eax), %eax ... LBB1_87: ## bb78 movl $0, 4(%esp) movl %esi, (%esp) call L_make_decl_rtl$stub jmp LBB1_62 ## bb77 after: jne LBB1_63 ## bb79 LBB1_62: ## bb78 movl $0, 4(%esp) movl %esi, (%esp) call L_make_decl_rtl$stub LBB1_63: ## bb79 movl 84(%esi), %eax The input code was (and the GEPs are merged and the PHI is now eliminated by instcombine): br i1 %tmp233, label %bb78, label %bb77 bb77: %tmp234 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22 br label %bb79 bb78: call void @make_decl_rtl(%struct.tree_node* %t_addr.3, i8* null) nounwind %tmp235 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22 br label %bb79 bb79: %iftmp.12.0.in = phi %struct.rtx_def [ %tmp235, %bb78 ], [ %tmp234, %bb77 ] %iftmp.12.0 = load %struct.rtx_def %iftmp.12.0.in llvm-svn: 60322
*	Make GVN be more intelligent about redundant load	Chris Lattner	2008-12-01	1	-2/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	elimination: when finding dependent load/stores, realize that they are the same if aliasing claims must alias instead of relying on the pointers to be exactly equal. This makes load elimination more aggressive. For example, on 403.gcc, we had: < 68 gvn - Number of instructions PRE'd < 152718 gvn - Number of instructions deleted < 49699 gvn - Number of loads deleted < 6153 memdep - Number of dirty cached non-local responses < 169336 memdep - Number of fully cached non-local responses < 162428 memdep - Number of uncached non-local responses now we have: > 64 gvn - Number of instructions PRE'd > 153623 gvn - Number of instructions deleted > 49856 gvn - Number of loads deleted > 5022 memdep - Number of dirty cached non-local responses > 159030 memdep - Number of fully cached non-local responses > 162443 memdep - Number of uncached non-local responses That's an extra 157 loads deleted and extra 905 other instructions nuked. This slows down GVN very slightly, from 3.91 to 3.96s. llvm-svn: 60314
*	Reimplement the non-local dependency data structure in terms of a sorted	Chris Lattner	2008-12-01	1	-22/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vector instead of a densemap. This shrinks the memory usage of this thing substantially (the high water mark) as well as making operations like scanning it faster. This speeds up memdep slightly, gvn goes from 3.9376 to 3.9118s on 403.gcc This also splits out the statistics for the cached non-local case to differentiate between the dirty and clean cached case. Here's the stats for 403.gcc: 6153 memdep - Number of dirty cached non-local responses 169336 memdep - Number of fully cached non-local responses 162428 memdep - Number of uncached non-local responses yay for caching :) llvm-svn: 60313
*	Implement ((A\|B)&1)\|(B&-2) -> (A&1) \| B transformation. This also takes care of	Bill Wendling	2008-12-01	1	-0/+67
\| \| \| \| \| \|	permutations of this pattern. llvm-svn: 60312
*	Cache analyses in ivars and add some useful DEBUG output.	Chris Lattner	2008-12-01	1	-37/+30
\| \| \| \| \| \|	This speeds up GVN from 4.0386s to 3.9376s. llvm-svn: 60310
*	improve indentation, do cheap checks before expensive ones,	Chris Lattner	2008-11-30	1	-52/+51
\| \| \| \| \| \| \|	remove some fixme's. This speeds up GVN very slightly on 403.gcc (4.06->4.03s) llvm-svn: 60309
*	Minor cleanup: use getTrue and getFalse where appropriate. No	Eli Friedman	2008-11-30	1	-20/+20
\| \| \| \| \| \|	functional change. llvm-svn: 60307
*	Some minor cleanups to instcombine; no functionality change.	Eli Friedman	2008-11-30	1	-56/+17
\| \| \| \| \| \| \|	Note that the FoldOpIntoPhi call is dead because it's impossible for the first operand of a subtraction to be both a ConstantInt and a PHINode. llvm-svn: 60306
*	Add instruction combining for ((A&~B)\|(~A&B)) -> A^B and all permutations.	Bill Wendling	2008-11-30	1	-0/+23
\| \| \| \|	llvm-svn: 60291
*	Implement (A&((~A)\|B)) -> A&B transformation in the instruction combiner. This	Bill Wendling	2008-11-30	1	-0/+19
\| \| \| \| \| \|	takes care of all permutations of this pattern. llvm-svn: 60290
*	Forgot one remaining call to getSExtValue().	Bill Wendling	2008-11-30	1	-1/+1
\| \| \| \|	llvm-svn: 60289
*	getSExtValue() doesn't work for ConstantInts with bitwidth > 64 bits. Use all	Bill Wendling	2008-11-30	1	-5/+5
\| \| \| \| \| \| \| \|	APInt calls instead. This fixes PR3144. llvm-svn: 60288
*	Optimize memmove and memset into the LLVM builtins. Note that these	Eli Friedman	2008-11-30	1	-4/+57
\| \| \| \| \| \|	only show up in code from front-ends besides llvm-gcc, like clang. llvm-svn: 60287
*	Don't make TwoToExp signed by default.	Bill Wendling	2008-11-30	1	-2/+1
\| \| \| \|	llvm-svn: 60279
*	From Hacker's Delight:	Bill Wendling	2008-11-30	1	-8/+10
\| \| \| \| \| \| \| \| \| \|	"For signed integers, the determination of overflow of xy is not so simple. If x and y have the same sign, then overflow occurs iff xy > 231 - 1. If they have opposite signs, then overflow occurs iff xy < -2*31." In this case, x == -1. llvm-svn: 60278
*	Instcombine was illegally transforming -X/C into X/-C when either X or C	Bill Wendling	2008-11-30	1	-3/+20
\| \| \| \| \| \| \| \|	overflowed on negation. This commit checks to make sure that neithe C nor X overflows. This requires that the RHS of X (a subtract instruction) be a constant integer. llvm-svn: 60275