bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	LoopRotate: Don't walk the uses of a Constant	David Majnemer	2015-01-27	1	-11/+8
\| \| \| \| \| \| \| \| \| \|	LoopRotate wanted to avoid live range interference by looking at the uses of a Value in the loop latch and seeing if any lied outside of the loop. We would wrongly perform this operation on Constants. This fixes PR22337. llvm-svn: 227171
*	[PM] Remove the Pass argument from all of the critical edge splitting	Chandler Carruth	2015-01-19	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	APIs and replace it and numerous booleans with an option struct. The critical edge splitting API has a really large surface of flags and so it seems worth burning a small option struct / builder. This struct can be constructed with the various preserved analyses and then flags can be flipped in a builder style. The various users are now responsible for directly passing along their analysis information. This should be enough for the critical edge splitting to work cleanly with the new pass manager as well. This API is still pretty crufty and could be cleaned up a lot, but I've focused on this change just threading an option struct rather than a pass through the API. llvm-svn: 226456
*	[PM] Replace another Pass argument with specific analyses that are	Chandler Carruth	2015-01-18	1	-1/+1
\| \| \| \| \| \| \| \| \|	optionally updated by MergeBlockIntoPredecessors. No functionality changed, just refactoring to clear the way for the new pass manager. llvm-svn: 226392
*	[PM] Refactor how the LoopRotation pass access the DominatorTree.	Chandler Carruth	2015-01-18	1	-20/+18
\| \| \| \| \| \| \| \|	Instead of querying the pass every where we need to, do that once and cache a pointer in the pass object. This is both simpler and I'm about to add yet another place where we need to dig out that pointer. llvm-svn: 226391
*	[PM] Split the LoopInfo object apart from the legacy pass, creating	Chandler Carruth	2015-01-17	1	-4/+4
\| \| \| \| \| \| \| \| \| \|	a LoopInfoWrapperPass to wire the object up to the legacy pass manager. This switches all the clients of LoopInfo over and paves the way to port LoopInfo to the new pass manager. No functionality change is intended with this iteration. llvm-svn: 226373
*	[PM] Split the AssumptionTracker immutable pass into two separate APIs:	Chandler Carruth	2015-01-04	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a cache of assumptions for a single function, and an immutable pass that manages those caches. The motivation for this change is two fold. Immutable analyses are really hacks around the current pass manager design and don't exist in the new design. This is usually OK, but it requires that the core logic of an immutable pass be reasonably partitioned off from the pass logic. This change does precisely that. As a consequence it also paves the way for the many utility functions that deal in the assumptions to live in both pass manager worlds by creating an separate non-pass object with its own independent API that they all rely on. Now, the only bits of the system that deal with the actual pass mechanics are those that actually need to deal with the pass mechanics. Once this separation is made, several simplifications become pretty obvious in the assumption cache itself. Rather than using a set and callback value handles, it can just be a vector of weak value handles. The callers can easily skip the handles that are null, and eventually we can wrap all of this up behind a filter iterator. For now, this adds boiler plate to the various passes, but this kind of boiler plate will end up making it possible to port these passes to the new pass manager, and so it will end up factored away pretty reasonably. llvm-svn: 225131
*	Do not simplifyLatch for loops where hoisting increments couldresult in ↵	Yi Jiang	2014-10-29	1	-3/+30
\| \| \| \| \| \|	extra live range interferance llvm-svn: 220872
*	Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.)	Hal Finkel	2014-09-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change, which allows @llvm.assume to be used from within computeKnownBits (and other associated functions in ValueTracking), adds some (optional) parameters to computeKnownBits and friends. These functions now (optionally) take a "context" instruction pointer, an AssumptionTracker pointer, and also a DomTree pointer, and most of the changes are just to pass this new information when it is easily available from InstSimplify, InstCombine, etc. As explained below, the significant conceptual change is that known properties of a value might depend on the control-flow location of the use (because we care that the @llvm.assume dominates the use because assumptions have control-flow dependencies). This means that, when we ask if bits are known in a value, we might get different answers for different uses. The significant changes are all in ValueTracking. Two main changes: First, as with the rest of the code, new parameters need to be passed around. To make this easier, I grouped them into a structure, and I made internal static versions of the relevant functions that take this structure as a parameter. The new code does as you might expect, it looks for @llvm.assume calls that make use of the value we're trying to learn something about (often indirectly), attempts to pattern match that expression, and uses the result if successful. By making use of the AssumptionTracker, the process of finding @llvm.assume calls is not expensive. Part of the structure being passed around inside ValueTracking is a set of already-considered @llvm.assume calls. This is to prevent a query using, for example, the assume(a == b), to recurse on itself. The context and DT params are used to find applicable assumptions. An assumption needs to dominate the context instruction, or come after it deterministically. In this latter case we only handle the specific case where both the assumption and the context instruction are in the same block, and we need to exclude assumptions from being used to simplify their own ephemeral values (those which contribute only to the assumption) because otherwise the assumption would prove its feeding comparison trivial and would be removed. This commit adds the plumbing and the logic for a simple masked-bit propagation (just enough to write a regression test). Future commits add more patterns (and, correspondingly, more regression tests). llvm-svn: 217342
*	Add functions for finding ephemeral values	Hal Finkel	2014-09-07	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a set of utility functions for collecting 'ephemeral' values. These are LLVM IR values that are used only by @llvm.assume intrinsics (directly or indirectly), and thus will be removed prior to code generation, implying that they should be considered free for certain purposes (like inlining). The inliner's cost analysis, and a few other passes, have been updated to account for ephemeral values using the provided functionality. This functionality is important for the usability of @llvm.assume, because it limits the "non-local" side-effects of adding llvm.assume on inlining, loop unrolling, etc. (these are hints, and do not generate code, so they should not directly contribute to estimates of execution cost). llvm-svn: 217335
*	Fix typos in comments and doc	JF Bastien	2014-08-05	1	-2/+2
\| \| \| \| \| \|	Committing http://reviews.llvm.org/D4798 for Robin Morisset (morisset@google.com) llvm-svn: 214934
*	Make the LoopRotate pass's maximum header size configurable both ↵	Owen Anderson	2014-05-26	1	-4/+14
\| \| \| \| \| \| \| \| \| \|	programmatically and via the command line, mirroring similar functionality in LoopUnroll. In situations where clients used custom unrolling thresholds, their intent could previously be foiled by LoopRotate having a hardcoded threshold. llvm-svn: 209617
*	[C++] Use 'nullptr'. Transforms edition.	Craig Topper	2014-04-25	1	-3/+3
\| \| \| \|	llvm-svn: 207196
*	[Modules] Fix potential ODR violations by sinking the DEBUG_TYPE	Chandler Carruth	2014-04-22	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	definition below all of the header #include lines, lib/Transforms/... edition. This one is tricky for two reasons. We again have a couple of passes that define something else before the includes as well. I've sunk their name macros with the DEBUG_TYPE. Also, InstCombine contains headers that need DEBUG_TYPE, so now those headers #define and #undef DEBUG_TYPE around their code, leaving them well formed modular headers. Fixing these headers was a large motivation for all of these changes, as "leaky" macros of this form are hard on the modules implementation. llvm-svn: 206844
*	D3348 - [BUG] "Rotate Loop" pass kills "llvm.vectorizer.enable" metadata	Alexey Bataev	2014-04-15	1	-0/+9
\| \| \| \|	llvm-svn: 206266
*	[C++11] Add range based accessors for the Use-Def chain of a Value.	Chandler Carruth	2014-03-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This requires a number of steps. 1) Move value_use_iterator into the Value class as an implementation detail 2) Change it to actually be a Use iterator rather than a User iterator. 3) Add an adaptor which is a User iterator that always looks through the Use to the User. 4) Wrap these in Value::use_iterator and Value::user_iterator typedefs. 5) Add the range adaptors as Value::uses() and Value::users(). 6) Update all of the callers to correctly distinguish between whether they wanted a use_iterator (and to explicitly dig out the User when needed), or a user_iterator which makes the Use itself totally opaque. Because #6 requires churning essentially everything that walked the Use-Def chains, I went ahead and added all of the range adaptors and switched them to range-based loops where appropriate. Also because the renaming requires at least churning every line of code, it didn't make any sense to split these up into multiple commits -- all of which would touch all of the same lies of code. The result is still not quite optimal. The Value::use_iterator is a nice regular iterator, but Value::user_iterator is an iterator over Users rather than over the User objects themselves. As a consequence, it fits a bit awkwardly into the range-based world and it has the weird extra-dereferencing 'operator->' that so many of our iterators have. I think this could be fixed by providing something which transforms a range of T&s into a range of Ts, but that can be separated into another patch, and it isn't yet 100% clear whether this is the right move. However, this change gets us most of the benefit and cleans up a substantial amount of code around Use and User. =] llvm-svn: 203364
*	[C++11] Add 'override' keyword to virtual methods that override their base ↵	Craig Topper	2014-03-05	1	-2/+2
\| \| \| \| \| \|	class. llvm-svn: 202953
*	[Modules] Move CFG.h to the IR library as it defines graph traits over	Chandler Carruth	2014-03-04	1	-1/+1
\| \| \| \| \| \|	IR types. llvm-svn: 202827
*	Disable most IR-level transform passes on functions marked 'optnone'.	Paul Robinson	2014-02-06	1	-0/+3
\| \| \| \| \| \| \| \| \|	Ideally only those transform passes that run at -O0 remain enabled, in reality we get as close as we reasonably can. Passes are responsible for disabling themselves, it's not the job of the pass manager to do it for them. llvm-svn: 200892
*	[LPM] Fix PR18643, another scary place where loop transforms failed to	Chandler Carruth	2014-01-29	1	-3/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	preserve loop simplify of enclosing loops. The problem here starts with LoopRotation which ends up cloning code out of the latch into the new preheader it is buidling. This can create a new edge from the preheader into the exit block of the loop which breaks LoopSimplify form. The code tries to fix this by splitting the critical edge between the latch and the exit block to get a new exit block that only the latch dominates. This sadly isn't sufficient. The exit block may be an exit block for multiple nested loops. When we clone an edge from the latch of the inner loop to the new preheader being built in the outer loop, we create an exiting edge from the outer loop to this exit block. Despite breaking the LoopSimplify form for the inner loop, this is fine for the outer loop. However, when we split the edge from the inner loop to the exit block, we create a new block which is in neither the inner nor outer loop as the new exit block. This is a predecessor to the old exit block, and so the split itself takes the outer loop out of LoopSimplify form. We need to split every edge entering the exit block from inside a loop nested more deeply than the exit block in order to preserve all of the loop simplify constraints. Once we try to do that, a problem with splitting critical edges surfaces. Previously, we tried a very brute force to update LoopSimplify form by re-computing it for all exit blocks. We don't need to do this, and doing this much will sometimes but not always overlap with the LoopRotate bug fix. Instead, the code needs to specifically handle the cases which can start to violate LoopSimplify -- they aren't that common. We need to see if the destination of the split edge was a loop exit block in simplified form for the loop of the source of the edge. For this to be true, all the predecessors need to be in the exact same loop as the source of the edge being split. If the dest block was originally in this form, we have to split all of the deges back into this loop to recover it. The old mechanism of doing this was conservatively correct because at least one of the exiting blocks it rewrote was the DestBB and so the DestBB's predecessors were fixed. But this is a much more targeted way of doing it. Making it targeted is important, because ballooning the set of edges touched prevents LoopRotate from being able to split edges it needs to split to preserve loop simplify in a coherent way -- the critical edge splitting would sometimes find the other edges in need of splitting but not others. Many, many thanks for help from Nick reducing these test cases mightily. And helping lots with the analysis here as this one was quite tricky to track down. llvm-svn: 200393
*	[PM] Split DominatorTree into a concrete analysis result object which	Chandler Carruth	2014-01-13	1	-16/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	can be used by both the new pass manager and the old. This removes it from any of the virtual mess of the pass interfaces and lets it derive cleanly from the DominatorTreeBase<> template. In turn, tons of boilerplate interface can be nuked and it turns into a very straightforward extension of the base DominatorTree interface. The old analysis pass is now a simple wrapper. The names and style of this split should match the split between CallGraph and CallGraphWrapperPass. All of the users of DominatorTree have been updated to match using many of the same tricks as with CallGraph. The goal is that the common type remains the resulting DominatorTree rather than the pass. This will make subsequent work toward the new pass manager significantly easier. Also in numerous places things became cleaner because I switched from re-running the pass (!!! mid way through some other passes run!!!) to directly recomputing the domtree. llvm-svn: 199104
*	[cleanup] Move the Dominators.h and Verifier.h headers into the IR	Chandler Carruth	2014-01-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	directory. These passes are already defined in the IR library, and it doesn't make any sense to have the headers in Analysis. Long term, I think there is going to be a much better way to divide these matters. The dominators code should be fully separated into the abstract graph algorithm and have that put in Support where it becomes obvious that evn Clang's CFGBlock's can use it. Then the verifier can manually construct dominance information from the Support-driven interface while the Analysis library can provide a pass which both caches, reconstructs, and supports a nice update API. But those are very long term, and so I don't want to leave the really confusing structure until that day arrives. llvm-svn: 199082
*	Don't #include heavy Dominators.h file in LoopInfo.h. This change reduces	Jakub Staszak	2013-12-07	1	-0/+1
\| \| \| \| \| \|	overall time of LLVM compilation by ~1%. llvm-svn: 196667
*	Correct word hyphenations	Alp Toker	2013-12-05	1	-1/+1
\| \| \| \| \| \| \|	This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. llvm-svn: 196471
*	Rotate multi-exit loops even if the latch was simplified.	Andrew Trick	2013-05-06	1	-14/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Test case by Michele Scandale! Fixes PR10293: Load not hoisted out of loop with multiple exits. There are few regressions with this patch, now tracked by rdar:13817079, and a roughly equal number of improvements. The regressions are almost certainly back luck because LoopRotate has very little idea of whether rotation is profitable. Doing better requires a more comprehensive solution. This checkin is a quick fix that lacks generality (PR10293 has a counter-example). But it trivially fixes the case in PR10293 without interfering with other cases, and it does satify the criteria that LoopRotate is a loop canonicalization pass that should avoid heuristics and special cases. I can think of two approaches that would probably be better in the long run. Ultimately they may both make sense. (1) LoopRotate should check that the current header would make a good loop guard, and that the loop does not already has a sufficient guard. The artifical SimplifiedLoopLatch check would be unnecessary, and the design would be more general and canonical. Two difficulties: - We need a strong guarantee that we won't endlessly rotate, so the analysis would need to be precise in order to avoid the SimplifiedLoopLatch precondition. - Analysis like this are usually based on SCEV, which we don't want to rely on. (2) Rotate on-demand in late loop passes. This could even be done by shoving the loop back on the queue after the optimization that needs it. This could work well when we find LICM opportunities in multi-branch loops. This requires some work, and it doesn't really solve the problem of SCEV wanting a loop guard before the analysis. llvm-svn: 181230
*	Switch CodeMetrics itself over to use TTI to determine if an instruction	Chandler Carruth	2013-01-21	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \|	is free. The whole CodeMetrics API should probably be reworked more, but this is enough to allow deleting the duplicate code there for computing whether an instruction is free. All of the passes using this have been updated to pull in TTI and hand it to the CodeMetrics stuff. Further, a dead CodeMetrics API (analyzeFunction) is nuked for lack of users. llvm-svn: 173036
*	Move all of the header files which are involved in modelling the LLVM IR	Chandler Carruth	2013-01-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366
*	Add a new attribute, 'noduplicate'. If a function contains a noduplicate ↵	James Molloy	2012-12-20	1	-1/+7
\| \| \| \| \| \| \| \|	call, the call cannot be duplicated - Jump threading, loop unrolling, loop unswitching, and loop rotation are inhibited if they would duplicate the call. Similarly inlining of the function is inhibited, if that would duplicate the call (in particular inlining is still allowed when there is only one callsite and the function has internal linkage). llvm-svn: 170704
*	Use the new script to sort the includes of every file under lib.	Chandler Carruth	2012-12-03	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
*	LoopRotation: Make the brute force DomTree update more brute force.	Benjamin Kramer	2012-09-02	1	-32/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We update until we hit a fixpoint. This is probably slow but also slightly simplifies the code. It should also fix the occasional invalid domtrees observed when building with expensive checking. I couldn't find a case where this had a measurable slowdown, but if someone finds a pathological case where it does we may have to find a cleverer way of updating dominators here. Thanks to Duncan for the test case. llvm-svn: 163091
*	LoopRotation: Check some invariants of the dominator updating code.	Benjamin Kramer	2012-09-01	1	-0/+3
\| \| \| \|	llvm-svn: 163058
*	LoopRotate: Also rotate loops with multiple exits.	Benjamin Kramer	2012-08-30	1	-13/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old PHI updating code in loop-rotate was replaced with SSAUpdater a while ago, it has no problems with comples PHIs. What had to be fixed is detecting whether a loop was already rotated and updating dominators when multiple exits were present. This change increases overall code size a bit, mostly due to additional loop unrolling opportunities. Passes test-suite and selfhost with -verify-dom-info. Fixes PR7447. Thanks to Andy for the input on the domtree updating code. llvm-svn: 162912
*	Clean whitespaces.	Nadav Rotem	2012-07-24	1	-3/+4
\| \| \| \|	llvm-svn: 160668
*	loop-rotate shouldn't hoist alloca instructions out of a loop. Patch by ↵	Eli Friedman	2012-02-16	1	-1/+2
\| \| \| \| \| \|	Patrik Hägglund, with slightly modified test. Issue reported by Patrik Hägglund on llvmdev. llvm-svn: 150642
*	Add simplifyLoopLatch to LoopRotate pass.	Andrew Trick	2012-02-14	1	-0/+103
\| \| \| \| \| \|	This folds a simple loop tail into a loop latch. It covers the common (in fortran) case of postincrement loops. It's a "free" way to expose this type of loop to downstream loop optimizations that bail out on non-canonical loops (getLoopLatch is a heavily used check). llvm-svn: 150439
*	whitespace	Andrew Trick	2012-02-14	1	-30/+30
\| \| \| \|	llvm-svn: 150438
*	Make better use of the PHINode API.	Jay Foad	2011-06-20	1	-1/+1
\| \| \| \| \| \| \| \|	Change various bits of code to make better use of the existing PHINode API, to insulate them from forthcoming changes in how PHINodes store their operands. llvm-svn: 133434
*	Preserve line number information.	Devang Patel	2011-04-29	1	-1/+2
\| \| \| \|	llvm-svn: 130536
*	fix PR9523, a crash in looprotate on a non-canonical loop made out of ↵	Chris Lattner	2011-04-09	1	-1/+5
\| \| \| \| \| \|	indirectbr. llvm-svn: 129203
*	Do not hoist @llvm.dbg.value. Here, @llvm.dbg.value is "referring" a value ↵	Devang Patel	2011-02-14	1	-1/+2
\| \| \| \| \| \|	that is modified inside loop. llvm-svn: 125529
*	remove a bogus assertion: the latch block of a loop is not	Chris Lattner	2011-01-11	1	-6/+5
\| \| \| \| \| \| \|	neccesarily an uncond branch to the header. This fixes PR8955 (the assertion tripping). llvm-svn: 123219
*	When loop rotation happens, it is very common for the duplicated condbr	Chris Lattner	2011-01-08	1	-21/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to be foldable into an uncond branch. When this happens, we can make a much simpler CFG for the loop, which is important for nested loop cases where we want the outer loop to be aggressively optimized. Handle this case more aggressively. For example, previously on phi-duplicate.ll we would get this: define void @test(i32 %N, double* %G) nounwind ssp { entry: %cmp1 = icmp slt i64 1, 1000 br i1 %cmp1, label %bb.nph, label %for.end bb.nph: ; preds = %entry br label %for.body for.body: ; preds = %bb.nph, %for.cond %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ] %arrayidx = getelementptr inbounds double* %G, i64 %j.02 %tmp3 = load double* %arrayidx %sub = sub i64 %j.02, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.02, 1 br label %for.cond for.cond: ; preds = %for.body %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge for.cond.for.end_crit_edge: ; preds = %for.cond br label %for.end for.end: ; preds = %for.cond.for.end_crit_edge, %entry ret void } Now we get the much nicer: define void @test(i32 %N, double* %G) nounwind ssp { entry: br label %for.body for.body: ; preds = %entry, %for.body %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ] %arrayidx = getelementptr inbounds double* %G, i64 %j.01 %tmp3 = load double* %arrayidx %sub = sub i64 %j.01, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.01, 1 %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.end for.end: ; preds = %for.body ret void } With all of these recent changes, we are now able to compile: void foo(char X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i100] = 0; } into a single memset of 10000 bytes. This series of changes should also be helpful for other nested loop scenarios as well. llvm-svn: 123079
*	split ssa updating code out to its own helper function. Don't bother	Chris Lattner	2011-01-08	1	-74/+78
\| \| \| \| \| \| \|	moving the OrigHeader block anymore: we just merge it away anyway so its code layout doesn't matter. llvm-svn: 123077
*	Implement a TODO: Enhance loopinfo to merge away the unconditional branch	Chris Lattner	2011-01-08	1	-11/+7
\| \| \| \| \| \| \| \| \| \|	that it was leaving in loops after rotation (between the original latch block and the original header. With this change, it is possible for rotated loops to have just a single basic block, which is useful. llvm-svn: 123075
*	inline preserveCanonicalLoopForm now that it is simple.	Chris Lattner	2011-01-08	1	-39/+17
\| \| \| \|	llvm-svn: 123073
*	Three major changes:	Chris Lattner	2011-01-08	1	-115/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1. Rip out LoopRotate's domfrontier updating code. It isn't needed now that LICM doesn't use DF and it is super complex and gross. 2. Make DomTree updating code a lot simpler and faster. The old loop over all the blocks was just to find a block?? 3. Change the code that inserts the new preheader to just use SplitCriticalEdge instead of doing an overcomplex reimplementation of it. No behavior change, except for the name of the inserted preheader. llvm-svn: 123072
*	LoopRotate requires canonical loop form, so it always has preheaders	Chris Lattner	2011-01-08	1	-15/+11
\| \| \| \| \| \| \|	and latch blocks. Reorder entry conditions to make hte pass faster and more logical. llvm-svn: 123069
*	use the LI ivar.	Chris Lattner	2011-01-08	1	-3/+2
\| \| \| \|	llvm-svn: 123068
*	some cleanups: remove dead arguments and eliminate ivars	Chris Lattner	2011-01-08	1	-55/+36
\| \| \| \| \| \|	that are just passed to one function. llvm-svn: 123067
*	fix an issue duncan pointed out, which could cause loop rotate	Chris Lattner	2011-01-08	1	-12/+16
\| \| \| \| \| \|	to violate LCSSA form llvm-svn: 123066
*	Have loop-rotate simplify instructions (yay instsimplify!) as it clones	Chris Lattner	2011-01-08	1	-5/+21
\| \| \| \| \| \| \| \| \| \| \| \|	them into the loop preheader, eliminating silly instructions like "icmp i32 0, 100" in fixed tripcount loops. This also better exposes the bigger problem with loop rotate that I'd like to fix: once this has been folded, the duplicated conditional branch often turns into an uncond branch. Not aggressively handling this is pessimizing later loop optimizations somethin' fierce by making "dominates all exit blocks" checks fail. llvm-svn: 123060