bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[ScalarOpts] Remove dead code.	Benjamin Kramer	2015-10-15	1	-11/+3
\| \| \| \| \| \|	Does not touch debug dumpers. NFC. llvm-svn: 250417
*	[Unroll] Do not crash trying to propagate a value to vector load.	Michael Zolotukhin	2015-09-22	1	-0/+6
\| \| \| \|	llvm-svn: 248333
*	[Unroll] Follow-up for r247769: fix a bug in UnrolledInstAnalyzer::visitLoad.	Michael Zolotukhin	2015-09-22	1	-1/+1
\| \| \| \| \| \| \| \|	Apart from checking that GlobalVariable is a constant, we should check that it's not a weak constant, in which case we can't propagate its value. llvm-svn: 248327
*	[Unroll] Fix a bug in UnrolledInstAnalyzer::visitLoad.	Michael Zolotukhin	2015-09-16	1	-1/+1
\| \| \| \| \| \| \| \|	We only checked that a global is initialized with constants, which is incorrect. We should be checking that GlobalVariable is a constant, not just initialized with it. llvm-svn: 247769
*	Add GlobalsAA as preserved to a bunch of transforms	James Molloy	2015-09-10	1	-0/+2
\| \| \| \| \| \|	GlobalsAA must by definition be preserved in function passes, but the passmanager doesn't know that. Make each pass explicitly preserve GlobalsAA. llvm-svn: 247263
*	Make helper functions static. NFC.	Benjamin Kramer	2015-08-20	1	-1/+1
\| \| \| \|	llvm-svn: 245549
*	[PM] Port ScalarEvolution to the new pass manager.	Chandler Carruth	2015-08-17	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never actually invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there is actually a key that we don't update with value handles -- there is a map keyed off of Loops. Because LoopInfo does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that don't trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in just such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193
*	Add new llvm.loop.unroll.enable metadata.	Mark Heffernan	2015-08-10	1	-20/+40
\| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds the unroll metadata "llvm.loop.unroll.enable" which directs the optimizer to unroll a loop fully if the trip count is known at compile time, and unroll partially if the trip count is not known at compile time. This differs from "llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not known at compile time. The "llvm.loop.unroll.enable" is intended to be added for loops annotated with "#pragma unroll". llvm-svn: 244466
*	Fix some comment typos.	Benjamin Kramer	2015-08-08	1	-2/+2
\| \| \| \|	llvm-svn: 244402
*	[Unroll] Switch to using 'int' cost types in preparation for a somewhat	Chandler Carruth	2015-08-05	1	-6/+6
\| \| \| \| \| \|	more involved change to the cost computation pattern. llvm-svn: 244095
*	wrap OptSize and MinSize attributes for easier and consistent access (NFCI)	Sanjay Patel	2015-08-04	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994
*	[Unroll] Improve the brute force loop unroll estimate by propagating	Chandler Carruth	2015-08-03	1	-4/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	through PHI nodes across iterations. This patch teaches the new advanced loop unrolling heuristics to propagate constants into the loop from the preheader and around the backedge after simulating each iteration. This lets us brute force solve simple recurrances that aren't modeled effectively by SCEV. It also makes it more clear why we need to process the loop in-order rather than bottom-up which might otherwise make much more sense (for example, for DCE). This came out of an attempt I'm making to develop a principled way to account for dead code in the unroll estimation. When I implemented a forward-propagating version of that it produced incorrect results due to failing to propagate cost between loop iterations through the PHI nodes, and it occured to me we really should at least propagate simplifications across those edges, and it is quite easy thanks to the loop being in canonical and LCSSA form. Differential Revision: http://reviews.llvm.org/D11706 llvm-svn: 243900
*	[Unroll] Handle SwitchInst properly.	Michael Zolotukhin	2015-07-29	1	-2/+2
\| \| \| \| \| \|	Previously successor selection was simply wrong. llvm-svn: 243545
*	[Unroll] Don't crash when simplified branch condition is undef.	Michael Zolotukhin	2015-07-29	1	-4/+14
\| \| \| \|	llvm-svn: 243544
*	[Unroll] Add debug dumps to loop-unroll analyzer.	Michael Zolotukhin	2015-07-28	1	-2/+21
\| \| \| \|	llvm-svn: 243471
*	[Unroll] Don't analyze blocks outside the loop.	Michael Zolotukhin	2015-07-28	1	-4/+8
\| \| \| \|	llvm-svn: 243466
*	Handle resolvable branches in complete loop unroll heuristic.	Michael Zolotukhin	2015-07-24	1	-2/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Resolving a branch allows us to ignore blocks that won't be executed, and thus make our estimate more accurate. This patch is intended to be applied after D10205 (though it could be applied independently). Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10206 llvm-svn: 243084
*	[LoopUnrolling] Handle cast instructions.	Michael Zolotukhin	2015-07-15	1	-0/+15
\| \| \| \| \| \| \| \| \|	During estimation of unrolling effect we should be able to propagate constants through casts. Differential Revision: http://reviews.llvm.org/D10207 llvm-svn: 242257
*	Enable runtime unrolling with unroll pragma metadata	Mark Heffernan	2015-07-13	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N") and a runtime trip count. Also, do not unroll loops with unroll full metadata if the loop has a runtime loop count. Previously, such loops would be unrolled with a very large threshold (pragma-unroll-threshold) if runtime unrolled happened to be enabled resulting in a very large (and likely unwise) unroll factor. llvm-svn: 242047
*	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)	Alexander Kornienko	2015-06-23	1	-1/+1
\| \| \| \| \| \|	Apparently, the style needs to be agreed upon first. llvm-svn: 240390
*	Fixed/added namespace ending comments using clang-tidy. NFC	Alexander Kornienko	2015-06-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
*	Update stale comment before analyzeLoopUnrollCost. NFC.	Michael Zolotukhin	2015-06-11	1	-7/+9
\| \| \| \|	llvm-svn: 239565
*	Remove SCEVCache and FindConstantPointers from complete loop unrolling ↵	Michael Zolotukhin	2015-06-08	1	-212/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	heuristic. Summary: Using some SCEV functionality helped to entirely remove SCEVCache class and FindConstantPointers SCEV visitor. Also, this makes the code more universal - I'll take advandate of it in next patches where I start handling additional types of instructions. Test Plan: Tests would be submitted in subsequent patches. Reviewers: atrick, chandlerc Reviewed By: atrick, chandlerc Subscribers: atrick, llvm-commits Differential Revision: http://reviews.llvm.org/D10205 llvm-svn: 239282
*	[LoopUnroll] Fix truncation bug in canUnrollCompletely.	Sanjoy Das	2015-06-06	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: canUnrollCompletely takes `unsigned` values for `UnrolledCost` and `RolledDynamicCost` but is passed in `uint64_t`s that are silently truncated. Because of this, when `UnrolledSize` is a large integer that has a small remainder with UINT32_MAX, LLVM tries to completely unroll loops with high trip counts. Reviewers: mzolotukhin, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10293 llvm-svn: 239218
*	[Unroll] Rework the naming and structure of the new unroll heuristics.	Chandler Carruth	2015-06-05	1	-95/+121
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new naming is (to me) much easier to understand. Here is a summary of the new state of the world: - 'Threshold' is the threshold for full unrolling. It is measured against the estimated unrolled cost as computed by getUserCost in TTI (or CodeMetrics, etc). We will exceed this threshold when unrolling loops where unrolling exposes a significant degree of simplification of the logic within the loop. - 'PercentDynamicCostSavedThreshold' is the percentage of the loop's estimated dynamic execution cost which needs to be saved by unrolling to apply a discount to the estimated unrolled cost. - 'DynamicCostSavingsDiscount' is the discount applied to the estimated unrolling cost when the dynamic savings are expected to be high. When actually analyzing the loop, we now produce both an estimated unrolled cost, and an estimated rolled cost. The rolled cost is notably a dynamic estimate based on our analysis of the expected execution of each iteration. While we're still working to build up the infrastructure for making these estimates, to me it is much more clear how* to make them better when they have reasonably descriptive names. For example, we may want to apply estimated (from heuristics or profiles) dynamic execution weights to the dynamic cost estimates. If we start doing that, we would also need to track the static unrolled cost and the dynamic unrolled cost, as only the latter could reasonably be weighted by profile information. This patch is sadly not without functionality change for the new unroll analysis logic. Buried in the heuristic management were several things that surprised me. For example, we never subtracted the optimized instruction count off when comparing against the unroll heursistics! I don't know if this just got lost somewhere along the way or what, but with the new accounting of things, this is much easier to keep track of and we use the post-simplification cost estimate to compare to the thresholds, and use the dynamic cost reduction ratio to select whether we can exceed the baseline threshold. The old values of these flags also don't necessarily make sense. My impression is that none of these thresholds or discounts have been tuned yet, and so they're just arbitrary placehold numbers. As such, I've not bothered to adjust for the fact that this is now a discount and not a tow-tier threshold model. We need to tune all these values once the logic is ready to be enabled. Differential Revision: http://reviews.llvm.org/D9966 llvm-svn: 239164
*	[Unroll] Switch from an eagerly populated SCEV cache to one that is	Chandler Carruth	2015-05-25	1	-89/+116
\| \| \| \| \| \| \| \| \| \|	lazily built. Also, make it a much more generic SCEV cache, which today exposes only a reduced GEP model description but could be extended in the future to do other profitable caching of SCEV information. llvm-svn: 238124
*	[Unroll] Separate the logic for testing each iteration of the loop,	Chandler Carruth	2015-05-22	1	-106/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	accumulating estimated cost, and other loop-centric logic from the logic used to analyze instructions in a particular iteration. This makes the visitor very narrow in scope -- all it does is visit instructions, update a map of simplified values, and return whether it is able to optimize away a particular instruction. The two cost metrics are now returned as an optional struct. When the optional is left unengaged, there is no information about the unrolled cost of the loop, when it is engaged the cost metrics are available to run against the thresholds. No functionality changed. llvm-svn: 238033
*	[Unroll] Replace a hand-wavy FIXME with a FIXME that explains the actual	Chandler Carruth	2015-05-22	1	-1/+6
\| \| \| \| \| \| \|	problem instead of suggesting doing something that is trivial to do but incorrect given the current design of the libraries. llvm-svn: 237994
*	[Unroll] Extract the logic for caching SCEV-modeled GEPs with their	Chandler Carruth	2015-05-22	1	-67/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	simplified model for use simulating each iteration into a separate helper function that just returns the cache. Building this cache had nothing to do with the rest of the unroll analysis and so this removes an unnecessary coupling, etc. It should also make it easier to think about the concept of providing fast cached access to basic SCEV models as an orthogonal concept to the overall unroll simulation. I'd really like to see this kind of caching logic folded into SCEV itself, it seems weird for us to provide it at this layer rather than making repeated queries into SCEV fast all on their own. No functionality changed. llvm-svn: 237993
*	[Unroll] Refactor the accumulation of optimized instruction costs into	Chandler Carruth	2015-05-22	1	-9/+10
\| \| \| \| \| \| \| \| \| \| \| \|	a single location. This reduces code duplication a bit and will also pave the way for a better separation between the visitation algorithm and the unroll analysis. No functionality changed. llvm-svn: 237990
*	[Unrolling] Refactor the start and step offsets to simplify overflow	Chandler Carruth	2015-05-12	1	-10/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	checking and make the cache faster and smaller. I had thought that using an APInt here would be useful, but I think I was just wrong. Notably, we don't have to do any fancy overflow checking, we can just bound the values as quite small and do the math in a higher precision integer. I've switched to a signed integer so that UBSan will even point out if we ever have integer overflow. I've added various asserts to try to catch things as well and hoisted the overflow checks so that we just leave the too-large offsets out of the SCEV-GEP cache. This makes the value in the cache quite a bit smaller which is probably worthwhile. No functionality changed here (for trip counts under 1 billion). llvm-svn: 237209
*	Reimplement heuristic for estimating complete-unroll optimization effects.	Michael Zolotukhin	2015-05-12	1	-248/+300
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch reimplements heuristic that tries to estimate optimization beneftis from complete loop unrolling. In this patch I kept the minimal changes - e.g. I removed code handling branches and folding compares. That's a promising area, but now there are too many questions to discuss before we can enable it. Test Plan: Tests are included in the patch. Reviewers: hfinkel, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8816 llvm-svn: 237156
*	[LoopUnrollRuntime] Avoid high-cost trip count computation.	Sanjoy Das	2015-04-14	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Runtime unrolling of loops needs to emit an expression to compute the loop's runtime trip-count. Avoid runtime unrolling if this computation will be expensive. Depends on D8993. Reviewers: atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8994 llvm-svn: 234846
*	Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.	Benjamin Kramer	2015-03-23	1	-2/+2
\| \| \| \|	llvm-svn: 232998
*	Move private classes into anonymous namespaces	Benjamin Kramer	2015-03-23	1	-0/+2
\| \| \| \| \| \|	NFC. llvm-svn: 232944
*	DataLayout is mandatory, update the API to reflect it with references.	Mehdi Amini	2015-03-10	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first attempt at doing that. This patch is not exactly NFC as for instance some places were passing a nullptr instead of the DataLayout, possibly just because there was a default value on the DataLayout argument to many functions in the API. Even though it is not purely NFC, there is no change in the validation. I turned as many pointer to DataLayout to references, this helped figuring out all the places where a nullptr could come up. I had initially a local version of this patch broken into over 30 independant, commits but some later commit were cleaning the API and touching part of the code modified in the previous commits, so it seemed cleaner without the intermediate state. Test Plan: Reviewers: echristo Subscribers: llvm-commits From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231740
*	Introduce runtime unrolling disable matadata and use it to mark the scalar ↵	Kevin Qin	2015-03-09	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \|	loop from vectorization. Runtime unrolling is an expensive optimization which can bring benefit only if the loop is hot and iteration number is relatively large enough. For some loops, we know they are not worth to be runtime unrolled. The scalar loop from vectorization is one of the cases. llvm-svn: 231631
*	Transforms: Canonicalize access to function attributes, NFC	Duncan P. N. Exon Smith	2015-02-14	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) llvm-svn: 229202
*	[unroll] Concede defeat and disable the unroll analyzer for now.	Chandler Carruth	2015-02-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The issues with the new unroll analyzer are more fundamental than code cleanup, algorithm, or data structure changes. I've sent an email to the original commit thread with details and a proposal for how to redesign things. I'm disabling this for now so that we don't spend time debugging issues with it in its current state. llvm-svn: 229064
*	[unroll] Merge the simplification and DCE estimation methods on the	Chandler Carruth	2015-02-13	1	-20/+16
\| \| \| \| \| \| \| \| \| \| \|	UnrollAnalyzer. Now they share a single worklist and have less implicit state between them. There was no real benefit to separating these two things out. I'm going to subsequently refactor things to share even more code. llvm-svn: 229062
*	[unroll] Remove pointless dyn_cast<>s to Instruction - the users of an	Chandler Carruth	2015-02-13	1	-12/+4
\| \| \| \| \| \|	instruction must by definition be instructions. llvm-svn: 229061
*	[unroll] Don't check the loop set for whether an instruction is	Chandler Carruth	2015-02-13	1	-4/+2
\| \| \| \| \| \| \| \| \|	contained in it each time we try to add it to the worklist, just check this when pulling it off the worklist. That way we do it at most once per instruction with the cost of the worklist set we would need to pay anyways. llvm-svn: 229060
*	[unroll] Change the other worklist in the unroll analyzer to be a set	Chandler Carruth	2015-02-13	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vector. In addition to dramatically reducing the work required for contrived example loops, this also has to correct some serious latent bugs in the cost computation. Previously, we might add an instruction onto the worklist once for every load which it used and was simplified. Then we would visit it many times and accumulate "savings" each time. I mean, fortunately this couldn't matter for things like calls with 100s of operands, but even for binary operators this code seems like it must be double counting the savings. I just noticed this by inspection and due to the runtime problems it can introduce, I don't have any test cases for cases where the cost produced by this routine is unacceptable. llvm-svn: 229059
*	[unroll] Replace a boolean, for loop, condition, and break with	Chandler Carruth	2015-02-13	1	-7/+3
\| \| \| \| \| \| \|	std::all_of and a lambda. Much cleaner, no functionality changed. llvm-svn: 229058
*	[unroll] Directly query for dead instructions.	Chandler Carruth	2015-02-13	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the unroll analyzer, it is checking each user to see if that user will become dead. However, it first checked if that user was missing from the simplified values map, and then if was also missing from the dead instructions set. We add everything from the simplified values map to the dead instructions set, so the first step is completely subsumed by the second. Moreover, the first step requires inserting something into the simplified value map which isn't what we want at all. This also replaces a dyn_cast with a cast as an instruction cannot be used by a non-instruction. llvm-svn: 229057
*	[unroll] Replace a linear time check for no uses with a constant time	Chandler Carruth	2015-02-13	1	-3/+2
\| \| \| \| \| \| \| \| \| \|	check. Also hoist this into the enqueue process as it is faster even than testing the worklist set, we should just directly filter these out much like we filter out constants and such. llvm-svn: 229056
*	[unroll] Rather than an operand set, use a setvector for the worklist.	Chandler Carruth	2015-02-13	1	-10/+14
\| \| \| \| \| \| \| \| \| \|	We don't just want to handle duplicate operands within an instruction, but also duplicates across operands of different instructions. I should have gone straight to this, but I had convinced myself that it wasn't going to be necessary briefly. I've come to my senses after chatting more with Nick, and am now happier here. llvm-svn: 229054
*	[unroll] Extract the code to enqueue operansd for the worklist in the	Chandler Carruth	2015-02-13	1	-10/+11
\| \| \| \| \| \| \|	unroll analysis into a lambda and call it. That's much simpler than duplicating all the code. llvm-svn: 229053
*	[unroll] Use a small set to de-duplicate operands prior to putting them	Chandler Carruth	2015-02-13	1	-2/+12
\| \| \| \| \| \| \|	into the worklist. This avoids allocating lots of worklist memory for them when there are large numbers of repeated operands. llvm-svn: 229052
*	[unroll] Make the unroll cost analysis terminate deterministically and	Chandler Carruth	2015-02-13	1	-23/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	reasonably quickly. I don't have a reduced test case, but for a version of FFMPEG, this makes the loop unroller start finishing at all (after over 15 minutes of running, it hadn't terminated for me, no idea if it was a true infloop or just exponential work). The key thing here is to check the DeadInstructions set when pulling things off the worklist. Without this, we would re-walk the user list of already dead instructions again and again and again. Consider phi nodes with many, many operands and other patterns. The other important aspect of this is that because we would keep re-visiting instructions that were already known dead, we kept adding their cost savings to this! This would cause our cost savings to be insanely inflated from this. While I was here, I also rotated the operand walk out of the worklist loop to make the code easier to read. There is still work to be done to minimize worklist traffic because we don't de-duplicate operands. This means we may add the same instruction onto the worklist 1000s of times if it shows up in 1000s of operansd to a PHI node for example. Still, with this patch, the ffmpeg testcase I have finishes quickly and I can't measure the runtime impact of the unroll analysis any more. I'll probably try to do a few more cleanups to this code, but not sure how much cleanup I can justify right now. llvm-svn: 229038