bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[LoopAccesses] Create the analysis pass	Adam Nemet	2015-02-19	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a function pass that runs the analysis on demand. The analysis can be initiated by querying the loop access info via LAA::getInfo. It either returns the cached info or runs the analysis. Symbolic stride information continues to reside outside of this analysis pass. We may move it inside later but it's not a priority for me right now. The idea is that Loop Distribution won't support run-time stride checking at least initially. This means that when querying the analysis, symbolic stride information can be provided optionally. Whether stride information is used can invalidate the cache entry and rerun the analysis. Note that if the loop does not have any symbolic stride, the entry should be preserved across Loop Distribution and LV. Since currently the only user of the pass is LV, I just check that the symbolic stride information didn't change when using a cached result. On the LV side, LoopVectorizationLegality requests the info object corresponding to the loop from the analysis pass. A large chunk of the diff is due to LAI becoming a pointer from a reference. A test will be added as part of the -analyze patch. Also tested that with AVX, we generate identical assembly output for the testsuite (including the external testsuite) before and after. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229893
*	LSR: Move set instead of copying. NFC.	Benjamin Kramer	2015-02-19	1	-4/+2
\| \| \| \|	llvm-svn: 229871
*	Revert r229622: "[LoopAccesses] Make VectorizerParams global" and others. ↵	NAKAMURA Takumi	2015-02-18	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	r229622 brought cyclic dependencies between Analysis and Vector. r229622: "[LoopAccesses] Make VectorizerParams global" r229623: "[LoopAccesses] Stash the report from the analysis rather than emitting it" r229624: "[LoopAccesses] Cache the result of canVectorizeMemory" r229626: "[LoopAccesses] Create the analysis pass" r229628: "[LoopAccesses] Change debug messages from LV to LAA" r229630: "[LoopAccesses] Add canAnalyzeLoop" r229631: "[LoopAccesses] Add missing const to APIs in VectorizationReport" r229632: "[LoopAccesses] Split out LoopAccessReport from VectorizerReport" r229633: "[LoopAccesses] Add -analyze support" r229634: "[LoopAccesses] Change LAA:getInfo to return a constant reference" r229638: "Analysis: fix buildbots" llvm-svn: 229650
*	[LoopAccesses] Create the analysis pass	Adam Nemet	2015-02-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a function pass that runs the analysis on demand. The analysis can be initiated by querying the loop access info via LAA::getInfo. It either returns the cached info or runs the analysis. Symbolic stride information continues to reside outside of this analysis pass. We may move it inside later but it's not a priority for me right now. The idea is that Loop Distribution won't support run-time stride checking at least initially. This means that when querying the analysis, symbolic stride information can be provided optionally. Whether stride information is used can invalidate the cache entry and rerun the analysis. Note that if the loop does not have any symbolic stride, the entry should be preserved across Loop Distribution and LV. Since currently the only user of the pass is LV, I just check that the symbolic stride information didn't change when using a cached result. On the LV side, LoopVectorizationLegality requests the info object corresponding to the loop from the analysis pass. A large chunk of the diff is due to LAI becoming a pointer from a reference. A test will be added as part of the -analyze patch. Also tested that with AVX, we generate identical assembly output for the testsuite (including the external testsuite) before and after. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229626
*	[BDCE] Don't forget uses of root instructions seen before the instruction itself	Hal Finkel	2015-02-18	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When visiting the initial list of "root" instructions (those which must always be alive), for those that are integer-valued (such as invokes returning an integer), we mark their bits as (initially) all dead (we might, obviously, find uses of those bits later, but all bits are assumed dead until proven otherwise). Don't do so, however, if we're already seen a use of those bits by another root instruction (such as a store). Fixes a miscompile of the sanitizer unit tests on x86_64. Also, add a debug line for visiting the root instructions, and remove a debug line which tried to print instructions being removed (printing dead instructions is dangerous, and can sometimes crash). llvm-svn: 229618
*	Fixed a bug in store sinking.	Elena Demikhovsky	2015-02-17	1	-4/+6
\| \| \| \| \| \| \| \| \| \|	The problem was in store-sink barrier check. Store sink barrier should be checked for ModRef (read-write) mode. http://llvm.org/bugs/show_bug.cgi?id=22613 llvm-svn: 229495
*	[BDCE] Add a bit-tracking DCE pass	Hal Finkel	2015-02-17	3	-0/+414
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	BDCE is a bit-tracking dead code elimination pass. It is based on ADCE (the "aggressive DCE" pass), with the added capability to track dead bits of integer valued instructions and remove those instructions when all of the bits are dead. Currently, it does not actually do this all-bits-dead removal, but rather replaces the instruction's uses with a constant zero, and lets instcombine (and the later run of ADCE) do the rest. Because we essentially get a run of ADCE "for free" while tracking the dead bits, we also do what ADCE does and removes actually-dead instructions as well (this includes instructions newly trivially dead because all bits were dead, but not all such instructions can be removed). The motivation for this is a case like: int __attribute__((const)) foo(int i); int bar(int x) { x \|= (4 & foo(5)); x \|= (8 & foo(3)); x \|= (16 & foo(2)); x \|= (32 & foo(1)); x \|= (64 & foo(0)); x \|= (128& foo(4)); return x >> 4; } As it turns out, if you order the bit-field insertions so that all of the dead ones come last, then instcombine will remove them. However, if you pick some other order (such as the one above), the fact that some of the calls to foo() are useless is not locally obvious, and we don't remove them (without this pass). I did a quick compile-time overhead check using sqlite from the test suite (Release+Asserts). BDCE took ~0.4% of the compilation time (making it about twice as expensive as ADCE). I've not looked at why yet, but we eliminate instructions due to having all-dead bits in: External/SPEC/CFP2006/447.dealII/447.dealII External/SPEC/CINT2006/400.perlbench/400.perlbench External/SPEC/CINT2006/403.gcc/403.gcc MultiSource/Applications/ClamAV/clamscan MultiSource/Benchmarks/7zip/7zip-benchmark llvm-svn: 229462
*	[ADCE] Don't indent inside an anonymous namespace	Hal Finkel	2015-02-16	1	-11/+10
\| \| \| \| \| \| \|	To be consistent with what clang-format does, don't add extra indentation inside an anonymous namespace. NFC. llvm-svn: 229412
*	[LoopReroll] Relax some assumptions a little.	James Molloy	2015-02-16	1	-3/+6
\| \| \| \| \| \| \| \|	We won't find a root with index zero in any loop that we are able to reroll. However, we may find one in a non-rerollable loop, so bail gracefully instead of failing hard. llvm-svn: 229406
*	[LoopReroll] Don't crash on dead code	James Molloy	2015-02-16	1	-0/+2
\| \| \| \| \| \| \| \|	If a PHI has no users, don't crash; bail gracefully. This shouldn't happen often, but we can make no guarantees that previous passes didn't leave dead code around. llvm-svn: 229405
*	Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for ↵	Aaron Ballman	2015-02-15	1	-4/+4
\| \| \| \| \| \|	requiring the macro. NFC; LLVM edition. llvm-svn: 229340
*	[ADCE] Convert another loop for a range-based for	Hal Finkel	2015-02-15	1	-2/+2
\| \| \| \| \| \|	We can use a range-based for for the operands loop too; NFC. llvm-svn: 229319
*	[ADCE] Use inst_range and range-based fors	Hal Finkel	2015-02-15	1	-14/+13
\| \| \| \| \| \|	Convert a few loops to range-based fors; NFC. llvm-svn: 229318
*	[ADCE] Fix formatting of pointer types	Hal Finkel	2015-02-15	1	-2/+2
\| \| \| \| \| \|	We prefer to put the * with the variable, not with the type; NFC. llvm-svn: 229317
*	[ADCE] Fix capitalization of another local variable	Hal Finkel	2015-02-15	1	-2/+2
\| \| \| \| \| \|	Bring another local variable in compliance with our naming conventions, NFC. llvm-svn: 229316
*	[ADCE] Fix capitalization of some local variables	Hal Finkel	2015-02-15	1	-14/+14
\| \| \| \| \| \|	Bring some local variables in compliance with our naming conventions, NFC. llvm-svn: 229315
*	[optnone] Skip pass Constant Hoisting on optnone functions.	Andrea Di Biagio	2015-02-14	1	-0/+3
\| \| \| \| \| \| \|	Added test CodeGen/X86/constant-hoisting-optnone.ll to verify that pass Constant Hoisting is not run on optnone functions. llvm-svn: 229258
*	Transforms: Canonicalize access to function attributes, NFC	Duncan P. N. Exon Smith	2015-02-14	2	-6/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) llvm-svn: 229202
*	[PM] Remove the old 'PassManager.h' header file at the top level of	Chandler Carruth	2015-02-13	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LLVM's include tree and the use of using declarations to hide the 'legacy' namespace for the old pass manager. This undoes the primary modules-hostile change I made to keep out-of-tree targets building. I sent an email inquiring about whether this would be reasonable to do at this phase and people seemed fine with it, so making it a reality. This should allow us to start bootstrapping with modules to a certain extent along with making it easier to mix and match headers in general. The updates to any code for users of LLVM are very mechanical. Switch from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h". Qualify the types which now produce compile errors with "legacy::". The most common ones are "PassManager", "PassManagerBase", and "FunctionPassManager". llvm-svn: 229094
*	[unroll] Concede defeat and disable the unroll analyzer for now.	Chandler Carruth	2015-02-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The issues with the new unroll analyzer are more fundamental than code cleanup, algorithm, or data structure changes. I've sent an email to the original commit thread with details and a proposal for how to redesign things. I'm disabling this for now so that we don't spend time debugging issues with it in its current state. llvm-svn: 229064
*	[unroll] Merge the simplification and DCE estimation methods on the	Chandler Carruth	2015-02-13	1	-20/+16
\| \| \| \| \| \| \| \| \| \| \|	UnrollAnalyzer. Now they share a single worklist and have less implicit state between them. There was no real benefit to separating these two things out. I'm going to subsequently refactor things to share even more code. llvm-svn: 229062
*	[unroll] Remove pointless dyn_cast<>s to Instruction - the users of an	Chandler Carruth	2015-02-13	1	-12/+4
\| \| \| \| \| \|	instruction must by definition be instructions. llvm-svn: 229061
*	[unroll] Don't check the loop set for whether an instruction is	Chandler Carruth	2015-02-13	1	-4/+2
\| \| \| \| \| \| \| \| \|	contained in it each time we try to add it to the worklist, just check this when pulling it off the worklist. That way we do it at most once per instruction with the cost of the worklist set we would need to pay anyways. llvm-svn: 229060
*	[unroll] Change the other worklist in the unroll analyzer to be a set	Chandler Carruth	2015-02-13	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vector. In addition to dramatically reducing the work required for contrived example loops, this also has to correct some serious latent bugs in the cost computation. Previously, we might add an instruction onto the worklist once for every load which it used and was simplified. Then we would visit it many times and accumulate "savings" each time. I mean, fortunately this couldn't matter for things like calls with 100s of operands, but even for binary operators this code seems like it must be double counting the savings. I just noticed this by inspection and due to the runtime problems it can introduce, I don't have any test cases for cases where the cost produced by this routine is unacceptable. llvm-svn: 229059
*	[unroll] Replace a boolean, for loop, condition, and break with	Chandler Carruth	2015-02-13	1	-7/+3
\| \| \| \| \| \| \|	std::all_of and a lambda. Much cleaner, no functionality changed. llvm-svn: 229058
*	[unroll] Directly query for dead instructions.	Chandler Carruth	2015-02-13	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the unroll analyzer, it is checking each user to see if that user will become dead. However, it first checked if that user was missing from the simplified values map, and then if was also missing from the dead instructions set. We add everything from the simplified values map to the dead instructions set, so the first step is completely subsumed by the second. Moreover, the first step requires inserting something into the simplified value map which isn't what we want at all. This also replaces a dyn_cast with a cast as an instruction cannot be used by a non-instruction. llvm-svn: 229057
*	[unroll] Replace a linear time check for no uses with a constant time	Chandler Carruth	2015-02-13	1	-3/+2
\| \| \| \| \| \| \| \| \| \|	check. Also hoist this into the enqueue process as it is faster even than testing the worklist set, we should just directly filter these out much like we filter out constants and such. llvm-svn: 229056
*	[unroll] Rather than an operand set, use a setvector for the worklist.	Chandler Carruth	2015-02-13	1	-10/+14
\| \| \| \| \| \| \| \| \| \|	We don't just want to handle duplicate operands within an instruction, but also duplicates across operands of different instructions. I should have gone straight to this, but I had convinced myself that it wasn't going to be necessary briefly. I've come to my senses after chatting more with Nick, and am now happier here. llvm-svn: 229054
*	[unroll] Extract the code to enqueue operansd for the worklist in the	Chandler Carruth	2015-02-13	1	-10/+11
\| \| \| \| \| \| \|	unroll analysis into a lambda and call it. That's much simpler than duplicating all the code. llvm-svn: 229053
*	[unroll] Use a small set to de-duplicate operands prior to putting them	Chandler Carruth	2015-02-13	1	-2/+12
\| \| \| \| \| \| \|	into the worklist. This avoids allocating lots of worklist memory for them when there are large numbers of repeated operands. llvm-svn: 229052
*	[unroll] Make the unroll cost analysis terminate deterministically and	Chandler Carruth	2015-02-13	1	-23/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	reasonably quickly. I don't have a reduced test case, but for a version of FFMPEG, this makes the loop unroller start finishing at all (after over 15 minutes of running, it hadn't terminated for me, no idea if it was a true infloop or just exponential work). The key thing here is to check the DeadInstructions set when pulling things off the worklist. Without this, we would re-walk the user list of already dead instructions again and again and again. Consider phi nodes with many, many operands and other patterns. The other important aspect of this is that because we would keep re-visiting instructions that were already known dead, we kept adding their cost savings to this! This would cause our cost savings to be insanely inflated from this. While I was here, I also rotated the operand walk out of the worklist loop to make the code easier to read. There is still work to be done to minimize worklist traffic because we don't de-duplicate operands. This means we may add the same instruction onto the worklist 1000s of times if it shows up in 1000s of operansd to a PHI node for example. Still, with this patch, the ffmpeg testcase I have finishes quickly and I can't measure the runtime impact of the unroll analysis any more. I'll probably try to do a few more cleanups to this code, but not sure how much cleanup I can justify right now. llvm-svn: 229038
*	[unroll] Make range based for loops a bit more explicit and more	Chandler Carruth	2015-02-13	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	readable. The biggest thing that was causing me problems is recognizing the references vs. poniters here. I also found that for maps naming the loop variable as KeyValue helps make it obvious why you don't actually use it directly. Finally, using 'auto' instead of 'User *' doesn't seem like a good tradeoff. Much like with the other cases, I like to know its a pointer, and 'User' is just as long and tells the reader a lot more. llvm-svn: 229033
*	[unroll] Avoid the "Insn" abbreviation of Instruction. This is quite	Chandler Carruth	2015-02-13	1	-16/+17
\| \| \| \| \| \| \| \| \|	hard to type and read for me, and is inconsistent with the other abbreviation in the base class "Inst". For most of these (where they are used widely) I prefer just spelling it out as Instruction. I've changed two of the short-lived variables to use "Inst" to match the base class. llvm-svn: 229028
*	[unroll] Tidy up the integer we use to accumululate the number of	Chandler Carruth	2015-02-13	1	-2/+5
\| \| \| \| \| \| \|	instructions optimized. NFC, just separating this out from the functionality changing commit. llvm-svn: 229026
*	[unroll] Don't use a map from pointer to bool. Use a set.	Chandler Carruth	2015-02-13	1	-4/+4
\| \| \| \| \| \| \| \| \|	This is much more efficient. In particular, the query with the user instruction has to insert a false for every missing instruction into the set. This is just a cleanup a long the way to fixing the underlying algorithm problems here. llvm-svn: 228994
*	Prevent division by 0.	Michael Zolotukhin	2015-02-13	1	-1/+1
\| \| \| \| \| \| \| \|	When we try to estimate number of potentially removed instructions in loop unroller, we analyze first N iterations and then scale the computed number by TripCount/N. We should bail out early if N is 0. llvm-svn: 228988
*	[unroll] Update the new analysis logic from r228265 to use modern coding	Chandler Carruth	2015-02-13	1	-10/+10
\| \| \| \| \| \| \|	conventions for function names consistently. Some were already using this but not all. llvm-svn: 228987
*	[LoopRerolling] Be more forgiving with instruction order.	James Molloy	2015-02-12	1	-17/+82
\| \| \| \| \| \| \| \| \|	We can't solve the full subgraph isomorphism problem. But we can allow obvious cases, where for example two instructions of different types are out of order. Due to them having different types/opcodes, there is no ambiguity. llvm-svn: 228931
*	Reassociate: cannot negate a INT_MIN value	Mehdi Amini	2015-02-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When trying to canonicalize negative constants out of multiplication expressions, we need to check that the constant is not INT_MIN which cannot be negated. Reviewers: mcrosier Reviewed By: mcrosier Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7286 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 228872
*	[LoopReroll] Introduce the concept of DAGRootSets.	James Molloy	2015-02-11	1	-202/+376
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A DAGRootSet models an induction variable being used in a rerollable loop. For example: x[i3+0] = y1 x[i3+1] = y2 x[i3+2] = y3 Base instruction -> i3 +---+----+ / \| \ ST[y1] +1 +2 <-- Roots \| \| ST[y2] ST[y3] There may be multiple DAGRootSets, for example: x[i2+0] = ... (1) x[i2+1] = ... (1) x[i2+4] = ... (2) x[i2+5] = ... (2) x[(i+1234)2+5678] = ... (3) x[(i+1234)2+5679] = ... (3) This concept is similar to the "Scale" member used previously, but allows multiple independent sets of roots based off the same induction variable. llvm-svn: 228821
*	Use ADDITIONAL_HEADER_DIRS in all LLVM CMake projects.	Zachary Turner	2015-02-11	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	This allows IDEs to recognize the entire set of header files for each of the core LLVM projects. Differential Revision: http://reviews.llvm.org/D7526 Reviewed By: Chris Bieneman llvm-svn: 228798
*	EarlyCSE: It isn't safe to CSE across synchronization boundaries	David Majnemer	2015-02-10	1	-0/+3
\| \| \| \| \| \|	This fixes PR22514. llvm-svn: 228760
*	Adjust how we avoid poll insertion inside the poll function (NFC)	Philip Reames	2015-02-10	1	-5/+11
\| \| \| \| \| \| \| \|	I realized that my early fix for this was overly complicated. Rather than scatter checks around in a bunch of places, just exit early when we visit the poll function itself. Thinking about it a bit, the whole inlining mechanism used with gc.safepoint_poll could probably be cleaned up a bit. Originally, poll insertion was fused with gc relocation rewriting. It might be worth going back to see if we can simplify the chain of events now that these two are seperated. As one thought, maybe it makes sense to rewrite calls inside the helper function before inlining it to the many callers. This would require us to visit the poll function before any other functions though.. llvm-svn: 228634
*	Debug info: When updating debug info during SROA, do not emit debug info	Adrian Prantl	2015-02-09	1	-8/+18
\| \| \| \| \| \| \| \| \| \|	for any padding introduced by SROA. In particular, do not emit debug info for an alloca that represents only the padding introduced by a previous iteration. Fixes PR22495. llvm-svn: 228632
*	Debug info: Use DW_OP_bit_piece instead of DW_OP_piece in the	Adrian Prantl	2015-02-09	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \|	intermediate representation. This - increases consistency by using the same granularity everywhere - allows for pieces < 1 byte - DW_OP_piece didn't actually allow storing an offset. Part of PR22495. llvm-svn: 228631
*	[Statepoint] Improve two asserts, fix some style (NFC)	Ramkumar Ramachandra	2015-02-09	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: It's important that our users immediately know what gc.safepoint_poll is. Also fix the style of the declaration of CreateGCStatepoint, in preparation for another change that will wrap it. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7517 llvm-svn: 228626
*	PlaceSafepoints: modernize gc.result.* -> gc.result	Ramkumar Ramachandra	2015-02-09	1	-12/+1
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D7516 llvm-svn: 228625
*	Update file comment to clarify points highlighted in review (NFC)	Philip Reames	2015-02-09	1	-31/+30
\| \| \| \|	llvm-svn: 228621
*	Use range for loops in PlaceSafepoints (NFC)	Philip Reames	2015-02-09	1	-8/+4
\| \| \| \|	llvm-svn: 228620
*	Add basic tests for PlaceSafepoints	Philip Reames	2015-02-09	1	-5/+21
\| \| \| \| \| \|	This is just adding really simple tests which should have been part of the original submission. When doing so, I discovered that I'd mistakenly removed required pieces when preparing the patch for upstream submission. I fixed two such bugs in this submission. llvm-svn: 228610