bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[CaptureTracking] Volatile operations capture their memory location	David Majnemer	2016-05-26	1	-11/+36
\| \| \| \| \| \| \| \| \| \|	The memory location that corresponds to a volatile operation is very special. They are observed by the machine in ways which we cannot reason about. Differential Revision: http://reviews.llvm.org/D20555 llvm-svn: 270879
*	MemorySSA: Revert r269678 and r268068; replace with special casing in MemorySSA.	Peter Collingbourne	2016-05-26	1	-9/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	It turns out that too many passes are relying on alias analysis results for control dependencies. Until we fix that by introducing a more accurate modelling of control dependencies, special case assume in MemorySSA instead. Also introduce tests to ensure we don't regress the FunctionAttrs or LICM passes. Differential Revision: http://reviews.llvm.org/D20658 llvm-svn: 270823
*	[LazyValueInfo] Simplify `return after else`. NFCI.	Davide Italiano	2016-05-25	1	-4/+3
\| \| \| \|	llvm-svn: 270779
*	[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries	Michael Kuperstein	2016-05-25	1	-82/+120
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a we have (a) a GEP and (b) a pointer based on an alloca, and the beginning of the object the GEP points would have a negative offset with repsect to the alloca, then the GEP can not alias pointer (b). For example, consider code like: struct { int f0, int f1, ...} foo; ... foo alloca; foo random = bar(alloca); int f0 = &alloca.f0 int f1 = &random->f1; Which is lowered, approximately, to: %alloca = alloca %struct.foo %random = call %struct.foo @random(%struct.foo* %alloca) %f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0 %f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1 Assume %f1 and %f0 alias. Then %f1 would point into the object allocated by %alloca. Since the %f1 GEP is inbounds, that means %random must also point into the same object. But since %f0 points to the beginning of %alloca, the highest %f1 can be is (%alloca + 3). This means %random can not be higher than (%alloca - 1), and so is not inbounds, a contradiction. Differential Revision: http://reviews.llvm.org/D20495 llvm-svn: 270777
*	Look for a loop's starting location in the llvm.loop metadata	Hal Finkel	2016-05-25	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Getting accurate locations for loops is important, because those locations are used by the frontend to generate optimization remarks. Currently, optimization remarks for loops often appear on the wrong line, often the first line of the loop body instead of the loop itself. This is confusing because that line might itself be another loop, or might be somewhere else completely if the body was inlined function call. This happens because of the way we find the loop's starting location. First, we look for a preheader, and if we find one, and its terminator has a debug location, then we use that. Otherwise, we look for a location on an instruction in the loop header. The fallback heuristic is not bad, but will almost always find the beginning of the body, and not the loop statement itself. The preheader location search often fails because there's often not a preheader, and even when there is a preheader, depending on how it was formed, it sometimes carries the location of some preceeding code. I don't see any good theoretical way to fix this problem. On the other hand, this seems like a straightforward solution: Put the debug location in the loop's llvm.loop metadata. A companion Clang patch will cause Clang to insert llvm.loop metadata with appropriate locations when generating debugging information. With these changes, our loop remarks have much more accurate locations. Differential Revision: http://reviews.llvm.org/D19738 llvm-svn: 270771
*	[TLI] Also cover Linux 64 libfunc (stat64, ...) prototype checking.	Ahmed Bougacha	2016-05-25	1	-2/+2
\| \| \| \| \| \|	My script missed those in r270750. llvm-svn: 270763
*	[TLI] Fix NumParams==0 prototype checking typo.	Ahmed Bougacha	2016-05-25	1	-57/+43
\| \| \| \| \| \| \| \| \| \| \| \| \|	There was a typo in r267758. It caused invalid accesses when given something like "void @free(...)", as NumParams == 0, and we then try to look at the 0th parameter. Turns out, most of these were untested; add both attribute and missing-prototype checks for all libc libfuncs. Differential Revision: http://reviews.llvm.org/D20543 llvm-svn: 270750
*	[SCEV] No-wrap flags are not propagated when folding "{S,+,X}+T ==> {S+T,+,X}"	Oleg Ranevskyy	2016-05-25	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Description This makes `WidenIV::widenIVUse` (IndVarSimplify.cpp) fail to widen narrow IV uses in some cases. The latter affects IndVarSimplify which may not eliminate narrow IV's when there actually exists such a possibility, thereby producing ineffective code. When `WidenIV::widenIVUse` gets a NarrowUse such as `{(-2 + %inc.lcssa),+,1}<nsw><%for.body3>`, it first tries to get a wide recurrence for it via the `getWideRecurrence` call. `getWideRecurrence` returns recurrence like this: `{(sext i32 (-2 + %inc.lcssa) to i64),+,1}<nsw><%for.body3>`. Then a wide use operation is generated by `cloneIVUser`. The generated wide use is evaluated to `{(-2 + (sext i32 %inc.lcssa to i64))<nsw>,+,1}<nsw><%for.body3>`, which is different from the `getWideRecurrence` result. `cloneIVUser` sees the difference and returns nullptr. This patch also fixes the broken LLVM tests by adding missing <nsw> entries introduced by the correction. Minimal reproducer: ``` int foo(int a, int b, int c); int baz(); void bar() { int arr[20]; int i = 0; for (i = 0; i < 4; ++i) arr[i] = baz(); for (; i < 20; ++i) arr[i] = foo(arr[i - 4], arr[i - 3], arr[i - 2]); } ``` Clang command line: ``` clang++ -mllvm -debug -S -emit-llvm -O3 --target=aarch64-linux-elf test.cpp -o test.ir ``` Expected result: The ` -mllvm -debug` log shows that all the IV's for the second `for` loop have been eliminated. Reviewers: sanjoy Subscribers: atrick, asl, aemerson, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D20058 llvm-svn: 270695
*	[LoopUnrollAnalyzer] Fix a crash in UnrolledInstAnalyzer::visitCastInst.	Michael Zolotukhin	2016-05-24	1	-5/+1
\| \| \| \| \| \|	This fixes PR27847. Now for real. llvm-svn: 270629
*	[ValueTracking, InstSimplify] extend isKnownNonZero() to handle vector constants	Sanjay Patel	2016-05-24	1	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	Similar in spirit to D20497 : If all elements of a constant vector are known non-zero, then we can say that the whole vector is known non-zero. It seems like we could extend this to FP scalar/vector too, but isKnownNonZero() says it only works for integers and pointers for now. Differential Revision: http://reviews.llvm.org/D20544 llvm-svn: 270562
*	[LoopUnrollAnalyzer] Fix a crash in UnrolledInstAnalyzer::visitCastInst.	Michael Zolotukhin	2016-05-24	1	-1/+6
\| \| \| \| \| \|	This fixes PR27847. llvm-svn: 270517
*	fix formatting; NFC	Sanjay Patel	2016-05-23	1	-4/+2
\| \| \| \|	llvm-svn: 270465
*	use 'auto' with 'dyn_cast'; fix formatting; NFC	Sanjay Patel	2016-05-22	1	-10/+8
\| \| \| \|	llvm-svn: 270370
*	[ValueTracking, InstCombine] extend isKnownToBeAPowerOfTwo() to handle ↵	Sanjay Patel	2016-05-22	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	vector splat constants We could try harder to handle non-splat vector constants too, but that seems much rarer to me. Note that the div test isn't resolved because there's a check for isIntegerTy() guarding that transform. Differential Revision: http://reviews.llvm.org/D20497 llvm-svn: 270369
*	Revert r270268 due to unused variable warnings.	Michael Kuperstein	2016-05-20	1	-12/+17
\| \| \| \|	llvm-svn: 270272
*	[BasicAA] Turn DecomposeGEPExpression runtime checks into asserts.	Michael Kuperstein	2016-05-20	1	-17/+12
\| \| \| \| \| \| \| \|	When it has a DataLayout, DecomposeGEPExpression() should return the same object as GetUnderlyingObject(). Per the FIXME, it currently always has a DL, so the runtime check is redundant and can become an assert. llvm-svn: 270268
*	Allow -inline-threshold to override default threshold.	Easwaran Raman	2016-05-19	1	-4/+7
\| \| \| \| \| \| \| \|	Before r257832, the threshold used by SimpleInliner was explicitly specified or generated from opt levels and passed to the base class Inliner's constructor. There, it was first overridden by explicitly specified -inline-threshold. The refactoring in r257832 did not preserve this behavior for all opt levels. This change brings back the original behavior. Differential Revision: http://reviews.llvm.org/D20452 llvm-svn: 270153
*	[LAA] Check independence of strided accesses before forward case	Matthew Simpson	2016-05-19	1	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \|	This patch changes the order in which we attempt to prove the independence of strided accesses. We previously did this after we knew the dependence distance was positive. With this change, we check for independence before handling the negative distance case. The patch prevents LAA from reporting forward dependences for independent strided accesses. This change was requested in the review of D19984. llvm-svn: 270072
*	[SCEV] Be more aggressive in proving NUW	Sanjoy Das	2016-05-17	1	-7/+20
\| \| \| \| \| \| \| \| \| \|	... for AddRec's in loops for which SCEV is unable to compute a max tripcount. This is the NUW variant of r269211 and fixes PR27691. (Note: PR27691 is not a correct or stability bug, it was created to track a pending task). llvm-svn: 269790
*	[BasicAA] Update comments based on feedback from hfinkel. NFCI.	Geoff Berry	2016-05-16	1	-1/+4
\| \| \| \| \| \| \|	Original change Hal's comments were based on: http://reviews.llvm.org/D19730 llvm-svn: 269678
*	[LAA] Rename forwarding conflict detection option (NFC)	Matthew Simpson	2016-05-16	1	-6/+6
\| \| \| \| \| \| \|	This patch renames the option enabling the store-to-load forwarding conflict detection optimization. This change was requested in the review of D20241. llvm-svn: 269668
*	[LAA] Comment couldPreventStoreLoadForward. NFC	Adam Nemet	2016-05-16	1	-2/+8
\| \| \| \| \| \| \|	Also s/Cycles/Iters/ in NumCyclesForStoreLoadThroughMemory to make it clear that this is not about clock cycles but loop cycles/iterations. llvm-svn: 269667
*	[LAA] clang-format the function couldPreventStoreLoadForward. NFC	Adam Nemet	2016-05-16	1	-9/+9
\| \| \| \|	llvm-svn: 269666
*	[LAA] Add option to disable conflict detection (NFC)	Matthew Simpson	2016-05-16	1	-2/+9
\| \| \| \|	llvm-svn: 269654
*	[LAA] Include MaxSafeDepDistBytes in the analysis print-out	Adam Nemet	2016-05-13	1	-0/+3
\| \| \| \|	llvm-svn: 269508
*	[LAA] Prepare the code to print more things in the summary. NFC	Adam Nemet	2016-05-13	1	-3/+3
\| \| \| \|	llvm-svn: 269507
*	Revert "Revert "[Unroll] Implement a conservative and monotonically ↵	Michael Zolotukhin	2016-05-13	1	-0/+10
\| \| \| \| \| \| \| \| \| \|	increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..."" This reverts commit r269395. Try to reapply with a fix from chapuni. llvm-svn: 269486
*	[scan-build] fix warnings emiited on LLVM Analysis code base	Silviu Baranga	2016-05-13	2	-24/+26
\| \| \| \| \| \| \| \| \| \| \| \|	Fix "Logic error" warnings of the type "Called C++ object pointer is null" reported by Clang Static Analyzer on the following files: lib/Analysis/ScalarEvolution.cpp, lib/Analysis/LoopInfo.cpp. Patch by Apelete Seketeli! llvm-svn: 269424
*	Revert "[Unroll] Implement a conservative and monotonically increasing cost ↵	Michael Zolotukhin	2016-05-13	1	-10/+0
\| \| \| \| \| \| \| \| \| \| \|	tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..." This reverts commit r269388. It caused some bots to fail, I'm reverting it until I investigate the issue. llvm-svn: 269395
*	[Unroll] Implement a conservative and monotonically increasing cost tracking ↵	Michael Zolotukhin	2016-05-13	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the... Summary: ...loop after the last iteration. This is really hard to do correctly. The core problem is that we need to model liveness through the induction PHIs from iteration to iteration in order to get the correct results, and we need to correctly de-duplicate the common subgraphs of instructions feeding some subset of the induction PHIs. All of this can be driven either from a side effect at some iteration or from the loop values used after the loop finishes. This patch implements this by storing the forward-propagating analysis of each instruction in a cache to recall whether it was free and whether it has become live and thus counted toward the total unroll cost. Then, at each sink for a value in the loop, we recursively walk back through every value that feeds the sink, including looping back through the iterations as needed, until we have marked the entire input graph as live. Because we cache this, we never visit instructions more than twice -- once when we analyze them and put them into the cache, and once when we count their cost towards the unrolled loop. Also, because the cache is only two bits and because we are dealing with relatively small iteration counts, we can store all of this very densely in memory to avoid this from becoming an excessively slow analysis. The code here is still pretty gross. I would appreciate suggestions about better ways to factor or split this up, I've stared too long at the algorithmic side to really have a good sense of what the design should probably look at. Also, it might seem like we should do all of this bottom-up, but I think that is a red herring. Specifically, the simplification power is much greater working top-down. We can forward propagate very effectively, even across strange and interesting recurrances around the backedge. Because we use data to propagate, this doesn't cause a state space explosion. Doing this level of constant folding, etc, would be very expensive to do bottom-up because it wouldn't be until the last moment that you could collapse everything. The current solution is essentially a top-down simplification with a bottom-up cost accounting which seems to get the best of both worlds. It makes the simplification incremental and powerful while leaving everything dead until we know it is needed. Finally, a core property of this approach is its monotonicity. At all times, the current UnrolledCost is a conservatively low estimate. This ensures that we will never early-exit from the analysis due to exceeding a threshold when if we had continued, the cost would have gone back below the threshold. These kinds of bugs can cause incredibly hard to track down random changes to behavior. We could use a techinque similar (but much simpler) within the inliner as well to avoid considering speculated code in the inline cost. Reviewers: chandlerc Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D11758 llvm-svn: 269388
*	[LoopUnrollAnalyzer] Don't treat gep-instructions with simplified offset as ↵	Michael Zolotukhin	2016-05-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	simplified. Summary: Currently we consider such instructions as simplified, which is incorrect, because if their user isn't simplified, we can't actually simplify them too. This biases our estimates of profitability: for instance the analyzer expects much more gains from unrolling memcpy loops than there actually are. Reviewers: hfinkel, chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17365 llvm-svn: 269387
*	[PM] Port of the DepndenceAnalysis to the new PM.	Chandler Carruth	2016-05-12	2	-228/+173
\| \| \| \| \| \| \| \| \| \| \| \| \|	Ported DA to the new PM by splitting the former DependenceAnalysis Pass into a DependenceInfo result type and DependenceAnalysisWrapperPass type and adding a new PM-style DependenceAnalysis analysis pass returning the DependenceInfo. Patch by Philip Pfaffe, most of the review by Justin. Differential Revision: http://reviews.llvm.org/D18834 llvm-svn: 269370
*	[LAA] Use std::min. NFC	Adam Nemet	2016-05-12	1	-4/+2
\| \| \| \|	llvm-svn: 269356
*	[SCEVExpander] Fix a failed cast<> assertion	Sanjoy Das	2016-05-11	1	-43/+47
\| \| \| \| \| \| \| \| \|	SCEVExpander::replaceCongruentIVs assumes the backedge value of an SCEV-analysable PHI to always be an instruction, when this is not necessarily true. For now address this by bailing out of the optimization if the backedge value of the PHI is a non-Instruction. llvm-svn: 269213
*	[SCEVExpander] Don't break SSA in replaceCongruentIVs	Sanjoy Das	2016-05-11	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \|	`SCEVExpander::replaceCongruentIVs` bypasses `hoistIVInc` if both the original and the isomorphic increments are PHI nodes. Doing this can break SSA if the isomorphic increment is not dominated by the original increment. Get rid of the bypass, and let `hoistIVInc` do the right thing. Fixes PR27232 (compile time crash/hang). llvm-svn: 269212
*	[SCEV] Be more aggressive around proving no-wrap	Sanjoy Das	2016-05-11	1	-4/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	... for AddRec's in loops for which SCEV is unable to compute a max tripcount. This is not a problem for "normal" loops[0] that don't have guards or assumes, but helps in cases where we have guards or assumes in the loop that can be used to constrain incoming values over the backedge. This partially fixes PR27691 (we still don't handle the NUW case). [0]: for "normal" loops, in the cases where we'd be able to prove no-wrap via isKnownPredicate, we'd also be able to compute a max tripcount. llvm-svn: 269211
*	[BasicAA] Compare GEP indices based on value (Fix PR27418)	Vedant Kumar	2016-05-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Equivalent GEP indices with different types are treated as different indices altogether, leading to an incorrect AA result. Fix the issue by comparing indices based on their values. Thanks to Mikael Holmén for reporting the issue! Differential Revision: http://reviews.llvm.org/D19935 llvm-svn: 269197
*	NFC. Introduce Value::isPointerDereferenceable	Artur Pilipenko	2016-05-11	1	-12/+5
\| \| \| \| \| \| \| \| \| \|	Extract a part of isDereferenceableAndAlignedPointer functionality to Value: Reviewed By: hfinkel, sanjoy Differential Revision: http://reviews.llvm.org/D17611 llvm-svn: 269190
*	Revert r269131	Easwaran Raman	2016-05-10	2	-5/+3
\| \| \| \|	llvm-svn: 269138
*	Reapply r266477 and r266488	Easwaran Raman	2016-05-10	2	-3/+5
\| \| \| \|	llvm-svn: 269131
*	[InstSimplify] use computeKnownBits on shift amount operands	Sanjay Patel	2016-05-10	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do simplifications common to all shift instructions based on the amount shifted: 1. If the shift amount is known larger than the bitwidth, the result is undefined. 2. If the valid bits of the shift amount are all known to be 0, it's a shift by zero, so the shift operand is the result. Note that we could generalize the shift-by-zero transform into a shift-by-constant if all of the valid bits in the shift amount are known, but that would have to be done in InstCombine rather than here because it would mean we need to create a new shift instruction. Differential Revision: http://reviews.llvm.org/D19874 llvm-svn: 269114
*	Re-apply r269081 and r269082 with a fix for MSVC.	Peter Collingbourne	2016-05-10	2	-0/+83
\| \| \| \|	llvm-svn: 269094
*	Revert r269081 and r269082 while I try to find the right incantation to fix ↵	Peter Collingbourne	2016-05-10	2	-83/+0
\| \| \| \| \| \|	MSVC build. llvm-svn: 269091
*	WholeProgramDevirt: Move logic for finding devirtualizable call sites to ↵	Peter Collingbourne	2016-05-10	2	-0/+83
\| \| \| \| \| \| \| \| \| \| \| \| \|	Analysis. The plan is to eventually make this logic simpler, however I expect it to be a little tricky for the foreseeable future (at least until we're rid of pointee types), so move it here so that it can be reused to build a summary index for devirtualization. Differential Revision: http://reviews.llvm.org/D20005 llvm-svn: 269081
*	[LAA] Use re-written SCEV expressions when computing distances	Silviu Baranga	2016-05-10	1	-7/+2
\| \| \| \| \| \| \| \| \| \| \| \|	This removes a redundant stride versioning step (we already do it in getPtrStride, so it has no effect) and uses PSE to get the SCEV expressions for the source and destination (this might have changed when getPtrStride was called). I discovered this through code inspection, and couldn't produce a regression test for it. llvm-svn: 269052
*	Revert "[VectorUtils] Query number of sign bits to allow more truncations"	James Molloy	2016-05-10	1	-14/+4
\| \| \| \| \| \| \| \|	This was a fairly simple patch but on closer inspection was seriously flawed and caused PR27690. This reverts commit r268921. llvm-svn: 269051
*	[LAA] Rename "isStridedPtr" with "getPtrStride". NFC.	Denis Zobnin	2016-05-10	1	-5/+5
\| \| \| \| \| \| \|	Changing misleading function name was approved in http://reviews.llvm.org/D17268. Patch by Roman Shirokiy. llvm-svn: 269021
*	[ValueTracking] Use guards to prove non-nullness of a value	Sanjoy Das	2016-05-10	1	-9/+11
\| \| \| \| \| \| \| \| \| \|	Reviewers: apilipenko, majnemer, reames Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20044 llvm-svn: 269008
*	[BasicAA] Guard intrinsics don't write to memory	Sanjoy Das	2016-05-10	1	-4/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The idea is very close to what we do for assume intrinsics: we mark the guard intrinsics as writing to arbitrary memory to maintain control dependence, but under the covers we teach AA that they do not mod any particular memory location. Reviewers: chandlerc, hfinkel, gbiv, reames Subscribers: george.burgess.iv, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19575 llvm-svn: 269007
*	[SCEVExpander] Clang format expressions; NFC	Sanjoy Das	2016-05-10	1	-17/+16
\| \| \| \| \| \|	The boolean expressions are somewhat hard to read otherwise. llvm-svn: 268998