bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[LAA] Formatting fix in previous change	Adam Nemet	2016-03-24	1	-2/+1
\| \| \| \|	llvm-svn: 264244
*	[LAA] Support memchecks involving loop-invariant addresses	Adam Nemet	2016-03-24	1	-17/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We used to only allow SCEVAddRecExpr for pointer expressions in order to be able to compute the bounds. However this is also trivially possible for loop-invariant addresses (scUnknown) since then the bounds are the address itself. Interestingly, we used allow this for the special case when the loop-invariant address happens to also be an SCEVAddRecExpr (in an outer loop). There are a couple more loops that are vectorized in SPEC after this. My guess is that the main reason we don't see more because for example a loop-invariant load is vectorized into a splat vector with several vector-inserts. This is likely to make the vectorization unprofitable. I.e. we don't notice that a later LICM will move all of this out of the loop so the cost estimate should really be 0. llvm-svn: 264243
*	Add getBlockProfileCount method to BlockFrequencyInfo	Easwaran Raman	2016-03-23	1	-0/+14
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D18233 llvm-svn: 264179
*	[SCEV] Change the SCEV Predicates interfaces for conversion to AddRecExpr to ↵	Silviu Baranga	2016-03-23	2	-5/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	return SCEVAddRecExpr* instead of SCEV* Summary: This changes the conversion functions from SCEV * to SCEVAddRecExpr from ScalarEvolution and PredicatedScalarEvolution to return a SCEVAddRecExpr* instead of a SCEV* (which removes the need of most clients to do a dyn_cast right after calling these functions). We also don't add new predicates if the transformation was not successful. This is not entirely a NFC (as it can theoretically remove some predicates from LAA when we have an unknown dependece), but I couldn't find an obvious regression test for it. Reviewers: sanjoy Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18368 llvm-svn: 264161
*	Rename DenseMap::resize() into DenseMap::reserve() (NFC)	Mehdi Amini	2016-03-22	1	-1/+1
\| \| \| \| \| \| \|	This is more coherent with usual containers. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264026
*	Implement constant folding for bitreverse	Matt Arsenault	2016-03-21	1	-0/+3
\| \| \| \|	llvm-svn: 263945
*	[IndVars] Fix PR26974: make sure replaceCongruentIVs doesn't break LCSSA	Silviu Baranga	2016-03-21	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: replaceCongruentIVs can break LCSSA when trying to replace IV increments since it tries to replace all uses of a phi node with another phi node while both of the phi nodes are not necessarily in the processed loop. This will cause an assert in IndVars. To fix this, we add a check to make sure that the replacement maintains LCSSA. Reviewers: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18266 llvm-svn: 263941
*	[LoopDataPrefetch] Add TTI to limit the number of iterations to prefetch ahead	Adam Nemet	2016-03-18	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: It can hurt performance to prefetch ahead too much. Be conservative for now and don't prefetch ahead more than 3 iterations on Cyclone. Reviewers: hfinkel Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17949 llvm-svn: 263772
*	[LoopDataPrefetch/Aarch64] Allow selective prefetching of large-strided accesses	Adam Nemet	2016-03-18	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: And use this TTI for Cyclone. As it was explained in the original RFC (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW prefetcher work up to 2KB strides. I am also adding tests for this and the previous change (D17943): * Cyclone prefetching accesses with a large stride * Cyclone not prefetching accesses with a small stride * Generic Aarch64 subtarget not prefetching either Reviewers: hfinkel Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17945 llvm-svn: 263771
*	Add Rust's personality function to the list of known personality functions	Bjorn Steinbrink	2016-03-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18192 llvm-svn: 263581
*	Re-add ConstantFoldInstOperands form taking opcode and return type.	Manuel Jacob	2016-03-14	1	-4/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This form was replaced by a form taking an instruction instead of opcode and return type in r258391. After committing this change (and some depending, follow-up changes) it turned out in the review thread to be controversial. The discussion didn't come to a conclusion yet. I'm re-adding the old form to fix the API regression and to provide a better base for discussion, possibly on llvm-dev. A difference to the original function is that it can't be called with GEPs (similarly to how it was already the case for compares). In order to support opaque pointers in the future, folding GEPs needs to be passed the source element type, which is not possible with the current API. Reviewers: dberlin, reames Subscribers: dblaikie, eddyb Differential Revision: http://reviews.llvm.org/D17901 llvm-svn: 263501
*	[AliasSetTracker] Do not strip pointer casts when processing MemSetInst	Michael Kuperstein	2016-03-14	1	-2/+2
\| \| \| \| \| \|	This fixes PR26843. llvm-svn: 263462
*	ConstantFoldInstruction: avoid wasted calls to ConstantFoldConstantExpression	Fiona Glaser	2016-03-13	1	-5/+5
\| \| \| \| \| \| \|	Check to see if all operands are constant before calling simplify on them so that we don't perform wasted simplifications. llvm-svn: 263374
*	[AA] Make BasicAA just require domtree.	Chandler Carruth	2016-03-11	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	This doesn't change how many times we construct domtrees in the normal pipeline, and it removes fragility and instability where basic-aa may not be run in time to see domtrees because they happen to be constructed afterward. This isn't quite as clean as the change to memdep because there is a mode where basic-aa specifically runs without domtrees -- in the hacking version used by function-attrs with the legacy pass manager. llvm-svn: 263234
*	[memdep] Just require domtree for memdep.	Chandler Carruth	2016-03-11	1	-16/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This doesn't cause us to construct dominator trees any more often in the normal pipeline, and removes an entire mode of memdep that needed to be reasoned about and maintained. Perhaps more importantly, it removes the ability for the results of memdep to be different because of accidental pass scheduling goofs or the order of evaluation of 'getResult' calls. Essentially, 'getCachedResult', unless across IR-unit boundaries, is extremely dangerous. We need to work much harder to avoid it (or its analog in the old pass manager). llvm-svn: 263232
*	[PM] Make the AnalysisManager parameter to run methods a reference.	Chandler Carruth	2016-03-11	17	-53/+53
\| \| \| \| \| \| \| \| \| \| \| \|	This was originally a pointer to support pass managers which didn't use AnalysisManagers. However, that doesn't realistically come up much and the complexity of supporting it doesn't really make sense. In fact, many parts of the pass manager were just assuming the pointer was never null already. This at least makes it much more explicit and clear. llvm-svn: 263219
*	[PM] Implement the final conclusion as to how the analysis IDs should	Chandler Carruth	2016-03-11	20	-15/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	work in the face of the limitations of DLLs and templated static variables. This requires passes that use the AnalysisBase mixin provide a static variable themselves. So as to keep their APIs clean, I've made these private and befriended the CRTP base class (which is the common practice). I've added documentation to AnalysisBase for why this is necessary and at what point we can go back to the much simpler system. This is clearly a better pattern than the extern template as it caught numerous places where the template magic hadn't been applied and things were "just working" but would eventually have broken mysteriously. llvm-svn: 263216
*	[PM/AA] Teach the AAManager how to handle module analyses in addition to	Chandler Carruth	2016-03-11	1	-0/+2
\| \| \| \| \| \| \|	function analyses, and use it to wire up globals-aa to the new pass manager. llvm-svn: 263211
*	[CG] Back out my pointless move ctor and add the explicit template	Chandler Carruth	2016-03-10	1	-0/+3
\| \| \| \| \| \|	instantiation needed for the mingw dll build bot. llvm-svn: 263114
*	[CG] Add a new pass manager printer pass for the old call graph and	Chandler Carruth	2016-03-10	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	actually finish wiring up the old call graph. There were bugs in the old call graph that hadn't been caught because it wasn't being tested. It wasn't being tested because it wasn't in the pipeline system and we didn't have a printing pass to run in tests. This fixes all of that. As for why I'm still keeping the old call graph alive its so that I can port GlobalsAA to the new pass manager with out forking it to work with the lazy call graph. That's clearly the right eventual design, but it seems pragmatic to defer that until its necessary. The old call graph works just fine for GlobalsAA. llvm-svn: 263104
*	[CG] Actually hoist up the generic CallGraphPrinter pass from a weird	Chandler Carruth	2016-03-10	2	-0/+27
\| \| \| \| \| \| \| \| \| \|	location in the opt tool to live along side the analysis in LLVM's libraries. No functionality changed here, but this will allow me to port the printer to the new pass manager as well. llvm-svn: 263101
*	[CG] Rename the DOT printing pass to actually reference "DOT".	Chandler Carruth	2016-03-10	2	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	There is another pass by the generic name 'CallGraphPrinter' which is actually just a call graph printer tucked away inside the opt tool. I'd like to bring it out and make it follow the same patterns as the rest of the CallGraph code, but doing so would end up conflicting with the name of the DOT printing pass. So this makes the DOT printing pass name be more precise. No functionality changed here. llvm-svn: 263100
*	[PM] Port memdep to the new pass manager.	Chandler Carruth	2016-03-10	3	-86/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a fairly straightforward port to the new pass manager with one exception. It removes a very questionable use of releaseMemory() in the old pass to invalidate its caches between runs on a function. I don't think this is really guaranteed to be safe. I've just used the more direct port to the new PM to address this by nuking the results object each time the pass runs. While this could cause some minor malloc traffic increase, I don't expect the compile time performance hit to be noticable, and it makes the correctness and other aspects of the pass much easier to reason about. In some cases, it may make things faster by making the sets and maps smaller with better locality. Indeed, the measurements collected by Bruno (thanks!!!) show mostly compile time improvements. There is sadly very limited testing at this point as there are only two tests of memdep, and both rely on GVN. I'll be porting GVN next and that will exercise this heavily though. Differential Revision: http://reviews.llvm.org/D17962 llvm-svn: 263082
*	[BasicAA/MDA] Sink aliasing rules for malloc and calloc into BasicAA	Philip Reames	2016-03-09	2	-16/+17
\| \| \| \| \| \| \| \| \| \|	MemoryDependenceAnalysis had a hard-coded exception to the general aliasing rules for malloc and calloc. The reasoning that applied there is equally valid in BasicAA and clarifies the remaining logic in MDA. In principal, this can expose slightly more optimization opportunities, but since essentially all of our aliasing aware memory optimization passes go through MDA, this will likely be NFC in practice. Differential Revision: http://reviews.llvm.org/D15912 llvm-svn: 263075
*	[ValueTracking] Extract isKnownPositive [NFCI]	Philip Reames	2016-03-09	1	-0/+12
\| \| \| \| \| \|	Extract out a generic interface from a recently landed patch and document a TODO in case compile time becomes a problem. llvm-svn: 263062
*	[SCEV] Slightly generalize getRangeViaFactoring	Sanjoy Das	2016-03-09	1	-13/+18
\| \| \| \| \| \| \| \| \|	Building on the previous change, this generalizes ScalarEvolution::getRangeViaFactoring to work with {Ext(C?A:B)+k0,+,Ext(C?A:B)+k1} where Ext can be a zero extend, sign extend or truncate operation, and k0 and k1 are constants. llvm-svn: 262979
*	[SCEV] Slightly generalize getRangeViaFactoring	Sanjoy Das	2016-03-09	1	-23/+51
\| \| \| \| \| \| \| \|	This change generalizes ScalarEvolution::getRangeViaFactoring to work with {Ext(C?A:B),+,Ext(C?A:B)} where Ext can be a zero extend, sign extend or truncate operation. llvm-svn: 262978
*	use range-based for loop; NFCI	Sanjay Patel	2016-03-08	1	-13/+12
\| \| \| \|	llvm-svn: 262956
*	Revert revisions 262636, 262643, 262679, and 262682.	Easwaran Raman	2016-03-08	1	-86/+16
\| \| \| \|	llvm-svn: 262883
*	[memdep] Switch to range based for loops.	Chandler Carruth	2016-03-07	1	-73/+46
\| \| \| \|	llvm-svn: 262831
*	[memdep] Switch a function to return true on success instead of false.	Chandler Carruth	2016-03-07	1	-9/+9
\| \| \| \| \| \| \| \|	This is much more clear and less surprising IMO. It also makes things more consistent with the increasingly large chunk of LLVM code that assumes true-on-success. llvm-svn: 262826
*	[memdep] Cleanup the implementation doxygen comments and remove	Chandler Carruth	2016-03-07	1	-80/+35
\| \| \| \| \| \| \| \| \| \|	duplicated comments. In several cases these had diverged making them especially nice to canonicalize. I checked to make sure we weren't losing important information of course. llvm-svn: 262825
*	[memdep] Run clang-format over the header before porting it to	Chandler Carruth	2016-03-07	1	-144/+154
\| \| \| \| \| \| \| \| \| \| \|	the new pass manager. The port will involve substantial edits here, and would likely introduce bad formatting if formatted in isolation, so just get all the formatting up to snuff. I'll also go through and try to freshen the doxygen here as well as modernizing some of the code. llvm-svn: 262821
*	[LVI] Fix a bug which prevented use of !range metadata within a query	Philip Reames	2016-03-04	1	-22/+10
\| \| \| \| \| \|	The diff is relatively large since I took a chance to rearrange the code I had to touch in a more obvious way, but the key bit is merely using the !range metadata when we can't analyze the instruction further. The previous !range metadata code was essentially just dead since no binary operator or cast will have !range metadata (per Verifier) and it was otherwise dropped on the floor. llvm-svn: 262751
*	Fix a memory leak.	Easwaran Raman	2016-03-04	1	-1/+4
\| \| \| \|	llvm-svn: 262682
*	[ValueTracking] "constant fold" an experimental hidden option	Philip Reames	2016-03-03	1	-7/+0
\| \| \| \|	llvm-svn: 262648
*	[ValueTracking] Remove dead code from an old experiment	Philip Reames	2016-03-03	1	-208/+2
\| \| \| \| \| \| \| \| \| \|	This experiment was originally about trying to use facts implied dominating conditions to infer more precise known bits. While the compile time was found to be acceptable on several large code bases, we never found sufficiently profitable examples to justify turning on the code by default. Given this, it's time to abandon the experiment. Several folks have commented that they've found this useful for experimentation, but nothing has come of those experiments. Given how easy the patch is to apply, there's no reason to leave the code in tree. For anyone interested in further investigation in this area, I recommend finding the summary email I sent on one of the original review threads. In particular, I now believe the use-list based approach is strictly worse than the dom-tree-walking approach. llvm-svn: 262646
*	Fix breakage caused by r262636.	Easwaran Raman	2016-03-03	1	-1/+1
\| \| \| \| \| \|	Use LLVM_ATTRIBUTE_UNUSED instead of __attribute_((unused)) llvm-svn: 262643
*	[SCEV] Prove no-overflow via constant ranges	Sanjoy Das	2016-03-03	1	-0/+41
\| \| \| \| \| \| \|	Exploit ScalarEvolution::getRange's newly acquired smartness (since r262438) by using that to infer nsw and nuw when possible. llvm-svn: 262639
*	[SCEV] Be less eager about demoting zexts to sexts	Sanjoy Das	2016-03-03	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \|	After r262438 we can have provably positive NSW SCEV expressions whose zero extensions cannot be simplified (since r262438 makes SCEV better at computing constant ranges). This means demoting sexts of positive add recurrences eagerly can result in an unsimplified zero extension where we could have had a simplified sign extension. This change fixes the issue by teaching SCEV to demote sext of a positive SCEV expression to a zext only if the sext could not be simplified. llvm-svn: 262638
*	Infrastructure for PGO enhancements in inliner	Easwaran Raman	2016-03-03	1	-16/+83
\| \| \| \| \| \| \| \| \| \| \| \|	This patch provides the following infrastructure for PGO enhancements in inliner: Enable the use of block level profile information in inliner Incremental update of block frequency information during inlining Update the function entry counts of callees when they get inlined into callers. Differential Revision: http://reviews.llvm.org/D16381 llvm-svn: 262636
*	[AA] Hoist the logic to reformulate various AA queries in terms of other	Chandler Carruth	2016-03-02	8	-65/+158
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	parts of the AA interface out of the base class of every single AA result object. Because this logic reformulates the query in terms of some other aspect of the API, it would easily cause O(n^2) query patterns in alias analysis. These could in turn be magnified further based on the number of call arguments, and then further based on the number of AA queries made for a particular call. This ended up causing problems for Rust that were actually noticable enough to get a bug (PR26564) and probably other places as well. When originally re-working the AA infrastructure, the desire was to regularize the pattern of refinement without losing any generality. While I think it was successful, that is clearly proving to be too costly. And the cost is needless: we gain no actual improvement for this generality of making a direct query to tbaa actually be able to re-use some other alias analysis's refinement logic for one of the other APIs, or some such. In short, this is entirely wasted work. To the extent possible, delegation to other API surfaces should be done at the aggregation layer so that we can avoid re-walking the aggregation. In fact, this significantly simplifies the logic as we no longer need to smuggle the aggregation layer into each alias analysis (or the TargetLibraryInfo into each alias analysis just so we can form argument memory locations!). However, we also have some delegation logic inside of BasicAA and some of it even makes sense. When the delegation logic is baking in specific knowledge of aliasing properties of the LLVM IR, as opposed to simply reformulating the query to utilize a different alias analysis interface entry point, it makes a lot of sense to restrict that logic to a different layer such as BasicAA. So one aspect of the delegation that was in every AA base class is that when we don't have operand bundles, we re-use function AA results as a fallback for callsite alias results. This relies on the IR properties of calls and functions w.r.t. aliasing, and so seems a better fit to BasicAA. I've lifted the logic up to that point where it seems to be a natural fit. This still does a bit of redundant work (we query function attributes twice, once via the callsite and once via the function AA query) but it is exactly twice here, no more. The end result is that all of the delegation logic is hoisted out of the base class and into either the aggregation layer when it is a pure retargeting to a different API surface, or into BasicAA when it relies on the IR's aliasing properties. This should fix the quadratic query pattern reported in PR26564, although I don't have a stand-alone test case to reproduce it. It also seems general goodness. Now the numerous AAs that don't need target library info don't carry it around and depend on it. I think I can even rip out the general access to the aggregation layer and only expose that in BasicAA as it is the only place where we re-query in that manner. However, this is a non-trivial change to the AA infrastructure so I want to get some additional eyes on this before it lands. Sadly, it can't wait long because we should really cherry pick this into 3.8 if we're going to go this route. Differential Revision: http://reviews.llvm.org/D17329 llvm-svn: 262490
*	[SCEV] Minor naming, braces cleanup; NFC	Sanjoy Das	2016-03-02	1	-5/+4
\| \| \| \|	llvm-svn: 262459
*	Add a comment with a rational for the unusual code structure	Sanjoy Das	2016-03-02	1	-0/+3
\| \| \| \|	llvm-svn: 262454
*	Qualify getRangeForAffineAR with this-> for MSVC	Sanjoy Das	2016-03-02	1	-2/+2
\| \| \| \|	llvm-svn: 262453
*	Perturb code in an attempt to appease MSVC	Sanjoy Das	2016-03-02	1	-9/+9
\| \| \| \| \| \| \| \|	For some reason MSVC seems to think I'm calling getConstant() from a static context. Try to avoid this issue by explicitly specifying 'this->' (though I'm not confident that this will actually work). llvm-svn: 262451
*	More code permutation to appease MSVC	Sanjoy Das	2016-03-02	1	-4/+7
\| \| \| \|	llvm-svn: 262449
*	Remove "auto" to appease the MSVC bots	Sanjoy Das	2016-03-02	1	-2/+2
\| \| \| \|	llvm-svn: 262448
*	[SCEV] Make getRange smarter around selects	Sanjoy Das	2016-03-02	1	-0/+83
\| \| \| \| \| \| \| \| \| \| \| \|	Have ScalarEvolution::getRange re-consider cases like "{C?A:B,+,C?P:Q}" by factoring out "C" and computing RangeOf{A,+,P} union RangeOf({B,+,Q}) instead. The latter can be easier to compute precisely in cases like "{C?0:N,+,C?1:-1}" N is the backedge taken count of the loop; since in such cases the latter form simplifies to [0,N+1) union [0,N+1). llvm-svn: 262438
*	[SCEV] Extract out a getRangeForAffineAR; NFC	Sanjoy Das	2016-03-02	1	-57/+71
\| \| \| \| \| \|	Pure code-motion change. Will be used later in making getRange more clever. llvm-svn: 262437