bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[PM] Port GVN to the new pass manager, wire it up, and teach a couple of	Chandler Carruth	2016-03-11	3	-350/+208
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tests to run GVN in both modes. This is mostly the boring refactoring just like SROA and other complex transformation passes. There is some trickiness in that GVN's ValueNumber class requires hand holding to get to compile cleanly. I'm open to suggestions about a better pattern there, but I tried several before settling on this. I was trying to balance my desire to sink as much implementation detail into the source file as possible without introducing overly many layers of abstraction. Much like with SROA, the design of this system is made somewhat more cumbersome by the need to support both pass managers without duplicating the significant state and logic of the pass. The same compromise is struck here. I've also left a FIXME in a doxygen comment as the GVN pass seems to have pretty woeful documentation within it. I'd like to submit this with the FIXME and let those more deeply familiar backfill the information here now that we have a nice place in an interface to put that kind of documentaiton. Differential Revision: http://reviews.llvm.org/D18019 llvm-svn: 263208
*	Remove llvm::getDISubprogram in favor of Function::getSubprogram	Pete Cooper	2016-03-11	4	-11/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	llvm::getDISubprogram walks the instructions in a function, looking for one in the scope of the current function, so that it can find the !dbg entry for the subprogram itself. Now that !dbg is attached to functions, this should not be necessary. This patch changes all uses to just query the subprogram directly on the function. Ideally this should be NFC, but in reality its possible that a function: has no !dbg (in which case there's likely a bug somewhere in an opt pass), or that none of the instructions had a scope referencing the function, so we used to not find the !dbg on the function but now we will Reviewed by Duncan Exon Smith. Differential Revision: http://reviews.llvm.org/D18074 llvm-svn: 263184
*	[LLE] Add missed LoopSimplify dependence	Adam Nemet	2016-03-10	1	-0/+3
\| \| \| \| \| \| \| \| \|	The code assumed that we always had a preheader without making the pass dependent on LoopSimplify. Thanks to Mattias Eriksson V for reporting this. llvm-svn: 263173
*	[SROA] Fix PR25873, which Andrea Di Biagio analyzed the daylights out	Chandler Carruth	2016-03-10	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of, and I misdiagnosed for months and months. Andrea has had a patch for this forever, but I just couldn't see how it was fixing the root cause of the problem. It didn't make sense to me, even though the patch was perfectly good and the analysis of the actual failure event was fantastic. Well, I came back to it today because the patch has sat for far too long and needs attention and decided I wouldn't let it go until I really understood what was going on. After quite some time in the debugger, I finally realized that in fact I had just missed an important case with my previous attempt to fix PR22093 in r225149. Not only do we need to handle loads that won't be split, but stores-of-loads that we won't split. We do actually have enough logic in the presplitting to form new slices for split stores.... unless we decided not to split them! I'm so sorry that it took me this long to come to the realization that this is the issue. It seems so obvious in hind sight (of course). Anyways, the fix becomes much smaller and more focused. The fact that we're left doing integer smashing is related to the FIXME in my original commit: fundamentally, we're not aggressive about pre-splitting for loads and stores to the same alloca. If we want to get aggressive about this, it'll need both what Andrea had put into the proposed fix, but also a lot more logic to essentially iteratively pre-split the alloca until we can't do any more. As I said in that commit log, its really unclear that this is the right call. Instead, the integer blending and letting targets lower this to narrower stores seems slightly better. But we definitely shouldn't really go down that path just to fix this bug. Again, tons of thanks are owed to Andrea and others at Sony for working on this bug. I really should have seen what was going on here and re-directed them sooner. =//// llvm-svn: 263121
*	[SROA] Clean up some really weird code, no functionality changed.	Chandler Carruth	2016-03-10	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	We already have the instruction extracted into 'I', just cast that to a store the way we do for loads. Also, we don't enter the if unless SI is non-null, so don't test it again for null. I'm pretty sure the entire test there can be nuked, but this is just the trivial cleanup. llvm-svn: 263112
*	[SLP] Add -slp-min-reg-size command line option.	Michael Zolotukhin	2016-03-10	1	-9/+21
\| \| \| \| \| \| \| \| \| \|	MinVecRegSize is currently hardcoded to 128; this patch adds a cl::opt to allow changing it. I tried not to change any existing behavior for the default case. Differential revision: http://reviews.llvm.org/D13278 llvm-svn: 263089
*	[gvn] Fix more indenting and formatting in regions of code that will	Chandler Carruth	2016-03-10	1	-64/+62
\| \| \| \| \| \| \| \| \| \| \|	need to be changed for porting to the new pass manager. Also sink the comment on the ValueTable class back to that class instead of it dangling on an anonymous namespace. No functionality changed. llvm-svn: 263084
*	[gvn] Reformat a chunk of the GVN code that is strangely indented prior	Chandler Carruth	2016-03-10	1	-241/+240
\| \| \| \| \| \| \| \|	to restructuring it for porting to the new pass manager. No functionality changed. llvm-svn: 263083
*	[PM] Port memdep to the new pass manager.	Chandler Carruth	2016-03-10	5	-26/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a fairly straightforward port to the new pass manager with one exception. It removes a very questionable use of releaseMemory() in the old pass to invalidate its caches between runs on a function. I don't think this is really guaranteed to be safe. I've just used the more direct port to the new PM to address this by nuking the results object each time the pass runs. While this could cause some minor malloc traffic increase, I don't expect the compile time performance hit to be noticable, and it makes the correctness and other aspects of the pass much easier to reason about. In some cases, it may make things faster by making the sets and maps smaller with better locality. Indeed, the measurements collected by Bruno (thanks!!!) show mostly compile time improvements. There is sadly very limited testing at this point as there are only two tests of memdep, and both rely on GVN. I'll be porting GVN next and that will exercise this heavily though. Differential Revision: http://reviews.llvm.org/D17962 llvm-svn: 263082
*	Fix the build	Philip Reames	2016-03-09	1	-0/+1
\| \| \| \| \| \|	I screwed up rebasing 263072. This change fixes the build and passes all make check. llvm-svn: 263073
*	[LICM] Store promotion when memory is thread local	Philip Reames	2016-03-09	1	-11/+56
\| \| \| \| \| \| \| \| \| \| \| \|	This patch teaches LICM's implementation of store promotion to exploit the fact that the memory location being accessed might be provable thread local. The fact it's thread local weakens the requirements for where we can insert stores since no other thread can observe the write. This allows us perform store promotion even in cases where the store is not guaranteed to execute in the loop. Two key assumption worth drawing out is that this assumes a) no-capture is strong enough to imply no-escape, and b) standard allocation functions like malloc, calloc, and operator new return values which can be assumed not to have previously escaped. In future work, it would be nice to generalize this so that it works without directly seeing the allocation site. I believe that the nocapture return attribute should be suitable for this purpose, but haven't investigated carefully. It's also likely that we could support unescaped allocas with similar reasoning, but since SROA and Mem2Reg should destroy those, they're less interesting than they first might seem. Differential Revision: http://reviews.llvm.org/D16783 llvm-svn: 263072
*	[ValueTracking] Extract isKnownPositive [NFCI]	Philip Reames	2016-03-09	1	-2/+2
\| \| \| \| \| \|	Extract out a generic interface from a recently landed patch and document a TODO in case compile time becomes a problem. llvm-svn: 263062
*	[InstCombine] (icmp sgt smin(PosA, B) 0) -> (icmp sgt B 0)	Philip Reames	2016-03-09	1	-0/+13
\| \| \| \| \| \| \| \|	When checking whether an smin is positive, we can move the comparison to one of the inputs if the other is known positive. If the known positive one is the min, then the other can't be negative. If the other is the min, then we compute the min. Differential Revision: http://reviews.llvm.org/D17873 llvm-svn: 263059
*	[LLE] Add missing check for unit stride	Adam Nemet	2016-03-09	1	-5/+13
\| \| \| \| \| \| \| \| \| \|	I somehow missed this. The case in GCC (global_alloc) was similar to the new testcase except it had an array of structs rather than a two dimensional array. Fixes RP26885. llvm-svn: 263058
*	InstCombine: Restrict computeKnownBits() on all Values to OptLevel > 2	Matthias Braun	2016-03-09	3	-29/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of r251146 InstCombine was extended to call computeKnownBits on every value in the function to determine whether it happens to be constant. This increases typical compiletime by 1-3% (5% in irgen+opt time) in my measurements. On the other hand this case did not trigger once in the whole llvm-testsuite. This patch introduces the notion of ExpensiveCombines which are only enabled for OptLevel > 2. I removed the check in InstructionSimplify as that is called from various places where the OptLevel is not known but given the rarity of the situation I think a check in InstCombine is enough. Differential Revision: http://reviews.llvm.org/D16835 llvm-svn: 263047
*	Reland r262337 "calculate builtin_object_size if arg is a removable pointer"	Petar Jovanovic	2016-03-09	1	-8/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Original commit message: calculate builtin_object_size if argument is a removable pointer This patch fixes calculating correct value for builtin_object_size function when pointer is used only in builtin_object_size function call and never after that. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D17337 Reland the original change with a small modification (first do a null check and then do the cast) to satisfy ubsan. llvm-svn: 263011
*	[LoopDataPrefetch] Add stats and debug output	Adam Nemet	2016-03-09	1	-0/+9
\| \| \| \|	llvm-svn: 262998
*	Return StringRef instead of a naked char*; NFC	Sanjoy Das	2016-03-09	1	-2/+2
\| \| \| \|	llvm-svn: 262989
*	[IRCE] Reflow comments; NFC	Sanjoy Das	2016-03-09	1	-4/+2
\| \| \| \|	llvm-svn: 262988
*	FunctionIndex is not optional for renameModuleForThinLTO(), make it a ↵	Mehdi Amini	2016-03-09	2	-3/+3
\| \| \| \| \| \| \|	reference (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 262976
*	fix variable name; NFC	Sanjay Patel	2016-03-08	1	-3/+3
\| \| \| \|	llvm-svn: 262953
*	use range-based loop; NFCI	Sanjay Patel	2016-03-08	1	-3/+2
\| \| \| \|	llvm-svn: 262952
*	rangify, fix function names; NFCI	Sanjay Patel	2016-03-08	1	-27/+22
\| \| \| \|	llvm-svn: 262940
*	don't repeat function names in documentation comments; NFC	Sanjay Patel	2016-03-08	1	-4/+4
\| \| \| \|	llvm-svn: 262937
*	Revert "[InstCombine] Combine A->B->A BitCast"	Junmo Park	2016-03-08	2	-104/+0
\| \| \| \| \| \|	This reverts commit r262670 due to compile failure. llvm-svn: 262916
*	Fix evaluation order. Spotted by Alexander Riccio!	Peter Collingbourne	2016-03-08	1	-1/+1
\| \| \| \|	llvm-svn: 262907
*	Revert revisions 262636, 262643, 262679, and 262682.	Easwaran Raman	2016-03-08	4	-144/+30
\| \| \| \|	llvm-svn: 262883
*	[tsan] Add support for pointer typed atomic stores, loads, and cmpxchg	Anna Zaks	2016-03-07	1	-8/+31
\| \| \| \| \| \| \| \| \| \|	TSan instrumentation functions for atomic stores, loads, and cmpxchg work on integer value types. This patch adds casts before calling TSan instrumentation functions in cases where the value is a pointer. Differential Revision: http://reviews.llvm.org/D17833 llvm-svn: 262876
*	[LoopDataPrefetch] If prefetch distance is not set, skip pass	Adam Nemet	2016-03-07	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This lets select sub-targets enable this pass. The patch implements the idea from the recent llvm-dev thread: http://thread.gmane.org/gmane.comp.compilers.llvm.devel/94925 The goal is to enable the LoopDataPrefetch pass for the Cyclone sub-target only within Aarch64. Positive and negative tests will be included in an upcoming patch that enables selective prefetching of large-strided accesses on Cyclone. llvm-svn: 262844
*	Revert "Enable LoopLoadElimination by default"	Adam Nemet	2016-03-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	This reverts commit r262250. It causes SPEC2006/gcc to generate wrong result (166.s) in AArch64 when running with ref data set. The error happens with "-Ofast -flto -fuse-ld=gold" or "-O3 -fno-strict-aliasing". llvm-svn: 262839
*	[DFSan] Remove an overly aggressive assert reported in PR26068.	Chandler Carruth	2016-03-07	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \|	This code has been successfully used to bootstrap libc++ in a no-asserts mode for a very long time, so the code that follows cannot be completely incorrect. I've added a test that shows the current behavior for this kind of code with DFSan. If it is desirable for DFSan to do something special when processing an invoke of a variadic function, it can be added, but we shouldn't keep an assert that we've been ignoring due to release builds anyways. llvm-svn: 262829
*	[PGO] Add a commandline option to control number of the VP annotation metadata.	Rong Xu	2016-03-04	1	-2/+10
\| \| \| \|	llvm-svn: 262750
*	Fix a use-after-free bug introduced in r262636	Easwaran Raman	2016-03-04	2	-6/+11
\| \| \| \|	llvm-svn: 262679
*	[InstCombine] Combine A->B->A BitCast	Guozhi Wei	2016-03-03	2	-0/+104
\| \| \| \| \| \| \| \| \| \|	This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 262670
*	[InstCombine] transform bitcasted bitwise logic ops with constants (PR26702)	Sanjay Patel	2016-03-03	1	-7/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Given that we're not actually reducing the instruction count in the included regression tests, I think we would call this a canonicalization step. The motivation comes from the example in PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable example of: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1> %bc = bitcast <4 x i32> %not to <2 x i64> %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1> %bc2 = bitcast <2 x i64> %notnot to <4 x i32> ret <4 x i32> %bc2 } Simplifies to the expected: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> ret <4 x i32> %lobit } Differential Revision: http://reviews.llvm.org/D17583 llvm-svn: 262645
*	Infrastructure for PGO enhancements in inliner	Easwaran Raman	2016-03-03	4	-29/+138
\| \| \| \| \| \| \| \| \| \| \| \|	This patch provides the following infrastructure for PGO enhancements in inliner: Enable the use of block level profile information in inliner Incremental update of block frequency information during inlining Update the function entry counts of callees when they get inlined into callers. Differential Revision: http://reviews.llvm.org/D16381 llvm-svn: 262636
*	Use LineLocation instead of CallsiteLocation to index callsite profile.	Dehao Chen	2016-03-03	1	-14/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: With discriminator, LineLocation can uniquely identify a callsite without the need to specifying callee name. Remove Callee function name from the key, and put it in the value (FunctionSamples). Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17827 llvm-svn: 262634
*	[LoopUtils, LV] Fix PR26734	Matthew Simpson	2016-03-03	1	-1/+1
\| \| \| \| \| \| \| \|	The vectorization of first-order recurrences (r261346) caused PR26734. When detecting these recurrences, we need to ensure that the previous value is actually defined inside the loop. This patch includes the fix and test case. llvm-svn: 262624
*	Explode store of arrays in instcombine	Amaury Sechet	2016-03-02	1	-1/+33
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is the last step toward supporting aggregate memory access in instcombine. This explodes stores of arrays into a serie of stores for each element, allowing them to be optimized. Reviewers: joker.eph, reames, hfinkel, majnemer, mgrang Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17828 llvm-svn: 262530
*	Unpack array of all sizes in InstCombine	Amaury Sechet	2016-03-02	1	-5/+38
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is another step toward improving fca support. This unpack load of array in a series of load to array's elements. Reviewers: chandlerc, joker.eph, majnemer, reames, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15890 llvm-svn: 262521
*	Really fix ASAN leak/etc issues with MemorySSA unittests	Daniel Berlin	2016-03-02	1	-3/+2
\| \| \| \|	llvm-svn: 262519
*	Revert "Fix ASAN detected errors in code and test" (it was not meant to be ↵	Daniel Berlin	2016-03-02	1	-2/+3
\| \| \| \| \| \| \| \|	committed yet) This reverts commit 890bbccd600ba1eb050353d06a29650ad0f2eb95. llvm-svn: 262512
*	Fix ASAN detected errors in code and test	Daniel Berlin	2016-03-02	1	-3/+2
\| \| \| \|	llvm-svn: 262511
*	[AA] Hoist the logic to reformulate various AA queries in terms of other	Chandler Carruth	2016-03-02	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	parts of the AA interface out of the base class of every single AA result object. Because this logic reformulates the query in terms of some other aspect of the API, it would easily cause O(n^2) query patterns in alias analysis. These could in turn be magnified further based on the number of call arguments, and then further based on the number of AA queries made for a particular call. This ended up causing problems for Rust that were actually noticable enough to get a bug (PR26564) and probably other places as well. When originally re-working the AA infrastructure, the desire was to regularize the pattern of refinement without losing any generality. While I think it was successful, that is clearly proving to be too costly. And the cost is needless: we gain no actual improvement for this generality of making a direct query to tbaa actually be able to re-use some other alias analysis's refinement logic for one of the other APIs, or some such. In short, this is entirely wasted work. To the extent possible, delegation to other API surfaces should be done at the aggregation layer so that we can avoid re-walking the aggregation. In fact, this significantly simplifies the logic as we no longer need to smuggle the aggregation layer into each alias analysis (or the TargetLibraryInfo into each alias analysis just so we can form argument memory locations!). However, we also have some delegation logic inside of BasicAA and some of it even makes sense. When the delegation logic is baking in specific knowledge of aliasing properties of the LLVM IR, as opposed to simply reformulating the query to utilize a different alias analysis interface entry point, it makes a lot of sense to restrict that logic to a different layer such as BasicAA. So one aspect of the delegation that was in every AA base class is that when we don't have operand bundles, we re-use function AA results as a fallback for callsite alias results. This relies on the IR properties of calls and functions w.r.t. aliasing, and so seems a better fit to BasicAA. I've lifted the logic up to that point where it seems to be a natural fit. This still does a bit of redundant work (we query function attributes twice, once via the callsite and once via the function AA query) but it is exactly twice here, no more. The end result is that all of the delegation logic is hoisted out of the base class and into either the aggregation layer when it is a pure retargeting to a different API surface, or into BasicAA when it relies on the IR's aliasing properties. This should fix the quadratic query pattern reported in PR26564, although I don't have a stand-alone test case to reproduce it. It also seems general goodness. Now the numerous AAs that don't need target library info don't carry it around and depend on it. I think I can even rip out the general access to the aggregation layer and only expose that in BasicAA as it is the only place where we re-query in that manner. However, this is a non-trivial change to the AA infrastructure so I want to get some additional eyes on this before it lands. Sadly, it can't wait long because we should really cherry pick this into 3.8 if we're going to go this route. Differential Revision: http://reviews.llvm.org/D17329 llvm-svn: 262490
*	Attempt to fix ASAN failure in a MemorySSA test.	George Burgess IV	2016-03-02	1	-4/+4
\| \| \| \|	llvm-svn: 262452
*	revert r262424 because there's a clang test for AArch64 that checks -O3 ↵	Sanjay Patel	2016-03-02	1	-17/+5
\| \| \| \| \| \| \| \|	asm output that is broken by this change llvm-svn: 262440
*	[InstCombine] convert 'isPositive' and 'isNegative' vector comparisons to ↵	Sanjay Patel	2016-03-01	1	-5/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	shifts (PR26701) As noted in the code comment, I don't think we can do the same transform that we do for scalar integers comparisons to vector integers comparisons because it might pessimize the general case. Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT for integer vectors. But we should now recognize all the variants of this construct and produce the optimal code for the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26701 llvm-svn: 262424
*	Perform InstructioinCombiningPass before SampleProfile pass.	Dehao Chen	2016-03-01	2	-21/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17742 llvm-svn: 262419
*	Fix an issue where fast math flags were dropped during scalarization.	Owen Anderson	2016-03-01	1	-2/+4
\| \| \| \| \| \| \|	Most portions of InstCombine properly propagate fast math flags, but apparently the vector scalarization section was overlooked. llvm-svn: 262376
*	Add the beginnings of an update API for preserving MemorySSA	Daniel Berlin	2016-03-01	1	-0/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds the beginning of an update API to preserve MemorySSA. In particular, this patch adds a way to remove memory SSA accesses when instructions are deleted. It also adds relevant unit testing infrastructure for MemorySSA's API. (There is an actual user of this API, i will make that diff dependent on this one. In practice, a ton of opt passes remove memory instructions, so it's hopefully an obviously useful API :P) Reviewers: hfinkel, reames, george.burgess.iv Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17157 llvm-svn: 262362