bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[ValueTracking] Extract isKnownPositive [NFCI]	Philip Reames	2016-03-09	1	-2/+2
\| \| \| \| \| \|	Extract out a generic interface from a recently landed patch and document a TODO in case compile time becomes a problem. llvm-svn: 263062
*	[InstCombine] (icmp sgt smin(PosA, B) 0) -> (icmp sgt B 0)	Philip Reames	2016-03-09	1	-0/+13
\| \| \| \| \| \| \| \|	When checking whether an smin is positive, we can move the comparison to one of the inputs if the other is known positive. If the known positive one is the min, then the other can't be negative. If the other is the min, then we compute the min. Differential Revision: http://reviews.llvm.org/D17873 llvm-svn: 263059
*	[LLE] Add missing check for unit stride	Adam Nemet	2016-03-09	1	-5/+13
\| \| \| \| \| \| \| \| \| \|	I somehow missed this. The case in GCC (global_alloc) was similar to the new testcase except it had an array of structs rather than a two dimensional array. Fixes RP26885. llvm-svn: 263058
*	InstCombine: Restrict computeKnownBits() on all Values to OptLevel > 2	Matthias Braun	2016-03-09	3	-29/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of r251146 InstCombine was extended to call computeKnownBits on every value in the function to determine whether it happens to be constant. This increases typical compiletime by 1-3% (5% in irgen+opt time) in my measurements. On the other hand this case did not trigger once in the whole llvm-testsuite. This patch introduces the notion of ExpensiveCombines which are only enabled for OptLevel > 2. I removed the check in InstructionSimplify as that is called from various places where the OptLevel is not known but given the rarity of the situation I think a check in InstCombine is enough. Differential Revision: http://reviews.llvm.org/D16835 llvm-svn: 263047
*	Reland r262337 "calculate builtin_object_size if arg is a removable pointer"	Petar Jovanovic	2016-03-09	1	-8/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Original commit message: calculate builtin_object_size if argument is a removable pointer This patch fixes calculating correct value for builtin_object_size function when pointer is used only in builtin_object_size function call and never after that. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D17337 Reland the original change with a small modification (first do a null check and then do the cast) to satisfy ubsan. llvm-svn: 263011
*	[LoopDataPrefetch] Add stats and debug output	Adam Nemet	2016-03-09	1	-0/+9
\| \| \| \|	llvm-svn: 262998
*	Return StringRef instead of a naked char*; NFC	Sanjoy Das	2016-03-09	1	-2/+2
\| \| \| \|	llvm-svn: 262989
*	[IRCE] Reflow comments; NFC	Sanjoy Das	2016-03-09	1	-4/+2
\| \| \| \|	llvm-svn: 262988
*	FunctionIndex is not optional for renameModuleForThinLTO(), make it a ↵	Mehdi Amini	2016-03-09	2	-3/+3
\| \| \| \| \| \| \|	reference (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 262976
*	fix variable name; NFC	Sanjay Patel	2016-03-08	1	-3/+3
\| \| \| \|	llvm-svn: 262953
*	use range-based loop; NFCI	Sanjay Patel	2016-03-08	1	-3/+2
\| \| \| \|	llvm-svn: 262952
*	rangify, fix function names; NFCI	Sanjay Patel	2016-03-08	1	-27/+22
\| \| \| \|	llvm-svn: 262940
*	don't repeat function names in documentation comments; NFC	Sanjay Patel	2016-03-08	1	-4/+4
\| \| \| \|	llvm-svn: 262937
*	Revert "[InstCombine] Combine A->B->A BitCast"	Junmo Park	2016-03-08	2	-104/+0
\| \| \| \| \| \|	This reverts commit r262670 due to compile failure. llvm-svn: 262916
*	Fix evaluation order. Spotted by Alexander Riccio!	Peter Collingbourne	2016-03-08	1	-1/+1
\| \| \| \|	llvm-svn: 262907
*	Revert revisions 262636, 262643, 262679, and 262682.	Easwaran Raman	2016-03-08	4	-144/+30
\| \| \| \|	llvm-svn: 262883
*	[tsan] Add support for pointer typed atomic stores, loads, and cmpxchg	Anna Zaks	2016-03-07	1	-8/+31
\| \| \| \| \| \| \| \| \| \|	TSan instrumentation functions for atomic stores, loads, and cmpxchg work on integer value types. This patch adds casts before calling TSan instrumentation functions in cases where the value is a pointer. Differential Revision: http://reviews.llvm.org/D17833 llvm-svn: 262876
*	[LoopDataPrefetch] If prefetch distance is not set, skip pass	Adam Nemet	2016-03-07	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This lets select sub-targets enable this pass. The patch implements the idea from the recent llvm-dev thread: http://thread.gmane.org/gmane.comp.compilers.llvm.devel/94925 The goal is to enable the LoopDataPrefetch pass for the Cyclone sub-target only within Aarch64. Positive and negative tests will be included in an upcoming patch that enables selective prefetching of large-strided accesses on Cyclone. llvm-svn: 262844
*	Revert "Enable LoopLoadElimination by default"	Adam Nemet	2016-03-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	This reverts commit r262250. It causes SPEC2006/gcc to generate wrong result (166.s) in AArch64 when running with ref data set. The error happens with "-Ofast -flto -fuse-ld=gold" or "-O3 -fno-strict-aliasing". llvm-svn: 262839
*	[DFSan] Remove an overly aggressive assert reported in PR26068.	Chandler Carruth	2016-03-07	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \|	This code has been successfully used to bootstrap libc++ in a no-asserts mode for a very long time, so the code that follows cannot be completely incorrect. I've added a test that shows the current behavior for this kind of code with DFSan. If it is desirable for DFSan to do something special when processing an invoke of a variadic function, it can be added, but we shouldn't keep an assert that we've been ignoring due to release builds anyways. llvm-svn: 262829
*	[PGO] Add a commandline option to control number of the VP annotation metadata.	Rong Xu	2016-03-04	1	-2/+10
\| \| \| \|	llvm-svn: 262750
*	Fix a use-after-free bug introduced in r262636	Easwaran Raman	2016-03-04	2	-6/+11
\| \| \| \|	llvm-svn: 262679
*	[InstCombine] Combine A->B->A BitCast	Guozhi Wei	2016-03-03	2	-0/+104
\| \| \| \| \| \| \| \| \| \|	This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 262670
*	[InstCombine] transform bitcasted bitwise logic ops with constants (PR26702)	Sanjay Patel	2016-03-03	1	-7/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Given that we're not actually reducing the instruction count in the included regression tests, I think we would call this a canonicalization step. The motivation comes from the example in PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable example of: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1> %bc = bitcast <4 x i32> %not to <2 x i64> %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1> %bc2 = bitcast <2 x i64> %notnot to <4 x i32> ret <4 x i32> %bc2 } Simplifies to the expected: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> ret <4 x i32> %lobit } Differential Revision: http://reviews.llvm.org/D17583 llvm-svn: 262645
*	Infrastructure for PGO enhancements in inliner	Easwaran Raman	2016-03-03	4	-29/+138
\| \| \| \| \| \| \| \| \| \| \| \|	This patch provides the following infrastructure for PGO enhancements in inliner: Enable the use of block level profile information in inliner Incremental update of block frequency information during inlining Update the function entry counts of callees when they get inlined into callers. Differential Revision: http://reviews.llvm.org/D16381 llvm-svn: 262636
*	Use LineLocation instead of CallsiteLocation to index callsite profile.	Dehao Chen	2016-03-03	1	-14/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: With discriminator, LineLocation can uniquely identify a callsite without the need to specifying callee name. Remove Callee function name from the key, and put it in the value (FunctionSamples). Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17827 llvm-svn: 262634
*	[LoopUtils, LV] Fix PR26734	Matthew Simpson	2016-03-03	1	-1/+1
\| \| \| \| \| \| \| \|	The vectorization of first-order recurrences (r261346) caused PR26734. When detecting these recurrences, we need to ensure that the previous value is actually defined inside the loop. This patch includes the fix and test case. llvm-svn: 262624
*	Explode store of arrays in instcombine	Amaury Sechet	2016-03-02	1	-1/+33
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is the last step toward supporting aggregate memory access in instcombine. This explodes stores of arrays into a serie of stores for each element, allowing them to be optimized. Reviewers: joker.eph, reames, hfinkel, majnemer, mgrang Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17828 llvm-svn: 262530
*	Unpack array of all sizes in InstCombine	Amaury Sechet	2016-03-02	1	-5/+38
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is another step toward improving fca support. This unpack load of array in a series of load to array's elements. Reviewers: chandlerc, joker.eph, majnemer, reames, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15890 llvm-svn: 262521
*	Really fix ASAN leak/etc issues with MemorySSA unittests	Daniel Berlin	2016-03-02	1	-3/+2
\| \| \| \|	llvm-svn: 262519
*	Revert "Fix ASAN detected errors in code and test" (it was not meant to be ↵	Daniel Berlin	2016-03-02	1	-2/+3
\| \| \| \| \| \| \| \|	committed yet) This reverts commit 890bbccd600ba1eb050353d06a29650ad0f2eb95. llvm-svn: 262512
*	Fix ASAN detected errors in code and test	Daniel Berlin	2016-03-02	1	-3/+2
\| \| \| \|	llvm-svn: 262511
*	[AA] Hoist the logic to reformulate various AA queries in terms of other	Chandler Carruth	2016-03-02	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	parts of the AA interface out of the base class of every single AA result object. Because this logic reformulates the query in terms of some other aspect of the API, it would easily cause O(n^2) query patterns in alias analysis. These could in turn be magnified further based on the number of call arguments, and then further based on the number of AA queries made for a particular call. This ended up causing problems for Rust that were actually noticable enough to get a bug (PR26564) and probably other places as well. When originally re-working the AA infrastructure, the desire was to regularize the pattern of refinement without losing any generality. While I think it was successful, that is clearly proving to be too costly. And the cost is needless: we gain no actual improvement for this generality of making a direct query to tbaa actually be able to re-use some other alias analysis's refinement logic for one of the other APIs, or some such. In short, this is entirely wasted work. To the extent possible, delegation to other API surfaces should be done at the aggregation layer so that we can avoid re-walking the aggregation. In fact, this significantly simplifies the logic as we no longer need to smuggle the aggregation layer into each alias analysis (or the TargetLibraryInfo into each alias analysis just so we can form argument memory locations!). However, we also have some delegation logic inside of BasicAA and some of it even makes sense. When the delegation logic is baking in specific knowledge of aliasing properties of the LLVM IR, as opposed to simply reformulating the query to utilize a different alias analysis interface entry point, it makes a lot of sense to restrict that logic to a different layer such as BasicAA. So one aspect of the delegation that was in every AA base class is that when we don't have operand bundles, we re-use function AA results as a fallback for callsite alias results. This relies on the IR properties of calls and functions w.r.t. aliasing, and so seems a better fit to BasicAA. I've lifted the logic up to that point where it seems to be a natural fit. This still does a bit of redundant work (we query function attributes twice, once via the callsite and once via the function AA query) but it is exactly twice here, no more. The end result is that all of the delegation logic is hoisted out of the base class and into either the aggregation layer when it is a pure retargeting to a different API surface, or into BasicAA when it relies on the IR's aliasing properties. This should fix the quadratic query pattern reported in PR26564, although I don't have a stand-alone test case to reproduce it. It also seems general goodness. Now the numerous AAs that don't need target library info don't carry it around and depend on it. I think I can even rip out the general access to the aggregation layer and only expose that in BasicAA as it is the only place where we re-query in that manner. However, this is a non-trivial change to the AA infrastructure so I want to get some additional eyes on this before it lands. Sadly, it can't wait long because we should really cherry pick this into 3.8 if we're going to go this route. Differential Revision: http://reviews.llvm.org/D17329 llvm-svn: 262490
*	Attempt to fix ASAN failure in a MemorySSA test.	George Burgess IV	2016-03-02	1	-4/+4
\| \| \| \|	llvm-svn: 262452
*	revert r262424 because there's a clang test for AArch64 that checks -O3 ↵	Sanjay Patel	2016-03-02	1	-17/+5
\| \| \| \| \| \| \| \|	asm output that is broken by this change llvm-svn: 262440
*	[InstCombine] convert 'isPositive' and 'isNegative' vector comparisons to ↵	Sanjay Patel	2016-03-01	1	-5/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	shifts (PR26701) As noted in the code comment, I don't think we can do the same transform that we do for scalar integers comparisons to vector integers comparisons because it might pessimize the general case. Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT for integer vectors. But we should now recognize all the variants of this construct and produce the optimal code for the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26701 llvm-svn: 262424
*	Perform InstructioinCombiningPass before SampleProfile pass.	Dehao Chen	2016-03-01	2	-21/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17742 llvm-svn: 262419
*	Fix an issue where fast math flags were dropped during scalarization.	Owen Anderson	2016-03-01	1	-2/+4
\| \| \| \| \| \| \|	Most portions of InstCombine properly propagate fast math flags, but apparently the vector scalarization section was overlooked. llvm-svn: 262376
*	Add the beginnings of an update API for preserving MemorySSA	Daniel Berlin	2016-03-01	1	-0/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds the beginning of an update API to preserve MemorySSA. In particular, this patch adds a way to remove memory SSA accesses when instructions are deleted. It also adds relevant unit testing infrastructure for MemorySSA's API. (There is an actual user of this API, i will make that diff dependent on this one. In practice, a ton of opt passes remove memory instructions, so it's hopefully an obviously useful API :P) Reviewers: hfinkel, reames, george.burgess.iv Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17157 llvm-svn: 262362
*	Revert "calculate builtin_object_size if argument is a removable pointer"	Petar Jovanovic	2016-03-01	1	-19/+6
\| \| \| \| \| \| \|	Revert r262337 as "check-llvm ubsan" step failed on sanitizer-x86_64-linux-fast buildbot. llvm-svn: 262349
*	calculate builtin_object_size if argument is a removable pointer	Petar Jovanovic	2016-03-01	1	-6/+19
\| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes calculating correct value for builtin_object_size function when pointer is used only in builtin_object_size function call and never after that. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D17337 llvm-svn: 262337
*	[x86, InstCombine] transform more x86 masked loads to LLVM intrinsics	Sanjay Patel	2016-02-29	1	-1/+7
\| \| \| \| \| \| \|	Continuation of: http://reviews.llvm.org/rL262269 llvm-svn: 262273
*	[LLE] Fix a comment	Adam Nemet	2016-02-29	1	-3/+3
\| \| \| \|	llvm-svn: 262270
*	[x86, InstCombine] transform x86 AVX masked loads to LLVM intrinsics	Sanjay Patel	2016-02-29	1	-1/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The intended effect of this patch in conjunction with: http://reviews.llvm.org/rL259392 http://reviews.llvm.org/rL260145 is that customers using the AVX intrinsics in C will benefit from combines when the load mask is constant: __m128 mload_zeros(float f) { return _mm_maskload_ps(f, _mm_set1_epi32(0)); } __m128 mload_fakeones(float f) { return _mm_maskload_ps(f, _mm_set1_epi32(1)); } __m128 mload_ones(float f) { return _mm_maskload_ps(f, _mm_set1_epi32(0x80000000)); } __m128 mload_oneset(float f) { return _mm_maskload_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0)); } ...so none of the above will actually generate a masked load for optimized code. This is the masked load counterpart to: http://reviews.llvm.org/rL262064 llvm-svn: 262269
*	[LLE] Fix SingleSource/Benchmarks/Polybench/stencils/jacobi-2d-imper with Polly	Adam Nemet	2016-02-29	1	-0/+5
\| \| \| \| \| \| \| \| \|	We can actually have dependences between accesses with different underlying types. Bail in this case. A test will follow shortly. llvm-svn: 262267
*	Enable LoopLoadElimination by default	Adam Nemet	2016-02-29	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I re-benchmarked this and results are similar to original results in D13259: On ARM64: SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog -59.27% SingleSource/Benchmarks/Polybench/stencils/adi -19.78% On x86: SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog -27.14% And of course the original ~20% gain on SPECint_2006/456.hmmer with Loop Distribution. In terms of compile time, there is ~5% increase on both SingleSource/Benchmarks/Misc/oourafft and SingleSource/Benchmarks/Linkpack/linkpack-pc. These are both very tiny loop-intensive programs where SCEV computations dominates compile time. The reason that time spent in SCEV increases has to do with the design of the old pass manager. If a transform pass does not preserve an analysis we invalidate the analysis even if there was no modification made by the transform pass. This means that currently we don't take advantage of LLE and LV sharing the same analysis (LAA) and unfortunately we recompute LAA and SCEV for LLE. (There should be a way to work around this limitation in the case of SCEV and LAA since both compute things on demand and internally cache their result. Thus we could pretend that transform passes preserve these analyses and manually invalidate them upon actual modification. On the other hand the new pass manager is supposed to solve so I am not sure if this is worthwhile.) Reviewers: hfinkel, dberlin Subscribers: dberlin, reames, mssimpso, aemerson, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16300 llvm-svn: 262250
*	Minor code cleanup. NFC	Rong Xu	2016-02-29	1	-16/+15
\| \| \| \|	llvm-svn: 262242
*	Move discriminator assignment to the right place.	Dehao Chen	2016-02-29	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Now discriminator is assigned per-function instead of per-module. Reviewers: davidxl, dnovillo Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D17664 llvm-svn: 262240
*	[PGO] Remove redundant counter copies for avail_extern functions.	Xinliang David Li	2016-02-27	1	-3/+32
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17654 llvm-svn: 262157
*	Revert "[sancov] do not instrument nodes that are full pre-dominators"	Renato Golin	2016-02-27	1	-22/+11
\| \| \| \| \| \|	This reverts commit r262103, as it broke all ARM and AArch64 bots. llvm-svn: 262139