bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Don't remove side effecting instructions due to ConstantFoldInstruction	David Majnemer	2016-07-22	3	-41/+53
\| \| \| \| \| \| \| \| \|	Just because we can constant fold the result of an instruction does not imply that we can delete the instruction. It may have side effects. This fixes PR28655. llvm-svn: 276389
*	[IRCE] Don't misuse CHECK-LABEL; NFC	Sanjoy Das	2016-07-22	5	-30/+31
\| \| \| \|	llvm-svn: 276373
*	[IRCE] Add an option to skip profitability checks	Sanjoy Das	2016-07-22	1	-0/+31
\| \| \| \| \| \| \| \|	If `-irce-skip-profitability-checks` is passed in, IRCE will kick in in all cases where it is legal for it to kick in. This flag is intended to help diagnose and analyse performance issues. llvm-svn: 276372
*	GVH-hoist: only clone GEPs (PR28606)	Sebastian Pop	2016-07-21	2	-2/+52
\| \| \| \| \| \| \| \| \|	Do not clone stored values unless they are GEPs that are special cased to avoid hoisting them without hoisting their associated ld/st. Differential revision: https://reviews.llvm.org/D22652 llvm-svn: 276358
*	[PM] Port NaryReassociate to the new PM	Wei Mi	2016-07-21	5	-0/+5
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D22648 llvm-svn: 276349
*	[InstSimplify] don't crash handling a pointer or aggregate type	Sanjay Patel	2016-07-21	1	-0/+13
\| \| \| \|	llvm-svn: 276345
*	[InstSimplify] recognize trunc + icmp sgt/slt variants of select ↵	Sanjay Patel	2016-07-21	1	-41/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	simplifications (PR28466) rL245171 exposed a hole in InstSimplify that manifested in a strange way in PR28466: https://llvm.org/bugs/show_bug.cgi?id=28466 It's possible to use trunc + icmp sgt/slt in place of an and + icmp eq/ne, so we need to recognize that pattern to eliminate selects that are choosing between some value and some bitmasked version of that value. Note that there is significant room for improvement (refactoring) and enhancement (more patterns, possibly in InstCombine rather than here). Differential Revision: https://reviews.llvm.org/D22537 llvm-svn: 276341
*	[OptDiag,LDist] Convert remaining opt remarks to use the new API	Adam Nemet	2016-07-21	1	-0/+6
\| \| \| \|	llvm-svn: 276340
*	[LV] Move vector int induction update to end of latch	Matthew Simpson	2016-07-21	3	-14/+15
\| \| \| \| \| \| \| \| \| \| \|	This patch moves the update instruction for vectorized integer induction phi nodes to the end of the latch block. This ensures consistent placement of all induction updates across all the kinds of int inductions we create (scalar, splat vector, or vector phi). Differential Revision: https://reviews.llvm.org/D22416 llvm-svn: 276339
*	add vector tests and a simpler version of the negative tests	Sanjay Patel	2016-07-21	1	-3/+48
\| \| \| \|	llvm-svn: 276328
*	Revert "Invariant start/end intrinsics overloaded for address space"	Anna Thomas	2016-07-21	3	-22/+10
\| \| \| \| \| \|	This reverts commit r276316. llvm-svn: 276320
*	Invariant start/end intrinsics overloaded for address space	Anna Thomas	2016-07-21	3	-10/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The llvm.invariant.start and llvm.invariant.end intrinsics currently support specifying invariant memory objects only in the default address space. With this change, these intrinsics are overloaded for any adddress space for memory objects and we can use these llvm invariant intrinsics in non-default address spaces. Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr) This overloaded intrinsic is needed for representing final or invariant memory in managed languages. Reviewers: tstellarAMD, reames, apilipenko Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22519 llvm-svn: 276316
*	[GVNHoist] Preserve optimization hints which agree	David Majnemer	2016-07-21	1	-0/+44
\| \| \| \| \| \| \|	If we have optimization hints with agree with each other along different paths, preserve them. llvm-svn: 276248
*	[GVNHoist] Don't wrongly preserve TBAA	David Majnemer	2016-07-21	1	-0/+29
\| \| \| \| \| \| \|	We hoisted loads/stores without taking into account which can cause miscompiles. llvm-svn: 276240
*	[OptDiag,LV] Add hotness attribute to applied-optimization remarks	Adam Nemet	2016-07-21	1	-4/+4
\| \| \| \| \| \| \|	Test coverage is provided by modifying the function in the FP-math testcase that we are allowed to vectorize. llvm-svn: 276223
*	[InstCombine] LogicOpc (zext X), C --> zext (LogicOpc X, C) (PR28476)	Sanjay Patel	2016-07-21	7	-54/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The benefits of this change include: 1. Remove DeMorgan-matching code that was added specifically to work-around the missing transform in http://reviews.llvm.org/rL248634. 2. Makes the DeMorgan transform work for vectors too. 3. Fix PR28476: https://llvm.org/bugs/show_bug.cgi?id=28476 Extending this transform to other casts and other associative operators may be useful too. See https://reviews.llvm.org/D22421 for a prerequisite for doing that though. Differential Revision: https://reviews.llvm.org/D22271 llvm-svn: 276221
*	[OptDiag,LV] Add hotness attribute to the derived analysis remarks	Adam Nemet	2016-07-20	1	-0/+113
\| \| \| \| \| \| \| \|	This includes FPCompute and Aliasing. Testcase is based on no_fpmath.ll. llvm-svn: 276211
*	[InstSimplify][InstCombine] don't crash when folding vector selects of icmp	Sanjay Patel	2016-07-20	2	-0/+34
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D22602 llvm-svn: 276209
*	[NVPTX] Enable the load-store vectorizer on nvptx.	Justin Lebar	2016-07-20	1	-16/+16
\| \| \| \| \| \| \| \| \| \|	Reviewers: tra Subscribers: jholewinski, arsenm, asbirlea Differential Revision: https://reviews.llvm.org/D22592 llvm-svn: 276196
*	[OptDiag,LV] Add hotness attribute to analysis remarks	Adam Nemet	2016-07-20	1	-0/+201
\| \| \| \| \| \| \| \|	The earlier change added hotness attribute to missed-optimization remarks. This follows up with the analysis remarks (the ones explaining the reason for the missed optimization). llvm-svn: 276192
*	[GVNHoist] Don't hoist PHI nodes	David Majnemer	2016-07-20	1	-0/+42
\| \| \| \| \| \| \| \| \|	We hoisted PHIs without respecting their special insertion point in the block, leading to verfier errors. This fixes PR28626. llvm-svn: 276181
*	[SCCP] Zap multiple return values.	Davide Italiano	2016-07-20	2	-2/+26
\| \| \| \| \| \| \| \| \|	We can replace the return values with undef if we replaced all the call uses with a constant/undef. Differential Revision: https://reviews.llvm.org/D22336 llvm-svn: 276174
*	[LSV] Don't move stores across may-load instrs, and loosen restrictions on ↵	Justin Lebar	2016-07-20	1	-33/+194
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	moving loads. Summary: Previously we wouldn't move loads/stores across instructions that had side-effects, where that was defined as may-write or may-throw. But this is not sufficiently restrictive: Stores can't safely be moved across instructions that may load. This patch also adds a DEBUG check that all instructions in our chain are either loads or stores. Reviewers: asbirlea Subscribers: llvm-commits, jholewinski, arsenm, mzolotukhin Differential Revision: https://reviews.llvm.org/D22547 llvm-svn: 276171
*	[LSV] Vectorize up to side-effecting instructions.	Justin Lebar	2016-07-20	2	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously if we had a chain that contained a side-effecting instruction, we wouldn't vectorize it at all. Now we'll vectorize everything that comes before the side-effecting instruction. Reviewers: asbirlea Subscribers: arsenm, jholewinski, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D22536 llvm-svn: 276170
*	minimize tests and auto-generate checks	Sanjay Patel	2016-07-20	1	-117/+73
\| \| \| \|	llvm-svn: 276147
*	Revert "[InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp))"	Benjamin Kramer	2016-07-20	1	-70/+0
\| \| \| \| \| \| \| \|	Makes InstCombine infloop when compiling v8. This reverts commit r275989 and r276105. llvm-svn: 276106
*	[InstCombine] Provide more test cases for cast-folding [NFC]	Tobias Grosser	2016-07-20	1	-3/+35
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In r275989 we enabled the folding of `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))`. Here we add more test cases to assure this folding works for all logical operations `and`/`or`/`xor`. Reviewers: grosser Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22561 Contributed-by: Matthias Reisinger llvm-svn: 276105
*	[X86][SSE] Add cost model values for CTPOP of vectors	Simon Pilgrim	2016-07-20	1	-35/+144
\| \| \| \| \| \| \| \|	This patch adds costs for the vectorized implementations of CTPOP, the default values were seriously underestimating the cost of these and was encouraging vectorization on targets where serialized use of POPCNT would be much better. Differential Revision: https://reviews.llvm.org/D22456 llvm-svn: 276104
*	Forgot to add a test for r276008.	David Majnemer	2016-07-20	1	-0/+18
\| \| \| \|	llvm-svn: 276082
*	[LV] Add hotness attribute to missed-optimization remarks	Adam Nemet	2016-07-20	1	-0/+213
\| \| \| \| \| \| \|	The new OptimizationRemarkEmitter analysis pass is hooked up to both new and old PM passes. llvm-svn: 276080
*	Revert "Revert r275883 and r275891. They seem to cause PR28608."	Michael Zolotukhin	2016-07-20	3	-0/+198
\| \| \| \| \| \| \|	This reverts commit r276064, and thus reapplies r275891 and r275883 with a fix for PR28608. llvm-svn: 276077
*	[LSV] Don't assume that loads/stores appear in address order in the BB.	Justin Lebar	2016-07-20	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: getVectorizablePrefix previously didn't work properly in the face of aliasing loads/stores. It unwittingly assumed that the loads/stores appeared in the BB in address order. If they didn't, it would do the wrong thing. Reviewers: asbirlea, tstellarAMD Subscribers: arsenm, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D22535 llvm-svn: 276072
*	Revert r275883 and r275891. They seem to cause PR28608.	Sean Silva	2016-07-19	2	-163/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Revert "[LoopSimplify] Update LCSSA after separating nested loops." This reverts commit r275891. Revert "[LCSSA] Post-process PHI-nodes created by SSAUpdate when constructing LCSSA form." This reverts commit r275883. llvm-svn: 276064
*	[PM] Port LoopUnroll.	Sean Silva	2016-07-19	1	-0/+1
\| \| \| \| \| \| \| \| \|	We just set PreserveLCSSA to always true since we don't have an analogous method `mustPreserveAnalysisID(LCSSA)`. Also port LoopInfo verifier pass to test LoopUnrollPass. llvm-svn: 276063
*	[LSV] Insert stores at the right point.	Justin Lebar	2016-07-19	1	-2/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously, the insertion point for stores was the last instruction in Chain before calling getVectorizablePrefixEndIdx. Thus if getVectorizablePrefixEndIdx didn't return Chain.size(), we still would insert at the last instruction in Chain. This patch changes our internal API a bit in an attempt to make it less prone to this sort of error. As a result, we end up recalculating the Chain's boundary instructions, but I think worrying about the speed hit of this is a premature optimization right now. Reviewers: asbirlea, tstellarAMD Subscribers: mzolotukhin, arsenm, llvm-commits Differential Revision: https://reviews.llvm.org/D22534 llvm-svn: 276056
*	[LSV] Add detail to correct-order.ll test.	Justin Lebar	2016-07-19	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This helps keep us honest -- there were a number of ways we could screw up and still have passed this test. Reviewers: asbirlea Subscribers: llvm-commits, arsenm Differential Revision: https://reviews.llvm.org/D22531 llvm-svn: 276053
*	regenerate checks	Sanjay Patel	2016-07-19	1	-10/+15
\| \| \| \|	llvm-svn: 276042
*	[InstCombine] fold add(zext(xor X, C), C) --> sext X when C is INT_MIN in ↵	Sanjay Patel	2016-07-19	1	-6/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the source type The pattern may look more obviously like a sext if written as: define i32 @g(i16 %x) { %zext = zext i16 %x to i32 %xor = xor i32 %zext, 32768 %add = add i32 %xor, -32768 ret i32 %add } We already have that fold in visitAdd(). Differential Revision: https://reviews.llvm.org/D22477 llvm-svn: 276035
*	add even more missing tests for simplifySelectBitTest()	Sanjay Patel	2016-07-19	1	-40/+148
\| \| \| \|	llvm-svn: 276024
*	[FunctionAttrs] Correct the safety analysis for inference of 'returned'	David Majnemer	2016-07-19	3	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \|	We skipped over ReturnInsts which didn't return an argument which would lead us to incorrectly conclude that an argument returned by another ReturnInst was 'returned'. This reverts commit r275756. This fixes PR28610. llvm-svn: 276008
*	Add a testcase for r275581	David Majnemer	2016-07-19	1	-0/+28
\| \| \| \|	llvm-svn: 276002
*	add tests related to PR28466	Sanjay Patel	2016-07-19	1	-1/+60
\| \| \| \|	llvm-svn: 275995
*	add missing test for simplifySelectBitTest()	Sanjay Patel	2016-07-19	1	-0/+14
\| \| \| \|	llvm-svn: 275990
*	[InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp))	Tobias Grosser	2016-07-19	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently, InstCombine is already able to fold expressions of the form `logic(cast(A), cast(B))` to the simpler form `cast(logic(A, B))`, where logic designates one of `and`/`or`/`xor`. This transformation is implemented in `foldCastedBitwiseLogic()` in InstCombineAndOrXor.cpp. However, this optimization will not be performed if both `A` and `B` are `icmp` instructions. The decision to preclude casts of `icmp` instructions originates in r48715 in combination with r261707, and can be best understood by the title of the former one: > Transform (zext (or (icmp), (icmp))) to (or (zext (cimp), (zext icmp))) if at least one of the (zext icmp) can be transformed to eliminate an icmp. Apparently, it introduced a transformation that is a reverse of the transformation that is done in `foldCastedBitwiseLogic()`. Its purpose is to expose pairs of `zext icmp` that would subsequently be optimized by `transformZExtICmp()` in InstCombineCasts.cpp. Therefore, in order to avoid an endless loop of switching back and forth between these two transformations, the one in `foldCastedBitwiseLogic()` has been restricted to exclude `icmp` instructions which is mirrored in the responsible check: `if ((!isa<ICmpInst>(Cast0Src) \|\| !isa<ICmpInst>(Cast1Src)) && ...` This check seems to sort out more cases than necessary because: - the reverse transformation is obviously done for `or` instructions only - and also not every `zext icmp` pair is necessarily the result of this reverse transformation Therefore we now remove this check and replace it by a more finegrained one in `shouldOptimizeCast()` that now rejects only those `logic(zext(icmp), zext(icmp))` that would be able to be optimized by `transformZExtICmp()`, which also avoids the mentioned endless loop. That means we are now able to also simplify expressions of the form `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` (`cast` being an arbitrary `CastInst`). As an example, consider the following IR snippet ``` %1 = icmp sgt i64 %a, %b %2 = zext i1 %1 to i8 %3 = icmp slt i64 %a, %c %4 = zext i1 %3 to i8 %5 = and i8 %2, %4 ``` which would now be transformed to ``` %1 = icmp sgt i64 %a, %b %2 = icmp slt i64 %a, %c %3 = and i1 %1, %2 %4 = zext i1 %3 to i8 ``` This issue became apparent when experimenting with the programming language Julia, which makes use of LLVM. Currently, Julia lowers its `Bool` datatype to LLVM's `i8` (also see https://github.com/JuliaLang/julia/pull/17225). In fact, the above IR example is the lowered form of the Julia snippet `(a > b) & (a < c)`. Like shown above, this may introduce `zext` operations, casting between `i1` and `i8`, which could for example hinder ScalarEvolution and Polly on certain code. Reviewers: grosser, vtjnash, majnemer Subscribers: majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D22511 Contributed-by: Matthias Reisinger llvm-svn: 275989
*	[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using ↵	Simon Pilgrim	2016-07-19	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	generic IR D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead. It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match). This patch changes both scalar and packed versions back to using x86-specific builtins. It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding. A companion clang patch is at D22105 Differential Revision: https://reviews.llvm.org/D22106 llvm-svn: 275981
*	[MemorySSA] Update to the new shiny walker.	George Burgess IV	2016-07-19	2	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch updates MemorySSA's use-optimizing walker to be more accurate and, in some cases, faster. Essentially, this changed our core walking algorithm from a cache-as-you-go DFS to an iteratively expanded DFS, with all of the caching happening at the end. Said expansion happens when we hit a Phi, P; we'll try to do the smallest amount of work possible to see if optimizing above that Phi is legal in the first place. If so, we'll expand the search to see if we can optimize to the next phi, etc. An iteratively expanded DFS lets us potentially quit earlier (because we don't assume that we can optimize above all phis) than our old walker. Additionally, because we don't cache as we go, we can now optimize above loops. As an added bonus, this patch adds a ton of verification (if EXPENSIVE_CHECKS are enabled), so finding bugs is easier. Differential Revision: https://reviews.llvm.org/D21777 llvm-svn: 275940
*	Recommit the patch "Use uniforms set to populate VecValuesToIgnore".	Wei Mi	2016-07-19	5	-17/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For instructions in uniform set, they will not have vector versions so add them to VecValuesToIgnore. For induction vars, those only used in uniform instructions or consecutive ptrs instructions have already been added to VecValuesToIgnore above. For those induction vars which are only used in uniform instructions or non-consecutive/non-gather scatter ptr instructions, the related phi and update will also be added into VecValuesToIgnore set. The change will make the vector RegUsages estimation less conservative. Differential Revision: https://reviews.llvm.org/D20474 The recommit fixed the testcase global_alias.ll. llvm-svn: 275936
*	[LoopReroll] Reroll loops with unordered atomic memory accesses	Sanjoy Das	2016-07-19	1	-0/+131
\| \| \| \| \| \| \| \| \| \|	Reviewers: hfinkel, jfb, reames Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D22385 llvm-svn: 275932
*	[PM] Convert Loop Strength Reduce pass to new PM	Dehao Chen	2016-07-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Convert Loop String Reduce pass to new PM Reviewers: davidxl, silvas Subscribers: junbuml, sanjoy, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D22468 llvm-svn: 275919
*	[PM] Port FunctionImport Pass to new PM	Teresa Johnson	2016-07-18	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Port FunctionImport Pass to new PM. Reviewers: mehdi_amini, davide Subscribers: davidxl, llvm-commits Differential Revision: https://reviews.llvm.org/D22475 llvm-svn: 275916