bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	SROA: Allow eliminating addrspacecasted allocas	Matt Arsenault	2019-06-14	3	-55/+165
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a circular dependency between SROA and InferAddressSpaces today that requires running both multiple times in order to be able to eliminate all simple allocas and addrspacecasts. InferAddressSpaces can't remove addrspacecasts when written to memory, and SROA helps move pointers out of memory. This should avoid inserting new commuting addrspacecasts with GEPs, since there are unresolved questions about pointer wrapping between different address spaces. For now, don't replace volatile operations that don't match the alloca addrspace, as it would change the address space of the access. It may be still OK to insert an addrspacecast from the new alloca, but be more conservative for now. llvm-svn: 363462
*	SROA: Add baseline test for addrspacecast changes	Matt Arsenault	2019-06-14	1	-0/+348
\| \| \| \|	llvm-svn: 363460
*	Revert Fix a bug w/inbounds invalidation in LFTR	Florian Hahn	2019-06-14	4	-13/+11
\| \| \| \| \| \| \| \| \|	Reverting because it breaks a green dragon build: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208 This reverts r363289 (git commit eb88badff96dacef8fce3f003dec34c2ef6900bf) llvm-svn: 363427
*	[CodeGenPrepare] propagate debuginfo when copying a shuffle	Sanjay Patel	2019-06-14	1	-14/+18
\| \| \| \|	llvm-svn: 363409
*	AMDGPU: Fold readlane intrinsics of constants	Matt Arsenault	2019-06-14	1	-0/+56
\| \| \| \| \| \| \| \|	I'm not 100% sure about this, since I'm worried about IR transforms that might end up introducing divergence downstream once replaced with a constant, but I haven't come up with an example yet. llvm-svn: 363406
*	[SCEV] Pass NoWrapFlags when expanding an AddExpr	Sam Parker	2019-06-14	14	-24/+24
\| \| \| \| \| \| \| \| \| \| \| \|	InsertBinop now accepts NoWrapFlags, so pass them through when expanding a simple add expression. This is the first re-commit of the functional changes from rL362687, which was previously reverted. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 363364
*	[AMDGPU] gfx1010 wave32 icmp/fcmp intrinsic changes for wave32	Stanislav Mekhanoshin	2019-06-13	1	-138/+136
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D63301 llvm-svn: 363339
*	[SimplifyCFG] NFC, update Switch tests as a baseline.	Shawn Landden	2019-06-13	19	-1422/+2511
\| \| \| \| \| \| \| \| \| \| \| \| \|	Also add baseline tests to show effect of later patches. There were a couple of regressions here that were never caught, but my patch set that this is a preparation to will fix them. This is the third attempt to land this patch. Differential Revision: https://reviews.llvm.org/D61150 llvm-svn: 363319
*	[InstCombine] add test for failed libfunction prototype matching; NFC	Sanjay Patel	2019-06-13	1	-7/+25
\| \| \| \|	llvm-svn: 363291
*	Fix a bug w/inbounds invalidation in LFTR	Philip Reames	2019-06-13	4	-11/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV. The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program. As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well. (Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.) Differential Revision: https://reviews.llvm.org/D62939 llvm-svn: 363289
*	[InstCombine] auto-generate complete test checks; NFC	Sanjay Patel	2019-06-13	1	-23/+20
\| \| \| \|	llvm-svn: 363286
*	[NFC] Updated testcase for D54411/rL363284	David Bolvansky	2019-06-13	1	-14/+8
\| \| \| \|	llvm-svn: 363285
*	[EarlyCSE] Ensure equal keys have the same hash value	Joseph Tremoulet	2019-06-13	1	-13/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The logic in EarlyCSE that looks through 'not' operations in the predicate recognizes e.g. that `select (not (cmp sgt X, Y)), X, Y` is equivalent to `select (cmp sgt X, Y), Y, X`. Without this change, however, only the latter is recognized as a form of `smin X, Y`, so the two expressions receive different hash codes. This leads to missed optimization opportunities when the quadratic probing for the two hashes doesn't happen to collide, and assertion failures when probing doesn't collide on insertion but does collide on a subsequent table grow operation. This change inverts the order of some of the pattern matching, checking first for the optional `not` and then for the min/max/abs patterns, so that e.g. both expressions above are recognized as a form of `smin X, Y`. It also adds an assertion to isEqual verifying that it implies equal hash codes; this fires when there's a collision during insertion, not just grow, and so will make it easier to notice if these functions fall out of sync again. A new flag --earlycse-debug-hash is added which can be used when changing the hash function; it forces hash collisions so that any pair of values inserted which compare as equal but hash differently will be caught by the isEqual assertion. Reviewers: spatel, nikic Reviewed By: spatel, nikic Subscribers: lebedev.ri, arsenm, craig.topper, efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62644 llvm-svn: 363274
*	Improve reduction intrinsics by overloading result value.	Sander de Smalen	2019-06-13	5	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch uses the mechanism from D62995 to strengthen the definitions of the reduction intrinsics by letting the scalar result/accumulator type be overloaded from the vector element type. For example: ; The LLVM LangRef specifies that the scalar result must equal the ; vector element type, but this is not checked/enforced by LLVM. declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a) This patch changes that into: declare i32 @llvm.experimental.vector.reduce.or.v4i32(<4 x i32> %a) Which has the type-constraint more explicit and causes LLVM to check the result type with the vector element type. Reviewers: RKSimon, arsenm, rnk, greened, aemerson Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D62996 llvm-svn: 363240
*	[ARM][TTI] Scan for existing loop intrinsics	Sam Parker	2019-06-13	1	-0/+68
\| \| \| \| \| \| \| \| \|	TTI should report that it's not profitable to generate a hardware loop if it, or one of its child loops, has already been converted. Differential Revision: https://reviews.llvm.org/D63212 llvm-svn: 363234
*	[SimplifyCFG] reverting preliminary Switch patches again	Shawn Landden	2019-06-13	19	-2582/+1422
\| \| \| \| \| \| \| \| \|	This reverts 363226 and 363227, both NFC intended I swear I fixed the test case that is failing, and ran the tests, but I will look into it again. llvm-svn: 363229
*	[SimplifyCFG] NFC, update Switch tests to better examine successive patches	Shawn Landden	2019-06-13	19	-1422/+2582
\| \| \| \| \| \| \| \| \| \| \|	Also add baseline tests to show effect of later patches. There were a couple of regressions here that were never caught, but my patch set that this is a preparation to will fix them. Differential Revision: https://reviews.llvm.org/D61150 llvm-svn: 363226
*	[SimplifyCFG] revert the last commit.	Shawn Landden	2019-06-13	16	-2469/+1308
\| \| \| \| \| \|	I ran ALL the test suite locally, so I will look into this... llvm-svn: 363223
*	[SimplifyCFG] NFC, update Switch tests to HEAD so I can	Shawn Landden	2019-06-13	16	-1308/+2469
\| \| \| \| \| \| \| \| \| \|	see if my changes change anything Also add baseline tests to show effect of later patches. Differential Revision: https://reviews.llvm.org/D61150 llvm-svn: 363222
*	Revert r361811: 'Re-commit r357452 (take 2): "SimplifyCFG ↵	David L. Jones	2019-06-13	1	-44/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SinkCommonCodeFromPredecessors ...' We have observed some failures with internal builds with this revision. - Performance regressions: - llvm's SingleSource/Misc evalloop shows performance regressions (although these may be red herrings). - Benchmarks for Abseil's SwissTable. - Correctness: - Failures for particular libicu tests when building the Google AppEngine SDK (for PHP). hwennborg has already been notified, and is aware of reproducer failures. llvm-svn: 363220
*	[SLP] Update propagate_ir_flags.ll test to check that we do retain the ↵	Dinar Temirbulatov	2019-06-13	1	-0/+36
\| \| \| \| \| \|	common subset, NFC. llvm-svn: 363218
*	[Tests] Highlight impact of multiple exit LFTR (D62625) as requested by reviewer	Philip Reames	2019-06-12	1	-0/+158
\| \| \| \|	llvm-svn: 363217
*	[x86] add tests for vector shifts; NFC	Sanjay Patel	2019-06-12	1	-0/+117
\| \| \| \|	llvm-svn: 363203
*	[Tests] Autogen RLEV test and add tests for a future enhancement	Philip Reames	2019-06-12	1	-57/+171
\| \| \| \|	llvm-svn: 363193
*	[Tests] Add tests to highlight sibling loop optimization order issue for ↵	Philip Reames	2019-06-12	1	-2/+151
\| \| \| \| \| \| \| \|	exit rewriting The issue addressed in r363180 is more broadly relevant. For the moment, we don't actually get any of these cases because we a) restrict SCEV formation due to SCEExpander needing to preserve LCSSA, and b) don't iterate between loops. llvm-svn: 363192
*	[SCEV] Teach computeSCEVAtScope benefit from one-input Phi. PR39673	Philip Reames	2019-06-12	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \|	SCEV does not propagate arguments through one-input Phis so as to make it easy for the SCEV expander (and related code) to preserve LCSSA. It's not entirely clear this restriction is neccessary, but for the moment it exists. For this reason, we don't analyze single-entry phi inputs. However it is possible that when an this input leaves the loop through LCSSA Phi, it is a provable constant. Missing that results in an order of optimization issue in loop exit value rewriting where we miss some oppurtunities based on order in which we visit sibling loops. This patch teaches computeSCEVAtScope about this case. We can generalize it later, but so far we can only replace LCSSA Phis with their constant loop-exiting values. We should probably also add similiar logic directly in the SCEV construction path itself. Patch by: mkazantsev (with revised commit message by me) Differential Revision: https://reviews.llvm.org/D58113 llvm-svn: 363180
*	[InstCombine] add tests for fmin/fmax libcalls; NFC	Sanjay Patel	2019-06-12	1	-0/+18
\| \| \| \|	llvm-svn: 363175
*	Revert rL363156.	Sam Parker	2019-06-12	6	-11/+0
\| \| \| \| \| \| \|	The patch was to fix buildbots, but rL363157 should now be fixing it in a cleaner way. llvm-svn: 363174
*	StackProtector: Use PointerMayBeCaptured	Matt Arsenault	2019-06-12	2	-0/+142
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was using its own, outdated list of possible captures. This was at minimum not catching cmpxchg and addrspacecast captures. One change is now any volatile access is treated as capturing. The test coverage for this pass is quite inadequate, but this required removing volatile in the lifetime capture test. Also fixes some infrastructure issues to allow running just the IR pass. Fixes bug 42238. llvm-svn: 363169
*	LoopVersioning: Respect convergent	Matt Arsenault	2019-06-12	1	-0/+40
\| \| \| \| \| \| \| \| \|	This changes the standalone pass only. Arguably the utility class itself should assert there are no convergent calls. However, a target pass with additional context may still be able to version a loop if all of the dynamic conditions are sufficiently uniform. llvm-svn: 363165
*	[InstCombine] add tests for fcmp+select with FMF (minnum/maxnum); NFC	Sanjay Patel	2019-06-12	1	-0/+132
\| \| \| \|	llvm-svn: 363163
*	LoopLoadElim: Respect convergent	Matt Arsenault	2019-06-12	1	-0/+51
\| \| \| \|	llvm-svn: 363162
*	LoopDistribute/LAA: Respect convergent	Matt Arsenault	2019-06-12	5	-1/+416
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This case is slightly tricky, because loop distribution should be allowed in some cases, and not others. As long as runtime dependency checks don't need to be introduced, this should be OK. This is further complicated by the fact that LoopDistribute partially ignores if LAA says that vectorization is safe, and then does its own runtime pointer legality checks. Note this pass still does not handle noduplicate correctly, as this should always be forbidden with it. I'm not going to bother trying to fix it, as it would require more effort and I think noduplicate should be removed. https://reviews.llvm.org/D62607 llvm-svn: 363160
*	LoopDistribute/LAA: Add tests to catch regressions	Matt Arsenault	2019-06-12	3	-0/+118
\| \| \| \| \| \| \| \| \|	I broke 2 of these with a patch, but were not covered by existing tests. https://reviews.llvm.org/D63035 llvm-svn: 363158
*	[NFC] Add HardwareLoops lit.local.cfg file	Sam Parker	2019-06-12	1	-0/+3
\| \| \| \| \| \| \|	Set Transforms/HardwareLoops/ARM/ tests as unsupported if there isn't an arm target. llvm-svn: 363157
*	Attempt to fix non-Arm buildbots	Sam Parker	2019-06-12	6	-0/+11
\| \| \| \| \| \|	Adding REQUIRES: arm to failing tests llvm-svn: 363156
*	[ARM] Implement TTI::isHardwareLoopProfitable	Sam Parker	2019-06-12	6	-0/+1132
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement the backend target hook to drive the HardwareLoops pass. The low-overhead branch extension for Arm M-class cores is flexible enough that we don't have to ensure correctness at this point, except checking that the loop counter variable can be stored in LR - a 32-bit register. For it to be profitable, we want to avoid loops that contain function calls, or any other instruction that alters the PC. This implementation uses TargetLoweringInfo, to query type and operation actions, looks at intrinsic calls and also performs some manual checks for remainder/division and FP operations. I think this should be a good base to start and extra details can be filled out later. Differential Revision: https://reviews.llvm.org/D62907 llvm-svn: 363149
*	Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step ↵	Orlando Cazalet-Hyams	2019-06-12	11	-368/+56
\| \| \| \| \| \| \| \| \|	through loop even after completion" This reverts commit 1a0f7a2077b70c9864faa476e15b048686cf1ca7. See phabricator thread for D60831. llvm-svn: 363132
*	Generalize icmp matching in IndVars' eliminateTrunc	Philip Reames	2019-06-11	1	-0/+104
\| \| \| \| \| \| \| \|	We were only matching RHS being a loop invariant value, not the inverse. Since there's nothing which appears to canonicalize loop invariant values to RHS, this means we missed cases. Differential Revision: https://reviews.llvm.org/D63112 llvm-svn: 363108
*	[InstCombine] Handle -(X-Y) --> (Y-X) for unary fneg when NSZ	Cameron McInally	2019-06-11	1	-3/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D62612 llvm-svn: 363082
*	[InstCombine] Update fptrunc (fneg x)) -> (fneg (fptrunc x) for unary FNeg	Cameron McInally	2019-06-11	3	-17/+10
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D62629 llvm-svn: 363080
*	[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through ↵	Orlando Cazalet-Hyams	2019-06-11	11	-56/+368
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. I have set up a separate review D61933 for a fix which is required for this patch. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse Reviewed By: hfinkel, jmorse Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 llvm-svn: 363046
*	AtomicExpand: Don't crash on non-0 alloca	Matt Arsenault	2019-06-11	1	-0/+37
\| \| \| \| \| \| \|	This now produces garbage on AMDGPU with a call to an nonexistent, anonymous libcall but won't assert. llvm-svn: 363022
*	AMDGPU: Expand < 32-bit atomics	Matt Arsenault	2019-06-11	3	-45/+422
\| \| \| \| \| \|	Also fix AtomicExpand asserting on atomicrmw fadd/fsub. llvm-svn: 363021
*	[Tests] Adjust LFTR dead-iv tests to bypass undef cases	Philip Reames	2019-06-10	1	-23/+17
\| \| \| \| \| \|	As pointed out by Nikita in review, undef and poison need to be handled separately. Since we're no longer expecting any test improvements - just fixes for miscompiles - update the tests to bypass the existing undef check. llvm-svn: 363002
*	[PGO] Handle cases of non-instrument BBs	Rong Xu	2019-06-10	4	-5/+106
\| \| \| \| \| \| \| \| \| \| \|	As shown in PR41279, some basic blocks (such as catchswitch) cannot be instrumented. This patch filters out these BBs in PGO instrumentation. It also sets the profile count to the fail-to-instrument edge, so that we can propagate the counts in the CFG. Differential Revision: https://reviews.llvm.org/D62700 llvm-svn: 362995
*	[Tests] Split an LFTR dead-iv case	Philip Reames	2019-06-10	1	-2/+33
\| \| \| \| \| \|	There are two interesting sub-cases here. 1) Switching IVs is legal, but only in pre-increment form. and 2) Switching IVs is legal, and so is post-increment form. llvm-svn: 362993
*	[Tests] Add tests for D62939 (miscompiles around dead pointer IVs)	Philip Reames	2019-06-10	1	-0/+228
\| \| \| \| \| \|	Flesh out a collection of tests for switching to a dead IV within LFTR, both for the current miscompile, and for some cases which we should be able to handle via simple reasoning. llvm-svn: 362976
*	[LFTR] Use recomputed BE count	Philip Reames	2019-06-10	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \|	This was discussed as part of D62880. The basic thought is that computing BE taken count after widening should produce (on average) an equally good backedge taken count as the one before widening. Since there's only one test in the suite which is impacted by this change, and it's essentially equivelent codegen, that seems to be a reasonable assertion. This change was separated from r362971 so that if this turns out to be problematic, the triggering piece is obvious and easily revertable. For the nestedIV example from elim-extend.ll, we end up with the following BE counts: BEFORE: (-2 + (-1 * %innercount) + %limit) AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>) Note that before is an i32 type, and the after is an i64. Truncating the i64 produces the i32. llvm-svn: 362975
*	[InstCombine] allow unordered preds when canonicalizing to fabs()	Sanjay Patel	2019-06-10	1	-8/+4
\| \| \| \| \| \| \| \| \| \|	We have a known-never-nan value via 'nnan', so an unordered predicate is the same as its ordered sibling. Similar to: rL362937 llvm-svn: 362954