bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[CodeGen] Check for HardwareLoop Latch ExitBlock	Sam Parker	2019-06-17	2	-0/+122
\| \| \| \| \| \| \| \| \| \| \| \|	The HardwareLoops pass finds exit blocks with a scevable exit count. If the target specifies to update the loop counter in a register, through a phi, we need to ensure that the exit block is a latch so that we can insert the phi with the correct value for the incoming edge. Differential Revision: https://reviews.llvm.org/D63336 llvm-svn: 363556
*	[LV] Deny irregular types in interleavedAccessCanBeWidened	Bjorn Pettersson	2019-06-17	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Avoid that loop vectorizer creates loads/stores of vectors with "irregular" types when interleaving. An example of an irregular type is x86_fp80 that is 80 bits, but that may have an allocation size that is 96 bits. So an array of x86_fp80 is not bitcast compatible with a vector of the same type. Not sure if interleavedAccessCanBeWidened is the best place for this check, but it solves the problem seen in the added test case. And it is the same kind of check that already exists in memoryInstructionCanBeWidened. Reviewers: fhahn, Ayal, craig.topper Reviewed By: fhahn Subscribers: hiraditya, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63386 llvm-svn: 363547
*	[SCEV] Use NoWrapFlags when expanding a simple mul	Sam Parker	2019-06-17	9	-19/+19
\| \| \| \| \| \| \| \| \| \|	Second functional change following on from rL362687. Pass the NoWrapFlags from the MulExpr to InsertBinop when we're generating a shl or mul. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 363540
*	[lit] Delete empty lines at the end of lit.local.cfg NFC	Fangrui Song	2019-06-17	51	-51/+0
\| \| \| \|	llvm-svn: 363538
*	Re-commit r357452 (take 3): "SimplifyCFG SinkCommonCodeFromPredecessors: ↵	Hans Wennborg	2019-06-17	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \|	Also sink function calls without used results (PR41259)" Third time's the charm. This was reverted in r363220 due to being suspected of an internal benchmark regression and a test failure, none of which turned out to be caused by this. llvm-svn: 363529
*	[SimplifyCFG] Fix prof branch_weights MD while removing unreachable switch cases	Yevgeny Rouban	2019-06-17	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \|	SimplifyCFG has a bug that results in inconsistent prof branch_weights metadata if unreachable switch cases are removed. This patch fixes this bug by making use of the newly introduced SwitchInstProfUpdateWrapper class (see patch D62122). A new test is created. Differential Revision: https://reviews.llvm.org/D62186 llvm-svn: 363527
*	[InstSimplify] Fix addo/subo undef folds (PR42209)	Roman Lebedev	2019-06-16	2	-8/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix folds of addo and subo with an undef operand to be: `@llvm.{u,s}{add,sub}.with.overflow` all fold to `{ undef, false }`, as per LLVM undef rules. Same for commuted variants. Based on the original version of the patch by @nikic. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42209 \| PR42209 ]] Differential Revision: https://reviews.llvm.org/D63065 llvm-svn: 363522
*	[CodeGenPrepare][x86] shift both sides of a vector select when profitable	Sanjay Patel	2019-06-16	1	-18/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is based on the example/discussion in PR37428: https://bugs.llvm.org/show_bug.cgi?id=37428 Proper vector shift instructions don't appear until AVX2, so we may generate several extra instructions within a loop trying to compensate for that. It's difficult to recover from that shift expansion later than this, so use the existing TLI hook and splat analysis to enable better codegen. This extends CGP functionality introduced with: rL201655 Differential Revision: https://reviews.llvm.org/D63233 llvm-svn: 363511
*	[SimplifyIndVar] Simplify non-overflowing saturating add/sub	Nikita Popov	2019-06-15	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \|	If we can detect that saturating math that depends on an IV cannot overflow, replace it with simple math. This is similar to the CVP optimization from D62703, just based on a different underlying analysis (SCEV vs LVI) that catches different cases. Differential Revision: https://reviews.llvm.org/D62792 llvm-svn: 363489
*	[InstCombine] Add tests to show missing fold opportunity for "icmp and ↵	Huihui Zhang	2019-06-15	6	-0/+1434
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	shift" (nfc). Summary: For icmp pred (and (sh X, Y), C), 0 When C is signbit, expect to fold (X << Y) & signbit ==/!= 0 into (X << Y) >=/< 0, rather than (X & (signbit >> Y)) != 0. When C+1 is power of 2, expect to fold (X << Y) & ~C ==/!= 0 into (X << Y) </>= C+1, rather than (X & (~C >> Y)) == 0. For icmp pred (and X, (sh signbit, Y)), 0 Expect to fold (X & (signbit l>> Y)) ==/!= 0 into (X << Y) >=/< 0 Expect to fold (X & (signbit << Y)) ==/!= 0 into (X l>> Y) >=/< 0 Reviewers: lebedev.ri, efriedma, spatel, craig.topper Reviewed By: lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63025 llvm-svn: 363479
*	[ObjC][ARC] Delete ObjC runtime calls on global variables annotated	Akira Hatanaka	2019-06-14	1	-0/+65
\| \| \| \| \| \| \| \| \| \| \| \|	with 'objc_arc_inert' Those calls are no-ops, so they can be safely deleted. rdar://problem/49839633 Differential Revision: https://reviews.llvm.org/D62433 llvm-svn: 363468
*	SROA: Allow eliminating addrspacecasted allocas	Matt Arsenault	2019-06-14	3	-55/+165
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a circular dependency between SROA and InferAddressSpaces today that requires running both multiple times in order to be able to eliminate all simple allocas and addrspacecasts. InferAddressSpaces can't remove addrspacecasts when written to memory, and SROA helps move pointers out of memory. This should avoid inserting new commuting addrspacecasts with GEPs, since there are unresolved questions about pointer wrapping between different address spaces. For now, don't replace volatile operations that don't match the alloca addrspace, as it would change the address space of the access. It may be still OK to insert an addrspacecast from the new alloca, but be more conservative for now. llvm-svn: 363462
*	SROA: Add baseline test for addrspacecast changes	Matt Arsenault	2019-06-14	1	-0/+348
\| \| \| \|	llvm-svn: 363460
*	Revert Fix a bug w/inbounds invalidation in LFTR	Florian Hahn	2019-06-14	4	-13/+11
\| \| \| \| \| \| \| \| \|	Reverting because it breaks a green dragon build: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208 This reverts r363289 (git commit eb88badff96dacef8fce3f003dec34c2ef6900bf) llvm-svn: 363427
*	[CodeGenPrepare] propagate debuginfo when copying a shuffle	Sanjay Patel	2019-06-14	1	-14/+18
\| \| \| \|	llvm-svn: 363409
*	AMDGPU: Fold readlane intrinsics of constants	Matt Arsenault	2019-06-14	1	-0/+56
\| \| \| \| \| \| \| \|	I'm not 100% sure about this, since I'm worried about IR transforms that might end up introducing divergence downstream once replaced with a constant, but I haven't come up with an example yet. llvm-svn: 363406
*	[SCEV] Pass NoWrapFlags when expanding an AddExpr	Sam Parker	2019-06-14	14	-24/+24
\| \| \| \| \| \| \| \| \| \| \| \|	InsertBinop now accepts NoWrapFlags, so pass them through when expanding a simple add expression. This is the first re-commit of the functional changes from rL362687, which was previously reverted. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 363364
*	[AMDGPU] gfx1010 wave32 icmp/fcmp intrinsic changes for wave32	Stanislav Mekhanoshin	2019-06-13	1	-138/+136
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D63301 llvm-svn: 363339
*	[SimplifyCFG] NFC, update Switch tests as a baseline.	Shawn Landden	2019-06-13	19	-1422/+2511
\| \| \| \| \| \| \| \| \| \| \| \| \|	Also add baseline tests to show effect of later patches. There were a couple of regressions here that were never caught, but my patch set that this is a preparation to will fix them. This is the third attempt to land this patch. Differential Revision: https://reviews.llvm.org/D61150 llvm-svn: 363319
*	[InstCombine] add test for failed libfunction prototype matching; NFC	Sanjay Patel	2019-06-13	1	-7/+25
\| \| \| \|	llvm-svn: 363291
*	Fix a bug w/inbounds invalidation in LFTR	Philip Reames	2019-06-13	4	-11/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV. The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program. As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well. (Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.) Differential Revision: https://reviews.llvm.org/D62939 llvm-svn: 363289
*	[InstCombine] auto-generate complete test checks; NFC	Sanjay Patel	2019-06-13	1	-23/+20
\| \| \| \|	llvm-svn: 363286
*	[NFC] Updated testcase for D54411/rL363284	David Bolvansky	2019-06-13	1	-14/+8
\| \| \| \|	llvm-svn: 363285
*	[EarlyCSE] Ensure equal keys have the same hash value	Joseph Tremoulet	2019-06-13	1	-13/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The logic in EarlyCSE that looks through 'not' operations in the predicate recognizes e.g. that `select (not (cmp sgt X, Y)), X, Y` is equivalent to `select (cmp sgt X, Y), Y, X`. Without this change, however, only the latter is recognized as a form of `smin X, Y`, so the two expressions receive different hash codes. This leads to missed optimization opportunities when the quadratic probing for the two hashes doesn't happen to collide, and assertion failures when probing doesn't collide on insertion but does collide on a subsequent table grow operation. This change inverts the order of some of the pattern matching, checking first for the optional `not` and then for the min/max/abs patterns, so that e.g. both expressions above are recognized as a form of `smin X, Y`. It also adds an assertion to isEqual verifying that it implies equal hash codes; this fires when there's a collision during insertion, not just grow, and so will make it easier to notice if these functions fall out of sync again. A new flag --earlycse-debug-hash is added which can be used when changing the hash function; it forces hash collisions so that any pair of values inserted which compare as equal but hash differently will be caught by the isEqual assertion. Reviewers: spatel, nikic Reviewed By: spatel, nikic Subscribers: lebedev.ri, arsenm, craig.topper, efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62644 llvm-svn: 363274
*	Improve reduction intrinsics by overloading result value.	Sander de Smalen	2019-06-13	5	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch uses the mechanism from D62995 to strengthen the definitions of the reduction intrinsics by letting the scalar result/accumulator type be overloaded from the vector element type. For example: ; The LLVM LangRef specifies that the scalar result must equal the ; vector element type, but this is not checked/enforced by LLVM. declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a) This patch changes that into: declare i32 @llvm.experimental.vector.reduce.or.v4i32(<4 x i32> %a) Which has the type-constraint more explicit and causes LLVM to check the result type with the vector element type. Reviewers: RKSimon, arsenm, rnk, greened, aemerson Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D62996 llvm-svn: 363240
*	[ARM][TTI] Scan for existing loop intrinsics	Sam Parker	2019-06-13	1	-0/+68
\| \| \| \| \| \| \| \| \|	TTI should report that it's not profitable to generate a hardware loop if it, or one of its child loops, has already been converted. Differential Revision: https://reviews.llvm.org/D63212 llvm-svn: 363234
*	[SimplifyCFG] reverting preliminary Switch patches again	Shawn Landden	2019-06-13	19	-2582/+1422
\| \| \| \| \| \| \| \| \|	This reverts 363226 and 363227, both NFC intended I swear I fixed the test case that is failing, and ran the tests, but I will look into it again. llvm-svn: 363229
*	[SimplifyCFG] NFC, update Switch tests to better examine successive patches	Shawn Landden	2019-06-13	19	-1422/+2582
\| \| \| \| \| \| \| \| \| \| \|	Also add baseline tests to show effect of later patches. There were a couple of regressions here that were never caught, but my patch set that this is a preparation to will fix them. Differential Revision: https://reviews.llvm.org/D61150 llvm-svn: 363226
*	[SimplifyCFG] revert the last commit.	Shawn Landden	2019-06-13	16	-2469/+1308
\| \| \| \| \| \|	I ran ALL the test suite locally, so I will look into this... llvm-svn: 363223
*	[SimplifyCFG] NFC, update Switch tests to HEAD so I can	Shawn Landden	2019-06-13	16	-1308/+2469
\| \| \| \| \| \| \| \| \| \|	see if my changes change anything Also add baseline tests to show effect of later patches. Differential Revision: https://reviews.llvm.org/D61150 llvm-svn: 363222
*	Revert r361811: 'Re-commit r357452 (take 2): "SimplifyCFG ↵	David L. Jones	2019-06-13	1	-44/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SinkCommonCodeFromPredecessors ...' We have observed some failures with internal builds with this revision. - Performance regressions: - llvm's SingleSource/Misc evalloop shows performance regressions (although these may be red herrings). - Benchmarks for Abseil's SwissTable. - Correctness: - Failures for particular libicu tests when building the Google AppEngine SDK (for PHP). hwennborg has already been notified, and is aware of reproducer failures. llvm-svn: 363220
*	[SLP] Update propagate_ir_flags.ll test to check that we do retain the ↵	Dinar Temirbulatov	2019-06-13	1	-0/+36
\| \| \| \| \| \|	common subset, NFC. llvm-svn: 363218
*	[Tests] Highlight impact of multiple exit LFTR (D62625) as requested by reviewer	Philip Reames	2019-06-12	1	-0/+158
\| \| \| \|	llvm-svn: 363217
*	[x86] add tests for vector shifts; NFC	Sanjay Patel	2019-06-12	1	-0/+117
\| \| \| \|	llvm-svn: 363203
*	[Tests] Autogen RLEV test and add tests for a future enhancement	Philip Reames	2019-06-12	1	-57/+171
\| \| \| \|	llvm-svn: 363193
*	[Tests] Add tests to highlight sibling loop optimization order issue for ↵	Philip Reames	2019-06-12	1	-2/+151
\| \| \| \| \| \| \| \|	exit rewriting The issue addressed in r363180 is more broadly relevant. For the moment, we don't actually get any of these cases because we a) restrict SCEV formation due to SCEExpander needing to preserve LCSSA, and b) don't iterate between loops. llvm-svn: 363192
*	[SCEV] Teach computeSCEVAtScope benefit from one-input Phi. PR39673	Philip Reames	2019-06-12	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \|	SCEV does not propagate arguments through one-input Phis so as to make it easy for the SCEV expander (and related code) to preserve LCSSA. It's not entirely clear this restriction is neccessary, but for the moment it exists. For this reason, we don't analyze single-entry phi inputs. However it is possible that when an this input leaves the loop through LCSSA Phi, it is a provable constant. Missing that results in an order of optimization issue in loop exit value rewriting where we miss some oppurtunities based on order in which we visit sibling loops. This patch teaches computeSCEVAtScope about this case. We can generalize it later, but so far we can only replace LCSSA Phis with their constant loop-exiting values. We should probably also add similiar logic directly in the SCEV construction path itself. Patch by: mkazantsev (with revised commit message by me) Differential Revision: https://reviews.llvm.org/D58113 llvm-svn: 363180
*	[InstCombine] add tests for fmin/fmax libcalls; NFC	Sanjay Patel	2019-06-12	1	-0/+18
\| \| \| \|	llvm-svn: 363175
*	Revert rL363156.	Sam Parker	2019-06-12	6	-11/+0
\| \| \| \| \| \| \|	The patch was to fix buildbots, but rL363157 should now be fixing it in a cleaner way. llvm-svn: 363174
*	StackProtector: Use PointerMayBeCaptured	Matt Arsenault	2019-06-12	2	-0/+142
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was using its own, outdated list of possible captures. This was at minimum not catching cmpxchg and addrspacecast captures. One change is now any volatile access is treated as capturing. The test coverage for this pass is quite inadequate, but this required removing volatile in the lifetime capture test. Also fixes some infrastructure issues to allow running just the IR pass. Fixes bug 42238. llvm-svn: 363169
*	LoopVersioning: Respect convergent	Matt Arsenault	2019-06-12	1	-0/+40
\| \| \| \| \| \| \| \| \|	This changes the standalone pass only. Arguably the utility class itself should assert there are no convergent calls. However, a target pass with additional context may still be able to version a loop if all of the dynamic conditions are sufficiently uniform. llvm-svn: 363165
*	[InstCombine] add tests for fcmp+select with FMF (minnum/maxnum); NFC	Sanjay Patel	2019-06-12	1	-0/+132
\| \| \| \|	llvm-svn: 363163
*	LoopLoadElim: Respect convergent	Matt Arsenault	2019-06-12	1	-0/+51
\| \| \| \|	llvm-svn: 363162
*	LoopDistribute/LAA: Respect convergent	Matt Arsenault	2019-06-12	5	-1/+416
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This case is slightly tricky, because loop distribution should be allowed in some cases, and not others. As long as runtime dependency checks don't need to be introduced, this should be OK. This is further complicated by the fact that LoopDistribute partially ignores if LAA says that vectorization is safe, and then does its own runtime pointer legality checks. Note this pass still does not handle noduplicate correctly, as this should always be forbidden with it. I'm not going to bother trying to fix it, as it would require more effort and I think noduplicate should be removed. https://reviews.llvm.org/D62607 llvm-svn: 363160
*	LoopDistribute/LAA: Add tests to catch regressions	Matt Arsenault	2019-06-12	3	-0/+118
\| \| \| \| \| \| \| \| \|	I broke 2 of these with a patch, but were not covered by existing tests. https://reviews.llvm.org/D63035 llvm-svn: 363158
*	[NFC] Add HardwareLoops lit.local.cfg file	Sam Parker	2019-06-12	1	-0/+3
\| \| \| \| \| \| \|	Set Transforms/HardwareLoops/ARM/ tests as unsupported if there isn't an arm target. llvm-svn: 363157
*	Attempt to fix non-Arm buildbots	Sam Parker	2019-06-12	6	-0/+11
\| \| \| \| \| \|	Adding REQUIRES: arm to failing tests llvm-svn: 363156
*	[ARM] Implement TTI::isHardwareLoopProfitable	Sam Parker	2019-06-12	6	-0/+1132
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement the backend target hook to drive the HardwareLoops pass. The low-overhead branch extension for Arm M-class cores is flexible enough that we don't have to ensure correctness at this point, except checking that the loop counter variable can be stored in LR - a 32-bit register. For it to be profitable, we want to avoid loops that contain function calls, or any other instruction that alters the PC. This implementation uses TargetLoweringInfo, to query type and operation actions, looks at intrinsic calls and also performs some manual checks for remainder/division and FP operations. I think this should be a good base to start and extra details can be filled out later. Differential Revision: https://reviews.llvm.org/D62907 llvm-svn: 363149
*	Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step ↵	Orlando Cazalet-Hyams	2019-06-12	11	-368/+56
\| \| \| \| \| \| \| \| \|	through loop even after completion" This reverts commit 1a0f7a2077b70c9864faa476e15b048686cf1ca7. See phabricator thread for D60831. llvm-svn: 363132
*	Generalize icmp matching in IndVars' eliminateTrunc	Philip Reames	2019-06-11	1	-0/+104
\| \| \| \| \| \| \| \|	We were only matching RHS being a loop invariant value, not the inverse. Since there's nothing which appears to canonicalize loop invariant values to RHS, this means we missed cases. Differential Revision: https://reviews.llvm.org/D63112 llvm-svn: 363108