bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[CGP] add special-cases to form unsigned add with overflow (PR40486)	Sanjay Patel	2019-02-24	1	-8/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's likely a missed IR canonicalization for at least 1 of these patterns. Otherwise, we wouldn't have needed the pattern-matching enhancement in D57516. Note that -- unlike usubo added with D57789 -- the TLI hook for this transform defaults to 'on'. So if there's any perf fallout from this, targets should look at how they're lowering the uaddo node in SDAG and/or override that hook. The x86 diffs suggest that there's some missing pattern-matching for forming inc/dec. This should fix the remaining known problems in: https://bugs.llvm.org/show_bug.cgi?id=40486 https://bugs.llvm.org/show_bug.cgi?id=31754 llvm-svn: 354746
*	[TwoAddressInstructionPass] After commuting an instruction and before trying ↵	Craig Topper	2019-02-23	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	to look for more commutable operands, resample the number of operands. The new instruciton might have less operands than the original instruction. If we don't resample, the next loop iteration might read an operand that doesn't exist. X86 can commute blends to movss/movsd which reduces from 4 operands to 3. This happened in the test case that caused r354363 & company to be reverted. A reduced version of that has been committed here. Really this whole checking for more commutable operands is a little fragile. It assumes that the new instructions operands are the same order and positions as the original except for the pair that was swapped. I don't know of anything that breaks this assumption today, but I've left a fixme. Fixing this will likely require an interface change. llvm-svn: 354738
*	Recommit r354647 and r354648 "[LegalizeTypes] When promoting the result of ↵	Craig Topper	2019-02-23	1	-3/+7
\| \| \| \| \| \| \| \| \| \|	EXTRACT_SUBVECTOR, also check if the input needs to be promoted. Use that to determine the element type to extract" r354648 was a follow up to fix a regression "[X86] Add a DAG combine for (aext_vector_inreg (aext_vector_inreg X)) -> (aext_vector_inreg X) to fix a regression from my previous commit." These were reverted in r354713 as their context depended on other patches that were reverted for a bug. llvm-svn: 354734
*	[NFC] Fix typos: preceeding -> preceding	Jordan Rupprecht	2019-02-23	2	-5/+5
\| \| \| \|	llvm-svn: 354715
*	Revert r354363 & co "[X86][SSE] Generalize X86ISD::BLENDI support to more ↵	Reid Kleckner	2019-02-23	1	-7/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	value types" r354363 caused https://crbug.com/934963#c1, which has a plain C reduced test case. I also had to revert some dependent changes: - r354648 - r354647 - r354640 - r354511 llvm-svn: 354713
*	[LegalizeTypes] Use PromoteTargetBoolean in PromoteIntOp_ADDSUBCARRY instead ↵	Craig Topper	2019-02-23	1	-13/+1
\| \| \| \| \| \|	of reimplementing it. NFCI llvm-svn: 354710
*	Restore ability for C++ API users to Enable IPRA.	Daniel Sanders	2019-02-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Prior to r310876 one of our out-of-tree targets was enabling IPRA by modifying the TargetOptions::EnableIPRA. This no longer works on current trunk since the useIPRA() hook overrides any values that are set in advance. This patch adjusts the behaviour of the hook so that API users and useIPRA() can both enable it but useIPRA() cannot disable it if the API user already enabled it. Reviewers: arsenm Reviewed By: arsenm Subscribers: wdng, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D38043 llvm-svn: 354692
*	[CGP] move overflow intrinsic insertion to common location; NFCI	Sanjay Patel	2019-02-22	1	-17/+28
\| \| \| \| \| \| \| \| \| \| \|	We need to enhance the uaddo matching to handle special-cases as seen in PR40486 and PR31754. That means we won't necessarily have a def-use pattern, so we'll need to check dominance to determine where to place the intrinsic (as we already do for usubo). This preliminary patch is just rearranging the code, so the planned follow-up to improve uaddo will be more clear. llvm-svn: 354689
*	MIR: Preserve incoming frame index numbers	Matt Arsenault	2019-02-22	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Don't skip incrementing the frame index number if the object is dead. Instructions can still be referencing the old frame index number, and this doesn't attempt to remap those. The resulting MIR then fails to load because the use instructions use a higher frame index number than recorded list of stack objects. I'm not sure it's possible to craft a testcase with the existing set of passes. It requires selectively marking some stack objects dead in an essentially random order. StackSlotColoring condenses towards the low indexes. This avoids a regression in a future AMDGPU commit when some frame indexes are lowered separately from PEI. llvm-svn: 354688
*	CodeGen: Make RegAllocRegistry a template class	Matt Arsenault	2019-02-22	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \|	Will allow re-using the machinery for independent sets of register allocators. This will allow AMDGPU to use separate command line options for the allocator to use for SGPRs separate from VGPRs. llvm-svn: 354687
*	[MBP] Factor out function hasViableTopFallthrough and enhancement	Guozhi Wei	2019-02-22	1	-9/+36
\| \| \| \| \| \| \| \|	This patch factor out the function hasViableTopFallthrough from rotateLoop. It is also enhanced. Original code checks only if there is a block can be placed before current loop top. This patch also checks if the loop top is the most possible successor of its predecessor. The attached test case shows its effect. Differential Revision: https://reviews.llvm.org/D58393 llvm-svn: 354682
*	Disable big-endian constant store merges from rL354676.	Nirav Dave	2019-02-22	1	-10/+11
\| \| \| \|	llvm-svn: 354677
*	[DAGCombine] Fold overlapping constant stores	Nirav Dave	2019-02-22	2	-3/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fold a smaller constant store into larger constant stores immediately preceeding it. Reviewers: rnk, courbet Subscribers: javed.absar, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58468 llvm-svn: 354676
*	[LegalizeVectorOps] Improve the placement of ANDs in the ExpandLoad path for ↵	Craig Topper	2019-02-22	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \|	non-byte-sized loads. When we need to merge two adjacent loads the AND mask for the low piece was still sized for the full src element size. But we didn't have that many bits. The upper bits are already zero due to the SRL. So we can skip the AND if we're going to combine with the high bits. We do need an AND to clear out any bits from the high part. We were anding the high part before combining with the low part, but it looks like ANDing after the OR gets better results. So we can just emit the final AND after the optional concatentation is done. That will handling skipping before the OR and get rid of extra high bits after the OR. llvm-svn: 354655
*	[LegalizeVectorOps] Simplify the non-byte sized load handling ↵	Craig Topper	2019-02-22	1	-11/+8
\| \| \| \| \| \| \| \|	VectorLegalizer::ExpandLoad. NFCI Remove an if that should always be true. Merge the body of another into the only block that could make the if true. llvm-svn: 354654
*	DAG: Add helper for creating shifts with correct type	Matt Arsenault	2019-02-22	2	-1/+8
\| \| \| \|	llvm-svn: 354649
*	[LegalizeTypes] When promoting the result of EXTRACT_SUBVECTOR, also check ↵	Craig Topper	2019-02-22	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	if the input needs to be promoted. Use that to determine the element type to extract. Otherwise we end up creating extract_vector_elts that then each need to have their input promoted. This can lead to truncates needing to be emitted for each of those. But we already emitted any_extends when we legalized the extract_subvector. So now we have pairs of any_extend+trunc that partially cancel. But depending on how DAGCombiner visits them we can get weird results. By promoting the input at the same time we can create only a single any_extend or truncate. There's one regression in the vector-narrow-binop.ll case, but that looks easy to fix with a follow up patch. llvm-svn: 354647
*	[DAGCombiner] prevent infinite looping by truncating 'and' (PR40793)	Sanjay Patel	2019-02-21	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fold can occur during legalization, so it can fight with promotion to the larger type. It apparently takes a special sequence and subtarget to avoid more basic simplifications that would hide the problem. But there's a bigger question raised here: why does distributeTruncateThroughAnd() even exist? It duplicates functionality from a more minimal pattern that we already have. But getting rid of this function requires some preliminary steps. https://bugs.llvm.org/show_bug.cgi?id=40793 llvm-svn: 354594
*	RegBankSelect: Allow targets to introduce control flow for mapping	Matt Arsenault	2019-02-21	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	For AMDGPU, if an operand requires an SGPR but is only available as a VGPR, a loop needs to be introduced to execute the instruction with each unique combination of values across all lanes. The rest of the instructions in the block will be moved to a new block following the loop. Check if the next instruction's parent changed, and update the iterators and insertion block if this happened. Tests will be included in a future patch. llvm-svn: 354591
*	Re-land part of r354244 "[DAGCombiner] Eliminate dead stores to stack."	Clement Courbet	2019-02-21	3	-10/+44
\| \| \| \| \| \|	This part introduces the lifetime node. llvm-svn: 354578
*	Add skipFunction to PostRA machine sinking pass.	Xin Tong	2019-02-21	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add skipFunction to PostRA machine sinking pass. Reviewers: junbuml Subscribers: arsenm, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57847 llvm-svn: 354541
*	[CGP] match a special-case of unsigned subtract overflow	Sanjay Patel	2019-02-20	1	-0/+5
\| \| \| \| \| \| \|	This is the 'sub0' (negate) pattern from PR31754: https://bugs.llvm.org/show_bug.cgi?id=31754 llvm-svn: 354519
*	[DAGCombine] Generalize Dead Store to overlapping stores.	Nirav Dave	2019-02-20	1	-14/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Remove stores that are immediately overwritten by larger stores. Reviewers: courbet, rnk Reviewed By: rnk Subscribers: javed.absar, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58467 llvm-svn: 354518
*	[SelectionDAG] Teach GetDemandedBits to look at the known zeros of the LHS ↵	Craig Topper	2019-02-20	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \|	when handling ISD::AND If the LHS has known zeros, then the RHS immediate mask might have been simplified to remove those bits. This patch adds a call to computeKnownBits to get the known zeroes to handle that possibility. I left an early out to skip the call if all of the demanded bits are set in the mask. Differential Revision: https://reviews.llvm.org/D58464 llvm-svn: 354514
*	[SDAG] Support vector UMULO/SMULO	Nikita Popov	2019-02-20	4	-19/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Second part of https://bugs.llvm.org/show_bug.cgi?id=40442. This adds an extra UnrollVectorOverflowOp() method to SDAG, because the general UnrollOverflowOp() method can't deal with multiple results. Additionally we need to expand UMULO/SMULO during vector op legalization, as it may result in unrolling, which may need additional type legalization. Differential Revision: https://reviews.llvm.org/D57997 llvm-svn: 354513
*	Revert r354498 "[X86] Add test case to show missed opportunity to remove an ↵	Craig Topper	2019-02-20	1	-7/+3
\| \| \| \| \| \| \| \|	explicit AND on the bit position from BT when it has known zeros." I accidentally committed more than just the test. llvm-svn: 354499
*	[X86] Add test case to show missed opportunity to remove an explicit AND on ↵	Craig Topper	2019-02-20	1	-3/+7
\| \| \| \| \| \| \| \| \| \|	the bit position from BT when it has known zeros. If the bit position has known zeros in it, then the AND immediate will likely be optimized to remove bits. This can prevent GetDemandedBits from recognizing that the AND is unnecessary. llvm-svn: 354498
*	GlobalISel: Fix fewerElementsVector for ctlz with different result type	Matt Arsenault	2019-02-20	1	-1/+5
\| \| \| \| \| \|	Also complete the set of related operations. llvm-svn: 354480
*	GlobalISel: Implement moreElementsVector for g_insert results	Matt Arsenault	2019-02-20	1	-0/+8
\| \| \| \|	llvm-svn: 354477
*	Re-land the refactoring part of r354244 "[DAGCombiner] Eliminate dead stores ↵	Clement Courbet	2019-02-20	2	-35/+80
\| \| \| \| \| \| \| \|	to stack." This is an NFC. llvm-svn: 354476
*	[Codegen] Remove dead flags on Physical Defs in machine cse	David Green	2019-02-20	1	-19/+24
\| \| \| \| \| \| \| \| \| \|	We may leave behind incorrect dead flags on instructions that are CSE'd. Make sure we remove the dead flags on physical registers to prevent other incorrect code motion. Differential Revision: https://reviews.llvm.org/D58115 llvm-svn: 354443
*	[RegAllocGreedy] Take last chance recoloring into account in split and assign	Mikael Holmen	2019-02-20	1	-12/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a follow-up to r353988 where tryEvict was extended to take last chance recoloring into account. Now we do the same thing for trySplit and tryAssign. Now we always pass a "FixedRegisters" argument to canEvictInterference and tryEvict so it doesn't need to have a default value anymore. The need for this was found long ago in an out-of-tree target. Unfortunately I don't have a reproducer for an in-tree target. Reviewers: qcolombet, rudkx Reviewed By: qcolombet, rudkx Subscribers: rudkx, MatzeB, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58376 llvm-svn: 354439
*	[NFC] add/modify wrapper function for findRegisterDefOperand().	Chen Zheng	2019-02-20	3	-3/+4
\| \| \| \|	llvm-svn: 354438
*	[WebAssembly] Update MC for bulk memory	Thomas Lively	2019-02-19	1	-3/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Rename MemoryIndex to InitFlags and implement logic for determining data segment layout in ObjectYAML and MC. Also adds a "passive" flag for the .section assembler directive although this cannot be assembled yet because the assembler does not support data sections. Reviewers: sbc100, aardappel, aheejin, dschuff Subscribers: jgravelle-google, hiraditya, sunfish, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57938 llvm-svn: 354397
*	[SDAG] Use shift amount type in MULO promotion; NFC	Nikita Popov	2019-02-19	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	Directly use the correct shift amount type if it is possible, and future-proof the code against vectors. The added test makes sure that bitwidths that do not fit into the shift amount type do not assert. Split out from D57997. llvm-svn: 354359
*	GlobalISel: Implement moreElementsVector for select	Matt Arsenault	2019-02-19	1	-0/+12
\| \| \| \|	llvm-svn: 354354
*	GlobalISel: Implement moreElementsVector for G_EXTRACT source	Matt Arsenault	2019-02-19	1	-0/+7
\| \| \| \|	llvm-svn: 354348
*	GlobalISel: Implement moreElementsVector for bit ops	Matt Arsenault	2019-02-19	1	-0/+40
\| \| \| \|	llvm-svn: 354345
*	GlobalISel: Verify g_insert	Matt Arsenault	2019-02-19	1	-0/+24
\| \| \| \|	llvm-svn: 354342
*	[GlobalISel][AArch64] Legalize + select some llvm.ctlz.* intrinsics	Jessica Paquette	2019-02-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Legalize/select llvm.ctlz.* Add select-ctlz to show that we actually select them. Update arm64-clrsb.ll and arm64-vclz.ll to show that we perform valid transformations in optimized builds, and document where GISel can improve. Differential Revision: https://reviews.llvm.org/D58155 llvm-svn: 354299
*	[CGP] form usub with overflow from sub+icmp	Sanjay Patel	2019-02-18	1	-13/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The motivating x86 cases for forming the intrinsic are shown in PR31754 and PR40487: https://bugs.llvm.org/show_bug.cgi?id=31754 https://bugs.llvm.org/show_bug.cgi?id=40487 ..and those are shown in the IR test file and x86 codegen file. Matching the usubo pattern is harder than uaddo because we have 2 independent values rather than a def-use. This adds a TLI hook that should preserve the existing behavior for uaddo formation, but disables usubo formation by default. Only x86 overrides that setting for now although other targets will likely benefit by forming usbuo too. Differential Revision: https://reviews.llvm.org/D57789 llvm-svn: 354298
*	GlobalISel: Implement widenScalar for g_extract scalar results	Matt Arsenault	2019-02-18	1	-6/+48
\| \| \| \|	llvm-svn: 354293
*	GlobalISel: Make buildExtract use DstOp/SrcOp	Matt Arsenault	2019-02-18	1	-12/+15
\| \| \| \|	llvm-svn: 354292
*	GlobalISel: Fix double count of offset for irregular vector breakdowns	Matt Arsenault	2019-02-18	1	-1/+0
\| \| \| \| \| \| \|	Fixes cases with odd vectors that break into multiple requested size pieces. llvm-svn: 354280
*	Revert r354244 "[DAGCombiner] Eliminate dead stores to stack."	Clement Courbet	2019-02-18	5	-209/+45
\| \| \| \| \| \|	Breaks some bots. llvm-svn: 354245
*	[DAGCombiner] Eliminate dead stores to stack.	Clement Courbet	2019-02-18	5	-45/+209
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: A store to an object whose lifetime is about to end can be removed. See PR40550 for motivation. Reviewers: niravd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57541 llvm-svn: 354244
*	[SelectionDAG] Extract [US]MULO expansion into TL method; NFC	Nikita Popov	2019-02-17	2	-148/+124
\| \| \| \| \| \| \| \| \| \| \| \|	In preparation for supporting vector expansion. Add an isPostTypeLegalization flag to makeLibCall(), because this expansion relies on the legalized form using MERGE_VALUES. Drop the corresponding variant of ExpandLibCall, which is no longer used. Differential Revision: https://reviews.llvm.org/D58006 llvm-svn: 354226
*	[X86] Fix LowerAsmOutputForConstraint.	Nirav Dave	2019-02-15	2	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Update Flag when generating cc output. Fixes PR40737. Reviewers: rnk, nickdesaulniers, craig.topper, spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58283 llvm-svn: 354163
*	Fix 80-column limit in SimplifyDemandedBits/SimplifyDemandedVectorElts. NFCI.	Simon Pilgrim	2019-02-15	1	-70/+78
\| \| \| \|	llvm-svn: 354152
*	GlobalISel: Fix inadequate verification of g_build_vector	Matt Arsenault	2019-02-15	1	-6/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	Testing based on the total size of the elements failed to catch a few invalid scenarios, so explicitly check the number of elements/operands and types. This failed to catch situations like <4 x s16> = G_BUILD_VECTOR s32, s32 since the total size added up. This also would fail to catch an implicit conversion between pointers and scalars. llvm-svn: 354139