bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	AMDGPU/GlobalISel: Legalize G_FMAD	Matt Arsenault	2019-09-13	2	-0/+817
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unlike SelectionDAG, treat this as a normally legalizable operation. In SelectionDAG this is supposed to only ever formed if it's legal, but I've found that to be restricting. For AMDGPU this is contextually legal depending on whether denormal flushing is allowed in the use function. Technically we currently treat the denormal mode as a subtarget feature, so custom lowering could be avoided. However I consider this to be a defect, and this should be contextually dependent on the controllable rounding mode of the parent function. llvm-svn: 371800
*	AMDGPU/GlobalISel: Select G_CTPOP	Matt Arsenault	2019-09-13	1	-0/+204
\| \| \| \|	llvm-svn: 371798
*	[Alignment] Use llvm::Align in MachineFunction and TargetLowering - fixes ↵	Guillaume Chatelet	2019-09-11	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mir parsing Summary: This catches malformed mir files which specify alignment as log2 instead of pow2. See https://reviews.llvm.org/D65945 for reference, This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67433 llvm-svn: 371608
*	GlobalISel/TableGen: Handle REG_SEQUENCE patterns	Matt Arsenault	2019-09-10	2	-13/+28
\| \| \| \| \| \| \| \|	The scalar f64 patterns don't work yet because they fail on multiple results from the unused implicit def of scc in the result bit operation. llvm-svn: 371542
*	AMDGPU/GlobalISel: Select G_FABS/G_FNEG	Matt Arsenault	2019-09-10	6	-308/+932
\| \| \| \| \| \| \| \| \| \| \|	f64 doesn't work yet because tablegen currently doesn't handlde REG_SEQUENCE. This does regress some multi use VALU fneg cases since now the immediate remains in an SGPR, and more moves are used for legalizing the xor. This is a SIFixSGPRCopies deficiency. llvm-svn: 371540
*	AMDGPU/GlobalISel: Select cvt pk intrinsics	Matt Arsenault	2019-09-10	5	-26/+322
\| \| \| \|	llvm-svn: 371539
*	AMDGPU/GlobalISel: Select llvm.amdgcn.sffbh	Matt Arsenault	2019-09-10	1	-0/+62
\| \| \| \|	llvm-svn: 371538
*	AMDGPU/GlobalISel: RegBankSelect for G_ZEXTLOAD/G_SEXTLOAD	Matt Arsenault	2019-09-10	2	-0/+195
\| \| \| \|	llvm-svn: 371536
*	AMDGPU/GlobalISel: Legalize constant 32-bit loads	Matt Arsenault	2019-09-10	1	-0/+64
\| \| \| \| \| \| \|	Legalize by casting to a 64-bit constant address. This isn't how the DAG implements it, but it should. llvm-svn: 371535
*	AMDGPU/GlobalISel: First pass at attempting to legalize load/stores	Matt Arsenault	2019-09-10	10	-1058/+55549
\| \| \| \| \| \| \| \|	There's still a lot more to do, but this handles decomposing due to alignment. I've gotten it to the point where nothing crashes or infinite loops the legalizer. llvm-svn: 371533
*	AMDGPU/GlobalISel: Fix insert point when lowering fminnum/fmaxnum	Matt Arsenault	2019-09-09	2	-206/+206
\| \| \| \|	llvm-svn: 371471
*	AMDGPU/GlobalISel: Legalize G_BUILD_VECTOR v2s16	Matt Arsenault	2019-09-09	1	-0/+99
\| \| \| \| \| \| \| \|	Handle it the same way as G_BUILD_VECTOR_TRUNC. Arguably only G_BUILD_VECTOR_TRUNC should be legal for this, but G_BUILD_VECTOR will probably be more convenient in most cases. llvm-svn: 371440
*	AMDGPU/GlobalISel: Select llvm.amdgcn.class	Matt Arsenault	2019-09-09	2	-0/+271
\| \| \| \| \| \|	Also fixes missing SubtargetPredicate on f16 class instructions. llvm-svn: 371436
*	AMDGPU/GlobalISel: Select fmed3	Matt Arsenault	2019-09-09	2	-0/+266
\| \| \| \|	llvm-svn: 371435
*	AMDGPU: Use PatFrags to allow selecting custom nodes or intrinsics	Matt Arsenault	2019-09-09	15	-0/+915
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This enables GlobalISel to handle various intrinsics. The custom node pattern will be ignored, and the intrinsic will work. This will also allow SelectionDAG to directly select the intrinsics, but as they are all custom lowered to the nodes, this ends up leaving dead code in the table. Eventually either GlobalISel should add the equivalent of custom nodes equivalent, or intrinsics should be directly used. These each have different tradeoffs. There are a few more to handle, but these are easy to handle ones. Some others fail for other reasons. llvm-svn: 371432
*	AMDGPU: Move MnemonicAlias out of instruction def hierarchy	Matt Arsenault	2019-09-09	3	-55/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unfortunately MnemonicAlias defines a "Predicates" field just like an instruction or pattern, with a somewhat different interpretation. This ends up overriding the intended Predicates set by PredicateControl on the pseudoinstruction defintions with an empty list. This allowed incorrectly selecting instructions that should have been rejected due to the SubtargetPredicate from patterns on the instruction definition. This does remove the divergent predicate from the 64-bit shift patterns, which were already not used for the 32-bit shift, so I'm not sure what the point was. This also removes a second, redundant copy of the 64-bit divergent patterns. llvm-svn: 371427
*	AMDGPU/GlobalISel: Implement LDS G_GLOBAL_VALUE	Matt Arsenault	2019-09-09	4	-0/+54
\| \| \| \| \| \|	Handle the simple case that lowers to a constant. llvm-svn: 371424
*	AMDGPU/GlobalISel: Legalize G_BUILD_VECTOR_TRUNC	Matt Arsenault	2019-09-09	2	-0/+102
\| \| \| \| \| \| \| \| \| \|	Treat this as legal on gfx9 since it can use S_PACK_* instructions for this. This isn't used by anything yet. The same will probably apply to 16-bit G_BUILD_VECTOR without the trunc. llvm-svn: 371423
*	AMDGPU/GlobalISel: Select atomic loads	Matt Arsenault	2019-09-09	5	-152/+985
\| \| \| \| \| \| \|	A new check for an explicitly atomic MMO is needed to avoid incorrectly matching pattern for non-atomic loads llvm-svn: 371418
*	AMDGPU/GlobalISel: Fix RegBankSelect for unaligned, uniform constant loads	Matt Arsenault	2019-09-09	1	-2/+51
\| \| \| \|	llvm-svn: 371416
*	AMDGPU/GlobalISel: Fix regbankselect for uniform extloads	Matt Arsenault	2019-09-09	1	-18/+85
\| \| \| \| \| \|	There are no scalar extloads. llvm-svn: 371414
*	AMDGPU: Remove code address space predicates	Matt Arsenault	2019-09-09	2	-4/+342
\| \| \| \| \| \| \|	Fixes 8-byte, 8-byte aligned LDS loads. 16-byte case still broken due to not be reported as legal. llvm-svn: 371413
*	AMDGPU/GlobalISel: Select G_PTR_MASK	Matt Arsenault	2019-09-09	1	-0/+475
\| \| \| \|	llvm-svn: 371412
*	AMDGPU/GlobalISel: Fix reg bank for uniform LDS loads	Matt Arsenault	2019-09-09	1	-2/+35
\| \| \| \| \| \| \|	The pointer is always a VGPR. Also fix hardcoding the pointer size to 64. llvm-svn: 371411
*	AMDGPU/GlobalISel: Use known bits for selection	Matt Arsenault	2019-09-09	2	-0/+85
\| \| \| \|	llvm-svn: 371409
*	AMDGPU/GlobalISel: Legalize wavefrontsize intrinsic	Matt Arsenault	2019-09-09	1	-0/+18
\| \| \| \|	llvm-svn: 371407
*	GlobalISel: Support physical register inputs in patterns	Matt Arsenault	2019-09-06	4	-11/+49
\| \| \| \|	llvm-svn: 371253
*	AMDGPU/GlobalISel: Fix load/store of types in other address spaces	Matt Arsenault	2019-09-06	3	-90/+90
\| \| \| \| \| \|	There should probably be a size only matcher. llvm-svn: 371155
*	AMDGPU: Add intrinsics for address space identification	Matt Arsenault	2019-09-05	2	-0/+206
\| \| \| \| \| \| \|	The library currently uses ptrtoint and directly checks the queue ptr for this, which counts as a pointer capture. llvm-svn: 371009
*	AMDGPU/GlobalISel: Restore insert point when getting aperture	Matt Arsenault	2019-09-05	1	-33/+33
\| \| \| \| \| \|	Avoids SSA violations in a future patch. llvm-svn: 371008
*	AMDGPU/GlobalISel: Fix placeholder value used for addrspacecast	Matt Arsenault	2019-09-05	1	-34/+82
\| \| \| \|	llvm-svn: 371007
*	AMDGPU/GlobalISel: Fix assert on load from constant address	Matt Arsenault	2019-09-05	1	-0/+27
\| \| \| \|	llvm-svn: 371006
*	AMDGPU/GlobalISel: Select G_BITREVERSE	Matt Arsenault	2019-09-04	2	-0/+84
\| \| \| \|	llvm-svn: 370980
*	GlobalISel: Add basic legalization for G_BITREVERSE	Matt Arsenault	2019-09-04	1	-0/+157
\| \| \| \|	llvm-svn: 370979
*	AMDGPU/GlobalISel: Make 16-bit constants legal	Matt Arsenault	2019-09-04	20	-663/+486
\| \| \| \| \| \|	This is mostly for the benefit of patterns which use 16-bit constants. llvm-svn: 370921
*	AMDGPU/GlobalISel: Legalize sin/cos	Matt Arsenault	2019-08-29	2	-0/+1082
\| \| \| \|	llvm-svn: 370402
*	GlobalISel/TableGen: Handle setcc patterns	Matt Arsenault	2019-08-29	2	-0/+1240
\| \| \| \| \| \| \| \| \| \| \|	This is a special case because one node maps to two different G_ instructions, and the operand order is changed. This mostly enables G_FCMP for AMDPGPU. G_ICMP is still manually selected for now since it has the SALU and VALU complication to deal with. llvm-svn: 370280
*	AMDGPU/GlobalISel: Fix constraining scalar and/or/xor	Matt Arsenault	2019-08-28	3	-168/+185
\| \| \| \| \| \| \|	If the result register already had a register class assigned, the sources may not have been properly constrained. llvm-svn: 370150
*	AMDGPU/GlobalISel: Implement addrspacecast for 32-bit constant addrspace	Matt Arsenault	2019-08-28	1	-0/+209
\| \| \| \|	llvm-svn: 370140
*	[GlobalISel] Fix narrowScalar for shifts to match algorithm from SDAG	Petar Avramovic	2019-08-27	5	-261/+261
\| \| \| \| \| \| \| \| \|	Fix typos. Use Hi and Lo prefixes for Or instead of LHS and RHS to match names of surrounding variables. Differential Revision: https://reviews.llvm.org/D66587 llvm-svn: 370062
*	[GlobalISel] Legalizer: Retry combining illegal artifacts as long as there ↵	Volkan Keles	2019-08-23	25	-536/+946
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	new artifacts Summary: Currently, Legalizer aborts if it’s unable to legalize artifacts. However, it’s possible to combine them after processing the rest of the instruction because the legalization is likely to generate more artifacts that allow ArtifactCombiner to combine away them. Instead, move illegal artifacts to another list called RetryList and wait until all of the instruction in InstList are legalized. After that, check if there is any new artifacts and try to combine them again if that’s the case. If not, abort. The idea is similar to D59339, but the approach is a bit different. This patch fixes the issue described above, but the legalizer still may be unable to handle some cases depending on when to legalize artifacts. So, in the long run, we probably need a different legalization strategy that handles this dependency in a better way. Reviewers: dsanders, aditya_nandakumar, qcolombet, arsenm, aemerson, paquette Reviewed By: dsanders Subscribers: jvesely, wdng, nhaehnle, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65894 llvm-svn: 369805
*	GlobalISel: Don't create G_UADDE with constant false carry in	Matt Arsenault	2019-08-22	3	-78/+66
\| \| \| \| \| \| \| \|	The x86 tests are now broken (in paticular add-scalar.ll now hits the DAG fallback) due to not handling G_UADDO. The DAG x86 backend has a custom lowering for this, so that will need to be implemented. llvm-svn: 369673
*	GlobalISel: Implement moreElementsVector for G_UNMERGE_VALUES sources	Matt Arsenault	2019-08-21	24	-349/+714
\| \| \| \| \| \| \| \|	This is necessary for handling <3 x s16> on AMDGPU, assuming this should be handled as 2 separate legalization actions. The alternative would be for fewerElementsVector to handle 3->2. llvm-svn: 369547
*	[GlobalISel]: Fix lowering of G_Shuffle_vector where we pick up the wrong ↵	Aditya Nandakumar	2019-08-14	1	-7/+34
\| \| \| \| \| \| \| \|	source index https://reviews.llvm.org/D66182 llvm-svn: 368781
*	[GlobalISel]: Fix lowering of G_SHUFFLE_VECTOR with scalar sources	Aditya Nandakumar	2019-08-13	1	-0/+20
\| \| \| \| \| \|	https://reviews.llvm.org/D66171 llvm-svn: 368753
*	GlobalISel: Partially implement fewerElementsVector G_UNMERGE_VALUES	Matt Arsenault	2019-08-13	22	-371/+1001
\| \| \| \| \| \|	Odd sized vectors aren't handled yet. llvm-svn: 368713
*	GlobalISel: Implement lower for G_SHUFFLE_VECTOR	Matt Arsenault	2019-08-13	1	-0/+257
\| \| \| \|	llvm-svn: 368709
*	[globalisel] Add G_SEXT_INREG	Daniel Sanders	2019-08-09	10	-62/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Targets often have instructions that can sign-extend certain cases faster than the equivalent shift-left/arithmetic-shift-right. Such cases can be identified by matching a shift-left/shift-right pair but there are some issues with this in the context of combines. For example, suppose you can sign-extend 8-bit up to 32-bit with a target extend instruction. %1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity) %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_ASHR %2:_(s32), i32 1 would reasonably combine to: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 25 which no longer matches the special case. If your shifts and extend are equal cost, this would break even as a pair of shifts but if your shift is more expensive than the extend then it's cheaper as: %2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8 %3:_(s32) = G_ASHR %2:_(s32), i32 1 It's possible to match the shift-pair in ISel and emit an extend and ashr. However, this is far from the only way to break this shift pair and make it hard to match the extends. Another example is that with the right known-zeros, this: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_MUL %2:_(s32), i32 2 can become: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 23 All upstream targets have been configured to lower it to the current G_SHL,G_ASHR pair but will likely want to make it legal in some cases to handle their faster cases. To follow-up: Provide a way to legalize based on the constant. At the moment, I'm thinking that the best way to achieve this is to provide the MI in LegalityQuery but that opens the door to breaking core principles of the legalizer (legality is not context sensitive). That said, it's worth noting that looking at other instructions and acting on that information doesn't violate this principle in itself. It's only a violation if, at the end of legalization, a pass that checks legality without being able to see the context would say an instruction might not be legal. That's a fairly subtle distinction so to give a concrete example, saying %2 in: %1 = G_CONSTANT 16 %2 = G_SEXT_INREG %0, %1 is legal is in violation of that principle if the legality of %2 depends on %1 being constant and/or being 16. However, legalizing to either: %2 = G_SEXT_INREG %0, 16 or: %1 = G_CONSTANT 16 %2:_(s32) = G_SHL %0, %1 %3:_(s32) = G_ASHR %2, %1 depending on whether %1 is constant and 16 does not violate that principle since both outputs are genuinely legal. Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61289 llvm-svn: 368487
*	AMDGPU/GlobalISel: Alternative mappings for constants	Matt Arsenault	2019-08-05	1	-0/+34
\| \| \| \| \| \| \| \|	Without context we assume SGPR. Allowing VGPR constants theoretically helps avoid a copy. This seems to not actually work now, and the choice isn't based on the use bank. llvm-svn: 367871
*	GlobalISel: Lower scalarizing unmerge of a vector to shifts	Matt Arsenault	2019-08-01	35	-426/+917
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	AMDGPU sometimes has legal s16 and <2 x s16> operations, but all registers are really 32-bit. An unmerge destination really should ben widened to a 32-bit register. If widening a scalarizing vector with a target size that matches the vector size, bitcast to integer and extract the relevant bits with shifts. I'm not sure if this is the right place for this. This could arguably be part of widenScalar for the result. I also have a growing feeling that we're missing a bitcast legalize action. llvm-svn: 367604