bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	AMDGPU/GlobalISel: Legalize G_FFLOOR	Matt Arsenault	2019-09-13	1	-2/+2
\| \| \| \|	llvm-svn: 371803
*	AMDGPU/GlobalISel: Legalize G_FMAD	Matt Arsenault	2019-09-13	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unlike SelectionDAG, treat this as a normally legalizable operation. In SelectionDAG this is supposed to only ever formed if it's legal, but I've found that to be restricting. For AMDGPU this is contextually legal depending on whether denormal flushing is allowed in the use function. Technically we currently treat the denormal mode as a subtarget feature, so custom lowering could be avoided. However I consider this to be a defect, and this should be contextually dependent on the controllable rounding mode of the parent function. llvm-svn: 371800
*	AMDGPU/GlobalISel: Select G_FABS/G_FNEG	Matt Arsenault	2019-09-10	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \|	f64 doesn't work yet because tablegen currently doesn't handlde REG_SEQUENCE. This does regress some multi use VALU fneg cases since now the immediate remains in an SGPR, and more moves are used for legalizing the xor. This is a SIFixSGPRCopies deficiency. llvm-svn: 371540
*	AMDGPU/GlobalISel: RegBankSelect for G_ZEXTLOAD/G_SEXTLOAD	Matt Arsenault	2019-09-10	1	-1/+3
\| \| \| \|	llvm-svn: 371536
*	AMDGPU/GlobalISel: Legalize constant 32-bit loads	Matt Arsenault	2019-09-10	1	-0/+15
\| \| \| \| \| \| \|	Legalize by casting to a 64-bit constant address. This isn't how the DAG implements it, but it should. llvm-svn: 371535
*	AMDGPU/GlobalISel: First pass at attempting to legalize load/stores	Matt Arsenault	2019-09-10	1	-69/+256
\| \| \| \| \| \| \| \|	There's still a lot more to do, but this handles decomposing due to alignment. I've gotten it to the point where nothing crashes or infinite loops the legalizer. llvm-svn: 371533
*	AMDGPU/GlobalISel: Fix insert point when lowering fminnum/fmaxnum	Matt Arsenault	2019-09-09	1	-1/+1
\| \| \| \|	llvm-svn: 371471
*	AMDGPU/GlobalISel: Rename MIRBuilder to B. NFC	Austin Kerbow	2019-09-09	1	-49/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67374 llvm-svn: 371467
*	AMDGPU/GlobalISel: Legalize G_BUILD_VECTOR v2s16	Matt Arsenault	2019-09-09	1	-8/+13
\| \| \| \| \| \| \| \|	Handle it the same way as G_BUILD_VECTOR_TRUNC. Arguably only G_BUILD_VECTOR_TRUNC should be legal for this, but G_BUILD_VECTOR will probably be more convenient in most cases. llvm-svn: 371440
*	AMDGPU/GlobalISel: Implement LDS G_GLOBAL_VALUE	Matt Arsenault	2019-09-09	1	-0/+42
\| \| \| \| \| \|	Handle the simple case that lowers to a constant. llvm-svn: 371424
*	AMDGPU/GlobalISel: Legalize G_BUILD_VECTOR_TRUNC	Matt Arsenault	2019-09-09	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	Treat this as legal on gfx9 since it can use S_PACK_* instructions for this. This isn't used by anything yet. The same will probably apply to 16-bit G_BUILD_VECTOR without the trunc. llvm-svn: 371423
*	AMDGPU/GlobalISel: Select G_PTR_MASK	Matt Arsenault	2019-09-09	1	-0/+4
\| \| \| \|	llvm-svn: 371412
*	AMDGPU/GlobalISel: Legalize wavefrontsize intrinsic	Matt Arsenault	2019-09-09	1	-0/+6
\| \| \| \|	llvm-svn: 371407
*	AMDGPU: Add intrinsics for address space identification	Matt Arsenault	2019-09-05	1	-0/+16
\| \| \| \| \| \| \|	The library currently uses ptrtoint and directly checks the queue ptr for this, which counts as a pointer capture. llvm-svn: 371009
*	AMDGPU/GlobalISel: Restore insert point when getting aperture	Matt Arsenault	2019-09-05	1	-0/+6
\| \| \| \| \| \|	Avoids SSA violations in a future patch. llvm-svn: 371008
*	AMDGPU/GlobalISel: Fix placeholder value used for addrspacecast	Matt Arsenault	2019-09-05	1	-4/+6
\| \| \| \|	llvm-svn: 371007
*	GlobalISel: Add basic legalization for G_BITREVERSE	Matt Arsenault	2019-09-04	1	-1/+1
\| \| \| \|	llvm-svn: 370979
*	AMDGPU/GlobalISel: Make 16-bit constants legal	Matt Arsenault	2019-09-04	1	-11/+5
\| \| \| \| \| \|	This is mostly for the benefit of patterns which use 16-bit constants. llvm-svn: 370921
*	Fix the build for MSVC builds using M_PI	Reid Kleckner	2019-08-29	1	-0/+7
\| \| \| \|	llvm-svn: 370405
*	AMDGPU/GlobalISel: Legalize sin/cos	Matt Arsenault	2019-08-29	1	-0/+41
\| \| \| \|	llvm-svn: 370402
*	AMDGPU/GlobalISel: Implement addrspacecast for 32-bit constant addrspace	Matt Arsenault	2019-08-28	1	-8/+31
\| \| \| \|	llvm-svn: 370140
*	GlobalISel: Implement moreElementsVector for G_UNMERGE_VALUES sources	Matt Arsenault	2019-08-21	1	-1/+1
\| \| \| \| \| \| \| \|	This is necessary for handling <3 x s16> on AMDGPU, assuming this should be handled as 2 separate legalization actions. The alternative would be for fewerElementsVector to handle 3->2. llvm-svn: 369547
*	GlobalISel: Partially implement fewerElementsVector G_UNMERGE_VALUES	Matt Arsenault	2019-08-13	1	-1/+10
\| \| \| \| \| \|	Odd sized vectors aren't handled yet. llvm-svn: 368713
*	GlobalISel: Implement lower for G_SHUFFLE_VECTOR	Matt Arsenault	2019-08-13	1	-0/+3
\| \| \| \|	llvm-svn: 368709
*	[globalisel] Add G_SEXT_INREG	Daniel Sanders	2019-08-09	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Targets often have instructions that can sign-extend certain cases faster than the equivalent shift-left/arithmetic-shift-right. Such cases can be identified by matching a shift-left/shift-right pair but there are some issues with this in the context of combines. For example, suppose you can sign-extend 8-bit up to 32-bit with a target extend instruction. %1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity) %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_ASHR %2:_(s32), i32 1 would reasonably combine to: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 25 which no longer matches the special case. If your shifts and extend are equal cost, this would break even as a pair of shifts but if your shift is more expensive than the extend then it's cheaper as: %2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8 %3:_(s32) = G_ASHR %2:_(s32), i32 1 It's possible to match the shift-pair in ISel and emit an extend and ashr. However, this is far from the only way to break this shift pair and make it hard to match the extends. Another example is that with the right known-zeros, this: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_MUL %2:_(s32), i32 2 can become: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 23 All upstream targets have been configured to lower it to the current G_SHL,G_ASHR pair but will likely want to make it legal in some cases to handle their faster cases. To follow-up: Provide a way to legalize based on the constant. At the moment, I'm thinking that the best way to achieve this is to provide the MI in LegalityQuery but that opens the door to breaking core principles of the legalizer (legality is not context sensitive). That said, it's worth noting that looking at other instructions and acting on that information doesn't violate this principle in itself. It's only a violation if, at the end of legalization, a pass that checks legality without being able to see the context would say an instruction might not be legal. That's a fairly subtle distinction so to give a concrete example, saying %2 in: %1 = G_CONSTANT 16 %2 = G_SEXT_INREG %0, %1 is legal is in violation of that principle if the legality of %2 depends on %1 being constant and/or being 16. However, legalizing to either: %2 = G_SEXT_INREG %0, 16 or: %1 = G_CONSTANT 16 %2:_(s32) = G_SHL %0, %1 %3:_(s32) = G_ASHR %2, %1 depending on whether %1 is constant and 16 does not violate that principle since both outputs are genuinely legal. Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61289 llvm-svn: 368487
*	GlobalISel: Lower scalarizing unmerge of a vector to shifts	Matt Arsenault	2019-08-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	AMDGPU sometimes has legal s16 and <2 x s16> operations, but all registers are really 32-bit. An unmerge destination really should ben widened to a 32-bit register. If widening a scalarizing vector with a target size that matches the vector size, bitcast to integer and extract the relevant bits with shifts. I'm not sure if this is the right place for this. This could arguably be part of widenScalar for the result. I also have a growing feeling that we're missing a bitcast legalize action. llvm-svn: 367604
*	AMDGPU/GlobalISel: Handle G_ATOMICRMW_FADD	Matt Arsenault	2019-08-01	1	-0/+3
\| \| \| \|	llvm-svn: 367509
*	GlobalISel: moreElementsVector for G_LOAD/G_STORE	Matt Arsenault	2019-08-01	1	-0/+1
\| \| \| \| \| \| \|	AMDGPU change and test is a placeholder until a future patch with complete handling. llvm-svn: 367503
*	[AMDGPU/GlobalISel] Add llvm.amdgcn.fdiv.fast legalization.	Austin Kerbow	2019-07-30	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Reviewed By: arsenm Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64966 llvm-svn: 367344
*	AMDGPU/GlobalISel: Legalize GEP for other 32-bit address spaces	Matt Arsenault	2019-07-19	1	-1/+3
\| \| \| \|	llvm-svn: 366621
*	AMDGPU/GlobalISel: Select flat loads	Matt Arsenault	2019-07-16	1	-0/+3
\| \| \| \| \| \| \| \|	Now that the patterns use the new PatFrag address space support, the only blocker to importing most load patterns is the addressing mode complex patterns. llvm-svn: 366237
*	AMDGPU/GlobalISel: Fix test failures in release build	Matt Arsenault	2019-07-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Apparently the check for legal instructions during instruction select does not happen without an asserts build, so these would successfully select in release, and fail in debug. Make s16 and/or/xor legal. These can just be selected directly to the 32-bit operation, as is already done in SelectionDAG, so just make them legal. llvm-svn: 366210
*	AMDGPU/GlobalISel: Custom legalize G_INSERT_VECTOR_ELT	Matt Arsenault	2019-07-15	1	-1/+31
\| \| \| \|	llvm-svn: 366116
*	AMDGPU/GlobalISel: Custom legalize G_EXTRACT_VECTOR_ELT	Matt Arsenault	2019-07-15	1	-1/+34
\| \| \| \| \| \|	Turn the constant cases into G_EXTRACTs. llvm-svn: 366115
*	AMDGPU/GlobalISel: Widen vector extracts	Matt Arsenault	2019-07-15	1	-5/+8
\| \| \| \|	llvm-svn: 366103
*	GlobalISel: Legalization for G_FMINNUM/G_FMAXNUM	Matt Arsenault	2019-07-10	1	-1/+55
\| \| \| \|	llvm-svn: 365658
*	AMDGPU/GlobalISel: Add support for wide loads >= 256-bits	Tom Stellard	2019-07-10	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds support for the most commonly used wide load types: <8xi32>, <16xi32>, <4xi64>, and <8xi64> Reviewers: arsenm Reviewed By: arsenm Subscribers: hiraditya, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57399 llvm-svn: 365586
*	GlobalISel: Implement lower for G_FCOPYSIGN	Matt Arsenault	2019-07-09	1	-3/+2
\| \| \| \| \| \| \| \| \|	In SelectionDAG AMDGPU treated these as legal, but this was mostly because the bitcasts required for FP types were painful. Theoretically the bitpattern should eventually match to bfi, so don't bother trying to get the patterns to import. llvm-svn: 365583
*	AMDGPU/GlobalISel: Fix legality for G_BUILD_VECTOR	Matt Arsenault	2019-07-09	1	-7/+4
\| \| \| \|	llvm-svn: 365575
*	AMDGPU/GlobalISel: Legalize more concat_vectors	Matt Arsenault	2019-07-09	1	-13/+16
\| \| \| \|	llvm-svn: 365488
*	AMDGPU/GlobalISel: Make s16 G_ICMP legal	Matt Arsenault	2019-07-09	1	-2/+8
\| \| \| \|	llvm-svn: 365486
*	AMDGPU/GlobalISel: Select G_MERGE_VALUES	Matt Arsenault	2019-07-09	1	-2/+4
\| \| \| \|	llvm-svn: 365482
*	AMDGPU/GlobalISel: Handle more input argument intrinsics	Matt Arsenault	2019-07-01	1	-0/+12
\| \| \| \|	llvm-svn: 364836
*	AMDGPU/GlobalISel: Lower kernarg segment ptr intrinsics	Matt Arsenault	2019-07-01	1	-4/+44
\| \| \| \|	llvm-svn: 364835
*	AMDGPU/GlobalISel: Legalize workgroup ID intrinsics	Matt Arsenault	2019-07-01	1	-0/+9
\| \| \| \|	llvm-svn: 364834
*	AMDGPU/GlobalISel: Legalize workitem ID intrinsics	Matt Arsenault	2019-07-01	1	-0/+84
\| \| \| \| \| \| \| \| \|	Tests don't cover the masked input path since non-kernel arguments aren't lowered yet. Test is copied directly from the existing test, with 2 additions. llvm-svn: 364833
*	AMDGPU/GlobalISel: Custom lower control flow intrinsics	Matt Arsenault	2019-07-01	1	-0/+64
\| \| \| \| \| \| \| \|	Replace the brcond for the 2 cases that act as branches. For now follow how the current system works, although I think we can eventually get rid of the pseudos. llvm-svn: 364832
*	AMDGPU/GlobalISel: Legalize s16 add/sub/mul	Matt Arsenault	2019-07-01	1	-1/+12
\| \| \| \| \| \| \|	If this is scalar, promote to s32. Use a new observer class to assign the register bank of newly created registers. llvm-svn: 364827
*	AMDGPU/GlobalISel: Legalize s16 fcmp	Matt Arsenault	2019-07-01	1	-1/+9
\| \| \| \|	llvm-svn: 364817
*	AMDGPU/GlobalISel: Make s16 select legal	Matt Arsenault	2019-07-01	1	-2/+2
\| \| \| \| \| \| \|	This is easy to handle and avoids legalization artifacts which are likely to obscure combines. llvm-svn: 364787