bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	AMDGPU/GlobalISel: RegBankSelect for update.dpp	Matt Arsenault	2019-06-29	1	-0/+1
\| \| \| \|	llvm-svn: 364701
*	AMDGPU/GlobalISel: RegBankSelect for atomic.inc/atomic.dec	Matt Arsenault	2019-06-29	1	-0/+2
\| \| \| \|	llvm-svn: 364699
*	AMDGPU/GlobalISel: RegBankSelect for some DS intrinsics	Matt Arsenault	2019-06-29	1	-1/+17
\| \| \| \|	llvm-svn: 364698
*	AMDGPU/GlobalISel: RegBankSelect for some easy intrinsics	Matt Arsenault	2019-06-29	1	-1/+48
\| \| \| \|	llvm-svn: 364697
*	AMDGPU/GlobalISel: RegBankSelect for icmp/fcmp intrinsics	Matt Arsenault	2019-06-29	1	-0/+12
\| \| \| \|	llvm-svn: 364696
*	AMDGPU/GlobalISel: RegBankSelect for amdgcn.div.fmas	Matt Arsenault	2019-06-29	1	-0/+1
\| \| \| \|	llvm-svn: 364695
*	AMDGPU/GlobalISel: RegBankSelect for some simple leaf intrinsics	Matt Arsenault	2019-06-29	1	-1/+11
\| \| \| \|	llvm-svn: 364694
*	[AMDGPU][MC] Fix 2 for sanitizer failure in 364645	Dmitry Preobrazhensky	2019-06-28	2	-6/+6
\| \| \| \|	llvm-svn: 364656
*	[AMDGPU][MC] Fix for sanitizer failure in 364645	Dmitry Preobrazhensky	2019-06-28	1	-4/+10
\| \| \| \|	llvm-svn: 364651
*	[AMDGPU][MC] Enabled constant expressions as operands of sendmsg	Dmitry Preobrazhensky	2019-06-28	5	-210/+266
\| \| \| \| \| \| \| \| \| \|	See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D62735 llvm-svn: 364645
*	[AMDGPU] Packed thread ids in function call ABI	Stanislav Mekhanoshin	2019-06-28	4	-22/+132
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D63851 llvm-svn: 364619
*	AMDGPU/GlobalISel: Convert to using Register	Matt Arsenault	2019-06-28	4	-44/+44
\| \| \| \|	llvm-svn: 364616
*	AMDGPU: Make fixing i1 copies robust against re-ordering	Nicolai Haehnle	2019-06-27	1	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The new test case led to incorrect code. Change-Id: Ief48b227e97aa662dd3535c9bafb27d4a184efca Reviewers: arsenm, david-salinas Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63871 llvm-svn: 364566
*	[GlobalISel] Accept multiple vregs in lowerFormalArgs	Diana Picus	2019-06-27	2	-10/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change the interface of CallLowering::lowerFormalArguments to accept several virtual registers for each formal argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660. lowerCall will be refactored in the same way in follow-up patches. With this change, we forward the virtual registers generated for aggregates to CallLowering. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. We also copy the pack/unpackRegs helpers to CallLowering to facilitate this. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was put into a s64 instead of a p0. Added a test-case which illustrates the problem more clearly (it crashes without this patch) and fixed the existing test-case to expect p0. AMDGPU has been updated to unpack into the virtual registers for kernels. I think the other code paths fall back for aggregates, so this should be NFC. Mips doesn't support aggregates yet, so it's also NFC. x86 seems to have code for dealing with aggregates, but I couldn't find the tests for it, so I just added a fallback to DAGISel if we get more than one virtual register for an argument. Differential Revision: https://reviews.llvm.org/D63549 llvm-svn: 364510
*	[AMDGPU] Fix +DumpCode to print an entry label for the first function	Jay Foad	2019-06-27	1	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The +DumpCode attribute is a horrible hack in AMDGPU to embed the disassembly of the generated code into the elf file. It is used by LLPC to implement an extension that allows the application to read back the disassembly of the code. It tries to print an entry label at the start of every function, but that didn't work for the first function in the module because DumpCodeInstEmitter wasn't initialised until EmitFunctionBodyStart which is too late. Change-Id: I790d73ddf4f51fd02ab32529380c7cb7c607c4ee Reviewers: arsenm, tpr, kzhuravl Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63712 llvm-svn: 364508
*	AMDGPU: Assert SPAdj is 0	Matt Arsenault	2019-06-26	1	-0/+2
\| \| \| \|	llvm-svn: 364473
*	[AMDGPU] Fix Livereg computation during epilogue insertion	Matt Arsenault	2019-06-26	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	The LivePhysRegs calculated in order to find a scratch register in the epilogue code wrongly uses 'LiveIns'. Instead, it should use the 'Liveout' sets. For the liveness, also considering the operands of the terminator (return) instruction which is the insertion point for the scratch-exec-copy instruction. Patch by Christudasan Devadasan llvm-svn: 364470
*	[AMDGPU] Fix for branch offset hardware workaround	Ryan Taylor	2019-06-26	7	-24/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes a hardware bug that makes a branch offset of 0x3f unsafe. This replaces the 32 bit branch with offset 0x3f to a 64 bit instruction that includes the same 32 bit branch and the encoding for a s_nop 0 to follow. The relaxer than modifies the offsets accordingly. Change-Id: I10b7aed99d651f8159401b01bb421f105fa6288e Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63494 llvm-svn: 364451
*	AMDGPU: Fix unused variable	Matt Arsenault	2019-06-26	1	-1/+0
\| \| \| \|	llvm-svn: 364426
*	AMDGPU: Check MRI for callee saved regs instead of TRI	Matt Arsenault	2019-06-26	4	-7/+5
\| \| \| \| \| \| \|	This should the same, but MRI does allow dynamically changing the CSR set, although currently not used. llvm-svn: 364425
*	Don't look for the TargetFrameLowering in the implementation	Matt Arsenault	2019-06-25	1	-2/+1
\| \| \| \| \| \|	The same oddity was apparently copy-pasted between multiple targets. llvm-svn: 364349
*	Update phis in AMDGPUUnifyDivergentExitNodes	Diego Novillo	2019-06-25	1	-7/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Original patch https://reviews.llvm.org/D63659 from Steven Perron <stevenperron@google.com> The pass AMDGPUUnifyDivergentExitNodes does not update the phi nodes in the successors of blocks that is splits. This is fixed by calling BasicBlock::splitBasicBlock to split the block instead of doing it manually. This does extra work because a new conditional branch is created in BB which is immediately replaced, but I think the simplicity is worth it. It also helps make the code more future proof in case other things need to be updated. llvm-svn: 364342
*	[AMDGPU] Removed dead SIMachineFunctionInfo::getWorkItemIDVGPR()	Stanislav Mekhanoshin	2019-06-25	2	-20/+0
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D63780 llvm-svn: 364339
*	[AMDGPU] Null checking on TS to avoid crashing in clang tests.	Michael Liao	2019-06-25	1	-1/+2
\| \| \| \| \| \| \|	- `test/Misc/backend-resource-limit-diagnostics.cl` crashes as null streamer is used. llvm-svn: 364318
*	AMDGPU: Select G_SEXT/G_ZEXT/G_ANYEXT	Matt Arsenault	2019-06-25	3	-5/+135
\| \| \| \|	llvm-svn: 364308
*	AMDGPU: Write LDS objects out as global symbols in code generation	Nicolai Haehnle	2019-06-25	9	-14/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The symbols use the processor-specific SHN_AMDGPU_LDS section index introduced with a previous change. The linker is then expected to resolve relocations, which are also emitted. Initially disabled for HSA and PAL environments until they have caught up in terms of linker and runtime loader. Some notes: - The llvm.amdgcn.groupstaticsize intrinsics can no longer be lowered to a constant at compile times, which means some tests can no longer be applied. The current "solution" is a terrible hack, but the intrinsic isn't used by Mesa, so we can keep it for now. - We no longer know the full LDS size per kernel at compile time, which means that we can no longer generate a relevant error message at compile time. It would be possible to add a check for the size of individual variables, but ultimately the linker will have to perform the final check. Change-Id: If66dbf33fccfbf3609aefefa2558ac0850d42275 Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin Subscribers: qcolombet, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61494 llvm-svn: 364297
*	AMDGPU/MC: Add .amdgpu_lds directive	Nicolai Haehnle	2019-06-25	3	-0/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The directive defines a symbol as an group/local memory (LDS) symbol. LDS symbols behave similar to common symbols for the purposes of ELF, using the processor-specific SHN_AMDGPU_LDS as section index. It is the linker and/or runtime loader's job to "instantiate" LDS symbols and resolve relocations that reference them. It is not possible to initialize LDS memory (not even zero-initialize as for .bss). We want to be able to link together objects -- starting with relocatable objects, but possible expanding to shared objects in the future -- that access LDS memory in a flexible way. LDS memory is in an address space that is entirely separate from the address space that contains the program image (code and normal data), so having program segments for it doesn't really make sense. Furthermore, we want to be able to compile multiple kernels in a compilation unit which have disjoint use of LDS memory. In that case, we may want to place LDS symbols differently for different kernels to save memory (LDS memory is very limited and physically private to each kernel invocation), so we can't simply place LDS symbols in a .lds section. Hence this solution where LDS symbols always stay undefined. Change-Id: I08cbc37a7c0c32f53f7b6123aa0afc91dbc1748f Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61493 llvm-svn: 364296
*	AMDGPU/GlobalISel: Fix regbankselect for amdgcn.class	Matt Arsenault	2019-06-25	1	-4/+8
\| \| \| \|	llvm-svn: 364262
*	AMDGPU/GlobalISel: Select G_TRUNC	Matt Arsenault	2019-06-24	4	-24/+115
\| \| \| \|	llvm-svn: 364215
*	AMDGPU/GlobalISel: RegBankSelect for amdgcn.class	Matt Arsenault	2019-06-24	1	-0/+9
\| \| \| \|	llvm-svn: 364214
*	AMDGPU/GlobalISel: Split VALU s64 G_ZEXT/G_SEXT in RegBankSelect	Matt Arsenault	2019-06-24	1	-13/+57
\| \| \| \| \| \| \| \| \| \| \|	Scalar extends to s64 can use S_BFE_{I64\|U64}, but vector extends need to extend to the 32-bit half, and then to 64. I'm not sure what the line should be between what RegBankSelect handles, and what instruction select does, but for now I'm erring on the side of RegBankSelect for future post-RBS combines. llvm-svn: 364212
*	[AMDGPU] Allow any value in unused src0 field in v_nop	Tim Renouf	2019-06-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The LLVM disassembler assumes that the unused src0 operand of v_nop is zero. Other tools can put another value in that field, which is still valid. This commit fixes the LLVM disassembler to recognize such an encoding as v_nop, in the same way as we already do for s_getpc. Differential Revision: https://reviews.llvm.org/D63724 Change-Id: Iaf0363eae26ff92fc4ebc716216476adbff37a6f llvm-svn: 364208
*	AMDGPU/GlobalISel: Fix selecting G_IMPLICIT_DEF for s1	Matt Arsenault	2019-06-24	2	-9/+27
\| \| \| \| \| \|	Try to fail for scc, since I don't think that should ever be produced. llvm-svn: 364199
*	GlobalISel: Remove unsigned variant of SrcOp	Matt Arsenault	2019-06-24	4	-29/+29
\| \| \| \| \| \| \| \| \|	Force using Register. One downside is the generated register enums require explicit conversion. llvm-svn: 364194
*	CodeGen: Introduce a class for registers	Matt Arsenault	2019-06-24	12	-42/+43
\| \| \| \| \| \| \| \| \|	Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191
*	[AMDGPU] Remove unused variable AllSGPRSpilledToVGPRs. NFC	Bjorn Pettersson	2019-06-24	1	-5/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Removing the unused variable AllSGPRSpilledToVGPRs in SIFrameLowering::processFunctionBeforeFrameFinalized to avoid error: variable 'AllSGPRSpilledToVGPRs' set but not used [-Werror=unused-but-set-variable] Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63721 llvm-svn: 364190
*	AMDGPU/GlobalISel: Fix RegBankSelect for s1 sext/zext/anyext	Matt Arsenault	2019-06-24	1	-10/+76
\| \| \| \| \| \| \| \|	This needs different handling if the source is known to be a valid condition or not. Handle turning it into shifts or a select during regbankselect. llvm-svn: 364186
*	AMDGPU: Fold frame index into MUBUF	Matt Arsenault	2019-06-24	2	-10/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This matters for byval uses outside of the entry block, which appear as copies. Previously, the only folding done was during selection, which could not see the underlying frame index. For any uses outside the entry block, the frame index was materialized in the entry block relative to the global scratch wave offset. This may produce worse code in cases where the offset ends up not fitting in the MUBUF offset field. A better heuristic would be helpfu for extreme frames. llvm-svn: 364185
*	AMDGPU: Cleanup checking when spills need emergency slots	Matt Arsenault	2019-06-24	1	-7/+6
\| \| \| \| \| \|	Address fixme, which should no longer be a problem since r363757. llvm-svn: 364182
*	AMDGPU: Fix not using s33 for scratch wave offset in kernels	Matt Arsenault	2019-06-21	1	-7/+11
\| \| \| \| \| \|	Fixes missing piece from r363990. llvm-svn: 364099
*	[AMDGPU] hazard recognizer for fp atomic to s_denorm_mode	Stanislav Mekhanoshin	2019-06-21	9	-28/+112
\| \| \| \| \| \| \| \| \|	This requires 3 wait states unless there is a wait or VALU in between. Differential Revision: https://reviews.llvm.org/D63619 llvm-svn: 364074
*	AMDGPU: Always use s33 for global scratch wave offset	Matt Arsenault	2019-06-20	2	-9/+1
\| \| \| \| \| \| \| \| \|	Every called function could possibly need this to calculate the absolute address of stack objectst, and this avoids inserting a copy around every call site in the kernel. It's also somewhat cleaner to keep this in a callee saved SGPR. llvm-svn: 363990
*	AMDGPU: Add intrinsics for DS GWS semaphore instructions	Matt Arsenault	2019-06-20	5	-25/+72
\| \| \| \|	llvm-svn: 363983
*	AMDGPU: Insert mem_viol check loop around GWS pre-GFX9	Matt Arsenault	2019-06-20	5	-19/+129
\| \| \| \| \| \| \|	It is necessary to emit this loop around GWS operations in case the wave is preempted pre-GFX9. llvm-svn: 363979
*	AMDGPU: Fix ignoring DisableFramePointerElim in leaf functions	Matt Arsenault	2019-06-20	1	-11/+7
\| \| \| \| \| \| \| \|	The attribute can specify elimination for leaf or non-leaf, so it should always be considered. I copied this bug from AArch64, which probably should also be fixed. llvm-svn: 363949
*	AMDGPU: Treat undef as an inline immediate	Matt Arsenault	2019-06-20	2	-5/+19
\| \| \| \| \| \| \|	This should only matter in vectors with an undef component, since a full undef vector would have been folded out. llvm-svn: 363941
*	[AMDGPU] gfx1010 core wave32 changes	Stanislav Mekhanoshin	2019-06-20	10	-40/+56
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D63204 llvm-svn: 363934
*	AMDGPU: Don't clobber VCC in MUBUF addr64 emulation	Matt Arsenault	2019-06-20	1	-9/+16
\| \| \| \| \| \| \| \| \|	Introducing VCC defs during SIFixSGPRCopies is generally problematic. Avoid it by starting with the VOP3 form with the general condition register. This is the easiest to fix instance, but doesn't solve any specific problems I'm looking at. llvm-svn: 363904
*	AMDGPU: Consolidate some getGeneration checks	Matt Arsenault	2019-06-19	9	-31/+82
\| \| \| \| \| \| \| \|	This is incomplete, and ideally these would all be removed, but it's better to localize them to the subtarget first with comments about what they're for. llvm-svn: 363902
*	AMDGPU: Undo sub x, c canonicalization for v2i16	Matt Arsenault	2019-06-19	3	-26/+87
\| \| \| \| \| \|	Should avoid regression from D62341 llvm-svn: 363899