bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	AMDGPU: Use generic bitreverse intrinsic	Matt Arsenault	2015-12-14	1	-0/+1
\| \| \| \| \| \|	Also fix bug in vector legalization for bitreverse. llvm-svn: 255512
*	AMDGPU/SI: Emit constant arrays in the .text section	Tom Stellard	2015-12-10	1	-13/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows us to remove the END_OF_TEXT_LABEL hack we had been using and simplifies the fixups used to compute the address of constant arrays. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15257 llvm-svn: 255204
*	AMDGPU/SI: Add support for sgpr and vgpr inline assembly constraints	Tom Stellard	2015-12-10	1	-6/+47
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: The 's' constraint represents sgprs and the 'v' constraint represents vgprs. Reviewers: arsenm, echristo Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15342 llvm-svn: 255203
*	AMDGPU: Implement isNoopAddrSpaceCast	Matt Arsenault	2015-12-01	1	-0/+11
\| \| \| \|	llvm-svn: 254468
*	AMDGPU/SI: Remove REGISTER_STORE/REGISTER_LOAD code which is now dead	Tom Stellard	2015-12-01	1	-16/+0
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15050 llvm-svn: 254427
*	AMDGPU: Fix unused function	Matt Arsenault	2015-11-30	1	-5/+0
\| \| \| \|	llvm-svn: 254333
*	AMDGPU: Rework how private buffer passed for HSA	Matt Arsenault	2015-11-30	1	-32/+128
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we know we have stack objects, we reserve the registers that the private buffer resource and wave offset are passed and use them directly. If not, reserve the last 5 SGPRs just in case we need to spill. After register allocation, try to pick the next available registers instead of the last SGPRs, and then insert copies from the inputs to the reserved registers in the progloue. This also only selectively enables all of the input registers which are really required instead of always enabling them. llvm-svn: 254331
*	AMDGPU: Rename enums to be consistent with HSA code object terminology	Matt Arsenault	2015-11-30	1	-16/+10
\| \| \| \|	llvm-svn: 254330
*	AMDGPU: Remove SIPrepareScratchRegs	Matt Arsenault	2015-11-30	1	-11/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It does not work because of emergency stack slots. This pass was supposed to eliminate dummy registers for the spill instructions, but the register scavenger can introduce more during PrologEpilogInserter, so some would end up left behind if they were needed. The potential for spilling the scratch resource descriptor and offset register makes doing something like this overly complicated. Reserve registers to use for the resource descriptor and use them directly in eliminateFrameIndex. Also removes creating another scratch resource descriptor when directly selecting scratch MUBUF instructions. The choice of which registers are reserved is temporary. For now it attempts to pick the next available registers after the user and system SGPRs. llvm-svn: 254329
*	AMDGPU: Use assert zext for workgroup sizes	Matt Arsenault	2015-11-30	1	-10/+21
\| \| \| \|	llvm-svn: 254328
*	AMDGPU: Don't reserve SCRATCH_PTR input register	Matt Arsenault	2015-11-30	1	-12/+4
\| \| \| \| \| \|	This hasn't been doing anything since using relocations was added. llvm-svn: 254304
*	AMDGPU: Add llvm.amdgcn.dispatch.ptr intrinsic	Tom Stellard	2015-11-26	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This returns a pointer to the dispatch packet, which can be used to load information about the kernel dispach. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D14898 llvm-svn: 254116
*	AMDGPU: Make v2i64/v2f64 legal types.	Matt Arsenault	2015-11-25	1	-1/+43
\| \| \| \| \| \| \|	They can be loaded and stored, so count them as legal. This is mostly to fix a number of common cases for load/store merging. llvm-svn: 254086
*	AMDGPU: Split LDS vector loads	Matt Arsenault	2015-11-24	1	-1/+2
\| \| \| \| \| \|	If properly aligned this could allow using ds_read_b64. llvm-svn: 253975
*	AMDGPU: Split x8 and x16 vector loads instead of scalarize	Matt Arsenault	2015-11-24	1	-1/+5
\| \| \| \| \| \| \| \|	The one regression in the builtin tests is in the read2 test which now (again) has many extra copies, but this should be solved once the pass is replaced with a DAG combine. llvm-svn: 253974
*	AMDGPU: Error on graphics shaders with HSA	Matt Arsenault	2015-11-02	1	-0/+8
\| \| \| \| \| \| \| \|	I've found myself pointlessly debugging problems from running graphics tests with an HSA triple a few times, so stop this from happening again. llvm-svn: 251858
*	AMDGPU/SI: handle undef for llvm.SI.packf16	Marek Olsak	2015-10-29	1	-0/+4
\| \| \| \|	llvm-svn: 251632
*	AMDGPU: Simplify VOP3 operand legalization.	Matt Arsenault	2015-10-21	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was checking for a variety of situations that should never happen. This saves a tiny bit of compile time. We should not be selecting instructions with invalid operands in the first place. Most of the time for registers copys are inserted to the correct operand register class. For VOP3, since all operand types are supported and literal constants never are, we just need to verify the constant bus requirements (all immediates should be legal inline ones). The only possibly tricky case to maybe worry about is if when legalizing operands in moveToVALU with s_add_i32 and similar instructions. If the original s_add_i32 had a literal constant and we need to replace it with v_add_i32_e64 we would have an unsupported literal operand. However, I don't think we should worry about that because SIFoldOperands should handle folding literal constant operands into the SALU instructions based on the uses. At SIFoldOperands time, the legality and profitability of operand types is a bit different. llvm-svn: 250951
*	AMDGPU: Add MachineInstr overloads for instruction format tests	Matt Arsenault	2015-10-20	1	-1/+1
\| \| \| \|	llvm-svn: 250797
*	AMDGPU/SI: Remove calling convention assertion from LowerFormalArguments()	Tom Stellard	2015-10-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We currently ignore the calling convention, so there is no real reason to assert on the calling convention of functions. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13367 llvm-svn: 249468
*	AMDGPU/SI: Don't set DATA_FORMAT if ADD_TID_ENABLE is set	Marek Olsak	2015-09-29	1	-3/+1
\| \| \| \| \| \| \| \| \| \|	to prevent setting a huge stride, because DATA_FORMAT has a different meaning if ADD_TID_ENABLE is set. This is a candidate for stable llvm 3.7. Tested-and-Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 248858
*	AMDGPU: Re-justify workaround and fix worked around problem	Matt Arsenault	2015-09-25	1	-38/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When buffer resource descriptors were built, the upper two components of the descriptor were first composed into a 64-bit register because legalizeOperands assumed all operands had the same register class. Fix that problem, but keep the workaround. I'm not sure anything actually is actually emitting such a REG_SEQUENCE now. If multiple resource descriptors are set up with different base pointers, this is copied with a single s_mov_b64. We probably should fix this better by recognizing a pair of s_mov_b32 later, but for now delete the dead code. llvm-svn: 248585
*	Reformat comment lines.	NAKAMURA Takumi	2015-09-22	1	-1/+1
\| \| \| \|	llvm-svn: 248262
*	Use makeArrayRef or None to avoid unnecessarily mentioning the ArrayRef type ↵	Craig Topper	2015-09-21	1	-1/+1
\| \| \| \| \| \|	extra times. NFC llvm-svn: 248140
*	propagate fast-math-flags on DAG nodes	Sanjay Patel	2015-09-16	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is one test case in this patch to prove that point. This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF ( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes. This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the current global settings. Differential Revision: http://reviews.llvm.org/D12095 llvm-svn: 247815
*	SelectionDAG: Support Expand of f16 extloads	Matt Arsenault	2015-09-09	1	-29/+3
\| \| \| \| \| \| \| \| \| \|	Currently this hits an assert that extload should always be supported, which assumes integer extloads. This moves a hack out of SI's argument lowering and is covered by existing tests. llvm-svn: 247113
*	check for fastness before merging in DAGCombiner::MergeConsecutiveStores()	Sanjay Patel	2015-09-03	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use and check the 'IsFast' optional parameter to TLI.allowsMemoryAccess() any time we have a merged access candidate. Without this patch, we were generating unaligned 16-byte (SSE) memops for x86 targets where those accesses are slow. This change was mentioned in: http://reviews.llvm.org/D10662 and http://reviews.llvm.org/D10905 and will help solve PR21711. Differential Revision: http://reviews.llvm.org/D12573 llvm-svn: 246771
*	Fix some comment typos.	Benjamin Kramer	2015-08-08	1	-5/+5
\| \| \| \|	llvm-svn: 244402
*	AMDGPU: Assume SMRD access for constant address space	Matt Arsenault	2015-08-07	1	-40/+75
\| \| \| \| \| \| \|	Since r243294 these are selected to SMRD and moved later if required. llvm-svn: 244354
*	De-constify pointers to Type since they can't be modified. NFC	Craig Topper	2015-08-01	1	-1/+1
\| \| \| \| \| \|	This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842
*	AMDGPU: Fix v16i32 to v16i8 truncstore	Matt Arsenault	2015-07-31	1	-0/+1
\| \| \| \|	llvm-svn: 243731
*	AMDGPU/SI: Add VI patterns to select FLAT instructions for global memory ops	Tom Stellard	2015-07-20	1	-6/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The MUBUF addr64 bit has been removed on VI, so we must use FLAT instructions when the pointer is stored in VGPRs. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11067 llvm-svn: 242673
*	AMDPGU/SI: Use AssertZext node to mask high bit for scratch offsets	Tom Stellard	2015-07-16	1	-2/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We can safely assume that the high bit of scratch offsets will never be set, because this would require at least 128 GB of GPU memory. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11225 llvm-svn: 242433
*	AMDGPU: Fix chains for memory ops dependent on argument loads	Matt Arsenault	2015-07-10	1	-4/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most loads and stores are derived from pointers derived from a kernel argument load inserted during argument lowering. This was just using the EntryToken chain for the argument loads, and any users of these loads were also on the EntryToken chain. Return the chain of the lowered argument load so that dependent loads end up on the correct chain. No test since I'm not aware of any case where this actually broke. llvm-svn: 241960
*	AMDGPU: Use requested chain when lowering arguments	Matt Arsenault	2015-07-10	1	-1/+1
\| \| \| \| \| \| \|	No test since I'm not aware of any case where this will end up being a different chain. llvm-svn: 241954
*	AMDGPU: Add helper function for implicit parameter offsets.	Tom Stellard	2015-07-09	1	-2/+2
\| \| \| \| \| \|	Patch by: Zoltan Gilian llvm-svn: 241861
*	Re-instate the EVT parameter to getScalarShiftAmountTy() for OOT user	Mehdi Amini	2015-07-09	1	-1/+1
\| \| \| \| \| \| \|	A documentation for this function would be nice by the way. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241807
*	Remove getDataLayout() from TargetLowering	Mehdi Amini	2015-07-09	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11042 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241779
*	Make isLegalAddressingMode() taking DataLayout as an argument	Mehdi Amini	2015-07-09	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11040 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241778
*	Make TargetLowering::getShiftAmountTy() taking DataLayout as an argument	Mehdi Amini	2015-07-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11037 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241776
*	Make TargetLowering::getPointerTy() taking DataLayout as an argument	Mehdi Amini	2015-07-09	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775
*	[TargetLowering] StringRefize asm constraint getters.	Benjamin Kramer	2015-07-05	1	-2/+1
\| \| \| \| \| \| \| \|	There is some functional change here because it changes target code from atoi(3) to StringRef::getAsInteger which has error checking. For valid constraints there should be no difference. llvm-svn: 241411
*	AMDGPU/SI: There are no implicit kernel args in the amdhsa ABI	Tom Stellard	2015-06-26	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10706 llvm-svn: 240830
*	AMDGPU: Use getAsInteger instead of atoi	Matt Arsenault	2015-06-23	1	-3/+5
\| \| \| \|	llvm-svn: 240365
*	R600 -> AMDGPU rename	Tom Stellard	2015-06-13	1	-0/+2241
\| \| \| \|	llvm-svn: 239657
*	Revert "AMDGPU: Add core backend files for R600/SI codegen v6"	Tom Stellard	2012-07-16	1	-195/+0
\| \| \| \| \| \|	This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea. llvm-svn: 160303
*	AMDGPU: Add core backend files for R600/SI codegen v6	Tom Stellard	2012-07-16	1	-0/+195
	llvm-svn: 160270