bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[OpenCL] Add missing tests for getOCLTypeName	Yaxun Liu	2016-08-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Adding missing tests for OCL type names for half, float, double, char, short, long, and unknown. Patch by Aaron En Ye Shi. Differential Revision: https://reviews.llvm.org/D22964 llvm-svn: 277759
*	MachineFunction: Return reference for getFrameInfo(); NFC	Matthias Braun	2016-07-28	1	-2/+2
\| \| \| \| \| \| \|	getFrameInfo() never returns nullptr so we should use a reference instead of a pointer. llvm-svn: 277017
*	AMDGPU: Minor AsmPrinter cleanups	Matt Arsenault	2016-07-26	1	-79/+84
\| \| \| \|	llvm-svn: 276804
*	AMDGPU: Make AMDGPUMachineFunction fields private	Matt Arsenault	2016-07-26	1	-5/+6
\| \| \| \| \| \| \| \| \|	ABIArgOffset is a problem because properly fsetting the KernArgSize requires that the reserved area before the real kernel arguments be correctly aligned, which requires fixing clover. llvm-svn: 276766
*	AMDGPU: Delete more dead code	Matt Arsenault	2016-07-22	1	-2/+6
\| \| \| \| \| \| \|	Remove dead code from r600 intrinsic removal. Remove unset members, rename StackSize to be less ambiguous. llvm-svn: 276436
*	AMDGPU: Fix bug causing crash due to invalid opencl version metadata.	Yaxun Liu	2016-07-20	1	-9/+13
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D22526 llvm-svn: 276119
*	Re-commit [AMDGPU] Add metadata for runtime	Yaxun Liu	2016-07-16	1	-0/+229
\| \| \| \| \| \|	Attempting to fix lit test failure on ppc. llvm-svn: 275676
*	Revert "[AMDGPU] Add metadata for runtime"	Vitaly Buka	2016-07-15	1	-229/+0
\| \| \| \| \| \|	This reverts commit r275566. llvm-svn: 275599
*	[AMDGPU] Add metadata for runtime	Yaxun Liu	2016-07-15	1	-0/+229
\| \| \| \| \| \| \| \| \| \|	Added emitting metadata to elf for runtime. Runtime requires certain information (metadata) about kernels to be able to execute and query them. Such information is emitted to an elf section as a key-value pair stream. Differential Revision: https://reviews.llvm.org/D21849 llvm-svn: 275566
*	AMDGPU/SI: Emit the number of SGPR and VGPR spills	Marek Olsak	2016-07-13	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: v2: don't count SGPRs spilled to scratch twice I think this is sufficient. It doesn't count private memory usage, which happens often and uses scratch but isn't technically a spill. The private memory usage can be computed by: [scratch_per_thread - vgpr_spills - a random multiple of SGPR spills]. The fact SGPR spills add very high numbers to the scratch size make that computation a guessing game, but I don't have a solution to that. Reviewers: tstellarAMD Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D22197 llvm-svn: 275288
*	AMDGPU/SI: Add support for R_AMDGPU_GOTPCREL	Tom Stellard	2016-07-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Reviewers: rafael, ruiu, tony-tye, arsenm, kzhuravl Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21484 llvm-svn: 275268
*	[AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in ↵	Konstantin Zhuravlyov	2016-06-25	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the kernel code header Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue. Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format: - offset 0: work group ID x - offset 4: work group ID y - offset 8: work group ID z - offset 16: work item ID x - offset 20: work item ID y - offset 24: work item ID z Set - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled Differential Revision: http://reviews.llvm.org/D20335 llvm-svn: 273769
*	AMDGPU: Cleanup subtarget handling.	Matt Arsenault	2016-06-24	1	-16/+13
\| \| \| \| \| \| \| \| \|	Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652
*	Generalize DiagnosticInfoStackSize to support other limits	Matt Arsenault	2016-06-20	1	-3/+11
\| \| \| \| \| \| \|	Backends may want to report errors on resources other than stack size. llvm-svn: 273177
*	AMDGPU: Use correct method for determining instruction size	Matt Arsenault	2016-06-20	1	-2/+4
\| \| \| \|	llvm-svn: 273172
*	[AMDGPU][NFC] Rename ReserveTrapVGPRs -> ReserveRegs	Konstantin Zhuravlyov	2016-05-24	1	-5/+6
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D20081 llvm-svn: 270594
*	AMDGPU/SI: Add support for AMD code object version 2.	Tom Stellard	2016-05-05	1	-45/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Version 2 is now the default. If you want to emit version 1, use the amdgcn--amdhsa-amdcov1 triple. Reviewers: arsenm, kzhuravl Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19283 llvm-svn: 268647
*	AMDGPU: Emit error if too much LDS is used	Matt Arsenault	2016-04-28	1	-0/+5
\| \| \| \|	llvm-svn: 267922
*	[AMDGPU] Move reserved vgpr count for trap handler usage to ↵	Konstantin Zhuravlyov	2016-04-26	1	-3/+3
\| \| \| \| \| \| \| \|	SIMachineFunctionInfo + minor commenting changes Differential Revision: http://reviews.llvm.org/D19537 llvm-svn: 267573
*	[AMDGPU] Reserve VGPRs for trap handler usage if instructed	Konstantin Zhuravlyov	2016-04-26	1	-0/+15
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D19235 llvm-svn: 267563
*	AMDGPU/SI: SGPR accounting in getSIProgramInfo must ignore exec_lo/hi	Nicolai Haehnle	2016-04-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: A shader stored the live mask (initial exec mask) in an SGPR which was then spilled during register allocation. The allocator quite reasonably optimized turned the spill into v_writelane_b32 %vgpr, exec_lo, N v_writelane_b32 %vgpr, exec_hi, N+1 at the beginning of the shader, confusing the SGPR accounting. No test case, because si-sgpr-spill.ll together with an upcoming patch for WQM handling exhibits the problem. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19199 llvm-svn: 266824
*	AMDGPU: Include LDS size in printed comment	Matt Arsenault	2016-04-14	1	-0/+2
\| \| \| \|	llvm-svn: 266382
*	[AMDGPU][llvm-mc] Support of Trap Handler registers (TTMP0..11 and ↵	Artem Tamazov	2016-04-13	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TBA/TMA)git status Tests added along with implemented feature. Note that there is a small leftover of unecessary MI sheduling issue (more info in the review). CodeGen/AMDGPU/salu-to-valu.ll updated to fix the false regression. TODO: Support for TTMP quads, comma-separated syntax in "[]" and more. Differential Revision: http://reviews.llvm.org/D17825 llvm-svn: 266205
*	AMDGPU: Add a shader calling convention	Nicolai Haehnle	2016-04-06	1	-21/+21
\| \| \| \| \| \| \| \| \| \| \|	This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589
*	[AMDGPU] Emit linkonce and linkonce_odr symbols	Konstantin Zhuravlyov	2016-04-05	1	-0/+2
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D18726 llvm-svn: 265408
*	Silencing warnings from MSVC 2015 Update 2. All of these changes silence ↵	Aaron Ballman	2016-03-30	1	-2/+2
\| \| \| \| \| \|	"C4334 '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)". NFC. llvm-svn: 264929
*	AMDGPU: Don't use estimated stack size when we know the real stack size	Matt Arsenault	2016-03-01	1	-1/+1
\| \| \| \|	llvm-svn: 262297
*	AMDGPU: Set element_size in private resource descriptor	Matt Arsenault	2016-02-12	1	-0/+19
\| \| \| \| \| \| \| \| \|	Introduce a subtarget feature for this, and leave the default with the current behavior which assumes up to 16-byte loads/stores can be used. The field also seems to have the ability to be set to 2 bytes, but I'm not sure what that would be used for. llvm-svn: 260651
*	AMDGPU: Set DX10Clamp bit	Matt Arsenault	2016-01-28	1	-3/+2
\| \| \| \|	llvm-svn: 259088
*	Update to use new name alignTo().	Rui Ueyama	2016-01-14	1	-4/+5
\| \| \| \|	llvm-svn: 257804
*	AMDGPU/SI: Add new target attribute InitialPSInputAddr	Marek Olsak	2016-01-13	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows Mesa to pass initial SPI_PS_INPUT_ADDR to LLVM. The register assigns VGPR locations to PS inputs, while the ENA register determines whether or not they are loaded. Mesa needs to set some inputs as not-movable, so that a pixel shader prolog binary appended at the beginning can assume where some inputs are. v2: Make PSInputAddr private, because there is never enough silly getters and setters for people to read. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16030 llvm-svn: 257591
*	AMDGPU: Emit note directive for HSA even if there are no functions	Tom Stellard	2016-01-12	1	-7/+19
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, echristo Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16010 llvm-svn: 257488
*	AMDGPU/SI: Emit global variable sizes when targeting HSA	Tom Stellard	2016-01-08	1	-1/+5
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15952 llvm-svn: 257173
*	AMDGPU/SI: xnack_mask is always reserved on VI	Nicolai Haehnle	2016-01-07	1	-5/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Somehow, I first interpreted the docs as saying space for xnack_mask is only reserved when XNACK is enabled via SH_MEM_CONFIG. I felt uneasy about this and went back to actually test what is happening, and it turns out that xnack_mask is always reserved at least on Tonga and Carrizo, in the sense that flat_scr is always fixed below the SGPRs that are used to implement xnack_mask, whether or not they are actually used. I confirmed this by writing a shader using inline assembly to tease out the aliasing between flat_scratch and regular SGPRs. For example, on Tonga, where we fix the number of SGPRs to 80, s[74:75] aliases flat_scratch (so xnack_mask is s[76:77] and vcc is s[78:79]). This patch changes both the calculation of the total number of SGPRs and the various register reservations to account for this. It ought to be possible to use the gap left by xnack_mask when the feature isn't used, but this patch doesn't try to do that. (Note that the same applies to vcc.) Note that previously, even before my earlier change in r256794, the SGPRs that alias to xnack_mask could end up being used as well when flat_scr was unused and the total number of SGPRs happened to fall on the right alignment (e.g. highest regular SGPR being used s29 and VCC used would lead to number of SGPRs being 32, where s28 and s29 alias with xnack_mask). So if there were some conflict due to such aliasing, we should have noticed that already. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15898 llvm-svn: 257073
*	AMDGPU: add +xnack feature	Nicolai Haehnle	2016-01-04	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Enabling this feature will account for the two SGPRs used by the hardware to store the XNACK_MASK physically. The hardware only requires this reservation when the XNACK feature is explicitly enabled. At some point, HSA will probably want to do that, but it does increase SGPR register pressure, so leave it disabled by default for now (but do add a small test). Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15869 llvm-svn: 256794
*	AMDGPU/SI: Reserve appropriate number of sgprs for flat scratch init.	Tom Stellard	2015-12-17	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15583 Patch by: Changpeng Fang llvm-svn: 255908
*	AMDGPU/SI: Set the code object work group segment size when targeting HSA	Tom Stellard	2015-12-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15493 llvm-svn: 255702
*	AMDGPU/SI: Set the code objects private segment size when targeting HSA.	Tom Stellard	2015-12-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: I'm not sure how things worked before without this. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15492 llvm-svn: 255692
*	AMDGPU/SI: Emit constant variables in the .hsatext section when targeting HSA	Tom Stellard	2015-12-15	1	-2/+6
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15426 llvm-svn: 255689
*	AMDGPU/SI: Emit constant arrays in the .text section	Tom Stellard	2015-12-10	1	-10/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows us to remove the END_OF_TEXT_LABEL hack we had been using and simplifies the fixups used to compute the address of constant arrays. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15257 llvm-svn: 255204
*	AMDGPU/SI: Emit constant arrays in the .hsrodata_readonly_agent section	Tom Stellard	2015-12-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is done only when targeting HSA. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13807 llvm-svn: 254587
*	AMDGPU/SI: Correctly emit agent global segment variables when targeting HSA	Tom Stellard	2015-12-02	1	-2/+37
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D14508 llvm-svn: 254540
*	AMDGPU/SI: Don't emit group segment global variables	Tom Stellard	2015-12-02	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Only global or readonly segment variables should appear in object files. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15111 llvm-svn: 254519
*	AMDGPU: Error if too many user SGPRs used	Matt Arsenault	2015-11-30	1	-0/+5
\| \| \| \|	llvm-svn: 254332
*	AMDGPU: Rework how private buffer passed for HSA	Matt Arsenault	2015-11-30	1	-11/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we know we have stack objects, we reserve the registers that the private buffer resource and wave offset are passed and use them directly. If not, reserve the last 5 SGPRs just in case we need to spill. After register allocation, try to pick the next available registers instead of the last SGPRs, and then insert copies from the inputs to the reserved registers in the progloue. This also only selectively enables all of the input registers which are really required instead of always enabling them. llvm-svn: 254331
*	AMDGPU: Add llvm.amdgcn.dispatch.ptr intrinsic	Tom Stellard	2015-11-26	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This returns a pointer to the dispatch packet, which can be used to load information about the kernel dispach. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D14898 llvm-svn: 254116
*	AMDGPU: Print more fields in comments	Matt Arsenault	2015-11-11	1	-3/+14
\| \| \| \|	llvm-svn: 252677
*	AMDGPU/SI: Emit HSA kernels with symbol type STT_AMDGPU_HSA_KERNEL	Tom Stellard	2015-11-06	1	-0/+13
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13804 llvm-svn: 252291
*	AMDGPU: Print number user SGPRs	Matt Arsenault	2015-11-05	1	-0/+6
\| \| \| \| \| \| \|	This doesn't quite match how SC prints it, which doesn't put it in a comment. llvm-svn: 252144
*	AMDGPU: Merge if and switch	Matt Arsenault	2015-10-01	1	-14/+17
\| \| \| \|	llvm-svn: 249082