bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	AMDGPU : Add trap handler support.	Wei Ding	2017-02-10	1	-0/+6
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D26010 llvm-svn: 294692
*	Re-commit AMDGPU/GlobalISel: Add support for simple shaders	Tom Stellard	2017-01-30	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix build when global-isel is disabled and fix a warning. Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP. Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D26730 llvm-svn: 293551
*	Revert "AMDGPU/GlobalISel: Add support for simple shaders"	Tom Stellard	2017-01-30	1	-1/+0
\| \| \| \| \| \| \| \|	This reverts commit r293503. Revert while I investigate some of the buildbot failures. llvm-svn: 293509
*	AMDGPU/GlobalISel: Add support for simple shaders	Tom Stellard	2017-01-30	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP. Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D26730 llvm-svn: 293503
*	AMDGPU: Enable FeatureFlatForGlobal on Volcanic Islands	Matt Arsenault	2017-01-27	1	-9/+2
\| \| \| \| \| \| \| \| \| \| \|	Accomplishes what r292982 was supposed to, which ended up only really making the necessary test changes. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 293310
*	Enable FeatureFlatForGlobal on Volcanic Islands	Matt Arsenault	2017-01-24	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \|	This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982
*	AMDGPU: Combine fp16/fp64 subtarget features	Matt Arsenault	2017-01-23	1	-9/+20
\| \| \| \| \| \| \|	The same control register controls both, and are set to the same defaults. Keep the old names around as aliases. llvm-svn: 292837
*	[AMDGPU] Add subtarget features for SDWA/DPP	Sam Kolton	2017-01-20	1	-1/+20
\| \| \| \| \| \| \| \| \| \|	Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28900 llvm-svn: 292596
*	AMDGPU/SI: Remove XNACK feature from CI	Marek Olsak	2016-12-09	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: CI doesn't have XNACK. Reviewers: tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27175 llvm-svn: 289263
*	AMDGPU/SI: Don't reserve XNACK when it's disabled	Marek Olsak	2016-12-09	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This frees 2 additional scalar registers. These are results from all of my 3 patches combined: Polaris: Spilled SGPRs: 2231 -> 1517 (-32.00 %) Tonga: Spilled SGPRs: 3829 -> 2608 (-31.89 %) Spilled VGPRs: 100 -> 84 (-16.00 %) Tonga even spills SGPRs via VGPRs to scratch. That's a compute shader limited to 64 VGPRs. Reviewers: tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27151 llvm-svn: 289262
*	[AMDGPU] Add f16 support (VI+)	Konstantin Zhuravlyov	2016-11-13	1	-0/+6
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D25975 llvm-svn: 286753
*	AMDGPU: Add VI i16 support	Tom Stellard	2016-11-10	1	-0/+2
\| \| \| \| \| \| \| \|	Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 286464
*	Revert "AMDGPU: Add VI i16 support"	Tom Stellard	2016-11-04	1	-2/+0
\| \| \| \| \| \|	This reverts commit r285939 and r285948. These broke some conformance tests. llvm-svn: 285995
*	AMDGPU: Add VI i16 support	Tom Stellard	2016-11-03	1	-0/+2
\| \| \| \| \| \| \| \|	Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 285939
*	AMDGPU: Whitespace fixes	Matt Arsenault	2016-11-01	1	-9/+9
\| \| \| \|	llvm-svn: 285659
*	AMDGPU: Use 1/2pi inline imm on VI	Matt Arsenault	2016-10-29	1	-1/+7
\| \| \| \| \| \|	I'm guessing at how it is supposed to be printed llvm-svn: 285490
*	AMDGPU: Add definitions for scalar store instructions	Matt Arsenault	2016-10-28	1	-1/+8
\| \| \| \| \| \| \| \| \| \|	Also add glc bit to the scalar loads since they exist on VI and change the caching behavior. This currently has an assembler bug where the glc bit is incorrectly accepted on SI/CI which do not have it. llvm-svn: 285463
*	AMDGPU: Refactor processor definition to use ISA version features	Yaxun Liu	2016-10-26	1	-15/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add missing ISA versions 7.0.2/8.0.4/8.1.0. to backend. Refactor processor definition to use ISA version features. Fixed ISA version for stoney. Based on Laurent Morichetti's patch. Differential Revision: https://reviews.llvm.org/D25919 llvm-svn: 285210
*	AMDGPU/SI: Don't allow unaligned scratch access	Tom Stellard	2016-10-14	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: The hardware doesn't support this. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25523 llvm-svn: 284257
*	AMDGPU: Add instruction definitions for VGPR indexing	Matt Arsenault	2016-10-12	1	-3/+15
\| \| \| \| \| \| \|	VI added a second method of indexing into VGPRs besides using v_movrel* llvm-svn: 284027
*	AMDGPU/SI: Update ISA version numbers for Tonga and Polaris10/11.	Changpeng Fang	2016-10-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D25454 Reviewers: tstellarAMD llvm-svn: 283893
*	[AMDGPU] Enable changing instprinter's behavior based on the per-function	Konstantin Zhuravlyov	2016-09-27	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	subtarget This is a prerequisite for coming waitcnt changes Differential Revision: https://reviews.llvm.org/D24939 llvm-svn: 282489
*	[AMDGPU] Assembler: Move disabled SDWA and DPP instruction into Disable asm ↵	Sam Kolton	2016-09-12	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	variant Summary: This removes disabled instructions from match tables so we will not match them at all. Reviewers: tstellarAMD, vpykhtin, artem.tamazov Subscribers: wdng, nhaehnle, arsenm Differential Revision: https://reviews.llvm.org/D24452 llvm-svn: 281216
*	[AMDGPU] Assembler: match e32 VOP instructions before e64.	Sam Kolton	2016-09-09	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Split assembler match table in 4 tables with assembler variants: Default - all instructions except VOP3, SDWA and DPP - VOP3 - SDWA - DPP First match Default table then VOP3, SDWA and DPP. Reviewers: tstellarAMD, artem.tamazov, vpykhtin Subscribers: arsenm, wdng, nhaehnle, AMDGPU Differential Revision: https://reviews.llvm.org/D24252 llvm-svn: 281023
*	AMDGPU: Add feature for unaligned access	Matt Arsenault	2016-07-01	1	-0/+6
\| \| \| \|	llvm-svn: 274398
*	AMDGPU: Move subtarget feature checks into passes	Matt Arsenault	2016-06-27	1	-6/+0
\| \| \| \|	llvm-svn: 273937
*	[AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in ↵	Konstantin Zhuravlyov	2016-06-25	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the kernel code header Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue. Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format: - offset 0: work group ID x - offset 4: work group ID y - offset 8: work group ID z - offset 16: work item ID x - offset 20: work item ID y - offset 24: work item ID z Set - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled Differential Revision: http://reviews.llvm.org/D20335 llvm-svn: 273769
*	AMDGPU: Remove disable-irstructurizer subtarget feature	Matt Arsenault	2016-06-24	1	-6/+0
\| \| \| \| \| \| \| \|	The only real reason to use it is for testing, so replace it with a command line option instead of a potentially function dependent feature. llvm-svn: 273653
*	[AMDGPU][NFC] Rename ReserveTrapVGPRs -> ReserveRegs	Konstantin Zhuravlyov	2016-05-24	1	-4/+4
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D20081 llvm-svn: 270594
*	[AMDGPU] Update nop insertion for debugger usage	Konstantin Zhuravlyov	2016-05-13	1	-1/+1
\| \| \| \| \| \| \| \| \|	- Insert one nop for each high level statement instead of two - Do not insert nop before prologue Differential Revision: http://reviews.llvm.org/D20215 llvm-svn: 269452
*	[AMDGPU] Reserve VGPRs for trap handler usage if instructed	Konstantin Zhuravlyov	2016-04-26	1	-0/+7
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D19235 llvm-svn: 267563
*	[AMDGPU] Add insert nops pass based on subtarget features instead of cl::opt	Konstantin Zhuravlyov	2016-04-18	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \|	Also, - Skip pass if machine module does not have debug info - Minor comment changes - Added test Differential Revision: http://reviews.llvm.org/D19079 llvm-svn: 266626
*	AMDGPU: More bits of frame index are known to be zero	Matt Arsenault	2016-02-27	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \|	The maximum private allocation for the whole GPU is 4G, so the maximum possible index for a single workitem is the maximum size divided by the smallest granularity for a dispatch. This increases the number of known zero high bits, which enables more offset folding. The maximum private size per workitem with this is 128M but may be smaller still. llvm-svn: 262153
*	AMDGPU: Split vi-insts subtarget feature	Matt Arsenault	2016-02-27	1	-4/+12
\| \| \| \| \| \| \|	This will be more useful for marking builtins acceptable for which subtargets. llvm-svn: 262121
*	AMDGPU: Implement readcyclecounter	Matt Arsenault	2016-02-27	1	-1/+7
\| \| \| \| \| \| \| \| \| \|	This matches the behavior of the HSAIL clock instruction. s_realmemtime is used if the subtarget supports it, and falls back to s_memtime if not. Also introduces new intrinsics for each of s_memtime / s_memrealtime. llvm-svn: 262119
*	AMDGPU: Set element_size in private resource descriptor	Matt Arsenault	2016-02-12	1	-0/+12
\| \| \| \| \| \| \| \| \|	Introduce a subtarget feature for this, and leave the default with the current behavior which assumes up to 16-byte loads/stores can be used. The field also seems to have the ability to be set to 2 bytes, but I'm not sure what that would be used for. llvm-svn: 260651
*	AMDGPU: Match some med3 patterns	Matt Arsenault	2016-01-28	1	-0/+6
\| \| \| \|	llvm-svn: 259089
*	AMDGPU/SI: Stoney has only 16 LDS banks	Marek Olsak	2016-01-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a candidate for stable, along with all patches that add the "stoney" processor. Reviewers: tstellarAMD Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16485 llvm-svn: 258922
*	AMDGPU: Tidy minor td file issues	Matt Arsenault	2016-01-26	1	-164/+209
\| \| \| \| \| \| \| \| \| \|	Make comments and indentation more consistent. Rearrange a few things to be in a more consistent order, such as organizing subtarget features from those describing an actual device property, and those used as options. llvm-svn: 258789
*	AMDGPU: Remove Feature64BitPtr	Matt Arsenault	2016-01-23	1	-8/+3
\| \| \| \| \| \| \|	This is a leftover from AMDIL that doesn't do anything and doesn't belong here. llvm-svn: 258606
*	AMDGPU/SI: Pass whether to use the SI scheduler via Target Attribute	Tom Stellard	2016-01-21	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently the SI scheduler can be selected via command line option, but it turned out it would be better if it was selectable via a Target Attribute. This patch adds "si-scheduler" attribute to the backend. Reviewers: tstellarAMD, echristo Subscribers: echristo, arsenm Differential Revision: http://reviews.llvm.org/D16192 llvm-svn: 258386
*	AMDGPU: Add subtarget feature for instruction rates	Matt Arsenault	2016-01-18	1	-0/+6
\| \| \| \|	llvm-svn: 258085
*	AMDGPU/SI: Update ISA version for FIJI	Changpeng Fang	2016-01-13	1	-0/+1
\| \| \| \|	llvm-svn: 257666
*	AMDGPU: add +xnack feature	Nicolai Haehnle	2016-01-04	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Enabling this feature will account for the two SGPRs used by the hardware to store the XNACK_MASK physically. The hardware only requires this reservation when the XNACK feature is explicitly enabled. At some point, HSA will probably want to do that, but it does increase SGPR register pressure, so leave it disabled by default for now (but do add a small test). Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15869 llvm-svn: 256794
*	AMDGPU/SI: Fix encoding of flat instructions on VI	Tom Stellard	2015-12-24	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15735 llvm-svn: 256360
*	AMDGPU/SI: Use flat for global load/store when targeting HSA	Changpeng Fang	2015-12-22	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For some reason doing executing an MUBUF instruction with the addr64 bit set and a zero base pointer in the resource descriptor causes the memory operation to be dropped when the shader is executed using the HSA runtime. This kind of MUBUF instruction is commonly used when the pointer is stored in VGPRs. The base pointer field in the resource descriptor is set to zero and and the pointer is stored in the vaddr field. This patch resolves the issue by only using flat instructions for global memory operations when targeting HSA. This is an overly conservative fix as all other configurations of MUBUF instructions appear to work. NOTE: re-commit by fixing a failure in Codegen/AMDGPU/llvm.dbg.value.ll Reviewers: tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15543 llvm-svn: 256282
*	Revert "AMDGPU/SI: Use flat for global load/store when targeting HSA"	Rafael Espindola	2015-12-22	1	-5/+0
\| \| \| \| \| \| \| \|	This reverts commit r256273. It broke CodeGen/AMDGPU/llvm.dbg.value.ll llvm-svn: 256275
*	AMDGPU/SI: Use flat for global load/store when targeting HSA	Changpeng Fang	2015-12-22	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For some reason doing executing an MUBUF instruction with the addr64 bit set and a zero base pointer in the resource descriptor causes the memory operation to be dropped when the shader is executed using the HSA runtime. This kind of MUBUF instruction is commonly used when the pointer is stored in VGPRs. The base pointer field in the resource descriptor is set to zero and and the pointer is stored in the vaddr field. This patch resolves the issue by only using flat instructions for global memory operations when targeting HSA. This is an overly conservative fix as all other configurations of MUBUF instructions appear to work. Reviewers: tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15543 llvm-svn: 256273
*	AMDPGU/SI: Use AssertZext node to mask high bit for scratch offsets	Tom Stellard	2015-07-16	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We can safely assume that the high bit of scratch offsets will never be set, because this would require at least 128 GB of GPU memory. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11225 llvm-svn: 242433
*	AMDGPU/SI: Add debugging subtarget feature for DS offsets	Matt Arsenault	2015-07-06	1	-0/+10
\| \| \| \| \| \| \| \|	We don't have a good way to detect most situations where DS offsets are usable on SI, so add an option to force using them even if unsafe for debugging performance problems. llvm-svn: 241462