bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	AMDGPU: Make auto waitcnt before barrier a feature	Konstantin Zhuravlyov	2017-06-02	1	-0/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D33793 llvm-svn: 304571
*	AMDGPU: Add new subtarget features for gfx9 flat instructions	Matt Arsenault	2017-05-10	1	-0/+3
\| \| \| \| \| \| \|	Flat instructions gain an immediate offset, and 2 new sets of segment specific flat instructions are added. llvm-svn: 302729
*	[AMDGPU] Generate range metadata for workitem id	Stanislav Mekhanoshin	2017-04-12	1	-0/+60
\| \| \| \| \| \| \| \| \|	If workgroup size is known inform llvm about range returned by local id and local size queries. Differential Revision: https://reviews.llvm.org/D31804 llvm-svn: 300102
*	AMDGPU: Fix crash when disassembling VOP3 mac	Matt Arsenault	2017-04-10	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \|	The unused dummy src2_modifiers is missing, so it crashes when trying to print it. I tried to fully remove src2_modifiers, but there are some irritations in the places where it is converted to mad since it starts to require modifying use lists while iterating over them. llvm-svn: 299861
*	[AMDGPU] Get address space mapping by target triple environment	Yaxun Liu	2017-03-27	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As we introduced target triple environment amdgiz and amdgizcl, the address space values are no longer enums. We have to decide the value by target triple. The basic idea is to use struct AMDGPUAS to represent address space values. For address space values which are not depend on target triple, use static const members, so that they don't occupy extra memory space and is equivalent to a compile time constant. Since the struct is lightweight and cheap, it can be created on the fly at the point of usage. Or it can be added as member to a pass and created at the beginning of the run* function. Differential Revision: https://reviews.llvm.org/D31284 llvm-svn: 298846
*	AMDGPU: Add VOP3P instruction format	Matt Arsenault	2017-02-27	1	-0/+1
\| \| \| \| \| \| \| \|	Add a few non-VOP3P but instructions related to packed. Includes hack with dummy operands for the benefit of the assembler llvm-svn: 296368
*	AMDGPU: Redefine clamp node as clamp 0.0-1.0	Matt Arsenault	2017-02-21	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \|	Change implementation to use max instead of add. min/max/med3 do not flush denormals regardless of the mode, so it is OK to use it whether or not they are enabled. Also allow using clamp with f16, and use knowledge of dx10_clamp. llvm-svn: 295788
*	AMDGPU: Fix assembler subtarget predicate for gfx9	Matt Arsenault	2017-02-18	1	-0/+1
\| \| \| \| \| \|	This was accepting GFX9 instructions on VI. llvm-svn: 295557
*	AMDGPU: Merge initial gfx9 support	Matt Arsenault	2017-02-18	1	-0/+1
\| \| \| \|	llvm-svn: 295554
*	Revert "[AMDGPU] Fix for SIMachineScheduler crash. SI Scheduler should track"	Alexander Timofeev	2017-02-14	1	-1/+3
\| \| \| \| \| \|	This reverts commit ce06d9cb99298eb844b66e117f5108a06747c907. llvm-svn: 295054
*	AMDGPU : Add trap handler support.	Wei Ding	2017-02-10	1	-1/+2
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D26010 llvm-svn: 294692
*	[AMDGPU] Calculate number of min/max SGPRs/VGPRs for WavesPerEU instead of ↵	Konstantin Zhuravlyov	2017-02-09	1	-1/+1
\| \| \| \| \| \| \| \|	using switch statement Differential Revision: https://reviews.llvm.org/D29741 llvm-svn: 294627
*	[AMDGPU] Add target information that is required by tools to metadata	Konstantin Zhuravlyov	2017-02-08	1	-80/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D28760#fb670e28 llvm-svn: 294449
*	[AMDGPU][NFC] De-tabify	Konstantin Zhuravlyov	2017-02-08	1	-1/+1
\| \| \| \|	llvm-svn: 294445
*	[AMDGPU] Move register related queries to subtarget class	Konstantin Zhuravlyov	2017-02-08	1	-5/+173
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D29318 llvm-svn: 294440
*	[AMDGPU] Fix for SIMachineScheduler crash. SI Scheduler should track	Alexander Timofeev	2017-02-07	1	-3/+1
\| \| \| \| \| \| \| \|	lane masks. Differential revision: https://reviews.llvm.org/D29442 llvm-svn: 294324
*	[AMDGPU] Account workgroup size in LDS occupancy limits	Stanislav Mekhanoshin	2017-02-01	1	-53/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Functions matching LDS use to occupancy return results for a workgroup of 64 workitems. The numbers has to be adjusted for bigger workgroups. For example a workgroup of size 256 already occupies 4 waves just by itself. Given that all numbers of LDS use in the compiler are per workgroup, occupancy shall be multiplied by 4 in this case. Each 64 workitems still limited by the same number, but 4 subrgoups 64 workitems each can afford 4 times more LDS to get the same occupancy. In addition change initializes LDS size in the subtarget to a real value for SI+ targets. This is required since LDS size is a variable in these calculations. Differential Revision: https://reviews.llvm.org/D29423 llvm-svn: 293837
*	AMDGPU: Enable FeatureFlatForGlobal on Volcanic Islands	Matt Arsenault	2017-01-27	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \|	Accomplishes what r292982 was supposed to, which ended up only really making the necessary test changes. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 293310
*	AMDGPU add support for spilling to a user sgpr pointed buffers	Tom Stellard	2017-01-25	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1]. Patch By: Dave Airlie Reviewers: nhaehnle, arsenm, tstellarAMD Reviewed By: arsenm Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25428 llvm-svn: 293000
*	Enable FeatureFlatForGlobal on Volcanic Islands	Matt Arsenault	2017-01-24	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982
*	AMDGPU: Combine fp16/fp64 subtarget features	Matt Arsenault	2017-01-23	1	-5/+4
\| \| \| \| \| \| \|	The same control register controls both, and are set to the same defaults. Keep the old names around as aliases. llvm-svn: 292837
*	Pacify -Wreorder.	Benjamin Kramer	2017-01-20	1	-1/+1
\| \| \| \|	llvm-svn: 292599
*	[AMDGPU] Add subtarget features for SDWA/DPP	Sam Kolton	2017-01-20	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28900 llvm-svn: 292596
*	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What ↵	Eugene Zelenko	2016-12-12	1	-13/+5
\| \| \| \| \| \|	You Use warnings; other minor fixes (NFC). llvm-svn: 289475
*	[AMDGPU] Scalarization of global uniform loads.	Alexander Timofeev	2016-12-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: LC can currently select scalar load for uniform memory access basing on readonly memory address space only. This restriction originated from the fact that in HW prior to VI vector and scalar caches are not coherent. With MemoryDependenceAnalysis we can check that the memory location corresponding to the memory operand of the LOAD is not clobbered along the all paths from the function entry. Reviewers: rampitec, tstellarAMD, arsenm Subscribers: wdng, arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D26917 llvm-svn: 289076
*	[AMDGPU] Add f16 support (VI+)	Konstantin Zhuravlyov	2016-11-13	1	-0/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D25975 llvm-svn: 286753
*	AMDGPU: Use 1/2pi inline imm on VI	Matt Arsenault	2016-10-29	1	-0/+2
\| \| \| \| \| \|	I'm guessing at how it is supposed to be printed llvm-svn: 285490
*	AMDGPU: Diagnose using too many SGPRs	Matt Arsenault	2016-10-28	1	-0/+10
\| \| \| \| \| \|	This is possible when using inline asm. llvm-svn: 285447
*	AMDGPU/SI: Don't allow unaligned scratch access	Tom Stellard	2016-10-14	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: The hardware doesn't support this. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25523 llvm-svn: 284257
*	AMDGPU: Add instruction definitions for VGPR indexing	Matt Arsenault	2016-10-12	1	-0/+2
\| \| \| \| \| \| \|	VI added a second method of indexing into VGPRs besides using v_movrel* llvm-svn: 284027
*	AMDGPU/SI: Include implicit arguments in kernarg_segment_byte_size	Tom Stellard	2016-09-23	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D24835 llvm-svn: 282223
*	[AMDGPU] Wave and register controls	Konstantin Zhuravlyov	2016-09-06	1	-0/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Implemented amdgpu-flat-work-group-size attribute - Implemented amdgpu-num-active-waves-per-eu attribute - Implemented amdgpu-num-sgpr attribute - Implemented amdgpu-num-vgpr attribute - Dynamic LDS constraints are in a separate patch Patch by Tom Stellard and Konstantin Zhuravlyov Differential Revision: https://reviews.llvm.org/D21562 llvm-svn: 280747
*	AMDGPU/SI: Implement a custom MachineSchedStrategy	Tom Stellard	2016-08-29	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: GCNSchedStrategy re-uses most of GenericScheduler, it's just uses a different method to compute the excess and critical register pressure limits. It's not enabled by default, to enable it you need to pass -misched=gcn to llc. Shader DB stats: 32464 shaders in 17874 tests Totals: SGPRS: 1542846 -> 1643125 (6.50 %) VGPRS: 1005595 -> 904653 (-10.04 %) Spilled SGPRs: 29929 -> 27745 (-7.30 %) Spilled VGPRs: 334 -> 352 (5.39 %) Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread Code Size: 36688188 -> 37034900 (0.95 %) bytes LDS: 1913 -> 1913 (0.00 %) blocks Max Waves: 254101 -> 265125 (4.34 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 1338220 -> 1438499 (7.49 %) VGPRS: 886221 -> 785279 (-11.39 %) Spilled SGPRs: 29869 -> 27685 (-7.31 %) Spilled VGPRs: 334 -> 352 (5.39 %) Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread Code Size: 34315716 -> 34662428 (1.01 %) bytes LDS: 1551 -> 1551 (0.00 %) blocks Max Waves: 188127 -> 199151 (5.86 %) Wait states: 0 -> 0 (0.00 %) Reviewers: arsenm, mareko, nhaehnle, MatzeB, atrick Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D23688 llvm-svn: 279995
*	AMDGPU: Fix crashes on memory functions	Matt Arsenault	2016-08-11	1	-1/+2
\| \| \| \|	llvm-svn: 278369
*	AMDGPU: Delete dead code	Matt Arsenault	2016-07-25	1	-26/+0
\| \| \| \|	llvm-svn: 276675
*	AMDGPU: Add feature for unaligned access	Matt Arsenault	2016-07-01	1	-2/+3
\| \| \| \|	llvm-svn: 274398
*	Target: Remove unused arguments from overrideSchedPolicy, NFC	Duncan P. N. Exon Smith	2016-07-01	1	-2/+0
\| \| \| \| \| \| \| \| \| \|	TargetSubtargetInfo::overrideSchedPolicy takes two MachineInstr* arguments (begin and end) that invite implicit conversions from MachineInstrBundleIterator. One option would be to change their type to an iterator, but since they don't seem to have been used since the API was added in 2010, I'm deleting the dead code. llvm-svn: 274304
*	AMDGPU: Fix global isel crashes	Matt Arsenault	2016-06-28	1	-2/+2
\| \| \| \|	llvm-svn: 274039
*	AMDGPU: Fix global isel build	Matt Arsenault	2016-06-28	1	-15/+3
\| \| \| \|	llvm-svn: 273964
*	AMDGPU: Implement per-function subtargets	Matt Arsenault	2016-06-27	1	-11/+1
\| \| \| \|	llvm-svn: 273940
*	AMDGPU: Move subtarget feature checks into passes	Matt Arsenault	2016-06-27	1	-1/+0
\| \| \| \|	llvm-svn: 273937
*	[AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in ↵	Konstantin Zhuravlyov	2016-06-25	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the kernel code header Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue. Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format: - offset 0: work group ID x - offset 4: work group ID y - offset 8: work group ID z - offset 16: work item ID x - offset 20: work item ID y - offset 24: work item ID z Set - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled Differential Revision: http://reviews.llvm.org/D20335 llvm-svn: 273769
*	AMDGPU: Remove disable-irstructurizer subtarget feature	Matt Arsenault	2016-06-24	1	-1/+0
\| \| \| \| \| \| \| \|	The only real reason to use it is for testing, so replace it with a command line option instead of a potentially function dependent feature. llvm-svn: 273653
*	AMDGPU: Cleanup subtarget handling.	Matt Arsenault	2016-06-24	1	-100/+111
\| \| \| \| \| \| \| \| \|	Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652
*	AMDGPU: Make FrameLowering stack alignment 16	Matt Arsenault	2016-06-22	1	-3/+4
\| \| \| \| \| \| \|	We don't need it to be that high. The natural alignment for a single workitem's stack is 16. llvm-svn: 273448
*	AMDGPU: Fix crashes on unknown processor name	Matt Arsenault	2016-06-02	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the processor name failed to parse for amdgcn, the resulting output would have R600 ISA in it. If the processor name was missing or invalid for R600, the wavefront size would not be set and there would be crashes from missing itinerary data. Fixes crashes in future commit caused by dividing by the unset/0 wavefront size. llvm-svn: 271561
*	AMDGPU/SI: Enable load-store-opt by default.	Changpeng Fang	2016-05-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Summary: Enable load-store-opt by default, and update LIT tests. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D20694 llvm-svn: 270894
*	[AMDGPU][NFC] Rename ReserveTrapVGPRs -> ReserveRegs	Konstantin Zhuravlyov	2016-05-24	1	-1/+1
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D20081 llvm-svn: 270594
*	AMDGPU: Fix promote alloca pass creating huge arrays	Matt Arsenault	2016-05-16	1	-0/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was assuming it could use all memory before, which is a bad decision because it restricts occupancy. By default, only try to use enough space that could reduce occupancy to 7, an arbitrarily chosen limit. Based on the exist LDS usage, try to round up to the limit in the current tier instead of further hurting occupancy. This isn't ideal, because it doesn't accurately know how much space is going to be used for alignment padding. llvm-svn: 269708
*	AMDGPU: Change private_element_size to 4	Matt Arsenault	2016-05-11	1	-1/+1
\| \| \| \|	llvm-svn: 269145