bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Re-commit AMDGPU/GlobalISel: Add support for simple shaders	Tom Stellard	2017-01-30	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix build when global-isel is disabled and fix a warning. Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP. Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D26730 llvm-svn: 293551
*	Revert "AMDGPU/GlobalISel: Add support for simple shaders"	Tom Stellard	2017-01-30	1	-15/+0
\| \| \| \| \| \| \| \|	This reverts commit r293503. Revert while I investigate some of the buildbot failures. llvm-svn: 293509
*	AMDGPU/GlobalISel: Add support for simple shaders	Tom Stellard	2017-01-30	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP. Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D26730 llvm-svn: 293503
*	AMDGPU: Enable FeatureFlatForGlobal on Volcanic Islands	Matt Arsenault	2017-01-27	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \|	Accomplishes what r292982 was supposed to, which ended up only really making the necessary test changes. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 293310
*	AMDGPU: Implement early ifcvt target hooks.	Matt Arsenault	2017-01-25	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Leave early ifcvt disabled for now since there are some shader-db regressions. This causes some immediate improvements, but could be better. The cost checking that the pass does is based on critical path length for out of order CPUs which we do not want so it skips out on many cases we want. llvm-svn: 293016
*	AMDGPU add support for spilling to a user sgpr pointed buffers	Tom Stellard	2017-01-25	1	-7/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1]. Patch By: Dave Airlie Reviewers: nhaehnle, arsenm, tstellarAMD Reviewed By: arsenm Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25428 llvm-svn: 293000
*	Enable FeatureFlatForGlobal on Volcanic Islands	Matt Arsenault	2017-01-24	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982
*	AMDGPU: Combine fp16/fp64 subtarget features	Matt Arsenault	2017-01-23	1	-4/+3
\| \| \| \| \| \| \|	The same control register controls both, and are set to the same defaults. Keep the old names around as aliases. llvm-svn: 292837
*	[AMDGPU] Add subtarget features for SDWA/DPP	Sam Kolton	2017-01-20	1	-0/+10
\| \| \| \| \| \| \| \| \| \|	Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28900 llvm-svn: 292596
*	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What ↵	Eugene Zelenko	2016-12-09	1	-5/+12
\| \| \| \| \| \|	You Use warnings; other minor fixes (NFC). llvm-svn: 289282
*	[AMDGPU] Scalarization of global uniform loads.	Alexander Timofeev	2016-12-08	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: LC can currently select scalar load for uniform memory access basing on readonly memory address space only. This restriction originated from the fact that in HW prior to VI vector and scalar caches are not coherent. With MemoryDependenceAnalysis we can check that the memory location corresponding to the memory operand of the LOAD is not clobbered along the all paths from the function entry. Reviewers: rampitec, tstellarAMD, arsenm Subscribers: wdng, arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D26917 llvm-svn: 289076
*	[AMDGPU] Add f16 support (VI+)	Konstantin Zhuravlyov	2016-11-13	1	-0/+4
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D25975 llvm-svn: 286753
*	[AMDGPU] Check if type transforms to i16 (VI+) when getting AMDGPUISD::FFBH_U32	Konstantin Zhuravlyov	2016-11-01	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	This will prevent following regression when enabling i16 support (D18049): test/CodeGen/AMDGPU/ctlz.ll test/CodeGen/AMDGPU/ctlz_zero_undef.ll Differential Revision: https://reviews.llvm.org/D25802 llvm-svn: 285716
*	AMDGPU: Use 1/2pi inline imm on VI	Matt Arsenault	2016-10-29	1	-0/+5
\| \| \| \| \| \|	I'm guessing at how it is supposed to be printed llvm-svn: 285490
*	AMDGPU: Add definitions for scalar store instructions	Matt Arsenault	2016-10-28	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	Also add glc bit to the scalar loads since they exist on VI and change the caching behavior. This currently has an assembler bug where the glc bit is incorrectly accepted on SI/CI which do not have it. llvm-svn: 285463
*	AMDGPU: Diagnose using too many SGPRs	Matt Arsenault	2016-10-28	1	-0/+2
\| \| \| \| \| \|	This is possible when using inline asm. llvm-svn: 285447
*	AMDGPU/SI: Handle hazard with > 8 byte VMEM stores	Tom Stellard	2016-10-27	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25577 llvm-svn: 285359
*	AMDGPU: Refactor processor definition to use ISA version features	Yaxun Liu	2016-10-26	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add missing ISA versions 7.0.2/8.0.4/8.1.0. to backend. Refactor processor definition to use ISA version features. Fixed ISA version for stoney. Based on Laurent Morichetti's patch. Differential Revision: https://reviews.llvm.org/D25919 llvm-svn: 285210
*	AMDGPU : Add a function to enable and disable IEEEBit for SC and shader	Wei Ding	2016-10-19	1	-0/+4
\| \| \| \| \| \| \| \|	respectively. Differential Revision: http://reviews.llvm.org/D25789 llvm-svn: 284655
*	AMDGPU/SI: Don't allow unaligned scratch access	Tom Stellard	2016-10-14	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: The hardware doesn't support this. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25523 llvm-svn: 284257
*	AMDGPU: Add instruction definitions for VGPR indexing	Matt Arsenault	2016-10-12	1	-0/+10
\| \| \| \| \| \| \|	VI added a second method of indexing into VGPRs besides using v_movrel* llvm-svn: 284027
*	AMDGPU/SI: Update ISA version numbers for Tonga and Polaris10/11.	Changpeng Fang	2016-10-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D25454 Reviewers: tstellarAMD llvm-svn: 283893
*	[AMDGPU] Ask subtarget if waitcnt instruction is needed before barrier ↵	Konstantin Zhuravlyov	2016-09-30	1	-0/+6
\| \| \| \| \| \| \| \|	instruction Differential Revision: https://reviews.llvm.org/D24985 llvm-svn: 282875
*	AMDGPU/SI: Include implicit arguments in kernarg_segment_byte_size	Tom Stellard	2016-09-23	1	-0/+14
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D24835 llvm-svn: 282223
*	AMDGPU: Use i64 scalar compare instructions	Matt Arsenault	2016-09-17	1	-0/+4
\| \| \| \| \| \|	VI added eq/ne for i64, so use them. llvm-svn: 281800
*	AMDGPU/SI: Add support for triples with the mesa3d operating system	Tom Stellard	2016-09-16	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: mesa3d will use the same kernel calling convention as amdhsa, but it will handle everything else like the default 'unknown' OS type. Reviewers: arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D22783 llvm-svn: 281779
*	AMDGPU/SI: Make sure llvm.amdgcn.implicitarg.ptr() is 8-byte aligned for HSA	Tom Stellard	2016-09-09	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D24405 llvm-svn: 281080
*	[AMDGPU] Wave and register controls	Konstantin Zhuravlyov	2016-09-06	1	-8/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Implemented amdgpu-flat-work-group-size attribute - Implemented amdgpu-num-active-waves-per-eu attribute - Implemented amdgpu-num-sgpr attribute - Implemented amdgpu-num-vgpr attribute - Dynamic LDS constraints are in a separate patch Patch by Tom Stellard and Konstantin Zhuravlyov Differential Revision: https://reviews.llvm.org/D21562 llvm-svn: 280747
*	AMDGPU/SI: Implement a custom MachineSchedStrategy	Tom Stellard	2016-08-29	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: GCNSchedStrategy re-uses most of GenericScheduler, it's just uses a different method to compute the excess and critical register pressure limits. It's not enabled by default, to enable it you need to pass -misched=gcn to llc. Shader DB stats: 32464 shaders in 17874 tests Totals: SGPRS: 1542846 -> 1643125 (6.50 %) VGPRS: 1005595 -> 904653 (-10.04 %) Spilled SGPRs: 29929 -> 27745 (-7.30 %) Spilled VGPRs: 334 -> 352 (5.39 %) Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread Code Size: 36688188 -> 37034900 (0.95 %) bytes LDS: 1913 -> 1913 (0.00 %) blocks Max Waves: 254101 -> 265125 (4.34 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 1338220 -> 1438499 (7.49 %) VGPRS: 886221 -> 785279 (-11.39 %) Spilled SGPRs: 29869 -> 27685 (-7.31 %) Spilled VGPRs: 334 -> 352 (5.39 %) Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread Code Size: 34315716 -> 34662428 (1.01 %) bytes LDS: 1551 -> 1551 (0.00 %) blocks Max Waves: 188127 -> 199151 (5.86 %) Wait states: 0 -> 0 (0.00 %) Reviewers: arsenm, mareko, nhaehnle, MatzeB, atrick Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D23688 llvm-svn: 279995
*	AMDGPU: Fix crashes on memory functions	Matt Arsenault	2016-08-11	1	-0/+7
\| \| \| \|	llvm-svn: 278369
*	AMDGPU/SI: Increase SGPR limit to 96 on Tonga/Iceland	Marek Olsak	2016-08-05	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is the setting of the Vulkan closed source driver. It decreases the max wave count from 10 to 8. 26010 shaders in 14650 tests Totals: VGPRS: 829593 -> 808440 (-2.55 %) Spilled SGPRs: 81878 -> 42226 (-48.43 %) Spilled VGPRs: 367 -> 358 (-2.45 %) Scratch VGPRs: 1764 -> 1748 (-0.91 %) dwords per thread Code Size: 36677864 -> 35923932 (-2.06 %) bytes There is a massive decrease in SGPR spilling in general and -7.4% spilled VGPRs for DiRT Showdown (= SGPRs spilled to scratch?) Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23034 llvm-svn: 277867
*	AMDGPU: Delete dead code	Matt Arsenault	2016-07-25	1	-6/+0
\| \| \| \|	llvm-svn: 276675
*	AMDGPU: Delete more dead code	Matt Arsenault	2016-07-22	1	-33/+4
\| \| \| \| \| \| \|	Remove dead code from r600 intrinsic removal. Remove unset members, rename StackSize to be less ambiguous. llvm-svn: 276436
*	AMDGPU: Add feature for unaligned access	Matt Arsenault	2016-07-01	1	-0/+5
\| \| \| \|	llvm-svn: 274398
*	Target: Remove unused arguments from overrideSchedPolicy, NFC	Duncan P. N. Exon Smith	2016-07-01	1	-1/+0
\| \| \| \| \| \| \| \| \| \|	TargetSubtargetInfo::overrideSchedPolicy takes two MachineInstr* arguments (begin and end) that invite implicit conversions from MachineInstrBundleIterator. One option would be to change their type to an iterator, but since they don't seem to have been used since the API was added in 2010, I'm deleting the dead code. llvm-svn: 274304
*	AMDGPU: Move subtarget feature checks into passes	Matt Arsenault	2016-06-27	1	-5/+0
\| \| \| \|	llvm-svn: 273937
*	[AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in ↵	Konstantin Zhuravlyov	2016-06-25	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the kernel code header Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue. Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format: - offset 0: work group ID x - offset 4: work group ID y - offset 8: work group ID z - offset 16: work item ID x - offset 20: work item ID y - offset 24: work item ID z Set - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled Differential Revision: http://reviews.llvm.org/D20335 llvm-svn: 273769
*	AMDGPU: Remove disable-irstructurizer subtarget feature	Matt Arsenault	2016-06-24	1	-5/+0
\| \| \| \| \| \| \| \|	The only real reason to use it is for testing, so replace it with a command line option instead of a potentially function dependent feature. llvm-svn: 273653
*	AMDGPU: Cleanup subtarget handling.	Matt Arsenault	2016-06-24	1	-144/+232
\| \| \| \| \| \| \| \| \|	Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652
*	AMDGPU: Fix i64 global cmpxchg	Matt Arsenault	2016-06-09	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	This was using extract_subreg sub0 to extract the low register of the result instead of sub0_sub1, producing an invalid copy. There doesn't seem to be a way to use the compound subreg indices in tablegen since those are generated, so manually select it. llvm-svn: 272344
*	[AMDGPU][NFC] Rename ReserveTrapVGPRs -> ReserveRegs	Konstantin Zhuravlyov	2016-05-24	1	-3/+3
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D20081 llvm-svn: 270594
*	AMDGPU: Fix promote alloca pass creating huge arrays	Matt Arsenault	2016-05-16	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was assuming it could use all memory before, which is a bad decision because it restricts occupancy. By default, only try to use enough space that could reduce occupancy to 7, an arbitrarily chosen limit. Based on the exist LDS usage, try to round up to the limit in the current tier instead of further hurting occupancy. This isn't ideal, because it doesn't accurately know how much space is going to be used for alignment padding. llvm-svn: 269708
*	[AMDGPU] Move reserved vgpr count for trap handler usage to ↵	Konstantin Zhuravlyov	2016-04-26	1	-4/+0
\| \| \| \| \| \| \| \|	SIMachineFunctionInfo + minor commenting changes Differential Revision: http://reviews.llvm.org/D19537 llvm-svn: 267573
*	[AMDGPU] Reserve VGPRs for trap handler usage if instructed	Konstantin Zhuravlyov	2016-04-26	1	-0/+9
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D19235 llvm-svn: 267563
*	[AMDGPU] Add insert nops pass based on subtarget features instead of cl::opt	Konstantin Zhuravlyov	2016-04-18	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	Also, - Skip pass if machine module does not have debug info - Minor comment changes - Added test Differential Revision: http://reviews.llvm.org/D19079 llvm-svn: 266626
*	[NFC] Header cleanup	Mehdi Amini	2016-04-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' \| xargs grep -L 'IndexedMap[<]' \| xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595
*	AMDGPU: Add skeleton GlobalIsel implementation	Tom Stellard	2016-04-14	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds the necessary target code to be able to run the ir translator. Lowering function arguments and returns is a nop and there is no support for RegBankSelect. Reviewers: arsenm, qcolombet Subscribers: arsenm, joker.eph, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D19077 llvm-svn: 266356
*	AMDGPU: Add a shader calling convention	Nicolai Haehnle	2016-04-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589
*	AMDGPU: More bits of frame index are known to be zero	Matt Arsenault	2016-02-27	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \|	The maximum private allocation for the whole GPU is 4G, so the maximum possible index for a single workitem is the maximum size divided by the smallest granularity for a dispatch. This increases the number of known zero high bits, which enables more offset folding. The maximum private size per workitem with this is 128M but may be smaller still. llvm-svn: 262153
*	AMDGPU: Split vi-insts subtarget feature	Matt Arsenault	2016-02-27	1	-1/+10
\| \| \| \| \| \| \|	This will be more useful for marking builtins acceptable for which subtargets. llvm-svn: 262121