bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU/SI: Extend promoting alloca to vector to arrays of up to 16 elements	Changpeng Fang	2018-02-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch extends the promotion of alloca to vector to the arrays of up to 16 elements. Also we introduce an option, -disable-promote-alloca-to-vector, to switch promotion to vector off, if needed. Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D33559 llvm-svn: 325372
*	[AMDGPU] Switch to the new addr space mapping by default	Yaxun Liu	2018-02-02	1	-16/+16
\| \| \| \| \| \| \| \|	This requires corresponding clang change. Differential Revision: https://reviews.llvm.org/D40955 llvm-svn: 324101
*	[AMDGPU] Eliminate barrier if workgroup size is not greater than wavefront size	Stanislav Mekhanoshin	2017-04-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	If a workgroup size is known to be not greater than wavefront size the s_barrier instruction is not needed since all threads are guarantied to come to the same point at the same time. Differential Revision: https://reviews.llvm.org/D31731 llvm-svn: 299659
*	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel	Matt Arsenault	2017-03-21	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
*	[AMDGPU] Account workgroup size in LDS occupancy limits	Stanislav Mekhanoshin	2017-02-01	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Functions matching LDS use to occupancy return results for a workgroup of 64 workitems. The numbers has to be adjusted for bigger workgroups. For example a workgroup of size 256 already occupies 4 waves just by itself. Given that all numbers of LDS use in the compiler are per workgroup, occupancy shall be multiplied by 4 in this case. Each 64 workitems still limited by the same number, but 4 subrgoups 64 workitems each can afford 4 times more LDS to get the same occupancy. In addition change initializes LDS size in the subtarget to a real value for SI+ targets. This is required since LDS size is a variable in these calculations. Differential Revision: https://reviews.llvm.org/D29423 llvm-svn: 293837
*	Enable FeatureFlatForGlobal on Volcanic Islands	Matt Arsenault	2017-01-24	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982
*	[AMDGPU] Wave and register controls	Konstantin Zhuravlyov	2016-09-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Implemented amdgpu-flat-work-group-size attribute - Implemented amdgpu-num-active-waves-per-eu attribute - Implemented amdgpu-num-sgpr attribute - Implemented amdgpu-num-vgpr attribute - Dynamic LDS constraints are in a separate patch Patch by Tom Stellard and Konstantin Zhuravlyov Differential Revision: https://reviews.llvm.org/D21562 llvm-svn: 280747
*	AMDGPU/SI: Enable load-store-opt by default.	Changpeng Fang	2016-05-26	1	-2/+10
\| \| \| \| \| \| \| \| \| \|	Summary: Enable load-store-opt by default, and update LIT tests. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D20694 llvm-svn: 270894
*	AMDGPU: Fix promote alloca pass creating huge arrays	Matt Arsenault	2016-05-16	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was assuming it could use all memory before, which is a bad decision because it restricts occupancy. By default, only try to use enough space that could reduce occupancy to 7, an arbitrarily chosen limit. Based on the exist LDS usage, try to round up to the limit in the current tier instead of further hurting occupancy. This isn't ideal, because it doesn't accurately know how much space is going to be used for alignment padding. llvm-svn: 269708
*	AMDGPU: Change private_element_size to 4	Matt Arsenault	2016-05-11	1	-10/+41
\| \| \| \|	llvm-svn: 269145
*	AMDGPU: Fix mishandling array allocations when promoting alloca	Matt Arsenault	2016-04-28	1	-10/+10
\| \| \| \| \| \| \| \|	The canonical form for allocas is a single allocation of the array type. In case we see a non-canonical array alloca, make sure we aren't replacing this with an array N times smaller. llvm-svn: 267916
*	AMDGPU: Remove some old intrinsic uses from tests	Matt Arsenault	2016-02-11	1	-10/+12
\| \| \| \|	llvm-svn: 260493
*	AMDGPU: Do not promote allocas with non-inbounds GEPs	Matt Arsenault	2016-02-02	1	-4/+4
\| \| \| \| \| \| \| \|	If we can't assume the pointer value isn't within the bounds of the object, it seems risky to try to replace the pointer calculations. llvm-svn: 259573
*	AMDGPU: Switch barrier intrinsics to using convergent	Matt Arsenault	2015-12-19	1	-5/+5
\| \| \| \| \| \| \| \|	noduplicate prevents unrolling of small loops that happen to have barriers in them. If a loop has a barrier in it, it is OK to duplicate it for the unroll. llvm-svn: 256075
*	AMDGPU: Split LDS vector loads	Matt Arsenault	2015-11-24	1	-16/+8
\| \| \| \| \| \|	If properly aligned this could allow using ds_read_b64. llvm-svn: 253975
*	R600 -> AMDGPU rename	Tom Stellard	2015-06-13	1	-0/+91
	llvm-svn: 239657