bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU: Don't form fmed3 if it will require materialization	Matt Arsenault	2018-09-18	1	-0/+62
\| \| \| \| \| \| \|	If there is a single use constant, it can be folded into the min/max, but not into med3. llvm-svn: 342443
*	DAG: Enhance isKnownNeverNaN	Matt Arsenault	2018-08-03	1	-41/+32
\| \| \| \| \| \| \| \| \| \| \| \|	Add a parameter for testing specifically for sNaNs - at least one instruction pattern on AMDGPU needs to check specifically for this. Also handle more cases, and add a target hook for custom nodes, similar to the hooks for known bits. llvm-svn: 338910
*	AMDGPU/GCN: Bring processors in sync with AMDGPUUsage	Konstantin Zhuravlyov	2017-12-08	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	- Add gfx704 - Change bonaire to gfx704 - Remove gfx804 - Remove gfx901 - Remove gfx903 Differential Revision: https://reviews.llvm.org/D40046 llvm-svn: 320194
*	AMDGPU: Fix -enable-var-scope violations	Matt Arsenault	2017-11-12	1	-7/+7
\| \| \| \|	llvm-svn: 318004
*	AMDGPU: Start selecting global instructions	Matt Arsenault	2017-07-29	1	-75/+75
\| \| \| \|	llvm-svn: 309470
*	AMDGPU: Allow SIShrinkInstructions to work in non-SSA	Matt Arsenault	2017-07-10	1	-2/+2
\| \| \| \| \| \| \| \|	Immediates can be folded as long as the immediate is a vreg. Also undo commuting instructions if it didn't fold an immediate. llvm-svn: 307575
*	[AMDGPU] Narrow lshl from 64 to 32 bit if possible	Stanislav Mekhanoshin	2017-05-22	1	-4/+4
\| \| \| \| \| \| \| \| \|	Turn expensive 64 bit shift into 32 bit if shift does not overflow int: shl (ext x) => zext (shl x) Differential Revision: https://reviews.llvm.org/D33367 llvm-svn: 303569
*	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel	Matt Arsenault	2017-03-21	1	-42/+42
\| \| \| \| \| \| \| \| \| \| \| \|	Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
*	AMDGPU: Use v_med3_{f16\|i16\|u16}	Matt Arsenault	2017-02-27	1	-4/+83
\| \| \| \|	llvm-svn: 296401
*	AMDGPU: Generalize matching of v_med3_f32	Matt Arsenault	2017-01-31	1	-6/+732
\| \| \| \| \| \| \| \| \| \|	I think this is safe as long as no inputs are known to ever be nans. Also add an intrinsic for fmed3 to be able to handle all safe math cases. llvm-svn: 293598
*	DAG: Consider nnan in isKnownNeverNaN	Matt Arsenault	2017-01-18	1	-0/+16
\| \| \| \|	llvm-svn: 292328
*	AMDGPU: Remove superfluous string attributes from tests	Matt Arsenault	2016-07-11	1	-9/+9
\| \| \| \| \| \|	Also fix v_mac.ll not testing right thing for fneg llvm-svn: 275129
*	AMDGPU: Run SIFoldOperands after PeepholeOptimizer	Matt Arsenault	2016-04-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PeepholeOptimizer cleans up redundant copies, which makes the operand folding more effective. shader-db stats: Totals: SGPRS: 34200 -> 34336 (0.40 %) VGPRS: 22118 -> 21655 (-2.09 %) Code Size: 632144 -> 633460 (0.21 %) bytes LDS: 11 -> 11 (0.00 %) blocks Scratch: 10240 -> 11264 (10.00 %) bytes per wave Max Waves: 8822 -> 8918 (1.09 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 7704 -> 7840 (1.77 %) VGPRS: 5169 -> 4706 (-8.96 %) Code Size: 234444 -> 235760 (0.56 %) bytes LDS: 2 -> 2 (0.00 %) blocks Scratch: 0 -> 1024 (0.00 %) bytes per wave Max Waves: 1188 -> 1284 (8.08 %) Wait states: 0 -> 0 (0.00 %) Increases: SGPRS: 35 (0.01 %) VGPRS: 1 (0.00 %) Code Size: 59 (0.02 %) LDS: 0 (0.00 %) Scratch: 1 (0.00 %) Max Waves: 48 (0.02 %) Wait states: 0 (0.00 %) Decreases: SGPRS: 26 (0.01 %) VGPRS: 54 (0.02 %) Code Size: 68 (0.03 %) LDS: 0 (0.00 %) Scratch: 0 (0.00 %) Max Waves: 4 (0.00 %) Wait states: 0 (0.00 %) llvm-svn: 266378
*	AMDGPU: Match fmed3 patterns with legacy fmin/fmax	Matt Arsenault	2016-01-28	1	-0/+23
\| \| \| \|	llvm-svn: 259090
*	AMDGPU: Match some med3 patterns	Matt Arsenault	2016-01-28	1	-0/+131
	llvm-svn: 259089