bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AMDGPU] Switch scalarize global loads ON by default	Alexander Timofeev	2017-07-04	1	-3/+3
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D34407 llvm-svn: 307097
*	Revert r307026, "[AMDGPU] Switch scalarize global loads ON by default"	NAKAMURA Takumi	2017-07-04	1	-3/+3
\| \| \| \| \| \| \| \| \|	It broke a testcase. Failing Tests (1): LLVM :: CodeGen/AMDGPU/alignbit-pat.ll llvm-svn: 307054
*	[AMDGPU] Switch scalarize global loads ON by default	Alexander Timofeev	2017-07-03	1	-3/+3
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D34407 llvm-svn: 307026
*	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel	Matt Arsenault	2017-03-21	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \|	Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
*	[DAGCombiner] add missing folds for scalar select of {-1,0,1}	Sanjay Patel	2017-02-24	1	-8/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The motivation for filling out these select-of-constants cases goes back to D24480, where we discussed removing an IR fold from add(zext) --> select. And that goes back to: https://reviews.llvm.org/rL75531 https://reviews.llvm.org/rL159230 The idea is that we should always canonicalize patterns like this to a select-of-constants in IR because that's the smallest IR and the best for value tracking. Note that we currently do the opposite in some cases (like the cases in this patch). Ie, the proposed folds in this patch already exist in InstCombine today: https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineSelect.cpp#L1151 As this patch shows, most targets generate better machine code for simple ext/add/not ops rather than a select of constants. So the follow-up steps to make this less of a patchwork of special-case folds and missing IR canonicalization: 1. Have DAGCombiner convert any select of constants into ext/add/not ops. 2 Have InstCombine canonicalize in the other direction (create more selects). Differential Revision: https://reviews.llvm.org/D30180 llvm-svn: 296137
*	AMDGPU/SI: Fix trunc i16 pattern	Jan Vesely	2017-02-23	1	-31/+60
\| \| \| \| \| \| \| \|	Hit on ASICs that support 16bit instructions. Differential Revision: https://reviews.llvm.org/D30281 llvm-svn: 295990
*	Revert "AMDGPU: Enable ConstrainCopy DAG mutation"	Konstantin Zhuravlyov	2016-11-17	1	-4/+3
\| \| \| \| \| \| \| \|	This reverts commit r287146. This breaks few conformance tests. llvm-svn: 287233
*	AMDGPU: Enable ConstrainCopy DAG mutation	Matt Arsenault	2016-11-16	1	-3/+4
\| \| \| \| \| \| \|	This fixes a probably unintended divergence from the default scheduler behavior. llvm-svn: 287146
*	AMDGPU: Use unsigned compare for eq/ne	Matt Arsenault	2016-09-30	1	-4/+4
\| \| \| \| \| \| \| \| \| \|	For some reason there are both of these available, except for scalar 64-bit compares which only has u64. I'm not sure why there are both (I'm guessing it's for the one bit inputs we don't use), but for consistency always using the unsigned one. llvm-svn: 282832
*	AMDGPU: Support commuting with immediate in src0	Matt Arsenault	2016-09-08	1	-2/+2
\| \| \| \|	llvm-svn: 280970
*	AMDGPU/SI: Implement a custom MachineSchedStrategy	Tom Stellard	2016-08-29	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: GCNSchedStrategy re-uses most of GenericScheduler, it's just uses a different method to compute the excess and critical register pressure limits. It's not enabled by default, to enable it you need to pass -misched=gcn to llc. Shader DB stats: 32464 shaders in 17874 tests Totals: SGPRS: 1542846 -> 1643125 (6.50 %) VGPRS: 1005595 -> 904653 (-10.04 %) Spilled SGPRs: 29929 -> 27745 (-7.30 %) Spilled VGPRs: 334 -> 352 (5.39 %) Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread Code Size: 36688188 -> 37034900 (0.95 %) bytes LDS: 1913 -> 1913 (0.00 %) blocks Max Waves: 254101 -> 265125 (4.34 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 1338220 -> 1438499 (7.49 %) VGPRS: 886221 -> 785279 (-11.39 %) Spilled SGPRs: 29869 -> 27685 (-7.31 %) Spilled VGPRs: 334 -> 352 (5.39 %) Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread Code Size: 34315716 -> 34662428 (1.01 %) bytes LDS: 1551 -> 1551 (0.00 %) blocks Max Waves: 188127 -> 199151 (5.86 %) Wait states: 0 -> 0 (0.00 %) Reviewers: arsenm, mareko, nhaehnle, MatzeB, atrick Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D23688 llvm-svn: 279995
*	AMDGPU/SI: Enable the post-ra scheduler	Tom Stellard	2016-04-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This includes a hazard recognizer implementation to replace some of the hazard handling we had during frame index elimination. Reviewers: arsenm Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18602 llvm-svn: 268143
*	AMDGPU/SI: use S_AND for i1 trunc	Marek Olsak	2015-10-29	1	-4/+4
\| \| \| \|	llvm-svn: 251630
*	R600 -> AMDGPU rename	Tom Stellard	2015-06-13	1	-0/+100
	llvm-svn: 239657