bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	DAG: Handle odd vector sizes in calling conv splitting	Matt Arsenault	2018-09-10	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This already worked if only one register piece was used, but didn't if a type was split into multiple, unequal sized pieces. Fixes not splitting 3i16/v3f16 into two registers for AMDGPU. This will also allow fixing the ABI for 16-bit vectors in a future commit so that it's the same for all subtargets. llvm-svn: 341801
*	AMDGPU: Use splat vectors for undefs when folding canonicalize	Matt Arsenault	2018-08-12	1	-8/+59
\| \| \| \| \| \| \| \| \| \| \|	If one of the elements is undef, use the canonicalized constant from the other element instead of 0. Splat vectors are more useful for other optimizations, such as matching vector clamps. This was breaking on clamps of half3 from the undef 4th component. llvm-svn: 339512
*	AMDGPU: Push fcanonicalize through partially constant build_vector	Matt Arsenault	2018-08-06	1	-0/+173
\| \| \| \| \| \| \|	This usually avoids some re-packing code, and may help find canonical sources. llvm-svn: 339072
*	DAG: Fix vector widening fcanonicalize	Matt Arsenault	2018-08-02	1	-0/+20
\| \| \| \|	llvm-svn: 338715
*	AMDGPU: Fix scalarizing v4f16 fcanonicalize	Matt Arsenault	2018-08-02	1	-0/+19
\| \| \| \|	llvm-svn: 338714
*	DAG: Fix PromoteFloatResult for fcanonicalize	Matt Arsenault	2018-07-31	1	-83/+101
\| \| \| \|	llvm-svn: 338382
*	AMDGPU: Reduce code size with fcanonicalize (fneg x)	Matt Arsenault	2018-07-30	1	-44/+67
\| \| \| \| \| \| \| \|	When fcanonicalize is lowered to a mul, we can use -1.0 for free and avoid the cost of the bigger encoding for source modifers. llvm-svn: 338244
*	AMDGPU: Make v2i16/v2f16 legal on VI	Matt Arsenault	2018-05-22	1	-11/+6
\| \| \| \| \| \| \| \| \| \| \| \|	This usually results in better code. Fixes using inline asm with short2, and also fixes having a different ABI for function parameters between VI and gfx9. Partially cleans up the mess used for lowering of the d16 operations. Making v4f16 legal will help clean this up more, but this requires additional work. llvm-svn: 332953
*	[AMDGPU] Enabled v2.16 literals for VOP3P	Stanislav Mekhanoshin	2018-04-17	1	-1/+1
\| \| \| \| \| \| \| \|	Literal encoding needs op_sel_hi to select low 16 bit in this case. Differential Revision: https://reviews.llvm.org/D45745 llvm-svn: 330230
*	AMDGPU/GCN: Bring processors in sync with AMDGPUUsage	Konstantin Zhuravlyov	2017-12-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	- Add gfx704 - Change bonaire to gfx704 - Remove gfx804 - Remove gfx901 - Remove gfx903 Differential Revision: https://reviews.llvm.org/D40046 llvm-svn: 320194
*	[AMDGPU] SDWA: add support for PRESERVE into SDWA peephole.	Sam Kolton	2017-12-04	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Reviewers: arsenm, vpykhtin, rampitec Subscribers: kzhuravl, wdng, nhaehnle, mgorny, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D37817 llvm-svn: 319662
*	[AMDGPU] Use v_pk_max_f16 for fcanonicalize	Stanislav Mekhanoshin	2017-09-06	1	-5/+5
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D37325 llvm-svn: 312676
*	[AMDGPU] Fixed encoding of v_pk_mul_f16 in fcanonicalize	Stanislav Mekhanoshin	2017-09-06	1	-6/+5
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D37522 llvm-svn: 312660
*	[AMDGPU] Use v_max_f* for fcanonicalize	Stanislav Mekhanoshin	2017-08-30	1	-21/+17
\| \| \| \| \| \| \| \| \| \|	If denorms are not flushed we can use max instead of multiplication by 1. For double that is simply faster, while for float and half it is shorter, because mul uses constant bus and VOP3. Differential Revision: https://reviews.llvm.org/D36856 llvm-svn: 312095
*	[AMDGPU] Switch scalarize global loads ON by default	Alexander Timofeev	2017-07-04	1	-4/+14
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D34407 llvm-svn: 307097
*	Revert r307026, "[AMDGPU] Switch scalarize global loads ON by default"	NAKAMURA Takumi	2017-07-04	1	-14/+4
\| \| \| \| \| \| \| \| \|	It broke a testcase. Failing Tests (1): LLVM :: CodeGen/AMDGPU/alignbit-pat.ll llvm-svn: 307054
*	[AMDGPU] Switch scalarize global loads ON by default	Alexander Timofeev	2017-07-03	1	-4/+14
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D34407 llvm-svn: 307026
*	[AMDGPU] Untangle SDWA pass from SIShrinkInstructions	Stanislav Mekhanoshin	2017-06-03	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Remove dependency of SDWA pass on SIShrinkInstructions. The goal is to move SDWA even higher in the stack to avoid second run of MachineLICM, MachineCSE and SIFoldOperands. Also added handling to preserve original src modifiers. Differential Revision: https://reviews.llvm.org/D33860 llvm-svn: 304665
*	[AMDGPU] Allow SDWA in instructions with immediates and SGPRs	Stanislav Mekhanoshin	2017-05-30	1	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An encoding does not allow to use SDWA in an instruction with scalar operands, either literals or SGPRs. That is however possible to copy these operands into a VGPR first. Several copies of the value are produced if multiple SDWA conversions were done. To cleanup MachineLICM (to hoist copies out of loops), MachineCSE (to remove duplicate copies) and SIFoldOperands (to replace SGPR to VGPR copy with immediate copy right to the VGPR) runs are added after the SDWA pass. Differential Revision: https://reviews.llvm.org/D33583 llvm-svn: 304219
*	AMDGPU: Temporarily disable packed inlinable literals (v2f16, v2i16)	Konstantin Zhuravlyov	2017-04-21	1	-1/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D32361 llvm-svn: 301028
*	[AMDGPU] Resubmit SDWA peephole: enable by default	Sam Kolton	2017-04-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Reviewers: vpykhtin, rampitec, arsenm Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31671 llvm-svn: 299654
*	Revert r299536. [AMDGPU] SDWA peephole: enable by default.	Ivan Krasin	2017-04-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Reason: breaks multiple bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/3988 http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1173 Original Review URL: https://reviews.llvm.org/D31671 llvm-svn: 299583
*	[AMDGPU] SDWA peephole: enable by default	Sam Kolton	2017-04-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Reviewers: vpykhtin, rampitec, arsenm Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31671 llvm-svn: 299536
*	AMDGPU: Remove unnecessary ands when f16 is legal	Matt Arsenault	2017-03-31	1	-10/+14
\| \| \| \| \| \| \| \| \| \|	Add a new node to act as a fancy bitcast from f16 operations to i32 that implicitly zero the high 16-bits of the result. Alternatively could try making v2f16 legal and canonicalizing on build_vectors. llvm-svn: 299246
*	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel	Matt Arsenault	2017-03-21	1	-42/+42
\| \| \| \| \| \| \| \| \| \| \| \|	Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
*	AMDGPU: Support v2i16/v2f16 packed operations	Matt Arsenault	2017-02-27	1	-21/+37
\| \| \| \|	llvm-svn: 296396
*	AMDGPU: Use source mods with fcanonicalize	Matt Arsenault	2017-01-31	1	-0/+35
\| \| \| \|	llvm-svn: 293654
*	Enable FeatureFlatForGlobal on Volcanic Islands	Matt Arsenault	2017-01-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982
*	AMDGPU: Combine fp16/fp64 subtarget features	Matt Arsenault	2017-01-23	1	-10/+10
\| \| \| \| \| \| \|	The same control register controls both, and are set to the same defaults. Keep the old names around as aliases. llvm-svn: 292837
*	DAG: Allow legalization of fcanonicalize vector types	Matt Arsenault	2017-01-23	1	-0/+214
\| \| \| \|	llvm-svn: 292814
*	AMDGPU: Implement f16 fcanonicalize	Matt Arsenault	2016-12-22	1	-0/+172
	llvm-svn: 290300