bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	AMDGPU/GlobalISel: Select G_SHL	Matt Arsenault	2019-07-16	3	-0/+698
\| \| \| \| \| \| \| \| \| \|	I think this manages to not break the DAG handling with the divergent predicates because the stadalone divergent patterns end up with a higher priority than the pattern on the instruction definition. The 16-bit versions don't work yet. llvm-svn: 366254
*	[AMDGPU] Change register type for v32 vectors	Stanislav Mekhanoshin	2019-07-16	1	-0/+29
\| \| \| \| \| \| \| \| \| \|	When it is AReg_1024 this results in unnecessary copying into AGPRs of a 32 element vectors even though they are not intended for an mfma instruction. Differential Revision: https://reviews.llvm.org/D64815 llvm-svn: 366252
*	AMDGPU/GlobalISel: Fix selection of private stores	Matt Arsenault	2019-07-16	1	-0/+280
\| \| \| \|	llvm-svn: 366249
*	AMDGPU/GlobalISel: Select private loads	Matt Arsenault	2019-07-16	1	-0/+1158
\| \| \| \|	llvm-svn: 366248
*	AMDGPU/GlobalISel: Select flat stores	Matt Arsenault	2019-07-16	7	-52/+1646
\| \| \| \|	llvm-svn: 366246
*	AMDGPU/GlobalISel: Select flat loads	Matt Arsenault	2019-07-16	2	-9/+3357
\| \| \| \| \| \| \| \|	Now that the patterns use the new PatFrag address space support, the only blocker to importing most load patterns is the addressing mode complex patterns. llvm-svn: 366237
*	[AMDGPU] Optimize atomic max/min	Jay Foad	2019-07-16	1	-0/+108
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Extend the atomic optimizer to handle signed and unsigned max and min operations, as well as add and subtract. Reviewers: arsenm, sheredom, critson, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64328 llvm-svn: 366235
*	[AMDGPU] Add the adjusted FP as a livein register.	Michael Liao	2019-07-16	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64145 llvm-svn: 366223
*	AMDGPU/GlobalISel: Fix test failures in release build	Matt Arsenault	2019-07-16	13	-463/+400
\| \| \| \| \| \| \| \| \| \| \| \|	Apparently the check for legal instructions during instruction select does not happen without an asserts build, so these would successfully select in release, and fail in debug. Make s16 and/or/xor legal. These can just be selected directly to the 32-bit operation, as is already done in SelectionDAG, so just make them legal. llvm-svn: 366210
*	[AMDGPU] Enable merging m0 initializations.	Austin Kerbow	2019-07-15	1	-7/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Enable hoisting and merging m0 defs that are initialized with the same immediate value. Fixes bug where removed instructions are not considered to interfere with other inits, and make sure to not hoist inits before block prologues. Reviewers: rampitec, arsenm Reviewed By: rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64766 llvm-svn: 366135
*	AMDGPU/GlobalISel: Allow scalar s1 and/or/xor	Matt Arsenault	2019-07-15	5	-162/+1873
\| \| \| \| \| \| \| \|	If a 1-bit value is in a 32-bit VGPR, the scalar opcodes set SCC to whether the result is 0. If the inputs are SCC, these can be copied to a 32-bit SGPR to produce an SCC result. llvm-svn: 366125
*	AMDGPU/GlobalISel: Select G_AND/G_OR/G_XOR	Matt Arsenault	2019-07-15	3	-24/+1762
\| \| \| \|	llvm-svn: 366121
*	AMDGPU/GlobalISel: Don't constrain source register of VCC copies	Matt Arsenault	2019-07-15	1	-4/+27
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is a hack until I come up with a better way of dealing with the pseudo-register banks used for boolean values. If the use instruction constrains the register, the selector for the def instruction won't see that the bank was VCC. A 1-bit SReg_32 is could ambiguously have been SCCRegBank or VCCRegBank in wave32. This is necessary to successfully select branches with and and/or/xor condition. llvm-svn: 366120
*	AMDGPU/GlobalISel: Fix selecting vcc->vcc bank copies	Matt Arsenault	2019-07-15	1	-3/+31
\| \| \| \| \| \| \| \| \|	The extra test change is correct, although how it arrives there is a bug that needs work. With wave32, the test for isVCC ambiguously reports true for an SCC or VCC source. A new allocatable pseudo register class for SCC may be necesssary. llvm-svn: 366119
*	AMDGPU/GlobalISel: Fix not constraining result reg of copies to VCC	Matt Arsenault	2019-07-15	1	-0/+26
\| \| \| \|	llvm-svn: 366118
*	AMDGPU/GlobalISel: Fix handling of sgpr (not scc bank) s1 to VCC	Matt Arsenault	2019-07-15	1	-9/+36
\| \| \| \| \| \|	This was emitting a copy from a 32-bit register to a 64-bit. llvm-svn: 366117
*	AMDGPU/GlobalISel: Custom legalize G_INSERT_VECTOR_ELT	Matt Arsenault	2019-07-15	1	-3/+38
\| \| \| \|	llvm-svn: 366116
*	AMDGPU/GlobalISel: Custom legalize G_EXTRACT_VECTOR_ELT	Matt Arsenault	2019-07-15	1	-102/+99
\| \| \| \| \| \|	Turn the constant cases into G_EXTRACTs. llvm-svn: 366115
*	AMDGPU/GlobalISel: Fix G_ICMP for wave32	Matt Arsenault	2019-07-15	1	-6/+7
\| \| \| \|	llvm-svn: 366114
*	GlobalISel: Implement narrowScalar for vector extract/insert indexes	Matt Arsenault	2019-07-15	2	-2/+63
\| \| \| \|	llvm-svn: 366113
*	AMDGPU/GlobalISel: Widen vector extracts	Matt Arsenault	2019-07-15	1	-0/+366
\| \| \| \|	llvm-svn: 366103
*	AMDGPU/GlobalISel: Handle llvm.amdgcn.if.break	Matt Arsenault	2019-07-15	2	-0/+53
\| \| \| \|	llvm-svn: 366102
*	AMDGPU/GlobalISel: Select llvm.amdgcn.end.cf	Matt Arsenault	2019-07-15	2	-0/+75
\| \| \| \|	llvm-svn: 366099
*	AMDGPU: Add 24-bit mul intrinsics	Matt Arsenault	2019-07-15	6	-11/+609
\| \| \| \| \| \| \| \| \| \| \|	Insert these during codegenprepare. This works around a DAG issue where generic combines eliminate the and asserting the high bits are zero, which then exposes an unknown read source to the mul combine. It doesn't worth the hassle of trying to insert an AssertZext or something to try to deal with it. llvm-svn: 366094
*	AMDGPU/GlobalISel: Select easy cases for G_BUILD_VECTOR	Matt Arsenault	2019-07-15	1	-0/+152
\| \| \| \|	llvm-svn: 366087
*	AMDGPU/GlobalISel: RegBankSelect for G_CONCAT_VECTORS	Matt Arsenault	2019-07-15	1	-0/+69
\| \| \| \|	llvm-svn: 366086
*	[AMDGPU] fixed scheduler crash in gfx908	Stanislav Mekhanoshin	2019-07-15	1	-0/+22
\| \| \| \| \| \| \| \| \|	For some reason scheduler can send down an SUnit without an instruction. Differential Revision: https://reviews.llvm.org/D64709 llvm-svn: 366074
*	[AMDGPU] use v32f32 for 3 mfma intrinsics	Stanislav Mekhanoshin	2019-07-12	4	-44/+48
\| \| \| \| \| \| \| \| \|	These should really use v32f32, but were defined as v32i32 due to the lack of the v32f32 type. Differential Revision: https://reviews.llvm.org/D64667 llvm-svn: 365972
*	AMDGPU: Drop remnants of byval support for shaders	Matt Arsenault	2019-07-12	13	-38/+30
\| \| \| \| \| \| \| \|	Before 2018, mesa used to use byval interchangably with inreg, which didn't really make sense. Fix tests still using it to avoid breaking in a future commit. llvm-svn: 365953
*	[AMDGPU] Fix DPP combiner check for exec modification	Jay Foad	2019-07-12	6	-2/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: r363675 changed the exec modification helper function, now called execMayBeModifiedBeforeUse, so that if no UseMI is specified it checks all instructions in the basic block, even beyond the last use. That meant that the DPP combiner no longer worked in any basic block that ended with a control flow instruction, and in particular it didn't work on code sequences generated by the atomic optimizer. Fix it by reinstating the old behaviour but in a new helper function execMayBeModifiedBeforeAnyUse, and limiting the number of instructions scanned. Reviewers: arsenm, vpykhtin Subscribers: kzhuravl, nemanjai, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kbarton, MaskRay, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64393 llvm-svn: 365910
*	[AMDGPU] Restrict v_cndmask_b32 abs/neg modifiers to f32	Jay Foad	2019-07-12	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: D64497 allowed abs/neg source modifiers on v_cndmask_b32 but it doesn't make any sense to apply them to f16 operands; they would interpret the bits of the value as an f32, giving nonsensical results. This patch restricts them to f32 operands. Reviewers: arsenm, hakzsam Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64636 llvm-svn: 365904
*	[AMDGPU] Skip calculating callee saved registers for entry function.	Michael Liao	2019-07-11	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64596 llvm-svn: 365846
*	AMDGPU: s_waitcnt field should be treated as unsigned	Matt Arsenault	2019-07-11	1	-0/+12
\| \| \| \| \| \| \|	Also make it an ImmLeaf, so it should work with global isel as well, which was part of the point of moving it in the first place. llvm-svn: 365842
*	[AMDGPU] gfx908 agpr spilling	Stanislav Mekhanoshin	2019-07-11	2	-0/+396
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D64594 llvm-svn: 365833
*	[AMDGPU] gfx908 hazard recognizer	Stanislav Mekhanoshin	2019-07-11	1	-0/+457
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D64593 llvm-svn: 365829
*	[AMDGPU] gfx908 mfma support	Stanislav Mekhanoshin	2019-07-11	8	-2/+1724
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D64584 llvm-svn: 365824
*	[DAGCombine] narrowInsertExtractVectorBinOp - add CONCAT_VECTORS support	Simon Pilgrim	2019-07-11	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already split extract_subvector(binop(insert_subvector(v,x),insert_subvector(w,y))) -> binop(x,y). This patch adds support for extract_subvector(binop(concat_vectors(),concat_vectors())) cases as well. In particular this means we don't have to wait for X86 lowering to convert concat_vectors to insert_subvector chains, which helps avoid some cases where demandedelts/combine calls occur too late to split large vector ops. The fast-isel-store.ll load folding regression is annoying but I don't think is that critical. Differential Revision: https://reviews.llvm.org/D63653 llvm-svn: 365785
*	[AMDGPU] Regenerate idot tests. NFCI.	Simon Pilgrim	2019-07-11	3	-56/+56
\| \| \| \| \| \|	Reduces diff in D63281. llvm-svn: 365754
*	[AMDGPU] gfx908 atomic fadd and atomic pk_fadd	Stanislav Mekhanoshin	2019-07-11	1	-0/+72
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D64435 llvm-svn: 365717
*	[AMDGPU] gfx908 dot instruction support	Stanislav Mekhanoshin	2019-07-11	2	-0/+6
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D64431 llvm-svn: 365715
*	GlobalISel: Legalization for G_FMINNUM/G_FMAXNUM	Matt Arsenault	2019-07-10	2	-0/+1066
\| \| \| \|	llvm-svn: 365658
*	AMDGPU: Serialize mode from MachineFunctionInfo	Matt Arsenault	2019-07-10	1	-15/+9
\| \| \| \|	llvm-svn: 365653
*	[AMDGPU] Allow abs/neg source modifiers on v_cndmask_b32	Jay Foad	2019-07-10	3	-40/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: D59191 added support for these modifiers in the assembler and disassembler. This patch just teaches instruction selection that it can use them. Reviewers: arsenm, tstellar Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64497 llvm-svn: 365640
*	AMDGPU/GlobalISel: Add support for wide loads >= 256-bits	Tom Stellard	2019-07-10	3	-0/+548
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds support for the most commonly used wide load types: <8xi32>, <16xi32>, <4xi64>, and <8xi64> Reviewers: arsenm Reviewed By: arsenm Subscribers: hiraditya, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57399 llvm-svn: 365586
*	GlobalISel: Implement lower for G_FCOPYSIGN	Matt Arsenault	2019-07-09	2	-176/+671
\| \| \| \| \| \| \| \| \|	In SelectionDAG AMDGPU treated these as legal, but this was mostly because the bitcasts required for FP types were painful. Theoretically the bitpattern should eventually match to bfi, so don't bother trying to get the patterns to import. llvm-svn: 365583
*	AMDGPU/GlobalISel: Fix legality for G_BUILD_VECTOR	Matt Arsenault	2019-07-09	19	-192/+600
\| \| \| \|	llvm-svn: 365575
*	GlobalISel: Combine unmerge of merge with intermediate cast	Matt Arsenault	2019-07-09	1	-0/+484
\| \| \| \| \| \| \|	This eliminates some illegal intermediate vectors when operations are scalarized. llvm-svn: 365566
*	[AMDGPU] gfx908 register file changes	Stanislav Mekhanoshin	2019-07-09	1	-3/+3
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D64438 llvm-svn: 365546
*	[AMDGPU] gfx908 target	Stanislav Mekhanoshin	2019-07-09	3	-0/+12
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D64429 llvm-svn: 365525
*	AMDGPU: Fix test failing since r365512	Matt Arsenault	2019-07-09	1	-1/+1
\| \| \| \|	llvm-svn: 365521