bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU: Custom lower v2i32 loads and stores	Matt Arsenault	2016-05-02	1	-7/+39
\| \| \| \| \| \| \|	This will allow us to split up 64-bit private accesses when necessary. llvm-svn: 268296
*	AMDGPU/SI: Use v_readfirstlane_b32 when restoring SGPRs spilled to scratch	Tom Stellard	2016-05-02	1	-2/+1
\| \| \| \| \| \| \| \| \|	We were using v_readlane_b32 with the lane set to zero, but this won't work if thread 0 is not active. Differential Revision: http://reviews.llvm.org/D19745 llvm-svn: 268295
*	AMDGPU: Make i64 loads/stores promote to v2i32	Matt Arsenault	2016-05-02	2	-55/+12
\| \| \| \| \| \| \| \| \| \| \| \|	Now that unaligned access expansion should not attempt to produce i64 accesses, we can remove the hack in PreprocessISelDAG where this is done. This allows splitting i64 private accesses while allowing the new add nodes indexing the vector components can be folded with the base pointer arithmetic. llvm-svn: 268293
*	Fix instance of -Winconsistent-missing-override in AMDGPU code	Reid Kleckner	2016-05-02	1	-1/+1
\| \| \| \|	llvm-svn: 268289
*	AMDGPU/SI: Set the kill flag on temp VGPRs used to restore SGPRs from scratch	Tom Stellard	2016-05-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When we restore an SGPR value from scratch, we first load it into a temporary VGPR and then use v_readlane_b32 to copy the value from the VGPR back into an SGPR. We weren't setting the kill flag on the VGPR in the v_readlane_b32 instruction, so the register scavenger wasn't able to re-use this temp value later. I wasn't able to create a lit test for this. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19744 llvm-svn: 268287
*	AMDGPU: Move R600 specific code out of AMDGPUISelLowering.cpp	Tom Stellard	2016-05-02	3	-39/+51
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: jvesely, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19736 llvm-svn: 268267
*	AMDGPU/SI: Fix bug in SIInstrInfo::insertWaitStates() uncovered by r268260	Tom Stellard	2016-05-02	1	-1/+2
\| \| \| \| \| \| \|	We can't use MI->getDebugLoc() when MI is an iterator that could be MBB.end(). llvm-svn: 268265
*	AMDGPU/SI: Use the hazard recognizer to break SMEM soft clauses	Tom Stellard	2016-05-02	3	-4/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add support for detecting hazards in SMEM soft clauses, so that we only break the clauses when necessary, either by adding s_nop or re-ordering other alu instructions. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18870 llvm-svn: 268260
*	AMDGPU: llvm.SI.fs.constant is a source of divergence	Nicolai Haehnle	2016-05-02	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This intrinsic is used to get flat-shaded fragment shader inputs. Those are uniform across a primitive, but a fragment shader wave may process pixels from multiple primitives (as indicated by the prim_mask), and so that's where divergence can arise. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19747 llvm-svn: 268259
*	AMDGPU/SI: Use hazard recognizer to detect DPP hazards	Tom Stellard	2016-05-02	3	-55/+27
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18603 llvm-svn: 268247
*	Silence unused variable warnings; NFC.	Aaron Ballman	2016-05-02	1	-9/+4
\| \| \| \|	llvm-svn: 268234
*	Add missing override.	Rafael Espindola	2016-04-30	1	-1/+2
\| \| \| \|	llvm-svn: 268163
*	AMDGPU/SI: Remove wait state handling for SMRD in SIInsertWaits	Tom Stellard	2016-04-30	1	-6/+0
\| \| \| \| \| \|	This was supposed to be part of r268143. llvm-svn: 268154
*	AMDGPU/SI: Enable the post-ra scheduler	Tom Stellard	2016-04-30	9	-18/+324
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This includes a hazard recognizer implementation to replace some of the hazard handling we had during frame index elimination. Reviewers: arsenm Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18602 llvm-svn: 268143
*	AMDGPU: Fix crash with unreachable terminators.	Matt Arsenault	2016-04-29	1	-12/+27
\| \| \| \| \| \| \| \| \| \|	If a block has no successors because it ends in unreachable, this was accessing an invalid iterator. Also stop counting instructions that don't emit any real instructions. llvm-svn: 268119
*	AMDGPU: Add kernarg.segment.ptr intrinsic	Matt Arsenault	2016-04-29	1	-0/+5
\| \| \| \|	llvm-svn: 268105
*	AMDGPU/SI: Move post regalloc run of SIShrinkInstructions	Matt Arsenault	2016-04-29	1	-5/+1
\| \| \| \| \| \| \| \|	Move to addPreEmitPass. This is so it runs after post-RA scheduling so we can merge s_nops emitted by the scheduler and hazard recognizer. llvm-svn: 268095
*	Fixed/Recommitted r267733 "[AMDGPU][llvm-mc] Add support of TTMP quads. ↵	Artem Tamazov	2016-04-29	6	-19/+39
\| \| \| \| \| \| \| \| \| \| \|	Rework M0 exclusion for SMRD." Previously reverted by r267752. r267733 review: Differential Revision: http://reviews.llvm.org/D19342 llvm-svn: 268066
*	AMDGPU/SI: Add offset field to ds_permute/ds_bpermute instructions	Tom Stellard	2016-04-29	3	-12/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These instructions can add an immediate offset to the address, like other ds instructions. Reviewers: arsenm Subscribers: arsenm, scchan Differential Revision: http://reviews.llvm.org/D19233 llvm-svn: 268043
*	AMDGPU/SI: Assembler: Unify parsing/printing of operands.	Nikolay Haustov	2016-04-29	4	-622/+365
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The goal is for each operand type to have its own parse function and at the same time share common code for tracking state as different instruction types share operand types (e.g. glc/glc_flat, etc). Introduce parseAMDGPUOperand which can parse any optional operand. DPP and Clamp/OMod have custom handling for now. Sam also suggested to have class hierarchy for operand types instead of table. This can be done in separate change. Remove parseVOP3OptionalOps, parseDS*OptionalOps, parseFlatOptionalOps, parseMubufOptionalOps, parseDPPOptionalOps. Reduce number of definitions of AsmOperand's and MatchClasses' by using common base class. Rename AsmMatcher/InstPrinter methods accordingly. Print immediate type when printing parsed immediate operand. Use 'off' if offset/index register is unused instead of skipping it to make it more readable (also agreed with SP3). Update tests. Reviewers: tstellarAMD, SamWot, artem.tamazov Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19584 llvm-svn: 268015
*	AMDGPU: Stop reporting an addressing mode for unknown addrspace	Matt Arsenault	2016-04-29	1	-1/+6
\| \| \| \| \| \| \| \| \|	This was being treated the same as private, which has an immediate offset. For unknown, it probably means it's for a computation not actually being used for accessing memory, so it should not have a nontrivial addressing mode. llvm-svn: 268002
*	AMDGPU: Emit error if too much LDS is used	Matt Arsenault	2016-04-28	1	-0/+5
\| \| \| \|	llvm-svn: 267922
*	AMDGPU: Fix mishandling array allocations when promoting alloca	Matt Arsenault	2016-04-28	1	-1/+3
\| \| \| \| \| \| \| \|	The canonical form for allocas is a single allocation of the array type. In case we see a non-canonical array alloca, make sure we aren't replacing this with an array N times smaller. llvm-svn: 267916
*	[CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in ↵	Craig Topper	2016-04-28	1	-8/+2
\| \| \| \| \| \|	TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior. llvm-svn: 267853
*	AMDGPU: Account for globals in AMDGPUPromoteAlloca pass	Matt Arsenault	2016-04-27	1	-2/+4
\| \| \| \| \| \|	Patch by Bas Nieuwenhuizen llvm-svn: 267791
*	Revert "[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for ↵	Chad Rosier	2016-04-27	6	-39/+13
\| \| \| \| \| \| \| \|	SMRD." This reverts commit r267733 due to a -Werror,-Wunused-function error. llvm-svn: 267752
*	Silence a -Wdangling-else	Reid Kleckner	2016-04-27	1	-1/+2
\| \| \| \|	llvm-svn: 267737
*	[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD.	Artem Tamazov	2016-04-27	6	-13/+39
\| \| \| \| \| \| \| \| \| \| \|	Added support of TTMP quads. Reworked M0 exclusion machinery for SMRD and similar instructions to enable usage of TTMP registers in those instructions as destinations. Tests added. Differential Revision: http://reviews.llvm.org/D19342 llvm-svn: 267733
*	AMDGPU/SI: Add llvm.amdgcn.s.waitcnt.all intrinsic	Nicolai Haehnle	2016-04-27	2	-14/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: So it appears that to guarantee some of the ordering requirements of a GLSL memoryBarrier() executed in the shader, we need to emit an s_waitcnt. (We can't use an s_barrier, because memoryBarrier() may appear anywhere in the shader, in particular it may appear in non-uniform control flow.) Reviewers: arsenm, mareko, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19203 llvm-svn: 267729
*	[AMDGPU][llvm-mc] s_getreg/setreg* - Support symbolic names of hardware ↵	Artem Tamazov	2016-04-27	2	-13/+42
\| \| \| \| \| \| \| \| \| \| \| \|	registers. Possibility to specify code of hardware register kept. Disassemble to symbolic name, if name is known. Tests updated/added. Differential Revision: http://reviews.llvm.org/D19335 llvm-svn: 267724
*	[CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI.	Ahmed Bougacha	2016-04-26	3	-36/+28
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17176 llvm-svn: 267606
*	[AMDGPU] Move reserved vgpr count for trap handler usage to ↵	Konstantin Zhuravlyov	2016-04-26	6	-9/+20
\| \| \| \| \| \| \| \|	SIMachineFunctionInfo + minor commenting changes Differential Revision: http://reviews.llvm.org/D19537 llvm-svn: 267573
*	[AMDGPU] Reserve VGPRs for trap handler usage if instructed	Konstantin Zhuravlyov	2016-04-26	6	-1/+48
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D19235 llvm-svn: 267563
*	[AMDGPU] Assembler: basic support for SDWA instructions	Sam Kolton	2016-04-26	8	-58/+414
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support for SDWA instructions for VOP1 and VOP2 encoding. Not done yet: - converters for support optional operands and modifiers - VOPC - sext() modifier - intrinsics - VOP2b (see vop_dpp.s) - V_MAC_F32 (see vop_dpp.s) Differential Revision: http://reviews.llvm.org/D19360 llvm-svn: 267553
*	Add optimization bisect opt-in calls for AMDGPU passes	Andrew Kaylor	2016-04-25	7	-1/+19
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D19450 llvm-svn: 267485
*	AMDGPU/SI: Optimize adjacent s_nop instructions	Matt Arsenault	2016-04-25	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \|	Use the operand for how long to wait. This is somewhat distasteful, since it would be better to just emit s_nop with the right argument in the first place. This would require changing TII::insertNoop to emit N operands, which would be easy. Slightly more problematic is the post-RA scheduler and hazard recognizer represent nops as a single null node, and would require inventing another way of representing N nops. llvm-svn: 267456
*	AMDGPU: Implement addrspacecast	Matt Arsenault	2016-04-25	5	-71/+124
\| \| \| \|	llvm-svn: 267452
*	AMDGPU: Add queue ptr intrinsic	Matt Arsenault	2016-04-25	4	-3/+18
\| \| \| \|	llvm-svn: 267451
*	AMDGPU: Add DAG to debug dump	Matt Arsenault	2016-04-25	1	-2/+2
\| \| \| \| \| \|	Also reorder case to match enum order llvm-svn: 267449
*	Fix incorrect redundant expression in target AMDGPU.	Etienne Bergeron	2016-04-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The expression is detected as a redundant expression. Turn out, this is probably a bug. ``` /home/etienneb/llvm/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:306:26: warning: both side of operator are equivalent [misc-redundant-expression] if (isSMRD(FirstLdSt) && isSMRD(FirstLdSt)) { ``` Reviewers: rnk, tstellarAMD Subscribers: arsenm, cfe-commits Differential Revision: http://reviews.llvm.org/D19460 llvm-svn: 267415
*	[AMDGPU][llvm-mc] s_getreg/setreg* - Add hwreg(...) syntax.	Artem Tamazov	2016-04-25	5	-3/+127
\| \| \| \| \| \| \| \| \| \| \| \| \|	Added hwreg(reg[,offset,width]) syntax. Default offset = 0, default width = 32. Possibility to specify 16-bit immediate kept. Added out-of-range checks. Disassembling is always to hwreg(...) format. Tests updated/added. Differential Revision: http://reviews.llvm.org/D19329 llvm-svn: 267410
*	Fix a couple assertions that can never fire because they just contained the ↵	Craig Topper	2016-04-24	1	-1/+1
\| \| \| \| \| \|	text string which always evaluates to true. Add a ! so they'll evaluate to false. llvm-svn: 267312
*	AMDGPU: sext_inreg (srl x, K), vt -> bfe x, K, vt.Size	Matt Arsenault	2016-04-22	1	-0/+16
\| \| \| \|	llvm-svn: 267244
*	AMDGPU: Re-visit nodes in performAndCombine	Matt Arsenault	2016-04-22	1	-0/+5
\| \| \| \| \| \|	This fixes test regressions when i64 loads/stores are made promote. llvm-svn: 267240
*	[AMDGPU] Insert nop pass: take care of outstanding feedback	Konstantin Zhuravlyov	2016-04-22	2	-21/+18
\| \| \| \| \| \| \| \| \| \| \|	- Switch few loops to range-based for loops - Fix nop insertion at the end of BB - Fix formatting - Check for endpgm Differential Revision: http://reviews.llvm.org/D19380 llvm-svn: 267167
*	AMDGPU/SI: add llvm.amdgcn.ps.live intrinsic	Nicolai Haehnle	2016-04-22	4	-16/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This intrinsic returns true if the current thread belongs to a live pixel and false if it belongs to a pixel that we are executing only for derivative computation. It will be used by Mesa to implement gl_HelperInvocation. Note that for pixels that are killed during the shader, this implementation also returns true, but it doesn't matter because those pixels are always disabled in the EXEC mask. This unearthed a corner case in the instruction verifier, which complained about a v_cndmask 0, 1, exec, exec<imp-use> instruction. That's stupid but correct code, so make the verifier accept it as such. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19191 llvm-svn: 267102
*	AMDGPU: Fix debug name of pass to better match	Matt Arsenault	2016-04-21	1	-1/+1
\| \| \| \| \| \|	I get this wrong every time I try to debug this. llvm-svn: 267030
*	Split IntrReadArgMem into IntrReadMem and IntrArgMemOnly	Nicolai Haehnle	2016-04-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: IntrReadWriteArgMem simply becomes IntrArgMemOnly. So there are fewer intrinsic properties that express their orthogonality better, and correspond more closely to the corresponding IR attributes. Suggested by: Philip Reames Reviewers: joker.eph, reames, tstellarAMD Subscribers: jholewinski, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19291 llvm-svn: 267021
*	[AMDGPU] Assembler: prevent parseDPPCtrlOps from eating invalid tokens	Sam Kolton	2016-04-21	1	-2/+14
\| \| \| \| \| \| \| \| \| \|	Reviewers: nhaustov, tstellarAMD Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D19317 llvm-svn: 266984
*	AMDGPU/SI: Assembler: improvements to support trap handlers.	Nikolay Haustov	2016-04-20	2	-70/+124
\| \| \| \| \| \| \| \| \| \| \| \|	Add ParseAMDGPURegister which can be invoked recursively for parsing lists. Rename getRegForName to getSpecialRegForName. Support legacy SP3 register list syntax: [s2,s3,s4,s5] or [flat_scratch_lo,flat_scratch_hi]. Add 64-bit registers TBA, TMA where missing. Add some tests. Differential Revision: http://reviews.llvm.org/D19163 llvm-svn: 266865