bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[AMDGPU] Prevent spills before exec mask is restored	Stanislav Mekhanoshin	2017-01-20	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Inline spiller can decide to move a spill as early as possible in the basic block. It will skip phis and label, but we also need to make sure it skips instructions in the basic block prologue which restore exec mask. Added isPositionLike callback in TargetInstrInfo to detect instructions which shall be skipped in addition to common phis, labels etc. Differential Revision: https://reviews.llvm.org/D27997 llvm-svn: 292554
*	AMDGPU: Fix asan errors when folding operands	Matt Arsenault	2016-12-10	1	-2/+2
\| \| \| \| \| \| \|	This was failing when trying to fold immediates into operand 1 of a phi, which only has one statically known operand. llvm-svn: 289337
*	AMDGPU: Fix handling of 16-bit immediates	Matt Arsenault	2016-12-10	1	-4/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since 32-bit instructions with 32-bit input immediate behavior are used to materialize 16-bit constants in 32-bit registers for 16-bit instructions, determining the legality based on the size is incorrect. Change operands to have the size specified in the type. Also adds a workaround for a disassembler bug that produces an immediate MCOperand for an operand that is supposed to be OPERAND_REGISTER. The assembler appears to accept out of bounds immediates and truncates them, but this seems to be an issue for 32-bit already. llvm-svn: 289306
*	AMDGPU: Refactor exp instructions	Matt Arsenault	2016-12-05	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Structure the definitions a bit more like the other classes. The main change here is to split EXP with the done bit set to a separate opcode, so we can set mayLoad = 1 so that it won't be reordered before the other exp stores, since this has the special constraint that if the done bit is set then this should be the last exp in she shader. Previously all exp instructions were inferred to have unmodeled side effects. llvm-svn: 288695
*	AMDGPU/SI: Avoid creating unnecessary copies in the SIFixSGPRCopies pass	Tom Stellard	2016-11-16	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: 1. Don't try to copy values to and from the same register class. 2. Replace copies with of registers with immediate values with v_mov/s_mov instructions. The main purpose of this change is to make MachineSink do a better job of determining when it is beneficial to split a critical edge, since the pass assumes that copies will become move instructions. This prevents a regression in uniform-cfg.ll if we enable critical edge splitting for AMDGPU. Reviewers: arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D23408 llvm-svn: 287131
*	AMDGPU: Workaround for instruction size with literals	Matt Arsenault	2016-11-01	1	-0/+8
\| \| \| \| \| \| \| \| \| \|	Instructions with a 32-bit base encoding with an optional 32-bit literal encoded after them report their size as 4 for the disassembler. Consider these when computing the MachineInstr size. This fixes problems caused by size estimate consistency in BranchRelaxation. llvm-svn: 285743
*	AMDGPU/SI: Don't use non-0 waitcnt values when waiting on Flat instructions	Tom Stellard	2016-10-28	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Flat instruction can return out of order, so we need always need to wait for all the outstanding flat operations. Reviewers: tony-tye, arsenm Subscribers: kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D25998 llvm-svn: 285479
*	AMDGPU: Add definitions for scalar store instructions	Matt Arsenault	2016-10-28	1	-0/+10
\| \| \| \| \| \| \| \| \| \|	Also add glc bit to the scalar loads since they exist on VI and change the caching behavior. This currently has an assembler bug where the glc bit is incorrectly accepted on SI/CI which do not have it. llvm-svn: 285463
*	[AMDGPU] Emit 32-bit lo/hi got and pc relative variant kinds for external ↵	Konstantin Zhuravlyov	2016-10-14	1	-1/+12
\| \| \| \| \| \| \| \|	and global address space variables Differential Revision: https://reviews.llvm.org/D25562 llvm-svn: 284196
*	BranchRelaxation: Support expanding unconditional branches	Matt Arsenault	2016-10-06	1	-0/+24
\| \| \| \| \| \| \|	AMDGPU needs to expand unconditional branches in a new block with an indirect branch. llvm-svn: 283464
*	AMDGPU: Use SOPK compare instructions	Matt Arsenault	2016-09-16	1	-0/+11
\| \| \| \|	llvm-svn: 281780
*	Finish renaming remaining analyzeBranch functions	Matt Arsenault	2016-09-14	1	-2/+2
\| \| \| \|	llvm-svn: 281535
*	Revert "AMDGPU: Use SOPK compare instructions"	Matt Arsenault	2016-09-14	1	-11/+0
\| \| \| \| \| \|	Accidentally committed llvm-svn: 281514
*	AMDGPU: Use SOPK compare instructions	Matt Arsenault	2016-09-14	1	-0/+11
\| \| \| \|	llvm-svn: 281513
*	Make analyzeBranch family of instruction names consistent	Matt Arsenault	2016-09-14	1	-1/+1
\| \| \| \| \| \| \|	analyzeBranch was renamed to use lowercase first, rename the related set to match. llvm-svn: 281506
*	AArch64: Use TTI branch functions in branch relaxation	Matt Arsenault	2016-09-14	1	-2/+4
\| \| \| \| \| \| \| \| \|	The main change is to return the code size from InsertBranch/RemoveBranch. Patch mostly by Tim Northover llvm-svn: 281505
*	AMDGPU: Implement is{LoadFrom\|StoreTo}FrameIndex	Matt Arsenault	2016-09-10	1	-0/+16
\| \| \| \|	llvm-svn: 281128
*	AMDGPU: Support commuting with immediate in src0	Matt Arsenault	2016-09-08	1	-1/+10
\| \| \| \|	llvm-svn: 280970
*	AMDGPU: Fix not estimating MBB operand sizes correctly	Matt Arsenault	2016-08-13	1	-0/+6
\| \| \| \|	llvm-svn: 278590
*	AMDGPU: Stay in WQM for non-intrinsic stores	Nicolai Haehnle	2016-08-02	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Two types of stores are possible in pixel shaders: stores to memory that are explicitly requested at the API level, and stores that are an implementation detail of register spilling or lowering of arrays. For the first kind of store, we must ensure that helper pixels have no effect and hence WQM must be disabled. The second kind of store must always be executed, because the written value may be loaded again in a way that is relevant for helper pixels as well -- and there are no externally visible effects anyway. This is a candidate for the 3.9 release branch. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D22675 llvm-svn: 277504
*	TargetInstrInfo: add virtual function getInstSizeInBytes	Sjoerd Meijer	2016-07-29	1	-1/+1
\| \| \| \| \| \| \| \| \|	This adds a target hook getInstSizeInBytes to TargetInstrInfo that a lot of subclasses already implement. Differential Revision: https://reviews.llvm.org/D22885 llvm-svn: 277126
*	Rename AnalyzeBranch* to analyzeBranch*.	Jacques Pienaar	2016-07-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect. Reviewers: tstellarAMD, mcrosier Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai Differential Revision: https://reviews.llvm.org/D22409 llvm-svn: 275564
*	AMDGPU/SI: Add support for R_AMDGPU_GOTPCREL	Tom Stellard	2016-07-13	1	-0/+6
\| \| \| \| \| \| \| \| \| \|	Reviewers: rafael, ruiu, tony-tye, arsenm, kzhuravl Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21484 llvm-svn: 275268
*	AMDGPU: WQM cleanups	Matt Arsenault	2016-07-13	1	-0/+8
\| \| \| \| \| \| \| \|	- Add new TTI instruction checks - Don't use const for blocks that are mutated. - Checking isBranch and isTerminator should be redundant llvm-svn: 275252
*	AMDGPU: Treat texture gather instructions more like other MIMG instructions	Nicolai Haehnle	2016-07-11	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Setting MIMG to 0 has a bunch of unexpected side effects, including that isVMEM returns false which leads to incorrect treatment in the hazard recognizer. The reason I noticed it is that it also leads to incorrect treatment in VGPR-to-SGPR copies, which is one cause of the referenced bug. The only reason why MIMG was set to 0 is to signal the special handling of dmasks, but that can be checked differently. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96877 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22210 llvm-svn: 275113
*	AMDGPU: Move R600 only pieces into R600 classes	Matt Arsenault	2016-07-09	1	-2/+0
\| \| \| \|	llvm-svn: 274979
*	CodeGen: Use MachineInstr& in TargetInstrInfo, NFC	Duncan P. N. Exon Smith	2016-06-30	1	-50/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement. Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary. This is mostly mechanical fixes: adding and removing `` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on. llvm-svn: 274189
*	AMDGPU: Remove unused function	Matt Arsenault	2016-06-28	1	-6/+0
\| \| \| \|	llvm-svn: 274033
*	AMDGPU: Define a schedule class for COPY.	Matthias Braun	2016-06-24	1	-0/+8
\| \| \| \| \| \| \| \| \| \|	COPY was lacking a scheduling class, define it to avoid regressions in the upcoming change to the bidirectional MachineScheduler. Approved by tstellar on IRC. Differential Revision: http://reviews.llvm.org/D21540 llvm-svn: 273751
*	AMDGPU: Cleanup subtarget handling.	Matt Arsenault	2016-06-24	1	-2/+3
\| \| \| \| \| \| \| \| \|	Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652
*	AMDGPU/SI: Set INDEX_STRIDE for scratch coalescing	Marek Olsak	2016-06-13	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Mesa and other users must set this to enable coalescing: - STRIDE = 0 - SWIZZLE_ENABLE = 1 This makes one particular compute shader 8x faster. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D21136 llvm-svn: 272556
*	Pass DebugLoc and SDLoc by const ref.	Benjamin Kramer	2016-06-12	1	-4/+3
\| \| \| \| \| \| \| \|	This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512
*	AMDGPU: Add function for getting instruction size	Matt Arsenault	2016-06-06	1	-0/+2
\| \| \| \|	llvm-svn: 271936
*	AMDGPU: Handle cbranch vccz/vccnz	Matt Arsenault	2016-05-21	1	-1/+5
\| \| \| \|	llvm-svn: 270297
*	AMDGPU: Implement ReverseBranchCondition	Matt Arsenault	2016-05-21	1	-0/+3
\| \| \| \|	llvm-svn: 270296
*	AMDGPU: Implement AnalyzeBranch	Matt Arsenault	2016-05-21	1	-1/+21
\| \| \| \| \| \|	Original patch by Tom Stellard llvm-svn: 270295
*	Add missing override.	Rafael Espindola	2016-04-30	1	-1/+2
\| \| \| \|	llvm-svn: 268163
*	AMDGPU/SI: Enable the post-ra scheduler	Tom Stellard	2016-04-30	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This includes a hazard recognizer implementation to replace some of the hazard handling we had during frame index elimination. Reviewers: arsenm Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18602 llvm-svn: 268143
*	[MachineScheduler]Add support for store clustering	Jun Bum Lim	2016-04-15	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Perform store clustering just like load clustering. This change add StoreClusterMutation in machine-scheduler. To control StoreClusterMutation, added enableClusterStores() in TargetInstrInfo.h. This is enabled only on AArch64 for now. This change also add support for unscaled stores which were not handled in getMemOpBaseRegImmOfs(). llvm-svn: 266437
*	AMDGPU/SI: Add MachineBasicBlock parameter to SIInstrInfo::insertWaitStates	Tom Stellard	2016-04-07	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This makes it possible to insert nops at the end of blocks. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18549 llvm-svn: 265678
*	AMDGPU: Add SIWholeQuadMode pass	Nicolai Haehnle	2016-03-21	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Whole quad mode is already enabled for pixel shaders that compute derivatives, but it must be suspended for instructions that cause a shader to have side effects (i.e. stores and atomics). This pass addresses the issue by storing the real (initial) live mask in a register, masking EXEC before instructions that require exact execution and (re-)enabling WQM where required. This pass is run before register coalescing so that we can use machine SSA for analysis. The changes in this patch expose a problem with the second machine scheduling pass: target independent instructions like COPY implicitly use EXEC when they operate on VGPRs, but this fact is not encoded in the MIR. This can lead to miscompilation because instructions are moved past changes to EXEC. This patch fixes the problem by adding use-implicit operands to target independent instructions. Some general codegen passes are relaxed to work with such implicit use operands. Reviewers: arsenm, tstellarAMD, mareko Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18162 llvm-svn: 263982
*	AMDGPU/SI: Handle wait states required for DPP instructions	Tom Stellard	2016-03-14	1	-0/+8
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17543 llvm-svn: 263447
*	AMDGPU: R600 code splitting cleanup	Matt Arsenault	2016-03-11	1	-3/+3
\| \| \| \| \| \| \|	Move a few functions only used by R600 to R600 specific code, fix header macros to stop using R600, mark classes as final. llvm-svn: 263204
*	[TII] Allow getMemOpBaseRegImmOfs() to accept negative offsets. NFC.	Chad Rosier	2016-03-09	1	-1/+1
\| \| \| \| \| \|	http://reviews.llvm.org/D17967 llvm-svn: 263021
*	AMDGPU/SI: Use v_readfirstlane to legalize SMRD with VGPR base pointer	Tom Stellard	2016-02-20	1	-9/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Instead of trying to replace SMRD instructions with a VGPR base pointer with an equivalent MUBUF instruction, we now copy the base pointer to SGPRs using v_readfirstlane. This is safe to do, because any load selected as an SMRD instruction has been proven to have a uniform base pointer, so each thread in the wave will have the same pointer value in VGPRs. This will fix some errors on VI from trying to replace SMRD instructions with addr64-enabled MUBUF instructions that don't exist. Reviewers: arsenm, cfang, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17305 llvm-svn: 261385
*	AMDGPU/SI: Detect uniform branches and emit s_cbranch instructions	Tom Stellard	2016-02-12	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16603 llvm-svn: 260765
*	AMDGPU: Set element_size in private resource descriptor	Matt Arsenault	2016-02-12	1	-1/+1
\| \| \| \| \| \| \| \| \|	Introduce a subtarget feature for this, and leave the default with the current behavior which assumes up to 16-byte loads/stores can be used. The field also seems to have the ability to be set to 2 bytes, but I'm not sure what that would be used for. llvm-svn: 260651
*	AMDGPU/SI: Make sure MIMG descriptors and samplers stay in SGPRs	Tom Stellard	2016-02-11	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: It's possible to have resource descriptors and samplers stored in VGPRs, either by a VMEM instruction or in the case of samplers, floating-point calculations. When this happens, we need to use v_readfirstlane to copy these values back to sgprs. Reviewers: mareko, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17102 llvm-svn: 260599
*	AMDGPU: Remove some purely R600 functions from AMDGPUInstrInfo	Tom Stellard	2016-02-05	1	-14/+0
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16862 llvm-svn: 259900
*	AMDGPU: Move subtarget specific code out of AMDGPUInstrInfo.cpp	Tom Stellard	2016-01-28	1	-5/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Also delete all the stub functions that are identical to the implementations in TargetInstrInfo.cpp. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16609 llvm-svn: 259054