bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fix typo	Matt Arsenault	2017-04-18	1	-1/+1
\| \| \| \|	llvm-svn: 300597
*	[AMDGPU] added SIInstrInfo::getAddNoCarry() helper	Stanislav Mekhanoshin	2017-04-14	1	-3/+1
\| \| \| \| \| \| \| \|	Addressed rest of post submit comments from D31993. Differential Revision: https://reviews.llvm.org/D32057 llvm-svn: 300288
*	Revert "Correct register pressure calculation in presence of subregs"	Stanislav Mekhanoshin	2017-02-24	1	-16/+0
\| \| \| \| \| \| \| \|	This reverts commit r296009. It broke one out of tree target and also does not account for all partial lines added or removed when calculating PressureDiff. llvm-svn: 296182
*	Correct register pressure calculation in presence of subregs	Stanislav Mekhanoshin	2017-02-23	1	-0/+16
\| \| \| \| \| \| \| \| \| \|	If a subreg is used in an instruction it counts as a whole superreg for the purpose of register pressure calculation. This patch corrects improper register pressure calculation by examining operand's lane mask. Differential Revision: https://reviews.llvm.org/D29835 llvm-svn: 296009
*	AMDGPU: Don't use stack space for SGPR->VGPR spills	Matt Arsenault	2017-02-21	1	-23/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before frame offsets are calculated, try to eliminate the frame indexes used by SGPR spills. Then we can delete them after. I think for now we can be sure that no other instruction will be re-using the same frame indexes. It should be easy to notice if this assumption ever breaks since everything asserts if it tries to use a dead frame index later. The unused emergency stack slot seems to still be left behind, so an additional 4 bytes is still wasted. llvm-svn: 295753
*	AMDGPU: Merge initial gfx9 support	Matt Arsenault	2017-02-18	1	-0/+6
\| \| \| \|	llvm-svn: 295554
*	[AMDGPU] Override PSet for M0	Stanislav Mekhanoshin	2017-02-10	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \|	This change returns empty PSet list for M0 register. Otherwise its PSet as defined by tablegen is SReg_32. This results in incorrect register pressure calculation every time an instruction uses M0. Such uses count as SReg_32 PSet and inadequately increase pressure on SGPRs. Differential Revision: https://reviews.llvm.org/D29798 llvm-svn: 294691
*	[AMDGPU] Implement register pressure callbacks	Stanislav Mekhanoshin	2017-02-08	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement getRegPressureLimit and getRegPressureSetLimit callbacks in SIRegisterInfo. This makes standard converge scheduler to behave almost the same as GCNScheduler, sometime slightly better sometimes a bit worse. In gerenal that is also possible to switch GCNScheduler to use these callbacks instead of getMaxWaves(), which also makes GCNScheduler slightly better on some tests and slightly worse on another. A big win is behavior with converge scheduler. Note, these are used not only by scheduling, but in places like MachineLICM. Differential Revision: https://reviews.llvm.org/D29700 llvm-svn: 294518
*	[AMDGPU] Move register related queries to subtarget class	Konstantin Zhuravlyov	2017-02-08	1	-208/+10
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D29318 llvm-svn: 294440
*	AMDGPU add support for spilling to a user sgpr pointed buffers	Tom Stellard	2017-01-25	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1]. Patch By: Dave Airlie Reviewers: nhaehnle, arsenm, tstellarAMD Reviewed By: arsenm Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25428 llvm-svn: 293000
*	[AMDGPU] Do not allow register coalescer to create big superregs	Stanislav Mekhanoshin	2017-01-18	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Limit register coalescer by not allowing it to artificially increase size of registers beyond dword. Such super-registers are in fact register sequences and not distinct HW registers. With more super-regs we would need to allocate adjacent registers and constraint regalloc more than needed. Moreover, our super registers are overlapping. For instance we have VGPR0_VGPR1_VGPR2, VGPR1_VGPR2_VGPR3, VGPR2_VGPR3_VGPR4 etc, which complicates registers allocation even more, resulting in excessive spilling. Differential Revision: https://reviews.llvm.org/D28782 llvm-svn: 292413
*	[CodeGen] Rename MachineInstrBuilder::addOperand. NFC	Diana Picus	2017-01-13	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \|	Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. See https://reviews.llvm.org/D28057 for the whole discussion. Differential Revision: https://reviews.llvm.org/D28556 llvm-svn: 291891
*	Extract LaneBitmask into a separate type	Krzysztof Parzyszek	2016-12-15	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Specifically avoid implicit conversions from/to integral types to avoid potential errors when changing the underlying type. For example, a typical initialization of a "full" mask was "LaneMask = ~0u", which would result in a value of 0x00000000FFFFFFFF if the type was extended to uint64_t. Differential Revision: https://reviews.llvm.org/D27454 llvm-svn: 289820
*	AMDGPU: Fix handling of 16-bit immediates	Matt Arsenault	2016-12-10	1	-13/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since 32-bit instructions with 32-bit input immediate behavior are used to materialize 16-bit constants in 32-bit registers for 16-bit instructions, determining the legality based on the size is incorrect. Change operands to have the size specified in the type. Also adds a workaround for a disassembler bug that produces an immediate MCOperand for an operand that is supposed to be OPERAND_REGISTER. The assembler appears to accept out of bounds immediates and truncates them, but this seems to be an issue for 32-bit already. llvm-svn: 289306
*	AMDGPU/SI: Don't reserve XNACK when it's disabled	Marek Olsak	2016-12-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This frees 2 additional scalar registers. These are results from all of my 3 patches combined: Polaris: Spilled SGPRs: 2231 -> 1517 (-32.00 %) Tonga: Spilled SGPRs: 3829 -> 2608 (-31.89 %) Spilled VGPRs: 100 -> 84 (-16.00 %) Tonga even spills SGPRs via VGPRs to scratch. That's a compute shader limited to 64 VGPRs. Reviewers: tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27151 llvm-svn: 289262
*	AMDGPU/SI: Don't reserve FLAT_SCR on non-HSA targets & without stack objects	Marek Olsak	2016-12-09	1	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This frees 2 scalar registers. Reviewers: tstellarAMD Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27150 llvm-svn: 289261
*	AMDGPU/SI: Allow using SGPRs 96-101 on VI	Marek Olsak	2016-12-09	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: There is no point in setting SGPRS=104, because VI allocates SGPRs in multiples of 16, so 104 -> 112. That enables us to use all 102 SGPRs for general purposes. Reviewers: tstellarAMD Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27149 llvm-svn: 289260
*	[AMDGPU] Fix number of reserved SGPRs on CI to reflect flat scratch use	Stanislav Mekhanoshin	2016-12-08	1	-0/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D27225 llvm-svn: 289095
*	AMDGPU: Properly implement SIRegisterInfo::isFrameOffsetLegal and ↵	Nicolai Haehnle	2016-12-08	1	-5/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	needsFrameBaseReg Summary: Without the fix to isFrameOffsetLegal to consider the instruction's immediate offset, the new test case hits the corresponding assertion in resolveFrameIndex, because the LocalStackSlotAllocation pass re-uses a different base register. With only the fix to isFrameOffsetLegal, code quality reduces in a bunch of places because frame base registers are added where they're not needed. This is addressed by properly implementing needsFrameBaseReg, which also helps to avoid unnecessary zero frame indices in a bunch of other places. Fixes piglit glsl-1.50/execution/variable-indexing/gs-output-array-vec4-index-wr.shader_test Reviewers: arsenm, tstellarAMD Subscribers: qcolombet, kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D27344 llvm-svn: 289048
*	AMDGPU: remove a couple of unused variables	Saleem Abdulrasool	2016-12-03	1	-14/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	lib/Target/AMDGPU/SIRegisterInfo.cpp: In member function 'void llvm::SIRegisterInfo::spillSGPR(llvm::MachineBasicBlock::iterator, int, llvm::RegScavenger) const': lib/Target/AMDGPU/SIRegisterInfo.cpp:572:30: warning: variable 'SubRC' set but not used [-Wunused-but-set-variable] const TargetRegisterClass SubRC = nullptr; ^ lib/Target/AMDGPU/SIRegisterInfo.cpp: In member function 'void llvm::SIRegisterInfo::restoreSGPR(llvm::MachineBasicBlock::iterator, int, llvm::RegScavenger) const': lib/Target/AMDGPU/SIRegisterInfo.cpp:723:30: warning: variable 'SubRC' set but not used [-Wunused-but-set-variable] const TargetRegisterClass SubRC = nullptr; ^ The variable was assigned to, but never used. The functions called did not mutate state. Simplify the logic and remove the variable. Identified by gcc 5.4.0. llvm-svn: 288601
*	AMDGPU: Use wider scalar spills for SGPR spilling	Matt Arsenault	2016-12-02	1	-15/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the spill is for the whole wave, these don't have the swizzling problems that vector stores do and a single 4-byte allocation is enough to spill a 64 element register. This should reduce the number of spill instructions and put all the spills for a register in the same cacheline. This should save allocated private size, but for now it doesn't. The extra slots are allocated for each component, but never used because the frame layout is essentially finalized before frame indices are replaced. For always using the scalar store path, this should probably be moved into processFunctionBeforeFrameFinalized. llvm-svn: 288445
*	AMDGPU: Materialize frame index before add	Matt Arsenault	2016-11-29	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \|	It isn't generally safe to fold the frame index directly into the operand since it will possibly not be an inline immediate after it is expanded. This surprisingly seems to produce better code, since the FI doesn't prevent folding other immediate operands. llvm-svn: 288185
*	AMDGPU/SI: Add back reverted SGPR spilling code, but disable it	Marek Olsak	2016-11-25	1	-75/+200
\| \| \| \| \| \|	suggested as a better solution by Matt llvm-svn: 287942
*	Revert "AMDGPU: Implement SGPR spilling with scalar stores"	Marek Olsak	2016-11-25	1	-99/+7
\| \| \| \| \| \|	This reverts commit 4404d0d6e354e80dd7f8f0a0e12d8ad809cf007e. llvm-svn: 287936
*	Revert "AMDGPU: Fix MMO when splitting spill"	Marek Olsak	2016-11-25	1	-71/+44
\| \| \| \| \| \|	This reverts commit 79d4f8b8b1ce430c3d5dac4fc72a9eebaed24fe1. llvm-svn: 287935
*	Revert "AMDGPU: Fix adding extra implicit def of register"	Marek Olsak	2016-11-25	1	-25/+14
\| \| \| \| \| \|	This reverts commit e834ce5976567575621901fb967b8018b9916d71. llvm-svn: 287934
*	Revert "AMDGPU: Fix not setting kill flag on temp reg when spilling"	Marek Olsak	2016-11-25	1	-1/+1
\| \| \| \| \| \|	This reverts commit 057bbbe4ae170247ba37f08f2e70ef185267d1bb. llvm-svn: 287933
*	Revert "AMDGPU: Make m0 unallocatable"	Marek Olsak	2016-11-25	1	-1/+1
\| \| \| \| \| \|	This reverts commit 124ad83dae04514f943902446520c859adee0e96. llvm-svn: 287932
*	Revert "AMDGPU: Remove m0 spilling code"	Marek Olsak	2016-11-25	1	-3/+37
\| \| \| \| \| \|	This reverts commit f18de36554eb22416f8ba58e094e0272523a4301. llvm-svn: 287931
*	Revert "AMDGPU: Preserve m0 value when spilling"	Marek Olsak	2016-11-25	1	-34/+5
\| \| \| \| \| \|	This reverts commit a5a179ffd94fd4136df461ec76fb30f04afa87ce. llvm-svn: 287930
*	AMDGPU: Preserve m0 value when spilling	Matt Arsenault	2016-11-24	1	-5/+34
\| \| \| \|	llvm-svn: 287844
*	TRI: Add hook to pass scavenger during frame elimination	Matt Arsenault	2016-11-24	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \|	The scavenger was not passed if requiresFrameIndexScavenging was enabled. I need to be able to test for the availability of an unallocatable register here, so I can't create a virtual register for it. It might be better to just always use the scavenger and stop creating virtual registers. llvm-svn: 287843
*	AMDGPU: Remove m0 spilling code	Matt Arsenault	2016-11-24	1	-37/+3
\| \| \| \| \| \|	Since m0 isn't allocatable it should never be spilled anymore. llvm-svn: 287842
*	AMDGPU: Make m0 unallocatable	Matt Arsenault	2016-11-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	m0 may need to be written for spill code, so we don't want general code uses relying on the value stored in it. This introduces a few code quality regressions where copies from m0 are not coalesced into copies of a copy of m0. llvm-svn: 287841
*	AMDGPU: Fix not setting kill flag on temp reg when spilling	Matt Arsenault	2016-11-23	1	-1/+1
\| \| \| \|	llvm-svn: 287808
*	AMDGPU: Fix adding extra implicit def of register	Matt Arsenault	2016-11-23	1	-14/+25
\| \| \| \| \| \| \|	In the scalar case, there's no reason to add an additional def of the same register. llvm-svn: 287807
*	AMDGPU: Fix MMO when splitting spill	Matt Arsenault	2016-11-23	1	-44/+71
\| \| \| \| \| \| \| \| \| \|	The size and offset were wrong. The size of the object was being used for the size of the access, when here it is really being split into 4-byte accesses. The underlying object size is set in the MachinePointerInfo, which also didn't have the offset set. llvm-svn: 287806
*	Fix spelling mistakes in AMDGPU target comments. NFC.	Simon Pilgrim	2016-11-18	1	-1/+1
\| \| \| \| \| \|	Identified by Pedro Giffuni in PR27636. llvm-svn: 287333
*	AMDGPU/SI: Avoid creating unnecessary copies in the SIFixSGPRCopies pass	Tom Stellard	2016-11-16	1	-11/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: 1. Don't try to copy values to and from the same register class. 2. Replace copies with of registers with immediate values with v_mov/s_mov instructions. The main purpose of this change is to make MachineSink do a better job of determining when it is beneficial to split a critical edge, since the pass assumes that copies will become move instructions. This prevents a regression in uniform-cfg.ll if we enable critical edge splitting for AMDGPU. Reviewers: arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D23408 llvm-svn: 287131
*	AMDGPU: Implement SGPR spilling with scalar stores	Matt Arsenault	2016-11-13	1	-7/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nThis avoids the nasty problems caused by using memory instructions that read the exec mask while spilling / restoring registers used for control flow masking, but only for VI when these were added. This always uses the scalar stores when enabled currently, but it may be better to still try to spill to a VGPR and use this on the fallback memory path. The cache also needs to be flushed before wave termination if a scalar store is used. llvm-svn: 286766
*	AMDGPU: Try to fix (non-clang?) bot builds	Matt Arsenault	2016-11-07	1	-10/+10
\| \| \| \|	llvm-svn: 286120
*	AMDGPU: Refactor copyPhysReg	Matt Arsenault	2016-11-07	1	-0/+103
\| \| \| \| \| \|	Separate the subregister splitting logic to re-use later. llvm-svn: 286118
*	AMDGPU: Stop creating unused virtual registers	Matt Arsenault	2016-11-01	1	-2/+5
\| \| \| \| \| \| \|	These are only used in the spill to VMEM path. Move them to the one use. llvm-svn: 285756
*	AMDGPU: Fix using incorrect private resource with no allocation	Matt Arsenault	2016-10-28	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \|	It's possible to have a use of the private resource descriptor or scratch wave offset registers even though there are no allocated stack objects. This would result in continuing to use the maximum number reserved registers. This could go over the number of SGPRs available on VI, or violate the SGPR limit requested by the function attributes. llvm-svn: 285435
*	Reapply "AMDGPU: Don't use offen if it is 0"	Matt Arsenault	2016-10-26	1	-9/+95
\| \| \| \| \| \|	This reverts r283003 llvm-svn: 285203
*	AMDGPU: Fix use-after-frees	Nicolai Haehnle	2016-10-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25312 llvm-svn: 284215
*	AMDGPU: Do not re-use tmpreg in spill/restore lowering	Matthias Braun	2016-10-05	1	-2/+2
\| \| \| \| \| \| \| \| \|	The register scavenging code does not support multiple definitions of the same vreg. Differential Revision: https://reviews.llvm.org/D25220 llvm-svn: 283369
*	AMDGPU: Factor SGPR spilling into separate functions	Matt Arsenault	2016-10-04	1	-129/+160
\| \| \| \|	llvm-svn: 283175
*	AMDGPU: Fix typo	Matt Arsenault	2016-10-03	1	-1/+1
\| \| \| \|	llvm-svn: 283108
*	Revert "AMDGPU: Don't use offen if it is 0"	Mehdi Amini	2016-10-01	1	-95/+9
\| \| \| \| \| \| \|	This reverts commit r282999. Tests are not passing: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/20038 llvm-svn: 283003