summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/spill-m0.ll
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] Remove -amdgpu-spill-sgpr-to-smem.Jay Foad2019-10-181-130/+2
| | | | | | | | | | | | | | Summary: The implementation was never completed and never used except in tests. Reviewers: arsenm, mareko Subscribers: qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69163 llvm-svn: 375293
* AMDGPU: Don't use frame virtual registersMatt Arsenault2019-08-291-4/+4
| | | | | | | | | | | | | | SGPR spills aren't really handled after SILowerSGPRSpills. In order to directly control what happens if the scavenger needs to spill, the scavenger needs to be used directly. There is an alternative to spilling in these contexts anyway since the frame register can be increment and restored. This does present another possible issue if spilling is needed for the unused carry out if an add is needed. I think this can be avoided by using a scalar add (although that clobbers SCC, which happens anyway). llvm-svn: 370281
* AMDGPU: Fix capitalized register names in asm constraintsMatt Arsenault2019-06-141-6/+6
| | | | | | | This was a workaround a long time ago, but the canonical lower case names work now. llvm-svn: 363459
* RegAllocFast: Leave unassigned virtreg entries in mapMatthias Braun2018-11-071-12/+12
| | | | | | | | | | | | | | | | Set `LiveReg::PhysReg` to zero when freeing a register instead of removing it from the entry from `LiveRegMap`. This way no iterators get invalidated and we can avoid passing around and updating iterators all over the place. This does not change any allocator decisions. It is not completely NFC because the arbitrary iteration order through `LiveRegMap` in `spillAll()` changes so we may get a different order in those spill sequences (the amount of spills does not change). This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346298
* [AMDGPU] Remove FeatureVGPRSpillingScott Linder2018-10-311-5/+5
| | | | | | | | | | | This feature is only relevant to shaders, and is no longer used. When disabled, lowering of reserved registers for shaders causes a compiler crash. Remove the feature and add a test for compilation of shaders at OptNone. Differential Revision: https://reviews.llvm.org/D53829 llvm-svn: 345763
* [AMDGPU] Don't force WQM for DS opTim Renouf2018-05-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | Summary: Previously, all DS ops forced WQM in a pixel shader. That was a hack to allow for graphics frontends using ds_swizzle to implement explicit derivatives, on SI/CI at least where DPP is not available. But it forced WQM for _any_ DS op. With this commit, DS ops no longer force WQM. Both graphics frontends (Mesa and LLPC) need to change to issue an explicit llvm.amdgcn.wqm intrinsic call when calculating explicit derivatives. The required Mesa change is: "amd/common: use llvm.amdgcn.wqm for explicit derivatives". Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46051 Change-Id: I9b745b626fa91bbd66456e6cf41ee07eeea42f81 llvm-svn: 331633
* [AMDGPU] Change constant addr space to 4Yaxun Liu2018-02-131-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D43170 llvm-svn: 325030
* [AMDGPU] exp should not be in WQM modeTim Renouf2017-09-111-4/+4
| | | | | | | | | | | | | | | A mrt exp with vm=1 must be in exact (non-WQM) mode, as it also exports the exec mask as the valid mask to determine which pixels to render. This commit marks any exp as needing to be in exact mode. Actually, if there are multiple mrt exps, only one needs to have vm=1, and only that one needs to be in exact mode. But that is an optimization for another day. Differential Revision: https://reviews.llvm.org/D36305 llvm-svn: 312915
* RegisterScavenging: Followup to r305625Matthias Braun2017-06-201-4/+4
| | | | | | | | | | | | | This does some improvements/cleanup to the recently introduced scavengeRegisterBackwards() functionality: - Rewrite findSurvivorBackwards algorithm to use the existing LiveRegUnit::accumulateBackward() code. This also avoids the Available and Candidates bitset and just need 1 LiveRegUnit instance (= 1 bitset). - Pick registers in allocation order instead of register number order. llvm-svn: 305817
* [AMDGPU] Turn on the new waitcnt insertion pass. Adjust tests.Mark Searles2017-06-021-2/+0
| | | | | | | | | -enable-si-insert-waitcnts=1 becomes the default -enable-si-insert-waitcnts=0 to use old pass Differential Revision: https://reviews.llvm.org/D33730 llvm-svn: 304551
* [AMDGPU] Merge M0 initializationsStanislav Mekhanoshin2017-04-241-10/+12
| | | | | | | | | | Merges equivalent initializations of M0 and hoists them into a common dominator block. Technically the same code can be used with any register, physical or virtual. Differential Revision: https://reviews.llvm.org/D32279 llvm-svn: 301228
* AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernelMatt Arsenault2017-03-211-3/+3
| | | | | | | | | | | | Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
* AMDGPU: Always allocate emergency stack slot at offset 0Matt Arsenault2017-02-221-13/+13
| | | | | | | | | This allows us to ensure that 0 is never a valid pointer to a user object, and ensures that the offset is always legal without needing a register to access it. This comes at the cost of usable offsets and wasted stack space. llvm-svn: 295877
* AMDGPU: Add cvt.pkrtz intrinsicMatt Arsenault2017-02-221-4/+4
| | | | | | Convert llvm.SI.packf16 test uses llvm-svn: 295797
* AMDGPU: Remove some uses of llvm.SI.export in testsMatt Arsenault2017-02-221-6/+4
| | | | | | Merge some of the old, smaller tests into more complete versions. llvm-svn: 295792
* AMDGPU: Remove SI_fs_constant and SI_fs_interp intrinsicsMatt Arsenault2017-02-161-6/+7
| | | | | | Update test uses with expansion in terms of new intrinsics. llvm-svn: 295269
* AMDGPU/SI: Don't reserve FLAT_SCR on non-HSA targets & without stack objectsMarek Olsak2016-12-091-7/+8
| | | | | | | | | | | | Summary: This frees 2 scalar registers. Reviewers: tstellarAMD Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27150 llvm-svn: 289261
* AMDGPU: Use wider scalar spills for SGPR spillingMatt Arsenault2016-12-021-19/+10
| | | | | | | | | | | | | | | | Since the spill is for the whole wave, these don't have the swizzling problems that vector stores do and a single 4-byte allocation is enough to spill a 64 element register. This should reduce the number of spill instructions and put all the spills for a register in the same cacheline. This should save allocated private size, but for now it doesn't. The extra slots are allocated for each component, but never used because the frame layout is essentially finalized before frame indices are replaced. For always using the scalar store path, this should probably be moved into processFunctionBeforeFrameFinalized. llvm-svn: 288445
* AMDGPU: Disallow exec as SMEM instruction operandMatt Arsenault2016-11-291-19/+75
| | | | | | | | | | | | | | | | | | | This is not in the list of valid inputs for the encoding. When spilling, copies from exec can be folded directly into the spill instruction which results in broken stores. This only fixes the operand constraints, more codegen work is required to avoid emitting the invalid spills. This sort of breaks the dbg.value test. Because the register class of the s_load_dwordx2 changes, there is a copy to SReg_64, and the copy is the operand of dbg_value. The copy is later dead, and removed from the dbg_value. llvm-svn: 288191
* AMDGPU/SI: Add back reverted SGPR spilling code, but disable itMarek Olsak2016-11-251-12/+98
| | | | | | suggested as a better solution by Matt llvm-svn: 287942
* Revert "AMDGPU: Implement SGPR spilling with scalar stores"Marek Olsak2016-11-251-18/+3
| | | | | | This reverts commit 4404d0d6e354e80dd7f8f0a0e12d8ad809cf007e. llvm-svn: 287936
* Revert "AMDGPU: Make m0 unallocatable"Marek Olsak2016-11-251-16/+15
| | | | | | This reverts commit 124ad83dae04514f943902446520c859adee0e96. llvm-svn: 287932
* Revert "AMDGPU: Preserve m0 value when spilling"Marek Olsak2016-11-251-70/+0
| | | | | | This reverts commit a5a179ffd94fd4136df461ec76fb30f04afa87ce. llvm-svn: 287930
* AMDGPU: Preserve m0 value when spillingMatt Arsenault2016-11-241-0/+70
| | | | llvm-svn: 287844
* AMDGPU: Make m0 unallocatableMatt Arsenault2016-11-241-15/+16
| | | | | | | | | | | m0 may need to be written for spill code, so we don't want general code uses relying on the value stored in it. This introduces a few code quality regressions where copies from m0 are not coalesced into copies of a copy of m0. llvm-svn: 287841
* AMDGPU: Implement SGPR spilling with scalar storesMatt Arsenault2016-11-131-3/+18
| | | | | | | | | | | | | | | | nThis avoids the nasty problems caused by using memory instructions that read the exec mask while spilling / restoring registers used for control flow masking, but only for VI when these were added. This always uses the scalar stores when enabled currently, but it may be better to still try to spill to a VGPR and use this on the fallback memory path. The cache also needs to be flushed before wave termination if a scalar store is used. llvm-svn: 286766
* AMDGPU: Fix using incorrect private resource with no allocationMatt Arsenault2016-10-281-0/+2
| | | | | | | | | | | It's possible to have a use of the private resource descriptor or scratch wave offset registers even though there are no allocated stack objects. This would result in continuing to use the maximum number reserved registers. This could go over the number of SGPRs available on VI, or violate the SGPR limit requested by the function attributes. llvm-svn: 285435
* AMDGPU: Use unsigned compare for eq/neMatt Arsenault2016-09-301-1/+1
| | | | | | | | | | For some reason there are both of these available, except for scalar 64-bit compares which only has u64. I'm not sure why there are both (I'm guessing it's for the one bit inputs we don't use), but for consistency always using the unsigned one. llvm-svn: 282832
* AMDGPU: Fix spilling of m0Matt Arsenault2016-09-031-0/+78
readlane/writelane do not support using m0 as the output/input. Constrain the register class of spill vregs to try to avoid this, but also handle spilling of the physreg when necessary by inserting an additional copy to a normal SGPR. llvm-svn: 280584
OpenPOWER on IntegriCloud