summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/flat-scratch-reg.ll
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Enable code object v3 for AMDHSA onlyKonstantin Zhuravlyov2018-11-151-3/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D54186 llvm-svn: 346923
* Revert r345542: AMDGPU: Enable code object v3 by defaultKonstantin Zhuravlyov2018-10-301-3/+3
| | | | | | It breaks mesa. llvm-svn: 345662
* AMDGPU: Enable code object v3 by defaultKonstantin Zhuravlyov2018-10-291-3/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D53525 llvm-svn: 345542
* AMDGPU: Use correct register names in inline assemblyMatt Arsenault2017-06-081-7/+7
| | | | | | Fixes using physical registers in inline asm from clang. llvm-svn: 305004
* AMDGPU: Use MachineRegisterInfo to find max used registerMatt Arsenault2017-04-171-12/+47
| | | | | | | | | | Avoid looping through program to determine register counts. This avoids needing to look at regmask operands. Also fixes some counting errors with flat_scr when there are no stack objects. llvm-svn: 300482
* AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernelMatt Arsenault2017-03-211-4/+4
| | | | | | | | | | | | Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
* Enable FeatureFlatForGlobal on Volcanic IslandsMatt Arsenault2017-01-241-1/+1
| | | | | | | | | | | This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982
* AMDGPU/SI: Don't reserve XNACK when it's disabledMarek Olsak2016-12-091-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This frees 2 additional scalar registers. These are results from all of my 3 patches combined: Polaris: Spilled SGPRs: 2231 -> 1517 (-32.00 %) Tonga: Spilled SGPRs: 3829 -> 2608 (-31.89 %) Spilled VGPRs: 100 -> 84 (-16.00 %) Tonga even spills SGPRs via VGPRs to scratch. That's a compute shader limited to 64 VGPRs. Reviewers: tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27151 llvm-svn: 289262
* AMDGPU/SI: Don't reserve FLAT_SCR on non-HSA targets & without stack objectsMarek Olsak2016-12-091-21/+34
| | | | | | | | | | | | Summary: This frees 2 scalar registers. Reviewers: tstellarAMD Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27150 llvm-svn: 289261
* AMDGPU : Add XNACK feature to GPUs that support it.Wei Ding2016-09-061-1/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D24276 llvm-svn: 280742
* AMDGPU: Add missing tests for xnack option for HSAMatt Arsenault2016-07-261-5/+21
| | | | llvm-svn: 276765
* AMDGPU/SI: xnack_mask is always reserved on VINicolai Haehnle2016-01-071-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Somehow, I first interpreted the docs as saying space for xnack_mask is only reserved when XNACK is enabled via SH_MEM_CONFIG. I felt uneasy about this and went back to actually test what is happening, and it turns out that xnack_mask is always reserved at least on Tonga and Carrizo, in the sense that flat_scr is always fixed below the SGPRs that are used to implement xnack_mask, whether or not they are actually used. I confirmed this by writing a shader using inline assembly to tease out the aliasing between flat_scratch and regular SGPRs. For example, on Tonga, where we fix the number of SGPRs to 80, s[74:75] aliases flat_scratch (so xnack_mask is s[76:77] and vcc is s[78:79]). This patch changes both the calculation of the total number of SGPRs and the various register reservations to account for this. It ought to be possible to use the gap left by xnack_mask when the feature isn't used, but this patch doesn't try to do that. (Note that the same applies to vcc.) Note that previously, even before my earlier change in r256794, the SGPRs that alias to xnack_mask could end up being used as well when flat_scr was unused and the total number of SGPRs happened to fall on the right alignment (e.g. highest regular SGPR being used s29 and VCC used would lead to number of SGPRs being 32, where s28 and s29 alias with xnack_mask). So if there were some conflict due to such aliasing, we should have noticed that already. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15898 llvm-svn: 257073
* AMDGPU: add +xnack featureNicolai Haehnle2016-01-041-6/+11
| | | | | | | | | | | | | | | | | | | Summary: Enabling this feature will account for the two SGPRs used by the hardware to store the XNACK_MASK physically. The hardware only requires this reservation when the XNACK feature is explicitly enabled. At some point, HSA will probably want to do that, but it does increase SGPR register pressure, so leave it disabled by default for now (but do add a small test). Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15869 llvm-svn: 256794
* AMDGPU/SI: Reserve appropriate number of sgprs for flat scratch init.Tom Stellard2015-12-171-0/+36
Reviewers: tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15583 Patch by: Changpeng Fang llvm-svn: 255908
OpenPOWER on IntegriCloud