summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/nested-calls.ll
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] Created a sub-register class for the return address operand in the ↵Christudasan Devadasan2019-07-091-5/+5
| | | | | | | | | | | | | | return instruction. Function return instruction lowering, currently uses the fixed register pair s[30:31] for holding the return address. It can be any SGPR pair other than the CSRs. Created an SGPR pair sub-register class exclusive of the CSRs, and used this regclass while lowering the return instruction. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D63924 llvm-svn: 365512
* AMDGPU: Make s34 the FP registerMatt Arsenault2019-07-081-16/+17
| | | | | | | | | | | | | | | | | | | | | | | Make the FP register callee saved. This is tricky because now the FP needs to be spilled in the prolog relative to the incoming SP register, rather than the frame register used throughout the rest of the function. I don't like how this bypassess the standard mechanism for CSR spills just to get the correct insert point. I may look for a better solution, since all CSR VGPRs may also need to have all lanes activated. Another option might be to make getFrameIndexReference change the base register if the frame index is a CSR, and then try to figure out the right insertion point in emitProlog. If there is a free VGPR lane available for SGPR spilling, try to use it for the FP. If that would require intrtoducing a new VGPR spill, try to use a free call clobbered SGPR. Only fallback to introducing a new VGPR spill as a last resort. This also doesn't attempt to handle SGPR spilling with scalar stores. llvm-svn: 365372
* AMDGPU: Always use s33 for global scratch wave offsetMatt Arsenault2019-06-201-6/+6
| | | | | | | | | Every called function could possibly need this to calculate the absolute address of stack objectst, and this avoids inserting a copy around every call site in the kernel. It's also somewhat cleaner to keep this in a callee saved SGPR. llvm-svn: 363990
* AMDGPU: Don't fix emergency stack slot at offset 0Matt Arsenault2019-06-051-2/+2
| | | | | | | | | | | | | | | | | | | | | This forced the caller to be aware of this, which is an ugly ABI feature. Partially reverts r295877. The original reasons for doing this are mostly fixed. Alloca is now in a non-0 address space, so it should be OK to have 0 as a valid pointer. Since we treat the absolute address as the pointer value, this part only really needed to apply to kernels. Since r357093, we avoid the need to increment/decrement the offset register in more cases, and since r354816 the scavenger can fail without spilling, so it's less critical that we try to avoid an offset that fits in the MUBUF offset. Restrict to callable functions for now to split this into 2 steps to limit thte number of test updates and in case anything breaks. llvm-svn: 362665
* AMDGPU: Activate all lanes when spilling CSR VGPR for SGPR spillsMatt Arsenault2019-05-241-3/+9
| | | | | | | If some lanes weren't active on entry to the function, this could clobber their VGPR values. llvm-svn: 361655
* [SchedModel] Fix for read advance cycles with implicit pseudo operands.Jonas Paulsson2018-10-301-2/+2
| | | | | | | | | | | | | | | | | | The SchedModel allows the addition of ReadAdvances to express that certain operands of the instructions are needed at a later point than the others. RegAlloc may add pseudo operands that are not part of the instruction descriptor, and therefore cannot have any read advance entries. This meant that in some cases the desired read advance was nullified by such a pseudo operand, which still had the original latency. This patch fixes this by making sure that such pseudo operands get a zero latency during DAG construction. Review: Matthias Braun, Ulrich Weigand. https://reviews.llvm.org/D49671 llvm-svn: 345606
* AMDGPU: Increase default stack alignmentMatt Arsenault2018-03-291-4/+4
| | | | | | | 8 and 16-byte values are common, so increase the default alignment to avoid realigning the stack in most functions. llvm-svn: 328821
* [AMDGPU] Switch to the new addr space mapping by defaultYaxun Liu2018-02-021-5/+5
| | | | | | | | This requires corresponding clang change. Differential Revision: https://reviews.llvm.org/D40955 llvm-svn: 324101
* AMDGPU: Remove error on calls for amdgcnMatt Arsenault2017-08-031-3/+3
| | | | | | | | Repurpose the -amdgpu-function-calls flag. Rather than require it to emit a call, only use it to run the always inline path or not. llvm-svn: 310003
* AMDGPU: Fix clobbering CSR VGPRs when spilling SGPR to itMatt Arsenault2017-08-021-4/+16
| | | | llvm-svn: 309783
* AMDGPU: Initial implementation of callsMatt Arsenault2017-08-011-0/+41
Includes a hack to fix the type selected for the GlobalAddress of the function, which will be fixed by changing the default datalayout to use generic pointers for 0. llvm-svn: 309732
OpenPOWER on IntegriCloud