summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Annotate necessity of flat-scratch-initMatt Arsenault2017-07-181-7/+8
| | | | | | | | As an approximation of the existing handling to avoid regressions. Fixes using too many registers with calls on subtargets with the SGPR allocation bug. llvm-svn: 308326
* AMDGPU: Figure out private memory regs after loweringMatt Arsenault2017-07-181-5/+8
| | | | | | | | | | | | | | | | | | Introduce pseudo-registers for registers needed for stack access, which are replaced during finalizeLowering. Note these pseudo-registers are currently only used for the used register location, and not for determining their input argument register. This is better because it avoids the need to try to predict whether a call will be emitted from the IR, and also detects stack objects introduced by legalization. Test changes are from the HasStackObjects check being more accurate since stack objects introduced during legalization are now known. llvm-svn: 308325
* AMDGPU: Annotate features from x work item/group IDs.Matt Arsenault2017-07-171-13/+26
| | | | | | | | This wasn't necessary before since they are always enabled for kernels, but this is necessary if they need to be forwarded to a callable function. llvm-svn: 308226
* AMDGPU: Detect kernarg segment pointerMatt Arsenault2017-07-141-1/+4
| | | | | | | | This is necessary to pass the kernarg segment pointer to callee functions. Also don't unconditionally enable for kernels. llvm-svn: 307978
* AMDGPU: Setup SP/FP in callee function prolog/epilogMatt Arsenault2017-06-261-0/+1
| | | | llvm-svn: 306312
* AMDGPU: Partially fix implicit.buffer.ptr intrinsic handlingMatt Arsenault2017-06-261-5/+5
| | | | | | | | | | | | | | This should not be treated as a different version of private_segment_buffer. These are distinct things with different uses and register classes, and requires the function argument info to have more context about the function's type and environment. Also add missing test coverage for the intrinsic, and emit an error for HSA. This also encovers that the intrinsic is broken unless there happen to be stack objects. llvm-svn: 306264
* AMDGPU: Start defining a calling conventionMatt Arsenault2017-05-171-7/+12
| | | | | | | | Partially implement callee-side for arguments and return values. byval doesn't work properly, and most likely sret or other on-stack return values most as well. llvm-svn: 303308
* AMDGPU: GFX9 GS and HS shaders always have the scratch wave offset in SGPR5Marek Olsak2017-05-041-1/+7
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D32645 llvm-svn: 302200
* AMDGPU: Add StackPtr and FramePtr registers to MFIMatt Arsenault2017-04-241-0/+2
| | | | | | These will be necessary for setting up call sequences. llvm-svn: 301208
* AMDGPU: Refactor SIMachineFunctionInfo slightlyMatt Arsenault2017-04-111-15/+25
| | | | | | Prepare for handling non-entry functions. llvm-svn: 299999
* AMDGPU: Refactor argument loweringMatt Arsenault2017-04-111-1/+1
| | | | | | | Split into smaller functions and prepare for handling non-entry functions. llvm-svn: 299998
* AMDGPU: Don't use stack space for SGPR->VGPR spillsMatt Arsenault2017-02-211-42/+51
| | | | | | | | | | | | | | | | Before frame offsets are calculated, try to eliminate the frame indexes used by SGPR spills. Then we can delete them after. I think for now we can be sure that no other instruction will be re-using the same frame indexes. It should be easy to notice if this assumption ever breaks since everything asserts if it tries to use a dead frame index later. The unused emergency stack slot seems to still be left behind, so an additional 4 bytes is still wasted. llvm-svn: 295753
* AMDGPU add support for spilling to a user sgpr pointed buffersTom Stellard2017-01-251-2/+13
| | | | | | | | | | | | | | | | | Summary: This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1]. Patch By: Dave Airlie Reviewers: nhaehnle, arsenm, tstellarAMD Reviewed By: arsenm Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25428 llvm-svn: 293000
* AMDGPU/SI: Make a function constTom Stellard2016-12-201-1/+0
| | | | llvm-svn: 290185
* AMDGPU/SI: Add a MachineMemOperand to MIMG instructionsTom Stellard2016-12-201-0/+1
| | | | | | | | | | | | | | | Summary: Without a MachineMemOperand, the scheduler was assuming MIMG instructions were ordered memory references, so no loads or stores could be reordered across them. Reviewers: arsenm Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27536 llvm-svn: 290179
* AMDGPU/SI: Add support for triples with the mesa3d operating systemTom Stellard2016-09-161-1/+1
| | | | | | | | | | | | | | Summary: mesa3d will use the same kernel calling convention as amdhsa, but it will handle everything else like the default 'unknown' OS type. Reviewers: arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D22783 llvm-svn: 281779
* [AMDGPU] Wave and register controlsKonstantin Zhuravlyov2016-09-061-14/+4
| | | | | | | | | | | | | | - Implemented amdgpu-flat-work-group-size attribute - Implemented amdgpu-num-active-waves-per-eu attribute - Implemented amdgpu-num-sgpr attribute - Implemented amdgpu-num-vgpr attribute - Dynamic LDS constraints are in a separate patch Patch by Tom Stellard and Konstantin Zhuravlyov Differential Revision: https://reviews.llvm.org/D21562 llvm-svn: 280747
* AMDGPU: Remove unused tracking of flat instructionsMatt Arsenault2016-08-111-1/+0
| | | | llvm-svn: 278361
* MachineFunction: Return reference for getFrameInfo(); NFCMatthias Braun2016-07-281-4/+4
| | | | | | | getFrameInfo() never returns nullptr so we should use a reference instead of a pointer. llvm-svn: 277017
* AMDGPU/SI: Don't use reserved VGPRs for SGPR spillingTom Stellard2016-07-281-1/+2
| | | | | | | | | | | | | | | Summary: We were using reserved VGPRs for SGPR spilling and this was causing some programs with a workgroup size of 1024 to use more than 64 registers, which is illegal. Reviewers: arsenm, mareko, nhaehnle Subscribers: nhaehnle, arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D22032 llvm-svn: 276980
* AMDGPU: Make AMDGPUMachineFunction fields privateMatt Arsenault2016-07-261-3/+0
| | | | | | | | | ABIArgOffset is a problem because properly fsetting the KernArgSize requires that the reserved area before the real kernel arguments be correctly aligned, which requires fixing clover. llvm-svn: 276766
* AMDGPU: Add HSA dispatch id intrinsicMatt Arsenault2016-07-221-1/+11
| | | | llvm-svn: 276437
* AMDGPU/SI: Emit the number of SGPR and VGPR spillsMarek Olsak2016-07-131-0/+2
| | | | | | | | | | | | | | | | | | | | | Summary: v2: don't count SGPRs spilled to scratch twice I think this is sufficient. It doesn't count private memory usage, which happens often and uses scratch but isn't technically a spill. The private memory usage can be computed by: [scratch_per_thread - vgpr_spills - a random multiple of SGPR spills]. The fact SGPR spills add very high numbers to the scratch size make that computation a guessing game, but I don't have a solution to that. Reviewers: tstellarAMD Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D22197 llvm-svn: 275288
* SIMachineFunctionInfo.cpp: Appease msc18 to use std::array.NAKAMURA Takumi2016-06-271-2/+2
| | | | llvm-svn: 273860
* Reformat.NAKAMURA Takumi2016-06-271-1/+1
| | | | llvm-svn: 273859
* Reformat blank lines.NAKAMURA Takumi2016-06-271-2/+0
| | | | llvm-svn: 273858
* [AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in ↵Konstantin Zhuravlyov2016-06-251-4/+6
| | | | | | | | | | | | | | | | | | | | | | | the kernel code header Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue. Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format: - offset 0: work group ID x - offset 4: work group ID y - offset 8: work group ID z - offset 16: work item ID x - offset 20: work item ID y - offset 24: work item ID z Set - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled Differential Revision: http://reviews.llvm.org/D20335 llvm-svn: 273769
* AMDGPU: Cleanup subtarget handling.Matt Arsenault2016-06-241-5/+6
| | | | | | | | | Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652
* AMDGPU: Add option to disable spilling SGPRs to VGPRs.Matt Arsenault2016-06-231-2/+9
| | | | | | This can help debug spilling problems. llvm-svn: 273605
* [AMDGPU][NFC] Rename ReserveTrapVGPRs -> ReserveRegsKonstantin Zhuravlyov2016-05-241-3/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D20081 llvm-svn: 270594
* [AMDGPU] Move reserved vgpr count for trap handler usage to ↵Konstantin Zhuravlyov2016-04-261-0/+4
| | | | | | | | SIMachineFunctionInfo + minor commenting changes Differential Revision: http://reviews.llvm.org/D19537 llvm-svn: 267573
* AMDGPU: Add queue ptr intrinsicMatt Arsenault2016-04-251-0/+3
| | | | llvm-svn: 267451
* AMDGPU: allow specifying a workgroup size that needs to fit in a compute unitTom Stellard2016-04-141-6/+7
| | | | | | | | | | | | | | | | | | | Summary: For GL_ARB_compute_shader we need to support workgroup sizes of at least 1024. However, if we want to allow large workgroup sizes, we may need to use less registers, as we have to run more waves per SIMD. This patch adds an attribute to specify the maximum work group size the compiled program needs to support. It defaults, to 256, as that has no wave restrictions. Reducing the number of registers available is done similarly to how the registers were reserved for chips with the sgpr init bug. Reviewers: mareko, arsenm, tstellarAMD, nhaehnle Subscribers: FireBurn, kerberizer, llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18340 Patch By: Bas Nieuwenhuizen llvm-svn: 266337
* AMDGPU/SI: Use the correct scratch wave offset register for shaders.Tom Stellard2016-04-141-3/+6
| | | | | | | | | | | | | | | | | | | | | | Summary: The code previously always used s1 as it was using the user + system SGPR information for compute kernels. This is incorrect for Mesa shaders though, The register should be the next SGPR after all user and system SGPR's. We use that Mesa adds arguments for all input and system SGPR's and take the next available SGPR for the scratch wave offset register. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewers: mareko, arsenm, nhaehnle, tstellarAMD Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18941 Patch By: Bas Nieuwenhuizen llvm-svn: 266336
* AMDGPU: Add a shader calling conventionNicolai Haehnle2016-04-061-3/+5
| | | | | | | | | | | This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589
* AMDGPU/SI: Add support for spiling SGPRs to scratch bufferTom Stellard2016-03-041-10/+5
| | | | | | | | | | | | | | Summary: This is necessary for when we run out of VGPRs and can no longer use v_{read,write}_lane for spilling SGPRs. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17592 llvm-svn: 262732
* AMDGPU: Set flat_scratch from flat_scratch_init regMatt Arsenault2016-02-121-4/+20
| | | | | | | | | | | | | | This was hardcoded to the static private size, but this would be missing the offset and additional size for someday when we have dynamic sizing. Also stops always initializing flat_scratch even when unused. In the future we should stop emitting this unless flat instructions are used to access private memory. For example this will initialize it almost always on VI because flat is used for global access. llvm-svn: 260658
* AMDGPU/SI: Add s_waitcnt at the end of non-void functionsMarek Olsak2016-01-131-0/+1
| | | | | | | | | | | | | | Summary: v2: Make ReturnsVoid private, so that I can another 8 lines of code and look more productive. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16034 llvm-svn: 257622
* AMDGPU/SI: Add new target attribute InitialPSInputAddrMarek Olsak2016-01-131-1/+4
| | | | | | | | | | | | | | | | | | | | | Summary: This allows Mesa to pass initial SPI_PS_INPUT_ADDR to LLVM. The register assigns VGPR locations to PS inputs, while the ENA register determines whether or not they are loaded. Mesa needs to set some inputs as not-movable, so that a pixel shader prolog binary appended at the beginning can assume where some inputs are. v2: Make PSInputAddr private, because there is never enough silly getters and setters for people to read. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16030 llvm-svn: 257591
* AMDGPU: Avoid assertions after SGPR spilling failedNicolai Haehnle2016-01-041-0/+11
| | | | | | | | | | | | | | | | | | | | Summary: The comment explains it: emitError does not necessarily exit the compilation process, and then using NoRegister leads to assertions later on. This generates incorrect code, of course, but the user should know to not use the result when an error has been emitted. It would be nice to have a test-case for this inside the LLVM repository, but llc exits on error. shader-db tests trigger the underlying issue at least on Tonga. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15826 llvm-svn: 256757
* AMDGPU: Rework how private buffer passed for HSAMatt Arsenault2015-11-301-9/+71
| | | | | | | | | | | | | | | | If we know we have stack objects, we reserve the registers that the private buffer resource and wave offset are passed and use them directly. If not, reserve the last 5 SGPRs just in case we need to spill. After register allocation, try to pick the next available registers instead of the last SGPRs, and then insert copies from the inputs to the reserved registers in the progloue. This also only selectively enables all of the input registers which are really required instead of always enabling them. llvm-svn: 254331
* AMDGPU: Remove SIPrepareScratchRegsMatt Arsenault2015-11-301-0/+8
| | | | | | | | | | | | | | | | | | | | | | It does not work because of emergency stack slots. This pass was supposed to eliminate dummy registers for the spill instructions, but the register scavenger can introduce more during PrologEpilogInserter, so some would end up left behind if they were needed. The potential for spilling the scratch resource descriptor and offset register makes doing something like this overly complicated. Reserve registers to use for the resource descriptor and use them directly in eliminateFrameIndex. Also removes creating another scratch resource descriptor when directly selecting scratch MUBUF instructions. The choice of which registers are reserved is temporary. For now it attempts to pick the next available registers after the user and system SGPRs. llvm-svn: 254329
* AMDGPU: Check feature attributes in SIMachineFunctionInfoMatt Arsenault2015-11-251-3/+36
| | | | llvm-svn: 254091
* AMDGPU: Also track whether SGPRs were spilledMatt Arsenault2015-11-051-0/+1
| | | | llvm-svn: 252145
* MachineRegisterInfo: Remove UsedPhysReg infrastructureMatthias Braun2015-07-141-1/+0
| | | | | | | | | | | | | We have a detailed def/use lists for every physical register in MachineRegisterInfo anyway, so there is little use in maintaining an additional bitset of which ones are used. Removing it frees us from extra book keeping. This simplifies VirtRegMap. Differential Revision: http://reviews.llvm.org/D10911 llvm-svn: 242173
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+77
| | | | llvm-svn: 239657
* Revert "AMDGPU: Add core backend files for R600/SI codegen v6"Tom Stellard2012-07-161-18/+0
| | | | | | This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea. llvm-svn: 160303
* AMDGPU: Add core backend files for R600/SI codegen v6Tom Stellard2012-07-161-0/+18
llvm-svn: 160270
OpenPOWER on IntegriCloud