summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/AMDGPUCallingConv.td
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] Adjust number of SGPRs available in Calling ConventionRyan Taylor2019-08-281-18/+2
| | | | | | | | | This reduces the number of SGPRs due to some concerns about running out of SGPRs if you make all the SGPRs that aren't reserved available for the calling convention. Change-Id: Idb4ca4dc72f5b6808cb524ff7270915a8de5b4c1 llvm-svn: 370215
* AMDGPU: Decompose all values to 32-bit pieces for calling conventionsMatt Arsenault2019-07-191-7/+0
| | | | | | | | | | This is the more natural lowering, and presents more opportunities to reduce 64-bit ops to 32-bit. This should also help avoid issues graphics shaders have had with 64-bit values, and simplify argument lowering in globalisel. llvm-svn: 366578
* AMDGPU: Add pass to lower SGPR spillsMatt Arsenault2019-07-031-0/+10
| | | | | | | | | | | | | | | This is split out from my patches to split register allocation into a separate SGPR and VGPR phase, and has some parts that aren't yet used (like maintaining LiveIntervals). This simplifies making the frame pointer register callee saved. As it is now, the code to determine callee saves needs to predict all the possible SGPR spills and how many callee saved VGPRs are needed. By handling this before PrologEpilogInserter, it's possible to just check the spill objects that already exist. Change-Id: I29e6df4034afcf949e06f8ef44206acb94696f04 llvm-svn: 365095
* AMDGPU: Fix not marking new gfx10 SGPRs as CSRsMatt Arsenault2019-05-211-3/+3
| | | | llvm-svn: 361330
* [AMDGPU] Increases available SGPR for Calling ConventionRyan Taylor2019-05-151-2/+20
| | | | | | | | | | | | | | | | Summary: SGPR in CC can be either hw initialized or set by other chained shaders and so this increases the SGPR count availalbe to CC to 105. Change-Id: I3dfadc750fe4a3e2bd07117a2899fd13f3e2fef3 Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61261 llvm-svn: 360778
* [AMDGPU] Added v5i32 and v5f32 register classesTim Renouf2019-03-221-1/+2
| | | | | | | | | | They are not used by anything yet, but a subsequent commit will start using them for image ops that return 5 dwords. Differential Revision: https://reviews.llvm.org/D58903 Change-Id: I63e1904081e39a6d66e4eb96d51df25ad399d271 llvm-svn: 356735
* [AMDGPU] Support for v3i32/v3f32Tim Renouf2019-03-211-1/+2
| | | | | | | | | | | | | | | Added support for dwordx3 for most load/store types, but not DS, and not intrinsics yet. SI (gfx6) does not have dwordx3 instructions, so they are not enabled there. Some of this patch is from Matt Arsenault, also of AMD. Differential Revision: https://reviews.llvm.org/D58902 Change-Id: I913ef54f1433a7149da8d72f4af54dbb13436bd9 llvm-svn: 356659
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* AMDGPU: Partially fix handling of packed amdgpu_ps argumentsMatt Arsenault2018-08-011-3/+3
| | | | | | | | Fixes annoying limitations when writing tests. Also remove more leftover code for manually scalarizing arguments and return values. llvm-svn: 338618
* AMDGPU: Refactor Subtarget classesTom Stellard2018-07-111-2/+2
| | | | | | | | | | | | | | | | | Summary: This is a follow-up to r335942. - Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget - Rename AMDGPUCommonSubtarget to AMDGPUSubtarget - Merge R600Subtarget::Generation and GCNSubtarget::Generation into AMDGPUSubtarget::Generation. Reviewers: arsenm, jvesely Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D49037 llvm-svn: 336851
* AMDGPU/GlobalISel: Implement custom kernel arg loweringMatt Arsenault2018-07-051-15/+0
| | | | | | | | | | | | | Avoid using allocateKernArg / AssignFn. We do not want any of the type splitting properties of normal calling convention lowering. For now at least this exists alongside the IR argument lowering pass. This is necessary to handle struct padding correctly while some arguments are still skipped by the IR argument lowering pass. llvm-svn: 336373
* AMDGPU: Separate R600 and GCN TableGen filesTom Stellard2018-06-281-16/+1
| | | | | | | | | | | | | | | | | | | | | Summary: We now have two sets of generated TableGen files, one for R600 and one for GCN, so each sub-target now has its own tables of instructions, registers, ISel patterns, etc. This should help reduce compile time since each sub-target now only has to consider information that is specific to itself. This will also help prevent the R600 sub-target from slowing down new features for GCN, like disassembler support, GlobalISel, etc. Reviewers: arsenm, nhaehnle, jvesely Reviewed By: arsenm Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46365 llvm-svn: 335942
* AMDGPU: Make v4i16/v4f16 legalMatt Arsenault2018-06-151-2/+2
| | | | | | | Some image loads return these, and it's awkward working around them not being legal. llvm-svn: 334835
* AMDGPU: Initial implementation of callsMatt Arsenault2017-08-011-0/+4
| | | | | | | | | Includes a hack to fix the type selected for the GlobalAddress of the function, which will be fixed by changing the default datalayout to use generic pointers for 0. llvm-svn: 309732
* AMDGPU: Start defining a calling conventionMatt Arsenault2017-05-171-1/+49
| | | | | | | | Partially implement callee-side for arguments and return values. byval doesn't work properly, and most likely sret or other on-stack return values most as well. llvm-svn: 303308
* AMDGPU: Stop using CCAssignToRegWithShadowMatt Arsenault2017-04-061-19/+5
| | | | | | | This does not do what it is attempting to use it for and requires working around in LowerFormalArguments. llvm-svn: 299667
* AMDGPU: Add f16 to shader calling conventionsMatt Arsenault2017-02-271-3/+3
| | | | | | Mostly useful for writing tests for f16 features. llvm-svn: 296370
* AMDGPU: Fix kernel argument alignment impacting stack sizeMatt Arsenault2016-06-181-1/+1
| | | | | | | | Don't use AllocateStack because kernel arguments have nothing to do with the stack. The ensureMaxAlignment call was still changing the stack alignment. llvm-svn: 273080
* AMDGPU: Add a shader calling conventionNicolai Haehnle2016-04-061-4/+2
| | | | | | | | | | | This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589
* AMDGPU/SI: Add support for non-void functionsMarek Olsak2016-01-131-0/+31
| | | | | | | | | | | | | | | | | | | Summary: Return values can be stored in SGPRs (i32) and VGPRs (f32). This will be used by functions which expect some bytecode or other binary to be appended at the end. It allows defining in which registers the return values will be stored. v2: don't do this for compute shaders Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16033 llvm-svn: 257621
* AMDGPU/SI: Allow more shader inputsMarek Olsak2016-01-131-6/+30
| | | | | | | | | | Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16032 llvm-svn: 257593
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+82
llvm-svn: 239657
OpenPOWER on IntegriCloud