summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/SIInstrFormats.td
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] Added MI bit IsDOTStanislav Mekhanoshin2019-09-171-0/+5
| | | | | | | | NFC, needed for future commit. Differential Revision: https://reviews.llvm.org/D67669 llvm-svn: 372151
* [AMDGPU] Extend MIMG opcode to 8 bitsStanislav Mekhanoshin2019-07-121-2/+3
| | | | | | | | This is NFC, but required for future commit. Differential Revision: https://reviews.llvm.org/D64649 llvm-svn: 365940
* [AMDGPU] gfx908 mAI instructions, MC partStanislav Mekhanoshin2019-07-091-0/+5
| | | | | | Differential Revision: https://reviews.llvm.org/D64446 llvm-svn: 365563
* [AMDGPU] hazard recognizer for fp atomic to s_denorm_modeStanislav Mekhanoshin2019-06-211-0/+5
| | | | | | | | | This requires 3 wait states unless there is a wait or VALU in between. Differential Revision: https://reviews.llvm.org/D63619 llvm-svn: 364074
* [AMDGPU] gfx1010 MIMG implementationStanislav Mekhanoshin2019-05-011-6/+26
| | | | | | Differential Revision: https://reviews.llvm.org/D61339 llvm-svn: 359698
* [AMDGPU] predicate and feature refactoringStanislav Mekhanoshin2019-04-051-6/+7
| | | | | | | | | We have done some predicate and feature refactoring lately but did not upstream it. This is to sync. Differential revision: https://reviews.llvm.org/D60292 llvm-svn: 357791
* AMDGPU: Remove GCN features and predicatesMatt Arsenault2019-02-081-5/+0
| | | | | | | These are no longer necessary since the R600 tablegen files are split out now. llvm-svn: 353548
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [AMDGPU] Add new Mode Register passTim Corringham2018-12-101-0/+6
| | | | | | | | | | | | | | | A new pass to manage the Mode register. Currently this just manages the floating point double precision rounding requirements, but is intended to be easily extended to encompass all Mode register settings. The immediate motivation comes from the requirement to use the round-to-zero rounding mode for the 16 bit interpolation instructions, where the rounding mode setting is shared between 16 and 64 bit operations. llvm-svn: 348754
* AMDGPU: Refactor Subtarget classesTom Stellard2018-07-111-2/+2
| | | | | | | | | | | | | | | | | Summary: This is a follow-up to r335942. - Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget - Rename AMDGPUCommonSubtarget to AMDGPUSubtarget - Merge R600Subtarget::Generation and GCNSubtarget::Generation into AMDGPUSubtarget::Generation. Reviewers: arsenm, jvesely Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D49037 llvm-svn: 336851
* [AMDGPU] Add VALU to V_INTERP InstructionsRyan Taylor2018-07-051-0/+1
| | | | | | | | | | | | Wait states are not properly being inserted after buffer_store for v_interp instructions. Add VALU to V_INTERP instructions so that the GCNHazardRecognizer can check and insert the appropriate wait states when needed. Differential Revision: https://reviews.llvm.org/D48772 Change-Id: Id540c9b074fc69b5c1de6b182276aa089c74aa64 llvm-svn: 336339
* AMDGPU: Separate R600 and GCN TableGen filesTom Stellard2018-06-281-1/+1
| | | | | | | | | | | | | | | | | | | | | Summary: We now have two sets of generated TableGen files, one for R600 and one for GCN, so each sub-target now has its own tables of instructions, registers, ISel patterns, etc. This should help reduce compile time since each sub-target now only has to consider information that is specific to itself. This will also help prevent the R600 sub-target from slowing down new features for GCN, like disassembler support, GlobalISel, etc. Reviewers: arsenm, nhaehnle, jvesely Reviewed By: arsenm Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46365 llvm-svn: 335942
* AMDGPU: Refactor MIMG instruction TableGen using generic tablesNicolai Haehnle2018-06-211-14/+0
| | | | | | | | | | | | | | | | | | | | Summary: This allows us to access rich information about MIMG opcodes from C++ code. Simplifying the mapping between equivalent opcodes of different data size becomes quite natural. This also flattens the MIMG-related class and multiclass hierarchy a little, and collapses together some of the scaffolding for sample and gather4 opcodes. Change-Id: I1a2549fdc1e881ff100e5393d2d87e73729a0ccd Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48016 llvm-svn: 335227
* AMDGPU: Turn D16 for MIMG instructions into a regular operandNicolai Haehnle2018-06-211-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This allows us to reduce the number of different machine instruction opcodes, which reduces the table sizes and helps flatten the TableGen multiclass hierarchies. We can do this because for each hardware MIMG opcode, we have a full set of IMAGE_xxx_Vn_Vm machine instructions for all required sizes of vdata and vaddr registers. Instead of having separate D16 machine instructions, a packed D16 instructions loading e.g. 4 components can simply use the same V2 opcode variant that non-D16 instructions use. We still require a TSFlag for D16 buffer instructions, because the D16-ness of buffer instructions is part of the opcode. Renaming the flag should help avoid future confusion. The one non-obvious code change is that for gather4 instructions, the disassembler can no longer automatically decide whether to use a V2 or a V4 variant. The existing logic which choose the correct variant for other MIMG instruction is extended to cover gather4 as well. As a bonus, some of the assembler error messages are now more helpful (e.g., complaining about a wrong data size instead of a non-existing instruction). While we're at it, delete a whole bunch of dead legacy TableGen code. Change-Id: I89b02c2841c06f95e662541433e597f5d4553978 Reviewers: arsenm, rampitec, kzhuravl, artem.tamazov, dp, rtaylor Subscribers: wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D47434 llvm-svn: 335222
* [MachineOperand][Target] MachineOperand::isRenamable semantics changesGeoff Berry2018-02-231-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add a target option AllowRegisterRenaming that is used to opt in to post-register-allocation renaming of registers. This is set to 0 by default, which causes the hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq fields of all opcodes to be set to 1, causing MachineOperand::isRenamable to always return false. Set the AllowRegisterRenaming flag to 1 for all in-tree targets that have lit tests that were effected by enabling COPY forwarding in MachineCopyPropagation (AArch64, AMDGPU, ARM, Hexagon, Mips, PowerPC, RISCV, Sparc, SystemZ and X86). Add some more comments describing the semantics of the MachineOperand::isRenamable function and how it is set and maintained. Change isRenamable to check the operand's opcode hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq bit directly instead of relying on it being consistently reflected in the IsRenamable bit setting. Clear the IsRenamable bit when changing an operand's register value. Remove target code that was clearing the IsRenamable bit when changing registers/opcodes now that this is done conservatively by default. Change setting of hasExtraSrcRegAllocReq in AMDGPU target to be done in one place covering all opcodes that have constant pipe read limit restrictions. Reviewers: qcolombet, MatzeB Subscribers: aemerson, arsenm, jyknight, mcrosier, sdardis, nhaehnle, javed.absar, tpr, arichardson, kristof.beyls, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, escha, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D43042 llvm-svn: 325931
* [AMDGPU] isRenamable fixes to support copy forwardingGeoff Berry2018-01-301-0/+2
| | | | | | | | | | | Mark more opcodes as hasExtraSrcRegAllocReq so that their operands will be marked as not renamable, to avoid copy forwarding violating the constraint that only one operand may use the constant bus. These changes fix a few mis-compiles when copy forwarding is enabled in MachineCopyPropagation by D41835 (and were reviewed as part of that change). llvm-svn: 323794
* AMDGPU/SI: Add d16 support for image intrinsics.Changpeng Fang2018-01-181-0/+7
| | | | | | | | | | | | | Summary: This patch implements d16 support for image load, image store and image sample intrinsics. Reviewers: Matt, Brian. Differential Revision: https://reviews.llvm.org/D3991 llvm-svn: 322903
* [AMDGPU][MC][GFX8][GFX9] Corrected names of integer ↵Dmitry Preobrazhensky2017-11-201-4/+4
| | | | | | | | | | | | v_{add/addc/sub/subrev/subb/subbrev} See bug 34765: https://bugs.llvm.org//show_bug.cgi?id=34765 Reviewers: tamazov, SamWot, arsenm, vpykhtin Differential Revision: https://reviews.llvm.org/D40088 llvm-svn: 318675
* [AMDGPU][MC][GFX9][disassembler] Corrected decoding of op_sel_hi for v_mad_mix*Dmitry Preobrazhensky2017-11-171-0/+5
| | | | | | | | | | See bug 35148: https://bugs.llvm.org//show_bug.cgi?id=35148 Reviewers: tamazov, SamWot, arsenm Differential Revision: https://reviews.llvm.org/D39492 llvm-svn: 318526
* AMDGPU: Remove global isGCN predicatesMatt Arsenault2017-10-031-0/+9
| | | | | | | | | | | | | | These are problematic because they apply to everything, and can easily clobber whatever more specific predicate you are trying to add to a function. Currently instructions use SubtargetPredicate/PredicateControl to apply this to patterns applied to an instruction definition, but not to free standing Pats. Add a wrapper around Pat so the special PredicateControls requirements can be appended to the final predicate list like how Mips does it. llvm-svn: 314742
* AMDGPU: Fold clamp modifier for packed instructionsMatt Arsenault2017-08-311-8/+19
| | | | llvm-svn: 312297
* [AMDGPU][MC][GFX9] Added integer clamping support for VOP3 opcodesDmitry Preobrazhensky2017-08-161-0/+5
| | | | | | | | | | See Bug 34152: https://bugs.llvm.org//show_bug.cgi?id=34152 Reviewers: SamWot, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D36674 llvm-svn: 311006
* [AMDGPU][MC][GFX9] Added 16-bit renamed and "_legacy" VALU opcodesDmitry Preobrazhensky2017-08-091-0/+5
| | | | | | | | | | See Bug 33629: https://bugs.llvm.org//show_bug.cgi?id=33629 Reviewers: vpykhtin, SamWot, arsenm Differential Revision: https://reviews.llvm.org/D36322 llvm-svn: 310497
* AMDGPU: Introduce maybeAtomic instruction flagKonstantin Zhuravlyov2017-07-211-1/+6
| | | | | | Testing is in the follow up change llvm-svn: 308779
* [AMDGPU][MC][GFX9] Added support of VOP3 'op_sel' modifierDmitry Preobrazhensky2017-07-211-0/+5
| | | | | | | | | | See bug 33591: https://bugs.llvm.org//show_bug.cgi?id=33591 Reviewers: vpykhtin, artem.tamazov, SamWot, arsenm Differential Revision: https://reviews.llvm.org/D35424 llvm-svn: 308740
* [AMDGPU][MC] Fixed bugs in export instructionDmitry Preobrazhensky2017-05-191-8/+8
| | | | | | | | | | | | See Bugs 33019, 33056: https://bugs.llvm.org//show_bug.cgi?id=33019 https://bugs.llvm.org//show_bug.cgi?id=33056 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D33288 llvm-svn: 303423
* AMDGPU: Unify divergent function exits.Matt Arsenault2017-03-241-6/+6
| | | | | | | | | | StructurizeCFG can't handle cases with multiple returns creating regions with multiple exits. Create a copy of UnifyFunctionExitNodes that only unifies exit nodes that skips exit nodes with uniform branch sources. llvm-svn: 298729
* AMDGPU: Add VOP3P instruction formatMatt Arsenault2017-02-271-0/+2
| | | | | | | | Add a few non-VOP3P but instructions related to packed. Includes hack with dummy operands for the benefit of the assembler llvm-svn: 296368
* AMDGPU: Fold FP clamp as modifier bitMatt Arsenault2017-02-221-0/+5
| | | | | | | | | | | The manual is unclear on the details of this. It's not clear to me if denormals are not allowed with clamp, or if that is only omod. Not allowing denorms for fp16 or fp64 isn't useful so I also question if that is really a restriction. Same with whether this is valid without IEEE mode enabled. llvm-svn: 295905
* AMDGPU: Fix vintrp disassemblyMatt Arsenault2016-12-101-1/+1
| | | | llvm-svn: 289292
* AMDGPU: Clean up instruction bitsMatt Arsenault2016-12-091-42/+55
| | | | | | | | | Sort the instruction bits by type and make sure there is one for each format. Also cleanup namespaces. llvm-svn: 289229
* AMDGPU/SI: Don't mark VINTRP instructions as mayLoadTom Stellard2016-12-091-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: These instructions technically do read from memory, but the memory is considered to be out of bounds for normal load/store instructions. shader-db stats: SGPRS: 1416075 -> 1413323 (-0.19 %) VGPRS: 867413 -> 863935 (-0.40 %) Spilled SGPRs: 1409 -> 1354 (-3.90 %) Spilled VGPRs: 63 -> 63 (0.00 %) Private memory VGPRs: 880 -> 880 (0.00 %) Scratch size: 2648 -> 2632 (-0.60 %) dwords per thread Code Size: 37889052 -> 37897340 (0.02 %) bytes LDS: 2147 -> 2147 (0.00 %) blocks Max Waves: 279243 -> 280369 (0.40 %) Wait states: 0 -> 0 (0.00 %) Reviewers: nhaehnle, mareko, arsenm Subscribers: kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27593 llvm-svn: 289219
* AMDGPU: Refactor exp instructionsMatt Arsenault2016-12-051-9/+22
| | | | | | | | | | | | | | | Structure the definitions a bit more like the other classes. The main change here is to split EXP with the done bit set to a separate opcode, so we can set mayLoad = 1 so that it won't be reordered before the other exp stores, since this has the special constraint that if the done bit is set then this should be the last exp in she shader. Previously all exp instructions were inferred to have unmodeled side effects. llvm-svn: 288695
* [AMDGPU] TableGen: change individual instruction flags to bit type from bits<1>Sam Kolton2016-11-151-35/+35
| | | | | | | | | | | | Summary: This is needed to be able to use this flags in InstrMappings. Reviewers: tstellarAMD, vpykhtin Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D26666 llvm-svn: 286960
* AMDGPU: Workaround for instruction size with literalsMatt Arsenault2016-11-011-0/+5
| | | | | | | | | | Instructions with a 32-bit base encoding with an optional 32-bit literal encoded after them report their size as 4 for the disassembler. Consider these when computing the MachineInstr size. This fixes problems caused by size estimate consistency in BranchRelaxation. llvm-svn: 285743
* AMDGPU: Add definitions for scalar store instructionsMatt Arsenault2016-10-281-0/+6
| | | | | | | | | | Also add glc bit to the scalar loads since they exist on VI and change the caching behavior. This currently has an assembler bug where the glc bit is incorrectly accepted on SI/CI which do not have it. llvm-svn: 285463
* [AMDGPU] Refactor VOP1 and VOP2 instruction TD definitionsValery Pykhtin2016-09-231-166/+0
| | | | | | Differential revision: https://reviews.llvm.org/D24738 llvm-svn: 282234
* [AMDGPU] Refactor VOPC instruction TD definitionsValery Pykhtin2016-09-191-29/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D24546 llvm-svn: 281903
* AMDGPU: Allow some control flow intrinsics to be CSEdMatt Arsenault2016-09-161-0/+3
| | | | | | | | | | | These clean up some unnecessary or instructions in cases with complex loops. In the original testcase I noticed this, the same or with exec was repeated 5 or 6 times in a row. With this only one is emitted or sometimes a copy. llvm-svn: 281786
* AMDGPU: Use SOPK compare instructionsMatt Arsenault2016-09-161-0/+5
| | | | llvm-svn: 281780
* Revert "AMDGPU: Use SOPK compare instructions"Matt Arsenault2016-09-141-5/+0
| | | | | | Accidentally committed llvm-svn: 281514
* AMDGPU: Use SOPK compare instructionsMatt Arsenault2016-09-141-0/+5
| | | | llvm-svn: 281513
* [AMDGPU] Refactor MUBUF/MTBUF instructionsValery Pykhtin2016-09-101-93/+0
| | | | | | Differential revision: https://reviews.llvm.org/D24295 llvm-svn: 281137
* AMDGPU: Implement is{LoadFrom|StoreTo}FrameIndexMatt Arsenault2016-09-101-3/+5
| | | | llvm-svn: 281128
* [AMDGPU] Assembler: match e32 VOP instructions before e64.Sam Kolton2016-09-091-0/+3
| | | | | | | | | | | | | | | | | | | Summary: Split assembler match table in 4 tables with assembler variants: Default - all instructions except VOP3, SDWA and DPP - VOP3 - SDWA - DPP First match Default table then VOP3, SDWA and DPP. Reviewers: tstellarAMD, artem.tamazov, vpykhtin Subscribers: arsenm, wdng, nhaehnle, AMDGPU Differential Revision: https://reviews.llvm.org/D24252 llvm-svn: 281023
* [AMDGPU] Refactor FLAT TD instructionsValery Pykhtin2016-09-051-36/+0
| | | | | | Differential revision: https://reviews.llvm.org/D24072 llvm-svn: 280655
* [AMDGPU] Scalar Memory instructions TD refactoringValery Pykhtin2016-09-011-54/+0
| | | | | | Differential revision: https://reviews.llvm.org/D23996 llvm-svn: 280349
* [AMDGPU] Refactor SOP instructions TD files.Valery Pykhtin2016-08-301-126/+0
| | | | | | Differential revision: https://reviews.llvm.org/D23617 llvm-svn: 280101
* AMDGPU: Remove unneeded implicit exec uses/defsMatt Arsenault2016-08-271-0/+19
| | | | | | | SI_BREAK, SI_IF_BREAK, and SI_ELSE_BREAK do not def exec. SI_IF_BREAK and SI_ELSE_BREAK do not read it either. llvm-svn: 279909
* AMDGPU: Stay in WQM for non-intrinsic storesNicolai Haehnle2016-08-021-0/+6
| | | | | | | | | | | | | | | | | | | | | | | Summary: Two types of stores are possible in pixel shaders: stores to memory that are explicitly requested at the API level, and stores that are an implementation detail of register spilling or lowering of arrays. For the first kind of store, we must ensure that helper pixels have no effect and hence WQM must be disabled. The second kind of store must always be executed, because the written value may be loaded again in a way that is relevant for helper pixels as well -- and there are no externally visible effects anyway. This is a candidate for the 3.9 release branch. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D22675 llvm-svn: 277504
OpenPOWER on IntegriCloud