summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/VOP2Instructions.td
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Split VOP2Inst into VOP2Inst_e32/e64/sdwaKonstantin Zhuravlyov2018-09-271-10/+32
| | | | llvm-svn: 343259
* AMDGPU/NFC: Simplify VOP_MAC_F16/F32Konstantin Zhuravlyov2018-09-271-11/+2
| | | | llvm-svn: 343254
* [AMDGPU] Divergence driven instruction selection. Part 1.Alexander Timofeev2018-09-211-24/+78
| | | | | | | | | | | | | Summary: This change is the first part of the AMDGPU target description change. The aim of it is the effective splitting the vector and scalar flows at the selection stage. Selection uses predicate functions based on the framework implemented earlier - https://reviews.llvm.org/D35267 Differential revision: https://reviews.llvm.org/D52019 Reviewers: rampitec llvm-svn: 342719
* AMDGPU: Fix getInstSizeInBytesNicolai Haehnle2018-08-291-9/+0
| | | | | | | | | | | | | | | | | | | | | | | Summary: Add some optional code to validate getInstSizeInBytes for emitted instructions. This flushed out some issues which are fixed by this patch: - Streamline getInstSizeInBytes - Properly define the VI readlane/writelane instruction as VOP3 - Fix the inline constant determination. Specifically, this change fixes an issue where a 32-bit value of 0xffffffff was recorded as unsigned. This is equal to -1 when restricting to a 32-bit comparison, and an inline constant can be used. Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D50629 Change-Id: Id87c3b7975839da0de8156a124b0ce98c5fb47f2 llvm-svn: 340903
* AMDGPU: Improve hack for packing conversion opsMatt Arsenault2018-08-011-5/+5
| | | | | | | | | | | Mutate the node type during selection when it doesn't matter. This avoids an intermediate bitcast node on targets with legal i16/f16. Also fixes missing output modifiers on v_cvt_pkrtz_f32_f16, which I assume are OK. llvm-svn: 338619
* AMDGPU: Add Vega12 and Vega20Matt Arsenault2018-04-301-0/+20
| | | | | | | | Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331215
* [AMDGPU][MC][VI][GFX9] Added support of SDWA/DPP for v_cndmask_b32Dmitry Preobrazhensky2018-04-161-1/+23
| | | | | | | | | See bug 36356: https://bugs.llvm.org/show_bug.cgi?id=36356 Differential Revision: https://reviews.llvm.org/D45446 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 330123
* AMDGPU: Introduce common SOP_Pseudo and VOP_Pseudo TableGen base classesNicolai Haehnle2018-03-261-12/+2
| | | | | | | Differential revision: https://reviews.llvm.org/D44820 Change-Id: I732979e2964006aa15d78a333d8886e6855f319a llvm-svn: 328496
* [AMDGPU] added writelane intrinsicTim Renouf2018-02-281-4/+9
| | | | | | | | | | | | | | | | | Summary: For use by LLPC SPV_AMD_shader_ballot extension. The v_writelane instruction was already implemented for use by SGPR spilling, but I had to add an extra dummy operand tied to the destination, to represent that all lanes except the selected one keep the old value of the destination register. .ll test changes were due to schedule changes caused by that new operand. Differential Revision: https://reviews.llvm.org/D42838 llvm-svn: 326353
* AMDGPU: Add intrinsics llvm.amdgcn.cvt.{pknorm.i16, pknorm.u16, pk.i16, pk.u16}Marek Olsak2018-01-311-4/+4
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D41663 llvm-svn: 323908
* [AMDGPU] Copy impdefs from pseudo to real instructionsStanislav Mekhanoshin2018-01-151-0/+1
| | | | | | | | | | In some cases we do not copy implicit defs from pseudo to real VOP instructions. It has no visible impact at the moment thus no tests are affected or added. Differential Revision: https://reviews.llvm.org/D41783 llvm-svn: 322496
* [AMDGPU][MC][GFX9] Corrected mapping of GFX9 v_add/sub/subrev_u32Dmitry Preobrazhensky2017-11-291-9/+14
| | | | | | | | | | | When translating pseudo to MC, v_add/sub/subrev_u32 shall be mapped via a separate table as GFX8 has opcodes with the same names. These instructions shall also be labelled as renamed for pseudoToMCOpcode to handle them correctly. Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D40550 llvm-svn: 319311
* [AMDGPU][MC][GFX8][GFX9] Corrected names of integer ↵Dmitry Preobrazhensky2017-11-201-45/+120
| | | | | | | | | | | | v_{add/addc/sub/subrev/subb/subbrev} See bug 34765: https://bugs.llvm.org//show_bug.cgi?id=34765 Reviewers: tamazov, SamWot, arsenm, vpykhtin Differential Revision: https://reviews.llvm.org/D40088 llvm-svn: 318675
* AMDGPU: Remove global isGCN predicatesMatt Arsenault2017-10-031-14/+14
| | | | | | | | | | | | | | These are problematic because they apply to everything, and can easily clobber whatever more specific predicate you are trying to add to a function. Currently instructions use SubtargetPredicate/PredicateControl to apply this to patterns applied to an instruction definition, but not to free standing Pats. Add a wrapper around Pat so the special PredicateControls requirements can be appended to the final predicate list like how Mips does it. llvm-svn: 314742
* [AMDGPU][MC][GFX9] Added integer clamping support for VOP3 opcodesDmitry Preobrazhensky2017-08-161-4/+4
| | | | | | | | | | See Bug 34152: https://bugs.llvm.org//show_bug.cgi?id=34152 Reviewers: SamWot, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D36674 llvm-svn: 311006
* [AMDGPU] Add pseudo "old" source to all DPP instructionsConnor Abbott2017-08-071-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: All instructions with the DPP modifier may not write to certain lanes of the output if bound_ctrl=1 is set or any bits in bank_mask or row_mask aren't set, so the destination register may be both defined and modified. The right way to handle this is to add a constraint that the destination register is the same as one of the inputs. We could tie the destination to the first source, but that would be too restrictive for some use-cases where we want the destination to be some other value before the instruction executes. Instead, add a fake "old" source and tie it to the destination. Effectively, the "old" source defines what value unwritten lanes will get. We'll expose this functionality to users with a new intrinsic later. Also, we want to use DPP instructions for computing derivatives, which means we need to set WQM for them. We also need to enable the entire wavefront when using DPP intrinsics to implement nonuniform subgroup reductions, since otherwise we'll get incorrect results in some cases. To accomodate this, add a new operand to all DPP instructions which will be interpreted by the SI WQM pass. This will be exposed with a new intrinsic later. We'll also add support for Whole Wavefront Mode later. I also fixed llvm.amdgcn.mov.dpp to overwrite the source and fixed up the test. However, I could also keep the old behavior (where lanes that aren't written are undefined) if people want it. Reviewers: tstellar, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34716 llvm-svn: 310283
* AMDGPU: Add encoding for carryless add/sub instructionsMatt Arsenault2017-07-201-0/+14
| | | | llvm-svn: 308639
* [AMDGPU] resubmit r308179: CodeGen: check dst operand type to determine if ↵Sam Kolton2017-07-181-3/+8
| | | | | | omod is supported for VOP3 instructions llvm-svn: 308310
* Revert r308179 which causes tablegen to spam stderr on every build.Chandler Carruth2017-07-181-8/+3
| | | | | | | Original commit log: [AMDGPU] CodeGen: check dst operand type to determine if omod is supported for VOP3 instructions llvm-svn: 308270
* [AMDGPU] CodeGen: check dst operand type to determine if omod is supported ↵Sam Kolton2017-07-171-3/+8
| | | | | | | | | | | | | | | | for VOP3 instructions Summary: Previously, CodeGen checked first src operand type to determine if omod is supported by instruction. This isn't correct for some instructions: e.g. V_CMP_EQ_F32 has floating-point src operands but desn't support omod. Changed .td files to check if dst operand instead of src operand. Reviewers: arsenm, vpykhtin Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D35350 llvm-svn: 308179
* [AMDGPU] SDWA: remove support for VOP2 instructions that have only 64-bit ↵Sam Kolton2017-06-221-11/+15
| | | | | | | | | | | | | | | | encoding Summary: Despite that this instructions are listed in VOP2, they are treated as VOP3 in specs. They should not support SDWA. There are no real instructions for them, but there are pseudo instructions. Reviewers: arsenm, vpykhtin, cfang Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34403 llvm-svn: 305999
* [AMDGPU] simplify add x, *ext (setcc) => addc|subb x, 0, setccStanislav Mekhanoshin2017-06-211-0/+9
| | | | | | | | | This simplification allows to avoid generating v_cndmask_b32 to serialize condition code between compare and use. Differential Revision: https://reviews.llvm.org/D34300 llvm-svn: 305962
* [AMDGPU] SDWA: merge VI and GFX9 pseudo instructionsSam Kolton2017-06-211-26/+7
| | | | | | | | | | | | Summary: Previously there were two separate pseudo instruction for SDWA on VI and on GFX9. Created one pseudo instruction that is union of both of them. Added verifier to check that operands conform either VI or GFX9. Reviewers: dp, arsenm, vpykhtin Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, artem.tamazov Differential Revision: https://reviews.llvm.org/D34026 llvm-svn: 305886
* [AMDGPU] SDWA: Add assembler support for GFX9Sam Kolton2017-05-231-12/+58
| | | | | | | | | | | | | | | Summary: Added separate pseudo and real instruction for GFX9 SDWA instructions. Currently supports only in assembler. Depends D32493 Reviewers: vpykhtin, artem.tamazov Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D33132 llvm-svn: 303620
* [AMDGPU][MC] Corrected several VI opcodes to avoid printing _e64Dmitry Preobrazhensky2017-05-151-11/+22
| | | | | | | | | | See bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D33123 llvm-svn: 303070
* [AMDGPU][MC] Corrected v_madak/madmk to avoid printing "_e32" in ↵Dmitry Preobrazhensky2017-05-101-6/+12
| | | | | | | | | | | | disassembler output See bug 32927: https://bugs.llvm.org//show_bug.cgi?id=32927 Reviewers: vpykhtin, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D32913 llvm-svn: 302648
* AMDGPU: Fix crash when disassembling VOP3 macMatt Arsenault2017-04-101-0/+2
| | | | | | | | | | | | The unused dummy src2_modifiers is missing, so it crashes when trying to print it. I tried to fully remove src2_modifiers, but there are some irritations in the places where it is converted to mad since it starts to require modifying use lists while iterating over them. llvm-svn: 299861
* [AMDGPU][MC] Fix for Bug 28167 + LIT testsDmitry Preobrazhensky2017-04-051-1/+4
| | | | | | | | | | | | Corrected src0 for v_writelane_b32: - Enabled inline constants and literals for SI/CI (VOP2) - Enabled inline constants for VI (VOP3) Reviewers: vpykhtin, arsenm https://reviews.llvm.org/D31463 llvm-svn: 299555
* [AMDGPU][MC] Fix for Bug 30829 + LIT testsDmitry Preobrazhensky2017-03-031-0/+2
| | | | | | | | Added code to check constant bus restrictions for VOP formats (only one SGPR value or literal-constant may be used by the instruction). Note that the same checks are performed by SIInstrInfo::verifyInstruction (used by lowering code). Added LIT tests. llvm-svn: 296873
* AMDGPU: Add VOP3P instruction formatMatt Arsenault2017-02-271-3/+4
| | | | | | | | Add a few non-VOP3P but instructions related to packed. Includes hack with dummy operands for the benefit of the assembler llvm-svn: 296368
* AMDGPU: Add cvt.pkrtz intrinsicMatt Arsenault2017-02-221-1/+1
| | | | | | Convert llvm.SI.packf16 test uses llvm-svn: 295797
* AMDGPU: Fix trailing whitespaceMatt Arsenault2017-02-101-3/+3
| | | | llvm-svn: 294694
* AMDGPU: Undo sub x, c -> add x, -c canonicalizationMatt Arsenault2017-01-301-0/+8
| | | | | | | | | This is worse if the original constant is an inline immediate. This should also be done for 64-bit adds, but requires fixing operand folding bugs first. llvm-svn: 293540
* [AMDGPU] Add subtarget features for SDWA/DPPSam Kolton2017-01-201-4/+4
| | | | | | | | | | Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28900 llvm-svn: 292596
* [AMDGPU] Assembler: SDWA/DPP should not accept scalar registers and ↵Sam Kolton2017-01-111-4/+4
| | | | | | | | | | | | immediate operands Reviewers: artem.tamazov, nhaustov, vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28157 llvm-svn: 291668
* [AMDGPU] Assembler: support SDWA and DPP for VOP2b instructionsSam Kolton2016-12-271-1/+26
| | | | | | | | | | Reviewers: nhaustov, artem.tamazov, vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28051 llvm-svn: 290599
* AMDGPU: Use i16 for i16 shift amountMatt Arsenault2016-12-221-6/+6
| | | | llvm-svn: 290351
* [AMDGPU] Add pseudo SDWA instructionsSam Kolton2016-12-221-22/+28
| | | | | | | | | | | | Summary: This is needed for later SDWA support in CodeGen. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27412 llvm-svn: 290338
* [AMDGPU] Disassembler: fix for disaasembling v_mac_f32/16_dpp/sdwaSam Kolton2016-12-221-4/+11
| | | | | | | | | | | | Summary: Real instruction should copy constraints from real instruction. This allows auto-generated disassembler to correctly process tied operands. Reviewers: nhaustov, vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27847 llvm-svn: 290336
* AMDGPU: Fix name for v_ashrrev_i16Matt Arsenault2016-12-161-3/+3
| | | | llvm-svn: 289967
* AMDGPU: Fix handling of 16-bit immediatesMatt Arsenault2016-12-101-2/+4
| | | | | | | | | | | | | | | | | | Since 32-bit instructions with 32-bit input immediate behavior are used to materialize 16-bit constants in 32-bit registers for 16-bit instructions, determining the legality based on the size is incorrect. Change operands to have the size specified in the type. Also adds a workaround for a disassembler bug that produces an immediate MCOperand for an operand that is supposed to be OPERAND_REGISTER. The assembler appears to accept out of bounds immediates and truncates them, but this seems to be an issue for 32-bit already. llvm-svn: 289306
* AMDGPU: Select i16 instructions to VOP3 formsMatt Arsenault2016-12-091-10/+10
| | | | | | | | | | | | | | These were selecting directly to the VOP2 form instead of VOP3 like the i32 instructions. Fixes regressions in future commits where an immediate isn't folded because it was initially used for the second operand. Because uniform 16-bit operations are promoted to i32, it's difficult to get a simple testcase where this matters. Fold failures in SIFoldOperands here tend to be hidden by commute and fold in SIShrinkInstructions. llvm-svn: 289189
* AMDGPU: Fix commuting v_sub_u16Matt Arsenault2016-12-081-1/+1
| | | | | | | | The correct commutable opcode was set to itself, so this was simply swapping the operands to commute instead of also changing the opcode to v_subrev_u16. llvm-svn: 289093
* AMDGPU/SI: Remove zero_extend patterns for i16 ops selected to 32-bit instsTom Stellard2016-11-181-3/+14
| | | | | | | | | | | | | | Summary: The 32-bit instructions don't zero the high 16-bits like the 16-bit instructions do. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D26828 llvm-svn: 287342
* AMDGPU/SI: Fix pattern for i16 = sign_extend i1Tom Stellard2016-11-151-1/+5
| | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D26670 llvm-svn: 287035
* [AMDGPU] Add f16 support (VI+)Konstantin Zhuravlyov2016-11-131-19/+40
| | | | | | Differential Revision: https://reviews.llvm.org/D25975 llvm-svn: 286753
* AMDGPU: Add VI i16 supportTom Stellard2016-11-101-0/+72
| | | | | | | | Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 286464
* Revert "AMDGPU: Add VI i16 support"Tom Stellard2016-11-041-72/+0
| | | | | | This reverts commit r285939 and r285948. These broke some conformance tests. llvm-svn: 285995
* AMDGPU: Add VI i16 supportTom Stellard2016-11-031-0/+72
| | | | | | | | Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 285939
* [AMDGPU] Assembler: support v_mac_f32 DPP and SDWA. Move getNamedOperandIdx ↵Sam Kolton2016-10-071-0/+1
| | | | | | | | | | | | to AMDGPUBaseInfo.h Reviewers: artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25084 llvm-svn: 283560
OpenPOWER on IntegriCloud