summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/SIInstructions.td
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU/SI: Refactor fixup handling for constant addrspace variablesTom Stellard2016-06-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the GlobalAddress lowering code add the required 4 byte offset to the global address rather than doing it as part of the fixup. This refactoring will make it easier to use the same code for global address space variables and also simplifies the code. Re-commit this after fixing a bug where we were trying to use a reference to a Triple object that had already been destroyed. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21154 llvm-svn: 272705
* Revert "AMDGPU/SI: Refactor fixup handling for constant addrspace variables"Tom Stellard2016-06-141-3/+3
| | | | | | This reverts commit r272675. llvm-svn: 272677
* AMDGPU/SI: Refactor fixup handling for constant addrspace variablesTom Stellard2016-06-141-3/+3
| | | | | | | | | | | | | | | | | | | Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the GlobalAddress lowering code add the required 4 byte offset to the global address rather than doing it as part of the fixup. This refactoring will make it easier to use the same code for global address space variables and also simplifies the code. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21154 llvm-svn: 272675
* AMDGPU: Fix i64 global cmpxchgMatt Arsenault2016-06-091-31/+0
| | | | | | | | | | This was using extract_subreg sub0 to extract the low register of the result instead of sub0_sub1, producing an invalid copy. There doesn't seem to be a way to use the compound subreg indices in tablegen since those are generated, so manually select it. llvm-svn: 272344
* [AMDGPU][llvm-mc] v_cndmask_b32: src2 is mandatory; do not enforce VOP2 when ↵Artem Tamazov2016-06-061-9/+3
| | | | | | | | | | | | src2 == VCC. Another step for unification llvm assembler/disassembler with sp3. Besides, CodeGen output is a bit improved, thus changes in CodeGen tests. Assembler/Disassembler tests updated/added. Differential Revision: http://reviews.llvm.org/D20796 llvm-svn: 271900
* AMDGPU: Add fract intrinsicMatt Arsenault2016-05-281-17/+17
| | | | | | | | | Remove broken patterns matching it. This was matching the unsafe math pattern and expanding the fix for the buggy instruction from the pattern. The problems are also on CI. Remove the workarounds and only use fract with unsafe math or from the intrinsic. llvm-svn: 271078
* AMDGPU: Fix v2i64/v2f64 bitcastsMatt Arsenault2016-05-251-0/+2
| | | | | | | These operations tend to get promoted away to v4i32 so this doesn't happen often. llvm-svn: 270740
* AMDGPU: Make CONST_DATA_PTR available to R600Jan Vesely2016-05-131-1/+1
| | | | | | | | | | | | Rename to AMDGPUconstdata_ptr Reviewers: tstellard Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D19786 llvm-svn: 269474
* AMDGPU: Make some instructions convergentMatt Arsenault2016-05-111-1/+7
| | | | llvm-svn: 269147
* [AMDGPU][llvm-mc] Some refactoring of .td filesArtem Tamazov2016-05-061-20/+0
| | | | | | | | | Some custom Operands and AsmOperandClasses moved to proper place. No functional changes. Differential Revision: http://reviews.llvm.org/D20012 llvm-svn: 268780
* [AMDGPU][llvm-mc] Add support for sendmsg(...) syntax.Artem Tamazov2016-05-061-1/+2
| | | | | | | | | | | | | | | | | | | Added support for sendmsg(MSG[, OP[, STREAM_ID]]) syntax in s_sendmsg and s_sendmsghalt instructions. The syntax matches the SP3 assembler/disassembler rules. That is why implicit inputs (like M0 and EXEC) are not printed to disassembly output anymore. sendmsg(...) allows only known message types and attributes, even if literals are used instead of symbolic names. However, raw literal (without "sendmsg") still can be used, and that allows for any 16-bit value. Tests updated/added. Differential Revision: http://reviews.llvm.org/D19596 llvm-svn: 268762
* Fixed/Recommitted r267733 "[AMDGPU][llvm-mc] Add support of TTMP quads. ↵Artem Tamazov2016-04-291-7/+7
| | | | | | | | | | | Rework M0 exclusion for SMRD." Previously reverted by r267752. r267733 review: Differential Revision: http://reviews.llvm.org/D19342 llvm-svn: 268066
* Revert "[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for ↵Chad Rosier2016-04-271-7/+7
| | | | | | | | SMRD." This reverts commit r267733 due to a -Werror,-Wunused-function error. llvm-svn: 267752
* [AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD.Artem Tamazov2016-04-271-7/+7
| | | | | | | | | | | Added support of TTMP quads. Reworked M0 exclusion machinery for SMRD and similar instructions to enable usage of TTMP registers in those instructions as destinations. Tests added. Differential Revision: http://reviews.llvm.org/D19342 llvm-svn: 267733
* AMDGPU/SI: Add llvm.amdgcn.s.waitcnt.all intrinsicNicolai Haehnle2016-04-271-1/+8
| | | | | | | | | | | | | | | | | Summary: So it appears that to guarantee some of the ordering requirements of a GLSL memoryBarrier() executed in the shader, we need to emit an s_waitcnt. (We can't use an s_barrier, because memoryBarrier() may appear anywhere in the shader, in particular it may appear in non-uniform control flow.) Reviewers: arsenm, mareko, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19203 llvm-svn: 267729
* [AMDGPU] Assembler: basic support for SDWA instructionsSam Kolton2016-04-261-4/+4
| | | | | | | | | | | | | | | Support for SDWA instructions for VOP1 and VOP2 encoding. Not done yet: - converters for support optional operands and modifiers - VOPC - sext() modifier - intrinsics - VOP2b (see vop_dpp.s) - V_MAC_F32 (see vop_dpp.s) Differential Revision: http://reviews.llvm.org/D19360 llvm-svn: 267553
* [AMDGPU][llvm-mc] s_getreg/setreg* - Add hwreg(...) syntax.Artem Tamazov2016-04-251-3/+6
| | | | | | | | | | | | | Added hwreg(reg[,offset,width]) syntax. Default offset = 0, default width = 32. Possibility to specify 16-bit immediate kept. Added out-of-range checks. Disassembling is always to hwreg(...) format. Tests updated/added. Differential Revision: http://reviews.llvm.org/D19329 llvm-svn: 267410
* AMDGPU/SI: add llvm.amdgcn.ps.live intrinsicNicolai Haehnle2016-04-221-0/+8
| | | | | | | | | | | | | | | | | | | | | | | Summary: This intrinsic returns true if the current thread belongs to a live pixel and false if it belongs to a pixel that we are executing only for derivative computation. It will be used by Mesa to implement gl_HelperInvocation. Note that for pixels that are killed during the shader, this implementation also returns true, but it doesn't matter because those pixels are always disabled in the EXEC mask. This unearthed a corner case in the instruction verifier, which complained about a v_cndmask 0, 1, exec, exec<imp-use> instruction. That's stupid but correct code, so make the verifier accept it as such. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19191 llvm-svn: 267102
* Add IntrWrite[Arg]Mem intrinsic propertyNicolai Haehnle2016-04-191-23/+12
| | | | | | | | | | | | | | | | | | | | | | Summary: This property is used to mark an intrinsic that only writes to memory, but neither reads from memory nor has other side effects. An example where this is useful is the llvm.amdgcn.buffer.store.format.* intrinsic, which corresponds to a store instruction that goes through a special buffer descriptor rather than through a plain pointer. With this property, the intrinsic should still be handled as having side effects at the LLVM IR level, but machine scheduling can make smarter decisions. Reviewers: tstellarAMD, arsenm, joker.eph, reames Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18291 llvm-svn: 266826
* [AMDGPU][llvm-mc] s_setreg* - Fix order of operandsArtem Tamazov2016-04-181-2/+2
| | | | | | | | Order should match the sp3 syntax, where destination (simm16 denoting the hwreg) is coming first. Differential Revision: http://reviews.llvm.org/D19161 llvm-svn: 266617
* AMDGPU: Directly emit m0 initialization with s_mov_b32Matt Arsenault2016-04-141-1/+17
| | | | | | | | | | | | | Currently what comes out of instruction selection is a register initialized to -1, and then copied to m0. MachineCSE doesn't consider copies, but we want these to be CSEed. This isn't much of a problem currently, because SIFoldOperands is run immediately after. This avoids regressions when SIFoldOperands is run later from leaving all copies to m0. llvm-svn: 266377
* AMDGPU: Implement canonicalizeMatt Arsenault2016-04-141-0/+10
| | | | | | Also add generic DAG node for it. llvm-svn: 266272
* AMDGPU: add llvm.amdgcn.buffer.load/store intrinsicsNicolai Haehnle2016-04-121-59/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: They correspond to BUFFER_LOAD/STORE_DWORD[_X2,X3,X4] and mostly behave like llvm.amdgcn.buffer.load/store.format. They will be used by Mesa for SSBO and atomic counters at least when robust buffer access behavior is desired. (These instructions perform no format conversion and do buffer range checking per component.) As a side effect of sharing patterns with llvm.amdgcn.buffer.store.format, it has become trivial to add support for the f32 and v2f32 variants of that intrinsic, so the patch does so. Also DAG-ify (and fix) some tests that I noticed intermittent failures in while developing this patch. Some tests were (temporarily) adjusted for the required mayLoad/hasSideEffects changes to the BUFFER_STORE_DWORD* instructions. See also http://reviews.llvm.org/D18291. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18292 llvm-svn: 266126
* AMDGPU: Implement i64 global atomicsMatt Arsenault2016-04-121-14/+34
| | | | llvm-svn: 266075
* AMDGPU: Add atomic_inc + atomic_dec intrinsicsMatt Arsenault2016-04-121-5/+17
| | | | | | | These are different than atomicrmw add 1 because they have an additional input value to clamp the result. llvm-svn: 266074
* AMDGPU/SI: Implement atomic load/store for i32 and i64Jan Vesely2016-04-071-7/+41
| | | | | | | | | | Standard load/store instructions with GLC bit set. Reviewers: tstellardAMD, arsenm Differential Revision: http://reviews.llvm.org/D18760 llvm-svn: 265709
* [AMDGPU] fix readlane/readfirstlane src vgpr operand type.Valery Pykhtin2016-04-071-2/+2
| | | | | | | | | For VGPR_32 operand disassembler expects a VGPR register encoded as 0..255 (enum8 src operand). readfirstlane/readline actually has enum9 operand and this change fixes VGPR_32 to VS_32 (enum9 encoding). Differential Revision: http://reviews.llvm.org/D18696 llvm-svn: 265670
* [AMDGPU] AsmParser: disable DPP for unsupported instructions. New dpp tests. ↵Sam Kolton2016-04-061-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fix v_nop_dpp. Summary: 1. Disable DPP encoding for instructions that do not support it: - VOP1: - v_readfirstlane_b32 - v_clrexcp - v_movreld_b32 - v_movrels_b32 - v_movrelsd_b32 - VOP2: - v_madmk_f16/32 - v_madak_f16/32 - VOPC, VINTRP, VOP3 2. Fix DPP for v_nop 3. New DPP tests for VOP1 and VOP2 instructions Reviewers: nhaustov, tstellarAMD, vpykhtin Subscribers: tstellarAMD, arsenm Differential Revision: http://reviews.llvm.org/D18552 llvm-svn: 265538
* AMDGPU: Implement {BUFFER,FLAT}_ATOMIC_CMPSWAP{,_X2}Tom Stellard2016-04-011-1/+34
| | | | | | | | | | | | | | | | | Summary: Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+. 32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý. Patch by: Vedran Miletić Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: jvesely, scchan, kanarayan, arsenm Differential Revision: http://reviews.llvm.org/D17280 llvm-svn: 265170
* AMDGPU: Add frexp_exp intrinsicMatt Arsenault2016-03-301-2/+2
| | | | llvm-svn: 264944
* AMDGPU: Remove atomic inc/dec patternsMatt Arsenault2016-03-231-23/+0
| | | | | | | There is no benefit to these since materializing the constant 1 requires the same number of instructions as materializing uint_max llvm-svn: 264215
* AMDGPU: Add frexp_mant intrinsicMatt Arsenault2016-03-211-2/+2
| | | | llvm-svn: 263948
* AMDGPU: Overload return type of llvm.amdgcn.buffer.load.formatNicolai Haehnle2016-03-181-36/+43
| | | | | | | | | | | | | | | | Summary: Allow the selection of BUFFER_LOAD_FORMAT_x and _XY. Do this now before the frontend patches land in Mesa. Eventually, we may want to automatically reduce the size of loads at the LLVM IR level, which requires such overloads, and in some cases Mesa can generate them directly. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18255 llvm-svn: 263792
* AMDGPU/SI: Add llvm.amdgcn.buffer.atomic.* intrinsicsNicolai Haehnle2016-03-181-1/+103
| | | | | | | | | | | | | | | | | | | | Summary: These intrinsics expose the BUFFER_ATOMIC_* instructions and will be used by Mesa to implement atomics with buffer semantics. The intrinsic interface matches that of buffer.load.format and buffer.store.format, except that the GLC bit is not exposed (it is automatically deduced based on whether the return value is used). The change of hasSideEffects is required for TableGen to accept the pattern that matches the intrinsic. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, rivanvx, llvm-commits Differential Revision: http://reviews.llvm.org/D18151 llvm-svn: 263791
* AMDGPU: use ComplexPattern for offsets in llvm.amdgcn.buffer.load/store.formatNicolai Haehnle2016-03-181-13/+29
| | | | | | | | | | | | | | | | | | | | | Summary: We cannot easily deduce that an offset is in an SGPR, but the Mesa frontend cannot easily make use of an explicit soffset parameter either. Furthermore, it is likely that in the future, LLVM will be in a better position than the frontend to choose an SGPR offset if possible. Since there aren't any frontend uses of these intrinsics in upstream repositories yet, I would like to take this opportunity to change the intrinsic signatures to a single offset parameter, which is then selected to immediate offsets or voffsets using a ComplexPattern. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18218 llvm-svn: 263790
* AMDGPU/SI: Implement GroupStaticSize Intrinsic for Dynamic LDSChangpeng Fang2016-03-151-0/+5
| | | | | | | | | | | | | | | | | | | Summary: Static LDS size is saved in MachineFunctionInfo::LDSSize, We define a pseudo instruction with usesCustomInserter bit set. Then, in EmitInstrWithCustomInserter, we replace this pseudo instruction with a mov of MachineFunctionInfo::LDSSize. Reviewers: arsenm tstellarAMD Subscribers llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18064 llvm-svn: 263563
* [AMDGPU] Assembler: SOP* instruction fixesNikolay Haustov2016-03-141-20/+20
| | | | | | | | | | | | | | s_bitset0_b64, s_bitset1_b64 has 32-bit src0, not 64-bit. s_rfe_b64 has just one destination operand and no source. Uncomment S_BITCMP* and S_SETVSKIP, adjust SOPC_* classes for that. Add s_memrealtime test and change comments in smem.s to follow common style. Change test for s_memtime to use non-zero register to make it really test encoding. Add tests for s_buffer_load*. Add tests for SOPC instructions (same for SI and VI) Differential Revision: http://reviews.llvm.org/D18040 llvm-svn: 263420
* [AMDGPU] Assembler: change v_madmk operands to have same order as mad.Nikolay Haustov2016-03-111-2/+2
| | | | | | | | | | The constant is now at source operand 1 (previously at 2). This is also how it is in legacy AMD sp3 assembler. Update tests. Differential Revision: http://reviews.llvm.org/D17984 llvm-svn: 263212
* AMDGPU/SI: add llvm.amdgcn.buffer.load/store.format intrinsicsNicolai Haehnle2016-03-101-12/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: They correspond to BUFFER_LOAD/STORE_FORMAT_XYZW and will be used by Mesa to implement the GL_ARB_shader_image_load_store extension. The intention is that for llvm.amdgcn.buffer.load.format, LLVM will decide whether one of the _X/_XY/_XYZ opcodes can be used (similar to image sampling and loads). However, this is not currently implemented. For llvm.amdgcn.buffer.store, LLVM cannot decide to use one of the "smaller" opcodes and therefore the intrinsic is overloaded. Currently, only the v4f32 is actually implemented since GLSL also only has a vec4 variant of the store instructions, although it's conceivable that Mesa will want to be smarter about this in the future. BUFFER_LOAD_FORMAT_XYZW is already exposed via llvm.SI.vs.load.input, which has a legacy name, pretends not to access memory, and does not capture the full flexibility of the instruction. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17277 llvm-svn: 263140
* AMDGPU/SI: Define S_GETREG IntrinsicChangpeng Fang2016-03-101-0/+12
| | | | | | | | | | | | | | Summary: Define s_getreg intrinsic to generate s_getreg instruction to read hardware registers. Reviewers: tstellarAMD, arsenm Subscribers: llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D17892 llvm-svn: 263124
* [AMDGPU] Fix SMEM instructions encoding/operand namingsValery Pykhtin2016-03-101-1/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D17651 llvm-svn: 263108
* [AMDGPU] Assembler: Fix s_setpc_b64Nikolay Haustov2016-03-091-1/+1
| | | | | | | | s_setpc_b64 has just one 64-bit source which is the address of instruction to jump to. Differential Revision: http://reviews.llvm.org/D17888 llvm-svn: 263005
* AMDGPU: Match more med3 integer patternsMatt Arsenault2016-03-071-0/+3
| | | | llvm-svn: 262864
* [AMDGPU] SOPxx instructions operand naming fixed in td files.Valery Pykhtin2016-03-061-32/+32
| | | | | | | | | dst -> sdst ssrcN -> srcN Differential Revision: http://reviews.llvm.org/D17646 llvm-svn: 262801
* AMDGPU/SI: Add support for spiling SGPRs to scratch bufferTom Stellard2016-03-041-2/+3
| | | | | | | | | | | | | | Summary: This is necessary for when we run out of VGPRs and can no longer use v_{read,write}_lane for spilling SGPRs. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17592 llvm-svn: 262732
* AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsicsNikolay Haustov2016-03-041-17/+52
| | | | | | | | | | | These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. Initial change by Nicolai H.hnle Differential Revision: http://reviews.llvm.org/D17401 llvm-svn: 262701
* AMDGPU: Add s_sleep intrinsicMatt Arsenault2016-02-271-1/+13
| | | | llvm-svn: 262120
* AMDGPU: Implement readcyclecounterMatt Arsenault2016-02-271-1/+16
| | | | | | | | | | This matches the behavior of the HSAIL clock instruction. s_realmemtime is used if the subtarget supports it, and falls back to s_memtime if not. Also introduces new intrinsics for each of s_memtime / s_memrealtime. llvm-svn: 262119
* [AMDGPU] Assembler: Basic support for MIMGNikolay Haustov2016-02-261-16/+18
| | | | | | | | | | | Add parsing and printing of image operands. Matches legacy sp3 assembler. Change image instruction order to have data/image/sampler operands in the beginning. This is needed because optional operands in MC are always last. Update SITargetLowering for new order. Add basic MC test. Update CodeGen tests. Review: http://reviews.llvm.org/D17574 llvm-svn: 261995
* [AMDGPU] Assembler: Simplify handling of optional operandsNikolay Haustov2016-02-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | Resubmit with index problem fixed. Verified with valgrind. Prepare to support DPP encodings. For DPP encodings, we want row_mask/bank_mask/bound_ctrl to be optional operands. However this means that when parsing instruction which has no mnemonic prefix, we cannot add both default values for VOP3 and for DPP optional operands to OperandVector - neither instructions would match. So add default values for optional operands to MCInst during conversion instead. Mark more operands as IsOptional = 1 in .td files. Do not add default values for optional operands to OperandVector in AMDGPUAsmParser. Add default values for optional operands during conversion using new helper addOptionalImmOperand. Change to cvtVOP3_2_mod to check instruction flag instead of presence of modifiers. In the future, cvtVOP3* functions can be combined into one. Separate cvtFlat and cvtFlatAtomic. Fix CNDMASK_B32 definition to have no modifiers. Review: http://reviews.llvm.org/D17445 llvm-svn: 261856
OpenPOWER on IntegriCloud