summaryrefslogtreecommitdiffstats
path: root/llvm/docs/AMDGPUModifierSyntax.rst
diff options
context:
space:
mode:
Diffstat (limited to 'llvm/docs/AMDGPUModifierSyntax.rst')
-rw-r--r--llvm/docs/AMDGPUModifierSyntax.rst1248
1 files changed, 1248 insertions, 0 deletions
diff --git a/llvm/docs/AMDGPUModifierSyntax.rst b/llvm/docs/AMDGPUModifierSyntax.rst
new file mode 100644
index 00000000000..bc2ddd0bffe
--- /dev/null
+++ b/llvm/docs/AMDGPUModifierSyntax.rst
@@ -0,0 +1,1248 @@
+======================================
+Syntax of AMDGPU Instruction Modifiers
+======================================
+
+.. contents::
+ :local:
+
+Conventions
+===========
+
+The following notation is used throughout this document:
+
+ =================== =============================================================
+ Notation Description
+ =================== =============================================================
+ {0..N} Any integer value in the range from 0 to N (inclusive).
+ <x> Syntax and meaning of *x* is explained elsewhere.
+ =================== =============================================================
+
+.. _amdgpu_syn_modifiers:
+
+Modifiers
+=========
+
+DS Modifiers
+------------
+
+.. _amdgpu_synid_ds_offset8:
+
+ds_offset8
+~~~~~~~~~~
+
+Specifies an immediate unsigned 8-bit offset, in bytes. The default value is 0.
+
+Used with DS instructions which have 2 addresses.
+
+ =================== =====================================================
+ Syntax Description
+ =================== =====================================================
+ offset:{0..0xFF} Specifies an unsigned 8-bit offset as a positive
+ :ref:`integer number <amdgpu_synid_integer_number>`.
+ =================== =====================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ offset:255
+ offset:0xff
+
+.. _amdgpu_synid_ds_offset16:
+
+ds_offset16
+~~~~~~~~~~~
+
+Specifies an immediate unsigned 16-bit offset, in bytes. The default value is 0.
+
+Used with DS instructions which have 1 address.
+
+ ==================== ======================================================
+ Syntax Description
+ ==================== ======================================================
+ offset:{0..0xFFFF} Specifies an unsigned 16-bit offset as a positive
+ :ref:`integer number <amdgpu_synid_integer_number>`.
+ ==================== ======================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ offset:65535
+ offset:0xffff
+
+.. _amdgpu_synid_sw_offset16:
+
+sw_offset16
+~~~~~~~~~~~
+
+This is a special modifier which may be used with *ds_swizzle_b32* instruction only.
+It specifies a swizzle pattern in numeric or symbolic form. The default value is 0.
+
+See AMD documentation for more information.
+
+ ======================================================= ===========================================================
+ Syntax Description
+ ======================================================= ===========================================================
+ offset:{0..0xFFFF} Specifies a 16-bit swizzle pattern.
+ offset:swizzle(QUAD_PERM,{0..3},{0..3},{0..3},{0..3}) Specifies a quad permute mode pattern
+
+ Each number is a lane *id*.
+ offset:swizzle(BITMASK_PERM, "<mask>") Specifies a bitmask permute mode pattern.
+
+ The pattern converts a 5-bit lane *id* to another
+ lane *id* with which the lane interacts.
+
+ *mask* is a 5 character sequence which
+ specifies how to transform the bits of the
+ lane *id*.
+
+ The following characters are allowed:
+
+ * "0" - set bit to 0.
+
+ * "1" - set bit to 1.
+
+ * "p" - preserve bit.
+
+ * "i" - inverse bit.
+
+ offset:swizzle(BROADCAST,{2..32},{0..N}) Specifies a broadcast mode.
+
+ Broadcasts the value of any particular lane to
+ all lanes in its group.
+
+ The first numeric parameter is a group
+ size and must be equal to 2, 4, 8, 16 or 32.
+
+ The second numeric parameter is an index of the
+ lane being broadcasted.
+
+ The index must not exceed group size.
+ offset:swizzle(SWAP,{1..16}) Specifies a swap mode.
+
+ Swaps the neighboring groups of
+ 1, 2, 4, 8 or 16 lanes.
+ offset:swizzle(REVERSE,{2..32}) Specifies a reverse mode.
+
+ Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes.
+ ======================================================= ===========================================================
+
+Numeric parameters may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+
+Examples:
+
+.. code-block:: nasm
+
+ offset:255
+ offset:0xffff
+ offset:swizzle(QUAD_PERM, 0, 1, 2 ,3)
+ offset:swizzle(BITMASK_PERM, "01pi0")
+ offset:swizzle(BROADCAST, 2, 0)
+ offset:swizzle(SWAP, 8)
+ offset:swizzle(REVERSE, 30 + 2)
+
+.. _amdgpu_synid_gds:
+
+gds
+~~~
+
+Specifies whether to use GDS or LDS memory (LDS is the default).
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ gds Use GDS memory.
+ ======================================== ================================================
+
+
+EXP Modifiers
+-------------
+
+.. _amdgpu_synid_done:
+
+done
+~~~~
+
+Specifies if this is the last export from the shader to the target. By default, current
+instruction does not finish an export sequence.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ done Indicates the last export operation.
+ ======================================== ================================================
+
+.. _amdgpu_synid_compr:
+
+compr
+~~~~~
+
+Indicates if the data are compressed (data are not compressed by default).
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ compr Data are compressed.
+ ======================================== ================================================
+
+.. _amdgpu_synid_vm:
+
+vm
+~~
+
+Specifies valid mask flag state (off by default).
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ vm Set valid mask flag.
+ ======================================== ================================================
+
+FLAT Modifiers
+--------------
+
+.. _amdgpu_synid_flat_offset12:
+
+flat_offset12
+~~~~~~~~~~~~~
+
+Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
+
+Cannot be used with *global/scratch* opcodes. GFX9 only.
+
+ ================= ======================================================
+ Syntax Description
+ ================= ======================================================
+ offset:{0..4095} Specifies a 12-bit unsigned offset as a positive
+ :ref:`integer number <amdgpu_synid_integer_number>`.
+ ================= ======================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ offset:4095
+ offset:0xff
+
+.. _amdgpu_synid_flat_offset13:
+
+flat_offset13
+~~~~~~~~~~~~~
+
+Specifies an immediate signed 13-bit offset, in bytes. The default value is 0.
+
+Can be used with *global/scratch* opcodes only. GFX9 only.
+
+ ============================ =======================================================
+ Syntax Description
+ ============================ =======================================================
+ offset:{-4096..+4095} Specifies a 13-bit signed offset as an
+ :ref:`integer number <amdgpu_synid_integer_number>`.
+ ============================ =======================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ offset:-4000
+ offset:0x10
+
+glc
+~~~
+
+See a description :ref:`here<amdgpu_synid_glc>`.
+
+slc
+~~~
+
+See a description :ref:`here<amdgpu_synid_slc>`.
+
+tfe
+~~~
+
+See a description :ref:`here<amdgpu_synid_tfe>`.
+
+nv
+~~
+
+See a description :ref:`here<amdgpu_synid_nv>`.
+
+MIMG Modifiers
+--------------
+
+.. _amdgpu_synid_dmask:
+
+dmask
+~~~~~
+
+Specifies which channels (image components) are used by the operation. By default, no channels
+are used.
+
+ =============== =====================================================
+ Syntax Description
+ =============== =====================================================
+ dmask:{0..15} Specifies image channels as a positive
+ :ref:`integer number <amdgpu_synid_integer_number>`.
+
+ Each bit corresponds to one of 4 image
+ components (RGBA).
+
+ If the specified bit value
+ is 0, the component is not used, value 1 means
+ that the component is used.
+ =============== =====================================================
+
+This modifier has some limitations depending on instruction kind:
+
+ =================================================== ========================
+ Instruction Kind Valid dmask Values
+ =================================================== ========================
+ 32-bit atomic *cmpswap* 0x3
+ 32-bit atomic instructions except for *cmpswap* 0x1
+ 64-bit atomic *cmpswap* 0xF
+ 64-bit atomic instructions except for *cmpswap* 0x3
+ *gather4* 0x1, 0x2, 0x4, 0x8
+ Other instructions any value
+ =================================================== ========================
+
+Examples:
+
+.. code-block:: nasm
+
+ dmask:0xf
+ dmask:0b1111
+ dmask:3
+
+.. _amdgpu_synid_unorm:
+
+unorm
+~~~~~
+
+Specifies whether the address is normalized or not (the address is normalized by default).
+
+ ======================== ========================================
+ Syntax Description
+ ======================== ========================================
+ unorm Force the address to be unnormalized.
+ ======================== ========================================
+
+glc
+~~~
+
+See a description :ref:`here<amdgpu_synid_glc>`.
+
+slc
+~~~
+
+See a description :ref:`here<amdgpu_synid_slc>`.
+
+.. _amdgpu_synid_r128:
+
+r128
+~~~~
+
+Specifies texture resource size. The default size is 256 bits.
+
+GFX7 and GFX8 only.
+
+ =================== ================================================
+ Syntax Description
+ =================== ================================================
+ r128 Specifies 128 bits texture resource size.
+ =================== ================================================
+
+.. WARNING:: Using this modifier should descrease *rsrc* register size from 8 to 4 dwords, but assembler does not currently support this feature.
+
+tfe
+~~~
+
+See a description :ref:`here<amdgpu_synid_tfe>`.
+
+.. _amdgpu_synid_lwe:
+
+lwe
+~~~
+
+Specifies LOD warning status (LOD warning is disabled by default).
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ lwe Enables LOD warning.
+ ======================================== ================================================
+
+.. _amdgpu_synid_da:
+
+da
+~~
+
+Specifies if an array index must be sent to TA. By default, array index is not sent.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ da Send an array-index to TA.
+ ======================================== ================================================
+
+.. _amdgpu_synid_d16:
+
+d16
+~~~
+
+Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ d16 Enables 16-bits data mode.
+
+ On loads, convert data in memory to 16-bit
+ format before storing it in VGPRs.
+
+ For stores, convert 16-bit data in VGPRs to
+ 32 bits before going to memory.
+
+ Note that GFX8.0 does not support data packing.
+ Each 16-bit data element occupies 1 VGPR.
+
+ GFX8.1 and GFX9 support data packing.
+ Each pair of 16-bit data elements
+ occupies 1 VGPR.
+ ======================================== ================================================
+
+.. _amdgpu_synid_a16:
+
+a16
+~~~
+
+Specifies size of image address components: 16 or 32 bits (32 bits by default). GFX9 only.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ a16 Enables 16-bits image address components.
+ ======================================== ================================================
+
+Miscellaneous Modifiers
+-----------------------
+
+.. _amdgpu_synid_glc:
+
+glc
+~~~
+
+This modifier has different meaning for loads, stores, and atomic operations.
+The default value is off (0).
+
+See AMD documentation for details.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ glc Set glc bit to 1.
+ ======================================== ================================================
+
+.. _amdgpu_synid_slc:
+
+slc
+~~~
+
+Specifies cache policy. The default value is off (0).
+
+See AMD documentation for details.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ slc Set slc bit to 1.
+ ======================================== ================================================
+
+.. _amdgpu_synid_tfe:
+
+tfe
+~~~
+
+Controls access to partially resident textures. The default value is off (0).
+
+See AMD documentation for details.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ tfe Set tfe bit to 1.
+ ======================================== ================================================
+
+.. _amdgpu_synid_nv:
+
+nv
+~~
+
+Specifies if instruction is operating on non-volatile memory. By default, memory is volatile.
+
+GFX9 only.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ nv Indicates that instruction operates on
+ non-volatile memory.
+ ======================================== ================================================
+
+MUBUF/MTBUF Modifiers
+---------------------
+
+.. _amdgpu_synid_idxen:
+
+idxen
+~~~~~
+
+Specifies whether address components include an index. By default, no components are used.
+
+Can be used together with :ref:`offen<amdgpu_synid_offen>`.
+
+Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ idxen Address components include an index.
+ ======================================== ================================================
+
+.. _amdgpu_synid_offen:
+
+offen
+~~~~~
+
+Specifies whether address components include an offset. By default, no components are used.
+
+Can be used together with :ref:`idxen<amdgpu_synid_idxen>`.
+
+Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ offen Address components include an offset.
+ ======================================== ================================================
+
+.. _amdgpu_synid_addr64:
+
+addr64
+~~~~~~
+
+Specifies whether a 64-bit address is used. By default, no address is used.
+
+GFX7 only. Cannot be used with :ref:`offen<amdgpu_synid_offen>` and
+:ref:`idxen<amdgpu_synid_idxen>` modifiers.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ addr64 A 64-bit address is used.
+ ======================================== ================================================
+
+.. _amdgpu_synid_buf_offset12:
+
+buf_offset12
+~~~~~~~~~~~~
+
+Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
+
+ =============================== ======================================================
+ Syntax Description
+ =============================== ======================================================
+ offset:{0..0xFFF} Specifies a 12-bit unsigned offset as a positive
+ :ref:`integer number <amdgpu_synid_integer_number>`.
+ =============================== ======================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ offset:0
+ offset:0x10
+
+glc
+~~~
+
+See a description :ref:`here<amdgpu_synid_glc>`.
+
+slc
+~~~
+
+See a description :ref:`here<amdgpu_synid_slc>`.
+
+.. _amdgpu_synid_lds:
+
+lds
+~~~
+
+Specifies where to store the result: VGPRs or LDS (VGPRs by default).
+
+ ======================================== ===========================
+ Syntax Description
+ ======================================== ===========================
+ lds Store result in LDS.
+ ======================================== ===========================
+
+tfe
+~~~
+
+See a description :ref:`here<amdgpu_synid_tfe>`.
+
+.. _amdgpu_synid_dfmt:
+
+dfmt
+~~~~
+
+TBD
+
+.. _amdgpu_synid_nfmt:
+
+nfmt
+~~~~
+
+TBD
+
+SMRD/SMEM Modifiers
+-------------------
+
+glc
+~~~
+
+See a description :ref:`here<amdgpu_synid_glc>`.
+
+nv
+~~
+
+See a description :ref:`here<amdgpu_synid_nv>`.
+
+VINTRP Modifiers
+----------------
+
+.. _amdgpu_synid_high:
+
+high
+~~~~
+
+Specifies which half of the LDS word to use. Low half of LDS word is used by default.
+GFX9 only.
+
+ ======================================== ================================
+ Syntax Description
+ ======================================== ================================
+ high Use high half of LDS word.
+ ======================================== ================================
+
+VOP1/VOP2 DPP Modifiers
+-----------------------
+
+GFX8 and GFX9 only.
+
+.. _amdgpu_synid_dpp_ctrl:
+
+dpp_ctrl
+~~~~~~~~
+
+Specifies how data are shared between threads. This is a mandatory modifier.
+There is no default value.
+
+Note. The lanes of a wavefront are organized in four banks and four rows.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ quad_perm:[{0..3},{0..3},{0..3},{0..3}] Full permute of 4 threads.
+ row_mirror Mirror threads within row.
+ row_half_mirror Mirror threads within 1/2 row (8 threads).
+ row_bcast:15 Broadcast 15th thread of each row to next row.
+ row_bcast:31 Broadcast thread 31 to rows 2 and 3.
+ wave_shl:1 Wavefront left shift by 1 thread.
+ wave_rol:1 Wavefront left rotate by 1 thread.
+ wave_shr:1 Wavefront right shift by 1 thread.
+ wave_ror:1 Wavefront right rotate by 1 thread.
+ row_shl:{1..15} Row shift left by 1-15 threads.
+ row_shr:{1..15} Row shift right by 1-15 threads.
+ row_ror:{1..15} Row rotate right by 1-15 threads.
+ ======================================== ================================================
+
+Note: Numeric parameters may be specified as either
+:ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+
+Examples:
+
+.. code-block:: nasm
+
+ quad_perm:[0, 1, 2, 3]
+ row_shl:3
+
+.. _amdgpu_synid_row_mask:
+
+row_mask
+~~~~~~~~
+
+Controls which rows are enabled for data sharing. By default, all rows are enabled.
+
+Note. The lanes of a wavefront are organized in four banks and four rows.
+
+ ======================================== =====================================================
+ Syntax Description
+ ======================================== =====================================================
+ row_mask:{0..15} Specifies a *row mask* as a positive
+ :ref:`integer number <amdgpu_synid_integer_number>`.
+
+ Each of 4 bits in the mask controls one
+ row (0 - disabled, 1 - enabled).
+ ======================================== =====================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ row_mask:0xf
+ row_mask:0b1010
+ row_mask:0b1111
+
+.. _amdgpu_synid_bank_mask:
+
+bank_mask
+~~~~~~~~~
+
+Controls which banks are enabled for data sharing. By default, all banks are enabled.
+
+Note. The lanes of a wavefront are organized in four banks and four rows.
+
+ ======================================== =======================================================
+ Syntax Description
+ ======================================== =======================================================
+ bank_mask:{0..15} Specifies a *bank mask* as a positive
+ :ref:`integer number <amdgpu_synid_integer_number>`.
+
+ Each of 4 bits in the mask controls one
+ bank (0 - disabled, 1 - enabled).
+ ======================================== =======================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ bank_mask:0x3
+ bank_mask:0b0011
+ bank_mask:0b1111
+
+.. _amdgpu_synid_bound_ctrl:
+
+bound_ctrl
+~~~~~~~~~~
+
+Controls data sharing when accessing an invalid lane. By default, data sharing with
+invalid lanes is disabled.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ bound_ctrl:0 Enables data sharing with invalid lanes.
+
+ Accessing data from an invalid lane will
+ return zero.
+ ======================================== ================================================
+
+VOP1/VOP2/VOPC SDWA Modifiers
+-----------------------------
+
+GFX8 and GFX9 only.
+
+clamp
+~~~~~
+
+See a description :ref:`here<amdgpu_synid_clamp>`.
+
+omod
+~~~~
+
+See a description :ref:`here<amdgpu_synid_omod>`.
+
+GFX9 only.
+
+.. _amdgpu_synid_dst_sel:
+
+dst_sel
+~~~~~~~
+
+Selects which bits in the destination are affected. By default, all bits are affected.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ dst_sel:DWORD Use bits 31:0.
+ dst_sel:BYTE_0 Use bits 7:0.
+ dst_sel:BYTE_1 Use bits 15:8.
+ dst_sel:BYTE_2 Use bits 23:16.
+ dst_sel:BYTE_3 Use bits 31:24.
+ dst_sel:WORD_0 Use bits 15:0.
+ dst_sel:WORD_1 Use bits 31:16.
+ ======================================== ================================================
+
+
+.. _amdgpu_synid_dst_unused:
+
+dst_unused
+~~~~~~~~~~
+
+Controls what to do with the bits in the destination which are not selected
+by :ref:`dst_sel<amdgpu_synid_dst_sel>`.
+By default, unused bits are preserved.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ dst_unused:UNUSED_PAD Pad with zeros.
+ dst_unused:UNUSED_SEXT Sign-extend upper bits, zero lower bits.
+ dst_unused:UNUSED_PRESERVE Preserve bits.
+ ======================================== ================================================
+
+.. _amdgpu_synid_src0_sel:
+
+src0_sel
+~~~~~~~~
+
+Controls which bits in the src0 are used. By default, all bits are used.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ src0_sel:DWORD Use bits 31:0.
+ src0_sel:BYTE_0 Use bits 7:0.
+ src0_sel:BYTE_1 Use bits 15:8.
+ src0_sel:BYTE_2 Use bits 23:16.
+ src0_sel:BYTE_3 Use bits 31:24.
+ src0_sel:WORD_0 Use bits 15:0.
+ src0_sel:WORD_1 Use bits 31:16.
+ ======================================== ================================================
+
+.. _amdgpu_synid_src1_sel:
+
+src1_sel
+~~~~~~~~
+
+Controls which bits in the src1 are used. By default, all bits are used.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ src1_sel:DWORD Use bits 31:0.
+ src1_sel:BYTE_0 Use bits 7:0.
+ src1_sel:BYTE_1 Use bits 15:8.
+ src1_sel:BYTE_2 Use bits 23:16.
+ src1_sel:BYTE_3 Use bits 31:24.
+ src1_sel:WORD_0 Use bits 15:0.
+ src1_sel:WORD_1 Use bits 31:16.
+ ======================================== ================================================
+
+.. _amdgpu_synid_sdwa_operand_modifiers:
+
+VOP1/VOP2/VOPC SDWA Operand Modifiers
+-------------------------------------
+
+Operand modifiers are not used separately. They are applied to source operands.
+
+GFX8 and GFX9 only.
+
+abs
+~~~
+
+See a description :ref:`here<amdgpu_synid_abs>`.
+
+neg
+~~~
+
+See a description :ref:`here<amdgpu_synid_neg>`.
+
+.. _amdgpu_synid_sext:
+
+sext
+~~~~
+
+Sign-extends value of a (sub-dword) operand to fill all 32 bits.
+Has no effect for 32-bit operands.
+
+Valid for integer operands only.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ sext(<operand>) Sign-extend operand value.
+ ======================================== ================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ sext(v4)
+ sext(v255)
+
+VOP3 Modifiers
+--------------
+
+.. _amdgpu_synid_vop3_op_sel:
+
+vop3_op_sel
+~~~~~~~~~~~
+
+Selects the low [15:0] or high [31:16] operand bits for source and destination operands.
+By default, low bits are used for all operands.
+
+The number of values specified with the op_sel modifier must match the number of instruction
+operands (both source and destination). First value controls src0, second value controls src1
+and so on, except that the last value controls destination.
+The value 0 selects the low bits, while 1 selects the high bits.
+
+Note. op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified
+by op_sel must be 0.
+
+GFX9 only.
+
+ ======================================== ============================================================
+ Syntax Description
+ ======================================== ============================================================
+ op_sel:[{0..1},{0..1}] Select operand bits for instructions with 1 source operand.
+ op_sel:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 2 source operands.
+ op_sel:[{0..1},{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
+ ======================================== ============================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ op_sel:[0,0]
+ op_sel:[0,1]
+
+.. _amdgpu_synid_clamp:
+
+clamp
+~~~~~
+
+Clamp meaning depends on instruction.
+
+For *v_cmp* instructions, clamp modifier indicates that the compare signals
+if a floating point exception occurs. By default, signaling is disabled.
+Not supported by GFX7.
+
+For integer operations, clamp modifier indicates that the result must be clamped
+to the largest and smallest representable value. By default, there is no clamping.
+Integer clamping is not supported by GFX7.
+
+For floating point operations, clamp modifier indicates that the result must be clamped
+to the range [0.0, 1.0]. By default, there is no clamping.
+
+Note. Clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any).
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ clamp Enables clamping (or signaling).
+ ======================================== ================================================
+
+.. _amdgpu_synid_omod:
+
+omod
+~~~~
+
+Specifies if an output modifier must be applied to the result.
+By default, no output modifiers are applied.
+
+Note. Output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any).
+
+Output modifiers are valid for f32 and f64 floating point results only.
+They must not be used with f16.
+
+Note. *v_cvt_f16_f32* is an exception. This instruction produces f16 result
+but accepts output modifiers.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ mul:2 Multiply the result by 2.
+ mul:4 Multiply the result by 4.
+ div:2 Multiply the result by 0.5.
+ ======================================== ================================================
+
+.. _amdgpu_synid_vop3_operand_modifiers:
+
+VOP3 Operand Modifiers
+----------------------
+
+Operand modifiers are not used separately. They are applied to source operands.
+
+.. _amdgpu_synid_abs:
+
+abs
+~~~
+
+Computes absolute value of its operand. Applied before :ref:`neg<amdgpu_synid_neg>` (if any).
+Valid for floating point operands only.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ abs(<operand>) Get absolute value of operand.
+ \|<operand>| The same as above.
+ ======================================== ================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ abs(v36)
+ |v36|
+
+.. _amdgpu_synid_neg:
+
+neg
+~~~
+
+Computes negative value of its operand. Applied after :ref:`abs<amdgpu_synid_abs>` (if any).
+Valid for floating point operands only.
+
+ ======================================== ================================================
+ Syntax Description
+ ======================================== ================================================
+ neg(<operand>) Get negative value of operand.
+ -<operand> The same as above.
+ ======================================== ================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ neg(v[0])
+ -v4
+
+VOP3P Modifiers
+---------------
+
+This section describes modifiers of *regular* VOP3P instructions.
+
+*v_mad_mix_f32*, *v_mad_mixhi_f16* and *v_mad_mixlo_f16*
+instructions use these modifiers :ref:`in a special manner<amdgpu_synid_mad_mix>`.
+
+GFX9 only.
+
+.. _amdgpu_synid_op_sel:
+
+op_sel
+~~~~~~
+
+Selects the low [15:0] or high [31:16] operand bits as input to the operation
+which results in the lower-half of the destination.
+By default, low bits are used for all operands.
+
+The number of values specified by the *op_sel* modifier must match the number of source
+operands. First value controls src0, second value controls src1 and so on.
+
+The value 0 selects the low bits, while 1 selects the high bits.
+
+ ================================= =============================================================
+ Syntax Description
+ ================================= =============================================================
+ op_sel:[{0..1}] Select operand bits for instructions with 1 source operand.
+ op_sel:[{0..1},{0..1}] Select operand bits for instructions with 2 source operands.
+ op_sel:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
+ ================================= =============================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ op_sel:[0,0]
+ op_sel:[0,1,0]
+
+.. _amdgpu_synid_op_sel_hi:
+
+op_sel_hi
+~~~~~~~~~
+
+Selects the low [15:0] or high [31:16] operand bits as input to the operation
+which results in the upper-half of the destination.
+By default, high bits are used for all operands.
+
+The number of values specified by the *op_sel_hi* modifier must match the number of source
+operands. First value controls src0, second value controls src1 and so on.
+
+The value 0 selects the low bits, while 1 selects the high bits.
+
+ =================================== =============================================================
+ Syntax Description
+ =================================== =============================================================
+ op_sel_hi:[{0..1}] Select operand bits for instructions with 1 source operand.
+ op_sel_hi:[{0..1},{0..1}] Select operand bits for instructions with 2 source operands.
+ op_sel_hi:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
+ =================================== =============================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ op_sel_hi:[0,0]
+ op_sel_hi:[0,0,1]
+
+.. _amdgpu_synid_neg_lo:
+
+neg_lo
+~~~~~~
+
+Specifies whether to change sign of operand values selected by
+:ref:`op_sel<amdgpu_synid_op_sel>`. These values are then used
+as input to the operation which results in the upper-half of the destination.
+
+The number of values specified by this modifier must match the number of source
+operands. First value controls src0, second value controls src1 and so on.
+
+The value 0 indicates that the corresponding operand value is used unmodified,
+the value 1 indicates that negative value of the operand must be used.
+
+By default, operand values are used unmodified.
+
+This modifier is valid for floating point operands only.
+
+ ================================ ==================================================================
+ Syntax Description
+ ================================ ==================================================================
+ neg_lo:[{0..1}] Select affected operands for instructions with 1 source operand.
+ neg_lo:[{0..1},{0..1}] Select affected operands for instructions with 2 source operands.
+ neg_lo:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands.
+ ================================ ==================================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ neg_lo:[0]
+ neg_lo:[0,1]
+
+.. _amdgpu_synid_neg_hi:
+
+neg_hi
+~~~~~~
+
+Specifies whether to change sign of operand values selected by
+:ref:`op_sel_hi<amdgpu_synid_op_sel_hi>`. These values are then used
+as input to the operation which results in the upper-half of the destination.
+
+The number of values specified by this modifier must match the number of source
+operands. First value controls src0, second value controls src1 and so on.
+
+The value 0 indicates that the corresponding operand value is used unmodified,
+the value 1 indicates that negative value of the operand must be used.
+
+By default, operand values are used unmodified.
+
+This modifier is valid for floating point operands only.
+
+ =============================== ==================================================================
+ Syntax Description
+ =============================== ==================================================================
+ neg_hi:[{0..1}] Select affected operands for instructions with 1 source operand.
+ neg_hi:[{0..1},{0..1}] Select affected operands for instructions with 2 source operands.
+ neg_hi:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands.
+ =============================== ==================================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ neg_hi:[1,0]
+ neg_hi:[0,1,1]
+
+clamp
+~~~~~
+
+See a description :ref:`here<amdgpu_synid_clamp>`.
+
+.. _amdgpu_synid_mad_mix:
+
+VOP3P V_MAD_MIX Modifiers
+-------------------------
+
+*v_mad_mix_f32*, *v_mad_mixhi_f16* and *v_mad_mixlo_f16* instructions
+use *op_sel* and *op_sel_hi* modifiers
+in a manner different from *regular* VOP3P instructions.
+
+See a description below.
+
+GFX9 only.
+
+.. _amdgpu_synid_mad_mix_op_sel:
+
+mad_mix_op_sel
+~~~~~~~~~~~~~~
+
+This operand has meaning only for 16-bit source operands as indicated by
+:ref:`mad_mix_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>`.
+It specifies to select either the low [15:0] or high [31:16] operand bits
+as input to the operation.
+
+The number of values specified by the *op_sel* modifier must match the number of source
+operands. First value controls src0, second value controls src1 and so on.
+
+The value 0 indicates the low bits, the value 1 indicates the high 16 bits.
+
+By default, low bits are used for all operands.
+
+ =============================== ================================================
+ Syntax Description
+ =============================== ================================================
+ op_sel:[{0..1},{0..1},{0..1}] Select location of each 16-bit source operand.
+ =============================== ================================================
+
+Examples:
+
+.. code-block:: nasm
+
+ op_sel:[0,1]
+
+.. _amdgpu_synid_mad_mix_op_sel_hi:
+
+mad_mix_op_sel_hi
+~~~~~~~~~~~~~~~~~
+
+Selects the size of source operands: either 32 bits or 16 bits.
+By default, 32 bits are used for all source operands.
+
+The number of values specified by the *op_sel_hi* modifier must match the number of source
+operands. First value controls src0, second value controls src1 and so on.
+
+The value 0 indicates 32 bits, the value 1 indicates 16 bits.
+
+The location of 16 bits in the operand may be specified by
+:ref:`mad_mix_op_sel<amdgpu_synid_mad_mix_op_sel>`.
+
+ ======================================== ====================================
+ Syntax Description
+ ======================================== ====================================
+ op_sel_hi:[{0..1},{0..1},{0..1}] Select size of each source operand.
+ ======================================== ====================================
+
+Examples:
+
+.. code-block:: nasm
+
+ op_sel_hi:[1,1,1]
+
+abs
+~~~
+
+See a description :ref:`here<amdgpu_synid_abs>`.
+
+neg
+~~~
+
+See a description :ref:`here<amdgpu_synid_neg>`.
+
+clamp
+~~~~~
+
+See a description :ref:`here<amdgpu_synid_clamp>`.
OpenPOWER on IntegriCloud