diff options
Diffstat (limited to 'llvm/docs/AMDGPUModifierSyntax.rst')
-rw-r--r-- | llvm/docs/AMDGPUModifierSyntax.rst | 296 |
1 files changed, 190 insertions, 106 deletions
diff --git a/llvm/docs/AMDGPUModifierSyntax.rst b/llvm/docs/AMDGPUModifierSyntax.rst index d66e94dcb91..526016d4f4f 100644 --- a/llvm/docs/AMDGPUModifierSyntax.rst +++ b/llvm/docs/AMDGPUModifierSyntax.rst @@ -34,19 +34,21 @@ Specifies an immediate unsigned 8-bit offset, in bytes. The default value is 0. Used with DS instructions which have 2 addresses. - =================== ===================================================== + =================== ==================================================================== Syntax Description - =================== ===================================================== + =================== ==================================================================== offset:{0..0xFF} Specifies an unsigned 8-bit offset as a positive - :ref:`integer number <amdgpu_synid_integer_number>`. - =================== ===================================================== + :ref:`integer number <amdgpu_synid_integer_number>` + or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. + =================== ==================================================================== Examples: .. parsed-literal:: - offset:255 offset:0xff + offset:2-x + offset:-x-y .. _amdgpu_synid_ds_offset16: @@ -57,12 +59,13 @@ Specifies an immediate unsigned 16-bit offset, in bytes. The default value is 0. Used with DS instructions which have 1 address. - ==================== ====================================================== + ==================== ==================================================================== Syntax Description - ==================== ====================================================== + ==================== ==================================================================== offset:{0..0xFFFF} Specifies an unsigned 16-bit offset as a positive - :ref:`integer number <amdgpu_synid_integer_number>`. - ==================== ====================================================== + :ref:`integer number <amdgpu_synid_integer_number>` + or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. + ==================== ==================================================================== Examples: @@ -70,6 +73,7 @@ Examples: offset:65535 offset:0xffff + offset:-x-y .. _amdgpu_synid_sw_offset16: @@ -95,7 +99,7 @@ See AMD documentation for more information. *mask* is a 5 character sequence which specifies how to transform the bits of the - lane *id*. + lane *id*. The following characters are allowed: @@ -116,7 +120,7 @@ See AMD documentation for more information. size and must be equal to 2, 4, 8, 16 or 32. The second numeric parameter is an index of the - lane being broadcasted. + lane being broadcasted. The index must not exceed group size. offset:swizzle(SWAP,{1..16}) Specifies a swap mode. @@ -128,7 +132,7 @@ See AMD documentation for more information. Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes. ======================================================= =========================================================== -Numeric parameters may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or +Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. Examples: @@ -137,7 +141,7 @@ Examples: offset:255 offset:0xffff - offset:swizzle(QUAD_PERM, 0, 1, 2 ,3) + offset:swizzle(QUAD_PERM, 0, 1, 2, 3) offset:swizzle(BITMASK_PERM, "01pi0") offset:swizzle(BROADCAST, 2, 0) offset:swizzle(SWAP, 8) @@ -212,19 +216,20 @@ Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0. Cannot be used with *global/scratch* opcodes. GFX9 only. - ================= ====================================================== + ================= ==================================================================== Syntax Description - ================= ====================================================== + ================= ==================================================================== offset:{0..4095} Specifies a 12-bit unsigned offset as a positive - :ref:`integer number <amdgpu_synid_integer_number>`. - ================= ====================================================== + :ref:`integer number <amdgpu_synid_integer_number>` + or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. + ================= ==================================================================== Examples: .. parsed-literal:: offset:4095 - offset:0xff + offset:x-0xff .. _amdgpu_synid_flat_offset13s: @@ -235,12 +240,13 @@ Specifies an immediate signed 13-bit offset, in bytes. The default value is 0. Can be used with *global/scratch* opcodes only. GFX9 only. - ============================ ======================================================= - Syntax Description - ============================ ======================================================= - offset:{-4096..4095} Specifies a 13-bit signed offset as an - :ref:`integer number <amdgpu_synid_integer_number>`. - ============================ ======================================================= + ===================== ==================================================================== + Syntax Description + ===================== ==================================================================== + offset:{-4096..4095} Specifies a 13-bit signed offset as an + :ref:`integer number <amdgpu_synid_integer_number>` + or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. + ===================== ==================================================================== Examples: @@ -248,6 +254,7 @@ Examples: offset:-4000 offset:0x10 + offset:-x .. _amdgpu_synid_flat_offset12s: @@ -260,12 +267,13 @@ Can be used with *global/scratch* opcodes only. GFX10 only. - ============================ ======================================================= - Syntax Description - ============================ ======================================================= - offset:{-2048..2047} Specifies a 12-bit signed offset as an - :ref:`integer number <amdgpu_synid_integer_number>`. - ============================ ======================================================= + ===================== ==================================================================== + Syntax Description + ===================== ==================================================================== + offset:{-2048..2047} Specifies a 12-bit signed offset as an + :ref:`integer number <amdgpu_synid_integer_number>` + or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. + ===================== ==================================================================== Examples: @@ -273,6 +281,7 @@ Examples: offset:-2000 offset:0x10 + offset:-x+y .. _amdgpu_synid_flat_offset11: @@ -285,19 +294,20 @@ Cannot be used with *global/scratch* opcodes. GFX10 only. - ================= ====================================================== + ================= ==================================================================== Syntax Description - ================= ====================================================== + ================= ==================================================================== offset:{0..2047} Specifies an 11-bit unsigned offset as a positive - :ref:`integer number <amdgpu_synid_integer_number>`. - ================= ====================================================== + :ref:`integer number <amdgpu_synid_integer_number>` + or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. + ================= ==================================================================== Examples: .. parsed-literal:: offset:2047 - offset:0xff + offset:x+0xff dlc ~~~ @@ -340,19 +350,18 @@ dmask Specifies which channels (image components) are used by the operation. By default, no channels are used. - =============== ===================================================== + =============== ==================================================================== Syntax Description - =============== ===================================================== + =============== ==================================================================== dmask:{0..15} Specifies image channels as a positive - :ref:`integer number <amdgpu_synid_integer_number>`. + :ref:`integer number <amdgpu_synid_integer_number>` + or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. - Each bit corresponds to one of 4 image - components (RGBA). + Each bit corresponds to one of 4 image components (RGBA). - If the specified bit value - is 0, the component is not used, value 1 means - that the component is used. - =============== ===================================================== + If the specified bit value is 0, the component is not used, + value 1 means that the component is used. + =============== ==================================================================== This modifier has some limitations depending on instruction kind: @@ -373,7 +382,7 @@ Examples: dmask:0xf dmask:0b1111 - dmask:3 + dmask:x|y|z .. _amdgpu_synid_unorm: @@ -468,7 +477,7 @@ Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7. Each 16-bit data element occupies 1 VGPR. GFX8.1, GFX9 and GFX10 support data packing. - Each pair of 16-bit data elements + Each pair of 16-bit data elements occupies 1 VGPR. ======================================== ================================================ @@ -684,18 +693,19 @@ offset12 Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0. - =============================== ====================================================== - Syntax Description - =============================== ====================================================== - offset:{0..0xFFF} Specifies a 12-bit unsigned offset as a positive - :ref:`integer number <amdgpu_synid_integer_number>`. - =============================== ====================================================== + ================== ==================================================================== + Syntax Description + ================== ==================================================================== + offset:{0..0xFFF} Specifies a 12-bit unsigned offset as a positive + :ref:`integer number <amdgpu_synid_integer_number>` + or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. + ================== ==================================================================== Examples: .. parsed-literal:: - offset:0 + offset:x+y offset:0x10 glc @@ -782,14 +792,18 @@ GFX10 only. dpp8_sel ~~~~~~~~ -Selects which lane to pull data from, within a group of 8 lanes. This is a mandatory modifier. +Selects which lanes to pull data from, within a group of 8 lanes. This is a mandatory modifier. There is no default value. GFX10 only. -The *dpp8_sel* modifier must specify exactly 8 values, each ranging from 0 to 7. +The *dpp8_sel* modifier must specify exactly 8 values. First value selects which lane to read from to supply data into lane 0. -Second value controls value for lane 1 and so on. +Second value controls lane 1 and so on. + +Each value may be specified as either +an :ref:`integer number<amdgpu_synid_integer_number>` or +an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. =============================================================== =========================== Syntax Description @@ -811,7 +825,7 @@ fi Controls interaction with inactive lanes for *dpp8* instructions. The default value is zero. -Note. *Inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero. +Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero. GFX10 only. @@ -822,6 +836,9 @@ GFX10 only. fi:1 Fetch pre-exist values from inactive lanes. ==================================== ===================================================== +Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or +:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. + DPP/DPP16 Modifiers ------------------- @@ -837,7 +854,7 @@ There is no default value. GFX8 and GFX9 only. Use :ref:`dpp16_ctrl<amdgpu_synid_dpp16_ctrl>` for GFX10. -Note. The lanes of a wavefront are organized in four *rows* and four *banks*. +Note: the lanes of a wavefront are organized in four *rows* and four *banks*. ======================================== ================================================ Syntax Description @@ -856,7 +873,7 @@ Note. The lanes of a wavefront are organized in four *rows* and four *banks*. row_ror:{1..15} Row rotate right by 1-15 threads. ======================================== ================================================ -Note: Numeric parameters may be specified as either +Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. @@ -877,7 +894,7 @@ There is no default value. GFX10 only. Use :ref:`dpp_ctrl<amdgpu_synid_dpp_ctrl>` for GFX8 and GFX9. -Note. The lanes of a wavefront are organized in four *rows* and four *banks*. +Note: the lanes of a wavefront are organized in four *rows* and four *banks*. (There are only two rows in *wave32* mode.) ======================================== ==================================================== @@ -894,7 +911,7 @@ Note. The lanes of a wavefront are organized in four *rows* and four *banks*. row_ror:{1..15} Row rotate right by 1-15 threads. ======================================== ==================================================== -Note: Numeric parameters may be specified as either +Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. @@ -912,21 +929,21 @@ row_mask Controls which rows are enabled for data sharing. By default, all rows are enabled. -Note. The lanes of a wavefront are organized in four *rows* and four *banks*. +Note: the lanes of a wavefront are organized in four *rows* and four *banks*. (There are only two rows in *wave32* mode.) - ======================================== ===================================================== - Syntax Description - ======================================== ===================================================== - row_mask:{0..15} Specifies a *row mask* as a positive - :ref:`integer number <amdgpu_synid_integer_number>`. + ================= ==================================================================== + Syntax Description + ================= ==================================================================== + row_mask:{0..15} Specifies a *row mask* as a positive + :ref:`integer number <amdgpu_synid_integer_number>` + or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. - Each of 4 bits in the mask controls one - row (0 - disabled, 1 - enabled). + Each of 4 bits in the mask controls one row + (0 - disabled, 1 - enabled). - In *wave32* mode the values should be limited to - {0..7}. - ======================================== ===================================================== + In *wave32* mode the values should be limited to 0..7. + ================= ==================================================================== Examples: @@ -934,7 +951,7 @@ Examples: row_mask:0xf row_mask:0b1010 - row_mask:0b1111 + row_mask:x|y .. _amdgpu_synid_bank_mask: @@ -943,18 +960,19 @@ bank_mask Controls which banks are enabled for data sharing. By default, all banks are enabled. -Note. The lanes of a wavefront are organized in four *rows* and four *banks*. +Note: the lanes of a wavefront are organized in four *rows* and four *banks*. (There are only two rows in *wave32* mode.) - ======================================== ======================================================= - Syntax Description - ======================================== ======================================================= - bank_mask:{0..15} Specifies a *bank mask* as a positive - :ref:`integer number <amdgpu_synid_integer_number>`. + ================== ==================================================================== + Syntax Description + ================== ==================================================================== + bank_mask:{0..15} Specifies a *bank mask* as a positive + :ref:`integer number <amdgpu_synid_integer_number>` + or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. - Each of 4 bits in the mask controls one - bank (0 - disabled, 1 - enabled). - ======================================== ======================================================= + Each of 4 bits in the mask controls one bank + (0 - disabled, 1 - enabled). + ================== ==================================================================== Examples: @@ -962,7 +980,7 @@ Examples: bank_mask:0x3 bank_mask:0b0011 - bank_mask:0b1111 + bank_mask:x&y .. _amdgpu_synid_bound_ctrl: @@ -988,7 +1006,7 @@ fi Controls interaction with *inactive* lanes for *dpp16* instructions. The default value is zero. -Note. *Inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero. +Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero. GFX10 only. @@ -1001,6 +1019,9 @@ GFX10 only. fi:1 Fetch pre-exist values from inactive lanes. ======================================== ================================================== +Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or +:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. + SDWA Modifiers -------------- @@ -1037,7 +1058,6 @@ Selects which bits in the destination are affected. By default, all bits are aff dst_sel:WORD_1 Use bits 31:16. ======================================== ================================================ - .. _amdgpu_synid_dst_unused: dst_unused @@ -1151,7 +1171,7 @@ operands (both source and destination). First value controls src0, second value and so on, except that the last value controls destination. The value 0 selects the low bits, while 1 selects the high bits. -Note. op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified +Note: op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified by op_sel must be 0. GFX9 and GFX10 only. @@ -1164,6 +1184,10 @@ GFX9 and GFX10 only. op_sel:[{0..1},{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands. ======================================== ============================================================ +Note: numeric values may be specified as either +:ref:`integer numbers<amdgpu_synid_integer_number>` or +:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. + Examples: .. parsed-literal:: @@ -1189,7 +1213,7 @@ Integer clamping is not supported by GFX7. For floating point operations, clamp modifier indicates that the result must be clamped to the range [0.0, 1.0]. By default, there is no clamping. -Note. Clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any). +Note: clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any). ======================================== ================================================ Syntax Description @@ -1205,12 +1229,12 @@ omod Specifies if an output modifier must be applied to the result. By default, no output modifiers are applied. -Note. Output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any). +Note: output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any). Output modifiers are valid for f32 and f64 floating point results only. They must not be used with f16. -Note. *v_cvt_f16_f32* is an exception. This instruction produces f16 result +Note: *v_cvt_f16_f32* is an exception. This instruction produces f16 result but accepts output modifiers. ======================================== ================================================ @@ -1221,6 +1245,16 @@ but accepts output modifiers. div:2 Multiply the result by 0.5. ======================================== ================================================ +Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or +:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. + +Examples: + +.. parsed-literal:: + + mul:2 + mul:x // x must be equal to 2 or 4 + .. _amdgpu_synid_vop3_operand_modifiers: VOP3 Operand Modifiers @@ -1233,15 +1267,19 @@ Operand modifiers are not used separately. They are applied to source operands. abs ~~~ -Computes absolute value of its operand. Applied before :ref:`neg<amdgpu_synid_neg>` (if any). -Valid for floating point operands only. +Computes the absolute value of its operand. Must be applied before :ref:`neg<amdgpu_synid_neg>` +(if any). Valid for floating point operands only. - ======================================== ================================================ + ======================================== ==================================================== Syntax Description - ======================================== ================================================ - abs(<operand>) Get absolute value of operand. - \|<operand>| The same as above. - ======================================== ================================================ + ======================================== ==================================================== + abs(<operand>) Get the absolute value of a floating-point operand. + \|<operand>| The same as above (an SP3 syntax). + ======================================== ==================================================== + +Note: avoid using SP3 syntax with operands specified as expressions because the trailing '|' +may be misinterpreted. Such operands should be enclosed into additional parentheses as shown +in examples below. Examples: @@ -1249,28 +1287,50 @@ Examples: abs(v36) \|v36| + abs(x|y) // ok + \|(x|y)| // additional parentheses are required .. _amdgpu_synid_neg: neg ~~~ -Computes negative value of its operand. Applied after :ref:`abs<amdgpu_synid_abs>` (if any). -Valid for floating point operands only. +Computes the negative value of its operand. Must be applied after :ref:`abs<amdgpu_synid_abs>` +(if any). Valid for floating point operands only. - ======================================== ================================================ - Syntax Description - ======================================== ================================================ - neg(<operand>) Get negative value of operand. - -<operand> The same as above. - ======================================== ================================================ + ================== ==================================================== + Syntax Description + ================== ==================================================== + neg(<operand>) Get the negative value of a floating-point operand. + The operand may include an optional + :ref:`abs<amdgpu_synid_abs>` modifier. + -<operand> The same as above (an SP3 syntax). + ================== ==================================================== + +Note: SP3 syntax is supported with limitations because of a potential ambiguity. +Currently it is allowed in the following cases: + +* Before a register. +* Before an :ref:`abs<amdgpu_synid_abs>` modifier. +* Before an SP3 :ref:`abs<amdgpu_synid_abs>` modifier. + +In all other cases "-" is handled as a part of an expression that follows the sign. Examples: .. parsed-literal:: + // Operands with negate modifiers neg(v[0]) - -v4 + neg(1.0) + neg(abs(v0)) + -v5 + -abs(v5) + -\|v5| + + // Operands without negate modifiers + -1 + -x+y VOP3P Modifiers --------------- @@ -1304,6 +1364,10 @@ The value 0 selects the low bits, while 1 selects the high bits. op_sel:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands. ================================= ============================================================= +Note: numeric values may be specified as either +:ref:`integer numbers<amdgpu_synid_integer_number>` or +:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. + Examples: .. parsed-literal:: @@ -1333,6 +1397,10 @@ The value 0 selects the low bits, while 1 selects the high bits. op_sel_hi:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands. =================================== ============================================================= +Note: numeric values may be specified as either +:ref:`integer numbers<amdgpu_synid_integer_number>` or +:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. + Examples: .. parsed-literal:: @@ -1367,6 +1435,10 @@ This modifier is valid for floating point operands only. neg_lo:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands. ================================ ================================================================== +Note: numeric values may be specified as either +:ref:`integer numbers<amdgpu_synid_integer_number>` or +:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. + Examples: .. parsed-literal:: @@ -1401,6 +1473,10 @@ This modifier is valid for floating point operands only. neg_hi:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands. =============================== ================================================================== +Note: numeric values may be specified as either +:ref:`integer numbers<amdgpu_synid_integer_number>` or +:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. + Examples: .. parsed-literal:: @@ -1419,7 +1495,7 @@ VOP3P V_MAD_MIX Modifiers ------------------------- *v_mad_mix_f32*, *v_mad_mixhi_f16* and *v_mad_mixlo_f16* instructions -use *op_sel* and *op_sel_hi* modifiers +use *op_sel* and *op_sel_hi* modifiers in a manner different from *regular* VOP3P instructions. See a description below. @@ -1449,6 +1525,10 @@ By default, low bits are used for all operands. op_sel:[{0..1},{0..1},{0..1}] Select location of each 16-bit source operand. =============================== ================================================ +Note: numeric values may be specified as either +:ref:`integer numbers<amdgpu_synid_integer_number>` or +:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. + Examples: .. parsed-literal:: @@ -1477,6 +1557,10 @@ The location of 16 bits in the operand may be specified by op_sel_hi:[{0..1},{0..1},{0..1}] Select size of each source operand. ======================================== ==================================== +Note: numeric values may be specified as either +:ref:`integer numbers<amdgpu_synid_integer_number>` or +:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. + Examples: .. parsed-literal:: |