summaryrefslogtreecommitdiffstats
path: root/llvm/docs/AMDGPUModifierSyntax.rst
diff options
context:
space:
mode:
Diffstat (limited to 'llvm/docs/AMDGPUModifierSyntax.rst')
-rw-r--r--llvm/docs/AMDGPUModifierSyntax.rst296
1 files changed, 190 insertions, 106 deletions
diff --git a/llvm/docs/AMDGPUModifierSyntax.rst b/llvm/docs/AMDGPUModifierSyntax.rst
index d66e94dcb91..526016d4f4f 100644
--- a/llvm/docs/AMDGPUModifierSyntax.rst
+++ b/llvm/docs/AMDGPUModifierSyntax.rst
@@ -34,19 +34,21 @@ Specifies an immediate unsigned 8-bit offset, in bytes. The default value is 0.
Used with DS instructions which have 2 addresses.
- =================== =====================================================
+ =================== ====================================================================
Syntax Description
- =================== =====================================================
+ =================== ====================================================================
offset:{0..0xFF} Specifies an unsigned 8-bit offset as a positive
- :ref:`integer number <amdgpu_synid_integer_number>`.
- =================== =====================================================
+ :ref:`integer number <amdgpu_synid_integer_number>`
+ or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
+ =================== ====================================================================
Examples:
.. parsed-literal::
- offset:255
offset:0xff
+ offset:2-x
+ offset:-x-y
.. _amdgpu_synid_ds_offset16:
@@ -57,12 +59,13 @@ Specifies an immediate unsigned 16-bit offset, in bytes. The default value is 0.
Used with DS instructions which have 1 address.
- ==================== ======================================================
+ ==================== ====================================================================
Syntax Description
- ==================== ======================================================
+ ==================== ====================================================================
offset:{0..0xFFFF} Specifies an unsigned 16-bit offset as a positive
- :ref:`integer number <amdgpu_synid_integer_number>`.
- ==================== ======================================================
+ :ref:`integer number <amdgpu_synid_integer_number>`
+ or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
+ ==================== ====================================================================
Examples:
@@ -70,6 +73,7 @@ Examples:
offset:65535
offset:0xffff
+ offset:-x-y
.. _amdgpu_synid_sw_offset16:
@@ -95,7 +99,7 @@ See AMD documentation for more information.
*mask* is a 5 character sequence which
specifies how to transform the bits of the
- lane *id*.
+ lane *id*.
The following characters are allowed:
@@ -116,7 +120,7 @@ See AMD documentation for more information.
size and must be equal to 2, 4, 8, 16 or 32.
The second numeric parameter is an index of the
- lane being broadcasted.
+ lane being broadcasted.
The index must not exceed group size.
offset:swizzle(SWAP,{1..16}) Specifies a swap mode.
@@ -128,7 +132,7 @@ See AMD documentation for more information.
Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes.
======================================================= ===========================================================
-Numeric parameters may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
+Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Examples:
@@ -137,7 +141,7 @@ Examples:
offset:255
offset:0xffff
- offset:swizzle(QUAD_PERM, 0, 1, 2 ,3)
+ offset:swizzle(QUAD_PERM, 0, 1, 2, 3)
offset:swizzle(BITMASK_PERM, "01pi0")
offset:swizzle(BROADCAST, 2, 0)
offset:swizzle(SWAP, 8)
@@ -212,19 +216,20 @@ Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
Cannot be used with *global/scratch* opcodes. GFX9 only.
- ================= ======================================================
+ ================= ====================================================================
Syntax Description
- ================= ======================================================
+ ================= ====================================================================
offset:{0..4095} Specifies a 12-bit unsigned offset as a positive
- :ref:`integer number <amdgpu_synid_integer_number>`.
- ================= ======================================================
+ :ref:`integer number <amdgpu_synid_integer_number>`
+ or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
+ ================= ====================================================================
Examples:
.. parsed-literal::
offset:4095
- offset:0xff
+ offset:x-0xff
.. _amdgpu_synid_flat_offset13s:
@@ -235,12 +240,13 @@ Specifies an immediate signed 13-bit offset, in bytes. The default value is 0.
Can be used with *global/scratch* opcodes only. GFX9 only.
- ============================ =======================================================
- Syntax Description
- ============================ =======================================================
- offset:{-4096..4095} Specifies a 13-bit signed offset as an
- :ref:`integer number <amdgpu_synid_integer_number>`.
- ============================ =======================================================
+ ===================== ====================================================================
+ Syntax Description
+ ===================== ====================================================================
+ offset:{-4096..4095} Specifies a 13-bit signed offset as an
+ :ref:`integer number <amdgpu_synid_integer_number>`
+ or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
+ ===================== ====================================================================
Examples:
@@ -248,6 +254,7 @@ Examples:
offset:-4000
offset:0x10
+ offset:-x
.. _amdgpu_synid_flat_offset12s:
@@ -260,12 +267,13 @@ Can be used with *global/scratch* opcodes only.
GFX10 only.
- ============================ =======================================================
- Syntax Description
- ============================ =======================================================
- offset:{-2048..2047} Specifies a 12-bit signed offset as an
- :ref:`integer number <amdgpu_synid_integer_number>`.
- ============================ =======================================================
+ ===================== ====================================================================
+ Syntax Description
+ ===================== ====================================================================
+ offset:{-2048..2047} Specifies a 12-bit signed offset as an
+ :ref:`integer number <amdgpu_synid_integer_number>`
+ or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
+ ===================== ====================================================================
Examples:
@@ -273,6 +281,7 @@ Examples:
offset:-2000
offset:0x10
+ offset:-x+y
.. _amdgpu_synid_flat_offset11:
@@ -285,19 +294,20 @@ Cannot be used with *global/scratch* opcodes.
GFX10 only.
- ================= ======================================================
+ ================= ====================================================================
Syntax Description
- ================= ======================================================
+ ================= ====================================================================
offset:{0..2047} Specifies an 11-bit unsigned offset as a positive
- :ref:`integer number <amdgpu_synid_integer_number>`.
- ================= ======================================================
+ :ref:`integer number <amdgpu_synid_integer_number>`
+ or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
+ ================= ====================================================================
Examples:
.. parsed-literal::
offset:2047
- offset:0xff
+ offset:x+0xff
dlc
~~~
@@ -340,19 +350,18 @@ dmask
Specifies which channels (image components) are used by the operation. By default, no channels
are used.
- =============== =====================================================
+ =============== ====================================================================
Syntax Description
- =============== =====================================================
+ =============== ====================================================================
dmask:{0..15} Specifies image channels as a positive
- :ref:`integer number <amdgpu_synid_integer_number>`.
+ :ref:`integer number <amdgpu_synid_integer_number>`
+ or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
- Each bit corresponds to one of 4 image
- components (RGBA).
+ Each bit corresponds to one of 4 image components (RGBA).
- If the specified bit value
- is 0, the component is not used, value 1 means
- that the component is used.
- =============== =====================================================
+ If the specified bit value is 0, the component is not used,
+ value 1 means that the component is used.
+ =============== ====================================================================
This modifier has some limitations depending on instruction kind:
@@ -373,7 +382,7 @@ Examples:
dmask:0xf
dmask:0b1111
- dmask:3
+ dmask:x|y|z
.. _amdgpu_synid_unorm:
@@ -468,7 +477,7 @@ Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7.
Each 16-bit data element occupies 1 VGPR.
GFX8.1, GFX9 and GFX10 support data packing.
- Each pair of 16-bit data elements
+ Each pair of 16-bit data elements
occupies 1 VGPR.
======================================== ================================================
@@ -684,18 +693,19 @@ offset12
Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
- =============================== ======================================================
- Syntax Description
- =============================== ======================================================
- offset:{0..0xFFF} Specifies a 12-bit unsigned offset as a positive
- :ref:`integer number <amdgpu_synid_integer_number>`.
- =============================== ======================================================
+ ================== ====================================================================
+ Syntax Description
+ ================== ====================================================================
+ offset:{0..0xFFF} Specifies a 12-bit unsigned offset as a positive
+ :ref:`integer number <amdgpu_synid_integer_number>`
+ or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
+ ================== ====================================================================
Examples:
.. parsed-literal::
- offset:0
+ offset:x+y
offset:0x10
glc
@@ -782,14 +792,18 @@ GFX10 only.
dpp8_sel
~~~~~~~~
-Selects which lane to pull data from, within a group of 8 lanes. This is a mandatory modifier.
+Selects which lanes to pull data from, within a group of 8 lanes. This is a mandatory modifier.
There is no default value.
GFX10 only.
-The *dpp8_sel* modifier must specify exactly 8 values, each ranging from 0 to 7.
+The *dpp8_sel* modifier must specify exactly 8 values.
First value selects which lane to read from to supply data into lane 0.
-Second value controls value for lane 1 and so on.
+Second value controls lane 1 and so on.
+
+Each value may be specified as either
+an :ref:`integer number<amdgpu_synid_integer_number>` or
+an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
=============================================================== ===========================
Syntax Description
@@ -811,7 +825,7 @@ fi
Controls interaction with inactive lanes for *dpp8* instructions. The default value is zero.
-Note. *Inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
+Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
GFX10 only.
@@ -822,6 +836,9 @@ GFX10 only.
fi:1 Fetch pre-exist values from inactive lanes.
==================================== =====================================================
+Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+
DPP/DPP16 Modifiers
-------------------
@@ -837,7 +854,7 @@ There is no default value.
GFX8 and GFX9 only. Use :ref:`dpp16_ctrl<amdgpu_synid_dpp16_ctrl>` for GFX10.
-Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
+Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
======================================== ================================================
Syntax Description
@@ -856,7 +873,7 @@ Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
row_ror:{1..15} Row rotate right by 1-15 threads.
======================================== ================================================
-Note: Numeric parameters may be specified as either
+Note: numeric values may be specified as either
:ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
@@ -877,7 +894,7 @@ There is no default value.
GFX10 only. Use :ref:`dpp_ctrl<amdgpu_synid_dpp_ctrl>` for GFX8 and GFX9.
-Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
+Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
(There are only two rows in *wave32* mode.)
======================================== ====================================================
@@ -894,7 +911,7 @@ Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
row_ror:{1..15} Row rotate right by 1-15 threads.
======================================== ====================================================
-Note: Numeric parameters may be specified as either
+Note: numeric values may be specified as either
:ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
@@ -912,21 +929,21 @@ row_mask
Controls which rows are enabled for data sharing. By default, all rows are enabled.
-Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
+Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
(There are only two rows in *wave32* mode.)
- ======================================== =====================================================
- Syntax Description
- ======================================== =====================================================
- row_mask:{0..15} Specifies a *row mask* as a positive
- :ref:`integer number <amdgpu_synid_integer_number>`.
+ ================= ====================================================================
+ Syntax Description
+ ================= ====================================================================
+ row_mask:{0..15} Specifies a *row mask* as a positive
+ :ref:`integer number <amdgpu_synid_integer_number>`
+ or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
- Each of 4 bits in the mask controls one
- row (0 - disabled, 1 - enabled).
+ Each of 4 bits in the mask controls one row
+ (0 - disabled, 1 - enabled).
- In *wave32* mode the values should be limited to
- {0..7}.
- ======================================== =====================================================
+ In *wave32* mode the values should be limited to 0..7.
+ ================= ====================================================================
Examples:
@@ -934,7 +951,7 @@ Examples:
row_mask:0xf
row_mask:0b1010
- row_mask:0b1111
+ row_mask:x|y
.. _amdgpu_synid_bank_mask:
@@ -943,18 +960,19 @@ bank_mask
Controls which banks are enabled for data sharing. By default, all banks are enabled.
-Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
+Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
(There are only two rows in *wave32* mode.)
- ======================================== =======================================================
- Syntax Description
- ======================================== =======================================================
- bank_mask:{0..15} Specifies a *bank mask* as a positive
- :ref:`integer number <amdgpu_synid_integer_number>`.
+ ================== ====================================================================
+ Syntax Description
+ ================== ====================================================================
+ bank_mask:{0..15} Specifies a *bank mask* as a positive
+ :ref:`integer number <amdgpu_synid_integer_number>`
+ or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
- Each of 4 bits in the mask controls one
- bank (0 - disabled, 1 - enabled).
- ======================================== =======================================================
+ Each of 4 bits in the mask controls one bank
+ (0 - disabled, 1 - enabled).
+ ================== ====================================================================
Examples:
@@ -962,7 +980,7 @@ Examples:
bank_mask:0x3
bank_mask:0b0011
- bank_mask:0b1111
+ bank_mask:x&y
.. _amdgpu_synid_bound_ctrl:
@@ -988,7 +1006,7 @@ fi
Controls interaction with *inactive* lanes for *dpp16* instructions. The default value is zero.
-Note. *Inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
+Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
GFX10 only.
@@ -1001,6 +1019,9 @@ GFX10 only.
fi:1 Fetch pre-exist values from inactive lanes.
======================================== ==================================================
+Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+
SDWA Modifiers
--------------
@@ -1037,7 +1058,6 @@ Selects which bits in the destination are affected. By default, all bits are aff
dst_sel:WORD_1 Use bits 31:16.
======================================== ================================================
-
.. _amdgpu_synid_dst_unused:
dst_unused
@@ -1151,7 +1171,7 @@ operands (both source and destination). First value controls src0, second value
and so on, except that the last value controls destination.
The value 0 selects the low bits, while 1 selects the high bits.
-Note. op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified
+Note: op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified
by op_sel must be 0.
GFX9 and GFX10 only.
@@ -1164,6 +1184,10 @@ GFX9 and GFX10 only.
op_sel:[{0..1},{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
======================================== ============================================================
+Note: numeric values may be specified as either
+:ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+
Examples:
.. parsed-literal::
@@ -1189,7 +1213,7 @@ Integer clamping is not supported by GFX7.
For floating point operations, clamp modifier indicates that the result must be clamped
to the range [0.0, 1.0]. By default, there is no clamping.
-Note. Clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any).
+Note: clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any).
======================================== ================================================
Syntax Description
@@ -1205,12 +1229,12 @@ omod
Specifies if an output modifier must be applied to the result.
By default, no output modifiers are applied.
-Note. Output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any).
+Note: output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any).
Output modifiers are valid for f32 and f64 floating point results only.
They must not be used with f16.
-Note. *v_cvt_f16_f32* is an exception. This instruction produces f16 result
+Note: *v_cvt_f16_f32* is an exception. This instruction produces f16 result
but accepts output modifiers.
======================================== ================================================
@@ -1221,6 +1245,16 @@ but accepts output modifiers.
div:2 Multiply the result by 0.5.
======================================== ================================================
+Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+
+Examples:
+
+.. parsed-literal::
+
+ mul:2
+ mul:x // x must be equal to 2 or 4
+
.. _amdgpu_synid_vop3_operand_modifiers:
VOP3 Operand Modifiers
@@ -1233,15 +1267,19 @@ Operand modifiers are not used separately. They are applied to source operands.
abs
~~~
-Computes absolute value of its operand. Applied before :ref:`neg<amdgpu_synid_neg>` (if any).
-Valid for floating point operands only.
+Computes the absolute value of its operand. Must be applied before :ref:`neg<amdgpu_synid_neg>`
+(if any). Valid for floating point operands only.
- ======================================== ================================================
+ ======================================== ====================================================
Syntax Description
- ======================================== ================================================
- abs(<operand>) Get absolute value of operand.
- \|<operand>| The same as above.
- ======================================== ================================================
+ ======================================== ====================================================
+ abs(<operand>) Get the absolute value of a floating-point operand.
+ \|<operand>| The same as above (an SP3 syntax).
+ ======================================== ====================================================
+
+Note: avoid using SP3 syntax with operands specified as expressions because the trailing '|'
+may be misinterpreted. Such operands should be enclosed into additional parentheses as shown
+in examples below.
Examples:
@@ -1249,28 +1287,50 @@ Examples:
abs(v36)
\|v36|
+ abs(x|y) // ok
+ \|(x|y)| // additional parentheses are required
.. _amdgpu_synid_neg:
neg
~~~
-Computes negative value of its operand. Applied after :ref:`abs<amdgpu_synid_abs>` (if any).
-Valid for floating point operands only.
+Computes the negative value of its operand. Must be applied after :ref:`abs<amdgpu_synid_abs>`
+(if any). Valid for floating point operands only.
- ======================================== ================================================
- Syntax Description
- ======================================== ================================================
- neg(<operand>) Get negative value of operand.
- -<operand> The same as above.
- ======================================== ================================================
+ ================== ====================================================
+ Syntax Description
+ ================== ====================================================
+ neg(<operand>) Get the negative value of a floating-point operand.
+ The operand may include an optional
+ :ref:`abs<amdgpu_synid_abs>` modifier.
+ -<operand> The same as above (an SP3 syntax).
+ ================== ====================================================
+
+Note: SP3 syntax is supported with limitations because of a potential ambiguity.
+Currently it is allowed in the following cases:
+
+* Before a register.
+* Before an :ref:`abs<amdgpu_synid_abs>` modifier.
+* Before an SP3 :ref:`abs<amdgpu_synid_abs>` modifier.
+
+In all other cases "-" is handled as a part of an expression that follows the sign.
Examples:
.. parsed-literal::
+ // Operands with negate modifiers
neg(v[0])
- -v4
+ neg(1.0)
+ neg(abs(v0))
+ -v5
+ -abs(v5)
+ -\|v5|
+
+ // Operands without negate modifiers
+ -1
+ -x+y
VOP3P Modifiers
---------------
@@ -1304,6 +1364,10 @@ The value 0 selects the low bits, while 1 selects the high bits.
op_sel:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
================================= =============================================================
+Note: numeric values may be specified as either
+:ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+
Examples:
.. parsed-literal::
@@ -1333,6 +1397,10 @@ The value 0 selects the low bits, while 1 selects the high bits.
op_sel_hi:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
=================================== =============================================================
+Note: numeric values may be specified as either
+:ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+
Examples:
.. parsed-literal::
@@ -1367,6 +1435,10 @@ This modifier is valid for floating point operands only.
neg_lo:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands.
================================ ==================================================================
+Note: numeric values may be specified as either
+:ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+
Examples:
.. parsed-literal::
@@ -1401,6 +1473,10 @@ This modifier is valid for floating point operands only.
neg_hi:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands.
=============================== ==================================================================
+Note: numeric values may be specified as either
+:ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+
Examples:
.. parsed-literal::
@@ -1419,7 +1495,7 @@ VOP3P V_MAD_MIX Modifiers
-------------------------
*v_mad_mix_f32*, *v_mad_mixhi_f16* and *v_mad_mixlo_f16* instructions
-use *op_sel* and *op_sel_hi* modifiers
+use *op_sel* and *op_sel_hi* modifiers
in a manner different from *regular* VOP3P instructions.
See a description below.
@@ -1449,6 +1525,10 @@ By default, low bits are used for all operands.
op_sel:[{0..1},{0..1},{0..1}] Select location of each 16-bit source operand.
=============================== ================================================
+Note: numeric values may be specified as either
+:ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+
Examples:
.. parsed-literal::
@@ -1477,6 +1557,10 @@ The location of 16 bits in the operand may be specified by
op_sel_hi:[{0..1},{0..1},{0..1}] Select size of each source operand.
======================================== ====================================
+Note: numeric values may be specified as either
+:ref:`integer numbers<amdgpu_synid_integer_number>` or
+:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
+
Examples:
.. parsed-literal::
OpenPOWER on IntegriCloud