diff options
Diffstat (limited to 'llvm/docs/LangRef.rst')
-rw-r--r-- | llvm/docs/LangRef.rst | 332 |
1 files changed, 332 insertions, 0 deletions
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index dad99e3352d..dc6350c5520 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -11687,6 +11687,338 @@ Examples: %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c + +Experimental Vector Reduction Intrinsics +---------------------------------------- + +Horizontal reductions of vectors can be expressed using the following +intrinsics. Each one takes a vector operand as an input and applies its +respective operation across all elements of the vector, returning a single +scalar result of the same element type. + + +'``llvm.experimental.vector.reduce.add.*``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare i32 @llvm.experimental.vector.reduce.add.i32.v4i32(<4 x i32> %a) + declare i64 @llvm.experimental.vector.reduce.add.i64.v2i64(<2 x i64> %a) + +Overview: +""""""""" + +The '``llvm.experimental.vector.reduce.add.*``' intrinsics do an integer ``ADD`` +reduction of a vector, returning the result as a scalar. The return type matches +the element-type of the vector input. + +Arguments: +"""""""""" +The argument to this intrinsic must be a vector of integer values. + +'``llvm.experimental.vector.reduce.fadd.*``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float %acc, <4 x float> %a) + declare double @llvm.experimental.vector.reduce.fadd.f64.v2f64(double %acc, <2 x double> %a) + +Overview: +""""""""" + +The '``llvm.experimental.vector.reduce.fadd.*``' intrinsics do a floating point +``ADD`` reduction of a vector, returning the result as a scalar. The return type +matches the element-type of the vector input. + +If the intrinsic call has fast-math flags, then the reduction will not preserve +the associativity of an equivalent scalarized counterpart. If it does not have +fast-math flags, then the reduction will be *ordered*, implying that the +operation respects the associativity of a scalarized reduction. + + +Arguments: +"""""""""" +The first argument to this intrinsic is a scalar accumulator value, which is +only used when there are no fast-math flags attached. This argument may be undef +when fast-math flags are used. + +The second argument must be a vector of floating point values. + +Examples: +""""""""" + +.. code-block:: llvm + + %fast = call fast float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float undef, <4 x float> %input) ; fast reduction + %ord = call float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float %acc, <4 x float> %input) ; ordered reduction + + +'``llvm.experimental.vector.reduce.mul.*``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare i32 @llvm.experimental.vector.reduce.mul.i32.v4i32(<4 x i32> %a) + declare i64 @llvm.experimental.vector.reduce.mul.i64.v2i64(<2 x i64> %a) + +Overview: +""""""""" + +The '``llvm.experimental.vector.reduce.mul.*``' intrinsics do an integer ``MUL`` +reduction of a vector, returning the result as a scalar. The return type matches +the element-type of the vector input. + +Arguments: +"""""""""" +The argument to this intrinsic must be a vector of integer values. + +'``llvm.experimental.vector.reduce.fmul.*``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float %acc, <4 x float> %a) + declare double @llvm.experimental.vector.reduce.fmul.f64.v2f64(double %acc, <2 x double> %a) + +Overview: +""""""""" + +The '``llvm.experimental.vector.reduce.fmul.*``' intrinsics do a floating point +``MUL`` reduction of a vector, returning the result as a scalar. The return type +matches the element-type of the vector input. + +If the intrinsic call has fast-math flags, then the reduction will not preserve +the associativity of an equivalent scalarized counterpart. If it does not have +fast-math flags, then the reduction will be *ordered*, implying that the +operation respects the associativity of a scalarized reduction. + + +Arguments: +"""""""""" +The first argument to this intrinsic is a scalar accumulator value, which is +only used when there are no fast-math flags attached. This argument may be undef +when fast-math flags are used. + +The second argument must be a vector of floating point values. + +Examples: +""""""""" + +.. code-block:: llvm + + %fast = call fast float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float undef, <4 x float> %input) ; fast reduction + %ord = call float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float %acc, <4 x float> %input) ; ordered reduction + +'``llvm.experimental.vector.reduce.and.*``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare i32 @llvm.experimental.vector.reduce.and.i32.v4i32(<4 x i32> %a) + +Overview: +""""""""" + +The '``llvm.experimental.vector.reduce.and.*``' intrinsics do a bitwise ``AND`` +reduction of a vector, returning the result as a scalar. The return type matches +the element-type of the vector input. + +Arguments: +"""""""""" +The argument to this intrinsic must be a vector of integer values. + +'``llvm.experimental.vector.reduce.or.*``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a) + +Overview: +""""""""" + +The '``llvm.experimental.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction +of a vector, returning the result as a scalar. The return type matches the +element-type of the vector input. + +Arguments: +"""""""""" +The argument to this intrinsic must be a vector of integer values. + +'``llvm.experimental.vector.reduce.xor.*``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare i32 @llvm.experimental.vector.reduce.xor.i32.v4i32(<4 x i32> %a) + +Overview: +""""""""" + +The '``llvm.experimental.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR`` +reduction of a vector, returning the result as a scalar. The return type matches +the element-type of the vector input. + +Arguments: +"""""""""" +The argument to this intrinsic must be a vector of integer values. + +'``llvm.experimental.vector.reduce.smax.*``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare i32 @llvm.experimental.vector.reduce.smax.i32.v4i32(<4 x i32> %a) + +Overview: +""""""""" + +The '``llvm.experimental.vector.reduce.smax.*``' intrinsics do a signed integer +``MAX`` reduction of a vector, returning the result as a scalar. The return type +matches the element-type of the vector input. + +Arguments: +"""""""""" +The argument to this intrinsic must be a vector of integer values. + +'``llvm.experimental.vector.reduce.smin.*``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare i32 @llvm.experimental.vector.reduce.smin.i32.v4i32(<4 x i32> %a) + +Overview: +""""""""" + +The '``llvm.experimental.vector.reduce.smin.*``' intrinsics do a signed integer +``MIN`` reduction of a vector, returning the result as a scalar. The return type +matches the element-type of the vector input. + +Arguments: +"""""""""" +The argument to this intrinsic must be a vector of integer values. + +'``llvm.experimental.vector.reduce.umax.*``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare i32 @llvm.experimental.vector.reduce.umax.i32.v4i32(<4 x i32> %a) + +Overview: +""""""""" + +The '``llvm.experimental.vector.reduce.umax.*``' intrinsics do an unsigned +integer ``MAX`` reduction of a vector, returning the result as a scalar. The +return type matches the element-type of the vector input. + +Arguments: +"""""""""" +The argument to this intrinsic must be a vector of integer values. + +'``llvm.experimental.vector.reduce.umin.*``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare i32 @llvm.experimental.vector.reduce.umin.i32.v4i32(<4 x i32> %a) + +Overview: +""""""""" + +The '``llvm.experimental.vector.reduce.umin.*``' intrinsics do an unsigned +integer ``MIN`` reduction of a vector, returning the result as a scalar. The +return type matches the element-type of the vector input. + +Arguments: +"""""""""" +The argument to this intrinsic must be a vector of integer values. + +'``llvm.experimental.vector.reduce.fmax.*``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare float @llvm.experimental.vector.reduce.fmax.f32.v4f32(<4 x float> %a) + declare double @llvm.experimental.vector.reduce.fmax.f64.v2f64(<2 x double> %a) + +Overview: +""""""""" + +The '``llvm.experimental.vector.reduce.fmax.*``' intrinsics do a floating point +``MAX`` reduction of a vector, returning the result as a scalar. The return type +matches the element-type of the vector input. + +If the intrinsic call has the ``nnan`` fast-math flag then the operation can +assume that NaNs are not present in the input vector. + +Arguments: +"""""""""" +The argument to this intrinsic must be a vector of floating point values. + +'``llvm.experimental.vector.reduce.fmin.*``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare float @llvm.experimental.vector.reduce.fmin.f32.v4f32(<4 x float> %a) + declare double @llvm.experimental.vector.reduce.fmin.f64.v2f64(<2 x double> %a) + +Overview: +""""""""" + +The '``llvm.experimental.vector.reduce.fmin.*``' intrinsics do a floating point +``MIN`` reduction of a vector, returning the result as a scalar. The return type +matches the element-type of the vector input. + +If the intrinsic call has the ``nnan`` fast-math flag then the operation can +assume that NaNs are not present in the input vector. + +Arguments: +"""""""""" +The argument to this intrinsic must be a vector of floating point values. + Half Precision Floating Point Intrinsics ---------------------------------------- |