summaryrefslogtreecommitdiffstats
path: root/llvm/docs
diff options
context:
space:
mode:
authorSander de Smalen <sander.desmalen@arm.com>2019-06-11 08:22:10 +0000
committerSander de Smalen <sander.desmalen@arm.com>2019-06-11 08:22:10 +0000
commitcbeb563cfb1752044fb8771586ae9bbd89d2a07b (patch)
treedd9dec7d2ce2d7f949c97d9624df5ea1bbbf551d /llvm/docs
parente2acbeb94cf28cf6a8c82e09073df79aa1e846be (diff)
downloadbcm5719-llvm-cbeb563cfb1752044fb8771586ae9bbd89d2a07b.tar.gz
bcm5719-llvm-cbeb563cfb1752044fb8771586ae9bbd89d2a07b.zip
Change semantics of fadd/fmul vector reductions.
This patch changes how LLVM handles the accumulator/start value in the reduction, by never ignoring it regardless of the presence of fast-math flags on callsites. This change introduces the following new intrinsics to replace the existing ones: llvm.experimental.vector.reduce.fadd -> llvm.experimental.vector.reduce.v2.fadd llvm.experimental.vector.reduce.fmul -> llvm.experimental.vector.reduce.v2.fmul and adds functionality to auto-upgrade existing LLVM IR and bitcode. Reviewers: RKSimon, greened, dmgreen, nikic, simoll, aemerson Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D60261 llvm-svn: 363035
Diffstat (limited to 'llvm/docs')
-rw-r--r--llvm/docs/LangRef.rst58
1 files changed, 26 insertions, 32 deletions
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 20e94b48f8a..d2250443d0b 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -13733,37 +13733,34 @@ Arguments:
""""""""""
The argument to this intrinsic must be a vector of integer values.
-'``llvm.experimental.vector.reduce.fadd.*``' Intrinsic
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+'``llvm.experimental.vector.reduce.v2.fadd.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
- declare float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float %acc, <4 x float> %a)
- declare double @llvm.experimental.vector.reduce.fadd.f64.v2f64(double %acc, <2 x double> %a)
+ declare float @llvm.experimental.vector.reduce.v2.fadd.f32.v4f32(float %start_value, <4 x float> %a)
+ declare double @llvm.experimental.vector.reduce.v2.fadd.f64.v2f64(double %start_value, <2 x double> %a)
Overview:
"""""""""
-The '``llvm.experimental.vector.reduce.fadd.*``' intrinsics do a floating-point
+The '``llvm.experimental.vector.reduce.v2.fadd.*``' intrinsics do a floating-point
``ADD`` reduction of a vector, returning the result as a scalar. The return type
matches the element-type of the vector input.
-If the intrinsic call has fast-math flags, then the reduction will not preserve
-the associativity of an equivalent scalarized counterpart. If it does not have
-fast-math flags, then the reduction will be *ordered*, implying that the
-operation respects the associativity of a scalarized reduction.
+If the intrinsic call has the 'reassoc' or 'fast' flags set, then the
+reduction will not preserve the associativity of an equivalent scalarized
+counterpart. Otherwise the reduction will be *ordered*, thus implying that
+the operation respects the associativity of a scalarized reduction.
Arguments:
""""""""""
-The first argument to this intrinsic is a scalar accumulator value, which is
-only used when there are no fast-math flags attached. This argument may be undef
-when fast-math flags are used. The type of the accumulator matches the
-element-type of the vector input.
-
+The first argument to this intrinsic is a scalar start value for the reduction.
+The type of the start value matches the element-type of the vector input.
The second argument must be a vector of floating-point values.
Examples:
@@ -13771,8 +13768,8 @@ Examples:
.. code-block:: llvm
- %fast = call fast float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float undef, <4 x float> %input) ; fast reduction
- %ord = call float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float %acc, <4 x float> %input) ; ordered reduction
+ %unord = call reassoc float @llvm.experimental.vector.reduce.v2.fadd.f32.v4f32(float 0.0, <4 x float> %input) ; unordered reduction
+ %ord = call float @llvm.experimental.vector.reduce.v2.fadd.f32.v4f32(float %start_value, <4 x float> %input) ; ordered reduction
'``llvm.experimental.vector.reduce.mul.*``' Intrinsic
@@ -13797,37 +13794,34 @@ Arguments:
""""""""""
The argument to this intrinsic must be a vector of integer values.
-'``llvm.experimental.vector.reduce.fmul.*``' Intrinsic
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+'``llvm.experimental.vector.reduce.v2.fmul.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
::
- declare float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float %acc, <4 x float> %a)
- declare double @llvm.experimental.vector.reduce.fmul.f64.v2f64(double %acc, <2 x double> %a)
+ declare float @llvm.experimental.vector.reduce.v2.fmul.f32.v4f32(float %start_value, <4 x float> %a)
+ declare double @llvm.experimental.vector.reduce.v2.fmul.f64.v2f64(double %start_value, <2 x double> %a)
Overview:
"""""""""
-The '``llvm.experimental.vector.reduce.fmul.*``' intrinsics do a floating-point
+The '``llvm.experimental.vector.reduce.v2.fmul.*``' intrinsics do a floating-point
``MUL`` reduction of a vector, returning the result as a scalar. The return type
matches the element-type of the vector input.
-If the intrinsic call has fast-math flags, then the reduction will not preserve
-the associativity of an equivalent scalarized counterpart. If it does not have
-fast-math flags, then the reduction will be *ordered*, implying that the
-operation respects the associativity of a scalarized reduction.
+If the intrinsic call has the 'reassoc' or 'fast' flags set, then the
+reduction will not preserve the associativity of an equivalent scalarized
+counterpart. Otherwise the reduction will be *ordered*, thus implying that
+the operation respects the associativity of a scalarized reduction.
Arguments:
""""""""""
-The first argument to this intrinsic is a scalar accumulator value, which is
-only used when there are no fast-math flags attached. This argument may be undef
-when fast-math flags are used. The type of the accumulator matches the
-element-type of the vector input.
-
+The first argument to this intrinsic is a scalar start value for the reduction.
+The type of the start value matches the element-type of the vector input.
The second argument must be a vector of floating-point values.
Examples:
@@ -13835,8 +13829,8 @@ Examples:
.. code-block:: llvm
- %fast = call fast float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float undef, <4 x float> %input) ; fast reduction
- %ord = call float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float %acc, <4 x float> %input) ; ordered reduction
+ %unord = call reassoc float @llvm.experimental.vector.reduce.v2.fmul.f32.v4f32(float 1.0, <4 x float> %input) ; unordered reduction
+ %ord = call float @llvm.experimental.vector.reduce.v2.fmul.f32.v4f32(float %start_value, <4 x float> %input) ; ordered reduction
'``llvm.experimental.vector.reduce.and.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OpenPOWER on IntegriCloud