1 files changed, 441 insertions, 0 deletions
diff --git a/llvm/docs/TransformMetadata.rst b/llvm/docs/TransformMetadata.rst
new file mode 100644
index 00000000000..68649424b71
--- /dev/null
+++ b/llvm/docs/TransformMetadata.rst
@@ -0,0 +1,441 @@
+.. _transformation-metadata:
+
+============================
+Code Transformation Metadata
+============================
+
+.. contents::
+   :local:
+
+Overview
+========
+
+LLVM transformation passes can be controlled by attaching metadata to
+the code to transform. By default, transformation passes use heuristics
+to determine whether or not to perform transformations, and when doing
+so, other details of how the transformations are applied (e.g., which
+vectorization factor to select).
+Unless the optimizer is otherwise directed, transformations are applied
+conservatively. This conservatism generally allows the optimizer to
+avoid unprofitable transformations, but in practice, this results in the
+optimizer not applying transformations that would be highly profitable.
+
+Frontends can give additional hints to LLVM passes on which
+transformations they should apply. This can be additional knowledge that
+cannot be derived from the emitted IR, or directives passed from the
+user/programmer. OpenMP pragmas are an example of the latter.
+
+If any such metadata is dropped from the program, the code's semantics
+must not change.
+
+Metadata on Loops
+=================
+
+Attributes can be attached to loops as described in :ref:`llvm.loop`.
+Attributes can describe properties of the loop, disable transformations,
+force specific transformations and set transformation options.
+
+Because metadata nodes are immutable (with the exception of
+``MDNode::replaceOperandWith`` which is dangerous to use on uniqued
+metadata), in order to add or remove a loop attributes, a new ``MDNode``
+must be created and assigned as the new ``llvm.loop`` metadata. Any
+connection between the old ``MDNode`` and the loop is lost. The
+``llvm.loop`` node is also used as LoopID (``Loop::getLoopID()``), i.e.
+the loop effectively gets a new identifier. For instance,
+``llvm.mem.parallel_loop_access`` references the LoopID. Therefore, if
+the parallel access property is to be preserved after adding/removing
+loop attributes, any ``llvm.mem.parallel_loop_access`` reference must be
+updated to the new LoopID.
+
+Transformation Metadata Structure
+=================================
+
+Some attributes describe code transformations (unrolling, vectorizing,
+loop distribution, etc.). They can either be a hint to the optimizer
+that a transformation might be beneficial, instruction to use a specific
+option, , or convey a specific request from the user (such as
+``#pragma clang loop`` or ``#pragma omp simd``).
+
+If a transformation is forced but cannot be carried-out for any reason,
+an optimization-missed warning must be emitted. Semantic information
+such as a transformation being safe (e.g.
+``llvm.mem.parallel_loop_access``) can be unused by the optimizer
+without generating a warning.
+
+Unless explicitly disabled, any optimization pass may heuristically
+determine whether a transformation is beneficial and apply it. If
+metadata for another transformation was specified, applying a different
+transformation before it might be inadvertent due to being applied on a
+different loop or the loop not existing anymore. To avoid having to
+explicitly disable an unknown number of passes, the attribute
+``llvm.loop.disable_nonforced`` disables all optional, high-level,
+restructuring transformations.
+
+The following example avoids the loop being altered before being
+vectorized, for instance being unrolled.
+
+.. code-block:: llvm
+
+      br i1 %exitcond, label %for.exit, label %for.header, !llvm.loop !0
+    ...
+    !0 = distinct !{!0, !1, !2}
+    !1 = !{!"llvm.loop.vectorize.enable", i1 true}
+    !2 = !{!"llvm.loop.disable_nonforced"}
+
+After a transformation is applied, follow-up attributes are set on the
+transformed and/or new loop(s). This allows additional attributes
+including followup-transformations to be specified. Specifying multiple
+transformations in the same metadata node is possible for compatibility
+reasons, but their execution order is undefined. For instance, when
+``llvm.loop.vectorize.enable`` and ``llvm.loop.unroll.enable`` are
+specified at the same time, unrolling may occur either before or after
+vectorization.
+
+As an example, the following instructs a loop to be vectorized and only
+then unrolled.
+
+.. code-block:: llvm
+
+    !0 = distinct !{!0, !1, !2, !3}
+    !1 = !{!"llvm.loop.vectorize.enable", i1 true}
+    !2 = !{!"llvm.loop.disable_nonforced"}
+    !3 = !{!"llvm.loop.vectorize.followup_vectorized", !{"llvm.loop.unroll.enable"}}
+
+If, and only if, no followup is specified, the pass may add attributes itself.
+For instance, the vectorizer adds a ``llvm.loop.isvectorized`` attribute and
+all attributes from the original loop excluding its loop vectorizer
+attributes. To avoid this, an empty followup attribute can be used, e.g.
+
+.. code-block:: llvm
+
+    !3 = !{!"llvm.loop.vectorize.followup_vectorized"}
+
+The followup attributes of a transformation that cannot be applied will
+never be added to a loop and are therefore effectively ignored. This means
+that any followup-transformation in such attributes requires that its
+prior transformations are applied before the followup-transformation.
+The user should receive a warning about the first transformation in the
+transformation chain that could not be applied if it a forced
+transformation. All following transformations are skipped.
+
+Pass-Specific Transformation Metadata
+=====================================
+
+Transformation options are specific to each transformation. In the
+following, we present the model for each LLVM loop optimization pass and
+the metadata to influence them.
+
+Loop Vectorization and Interleaving
+-----------------------------------
+
+Loop vectorization and interleaving is interpreted as a single
+transformation. It is interpreted as forced if
+``!{"llvm.loop.vectorize.enable", i1 true}`` is set.
+
+Assuming the pre-vectorization loop is
+
+.. code-block:: c
+
+    for (int i = 0; i < n; i+=1) // original loop
+      Stmt(i);
+
+then the code after vectorization will be approximately (assuming an
+SIMD width of 4):
+
+.. code-block:: c
+
+    int i = 0;
+    if (rtc) {
+      for (; i + 3 < n; i+=4) // vectorized/interleaved loop
+        Stmt(i:i+3);
+    }
+    for (; i < n; i+=1) // epilogue loop
+      Stmt(i);
+
+where ``rtc`` is a generated runtime check.
+
+``llvm.loop.vectorize.followup_vectorized`` will set the attributes for
+the vectorized loop. If not specified, ``llvm.loop.isvectorized`` is
+combined with the original loop's attributes to avoid it being
+vectorized multiple times.
+
+``llvm.loop.vectorize.followup_epilogue`` will set the attributes for
+the remainder loop. If not specified, it will have the original loop's
+attributes combined with ``llvm.loop.isvectorized`` and
+``llvm.loop.unroll.runtime.disable`` (unless the original loop already
+has unroll metadata).
+
+The attributes specified by ``llvm.loop.vectorize.followup_all`` are
+added to both loops.
+
+When using a follow-up attribute, it replaces any automatically deduced
+attributes for the generated loop in question. Therefore it is
+recommended to add ``llvm.loop.isvectorized`` to
+``llvm.loop.vectorize.followup_all`` which avoids that the loop
+vectorizer tries to optimize the loops again.
+
+Loop Unrolling
+--------------
+
+Unrolling is interpreted as forced any ``!{!"llvm.loop.unroll.enable"}``
+metadata or option (``llvm.loop.unroll.count``, ``llvm.loop.unroll.full``)
+is present. Unrolling can be full unrolling, partial unrolling of a loop
+with constant trip count or runtime unrolling of a loop with a trip
+count unknown at compile-time.
+
+If the loop has been unrolled fully, there is no followup-loop. For
+partial/runtime unrolling, the original loop of
+
+.. code-block:: c
+
+    for (int i = 0; i < n; i+=1) // original loop
+      Stmt(i);
+
+is transformed into (using an unroll factor of 4):
+
+.. code-block:: c
+
+    int i = 0;
+    for (; i + 3 < n; i+=4) // unrolled loop
+      Stmt(i);
+      Stmt(i+1);
+      Stmt(i+2);
+      Stmt(i+3);
+    }
+    for (; i < n; i+=1) // remainder loop
+      Stmt(i);
+
+``llvm.loop.unroll.followup_unrolled`` will set the loop attributes of
+the unrolled loop. If not specified, the attributes of the original loop
+without the ``llvm.loop.unroll.*`` attributes are copied and
+``llvm.loop.unroll.disable`` added to it.
+
+``llvm.loop.unroll.followup_remainder`` defines the attributes of the
+remainder loop. If not specified the remainder loop will have no
+attributes. The remainder loop might not be present due to being fully
+unrolled in which case this attribute has no effect.
+
+Attributes defined in ``llvm.loop.unroll.followup_all`` are added to the
+unrolled and remainder loops.
+
+To avoid that the partially unrolled loop is unrolled again, it is
+recommended to add ``llvm.loop.unroll.disable`` to
+``llvm.loop.unroll.followup_all``. If no follow-up attribute specified
+for a generated loop, it is added automatically.
+
+Unroll-And-Jam
+--------------
+
+Unroll-and-jam uses the following transformation model (here with an
+unroll factor if 2). Currently, it does not support a fallback version
+when the transformation is unsafe.
+
+.. code-block:: c
+
+    for (int i = 0; i < n; i+=1) { // original outer loop
+      Fore(i);
+      for (int j = 0; j < m; j+=1) // original inner loop
+        SubLoop(i, j);
+      Aft(i);
+    }
+
+.. code-block:: c
+
+    int i = 0;
+    for (; i + 1 < n; i+=2) { // unrolled outer loop
+      Fore(i);
+      Fore(i+1);
+      for (int j = 0; j < m; j+=1) { // unrolled inner loop
+        SubLoop(i, j);
+        SubLoop(i+1, j);
+      }
+      Aft(i);
+      Aft(i+1);
+    }
+    for (; i < n; i+=1) { // remainder outer loop
+      Fore(i);
+      for (int j = 0; j < m; j+=1) // remainder inner loop
+        SubLoop(i, j);
+      Aft(i);
+    }
+
+``llvm.loop.unroll_and_jam.followup_outer`` will set the loop attributes
+of the unrolled outer loop. If not specified, the attributes of the
+original outer loop without the ``llvm.loop.unroll.*`` attributes are
+copied and ``llvm.loop.unroll.disable`` added to it.
+
+``llvm.loop.unroll_and_jam.followup_inner`` will set the loop attributes
+of the unrolled inner loop. If not specified, the attributes of the
+original inner loop are used unchanged.
+
+``llvm.loop.unroll_and_jam.followup_remainder_outer`` sets the loop
+attributes of the outer remainder loop. If not specified it will not
+have any attributes. The remainder loop might not be present due to
+being fully unrolled.
+
+``llvm.loop.unroll_and_jam.followup_remainder_inner`` sets the loop
+attributes of the inner remainder loop. If not specified it will have
+the attributes of the original inner loop. It the outer remainder loop
+is unrolled, the inner remainder loop might be present multiple times.
+
+Attributes defined in ``llvm.loop.unroll_and_jam.followup_all`` are
+added to all of the aforementioned output loops.
+
+To avoid that the unrolled loop is unrolled again, it is
+recommended to add ``llvm.loop.unroll.disable`` to
+``llvm.loop.unroll_and_jam.followup_all``. It suppresses unroll-and-jam
+as well as an additional inner loop unrolling. If no follow-up
+attribute specified for a generated loop, it is added automatically.
+
+Loop Distribution
+-----------------
+
+The LoopDistribution pass tries to separate vectorizable parts of a loop
+from the non-vectorizable part (which otherwise would make the entire
+loop non-vectorizable). Conceptually, it transforms a loop such as
+
+.. code-block:: c
+
+    for (int i = 1; i < n; i+=1) { // original loop
+      A[i] = i;
+      B[i] = 2 + B[i];
+      C[i] = 3 + C[i - 1];
+    }
+
+into the following code:
+
+.. code-block:: c
+
+    if (rtc) {
+      for (int i = 1; i < n; i+=1) // coincident loop
+        A[i] = i;
+      for (int i = 1; i < n; i+=1) // coincident loop
+        B[i] = 2 + B[i];
+      for (int i = 1; i < n; i+=1) // sequential loop
+        C[i] = 3 + C[i - 1];
+    } else {
+      for (int i = 1; i < n; i+=1) { // fallback loop
+        A[i] = i;
+        B[i] = 2 + B[i];
+        C[i] = 3 + C[i - 1];
+      }
+    }
+
+where ``rtc`` is a generated runtime check.
+
+``llvm.loop.distribute.followup_coincident`` sets the loop attributes of
+all loops without loop-carried dependencies (i.e. vectorizable loops).
+There might be more than one such loops. If not defined, the loops will
+inherit the original loop's attributes.
+
+``llvm.loop.distribute.followup_sequential`` sets the loop attributes of the
+loop with potentially unsafe dependencies. There should be at most one
+such loop. If not defined, the loop will inherit the original loop's
+attributes.
+
+``llvm.loop.distribute.followup_fallback`` defines the loop attributes
+for the fallback loop, which is a copy of the original loop for when
+loop versioning is required. If undefined, the fallback loop inherits
+all attributes from the original loop.
+
+Attributes defined in ``llvm.loop.distribute.followup_all`` are added to
+all of the aforementioned output loops.
+
+It is recommended to add ``llvm.loop.disable_nonforced`` to
+``llvm.loop.distribute.followup_fallback``. This avoids that the
+fallback version (which is likely never executed) is further optimzed
+which would increase the code size.
+
+Versioning LICM
+---------------
+
+The pass hoists code out of loops that are only loop-invariant when
+dynamic conditions apply. For instance, it transforms the loop
+
+.. code-block:: c
+
+    for (int i = 0; i < n; i+=1) // original loop
+      A[i] = B[0];
+
+into:
+
+.. code-block:: c
+
+    if (rtc) {
+      auto b = B[0];
+      for (int i = 0; i < n; i+=1) // versioned loop
+        A[i] = b;
+    } else {
+      for (int i = 0; i < n; i+=1) // unversioned loop
+        A[i] = B[0];
+    }
+
+The runtime condition (``rtc``) checks that the array ``A`` and the
+element `B[0]` do not alias.
+
+Currently, this transformation does not support followup-attributes.
+
+Loop Interchange
+----------------
+
+Currently, the ``LoopInterchange`` pass does not use any metadata.
+
+Ambiguous Transformation Order
+==============================
+
+If there multiple transformations defined, the order in which they are
+executed depends on the order in LLVM's pass pipeline, which is subject
+to change. The default optimization pipeline (anything higher than
+``-O0``) has the following order.
+
+When using the legacy pass manager:
+
+ - LoopInterchange (if enabled)
+ - SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling)
+ - VersioningLICM (if enabled)
+ - LoopDistribute
+ - LoopVectorizer
+ - LoopUnrollAndJam (if enabled)
+ - LoopUnroll (partial and runtime unrolling)
+
+When using the legacy pass manager with LTO:
+
+ - LoopInterchange (if enabled)
+ - SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling)
+ - LoopVectorizer
+ - LoopUnroll (partial and runtime unrolling)
+
+When using the new pass manager:
+
+ - SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling)
+ - LoopDistribute
+ - LoopVectorizer
+ - LoopUnrollAndJam (if enabled)
+ - LoopUnroll (partial and runtime unrolling)
+
+Leftover Transformations
+========================
+
+Forced transformations that have not been applied after the last
+transformation pass should be reported to the user. The transformation
+passes themselves cannot be responsible for this reporting because they
+might not be in the pipeline, there might be multiple passes able to
+apply a transformation (e.g. ``LoopInterchange`` and Polly) or a
+transformation attribute may be 'hidden' inside another passes' followup
+attribute.
+
+The pass ``-transform-warning`` (``WarnMissedTransformationsPass``)
+emits such warnings. It should be placed after the last transformation
+pass.
+
+The current pass pipeline has a fixed order in which transformations
+passes are executed. A transformation can be in the followup of a pass
+that is executed later and thus leftover. For instance, a loop nest
+cannot be distributed and then interchanged with the current pass
+pipeline. The loop distribution will execute, but there is no loop
+interchange pass following such that any loop interchange metadata will
+be ignored. The ``-transform-warning`` should emit a warning in this
+case.
+
+Future versions of LLVM may fix this by executing transformations using
+a dynamic ordering.