diff options
Diffstat (limited to 'llvm/docs/TransformMetadata.rst')
| -rw-r--r-- | llvm/docs/TransformMetadata.rst | 441 |
1 files changed, 441 insertions, 0 deletions
diff --git a/llvm/docs/TransformMetadata.rst b/llvm/docs/TransformMetadata.rst new file mode 100644 index 00000000000..68649424b71 --- /dev/null +++ b/llvm/docs/TransformMetadata.rst @@ -0,0 +1,441 @@ +.. _transformation-metadata: + +============================ +Code Transformation Metadata +============================ + +.. contents:: + :local: + +Overview +======== + +LLVM transformation passes can be controlled by attaching metadata to +the code to transform. By default, transformation passes use heuristics +to determine whether or not to perform transformations, and when doing +so, other details of how the transformations are applied (e.g., which +vectorization factor to select). +Unless the optimizer is otherwise directed, transformations are applied +conservatively. This conservatism generally allows the optimizer to +avoid unprofitable transformations, but in practice, this results in the +optimizer not applying transformations that would be highly profitable. + +Frontends can give additional hints to LLVM passes on which +transformations they should apply. This can be additional knowledge that +cannot be derived from the emitted IR, or directives passed from the +user/programmer. OpenMP pragmas are an example of the latter. + +If any such metadata is dropped from the program, the code's semantics +must not change. + +Metadata on Loops +================= + +Attributes can be attached to loops as described in :ref:`llvm.loop`. +Attributes can describe properties of the loop, disable transformations, +force specific transformations and set transformation options. + +Because metadata nodes are immutable (with the exception of +``MDNode::replaceOperandWith`` which is dangerous to use on uniqued +metadata), in order to add or remove a loop attributes, a new ``MDNode`` +must be created and assigned as the new ``llvm.loop`` metadata. Any +connection between the old ``MDNode`` and the loop is lost. The +``llvm.loop`` node is also used as LoopID (``Loop::getLoopID()``), i.e. +the loop effectively gets a new identifier. For instance, +``llvm.mem.parallel_loop_access`` references the LoopID. Therefore, if +the parallel access property is to be preserved after adding/removing +loop attributes, any ``llvm.mem.parallel_loop_access`` reference must be +updated to the new LoopID. + +Transformation Metadata Structure +================================= + +Some attributes describe code transformations (unrolling, vectorizing, +loop distribution, etc.). They can either be a hint to the optimizer +that a transformation might be beneficial, instruction to use a specific +option, , or convey a specific request from the user (such as +``#pragma clang loop`` or ``#pragma omp simd``). + +If a transformation is forced but cannot be carried-out for any reason, +an optimization-missed warning must be emitted. Semantic information +such as a transformation being safe (e.g. +``llvm.mem.parallel_loop_access``) can be unused by the optimizer +without generating a warning. + +Unless explicitly disabled, any optimization pass may heuristically +determine whether a transformation is beneficial and apply it. If +metadata for another transformation was specified, applying a different +transformation before it might be inadvertent due to being applied on a +different loop or the loop not existing anymore. To avoid having to +explicitly disable an unknown number of passes, the attribute +``llvm.loop.disable_nonforced`` disables all optional, high-level, +restructuring transformations. + +The following example avoids the loop being altered before being +vectorized, for instance being unrolled. + +.. code-block:: llvm + + br i1 %exitcond, label %for.exit, label %for.header, !llvm.loop !0 + ... + !0 = distinct !{!0, !1, !2} + !1 = !{!"llvm.loop.vectorize.enable", i1 true} + !2 = !{!"llvm.loop.disable_nonforced"} + +After a transformation is applied, follow-up attributes are set on the +transformed and/or new loop(s). This allows additional attributes +including followup-transformations to be specified. Specifying multiple +transformations in the same metadata node is possible for compatibility +reasons, but their execution order is undefined. For instance, when +``llvm.loop.vectorize.enable`` and ``llvm.loop.unroll.enable`` are +specified at the same time, unrolling may occur either before or after +vectorization. + +As an example, the following instructs a loop to be vectorized and only +then unrolled. + +.. code-block:: llvm + + !0 = distinct !{!0, !1, !2, !3} + !1 = !{!"llvm.loop.vectorize.enable", i1 true} + !2 = !{!"llvm.loop.disable_nonforced"} + !3 = !{!"llvm.loop.vectorize.followup_vectorized", !{"llvm.loop.unroll.enable"}} + +If, and only if, no followup is specified, the pass may add attributes itself. +For instance, the vectorizer adds a ``llvm.loop.isvectorized`` attribute and +all attributes from the original loop excluding its loop vectorizer +attributes. To avoid this, an empty followup attribute can be used, e.g. + +.. code-block:: llvm + + !3 = !{!"llvm.loop.vectorize.followup_vectorized"} + +The followup attributes of a transformation that cannot be applied will +never be added to a loop and are therefore effectively ignored. This means +that any followup-transformation in such attributes requires that its +prior transformations are applied before the followup-transformation. +The user should receive a warning about the first transformation in the +transformation chain that could not be applied if it a forced +transformation. All following transformations are skipped. + +Pass-Specific Transformation Metadata +===================================== + +Transformation options are specific to each transformation. In the +following, we present the model for each LLVM loop optimization pass and +the metadata to influence them. + +Loop Vectorization and Interleaving +----------------------------------- + +Loop vectorization and interleaving is interpreted as a single +transformation. It is interpreted as forced if +``!{"llvm.loop.vectorize.enable", i1 true}`` is set. + +Assuming the pre-vectorization loop is + +.. code-block:: c + + for (int i = 0; i < n; i+=1) // original loop + Stmt(i); + +then the code after vectorization will be approximately (assuming an +SIMD width of 4): + +.. code-block:: c + + int i = 0; + if (rtc) { + for (; i + 3 < n; i+=4) // vectorized/interleaved loop + Stmt(i:i+3); + } + for (; i < n; i+=1) // epilogue loop + Stmt(i); + +where ``rtc`` is a generated runtime check. + +``llvm.loop.vectorize.followup_vectorized`` will set the attributes for +the vectorized loop. If not specified, ``llvm.loop.isvectorized`` is +combined with the original loop's attributes to avoid it being +vectorized multiple times. + +``llvm.loop.vectorize.followup_epilogue`` will set the attributes for +the remainder loop. If not specified, it will have the original loop's +attributes combined with ``llvm.loop.isvectorized`` and +``llvm.loop.unroll.runtime.disable`` (unless the original loop already +has unroll metadata). + +The attributes specified by ``llvm.loop.vectorize.followup_all`` are +added to both loops. + +When using a follow-up attribute, it replaces any automatically deduced +attributes for the generated loop in question. Therefore it is +recommended to add ``llvm.loop.isvectorized`` to +``llvm.loop.vectorize.followup_all`` which avoids that the loop +vectorizer tries to optimize the loops again. + +Loop Unrolling +-------------- + +Unrolling is interpreted as forced any ``!{!"llvm.loop.unroll.enable"}`` +metadata or option (``llvm.loop.unroll.count``, ``llvm.loop.unroll.full``) +is present. Unrolling can be full unrolling, partial unrolling of a loop +with constant trip count or runtime unrolling of a loop with a trip +count unknown at compile-time. + +If the loop has been unrolled fully, there is no followup-loop. For +partial/runtime unrolling, the original loop of + +.. code-block:: c + + for (int i = 0; i < n; i+=1) // original loop + Stmt(i); + +is transformed into (using an unroll factor of 4): + +.. code-block:: c + + int i = 0; + for (; i + 3 < n; i+=4) // unrolled loop + Stmt(i); + Stmt(i+1); + Stmt(i+2); + Stmt(i+3); + } + for (; i < n; i+=1) // remainder loop + Stmt(i); + +``llvm.loop.unroll.followup_unrolled`` will set the loop attributes of +the unrolled loop. If not specified, the attributes of the original loop +without the ``llvm.loop.unroll.*`` attributes are copied and +``llvm.loop.unroll.disable`` added to it. + +``llvm.loop.unroll.followup_remainder`` defines the attributes of the +remainder loop. If not specified the remainder loop will have no +attributes. The remainder loop might not be present due to being fully +unrolled in which case this attribute has no effect. + +Attributes defined in ``llvm.loop.unroll.followup_all`` are added to the +unrolled and remainder loops. + +To avoid that the partially unrolled loop is unrolled again, it is +recommended to add ``llvm.loop.unroll.disable`` to +``llvm.loop.unroll.followup_all``. If no follow-up attribute specified +for a generated loop, it is added automatically. + +Unroll-And-Jam +-------------- + +Unroll-and-jam uses the following transformation model (here with an +unroll factor if 2). Currently, it does not support a fallback version +when the transformation is unsafe. + +.. code-block:: c + + for (int i = 0; i < n; i+=1) { // original outer loop + Fore(i); + for (int j = 0; j < m; j+=1) // original inner loop + SubLoop(i, j); + Aft(i); + } + +.. code-block:: c + + int i = 0; + for (; i + 1 < n; i+=2) { // unrolled outer loop + Fore(i); + Fore(i+1); + for (int j = 0; j < m; j+=1) { // unrolled inner loop + SubLoop(i, j); + SubLoop(i+1, j); + } + Aft(i); + Aft(i+1); + } + for (; i < n; i+=1) { // remainder outer loop + Fore(i); + for (int j = 0; j < m; j+=1) // remainder inner loop + SubLoop(i, j); + Aft(i); + } + +``llvm.loop.unroll_and_jam.followup_outer`` will set the loop attributes +of the unrolled outer loop. If not specified, the attributes of the +original outer loop without the ``llvm.loop.unroll.*`` attributes are +copied and ``llvm.loop.unroll.disable`` added to it. + +``llvm.loop.unroll_and_jam.followup_inner`` will set the loop attributes +of the unrolled inner loop. If not specified, the attributes of the +original inner loop are used unchanged. + +``llvm.loop.unroll_and_jam.followup_remainder_outer`` sets the loop +attributes of the outer remainder loop. If not specified it will not +have any attributes. The remainder loop might not be present due to +being fully unrolled. + +``llvm.loop.unroll_and_jam.followup_remainder_inner`` sets the loop +attributes of the inner remainder loop. If not specified it will have +the attributes of the original inner loop. It the outer remainder loop +is unrolled, the inner remainder loop might be present multiple times. + +Attributes defined in ``llvm.loop.unroll_and_jam.followup_all`` are +added to all of the aforementioned output loops. + +To avoid that the unrolled loop is unrolled again, it is +recommended to add ``llvm.loop.unroll.disable`` to +``llvm.loop.unroll_and_jam.followup_all``. It suppresses unroll-and-jam +as well as an additional inner loop unrolling. If no follow-up +attribute specified for a generated loop, it is added automatically. + +Loop Distribution +----------------- + +The LoopDistribution pass tries to separate vectorizable parts of a loop +from the non-vectorizable part (which otherwise would make the entire +loop non-vectorizable). Conceptually, it transforms a loop such as + +.. code-block:: c + + for (int i = 1; i < n; i+=1) { // original loop + A[i] = i; + B[i] = 2 + B[i]; + C[i] = 3 + C[i - 1]; + } + +into the following code: + +.. code-block:: c + + if (rtc) { + for (int i = 1; i < n; i+=1) // coincident loop + A[i] = i; + for (int i = 1; i < n; i+=1) // coincident loop + B[i] = 2 + B[i]; + for (int i = 1; i < n; i+=1) // sequential loop + C[i] = 3 + C[i - 1]; + } else { + for (int i = 1; i < n; i+=1) { // fallback loop + A[i] = i; + B[i] = 2 + B[i]; + C[i] = 3 + C[i - 1]; + } + } + +where ``rtc`` is a generated runtime check. + +``llvm.loop.distribute.followup_coincident`` sets the loop attributes of +all loops without loop-carried dependencies (i.e. vectorizable loops). +There might be more than one such loops. If not defined, the loops will +inherit the original loop's attributes. + +``llvm.loop.distribute.followup_sequential`` sets the loop attributes of the +loop with potentially unsafe dependencies. There should be at most one +such loop. If not defined, the loop will inherit the original loop's +attributes. + +``llvm.loop.distribute.followup_fallback`` defines the loop attributes +for the fallback loop, which is a copy of the original loop for when +loop versioning is required. If undefined, the fallback loop inherits +all attributes from the original loop. + +Attributes defined in ``llvm.loop.distribute.followup_all`` are added to +all of the aforementioned output loops. + +It is recommended to add ``llvm.loop.disable_nonforced`` to +``llvm.loop.distribute.followup_fallback``. This avoids that the +fallback version (which is likely never executed) is further optimzed +which would increase the code size. + +Versioning LICM +--------------- + +The pass hoists code out of loops that are only loop-invariant when +dynamic conditions apply. For instance, it transforms the loop + +.. code-block:: c + + for (int i = 0; i < n; i+=1) // original loop + A[i] = B[0]; + +into: + +.. code-block:: c + + if (rtc) { + auto b = B[0]; + for (int i = 0; i < n; i+=1) // versioned loop + A[i] = b; + } else { + for (int i = 0; i < n; i+=1) // unversioned loop + A[i] = B[0]; + } + +The runtime condition (``rtc``) checks that the array ``A`` and the +element `B[0]` do not alias. + +Currently, this transformation does not support followup-attributes. + +Loop Interchange +---------------- + +Currently, the ``LoopInterchange`` pass does not use any metadata. + +Ambiguous Transformation Order +============================== + +If there multiple transformations defined, the order in which they are +executed depends on the order in LLVM's pass pipeline, which is subject +to change. The default optimization pipeline (anything higher than +``-O0``) has the following order. + +When using the legacy pass manager: + + - LoopInterchange (if enabled) + - SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling) + - VersioningLICM (if enabled) + - LoopDistribute + - LoopVectorizer + - LoopUnrollAndJam (if enabled) + - LoopUnroll (partial and runtime unrolling) + +When using the legacy pass manager with LTO: + + - LoopInterchange (if enabled) + - SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling) + - LoopVectorizer + - LoopUnroll (partial and runtime unrolling) + +When using the new pass manager: + + - SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling) + - LoopDistribute + - LoopVectorizer + - LoopUnrollAndJam (if enabled) + - LoopUnroll (partial and runtime unrolling) + +Leftover Transformations +======================== + +Forced transformations that have not been applied after the last +transformation pass should be reported to the user. The transformation +passes themselves cannot be responsible for this reporting because they +might not be in the pipeline, there might be multiple passes able to +apply a transformation (e.g. ``LoopInterchange`` and Polly) or a +transformation attribute may be 'hidden' inside another passes' followup +attribute. + +The pass ``-transform-warning`` (``WarnMissedTransformationsPass``) +emits such warnings. It should be placed after the last transformation +pass. + +The current pass pipeline has a fixed order in which transformations +passes are executed. A transformation can be in the followup of a pass +that is executed later and thus leftover. For instance, a loop nest +cannot be distributed and then interchanged with the current pass +pipeline. The loop distribution will execute, but there is no loop +interchange pass following such that any loop interchange metadata will +be ignored. The ``-transform-warning`` should emit a warning in this +case. + +Future versions of LLVM may fix this by executing transformations using +a dynamic ordering. |

