summaryrefslogtreecommitdiffstats
path: root/clang/docs/LanguageExtensions.rst
diff options
context:
space:
mode:
Diffstat (limited to 'clang/docs/LanguageExtensions.rst')
-rw-r--r--clang/docs/LanguageExtensions.rst38
1 files changed, 35 insertions, 3 deletions
diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index c7151644616..0b0a4fb925d 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -2050,9 +2050,9 @@ Extensions for loop hint optimizations
The ``#pragma clang loop`` directive is used to specify hints for optimizing the
subsequent for, while, do-while, or c++11 range-based for loop. The directive
-provides options for vectorization, interleaving, and unrolling. Loop hints can
-be specified before any loop and will be ignored if the optimization is not safe
-to apply.
+provides options for vectorization, interleaving, unrolling and
+distribution. Loop hints can be specified before any loop and will be ignored if
+the optimization is not safe to apply.
Vectorization and Interleaving
------------------------------
@@ -2147,6 +2147,38 @@ to the same code size limit as with ``unroll(enable)``.
Unrolling of a loop can be prevented by specifying ``unroll(disable)``.
+Loop Distribution
+-----------------
+
+Loop Distribution allows splitting a loop into multiple loops. This is
+beneficial for example when the entire loop cannot be vectorized but some of the
+resulting loops can.
+
+If ``distribute(enable))'' is specified and the loop has memory dependencies
+that inhibit vectorization, the compiler will attempt to isolate the offending
+operations into a new loop. This optimization is not enabled by default, only
+loops marked with the pragma are considered.
+
+.. code-block:: c++
+
+ #pragma clang loop distribute(enable)
+ for (i = 0; i < N; ++i) {
+ S1: A[i + 1] = A[i] + B[i];
+ S2: C[i] = D[i] * E[i];
+ }
+
+This loop will be split into two loops between statements S1 and S2. The
+second loop containing S2 will be vectorized.
+
+Loop Distribution is currently not enabled by default in the optimizer because
+it can hurt performance in some cases. For example, instruction-level
+parallelism could be reduced by sequentializing the execution of the
+statements S1 and S2 above.
+
+If Loop Distribution is turned on globally with
+``-mllvm -enable-loop-distribution``, specifying ``distribute(disable)`` can
+be used the disable it on a per-loop basis.
+
Additional Information
----------------------
OpenPOWER on IntegriCloud