summaryrefslogtreecommitdiffstats
path: root/llvm/docs/Vectorizers.rst
diff options
context:
space:
mode:
authorNadav Rotem <nrotem@apple.com>2013-01-03 01:47:02 +0000
committerNadav Rotem <nrotem@apple.com>2013-01-03 01:47:02 +0000
commita616d68f2c184a79d1c2b38bafaaee262b2c1114 (patch)
treee31e8f9523657842e7d3eaec00a0953a47b4ab3f /llvm/docs/Vectorizers.rst
parent40785ae18f9cdd166e63a3fde7f72ba2f123541e (diff)
downloadbcm5719-llvm-a616d68f2c184a79d1c2b38bafaaee262b2c1114.tar.gz
bcm5719-llvm-a616d68f2c184a79d1c2b38bafaaee262b2c1114.zip
LoopVectorizer: Document the unrolling feature.
llvm-svn: 171445
Diffstat (limited to 'llvm/docs/Vectorizers.rst')
-rw-r--r--llvm/docs/Vectorizers.rst36
1 files changed, 34 insertions, 2 deletions
diff --git a/llvm/docs/Vectorizers.rst b/llvm/docs/Vectorizers.rst
index 3410f183781..b4c5458953b 100644
--- a/llvm/docs/Vectorizers.rst
+++ b/llvm/docs/Vectorizers.rst
@@ -159,8 +159,8 @@ The Loop Vectorizer can vectorize loops that count backwards.
Scatter / Gather
^^^^^^^^^^^^^^^^
-The Loop Vectorizer can vectorize code that becomes scatter/gather
-memory accesses.
+The Loop Vectorizer can vectorize code that becomes a sequence of scalar instructions
+that scatter/gathers memory.
.. code-block:: c++
@@ -203,6 +203,38 @@ See the table below for a list of these functions.
| | | fmuladd |
+-----+-----+---------+
+
+Partial unrolling during vectorization
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Modern processors feature multiple execution units, and only programs that contain a
+high degree of parallelism can fully utilize the entire width of the machine.
+
+The Loop Vectorizer increases the instruction level parallelism (ILP) by
+performing partial-unrolling of loops.
+
+In the example below the entire array is accumulated into the variable 'sum'.
+This is inefficient because only a single 'adder' can be used by the processor.
+By unrolling the code the Loop Vectorizer allows two or more execution ports
+to be used.
+
+.. code-block:: c++
+
+ int foo(int *A, int *B, int n) {
+ unsigned sum = 0;
+ for (int i = 0; i < n; ++i)
+ sum += A[i];
+ return sum;
+ }
+
+At the moment the unrolling feature is not enabled by default and needs to be enabled
+in opt or clang using the following flag:
+
+.. code-block:: console
+
+ -force-vector-unroll=2
+
+
Performance
-----------
OpenPOWER on IntegriCloud