diff options
author | Tobias Grosser <tobias@grosser.es> | 2016-03-25 19:23:44 +0000 |
---|---|---|
committer | Tobias Grosser <tobias@grosser.es> | 2016-03-25 19:23:44 +0000 |
commit | 054ca24be76042452f6496bf401039adbc49df22 (patch) | |
tree | 2de0d35ce0632f8d2cd9abb418b3b8f5f66502c5 | |
parent | aae261004265189ef9b3f48172d3381e18b9488f (diff) | |
download | bcm5719-llvm-054ca24be76042452f6496bf401039adbc49df22.tar.gz bcm5719-llvm-054ca24be76042452f6496bf401039adbc49df22.zip |
docs: Describe Polly in the LLVM pass pipeline
llvm-svn: 264446
-rw-r--r-- | polly/docs/Architecture.rst | 79 | ||||
-rw-r--r-- | polly/docs/images/LLVM-Passes-all.pdf | bin | 0 -> 162694 bytes | |||
-rw-r--r-- | polly/docs/images/LLVM-Passes-all.png | bin | 0 -> 94585 bytes | |||
-rw-r--r-- | polly/docs/images/LLVM-Passes-early.pdf | bin | 0 -> 158890 bytes | |||
-rw-r--r-- | polly/docs/images/LLVM-Passes-early.png | bin | 0 -> 84569 bytes | |||
-rw-r--r-- | polly/docs/images/LLVM-Passes-late.pdf | bin | 0 -> 114846 bytes | |||
-rw-r--r-- | polly/docs/images/LLVM-Passes-late.png | bin | 0 -> 64574 bytes | |||
-rw-r--r-- | polly/docs/images/LLVM-Passes-only.pdf | bin | 0 -> 86415 bytes | |||
-rw-r--r-- | polly/docs/images/LLVM-Passes-only.png | bin | 0 -> 44027 bytes | |||
-rw-r--r-- | polly/docs/images/LLVM-Passes-only.pngfg | bin | 0 -> 201732 bytes |
10 files changed, 79 insertions, 0 deletions
diff --git a/polly/docs/Architecture.rst b/polly/docs/Architecture.rst index 136b2501288..f2ae2b10132 100644 --- a/polly/docs/Architecture.rst +++ b/polly/docs/Architecture.rst @@ -12,3 +12,82 @@ and inserted into the LLVM-IR module. .. image:: images/architecture.png :align: center + +Polly in the LLVM pass pipeline +------------------------------- + +The standard LLVM pass pipeline as it is used in -O1/-O2/-O3 mode of clang/opt +consists of a sequence of passes that can be grouped in different conceptual +phases. The first phase, we call it here **Canonicalization**, is a scalar +canonicalization phase that contains passes like -mem2reg, -instcombine, +-cfgsimplify, or early loop unrolling. It has the goal of removing and +simplifying the given IR as much as possible focusing mostly on scalar +optimizations. The second phase consists of three conceptual groups that are +executed in the so-called **Inliner cycle**, This is again a set of **Scalar +Simplification** passes, a set of **Simple Loop Optimizations**, and the +**Inliner** itself. Even though these passes make up the majority of the LLVM +pass pipeline, the primary goal of these passes is still canonicalization +without loosing semantic information that complicates later analysis. As part of +the inliner cycle, the LLVM inliner step-by-step tries to inline functions, runs +canonicalization passes to exploit newly exposed simplification opportunities, +and then tries to inline the further simplified functions. Some simple loop +optimizations are executed as part of the inliner cycle. Even though they +perform some optimizations, their primary goal is still the simplification of +the program code. Loop invariant code motion is one such optimization that +besides being beneficial for program performance also allows us to move +computation out of loops and in the best case enables us to eliminate certain +loops completely. Only after the inliner cycle has been finished, a last +**Target Specialization** phase is run, where IR complexity is deliberately +increased to take advantage of target specific features that maximize the +execution performance on the device we target. One of the principal +optimizations in this phase is vectorization, but also target specific loop +unrolling, or some loop transformations (e.g., distribution) that expose more +vectorization opportunities. + +.. image:: images/LLVM-Passes-only.png + :align: center + +Polly can conceptually be run at three different positions in the pass pipeline. +As an early optimizer before the standard LLVM pass pipeline, as a later +optimizer as part of the target specialization sequence, and theoretically also +with the loop optimizations in the inliner cycle. We only discuss the first two +options, as running Polly in the inline loop, is likely to disturb the inliner +and is consequently not a good idea. + +.. image:: images/LLVM-Passes-all.png + :align: center + +Running Polly early before the standard pass pipeline has the benefit that the +LLVM-IR processed by Polly is still very close to the original input code. +Hence, it is less likely that transformations applied by LLVM change the IR in +ways not easily understandable for the programmer. As a result, user feedback is +likely better and it is less likely that kernels that in C seem a perfect fit +for Polly have been transformed such that Polly can not handle them any +more. On the other hand, codes that require inlining to be optimized won't +benefit if Polly is scheduled at this position. The additional set of +canonicalization passes required will result in a small, but general compile +time increase and some random run-time performance changes due to slightly +different IR being passed through the optimizers. To force Polly to run early in +the pass pipleline use the option *-polly-position=early* (default today). + +.. image:: images/LLVM-Passes-early.png + :align: center + +Running Polly right before the vectorizer has the benefit that the full inlining +cycle has been run and as a result even heavily templated C++ code could +theoretically benefit from Polly (more work is necessary to make Polly here +really effective). As the IR that is passed to Polly has already been +canonicalized, there is also no need to run additional canonicalization passes. +General compile time is almost not affected by Polly, as detection of loop +kernels is generally very fast and the actual optimization and cleanup passes +are only run on functions which contain loop kernels that are worth optimizing. +However, due to the many optimizations that LLVM runs before Polly the IR that +reaches Polly often has additional scalar dependences that make Polly a lot less +efficient. To force Polly to run before the vectorizer in the pass pipleline use +the option *-polly-position=before-vectorizer*. This position is not yet the +default for Polly, but work is on its way to be effective even in presence of +scalar dependences. After this work has been completed, Polly will likely use +this position by default. + +.. image:: images/LLVM-Passes-late.png + :align: center diff --git a/polly/docs/images/LLVM-Passes-all.pdf b/polly/docs/images/LLVM-Passes-all.pdf Binary files differnew file mode 100644 index 00000000000..e2c6cf6684c --- /dev/null +++ b/polly/docs/images/LLVM-Passes-all.pdf diff --git a/polly/docs/images/LLVM-Passes-all.png b/polly/docs/images/LLVM-Passes-all.png Binary files differnew file mode 100644 index 00000000000..0df6f76f20a --- /dev/null +++ b/polly/docs/images/LLVM-Passes-all.png diff --git a/polly/docs/images/LLVM-Passes-early.pdf b/polly/docs/images/LLVM-Passes-early.pdf Binary files differnew file mode 100644 index 00000000000..4d959a16f5a --- /dev/null +++ b/polly/docs/images/LLVM-Passes-early.pdf diff --git a/polly/docs/images/LLVM-Passes-early.png b/polly/docs/images/LLVM-Passes-early.png Binary files differnew file mode 100644 index 00000000000..54991529d29 --- /dev/null +++ b/polly/docs/images/LLVM-Passes-early.png diff --git a/polly/docs/images/LLVM-Passes-late.pdf b/polly/docs/images/LLVM-Passes-late.pdf Binary files differnew file mode 100644 index 00000000000..f32665990ec --- /dev/null +++ b/polly/docs/images/LLVM-Passes-late.pdf diff --git a/polly/docs/images/LLVM-Passes-late.png b/polly/docs/images/LLVM-Passes-late.png Binary files differnew file mode 100644 index 00000000000..f70895f37cb --- /dev/null +++ b/polly/docs/images/LLVM-Passes-late.png diff --git a/polly/docs/images/LLVM-Passes-only.pdf b/polly/docs/images/LLVM-Passes-only.pdf Binary files differnew file mode 100644 index 00000000000..146909ec2c9 --- /dev/null +++ b/polly/docs/images/LLVM-Passes-only.pdf diff --git a/polly/docs/images/LLVM-Passes-only.png b/polly/docs/images/LLVM-Passes-only.png Binary files differnew file mode 100644 index 00000000000..24b56bfc283 --- /dev/null +++ b/polly/docs/images/LLVM-Passes-only.png diff --git a/polly/docs/images/LLVM-Passes-only.pngfg b/polly/docs/images/LLVM-Passes-only.pngfg Binary files differnew file mode 100644 index 00000000000..f89e38b24f6 --- /dev/null +++ b/polly/docs/images/LLVM-Passes-only.pngfg |