summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Add a new option -run-slp-after-loop-vectorization.James Molloy2014-08-061-15/+44
| | | | | | This swaps the order of the loop vectorizer and the SLP/BB vectorizers. It is disabled by default so we can do performance testing - ideally we want to change to having the loop vectorizer running first, and the SLP vectorizer using its leftovers instead of the other way around. llvm-svn: 214963
* Don't internalize all but main by default.Rafael Espindola2014-08-051-8/+1
| | | | | | | | | | | | | | | This is mostly a cleanup, but it changes a fairly old behavior. Every "real" LTO user was already disabling the silly internalize pass and creating the internalize pass itself. The difference with this patch is for "opt -std-link-opts" and the C api. Now to get a usable behavior out of opt one doesn't need the funny looking command line: opt -internalize -disable-internalize -internalize-public-api-list=foo,bar -std-link-opts llvm-svn: 214919
* Add scoped-noalias metadataHal Finkel2014-07-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds scoped noalias metadata. The primary motivations for this feature are: 1. To preserve noalias function attribute information when inlining 2. To provide the ability to model block-scope C99 restrict pointers Neither of these two abilities are added here, only the necessary infrastructure. In fact, there should be no change to existing functionality, only the addition of new features. The logic that converts noalias function parameters into this metadata during inlining will come in a follow-up commit. What is added here is the ability to generally specify noalias memory-access sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA nodes: !scope0 = metadata !{ metadata !"scope of foo()" } !scope1 = metadata !{ metadata !"scope 1", metadata !scope0 } !scope2 = metadata !{ metadata !"scope 2", metadata !scope0 } !scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 } !scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 } Loads and stores can be tagged with an alias-analysis scope, and also, with a noalias tag for a specific scope: ... = load %ptr1, !alias.scope !{ !scope1 } ... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 } When evaluating an aliasing query, if one of the instructions is associated with an alias.scope id that is identical to the noalias scope associated with the other instruction, or is a descendant (in the scope hierarchy) of the noalias scope associated with the other instruction, then the two memory accesses are assumed not to alias. Note that is the first element of the scope metadata is a string, then it can be combined accross functions and translation units. The string can be replaced by a self-reference to create globally unqiue scope identifiers. [Note: This overview is slightly stylized, since the metadata nodes really need to just be numbers (!0 instead of !scope0), and the scope lists are also global unnamed metadata.] Existing noalias metadata in a callee is "cloned" for use by the inlined code. This is necessary because the aliasing scopes are unique to each call site (because of possible control dependencies on the aliasing properties). For example, consider a function: foo(noalias a, noalias b) { *a = *b; } that gets inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } -- now just because we know that a1 does not alias with b1 at the first call site, and a2 does not alias with b2 at the second call site, we cannot let inlining these functons have the metadata imply that a1 does not alias with b2. llvm-svn: 213864
* MergedLoadStoreMotion passGerolf Hoflehner2014-07-181-1/+4
| | | | | | | | | | | Merges equivalent loads on both sides of a hammock/diamond and hoists into into the header. Merges equivalent stores on both sides of a hammock/diamond and sinks it to the footer. Can enable if conversion and tolerate better load misses and store operand latencies. llvm-svn: 213396
* Run interprocedural const prop before global optimizerGerolf Hoflehner2014-07-031-1/+1
| | | | | | | | | | | Exposes more constant globals that can be removed by the global optimizer. A specific example is the removal of the static global block address array in clang/test/CodeGen/indirect-goto.c. This change impacts only lower optimization levels. With LTO interprocedural const prop runs already before global opt. llvm-svn: 212284
* Add LoadCombine pass.Michael J. Spencer2014-05-291-0/+11
| | | | | | | | This pass is disabled by default. Use -combine-loads to enable in -O[1-3] Differential revision: http://reviews.llvm.org/D3580 llvm-svn: 209791
* Add an extension point for peephole optimizers.Peter Collingbourne2014-05-251-0/+9
| | | | | | | | | | This extension point allows adding passes that perform peephole optimizations similar to the instruction combiner. These passes will be inserted after each instance of the instruction combiner pass. Differential Revision: http://reviews.llvm.org/D3905 llvm-svn: 209595
* Reapply: Add slp vectorization to LTO passes. The bug it exposed has been ↵Yi Jiang2014-05-051-0/+3
| | | | | | fixed by r207983. <radar://16641956> llvm-svn: 208013
* Revert r207571 - Add slp vectorization to LTO passesYi Jiang2014-04-301-3/+0
| | | | llvm-svn: 207693
* Add slp vectorization to LTO passesYi Jiang2014-04-291-0/+3
| | | | llvm-svn: 207571
* [C++] Use 'nullptr'. Transforms edition.Craig Topper2014-04-251-4/+4
| | | | llvm-svn: 207196
* PMBuilder: Expose an option to disable tail callsDuncan P. N. Exon Smith2014-04-181-1/+3
| | | | | | | | Adds API to allow frontends to disable tail calls in PassManagerBuilder. <rdar://problem/16050591> llvm-svn: 206542
* LTO: Add more loop simplification passes to LTODuncan P. N. Exon Smith2014-04-151-1/+3
| | | | | | | | | Similar to r202051, add missing loop simplification passes to the LTO optimization pipeline. Patch by Rafael Espindola. llvm-svn: 206306
* Move partial/runtime unrolling late in the pipelineHal Finkel2014-03-311-1/+4
| | | | | | | | | | | | | | | | The generic (concatenation) loop unroller is currently placed early in the standard optimization pipeline. This is a good place to perform full unrolling, but not the right place to perform partial/runtime unrolling. However, most targets don't enable partial/runtime unrolling, so this never mattered. However, even some x86 cores benefit from partial/runtime unrolling of very small loops, and follow-up commits will enable this. First, we need to move partial/runtime unrolling late in the optimization pipeline (importantly, this is after SLP and loop vectorization, as vectorization can drastically change the size of a loop), while keeping the full unrolling where it is now. This change does just that. llvm-svn: 205264
* LTO: Add the loop vectorizer to the LTO pipeline.Arnold Schwaighofer2014-02-241-0/+3
| | | | | | | | | | | | | During the LTO phase LICM will move loop invariant global variables out of loops (informed by GlobalModRef). This makes more loops countable presenting opportunity for the loop vectorizer. Adding the loop vectorizer improves some TSVC benchmarks and twolf/ref dataset (5%) on x86-64. radar://15970632 llvm-svn: 202051
* [cleanup] Move the Dominators.h and Verifier.h headers into the IRChandler Carruth2014-01-131-1/+1
| | | | | | | | | | | | | | | | | | directory. These passes are already defined in the IR library, and it doesn't make any sense to have the headers in Analysis. Long term, I think there is going to be a much better way to divide these matters. The dominators code should be fully separated into the abstract graph algorithm and have that put in Support where it becomes obvious that evn Clang's CFGBlock's can use it. Then the verifier can manually construct dominance information from the Support-driven interface while the Analysis library can provide a pass which both caches, reconstructs, and supports a nice update API. But those are very long term, and so I don't want to leave the really confusing structure until that day arrives. llvm-svn: 199082
* Add #pragma vectorize enable/disable to LLVMRenato Golin2013-12-051-24/+12
| | | | | | | | | | | | | | | | | | | | | | | | The intended behaviour is to force vectorization on the presence of the flag (either turn on or off), and to continue the behaviour as expected in its absence. Tests were added to make sure the all cases are covered in opt. No tests were added in other tools with the assumption that they should use the PassManagerBuilder in the same way. This patch also removes the outdated -late-vectorize flag, which was on by default and not helping much. The pragma metadata is being attached to the same place as other loop metadata, but nothing forbids one from attaching it to a function (to enable #pragma optimize) or basic blocks (to hint the basic-block vectorizers), etc. The logic should be the same all around. Patches to Clang to produce the metadata will be produced after the initial implementation is agreed upon and committed. Patches to other vectorizers (such as SLP and BB) will be added once we're happy with the pass manager changes. llvm-svn: 196537
* Add a loop rerolling flag to the PassManagerBuilderHal Finkel2013-11-171-1/+2
| | | | | | | | | This adds a boolean member variable to the PassManagerBuilder to control loop rerolling (just like we have for unrolling and the various vectorization options). This is necessary for control by the frontend. Loop rerolling remains disabled by default at all optimization levels. llvm-svn: 194966
* Add a loop rerolling passHal Finkel2013-11-161-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a loop rerolling pass: the opposite of (partial) loop unrolling. The transformation aims to take loops like this: for (int i = 0; i < 3200; i += 5) { a[i] += alpha * b[i]; a[i + 1] += alpha * b[i + 1]; a[i + 2] += alpha * b[i + 2]; a[i + 3] += alpha * b[i + 3]; a[i + 4] += alpha * b[i + 4]; } and turn them into this: for (int i = 0; i < 3200; ++i) { a[i] += alpha * b[i]; } and loops like this: for (int i = 0; i < 500; ++i) { x[3*i] = foo(0); x[3*i+1] = foo(0); x[3*i+2] = foo(0); } and turn them into this: for (int i = 0; i < 1500; ++i) { x[i] = foo(0); } There are two motivations for this transformation: 1. Code-size reduction (especially relevant, obviously, when compiling for code size). 2. Providing greater choice to the loop vectorizer (and generic unroller) to choose the unrolling factor (and a better ability to vectorize). The loop vectorizer can take vector lengths and register pressure into account when choosing an unrolling factor, for example, and a pre-unrolled loop limits that choice. This is especially problematic if the manual unrolling was optimized for a machine different from the current target. The current implementation is limited to single basic-block loops only. The rerolling recognition should work regardless of how the loop iterations are intermixed within the loop body (subject to dependency and side-effect constraints), but the significant restriction is that the order of the instructions in each iteration must be identical. This seems sufficient to capture all current use cases. This pass is not currently enabled by default at any optimization level. llvm-svn: 194939
* Use LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN instead of the "dso list".Rafael Espindola2013-10-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | There are two ways one could implement hiding of linkonce_odr symbols in LTO: * LLVM tells the linker which symbols can be hidden if not used from native files. * The linker tells LLVM which symbols are not used from other object files, but will be put in the dso symbol table if present. GOLD's API is the second option. It was implemented almost 1:1 in llvm by passing the list down to internalize. LLVM already had partial support for the first option. It is also very similar to how ld64 handles hiding these symbols when *not* doing LTO. This patch then * removes the APIs for the DSO list. * marks LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN all linkonce_odr unnamed_addr global values and other linkonce_odr whose address is not used. * makes the gold plugin responsible for handling the API mismatch. llvm-svn: 193800
* Mark some command line flags as hiddenNadav Rotem2013-10-181-3/+3
| | | | llvm-svn: 193013
* Optimize linkonce_odr unnamed_addr functions during LTO.Rafael Espindola2013-10-031-1/+1
| | | | | | | | | | | Generalize the API so we can distinguish symbols that are needed just for a DSO symbol table from those that are used from some native .o. The symbols that are only wanted for the dso symbol table can be dropped if llvm can prove every other dso has a copy (linkonce_odr) and the address is not important (unnamed_addr). llvm-svn: 191922
* Enable late-vectorization by default.Nadav Rotem2013-09-031-1/+1
| | | | | | | | | | | | | | | | | | | | This patch changes the default setting for the LateVectorization flag that controls where the loop-vectorizer is ran. Perf gains: SingleSource/Benchmarks/Shootout/matrix -37.33% MultiSource/Benchmarks/PAQ8p/paq8p -22.83% SingleSource/Benchmarks/Linpack/linpack-pc -16.22% SingleSource/Benchmarks/Shootout-C++/ary3 -15.16% MultiSource/Benchmarks/TSVC/NodeSplitting-flt/NodeSplitting-flt -10.34% MultiSource/Benchmarks/TSVC/NodeSplitting-dbl/NodeSplitting-dbl -7.12% Regressions: SingleSource/Benchmarks/Misc/lowercase 15.10% MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt 13.18% SingleSource/Benchmarks/Shootout-C++/matrix 8.27% SingleSource/Benchmarks/CoyoteBench/lpbench 7.30% llvm-svn: 189858
* Random cleanup: No need to use a std::vector here, since ↵Bill Wendling2013-08-301-5/+4
| | | | | | createInternalizePass uses an ArrayRef. llvm-svn: 189632
* Vectorizer/PassManager: I am working on moving the vectorizer out of the ↵Nadav Rotem2013-08-281-46/+18
| | | | | | | | | | | SCC passes. This patch moves the SLP-vectorizer and BB-vectorizer back into SCC passes for two reasons: 1. They are a kind of cannonicalization. 2. The performance measurements show that it is better to keep them in. There should be no functional change if you are not enabling the LateVectorization mode. llvm-svn: 189539
* Disable unrolling in the loop vectorizer when disabled in the pass managerHal Finkel2013-08-281-2/+2
| | | | | | | | | | | | | | | | | When unrolling is disabled in the pass manager, the loop vectorizer should also not unroll loops. This will allow the -fno-unroll-loops option in Clang to behave as expected (even for vectorizable loops). The loop vectorizer's -force-vector-unroll option will (continue to) override the pass-manager setting (including -force-vector-unroll=0 to force use of the internal auto-selection logic). In order to test this, I added a flag to opt (-disable-loop-unrolling) to force disable unrolling through opt (the analog of -fno-unroll-loops in Clang). Also, this fixes a small bug in opt where the loop vectorizer was enabled only after the pass manager populated the queue of passes (the global_alias.ll test needed a slight update to the RUN line as a result of this fix). llvm-svn: 189499
* Also remove logic in LateVectorizeArnold Schwaighofer2013-08-131-1/+1
| | | | llvm-svn: 188285
* Remove logic that decides whether to vectorize or not depending on O-levelsArnold Schwaighofer2013-08-131-1/+1
| | | | | | I have moved this logic into clang and opt. llvm-svn: 188281
* Factor FlattenCFG out from SimplifyCFGTom Stellard2013-08-061-2/+2
| | | | | | Patch by: Mei Ye llvm-svn: 187764
* Move the optlevel check to the frontend.Nadav Rotem2013-08-011-1/+1
| | | | llvm-svn: 187628
* Only enable SLP-vectorization on O3 builds.Nadav Rotem2013-08-011-1/+1
| | | | llvm-svn: 187595
* SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch ↵Tom Stellard2013-07-271-2/+2
| | | | | | | | | | | | | | conditions Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278
* Add a flag to defer vectorization into a phase after the inliner and itsChandler Carruth2013-06-241-16/+66
| | | | | | | | | | | | | CGSCC pass manager. This should insulate the inlining decisions from the vectorization decisions, however it may have both compile time and code size problems so it is just an experimental option right now. Adding this based on a discussion with Arnold and it seems at least worth having this flag for us to both run some experiments to see if this strategy is workable. It may solve some of the regressions seen with the loop vectorizer. llvm-svn: 184698
* Remove the simplify-libcalls pass (finally)Meador Inge2013-06-201-5/+1
| | | | | | | | | | | This commit completely removes what is left of the simplify-libcalls pass. All of the functionality has now been migrated to the instcombine and functionattrs passes. The following C API functions are now NOPs: 1. LLVMAddSimplifyLibCallsPass 2. LLVMPassManagerBuilderSetDisableSimplifyLibCalls llvm-svn: 184459
* Disable vectorization for -Oz.Nadav Rotem2013-06-171-1/+1
| | | | llvm-svn: 184089
* Enable the loop vectorizer by default for -Os and -O2.Nadav Rotem2013-06-171-7/+1
| | | | llvm-svn: 184084
* Jeffrey Yasskin volunteered to benchmark the vectorizer on -O2 or -Os when ↵Nadav Rotem2013-06-061-1/+7
| | | | | | compiling chrome. This patch adds a new flag to enable vectorization on all levels and not only on -O3. It should go away once we make a decision. llvm-svn: 183456
* This patch breaks up Wrap.h so that it does not have to include all of Filip Pizlo2013-05-011-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | the things, and renames it to CBindingWrapping.h. I also moved CBindingWrapping.h into Support/. This new file just contains the macros for defining different wrap/unwrap methods. The calls to those macros, as well as any custom wrap/unwrap definitions (like for array of Values for example), are put into corresponding C++ headers. Doing this required some #include surgery, since some .cpp files relied on the fact that including Wrap.h implicitly caused the inclusion of a bunch of other things. This also now means that the C++ headers will include their corresponding C API headers; for example Value.h must include llvm-c/Core.h. I think this is harmless, since the C API headers contain just external function declarations and some C types, so I don't believe there should be any nasty dependency issues here. llvm-svn: 180881
* Move C++ code out of the C headers and into either C++ headersEric Christopher2013-04-221-0/+9
| | | | | | | or the C++ files themselves. This enables people to use just a C compiler to interoperate with LLVM. llvm-svn: 180063
* SLPVectorizer: Make it a function pass and add code for hoisting the ↵Nadav Rotem2013-04-151-4/+2
| | | | | | vector-gather sequence out of loops. llvm-svn: 179562
* Add an option -vectorize-slp-aggressive for running the BB vectorizer. Make ↵Nadav Rotem2013-04-151-1/+12
| | | | | | -fslp-vectorize run the slp-vectorizer. llvm-svn: 179508
* Rename the slp-vectorizer clang/llvm flags. No functionality change.Nadav Rotem2013-04-151-3/+3
| | | | llvm-svn: 179505
* Use LLVMBool instead of 'bool' in the C API. Based on a patch by Peter Zotov!Nick Lewycky2013-03-101-3/+3
| | | | llvm-svn: 176793
* Generalize my previous fix for -print-options.Andrew Trick2013-03-061-1/+1
| | | | | | | Always print options that differ from their implicit default. At least for simple option types. llvm-svn: 176572
* Give -loop-vectorize an explicit default.Andrew Trick2013-03-061-1/+1
| | | | | | This way, clang -mllvm -print-options shows that the driver is overriding it. llvm-svn: 176569
* Unroll again after running BBVectorizeHal Finkel2013-01-291-0/+4
| | | | | | | | Because BBVectorize may significantly shorten a loop body, unroll again after vectorization. This is especially important when using runtime or partial unrolling. llvm-svn: 173730
* Remove the long defunct 'DefaultPasses' header. We have a pass managerChandler Carruth2013-01-071-1/+0
| | | | | | | builder these days, and this thing hasn't seen updates for a very long time. llvm-svn: 171741
* Move the loop vectorizer from O2 to O3. It looks like the increase in code ↵Nadav Rotem2013-01-041-1/+1
| | | | | | size actually hurts the performance on many programs. llvm-svn: 171471
* Remove duplicate includes.Roman Divacky2012-12-211-1/+0
| | | | llvm-svn: 170902
* Enable the loop vectorizer in clang and not in the pass manager, so that we ↵Nadav Rotem2012-12-181-1/+1
| | | | | | can disable it in clang. llvm-svn: 170470
OpenPOWER on IntegriCloud