summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar/Scalar.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [UnrollAndJam] New Unroll and Jam passDavid Green2018-07-011-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder Loop So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 llvm-svn: 336062
* [instsimplify] Move the instsimplify pass to use more obvious file namesChandler Carruth2018-06-291-0/+1
| | | | | | | | | | | | | | | | and diretory. Also cleans up all the associated naming to be consistent and removes the public access to the pass ID which was unused in LLVM. Also runs clang-format over parts that changed, which generally cleans up a bunch of formatting. This is in preparation for doing some internal cleanups to the pass. Differential Revision: https://reviews.llvm.org/D47352 llvm-svn: 336028
* Revert 333358 as it's failing on some builders.David Green2018-05-271-5/+0
| | | | | | I'm guessing the tests reply on the ARM backend being built. llvm-svn: 333359
* [UnrollAndJam] Add a new Unroll and Jam passDavid Green2018-05-271-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now-jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 llvm-svn: 333358
* Restore the LoopInstSimplify pass, reverting r327329 that removed it.Chandler Carruth2018-05-251-0/+1
| | | | | | | | | | | | | | | The plan had always been to move towards using this rather than so much in-pass simplification within the loop pipeline, but we never got around to it.... until only a couple months after it was removed due to disuse. =/ This commit is just a pure revert of the removal. I will add tests and do some basic cleanup in follow-up commits. Then I'll wire it into the loop pass pipeline. Differential Revision: https://reviews.llvm.org/D47353 llvm-svn: 333250
* [LoopGuardWidening] Split out a loop pass version of GuardWideningPhilip Reames2018-04-271-0/+1
| | | | | | | | The idea is to have a pass which performs the same transformation as GuardWidening, but can be run within a loop pass manager without disrupting the pass manager structure. As demonstrated by the test case, this doesn't quite get there because of issues with post dom, but it gives a good step in the right direction. the motivation is purely to reduce compile time since we can now preserve locality during the loop walk. This patch only includes a legacy pass. A follow up will add a new style pass as well. llvm-svn: 331060
* Fix some layering in AggressiveInstCombine (avoiding inclusion of Scalar.h)David Blaikie2018-04-241-4/+0
| | | | llvm-svn: 330726
* InstCombine: Fix layering by not including Scalar.h in InstCombineDavid Blaikie2018-04-241-4/+0
| | | | | | | | (notionally Scalar.h is part of libLLVMScalarOpts, so it shouldn't be included by InstCombine which doesn't/shouldn't need to depend on ScalarOpts) llvm-svn: 330669
* [AggressiveInstCombine] Add aggressive inst combiner to the LLVM C API.Craig Topper2018-04-241-0/+4
| | | | | | I just tried to copy what was done for regular InstCombine. Hopefully I didn't miss anything. llvm-svn: 330668
* Oops - moved slightly too many things from Scalar to Utils. Move ↵David Blaikie2018-03-281-0/+4
| | | | | | LoopSimplifyCFG things back llvm-svn: 328720
* Transforms: Introduce Transforms/Utils.h rather than spreading the ↵David Blaikie2018-03-281-12/+0
| | | | | | | | | declarations amongst Scalar.h and IPO.h Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, because Scalar and IPO depend on Utils. llvm-svn: 328717
* Finish moving the IPSCCP pass from Scalar to IPO - moving the registrationDavid Blaikie2018-03-221-1/+0
| | | | llvm-svn: 328259
* [New PM][IRCE] port of Inductive Range Check Elimination pass to the new ↵Fedor Sergeev2018-03-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | pass manager There are two nontrivial details here: * Loop structure update interface is quite different with new pass manager, so the code to add new loops was factored out * BranchProbabilityInfo is not a loop analysis, so it can not be just getResult'ed from within the loop pass. It cant even be queried through getCachedResult as LoopCanonicalization sequence (e.g. LoopSimplify) might invalidate BPI results. Complete solution for BPI will likely take some time to discuss and figure out, so for now this was partially solved by making BPI optional in IRCE (skipping a couple of profitability checks if it is absent). Most of the IRCE tests got their corresponding new-pass-manager variant enabled. Only two of them depend on BPI, both marked with TODO, to be turned on when BPI starts being available for loop passes. Reviewers: chandlerc, mkazantsev, sanjoy, asbirlea Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D43795 llvm-svn: 327619
* Remove the LoopInstSimplify pass (-loop-instsimplify)Vedant Kumar2018-03-121-1/+0
| | | | | | | | | | | | LoopInstSimplify is unused and untested. Reading through the commit history the pass also seems to have a high maintenance burden. It would be best to retire the pass for now. It should be easy to recover if we need something similar in the future. Differential Revision: https://reviews.llvm.org/D44053 llvm-svn: 327329
* [PM] port Rewrite Statepoints For GC to the new pass manager.Fedor Sergeev2017-12-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The port is nearly straightforward. The only complication is related to the analyses handling, since one of the analyses used in this module pass is domtree, which is a function analysis. That requires asking for the results of each function and disallows a single interface for run-on-module pass action. Decided to copy-paste the main body of this pass. Most of its code is requesting analyses anyway, so not that much of a copy-paste. The rest of the code movement is to transform all the implementation helper functions like stripNonValidData into non-member statics. Extended all the related LLVM tests with new-pass-manager use. No failures. Reviewers: sanjoy, anna, reames Reviewed By: anna Subscribers: skatkov, llvm-commits Differential Revision: https://reviews.llvm.org/D41162 llvm-svn: 320796
* Rename CountingFunctionInserter and use for both mcount and cygprofile ↵Hans Wennborg2017-11-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | calls, before and after inlining Clang implements the -finstrument-functions flag inherited from GCC, which inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit. This is useful for getting a trace of how the functions in a program are executed. Normally, the calls remain even if a function is inlined into another function, but it is useful to be able to turn this off for users who are interested in a lower-level trace, i.e. one that reflects what functions are called post-inlining. (We use this to generate link order files for Chromium.) LLVM already has a pass for inserting similar instrumentation calls to mcount(), which it does after inlining. This patch renames and extends that pass to handle calls both to mcount and the cygprofile functions, before and/or after inlining as controlled by function attributes. Differential Revision: https://reviews.llvm.org/D39287 llvm-svn: 318195
* Recommit r317351 : Add CallSiteSplitting passJun Bum Lim2017-11-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This recommit r317351 after fixing a buildbot failure. Original commit message: Summary: This change add a pass which tries to split a call-site to pass more constrained arguments if its argument is predicated in the control flow so that we can expose better context to the later passes (e.g, inliner, jump threading, or IPA-CP based function cloning, etc.). As of now we support two cases : 1) If a call site is dominated by an OR condition and if any of its arguments are predicated on this OR condition, try to split the condition with more constrained arguments. For example, in the code below, we try to split the call site since we can predicate the argument (ptr) based on the OR condition. Split from : if (!ptr || c) callee(ptr); to : if (!ptr) callee(null ptr) // set the known constant value else if (c) callee(nonnull ptr) // set non-null attribute in the argument 2) We can also split a call-site based on constant incoming values of a PHI For example, from : BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2, label %BB1 BB1: br label %BB2 BB2: %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ] call void @bar(i32 %p) to BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2-split0, label %BB1 BB1: br label %BB2-split1 BB2-split0: call void @bar(i32 0) br label %BB2 BB2-split1: call void @bar(i32 1) br label %BB2 BB2: %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ] llvm-svn: 317362
* Revert "Add CallSiteSplitting pass"Jun Bum Lim2017-11-031-1/+0
| | | | | | | | Revert due to Buildbot failure. This reverts commit r317351. llvm-svn: 317353
* Add CallSiteSplitting passJun Bum Lim2017-11-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change add a pass which tries to split a call-site to pass more constrained arguments if its argument is predicated in the control flow so that we can expose better context to the later passes (e.g, inliner, jump threading, or IPA-CP based function cloning, etc.). As of now we support two cases : 1) If a call site is dominated by an OR condition and if any of its arguments are predicated on this OR condition, try to split the condition with more constrained arguments. For example, in the code below, we try to split the call site since we can predicate the argument (ptr) based on the OR condition. Split from : if (!ptr || c) callee(ptr); to : if (!ptr) callee(null ptr) // set the known constant value else if (c) callee(nonnull ptr) // set non-null attribute in the argument 2) We can also split a call-site based on constant incoming values of a PHI For example, from : BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2, label %BB1 BB1: br label %BB2 BB2: %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ] call void @bar(i32 %p) to BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2-split0, label %BB1 BB1: br label %BB2-split1 BB2-split0: call void @bar(i32 0) br label %BB2 BB2-split1: call void @bar(i32 1) br label %BB2 BB2: %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ] Reviewers: davidxl, huntergr, chandlerc, mcrosier, eraman, davide Reviewed By: davidxl Subscribers: sdesmalen, ashutosh.nema, fhahn, mssimpso, aemerson, mgorny, mehdi_amini, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D39137 llvm-svn: 317351
* Revert "[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."Clement Courbet2017-11-021-1/+0
| | | | | | | | | undefined reference to `llvm::TargetPassConfig::ID' on clang-ppc64le-linux-multistage This reverts commit eea333c33fa73ad225ef28607795984829f65688. llvm-svn: 317213
* [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass.Clement Courbet2017-11-021-0/+1
| | | | | | | | | | | | | | | | | Summary: This is mostly a noop (most of the test diffs are renamed blocks). There are a few temporary register renames (eax<->ecx) and a few blocks are shuffled around. See the discussion in PR33325 for more details. Reviewers: spatel Subscribers: mgorny Differential Revision: https://reviews.llvm.org/D39456 llvm-svn: 317211
* [SimplifyCFG] use pass options and remove the latesimplifycfg passSanjay Patel2017-10-281-6/+1
| | | | | | | | | | | | | | | | | This is no-functional-change-intended. This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and D35411 (defer folding unconditional branches) with pass parameters rather than a named "latesimplifycfg" pass. Now that we have individual options to control the functionality, we could decouple when these fire (but that's an independent patch if desired). The next planned step would be to add another option bit to disable the sinking transform mentioned in D38566. This should also make it clear that the new pass manager needs to be updated to limit simplifycfg in the same way as the old pass manager. Differential Revision: https://reviews.llvm.org/D38631 llvm-svn: 316835
* [DivRempairs] add a pass to optimize div/rem pairs (PR31028)Sanjay Patel2017-09-091-0/+1
| | | | | | | | | | | | | | | | | | This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented as an independent pass, so there's no stretching of scope and feature creep for an existing pass. I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost this same functionality as an addition to CGP in the motivating example of PR31028: https://bugs.llvm.org/show_bug.cgi?id=31028 The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and undo the hoisting that is done here. Decomposing remainder may allow removing some code from the backend (PPC and possibly others). Differential Revision: https://reviews.llvm.org/D37121 llvm-svn: 312862
* Reland rL312315: [MergeICmps] MergeICmps is a new optimization pass that ↵Clement Courbet2017-09-011-0/+1
| | | | | | | | | | turns chains of integer Add missing header. This reverts commit 86dd6335cf7607af22f383a9a8e072ba929848cf. llvm-svn: 312322
* Revert "[MergeICmps] MergeICmps is a new optimization pass that turns chains ↵Clement Courbet2017-09-011-1/+0
| | | | | | | | | | of integer" Break build This reverts commit d07ab866f7f88f81e49046d691a80dcd32d7198b. llvm-svn: 312317
* [MergeICmps] MergeICmps is a new optimization pass that turns chains of integerClement Courbet2017-09-011-0/+1
| | | | | | | | | | | | | | | | | comparisons into memcmp. Thanks to recent improvements in the LLVM codegen, the memcmp is typically inlined as a chain of efficient hardware comparisons. This typically benefits C++ member or nonmember operator==(). For now this is disabled by default until: - https://bugs.llvm.org/show_bug.cgi?id=33329 is complete - Benchmarks show that this is always useful. Differential Revision: https://reviews.llvm.org/D33987 llvm-svn: 312315
* Remove the LoadCombine pass. It was never enabled and is unsupported.Eric Christopher2017-06-221-1/+0
| | | | | | Based on discussions with the author on mailing lists. llvm-svn: 306067
* Sort the remaining #include lines in include/... and lib/....Chandler Carruth2017-06-061-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
* [GVNSink] GVNSink passJames Molloy2017-05-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch provides an initial prototype for a pass that sinks instructions based on GVN information, similar to GVNHoist. It is not yet ready for commiting but I've uploaded it to gather some initial thoughts. This pass attempts to sink instructions into successors, reducing static instruction count and enabling if-conversion. We use a variant of global value numbering to decide what can be sunk. Consider: [ %a1 = add i32 %b, 1 ] [ %c1 = add i32 %d, 1 ] [ %a2 = xor i32 %a1, 1 ] [ %c2 = xor i32 %c1, 1 ] \ / [ %e = phi i32 %a2, %c2 ] [ add i32 %e, 4 ] GVN would number %a1 and %c1 differently because they compute different results - the VN of an instruction is a function of its opcode and the transitive closure of its operands. This is the key property for hoisting and CSE. What we want when sinking however is for a numbering that is a function of the *uses* of an instruction, which allows us to answer the question "if I replace %a1 with %c1, will it contribute in an equivalent way to all successive instructions?". The (new) PostValueTable class in GVN provides this mapping. This pass has some shown really impressive improvements especially for codesize already on internal benchmarks, so I have high hopes it can replace all the sinking logic in SimplifyCFG. Differential revision: https://reviews.llvm.org/D24805 llvm-svn: 303850
* [PM/LoopUnswitch] Introduce a new, simpler loop unswitch pass.Chandler Carruth2017-04-271-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, this pass only focuses on *trivial* loop unswitching. At that reduced problem it remains significantly better than the current loop unswitch: - Old pass is worse than cubic complexity. New pass is (I think) linear. - New pass is much simpler in its design by focusing on full unswitching. (See below for details on this). - New pass doesn't carry state for thresholds between pass iterations. - New pass doesn't carry state for correctness (both miscompile and infloop) between pass iterations. - New pass produces substantially better code after unswitching. - New pass can handle more trivial unswitch cases. - New pass doesn't recompute the dominator tree for the entire function and instead incrementally updates it. I've ported all of the trivial unswitching test cases from the old pass to the new one to make sure that major functionality isn't lost in the process. For several of the test cases I've worked to improve the precision and rigor of the CHECKs, but for many I've just updated them to handle the new IR produced. My initial motivation was the fact that the old pass carried state in very unreliable ways between pass iterations, and these mechansims were incompatible with the new pass manager. However, I discovered many more improvements to make along the way. This pass makes two very significant assumptions that enable most of these improvements: 1) Focus on *full* unswitching -- that is, completely removing whatever control flow construct is being unswitched from the loop. In the case of trivial unswitching, this means removing the trivial (exiting) edge. In non-trivial unswitching, this means removing the branch or switch itself. This is in opposition to *partial* unswitching where some part of the unswitched control flow remains in the loop. Partial unswitching only really applies to switches and to folded branches. These are very similar to full unrolling and partial unrolling. The full form is an effective canonicalization, the partial form needs a complex cost model, cannot be iterated, isn't canonicalizing, and should be a separate pass that runs very late (much like unrolling). 2) Leverage LLVM's Loop machinery to the fullest. The original unswitch dates from a time when a great deal of LLVM's loop infrastructure was missing, ineffective, and/or unreliable. As a consequence, a lot of complexity was added which we no longer need. With these two overarching principles, I think we can build a fast and effective unswitcher that fits in well in the new PM and in the canonicalization pipeline. Some of the remaining functionality around partial unswitching may not be relevant today (not many test cases or benchmarks I can find) but if they are I'd like to add support for them as a separate layer that runs very late in the pipeline. Purely to make reviewing and introducing this code more manageable, I've split this into first a trivial-unswitch-only pass and in the next patch I'll add support for full non-trivial unswitching against a *fixed* threshold, exactly like full unrolling. I even plan to re-use the unrolling thresholds, as these are incredibly similar cost tradeoffs: we're cloning a loop body in order to end up with simplified control flow. We should only do that when the total growth is reasonably small. One of the biggest changes with this pass compared to the previous one is that previously, each individual trivial exiting edge from a switch was unswitched separately as a branch. Now, we unswitch the entire switch at once, with cases going to the various destinations. This lets us unswitch multiple exiting edges in a single operation and also avoids numerous extremely bad behaviors, where we would introduce 1000s of branches to test for thousands of possible values, all of which would take the exact same exit path bypassing the loop. Now we will use a switch with 1000s of cases that can be efficiently lowered into a jumptable. This avoids relying on somehow forming a switch out of the branches or getting horrible code if that fails for any reason. Another significant change is that this pass actively updates the CFG based on unswitching. For trivial unswitching, this is actually very easy because of the definition of loop simplified form. Doing this makes the code coming out of loop unswitch dramatically more friendly. We still should run loop-simplifycfg (at the least) after this to clean up, but it will have to do a lot less work. Finally, this pass makes much fewer attempts to simplify instructions based on the unswitch. Something like loop-instsimplify, instcombine, or GVN can be used to do increasingly powerful simplifications based on the now dominating predicate. The old simplifications are things that something like loop-instsimplify should get today or a very, very basic loop-instcombine could get. Keeping that logic separate is a big simplifying technique. Most of the code in this pass that isn't in the old one has to do with achieving specific goals: - Updating the dominator tree as we go - Unswitching all cases in a switch in a single step. I think it is still shorter than just the trivial unswitching code in the old pass despite having this functionality. Differential Revision: https://reviews.llvm.org/D32409 llvm-svn: 301576
* Split the SimplifyCFG pass into two variants.Joerg Sonnenberger2017-03-261-0/+5
| | | | | | | | | | | | | | | | | | | | | | | The first variant contains all current transformations except transforming switches into lookup tables. The second variant contains all current transformations. The switch-to-lookup-table conversion results in code that is more difficult to analyze and optimize by other passes. Most importantly, it can inhibit Dead Code Elimination. As such it is often beneficial to only apply this transformation very late. A common example is inlining, which can often result in range restrictions for the switch expression. Changes in execution time according to LNT: SingleSource/Benchmarks/Misc/fp-convert +3.03% MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk -11.20% MultiSource/Benchmarks/Olden/perimeter/perimeter -10.43% and a couple of smaller changes. For perimeter it also results 2.6% a smaller binary. Differential Revision: https://reviews.llvm.org/D30333 llvm-svn: 298799
* Split NewGVN class into a legacy pass and an impl, instead of a merged class.Daniel Berlin2017-03-121-1/+1
| | | | llvm-svn: 297576
* NVPTX: Move InferAddressSpaces to generic codeMatt Arsenault2017-01-311-0/+1
| | | | llvm-svn: 293579
* [Guards] Introduce loop-predication passArtur Pilipenko2017-01-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces guard based loop predication optimization. The new LoopPredication pass tries to convert loop variant range checks to loop invariant by widening checks across loop iterations. For example, it will convert for (i = 0; i < n; i++) { guard(i < len); ... } to for (i = 0; i < n; i++) { guard(n - 1 < len); ... } After this transformation the condition of the guard is loop invariant, so loop-unswitch can later unswitch the loop by this condition which basically predicates the loop by the widened condition: if (n - 1 < len) for (i = 0; i < n; i++) { ... } else deoptimize This patch relies on an NFC change to make ScalarEvolution::isMonotonicPredicate public (revision 293062). Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D29034 llvm-svn: 293064
* [GVN] Initial check-in of a new global value numbering algorithm.Davide Italiano2016-12-221-0/+5
| | | | | | | | | | | | | | | | | | | | | The code have been developed by Daniel Berlin over the years, and the new implementation goal is that of addressing shortcomings of the current GVN infrastructure, i.e. long compile time for large testcases, lack of phi predication, no load/store value numbering etc... The current code just implements the "core" GVN algorithm, although other pieces (load coercion, phi handling, predicate system) are already implemented in a branch out of tree. Once the core is stable, we'll start adding pieces on top of the base framework. The test currently living in test/Transform/NewGVN are a copy of the ones in GVN, with proper `XFAIL` (missing features in NewGVN). A flag will be added in a future commit to enable NewGVN, so that interested parties can exercise this code easily. Differential Revision: https://reviews.llvm.org/D26224 llvm-svn: 290346
* Add Loop Sink pass to reverse the LICM based of basic block frequency.Dehao Chen2016-10-271-0/+5
| | | | | | | | | | | | Summary: LICM may hoist instructions to preheader speculatively. Before code generation, we need to sink down the hoisted instructions inside to loop if it's beneficial. This pass is a reverse of LICM: looking at instructions in preheader and sinks the instruction to basic blocks inside the loop body if basic block frequency is smaller than the preheader frequency. Reviewers: hfinkel, davidxl, chandlerc Subscribers: anna, modocache, mgorny, beanz, reames, dberlin, chandlerc, mcrosier, junbuml, sanjoy, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D22778 llvm-svn: 285308
* [EarlyCSE] Change C API pass interface for EarlyCSE w/ MemorySSAGeoff Berry2016-09-011-2/+6
| | | | | | | | | | Previous change broke the C API for creating an EarlyCSE pass w/ MemorySSA by adding a bool parameter to control whether MemorySSA was used or not. This broke the OCaml bindings. Instead, change the old C API entry point back and add a new one to request an EarlyCSE pass with MemorySSA. llvm-svn: 280379
* [EarlyCSE] Optionally use MemorySSA. NFC.Geoff Berry2016-08-311-2/+3
| | | | | | | | | | | | | | | | | Summary: Use MemorySSA, if requested, to do less conservative memory dependency checking. This change doesn't enable the MemorySSA enhanced EarlyCSE in the default pipelines, so should be NFC. Reviewers: dberlin, sanjoy, reames, majnemer Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19821 llvm-svn: 280279
* [PM] Port LoopDataPrefetch to new pass managerTeresa Johnson2016-08-131-1/+1
| | | | | | | | | | | | | | | | Summary: Refactor the existing support into a LoopDataPrefetch implementation class and a LoopDataPrefetchLegacyPass class that invokes it. Add a new LoopDataPrefetchPass for the new pass manager that utilizes the LoopDataPrefetch implementation class. Reviewers: mehdi_amini Subscribers: sanjoy, mzolotukhin, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23483 llvm-svn: 278591
* [PM] Port SpeculativeExecution to the new PMMichael Kuperstein2016-08-011-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D23033 llvm-svn: 277393
* [PM] Port LowerGuardIntrinsic to the new PM.Michael Kuperstein2016-07-281-1/+1
| | | | llvm-svn: 277057
* [PM] Port NaryReassociate to the new PMWei Mi2016-07-211-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D22648 llvm-svn: 276349
* [LoopDist] Port to new PMAdam Nemet2016-07-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | Summary: The direct motivation for the port is to ensure that the OptRemarkEmitter tests work with the new PM. This remains a function pass because we not only create multiple loops but could also version the original loop. In the test I need to invoke opt with -passes='require<aa>,loop-distribute'. LoopDistribute does not directly depend on AA however LAA does. LAA uses getCachedResult so I *think* we need manually pull in 'aa'. Reviewers: davidxl, silvas Subscribers: sanjoy, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D22437 llvm-svn: 275811
* [PM] Convert LoopInstSimplify Pass to new PMDehao Chen2016-07-151-1/+1
| | | | | | | | | | | | Summary: Convert LoopInstSimplify to new PM. Unfortunately there is no exisiting unittest for this pass. Reviewers: davidxl, silvas Subscribers: silvas, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D22280 llvm-svn: 275576
* code hoisting pass based on GVNSebastian Pop2016-07-151-0/+5
| | | | | | | | | | | | | This pass hoists duplicated computations in the program. The primary goal of gvn-hoist is to reduce the size of functions before inline heuristics to reduce the total cost of function inlining. Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki. Important algorithmic contributions by Daniel Berlin under the form of reviews. Differential Revision: http://reviews.llvm.org/D19338 llvm-svn: 275561
* [PM] Port Dead Loop Deletion Pass to the new PMJun Bum Lim2016-07-141-1/+1
| | | | | | | | | | | | Summary: Port Dead Loop Deletion Pass to the new pass manager. Reviewers: silvas, davide Subscribers: llvm-commits, sanjoy, mcrosier Differential Revision: https://reviews.llvm.org/D21483 llvm-svn: 275453
* Revert r275401, it caused PR28551.Nico Weber2016-07-141-5/+0
| | | | llvm-svn: 275420
* code hoisting pass based on GVNSebastian Pop2016-07-141-0/+5
| | | | | | | | | | | | | This pass hoists duplicated computations in the program. The primary goal of gvn-hoist is to reduce the size of functions before inline heuristics to reduce the total cost of function inlining. Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki. Important algorithmic contributions by Daniel Berlin under the form of reviews. Differential Revision: http://reviews.llvm.org/D19338 llvm-svn: 275401
* New pass manager for LICM.Dehao Chen2016-07-121-1/+1
| | | | | | | | | | | | Summary: Port LICM to the new pass manager. Reviewers: davidxl, silvas Subscribers: krasin, vitalybuka, silvas, davide, sanjoy, llvm-commits, mehdi_amini Differential Revision: http://reviews.llvm.org/D21772 llvm-svn: 275222
* [PM] Port LoopIdiomRecognize Pass to new PMDehao Chen2016-07-121-1/+1
| | | | | | | | | | | | Summary: Port LoopIdiomRecognize Pass to new PM Reviewers: davidxl Subscribers: davide, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D22250 llvm-svn: 275202
OpenPOWER on IntegriCloud