summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar/Scalar.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [BasicBlockUtils] Add utility to remove redundant dbg.value instrsBjorn Pettersson2019-12-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add a RemoveRedundantDbgInstrs to BasicBlockUtils with the goal to remove redundant dbg intrinsics from a basic block. This can be useful after various transforms, as it might be simpler to do a filtering of dbg intrinsics after the transform than during the transform. One primary use case would be to replace a too aggressive removal done by MergeBlockIntoPredecessor, seen at loop rotate (not done in this patch). The elimination algorithm currently focuses on dbg.value intrinsics and is doing two iterations over the BB. First we iterate backward starting at the last instruction in the BB. Whenever a consecutive sequence of dbg.value instructions are found we keep the last dbg.value for each variable found (variable fragments are identified using the {DILocalVariable, FragmentInfo, inlinedAt} triple as given by the DebugVariable helper class). Next we iterate forward starting at the first instruction in the BB. Whenever we find a dbg.value describing a DebugVariable (identified by {DILocalVariable, inlinedAt}) we save the {DIValue, DIExpression} that describes that variables value. But if the variable already was mapped to the same {DIValue, DIExpression} pair we instead drop the second dbg.value. To ease the process of making lit tests for this utility a new pass is introduced called RedundantDbgInstElimination. It can be executed by opt using -redundant-dbg-inst-elim. Reviewers: aprantl, jmorse, vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71478
* [Matrix] Add first set of matrix intrinsics and initial lowering pass.Florian Hahn2019-12-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first patch adding an initial set of matrix intrinsics and a corresponding lowering pass. This has been discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2019-October/136240.html The first patch introduces four new intrinsics (transpose, multiply, columnwise load and store) and a LowerMatrixIntrinsics pass, that lowers those intrinsics to vector operations. Matrixes are embedded in a 'flat' vector (e.g. a 4 x 4 float matrix embedded in a <16 x float> vector) and the intrinsics take the dimension information as parameters. Those parameters need to be ConstantInt. For the memory layout, we initially assume column-major, but in the RFC we also described how to extend the intrinsics to support row-major as well. For the initial lowering, we split the input of the intrinsics into a set of column vectors, transform those column vectors and concatenate the result columns to a flat result vector. This allows us to lower the intrinsics without any shape propagation, as mentioned in the RFC. In follow-up patches, we plan to submit the following improvements: * Shape propagation to eliminate the embedding/splitting for each intrinsic. * Fused & tiled lowering of multiply and other operations. * Optimization remarks highlighting matrix expressions and costs. * Generate loops for operations on large matrixes. * More general block processing for operation on large vectors, exploiting shape information. We would like to add dedicated transpose, columnwise load and store intrinsics, even though they are not strictly necessary. For example, we could instead emit a large shufflevector instruction instead of the transpose. But we expect that to (1) become unwieldy for larger matrixes (even for 16x16 matrixes, the resulting shufflevector masks would be huge), (2) risk instcombine making small changes, causing us to fail to detect the transpose, preventing better lowerings For the load/store, we are additionally planning on exploiting the intrinsics for better alias analysis. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor, efriedma, rengolin Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70456
* Reapply r374743 with a fix for the ocaml bindingJoerg Sonnenberger2019-10-141-0/+5
| | | | | | | | | | | | | | | | | | | Add a pass to lower is.constant and objectsize intrinsics This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374784
* Revert "Add a pass to lower is.constant and objectsize intrinsics"Dmitri Gribenko2019-10-141-5/+0
| | | | | | | This reverts commit r374743. It broke the build with Ocaml enabled: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218 llvm-svn: 374768
* Add a pass to lower is.constant and objectsize intrinsicsJoerg Sonnenberger2019-10-131-0/+5
| | | | | | | | | | | | | | | | | This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374743
* [LLVM-C][Ocaml] Add MergeFunctions and DCE passAditya Kumar2019-09-291-0/+4
| | | | | | | | | | | | | | | | | | | MergeFunctions and DCE pass are missing from OCaml/C-api. This patch adds them. Differential Revision: https://reviews.llvm.org/D65071 Reviewers: whitequark, hiraditya, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Tags: #llvm Authored by: kren1 llvm-svn: 373170
* Revert "Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of ↵Dmitri Gribenko2019-09-101-1/+0
| | | | | | | | | CodeGen into opt pipeline."" This reverts commit r371502, it broke tests (clang/test/CodeGenCXX/auto-var-init.cpp). llvm-svn: 371507
* Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into ↵Clement Courbet2019-09-101-0/+1
| | | | | | | | opt pipeline." With a fix for sanitizer breakage (see explanation in D60318). llvm-svn: 371502
* Revert "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into ↵Clement Courbet2019-06-261-1/+0
| | | | | | | | | | | | | | opt pipeline." Breaks sanitizers: libFuzzer :: cxxstring.test libFuzzer :: memcmp.test libFuzzer :: recommended-dictionary.test libFuzzer :: strcmp.test libFuzzer :: value-profile-mem.test libFuzzer :: value-profile-strcmp.test llvm-svn: 364416
* [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.Clement Courbet2019-06-261-0/+1
| | | | | | | | | This allows later passes (in particular InstCombine) to optimize more cases. One that's important to us is `memcmp(p, q, constant) < 0` and memcmp(p, q, constant) > 0. llvm-svn: 364412
* [MergeICmps] Make the pass compatible with the new pass manager.Clement Courbet2019-05-231-1/+1
| | | | | | | | | | | | Reviewers: gchatelet, spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62287 llvm-svn: 361490
* Add basic loop fusion pass.Kit Barton2019-04-171-0/+1
| | | | | | | | | | | | | | | | | | | | This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Differential Revision: https://reviews.llvm.org/D55851 llvm-svn: 358607
* Revert "Add basic loop fusion pass." Per request.Eric Christopher2019-04-171-1/+0
| | | | | | This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358553
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-171-0/+1
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-171-1/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
* Add basic loop fusion pass.Kit Barton2019-04-171-0/+1
| | | | | | | | | | | | | | | | | | | This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Phabricator: https://reviews.llvm.org/D55851 llvm-svn: 358543
* Default lowering for experimental.widenable.conditionMax Kazantsev2019-01-311-0/+1
| | | | | | | | | | | Introduces a pass that provides default lowering strategy for the `experimental.widenable.condition` intrinsic, replacing all its uses with `i1 true`. Differential Revision: https://reviews.llvm.org/D56096 Reviewed By: reames llvm-svn: 352739
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.Michael Kruse2018-12-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g. #pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable) is the same as #pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable) and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used. This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance, !0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"} defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop. Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account. For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations. Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated. To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied. With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling). Reviewed By: hfinkel, dmgreen Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288 llvm-svn: 348944
* Introduce llvm.experimental.widenable_condition intrinsicMax Kazantsev2018-12-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a new instinsic `@llvm.experimental.widenable_condition` that allows explicit representation for guards. It is an alternative to using `@llvm.experimental.guard` intrinsic that does not contain implicit control flow. We keep finding places where `@llvm.experimental.guard` is not supported or treated too conservatively, and there are 2 reasons to that: - `@llvm.experimental.guard` has memory write side effect to model implicit control flow, and this sometimes confuses passes and analyzes that work with memory; - Not all passes and analysis are aware of the semantics of guards. These passes treat them as regular throwing call and have no idea that the condition of guard may be used to prove something. One well-known place which had caused us troubles in the past is explicit loop iteration count calculation in SCEV. Another example is new loop unswitching which is not aware of guards. Whenever a new pass appears, we potentially have this problem there. Rather than go and fix all these places (and commit to keep track of them and add support in future), it seems more reasonable to leverage the existing optimizer's logic as much as possible. The only significant difference between guards and regular explicit branches is that guard's condition can be widened. It means that a guard contains (explicitly or implicitly) a `deopt` block successor, and it is always legal to go there no matter what the guard condition is. The other successor is a guarded block, and it is only legal to go there if the condition is true. This patch introduces a new explicit form of guards alternative to `@llvm.experimental.guard` intrinsic. Now a widenable guard can be represented in the CFG explicitly like this: %widenable_condition = call i1 @llvm.experimental.widenable.condition() %new_condition = and i1 %cond, %widenable_condition br i1 %new_condition, label %guarded, label %deopt guarded: ; Guarded instructions deopt: call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ] The new intrinsic `@llvm.experimental.widenable.condition` has semantics of an `undef`, but the intrinsic prevents the optimizer from folding it early. This form should exploit all optimization boons provided to `br` instuction, and it still can be widened by replacing the result of `@llvm.experimental.widenable.condition()` with `and` with any arbitrary boolean value (as long as the branch that is taken when it is `false` has a deopt and has no side-effects). For more motivation, please check llvm-dev discussion "[llvm-dev] Giving up using implicit control flow in guards". This patch introduces this new intrinsic with respective LangRef changes and a pass that converts old-style guards (expressed as intrinsics) into the new form. The naming discussion is still ungoing. Merging this to unblock further items. We can later change the name of this intrinsic. Reviewed By: reames, fedor.sergeev, sanjoy Differential Revision: https://reviews.llvm.org/D51207 llvm-svn: 348593
* [PM] Port Scalarizer to the new pass manager.Mikael Holmen2018-11-211-1/+2
| | | | | | | | | | | | | | Patch by: markus (Markus Lavin) Reviewers: chandlerc, fedor.sergeev Reviewed By: fedor.sergeev Subscribers: llvm-commits, Ka-Ka, bjope Differential Revision: https://reviews.llvm.org/D54695 llvm-svn: 347392
* [LLVM-C][OCaml] Add UnifyFunctionExitNodes pass to C and OCaml APIswhitequark2018-09-181-0/+5
| | | | | | | | | | | | | | | | Summary: Adds LLVMAddUnifyFunctionExitNodesPass to expose createUnifyFunctionExitNodesPass to the C and OCaml APIs. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52212 llvm-svn: 342476
* [LLVM-C][OCaml] Add LowerAtomic pass to C and OCaml APIswhitequark2018-09-181-0/+4
| | | | | | | | | | | | | | | | Summary: Adds LLVMAddLowerAtomicPass to expose createLowerAtomicPass in the C and OCaml APIs. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D52211 llvm-svn: 342475
* [UnrollAndJam] New Unroll and Jam passDavid Green2018-07-011-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder Loop So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 llvm-svn: 336062
* [instsimplify] Move the instsimplify pass to use more obvious file namesChandler Carruth2018-06-291-0/+1
| | | | | | | | | | | | | | | | and diretory. Also cleans up all the associated naming to be consistent and removes the public access to the pass ID which was unused in LLVM. Also runs clang-format over parts that changed, which generally cleans up a bunch of formatting. This is in preparation for doing some internal cleanups to the pass. Differential Revision: https://reviews.llvm.org/D47352 llvm-svn: 336028
* Revert 333358 as it's failing on some builders.David Green2018-05-271-5/+0
| | | | | | I'm guessing the tests reply on the ARM backend being built. llvm-svn: 333359
* [UnrollAndJam] Add a new Unroll and Jam passDavid Green2018-05-271-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now-jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 llvm-svn: 333358
* Restore the LoopInstSimplify pass, reverting r327329 that removed it.Chandler Carruth2018-05-251-0/+1
| | | | | | | | | | | | | | | The plan had always been to move towards using this rather than so much in-pass simplification within the loop pipeline, but we never got around to it.... until only a couple months after it was removed due to disuse. =/ This commit is just a pure revert of the removal. I will add tests and do some basic cleanup in follow-up commits. Then I'll wire it into the loop pass pipeline. Differential Revision: https://reviews.llvm.org/D47353 llvm-svn: 333250
* [LoopGuardWidening] Split out a loop pass version of GuardWideningPhilip Reames2018-04-271-0/+1
| | | | | | | | The idea is to have a pass which performs the same transformation as GuardWidening, but can be run within a loop pass manager without disrupting the pass manager structure. As demonstrated by the test case, this doesn't quite get there because of issues with post dom, but it gives a good step in the right direction. the motivation is purely to reduce compile time since we can now preserve locality during the loop walk. This patch only includes a legacy pass. A follow up will add a new style pass as well. llvm-svn: 331060
* Fix some layering in AggressiveInstCombine (avoiding inclusion of Scalar.h)David Blaikie2018-04-241-4/+0
| | | | llvm-svn: 330726
* InstCombine: Fix layering by not including Scalar.h in InstCombineDavid Blaikie2018-04-241-4/+0
| | | | | | | | (notionally Scalar.h is part of libLLVMScalarOpts, so it shouldn't be included by InstCombine which doesn't/shouldn't need to depend on ScalarOpts) llvm-svn: 330669
* [AggressiveInstCombine] Add aggressive inst combiner to the LLVM C API.Craig Topper2018-04-241-0/+4
| | | | | | I just tried to copy what was done for regular InstCombine. Hopefully I didn't miss anything. llvm-svn: 330668
* Oops - moved slightly too many things from Scalar to Utils. Move ↵David Blaikie2018-03-281-0/+4
| | | | | | LoopSimplifyCFG things back llvm-svn: 328720
* Transforms: Introduce Transforms/Utils.h rather than spreading the ↵David Blaikie2018-03-281-12/+0
| | | | | | | | | declarations amongst Scalar.h and IPO.h Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, because Scalar and IPO depend on Utils. llvm-svn: 328717
* Finish moving the IPSCCP pass from Scalar to IPO - moving the registrationDavid Blaikie2018-03-221-1/+0
| | | | llvm-svn: 328259
* [New PM][IRCE] port of Inductive Range Check Elimination pass to the new ↵Fedor Sergeev2018-03-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | pass manager There are two nontrivial details here: * Loop structure update interface is quite different with new pass manager, so the code to add new loops was factored out * BranchProbabilityInfo is not a loop analysis, so it can not be just getResult'ed from within the loop pass. It cant even be queried through getCachedResult as LoopCanonicalization sequence (e.g. LoopSimplify) might invalidate BPI results. Complete solution for BPI will likely take some time to discuss and figure out, so for now this was partially solved by making BPI optional in IRCE (skipping a couple of profitability checks if it is absent). Most of the IRCE tests got their corresponding new-pass-manager variant enabled. Only two of them depend on BPI, both marked with TODO, to be turned on when BPI starts being available for loop passes. Reviewers: chandlerc, mkazantsev, sanjoy, asbirlea Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D43795 llvm-svn: 327619
* Remove the LoopInstSimplify pass (-loop-instsimplify)Vedant Kumar2018-03-121-1/+0
| | | | | | | | | | | | LoopInstSimplify is unused and untested. Reading through the commit history the pass also seems to have a high maintenance burden. It would be best to retire the pass for now. It should be easy to recover if we need something similar in the future. Differential Revision: https://reviews.llvm.org/D44053 llvm-svn: 327329
* [PM] port Rewrite Statepoints For GC to the new pass manager.Fedor Sergeev2017-12-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The port is nearly straightforward. The only complication is related to the analyses handling, since one of the analyses used in this module pass is domtree, which is a function analysis. That requires asking for the results of each function and disallows a single interface for run-on-module pass action. Decided to copy-paste the main body of this pass. Most of its code is requesting analyses anyway, so not that much of a copy-paste. The rest of the code movement is to transform all the implementation helper functions like stripNonValidData into non-member statics. Extended all the related LLVM tests with new-pass-manager use. No failures. Reviewers: sanjoy, anna, reames Reviewed By: anna Subscribers: skatkov, llvm-commits Differential Revision: https://reviews.llvm.org/D41162 llvm-svn: 320796
* Rename CountingFunctionInserter and use for both mcount and cygprofile ↵Hans Wennborg2017-11-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | calls, before and after inlining Clang implements the -finstrument-functions flag inherited from GCC, which inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit. This is useful for getting a trace of how the functions in a program are executed. Normally, the calls remain even if a function is inlined into another function, but it is useful to be able to turn this off for users who are interested in a lower-level trace, i.e. one that reflects what functions are called post-inlining. (We use this to generate link order files for Chromium.) LLVM already has a pass for inserting similar instrumentation calls to mcount(), which it does after inlining. This patch renames and extends that pass to handle calls both to mcount and the cygprofile functions, before and/or after inlining as controlled by function attributes. Differential Revision: https://reviews.llvm.org/D39287 llvm-svn: 318195
* Recommit r317351 : Add CallSiteSplitting passJun Bum Lim2017-11-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This recommit r317351 after fixing a buildbot failure. Original commit message: Summary: This change add a pass which tries to split a call-site to pass more constrained arguments if its argument is predicated in the control flow so that we can expose better context to the later passes (e.g, inliner, jump threading, or IPA-CP based function cloning, etc.). As of now we support two cases : 1) If a call site is dominated by an OR condition and if any of its arguments are predicated on this OR condition, try to split the condition with more constrained arguments. For example, in the code below, we try to split the call site since we can predicate the argument (ptr) based on the OR condition. Split from : if (!ptr || c) callee(ptr); to : if (!ptr) callee(null ptr) // set the known constant value else if (c) callee(nonnull ptr) // set non-null attribute in the argument 2) We can also split a call-site based on constant incoming values of a PHI For example, from : BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2, label %BB1 BB1: br label %BB2 BB2: %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ] call void @bar(i32 %p) to BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2-split0, label %BB1 BB1: br label %BB2-split1 BB2-split0: call void @bar(i32 0) br label %BB2 BB2-split1: call void @bar(i32 1) br label %BB2 BB2: %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ] llvm-svn: 317362
* Revert "Add CallSiteSplitting pass"Jun Bum Lim2017-11-031-1/+0
| | | | | | | | Revert due to Buildbot failure. This reverts commit r317351. llvm-svn: 317353
* Add CallSiteSplitting passJun Bum Lim2017-11-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change add a pass which tries to split a call-site to pass more constrained arguments if its argument is predicated in the control flow so that we can expose better context to the later passes (e.g, inliner, jump threading, or IPA-CP based function cloning, etc.). As of now we support two cases : 1) If a call site is dominated by an OR condition and if any of its arguments are predicated on this OR condition, try to split the condition with more constrained arguments. For example, in the code below, we try to split the call site since we can predicate the argument (ptr) based on the OR condition. Split from : if (!ptr || c) callee(ptr); to : if (!ptr) callee(null ptr) // set the known constant value else if (c) callee(nonnull ptr) // set non-null attribute in the argument 2) We can also split a call-site based on constant incoming values of a PHI For example, from : BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2, label %BB1 BB1: br label %BB2 BB2: %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ] call void @bar(i32 %p) to BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2-split0, label %BB1 BB1: br label %BB2-split1 BB2-split0: call void @bar(i32 0) br label %BB2 BB2-split1: call void @bar(i32 1) br label %BB2 BB2: %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ] Reviewers: davidxl, huntergr, chandlerc, mcrosier, eraman, davide Reviewed By: davidxl Subscribers: sdesmalen, ashutosh.nema, fhahn, mssimpso, aemerson, mgorny, mehdi_amini, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D39137 llvm-svn: 317351
* Revert "[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."Clement Courbet2017-11-021-1/+0
| | | | | | | | | undefined reference to `llvm::TargetPassConfig::ID' on clang-ppc64le-linux-multistage This reverts commit eea333c33fa73ad225ef28607795984829f65688. llvm-svn: 317213
* [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass.Clement Courbet2017-11-021-0/+1
| | | | | | | | | | | | | | | | | Summary: This is mostly a noop (most of the test diffs are renamed blocks). There are a few temporary register renames (eax<->ecx) and a few blocks are shuffled around. See the discussion in PR33325 for more details. Reviewers: spatel Subscribers: mgorny Differential Revision: https://reviews.llvm.org/D39456 llvm-svn: 317211
* [SimplifyCFG] use pass options and remove the latesimplifycfg passSanjay Patel2017-10-281-6/+1
| | | | | | | | | | | | | | | | | This is no-functional-change-intended. This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and D35411 (defer folding unconditional branches) with pass parameters rather than a named "latesimplifycfg" pass. Now that we have individual options to control the functionality, we could decouple when these fire (but that's an independent patch if desired). The next planned step would be to add another option bit to disable the sinking transform mentioned in D38566. This should also make it clear that the new pass manager needs to be updated to limit simplifycfg in the same way as the old pass manager. Differential Revision: https://reviews.llvm.org/D38631 llvm-svn: 316835
* [DivRempairs] add a pass to optimize div/rem pairs (PR31028)Sanjay Patel2017-09-091-0/+1
| | | | | | | | | | | | | | | | | | This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented as an independent pass, so there's no stretching of scope and feature creep for an existing pass. I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost this same functionality as an addition to CGP in the motivating example of PR31028: https://bugs.llvm.org/show_bug.cgi?id=31028 The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and undo the hoisting that is done here. Decomposing remainder may allow removing some code from the backend (PPC and possibly others). Differential Revision: https://reviews.llvm.org/D37121 llvm-svn: 312862
* Reland rL312315: [MergeICmps] MergeICmps is a new optimization pass that ↵Clement Courbet2017-09-011-0/+1
| | | | | | | | | | turns chains of integer Add missing header. This reverts commit 86dd6335cf7607af22f383a9a8e072ba929848cf. llvm-svn: 312322
* Revert "[MergeICmps] MergeICmps is a new optimization pass that turns chains ↵Clement Courbet2017-09-011-1/+0
| | | | | | | | | | of integer" Break build This reverts commit d07ab866f7f88f81e49046d691a80dcd32d7198b. llvm-svn: 312317
* [MergeICmps] MergeICmps is a new optimization pass that turns chains of integerClement Courbet2017-09-011-0/+1
| | | | | | | | | | | | | | | | | comparisons into memcmp. Thanks to recent improvements in the LLVM codegen, the memcmp is typically inlined as a chain of efficient hardware comparisons. This typically benefits C++ member or nonmember operator==(). For now this is disabled by default until: - https://bugs.llvm.org/show_bug.cgi?id=33329 is complete - Benchmarks show that this is always useful. Differential Revision: https://reviews.llvm.org/D33987 llvm-svn: 312315
* Remove the LoadCombine pass. It was never enabled and is unsupported.Eric Christopher2017-06-221-1/+0
| | | | | | Based on discussions with the author on mailing lists. llvm-svn: 306067
OpenPOWER on IntegriCloud