summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/IPO/InlineSimple.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Sink all InitializePasses.h includesReid Kleckner2019-11-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211
* [CallSite removal] move InlineCost to CallBase usageFedor Sergeev2019-04-231-3/+3
| | | | | | | | | | | Converting InlineCost interface and its internals into CallBase usage. Inliners themselves are still not converted. Reviewed By: reames Tags: #llvm Differential Revision: https://reviews.llvm.org/D60636 llvm-svn: 358982
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-011-1/+1
| | | | | | | | | | | | | | | | We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
* Remove redundant includes from lib/Transforms.Michael Zolotukhin2017-12-131-2/+0
| | | | llvm-svn: 320628
* [Inliner] Only compute fully inline cost when remarks are enabled.Davide Italiano2017-08-251-1/+11
| | | | | | | | Prior to this change (and after r311371), we computed it unconditionally, causin gsevere compile time regressions (in some cases, 5 to 10x). llvm-svn: 311804
* [InlineCost] Add cl::opt to allow full inline cost to be computed for ↵Haicheng Wu2017-08-211-1/+2
| | | | | | | | | | | | | | | debugging purposes. Currently, the inline cost model will bail once the inline cost exceeds the inline threshold in order to avoid unnecessary compile-time. However, when debugging it is useful to compute the full cost, so this command line option is added to override the default behavior. I took over this work from Chad Rosier (mcrosier@codeaurora.org). Differential Revision: https://reviews.llvm.org/D35850 llvm-svn: 311371
* Do not inline hot callsites for samplepgo in thinlto compile phase.Dehao Chen2017-03-211-2/+6
| | | | | | | | | | | | | | Summary: Because SamplePGO passes will be invoked twice in ThinLTO build: once at compile phase, the other at backend. We want to make sure the IR at the 2nd phase matches the hot part in profile, thus we do not want to inline hot callsites in the first phase. Reviewers: tejohnson, eraman Reviewed By: tejohnson Subscribers: mehdi_amini, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D31201 llvm-svn: 298428
* Improve PGO support for the new inlinerEaswaran Raman2017-01-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | This adds the following to the new PM based inliner in PGO mode: * Use block frequency analysis to derive callsite's profile count and use that to adjust thresholds of hot and cold callsites. * Incrementally update the BFI of the caller after a callee gets inlined into it. This incremental update is only within an invocation of the run method - BFI is not preserved across calls to run. Update the function entry count of the callee after inlining it into a caller. * I've tuned the thresholds for the hot and cold callsites using a hacked up version of the old inliner that explicitly computes BFI on a set of internal benchmarks and spec. Once the new PM based pipeline stabilizes (IIRC Chandler mentioned there are known issues) I'll benchmark this again and adjust the thresholds if required. Inliner PGO support. Differential revision: https://reviews.llvm.org/D28331 llvm-svn: 292666
* Apply clang-tidy's performance-unnecessary-value-param to LLVM.Benjamin Kramer2017-01-131-1/+1
| | | | | | | With some minor manual fixes for using function_ref instead of std::function. No functional change intended. llvm-svn: 291904
* [PM] Provide an initial, minimal port of the inliner to the new pass manager.Chandler Carruth2016-12-201-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This doesn't implement *every* feature of the existing inliner, but tries to implement the most important ones for building a functional optimization pipeline and beginning to sort out bugs, regressions, and other problems. Notable, but intentional omissions: - No alloca merging support. Why? Because it isn't clear we want to do this at all. Active discussion and investigation is going on to remove it, so for simplicity I omitted it. - No support for trying to iterate on "internally" devirtualized calls. Why? Because it adds what I suspect is inappropriate coupling for little or no benefit. We will have an outer iteration system that tracks devirtualization including that from function passes and iterates already. We should improve that rather than approximate it here. - Optimization remarks. Why? Purely to make the patch smaller, no other reason at all. The last one I'll probably work on almost immediately. But I wanted to skip it in the initial patch to try to focus the change as much as possible as there is already a lot of code moving around and both of these *could* be skipped without really disrupting the core logic. A summary of the different things happening here: 1) Adding the usual new PM class and rigging. 2) Fixing minor underlying assumptions in the inline cost analysis or inline logic that don't generally hold in the new PM world. 3) Adding the core pass logic which is in essence a loop over the calls in the nodes in the call graph. This is a bit duplicated from the old inliner, but only a handful of lines could realistically be shared. (I tried at first, and it really didn't help anything.) All told, this is only about 100 lines of code, and most of that is the mechanics of wiring up analyses from the new PM world. 4) Updating the LazyCallGraph (in the new PM) based on the *newly inlined* calls and references. This is very minimal because we cannot form cycles. 5) When inlining removes the last use of a function, eagerly nuking the body of the function so that any "one use remaining" inline cost heuristics are immediately refined, and queuing these functions to be completely deleted once inlining is complete and the call graph updated to reflect that they have become dead. 6) After all the inlining for a particular function, updating the LazyCallGraph and the CGSCC pass manager to reflect the function-local simplifications that are done immediately and internally by the inline utilties. These are the exact same fundamental set of CG updates done by arbitrary function passes. 7) Adding a bunch of test cases to specifically target CGSCC and other subtle aspects in the new PM world. Many thanks to the careful review from Easwaran and Sanjoy and others! Differential Revision: https://reviews.llvm.org/D24226 llvm-svn: 290161
* Revert @llvm.assume with operator bundles (r289755-r289757)Daniel Jasper2016-12-191-1/+7
| | | | | | | This creates non-linear behavior in the inliner (see more details in r289755's commit thread). llvm-svn: 290086
* Remove the AssumptionCacheHal Finkel2016-12-151-7/+1
| | | | | | | | | After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... llvm-svn: 289756
* Add a new method to create SimpleInliner instance and make pre-inliner use this.Easwaran Raman2016-08-111-0/+4
| | | | | | | | This adds a createFunctionInliningPass pass that takes an InlineParams object and use this to create the pre-inliner pass. This prevents the regular inliner's threshold flag from influencing the preinliner. Differential revision: https://reviews.llvm.org/D23377 llvm-svn: 278377
* Do not directly use inline threshold cl options in cost analysis.Easwaran Raman2016-08-101-14/+8
| | | | | | | | This adds an InlineParams struct which is populated from the command line options by getInlineParams and passed to getInlineCost for the call analyzer to use. Differential revision: https://reviews.llvm.org/D22120 llvm-svn: 278189
* [Inliner] clang-format various parts of the inliner prior to changesChandler Carruth2016-08-031-6/+6
| | | | | | here. NFC. llvm-svn: 277557
* Avoid using a raw AssumptionCacheTracker in various inliner functions.Sean Silva2016-07-231-1/+6
| | | | | | | | | | This unblocks the new PM part of River's patch in https://reviews.llvm.org/D22706 Conveniently, this same change was needed for D21921 and so these changes are just spun out from there. llvm-svn: 276515
* Use ProfileSummaryInfo in inline cost analysis.Easwaran Raman2016-06-091-1/+3
| | | | | | | | Instead of directly using MaxFunctionCount and function entry count to determine callee hotness, use the isHotFunction/isColdFunction methods provided by ProfileSummaryInfo. Differential revision: http://reviews.llvm.org/D21045 llvm-svn: 272321
* Make InlineSimple's one-arg constructor explicit. NFCJustin Lebar2016-03-291-1/+2
| | | | llvm-svn: 264744
* Reformat a comment in InlineSimple.cpp. NFCJustin Lebar2016-03-291-3/+3
| | | | llvm-svn: 264743
* Revert revisions 262636, 262643, 262679, and 262682.Easwaran Raman2016-03-081-2/+1
| | | | llvm-svn: 262883
* Infrastructure for PGO enhancements in inlinerEaswaran Raman2016-03-031-1/+2
| | | | | | | | | | | | This patch provides the following infrastructure for PGO enhancements in inliner: Enable the use of block level profile information in inliner Incremental update of block frequency information during inlining Update the function entry counts of callees when they get inlined into callers. Differential Revision: http://reviews.llvm.org/D16381 llvm-svn: 262636
* Refactor threshold computation for inline cost analysisEaswaran Raman2016-01-141-16/+10
| | | | | | Differential Revision: http://reviews.llvm.org/D15401 llvm-svn: 257832
* Refactor inline costs analysis by removing the InlineCostAnalysis classEaswaran Raman2015-12-281-8/+13
| | | | | | | | | | InlineCostAnalysis is an analysis pass without any need for it to be one. Once it stops being an analysis pass, it doesn't maintain any useful state and the member functions inside can be made free functions. NFC. Differential Revision: http://reviews.llvm.org/D15701 llvm-svn: 256521
* [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatibleChandler Carruth2015-09-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with the new pass manager, and no longer relying on analysis groups. This builds essentially a ground-up new AA infrastructure stack for LLVM. The core ideas are the same that are used throughout the new pass manager: type erased polymorphism and direct composition. The design is as follows: - FunctionAAResults is a type-erasing alias analysis results aggregation interface to walk a single query across a range of results from different alias analyses. Currently this is function-specific as we always assume that aliasing queries are *within* a function. - AAResultBase is a CRTP utility providing stub implementations of various parts of the alias analysis result concept, notably in several cases in terms of other more general parts of the interface. This can be used to implement only a narrow part of the interface rather than the entire interface. This isn't really ideal, this logic should be hoisted into FunctionAAResults as currently it will cause a significant amount of redundant work, but it faithfully models the behavior of the prior infrastructure. - All the alias analysis passes are ported to be wrapper passes for the legacy PM and new-style analysis passes for the new PM with a shared result object. In some cases (most notably CFL), this is an extremely naive approach that we should revisit when we can specialize for the new pass manager. - BasicAA has been restructured to reflect that it is much more fundamentally a function analysis because it uses dominator trees and loop info that need to be constructed for each function. All of the references to getting alias analysis results have been updated to use the new aggregation interface. All the preservation and other pass management code has been updated accordingly. The way the FunctionAAResultsWrapperPass works is to detect the available alias analyses when run, and add them to the results object. This means that we should be able to continue to respect when various passes are added to the pipeline, for example adding CFL or adding TBAA passes should just cause their results to be available and to get folded into this. The exception to this rule is BasicAA which really needs to be a function pass due to using dominator trees and loop info. As a consequence, the FunctionAAResultsWrapperPass directly depends on BasicAA and always includes it in the aggregation. This has significant implications for preserving analyses. Generally, most passes shouldn't bother preserving FunctionAAResultsWrapperPass because rebuilding the results just updates the set of known AA passes. The exception to this rule are LoopPass instances which need to preserve all the function analyses that the loop pass manager will end up needing. This means preserving both BasicAAWrapperPass and the aggregating FunctionAAResultsWrapperPass. Now, when preserving an alias analysis, you do so by directly preserving that analysis. This is only necessary for non-immutable-pass-provided alias analyses though, and there are only three of interest: BasicAA, GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is preserved when needed because it (like DominatorTree and LoopInfo) is marked as a CFG-only pass. I've expanded GlobalsAA into the preserved set everywhere we previously were preserving all of AliasAnalysis, and I've added SCEVAA in the intersection of that with where we preserve SCEV itself. One significant challenge to all of this is that the CGSCC passes were actually using the alias analysis implementations by taking advantage of a pretty amazing set of loop holes in the old pass manager's analysis management code which allowed analysis groups to slide through in many cases. Moving away from analysis groups makes this problem much more obvious. To fix it, I've leveraged the flexibility the design of the new PM components provides to just directly construct the relevant alias analyses for the relevant functions in the IPO passes that need them. This is a bit hacky, but should go away with the new pass manager, and is already in many ways cleaner than the prior state. Another significant challenge is that various facilities of the old alias analysis infrastructure just don't fit any more. The most significant of these is the alias analysis 'counter' pass. That pass relied on the ability to snoop on AA queries at different points in the analysis group chain. Instead, I'm planning to build printing functionality directly into the aggregation layer. I've not included that in this patch merely to keep it smaller. Note that all of this needs a nearly complete rewrite of the AA documentation. I'm planning to do that, but I'd like to make sure the new design settles, and to flesh out a bit more of what it looks like in the new pass manager first. Differential Revision: http://reviews.llvm.org/D12080 llvm-svn: 247167
* [PM] Split the AssumptionTracker immutable pass into two separate APIs:Chandler Carruth2015-01-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | a cache of assumptions for a single function, and an immutable pass that manages those caches. The motivation for this change is two fold. Immutable analyses are really hacks around the current pass manager design and don't exist in the new design. This is usually OK, but it requires that the core logic of an immutable pass be reasonably partitioned off from the pass logic. This change does precisely that. As a consequence it also paves the way for the *many* utility functions that deal in the assumptions to live in both pass manager worlds by creating an separate non-pass object with its own independent API that they all rely on. Now, the only bits of the system that deal with the actual pass mechanics are those that actually need to deal with the pass mechanics. Once this separation is made, several simplifications become pretty obvious in the assumption cache itself. Rather than using a set and callback value handles, it can just be a vector of weak value handles. The callers can easily skip the handles that are null, and eventually we can wrap all of this up behind a filter iterator. For now, this adds boiler plate to the various passes, but this kind of boiler plate will end up making it possible to port these passes to the new pass manager, and so it will end up factored away pretty reasonably. llvm-svn: 225131
* Add an Assumption-Tracking PassHal Finkel2014-09-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | This adds an immutable pass, AssumptionTracker, which keeps a cache of @llvm.assume call instructions within a module. It uses callback value handles to keep stale functions and intrinsics out of the map, and it relies on any code that creates new @llvm.assume calls to notify it of the new instructions. The benefit is that code needing to find @llvm.assume intrinsics can do so directly, without scanning the function, thus allowing the cost of @llvm.assume handling to be negligible when none are present. The current design is intended to be lightweight. We don't keep track of anything until we need a list of assumptions in some function. The first time this happens, we scan the function. After that, we add/remove @llvm.assume calls from the cache in response to registration calls and ValueHandle callbacks. There are no new direct test cases for this pass, but because it calls it validation function upon module finalization, we'll pick up detectable inconsistencies from the other tests that touch @llvm.assume calls. This pass will be used by follow-up commits that make use of @llvm.assume. llvm-svn: 217334
* Feed AA to the inliner and use AA->getModRefBehavior in AddAliasScopeMetadataHal Finkel2014-09-011-0/+2
| | | | | | | | | | | | This feeds AA through the IFI structure into the inliner so that AddAliasScopeMetadata can use AA->getModRefBehavior to figure out which functions only access their arguments (instead of just hard-coding some knowledge of memory intrinsics). Most of the information is only available from BasicAA; this is important for preserving alias scoping information for target-specific intrinsics when doing the noalias parameter attribute to metadata conversion. llvm-svn: 216866
* [C++] Use 'nullptr'. Transforms edition.Craig Topper2014-04-251-2/+2
| | | | llvm-svn: 207196
* [Modules] Fix potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-221-1/+2
| | | | | | | | | | | | | | | | | definition below all of the header #include lines, lib/Transforms/... edition. This one is tricky for two reasons. We again have a couple of passes that define something else before the includes as well. I've sunk their name macros with the DEBUG_TYPE. Also, InstCombine contains headers that need DEBUG_TYPE, so now those headers #define and #undef DEBUG_TYPE around their code, leaving them well formed modular headers. Fixing these headers was a large motivation for all of these changes, as "leaky" macros of this form are hard on the modules implementation. llvm-svn: 206844
* Revive SizeOptLevel-explaining comments that were dropped in r203669Eli Bendersky2014-03-121-2/+2
| | | | llvm-svn: 203675
* Move duplicated code into a helper function (exposed through overload).Eli Bendersky2014-03-121-0/+17
| | | | | | | | | | | | | | | | | There's a bit of duplicated "magic" code in opt.cpp and Clang's CodeGen that computes the inliner threshold from opt level and size opt level. This patch moves the code to a function that lives alongside the inliner itself, providing a convenient overload to the inliner creation. A separate patch can be committed to Clang to use this once it's committed to LLVM. Standalone tools that use the inlining pass can also avoid duplicating this code and fearing it will go out of sync. Note: this patch also restructures the conditinal logic of the computation to be cleaner. llvm-svn: 203669
* [C++11] Add 'override' keyword to virtual methods that override their base ↵Craig Topper2014-03-051-3/+3
| | | | | | class. llvm-svn: 202953
* [Modules] Move CallSite into the IR library where it belogs. It isChandler Carruth2014-03-041-1/+1
| | | | | | | abstracting between a CallInst and an InvokeInst, both of which are IR concepts. llvm-svn: 202816
* [PM] Split the CallGraph out from the ModulePass which creates theChandler Carruth2013-11-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CallGraph. This makes the CallGraph a totally generic analysis object that is the container for the graph data structure and the primary interface for querying and manipulating it. The pass logic is separated into its own class. For compatibility reasons, the pass provides wrapper methods for most of the methods on CallGraph -- they all just forward. This will allow the new pass manager infrastructure to provide its own analysis pass that constructs the same CallGraph object and makes it available. The idea is that in the new pass manager, the analysis pass's 'run' method returns a concrete analysis 'result'. Here, that result is a 'CallGraph'. The 'run' method will typically do only minimal work, deferring much of the work into the implementation of the result object in order to be lazy about computing things, but when (like DomTree) there is *some* up-front computation, the analysis does it prior to handing the result back to the querying pass. I know some of this is fairly ugly. I'm happy to change it around if folks can suggest a cleaner interim state, but there is going to be some amount of unavoidable ugliness during the transition period. The good thing is that this is very limited and will naturally go away when the old pass infrastructure goes away. It won't hang around to bother us later. Next up is the initial new-PM-style call graph analysis. =] llvm-svn: 195722
* Spell "Actual" correctlyDavid Majnemer2013-11-031-1/+1
| | | | llvm-svn: 193954
* Merge CallGraph and BasicCallGraph.Rafael Espindola2013-10-311-1/+1
| | | | llvm-svn: 193734
* Make the inline cost a proper analysis pass. This remains essentiallyChandler Carruth2013-01-211-11/+14
| | | | | | | | | | | | | | | | a dynamic analysis done on each call to the routine. However, now it can use the standard pass infrastructure to reference other analyses, instead of a silly setter method. This will become more interesting as I teach it about more analysis passes. This updates the two inliner passes to use the inline cost analysis. Doing so highlights how utterly redundant these two passes are. Either we should find a cheaper way to do always inlining, or we should merge the two and just fiddle with the thresholds to get the desired behavior. I'm leaning increasingly toward the latter as it would also remove the Inliner sub-class split. llvm-svn: 173030
* Clean up the formatting and doxygen for the simple inliner a bit. NoChandler Carruth2013-01-211-18/+29
| | | | | | functionality changed. llvm-svn: 173028
* Move all of the header files which are involved in modelling the LLVM IRChandler Carruth2013-01-021-6/+6
| | | | | | | | | | | | | | | | | | | | | into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366
* Add 'using' declarations to suppress -Woverloaded-virtual warnings.Matt Beaumont-Gay2012-12-041-0/+1
| | | | llvm-svn: 169214
* Use the new script to sort the includes of every file under lib.Chandler Carruth2012-12-031-5/+5
| | | | | | | | | | | | | | | | | Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
* Move TargetData to DataLayout.Micah Villmow2012-10-081-2/+2
| | | | llvm-svn: 165402
* Remove a bunch of empty, dead, and no-op methods from all of theseChandler Carruth2012-03-311-9/+0
| | | | | | | | | | interfaces. These methods were used in the old inline cost system where there was a persistent cache that had to be updated, invalidated, and cleared. We're now doing more direct computations that don't require this intricate dance. Even if we resume some level of caching, it would almost certainly have a simpler and more narrow interface than this. llvm-svn: 153813
* Initial commit for the rewrite of the inline cost analysis to operateChandler Carruth2012-03-311-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | on a per-callsite walk of the called function's instructions, in breadth-first order over the potentially reachable set of basic blocks. This is a major shift in how inline cost analysis works to improve the accuracy and rationality of inlining decisions. A brief outline of the algorithm this moves to: - Build a simplification mapping based on the callsite arguments to the function arguments. - Push the entry block onto a worklist of potentially-live basic blocks. - Pop the first block off of the *front* of the worklist (for breadth-first ordering) and walk its instructions using a custom InstVisitor. - For each instruction's operands, re-map them based on the simplification mappings available for the given callsite. - Compute any simplification possible of the instruction after re-mapping, and store that back int othe simplification mapping. - Compute any bonuses, costs, or other impacts of the instruction on the cost metric. - When the terminator is reached, replace any conditional value in the terminator with any simplifications from the mapping we have, and add any successors which are not proven to be dead from these simplifications to the worklist. - Pop the next block off of the front of the worklist, and repeat. - As soon as the cost of inlining exceeds the threshold for the callsite, stop analyzing the function in order to bound cost. The primary goal of this algorithm is to perfectly handle dead code paths. We do not want any code in trivially dead code paths to impact inlining decisions. The previous metric was *extremely* flawed here, and would always subtract the average cost of two successors of a conditional branch when it was proven to become an unconditional branch at the callsite. There was no handling of wildly different costs between the two successors, which would cause inlining when the path actually taken was too large, and no inlining when the path actually taken was trivially simple. There was also no handling of the code *path*, only the immediate successors. These problems vanish completely now. See the added regression tests for the shiny new features -- we skip recursive function calls, SROA-killing instructions, and high cost complex CFG structures when dead at the callsite being analyzed. Switching to this algorithm required refactoring the inline cost interface to accept the actual threshold rather than simply returning a single cost. The resulting interface is pretty bad, and I'm planning to do lots of interface cleanup after this patch. Several other refactorings fell out of this, but I've tried to minimize them for this patch. =/ There is still more cleanup that can be done here. Please point out anything that you see in review. I've worked really hard to try to mirror at least the spirit of all of the previous heuristics in the new model. It's not clear that they are all correct any more, but I wanted to minimize the change in this single patch, it's already a bit ridiculous. One heuristic that is *not* yet mirrored is to allow inlining of functions with a dynamic alloca *if* the caller has a dynamic alloca. I will add this back, but I think the most reasonable way requires changes to the inliner itself rather than just the cost metric, and so I've deferred this for a subsequent patch. The test case is XFAIL-ed until then. As mentioned in the review mail, this seems to make Clang run about 1% to 2% faster in -O0, but makes its binary size grow by just under 4%. I've looked into the 4% growth, and it can be fixed, but requires changes to other parts of the inliner. llvm-svn: 153812
* Rip out support for 'llvm.noinline'. This thing has a strange history...Chandler Carruth2012-03-161-45/+0
| | | | | | | | | | | | | | | | | | | | | It was added in 2007 as the first cut at supporting no-inline attributes, but we didn't have function attributes of any form at the time. However, it was added without any mention in the LangRef or other documentation. Later on, in 2008, Devang added function notes for 'inline=never' and then turned them into proper function attributes. From that point onward, as far as I can tell, the world moved on, and no one has touched 'llvm.noinline' in any meaningful way since. It's time has now come. We have had better mechanisms for doing this for a long time, all the frontends I'm aware of use them, and this is just holding back progress. Given that it was never a documented feature of the IR, I've provided no auto-upgrade support. If people know of real, in-the-wild bitcode that relies on this, yell at me and I'll add it, but I *seriously* doubt anyone cares. llvm-svn: 152904
* Start removing the use of an ad-hoc 'never inline' set and insteadChandler Carruth2012-03-161-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | directly query the function information which this set was representing. This simplifies the interface of the inline cost analysis, and makes the always-inline pass significantly more efficient. Previously, always-inline would first make a single set of every function in the module *except* those marked with the always-inline attribute. It would then query this set at every call site to see if the function was a member of the set, and if so, refuse to inline it. This is quite wasteful. Instead, simply check the function attribute directly when looking at the callsite. The normal inliner also had similar redundancy. It added every function in the module with the noinline attribute to its set to ignore, even though inside the cost analysis function we *already tested* the noinline attribute and produced the same result. The only tricky part of removing this is that we have to be able to correctly remove only the functions inlined by the always-inline pass when finalizing, which requires a bit of a hack. Still, much less of a hack than the set of all non-always-inline functions was. While I was touching this function, I switched a heavy-weight set to a vector with sort+unique. The algorithm already had a two-phase insert and removal pattern, we were just needlessly paying the uniquing cost on every insert. This probably speeds up some compiles by a small amount (-O0 compiles with lots of always-inline, so potentially heavy libc++ users), but I've not tried to measure it. I believe there is no functional change here, but yell if you spot one. None are intended. Finally, the direction this is going in is to greatly simplify the inline cost query interface so that we can replace its implementation with a much more clever one. Along the way, all the APIs get simplified, so it seems incrementally good. llvm-svn: 152903
* Add comment.Chad Rosier2012-02-251-1/+2
| | | | llvm-svn: 151431
* Add support for disabling llvm.lifetime intrinsics in the AlwaysInliner. TheseChad Rosier2012-02-251-1/+1
| | | | | | | | are optimization hints, but at -O0 we're not optimizing. This becomes a problem when the alwaysinline attribute is abused. rdar://10921594 llvm-svn: 151429
* Inlining and unrolling heuristics should be aware of free truncs.Andrew Trick2011-10-011-0/+2
| | | | | | | | | | We want heuristics to be based on accurate data, but more importantly we don't want llvm to behave randomly. A benign trunc inserted by an upstream pass should not cause a wild swings in optimization level. See PR11034. It's a general problem with threshold-based heuristics, but we can make it less bad. llvm-svn: 140919
OpenPOWER on IntegriCloud