summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis/CGSCCPassManager.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[CallGraph] Refine call graph for indirect calls with !callees metadata"Benjamin Kramer2019-08-161-2/+1
| | | | | | | This reverts commit r369025. Crashes clang, test case is on the mailing list. llvm-svn: 369096
* [CallGraph] Refine call graph for indirect calls with !callees metadataMark Lacey2019-08-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For indirect call sites having a small set of possible callees, !callees metadata can be used to indicate what those callees are. This patch updates the call graph and lazy call graph analyses so that they consider this metadata when encountering call sites. For the call graph, it adds a new external call graph node to the graph for each unique !callees metadata node. A call graph edge connects an indirect call site with the external node associated with the !callees metadata that is attached to it. And there is an edge from this external node to each of the callees indicated by the metadata. Similarly, for the lazy call graph, the patch adds Ref edges from a caller to the possible callees indicated by the metadata. The primary purpose of the patch is to facilitate iterating over the functions in a module such that all of the callees indicated by a given !callees metadata node will be visited prior to the functions containing call sites annotated by that node. This property is required by optimizations performing a bottom-up traversal of the SCC DAG. For example, the inliner can be made to inline through an indirect call. If the call site is annotated with !callees metadata, this patch ensures that the inliner will have visited all of the callees prior to the caller, allowing it to reliably compute the cost of inlining one or more of the potential callees. Original patch by @mssimpso. I've made some small changes to get it to apply, build, and pass tests on the top of tree, as well as some minor tweaks to formatting and functionality. Subscribers: mehdi_amini, hiraditya, llvm-commits, mssimpso Tags: #llvm Differential Revision: https://reviews.llvm.org/D39339 llvm-svn: 369025
* [NewPM] Fix a nasty bug with analysis invalidation in the new PM.Chandler Carruth2019-03-281-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The issue here is that we actually allow CGSCC passes to mutate IR (and therefore invalidate analyses) outside of the current SCC. At a minimum, we need to support mutating parent and ancestor SCCs to support the ArgumentPromotion pass which rewrites all calls to a function. However, the analysis invalidation infrastructure is heavily based around not needing to invalidate the same IR-unit at multiple levels. With Loop passes for example, they don't invalidate other Loops. So we need to customize how we handle CGSCC invalidation. Doing this without gratuitously re-running analyses is even harder. I've avoided most of these by using an out-of-band preserved set to accumulate the cross-SCC invalidation, but it still isn't perfect in the case of re-visiting the same SCC repeatedly *but* it coming off the worklist. Unclear how important this use case really is, but I wanted to call it out. Another wrinkle is that in order for this to successfully propagate to function analyses, we have to make sure we have a proxy from the SCC to the Function level. That requires pre-creating the necessary proxy. The motivating test case now works cleanly and is added for ArgumentPromotion. Thanks for the review from Philip and Wei! Differential Revision: https://reviews.llvm.org/D59869 llvm-svn: 357137
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [NewPM] fixing asserts on deleted loop in -print-after-allFedor Sergeev2018-12-111-1/+4
| | | | | | | | | | | IR-printing AfterPass instrumentation might be called on a loop that has just been invalidated. We should skip printing it to avoid spurious asserts. Reviewed By: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D54740 llvm-svn: 348887
* [New PM] Introducing PassInstrumentation frameworkFedor Sergeev2018-09-201-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass Execution Instrumentation interface enables customizable instrumentation of pass execution, as per "RFC: Pass Execution Instrumentation interface" posted 06/07/2018 on llvm-dev@ The intent is to provide a common machinery to implement all the pass-execution-debugging features like print-before/after, opt-bisect, time-passes etc. Here we get a basic implementation consisting of: * PassInstrumentationCallbacks class that handles registration of callbacks and access to them. * PassInstrumentation class that handles instrumentation-point interfaces that call into PassInstrumentationCallbacks. * Callbacks accept StringRef which is just a name of the Pass right now. There were some ideas to pass an opaque wrapper for the pointer to pass instance, however it appears that pointer does not actually identify the instance (adaptors and managers might have the same address with the pass they govern). Hence it was decided to go simple for now and then later decide on what the proper mental model of identifying a "pass in a phase of pipeline" is. * Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies on different IRUnits (e.g. Analyses). * PassInstrumentationAnalysis analysis is explicitly requested from PassManager through usual AnalysisManager::getResult. All pass managers were updated to run that to get PassInstrumentation object for instrumentation calls. * Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra args out of a generic PassManager's extra args. This is the only way I was able to explicitly run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or RepeatedPass::run. TODO: Upon lengthy discussions we agreed to accept this as an initial implementation and then get rid of getAnalysisResult by improving RepeatedPass implementation. * PassBuilder takes PassInstrumentationCallbacks object to pass it further into PassInstrumentationAnalysis. Callbacks registration should be performed directly through PassInstrumentationCallbacks. * new-pm tests updated to account for PassInstrumentationAnalysis being run * Added PassInstrumentation tests to PassBuilderCallbacks unit tests. Other unit tests updated with registration of the now-required PassInstrumentationAnalysis. Made getName helper to return std::string (instead of StringRef initially) to fix asan builtbot failures on CGSCC tests. Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D47858 llvm-svn: 342664
* Temporarily Revert "[New PM] Introducing PassInstrumentation framework"Eric Christopher2018-09-201-12/+0
| | | | | | | | as it was causing failures in the asan buildbot. This reverts commit r342597. llvm-svn: 342616
* [New PM] Introducing PassInstrumentation frameworkFedor Sergeev2018-09-191-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass Execution Instrumentation interface enables customizable instrumentation of pass execution, as per "RFC: Pass Execution Instrumentation interface" posted 06/07/2018 on llvm-dev@ The intent is to provide a common machinery to implement all the pass-execution-debugging features like print-before/after, opt-bisect, time-passes etc. Here we get a basic implementation consisting of: * PassInstrumentationCallbacks class that handles registration of callbacks and access to them. * PassInstrumentation class that handles instrumentation-point interfaces that call into PassInstrumentationCallbacks. * Callbacks accept StringRef which is just a name of the Pass right now. There were some ideas to pass an opaque wrapper for the pointer to pass instance, however it appears that pointer does not actually identify the instance (adaptors and managers might have the same address with the pass they govern). Hence it was decided to go simple for now and then later decide on what the proper mental model of identifying a "pass in a phase of pipeline" is. * Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies on different IRUnits (e.g. Analyses). * PassInstrumentationAnalysis analysis is explicitly requested from PassManager through usual AnalysisManager::getResult. All pass managers were updated to run that to get PassInstrumentation object for instrumentation calls. * Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra args out of a generic PassManager's extra args. This is the only way I was able to explicitly run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or RepeatedPass::run. TODO: Upon lengthy discussions we agreed to accept this as an initial implementation and then get rid of getAnalysisResult by improving RepeatedPass implementation. * PassBuilder takes PassInstrumentationCallbacks object to pass it further into PassInstrumentationAnalysis. Callbacks registration should be performed directly through PassInstrumentationCallbacks. * new-pm tests updated to account for PassInstrumentationAnalysis being run * Added PassInstrumentation tests to PassBuilderCallbacks unit tests. Other unit tests updated with registration of the now-required PassInstrumentationAnalysis. Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D47858 llvm-svn: 342597
* Revert rL342544: [New PM] Introducing PassInstrumentation frameworkFedor Sergeev2018-09-191-12/+0
| | | | | | A bunch of bots fail to compile unittests. Reverting. llvm-svn: 342552
* [New PM] Introducing PassInstrumentation frameworkFedor Sergeev2018-09-191-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Pass Execution Instrumentation interface enables customizable instrumentation of pass execution, as per "RFC: Pass Execution Instrumentation interface" posted 06/07/2018 on llvm-dev@ The intent is to provide a common machinery to implement all the pass-execution-debugging features like print-before/after, opt-bisect, time-passes etc. Here we get a basic implementation consisting of: * PassInstrumentationCallbacks class that handles registration of callbacks and access to them. * PassInstrumentation class that handles instrumentation-point interfaces that call into PassInstrumentationCallbacks. * Callbacks accept StringRef which is just a name of the Pass right now. There were some ideas to pass an opaque wrapper for the pointer to pass instance, however it appears that pointer does not actually identify the instance (adaptors and managers might have the same address with the pass they govern). Hence it was decided to go simple for now and then later decide on what the proper mental model of identifying a "pass in a phase of pipeline" is. * Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies on different IRUnits (e.g. Analyses). * PassInstrumentationAnalysis analysis is explicitly requested from PassManager through usual AnalysisManager::getResult. All pass managers were updated to run that to get PassInstrumentation object for instrumentation calls. * Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra args out of a generic PassManager's extra args. This is the only way I was able to explicitly run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or RepeatedPass::run. TODO: Upon lengthy discussions we agreed to accept this as an initial implementation and then get rid of getAnalysisResult by improving RepeatedPass implementation. * PassBuilder takes PassInstrumentationCallbacks object to pass it further into PassInstrumentationAnalysis. Callbacks registration should be performed directly through PassInstrumentationCallbacks. * new-pm tests updated to account for PassInstrumentationAnalysis being run * Added PassInstrumentation tests to PassBuilderCallbacks unit tests. Other unit tests updated with registration of the now-required PassInstrumentationAnalysis. Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D47858 llvm-svn: 342544
* Rename DEBUG macro to LLVM_DEBUG.Nicola Zaghen2018-05-141-17/+18
| | | | | | | | | | | | | | | | The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
* Fixed spelling mistake in comments of LLVM Analysis passesVedant Kumar2018-02-281-4/+4
| | | | | | | | Patch by Reshabh Sharma! Differential Revision: https://reviews.llvm.org/D43861 llvm-svn: 326352
* Use a BumpPtrAllocator for Loop objectsSanjoy Das2017-09-281-1/+1
| | | | | | | | | | | | | | | Summary: And now that we no longer have to explicitly free() the Loop instances, we can (with more ease) use the destructor of LoopBase to do what LoopBase::clear() was doing. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38201 llvm-svn: 314375
* [PM/CGSCC] Teach the CGSCC pass manager components to gracefully handleChandler Carruth2017-09-141-1/+6
| | | | | | | | | | | | | | | | | | | | | invalidated SCCs even when we do not have an updated SCC to redirect towards. This comes up in a fairly subtle and surprising circumstance: we need to have a connected but internal node in the call graph which later becomes a disconnected island, and then gets deleted. All of this needs to happen mid-CGSCC walk. Because it is disconnected, we have no way of computing a new "current" SCC when it gets deleted. Instead, we need to explicitly check for a deleted "current" SCC and bail out of the current CGSCC step. This will bubble all the way up to the post-order walk and then resume correctly. I've included minimal tests for this bug. The specific behavior matches something we've seen in the wild with the new PM combined with ThinLTO and sample PGO, but I've not yet confirmed whether this is the only issue there. llvm-svn: 313242
* [Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-08-311-16/+31
| | | | | | warnings; other minor fixes. Also affected in files (NFC). llvm-svn: 312289
* [PM] Switch the CGSCC debug messages to use the standard LLVM debugChandler Carruth2017-08-111-29/+22
| | | | | | | | | | | | | | | | printing techniques with a DEBUG_TYPE controlling them. It was a mistake to start re-purposing the pass manager `DebugLogging` variable for generic debug printing -- those logs are intended to be very minimal and primarily used for testing. More detailed and comprehensive logging doesn't make sense there (it would only make for brittle tests). Moreover, we kept forgetting to propagate the `DebugLogging` variable to various places making it also ineffective and/or unavailable. Switching to `DEBUG_TYPE` makes this a non-issue. llvm-svn: 310695
* [LCG] Switch one of the update methods for the LazyCallGraph to supportChandler Carruth2017-08-091-55/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | limited batch updates. Specifically, allow removing multiple reference edges starting from a common source node. There are a few constraints that play into supporting this form of batching: 1) The way updates occur during the CGSCC walk, about the most we can functionally batch together are those with a common source node. This also makes the batching simpler to implement, so it seems a worthwhile restriction. 2) The far and away hottest function for large C++ files I measured (generated code for protocol buffers) showed a huge amount of time was spent removing ref edges specifically, so it seems worth focusing there. 3) The algorithm for removing ref edges is very amenable to this restricted batching. There are just both API and implementation special casing for the non-batch case that gets in the way. Once removed, supporting batches is nearly trivial. This does modify the API in an interesting way -- now, we only preserve the target RefSCC when the RefSCC structure is unchanged. In the face of any splits, we create brand new RefSCC objects. However, all of the users were OK with it that I could find. Only the unittest needed interesting updates here. How much does batching these updates help? I instrumented the compiler when run over a very large generated source file for a protocol buffer and found that the majority of updates are intrinsically updating one function at a time. However, nearly 40% of the total ref edges removed are removed as part of a batch of removals greater than one, so these are the cases batching can help with. When compiling the IR for this file with 'opt' and 'O3', this patch reduces the total time by 8-9%. Differential Revision: https://reviews.llvm.org/D36352 llvm-svn: 310450
* [PM] Fix a likely more critical infloop bug in the CGSCC pass manager.Chandler Carruth2017-08-081-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | This was just a bad oversight on my part. The code in question should never have worked without this fix. But it turns out, there are relatively few places that involve libfunctions that participate in a single SCC, and unless they do, this happens to not matter. The effect of not having this correct is that each time through this routine, the edge from write_wrapper to write was toggled between a call edge and a ref edge. First time through, it becomes a demoted call edge and is turned into a ref edge. Next time it is a promoted call edge from a ref edge. On, and on it goes forever. I've added the asserts which should have always been here to catch silly mistakes like this in the future as well as a test case that will actually infloop without the fix. The other (much scarier) infinite-inlining issue I think didn't actually occur in practice, and I simply misdiagnosed this minor issue as that much more scary issue. The other issue *is* still a real issue, but I'm somewhat relieved that so far it hasn't happened in real-world code yet... llvm-svn: 310342
* [PM] Add a comment clarifying what a particular predicate is doing.Chandler Carruth2017-08-011-0/+8
| | | | | | | This came up as a point of confusion while working on a fundamental problem with the combination of CGSCC iteration and the inliner. llvm-svn: 309662
* [PM/LCG] Teach the LazyCallGraph to maintain reference edges from everyChandler Carruth2017-07-151-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | function to every defined function known to LLVM as a library function. LLVM can introduce calls to these functions either by replacing other library calls or by recognizing patterns (such as memset_pattern or vector math patterns) and replacing those with calls. When these library functions are actually defined in the module, we need to have reference edges to them initially so that we visit them during the CGSCC walk in the right order and can effectively rebuild the call graph afterward. This was discovered when building code with Fortify enabled as that is a common case of both inline definitions of library calls and simplifications of code into calling them. This can in extreme cases of LTO-ing with libc introduce *many* more reference edges. I discussed a bunch of different options with folks but all of them are unsatisfying. They either make the graph operations substantially more complex even when there are *no* defined libfuncs, or they introduce some other complexity into the callgraph. So this patch goes with the simplest possible solution of actual synthetic reference edges. If this proves to be a memory problem, I'm happy to implement one of the clever techniques to save memory here. llvm-svn: 308088
* [PM] Fix a silly bug in my recent update to the CG update logic.Chandler Carruth2017-07-121-1/+1
| | | | | | | | I used the wrong variable to update. This was even covered by a unittest I wrote, and the comments for the unittest were correct (if confusing) but the test itself just matched the buggy behavior. =[ llvm-svn: 307764
* [PM] Fix a nasty bug in the new PM where we failed to properlyChandler Carruth2017-07-091-14/+37
| | | | | | | | | | | | | | | | | | | | | | | | invalidation of analyses when merging SCCs. While I've added a bunch of testing of this, it takes something much more like the inliner to really trigger this as you need to have partially-analyzed SCCs with updates at just the right time. So I've added a direct test for this using the inliner and verifying the domtree. Without the changes here, this test ends up finding a stale dominator tree. However, to handle this properly, we need to invalidate analyses *before* merging the SCCs. After talking to Philip and Sanjoy about this they convinced me this was the right approach. To do this, we need a callback mechanism when merging SCCs so we can observe the cycle that will be merged before the merge happens. This API update ended up being surprisingly easy. With this commit, the new PM passes the test-suite again. It hadn't since MemorySSA was enabled for EarlyCSE as that also will find this bug very quickly. llvm-svn: 307498
* [PM] Add unittesting of the call graph update logic with complexChandler Carruth2017-07-091-6/+51
| | | | | | | | | | | dependencies between analyses. This uncovers even more issues with the proxies and the splitting apart of SCCs which are fixed in this patch. I discovered this while trying to add more rigorous testing for a change I'm making to the call graph update invalidation logic. llvm-svn: 307497
* [PM] Finish implementing and fix a chain of bugs uncovered by testingChandler Carruth2017-07-091-22/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the invalidation propagation logic from an SCC to a Function. I wrote the infrastructure to test this but didn't actually use it in the unit test where it was designed to be used. =[ My bad. Once I actually added it to the test case I discovered that it also hadn't been properly implemented, so I've implemented it. The logic in the FAM proxy for an SCC pass to propagate invalidation follows the same ideas as the FAM proxy for a Module pass, but the implementation is a bit different to reflect the fact that it is forwarding just for an SCC. However, implementing this correctly uncovered a surprising "bug" (it was conservatively correct but relatively very expensive) in how we handle invalidation when splitting one SCC into multiple SCCs. We did an eager invalidation when in reality we should be deferring invaliadtion for the *current* SCC to the CGSCC pass manager and just invaliating the newly constructed SCCs. Otherwise we end up invalidating too much too soon. This was exposed by the inliner test case that I've updated. Now, we invalidate *just* the split off '(test1_f)' SCC when doing the CG update, and then the inliner finishes and invalidates the '(test1_g, test1_h)' SCC's analyses. The first few attempts at fixing this hit still more bugs, but all of those are covered by existing tests. For example, the inliner should also preserve the FAM proxy to avoid unnecesasry invalidation, and this is safe because the CG update routines it uses handle any necessary adjustments to the FAM proxy. Finally, the unittests for the CGSCC pass manager needed a bunch of updates where we weren't correctly preserving the FAM proxy because it hadn't been fully implemented and failing to preserve it didn't matter. Note that this doesn't yet fix the current crasher due to MemSSA finding a stale dominator tree, but without this the fix to that crasher doesn't really make any sense when testing because it relies on the proxy behavior. llvm-svn: 307487
* [PM/LCG] Teach the LazyCallGraph how to replace a function withoutChandler Carruth2017-02-091-27/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | disturbing the graph or having to update edges. This is motivated by porting argument promotion to the new pass manager. Because of how LLVM IR Function objects work, in order to change their signature a new object needs to be created. This is efficient and straight forward in the IR but previously was very hard to implement in LCG. We could easily replace the function a node in the graph represents. The challenging part is how to handle updating the edges in the graph. LCG previously used an edge to a raw function to represent a node that had not yet been scanned for calls and references. This was the core of its laziness. However, that model causes this kind of update to be very hard: 1) The keys to lookup an edge need to be `Function*`s that would all need to be updated when we update the node. 2) There will be some unknown number of edges that haven't transitioned from `Function*` edges to `Node*` edges. All of this complexity isn't necessary. Instead, we can always build a node around any function, always pointing edges at it and always using it as the key to lookup an edge. To maintain the laziness, we need to sink the *edges* of a node into a secondary object and explicitly model transitioning a node from empty to populated by scanning the function. This design seems much cleaner in a number of ways, but importantly there is now exactly *one* place where the `Function*` has to be updated! Some other cleanups that fall out of this include having something to model the *entry* edges more accurately. Rather than hand rolling parts of the node in the graph itself, we have an explicit `EdgeSequence` object that gives us exactly the functionality needed. We also have a consistent place to define the edge iterators and can use them for both the entry edges and the internal edges of the graph. The API used to model the separation between a node and its edges is intentionally very thin as most clients are expected to deal with nodes that have populated edges. We model this exactly as an optional does with an additional method to populate the edges when that is a reasonable thing for a client to do. This is based on API design suggestions from Richard Smith and David Blaikie, credit goes to them for helping pick how to model this without it being either too explicit or too implicit. The patch is somewhat noisy due to shifting around iterator types and new syntax for walking the edges of a node, but most of the functionality change is in the `Edge`, `EdgeSequence`, and `Node` types. Differential Revision: https://reviews.llvm.org/D29577 llvm-svn: 294653
* Revert r293017 and fix the actual underlying issue.Chandler Carruth2017-02-071-1/+1
| | | | | | | | | | | | | | | | | | | | | The patch committed in r293017, as discussed on the list, doesn't really make sense but was causing an actual issue to go away. The issue turns out to be that in one place the extra template arguments were dropped from the OuterAnalysisManagerProxy. This in turn caused the types used in one set of places to access the key to be completely different from the types used in another set of places for both Loop and CGSCC cases where there are extra arguments. I have literally no idea how anything seemed to work with this bug in place. It blows my mind. But it did except for mingw64 in a DLL build. I've added a really handy static assert that helps ensure we don't break this in the future. It immediately diagnoses the issue with a compile failure and a very clear error message. Much better that staring at backtraces on a build bot. =] llvm-svn: 294267
* [PM/LCG] Remove the lazy RefSCC formation from the LazyCallGraph duringChandler Carruth2017-02-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iteration. The lazy formation of RefSCCs isn't really the most important part of the laziness here -- that has to do with walking the functions themselves -- and isn't essential to maintain. Originally, there were incremental update algorithms that relied on updates happening predominantly near the most recent RefSCC formed, but those have been replaced with ones that have much tighter general case bounds at this point. We do still perform asserts that only scale well due to this incrementality, but those are easy to place behind EXPENSIVE_CHECKS. Removing this simplifies the entire analysis by having a single up-front step that builds all of the RefSCCs in a direct Tarjan walk. We can even easily replace this with other or better algorithms at will and with much less confusion now that there is no iterator-based incremental logic involved. This removes a lot of complexity from LCG. Another advantage of moving in this direction is that it simplifies testing the system substantially as we no longer have to worry about observing and mutating the graph half-way through the RefSCC formation. We still need a somewhat special iterator for RefSCCs because we want the iterator to remain stable in the face of graph updates. However, this now merely involves relative indexing to the current RefSCC's position in the sequence which isn't too hard. Differential Revision: https://reviews.llvm.org/D29381 llvm-svn: 294227
* Rewind instantiations of OuterAnalysisManagerProxy in r289317, r291651, and ↵NAKAMURA Takumi2017-01-251-1/+1
| | | | | | | | | | r291662. I found root class should be instantiated for variadic tempate to instantiate static member explicitly. This will fix failures in mingw DLL build. llvm-svn: 293017
* [PM] Teach the CGSCC's CG update utility to more carefully invalidateChandler Carruth2016-12-281-10/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | analyses when we're about to break apart an SCC. We can't wait until after breaking apart the SCC to invalidate things: 1) Which SCC do we then invalidate? All of them? 2) Even if we invalidate all of them, a newly created SCC may not have a proxy that will convey the invalidation to functions! Previously we only invalidated one of the SCCs and too late. This led to stale analyses remaining in the cache. And because the caching strategy actually works, they would get used and chaos would ensue. Doing invalidation early is somewhat pessimizing though if we *know* that the SCC structure won't change. So it turns out that the design to make the mutation API force the caller to know the *kind* of mutation in advance was indeed 100% correct and we didn't do enough of it. So this change also splits two cases of switching a call edge to a ref edge into two separate APIs so that callers can clearly test for this and take the easy path without invalidating when appropriate. This is particularly important in this case as we expect most inlines to be between functions in separate SCCs and so the common case is that we don't have to so aggressively invalidate analyses. The LCG API change in turn needed some basic cleanups and better testing in its unittest. No interesting functionality changed there other than more coverage of the returned sequence of SCCs. While this seems like an obvious improvement over the current state, I'd like to revisit the core concept of invalidating within the CG-update layer at all. I'm wondering if we would be better served forcing the callers to handle the invalidation beforehand in the cases that they can handle it. An interesting example is when we want to teach the inliner to *update and preserve* analyses. But we can cross that bridge when we get there. With this patch, the new pass manager an build all of the LLVM test suite at -O3 and everything passes. =D I haven't bootstrapped yet and I'm sure there are still plenty of bugs, but this gives a nice baseline so I'm going to increasingly focus on fleshing out the missing functionality, especially the bits that are just turned off right now in order to let us establish this baseline. llvm-svn: 290664
* [PM] Introduce the facilities for registering cross-IR-unit dependenciesChandler Carruth2016-12-271-5/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | that require deferred invalidation. This handles the other real-world invalidation scenario that we have cases of: a function analysis which caches references to a module analysis. We currently do this in the AA aggregation layer and might well do this in other places as well. Since this is relative rare, the technique is somewhat more cumbersome. Analyses need to register themselves when accessing the outer analysis manager's proxy. This proxy is already necessarily present to allow access to the outer IR unit's analyses. By registering here we can track and trigger invalidation when that outer analysis goes away. To make this work we need to enhance the PreservedAnalyses infrastructure to support a (slightly) more explicit model for "sets" of analyses, and allow abandoning a single specific analyses even when a set covering that analysis is preserved. That allows us to describe the scenario of preserving all Function analyses *except* for the one where deferred invalidation has triggered. We also need to teach the invalidator API to support direct ID calls instead of always going through a template to dispatch so that we can just record the ID mapping. I've introduced testing of all of this both for simple module<->function cases as well as for more complex cases involving a CGSCC layer. Much like the previous patch I've not tried to fully update the loop pass management layer because that layer is due to be heavily reworked to use similar techniques to the CGSCC to handle updates. As that happens, we'll have a better testing basis for adding support like this. Many thanks to both Justin and Sean for the extensive reviews on this to help bring the API design and documentation into a better state. Differential Revision: https://reviews.llvm.org/D27198 llvm-svn: 290594
* [PM] Remove now-dead extern template and explicit instantiationChandler Carruth2016-12-221-2/+0
| | | | | | | | | | | | declarations. We're using a custom class here instead of the helper template, these bits just didn't get deleted when the other bits did get deleted. This was found by a really nice MSVC warning about explicitly instantiating a template where some member functions aren't defined and thus can't be instantiatied. llvm-svn: 290327
* [PM] Rework a loop in the CGSCC update logic to be more conservative andChandler Carruth2016-12-201-7/+11
| | | | | | | | | | clear. The current RefSCC can occur in exactly one position so we should just enforce that and leverage the property rather than checking for it anywhere. This addresses review comments made on another patch. llvm-svn: 290162
* [PM] Support invalidation of inner analysis managers from a pass over the ↵Chandler Carruth2016-12-101-2/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | outer IR unit. Summary: This never really got implemented, and was very hard to test before a lot of the refactoring changes to make things more robust. But now we can test it thoroughly and cleanly, especially at the CGSCC level. The core idea is that when an inner analysis manager proxy receives the invalidation event for the outer IR unit, it needs to walk the inner IR units and propagate it to the inner analysis manager for each of those units. For example, each function in the SCC needs to get an invalidation event when the SCC gets one. The function / module interaction is somewhat boring here. This really becomes interesting in the face of analysis-backed IR units. This patch effectively handles all of the CGSCC layer's needs -- both invalidating SCC analysis and invalidating function analysis when an SCC gets invalidated. However, this second aspect doesn't really handle the LoopAnalysisManager well at this point. That one will need some change of design in order to fully integrate, because unlike the call graph, the entire function behind a LoopAnalysis's results can vanish out from under us, and we won't even have a cached API to access. I'd like to try to separate solving the loop problems into a subsequent patch though in order to keep this more focused so I've adapted them to the API and updated the tests that immediately fail, but I've not added the level of testing and validation at that layer that I have at the CGSCC layer. An important aspect of this change is that the proxy for the FunctionAnalysisManager at the SCC pass layer doesn't work like the other proxies for an inner IR unit as it doesn't directly manage the FunctionAnalysisManager and invalidation or clearing of it. This would create an ever worsening problem of dual ownership of this responsibility, split between the module-level FAM proxy and this SCC-level FAM proxy. Instead, this patch changes the SCC-level FAM proxy to work in terms of the module-level proxy and defer to it to handle much of the updates. It only does SCC-specific invalidation. This will become more important in subsequent patches that support more complex invalidaiton scenarios. Reviewers: jlebar Subscribers: mehdi_amini, mcrosier, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D27197 llvm-svn: 289317
* [PM] Basic cleanups to CGSCC update code, NFC.Chandler Carruth2016-12-061-41/+36
| | | | | | | Just using InstIterator, simpler loop structures, and making better use of the visit callback infrastructure. llvm-svn: 288790
* [PM] Extend the explicit 'invalidate' method API on analysis results toChandler Carruth2016-11-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | accept an Invalidator that allows them to invalidate themselves if their dependencies are in turn invalidated. Rather than recording the dependency graph ahead of time when analysis get results from other analyses, this simply lets each result trigger the immediate invalidation of any analyses they actually depend on. They do this in a way that has three nice properties: 1) They don't have to handle transitive dependencies because the infrastructure will recurse for them. 2) The invalidate methods are still called only once. We just dynamically discover the necessary topological ordering, everything is memoized nicely. 3) The infrastructure still provides a default implementation and can access it so that only analyses which have dependencies need to do anything custom. To make this work at all, the invalidation logic also has to defer the deletion of the result objects themselves so that they can remain alive until we have collected the complete set of results to invalidate. A unittest is added here that has exactly the dependency pattern we are concerned with. It hit the use-after-free described by Sean in much detail in the long thread about analysis invalidation before this change, and even in an intermediate form of this change where we failed to defer the deletion of the result objects. There is an important problem with doing dependency invalidation that *isn't* solved here: we don't *enforce* that results correctly invalidate all the analyses whose results they depend on. I actually looked at what it would take to do that, and it isn't as hard as I had thought but the complexity it introduces seems very likely to outweigh the benefit. The technique would be to provide a base class for an analysis result that would be populated with other results, and automatically provide the invalidate method which immediately does the correct thing. This approach has some nice pros IMO: - Handles the case we care about and nothing else: only *results* that depend on other analyses trigger extra invalidation. - Localized to the result rather than centralized in the analysis manager. - Ties the storage of the reference to another result to the triggering of the invalidation of that analysis. - Still supports extending invalidation in customized ways. But the down sides here are: - Very heavy-weight meta-programming is needed to provide this base class. - Requires a pretty awful API for accessing the dependencies. Ultimately, I fear it will not pull its weight. But we can re-evaluate this at any point if we start discovering consistent problems where the invalidation and dependencies get out of sync. It will fit as a clean layer on top of the facilities in this patch that we can add if and when we need it. Note that I'm not really thrilled with the names for these APIs... The name "Invalidator" seems ok but not great. The method name "invalidate" also. In review some improvements were suggested, but they really need *other* uses of these terms to be updated as well so I'm going to do that in a follow-up commit. I'm working on the actual fixes to various analyses that need to use these, but I want to try to get tests for each of them so we don't regress. And those changes are seperable and obvious so once this goes in I should be able to roll them out throughout LLVM. Many thanks to Sean, Justin, and others for help reviewing here. Differential Revision: https://reviews.llvm.org/D23738 llvm-svn: 288077
* [PM] Remove weird marking of invalidated analyses as "preserved".Chandler Carruth2016-11-281-4/+8
| | | | | | | | | | | | | | | | | | | | | This never made a lot of sense. They've been invalidated for one IR unit but they aren't really preserved in any normal sense. It seemed like it would be an elegant way of communicating to outer IR units that pass managers and adaptors had already handled invalidation, but we've since ended up adding sets that model this more clearly: we're now using the 'AllAnalysesOn<IRUnitT>' set to handle cases where the trick of "preserving" invalidated analyses didn't work. This patch moves to rely on that technique exclusively and removes the cumbersome API aspect of updating the preserved set when doing invalidation. This in turn will simplify a *number* of upcoming patches. This has a side benefit of exposing a number of places where we were failing to mark the 'AllAnalysesOn<IRUnitT>' set as preserved. This patch fixes those, and with those fixes shouldn't change any observable behavior. llvm-svn: 288023
* Fixup r279618, instantiate ↵NAKAMURA Takumi2016-08-301-2/+2
| | | | | | | | | | | *AnalysisManagerProxy<*AnalysisManager,LazyCallGraph::SCC>, instead of *AnalysisManagerProxy<*AnalysisManager,LazyCallGraph::SCC,LazyCallGraph&>, for PassID. Or they were not instantiated as expected; llvm::InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::Function>, llvm::LazyCallGraph::SCC>::PassID llvm::InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::Function>, llvm::LazyCallGraph::SCC>::PassID llvm-svn: 280105
* [PM] Introduce basic update capabilities to the new PM's CGSCC passChandler Carruth2016-08-241-5/+342
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | manager, including both plumbing and logic to handle function pass updates. There are three fundamentally tied changes here: 1) Plumbing *some* mechanism for updating the CGSCC pass manager as the CG changes while passes are running. 2) Changing the CGSCC pass manager infrastructure to have support for the underlying graph to mutate mid-pass run. 3) Actually updating the CG after function passes run. I can separate them if necessary, but I think its really useful to have them together as the needs of #3 drove #2, and that in turn drove #1. The plumbing technique is to extend the "run" method signature with extra arguments. We provide the call graph that intrinsically is available as it is the basis of the pass manager's IR units, and an output parameter that records the results of updating the call graph during an SCC passes's run. Note that "...UpdateResult" isn't a *great* name here... suggestions very welcome. I tried a pretty frustrating number of different data structures and such for the innards of the update result. Every other one failed for one reason or another. Sometimes I just couldn't keep the layers of complexity right in my head. The thing that really worked was to just directly provide access to the underlying structures used to walk the call graph so that their updates could be informed by the *particular* nature of the change to the graph. The technique for how to make the pass management infrastructure cope with mutating graphs was also something that took a really, really large number of iterations to get to a place where I was happy. Here are some of the considerations that drove the design: - We operate at three levels within the infrastructure: RefSCC, SCC, and Node. In each case, we are working bottom up and so we want to continue to iterate on the "lowest" node as the graph changes. Look at how we iterate over nodes in an SCC running function passes as those function passes mutate the CG. We continue to iterate on the "lowest" SCC, which is the one that continues to contain the function just processed. - The call graph structure re-uses SCCs (and RefSCCs) during mutation events for the *highest* entry in the resulting new subgraph, not the lowest. This means that it is necessary to continually update the current SCC or RefSCC as it shifts. This is really surprising and subtle, and took a long time for me to work out. I actually tried changing the call graph to provide the opposite behavior, and it breaks *EVERYTHING*. The graph update algorithms are really deeply tied to this particualr pattern. - When SCCs or RefSCCs are split apart and refined and we continually re-pin our processing to the bottom one in the subgraph, we need to enqueue the newly formed SCCs and RefSCCs for subsequent processing. Queuing them presents a few challenges: 1) SCCs and RefSCCs use wildly different iteration strategies at a high level. We end up needing to converge them on worklist approaches that can be extended in order to be able to handle the mutations. 2) The order of the enqueuing need to remain bottom-up post-order so that we don't get surprising order of visitation for things like the inliner. 3) We need the worklists to have set semantics so we don't duplicate things endlessly. We don't need a *persistent* set though because we always keep processing the bottom node!!!! This is super, super surprising to me and took a long time to convince myself this is correct, but I'm pretty sure it is... Once we sink down to the bottom node, we can't re-split out the same node in any way, and the postorder of the current queue is fixed and unchanging. 4) We need to make sure that the "current" SCC or RefSCC actually gets enqueued here such that we re-visit it because we continue processing a *new*, *bottom* SCC/RefSCC. - We also need the ability to *skip* SCCs and RefSCCs that get merged into a larger component. We even need the ability to skip *nodes* from an SCC that are no longer part of that SCC. This led to the design you see in the patch which uses SetVector-based worklists. The RefSCC worklist is always empty until an update occurs and is just used to handle those RefSCCs created by updates as the others don't even exist yet and are formed on-demand during the bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and we push new SCCs onto it and blacklist existing SCCs on it to get the desired processing. We then *directly* update these when updating the call graph as I was never able to find a satisfactory abstraction around the update strategy. Finally, we need to compute the updates for function passes. This is mostly used as an initial customer of all the update mechanisms to drive their design to at least cover some real set of use cases. There are a bunch of interesting things that came out of doing this: - It is really nice to do this a function at a time because that function is likely hot in the cache. This means we want even the function pass adaptor to support online updates to the call graph! - To update the call graph after arbitrary function pass mutations is quite hard. We have to build a fairly comprehensive set of data structures and then process them. Fortunately, some of this code is related to the code for building the cal graph in the first place. Unfortunately, very little of it makes any sense to share because the nature of what we're doing is so very different. I've factored out the one part that made sense at least. - We need to transfer these updates into the various structures for the CGSCC pass manager. Once those were more sanely worked out, this became relatively easier. But some of those needs necessitated changes to the LazyCallGraph interface to make it significantly easier to extract the changed SCCs from an update operation. - We also need to update the CGSCC analysis manager as the shape of the graph changes. When an SCC is merged away we need to clear analyses associated with it from the analysis manager which we didn't have support for in the analysis manager infrsatructure. New SCCs are easy! But then we have the case that the original SCC has its shape changed but remains in the call graph. There we need to *invalidate* the analyses associated with it. - We also need to invalidate analyses after we *finish* processing an SCC. But the analyses we need to invalidate here are *only those for the newly updated SCC*!!! Because we only continue processing the bottom SCC, if we split SCCs apart the original one gets invalidated once when its shape changes and is not processed farther so its analyses will be correct. It is the bottom SCC which continues being processed and needs to have the "normal" invalidation done based on the preserved analyses set. All of this is mostly background and context for the changes here. Many thanks to all the reviewers who helped here. Especially Sanjoy who caught several interesting bugs in the graph algorithms, David, Sean, and others who all helped with feedback. Differential Revision: http://reviews.llvm.org/D21464 llvm-svn: 279618
* [NFC] Header cleanupMehdi Amini2016-04-181-2/+0
| | | | | | | | | | | | | | Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595
* [PM] Implement the final conclusion as to how the analysis IDs shouldChandler Carruth2016-03-111-3/+0
| | | | | | | | | | | | | | | | | | | | work in the face of the limitations of DLLs and templated static variables. This requires passes that use the AnalysisBase mixin provide a static variable themselves. So as to keep their APIs clean, I've made these private and befriended the CRTP base class (which is the common practice). I've added documentation to AnalysisBase for why this is necessary and at what point we can go back to the much simpler system. This is clearly a better pattern than the extern template as it caught *numerous* places where the template magic hadn't been applied and things were "just working" but would eventually have broken mysteriously. llvm-svn: 263216
* [PM] Appease mingw32's auto-import DLL build with minimal tweaks, with fix ↵NAKAMURA Takumi2016-02-281-0/+3
| | | | | | | | for clang. char AnalysisBase::ID should be declared as extern and defined in one module. llvm-svn: 262188
* Revert r262185, "[PM] Appease mingw32's auto-import DLL build with minimal ↵NAKAMURA Takumi2016-02-281-3/+0
| | | | | | | | tweaks." I'll rework soon. llvm-svn: 262186
* [PM] Appease mingw32's auto-import DLL build with minimal tweaks.NAKAMURA Takumi2016-02-281-0/+3
| | | | | | char AnalysisBase::ID should be declared as extern and defined in one module. llvm-svn: 262185
* [PM] Provide explicit instantiation declarations and definitions for theChandler Carruth2016-02-271-0/+2
| | | | | | PassManager and AnalysisManager template specializations as well. llvm-svn: 262128
* [PM] Provide two templates for the two directionalities of analysisChandler Carruth2016-02-271-55/+8
| | | | | | | | | | | | | | | | | | | | | | | | manager proxies and use those rather than repeating their definition four times. There are real differences between the two directions: outer AMs are const and don't need to have invalidation tracked. But every proxy in a particular direction is identical except for the analysis manager type and the IR unit they proxy into. This makes them prime candidates for nice templates. I've started introducing explicit template instantiation declarations and definitions as well because we really shouldn't be emitting all this everywhere. I'm going to go back and add the same for the other templates like this in a follow-up patch. I've left the analysis manager as an opaque type rather than using two IR units and requiring it to be an AnalysisManager template specialization. I think its important that users retain the ability to provide their own custom analysis management layer and provided it has the appropriate API everything should Just Work. llvm-svn: 262127
* [PM] Introduce CRTP mixin base classes to help define passes andChandler Carruth2016-02-261-8/+0
| | | | | | | | | | | | | | | | | analyses in the new pass manager. These just handle really basic stuff: turning a type name into a string statically that is nice to print in logs, and getting a static unique ID for each analysis. Sadly, the format of passes in anonymous namespaces makes using their names in tests really annoying so I've customized the names of the no-op passes to keep tests sane to read. This is the first of a few simplifying refactorings for the new pass manager that should reduce boilerplate and confusion. llvm-svn: 262004
* [PM] Remove an overly aggressive assert now that I can actually test theChandler Carruth2016-02-231-1/+0
| | | | | | | | | | | | | | | | | pattern that triggers it. This essentially requires an immutable function analysis, as that will survive anything we do to invalidate it. When we have such patterns, the function analysis manager will not get cleared between runs of the proxy. If we actually need an assert about how things are queried, we can add more elaborate machinery for computing it, but so far I'm not aware of significant value provided. Thanks to Justin Lebar for noticing this when he made a (seemingly innocuous) change to FunctionAttrs that is enough to trigger it in one test there. Now it is covered by a direct test of the pass manager code. llvm-svn: 261627
* [PM] Improve the API and comments around the analysis manager proxies.Chandler Carruth2016-02-231-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | These are really handles that ensure the analyses get cleared at appropriate places, and as such copying doesn't really make sense. Instead, they should look more like unique ownership objects. Make that the case. Relatedly, if you create a temporary of one and move out of it its destructor shouldn't actually clear anything. I don't think there is any code that can trigger this currently, but it seems like a more robust implementation. If folks want, I can add a unittest that forces this to be exercised, but that seems somewhat pointless -- whether a temporary is ever created in the innards of AnalysisManager is not really something we should be adding a reliance on, but I didn't want to leave a timebomb in the code here. If anyone has a cleaner way to represent this, I'm all ears, but I wanted to assure myself that this wasn't in fact responsible for another bug I'm chasing down (it wasn't) and figured I'd commit that. llvm-svn: 261594
* [PM] Remove the defunt CGSCC-specific debug flag.Chandler Carruth2015-01-131-4/+0
| | | | | | | | Even before I sunk the debug flag into the opt tool this had been made obsolete by factoring the pass and analysis managers into a single set of templates that all used the core flag. No functionality changed here. llvm-svn: 225842
* [PM] Refactor the new pass manager to use a single template to implementChandler Carruth2015-01-131-32/+0
| | | | | | | | | | | | | | | | | | | | | | | the generic functionality of the pass managers themselves. In the new infrastructure, the pass "manager" isn't actually interesting at all. It just pipelines a single chunk of IR through N passes. We don't need to know anything about the IR or the passes to do this really and we can replace the 3 implementations of the exact same functionality with a single generic PassManager template, complementing the single generic AnalysisManager template. I've left typedefs in place to give convenient names to the various obvious instantiations of the template. With this, I think I've nuked almost all of the redundant logic in the managers, and I think the overall design is actually simpler for having single templates that clearly indicate there is no special logic here. The logging is made somewhat more annoying by this change, but I don't think the difference is worth having heavy-weight traits to help log things. llvm-svn: 225783
OpenPOWER on IntegriCloud