summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert "blockfreq: Approximate irreducible control flow"Duncan P. N. Exon Smith2014-04-251-210/+20
| | | | | | | | | | | | This reverts commit r207286. It causes an ICE on the cmake-llvm-x86_64-linux buildbot [1]: llvm/lib/Analysis/BlockFrequencyInfo.cpp: In lambda function: llvm/lib/Analysis/BlockFrequencyInfo.cpp:182:1: internal compiler error: in get_expr_operands, at tree-ssa-operands.c:1035 [1]: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/12093/steps/build_llvm/logs/stdio llvm-svn: 207287
* blockfreq: Approximate irreducible control flowDuncan P. N. Exon Smith2014-04-251-20/+210
| | | | | | | | | | | | | | | | | | | | | | Previously, irreducible backedges were ignored. With this commit, irreducible SCCs are discovered on the fly, and modelled as loops with multiple headers. This approximation specifies the headers of irreducible sub-SCCs as its entry blocks and all nodes that are targets of a backedge within it (excluding backedges within true sub-loops). Block frequency calculations act as if we insert a new block that intercepts all the edges to the headers. All backedges and entries to the irreducible SCC point to this imaginary block. This imaginary block has an edge (with even probability) to each header block. The result is now reasonable enough that I've added a number of testcases for irreducible control flow. I've outlined in `BlockFrequencyInfoImpl.h` ways to improve the approximation. <rdar://problem/14292693> llvm-svn: 207286
* blockfreq: Further shift logic to LoopDataDuncan P. N. Exon Smith2014-04-251-27/+12
| | | | | | | | | Move a lot of the loop-related logic that was sprinkled around the code into `LoopData`. <rdar://problem/14292693> llvm-svn: 207258
* SCC: Change clients to use const, NFCDuncan P. N. Exon Smith2014-04-252-3/+3
| | | | | | | | | | It's fishy to be changing the `std::vector<>` owned by the iterator, and no one actual does it, so I'm going to remove the ability in a subsequent commit. First, update the users. <rdar://problem/14292693> llvm-svn: 207252
* [LCG] During the incremental update of an SCC, switch to using theChandler Carruth2014-04-251-26/+26
| | | | | | | | | | | | | | | | | | SCCMap to test for nodes that have been re-added to the root SCC rather than a set vector. We already have done the SCCMap lookup, we juts need to test it in two different ways. In turn, do most of the processing of these nodes as they go into the root SCC rather than lazily. This simplifies the final loop to just stitch the root SCC into its children's parent sets. No functionlatiy changed. However, this makes a few things painfully obvious, which was my intent. =] There is tons of repeated code introduced here and elsewhere. I'm splitting the refactoring of that code into helpers from this change so its clear that this is the change which switches the datastructures used around, and the other is a pure factoring & deduplication of code change. llvm-svn: 207217
* [LCG] During the incremental re-build of an SCC after removing an edge,Chandler Carruth2014-04-251-4/+5
| | | | | | | | | | | | | | | | | | remove the nodes in the SCC from the SCC map entirely prior to the DFS walk. This allows the SCC map to represent both the state of not-yet-re-added-to-an-SCC and added-back-to-this-SCC independently. The first is being missing from the SCC map, the second is mapping back to 'this'. In a subsequent commit, I'm going to use this property to simplify the new node list for this SCC. In theory, I think this also makes the contract for orphaning a node from the graph slightly less confusing. Now it is also orphaned from the SCC graph. Still, this isn't quite right either, and so I'm not adding test cases here. I'll add test cases for the behavior of orphaning nodes when the code *actually* supports it. The change here is mostly incidental, my goal is simplifying the algorithm. llvm-svn: 207213
* [LCG] Rather than doing a linear time SmallSetVector removal of eachChandler Carruth2014-04-251-6/+7
| | | | | | | | | child from the worklist, wait until we actually need to pop another element off of the worklist and skip over any that were already visited by the DFS. This also enables swapping the nodes of the SCC into the worklist. No functionality changed. llvm-svn: 207212
* [LCG] Remove a completely unnecessary loop. It wasn't even doing anyChandler Carruth2014-04-251-60/+54
| | | | | | | | | thing, just mucking up the code. I feel bad that I even wrote this loop. Very sorry. The diff is huge because of the indent change, but I promise all this is doing is realizing that the outer two loops were actually the exact same loops, and we didn't need two of them. llvm-svn: 207202
* [LCG] Now that the loop structure of the core SCC finding routine isChandler Carruth2014-04-251-1/+8
| | | | | | | | | factored into a more reasonable form, replace the tail call with a simple outer-loop continuation. It's sad that C++ makes this so awkward to write, but it seems more direct and clear than the tail call at this point. llvm-svn: 207201
* blockfreq: Only one mass distribution per nodeDuncan P. N. Exon Smith2014-04-251-61/+11
| | | | | | | | | | Remove the concepts of "forward" and "general" mass distributions, which was wrong. The split might have made sense in an early version of the algorithm, but it's definitely wrong now. <rdar://problem/14292693> llvm-svn: 207195
* blockfreq: Document assertionDuncan P. N. Exon Smith2014-04-251-1/+1
| | | | | | <rdar://problem/14292693> llvm-svn: 207194
* blockfreq: Document high-level functionsDuncan P. N. Exon Smith2014-04-251-1/+1
| | | | | | <rdar://problem/14292693> llvm-svn: 207191
* blockfreq: Scale LoopData::Scale on the way downDuncan P. N. Exon Smith2014-04-251-23/+12
| | | | | | | | | | | Rather than scaling loop headers and then scaling all the loop members by the header frequency, scale `LoopData::Scale` itself, and scale the loop members by it. It's much more obvious what's going on this way, and doesn't cost any extra multiplies. <rdar://problem/14292693> llvm-svn: 207189
* blockfreq: unwrapLoopPackage() => unwrapLoop()Duncan P. N. Exon Smith2014-04-251-10/+8
| | | | | | <rdar://problem/14292693> llvm-svn: 207188
* blockfreq: Pass the Loop directly into unwrapLoopPackage()Duncan P. N. Exon Smith2014-04-251-6/+4
| | | | | | <rdar://problem/14292693> llvm-svn: 207187
* blockfreq: Unwrap from LoopsDuncan P. N. Exon Smith2014-04-251-4/+2
| | | | | | | | When unwrapping loops, just visit the loops rather than all nodes. <rdar://problem/14292693> llvm-svn: 207186
* blockfreq: Separate unwrapLoops() from finalizeMetrics()Duncan P. N. Exon Smith2014-04-251-5/+9
| | | | | | <rdar://problem/14292693> llvm-svn: 207185
* blockfreq: Expose getPackagedNode()Duncan P. N. Exon Smith2014-04-251-24/+1
| | | | | | | | | Make `getPackagedNode()` a member function of `BlockFrequencyInfoImplBase` so that it's available for templated code. <rdar://problem/14292693> llvm-svn: 207183
* blockfreq: Store the header with the membersDuncan P. N. Exon Smith2014-04-251-2/+2
| | | | | | <rdar://problem/14292693> llvm-svn: 207182
* blockfreq: Encapsulate LoopData::HeaderDuncan P. N. Exon Smith2014-04-251-14/+12
| | | | | | <rdar://problem/14292693> llvm-svn: 207181
* blockfreq: Use LoopData directlyDuncan P. N. Exon Smith2014-04-251-30/+28
| | | | | | | | Instead of passing around loop headers, pass around `LoopData` directly. <rdar://problem/14292693> llvm-svn: 207179
* blockfreq: Use a std::list for LoopsDuncan P. N. Exon Smith2014-04-251-1/+1
| | | | | | | | | | | | As pointed out by David Blaikie in code review, a `std::list<T>` is simpler than a `std::vector<std::unique_ptr<T>>`. Another option is a `std::deque<T>` (which allocates in chunks), but I'd like to leave open the option of inserting in the middle of the sequence for handling irreducible control flow on the fly. <rdar://problem/14292693> llvm-svn: 207177
* [LCG] Switch a weird do/while loop that actually couldn't fail itsChandler Carruth2014-04-241-5/+4
| | | | | | | condition into an obviously infinite loop with an assert about the degenerate condition. No functionality changed. llvm-svn: 207147
* [LCG] Incorporate the core trick of improvements on the naive Tarjan'sChandler Carruth2014-04-241-41/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | algorithm here: http://dl.acm.org/citation.cfm?id=177301. The idea of isolating the roots has even more relevance when using the stack not just to implement the DFS but also to implement the recursive step. Because we use it for the recursive step, to isolate the roots we need to maintain two stacks: one for our recursive DFS walk, and another of the nodes that have been walked. The nice thing is that the latter will be half the size. It also fixes a complete hack where we scanned backwards over the stack to find the next potential-root to continue processing. Now that is always the top of the DFS stack. While this is a really nice improvement already (IMO) it further opens the door for two important simplifications: 1) De-duplicating some of the code across the two different walks. I've actually made the duplication a bit worse in some senses with this patch because the two are starting to converge. 2) Dramatically simplifying the loop structures of both walks. I wanted to do those separately as they'll be essentially *just* CFG restructuring. This patch on the other hand actually uses different datastructures to implement the algorithm itself. llvm-svn: 207098
* [LCG] Rotate logic applied to the top of the DFSStack to instead beChandler Carruth2014-04-241-25/+24
| | | | | | | | | | | | | applied prior to pushing a node onto the DFSStack. This is the first step toward avoiding the stack entirely for leaf nodes. It also simplifies things a bit and I think is pointing the way toward factoring some more of the shared logic out of the two implementations. It is also making it more obvious how to restructure the loops themselves to be a bit easier to read (although no different in terms of functionality). llvm-svn: 207095
* [LCG] Switch the parent SCC tracking from a SmallSetVector toChandler Carruth2014-04-241-2/+2
| | | | | | | | | | | | | | | | a SmallPtrSet. Currently, there is no need for stable iteration in this dimension, and I now thing there won't need to be going forward. If this is ever re-introduced in any form, it needs to not be a SetVector based solution because removal cannot be linear. There will be many SCCs with large numbers of parents. When encountering these, the incremental SCC update for intra-SCC edge removal was quadratic due to linear removal (kind of). I'm really hoping we can avoid having an ordering property here at all though... llvm-svn: 207091
* [LCG] We don't actually need a set in each SCC to track the nodes. WeChandler Carruth2014-04-241-7/+1
| | | | | | | can use the node -> SCC mapping in the top-level graph to test this on the rare occasions we need it. llvm-svn: 207090
* [C++] Use 'nullptr'.Craig Topper2014-04-244-18/+18
| | | | llvm-svn: 207083
* [LCG] Normalize the post-order SCC iterator to just iterate over the SCCChandler Carruth2014-04-231-2/+2
| | | | | | values rather than having pointers in weird places. llvm-svn: 207053
* [LCG] Switch the primary node iterator to be a *much* more normal C++Chandler Carruth2014-04-231-39/+33
| | | | | | iterator, returning a Node by reference on dereference. llvm-svn: 207048
* [LCG] Make the insertion and query paths into the LCG which cannot failChandler Carruth2014-04-231-4/+4
| | | | | | | | return references to better model this property. No functionality changed. llvm-svn: 207047
* [LCG] Switch the SCC lookup to be in terms of call graph nodes ratherChandler Carruth2014-04-231-8/+8
| | | | | | | | | | than functions. So far, this access pattern is *much* more common. It seems likely that any user of this interface is going to have nodes at the point that they are querying the SCCs. No functionality changed. llvm-svn: 207045
* [LCG] Switch the primary SCC building code to use the negative low-linkChandler Carruth2014-04-231-2/+4
| | | | | | | | | | | | | | | values rather than an expensive dense map query to test whether children have already been popped into an SCC. This matches the incremental SCC building code. I've also included the assert that I put there but updated both of their text. No functionality changed here. I still don't have any great ideas for sharing the code between the two implementations, but I may try a brute-force approach to factoring it at some point. llvm-svn: 207042
* [LCG] Add the first round of mutation support to the lazy call graph.Chandler Carruth2014-04-231-0/+233
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implements the core functionality necessary to remove an edge from the call graph and correctly update both the basic graph and the SCC structure. As part of that it has to run a tiny (in number of nodes) Tarjan-style DFS walk of an SCC being mutated to compute newly formed SCCs, etc. This is *very rough* and a WIP. I have a bunch of FIXMEs for code cleanup that will reduce the boilerplate in this change substantially. I also have a bunch of simplifications to various parts of both algorithms that I want to make, but first I'd like to have a more holistic picture. Ideally, I'd also like more testing. I'll probably add quite a few more unit tests as I go here to cover the various different aspects and corner cases of removing edges from the graph. Still, this is, so far, successfully updating the SCC graph in-place without disrupting the identity established for the existing SCCs even when we do challenging things like delete the critical edge that made an SCC cycle at all and have to reform things as a tree of smaller SCCs. Getting this to work is really critical for the new pass manager as it is going to associate significant state with the SCC instance and needs it to be stable. That is also the motivation behind the return of the newly formed SCCs. Eventually, I'll wire this all the way up to the public API so that the pass manager can use it to correctly re-enqueue newly formed SCCs into a fresh postorder traversal. llvm-svn: 206968
* [LCG] Implement Tarjan's algorithm correctly this time. We have to walkChandler Carruth2014-04-231-34/+42
| | | | | | | | | | | | up the stack finishing the exploration of each entries children before we're finished in addition to accounting for their low-links. Added a unittest that really hammers home the need for this with interlocking cycles that would each appear distinct otherwise and crash or compute the wrong result. As part of this, nuke a stale fixme and bring the rest of the implementation still more closely in line with the original algorithm. llvm-svn: 206966
* [LCG] Add a unittest for the LazyCallGraph. I had a weak moment andChandler Carruth2014-04-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | resisted this for too long. Just with the basic testing here I was able to exercise the analysis in more detail and sift out both type signature bugs in the API and a bug in the DFS numbering. All of these are fixed here as well. The unittests will be much more important for the mutation support where it is necessary to craft minimal mutations and then inspect the state of the graph. There is just no way to do that with a standard FileCheck test. However, unittesting these kinds of analyses is really quite easy, especially as they're designed with the new pass manager where there is essentially no infrastructure required to rig up the core logic and exercise it at an API level. As a minor aside about the DFS numbering bug, the DFS numbering used in LCG is a bit unusual. Rather than numbering from 0, we number from 1, and use 0 as the sentinel "unvisited" state. Other implementations often use '-1' for this, but I find it easier to deal with 0 and it shouldn't make any real difference provided someone doesn't write silly bugs like forgetting to actually initialize the DFS numbering. Oops. ;] llvm-svn: 206954
* [LCG] Hoist the logic for forming a new SCC from the top of the DFSStackChandler Carruth2014-04-231-41/+47
| | | | | | | into a helper function. I plan to re-use it for doing incremental DFS-based updates to the SCCs when we mutate the call graph. llvm-svn: 206948
* [LCG] Switch the Callee sets to be DenseMaps pointing to the index intoChandler Carruth2014-04-231-7/+8
| | | | | | | | | the Callee list. This is going to be quite important to prevent removal from going quadratic. No functionality changed at this point, this is one of the refactoring patches I've broken out of my initial work toward mutation updates of the call graph. llvm-svn: 206938
* blockfreq: Skip irreducible backedges inside functionsDuncan P. N. Exon Smith2014-04-221-1/+1
| | | | | | | | | | | | The branch that skips irreducible backedges was only active when propagating mass at the top-level. In particular, when propagating mass through a loop recognized by `LoopInfo` with irreducible control flow inside, irreducible backedges would not be skipped. Not sure where that idea came from, but the result was that mass was lost until after loop exit. Added a testcase that covers this case. llvm-svn: 206860
* blockfreq: Rename PackagedLoops => LoopsDuncan P. N. Exon Smith2014-04-221-1/+1
| | | | llvm-svn: 206859
* blockfreq: Use a pointer for ContainingLoop tooDuncan P. N. Exon Smith2014-04-221-9/+9
| | | | llvm-svn: 206858
* blockfreq: Use pointers to loops instead of an indexDuncan P. N. Exon Smith2014-04-221-4/+5
| | | | | | | | | | | | | Store pointers directly to loops inside the nodes. This could have been done without changing the type stored in `std::vector<>`. However, rather than computing the number of loops before constructing them (which `LoopInfo` doesn't provide directly), I've switched to a `vector<unique_ptr<LoopData>>`. This adds some heap overhead, but the number of loops is typically small. llvm-svn: 206857
* blockfreq: Implement clear() explicitlyDuncan P. N. Exon Smith2014-04-221-1/+5
| | | | | | | | | This was implicitly with copy assignment before, which fails to actually clear `std::vector<>`'s heap storage. Move assignment would work, but since MSVC can't imply those anyway, explicitly `clear()`-ing members makes more sense. llvm-svn: 206856
* blockfreq: Rename PackagedLoopData => LoopDataDuncan P. N. Exon Smith2014-04-221-7/+7
| | | | | | No functionality change. llvm-svn: 206855
* [Modules] Fix potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-2221-25/+44
| | | | | | | | | | definition below all the header #include lines, lib/Analysis/... edition. This one has a bit extra as there were *other* #define's before #include lines in addition to DEBUG_TYPE. I've sunk all of them as a block. llvm-svn: 206843
* [Modules] Remove potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-221-1/+2
| | | | | | | | | | | | define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind of ODR violation yet, it is likely to grow support for it in the future. It also removes one layer of macro pollution across all the included headers. Other sub-trees will follow. llvm-svn: 206837
* [Modules] Make Support/Debug.h modular. This requires it to not changeChandler Carruth2014-04-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | behavior based on other files defining DEBUG_TYPE, which means it cannot define DEBUG_TYPE at all. This is actually better IMO as it forces folks to define relevant DEBUG_TYPEs for their files. However, it requires all files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't already. I've updated all such files in LLVM and will do the same for other upstream projects. This still leaves one important change in how LLVM uses the DEBUG_TYPE macro going forward: we need to only define the macro *after* header files have been #include-ed. Previously, this wasn't possible because Debug.h required the macro to be pre-defined. This commit removes that. By defining DEBUG_TYPE after the includes two things are fixed: - Header files that need to provide a DEBUG_TYPE for some inline code can do so by defining the macro before their inline code and undef-ing it afterward so the macro does not escape. - We no longer have rampant ODR violations due to including headers with different DEBUG_TYPE definitions. This may be mostly an academic violation today, but with modules these types of violations are easy to check for and potentially very relevant. Where necessary to suppor headers with DEBUG_TYPE, I have moved the definitions below the includes in this commit. I plan to move the rest of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big enough. The comments in Debug.h, which were hilariously out of date already, have been updated to reflect the recommended practice going forward. llvm-svn: 206822
* blockfreq: Some cleanup of UnsignedFloatDuncan P. N. Exon Smith2014-04-211-22/+19
| | | | | | | Change `PositiveFloat` to `UnsignedFloat`, and fix some of the comments to indicate that it's disappearing eventually. llvm-svn: 206771
* Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl"Duncan P. N. Exon Smith2014-04-213-2/+939
| | | | | | | | | This reverts commit r206707, reapplying r206704. The preceding commit to CalcSpillWeights should have sorted out the failing buildbots. <rdar://problem/14292693> llvm-svn: 206766
* [PM] Add a new-PM-style CGSCC pass manager using the newly addedChandler Carruth2014-04-212-0/+168
| | | | | | | | | | | | | | | | | | | | | | | LazyCallGraph analysis framework. Wire it up all the way through the opt driver and add some very basic testing that we can build pass pipelines including these components. Still a lot more to do in terms of testing that all of this works, but the basic pieces are here. There is a *lot* of boiler plate here. It's something I'm going to actively look at reducing, but I don't have any immediate ideas that don't end up making the code terribly complex in order to fold away the boilerplate. Until I figure out something to minimize the boilerplate, almost all of this is based on the code for the existing pass managers, copied and heavily adjusted to suit the needs of the CGSCC pass management layer. The actual CG management still has a bunch of FIXMEs in it. Notably, we don't do *any* updating of the CG as it is potentially invalidated. I wanted to get this in place to motivate the new analysis, and add update APIs to the analysis and the pass management layers in concert to make sure that the *right* APIs are present. llvm-svn: 206745
OpenPOWER on IntegriCloud