summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis
Commit message (Collapse)AuthorAgeFilesLines
* Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl"Duncan P. N. Exon Smith2014-04-183-2/+938
| | | | | | | | | | | | | | This reverts commit r206556, effectively reapplying commit r206548 and its fixups in r206549 and r206550. In an intervening commit I've added target triples to the tests that were failing remotely [1] (but passing locally). I'm hoping the mystery is solved? I'll revert this again if the tests are still failing remotely. [1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816 llvm-svn: 206622
* [LCG] Remove all of the complexity stemming from supporting copying.Chandler Carruth2014-04-181-42/+21
| | | | | | | | | | | | | | Reality is that we're never going to copy one of these. Supporting this was becoming a nightmare because nothing even causes it to compile most of the time. Lots of subtle errors built up that wouldn't have been caught by any "normal" testing. Also, make the move assignment actually work rather than the bogus swap implementation that would just infloop if used. As part of that, factor out the graph pointer updates into a helper to share between move construction and move assignment. llvm-svn: 206583
* [LCG] Add support for building persistent and connected SCCs to theChandler Carruth2014-04-181-4/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LazyCallGraph. This is the start of the whole point of this different abstraction, but it is just the initial bits. Here is a run-down of what's going on here. I'm planning to incorporate some (or all) of this into comments going forward, hopefully with better editing and wording. =] The crux of the problem with the traditional way of building SCCs is that they are ephemeral. The new pass manager however really needs the ability to associate analysis passes and results of analysis passes with SCCs in order to expose these analysis passes to the SCC passes. Making this work is kind-of the whole point of the new pass manager. =] So, when we're building SCCs for the call graph, we actually want to build persistent nodes that stick around and can be reasoned about later. We'd also like the ability to walk the SCC graph in more complex ways than just the traditional postorder traversal of the current CGSCC walk. That means that in addition to being persistent, the SCCs need to be connected into a useful graph structure. However, we still want the SCCs to be formed lazily where possible. These constraints are quite hard to satisfy with the SCC iterator. Also, using that would bypass our ability to actually add data to the nodes of the call graph to facilite implementing the Tarjan walk. So I've re-implemented things in a more direct and embedded way. This immediately makes it easy to get the persistence and connectivity correct, and it also allows leveraging the existing nodes to simplify the algorithm. I've worked somewhat to make this implementation more closely follow the traditional paper's nomenclature and strategy, although it is still a bit obtuse because it isn't recursive, using an explicit stack and a tail call instead, and it is interruptable, resuming each time we need another SCC. The other tricky bit here, and what actually took almost all the time and trials and errors I spent building this, is exactly *what* graph structure to build for the SCCs. The naive thing to build is the call graph in its newly acyclic form. I wrote about 4 versions of this which did precisely this. Inevitably, when I experimented with them across various use cases, they became incredibly awkward. It was all implementable, but it felt like a complete wrong fit. Square peg, round hole. There were two overriding aspects that pushed me in a different direction: 1) We want to discover the SCC graph in a postorder fashion. That means the root node will be the *last* node we find. Using the call-SCC DAG as the graph structure of the SCCs results in an orphaned graph until we discover a root. 2) We will eventually want to walk the SCC graph in parallel, exploring distinct sub-graphs independently, and synchronizing at merge points. This again is not helped by the call-SCC DAG structure. The structure which, quite surprisingly, ended up being completely natural to use is the *inverse* of the call-SCC DAG. We add the leaf SCCs to the graph as "roots", and have edges to the caller SCCs. Once I switched to building this structure, everything just fell into place elegantly. Aside from general cleanups (there are FIXMEs and too few comments overall) that are still needed, the other missing piece of this is support for iterating across levels of the SCC graph. These will become useful for implementing #2, but they aren't an immediate priority. Once SCCs are in good shape, I'll be working on adding mutation support for incremental updates and adding the pass manager that this analysis enables. llvm-svn: 206581
* Revert "blockfreq: Rewrite BlockFrequencyInfoImpl"Duncan P. N. Exon Smith2014-04-183-938/+2
| | | | | | | | | | | This reverts commits r206548, r206549 and r206549. There are some unit tests failing that aren't failing locally [1], so reverting until I have time to investigate. [1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816 llvm-svn: 206556
* blockfreq: Really fix r206548 (and r206549)Duncan P. N. Exon Smith2014-04-181-32/+0
| | | | | | Turns out this code is dead. llvm-svn: 206554
* blockfreq: Fixing MSVC after r206548?Duncan P. N. Exon Smith2014-04-181-2/+2
| | | | llvm-svn: 206549
* blockfreq: Rewrite BlockFrequencyInfoImplDuncan P. N. Exon Smith2014-04-183-2/+970
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rewrite the shared implementation of BlockFrequencyInfo and MachineBlockFrequencyInfo entirely. The old implementation had a fundamental flaw: precision losses from nested loops (or very wide branches) compounded past loop exits (and convergence points). The @nested_loops testcase at the end of test/Analysis/BlockFrequencyAnalysis/basic.ll is motivating. This function has three nested loops, with branch weights in the loop headers of 1:4000 (exit:continue). The old analysis gives non-sensical results: Printing analysis 'Block Frequency Analysis' for function 'nested_loops': ---- Block Freqs ---- entry = 1.0 for.cond1.preheader = 1.00103 for.cond4.preheader = 5.5222 for.body6 = 18095.19995 for.inc8 = 4.52264 for.inc11 = 0.00109 for.end13 = 0.0 The new analysis gives correct results: Printing analysis 'Block Frequency Analysis' for function 'nested_loops': block-frequency-info: nested_loops - entry: float = 1.0, int = 8 - for.cond1.preheader: float = 4001.0, int = 32007 - for.cond4.preheader: float = 16008001.0, int = 128064007 - for.body6: float = 64048012001.0, int = 512384096007 - for.inc8: float = 16008001.0, int = 128064007 - for.inc11: float = 4001.0, int = 32007 - for.end13: float = 1.0, int = 8 Most importantly, the frequency leaving each loop matches the frequency entering it. The new algorithm leverages BlockMass and PositiveFloat to maintain precision, separates "probability mass distribution" from "loop scaling", and uses dithering to eliminate probability mass loss. I have unit tests for these types out of tree, but it was decided in the review to make the classes private to BlockFrequencyInfoImpl, and try to shrink them (or remove them entirely) in follow-up commits. The new algorithm should generally have a complexity advantage over the old. The previous algorithm was quadratic in the worst case. The new algorithm is still worst-case quadratic in the presence of irreducible control flow, but it's linear without it. The key difference between the old algorithm and the new is that control flow within a loop is evaluated separately from control flow outside, limiting propagation of precision problems and allowing loop scale to be calculated independently of mass distribution. Loops are visited bottom-up, their loop scales are calculated, and they are replaced by pseudo-nodes. Mass is then distributed through the function, which is now a DAG. Finally, loops are revisited top-down to multiply through the loop scales and the masses distributed to pseudo nodes. There are some remaining flaws. - Irreducible control flow isn't modelled correctly. LoopInfo and MachineLoopInfo ignore irreducible edges, so this algorithm will fail to scale accordingly. There's a note in the class documentation about how to get closer. See also the comments in test/Analysis/BlockFrequencyInfo/irreducible.ll. - Loop scale is limited to 4096 per loop (2^12) to avoid exhausting the 64-bit integer precision used downstream. - The "bias" calculation proposed on llvmdev is *not* incorporated here. This will be added in a follow-up commit, once comments from this review have been handled. llvm-svn: 206548
* remove some dead codeNuno Lopes2014-04-173-20/+0
| | | | | | | | | | | | | | | lib/Analysis/IPA/InlineCost.cpp | 18 ------------------ lib/Analysis/RegionPass.cpp | 1 - lib/Analysis/TypeBasedAliasAnalysis.cpp | 1 - lib/Transforms/Scalar/LoopUnswitch.cpp | 21 --------------------- lib/Transforms/Utils/LCSSA.cpp | 2 -- lib/Transforms/Utils/LoopSimplify.cpp | 6 ------ utils/TableGen/AsmWriterEmitter.cpp | 13 ------------- utils/TableGen/DFAPacketizerEmitter.cpp | 7 ------- utils/TableGen/IntrinsicEmitter.cpp | 2 -- 9 files changed, 71 deletions(-) llvm-svn: 206506
* Reverse 206485.Gerolf Hoflehner2014-04-171-8/+2
| | | | | | | | | After some discussions the preferred semantics of the always_inline attribute is inline always when the compiler can determine that it it safe to do so. llvm-svn: 206487
* [LCG] Just move the allocator (now that we can) when moving a callChandler Carruth2014-04-171-28/+14
| | | | | | | | | graph. This simplifies the custom move constructor operation to one of walking the graph and updating the 'up' pointers to point to the new location of the graph. Switch the nodes from a reference to a pointer for the 'up' edge to facilitate this. llvm-svn: 206450
* [LCG] Remove the Module reference member which we weren't using forChandler Carruth2014-04-171-3/+3
| | | | | | anything and doesn't make sense if assigning. llvm-svn: 206449
* Inline a function when the always_inline attributeGerolf Hoflehner2014-04-171-2/+8
| | | | | | | | | | is set even when it contains a indirect branch. The attribute overrules correctness concerns like the escape of a local block address. This is for rdar://16501761 llvm-svn: 206429
* RegionInfo: Do not access a value that was just moved awayTobias Grosser2014-04-151-1/+1
| | | | | | This fixes a regression introduced in r206310. llvm-svn: 206328
* Use unique_ptr to manage ownership of child Regions within llvm::RegionDavid Blaikie2014-04-153-31/+35
| | | | llvm-svn: 206310
* [C++11] More 'nullptr' conversion. In some cases just using a boolean check ↵Craig Topper2014-04-1539-505/+512
| | | | | | instead of comparing to nullptr. llvm-svn: 206243
* Fix a bug in which BranchProbabilityInfo wasn't setting branch weights of ↵Akira Hatanaka2014-04-141-0/+3
| | | | | | | | | | | | basic blocks inside loops correctly. Previously, BranchProbabilityInfo::calcLoopBranchHeuristics would determine the weights of basic blocks inside loops even when it didn't have enough information to estimate the branch probabilities correctly. This patch fixes the function to exit early if it doesn't see any exit edges or back edges and let the later heuristics determine the weights. This fixes PR18705 and <rdar://problem/15991090>. Differential Revision: http://reviews.llvm.org/D3363 llvm-svn: 206194
* blockfreq: Rename BlockFrequencyImpl to BlockFrequencyInfoImplDuncan P. N. Exon Smith2014-04-111-2/+2
| | | | | | | | | | | | This is a shared implementation class for BlockFrequencyInfo and MachineBlockFrequencyInfo, not for BlockFrequency, a related (but distinct) class. No functionality change. <rdar://problem/14292693> llvm-svn: 206083
* blockfreq: Use getSuccessorIndex()Duncan P. N. Exon Smith2014-04-111-5/+3
| | | | | | | | No functionality change. <rdar://problem/14292693> llvm-svn: 206082
* Delinearize: Extend informationin -analyze outputTobias Grosser2014-04-091-0/+4
| | | | llvm-svn: 205838
* divide by the result of the gcdSebastian Pop2014-04-081-1/+1
| | | | | | used to fail with 'Step should divide Start with no remainder.' llvm-svn: 205802
* handle special cases when findGCD returns 1Sebastian Pop2014-04-081-1/+6
| | | | | | used to fail with 'Step should divide Start with no remainder.' llvm-svn: 205801
* in findGCD of multiply expr return the gcdSebastian Pop2014-04-081-2/+4
| | | | | | we used to return 1 instead of the gcd llvm-svn: 205800
* Handle vlas during inline cost computation if they'll be turnedEric Christopher2014-04-071-1/+10
| | | | | | | | | | | into a constant size alloca by inlining. Ran a run over the testsuite, no results out of the noise, fixes the testcase in the PR. PR19115. llvm-svn: 205710
* Use TopTTI->getGEPCost from within getUserCostHal Finkel2014-04-011-4/+4
| | | | | | | | | | | The implementation of getUserCost had duplicated (and hard-coded) the default logic in getGEPCost. Instead, it is better to use getGEPCost directly, which limits the default logic to the implementation of one function, and allows targets to override the behavior. No functionality change intended. llvm-svn: 205346
* PR15967 Fix in basicaa for faulty returning no alias.Arnold Schwaighofer2014-03-261-11/+38
| | | | | | | | | | | | | | This commit consist of two parts. The first part fix the PR15967. The wrong conclusion was made when the MaxLookup limit was reached. The fix introduce a out parameter (MaxLookupReached) to DecomposeGEPExpression that the function aliasGEP can act upon. The second part is introducing the constant MaxLookupSearchDepth to make sure that DecomposeGEPExpression and GetUnderlyingObject use the same search depth. This is a small cleanup to clarify the original algorithm. Patch by Karl-Johan Karlsson! llvm-svn: 204859
* blockfreq: Implement Pass::releaseMemory()Duncan P. N. Exon Smith2014-03-251-9/+10
| | | | | | | | | | Implement Pass::releaseMemory() in BlockFrequencyInfo and MachineBlockFrequencyInfo. Just delete the private implementation when not in use. Switch to a std::unique_ptr to make the logic more clear. <rdar://problem/14292693> llvm-svn: 204741
* ScalarEvolution: Compute exit counts for loops with a power-of-2 step.Benjamin Kramer2014-03-251-0/+10
| | | | | | | | | | | | | If we have a loop of the form for (unsigned n = 0; n != (k & -32); n += 32) {} then we know that n is always divisible by 32 and the loop must terminate. Even if we have a condition where the loop counter will overflow it'll always hold this invariant. PR19183. Our loop vectorizer creates this pattern and it's also occasionally formed by loop counters derived from pointers. llvm-svn: 204728
* Simplify loop that worked around bugs in old GCC/Xcode.Erik Verbruggen2014-03-251-8/+2
| | | | | | GCC 4.0.1 and Xcode 2 are no longer supported for building llvm/clang. llvm-svn: 204705
* Allow constant folding of ceil function whenever feasibleKarthik Bhat2014-03-241-0/+3
| | | | llvm-svn: 204583
* [Constant Hoisting] Make the constant materialization cost operand dependentJuergen Ributzka2014-03-211-8/+8
| | | | | | | | | Extend the target hook to take also the operand index into account when calculating the cost of the constant materialization. Related to <rdar://problem/16381500> llvm-svn: 204435
* Revert "[Constant Hoisting] Extend coverage of the constant hoisting pass."Juergen Ributzka2014-03-201-8/+8
| | | | | | I will break this up into smaller pieces for review and recommit. llvm-svn: 204393
* [Constant Hoisting] Extend coverage of the constant hoisting pass.Juergen Ributzka2014-03-201-8/+8
| | | | | | | | | This commit extends the coverage of the constant hoisting pass, adds additonal debug output and updates the function names according to the style guide. Related to <rdar://problem/16381500> llvm-svn: 204389
* Add stride normalization to SCEV Normalize/Denormalize transformation.Michael Zolotukhin2014-03-181-3/+26
| | | | llvm-svn: 204161
* [C++11] Change DebugInfoFinder to use range-based loopsAlon Mishne2014-03-181-12/+8
| | | | | | Also changes the iterators to return actual DI type over MDNode. llvm-svn: 204130
* Consistent use of the noduplicate attribute.Eli Bendersky2014-03-173-5/+5
| | | | | | | | | The "noduplicate" attribute of call instructions is sometimes queried directly and sometimes through the cannotDuplicate() predicate. This patch streamlines all queries to use the cannotDuplicate() predicate. It also adds this predicate to InvokeInst, to mirror what CallInst has. llvm-svn: 204049
* Remove some dead assignements found by scan-buildArnaud A. de Grandmaison2014-03-151-6/+1
| | | | llvm-svn: 204013
* PR17473:Michael Zolotukhin2014-03-121-3/+21
| | | | | | | Don't normalize an expression during postinc transformation unless it's invertible. llvm-svn: 203719
* Test commitMichael Zolotukhin2014-03-121-0/+1
| | | | llvm-svn: 203716
* IR: add a second ordering operand to cmpxhg for failureTim Northover2014-03-111-1/+1
| | | | | | | | | | | | | | | The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 llvm-svn: 203559
* [TTI] There is actually no realistic way to pop TTI implementations offChandler Carruth2014-03-101-10/+0
| | | | | | | | | | | | | | the stack of the analysis group because they are all immutable passes. This is made clear by Craig's recent work to use override systematically -- we weren't overriding anything for 'finalizePass' because there is no such thing. This is kind of a lame restriction on the API -- we can no longer push and pop things, we just set up the stack and run. However, I'm not invested in building some better solution on top of the existing (terrifying) immutable pass and legacy pass manager. llvm-svn: 203437
* [LCG] Ran clang-format over this too and it pointed out some fixes.Chandler Carruth2014-03-101-4/+6
| | | | llvm-svn: 203435
* [LCG] Simplify a bunch of the LCG code with range for loops and auto.Chandler Carruth2014-03-091-37/+29
| | | | | | | Still more work to be done here to leverage C++11, but this clears out the glaring issues. llvm-svn: 203395
* [C++11] Add range based accessors for the Use-Def chain of a Value.Chandler Carruth2014-03-0913-83/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This requires a number of steps. 1) Move value_use_iterator into the Value class as an implementation detail 2) Change it to actually be a *Use* iterator rather than a *User* iterator. 3) Add an adaptor which is a User iterator that always looks through the Use to the User. 4) Wrap these in Value::use_iterator and Value::user_iterator typedefs. 5) Add the range adaptors as Value::uses() and Value::users(). 6) Update *all* of the callers to correctly distinguish between whether they wanted a use_iterator (and to explicitly dig out the User when needed), or a user_iterator which makes the Use itself totally opaque. Because #6 requires churning essentially everything that walked the Use-Def chains, I went ahead and added all of the range adaptors and switched them to range-based loops where appropriate. Also because the renaming requires at least churning every line of code, it didn't make any sense to split these up into multiple commits -- all of which would touch all of the same lies of code. The result is still not quite optimal. The Value::use_iterator is a nice regular iterator, but Value::user_iterator is an iterator over User*s rather than over the User objects themselves. As a consequence, it fits a bit awkwardly into the range-based world and it has the weird extra-dereferencing 'operator->' that so many of our iterators have. I think this could be fixed by providing something which transforms a range of T&s into a range of T*s, but that *can* be separated into another patch, and it isn't yet 100% clear whether this is the right move. However, this change gets us most of the benefit and cleans up a substantial amount of code around Use and User. =] llvm-svn: 203364
* [C++11] Convert sort predicates into lambdas.Benjamin Kramer2014-03-071-10/+7
| | | | | | No functionality change. llvm-svn: 203288
* Allow constant folding of round function whenever feasibleKarthik Bhat2014-03-071-0/+7
| | | | llvm-svn: 203198
* Teach lint about address spacesMatt Arsenault2014-03-061-6/+5
| | | | llvm-svn: 203132
* Replace OwningPtr<T> with std::unique_ptr<T>.Ahmed Charles2014-03-061-1/+1
| | | | | | | | | | This compiles with no changes to clang/lld/lldb with MSVC and includes overloads to various functions which are used by those projects and llvm which have OwningPtr's as parameters. This should allow out of tree projects some time to move. There are also no changes to libs/Target, which should help out of tree targets have time to move, if necessary. llvm-svn: 203083
* Allow constant folding of copysignKarthik Bhat2014-03-061-0/+7
| | | | llvm-svn: 203076
* [Layering] Move InstVisitor.h into the IR library as it is prettyChandler Carruth2014-03-064-4/+4
| | | | | | obviously coupled to the IR. llvm-svn: 203064
* [Layering] Move DebugInfo.h into the IR library where its implementationChandler Carruth2014-03-061-1/+1
| | | | | | already lives. llvm-svn: 203046
OpenPOWER on IntegriCloud