summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis
Commit message (Collapse)AuthorAgeFilesLines
...
* Reverted: Track validity of pass resultsSerge Pavlov2017-01-154-7/+2
| | | | | | Commits r291882 and related r291887. llvm-svn: 292062
* [PM] Teach the optimization remarks emitter to handle invalidationChandler Carruth2017-01-151-0/+12
| | | | | | | | | | events. This pass sometimes has a pointer to BlockFrequencyInfo so it needs custom invalidation logic. It is also otherwise immutable so we can reduce the number of invalidations that happen substantially. llvm-svn: 292058
* [PM] Introduce an analysis set used to preserve all analyses overChandler Carruth2017-01-155-0/+46
| | | | | | | | | | | | | | | a function's CFG when that CFG is unchanged. This allows transformation passes to simply claim they preserve the CFG and analysis passes to check for the CFG being preserved to remove the fanout of all analyses being listed in all passes. I've gone through and removed or cleaned up as many of the comments reminding us to do this as I could. Differential Revision: https://reviews.llvm.org/D28627 llvm-svn: 292054
* [PM] The assumption cache is fundamentally designed to be self-updating,Chandler Carruth2017-01-151-1/+0
| | | | | | | | | | | | | | mark it as never invalidated in the new PM. The old PM already required this to work, and after a discussion with Hal this seems to really be the only sensible answer. The cache gracefully degrades as the IR is mutated, and most things which do this should already be incrementally updating the cache. This gets rid of a bunch of logic preserving and testing the invalidation of this analysis. llvm-svn: 292039
* Removing potentially error-prone fallthrough. NFCMarcello Maggioni2017-01-141-0/+1
| | | | | | | | This fallthrough if other cases are added between fabs and default could cause fabs to fall to the next case resulting in a bug. Better getting rid of it immediately just to be sure. llvm-svn: 292003
* Compute summary before calling extractProfTotalWeightEaswaran Raman2017-01-141-11/+11
| | | | | | | | | | extractProfTotalWeight checks if the profile type is sample profile, but before that we have to ensure that summary is available. Also expanded the unittest to test the case where there is no summar Differential Revision: https://reviews.llvm.org/D28708 llvm-svn: 291982
* [SCEV] Limit recursion depth of constant evolving.Michael Liao2017-01-131-3/+10
| | | | | | | | | | - For a loop body with VERY complicated exit condition evaluation, constant evolving may run out of stack on platforms such as Windows. Need to limit the recursion depth. Differential Revision: https://reviews.llvm.org/D28629 llvm-svn: 291927
* Remove unused lambda captures. NFCMalcolm Parsons2017-01-131-3/+3
| | | | llvm-svn: 291916
* Apply clang-tidy's performance-unnecessary-value-param to LLVM.Benjamin Kramer2017-01-131-2/+2
| | | | | | | With some minor manual fixes for using function_ref instead of std::function. No functional change intended. llvm-svn: 291904
* RegionPass: Set isExecuted flag correctlyTobias Grosser2017-01-131-0/+1
| | | | | | | This was forgotten in r291882. Without this fix, the Polly build bots are broken. llvm-svn: 291887
* Track validity of pass resultsSerge Pavlov2017-01-133-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | Running tests with expensive checks enabled exhibits some problems with verification of pass results. First, the pass verification may require results of analysis that are not available. For instance, verification of loop info requires results of dominator tree analysis. A pass may be marked as conserving loop info but does not need to be dependent on DominatorTreePass. When a pass manager tries to verify that loop info is valid, it needs dominator tree, but corresponding analysis may be already destroyed as no user of it remained. Another case is a pass that is skipped. For instance, entities with linkage available_externally do not need code generation and such passes are skipped for them. In this case result verification must also be skipped. To solve these problems this change introduces a special flag to the Pass structure to mark passes that have valid results. If this flag is reset, verifications dependent on the pass result are skipped. Differential Revision: https://reviews.llvm.org/D27190 llvm-svn: 291882
* ProfileSummaryInfo improvements.Easwaran Raman2017-01-131-5/+48
| | | | | | | | | | | * Add is{Hot|Cold}CallSite methods * Fix a bug in isHotBB where it was looking for MD_prof on a return instruction * Use MD_prof data only if sample profiling was used to collect profiles. * Add an unit test to ProfileSummaryInfo Differential Revision: https://reviews.llvm.org/D28584 llvm-svn: 291878
* [SCEV] Simplify SolveLinEquationWithOverflow a bit.Eli Friedman2017-01-121-7/+8
| | | | | | Cleanup in preparation for generalizing it. llvm-svn: 291808
* [Devirtualization] MemDep returns non-local !invariant.group dependenciesPiotr Padlewski2017-01-121-8/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Memory Dependence Analysis was limited to return only local dependencies for invariant.group handling. Now it returns NonLocal when it finds it and then by asking getNonLocalPointerDependency we get found dep. Thanks to this we are able to devirtualize loops! void indirect(A &a, int n) { for (int i = 0 ; i < n; i++) a.foo(); } void test(int n) { A a; indirect(a); } After inlining a.foo() will be changed to direct call, even if foo and A::A() is external (but only if vtable definition is be available). Reviewers: nlewycky, dberlin, chandlerc, rsmith Subscribers: mehdi_amini, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D28137 llvm-svn: 291762
* [SCEV] Make howFarToZero max backedge-taken count check for precondition.Eli Friedman2017-01-111-0/+17
| | | | | | | | | Refines max backedge-taken count if a loop like "for (int i = 0; i != n; ++i) { /* body */ }" is rotated. Differential Revision: https://reviews.llvm.org/D28536 llvm-svn: 291704
* [SCEV] Make howFarToZero use a simpler formula for max backedge-taken count.Eli Friedman2017-01-111-11/+2
| | | | | | | | | This is both easier to understand, and produces a tighter bound in certain cases. Differential Revision: https://reviews.llvm.org/D28393 llvm-svn: 291701
* [MemDep] NFC variable name changePiotr Padlewski2017-01-111-3/+3
| | | | llvm-svn: 291679
* Make processing @llvm.assume more efficient - Add affected values to the ↵Hal Finkel2017-01-113-2/+114
| | | | | | | | | | | | | | | | | | | | | | | | assumption cache Here's my second try at making @llvm.assume processing more efficient. My previous attempt, which leveraged operand bundles, r289755, didn't end up working: it did make assume processing more efficient but eliminating the assumption cache made ephemeral value computation too expensive. This is a more-targeted change. We'll keep the assumption cache, but extend it to keep a map of affected values (i.e. values about which an assumption might provide some information) to the corresponding assumption intrinsics. This allows ValueTracking and LVI to find assumptions relevant to the value being queried without scanning all assumptions in the function. The fact that ValueTracking started doing O(number of assumptions in the function) work, for every known-bits query, has become prohibitively expensive in some cases. As discussed during the review, this is a pragmatic fix that, longer term, will likely be replaced by a more-principled solution (perhaps based on an extended SSA form). Differential Revision: https://reviews.llvm.org/D28459 llvm-svn: 291671
* [PM] Separate the LoopAnalysisManager from the LoopPassManager and moveChandler Carruth2017-01-115-95/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | the latter to the Transforms library. While the loop PM uses an analysis to form the IR units, the current plan is to have the PM itself establish and enforce both loop simplified form and LCSSA. This would be a layering violation in the analysis library. Fundamentally, the idea behind the loop PM is to *transform* loops in addition to running passes over them, so it really seemed like the most natural place to sink this was into the transforms library. We can't just move *everything* because we also have loop analyses that rely on a subset of the invariants. So this patch splits the the loop infrastructure into the analysis management that has to be part of the analysis library, and the transform-aware pass manager. This also required splitting the loop analyses' printer passes out to the transforms library, which makes sense to me as running these will transform the code into LCSSA in theory. I haven't split the unittest though because testing one component without the other seems nearly intractable. Differential Revision: https://reviews.llvm.org/D28452 llvm-svn: 291662
* [X86] updating TTI costs for arithmetic instructions on X86\SLM arch.Mohammed Agabaria2017-01-112-3/+7
| | | | | | | | | | | | updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd. special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. In case if the real operands bitwidth <= 16. Differential Revision: https://reviews.llvm.org/D28104 llvm-svn: 291657
* [PM] Rewrite the loop pass manager to use a worklist and augmented runChandler Carruth2017-01-115-54/+198
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arguments much like the CGSCC pass manager. This is a major redesign following the pattern establish for the CGSCC layer to support updates to the set of loops during the traversal of the loop nest and to support invalidation of analyses. An additional significant burden in the loop PM is that so many passes require access to a large number of function analyses. Manually ensuring these are cached, available, and preserved has been a long-standing burden in LLVM even with the help of the automatic scheduling in the old pass manager. And it made the new pass manager extremely unweildy. With this design, we can package the common analyses up while in a function pass and make them immediately available to all the loop passes. While in some cases this is unnecessary, I think the simplicity afforded is worth it. This does not (yet) address loop simplified form or LCSSA form, but those are the next things on my radar and I have a clear plan for them. While the patch is very large, most of it is either mechanically updating loop passes to the new API or the new testing for the loop PM. The code for it is reasonably compact. I have not yet updated all of the loop passes to correctly leverage the update mechanisms demonstrated in the unittests. I'll do that in follow-up patches along with improved FileCheck tests for those passes that ensure things work in more realistic scenarios. In many cases, there isn't much we can do with these until the loop simplified form and LCSSA form are in place. Differential Revision: https://reviews.llvm.org/D28292 llvm-svn: 291651
* InstSimplify: Refactor function to use more switchesMatt Arsenault2017-01-111-38/+51
| | | | llvm-svn: 291634
* InstSimplify: Eliminate fabs on known positiveMatt Arsenault2017-01-112-22/+64
| | | | llvm-svn: 291624
* Refactor inline threshold update code.Easwaran Raman2017-01-091-22/+19
| | | | | | | | | | Functional change: Previously, if a callee is cold, we used ColdThreshold if it minimizes the existing threshold. This was irrespective of whether we were optimizing for minsize (-Oz) or not. But -Oz uses very low threshold to begin with and the inlining with -Oz is expected to be tuned for lowering code size, so there is no good reason to set an even lower threshold for cold callees. We now lower the threshold for cold callees only when -Oz is not used. For default values of -inlinethreshold and -inlinecold-threshold, this change has no effect and this simplifies the code. NFC changes: Group all threshold updates that are guarded by !Caller->optForMinSize() and within that group threshold updates that require profile summary info. Differential revision: https://reviews.llvm.org/D28369 llvm-svn: 291487
* Intrinsic::Bitreverse is safe to speculateXin Tong2017-01-091-0/+1
| | | | | | | | | | | | Summary: Intrinsic::Bitreverse is safe to speculate Reviewers: hfinkel, mkuper, arsenm, jmolloy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28471 llvm-svn: 291456
* [PM] Teach SCEV to invalidate itself when its dependencies becomeChandler Carruth2017-01-091-0/+12
| | | | | | | | | | | | | invalid. This fixes use-after-free bugs that will arise with any interesting use of SCEV. I've added a dedicated test that works diligently to trigger these kinds of bugs in the new pass manager and also checks for them explicitly as well as triggering ASan failures when things go squirly. llvm-svn: 291426
* [MemDep] NFC walk invariant.group graph only downPiotr Padlewski2017-01-081-26/+16
| | | | | | | | | | | | | | Summary: By using stripPointerCasts we can get to the root value and then walk down the bitcast graph Reviewers: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28181 llvm-svn: 291405
* [LCSSA] Fix some typos. NFCI.Davide Italiano2017-01-081-3/+3
| | | | llvm-svn: 291404
* [InstSimplify] Optimize away udivs in the presence of range metadataDavid Majnemer2017-01-061-0/+10
| | | | | | We know that udiv %V, C can be optimized away to 0 if %V is ult C. llvm-svn: 291296
* [InstSimplify] Optimize away urems in the presence of range metadataDavid Majnemer2017-01-061-0/+10
| | | | | | We know that urem %V, C can be optimized away to %V if %V is ult C. llvm-svn: 291282
* ThinLTO: add early "dead-stripping" on the IndexTeresa Johnson2017-01-051-4/+27
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Using the linker-supplied list of "preserved" symbols, we can compute the list of "dead" symbols, i.e. the one that are not reachable from a "preserved" symbol transitively on the reference graph. Right now we are using this information to mark these functions as non-eligible for import. The impact is two folds: - Reduction of compile time: we don't import these functions anywhere or import the function these symbols are calling. - The limited number of import/export leads to better internalization. Patch originally by Mehdi Amini. Reviewers: mehdi_amini, pcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23488 llvm-svn: 291177
* [ThinLTO] Use DenseSet instead of SmallPtrSet for holding GUIDsTeresa Johnson2017-01-051-4/+4
| | | | | | | | | Should fix some more bot failures from r291108. This should have been a DenseSet, since GUID is not a pointer type. It caused some bots to fail, but for some reason I wasnt't getting a build failure. llvm-svn: 291115
* [ThinLTO] Subsume all importing checks into a single flagTeresa Johnson2017-01-051-26/+71
| | | | | | | | | | | | | | | | | | | Summary: This adds a new summary flag NotEligibleToImport that subsumes several existing flags (NoRename, HasInlineAsmMaybeReferencingInternal and IsNotViableToInline). It also subsumes the checking of references on the summary that was being done during the thin link by eligibleForImport() for each candidate. It is much more efficient to do that checking once during the per-module summary build and record it in the summary. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28169 llvm-svn: 291108
* Currently isLikelyComplexAddressComputation tries to figure out if the given ↵Mohammed Agabaria2017-01-051-2/+3
| | | | | | | | | | | | | stride seems to be 'complex' and need some extra cost for address computation handling. This code seems to be target dependent which may not be the same for all targets. Passed the decision whether the given stride is complex or not to the target by sending stride information via SCEV to getAddressComputationCost instead of 'IsComplex'. Specifically at X86 targets we dont see any significant address computation cost in case of the strided access in general. Differential Revision: https://reviews.llvm.org/D27518 llvm-svn: 291106
* [ValueTracking] remove stale comments; NFCSanjay Patel2017-01-021-6/+0
| | | | | | | The checks were improved with: https://reviews.llvm.org/rL290194 llvm-svn: 290826
* AVX-512 Loop Vectorizer: Cost calculation for interleave load/store patterns.Elena Demikhovsky2017-01-021-0/+32
| | | | | | | | | | | | X86 target does not provide any target specific cost calculation for interleave patterns.It uses the common target-independent calculation, which gives very high numbers. As a result, the scalar version is chosen in many cases. The situation on AVX-512 is even worse, since we have 3-src shuffles that significantly reduce the cost. In this patch I calculate the cost on AVX-512. It will allow to compare interleave pattern with gather/scatter and choose a better solution (PR31426). * Shiffle-broadcast cost will be changed in Simon's upcoming patch. Differential Revision: https://reviews.llvm.org/D28118 llvm-svn: 290810
* Fix an issue with isGuaranteedToTransferExecutionToSuccessorSanjoy Das2016-12-311-6/+20
| | | | | | | | | | | | | | I'm not sure if this was intentional, but today isGuaranteedToTransferExecutionToSuccessor returns true for readonly and argmemonly calls that may throw. This commit changes the function to not implicitly infer nounwind this way. Even if we eventually specify readonly calls as not throwing, isGuaranteedToTransferExecutionToSuccessor is not the best place to infer that. We should instead teach FunctionAttrs or some other such pass to tag readonly functions / calls as nounwind instead. llvm-svn: 290794
* Avoid const_cast; NFCSanjoy Das2016-12-311-2/+3
| | | | llvm-svn: 290793
* [ValueTracking] make dominator tree requirement explicit for ↵Sanjay Patel2016-12-311-1/+6
| | | | | | | | | | | | | | | | | isKnownNonNullFromDominatingCondition(); NFCI I don't think this hole is currently exposed, but I crashed regression tests for jump-threading and loop-vectorize after I added calls to isKnownNonNullAt() in InstSimplify as part of trying to solve PR28430: https://llvm.org/bugs/show_bug.cgi?id=28430 That's because they call into value tracking with a context instruction, but no other parts of the query structure filled in. For more background, see the discussion in: https://reviews.llvm.org/D27855 llvm-svn: 290786
* [LVI] Remove count/erase idiom in favor of checking result value of erasePhilip Reames2016-12-301-6/+2
| | | | | | Minor compile time win. Avoids an additional O(N) scan in the case where we are removing an element and costs nothing when we aren't. llvm-svn: 290768
* [MemDep] Handle gep with zeros for invariant.groupPiotr Padlewski2016-12-301-20/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: gep 0, 0 is equivalent to bitcast. LLVM canonicalizes it to getelementptr because it make SROA can then handle it. Simple case like void g(A &a) { z(a); if (glob) a.foo(); } void testG() { A a; g(a); } was not devirtualized with -fstrict-vtable-pointers because luck of handling for gep 0 in Memory Dependence Analysis Reviewers: dberlin, nlewycky, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28126 llvm-svn: 290763
* [LVI] Manually hoist computation from loopPhilip Reames2016-12-301-7/+12
| | | | | | Minor compile time win. Not known to be a hot spot, just something I noticed while reading. llvm-svn: 290759
* [PM] Teach the CGSCC's CG update utility to more carefully invalidateChandler Carruth2016-12-282-20/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | analyses when we're about to break apart an SCC. We can't wait until after breaking apart the SCC to invalidate things: 1) Which SCC do we then invalidate? All of them? 2) Even if we invalidate all of them, a newly created SCC may not have a proxy that will convey the invalidation to functions! Previously we only invalidated one of the SCCs and too late. This led to stale analyses remaining in the cache. And because the caching strategy actually works, they would get used and chaos would ensue. Doing invalidation early is somewhat pessimizing though if we *know* that the SCC structure won't change. So it turns out that the design to make the mutation API force the caller to know the *kind* of mutation in advance was indeed 100% correct and we didn't do enough of it. So this change also splits two cases of switching a call edge to a ref edge into two separate APIs so that callers can clearly test for this and take the easy path without invalidating when appropriate. This is particularly important in this case as we expect most inlines to be between functions in separate SCCs and so the common case is that we don't have to so aggressively invalidate analyses. The LCG API change in turn needed some basic cleanups and better testing in its unittest. No interesting functionality changed there other than more coverage of the returned sequence of SCCs. While this seems like an obvious improvement over the current state, I'd like to revisit the core concept of invalidating within the CG-update layer at all. I'm wondering if we would be better served forcing the callers to handle the invalidation beforehand in the cases that they can handle it. An interesting example is when we want to teach the inliner to *update and preserve* analyses. But we can cross that bridge when we get there. With this patch, the new pass manager an build all of the LLVM test suite at -O3 and everything passes. =D I haven't bootstrapped yet and I'm sure there are still plenty of bugs, but this gives a nice baseline so I'm going to increasingly focus on fleshing out the missing functionality, especially the bits that are just turned off right now in order to let us establish this baseline. llvm-svn: 290664
* [LCG] Teach the ref edge removal to handle a ref edge that is trivialChandler Carruth2016-12-281-1/+7
| | | | | | | | | | | due to a call cycle. This actually crashed the ref removal before. I've added a unittest that covers this kind of interesting graph structure and mutation. llvm-svn: 290645
* [PM] Teach MemDep to invalidate its result object when its cachedChandler Carruth2016-12-271-0/+18
| | | | | | | | analysis handles become invalid. Add a test case for its invalidation logic. llvm-svn: 290620
* [PM] Remove a pointless optimization.Chandler Carruth2016-12-272-6/+0
| | | | | | | | There is no need to do this within an analysis. That method shouldn't even be reached if this predicate holds as the actual useful optimization is in the analysis manager itself. llvm-svn: 290614
* [ThinLTO] Fix "||" vs "|" mixup.Teresa Johnson2016-12-271-1/+1
| | | | | | | | | | | | | | The effect of the bug was that we would incorrectly create summaries for global and weak values defined in module asm (since we were essentially testing for bit 1 which is SF_Undefined, and the RecordStreamer ignores local undefined references). This would have resulted in conservatively disabling importing of anything referencing globals and weaks defined in module asm. Added these cases to the test which now fails without this bug fix. Fixes PR31459. llvm-svn: 290610
* [MemDep] Operand visited twice bugfixPiotr Padlewski2016-12-271-0/+1
| | | | | | | | Because operand was not marked as seen it was visited twice. It doesn't change behavior of optimization, it just saves redudant visit, so no test changes. llvm-svn: 290607
* [PM] Teach BasicAA how to invalidate its result object.Chandler Carruth2016-12-271-0/+18
| | | | | | | | | | | | | | | This requires custom handling because BasicAA caches handles to other analyses and so it needs to trigger indirect invalidation. This fixes one of the common crashes when using the new PM in real pipelines. I've also tweaked a regression test to check that we are at least handling the most immediate case. I'm going to work at re-structuring this test some to both scale better (rather than all being in one file) and check more invalidation paths in a follow-up commit, but I wanted to get the basic bug fix in place. llvm-svn: 290603
* [PM] Teach the AAManager and AAResults layer (the worst offender forChandler Carruth2016-12-271-1/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | inter-analysis dependencies) to use the new invalidation infrastructure. This teaches it to invalidate itself when any of the peer function AA results that it uses become invalid. We do this by just tracking the originating IDs. I've kept it in a somewhat clunky API since some users of AAResults are outside the new PM right now. We can clean this API up if/when those users go away. Secondly, it uses the registration on the outer analysis manager proxy to trigger deferred invalidation when a module analysis result becomes invalid. I've included test cases that specifically try to trigger use-after-free in both of these cases and they would crash or hang pretty horribly for me even without ASan. Now they work nicely. The `InvalidateAnalysis` utility pass required some tweaking to be useful in this context and it still is pretty garbage. I'd like to switch it back to the previous implementation and teach the explicit invalidate method on the AnalysisManager to take care of correctly triggering indirect invalidation, but I wanted to go ahead and send this out so folks could see how all of this stuff works together in practice. And, you know, that it does actually work. =] Differential Revision: https://reviews.llvm.org/D27205 llvm-svn: 290595
OpenPOWER on IntegriCloud