summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis
Commit message (Collapse)AuthorAgeFilesLines
* [PatternMatch] Stabilize the matching order of commutative matchersRoman Lebedev2018-04-271-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, we 1. match `LHS` matcher to the `first` operand of binary operator, 2. and then match `RHS` matcher to the `second` operand of binary operator. If that does not match, we swap the `LHS` and `RHS` matchers: 1. match `RHS` matcher to the `first` operand of binary operator, 2. and then match `LHS` matcher to the `second` operand of binary operator. This works ok. But it complicates writing of commutative matchers, where one would like to match (`m_Value()`) the value on one side, and use (`m_Specific()`) it on the other side. This is additionally complicated by the fact that `m_Specific()` stores the `Value *`, not `Value **`, so it won't work at all out of the box. The last problem is trivially solved by adding a new `m_c_Specific()` that stores the `Value **`, not `Value *`. I'm choosing to add a new matcher, not change the existing one because i guess all the current users are ok with existing behavior, and this additional pointer indirection may have performance drawbacks. Also, i'm storing pointer, not reference, because for some mysterious-to-me reason it did not work with the reference. The first one appears trivial, too. Currently, we 1. match `LHS` matcher to the `first` operand of binary operator, 2. and then match `RHS` matcher to the `second` operand of binary operator. If that does not match, we swap the ~~`LHS` and `RHS` matchers~~ **operands**: 1. match ~~`RHS`~~ **`LHS`** matcher to the ~~`first`~~ **`second`** operand of binary operator, 2. and then match ~~`LHS`~~ **`RHS`** matcher to the ~~`second`~ **`first`** operand of binary operator. Surprisingly, `$ ninja check-llvm` still passes with this. But i expect the bots will disagree.. The motivational unittest is included. I'd like to use this in D45664. Reviewers: spatel, craig.topper, arsenm, RKSimon Reviewed By: craig.topper Subscribers: xbolva00, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D45828 llvm-svn: 331085
* [MustExecute/LICM] Special case first instruction in throwing headerPhilip Reames2018-04-271-2/+5
| | | | | | | | | | | We currently have a hard to solve analysis problem around the order of instructions within a potentially throwing block. We can't cheaply determine whether a given instruction is before the first potential throw in the block. While we're working on that in the background, special case the first instruction within the header. why this particular special case? Well, headers are guaranteed to execute if the loop does, and it turns out we tend to produce this form in practice. In a follow on patch, I tend to extend LICM with an alternate approach which works for any instruction in the header before the first throw, but this is the best I can come up with other users of the analysis (such as store promotion.) Note: I can't show the difference in the analysis result since we're ORing in the expensive instruction walk used by SCEV. Using the full walk is not suitable for a general solution. llvm-svn: 331079
* [SCEV] Add trivial case handling for umin utilities. NFC.Serguei Katkov2018-04-271-2/+11
| | | | | | | | | Reviewers: sanjoy, mkazantsev Reviewed By: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46175 llvm-svn: 331022
* [SCEV] Introduce bulk umin creation utilitiesSerguei Katkov2018-04-271-19/+45
| | | | | | | | | | | | | | | | Add new umin creation method which accepts a list of operands. SCEV does not represents umin which is required in getExact, so it transforms umin to umax with not. As a result the transformation of tree of max to max with several operands does not work. We just use the new introduced method for creation umin from several operands. Reviewers: sanjoy, mkazantsev Reviewed By: sanjoy Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46047 llvm-svn: 331015
* Revert "[SimplifyLibcalls] Replace locked IO with unlocked IO"Matt Morehouse2018-04-271-41/+1
| | | | | | This reverts r331002 due to sanitizer bot breakage. llvm-svn: 331011
* [SimplifyLibcalls] Replace locked IO with unlocked IODavid Bolvansky2018-04-261-1/+41
| | | | | | | | | | | | Summary: If file stream arg is not captured and source is fopen, we could replace IO calls by unlocked IO ("_unlocked" function variants) to gain better speed, Reviewers: efriedma, RKSimon, spatel, sanjoy, hfinkel, majnemer Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D45736 llvm-svn: 331002
* [TTI, AArch64] Add transpose shuffle kindMatthew Simpson2018-04-261-10/+74
| | | | | | | | | | | | | | This patch adds a new shuffle kind useful for transposing a 2xn matrix. These transpose shuffle masks read corresponding even- or odd-numbered vector elements from two n-dimensional source vectors and write each result into consecutive elements of an n-dimensional destination vector. The transpose shuffle kind is meant to model the TRN1 and TRN2 AArch64 instructions. As such, this patch also considers transpose shuffles in the AArch64 implementation of getShuffleCost. Differential Revision: https://reviews.llvm.org/D45982 llvm-svn: 330941
* Revert "[SCEV] Make computeExitLimit more simple and more powerful"Max Kazantsev2018-04-261-17/+58
| | | | | | | | | | | This reverts commit 023c8be90980e0180766196cba86f81608b35d38. This patch triggers miscompile of zlib on PowerPC platform. Most likely it is caused by some pre-backend PPC-specific pass, but we don't clearly know the reason yet. So we temporally revert this patch with intention to return it once the problem is resolved. See bug 37229 for details. llvm-svn: 330893
* [CaptureTracking] Fixup const correctness of DomTree arg (NFC)Daniel Neilson2018-04-241-3/+3
| | | | | | | | | Summary: The PointerMayBeCapturedBefore function's DomTree arg should be const instead of non-const. There are no non-const uses of it in the function. llvm-svn: 330769
* [LVI] Fix typo. NFCXin Tong2018-04-241-1/+1
| | | | llvm-svn: 330688
* Reland r301880(!): "[InstSimplify] Handle selects of GEPs with 0 offset"George Burgess IV2018-04-241-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I was reminded today that this patch got reverted in r301885. I can no longer reproduce the failure that caused the revert locally (...almost one year later), and the patch applied pretty cleanly, so I guess we'll see if the bots still get angry about it. The original breakage was InstSimplify complaining (in "assertion failed" form) about getting passed some crazy IR when running `ninja check-sanitizer`. I'm unable to find traces of what, exactly, said crazy IR was. I suppose we'll find out pretty soon if that's still the case. :) Original commit: Author: gbiv Date: Mon May 1 18:12:08 2017 New Revision: 301880 URL: http://llvm.org/viewvc/llvm-project?rev=301880&view=rev Log: [InstSimplify] Handle selects of GEPs with 0 offset In particular (since it wouldn't fit nicely in the summary): (select (icmp eq V 0) P (getelementptr P V)) -> (getelementptr P V) Differential Revision: https://reviews.llvm.org/D31435 llvm-svn: 330667
* [DSE] Teach the pass that atomic memory intrinsics are stores.Daniel Neilson2018-04-231-4/+20
| | | | | | | | | | | | | | | | | | | Summary: This change teaches DSE that the atomic memory intrinsics are stores that can be eliminated, and can allow other stores to be eliminated. This change specifically does not teach DSE that these intrinsics can be partially eliminated (i.e. length reduced, and dest/src changed); that will be handled in another change. Reviewers: mkazantsev, skatkov, apilipenko, efriedma, rsmith Reviewed By: efriedma Subscribers: dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D45535 llvm-svn: 330629
* [LoopSimplify] Fix incorrect SCEV invalidationMax Kazantsev2018-04-231-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | In the function `simplifyOneLoop` we optimistically assume that changes in the inner loop only affect this very loop and have no impact on its parents. In fact, after rL329047 has been merged, we can now calculate exit counts for outer loops which may depend on inner loops. Thus, we need to invalidate all parents when we do something to a loop. There is an evidence of incorrect behavior of `simplifyOneLoop`: when we insert `SE->verify()` check in the end of this funciton, it fails on a bunch of existing test, in particular: LLVM :: Transforms/LoopUnroll/peel-loop-not-forced.ll LLVM :: Transforms/LoopUnroll/peel-loop-pgo.ll LLVM :: Transforms/LoopUnroll/peel-loop.ll LLVM :: Transforms/LoopUnroll/peel-loop2.ll Note that previously we have fixed issues of this variety, see rL328483. This patch makes this function invalidate the outermost loop properly. Differential Revision: https://reviews.llvm.org/D45937 Reviewed By: chandlerc llvm-svn: 330576
* [PatternMatch] allow undef elements when matching a vector zeroSanjay Patel2018-04-221-12/+8
| | | | | | | | | | | | | | | | | | | | | | | | | This is the last step in getting constant pattern matchers to allow undef elements in constant vectors. I'm adding a dedicated m_ZeroInt() function and building m_Zero() from that. In most cases, calling code can be updated to use m_ZeroInt() directly when there's no need to match pointers, but I'm leaving that efficiency optimization as a follow-up step because it's not always clear when that's ok. There are just enough icmp folds in InstSimplify that can be used for integer or pointer types, that we probably still want a generic m_Zero() for those cases. Otherwise, we could eliminate it (and possibly add a m_NullPtr() as an alias for isa<ConstantPointerNull>()). We're conservatively returning a full zero vector (zeroinitializer) in InstSimplify/InstCombine on some of these folds (see diffs in InstSimplify), but I'm not sure if that's actually necessary in all cases. We may be able to propagate an undef lane instead. One test where this happens is marked with 'TODO'. llvm-svn: 330550
* [BasicAA] Return MayAlias for the pointer plus variable offset toShiva Chen2018-04-161-6/+6
| | | | | | | | structure object member Differential Revision: https://reviews.llvm.org/D45510 llvm-svn: 330106
* [InstCombine] Simplify 'add' to 'or' if no common bits are set.Roman Lebedev2018-04-151-0/+8
| | | | | | | | | | | | | | | | | | | | | | | Summary: In order to get the whole fold as specified in [[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]], let's first handle the simple straight-forward things. Let's start with the `and` -> `or` simplification. The one obvious thing missing here: the constant mask is not handled. I have an idea how to handle it, but it will require some thinking, and is not strictly required here, so i've left that for later. https://rise4fun.com/Alive/Pkmg Reviewers: spatel, craig.topper, eli.friedman, jingyue Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45631 llvm-svn: 330101
* [LV] Introduce TTI::getMinimumVFKrzysztof Parzyszek2018-04-131-0/+4
| | | | | | | | | | | The function getMinimumVF(ElemWidth) will return the minimum VF for a vector with elements of size ElemWidth bits. This value will only apply to targets for which TTI::shouldMaximizeVectorBandwidth returns true. The value of 0 indicates that there is no minimum VF. Differential Revision: https://reviews.llvm.org/D45271 llvm-svn: 330062
* Fix a typo in a comment; NFCGeorge Burgess IV2018-04-121-1/+1
| | | | llvm-svn: 329935
* [InstSimplify] fix formatting; NFCSanjay Patel2018-04-101-7/+7
| | | | llvm-svn: 329736
* [DA] Improve alias checking in dependence analysisDavid Green2018-04-101-10/+37
| | | | | | | | | | | | Improve the alias analysis to account for cases where we know that src/dst pairs cannot alias due to things like TBAA. As we know they are noalias, we know no dependency can occur. Also fixes issues around the size parameter to AA being incorrect. Differential Revision: https://reviews.llvm.org/D42381 llvm-svn: 329692
* [MemorySSA] remove cruft; NFC.George Burgess IV2018-04-091-23/+1
| | | | | | | | | | | | The caching walker used to hold its own caches, which made its `reset()` function meaningful. Since caching has been moved out of it, there's no reason to continue to have these cache-related methods. Similarly, the EXPENSIVE_CHECKS block that's getting removed used to rerun the query with caching disabled. Since that's how we always do queries now, it's redundant. llvm-svn: 329638
* [MemorySSA] Remove redundant assert; NFCGeorge Burgess IV2018-04-091-3/+0
| | | | | | | The `if (!Def && !Use) return nullptr;` right above this assert sort of defeats the purpose. llvm-svn: 329632
* [MemorySSAUpdater] Mark Phi users of a node being moved as non-optimizeZhaoshi Zheng2018-04-091-0/+17
| | | | | | | | | | Fix PR36484, as suggested: <quote> during moves, mark the direct users of the erased things that were phis as "not to be optimized" <quote> llvm-svn: 329621
* [NFC] Loosen restriction on preheader to fix buildbotMax Kazantsev2018-04-061-5/+5
| | | | llvm-svn: 329379
* Don't inline @llvm.icall.branch.funnelVitaly Buka2018-04-041-9/+14
| | | | | | | | | | | | | | Summary: @llvm.icall.branch.funnel is musttail with variable number of arguments. After inlining current backend can't separate call targets from call arguments. Reviewers: pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45116 llvm-svn: 329235
* [MemorySSA] Fix spelling errors in MemorySSA.cpp. NFCZhaoshi Zheng2018-04-041-2/+2
| | | | llvm-svn: 329230
* [Analysis] Support aligned new/delete functions.Eric Fiselier2018-04-042-0/+45
| | | | | | | | | | | | | | | | Summary: Clang's __builtin_operator_new/delete was recently taught about the aligned allocation overloads (r328134). This patch makes LLVM aware of them as well. This allows the compiler to perform certain optimizations including eliding new/delete calls. Reviewers: rsmith, majnemer, dblaikie, vsk, bkramer Reviewed By: bkramer Subscribers: ckennelly, llvm-commits Differential Revision: https://reviews.llvm.org/D44769 llvm-svn: 329218
* Revert "[Analysis] Support aligned new/delete functions."Eric Fiselier2018-04-042-45/+0
| | | | | | This reverts commit bee3bbd9bdd3ab3364b8fb0cdb6326bc1ae740e0. llvm-svn: 329217
* [Analysis] Support aligned new/delete functions.Eric Fiselier2018-04-042-0/+45
| | | | | | | | | | | | | | | | Summary: Clang's __builtin_operator_new/delete was recently taught about the aligned allocation overloads (r328134). This patch makes LLVM aware of them as well. This allows the compiler to perform certain optimizations including eliding new/delete calls. Reviewers: rsmith, majnemer, dblaikie, vsk, bkramer Reviewed By: bkramer Subscribers: ckennelly, llvm-commits Differential Revision: https://reviews.llvm.org/D44769 llvm-svn: 329215
* Make helpers static. NFC.Benjamin Kramer2018-04-041-1/+3
| | | | llvm-svn: 329170
* [SCEV] Prove implications for SCEVUnknown PhisMax Kazantsev2018-04-041-0/+118
| | | | | | | | | | | | | | | | | This patch teaches SCEV how to prove implications for SCEVUnknown nodes that are Phis. If we need to prove `Pred` for `LHS, RHS`, and `LHS` is a Phi with possible incoming values `L1, L2, ..., LN`, then if we prove `Pred` for `(L1, RHS), (L2, RHS), ..., (LN, RHS)` then we can also prove it for `(LHS, RHS)`. If both `LHS` and `RHS` are Phis from the same block, it is sufficient to prove the predicate for values that come from the same predecessor block. The typical case that it handles is that we sometimes need to prove that `Phi(Len, Len - 1) >= 0` given that `Len > 0`. The new logic was added to `isImpliedViaOperations` and only uses it and non-recursive reasoning to prove the facts we need, so it should not hurt compile time a lot. Differential Revision: https://reviews.llvm.org/D44001 Reviewed By: anna llvm-svn: 329150
* [SLP] Fix PR36481: vectorize reassociated instructions.Alexey Bataev2018-04-031-0/+61
| | | | | | | | | | | | | | | | | | Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Patch does not support reordering of the repeated instruction, this must be handled in the separate patch. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 329085
* Revert "[SLP] Fix PR36481: vectorize reassociated instructions."Benjamin Kramer2018-04-031-61/+0
| | | | | | This reverts commit r328980 and r329046. Makes the vectorizer crash. llvm-svn: 329071
* [SCEV] Fix PR36974.Serguei Katkov2018-04-031-5/+6
| | | | | | | | | | | | | The patch changes the usage of dominate to properlyDominate to satisfy the condition !(a < a) while using std::max. It is actually NFC due to set data structure is used to keep the Loops and no two identical loops can be in collection. So in reality there is no difference between usage of dominate and properlyDominate in this particular case. However it might be changed so it is better to fix it. llvm-svn: 329051
* [SCEV] Make computeExitLimit more simple and more powerfulMax Kazantsev2018-04-031-58/+17
| | | | | | | | | | | | | | | | | | | | | | | Current implementation of `computeExitLimit` has a big piece of code the only purpose of which is to prove that after the execution of this block the latch will be executed. What it currently checks is actually a subset of situations where the exiting block dominates latch. This patch replaces all these checks for simple particular cases with domination check over loop's latch which is the only necessary condition of taking the exiting block into consideration. This change allows to calculate exact loop taken count for simple loops like for (int i = 0; i < 100; i++) { if (cond) {...} else {...} if (i > 50) break; . . . } Differential Revision: https://reviews.llvm.org/D44677 Reviewed By: efriedma llvm-svn: 329047
* [SLP] Fix PR36481: vectorize reassociated instructions.Alexey Bataev2018-04-021-0/+61
| | | | | | | | | | | | | | | Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 328980
* [Analysis] Change std::sort to llvm::sort in response to r327219Mandeep Singh Grang2018-04-018-21/+21
| | | | | | | | | | | | | | | | | | | | | | Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer D44363 for a list of all the required patches. Reviewers: sanjoy, dexonsmith, hfinkel, RKSimon Reviewed By: dexonsmith Subscribers: david2050, llvm-commits Differential Revision: https://reviews.llvm.org/D44944 llvm-svn: 328925
* [ThinLTO] Add an option to force summary call edges cold for debuggingTeresa Johnson2018-03-311-1/+18
| | | | | | | | | | | | | | Summary: Useful to selectively disable importing into specific modules for debugging/triaging/workarounds. Reviewers: eraman Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D45062 llvm-svn: 328909
* Fix a bunch of typoes. NFCFangrui Song2018-03-301-1/+1
| | | | llvm-svn: 328907
* Fix an accidental circular dependencePhilip Reames2018-03-291-1/+1
| | | | llvm-svn: 328816
* [NFC] Fix meaningless assert in SCEVMax Kazantsev2018-03-291-2/+2
| | | | llvm-svn: 328764
* [MemorySSA] Turn an assert into a conditionGeorge Burgess IV2018-03-291-2/+2
| | | | | | | | | | | | Eli pointed out that variadic functions are totally a thing, so this assert is incorrect. No test-case is provided, since the only way this assert fires is if a specific DenseMap falls back to doing `isEqual` checks, and that seems fairly brittle (and requires a pyramid of growing `call void (i8, ...) @varargs(i8 0)`). llvm-svn: 328755
* [MemorySSA] Consider callsite args for hashing and equality.George Burgess IV2018-03-291-9/+20
| | | | | | | | | | | | | | We use a `DenseMap<MemoryLocOrCall, MemlocStackInfo>` to keep track of prior work when optimizing uses in MemorySSA. Because we weren't accounting for callsite arguments in either the hash code or equality tests for `MemoryLocOrCall`s, we optimized uses too aggressively in some rare cases. Fix by Daniel Berlin. Should fix PR36883. llvm-svn: 328748
* Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header ↵David Blaikie2018-03-281-0/+2
| | | | | | | | dependency Thanks to echristo for the pointers on direction. llvm-svn: 328737
* [NFC] OptPassGate extracted from OptBisectFedor Sergeev2018-03-273-3/+3
| | | | | | | | | | | | | | | Summary: This is an NFC refactoring of the OptBisect class to split it into an optional pass gate interface used by LLVMContext and the Optional Pass Bisector (OptBisect) used for debugging of optional passes. This refactoring is needed for D44464, which introduces setOptPassGate() method to allow implementations other than OptBisect. Patch by Yevgeny Rouban. Reviewers: andrew.w.kaylor, fedor.sergeev, vsk, dberlin, Eugene.Zelenko, reames, skatkov Reviewed By: fedor.sergeev Differential Revision: https://reviews.llvm.org/D44821 llvm-svn: 328637
* [LV] Add TTI::shouldMaximizeVectorBandwidth to allow enabling it per targetKrzysztof Parzyszek2018-03-271-0/+4
| | | | | | | | The default implementation returns false and keeps the current behavior. Differential Revision: https://reviews.llvm.org/D44735 llvm-svn: 328632
* [NFC] Fix comments in getExact()Max Kazantsev2018-03-271-12/+11
| | | | llvm-svn: 328612
* [SCEV] Make exact taken count calculation more optimisticMax Kazantsev2018-03-271-6/+16
| | | | | | | | | | | | | | | | | | | | | Currently, `getExact` fails if it sees two exit counts in different blocks. There is no solid reason to do so, given that we only calculate exact non-taken count for exiting blocks that dominate latch. Using this fact, we can simply take min out of all exits of all blocks to get the exact taken count. This patch makes the calculation more optimistic with enforcing our assumption with asserts. It allows us to calculate exact backedge taken count in trivial loops like for (int i = 0; i < 100; i++) { if (i > 50) break; . . . } Differential Revision: https://reviews.llvm.org/D44676 Reviewed By: fhahn llvm-svn: 328611
* [SCEV] Add one more case in computeConstantDifferenceMax Kazantsev2018-03-271-10/+18
| | | | | | | | | | This patch teaches `computeConstantDifference` handle calculation of constant difference between `(X + C1)` and `(X + C2)` which is `(C2 - C1)`. Differential Revision: https://reviews.llvm.org/D43759 Reviewed By: anna llvm-svn: 328609
* [MemorySSA] Fix exponential compile-time updating MemorySSA.Eli Friedman2018-03-261-15/+28
| | | | | | | | | | | | | | | | | | | | | | | MemorySSAUpdater::getPreviousDefRecursive is a recursive algorithm, for each block, it computes the previous definition for each predecessor, then takes those definitions and combines them. But currently it doesn't remember results which it already computed; this means it can visit the same block multiple times, which adds up to exponential time overall. To fix this, this patch adds a cache. If we computed the result for a block already, we don't need to visit it again because we'll come up with the same result. Well, unless we RAUW a MemoryPHI; in that case, the TrackingVH will be updated automatically. This matches the original source paper for this algorithm. The testcase isn't really a test for the bug, but it adds coverage for the case where tryRemoveTrivialPhi erases an existing PHI node. (It's hard to write a good regression test for a performance issue.) Differential Revision: https://reviews.llvm.org/D44715 llvm-svn: 328577
OpenPOWER on IntegriCloud