summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis
Commit message (Collapse)AuthorAgeFilesLines
* Some cleanup for r209568.Michael Zolotukhin2014-05-261-9/+7
| | | | llvm-svn: 209634
* Implement sext(C1 + C2*X) --> sext(C1) + sext(C2*X) andMichael Zolotukhin2014-05-241-0/+35
| | | | | | | | | | | sext{C1,+,C2} --> sext(C1) + sext{0,+,C2} transformation in Scalar Evolution. That helps SLP-vectorizer to recognize consecutive loads/stores. <rdar://problem/14860614> llvm-svn: 209568
* Fix and improve SCEV ComputeBackedgeTankCount.Andrew Trick2014-05-231-19/+46
| | | | | | | | | | | | | This is a follow-up to r209358: PR19799: Indvars miscompile due to an incorrect max backedge taken count from SCEV. That fix was incomplete as pointed out by Arnold and Michael Z. The code was also too confusing. It needed a careful rewrite with more unit tests. This version will also happen to optimize more cases. <rdar://17005101> PR19799: Indvars miscompile... llvm-svn: 209545
* ScalarEvolution: Fix handling of AddRecs in isKnownPredicateJustin Bogner2014-05-231-12/+24
| | | | | | | | | | | | | ScalarEvolution::isKnownPredicate() can wrongly reduce a comparison when both the LHS and RHS are SCEVAddRecExprs. This checks that both LHS and RHS are guarded in the case when both are SCEVAddRecExprs. The test case is against indvars because I could not find a way to directly test SCEV. Patch by Sanjay Patel! llvm-svn: 209487
* Fix a bug in SCEV's backedge taken count computation from my prior fix in Jan.Andrew Trick2014-05-221-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This has to do with the trip count computation for loops with multiple exits, which is quite subtle. Most passes just ask for a single trip count number, so we must be conservative assuming any exit could be taken. Normally, we rely on the "exact" trip count, which was correctly given as "unknown". However, SCEV also gives a "max" back-edge taken count. The loops max BE taken count is conservatively a maximum over the max of each exit's non-exiting iterations count. Note that some exit tests can be skipped so the max loop back-edge taken count can actually exceed the max non-exiting iterations for some exits. However, when we know the loop *latch* cannot be skipped, we can directly use its max taken count disregarding other exits. I previously took the minimum here without checking whether the other exit could be skipped. The correct, and simpler thing to do here is just to directly use the loop latch's max non-exiting iterations as the loops max back-edge count. In the problematic test case, the first loop exit had a max of zero non-exiting iterations, but could be skipped. The loop latch was known not to be skipped but had max of one non-exiting iteration. We incorrectly claimed the loop back-edge could be taken zero times, when it is actually taken one time. Fixes Loop %for.body.i: <multiple exits> Unpredictable backedge-taken count. Loop %for.body.i: max backedge-taken count is 1. llvm-svn: 209358
* Clean up language and grammar.Eric Christopher2014-05-201-2/+2
| | | | | | | Based on a patch by jfcaron3@gmail.com! PR19806 llvm-svn: 209216
* Teach isKnownNonNull that a nonnull return is not null. Add a test for this ↵Nick Lewycky2014-05-201-0/+5
| | | | | | case as well as the case of a nonnull attribute (already handled but not tested). llvm-svn: 209193
* Add 'nonnull', a new parameter and return attribute which indicates that the ↵Nick Lewycky2014-05-201-2/+2
| | | | | | pointer is not null. Instcombine will elide comparisons between these and null. Patch by Luqman Aden! llvm-svn: 209185
* Check the alwaysinline attribute on the call as well as on the caller.Peter Collingbourne2014-05-191-1/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D3815 llvm-svn: 209150
* InstSimplify: Improve handling of ashr/lshrDavid Majnemer2014-05-161-1/+21
| | | | | | | | | | | | | | Summary: Analyze the range of values produced by ashr/lshr cst, %V when it is being used in an icmp. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3774 llvm-svn: 209000
* InstSimplify: Optimize using dividend in sdivDavid Majnemer2014-05-161-0/+4
| | | | | | | | | | | | | | | Summary: The dividend in an sdiv tells us the largest and smallest possible results. Use this fact to optimize comparisons against an sdiv with a constant dividend. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3795 llvm-svn: 208999
* Add C API for thread yielding callback.Juergen Ributzka2014-05-162-2/+9
| | | | | | | | | | | | | | | | | | | | | Sometimes a LLVM compilation may take more time then a client would like to wait for. The problem is that it is not possible to safely suspend the LLVM thread from the outside. When the timing is bad it might be possible that the LLVM thread holds a global mutex and this would block any progress in any other thread. This commit adds a new yield callback function that can be registered with a context. LLVM will try to yield by calling this callback function, but there is no guaranteed frequency. LLVM will only do so if it can guarantee that suspending the thread won't block any forward progress in other LLVM contexts in the same process. Once the client receives the call back it can suspend the thread safely and resume it at another time. Related to <rdar://problem/16728690> llvm-svn: 208945
* Instead of littering asserts throughout the code after every call toJay Foad2014-05-151-38/+27
| | | | | | | computeKnownBits, consolidate them into one assert at the end of computeKnownBits itself. llvm-svn: 208876
* Teach the constant folder to look through bitcast constant expressionsChandler Carruth2014-05-151-0/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | much more effectively when trying to constant fold a load of a constant. Previously, we only handled bitcasts by trying to find a totally generic byte representation of the constant and use that. Now, we look through the bitcast to see what constant we might fold the load into, and then try to form a constant expression cast of the found value that would be equivalent to loading the value. You might wonder why on earth this actually matters. Well, turns out that the Itanium ABI causes us to create a single array for a vtable where the first elements are virtual base offsets, followed by the virtual function pointers. Because the array is homogenous the element type is consistently i8* and we inttoptr the virtual base offsets into the initial elements. Then constructors bitcast these pointers to i64 pointers prior to loading them. Boom, no more constant folding of virtual base offsets. This is the first fix to LLVM to address the *insane* performance Eric Niebler discovered with Clang on his range comprehensions[1]. There is more to come though, this doesn't *really* fix the problem fully. [1]: http://ericniebler.com/2014/04/27/range-comprehensions/ llvm-svn: 208856
* Fix typosAlp Toker2014-05-151-2/+2
| | | | llvm-svn: 208839
* Rename ComputeMaskedBits to computeKnownBits. "Masked" has beenJay Foad2014-05-144-85/+85
| | | | | | inappropriate since it lost its Mask parameter in r154011. llvm-svn: 208811
* InstSimplify: Optimize signed icmp of -(zext V)David Majnemer2014-05-141-0/+22
| | | | | | | | | | | | | | | | Summary: We know that -(zext V) will always be <= zero, simplify signed icmps that have these. Uncovered using http://www.cs.utah.edu/~regehr/souper/ Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3754 llvm-svn: 208809
* Update the comments for ComputeMaskedBits, which lost its Mask parameterJay Foad2014-05-141-2/+2
| | | | | | in r154011. llvm-svn: 208757
* use nullptr instead of NULLSebastian Pop2014-05-121-4/+4
| | | | llvm-svn: 208622
* do not assert when delinearization failsSebastian Pop2014-05-121-8/+30
| | | | llvm-svn: 208615
* use isZero()Sebastian Pop2014-05-121-6/+5
| | | | llvm-svn: 208614
* SCEV: Use range-based for loop and fold variable into assert.Benjamin Kramer2014-05-101-6/+4
| | | | llvm-svn: 208476
* move findArrayDimensions to ScalarEvolutionSebastian Pop2014-05-092-10/+10
| | | | | | | we do not use the information from SCEVAddRecExpr to compute the shape of the array, so a better place for this function is in ScalarEvolution. llvm-svn: 208456
* fix typo in debug messageSebastian Pop2014-05-091-2/+2
| | | | llvm-svn: 208455
* Correct formatting.Tobias Grosser2014-05-081-4/+4
| | | | | | | | Sorry for the commit spam. My clang-format crashed on me and the vim plugin did not print an error, but instead just left the formatting untouched. llvm-svn: 208358
* Use std::remove_if to remove elements from a vectorTobias Grosser2014-05-081-5/+4
| | | | | Suggested-by: Benjamin Kramer <benny.kra@gmail.com> llvm-svn: 208357
* Use a range loop.Rafael Espindola2014-05-081-4/+2
| | | | llvm-svn: 208343
* Revert "SCEV: Use I = vector<>.erase(I) to iterate and delete at the same time"Tobias Grosser2014-05-081-3/+6
| | | | | | as committed in r208282. The original commit was incorrect. llvm-svn: 208286
* SCEV: Use I = vector<>.erase(I) to iterate and delete at the same timeTobias Grosser2014-05-081-6/+3
| | | | llvm-svn: 208282
* avoid segfaultingSebastian Pop2014-05-071-2/+1
| | | | | | *Quotient and *Remainder don't have to be initialized. llvm-svn: 208238
* do not collect undef termsSebastian Pop2014-05-071-1/+36
| | | | llvm-svn: 208237
* split delinearization pass in 3 stepsSebastian Pop2014-05-073-397/+484
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To compute the dimensions of the array in a unique way, we split the delinearization analysis in three steps: - find parametric terms in all memory access functions - compute the array dimensions from the set of terms - compute the delinearized access functions for each dimension The first step is executed on all the memory access functions such that we gather all the patterns in which an array is accessed. The second step reduces all this information in a unique description of the sizes of the array. The third step is delinearizing each memory access function following the common description of the shape of the array computed in step 2. This rewrite of the delinearization pass also solves a problem we had with the previous implementation: because the previous algorithm was by induction on the structure of the SCEV, it would not correctly recognize the shape of the array when the memory access was not following the nesting of the loops: for example, see polly/test/ScopInfo/multidim_only_ivs_3d_reverse.ll ; void foo(long n, long m, long o, double A[n][m][o]) { ; ; for (long i = 0; i < n; i++) ; for (long j = 0; j < m; j++) ; for (long k = 0; k < o; k++) ; A[i][k][j] = 1.0; Starting with this patch we no longer delinearize access functions that do not contain parameters, for example in test/Analysis/DependenceAnalysis/GCD.ll ;; for (long int i = 0; i < 100; i++) ;; for (long int j = 0; j < 100; j++) { ;; A[2*i - 4*j] = i; ;; *B++ = A[6*i + 8*j]; these accesses will not be delinearized as the upper bound of the loops are constants, and their access functions do not contain SCEVUnknown parameters. llvm-svn: 208232
* [C++11] Add NArySCEV->Operands iterator rangeTobias Grosser2014-05-071-8/+6
| | | | llvm-svn: 208158
* blockfreq: Move include to .cppDuncan P. N. Exon Smith2014-05-061-0/+1
| | | | llvm-svn: 208035
* [LCG] Add the last (and most complex) of the edge insertion mutationChandler Carruth2014-05-041-0/+119
| | | | | | | | | | | | | operations on the call graph. This one forms a cycle, and while not as complex as removing an internal edge from an SCC, it involves a reasonable amount of work to find all of the nodes newly connected in a cycle. Also somewhat alarming is the worst case complexity here: it might have to walk roughly the entire SCC inverse DAG to insert a single edge. This is carefully documented in the API (I hope). llvm-svn: 207935
* [TBAA] Fix handling of mixed TBAA (path-aware and non-path-aware TBAA).Juergen Ributzka2014-05-031-2/+7
| | | | | | | | | | | | | | This fix simply ensures that both metadata nodes are path-aware before performing path-aware alias analysis. This issue isn't normally triggered in LLVM, because we perform an autoupgrade of the TBAA metadata to the new format when reading in LL or BC files. This issue only appears when a client creates the IR manually and mixes old and new TBAA metadata format. This fixes <rdar://problem/16760860>. llvm-svn: 207923
* [LCG] Add the other simple edge insertion API to the call graph. ThisChandler Carruth2014-05-011-0/+15
| | | | | | | | | just connects an SCC to one of its descendants directly. Not much of an impact. The last one is the hard one -- connecting an SCC to one of its ancestors, and thereby forming a cycle such that we have to merge all the SCCs participating in the cycle. llvm-svn: 207751
* [LCG] Don't lookup the child SCC twice. Spotted this by inspection, andChandler Carruth2014-05-011-2/+2
| | | | | | no functionality changed. llvm-svn: 207750
* [LCG] Add some basic methods for querying the parent/child relationshipsChandler Carruth2014-05-011-0/+15
| | | | | | | | of SCCs in the SCC DAG. Exercise them in the big graph test case. These will be especially useful for establishing invariants in insertion logic. llvm-svn: 207749
* [LCG] Add the really, *really* boring edge insertion case: adding anChandler Carruth2014-04-301-4/+19
| | | | | | | | | | edge entirely within an existing SCC. Shockingly, making the connected component more connected is ... a total snooze fest. =] Anyways, its wired up, and I even added a test case to make sure it pretty much sorta works. =D llvm-svn: 207631
* [LCG] Actually test the *basic* edge removal bits (IE, the non-SCCChandler Carruth2014-04-301-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | bits), and discover that it's totally broken. Yay tests. Boo bug. Fix the basic edge removal so that it works by nulling out the removed edges rather than actually removing them. This leaves the indices valid in the map from callee to index, and preserves some of the locality for iterating over edges. The iterator is made bidirectional to reflect that it now has to skip over null entries, and the skipping logic is layered onto it. As future work, I would like to track essentially the "load factor" of the edge list, and when it falls below a threshold do a compaction. An alternative I considered (and continue to consider) is storing the callees in a doubly linked list where each element of the list is in a set (which is essentially the classical linked-hash-table datastructure). The problem with that approach is that either you need to heap allocate the linked list nodes and use pointers to them, or use a bucket hash table (with even *more* linked list pointer overhead!), etc. It's pretty easy to get 5x overhead for values that are just pointers. So far, I think punching holes in the vector, and periodic compaction is likely to be much more efficient overall in the space/time tradeoff. llvm-svn: 207619
* raw_ostream: Forward declare OpenFlags and include FileSystem.h only where ↵Benjamin Kramer2014-04-291-0/+1
| | | | | | necessary. llvm-svn: 207593
* blockfreq: Defer to BranchProbability::scale()Duncan P. N. Exon Smith2014-04-291-26/+0
| | | | | | `BlockMass` can now defer to `BranchProbability::scale()`. llvm-svn: 207547
* blockfreq: Remove more extra typenames from r207438Duncan P. N. Exon Smith2014-04-281-2/+2
| | | | llvm-svn: 207440
* Reapply "blockfreq: Approximate irreducible control flow"Duncan P. N. Exon Smith2014-04-281-20/+210
| | | | | | | | | | This reverts commit r207287, reapplying r207286. I'm hoping that declaring an explicit struct and instantiating `addBlockEdges()` directly works around the GCC crash from r207286. This is a lot more boilerplate, though. llvm-svn: 207438
* [LCG] Add the most basic of edge insertion to the lazy call graph. ThisChandler Carruth2014-04-281-0/+15
| | | | | | | just handles the pre-DFS case. Also add some test cases for this case to make sure it works. llvm-svn: 207411
* [LCG] Make the return of the IntraSCC removal method actually match itsChandler Carruth2014-04-281-5/+3
| | | | | | | | contract (and be much more useful). It now provides exactly the post-order traversal a caller might need to perform on newly formed SCCs. llvm-svn: 207410
* [inliner] Significantly improve the compile time in cases like PR19499Chandler Carruth2014-04-281-3/+23
| | | | | | | | | | | | | | | by avoiding inlining massive switches merely because they have no instructions in them. These switches still show up where we fail to form lookup tables, and in those cases they are actually going to cause a very significant code size hit anyways, so inlining them is not the right call. The right way to fix any performance regressions stemming from this is to enhance the switch-to-lookup-table logic to fire in more places. This makes PR19499 about 5x less bad. It uncovers a second compile time problem in that test case that is unrelated (surprisingly!). llvm-svn: 207403
* [C++] Use 'nullptr'.Craig Topper2014-04-281-1/+1
| | | | llvm-svn: 207394
* [LCG] Re-organize the methods for mutating a call graph to make theirChandler Carruth2014-04-271-76/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | API requirements much more obvious. The key here is that there are two totally different use cases for mutating the graph. Prior to doing any SCC formation, it is very easy to mutate the graph. There may be users that want to do small tweaks here, and then use the already-built graph for their SCC-based operations. This method remains on the graph itself and is documented carefully as being cheap but unavailable once SCCs are formed. Once SCCs are formed, and there is some in-flight DFS building them, we have to be much more careful in how we mutate the graph. These mutation operations are sunk onto the SCCs themselves, which both simplifies things (the code was already there!) and helps make it obvious that these interfaces are only applicable within that context. The other primary constraint is that the edge being mutated is actually related to the SCC on which we call the method. This helps make it obvious that you cannot arbitrarily mutate some other SCC. I've tried to write much more complete documentation for the interesting mutation API -- intra-SCC edge removal. Currently one aspect of this documentation is a lie (the result list of SCCs) but we also don't even have tests for that API. =[ I'm going to add tests and fix it to match the documentation next. llvm-svn: 207339
OpenPOWER on IntegriCloud