summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis
Commit message (Collapse)AuthorAgeFilesLines
* Allow isDereferenceablePointer to look through some bitcastsHal Finkel2014-07-101-1/+1
| | | | | | | | | | | | | | | | isDereferenceablePointer should not give up upon encountering any bitcast. If we're casting from a pointer to a larger type to a pointer to a small type, we can continue by examining the bitcast's operand. This missing capability was noted in a comment in the function. In order for this to work, isDereferenceablePointer now takes an optional DataLayout pointer (essentially all callers already had such a pointer available). Most code uses isDereferenceablePointer though isSafeToSpeculativelyExecute (which already took an optional DataLayout pointer), and to enable the LICM test case, LICM needs to actually provide its DL pointer to isSafeToSpeculativelyExecute (which it was not doing previously). llvm-svn: 212686
* Improve BasicAA CS-CS queriesHal Finkel2014-07-083-126/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | BasicAA contains knowledge of certain intrinsics, such as memcpy and memset, and uses that information to form more-accurate answers to CallSite vs. Loc ModRef queries. Unfortunately, it did not use this information when answering CallSite vs. CallSite queries. Generically, when an intrinsic takes one or more pointers and the intrinsic is marked only to read/write from its arguments, the offset/size is unknown. As a result, the generic code that answers CallSite vs. CallSite (and CallSite vs. Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's arguments. While BasicAA's CallSite vs. Loc override could use more-accurate size information for some intrinsics, it did not do the same for CallSite vs. CallSite queries. This change refactors the intrinsic-specific logic in BasicAA into a generic AA query function: getArgLocation, which is overridden by BasicAA to supply the intrinsic-specific knowledge, and used by AA's generic implementation. This allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and CallSite vs. CallSite queries, and simplifies the BasicAA implementation. Currently, only one function, Mac's memset_pattern16, is handled by BasicAA (all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's getModRefBehavior override now also returns OnlyAccessesArgumentPointees for this function (which is an improvement). llvm-svn: 212572
* fixed typos in commentsSanjay Patel2014-07-061-2/+2
| | | | llvm-svn: 212424
* InstSimplify: Fix a bug when INT_MIN is in a sdivDavid Majnemer2014-07-041-3/+9
| | | | | | | | | | | | | When INT_MIN is the numerator in a sdiv, we would not properly handle overflow when calculating the bounds of possible values; abs(INT_MIN) is not a meaningful number. Instead, check and handle INT_MIN by reasoning that the largest value is INT_MIN/-2 and the smallest value is INT_MIN. This fixes PR20199. llvm-svn: 212307
* [CostModel][x86] Improved cost model for alternate shuffles.Andrea Di Biagio2014-07-031-3/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch: 1) Improves the cost model for x86 alternate shuffles (originally added at revision 211339); 2) Teaches the Cost Model Analysis pass how to analyze alternate shuffles. Alternate shuffles are a special kind of blend; on x86, we can often easily lowered alternate shuffled into single blend instruction (depending on the subtarget features). The existing cost model didn't take into account subtarget features. Also, it had a couple of "dead" entries for vector types that are never legal (example: on x86 types v2i32 and v2f32 are not legal; those are always either promoted or widened to 128-bit vector types). The new x86 cost model takes into account what target features we have before returning the shuffle cost (i.e. the number of instructions after the blend is lowered/expanded). This patch also teaches the Cost Model Analysis how to identify and analyze alternate shuffles (i.e. 'SK_Alternate' shufflevector instructions): - added function 'isAlternateVectorMask'; - added some logic to check if an instruction is a alternate shuffle and, in case, call the target specific TTI to get the corresponding shuffle cost; - added a test to verify the cost model analysis on alternate shuffles. llvm-svn: 212296
* Add new lines to debugging information.Richard Trieu2014-07-031-1/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D4262 llvm-svn: 212250
* Suppress inlining when the block address is takenGerolf Hoflehner2014-07-011-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inlining functions with block addresses can cause many problem and requires a rich infrastructure to support including escape analysis. At this point the safest approach to address these problems is by blocking inlining from happening. Background: There have been reports on Ruby segmentation faults triggered by inlining functions with block addresses like //Ruby code snippet vm_exec_core() { finish_insn_seq_0 = &&INSN_LABEL_finish; INSN_LABEL_finish: ; } This kind of scenario can also happen when LLVM picks a subset of blocks for inlining, which is the case with the actual code in the Ruby environment. LLVM suppresses inlining for such functions when there is an indirect branch. The attached patch does so even when there is no indirect branch. Note that user code like above would not make much sense: using the global for jumping across function boundaries would be illegal. Why was there a segfault: In the snipped above the block with the label is recognized as dead So it is eliminated. Instead of a block address the cloner stores a constant (sic!) into the global resulting in the segfault (when the global is used in a goto). Why had it worked in the past then: By luck. In older versions vm_exec_core was also inlined but the label address used was the block label address in vm_exec_core. So the global jump ended up in the original function rather than in the caller which accidentally happened to work. Test case ./tools/clang/test/CodeGen/indirect-goto.c will fail as a result of this commit. rdar://17245966 llvm-svn: 212077
* Revert "Introduce a string_ostream string builder facilty"Alp Toker2014-06-263-11/+12
| | | | | | Temporarily back out commits r211749, r211752 and r211754. llvm-svn: 211814
* This patch removed duplicate code for matching patterns Dinesh Dwivedi2014-06-261-106/+1
| | | | | | | | | which are now handled in SimplifyUsingDistributiveLaws() (after r211261) Differential Revision: http://reviews.llvm.org/D4253 llvm-svn: 211768
* MSVC build fix following r211749Alp Toker2014-06-261-2/+4
| | | | | | Avoid strndup() llvm-svn: 211752
* Introduce a string_ostream string builder faciltyAlp Toker2014-06-263-11/+8
| | | | | | | | | | | | | | | | | | | | string_ostream is a safe and efficient string builder that combines opaque stack storage with a built-in ostream interface. small_string_ostream<bytes> additionally permits an explicit stack storage size other than the default 128 bytes to be provided. Beyond that, storage is transferred to the heap. This convenient class can be used in most places an std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair would previously have been used, in order to guarantee consistent access without byte truncation. The patch also converts much of LLVM to use the new facility. These changes include several probable bug fixes for truncated output, a programming error that's no longer possible with the new interface. llvm-svn: 211749
* Support: Move class ScaledNumberDuncan P. N. Exon Smith2014-06-241-190/+0
| | | | | | | | ScaledNumber has been cleaned up enough to pull out of BFI now. Still work to do there (tests for shifting, bloated printing code, etc.), but it seems clean enough for its new home. llvm-svn: 211562
* BFI: Un-floatify more languageDuncan P. N. Exon Smith2014-06-241-23/+24
| | | | llvm-svn: 211561
* Support: Extract ScaledNumbers::MinScale and MaxScaleDuncan P. N. Exon Smith2014-06-241-10/+5
| | | | llvm-svn: 211558
* BFI: Change language from "exponent" to "scale"Duncan P. N. Exon Smith2014-06-231-8/+8
| | | | llvm-svn: 211557
* BFI: Rename UnsignedFloat => ScaledNumberDuncan P. N. Exon Smith2014-06-231-17/+17
| | | | | | | A lot of the docs and API are out of date, but I'll leave that for a separate commit. llvm-svn: 211555
* SCEVExpander: Fold constant PHIs harder. The logic below only understands ↵Benjamin Kramer2014-06-211-1/+2
| | | | | | | | proper IVs. PR20093. llvm-svn: 211433
* Add back functionality removed in r210497.Richard Trieu2014-06-214-8/+16
| | | | | | Instead of asserting, output a message stating that a null pointer was found. llvm-svn: 211430
* Support: Write ScaledNumber::getQuotient() and getProduct()Duncan P. N. Exon Smith2014-06-201-91/+0
| | | | llvm-svn: 211409
* [ValueTracking] Extend range metadata to call/invokeJingyue Wu2014-06-191-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: With this patch, range metadata can be added to call/invoke including IntrinsicInst. Previously, it could only be added to load. Rename computeKnownBitsLoad to computeKnownBitsFromRangeMetadata because range metadata is not only used by load. Update the language reference to reflect this change. Test Plan: Add several tests in range-2.ll to confirm the verifier is happy with having range metadata on call/invoke. Add two tests in AddOverFlow.ll to confirm annotating range metadata to call/invoke can benefit InstCombine. Reviewers: meheff, nlewycky, reames, hfinkel, eliben Reviewed By: eliben Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4187 llvm-svn: 211281
* Move optimization of some cases of (A & C1)|(B & C2) from instcombine to ↵Nick Lewycky2014-06-191-0/+32
| | | | | | instsimplify. Patch by Rahul Jain, plus some last minute changes by me -- you can blame me for any bugs. llvm-svn: 211252
* Make instsimplify's analysis of icmp eq/ne use computeKnownBits to determine ↵Nick Lewycky2014-06-191-0/+19
| | | | | | whether the icmp is always true or false. Patch by Suyog Sarda! llvm-svn: 211251
* Removing an "if (!this)" check from two print methods. The condition willRichard Trieu2014-06-094-2/+8
| | | | | | | never be true in a well-defined context. The checking for null pointers has been moved into the caller logic so it does not rely on undefined behavior. llvm-svn: 210497
* Remove old fenv.h workaround for a historic clang driver bugAlp Toker2014-06-091-9/+2
| | | | | | | | | | | Tested and works fine with clang using libstdc++. All indications are that this was fixed some time ago and isn't a problem with any clang version we support. I've added a note in PR6907 which is still open for some reason. llvm-svn: 210485
* Fold FEnv.h into the implementationAlp Toker2014-06-091-7/+41
| | | | | | | | | | | | | | | | | | Support headers shouldn't use config.h definitions, and they should never be undefined like this. ConstantFolding.cpp was the only user of this facility and already includes config.h for other math features, so it makes sense to move the checks there at point of use. (The implicit config.h was also quite dangerous -- removing the FEnv.h include would have silently disabled math constant folding without causing any tests to fail. Need to investigate -Wundef once the cleanup is done.) This eliminates the last config.h include from LLVM headers, paving the way for more consistent configuration checks. llvm-svn: 210483
* ScalarEvolution: Derive element size from the type of the loaded elementTobias Grosser2014-06-081-1/+1
| | | | | | | | | | Before, we where looking at the size of the pointer type that specifies the location from which to load the element. This did not make any sense at all. This change fixes a bug in the delinearization where we failed to delinerize certain load instructions. llvm-svn: 210435
* Add a new attribute called 'jumptable' that creates jump-instruction tables ↵Tom Roeder2014-06-053-0/+42
| | | | | | | | | | | | for functions marked with this attribute. It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables. This also adds backend support for generating the jump-instruction tables on ARM and X86. Note that since the jumptable attribute creates a second function pointer for a function, any function marked with jumptable must also be marked with unnamed_addr. llvm-svn: 210280
* Add a Constant version of stripPointerCasts.Rafael Espindola2014-06-041-1/+1
| | | | | | Thanks to rnk for the suggestion. llvm-svn: 210205
* implement missing SCEVDivision caseSebastian Pop2014-05-291-0/+9
| | | | | | | | without this case we would end on an infinite recursion: the remainder is zero, so Numerator - Remainder is equal to Numerator and so we would recursively ask for the division of Numerator by Denominator. llvm-svn: 209838
* fail to find dimensions when ElementSize is nullptrSebastian Pop2014-05-291-1/+1
| | | | | | | | when ScalarEvolution::getElementSize returns nullptr it is safe to early return in ScalarEvolution::findArrayDimensions such that we avoid later problems when we try to divide the terms by ElementSize. llvm-svn: 209837
* test check-in: added missing parenthesis in commentSanjay Patel2014-05-281-1/+1
| | | | llvm-svn: 209763
* avoid type mismatch when building SCEVsSebastian Pop2014-05-271-0/+26
| | | | | | | | | | | This is a corner case I have stumbled upon when dealing with ARM64 type conversions. I was not able to extract a testcase for the community codebase to fail on. The patch conservatively discards a division that would have ended up in an ICE due to a type mismatch when building a multiply expression. I have also added code to a place that builds add expressions and in which we should be careful not to pass in operands of different types. llvm-svn: 209694
* do not use the GCD to compute the delinearization stridesSebastian Pop2014-05-271-59/+8
| | | | | | | | | | | We do not need to compute the GCD anymore after we removed the constant coefficients from the terms: the terms are now all parametric expressions and there is no need to recognize constant terms that divide only a subset of the terms. We only rely on the size of the terms, i.e., the number of operands in the multiply expressions, to sort the terms and recognize the parametric dimensions. llvm-svn: 209693
* remove BasePointer before delinearizingSebastian Pop2014-05-273-29/+41
| | | | | | | | | | No functional change is intended: instead of relying on the delinearization to come up with the base pointer as a remainder of the divisions in the delinearization, we just compute it from the array access and use that value. We substract the base pointer from the SCEV to be delinearized and that simplifies the work of the delinearizer. llvm-svn: 209692
* remove constant termsSebastian Pop2014-05-273-25/+80
| | | | | | | | | | | | | | | | | | | | | | The delinearization is needed only to remove the non linearity induced by expressions involving multiplications of parameters and induction variables. There is no problem in dealing with constant times parameters, or constant times an induction variable. For this reason, the current patch discards all constant terms and multipliers before running the delinearization algorithm on the terms. The only thing remaining in the term expressions are parameters and multiply expressions of parameters: these simplified term expressions are passed to the array shape recognizer that will not recognize constant dimensions anymore: these will be recognized as different strides in parametric subscripts. The only important special case of a constant dimension is the size of elements. Instead of relying on the delinearization to infer the size of an element, compute the element size from the base address type. This is a much more precise way of computing the element size than before, as we would have mixed together the size of an element with the strides of the innermost dimension. llvm-svn: 209691
* Some cleanup for r209568.Michael Zolotukhin2014-05-261-9/+7
| | | | llvm-svn: 209634
* Implement sext(C1 + C2*X) --> sext(C1) + sext(C2*X) andMichael Zolotukhin2014-05-241-0/+35
| | | | | | | | | | | sext{C1,+,C2} --> sext(C1) + sext{0,+,C2} transformation in Scalar Evolution. That helps SLP-vectorizer to recognize consecutive loads/stores. <rdar://problem/14860614> llvm-svn: 209568
* Fix and improve SCEV ComputeBackedgeTankCount.Andrew Trick2014-05-231-19/+46
| | | | | | | | | | | | | This is a follow-up to r209358: PR19799: Indvars miscompile due to an incorrect max backedge taken count from SCEV. That fix was incomplete as pointed out by Arnold and Michael Z. The code was also too confusing. It needed a careful rewrite with more unit tests. This version will also happen to optimize more cases. <rdar://17005101> PR19799: Indvars miscompile... llvm-svn: 209545
* ScalarEvolution: Fix handling of AddRecs in isKnownPredicateJustin Bogner2014-05-231-12/+24
| | | | | | | | | | | | | ScalarEvolution::isKnownPredicate() can wrongly reduce a comparison when both the LHS and RHS are SCEVAddRecExprs. This checks that both LHS and RHS are guarded in the case when both are SCEVAddRecExprs. The test case is against indvars because I could not find a way to directly test SCEV. Patch by Sanjay Patel! llvm-svn: 209487
* Fix a bug in SCEV's backedge taken count computation from my prior fix in Jan.Andrew Trick2014-05-221-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This has to do with the trip count computation for loops with multiple exits, which is quite subtle. Most passes just ask for a single trip count number, so we must be conservative assuming any exit could be taken. Normally, we rely on the "exact" trip count, which was correctly given as "unknown". However, SCEV also gives a "max" back-edge taken count. The loops max BE taken count is conservatively a maximum over the max of each exit's non-exiting iterations count. Note that some exit tests can be skipped so the max loop back-edge taken count can actually exceed the max non-exiting iterations for some exits. However, when we know the loop *latch* cannot be skipped, we can directly use its max taken count disregarding other exits. I previously took the minimum here without checking whether the other exit could be skipped. The correct, and simpler thing to do here is just to directly use the loop latch's max non-exiting iterations as the loops max back-edge count. In the problematic test case, the first loop exit had a max of zero non-exiting iterations, but could be skipped. The loop latch was known not to be skipped but had max of one non-exiting iteration. We incorrectly claimed the loop back-edge could be taken zero times, when it is actually taken one time. Fixes Loop %for.body.i: <multiple exits> Unpredictable backedge-taken count. Loop %for.body.i: max backedge-taken count is 1. llvm-svn: 209358
* Clean up language and grammar.Eric Christopher2014-05-201-2/+2
| | | | | | | Based on a patch by jfcaron3@gmail.com! PR19806 llvm-svn: 209216
* Teach isKnownNonNull that a nonnull return is not null. Add a test for this ↵Nick Lewycky2014-05-201-0/+5
| | | | | | case as well as the case of a nonnull attribute (already handled but not tested). llvm-svn: 209193
* Add 'nonnull', a new parameter and return attribute which indicates that the ↵Nick Lewycky2014-05-201-2/+2
| | | | | | pointer is not null. Instcombine will elide comparisons between these and null. Patch by Luqman Aden! llvm-svn: 209185
* Check the alwaysinline attribute on the call as well as on the caller.Peter Collingbourne2014-05-191-1/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D3815 llvm-svn: 209150
* InstSimplify: Improve handling of ashr/lshrDavid Majnemer2014-05-161-1/+21
| | | | | | | | | | | | | | Summary: Analyze the range of values produced by ashr/lshr cst, %V when it is being used in an icmp. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3774 llvm-svn: 209000
* InstSimplify: Optimize using dividend in sdivDavid Majnemer2014-05-161-0/+4
| | | | | | | | | | | | | | | Summary: The dividend in an sdiv tells us the largest and smallest possible results. Use this fact to optimize comparisons against an sdiv with a constant dividend. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3795 llvm-svn: 208999
* Add C API for thread yielding callback.Juergen Ributzka2014-05-162-2/+9
| | | | | | | | | | | | | | | | | | | | | Sometimes a LLVM compilation may take more time then a client would like to wait for. The problem is that it is not possible to safely suspend the LLVM thread from the outside. When the timing is bad it might be possible that the LLVM thread holds a global mutex and this would block any progress in any other thread. This commit adds a new yield callback function that can be registered with a context. LLVM will try to yield by calling this callback function, but there is no guaranteed frequency. LLVM will only do so if it can guarantee that suspending the thread won't block any forward progress in other LLVM contexts in the same process. Once the client receives the call back it can suspend the thread safely and resume it at another time. Related to <rdar://problem/16728690> llvm-svn: 208945
* Instead of littering asserts throughout the code after every call toJay Foad2014-05-151-38/+27
| | | | | | | computeKnownBits, consolidate them into one assert at the end of computeKnownBits itself. llvm-svn: 208876
* Teach the constant folder to look through bitcast constant expressionsChandler Carruth2014-05-151-0/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | much more effectively when trying to constant fold a load of a constant. Previously, we only handled bitcasts by trying to find a totally generic byte representation of the constant and use that. Now, we look through the bitcast to see what constant we might fold the load into, and then try to form a constant expression cast of the found value that would be equivalent to loading the value. You might wonder why on earth this actually matters. Well, turns out that the Itanium ABI causes us to create a single array for a vtable where the first elements are virtual base offsets, followed by the virtual function pointers. Because the array is homogenous the element type is consistently i8* and we inttoptr the virtual base offsets into the initial elements. Then constructors bitcast these pointers to i64 pointers prior to loading them. Boom, no more constant folding of virtual base offsets. This is the first fix to LLVM to address the *insane* performance Eric Niebler discovered with Clang on his range comprehensions[1]. There is more to come though, this doesn't *really* fix the problem fully. [1]: http://ericniebler.com/2014/04/27/range-comprehensions/ llvm-svn: 208856
* Fix typosAlp Toker2014-05-151-2/+2
| | | | llvm-svn: 208839
OpenPOWER on IntegriCloud