summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis
Commit message (Collapse)AuthorAgeFilesLines
* Revert r212572 "improve BasicAA CS-CS queries", it causes PR20303.Nick Lewycky2014-07-153-142/+126
| | | | llvm-svn: 213024
* Look through addrspacecast in IsConstantOffsetFromGlobalMatt Arsenault2014-07-141-1/+2
| | | | llvm-svn: 213000
* Look through addrspacecast in GetPointerBaseWithConstantOffsetMatt Arsenault2014-07-141-1/+2
| | | | llvm-svn: 212999
* InstSimplify: Correct sdiv x / -1David Majnemer2014-07-141-11/+13
| | | | | | | | | | | Determining the bounds of x/ -1 would start off with us dividing it by INT_MIN. Suffice to say, this would not work very well. Instead, handle it upfront by checking for -1 and mapping it to the range: [INT_MIN + 1, INT_MAX. This means that the result of our division can be any value other than INT_MIN. llvm-svn: 212981
* InstSimplify: The upper bound of X / C was missing a rounding stepDavid Majnemer2014-07-141-1/+9
| | | | | | | | | | | | | | | | | | Summary: When calculating the upper bound of X / -8589934592, we would perform the following calculation: Floor[INT_MAX / 8589934592] However, flooring the result would make us wrongly come to the conclusion that 1073741824 was not in the set of possible values. Instead, use the ceiling of the result. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4502 llvm-svn: 212976
* Templatify DominanceFrontier.Matt Arsenault2014-07-121-109/+26
| | | | | | Theoretically this should now work for MachineBasicBlocks. llvm-svn: 212885
* BFI: Add constructor for WeightDuncan P. N. Exon Smith2014-07-121-5/+1
| | | | llvm-svn: 212868
* BFI: Clean up BlockMassDuncan P. N. Exon Smith2014-07-121-10/+0
| | | | | | | | | | | | | | Implementation is small now -- the interesting logic was moved to `BranchProbability` a while ago. Move it into `bfi_detail` and get rid of the related TODOs. I was originally planning to define it within `BlockFrequencyInfoImpl` (or `BFIIBase`), but it seems cleaner in a namespace. Besides, `isPodLike` needs to be specialized before `BlockMass` can be used in some of the other data structures, and there isn't a clear way to do that. llvm-svn: 212866
* BFI: Mark the end of namespacesDuncan P. N. Exon Smith2014-07-111-1/+2
| | | | llvm-svn: 212861
* Allow isDereferenceablePointer to look through some bitcastsHal Finkel2014-07-101-1/+1
| | | | | | | | | | | | | | | | isDereferenceablePointer should not give up upon encountering any bitcast. If we're casting from a pointer to a larger type to a pointer to a small type, we can continue by examining the bitcast's operand. This missing capability was noted in a comment in the function. In order for this to work, isDereferenceablePointer now takes an optional DataLayout pointer (essentially all callers already had such a pointer available). Most code uses isDereferenceablePointer though isSafeToSpeculativelyExecute (which already took an optional DataLayout pointer), and to enable the LICM test case, LICM needs to actually provide its DL pointer to isSafeToSpeculativelyExecute (which it was not doing previously). llvm-svn: 212686
* Improve BasicAA CS-CS queriesHal Finkel2014-07-083-126/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | BasicAA contains knowledge of certain intrinsics, such as memcpy and memset, and uses that information to form more-accurate answers to CallSite vs. Loc ModRef queries. Unfortunately, it did not use this information when answering CallSite vs. CallSite queries. Generically, when an intrinsic takes one or more pointers and the intrinsic is marked only to read/write from its arguments, the offset/size is unknown. As a result, the generic code that answers CallSite vs. CallSite (and CallSite vs. Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's arguments. While BasicAA's CallSite vs. Loc override could use more-accurate size information for some intrinsics, it did not do the same for CallSite vs. CallSite queries. This change refactors the intrinsic-specific logic in BasicAA into a generic AA query function: getArgLocation, which is overridden by BasicAA to supply the intrinsic-specific knowledge, and used by AA's generic implementation. This allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and CallSite vs. CallSite queries, and simplifies the BasicAA implementation. Currently, only one function, Mac's memset_pattern16, is handled by BasicAA (all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's getModRefBehavior override now also returns OnlyAccessesArgumentPointees for this function (which is an improvement). llvm-svn: 212572
* fixed typos in commentsSanjay Patel2014-07-061-2/+2
| | | | llvm-svn: 212424
* InstSimplify: Fix a bug when INT_MIN is in a sdivDavid Majnemer2014-07-041-3/+9
| | | | | | | | | | | | | When INT_MIN is the numerator in a sdiv, we would not properly handle overflow when calculating the bounds of possible values; abs(INT_MIN) is not a meaningful number. Instead, check and handle INT_MIN by reasoning that the largest value is INT_MIN/-2 and the smallest value is INT_MIN. This fixes PR20199. llvm-svn: 212307
* [CostModel][x86] Improved cost model for alternate shuffles.Andrea Di Biagio2014-07-031-3/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch: 1) Improves the cost model for x86 alternate shuffles (originally added at revision 211339); 2) Teaches the Cost Model Analysis pass how to analyze alternate shuffles. Alternate shuffles are a special kind of blend; on x86, we can often easily lowered alternate shuffled into single blend instruction (depending on the subtarget features). The existing cost model didn't take into account subtarget features. Also, it had a couple of "dead" entries for vector types that are never legal (example: on x86 types v2i32 and v2f32 are not legal; those are always either promoted or widened to 128-bit vector types). The new x86 cost model takes into account what target features we have before returning the shuffle cost (i.e. the number of instructions after the blend is lowered/expanded). This patch also teaches the Cost Model Analysis how to identify and analyze alternate shuffles (i.e. 'SK_Alternate' shufflevector instructions): - added function 'isAlternateVectorMask'; - added some logic to check if an instruction is a alternate shuffle and, in case, call the target specific TTI to get the corresponding shuffle cost; - added a test to verify the cost model analysis on alternate shuffles. llvm-svn: 212296
* Add new lines to debugging information.Richard Trieu2014-07-031-1/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D4262 llvm-svn: 212250
* Suppress inlining when the block address is takenGerolf Hoflehner2014-07-011-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inlining functions with block addresses can cause many problem and requires a rich infrastructure to support including escape analysis. At this point the safest approach to address these problems is by blocking inlining from happening. Background: There have been reports on Ruby segmentation faults triggered by inlining functions with block addresses like //Ruby code snippet vm_exec_core() { finish_insn_seq_0 = &&INSN_LABEL_finish; INSN_LABEL_finish: ; } This kind of scenario can also happen when LLVM picks a subset of blocks for inlining, which is the case with the actual code in the Ruby environment. LLVM suppresses inlining for such functions when there is an indirect branch. The attached patch does so even when there is no indirect branch. Note that user code like above would not make much sense: using the global for jumping across function boundaries would be illegal. Why was there a segfault: In the snipped above the block with the label is recognized as dead So it is eliminated. Instead of a block address the cloner stores a constant (sic!) into the global resulting in the segfault (when the global is used in a goto). Why had it worked in the past then: By luck. In older versions vm_exec_core was also inlined but the label address used was the block label address in vm_exec_core. So the global jump ended up in the original function rather than in the caller which accidentally happened to work. Test case ./tools/clang/test/CodeGen/indirect-goto.c will fail as a result of this commit. rdar://17245966 llvm-svn: 212077
* Revert "Introduce a string_ostream string builder facilty"Alp Toker2014-06-263-11/+12
| | | | | | Temporarily back out commits r211749, r211752 and r211754. llvm-svn: 211814
* This patch removed duplicate code for matching patterns Dinesh Dwivedi2014-06-261-106/+1
| | | | | | | | | which are now handled in SimplifyUsingDistributiveLaws() (after r211261) Differential Revision: http://reviews.llvm.org/D4253 llvm-svn: 211768
* MSVC build fix following r211749Alp Toker2014-06-261-2/+4
| | | | | | Avoid strndup() llvm-svn: 211752
* Introduce a string_ostream string builder faciltyAlp Toker2014-06-263-11/+8
| | | | | | | | | | | | | | | | | | | | string_ostream is a safe and efficient string builder that combines opaque stack storage with a built-in ostream interface. small_string_ostream<bytes> additionally permits an explicit stack storage size other than the default 128 bytes to be provided. Beyond that, storage is transferred to the heap. This convenient class can be used in most places an std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair would previously have been used, in order to guarantee consistent access without byte truncation. The patch also converts much of LLVM to use the new facility. These changes include several probable bug fixes for truncated output, a programming error that's no longer possible with the new interface. llvm-svn: 211749
* Support: Move class ScaledNumberDuncan P. N. Exon Smith2014-06-241-190/+0
| | | | | | | | ScaledNumber has been cleaned up enough to pull out of BFI now. Still work to do there (tests for shifting, bloated printing code, etc.), but it seems clean enough for its new home. llvm-svn: 211562
* BFI: Un-floatify more languageDuncan P. N. Exon Smith2014-06-241-23/+24
| | | | llvm-svn: 211561
* Support: Extract ScaledNumbers::MinScale and MaxScaleDuncan P. N. Exon Smith2014-06-241-10/+5
| | | | llvm-svn: 211558
* BFI: Change language from "exponent" to "scale"Duncan P. N. Exon Smith2014-06-231-8/+8
| | | | llvm-svn: 211557
* BFI: Rename UnsignedFloat => ScaledNumberDuncan P. N. Exon Smith2014-06-231-17/+17
| | | | | | | A lot of the docs and API are out of date, but I'll leave that for a separate commit. llvm-svn: 211555
* SCEVExpander: Fold constant PHIs harder. The logic below only understands ↵Benjamin Kramer2014-06-211-1/+2
| | | | | | | | proper IVs. PR20093. llvm-svn: 211433
* Add back functionality removed in r210497.Richard Trieu2014-06-214-8/+16
| | | | | | Instead of asserting, output a message stating that a null pointer was found. llvm-svn: 211430
* Support: Write ScaledNumber::getQuotient() and getProduct()Duncan P. N. Exon Smith2014-06-201-91/+0
| | | | llvm-svn: 211409
* [ValueTracking] Extend range metadata to call/invokeJingyue Wu2014-06-191-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: With this patch, range metadata can be added to call/invoke including IntrinsicInst. Previously, it could only be added to load. Rename computeKnownBitsLoad to computeKnownBitsFromRangeMetadata because range metadata is not only used by load. Update the language reference to reflect this change. Test Plan: Add several tests in range-2.ll to confirm the verifier is happy with having range metadata on call/invoke. Add two tests in AddOverFlow.ll to confirm annotating range metadata to call/invoke can benefit InstCombine. Reviewers: meheff, nlewycky, reames, hfinkel, eliben Reviewed By: eliben Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4187 llvm-svn: 211281
* Move optimization of some cases of (A & C1)|(B & C2) from instcombine to ↵Nick Lewycky2014-06-191-0/+32
| | | | | | instsimplify. Patch by Rahul Jain, plus some last minute changes by me -- you can blame me for any bugs. llvm-svn: 211252
* Make instsimplify's analysis of icmp eq/ne use computeKnownBits to determine ↵Nick Lewycky2014-06-191-0/+19
| | | | | | whether the icmp is always true or false. Patch by Suyog Sarda! llvm-svn: 211251
* Removing an "if (!this)" check from two print methods. The condition willRichard Trieu2014-06-094-2/+8
| | | | | | | never be true in a well-defined context. The checking for null pointers has been moved into the caller logic so it does not rely on undefined behavior. llvm-svn: 210497
* Remove old fenv.h workaround for a historic clang driver bugAlp Toker2014-06-091-9/+2
| | | | | | | | | | | Tested and works fine with clang using libstdc++. All indications are that this was fixed some time ago and isn't a problem with any clang version we support. I've added a note in PR6907 which is still open for some reason. llvm-svn: 210485
* Fold FEnv.h into the implementationAlp Toker2014-06-091-7/+41
| | | | | | | | | | | | | | | | | | Support headers shouldn't use config.h definitions, and they should never be undefined like this. ConstantFolding.cpp was the only user of this facility and already includes config.h for other math features, so it makes sense to move the checks there at point of use. (The implicit config.h was also quite dangerous -- removing the FEnv.h include would have silently disabled math constant folding without causing any tests to fail. Need to investigate -Wundef once the cleanup is done.) This eliminates the last config.h include from LLVM headers, paving the way for more consistent configuration checks. llvm-svn: 210483
* ScalarEvolution: Derive element size from the type of the loaded elementTobias Grosser2014-06-081-1/+1
| | | | | | | | | | Before, we where looking at the size of the pointer type that specifies the location from which to load the element. This did not make any sense at all. This change fixes a bug in the delinearization where we failed to delinerize certain load instructions. llvm-svn: 210435
* Add a new attribute called 'jumptable' that creates jump-instruction tables ↵Tom Roeder2014-06-053-0/+42
| | | | | | | | | | | | for functions marked with this attribute. It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables. This also adds backend support for generating the jump-instruction tables on ARM and X86. Note that since the jumptable attribute creates a second function pointer for a function, any function marked with jumptable must also be marked with unnamed_addr. llvm-svn: 210280
* Add a Constant version of stripPointerCasts.Rafael Espindola2014-06-041-1/+1
| | | | | | Thanks to rnk for the suggestion. llvm-svn: 210205
* implement missing SCEVDivision caseSebastian Pop2014-05-291-0/+9
| | | | | | | | without this case we would end on an infinite recursion: the remainder is zero, so Numerator - Remainder is equal to Numerator and so we would recursively ask for the division of Numerator by Denominator. llvm-svn: 209838
* fail to find dimensions when ElementSize is nullptrSebastian Pop2014-05-291-1/+1
| | | | | | | | when ScalarEvolution::getElementSize returns nullptr it is safe to early return in ScalarEvolution::findArrayDimensions such that we avoid later problems when we try to divide the terms by ElementSize. llvm-svn: 209837
* test check-in: added missing parenthesis in commentSanjay Patel2014-05-281-1/+1
| | | | llvm-svn: 209763
* avoid type mismatch when building SCEVsSebastian Pop2014-05-271-0/+26
| | | | | | | | | | | This is a corner case I have stumbled upon when dealing with ARM64 type conversions. I was not able to extract a testcase for the community codebase to fail on. The patch conservatively discards a division that would have ended up in an ICE due to a type mismatch when building a multiply expression. I have also added code to a place that builds add expressions and in which we should be careful not to pass in operands of different types. llvm-svn: 209694
* do not use the GCD to compute the delinearization stridesSebastian Pop2014-05-271-59/+8
| | | | | | | | | | | We do not need to compute the GCD anymore after we removed the constant coefficients from the terms: the terms are now all parametric expressions and there is no need to recognize constant terms that divide only a subset of the terms. We only rely on the size of the terms, i.e., the number of operands in the multiply expressions, to sort the terms and recognize the parametric dimensions. llvm-svn: 209693
* remove BasePointer before delinearizingSebastian Pop2014-05-273-29/+41
| | | | | | | | | | No functional change is intended: instead of relying on the delinearization to come up with the base pointer as a remainder of the divisions in the delinearization, we just compute it from the array access and use that value. We substract the base pointer from the SCEV to be delinearized and that simplifies the work of the delinearizer. llvm-svn: 209692
* remove constant termsSebastian Pop2014-05-273-25/+80
| | | | | | | | | | | | | | | | | | | | | | The delinearization is needed only to remove the non linearity induced by expressions involving multiplications of parameters and induction variables. There is no problem in dealing with constant times parameters, or constant times an induction variable. For this reason, the current patch discards all constant terms and multipliers before running the delinearization algorithm on the terms. The only thing remaining in the term expressions are parameters and multiply expressions of parameters: these simplified term expressions are passed to the array shape recognizer that will not recognize constant dimensions anymore: these will be recognized as different strides in parametric subscripts. The only important special case of a constant dimension is the size of elements. Instead of relying on the delinearization to infer the size of an element, compute the element size from the base address type. This is a much more precise way of computing the element size than before, as we would have mixed together the size of an element with the strides of the innermost dimension. llvm-svn: 209691
* Some cleanup for r209568.Michael Zolotukhin2014-05-261-9/+7
| | | | llvm-svn: 209634
* Implement sext(C1 + C2*X) --> sext(C1) + sext(C2*X) andMichael Zolotukhin2014-05-241-0/+35
| | | | | | | | | | | sext{C1,+,C2} --> sext(C1) + sext{0,+,C2} transformation in Scalar Evolution. That helps SLP-vectorizer to recognize consecutive loads/stores. <rdar://problem/14860614> llvm-svn: 209568
* Fix and improve SCEV ComputeBackedgeTankCount.Andrew Trick2014-05-231-19/+46
| | | | | | | | | | | | | This is a follow-up to r209358: PR19799: Indvars miscompile due to an incorrect max backedge taken count from SCEV. That fix was incomplete as pointed out by Arnold and Michael Z. The code was also too confusing. It needed a careful rewrite with more unit tests. This version will also happen to optimize more cases. <rdar://17005101> PR19799: Indvars miscompile... llvm-svn: 209545
* ScalarEvolution: Fix handling of AddRecs in isKnownPredicateJustin Bogner2014-05-231-12/+24
| | | | | | | | | | | | | ScalarEvolution::isKnownPredicate() can wrongly reduce a comparison when both the LHS and RHS are SCEVAddRecExprs. This checks that both LHS and RHS are guarded in the case when both are SCEVAddRecExprs. The test case is against indvars because I could not find a way to directly test SCEV. Patch by Sanjay Patel! llvm-svn: 209487
* Fix a bug in SCEV's backedge taken count computation from my prior fix in Jan.Andrew Trick2014-05-221-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This has to do with the trip count computation for loops with multiple exits, which is quite subtle. Most passes just ask for a single trip count number, so we must be conservative assuming any exit could be taken. Normally, we rely on the "exact" trip count, which was correctly given as "unknown". However, SCEV also gives a "max" back-edge taken count. The loops max BE taken count is conservatively a maximum over the max of each exit's non-exiting iterations count. Note that some exit tests can be skipped so the max loop back-edge taken count can actually exceed the max non-exiting iterations for some exits. However, when we know the loop *latch* cannot be skipped, we can directly use its max taken count disregarding other exits. I previously took the minimum here without checking whether the other exit could be skipped. The correct, and simpler thing to do here is just to directly use the loop latch's max non-exiting iterations as the loops max back-edge count. In the problematic test case, the first loop exit had a max of zero non-exiting iterations, but could be skipped. The loop latch was known not to be skipped but had max of one non-exiting iteration. We incorrectly claimed the loop back-edge could be taken zero times, when it is actually taken one time. Fixes Loop %for.body.i: <multiple exits> Unpredictable backedge-taken count. Loop %for.body.i: max backedge-taken count is 1. llvm-svn: 209358
* Clean up language and grammar.Eric Christopher2014-05-201-2/+2
| | | | | | | Based on a patch by jfcaron3@gmail.com! PR19806 llvm-svn: 209216
OpenPOWER on IntegriCloud