summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* Check for all known bits on ret in InstCombineHal Finkel2014-09-072-0/+19
| | | | | | | | | | From a combination of @llvm.assume calls (and perhaps through other means, such as range metadata), it is possible that all bits of a return value might be known. Previously, InstCombine did not check for this (which is understandable given assumptions of constant propagation), but means that we'd miss simple cases where assumptions are involved. llvm-svn: 217346
* Add additional patterns for @llvm.assume in ValueTrackingHal Finkel2014-09-071-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This builds on r217342, which added the infrastructure to compute known bits using assumptions (@llvm.assume calls). That original commit added only a few patterns (to catch common cases related to determining pointer alignment); this change adds several other patterns for simple cases. r217342 contained that, for assume(v & b = a), bits in the mask that are known to be one, we can propagate known bits from the a to v. It also had a known-bits transfer for assume(a = b). This patch adds: assume(~(v & b) = a) : For those bits in the mask that are known to be one, we can propagate inverted known bits from the a to v. assume(v | b = a) : For those bits in b that are known to be zero, we can propagate known bits from the a to v. assume(~(v | b) = a): For those bits in b that are known to be zero, we can propagate inverted known bits from the a to v. assume(v ^ b = a) : For those bits in b that are known to be zero, we can propagate known bits from the a to v. For those bits in b that are known to be one, we can propagate inverted known bits from the a to v. assume(~(v ^ b) = a) : For those bits in b that are known to be zero, we can propagate inverted known bits from the a to v. For those bits in b that are known to be one, we can propagate known bits from the a to v. assume(v << c = a) : For those bits in a that are known, we can propagate them to known bits in v shifted to the right by c. assume(~(v << c) = a) : For those bits in a that are known, we can propagate them inverted to known bits in v shifted to the right by c. assume(v >> c = a) : For those bits in a that are known, we can propagate them to known bits in v shifted to the right by c. assume(~(v >> c) = a) : For those bits in a that are known, we can propagate them inverted to known bits in v shifted to the right by c. assume(v >=_s c) where c is non-negative: The sign bit of v is zero assume(v >_s c) where c is at least -1: The sign bit of v is zero assume(v <=_s c) where c is negative: The sign bit of v is one assume(v <_s c) where c is non-positive: The sign bit of v is one assume(v <=_u c): Transfer the known high zero bits assume(v <_u c): Transfer the known high zero bits (if c is know to be a power of 2, transfer one more) A small addition to InstCombine was necessary for some of the test cases. The problem is that when InstCombine was simplifying and, or, etc. it would fail to check the 'do I know all of the bits' condition before checking less specific conditions and would not fully constant-fold the result. I'm not sure how to trigger this aside from using assumptions, so I've just included the change here. llvm-svn: 217343
* Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.)Hal Finkel2014-09-0713-188/+265
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change, which allows @llvm.assume to be used from within computeKnownBits (and other associated functions in ValueTracking), adds some (optional) parameters to computeKnownBits and friends. These functions now (optionally) take a "context" instruction pointer, an AssumptionTracker pointer, and also a DomTree pointer, and most of the changes are just to pass this new information when it is easily available from InstSimplify, InstCombine, etc. As explained below, the significant conceptual change is that known properties of a value might depend on the control-flow location of the use (because we care that the @llvm.assume dominates the use because assumptions have control-flow dependencies). This means that, when we ask if bits are known in a value, we might get different answers for different uses. The significant changes are all in ValueTracking. Two main changes: First, as with the rest of the code, new parameters need to be passed around. To make this easier, I grouped them into a structure, and I made internal static versions of the relevant functions that take this structure as a parameter. The new code does as you might expect, it looks for @llvm.assume calls that make use of the value we're trying to learn something about (often indirectly), attempts to pattern match that expression, and uses the result if successful. By making use of the AssumptionTracker, the process of finding @llvm.assume calls is not expensive. Part of the structure being passed around inside ValueTracking is a set of already-considered @llvm.assume calls. This is to prevent a query using, for example, the assume(a == b), to recurse on itself. The context and DT params are used to find applicable assumptions. An assumption needs to dominate the context instruction, or come after it deterministically. In this latter case we only handle the specific case where both the assumption and the context instruction are in the same block, and we need to exclude assumptions from being used to simplify their own ephemeral values (those which contribute only to the assumption) because otherwise the assumption would prove its feeding comparison trivial and would be removed. This commit adds the plumbing and the logic for a simple masked-bit propagation (just enough to write a regression test). Future commits add more patterns (and, correspondingly, more regression tests). llvm-svn: 217342
* Add an Assumption-Tracking PassHal Finkel2014-09-073-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | This adds an immutable pass, AssumptionTracker, which keeps a cache of @llvm.assume call instructions within a module. It uses callback value handles to keep stale functions and intrinsics out of the map, and it relies on any code that creates new @llvm.assume calls to notify it of the new instructions. The benefit is that code needing to find @llvm.assume intrinsics can do so directly, without scanning the function, thus allowing the cost of @llvm.assume handling to be negligible when none are present. The current design is intended to be lightweight. We don't keep track of anything until we need a list of assumptions in some function. The first time this happens, we scan the function. After that, we add/remove @llvm.assume calls from the cache in response to registration calls and ValueHandle callbacks. There are no new direct test cases for this pass, but because it calls it validation function upon module finalization, we'll pick up detectable inconsistencies from the other tests that touch @llvm.assume calls. This pass will be used by follow-up commits that make use of @llvm.assume. llvm-svn: 217334
* InstCombine: Remove a special case patternDavid Majnemer2014-09-051-15/+18
| | | | | | | The special case did not work when run under -reassociate and can easily be expressed by a further generalization of an existing pattern. llvm-svn: 217227
* Revert "Revert two GEP-related InstCombine commits"David Majnemer2014-09-011-11/+42
| | | | | | | | | | | | | | This reverts commit r216698 which reverted r216523 and r216598. We would attempt to perform the transformation even if the match() failed because, as a side effect, it would set V. This would trick us into believing that we correctly found a place to correctly apply the transform. An additional test case was added to getelementptr.ll so that we might not regress in the future. llvm-svn: 216890
* InstCombine: Respect recursion depth in visitUDivOperandDavid Majnemer2014-08-301-4/+4
| | | | llvm-svn: 216817
* InstCombine: Try harder to combine icmp instructionsDavid Majnemer2014-08-301-3/+25
| | | | | | | | | | | | | consider: (and (icmp X, Y), (and Z, (icmp A, B))) It may be possible to combine (icmp X, Y) with (icmp A, B). If we successfully combine, create an 'and' instruction with Z. This fixes PR20814. N.B. There is room for improvement after this change but I'm not convinced it's worth chasing yet. llvm-svn: 216814
* Revert two GEP-related InstCombine commitsDavid Majnemer2014-08-291-40/+11
| | | | | | | This reverts commit r216523 and r216598; people have reported regressions. llvm-svn: 216698
* InstCombine: Remove redundant combinesDavid Majnemer2014-08-281-15/+0
| | | | | | | | InstSimplify already handles icmp (X+Y), X (and things like it) appropriately. The first thing that InstCombine does is run InstSimplify on the instruction. llvm-svn: 216659
* InstSimplify: Move a transform from InstCombine to InstSimplifyDavid Majnemer2014-08-281-10/+0
| | | | | | | | Several combines involving icmp (shl C2, %X) C1 can be simplified without introducing any new instructions. Move them to InstSimplify; while we are at it, make them more powerful. llvm-svn: 216642
* InstCombine: Combine gep X, (Y-X) to YDavid Majnemer2014-08-271-14/+25
| | | | | | | | We try to perform this transform in InstSimplify but we aren't always able to. Sometimes, we need to insert a bitcast if X and Y don't have the same time. llvm-svn: 216598
* Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or ↵Craig Topper2014-08-271-1/+1
| | | | | | just letting them be implicitly created. llvm-svn: 216525
* InstCombine: Optimize GEP's involving ptrtoint betterDavid Majnemer2014-08-271-11/+29
| | | | | | | | | | | | | | We supported transforming: (gep i8* X, -(ptrtoint Y)) to: (inttoptr (sub (ptrtoint X), (ptrtoint Y))) However, this only fired if 'X' had type i8*. Generalize this to support various types of different sizes. This results in much better CodeGen, especially for pointers to packed structs. llvm-svn: 216523
* This patch enables SimplifyUsingDistributiveLaws() to handle following pattens.Dinesh Dwivedi2014-08-262-54/+45
| | | | | | | | | | | | (X >> Z) & (Y >> Z) -> (X&Y) >> Z for all shifts. (X >> Z) | (Y >> Z) -> (X|Y) >> Z for all shifts. (X >> Z) ^ (Y >> Z) -> (X^Y) >> Z for all shifts. These patterns were previously handled separately in visitAnd()/visitOr()/visitXor(). Differential Revision: http://reviews.llvm.org/D4951 llvm-svn: 216443
* InstCombine: Properly optimize or'ing bittests togetherDavid Majnemer2014-08-241-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CFE, with -03, would turn: bool f(unsigned x) { bool a = x & 1; bool b = x & 2; return a | b; } into: %1 = lshr i32 %x, 1 %2 = or i32 %1, %x %3 = and i32 %2, 1 %4 = icmp ne i32 %3, 0 This sort of thing exposes a nasty pathology in GCC, ICC and LLVM. Instead, we would rather want: %1 = and i32 %x, 3 %2 = icmp ne i32 %1, 0 Things get a bit more interesting in the following case: %1 = lshr i32 %x, %y %2 = or i32 %1, %x %3 = and i32 %2, 1 %4 = icmp ne i32 %3, 0 Replacing it with the following sequence is better: %1 = shl nuw i32 1, %y %2 = or i32 %1, 1 %3 = and i32 %2, %x %4 = icmp ne i32 %3, 0 This sequence is preferable because %1 doesn't involve %x and could potentially be hoisted out of loops if it is invariant; only perform this transform in the non-constant case if we know we won't increase register pressure. llvm-svn: 216343
* InstCombine: Don't unconditionally preserve 'nuw' when shrinking constantsDavid Majnemer2014-08-221-6/+12
| | | | | | | | | | | | Consider: %add = add nuw i32 %a, -16777216 %and = and i32 %add, 255 Regardless of whether or not we demand the sign bit of %add, we cannot replace -16777216 with 2130706432 without also removing 'nuw' from the instruction. llvm-svn: 216273
* InstCombine: sub nsw %x, C -> add nsw %x, -C if C isn't INT_MINDavid Majnemer2014-08-221-1/+4
| | | | | | We can preserve nsw during this transform if -C won't overflow. llvm-svn: 216269
* InstCombine: Don't unconditionally preserve 'nsw' when shrinking constantsDavid Majnemer2014-08-221-0/+8
| | | | | | | | | | | | | | Consider: %add = add nsw i32 %a, -16777216 %and = and i32 %add, 255 Regardless of whether or not we demand the sign bit of %add, we cannot replace -16777216 with 2130706432 without also removing 'nsw' from the instruction. This fixes PR20377. llvm-svn: 216261
* Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid ↵Craig Topper2014-08-212-3/+3
| | | | | | needing to mention the size. llvm-svn: 216158
* InstCombine: Fold ((A | B) & C1) ^ (B & C2) -> (A & C1) ^ B if C1^C2=-1David Majnemer2014-08-212-0/+46
| | | | | | Adapted from a patch by Richard Smith, test-case written by me. llvm-svn: 216157
* New InstCombine pattern: (icmp ult/ule (A + C1), C3) | (icmp ult/ule (A + ↵Yi Jiang2014-08-201-0/+55
| | | | | | C2), C3) to (icmp ult/ule ((A & ~(C1 ^ C2)) + max(C1, C2)), C3) under certain condition llvm-svn: 216135
* InstCombine: Annotate sub with nuw when we prove it's safeDavid Majnemer2014-08-202-0/+19
| | | | | | | We can prove that a 'sub' can be a 'sub nuw' if the left-hand side is negative and the right-hand side is non-negative. llvm-svn: 216045
* InstCombine: Annotate sub with nsw when we prove it's safeDavid Majnemer2014-08-192-1/+40
| | | | | | | | | | We can prove that a 'sub' can be a 'sub nsw' under certain conditions: - The sign bits of the operands is the same. - Both operands have more than 1 sign bit. The subtraction cannot be a signed overflow in either case. llvm-svn: 216037
* InstCombine: ((A & ~B) ^ (~A & B)) to A ^ BMayur Pandey2014-08-191-0/+10
| | | | | | | | | | | | | Proof using CVC3 follows: $ cat t.cvc A, B : BITVECTOR(32); QUERY BVXOR((A & ~B),(~A & B)) = BVXOR(A,B); $ cvc3 t.cvc Valid. Differential Revision: http://reviews.llvm.org/D4898 llvm-svn: 215974
* test commit (spelling correction)Mayur Pandey2014-08-191-1/+1
| | | | llvm-svn: 215970
* Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to ↵Craig Topper2014-08-182-3/+3
| | | | | | | | avoid needing to mention the size." Getting a weird buildbot failure that I need to investigate. llvm-svn: 215870
* Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid ↵Craig Topper2014-08-172-3/+3
| | | | | | needing to mention the size. llvm-svn: 215868
* Remove an InstCombine that transformed patterns like (x * uitofp i1 y) to ↵Owen Anderson2014-08-171-30/+0
| | | | | | | | | | | | | (select y, x, 0.0) when the multiply has fast math flags set. While this might seem like an obvious canonicalization, there is one subtle problem with it. The result of the original expression is undef when x is NaN (remember, fast math flags), but the result of the select is always defined when x is NaN. This means that the new expression is strictly more defined than the original one. One unfortunate consequence of this is that the transform is not reversible! It's always legal to make increase the defined-ness of an expression, but it's not legal to reduce it. Thus, targets that prefer the original form of the expression cannot reverse the transform to recover it. Another way to think of it is that the transform has lost source-level information (the fast math flags), which is undesirable. llvm-svn: 215825
* InstCombine: Fix a potential bug in 0 - (X sdiv C) -> (X sdiv -C)David Majnemer2014-08-161-1/+1
| | | | | | | | | | | | | | | While *most* (X sdiv 1) operations will get caught by InstSimplify, it is still possible for a sdiv to appear in the worklist which hasn't been simplified yet. This means that it is possible for 0 - (X sdiv 1) to get transformed into (X sdiv -1); dividing by -1 can make the transform produce undef values instead of the proper result. Sorry for the lack of testcase, it's a bit problematic because it relies on the exact order of operations in the worklist. llvm-svn: 215818
* InstCombine: Combine mul with div.David Majnemer2014-08-161-2/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can combne a mul with a div if one of the operands is a multiple of the other: %mul = mul nsw nuw %a, C1 %ret = udiv %mul, C2 => %ret = mul nsw %a, (C1 / C2) This can expose further optimization opportunities if we end up multiplying or dividing by a power of 2. Consider this small example: define i32 @f(i32 %a) { %mul = mul nuw i32 %a, 14 %div = udiv exact i32 %mul, 7 ret i32 %div } which gets CodeGen'd to: imull $14, %edi, %eax imulq $613566757, %rax, %rcx shrq $32, %rcx subl %ecx, %eax shrl %eax addl %ecx, %eax shrl $2, %eax retq We can now transform this into: define i32 @f(i32 %a) { %shl = shl nuw i32 %a, 1 ret i32 %shl } which gets CodeGen'd to: leal (%rdi,%rdi), %eax retq This fixes PR20681. llvm-svn: 215815
* InstCombine: ((A | ~B) ^ (~A | B)) to A ^ BDavid Majnemer2014-08-141-0/+10
| | | | | | | | | | | | | | | Proof using CVC3 follows: $ cat t.cvc A, B : BITVECTOR(32); QUERY BVXOR((A | ~B),(~A |B)) = BVXOR(A,B); $ cvc3 t.cvc Valid. Patch by Mayur Pandey! Differential Revision: http://reviews.llvm.org/D4883 llvm-svn: 215621
* Added InstCombine Transform for ((B | C) & A) | B -> B | (A & C)David Majnemer2014-08-141-0/+4
| | | | | | | | | | | | Transform ((B | C) & A) | B --> B | (A & C) Z3 Link: http://rise4fun.com/Z3/hP6p Patch by Sonam Kumari! Differential Revision: http://reviews.llvm.org/D4865 llvm-svn: 215619
* Canonicalize header guards into a common format.Benjamin Kramer2014-08-132-4/+4
| | | | | | | | | | Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558
* InstCombine: Combine (xor (or %a, %b) (xor %a, %b)) to (add %a, %b)Karthik Bhat2014-08-131-0/+12
| | | | | | | | | | | | | Correctness proof of the transform using CVC3- $ cat t.cvc A, B : BITVECTOR(32); QUERY BVXOR(A | B, BVXOR(A,B) ) = A & B; $ cvc3 t.cvc Valid. llvm-svn: 215524
* Allwo bitcast + struct GEP transform to work with addrspacecastMatt Arsenault2014-08-121-3/+20
| | | | llvm-svn: 215467
* InstCombine: Combine (add (and %a, %b) (or %a, %b)) to (add %a, %b)David Majnemer2014-08-111-1/+23
| | | | | | | | | | | | | | What follows bellow is a correctness proof of the transform using CVC3. $ < t.cvc A, B : BITVECTOR(32); QUERY BVPLUS(32, A & B, A | B) = BVPLUS(32, A, B); $ cvc3 < t.cvc Valid. llvm-svn: 215400
* This patch implements transform for pattern "(A & ~B) ^ (~A) -> ~(A & B)".Suyog Sarda2014-08-011-0/+5
| | | | | | Differential Revision: http://reviews.llvm.org/D4653 llvm-svn: 214479
* This patch implements transform for pattern "(A | B) & ((~A) ^ B) -> (A & B)".Suyog Sarda2014-08-011-0/+10
| | | | | | Differential Revision: http://reviews.llvm.org/D4628 llvm-svn: 214478
* This patch implements transform for pattern "( A & (~B)) | (A ^ B) -> (A ^ B)"Suyog Sarda2014-08-011-0/+10
| | | | | | Differential Revision: http://reviews.llvm.org/D4652 llvm-svn: 214477
* This patch implements transform for pattern "(A & B) | ((~A) ^ B) -> (~A ^ B)".Suyog Sarda2014-08-011-0/+10
| | | | | | | | Patch Credit to Ankit Jain ! Differential Revision: http://reviews.llvm.org/D4655 llvm-svn: 214476
* InstCombine: Correctly propagate NSW/NUW for x-(-A) -> x+ADavid Majnemer2014-07-311-3/+9
| | | | | | | | | | | | We can only propagate the nsw bits if both subtraction instructions are marked with the appropriate bit. N.B. We only propagate the nsw bit in InstCombine because the nuw case is already handled in InstSimplify. This fixes PR20189. llvm-svn: 214385
* InstCombine: Simplify (A ^ B) or/and (A ^ B ^ C)David Majnemer2014-07-301-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While we can already transform A | (A ^ B) into A | B, things get bad once we have (A ^ B) | (A ^ B ^ Cst) because reassociation will morph this into (A ^ B) | ((A ^ Cst) ^ B). Our existing patterns fail once this happens. To fix this, we add a new pattern which looks through the tree of xor binary operators to see that, in fact, there exists a redundant xor operation. What follows bellow is a correctness proof of the transform using CVC3. $ cat t.cvc A, B, C : BITVECTOR(64); QUERY BVXOR(A, B) | BVXOR(BVXOR(B, C), A) = BVXOR(A, B) | C; QUERY BVXOR(BVXOR(A, C), B) | BVXOR(A, B) = BVXOR(A, B) | C; QUERY BVXOR(A, B) & BVXOR(BVXOR(B, C), A) = BVXOR(A, B) & ~C; QUERY BVXOR(BVXOR(A, C), B) & BVXOR(A, B) = BVXOR(A, B) & ~C; $ cvc3 < t.cvc Valid. Valid. Valid. Valid. llvm-svn: 214342
* Canonicalization for @llvm.assumeHal Finkel2014-07-251-0/+17
| | | | | | | | | Adds simple logical canonicalization of assumption intrinsics to instcombine, currently: - invariant(a && b) -> invariant(a); invariant(b) - invariant(!(a || b)) -> invariant(!a); invariant(!b) llvm-svn: 213977
* AA metadata refactoring (introduce AAMDNodes)Hal Finkel2014-07-241-6/+7
| | | | | | | | | | | | | | | | | | | | In order to enable the preservation of noalias function parameter information after inlining, and the representation of block-level __restrict__ pointer information (etc.), additional kinds of aliasing metadata will be introduced. This metadata needs to be carried around in AliasAnalysis::Location objects (and MMOs at the SDAG level), and so we need to generalize the current scheme (which is hard-coded to just one TBAA MDNode*). This commit introduces only the necessary refactoring to allow for the introduction of other aliasing metadata types, but does not actually introduce any (that will come in a follow-up commit). What it does introduce is a new AAMDNodes structure to hold all of the aliasing metadata nodes associated with a particular memory-accessing instruction, and uses that structure instead of the raw MDNode* in AliasAnalysis::Location, etc. No functionality change intended. llvm-svn: 213859
* This patch implements optimization as mentioned in PR19753: Optimize ↵Suyog Sarda2014-07-222-0/+95
| | | | | | | | | | | | | | | | | comparisons with "ashr/lshr exact" of a constanst. It handles the errors which were seen in PR19958 where wrong code was being emitted due to earlier patch. Added code for lshr as well as non-exact right shifts. It implements : (icmp eq/ne (ashr/lshr const2, A), const1)" -> (icmp eq/ne A, Log2(const2/const1)) -> (icmp eq/ne A, Log2(const2) - Log2(const1)) Differential Revision: http://reviews.llvm.org/D4068 llvm-svn: 213678
* Added InstCombine transform for pattern "(A & B) ^ (A ^ B) -> (A | B)"Suyog Sarda2014-07-221-0/+8
| | | | | | | | Patch idea by Ankit Jain ! Differential Revision: http://reviews.llvm.org/D4618 llvm-svn: 213677
* Added InstCombine Transform for patterns: Suyog Sarda2014-07-221-0/+10
| | | | | | | | | | "((~A & B) | A) -> (A | B)" and "((A & B) | ~A) -> (~A | B)" Original Patch credit to Ankit Jain !! Differential Revision: http://reviews.llvm.org/D4591 llvm-svn: 213676
* This patch implements transform for pattern "(A | B) ^ (~A) -> (A | ~B)".Suyog Sarda2014-07-221-0/+6
| | | | | | | | Patch Credit to Ankit Jain !! Differential Revision: http://reviews.llvm.org/D4588 llvm-svn: 213662
* fixed typo in commentSanjay Patel2014-07-221-1/+1
| | | | llvm-svn: 213614
OpenPOWER on IntegriCloud