summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] Teach getDemandedBitsLHSMask to handle constant splat vectorsCraig Topper2017-09-201-11/+7
| | | | | | | | This replaces a ConstantInt dyn_cast with m_APInt Differential Revision: https://reviews.llvm.org/D38100 llvm-svn: 313840
* [IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declareReid Kleckner2017-09-201-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This implements the design discussed on llvm-dev for better tracking of variables that live in memory through optimizations: http://lists.llvm.org/pipermail/llvm-dev/2017-September/117222.html This is tracked as PR34136 llvm.dbg.addr is intended to be produced and used in almost precisely the same way as llvm.dbg.declare is today, with the exception that it is control-dependent. That means that dbg.addr should always have a position in the instruction stream, and it will allow passes that optimize memory operations on local variables to insert llvm.dbg.value calls to reflect deleted stores. See SourceLevelDebugging.rst for more details. The main drawback to generating DBG_VALUE machine instrs is that they usually cause LLVM to emit a location list for DW_AT_location. The next step will be to teach DwarfDebug.cpp how to recognize more DBG_VALUE ranges as not needing a location list, and possibly start setting DW_AT_start_offset for variables whose lifetimes begin mid-scope. Reviewers: aprantl, dblaikie, probinson Subscribers: eraman, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37768 llvm-svn: 313825
* [InstCombine] Handle (X & C2) < C1 --> (X & C2) == 0Craig Topper2017-09-201-6/+13
| | | | | | | | We already did (X & C2) > C1 --> (X & C2) != 0, if any bit set in (X & C2) will produce a result greater than C1. But there is an equivalent inverse condition with <= C1 (which will be canonicalized to < C1+1) Differential Revision: https://reviews.llvm.org/D38065 llvm-svn: 313819
* [InstCombine] Use APInt::getActiveBits() to avoid creating an APInt from a ↵Craig Topper2017-09-201-1/+1
| | | | | | trailing zero count to do a comparison. NFCI llvm-svn: 313792
* [InstCombine] Add select simplificationsQuentin Colombet2017-09-204-52/+44
| | | | | | | | | | | | | | | | | In these cases, two selects have constant selectable operands for both the true and false components and have the same conditional expression. We then create two arithmetic operations of the same type and feed a final select operation using the result of the true arithmetic for the true operand and the result of the false arithmetic for the false operand and reuse the original conditionl expression. The arithmetic operations are naturally folded as a consequence, leaving only the newly formed select to replace the old arithmetic operation. Patch by: Michael Berg <michael_c_berg@apple.com> Differential Revision: https://reviews.llvm.org/D37019 llvm-svn: 313774
* [X86] Remove VPERM2F128/VPERM2I128 intrinsics and autoupgrade to native ↵Craig Topper2017-09-161-74/+0
| | | | | | | | shuffles. I've moved the test cases from the InstCombine optimizations to the backend to keep the coverage we had there. It covered every possible immediate so I've preserved the resulting shuffle mask for each of those immediates. llvm-svn: 313450
* [InstCombine] Add a flag to disable LowerDbgDeclareReid Kleckner2017-09-131-1/+30
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This should improve optimized debug info for address-taken variables at the cost of inaccurate debug info in some situations. We patched this into clang and deployed this change to Chromium developers, and this significantly improved debuggability of optimized code. The long-term solution to PR34136 seems more and more like it's going to take a while, so I would like to commit this change under a flag so that it can be used as a stop-gap measure. This flag should really help so for C++ aggregates like std::string and std::vector, which are typically address-taken, even after inlining, and cannot be SROA-ed. Reviewers: aprantl, dblaikie, probinson, dberlin Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D36596 llvm-svn: 313108
* Test commitUriel Korach2017-09-101-1/+1
| | | | llvm-svn: 312878
* Merge isKnownNonNull into isKnownNonZeroNuno Lopes2017-09-091-2/+2
| | | | | | | | | It now knows the tricks of both functions. Also, fix a bug that considered allocas of non-zero address space to be always non null Differential Revision: https://reviews.llvm.org/D37628 llvm-svn: 312869
* [ValueTracking, InstCombine] canonicalize fcmp ord/uno with non-NAN ops to ↵Sanjay Patel2017-09-052-15/+19
| | | | | | | | | | | | | | | | | | | | | null constants This is a preliminary step towards solving the remaining part of PR27145 - IR for isfinite(): https://bugs.llvm.org/show_bug.cgi?id=27145 In order to solve that one more generally, we need to add matching for and/or of fcmp ord/uno with a constant operand. But while looking at those patterns, I realized we were missing a canonicalization for nonzero constants. Rather than limiting to just folds for constants, we're adding a general value tracking method for this based on an existing DAG helper. By transforming everything to 0.0, we can simplify the existing code in foldLogicOfFCmps() and pick up missing vector folds. Differential Revision: https://reviews.llvm.org/D37427 llvm-svn: 312591
* [InstCombine] Move foldSelectICmpAnd helper function earlier in the file to ↵Craig Topper2017-09-051-105/+105
| | | | | | enable reuse in a future patch. llvm-svn: 312518
* [InstCombine] In foldSelectIntoOp, avoid creating a Constant before we know ↵Craig Topper2017-09-051-17/+18
| | | | | | | | | | for sure we're going to use it and avoid an unnecessary call to m_APInt. Instead of creating a Constant and then calling m_APInt with it (which will always return true). Just create an APInt initially, and use that for the checks in isSelect01 function. If it turns out we do need the Constant, create it from the APInt. This is a refactor for a future patch that will do some more checks of the constant values here. llvm-svn: 312517
* [InstCombine] replace unnecessary fcmp fold with assertSanjay Patel2017-09-021-6/+3
| | | | | | See https://reviews.llvm.org/rL312411 for related InstSimplify tests. llvm-svn: 312421
* [InstCombine] combine foldAndOfFCmps and foldOrOfFcmps; NFCISanjay Patel2017-09-022-77/+35
| | | | | | | | In addition to removing chunks of duplicated code, we don't want these to diverge. If there's a fold for one, there should be a fold of the other via DeMorgan's Laws. llvm-svn: 312420
* [InstCombine] fix misnamed locals and use them to reduce code; NFCISanjay Patel2017-09-021-34/+34
| | | | | | | | | We had these locals: Value *Op0RHS = LHS->getOperand(1); Value *Op1LHS = RHS->getOperand(0); ...so we confusingly transposed the meaning of left/right and op0/op1. llvm-svn: 312418
* [InstCombine] remove unnecessary code; NFCSanjay Patel2017-09-021-3/+0
| | | | llvm-svn: 312416
* [InstCombine] move related functions next to each other; NFCSanjay Patel2017-09-021-51/+51
| | | | | | | | This makes it easier to see that they're almost duplicates. As with the similar icmp functions, there should be identical folds for both logic ops because those are DeMorganized variants. llvm-svn: 312415
* [InstCombine] use local variable to reduce code duplication; NFCISanjay Patel2017-09-021-11/+9
| | | | llvm-svn: 312414
* [InstCombine][InstSimplify] Teach decomposeBitTestICmp to look through ↵Craig Topper2017-09-011-1/+1
| | | | | | | | | | | | | | | | truncate instructions This patch teaches decomposeBitTestICmp to look through truncate instructions on the input to the compare. If a truncate is found it will now return the pre-truncated Value and appropriately extend the APInt mask. This allows some code to be removed from InstSimplify that was doing this functionality. This allows InstCombine's bit test combining code to match a pre-truncate Value with the same Value appear with an 'and' on another icmp. Or it allows us to combine a truncate to i16 and a truncate to i8. This also required removing the type check from the beginning of getMaskedTypeForICmpPair, but I believe that's ok because we still have to find two values from the input to each icmp that are equal before we'll do any transformation. So the type check was really just serving as an early out. There was one user of decomposeBitTestICmp that didn't want to look through truncates, so I've added a flag to prevent that behavior when necessary. Differential Revision: https://reviews.llvm.org/D37158 llvm-svn: 312382
* [InstCombine] Don't require the compare types to be the same in ↵Craig Topper2017-09-011-3/+2
| | | | | | | | | | getMaskedTypeForICmpPair. A future patch will make the code look through truncates feeding the compare. So the compares might be different types but the pretruncated types might be the same. This should be safe because we still require the same Value* to be used truncated or not in both compares. So that serves to ensure the types are the same. llvm-svn: 312381
* [InstCombine] When converting decomposeBitTestICmp's APInt return to ↵Craig Topper2017-09-011-2/+2
| | | | | | | | ConstantInt, make sure we use the type from the Value* that was also returned from decomposeBitTestICmp. Previously we used the type from the LHS of the compare, but a future patch will change decomposeBitTestICmp to look through truncates so it will return a pretruncated Value* and the type needs to match that. llvm-svn: 312380
* [InstCombine] improve demanded vector elements analysis of insertelementSanjay Patel2017-08-312-12/+11
| | | | | | | | | | | | | Recurse instead of returning on the first found optimization. Also, return early in the caller instead of continuing because that allows another round of simplification before we might potentially lose undef information from a shuffle mask by eliminating the shuffle. As noted in the review, we could probably do better and be more efficient by moving all of demanded elements into a separate pass, but this is yet another quick fix to instcombine. Differential Revision: https://reviews.llvm.org/D37236 llvm-svn: 312248
* [InstCombine] remove unnecessary vector select fold; NFCISanjay Patel2017-08-301-4/+0
| | | | | | | | | | This code is double-dead: 1. We simplify all selects with constant true/false condition in InstSimplify. I've minimized/moved the tests to show that works as expected. 2. All remaining vector selects with a constant condition are canonicalized to shufflevector, so we really can't see this pattern. llvm-svn: 312123
* [InstCombine] Fold insert sequence if first ins has multiple users.Florian Hahn2017-08-301-6/+18
| | | | | | | | | | | | | | | | | | | | | | | Summary: If the first insertelement instruction has multiple users and inserts at position 0, we can re-use this instruction when folding a chain of insertelement instructions. As we need to generate the first insertelement instruction anyways, this should be a strict improvement. We could get rid of the restriction of inserting at position 0 by creating a different shufflemask, but it is probably worth to keep the first insertelement instruction with position 0, as this is easier to do efficiently than at other positions I think. Reviewers: grosser, mkuper, fpetrogalli, efriedma Reviewed By: fpetrogalli Subscribers: gareevroman, llvm-commits Differential Revision: https://reviews.llvm.org/D37064 llvm-svn: 312110
* [InstCombine] Support vector splats in transformZExtICmpCraig Topper2017-08-291-7/+7
| | | | | | | | | | This patch adds splat support to transformZExtICmp. The test cases are vector versions of tests that failed when commenting out parts of the existing scalar code. One test didn't vectorize optimize properly due to another bug so a TODO has been added. Differential Revision: https://reviews.llvm.org/D37253 llvm-svn: 312023
* [InstCombine] Teach foldSelectICmpAndOr to handle vector splatsCraig Topper2017-08-291-6/+8
| | | | | | | | This was pretty close to working already. While I was here I went ahead and passed the ICmpInst pointer from the caller instead of doing a dyn_cast that can never fail. Differential Revision: https://reviews.llvm.org/D37237 llvm-svn: 311960
* [InstCombine] Teach select01 helper of foldSelectIntoOp to handle vector splatsCraig Topper2017-08-281-7/+6
| | | | | | | | We were handling some vectors in foldSelectIntoOp, but not if the operand of the bin op was any kind of vector constant. This patch fixes it to treat vector splats the same as scalars. Differential Revision: https://reviews.llvm.org/D37232 llvm-svn: 311940
* [InstCombine] Call hasNoSignedWrap instead of hasNoUnsignedWrap to get the ↵Craig Topper2017-08-281-1/+1
| | | | | | | | | | NSW flag when handling Add in SimplifyDemandedUseBits. This is a typo from r311789. This should fix PR34349. llvm-svn: 311902
* [InstCombine] Don't fall back to only calling computeKnownBits if the upper ↵Craig Topper2017-08-251-23/+24
| | | | | | | | | | | | | | | | bit of Add/Sub is demanded. Just create an all 1s demanded mask and continue recursing like normal. The recursive calls should be able to handle an all 1s mask and do the right thing. The only time we should care about knowing whether the upper bit was demanded is when we need to know if we should clear the NSW/NUW flags. Now that we have a consistent path through the code for all cases, use KnownBits::computeForAddSub to compute the known bits at the end since we already have the LHS and RHS. My larger goal here is to move the code that turns add into xor if only 1 bit is demanded and no bits below it are non-zero from InstCombiner::OptAndOp to here. This will allow it to be more general instead of just looking for 'add' and 'and' with constant RHS. Differential Revision: https://reviews.llvm.org/D36486 llvm-svn: 311789
* [InstCombine] Consider more cases where SimplifyDemandedUseBits does not ↵Amjad Aboud2017-08-251-2/+5
| | | | | | | | | | convert AShr to LShr. There are cases where AShr have better chance to be optimized than LShr, especially when the demanded bits are not known to be Zero, and also known to be similar to the sign bit. Differential Revision: https://reviews.llvm.org/D36936 llvm-svn: 311773
* [InstCombine] fix and enhance udiv/urem narrowingSanjay Patel2017-08-241-24/+41
| | | | | | | | | | | | | | There are 3 small independent changes here: 1. Account for multiple uses in the pattern matching: avoid the transform if it increases the instruction count. 2. Add a missing fold for the case where the numerator is the constant: http://rise4fun.com/Alive/E2p 3. Enable all folds for vector types. There's still one more potential change - use "shouldChangeType()" to keep from transforming to an illegal integer type. Differential Revision: https://reviews.llvm.org/D36988 llvm-svn: 311726
* [InstCombine] Fold branches with irrelevant conditions to a constant.Davide Italiano2017-08-231-4/+3
| | | | | | | | | | | | | | | | InstCombine folds instructions with irrelevant conditions to undef. This, as Nuno confirmed is a bug. (see https://bugs.llvm.org/show_bug.cgi?id=33409#c1 ) Given the original motivation for the change is that of removing an USE, we now fold to false instead (which reaches the same goal without undesired side effects). Fixes PR33409. Differential Revision: https://reviews.llvm.org/D36975 llvm-svn: 311540
* [InstCombine] Remove unused argument. NFCCraig Topper2017-08-232-5/+4
| | | | llvm-svn: 311529
* [InstCombine] Replace a simple matcher with a plain old dyn_cast. NFCCraig Topper2017-08-231-2/+1
| | | | llvm-svn: 311528
* [InstCombine] Remove an unnecessary dyn_cast to Instruction and a switch ↵Craig Topper2017-08-232-25/+18
| | | | | | | | over two opcodes. Just dyn_cast to the specific instruction classes individually. NFC Change the helper methods to take the more specific class as well. llvm-svn: 311527
* [InstCombine] Remove check for sext of vector icmp from shouldOptimizeCastCraig Topper2017-08-221-6/+0
| | | | | | | | | | | | Looks like for 'and' and 'or' we end up performing at least some of the transformations this is bocking in a round about way anyway. For 'and sext(cmp1), sext(cmp2) we end up later turning it into 'select cmp1, sext(cmp2), 0'. Then we optimize that back to sext (and cmp1, cmp2). This is the same result we would have gotten if shouldOptimizeCast hadn't blocked it. We do something analogous for 'or'. With this patch we allow that transformation to happen directly in foldCastedBitwiseLogic. And we now support the same thing for 'xor'. This is definitely opening up many other cases, but since we already went around it for some cases hopefully it's ok. Differential Revision: https://reviews.llvm.org/D36213 llvm-svn: 311508
* [InstCombine] Move the checks for pointer types in getMaskedTypeForICmpPair ↵Craig Topper2017-08-211-12/+6
| | | | | | | | | | earlier in the function I don't think there's any reason to have them scattered about and on all 4 operands. We already have an early check that both compares must be the same type. And within a given compare the LHS and RHS must have the same type. Beyond that I don't think there's anyway this function returns anything valid for pointer types. So let's just return early and be done with it. Differential Revision: https://reviews.llvm.org/D36561 llvm-svn: 311383
* [InstCombine] Teach foldSelectICmpAnd to recognize a (icmp slt X, 0) and ↵Craig Topper2017-08-211-19/+49
| | | | | | | | | | (icmp sgt X, -1) as equivalent to an and with the sign bit of the truncated type This is similar to what was already done in foldSelectICmpAndOr. Ultimately I'd like to see if we can call foldSelectICmpAnd from foldSelectIntoOp if we detect a power of 2 constant. This would allow us to remove foldSelectICmpAndOr entirely. Differential Revision: https://reviews.llvm.org/D36498 llvm-svn: 311362
* [InstCombine] Fix a weakness in canEvaluateZExtd around 'and' instructionsCraig Topper2017-08-211-1/+6
| | | | | | | | | | | | | | | | | Summary: If the bitsToClear from the LHS of an 'and' comes back non-zero, but all of those bits are known zero on the RHS, we can reset bitsToClear. Without this, the 'or' in the modified test case blocks the transform because it has non-zero bits in its RHS in those bits. Reviewers: spatel, majnemer, davide Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36944 llvm-svn: 311343
* [InstCombine] Teach canEvaluateTruncated to handle arithmetic shift ↵Amjad Aboud2017-08-161-0/+17
| | | | | | | | (including those with vector splat shift amount) Differential Revision: https://reviews.llvm.org/D36784 llvm-svn: 311050
* [InstCombine] Make folding (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) ↵Craig Topper2017-08-161-17/+22
| | | | | | | | | | + C1 support splat vectors This also uses decomposeBitTestICmp to decode the compare. Differential Revision: https://reviews.llvm.org/D36781 llvm-svn: 311044
* [InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle ↵Craig Topper2017-08-151-10/+18
| | | | | | | | | | vector shifts with splat shift amount We were only allowing ConstantInt before. This patch allows splat of ConstantInt too. Differential Revision: https://reviews.llvm.org/D36763 llvm-svn: 310970
* [InstCombine] Added support for (X >>s C) << C --> X & (-1 << C)Amjad Aboud2017-08-151-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D36743 llvm-svn: 310949
* [InstCombine] sink sext after ashrSanjay Patel2017-08-151-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Narrow ops are better for bit-tracking, and in the case of vectors, may enable better codegen. As the trunc test shows, this can allow follow-on simplifications. There's a block of code in visitTrunc that deals with shifted ops with FIXME comments. It may be possible to remove some of that now, but I want to make sure there are no problems with this step first. http://rise4fun.com/Alive/Y3a Name: hoist_ashr_ahead_of_sext_1 %s = sext i8 %x to i32 %r = ashr i32 %s, 3 ; shift value is < than source bit width => %a = ashr i8 %x, 3 %r = sext i8 %a to i32 Name: hoist_ashr_ahead_of_sext_2 %s = sext i8 %x to i32 %r = ashr i32 %s, 8 ; shift value is >= than source bit width => %a = ashr i8 %x, 7 ; so clamp this shift value %r = sext i8 %a to i32 Name: junc_the_trunc %a = sext i16 %v to i32 %s = ashr i32 %a, 18 %t = trunc i32 %s to i16 => %t = ashr i16 %v, 15 llvm-svn: 310942
* Remove checks for debug info intrinsics in use lists, NFCReid Kleckner2017-08-143-4/+0
| | | | | | | These haven't done anything since debug info intrinsics stopped appearing in Value use lists in 2014. llvm-svn: 310892
* Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of ↵Craig Topper2017-08-141-3/+15
| | | | | | | | | | | | | | | | | | | | decomposeBitTestICmp and use it in the InstSimplify" This recommits r310869, with the moved files and no extra changes. Original commit message: This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310889
* Revert r310869 "[InstSimplify][InstCombine] Modify the interface of ↵Craig Topper2017-08-141-15/+3
| | | | | | | | decomposeBitTestICmp and use it in the InstSimplify" Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything. llvm-svn: 310873
* [InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and ↵Craig Topper2017-08-141-3/+15
| | | | | | | | | | | | | | | | use it in the InstSimplify This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310869
* [InstCombine] Simplify and inline FoldOrWithConstants/FoldXorWithConstantsCraig Topper2017-08-141-85/+19
| | | | | | | | | | | | | | | | | Summary: These functions were overly complicated. The body of this function was rechecking for an And operation to find the constant, but we already knew we were looking at two Ands ORed together and the pieces are in variables. We already had earlier nearby code that checked for ConstantInts. So just inline the remaining parts into the earlier code. Next step is to use m_APInt instead of ConstantInt. Reviewers: spatel, efriedma, davide, majnemer Reviewed By: spatel Subscribers: zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D36439 llvm-svn: 310806
* [InstCombine] Make (X|C1)^C2 -> X^(C1^C2) iff X&~C1 == 0 work for splat vectorsCraig Topper2017-08-101-23/+18
| | | | | | | | This also corrects the description to match what was actually implemented. The old comment said X^(C1|C2), but it implemented X^((C1|C2)&~(C1&C2)). I believe ((C1|C2)&~(C1&C2)) is equivalent to (C1^C2). Differential Revision: https://reviews.llvm.org/D36505 llvm-svn: 310658
OpenPOWER on IntegriCloud