summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
Commit message (Collapse)AuthorAgeFilesLines
* InstCombine: try to transform A-B < 0 into A < BDavid Majnemer2014-12-311-0/+20
| | | | | | | We are allowed to move the 'B' to the right hand side if we an prove there is no signed overflow and if the comparison itself is signed. llvm-svn: 225034
* Removed extra line from a comment to test first commit. NFC.Ankur Garg2014-11-281-1/+0
| | | | llvm-svn: 222916
* InstCombine: Silence a parenthesis warningDavid Majnemer2014-11-221-1/+1
| | | | llvm-svn: 222609
* [InstCombine] Re-commit of r218721 (Optimize icmp-select-icmp sequence)Gerolf Hoflehner2014-11-211-4/+150
| | | | | | | Fixes the self-host fail. Note that this commit activates dominator analysis in the combiner by default (like the original commit did). llvm-svn: 222590
* InstCombine: Rely on cmpxchg's return code when it's strongDavid Majnemer2014-11-061-0/+16
| | | | | | | Comparing the result of a cmpxchg instruction can be replaced with an extractvalue of the cmpxchg success indicator. llvm-svn: 221498
* InstCombine: Don't assume that m_ZExt matches an InstructionDavid Majnemer2014-11-011-2/+2
| | | | | | | | | | | | m_ZExt might bind against a ConstantExpr instead of an Instruction. Assuming this, using cast<Instruction>, results in InstCombine crashing. Instead, introduce ZExtOperator to bridge both Instruction and ConstantExpr ZExts. This fixes PR21445. llvm-svn: 221069
* InstCombine: Combine (X+cst) < 0 --> X < -cstDavid Majnemer2014-11-011-0/+6
| | | | | | | | | | | | | This can happen pretty often in code that looks like: int foo = bar - 1; if (foo < 0) do stuff In this case, bar < 1 is an equivalent condition. This transform requires that the add instruction be annotated with nsw. llvm-svn: 221045
* InstCombine: Remove overzealous assertsDavid Majnemer2014-10-251-12/+19
| | | | | | | | | | | | | | These asserts can trigger if the worklist iteration order is sufficiently unlucky. Instead of adding special case logic to handle these edge conditions, just bail out on trying to transform them: InstSimplify will get them when it reaches them on the worklist. This fixes PR21378. N.B. No test case is included because any test would rely on the fragile worklist iteration order. llvm-svn: 220612
* InstCombine: Simplify FoldICmpCstShrCstDavid Majnemer2014-10-211-48/+16
| | | | | | | | | This function was complicated by the fact that it tried to perform canonicalizations that were already preformed by InstSimplify. Remove this extra code and move the tests over to InstSimplify. Add asserts to make sure our preconditions hold before we make any assumptions. llvm-svn: 220314
* InstCombine: Optimize icmp eq/ne (shl Const2, A), Const1David Majnemer2014-10-191-1/+48
| | | | | | | | | | | | | | | The following implements the optimization for sequences of the form: icmp eq/ne (shl Const2, A), Const1 Such sequences can be transformed to: icmp eq/ne A, (TrailingZeros(Const1) - TrailingZeros(Const2)) This handles only the equality operators for now. Other operators need to be handled. Patch by Ankur Garg! llvm-svn: 220162
* [InstCombine] Fix wrong folding of constant comparisons involving ashr and ↵Andrea Di Biagio2014-10-091-1/+2
| | | | | | | | | | | | | | negative values. This patch fixes a bug in method InstCombiner::FoldCmpCstShrCst where we wrongly computed the distance between the highest bits set of two negative values. This fixes PR21222. Differential Revision: http://reviews.llvm.org/D5700 llvm-svn: 219406
* Revert "[InstCombine] re-commit r218721 with fix for pr21199"Justin Bogner2014-10-081-134/+2
| | | | | | | | | | | | | This seems to cause a miscompile when building clang, which causes a bootstrapped clang to fail or crash in several of its tests. See: http://lab.llvm.org:8013/builders/clang-x86_64-darwin11-RA/builds/1184 http://bb.pgr.jp/builders/clang-3stage-x86_64-linux/builds/7813 This reverts commit r219282. llvm-svn: 219317
* [InstCombine] re-commit r218721 with fix for pr21199Gerolf Hoflehner2014-10-081-2/+134
| | | | | | | | The icmp-select-icmp optimization targets select-icmp.eq only. This is now ensured by testing the branch predicate explictly. This commit also includes the test case for pr21199. llvm-svn: 219282
* Revert r219175 - [InstCombine] re-commit r218721 icmp-select-icmp optimizationHans Wennborg2014-10-081-138/+2
| | | | | | This seems to have caused PR21199. llvm-svn: 219264
* [InstCombine] re-commit r218721 icmp-select-icmp optimizationGerolf Hoflehner2014-10-071-2/+138
| | | | | | | | | Takes care of the assert that caused build fails. Rather than asserting the code checks now that the definition and use are in the same block, and does not attempt to optimize when that is not the case. llvm-svn: 219175
* Revert r218721, r218735.Evgeniy Stepanov2014-10-011-142/+2
| | | | | | | | | | Failing bootstrap on Linux (arm, x86). http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/13139/steps/bootstrap%20clang/logs/stdio http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost/builds/470 http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/8518 llvm-svn: 218752
* [InstCombine] Fix for assert build failures caused by r218721Gerolf Hoflehner2014-10-011-1/+7
| | | | | | | | | The icmp-select-icmp optimization made the implicit assumption that the select-icmp instructions are in the same block and asserted on it. The fix explicitly checks for that condition and conservatively suppresses the optimization when it is violated. llvm-svn: 218735
* [InstCombine] Optimize icmp-select-icmpGerolf Hoflehner2014-10-011-2/+136
| | | | | | | | | | | | | | | | | | | | | | | | | | | In special cases select instructions can be eliminated by replacing them with a cheaper bitwise operation even when the select result is used outside its home block. The instances implemented are patterns like %x=icmp.eq %y=select %x,%r, null %z=icmp.eq|neq %y, null br %z,true, false ==> %x=icmp.ne %y=icmp.eq %r,null %z=or %x,%y br %z,true,false The optimization is integrated into the instruction combiner and performed only when all uses of the select result can be replaced by the select operand proper. For this dominator information is used and dominance is now a required analysis pass in the combiner. The optimization itself is iterative. The critical step is to replace the select result with the non-constant select operand. So the select becomes local and the combiner iteratively works out simpler code pattern and eventually eliminates the select. rdar://17853760 llvm-svn: 218721
* [InstCombine] Fix wrong folding of constant comparison involving ahsr and ↵Andrea Di Biagio2014-09-171-9/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | negative quantities (PR20945). Example: define i1 @foo(i32 %a) { %shr = ashr i32 -9, %a %cmp = icmp ne i32 %shr, -5 ret i1 %cmp } Before this fix, the instruction combiner wrongly thought that %shr could have never been equal to -5. Therefore, %cmp was always folded to 'true'. However, when %a is equal to 1, then %cmp evaluates to 'false'. Therefore, in this example, it is not valid to fold %cmp to 'true'. The problem was only affecting the case where the comparison was between negative quantities where one of the quantities was obtained from arithmetic shift of a negative constant. This patch fixes the problem with the wrong folding (fixes PR20945). With this patch, the 'icmp' from the example is now simplified to a comparison between %a and 1. This still allows us to get rid of the arithmetic shift (%shr). llvm-svn: 217950
* Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.)Hal Finkel2014-09-071-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change, which allows @llvm.assume to be used from within computeKnownBits (and other associated functions in ValueTracking), adds some (optional) parameters to computeKnownBits and friends. These functions now (optionally) take a "context" instruction pointer, an AssumptionTracker pointer, and also a DomTree pointer, and most of the changes are just to pass this new information when it is easily available from InstSimplify, InstCombine, etc. As explained below, the significant conceptual change is that known properties of a value might depend on the control-flow location of the use (because we care that the @llvm.assume dominates the use because assumptions have control-flow dependencies). This means that, when we ask if bits are known in a value, we might get different answers for different uses. The significant changes are all in ValueTracking. Two main changes: First, as with the rest of the code, new parameters need to be passed around. To make this easier, I grouped them into a structure, and I made internal static versions of the relevant functions that take this structure as a parameter. The new code does as you might expect, it looks for @llvm.assume calls that make use of the value we're trying to learn something about (often indirectly), attempts to pattern match that expression, and uses the result if successful. By making use of the AssumptionTracker, the process of finding @llvm.assume calls is not expensive. Part of the structure being passed around inside ValueTracking is a set of already-considered @llvm.assume calls. This is to prevent a query using, for example, the assume(a == b), to recurse on itself. The context and DT params are used to find applicable assumptions. An assumption needs to dominate the context instruction, or come after it deterministically. In this latter case we only handle the specific case where both the assumption and the context instruction are in the same block, and we need to exclude assumptions from being used to simplify their own ephemeral values (those which contribute only to the assumption) because otherwise the assumption would prove its feeding comparison trivial and would be removed. This commit adds the plumbing and the logic for a simple masked-bit propagation (just enough to write a regression test). Future commits add more patterns (and, correspondingly, more regression tests). llvm-svn: 217342
* InstCombine: Remove redundant combinesDavid Majnemer2014-08-281-15/+0
| | | | | | | | InstSimplify already handles icmp (X+Y), X (and things like it) appropriately. The first thing that InstCombine does is run InstSimplify on the instruction. llvm-svn: 216659
* InstSimplify: Move a transform from InstCombine to InstSimplifyDavid Majnemer2014-08-281-10/+0
| | | | | | | | Several combines involving icmp (shl C2, %X) C1 can be simplified without introducing any new instructions. Move them to InstSimplify; while we are at it, make them more powerful. llvm-svn: 216642
* InstCombine: Properly optimize or'ing bittests togetherDavid Majnemer2014-08-241-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CFE, with -03, would turn: bool f(unsigned x) { bool a = x & 1; bool b = x & 2; return a | b; } into: %1 = lshr i32 %x, 1 %2 = or i32 %1, %x %3 = and i32 %2, 1 %4 = icmp ne i32 %3, 0 This sort of thing exposes a nasty pathology in GCC, ICC and LLVM. Instead, we would rather want: %1 = and i32 %x, 3 %2 = icmp ne i32 %1, 0 Things get a bit more interesting in the following case: %1 = lshr i32 %x, %y %2 = or i32 %1, %x %3 = and i32 %2, 1 %4 = icmp ne i32 %3, 0 Replacing it with the following sequence is better: %1 = shl nuw i32 1, %y %2 = or i32 %1, 1 %3 = and i32 %2, %x %4 = icmp ne i32 %3, 0 This sequence is preferable because %1 doesn't involve %x and could potentially be hoisted out of loops if it is invariant; only perform this transform in the non-constant case if we know we won't increase register pressure. llvm-svn: 216343
* This patch implements optimization as mentioned in PR19753: Optimize ↵Suyog Sarda2014-07-221-0/+93
| | | | | | | | | | | | | | | | | comparisons with "ashr/lshr exact" of a constanst. It handles the errors which were seen in PR19958 where wrong code was being emitted due to earlier patch. Added code for lshr as well as non-exact right shifts. It implements : (icmp eq/ne (ashr/lshr const2, A), const1)" -> (icmp eq/ne A, Log2(const2/const1)) -> (icmp eq/ne A, Log2(const2) - Log2(const1)) Differential Revision: http://reviews.llvm.org/D4068 llvm-svn: 213678
* InstCombine: Simplify code, no functionality change.Benjamin Kramer2014-07-071-16/+2
| | | | llvm-svn: 212449
* InstCombine: Disable umul.with.overflow recognition for vectors.Benjamin Kramer2014-06-241-1/+5
| | | | | | It doesn't make a lot on most targets and the code isn't ready for it. PR20113. llvm-svn: 211583
* Look through addrspacecasts when turning ptr comparisons intoMatt Arsenault2014-06-091-5/+21
| | | | | | index comparisons. llvm-svn: 210488
* Revert 209903 and 210040.Rafael Espindola2014-06-071-40/+0
| | | | | | | | | | | | The messages were "PR19753: Optimize comparisons with "ashr exact" of a constanst." "Added support to optimize comparisons with "lshr exact" of a constant." They were not correctly handling signed/unsigned operation differences, causing pr19958. llvm-svn: 210393
* Added support to optimize comparisons with "lshr exact" of a constant.Rafael Espindola2014-06-021-6/+29
| | | | | | Patch by Rahul Jain. llvm-svn: 210040
* Added inst combine tarnsform for (1 << X) & C pattrens where C is (some ↵Dinesh Dwivedi2014-06-021-8/+24
| | | | | | | | | | | | PowerOf2 - 1) This patch can handles following cases from http://nondot.org/sabre/LLVMNotes/InstCombine.txt "((1 << X) & 7) == 0" ==> "X > 2" "((1 << X) & 7) != 0" ==> "X < 3". Differential Revision: http://reviews.llvm.org/D3678 llvm-svn: 210007
* PR19753: Optimize comparisons with "ashr exact" of a constanst.Rafael Espindola2014-05-301-0/+17
| | | | | | Patch by suyog sarda. llvm-svn: 209903
* InstCombine: Optimize -x s< cstDavid Majnemer2014-05-151-0/+10
| | | | | | | | | | | | | | Summary: This gets rid of a sub instruction by moving the negation to the constant when valid. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3773 llvm-svn: 208827
* Rename ComputeMaskedBits to computeKnownBits. "Masked" has beenJay Foad2014-05-141-1/+1
| | | | | | inappropriate since it lost its Mask parameter in r154011. llvm-svn: 208811
* [C++] Use 'nullptr'. Transforms edition.Craig Topper2014-04-251-79/+79
| | | | llvm-svn: 207196
* [Modules] Fix potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-221-1/+2
| | | | | | | | | | | | | | | | | definition below all of the header #include lines, lib/Transforms/... edition. This one is tricky for two reasons. We again have a couple of passes that define something else before the includes as well. I've sunk their name macros with the DEBUG_TYPE. Also, InstCombine contains headers that need DEBUG_TYPE, so now those headers #define and #undef DEBUG_TYPE around their code, leaving them well formed modular headers. Fixing these headers was a large motivation for all of these changes, as "leaky" macros of this form are hard on the modules implementation. llvm-svn: 206844
* [Modules] Sink all the DEBUG_TYPE defines for InstCombine out of theChandler Carruth2014-04-211-0/+1
| | | | | | | | | | | header files and into the cpp files. These files will require more touches as the header files actually use DEBUG(). Eventually, I'll have to introduce a matched #define and #undef of DEBUG_TYPE for the header files, but that comes as step N of many to clean all of this up. llvm-svn: 206777
* Use APInt arithmetic, fixed typo. Thanks to Benjamin Kramer for noticing that.Serge Pavlov2014-04-141-2/+2
| | | | llvm-svn: 206144
* Recognize test for overflow in integer multiplication.Serge Pavlov2014-04-131-0/+240
| | | | | | | | | | | | | | | | | | If multiplication involves zero-extended arguments and the result is compared as in the patterns: %mul32 = trunc i64 %mul64 to i32 %zext = zext i32 %mul32 to i64 %overflow = icmp ne i64 %mul64, %zext or %overflow = icmp ugt i64 %mul64 , 0xffffffff then the multiplication may be replaced by call to umul.with.overflow. This change fixes PR4917 and PR4918. Differential Revision: http://llvm-reviews.chandlerc.com/D2814 llvm-svn: 206137
* Revert "InstCombine: merge constants in both operands of icmp."Erik Verbruggen2014-03-281-14/+0
| | | | | | | | | This reverts commit r204912, and follow-up commit r204948. This introduced a performance regression, and the fix is not completely clear yet. llvm-svn: 205010
* InstCombine: Don't combine constants on unsigned icmpsReid Kleckner2014-03-271-1/+2
| | | | | | | | | Fixes a miscompile introduced in r204912. It would miscompile code like (unsigned)(a + -49) <= 5U. The transform would turn this into (unsigned)a < 55U, which would return true for values in [0, 49], when it should not. llvm-svn: 204948
* InstCombine: merge constants in both operands of icmp.Erik Verbruggen2014-03-271-0/+13
| | | | | | | | | | Transform: icmp X+Cst2, Cst into: icmp X, Cst-Cst2 when Cst-Cst2 does not overflow, and the add has nsw. llvm-svn: 204912
* [C++11] Add range based accessors for the Use-Def chain of a Value.Chandler Carruth2014-03-091-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This requires a number of steps. 1) Move value_use_iterator into the Value class as an implementation detail 2) Change it to actually be a *Use* iterator rather than a *User* iterator. 3) Add an adaptor which is a User iterator that always looks through the Use to the User. 4) Wrap these in Value::use_iterator and Value::user_iterator typedefs. 5) Add the range adaptors as Value::uses() and Value::users(). 6) Update *all* of the callers to correctly distinguish between whether they wanted a use_iterator (and to explicitly dig out the User when needed), or a user_iterator which makes the Use itself totally opaque. Because #6 requires churning essentially everything that walked the Use-Def chains, I went ahead and added all of the range adaptors and switched them to range-based loops where appropriate. Also because the renaming requires at least churning every line of code, it didn't make any sense to split these up into multiple commits -- all of which would touch all of the same lies of code. The result is still not quite optimal. The Value::use_iterator is a nice regular iterator, but Value::user_iterator is an iterator over User*s rather than over the User objects themselves. As a consequence, it fits a bit awkwardly into the range-based world and it has the weird extra-dereferencing 'operator->' that so many of our iterators have. I think this could be fixed by providing something which transforms a range of T&s into a range of T*s, but that *can* be separated into another patch, and it isn't yet 100% clear whether this is the right move. However, this change gets us most of the benefit and cleans up a substantial amount of code around Use and User. =] llvm-svn: 203364
* [Modules] Move the ConstantRange class into the IR library. This isChandler Carruth2014-03-041-1/+1
| | | | | | | | | | a bit surprising, as the class is almost entirely abstracted away from any particular IR, however it encodes the comparsion predicates which mutate ranges as ICmp predicate codes. This is reasonable as they're used for both instructions and constants. Thus, it belongs in the IR library with instructions and constants. llvm-svn: 202838
* [Modules] Move the LLVM IR pattern match header into the IR library, itChandler Carruth2014-03-041-1/+1
| | | | | | obviously is coupled to the IR. llvm-svn: 202818
* [Modules] Move GetElementPtrTypeIterator into the IR library. As itsChandler Carruth2014-03-041-1/+1
| | | | | | | | | name might indicate, it is an iterator over the types in an instruction in the IR.... You see where this is going. Another step of modularizing the support library. llvm-svn: 202815
* Make some DataLayout pointers const.Rafael Espindola2014-02-241-1/+1
| | | | | | No functionality change. Just reduces the noise of an upcoming patch. llvm-svn: 202087
* Rename many DataLayout variables from TD to DL.Rafael Espindola2014-02-211-23/+23
| | | | | | | | | I am really sorry for the noise, but the current state where some parts of the code use TD (from the old name: TargetData) and other parts use DL makes it hard to write a patch that changes where those variables come from and how they are passed along. llvm-svn: 201827
* Remove a very old instcombine where we would turn sequences of selects intoOwen Anderson2014-02-121-25/+0
| | | | | | | | | | | | | logical operations on the i1's driving them. This is a bad idea for every target I can think of (confirmed with micro tests on all of: x86-64, ARM, AArch64, Mips, and PowerPC) because it forces the i1 to be materialized into a general purpose register, whereas consuming it directly into a select generally allows it to exist only transiently in a predicate or flags register. Chandler ran a set of performance tests with this change, and reported no measurable change on x86-64. llvm-svn: 201275
* Fix known typosAlp Toker2014-01-241-2/+2
| | | | | | | Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018
* InstCombine: Hoist 3 copies of AddOne/SubOne into a header.Benjamin Kramer2014-01-191-9/+0
| | | | llvm-svn: 199605
OpenPOWER on IntegriCloud