summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] fold a shifted bool zext to a select (2nd try)Sanjay Patel2019-10-151-0/+6
| | | | | | | | | | | | | | | | | | | | The 1st attempt at rL374828 inserted the code at the wrong position (outside of the constant-shift-amount block). Trying again with an additional test to verify const-ness. For a constant shift amount, add the following fold. shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0) https://rise4fun.com/Alive/IZ9 Fixes PR42257. Based on original patch by @zvi (Zvi Rackover) Differential Revision: https://reviews.llvm.org/D63382 llvm-svn: 374886
* Revert [InstCombine] fold a shifted bool zext to a selectSanjay Patel2019-10-141-6/+0
| | | | | | This reverts r374828 (git commit 1f40f15d54aac06421448b6de131231d2d78bc75) due to bot breakage llvm-svn: 374851
* [InstCombine] fold a shifted bool zext to a selectSanjay Patel2019-10-141-0/+6
| | | | | | | | | | | | | | | For a constant shift amount, add the following fold. shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0) https://rise4fun.com/Alive/IZ9 Fixes PR42257. Based on original patch by @zvi (Zvi Rackover) Differential Revision: https://reviews.llvm.org/D63382 llvm-svn: 374828
* [NFC][InstCombine] Some preparatory cleanup in ↵Roman Lebedev2019-10-131-23/+21
| | | | | | dropRedundantMaskingOfLeftShiftInput() llvm-svn: 374734
* [InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in ↵Sanjay Patel2019-10-131-2/+10
| | | | | | | | | | | | | | non-default address space Follow-up to D68244 to account for a corner case discussed in: https://bugs.llvm.org/show_bug.cgi?id=43501 Add one more restriction: if the pointer is deref-or-null and in a non-default (non-zero) address space, we can't assume inbounds. Differential Revision: https://reviews.llvm.org/D68706 llvm-svn: 374728
* [InstCombine] Fold conditional sign-extend of high-bit-extract into ↵Roman Lebedev2019-10-071-0/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | high-bit-extract-with-signext (PR42389) This can come up in Bit Stream abstractions. The pattern looks big/scary, but it can't be simplified any further. It only is so simple because a number of my preparatory folds had happened already (shift amount reassociation / shift amount reassociation in bit test, sign bit test detection). Highlights: * There are two main flavors: https://rise4fun.com/Alive/zWi The difference is add vs. sub, and left-shift of -1 vs. 1 * Since we only change the shift opcode, we can preserve the exact-ness: https://rise4fun.com/Alive/4u4 * There can be truncation after high-bit-extraction: https://rise4fun.com/Alive/slHc1 (the main pattern i'm after!) Which means that we need to ignore zext of shift amounts and of NBits. * The sign-extending magic can be extended itself (in add pattern via sext, in sub pattern via zext. not the other way around!) https://rise4fun.com/Alive/NhG (or those sext/zext can be sinked into `select`!) Which again means we should pay attention when matching NBits. * We can have both truncation of extraction and widening of magic: https://rise4fun.com/Alive/XTw In other words, i don't believe we need to have any checks on bitwidths of any of these constructs. This is worsened in general by the fact that we may have `sext` instead of `zext` for shift amounts, and we don't yet canonicalize to `zext`, although we should. I have not done anything about that here. Also, we really should have something to weed out `sub` like these, by folding them into `add` variant. https://bugs.llvm.org/show_bug.cgi?id=42389 llvm-svn: 373964
* [InstCombine] Move isSignBitCheck(), handle rest of the predicatesRoman Lebedev2019-10-072-28/+39
| | | | | | | | | True, no test coverage is being added here. But those non-canonical predicates that are already handled here already have no test coverage as far as i can tell. I tried to add tests for them, but all the patterns already get handled elsewhere. llvm-svn: 373962
* [InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we ↵Roman Lebedev2019-10-071-70/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | deal with mask Summary: Currently, we pre-check whether we need to produce a mask or not. This involves some rather magical constants. I'd like to extend this fold to also handle the situation when there's also a `trunc` before outer shift. That will require another set of magical constants. It's ugly. Instead, we can just compute the mask, and check whether mask is a pass-through (all-ones) or not. This way we don't need to have any magical numbers. This change is NFC other than the fact that we now compute the mask and then check if we need (and can!) apply it. Reviewers: spatel Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68470 llvm-svn: 373961
* [InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift ↵Roman Lebedev2019-10-071-0/+33
| | | | | | | | | | | | | | | | | | | | | | amounts Summary: When we do `ConstantExpr::getZExt()`, that "extends" `undef` to `0`, which means that for patterns a/b we'd assume that we must not produce any bits for that channel, while in reality we simply didn't care about that channel - i.e. we don't need to mask it. Reviewers: spatel Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68239 llvm-svn: 373960
* [InstCombine] fold fneg disguised as select+fmul (PR43497)Sanjay Patel2019-10-061-18/+49
| | | | | | | Extends rL373230 and solves the motivating bug (although in a narrow way): https://bugs.llvm.org/show_bug.cgi?id=43497 llvm-svn: 373851
* [InstCombine] don't assume 'inbounds' for bitcast pointer to GEP transform ↵Sanjay Patel2019-10-061-2/+9
| | | | | | | | | | | | (PR43501) https://bugs.llvm.org/show_bug.cgi?id=43501 We can't declare a GEP 'inbounds' in general. But we may salvage that information if we have known dereferenceable bytes on the source pointer. Differential Revision: https://reviews.llvm.org/D68244 llvm-svn: 373847
* [InstCombine] Fold 'icmp eq/ne (?trunc (lshr/ashr %x, bitwidth(x)-1)), 0' -> ↵Roman Lebedev2019-10-041-0/+28
| | | | | | | | | | | 'icmp sge/slt %x, 0' We do indeed already get it right in some cases, but only transitively, with one-use restrictions. Since we only need to produce a single comparison, it makes sense to match the pattern directly: https://rise4fun.com/Alive/kPg llvm-svn: 373802
* [InstCombine] Right-shift shift amount reassociation with truncation ↵Roman Lebedev2019-10-041-15/+19
| | | | | | | | | | | | | | | | | | | (PR43564, PR42391) Initially (D65380) i believed that if we have rightshift-trunc-rightshift, we can't do any folding. But as it usually happens, i was wrong. https://rise4fun.com/Alive/GEw https://rise4fun.com/Alive/gN2O In https://bugs.llvm.org/show_bug.cgi?id=43564 we happen to have this very sequence, of two right shifts separated by trunc. And "just" so that happens, we apparently can fold the pattern if the total shift amount is either 0, or it's equal to the bitwidth of the innermost widest shift - i.e. if we are left with only the original sign bit. Which is exactly what is wanted there. llvm-svn: 373801
* [Alignment][NFC] Remove StoreInst::setAlignment(unsigned)Guillaume Chatelet2019-10-033-18/+17
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, bollu, jdoerfert Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68268 llvm-svn: 373595
* [InstCombine] Bypass high bit extract before variable sign-extension (PR43523)Roman Lebedev2019-10-022-0/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | https://rise4fun.com/Alive/8BY - valid for lshr+trunc+variable sext https://rise4fun.com/Alive/7jk - the variable sext can be redundant https://rise4fun.com/Alive/Qslu - 'exact'-ness of first shift can be preserver https://rise4fun.com/Alive/IF63 - without trunc we could view this as more general "drop redundant mask before right-shift", but let's handle it here for now https://rise4fun.com/Alive/iip - likewise, without trunc, variable sext can be redundant. There's more patterns for sure - e.g. we can have 'lshr' as the final shift, but that might be best handled by some more generic transform, e.g. "drop redundant masking before right-shift" (PR42456) I'm singling-out this sext patch because you can only extract high bits with `*shr` (unlike abstract bit masking), and i *know* this fold is wanted by existing code. I don't believe there is much to review here, so i'm gonna opt into post-review mode here. https://bugs.llvm.org/show_bug.cgi?id=43523 llvm-svn: 373542
* [InstCombine] Deal with -(trunc(X >>u 63)) -> trunc(X >>s 63)Roman Lebedev2019-10-011-12/+25
| | | | | | | | Identical to it's trunc-less variant, just pretent-to hoist trunc, and everything else still holds: https://rise4fun.com/Alive/JRU llvm-svn: 373364
* [InstCombine] Preserve 'exact' in -(X >>u 31) -> (X >>s 31) foldRoman Lebedev2019-10-011-2/+6
| | | | | | https://rise4fun.com/Alive/yR4 llvm-svn: 373363
* [InstCombine][NFC] visitShl(): call SimplifyQuery::getWithInstruction() onceRoman Lebedev2019-09-301-10/+9
| | | | llvm-svn: 373249
* [InstCombine] fold negate disguised as select+mulSanjay Patel2019-09-301-0/+19
| | | | | | | | | | | | | | | | | | | | Name: negate if true %sel = select i1 %cond, i32 -1, i32 1 %r = mul i32 %sel, %x => %m = sub i32 0, %x %r = select i1 %cond, i32 %m, i32 %x Name: negate if false %sel = select i1 %cond, i32 1, i32 -1 %r = mul i32 %sel, %x => %m = sub i32 0, %x %r = select i1 %cond, i32 %x, i32 %m https://rise4fun.com/Alive/Nlh llvm-svn: 373230
* [Alignment][NFC] Remove AllocaInst::setAlignment(unsigned)Guillaume Chatelet2019-09-302-6/+7
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68141 llvm-svn: 373207
* [Alignment][NFC] Remove LoadInst::setAlignment(unsigned)Guillaume Chatelet2019-09-303-6/+6
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jdoerfert Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68142 llvm-svn: 373195
* [InstCombine] Simplify shift-by-sext to shift-by-zextRoman Lebedev2019-09-271-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is valid for any `sext` bitwidth pair: ``` Processing /tmp/opt.ll.. ---------------------------------------- %signed = sext %y %r = shl %x, %signed ret %r => %unsigned = zext %y %r = shl %x, %unsigned ret %r %signed = sext %y Done: 2016 Optimization is correct! ``` (This isn't so for funnel shifts, there it's illegal for e.g. i6->i7.) Main motivation is the C++ semantics: ``` int shl(int a, char b) { return a << b; } ``` ends as ``` %3 = sext i8 %1 to i32 %4 = shl i32 %0, %3 ``` https://godbolt.org/z/0jgqUq which is, as this shows, too pessimistic. There is another problem here - we can only do the fold if sext is one-use. But we can trivially have cases where several shifts have the same sext shift amount. This should be resolved, later. Reviewers: spatel, nikic, RKSimon Reviewed By: spatel Subscribers: efriedma, hiraditya, nlopes, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68103 llvm-svn: 373106
* [InstCombine] Use m_Zero instead of isNullValue() when checking if a GEP ↵Craig Topper2019-09-261-1/+1
| | | | | | | | | | | | index is all zeroes to prevent an infinite loop. The test case here previously infinite looped. Only one element from the GEP is used so SimplifyDemandedVectorElts would replace the other lanes in each index with undef leading to the first index being <0, undef, undef, undef>. But there's a GEP transform that tries to replace an index into a 0 sized type with a zero index. But the zero index check only works on ConstantInt 0 or ConstantAggregateZero so it would turn the index back to zeroinitializer. Resulting in a loop. The fix is to use m_Zero() to allow a vector of zeroes and undefs. Differential Revision: https://reviews.llvm.org/D67977 llvm-svn: 373000
* [InstCombine] Don't assume CmpInst has been visited in ↵Bjorn Pettersson2019-09-261-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | getFlippedStrictnessPredicateAndConstant Summary: Removing an assumption (assert) that the CmpInst already has been simplified in getFlippedStrictnessPredicateAndConstant. Solution is to simply bail out instead of hitting the assertion. Instead we assume that any profitable rewrite will happen in the next iteration of InstCombine. The reason why we can't assume that the CmpInst already has been simplified is that the worklist does not guarantee such an ordering. Solves https://bugs.llvm.org/show_bug.cgi?id=43376 Reviewers: spatel, lebedev.ri Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68022 llvm-svn: 372972
* [InstCombine] foldUnsignedUnderflowCheck(): one last pattern with 'sub' ↵Roman Lebedev2019-09-251-0/+10
| | | | | | | | (PR43251) https://rise4fun.com/Alive/0j9 llvm-svn: 372930
* [InstCombine] Fold (A - B) u>=/u< A --> B u>/u<= A iff B != 0Roman Lebedev2019-09-253-20/+14
| | | | | | | | | | | | | https://rise4fun.com/Alive/KtL This also shows that the fold added in D67412 / r372257 was too specific, and the new fold allows those test cases to be handled more generically, therefore i delete now-dead code. This is yet again motivated by D67122 "[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour" llvm-svn: 372912
* [InstCombine] Limit FMul constant folding for fma simplifications.Florian Hahn2019-09-251-3/+15
| | | | | | | | | | | | | | | | | As @reames pointed out post-commit, rL371518 adds additional rounding in some cases, when doing constant folding of the multiplication. This breaks a guarantee llvm.fma makes and must be avoided. This patch reapplies rL371518, but splits off the simplifications not requiring rounding from SimplifFMulInst as SimplifyFMAFMul. Reviewers: spatel, lebedev.ri, reames, scanon Reviewed By: reames Differential Revision: https://reviews.llvm.org/D67434 llvm-svn: 372899
* [PatternMatch] Make m_Br more flexible, add matchers for BB values.Florian Hahn2019-09-251-5/+3
| | | | | | | | | | | | | | | | | | | | | Currently m_Br only takes references to BasicBlock*, which limits its flexibility. For example, you have to declare a variable, even if you ignore the result or you have to have additional checks to make sure the matched BB matches an expected one. This patch adds m_BasicBlock and m_SpecificBB matchers, which can be used like the existing matchers for constants or values. I also had a look at the existing uses and updated a few. IMO it makes the code a bit more explicit. Reviewers: spatel, craig.topper, RKSimon, majnemer, lebedev.ri Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D68013 llvm-svn: 372885
* [GCRelocate] Add a peephole to canonicalize base pointer relocationPhilip Reames2019-09-241-1/+12
| | | | | | If we generate the gc.relocate, and then later prove two arguments to the statepoint are equivalent, we should canonicalize the gc.relocate to the form we would have produced if this had been known before rewriting. llvm-svn: 372771
* [InstCombine] (a+b) < a && (a+b) != 0 -> (0-b) < a iff a/b != 0 (PR43259)Roman Lebedev2019-09-241-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is again motivated by D67122 sanitizer check enhancement. That patch seemingly worsens `-fsanitize=pointer-overflow` overhead from 25% to 50%, which strongly implies missing folds. For ``` #include <cassert> char* test(char& base, signed long offset) { __builtin_assume(offset < 0); return &base + offset; } ``` We produce https://godbolt.org/z/r40U47 and again those two icmp's can be merged: ``` Name: 0 Pre: C != 0 %adjusted = add i8 %base, C %not_null = icmp ne i8 %adjusted, 0 %no_underflow = icmp ult i8 %adjusted, %base %r = and i1 %not_null, %no_underflow => %neg_offset = sub i8 0, C %r = icmp ugt i8 %base, %neg_offset ``` https://rise4fun.com/Alive/ALap https://rise4fun.com/Alive/slnN There are 3 other variants of this pattern, i believe they all will go into InstSimplify. https://bugs.llvm.org/show_bug.cgi?id=43259 Reviewers: spatel, xbolva00, nikic Reviewed By: spatel Subscribers: efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67849 llvm-svn: 372768
* [InstCombine] (a+b) <= a && (a+b) != 0 -> (0-b) < a (PR43259)Roman Lebedev2019-09-241-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is again motivated by D67122 sanitizer check enhancement. That patch seemingly worsens `-fsanitize=pointer-overflow` overhead from 25% to 50%, which strongly implies missing folds. This pattern isn't exactly what we get there (strict vs. non-strict predicate), but this pattern does not require known-bits analysis, so it is best to handle it first. ``` Name: 0 %adjusted = add i8 %base, %offset %not_null = icmp ne i8 %adjusted, 0 %no_underflow = icmp ule i8 %adjusted, %base %r = and i1 %not_null, %no_underflow => %neg_offset = sub i8 0, %offset %r = icmp ugt i8 %base, %neg_offset ``` https://rise4fun.com/Alive/knp There are 3 other variants of this pattern, they all will go into InstSimplify: https://rise4fun.com/Alive/bIDZ https://bugs.llvm.org/show_bug.cgi?id=43259 Reviewers: spatel, xbolva00, nikic Reviewed By: spatel Subscribers: hiraditya, majnemer, vsk, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67846 llvm-svn: 372767
* [InstCombine] Fold a shifty implementation of clamp-to-allones.Huihui Zhang2019-09-241-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fold or(ashr(subNSW(Y, X), ScalarSizeInBits(Y)-1), X) into X s> Y ? -1 : X https://rise4fun.com/Alive/d8Ab clamp255 is a common operator in image processing, can be implemented in a shifty way "(255 - X) >> 31 | X & 255". Fold shift into select enables more optimization, e.g., vmin generation for ARM target. Reviewers: lebedev.ri, efriedma, spatel, kparzysz, bcahoon Reviewed By: lebedev.ri Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67800 llvm-svn: 372678
* [InstCombine] Fold a shifty implementation of clamp-to-zero.Huihui Zhang2019-09-241-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fold and(ashr(subNSW(Y, X), ScalarSizeInBits(Y)-1), X) into X s> Y ? X : 0 https://rise4fun.com/Alive/lFH Fold shift into select enables more optimization, e.g., vmax generation for ARM target. Reviewers: lebedev.ri, efriedma, spatel, kparzysz, bcahoon Reviewed By: lebedev.ri Subscribers: xbolva00, andreadb, craig.topper, RKSimon, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67799 llvm-svn: 372676
* [InstCombine] foldOrOfICmps(): Acquire SimplifyQuery with set CxtIRoman Lebedev2019-09-231-2/+4
| | | | | | Extracted from https://reviews.llvm.org/D67849#inline-610377 llvm-svn: 372654
* [InstCombine] foldAndOfICmps(): Acquire SimplifyQuery with set CxtIRoman Lebedev2019-09-231-2/+4
| | | | | | Extracted from https://reviews.llvm.org/D67849#inline-610377 llvm-svn: 372653
* [InstCombine] Annotate strndup calls with dereferenceable_or_nullDavid Bolvansky2019-09-231-9/+18
| | | | | | "Implementations are free to malloc() a buffer containing either (size + 1) bytes or (strnlen(s, size) + 1) bytes. Applications should not assume that strndup() will allocate (size + 1) bytes when strlen(s) is smaller than size." llvm-svn: 372647
* [IR] Add getExtendedType() to IntegerType and Type (dispatching to ↵Roman Lebedev2019-09-231-10/+2
| | | | | | IntegerType or VectorType) llvm-svn: 372638
* [InstCombine] dropRedundantMaskingOfLeftShiftInput(): improve commentRoman Lebedev2019-09-231-4/+4
| | | | llvm-svn: 372637
* [SLC] Convert some strndup calls to strdup callsDavid Bolvansky2019-09-231-3/+3
| | | | | | | | | | | | | | | | | | | | | Summary: Motivation: - If we can fold it to strdup, we should (strndup does more things than strdup). - Annotation mechanism. (Works for strdup well). strdup and strndup are part of C 20 (currently posix fns), so we should optimize them. Reviewers: efriedma, jdoerfert Reviewed By: jdoerfert Subscribers: lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67679 llvm-svn: 372636
* [InstCombine] dropRedundantMaskingOfLeftShiftInput(): pat. c/d/e with mask ↵Roman Lebedev2019-09-231-3/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (PR42563) Summary: If we have a pattern `(x & (-1 >> maskNbits)) << shiftNbits`, we already know (have a fold) that will drop the `& (-1 >> maskNbits)` mask iff `(shiftNbits-maskNbits) s>= 0` (i.e. `shiftNbits u>= maskNbits`). So even if `(shiftNbits-maskNbits) s< 0`, we can still fold, we will just need to apply a **constant** mask afterwards: ``` Name: c, normal+mask %t0 = lshr i32 -1, C1 %t1 = and i32 %t0, %x %r = shl i32 %t1, C2 => %n0 = shl i32 %x, C2 %n1 = i32 ((-(C2-C1))+32) %n2 = zext i32 %n1 to i64 %n3 = lshr i64 -1, %n2 %n4 = trunc i64 %n3 to i32 %r = and i32 %n0, %n4 ``` https://rise4fun.com/Alive/gslRa Naturally, old `%masked` will have to be one-use. This is not valid for pattern f - where "masking" is done via `ashr`. https://bugs.llvm.org/show_bug.cgi?id=42563 Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67725 llvm-svn: 372630
* [InstCombine] dropRedundantMaskingOfLeftShiftInput(): pat. a/b with mask ↵Roman Lebedev2019-09-231-3/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (PR42563) Summary: And this is **finally** the interesting part of that fold! If we have a pattern `(x & (~(-1 << maskNbits))) << shiftNbits`, we already know (have a fold) that will drop the `& (~(-1 << maskNbits))` mask iff `(maskNbits+shiftNbits) u>= bitwidth(x)`. But that is actually ignorant, there's more general fold here: In this pattern, `(maskNbits+shiftNbits)` actually correlates with the number of low bits that will remain in the final value. So even if `(maskNbits+shiftNbits) u< bitwidth(x)`, we can still fold, we will just need to apply a **constant** mask afterwards: ``` Name: a, normal+mask %onebit = shl i32 -1, C1 %mask = xor i32 %onebit, -1 %masked = and i32 %mask, %x %r = shl i32 %masked, C2 => %n0 = shl i32 %x, C2 %n1 = add i32 C1, C2 %n2 = zext i32 %n1 to i64 %n3 = shl i64 -1, %n2 %n4 = xor i64 %n3, -1 %n5 = trunc i64 %n4 to i32 %r = and i32 %n0, %n5 ``` https://rise4fun.com/Alive/F5R Naturally, old `%masked` will have to be one-use. Similar fold exists for patterns c,d,e, will post patch later. https://bugs.llvm.org/show_bug.cgi?id=42563 Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67677 llvm-svn: 372629
* [InstCombine] foldUnsignedUnderflowCheck(): s/Subtracted/ZeroCmpOp/Roman Lebedev2019-09-231-7/+7
| | | | llvm-svn: 372625
* [InstCombine] allow icmp+binop folds before min/max bailout (PR43310)Sanjay Patel2019-09-222-3/+4
| | | | | | | | | This has the potential to uncover missed analysis/folds as shown in the min/max code comment/test, but fewer restrictions on icmp folds should be better in general to solve cases like: https://bugs.llvm.org/show_bug.cgi?id=43310 llvm-svn: 372510
* [InstCombine] Simplify @llvm.usub.with.overflow+non-zero check (PR43251)Roman Lebedev2019-09-191-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is again motivated by D67122 sanitizer check enhancement. That patch seemingly worsens `-fsanitize=pointer-overflow` overhead from 25% to 50%, which strongly implies missing folds. In this particular case, given ``` char* test(char& base, unsigned long offset) { return &base - offset; } ``` it will end up producing something like https://godbolt.org/z/luGEju which after optimizations reduces down to roughly ``` declare void @use64(i64) define i1 @test(i8* dereferenceable(1) %base, i64 %offset) { %base_int = ptrtoint i8* %base to i64 %adjusted = sub i64 %base_int, %offset call void @use64(i64 %adjusted) %not_null = icmp ne i64 %adjusted, 0 %no_underflow = icmp ule i64 %adjusted, %base_int %no_underflow_and_not_null = and i1 %not_null, %no_underflow ret i1 %no_underflow_and_not_null } ``` Without D67122 there was no `%not_null`, and in this particular case we can "get rid of it", by merging two checks: Here we are checking: `Base u>= Offset && (Base u- Offset) != 0`, but that is simply `Base u> Offset` Alive proofs: https://rise4fun.com/Alive/QOs The `@llvm.usub.with.overflow` pattern itself is not handled here because this is the main pattern, that we currently consider canonical. https://bugs.llvm.org/show_bug.cgi?id=43251 Reviewers: spatel, nikic, xbolva00, majnemer Reviewed By: xbolva00, majnemer Subscribers: vsk, majnemer, xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67356 llvm-svn: 372341
* [InstCombine] foldUnsignedUnderflowCheck(): handle last few cases (PR43251)Roman Lebedev2019-09-181-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I don't have a direct motivational case for this, but it would be good to have this for completeness/symmetry. This pattern is basically the motivational pattern from https://bugs.llvm.org/show_bug.cgi?id=43251 but with different predicate that requires that the offset is non-zero. The completeness bit comes from the fact that a similar pattern (offset != zero) will be needed for https://bugs.llvm.org/show_bug.cgi?id=43259, so it'd seem to be good to not overlook very similar patterns.. Proofs: https://rise4fun.com/Alive/21b Also, there is something odd with `isKnownNonZero()`, if the non-zero knowledge was specified as an assumption, it didn't pick it up (PR43267) With this, i see no other missing folds for https://bugs.llvm.org/show_bug.cgi?id=43251 Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67412 llvm-svn: 372257
* [InstCombine] dropRedundantMaskingOfLeftShiftInput(): some cleanup before ↵Roman Lebedev2019-09-181-5/+8
| | | | | | upcoming patch llvm-svn: 372245
* [NFC][InstCombine] dropRedundantMaskingOfLeftShiftInput(): some NFC diff shavingRoman Lebedev2019-09-171-8/+11
| | | | llvm-svn: 372171
* [InstCombine] Annotate strdup with deref_or_nullDavid Bolvansky2019-09-171-0/+6
| | | | llvm-svn: 372098
* [InstCombine] remove unneeded one-use checks for icmp foldSanjay Patel2019-09-161-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Related folds were added in: rL125734 ...the code comment about register pressure is discussed in more detail in: https://bugs.llvm.org/show_bug.cgi?id=2698 But 10 years later, perf testing bzip2 with this change now shows a slight (0.2% average) improvement on Haswell although that's probably within test noise. Given that this is IR canonicalization, we shouldn't be worried about register pressure though; the backend should be able to adjust for that as needed. This is part of solving PR43310 the theoretically right way: https://bugs.llvm.org/show_bug.cgi?id=43310 ...ie, if we don't cripple basic transforms, then we won't need to add special-case code to detect larger patterns. rL371940 and rL371981 are related patches in this series. llvm-svn: 372007
* [InstCombine] remove unneeded one-use checks for icmp foldSanjay Patel2019-09-161-4/+1
| | | | | | | | | | | | | | | | | | | | This fold and several others were added in: rL125734 <https://reviews.llvm.org/rL125734> ...with no explanation for the one-use checks other than the code comments about register pressure. Given that this is IR canonicalization, we shouldn't be worried about register pressure though; the backend should be able to adjust for that as needed. This is part of solving PR43310 the theoretically right way: https://bugs.llvm.org/show_bug.cgi?id=43310 ...ie, if we don't cripple basic transforms, then we won't need to add special-case code to detect larger patterns. rL371940 is a related patch in this series. llvm-svn: 371981
OpenPOWER on IntegriCloud