summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] add tests for shift-logic-shift; NFCSanjay Patel2019-11-051-0/+171
| | | | | | This is based on existing CodeGen test files for x86 and AArch64. The corresponding potential transform is shown in: rL370617
* [InstCombine] dropRedundantMaskingOfLeftShiftInput(): truncation (PR42563)Roman Lebedev2019-11-0511-108/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: That fold keeps growing and growing :( I think this may be one of the last pieces for it. Since D67677/D67725, the fold knowns the general form of the pattern - where some masking is needed: https://rise4fun.com/Alive/F5R https://rise4fun.com/Alive/gslRa But there is one more huge piece missing - if you are extracting some bits, it is not impossible that the origin is wider than the extraction, i.e. there may be a truncation. And we don't deal with that yet. But we can, and the generalization remains fully identical: https://rise4fun.com/Alive/Uar https://rise4fun.com/Alive/5SW After a preparatory cleanup i think the diff looks rather clean. One missing piece is that in some patterns (especially pat. b), `-1` only needs to be `-1` in final type, but that is for later.. https://bugs.llvm.org/show_bug.cgi?id=42563 Reviewers: spatel, nikic Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69125
* [InstCombine] regenerate test checks; NFCSanjay Patel2019-11-013-186/+162
| | | | Avoid subsequent test noise from improved CHECK-LABEL matching.
* [ValueTracking] Allow context-sensitive nullness check for non-pointersJohannes Doerfert2019-10-311-2/+2
| | | | | | | | | | | | | | | Same as D60846 but with a fix for the problem encountered there which was a missing context adjustment in the handling of PHI nodes. The test that caused D60846 to be reverted was added in e15ab8f277c7. Reviewers: nikic, nlopes, mkazantsev,spatel, dlrobertson, uabelho, hakzsam Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69571
* [InstCombine] Add the test that triggered the D60846 revertJohannes Doerfert2019-10-311-0/+42
| | | | This is in preparation of D69571.
* [InstCombine] add fast-math-flags for better test coverage; NFCSanjay Patel2019-10-311-50/+50
| | | | In all cases, we currently unintentionally drop the FMF on the new select.
* [InstCombine] regenerate test checks; NFCSanjay Patel2019-10-311-48/+63
|
* [InstCombine] Canonicalize uadd.with.overflow to uadd.satDavid Green2019-10-311-8/+2
| | | | | | | | | | | This adds some patterns to transform uadd.with.overflow to uadd.sat (with usub.with.overflow to usub.sat too). The patterns selects from UINTMAX (or 0 for subs) depending on whether the operation overflowed. Signed patterns are a little more involved (they can wrap in two directions), but can be added here in a followup patch too. Differential Revision: https://reviews.llvm.org/D69245
* [InstCombine] keep assumption before sinking callstyker2019-10-311-0/+192
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: in the following C code the branch is not removed by clang in O3. ``` int f1(char* p) { int i1 = __builtin_strlen(p); if (!p) return -1; return i1; } ``` The issue is that the call to strlen is sunk to the following block by instcombine. In its new place the call to strlen doesn't dominate the use in the icmp anymore so value tracking can't see that p cannot be null. This patch resolves the issue by inserting an assumption at the place of the call before sinking a call when that call can be used to prove an argument to be nonnull. This resolves this issue at O3. Reviewers: majnemer, xbolva00, fhahn, jdoerfert, spatel, efriedma Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69477
* [InstCombine] make icmp vector canonicalization safe for constant with undef ↵Sanjay Patel2019-10-297-9/+9
| | | | | | | | | | | | | | | | | elements This is a fix for: https://bugs.llvm.org/show_bug.cgi?id=43730 ...and as shown there, we have existing test cases that show potential miscompiles. We could just bail out for vector constants that contain any undef elements, or we can do as shown here: allow the transform, but replace the undefs with a safe value. For most of the tests shown, this results in a full splat constant (no undefs) which is probably a win for further IR analysis because we conservatively don't match undefs in most cases. Codegen can probably recover these kinds of undef lanes via demanded elements analysis if that's profitable. Differential Revision: https://reviews.llvm.org/D69519
* [InstCombine] add tests for icmp predicate canonicalization with vector ↵Sanjay Patel2019-10-291-5/+14
| | | | | | types; NFC Increase coverage for D69519.
* [ConstantFold] Fold extractelement of getelementptrJay Foad2019-10-281-1/+1
| | | | | | | | | | | | | | Summary: Getelementptr has vector type if any of its operands are vectors (the scalar operands being implicitly broadcast to all vector elements). Extractelement applied to a vector getelementptr can be folded by applying the extractelement in turn to all of the vector operands. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69379
* [InstCombine] Extra combine for uadd_satDavid Green2019-10-281-8/+4
| | | | | | | This is an extra fold for a canonical form of uadd_sat, as shown in D68651. It essentially selects uadd from an add and a select. Differential Revision: https://reviews.llvm.org/D69244
* [InstCombine][NFC] Tests for uadd.sat and sadd.sat canonicalisation.David Green2019-10-282-0/+778
|
* [InstCombine] Signed saturation patternsDavid Green2019-10-221-124/+30
| | | | | | | | | | | | This adds an instcombine matcher for code that attempts to perform signed saturating arithmetic by casting to a higher type. Unsigned cases are already matched, this adds extra matches for the more complex signed cases, which involves matching the min(max(add a b)) nodes with proper extends to ensure legality. Differential Revision: https://reviews.llvm.org/D68651 llvm-svn: 375505
* [InstCombine] Signed saturation tests. NFCDavid Green2019-10-221-0/+597
| | | | llvm-svn: 375503
* Pre-commit test cases for D64713.Jay Foad2019-10-212-0/+64
| | | | llvm-svn: 375418
* [InstCombine] Allow values with multiple users in SimplifyDemandedVectorEltsPiotr Sobczak2019-10-211-4/+71
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Allow for ignoring the check for a single use in SimplifyDemandedVectorElts to be able to simplify operands if DemandedElts is known to contain the union of elements used by all users. It is a responsibility of a caller of SimplifyDemandedVectorElts to supply correct DemandedElts. Simplify a series of extractelement instructions if only a subset of elements is used. Reviewers: reames, arsenm, majnemer, nhaehnle Reviewed By: nhaehnle Subscribers: wdng, jvesely, nhaehnle, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67345 llvm-svn: 375395
* [InstCombine] conditional sign-extend of high-bit-extract: 'or' pattern.Roman Lebedev2019-10-201-3/+3
| | | | | | | | | | | | | | In this pattern, all the "magic" bits that we'd `add` are all high sign bits, and in the value we'd be adding to they are all unset, not unexpectedly, so we can have an `or` there: https://rise4fun.com/Alive/ups It is possible that `haveNoCommonBitsSet()` should be taught about this pattern so that we never have an `add` variant, but the reasoning would need to be recursive (because of that `select`), so i'm not really sure that would be worth it just yet. llvm-svn: 375378
* [NFC][InstCombine] conditional sign-extend of high-bit-extract: 'and' pat. ↵Roman Lebedev2019-10-201-0/+99
| | | | | | | | | | | can be 'or' pattern. In this pattern, all the "magic" bits that we'd add are all high sign bits, and in the value we'd be adding to they are all unset, not unexpectedly, so we can have an `or` there: https://rise4fun.com/Alive/ups llvm-svn: 375377
* [InstCombine] Fold uadd.sat(a, b) == 0 and usub.sat(a, b) == 0Nikita Popov2019-10-201-10/+8
| | | | | | | | | | | | | This adds folds for comparing uadd.sat/usub.sat with zero: * uadd.sat(a, b) == 0 => a == 0 && b == 0 => (a | b) == 0 * usub.sat(a, b) == 0 => a <= b And inverted forms for !=. Differential Revision: https://reviews.llvm.org/D69224 llvm-svn: 375374
* [InstCombine] Add tests for uadd/sub.sat(a, b) == 0; NFCNikita Popov2019-10-201-0/+44
| | | | llvm-svn: 375372
* [InstCombine] Shift amount reassociation in shifty sign bit test (PR43595)Roman Lebedev2019-10-201-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This problem consists of several parts: * Basic sign bit extraction - `trunc? (?shr %x, (bitwidth(x)-1))`. This is trivial, and easy to do, we have a fold for it. * Shift amount reassociation - if we have two identical shifts, and we can simplify-add their shift amounts together, then we likely can just perform them as a single shift. But this is finicky, has one-use restrictions, and shift opcodes must be identical. But there is a super-pattern where both of these work together. to produce sign bit test from two shifts + comparison. We do indeed already handle this in most cases. But since we get that fold transitively, it has one-use restrictions. And what's worse, in this case the right-shifts aren't required to be identical, and we can't handle that transitively: If the total shift amount is bitwidth-1, only a sign bit will remain in the output value. But if we look at this from the perspective of two shifts, we can't fold - we can't possibly know what bit pattern we'd produce via two shifts, it will be *some* kind of a mask produced from original sign bit, but we just can't tell it's shape: https://rise4fun.com/Alive/cM0 https://rise4fun.com/Alive/9IN But it will *only* contain sign bit and zeros. So from the perspective of sign bit test, we're good: https://rise4fun.com/Alive/FRz https://rise4fun.com/Alive/qBU Superb! So the simplest solution is to extend `reassociateShiftAmtsOfTwoSameDirectionShifts()` to also have a sudo-analysis mode that will ignore extra-uses, and will only check whether a) those are two right shifts and b) they end up with bitwidth(x)-1 shift amount and return either the original value that we sign-checking, or null. This does not have any functionality change for the existing `reassociateShiftAmtsOfTwoSameDirectionShifts()`. All that being said, as disscussed in the review, this yet again increases usage of instsimplify in instcombine as utility. Some day that may need to be reevaluated. https://bugs.llvm.org/show_bug.cgi?id=43595 Reviewers: spatel, efriedma, vsk Reviewed By: spatel Subscribers: xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68930 llvm-svn: 375371
* [InstCombine] Fix miscompile bug in canEvaluateShuffledBjorn Pettersson2019-10-181-5/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add restrictions in canEvaluateShuffled to prevent that we for example transform %0 = insertelement <2 x i16> undef, i16 %a, i32 0 %1 = srem <2 x i16> %0, <i16 2, i16 1> %2 = shufflevector <2 x i16> %1, <2 x i16> undef, <2 x i32> <i32 undef, i32 0> into %1 = insertelement <2 x i16> undef, i16 %a, i32 1 %2 = srem <2 x i16> %1, <i16 undef, i16 2> as having an undef denominator makes the srem undefined (for all vector elements). Fixes: https://bugs.llvm.org/show_bug.cgi?id=43689 Reviewers: spatel, lebedev.ri Reviewed By: spatel, lebedev.ri Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69038 llvm-svn: 375208
* [InstCombine] Pre-commit of test case showing miscompile bug in ↵Bjorn Pettersson2019-10-181-0/+101
| | | | | | | | | | | | | | | | | | | | canEvaluateShuffled Adding the reproducer from https://bugs.llvm.org/show_bug.cgi?id=43689, showing that instcombine is doing a bad transform. It transforms %0 = insertelement <2 x i16> undef, i16 %a, i32 0 %1 = srem <2 x i16> %0, <i16 2, i16 1> %2 = shufflevector <2 x i16> %1, <2 x i16> undef, <2 x i32> <i32 undef, i32 0> into %1 = insertelement <2 x i16> undef, i16 %a, i32 1 %2 = srem <2 x i16> %1, <i16 undef, i16 2> The undef denominator makes the whole srem undefined. llvm-svn: 375207
* [NFC][InstCombine] Tests for "fold variable mask before variable ↵Roman Lebedev2019-10-1711-0/+2430
| | | | | | | | shift-of-trunc" (PR42563) https://bugs.llvm.org/show_bug.cgi?id=42563 llvm-svn: 375135
* [Analysis] Don't assume that unsigned overflow can't happen in EmitGEPOffset ↵Mikhail Maltsev2019-10-175-25/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (PR42699) Summary: Currently when computing a GEP offset using the function EmitGEPOffset for the following instruction getelementptr inbounds i32, i32* %p, i64 %offs we get mul nuw i64 %offs, 4 Unfortunately we cannot assume that unsigned wrapping won't happen here because %offs is allowed to be negative. Making such assumptions can lead to miscompilations: see the new test test24_neg_offs in InstCombine/icmp.ll. Without the patch InstCombine would generate the following comparison: icmp eq i64 %offs, 4611686018427387902; 0x3ffffffffffffffe Whereas the correct value to compare with is -2. This patch replaces the NUW flag with NSW in the multiplication instructions generated by EmitGEPOffset and adjusts the test suite. https://bugs.llvm.org/show_bug.cgi?id=42699 Reviewers: chandlerc, craig.topper, ostannard, lebedev.ri, spatel, efriedma, nlopes, aqjune Reviewed By: lebedev.ri Subscribers: reames, lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68342 llvm-svn: 375089
* [InstCombine][AMDGPU] Fix crash with v3i16/v3f16 buffer intrinsicsPiotr Sobczak2019-10-161-0/+45
| | | | | | | | | | | | | | | | | | | | Summary: This is something of a workaround to avoid a crash later on in type legalizer (WidenVectorResult()). Also added some f16 tests, including a non-working v3f16 case with a FIXME. Reviewers: arsenm, tpr, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68865 llvm-svn: 374993
* [InstCombine] fold a shifted bool zext to a select (2nd try)Sanjay Patel2019-10-152-8/+15
| | | | | | | | | | | | | | | | | | | | The 1st attempt at rL374828 inserted the code at the wrong position (outside of the constant-shift-amount block). Trying again with an additional test to verify const-ness. For a constant shift amount, add the following fold. shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0) https://rise4fun.com/Alive/IZ9 Fixes PR42257. Based on original patch by @zvi (Zvi Rackover) Differential Revision: https://reviews.llvm.org/D63382 llvm-svn: 374886
* Revert [InstCombine] fold a shifted bool zext to a selectSanjay Patel2019-10-142-4/+8
| | | | | | This reverts r374828 (git commit 1f40f15d54aac06421448b6de131231d2d78bc75) due to bot breakage llvm-svn: 374851
* [InstCombine] fold a shifted bool zext to a selectSanjay Patel2019-10-142-8/+4
| | | | | | | | | | | | | | | For a constant shift amount, add the following fold. shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0) https://rise4fun.com/Alive/IZ9 Fixes PR42257. Based on original patch by @zvi (Zvi Rackover) Differential Revision: https://reviews.llvm.org/D63382 llvm-svn: 374828
* [InstCombine] add tests for select/shift transforms; NFCSanjay Patel2019-10-142-0/+59
| | | | | | A transform proposal for the shift form is in D63382. llvm-svn: 374818
* [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperatorCameron McInally2019-10-144-17/+17
| | | | | | | | Reapply r374240 with fix for Ocaml test, namely Bindings/OCaml/core.ml. Differential Revision: https://reviews.llvm.org/D61675 llvm-svn: 374782
* [InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in ↵Sanjay Patel2019-10-131-2/+2
| | | | | | | | | | | | | | non-default address space Follow-up to D68244 to account for a corner case discussed in: https://bugs.llvm.org/show_bug.cgi?id=43501 Add one more restriction: if the pointer is deref-or-null and in a non-default (non-zero) address space, we can't assume inbounds. Differential Revision: https://reviews.llvm.org/D68706 llvm-svn: 374728
* [NFC][InstCombine] More test for "sign bit test via shifts" pattern (PR43595)Roman Lebedev2019-10-136-13/+377
| | | | | | | | | | | | | | | | While that pattern is indirectly handled via reassociateShiftAmtsOfTwoSameDirectionShifts(), that incursme one-use restriction on truncation, which is pointless since we know that we'll produce a single instruction. Additionally, *if* we are only looking for sign bit, we don't need shifts to be identical, which isn't the case in general, and is the blocker for me in bug in question: https://bugs.llvm.org/show_bug.cgi?id=43595 llvm-svn: 374726
* [InstCombine] Add test case for PR43617 (NFC)Evandro Menezes2019-10-101-0/+10
| | | | | | Also, refactor check in `LibCallSimplifier::optimizeLog()`. llvm-svn: 374453
* Revert "[IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator"Dmitri Gribenko2019-10-104-17/+17
| | | | | | | This reverts commit r374240. It broke OCaml tests: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19014 llvm-svn: 374354
* [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperatorCameron McInally2019-10-094-17/+17
| | | | | | | | Also update Clang to call Builder.CreateFNeg(...) for UnaryMinus. Differential Revision: https://reviews.llvm.org/D61675 llvm-svn: 374240
* [InstCombine] add another test for gep inbounds; NFCSanjay Patel2019-10-091-0/+11
| | | | llvm-svn: 374190
* [InstCombine] Fold conditional sign-extend of high-bit-extract into ↵Roman Lebedev2019-10-071-19/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | high-bit-extract-with-signext (PR42389) This can come up in Bit Stream abstractions. The pattern looks big/scary, but it can't be simplified any further. It only is so simple because a number of my preparatory folds had happened already (shift amount reassociation / shift amount reassociation in bit test, sign bit test detection). Highlights: * There are two main flavors: https://rise4fun.com/Alive/zWi The difference is add vs. sub, and left-shift of -1 vs. 1 * Since we only change the shift opcode, we can preserve the exact-ness: https://rise4fun.com/Alive/4u4 * There can be truncation after high-bit-extraction: https://rise4fun.com/Alive/slHc1 (the main pattern i'm after!) Which means that we need to ignore zext of shift amounts and of NBits. * The sign-extending magic can be extended itself (in add pattern via sext, in sub pattern via zext. not the other way around!) https://rise4fun.com/Alive/NhG (or those sext/zext can be sinked into `select`!) Which again means we should pay attention when matching NBits. * We can have both truncation of extraction and widening of magic: https://rise4fun.com/Alive/XTw In other words, i don't believe we need to have any checks on bitwidths of any of these constructs. This is worsened in general by the fact that we may have `sext` instead of `zext` for shift amounts, and we don't yet canonicalize to `zext`, although we should. I have not done anything about that here. Also, we really should have something to weed out `sub` like these, by folding them into `add` variant. https://bugs.llvm.org/show_bug.cgi?id=42389 llvm-svn: 373964
* [InstCombine][NFC] Tests for "conditional sign-extend of high-bit-extract" ↵Roman Lebedev2019-10-071-0/+1040
| | | | | | | | pattern (PR42389) https://bugs.llvm.org/show_bug.cgi?id=42389 llvm-svn: 373963
* [InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift ↵Roman Lebedev2019-10-075-5/+5
| | | | | | | | | | | | | | | | | | | | | | amounts Summary: When we do `ConstantExpr::getZExt()`, that "extends" `undef` to `0`, which means that for patterns a/b we'd assume that we must not produce any bits for that channel, while in reality we simply didn't care about that channel - i.e. we don't need to mask it. Reviewers: spatel Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68239 llvm-svn: 373960
* [InstCombine] fold fneg disguised as select+fmul (PR43497)Sanjay Patel2019-10-061-12/+16
| | | | | | | Extends rL373230 and solves the motivating bug (although in a narrow way): https://bugs.llvm.org/show_bug.cgi?id=43497 llvm-svn: 373851
* [InstCombine] add fast-math-flags for better test coverage; NFCSanjay Patel2019-10-061-4/+4
| | | | llvm-svn: 373848
* [InstCombine] don't assume 'inbounds' for bitcast pointer to GEP transform ↵Sanjay Patel2019-10-065-13/+48
| | | | | | | | | | | | (PR43501) https://bugs.llvm.org/show_bug.cgi?id=43501 We can't declare a GEP 'inbounds' in general. But we may salvage that information if we have known dereferenceable bytes on the source pointer. Differential Revision: https://reviews.llvm.org/D68244 llvm-svn: 373847
* [InstCombine] Fold 'icmp eq/ne (?trunc (lshr/ashr %x, bitwidth(x)-1)), 0' -> ↵Roman Lebedev2019-10-042-13/+9
| | | | | | | | | | | 'icmp sge/slt %x, 0' We do indeed already get it right in some cases, but only transitively, with one-use restrictions. Since we only need to produce a single comparison, it makes sense to match the pattern directly: https://rise4fun.com/Alive/kPg llvm-svn: 373802
* [InstCombine] Right-shift shift amount reassociation with truncation ↵Roman Lebedev2019-10-043-89/+33
| | | | | | | | | | | | | | | | | | | (PR43564, PR42391) Initially (D65380) i believed that if we have rightshift-trunc-rightshift, we can't do any folding. But as it usually happens, i was wrong. https://rise4fun.com/Alive/GEw https://rise4fun.com/Alive/gN2O In https://bugs.llvm.org/show_bug.cgi?id=43564 we happen to have this very sequence, of two right shifts separated by trunc. And "just" so that happens, we apparently can fold the pattern if the total shift amount is either 0, or it's equal to the bitwidth of the innermost widest shift - i.e. if we are left with only the original sign bit. Which is exactly what is wanted there. llvm-svn: 373801
* [NFC][InstCombine] Autogenerate shift.ll testRoman Lebedev2019-10-041-114/+114
| | | | llvm-svn: 373800
* [NFC][InstCombine] Autogenerate icmp-shr-lt-gt.ll testRoman Lebedev2019-10-041-88/+89
| | | | llvm-svn: 373799
* [NFC][InstCombine] Tests for bit test via highest sign-bit extract (w/ ↵Roman Lebedev2019-10-041-0/+182
| | | | | | | | trunc) (PR43564) https://rise4fun.com/Alive/x5IS llvm-svn: 373798
OpenPOWER on IntegriCloud