summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] reassociate loop invariant GEP chains to enable LICMSebastian Pop2018-03-261-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | This change brings performance of zlib up by 10%. The example below is from a hot loop in longest_match() from zlib. do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1 In this example %idx.ext1 is a loop invariant. It will be moved above the use of loop induction variable %idx.ext such that it can be hoisted out of the loop by LICM. The operands that have dependences carried by the loop will be sinked down in the GEP chain. This patch will produce the following output: do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1 %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext llvm-svn: 328539
* [InstCombine] distribute fmul over fadd/fsubSanjay Patel2018-03-262-100/+15
| | | | | | | | | | This replaces a large chunk of code that was looking for compound patterns that include these sub-patterns. Existing tests ensure that all of the previous examples are still folded as expected. We still need to loosen the FMF check. llvm-svn: 328502
* [InstCombine] check uses before creating instructions for fmul distributionSanjay Patel2018-03-261-1/+1
| | | | | | As the tests show, we could create extra instructions without any obvious benefit. llvm-svn: 328498
* [PatternMatch] allow undef elements when matching vector FP +0.0Sanjay Patel2018-03-255-11/+11
| | | | | | | | | | | | | This continues the FP constant pattern matching improvements from: https://reviews.llvm.org/rL327627 https://reviews.llvm.org/rL327339 https://reviews.llvm.org/rL327307 Several integer constant matchers also have this ability. I'm separating matching of integer/pointer null from FP positive zero and renaming/commenting to make the functionality clearer. llvm-svn: 328461
* [InstCombine] peek through more icmp of FP cast + bitcastSanjay Patel2018-03-251-4/+14
| | | | | | This is an extension of rL328426 as noted in D44367. llvm-svn: 328448
* [InstCombine] peek through FP casts for sign-bit compares (PR36682)Sanjay Patel2018-03-241-0/+9
| | | | | | | | | | | | This pattern came up in PR36682: https://bugs.llvm.org/show_bug.cgi?id=36682 https://godbolt.org/g/LhuD9A Equality checks are planned as a follow-up enhancement. Differential Revision: https://reviews.llvm.org/D44367 llvm-svn: 328426
* [InstCombine] fix formatting; NFCSanjay Patel2018-03-241-37/+30
| | | | llvm-svn: 328425
* [InstCombine] simplify code for FP intrinsic shrinking; NFCISanjay Patel2018-03-231-10/+5
| | | | llvm-svn: 328372
* [InstCombine] reduce code duplication; NFCSanjay Patel2018-03-231-56/+49
| | | | llvm-svn: 328323
* [InstCombine] improve variable name; NFCSanjay Patel2018-03-231-12/+10
| | | | llvm-svn: 328322
* [InstCombineCalls] Update deprecated API usage (NFC)Daniel Neilson2018-03-221-1/+1
| | | | | | | | Summary: Just updating a call to MemSetInst::getAlignment() to MemSetInst::getDestAlignment(). The former has been deprecated. llvm-svn: 328227
* [InstCombine] add folds for xor-of-icmp signbit tests (PR36682)Sanjay Patel2018-03-221-0/+28
| | | | | | | | | | | | | | | | | | This is a retry of r328119 which was reverted at r328145 because it could crash by trying to combine icmps with different operand types. This version has a check for that and additional tests. Original commit message: This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=36682 There's also a leftover improvement from the long-ago-closed: https://bugs.llvm.org/show_bug.cgi?id=5438 https://rise4fun.com/Alive/dC1 llvm-svn: 328197
* Fix a couple of layering violations in TransformsDavid Blaikie2018-03-216-6/+6
| | | | | | | | | | | | | Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering. Transforms depends on Transforms/Utils, not the other way around. So remove the header and the "createStripGCRelocatesPass" function declaration (& definition) that is unused and motivated this dependency. Move Transforms/Utils/Local.h into Analysis because it's used by Analysis/MemoryBuiltins.cpp. llvm-svn: 328165
* Revert r328119 "[InstCombine] add folds for xor-of-icmp signbit tests (PR36682)"Reid Kleckner2018-03-211-30/+0
| | | | | | | This asserts when compiling safe_numerics_unittest.cpp in Chromium with MSan. llvm-svn: 328145
* [InstCombine] add folds for xor-of-icmp signbit tests (PR36682)Sanjay Patel2018-03-211-0/+30
| | | | | | | | | | | | This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=36682 There's also a leftover improvement from the long-ago-closed: https://bugs.llvm.org/show_bug.cgi?id=5438 https://rise4fun.com/Alive/dC1 llvm-svn: 328119
* [InstCombine] canonicalize fcmp+select to fabsSanjay Patel2018-03-191-1/+31
| | | | | | | | | | | | | | This is complicated by -0.0 and nan. This is based on the DAG patterns as shown in D44091. I'm hoping that we can just remove those DAG folds and always rely on IR canonicalization to handle the matching to fabs. We would still need to delete the broken code from DAGCombiner to fix PR36600: https://bugs.llvm.org/show_bug.cgi?id=36600 Differential Revision: https://reviews.llvm.org/D44550 llvm-svn: 327858
* [InstCombine] peek through unsigned FP casts for zero-equality compares ↵Roman Lebedev2018-03-181-0/+9
| | | | | | | | | | | | | | | | | | | | | | (PR36682) Summary: This pattern came up in PR36682 / D44390 https://bugs.llvm.org/show_bug.cgi?id=36682 https://reviews.llvm.org/D44390 https://godbolt.org/g/oKvT5H See also D44416 Reviewers: spatel, majnemer, efriedma, arsenm Reviewed By: spatel Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D44424 llvm-svn: 327799
* [InstCombine] add nnan requirement for sqrt(x) * sqrt(y) -> sqrt(x*y)Sanjay Patel2018-03-181-1/+3
| | | | | | This is similar to D43765. llvm-svn: 327797
* Simplify more cases of logical ops of masked icmps.Hiroshi Yamauchi2018-03-131-17/+199
| | | | | | | | | | | | | | | | | | Summary: For example, ((X & 255) != 0) && ((X & 15) == 8) -> ((X & 15) == 8). ((X & 7) != 0) && ((X & 15) == 8) -> false. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43835 llvm-svn: 327450
* [InstCombine] fix fmul reassociation to avoid creating an extra fdivSanjay Patel2018-03-131-6/+20
| | | | | | | | | | | | | This was supposed to be an NFC refactoring that will eventually allow eliminating the isFast() predicate, but there's a rare possibility that we would pessimize the code as shown in the test case because we failed to check 'hasOneUse()' properly. This version also removes an inefficiency of the old code; we would look for: (X * C) * C1 --> X * (C * C1) ...but that pattern is always handled by SimplifyAssociativeOrCommutative(). llvm-svn: 327404
* [InstCombine] Replace calls to getNumUses with hasNUses or hasNUsesOrMoreCraig Topper2018-03-122-5/+5
| | | | | | | | | | getNumUses is a linear time operation. It traverses the user linked list to the end and counts as it goes. Since we are only interested in small constant counts, we should use hasNUses or hasNUsesMore more that terminate the traversal as soon as it can provide the answer. There are still two other locations in InstCombine, but changing those would force a rebase of D44266 which if accepted would remove them. Differential Revision: https://reviews.llvm.org/D44398 llvm-svn: 327315
* [Transforms] Add missing header for InstructionCombining.cpp, in order to ↵Eugene Zelenko2018-03-061-0/+1
| | | | | | | | | | export LLVMInitializeInstCombine as extern "C". Fixes PR35947. Patch by Brenton Bostick. Differential revision: https://reviews.llvm.org/D44140 llvm-svn: 326843
* [InstCombine] simplify min/max canonicalization; NFCISanjay Patel2018-03-061-10/+5
| | | | llvm-svn: 326828
* [ValueTracking] move helpers for SelectPatterns from InstCombine to ↵Sanjay Patel2018-03-061-51/+11
| | | | | | | | | ValueTracking Most of the folds based on SelectPatternResult belong in InstSimplify rather than InstCombine, so the helper code should be available to other passes/analysis. llvm-svn: 326812
* [InstCombine] Don't blow up in foldICmpWithCastAndCast on vector icmp ↵Daniel Neilson2018-03-051-1/+8
| | | | | | | | | | | | | | | | | | | | | | | instructions. Summary: Presently, InstCombiner::foldICmpWithCastAndCast() implicitly assumes that it is only invoked with icmp instructions of integer type. If that assumption is broken, and it is called with an icmp of vector type, then it fails (asserts/crashes). This patch addresses the deficiency. It allows it to simplify icmp (ptrtoint x), (ptrtoint/c) of vector type into a compare of the inputs, much as is done when the type is integer. Reviewers: apilipenko, fedor.sergeev, mkazantsev, anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44063 llvm-svn: 326730
* [InstCombine] Add constant vector support to getMinimumFPType for visitFPTrunc.Craig Topper2018-03-051-0/+34
| | | | | | | | This patch teaches getMinimumFPType to support shrinking a vector of ConstantFPs. This should improve our ability to combine vector fptrunc with fp binops. Differential Revision: https://reviews.llvm.org/D43774 llvm-svn: 326729
* [InstCombine] (~X) - (~Y) --> Y - XSanjay Patel2018-03-031-0/+5
| | | | llvm-svn: 326660
* [InstCombine] rearrange visitFMul; NFCISanjay Patel2018-03-021-73/+69
| | | | | | | Put the simplest non-FMF folds first, so it's easier to see what's left to fix/group/add with the FMF folds. llvm-svn: 326632
* [InstCombine] Rewrite the binary op shrinking in visitFPTrunc to avoid ↵Craig Topper2018-03-021-47/+43
| | | | | | | | | | creating overly small ConstantFPs that we'll just need to extend again. Instead of returning the smaller FP constant we now return the minimal Type the constant can fit into. We also return the Type of the input to any fp extends. The legality checks are then done on just the size of these Types. If we find something profitable we then emit FPTruncs in front of the smaller binop and assume those FPTruncs will be constant folded or combined with any ConstantFPs or fpextends. Differential Revision: https://reviews.llvm.org/D44038 llvm-svn: 326617
* [InstCombine] partly fix FMF for fmul+log2 foldSanjay Patel2018-03-021-52/+17
| | | | | | | | | | The code was checking that all of the instructions in the sequence are 'fast', but that's not necessary. The final multiply is all that we need to check (tests adjusted). The fmul doesn't need to be fully 'fast' either, but that can be another patch. llvm-svn: 326608
* [InstCombine] allow fmul fold with less than 'fast'Sanjay Patel2018-03-021-1/+1
| | | | | | | | | | | | | | | | | | This is a retry of r326502 with updates to the reassociate test file that I missed the first time. @test15_reassoc in the supposed -reassociate test file (except that it tests 2 other passes too...) shows that there's no clear responsiblity for reassociation transforms. Instcombine now gets that case, but only because the constant values are identical. Otherwise, it would still miss that pattern. Reassociate doesn't get that case because it hasn't been updated to use less than 'fast' FMF. llvm-svn: 326513
* revert r326502: [InstCombine] allow fmul fold with less than 'fast'Sanjay Patel2018-03-011-1/+1
| | | | | | | | I forgot that I added tests for 'reassoc' to -reassociate, but suprisingly that file calls -instcombine too, so it is affected. I'll update that file and try again. llvm-svn: 326510
* [InstCombine] allow fmul fold with less than 'fast'Sanjay Patel2018-03-011-1/+1
| | | | llvm-svn: 326502
* [InstCombine] simplify code for (X*Y) * X => (X*X) * Y ; NFCISanjay Patel2018-03-011-35/+17
| | | | llvm-svn: 326444
* [InstCombine] simplify code for X * -1.0 --> -X; NFCSanjay Patel2018-02-281-7/+3
| | | | | | I've added random FMF to one of the tests to show those are propagated. llvm-svn: 326377
* [InstCombine] Split the FP constant code out of lookThroughFPExtensions and ↵Craig Topper2018-02-281-15/+20
| | | | | | | | | | | | | | use nullptr as a sentinel Currently this code's control flow very much assumes that there are no meaningful checks after determining that it's a ConstantFP. So whenever it wants to stop it just does "return V". But V is also the variable name it uses when it wants to return a new value. So 'return V' appears multiple times with different meanings. This patch just moves all the code into a helper function and returns nullptr when it wants to stop. I've split this from D43774 while I try to figure out how to best handle the vector case there. But this change by itself at least seemed like a readability improvement. Differential Revision: https://reviews.llvm.org/D43833 llvm-svn: 326361
* [InstCombine] move invariant call out of loop; NFCSanjay Patel2018-02-281-4/+4
| | | | | | We really shouldn't need a 2-loop here at all, but that's another cleanup. llvm-svn: 326330
* [InstCombine] move constant check into foldBinOpIntoSelectOrPhi; NFCISanjay Patel2018-02-286-34/+29
| | | | | | | | Also, rename 'foldOpWithConstantIntoOperand' because that's annoyingly vague. The constant check is redundant in some cases, but it allows removing duplication for most of the calls. llvm-svn: 326329
* [InstCombine] allow fdiv folds with less than fully 'fast' opsSanjay Patel2018-02-261-13/+3
| | | | | | | | | | | | | | | | | | Note: gcc appears to allow this fold with -freciprocal-math alone, but clang/llvm require more than that with this patch. The wording in the definitions seems fuzzy enough that it could go either way, but we'll err on the conservative side of FMF interpretation. This patch also changes the newly created fmul to have FMF propagated by the last fdiv rather than intersecting the FMF of the fdivs. This matches the behavior of other folds near here. The new fmul is only used to produce an intermediate op for the final fdiv result, so it shouldn't be any stricter than that result. The previous behavior could result in dropping FMF via other folds in instcombine or CSE. Differential Revision: https://reviews.llvm.org/D43398 llvm-svn: 326098
* [InstCombine] simplify code for fabs(X) * fabs(X) -> X * X; NFCSanjay Patel2018-02-231-13/+4
| | | | llvm-svn: 325968
* [InstSimplify] sqrt(X) * sqrt(X) --> XSanjay Patel2018-02-231-4/+0
| | | | | | This was misplaced in InstCombine. We can loosen the FMF as a follow-up step. llvm-svn: 325965
* [InstCombine] allow fmul-sqrt folds with less than full -ffast-mathSanjay Patel2018-02-231-15/+8
| | | | | | Also, add a Builder method for intrinsics to reduce code duplication for clients. llvm-svn: 325960
* [InstCombine] refactor fmul with negated op folds; NFCISanjay Patel2018-02-231-24/+18
| | | | | | | | | | | | | | The existing code was inefficiently looking for 'nsz' variants. That's unnecessary because we canonicalize those to the expected form with -0.0. We may also want to adjust or remove the fold that sinks negation. We don't do that for fdiv (or integer ops?). That should be uniform? It may also lead to missed optimization as in PR21914: https://bugs.llvm.org/show_bug.cgi?id=21914 ...or we just have to fix other passes to avoid that problem. llvm-svn: 325924
* [InstCombine] use FMF-copying functions to reduce code; NFCISanjay Patel2018-02-231-28/+12
| | | | llvm-svn: 325923
* [InstCombine] add and use Create*FMF functions; NFCSanjay Patel2018-02-211-15/+7
| | | | llvm-svn: 325730
* [InstCombine] C / -X --> -C / XSanjay Patel2018-02-211-8/+17
| | | | | | | | | We already do this in DAGCombiner, but it should also be good to eliminate the fsub use in IR. This is similar to rL325648. llvm-svn: 325649
* [InstCombine] -X / C --> X / -C for FPSanjay Patel2018-02-201-5/+12
| | | | | | | We already do this in DAGCombiner, but it should also be good to eliminate the fsub use in IR. llvm-svn: 325648
* [InstCombine] remove unneeded operand swap: NFCISanjay Patel2018-02-201-3/+0
| | | | | | | FMul is commutative, so complexity-based canonicalization should always take care of the swap via SimplifyAssociativeOrCommutative(). llvm-svn: 325628
* [InstCombine] remove unneeded dyn_cast to prevent unused variable warningSanjay Patel2018-02-201-2/+1
| | | | llvm-svn: 325597
* [InstCombine] remove compound fdiv pattern foldsSanjay Patel2018-02-201-27/+1
| | | | | | | | | | | | | | These are fdiv-with-constant-divisor, so they already become reciprocal multiplies. The last gap for vector ops should be closed with rL325590. It's possible that we're missing folds for some edge cases with denormal intermediate constants after deleting these, but there are no tests for those patterns, and it would be better to handle denormals more consistently (and less conservatively) as noted in TODO comments. llvm-svn: 325595
OpenPOWER on IntegriCloud