| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change brings performance of zlib up by 10%. The example below is from a
hot loop in longest_match() from zlib.
do.body:
%cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ]
%idx.ext = zext i32 %cur_match.addr.0 to i64
%add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext
%add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1
%add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1
In this example %idx.ext1 is a loop invariant. It will be moved above the use of
loop induction variable %idx.ext such that it can be hoisted out of the loop by
LICM. The operands that have dependences carried by the loop will be sinked down
in the GEP chain. This patch will produce the following output:
do.body:
%cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ]
%idx.ext = zext i32 %cur_match.addr.0 to i64
%add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1
%add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1
%add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext
llvm-svn: 328539
|
|
|
|
|
|
|
|
|
|
| |
This replaces a large chunk of code that was looking for compound
patterns that include these sub-patterns. Existing tests ensure that
all of the previous examples are still folded as expected.
We still need to loosen the FMF check.
llvm-svn: 328502
|
|
|
|
|
|
| |
As the tests show, we could create extra instructions without any obvious benefit.
llvm-svn: 328498
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This continues the FP constant pattern matching improvements from:
https://reviews.llvm.org/rL327627
https://reviews.llvm.org/rL327339
https://reviews.llvm.org/rL327307
Several integer constant matchers also have this ability. I'm
separating matching of integer/pointer null from FP positive zero
and renaming/commenting to make the functionality clearer.
llvm-svn: 328461
|
|
|
|
|
|
| |
This is an extension of rL328426 as noted in D44367.
llvm-svn: 328448
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pattern came up in PR36682:
https://bugs.llvm.org/show_bug.cgi?id=36682
https://godbolt.org/g/LhuD9A
Equality checks are planned as a follow-up enhancement.
Differential Revision: https://reviews.llvm.org/D44367
llvm-svn: 328426
|
|
|
|
| |
llvm-svn: 328425
|
|
|
|
| |
llvm-svn: 328372
|
|
|
|
| |
llvm-svn: 328323
|
|
|
|
| |
llvm-svn: 328322
|
|
|
|
|
|
|
|
| |
Summary:
Just updating a call to MemSetInst::getAlignment() to MemSetInst::getDestAlignment(). The
former has been deprecated.
llvm-svn: 328227
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a retry of r328119 which was reverted at r328145 because
it could crash by trying to combine icmps with different operand
types. This version has a check for that and additional tests.
Original commit message:
This is part of solving:
https://bugs.llvm.org/show_bug.cgi?id=36682
There's also a leftover improvement from the long-ago-closed:
https://bugs.llvm.org/show_bug.cgi?id=5438
https://rise4fun.com/Alive/dC1
llvm-svn: 328197
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering.
Transforms depends on Transforms/Utils, not the other way around. So
remove the header and the "createStripGCRelocatesPass" function
declaration (& definition) that is unused and motivated this dependency.
Move Transforms/Utils/Local.h into Analysis because it's used by
Analysis/MemoryBuiltins.cpp.
llvm-svn: 328165
|
|
|
|
|
|
|
| |
This asserts when compiling safe_numerics_unittest.cpp in Chromium with
MSan.
llvm-svn: 328145
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is part of solving:
https://bugs.llvm.org/show_bug.cgi?id=36682
There's also a leftover improvement from the long-ago-closed:
https://bugs.llvm.org/show_bug.cgi?id=5438
https://rise4fun.com/Alive/dC1
llvm-svn: 328119
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is complicated by -0.0 and nan. This is based on the DAG patterns
as shown in D44091. I'm hoping that we can just remove those DAG folds
and always rely on IR canonicalization to handle the matching to fabs.
We would still need to delete the broken code from DAGCombiner to fix
PR36600:
https://bugs.llvm.org/show_bug.cgi?id=36600
Differential Revision: https://reviews.llvm.org/D44550
llvm-svn: 327858
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR36682)
Summary:
This pattern came up in PR36682 / D44390
https://bugs.llvm.org/show_bug.cgi?id=36682
https://reviews.llvm.org/D44390
https://godbolt.org/g/oKvT5H
See also D44416
Reviewers: spatel, majnemer, efriedma, arsenm
Reviewed By: spatel
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D44424
llvm-svn: 327799
|
|
|
|
|
|
| |
This is similar to D43765.
llvm-svn: 327797
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For example,
((X & 255) != 0) && ((X & 15) == 8) -> ((X & 15) == 8).
((X & 7) != 0) && ((X & 15) == 8) -> false.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43835
llvm-svn: 327450
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was supposed to be an NFC refactoring that will eventually allow
eliminating the isFast() predicate, but there's a rare possibility
that we would pessimize the code as shown in the test case because
we failed to check 'hasOneUse()' properly. This version also removes
an inefficiency of the old code; we would look for:
(X * C) * C1 --> X * (C * C1)
...but that pattern is always handled by
SimplifyAssociativeOrCommutative().
llvm-svn: 327404
|
|
|
|
|
|
|
|
|
|
| |
getNumUses is a linear time operation. It traverses the user linked list to the end and counts as it goes. Since we are only interested in small constant counts, we should use hasNUses or hasNUsesMore more that terminate the traversal as soon as it can provide the answer.
There are still two other locations in InstCombine, but changing those would force a rebase of D44266 which if accepted would remove them.
Differential Revision: https://reviews.llvm.org/D44398
llvm-svn: 327315
|
|
|
|
|
|
|
|
|
|
| |
export LLVMInitializeInstCombine as extern "C". Fixes PR35947.
Patch by Brenton Bostick.
Differential revision: https://reviews.llvm.org/D44140
llvm-svn: 326843
|
|
|
|
| |
llvm-svn: 326828
|
|
|
|
|
|
|
|
|
| |
ValueTracking
Most of the folds based on SelectPatternResult belong in InstSimplify rather than
InstCombine, so the helper code should be available to other passes/analysis.
llvm-svn: 326812
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instructions.
Summary:
Presently, InstCombiner::foldICmpWithCastAndCast() implicitly assumes that it is
only invoked with icmp instructions of integer type. If that assumption is broken,
and it is called with an icmp of vector type, then it fails (asserts/crashes).
This patch addresses the deficiency. It allows it to simplify
icmp (ptrtoint x), (ptrtoint/c) of vector type into a compare of the inputs,
much as is done when the type is integer.
Reviewers: apilipenko, fedor.sergeev, mkazantsev, anna
Reviewed By: anna
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44063
llvm-svn: 326730
|
|
|
|
|
|
|
|
| |
This patch teaches getMinimumFPType to support shrinking a vector of ConstantFPs. This should improve our ability to combine vector fptrunc with fp binops.
Differential Revision: https://reviews.llvm.org/D43774
llvm-svn: 326729
|
|
|
|
| |
llvm-svn: 326660
|
|
|
|
|
|
|
| |
Put the simplest non-FMF folds first, so it's easier to
see what's left to fix/group/add with the FMF folds.
llvm-svn: 326632
|
|
|
|
|
|
|
|
|
|
| |
creating overly small ConstantFPs that we'll just need to extend again.
Instead of returning the smaller FP constant we now return the minimal Type the constant can fit into. We also return the Type of the input to any fp extends. The legality checks are then done on just the size of these Types. If we find something profitable we then emit FPTruncs in front of the smaller binop and assume those FPTruncs will be constant folded or combined with any ConstantFPs or fpextends.
Differential Revision: https://reviews.llvm.org/D44038
llvm-svn: 326617
|
|
|
|
|
|
|
|
|
|
| |
The code was checking that all of the instructions in the
sequence are 'fast', but that's not necessary. The final
multiply is all that we need to check (tests adjusted).
The fmul doesn't need to be fully 'fast' either, but that
can be another patch.
llvm-svn: 326608
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a retry of r326502 with updates to the reassociate
test file that I missed the first time.
@test15_reassoc in the supposed -reassociate test file
(except that it tests 2 other passes too...) shows that
there's no clear responsiblity for reassociation transforms.
Instcombine now gets that case, but only because the
constant values are identical. Otherwise, it would still
miss that pattern.
Reassociate doesn't get that case because it hasn't been
updated to use less than 'fast' FMF.
llvm-svn: 326513
|
|
|
|
|
|
|
|
| |
I forgot that I added tests for 'reassoc' to -reassociate, but
suprisingly that file calls -instcombine too, so it is affected.
I'll update that file and try again.
llvm-svn: 326510
|
|
|
|
| |
llvm-svn: 326502
|
|
|
|
| |
llvm-svn: 326444
|
|
|
|
|
|
| |
I've added random FMF to one of the tests to show those are propagated.
llvm-svn: 326377
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
use nullptr as a sentinel
Currently this code's control flow very much assumes that there are no meaningful checks after determining that it's a ConstantFP. So whenever it wants to stop it just does "return V". But V is also the variable name it uses when it wants to return a new value. So 'return V' appears multiple times with different meanings.
This patch just moves all the code into a helper function and returns nullptr when it wants to stop.
I've split this from D43774 while I try to figure out how to best handle the vector case there. But this change by itself at least seemed like a readability improvement.
Differential Revision: https://reviews.llvm.org/D43833
llvm-svn: 326361
|
|
|
|
|
|
| |
We really shouldn't need a 2-loop here at all, but that's another cleanup.
llvm-svn: 326330
|
|
|
|
|
|
|
|
| |
Also, rename 'foldOpWithConstantIntoOperand' because that's annoyingly
vague. The constant check is redundant in some cases, but it allows
removing duplication for most of the calls.
llvm-svn: 326329
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note: gcc appears to allow this fold with -freciprocal-math alone,
but clang/llvm require more than that with this patch. The wording
in the definitions seems fuzzy enough that it could go either way,
but we'll err on the conservative side of FMF interpretation.
This patch also changes the newly created fmul to have FMF propagated
by the last fdiv rather than intersecting the FMF of the fdivs. This
matches the behavior of other folds near here. The new fmul is only
used to produce an intermediate op for the final fdiv result, so it
shouldn't be any stricter than that result. The previous behavior
could result in dropping FMF via other folds in instcombine or CSE.
Differential Revision: https://reviews.llvm.org/D43398
llvm-svn: 326098
|
|
|
|
| |
llvm-svn: 325968
|
|
|
|
|
|
| |
This was misplaced in InstCombine. We can loosen the FMF as a follow-up step.
llvm-svn: 325965
|
|
|
|
|
|
| |
Also, add a Builder method for intrinsics to reduce code duplication for clients.
llvm-svn: 325960
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing code was inefficiently looking for 'nsz' variants.
That's unnecessary because we canonicalize those to the expected
form with -0.0.
We may also want to adjust or remove the fold that sinks negation.
We don't do that for fdiv (or integer ops?). That should be uniform?
It may also lead to missed optimization as in PR21914:
https://bugs.llvm.org/show_bug.cgi?id=21914
...or we just have to fix other passes to avoid that problem.
llvm-svn: 325924
|
|
|
|
| |
llvm-svn: 325923
|
|
|
|
| |
llvm-svn: 325730
|
|
|
|
|
|
|
|
|
| |
We already do this in DAGCombiner, but it should
also be good to eliminate the fsub use in IR.
This is similar to rL325648.
llvm-svn: 325649
|
|
|
|
|
|
|
| |
We already do this in DAGCombiner, but it should
also be good to eliminate the fsub use in IR.
llvm-svn: 325648
|
|
|
|
|
|
|
| |
FMul is commutative, so complexity-based canonicalization should always
take care of the swap via SimplifyAssociativeOrCommutative().
llvm-svn: 325628
|
|
|
|
| |
llvm-svn: 325597
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are fdiv-with-constant-divisor, so they already become
reciprocal multiplies. The last gap for vector ops should be
closed with rL325590.
It's possible that we're missing folds for some edge cases
with denormal intermediate constants after deleting these,
but there are no tests for those patterns, and it would be
better to handle denormals more consistently (and less
conservatively) as noted in TODO comments.
llvm-svn: 325595
|