| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
x, log2(pow(2.0,x)) == x
Summary: This patch enables folding following instructions under -ffast-math flag: log10(pow(10.0,x)) -> x, log2(pow(2.0,x)) -> x
Reviewers: hfinkel, spatel, efriedma, craig.topper, zvi, majnemer, lebedev.ri
Reviewed By: spatel, lebedev.ri
Subscribers: lebedev.ri, llvm-commits
Differential Revision: https://reviews.llvm.org/D41940
llvm-svn: 352981
|
| |
|
|
| |
llvm-svn: 352734
|
| |
|
|
|
|
|
| |
Move constant folding tests into ConstantFolding/bitcount.ll and drop
various tests in other places. Add coverage for undefs.
llvm-svn: 349806
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
If a saturating add/sub has one constant operand, then we can
determine the possible range of outputs it can produce, and simplify
an icmp comparison based on that.
The implementation is based on a similar existing mechanism for
simplifying binary operator + icmps.
Differential Revision: https://reviews.llvm.org/D55735
llvm-svn: 349369
|
| |
|
|
|
|
|
| |
These test comparisons with saturating add/sub in non-canonical
form.
llvm-svn: 349309
|
| |
|
|
|
|
|
|
| |
If a saturating add/sub with a constant operand is compared to
another constant, we should be able to determine that the condition
is always true/false in some cases (but currently don't).
llvm-svn: 349261
|
| |
|
|
|
|
|
|
| |
Extracting from a splat constant is always handled by InstSimplify.
Move the test for this from InstCombine to InstSimplify to make
sure that stays true.
llvm-svn: 348423
|
| |
|
|
|
|
|
|
| |
These tests should probably go under a separate test file because they
should fold with just -constprop, but they're similar to the scalar
tests already in here.
llvm-svn: 348045
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an almost direct move of the functionality from InstCombine to
InstSimplify. There's no reason not to do this in InstSimplify because
we never create a new value with this transform.
(There's a question of whether any dominance-based transform belongs in
either of these passes, but that's a separate issue.)
I've changed 1 of the conditions for the fold (1 of the blocks for the
branch must be the block we started with) into an assert because I'm not
sure how that could ever be false.
We need 1 extra check to make sure that the instruction itself is in a
basic block because passes other than InstCombine may be using InstSimplify
as an analysis on values that are not wired up yet.
The 3-way compare changes show that InstCombine has some kind of
phase-ordering hole. Otherwise, we would have already gotten the intended
final result that we now show here.
llvm-svn: 347896
|
| |
|
|
|
|
|
|
| |
Splitting these off from the D54666.
Patch by: nikic (Nikita Popov)
llvm-svn: 347332
|
| |
|
|
|
|
|
|
|
| |
These are part of D54666, so adding them here before the patch to
show the baseline (currently unoptimized) results.
Patch by: @nikic (Nikita Popov)
llvm-svn: 347331
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for saturating add/sub in InstructionSimplify. In particular, the following simplifications are supported:
sat(X + 0) -> X
sat(X + undef) -> -1
sat(X uadd MAX) -> MAX
(and commutative variants)
sat(X - 0) -> X
sat(X - X) -> 0
sat(X - undef) -> 0
sat(undef - X) -> 0
sat(0 usub X) -> 0
sat(X usub MAX) -> 0
Patch by: @nikic (Nikita Popov)
Differential Revision: https://reviews.llvm.org/D54532
llvm-svn: 347330
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the issue noticed in D54532.
The problem is that cst_pred_ty-based matchers like m_Zero() currently do not match
scalar undefs (as expected), but *do* match vector undefs. This may lead to optimization
inconsistencies in rare cases.
There is only one existing test for which output changes, reverting the change from D53205.
The reason here is that vector fsub undef, %x is no longer matched as an m_FNeg(). While I
think that the new output is technically worse than the previous one, it is consistent with
scalar, and I don't think it's really important either way (generally that undef should have
been folded away prior to reassociation.)
I've also added another test case for this issue based on InstructionSimplify. It took some
effort to find that one, as in most cases undef folds are either checked first -- and in the
cases where they aren't it usually happens to not make a difference in the end. This is the
only case I was able to come up with. Prior to this patch the test case simplified to undef
in the scalar case, but zeroinitializer in the vector case.
Patch by: @nikic (Nikita Popov)
Differential Revision: https://reviews.llvm.org/D54631
llvm-svn: 347318
|
| |
|
|
|
|
|
|
| |
These are baseline tests for D54532.
Patch based on the original tests by:
@nikic (Nikita Popov)
llvm-svn: 347060
|
| |
|
|
|
|
|
|
| |
This is a baseline test for D54631.
Patch by:
@nikic (Nikita Popov)
llvm-svn: 347055
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a problem seen in common rotate idioms as noted in:
https://bugs.llvm.org/show_bug.cgi?id=34924
Note that we are not canonicalizing standard IR (shifts and logic) to the intrinsics yet.
(Although I've written this before...) I think this is the last step before we enable
that transform. Ie, we could regress code by doing that transform without this
simplification in place.
In PR34924, I questioned whether this is a valid transform for target-independent IR,
but I convinced myself this is ok. If we're speculating a funnel shift by turning cmp+br
into select, then SimplifyCFG has already determined that the transform is justified.
It's possible that SimplifyCFG is not taking into account profile or other metadata,
but if that's true, then it's a bug independent of funnel shifts.
Also, we do have CGP code to restore a guard like this around an intrinsic if it can't
be lowered cheaply. But that isn't necessary for funnel shift because the default
expansion in SelectionDAGBuilder includes this same cmp+select.
Differential Revision: https://reviews.llvm.org/D54552
llvm-svn: 346960
|
| |
|
|
|
|
|
| |
The cases are just different enough that we should have
complete tests to avoid bugs from typos in the code.
llvm-svn: 346902
|
| |
|
|
| |
llvm-svn: 346881
|
| |
|
|
|
|
|
|
|
| |
This is NFCI for InstCombine because it calls InstSimplify,
so I left the tests for this transform there. As noted in
the code comment, we can allow this fold more often by using
FMF and/or value tracking.
llvm-svn: 346169
|
| |
|
|
|
|
|
| |
These are translated from InstCombine's test file with the same name.
We should move the transform from InstCombine to InstSimplify.
llvm-svn: 346168
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is retrying the fold from rL345717
(reverted at rL347780)
...with a fix for the miscompile
demonstrated by PR39510:
https://bugs.llvm.org/show_bug.cgi?id=39510
Original commit message:
This is a fix for PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475
We managed to get some of these patterns using computeKnownBits in https://reviews.llvm.org/D47041, but that
can't be used for nabs(). Instead, put in some range-based logic, so we can fold
both abs/nabs with icmp with a constant value.
Alive proofs:
https://rise4fun.com/Alive/21r
Name: abs_nsw_is_positive
%cmp = icmp slt i32 %x, 0
%negx = sub nsw i32 0, %x
%abs = select i1 %cmp, i32 %negx, i32 %x
%r = icmp sgt i32 %abs, -1
=>
%r = i1 true
Name: abs_nsw_is_not_negative
%cmp = icmp slt i32 %x, 0
%negx = sub nsw i32 0, %x
%abs = select i1 %cmp, i32 %negx, i32 %x
%r = icmp slt i32 %abs, 0
=>
%r = i1 false
Name: nabs_is_negative_or_0
%cmp = icmp slt i32 %x, 0
%negx = sub i32 0, %x
%nabs = select i1 %cmp, i32 %x, i32 %negx
%r = icmp slt i32 %nabs, 1
=>
%r = i1 true
Name: nabs_is_not_over_0
%cmp = icmp slt i32 %x, 0
%negx = sub i32 0, %x
%nabs = select i1 %cmp, i32 %x, i32 %negx
%r = icmp sgt i32 %nabs, 0
=>
%r = i1 false
Differential Revision: https://reviews.llvm.org/D53844
llvm-svn: 345832
|
| |
|
|
|
|
| |
Verify that set intersection/subset are not confused.
llvm-svn: 345831
|
| |
|
|
|
|
|
| |
This can miscompile as shown in PR39510:
https://bugs.llvm.org/show_bug.cgi?id=39510
llvm-svn: 345780
|
| |
|
|
|
|
| |
This is the inverted case for the transform added with D53874 / rL345725.
llvm-svn: 345728
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This re-raises some of the open questions about how to apply and use fast-math-flags in IR from PR38086:
https://bugs.llvm.org/show_bug.cgi?id=38086
...but given the current implementation (no FMF on casts), this is likely the only way to predicate the
transform.
This is part of solving PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475
Differential Revision: https://reviews.llvm.org/D53874
llvm-svn: 345725
|
| |
|
|
| |
llvm-svn: 345722
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a fix for PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475
We managed to get some of these patterns using computeKnownBits in D47041, but that
can't be used for nabs(). Instead, put in some range-based logic, so we can fold
both abs/nabs with icmp with a constant value.
Alive proofs:
https://rise4fun.com/Alive/21r
Name: abs_nsw_is_positive
%cmp = icmp slt i32 %x, 0
%negx = sub nsw i32 0, %x
%abs = select i1 %cmp, i32 %negx, i32 %x
%r = icmp sgt i32 %abs, -1
=>
%r = i1 true
Name: abs_nsw_is_not_negative
%cmp = icmp slt i32 %x, 0
%negx = sub nsw i32 0, %x
%abs = select i1 %cmp, i32 %negx, i32 %x
%r = icmp slt i32 %abs, 0
=>
%r = i1 false
Name: nabs_is_negative_or_0
%cmp = icmp slt i32 %x, 0
%negx = sub i32 0, %x
%nabs = select i1 %cmp, i32 %x, i32 %negx
%r = icmp slt i32 %nabs, 1
=>
%r = i1 true
Name: nabs_is_not_over_0
%cmp = icmp slt i32 %x, 0
%negx = sub i32 0, %x
%nabs = select i1 %cmp, i32 %x, i32 %negx
%r = icmp sgt i32 %nabs, 0
=>
%r = i1 false
Differential Revision: https://reviews.llvm.org/D53844
llvm-svn: 345717
|
| |
|
|
|
|
|
| |
This is part of a problem noted in PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475
llvm-svn: 345615
|
| |
|
|
| |
llvm-svn: 345541
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Depends on D52765
Reviewers: aheejin, dschuff
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52766
llvm-svn: 344799
|
| |
|
|
|
|
|
| |
These should all be handled using "dyn_castNegVal",
but that misses vectors with undef elements.
llvm-svn: 344790
|
| |
|
|
|
|
| |
https://reviews.llvm.org/D52934
llvm-svn: 344084
|
| |
|
|
|
|
| |
This should be fixed with D52934.
llvm-svn: 343936
|
| |
|
|
|
|
|
|
|
|
|
| |
Remove duplicate tests from InstCombine that were added with
D50582. I left negative tests there to verify that nothing
in InstCombine tries to go overboard. If isKnownNeverNaN is
improved to handle the FP binops or other cases, we should
have coverage under InstSimplify, so we could remove more
duplicate tests from InstCombine at that time.
llvm-svn: 340279
|
| |
|
|
|
|
|
|
|
| |
This is a slight modification of the tests from D50582;
change half of the predicates to 'uno' so we have coverage
for that side too. All of the positive tests can fold to a
constant (true/false), so that should happen in instsimplify.
llvm-svn: 340276
|
| |
|
|
| |
llvm-svn: 339396
|
| |
|
|
| |
llvm-svn: 339176
|
| |
|
|
| |
llvm-svn: 339174
|
| |
|
|
| |
llvm-svn: 339171
|
| |
|
|
|
|
|
|
| |
Instcombine gets some, but not all, of these cases via
it's internal reassociation transforms. It fails in
all cases with vector types.
llvm-svn: 339168
|
| |
|
|
| |
llvm-svn: 339144
|
| |
|
|
| |
llvm-svn: 339142
|
| |
|
|
| |
llvm-svn: 339141
|
| |
|
|
|
|
|
| |
Also fix apparently missing test coverage for any of the
handling here.
llvm-svn: 339023
|
| |
|
|
|
|
|
|
|
|
|
| |
This is the second patch of the series which intends to enable jump threading for an inlined method whose return type is std::pair<int, bool> or std::pair<bool, int>.
The first patch is https://reviews.llvm.org/rL338485.
This patch handles code sequences that merges two values using `shl` and `or`, then extracts one value using `and`.
Differential Revision: https://reviews.llvm.org/D49981
llvm-svn: 338817
|
| |
|
|
| |
llvm-svn: 338719
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the NAN checks suggested in PR37776:
https://bugs.llvm.org/show_bug.cgi?id=37776
If both operands to maxnum are NAN, that should get constant folded, so we don't
have to handle that case. This is the same assumption as other FP ops in this
function. Returning 'false' is always conservatively correct.
Copying from the bug report:
Currently, we have this for "when is cannotBeOrderedLessThanZero
(mustBePositiveOrNaN) true for maxnum":
L
-------------------
| Pos | Neg | NaN |
------------------------
|Pos | x | x | x |
------------------------
R |Neg | x | | x |
------------------------
|NaN | x | x | x |
------------------------
The cases with (Neg & NaN) are wrong. We should have:
L
-------------------
| Pos | Neg | NaN |
------------------------
|Pos | x | x | x |
------------------------
R |Neg | x | | |
------------------------
|NaN | x | | x |
------------------------
Differential Revision: https://reviews.llvm.org/D50081
llvm-svn: 338716
|
| |
|
|
| |
llvm-svn: 338652
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch intends to enable jump threading when a method whose return type is std::pair<int, bool> or std::pair<bool, int> is inlined.
For example, jump threading does not happen for the if statement in func.
std::pair<int, bool> callee(int v) {
int a = dummy(v);
if (a) return std::make_pair(dummy(v), true);
else return std::make_pair(v, v < 0);
}
int func(int v) {
std::pair<int, bool> rc = callee(v);
if (rc.second) {
// do something
}
SROA executed before the method inlining replaces std::pair by i64 without splitting in both callee and func since at this point no access to the individual fields is seen to SROA.
After inlining, jump threading fails to identify that the incoming value is a constant due to additional instructions (like or, and, trunc).
This series of patch add patterns in InstructionSimplify to fold extraction of members of std::pair. To help jump threading, actually we need to optimize the code sequence spanning multiple BBs.
These patches does not handle phi by itself, but these additional patterns help NewGVN pass, which calls instsimplify to check opportunities for simplifying instructions over phi, apply phi-of-ops optimization to result in successful jump threading.
SimplifyDemandedBits in InstCombine, can do more general optimization but this patch aims to provide opportunities for other optimizers by supporting a simple but common case in InstSimplify.
This first patch in the series handles code sequences that merges two values using shl and or and then extracts one value using lshr.
Differential Revision: https://reviews.llvm.org/D48828
llvm-svn: 338485
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Proof: https://rise4fun.com/Alive/L5J
Reviewers: lebedev.ri, spatel
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49975
llvm-svn: 338383
|