| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Start using the uadd.sat and usub.sat intrinsics for the existing
canonicalizations. These intrinsics should optimize better than
expanded IR, have better handling in the X86 backend and should
be no worse than expanded IR in other backends, as far as we know.
rL357012 already introduced use of uadd.sat for the add+umin pattern.
Differential Revision: https://reviews.llvm.org/D58872
llvm-svn: 357103
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Follow-up to rL355221.
This isn't specifically called for within PR14613,
but we'll get there eventually if it's not already
requested in some other bug report.
https://rise4fun.com/Alive/5b0
Name: smax
Pre: WillNotOverflowSignedSub(C1,C0)
%a = add nsw i8 %x, C0
%cond = icmp sgt i8 %a, C1
%r = select i1 %cond, i8 %a, i8 C1
=>
%c2 = icmp sgt i8 %x, C1-C0
%u2 = select i1 %c2, i8 %x, i8 C1-C0
%r = add nsw i8 %u2, C0
Name: smin
Pre: WillNotOverflowSignedSub(C1,C0)
%a = add nsw i32 %x, C0
%cond = icmp slt i32 %a, C1
%r = select i1 %cond, i32 %a, i32 C1
=>
%c2 = icmp slt i32 %x, C1-C0
%u2 = select i1 %c2, i32 %x, i32 C1-C0
%r = add nsw i32 %u2, C0
llvm-svn: 355272
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the motivating cases from PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613
...moving the add enables us to narrow the
min/max which eliminates zext/trunc which
enables signficantly better vectorization.
But that bug is still not completely fixed.
https://rise4fun.com/Alive/5KQ
Name: umax
Pre: C1 u>= C0
%a = add nuw i8 %x, C0
%cond = icmp ugt i8 %a, C1
%r = select i1 %cond, i8 %a, i8 C1
=>
%c2 = icmp ugt i8 %x, C1-C0
%u2 = select i1 %c2, i8 %x, i8 C1-C0
%r = add nuw i8 %u2, C0
Name: umin
Pre: C1 u>= C0
%a = add nuw i32 %x, C0
%cond = icmp ult i32 %a, C1
%r = select i1 %cond, i32 %a, i32 C1
=>
%c2 = icmp ult i32 %x, C1-C0
%u2 = select i1 %c2, i32 %x, i32 C1-C0
%r = add nuw i32 %u2, C0
llvm-svn: 355221
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Yet another pattern variation suggested by:
https://bugs.llvm.org/show_bug.cgi?id=14613
There are 8 more potential commuted patterns here on top of the
8 that were already handled (rL354221, rL354276, rL354393).
We have the obvious commute of the 'add' + commute of the cmp
predicate/operands (ugt/ult) + commute of the select operands:
Name: base
%notx = xor i32 %x, -1
%a = add i32 %notx, %y
%c = icmp ult i32 %x, %y
%r = select i1 %c, i32 -1, i32 %a
=>
%c2 = icmp ult i32 %a, %y
%r = select i1 %c2, i32 -1, i32 %a
Name: ugt
%notx = xor i32 %x, -1
%a = add i32 %notx, %y
%c = icmp ugt i32 %y, %x
%r = select i1 %c, i32 -1, i32 %a
=>
%c2 = icmp ult i32 %a, %y
%r = select i1 %c2, i32 -1, i32 %a
Name: commute select
%notx = xor i32 %x, -1
%a = add i32 %notx, %y
%c = icmp ult i32 %y, %x
%r = select i1 %c, i32 %a, i32 -1
=>
%c2 = icmp ult i32 %a, %y
%r = select i1 %c2, i32 -1, i32 %a
Name: ugt + commute select
%notx = xor i32 %x, -1
%a = add i32 %notx, %y
%c = icmp ugt i32 %x, %y
%r = select i1 %c, i32 %a, i32 -1
=>
%c2 = icmp ult i32 %a, %y
%r = select i1 %c2, i32 -1, i32 %a
https://rise4fun.com/Alive/den
llvm-svn: 354887
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613
Name: uaddsat, -1 fval
%notx = xor i32 %x, -1
%a = add i32 %x, %y
%c = icmp ugt i32 %notx, %y
%r = select i1 %c, i32 %a, i32 -1
=>
%a = add i32 %x, %y
%c2 = icmp ugt i32 %y, %a
%r = select i1 %c2, i32 -1, i32 %a
Name: uaddsat, -1 fval + ult
%notx = xor i32 %x, -1
%a = add i32 %x, %y
%c = icmp ult i32 %y, %notx
%r = select i1 %c, i32 %a, i32 -1
=>
%a = add i32 %x, %y
%c2 = icmp ugt i32 %y, %a
%r = select i1 %c2, i32 -1, i32 %a
https://rise4fun.com/Alive/nTp
llvm-svn: 354393
|
| |
|
|
|
|
|
|
|
|
| |
This is no-functional-change-intended, but that was also
true when it was part of rL354276, and I managed to lose
2 predicates for the fold with constant...causing much bot
distress. So this time I'm adding a couple of negative tests
to avoid that.
llvm-svn: 354384
|
| |
|
|
|
|
|
| |
This reverts commit 079b610c29b4a428b3ae7b64dbac0378facf6632.
Bots are failing after this change on a stage 2 compile of clang.
llvm-svn: 354277
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613
Name: uaddsat, -1 fval
%notx = xor i32 %x, -1
%a = add i32 %x, %y
%c = icmp ugt i32 %notx, %y
%r = select i1 %c, i32 %a, i32 -1
=>
%a = add i32 %x, %y
%c2 = icmp ugt i32 %y, %a
%r = select i1 %c2, i32 -1, i32 %a
Name: uaddsat, -1 fval + ult
%notx = xor i32 %x, -1
%a = add i32 %x, %y
%c = icmp ult i32 %y, %notx
%r = select i1 %c, i32 %a, i32 -1
=>
%a = add i32 %x, %y
%c2 = icmp ugt i32 %y, %a
%r = select i1 %c2, i32 -1, i32 %a
https://rise4fun.com/Alive/nTp
llvm-svn: 354276
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613
Name: not op
%notx = xor i32 %x, -1
%a = add i32 %x, %y
%c = icmp ult i32 %notx, %y
%r = select i1 %c, i32 -1, i32 %a
=>
%a = add i32 %x, %y
%c2 = icmp ult i32 %a, %y
%r = select i1 %c2, i32 -1, i32 %a
Name: not op ugt
%notx = xor i32 %x, -1
%a = add i32 %x, %y
%c = icmp ugt i32 %y, %notx
%r = select i1 %c, i32 -1, i32 %a
=>
%a = add i32 %x, %y
%c2 = icmp ult i32 %a, %y
%r = select i1 %c2, i32 -1, i32 %a
https://rise4fun.com/Alive/niom
(The matching here is still incomplete.)
llvm-svn: 354224
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613
(The matching here is incomplete. Trying to take minimal steps
to make sure we don't induce infinite looping from existing
canonicalizations of the 'select'.)
llvm-svn: 354221
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm circling back around to a loose end from D51929.
The backend (either CGP or DAG) doesn't recognize this pattern, so we end up with different asm for these IR variants.
Regardless of any future changes to canonicalize to saturation/overflow intrinsics, we want to get raw IR variations
into the minimal number of raw IR forms. If/when we can canonicalize to intrinsics, that will make that step easier.
Pre: C2 == ~C1
%a = add i32 %x, C1
%c = icmp ugt i32 %x, C2
%r = select i1 %c, i32 -1, i32 %a
=>
%a = add i32 %x, C1
%c2 = icmp ult i32 %x, C2
%r = select i1 %c2, i32 %a, i32 -1
https://rise4fun.com/Alive/pkH
Differential Revision: https://reviews.llvm.org/D57352
llvm-svn: 352536
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
|
| |
|
|
|
|
|
|
|
| |
This is matching the equivalent of the DAG expansion,
so it should never end up with worse perf than the
original code even if the target doesn't have a rotate
instruction.
llvm-svn: 350672
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The cttz/ctlz intrinsics have a parameter specifying whether the
result is undefined for zero. cttz(x, false) can be relaxed to
cttz(x, true) if x is known non-zero, and in fact such an optimization
is already performed. However, this currently doesn't work if x is
non-zero as a result of a select rather than an explicit branch.
This patch adds handling for this case, thus allowing
x != 0 ? cttz(x, false) : y to simplify to x != 0 ? cttz(x, true) : y.
Differential Revision: https://reviews.llvm.org/D55786
llvm-svn: 350463
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The final piece of IR-level analysis to allow this was committed with:
rL350188
Using the intrinsics should improve transforms based on cost models
like vectorization and inlining.
The backend should be prepared too, so we can now canonicalize more
sequences of shift/logic to the intrinsics and know that the end
result should be equal or better to the original code even if the
target does not have an actual rotate instruction.
llvm-svn: 350199
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an almost direct move of the functionality from InstCombine to
InstSimplify. There's no reason not to do this in InstSimplify because
we never create a new value with this transform.
(There's a question of whether any dominance-based transform belongs in
either of these passes, but that's a separate issue.)
I've changed 1 of the conditions for the fold (1 of the blocks for the
branch must be the block we started with) into an assert because I'm not
sure how that could ever be false.
We need 1 extra check to make sure that the instruction itself is in a
basic block because passes other than InstCombine may be using InstSimplify
as an analysis on values that are not wired up yet.
The 3-way compare changes show that InstCombine has some kind of
phase-ordering hole. Otherwise, we would have already gotten the intended
final result that we now show here.
llvm-svn: 347896
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These asserts are based on the assumption that the order of true/false operands in a select and those in the compare would always be the same.
This fixes PR39595.
Reviewers: craig.topper, spatel, dmgreen
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54359
llvm-svn: 346874
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The cmp+branch variant of this pattern is shown in:
https://bugs.llvm.org/show_bug.cgi?id=34924
...and as discussed there, we probably can't transform
that without a rotate intrinsic. We do have that now
via funnel shift, but we're not quite ready to
canonicalize IR to that form yet. The case with 'select'
should already be transformed though, so that's this patch.
The sequence with negation followed by masking is what we
use in the backend and partly in clang (though that part
should be updated).
https://rise4fun.com/Alive/TplC
%cmp = icmp eq i32 %shamt, 0
%sub = sub i32 32, %shamt
%shr = lshr i32 %x, %shamt
%shl = shl i32 %x, %sub
%or = or i32 %shr, %shl
%r = select i1 %cmp, i32 %x, i32 %or
=>
%neg = sub i32 0, %shamt
%masked = and i32 %shamt, 31
%maskedneg = and i32 %neg, 31
%shl2 = lshr i32 %x, %masked
%shr2 = shl i32 %x, %maskedneg
%r = or i32 %shl2, %shr2
llvm-svn: 346807
|
| |
|
|
|
|
|
|
|
| |
This is NFCI for InstCombine because it calls InstSimplify,
so I left the tests for this transform there. As noted in
the code comment, we can allow this fold more often by using
FMF and/or value tracking.
llvm-svn: 346169
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
It looks like we correctly removed edge cases with 0.0 from D50714,
but we were a bit conservative because getBinOpIdentity() doesn't
distinguish between +0.0 and -0.0 and 'nsz' is effectively always
true for fcmp (see discussion in:
https://bugs.llvm.org/show_bug.cgi?id=38086
Without this change, we would get regressions by canonicalizing
to +0.0 in all fcmp, and that's a step towards solving:
https://bugs.llvm.org/show_bug.cgi?id=39475
llvm-svn: 346143
|
| |
|
|
|
|
|
| |
This is another step towards completely removing the fake
binop queries for not/neg/fneg.
llvm-svn: 345036
|
| |
|
|
| |
llvm-svn: 345034
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The IRBuilder CreateIntrinsic method wouldn't allow you to specify the
types that you wanted the intrinsic to be mangled with. To fix this
I've:
- Added an ArrayRef<Type *> member to both CreateIntrinsic overloads.
- Used that array to pass into the Intrinsic::getDeclaration call.
- Added a CreateUnaryIntrinsic to replace the most common use of
CreateIntrinsic where the type was auto-deduced from operand 0.
- Added a bunch more unit tests to test Create*Intrinsic calls that
weren't being tested (including the FMF flag that wasn't checked).
This was suggested as part of the AMDGPU specific atomic optimizer
review (https://reviews.llvm.org/D51969).
Differential Revision: https://reviews.llvm.org/D52087
llvm-svn: 343962
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
invertible
Summary: This restores the combine that was reverted in r341883. The infinite loop from the failing test no longer occurs due to changes from r342163.
Reviewers: spatel, dmgreen
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52070
llvm-svn: 342797
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
are freely invertible.
This allows the xor to be removed completely.
This might help with recomitting r341674, but seems good regardless.
Coincidentally fixes PR38915.
Differential Revision: https://reviews.llvm.org/D51964
llvm-svn: 342163
|
| |
|
|
|
|
|
|
|
| |
I accidentally committed this diff with rL342147 because
I had applied D51964. We probably do need those checks,
but D51964 has tests and more discussion/motivation,
so they should be re-added with that patch.
llvm-svn: 342149
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I don't have a test case for this, but it's motivated by
the discussion in D51964, and I've added TODO comments for
the better fix - move simplifications into instsimplify
because that's more efficient and reduces risk of infinite
loops in instcombine caused by transforms trying to do the
opposite folds.
In this case, we know that the transform that tries to move
'not' through min/max can be fooled by the multiple uses
of a value in another min/max, so try to squash the
foldSPFofSPF() patterns first.
llvm-svn: 342147
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Revert min/max changes in rL341674 dues to high compile times causing timeouts (PR38897).
Checking in to unblock failing builds. Patch available for post-commit review and re-revert once resolved.
Working on a smaller reproducer for PR38897.
Reviewers: craig.topper, spatel
Subscribers: sanjoy, jlebar, llvm-commits
Differential Revision: https://reviews.llvm.org/D51897
llvm-svn: 341883
|
| |
|
|
|
|
|
|
|
|
|
|
| |
invertible
If the ~X wasn't able to simplify above the max/min, we might be able to simplify it by moving it below the max/min.
I had to modify the ~(min/max ~X, Y) transform to prevent getting stuck in a loop when we saw the new ~(max/min X, ~Y) before the ~Y had been folded away to remove the new not.
Differential Revision: https://reviews.llvm.org/D51398
llvm-svn: 341674
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
If OtherOpT or OtherOpF have scalar types and the condition is a vector,
we would create an invalid select.
Reviewers: spatel, john.brawn, mssimpso, craig.topper
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D51781
llvm-svn: 341666
|
| |
|
|
|
|
| |
We were calling getNumUses to check for 1 or 2 uses. But getNumUses is linear in the number of uses. We can instead use !hasNUsesOrMore(3) which will stop the linear scan as soon as it determines there are at least 3 uses even if there are more.
llvm-svn: 340939
|
| |
|
|
| |
llvm-svn: 340790
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Follow up for https://reviews.llvm.org/rL339520 and https://reviews.llvm.org/rL338300
Alive:
```
%A = fcmp oeq float %x, 0.0
%B = fadd nsz float %x, %z
%C = select i1 %A, float %B, float %y
=>
%C = select i1 %A, float %z, float %y
----------
%A = fcmp oeq float %x, 0.0
%B = fadd nsz float %x, %z
%C = select %A, float %B, float %y
=>
%C = select %A, float %z, float %y
Done: 1
Optimization is correct
%A = fcmp une float %x, -0.0
%B = fadd nsz float %x, %z
%C = select i1 %A, float %y, float %B
=>
%C = select i1 %A, float %y, float %z
----------
%A = fcmp une float %x, -0.0
%B = fadd nsz float %x, %z
%C = select %A, float %y, float %B
=>
%C = select %A, float %y, float %z
Done: 1
Optimization is correct
```
Reviewers: spatel, lebedev.ri
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D50714
llvm-svn: 340538
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
intersection
Summary: This change address bug 38641
Reviewers: spatel, wristow
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D50996
llvm-svn: 340222
|
| |
|
|
| |
llvm-svn: 340150
|
| |
|
|
| |
llvm-svn: 339938
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Basic version was merged - https://reviews.llvm.org/D49954
This adds support for FP & non-commutative opcodes
Precommited tests: https://reviews.llvm.org/rL338727
Reviewers: spatel, lebedev.ri
Reviewed By: spatel
Subscribers: jfb
Differential Revision: https://reviews.llvm.org/D50190
llvm-svn: 339520
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This is a retry of rL339439 with a fix for the problem that
caused the original commit to be reverted at rL339446.
That problem was that the compare can be integer while
the binop is FP or vice-versa, so we need to use the binop
type when we ask for the identity constant.
A test to guard against the problem was added at rL339453.
llvm-svn: 339469
|
| |
|
|
|
|
|
| |
That was supposed to be NFC, but it exposed a logic hole somewhere that
caused bots to fail.
llvm-svn: 339446
|
| |
|
|
|
|
|
| |
This should make it easier to folow and to add the planned enhancements
such as D50190.
llvm-svn: 339439
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fold
%A = icmp eq i8 %x, 0
%B = xor i8 %x, %z
%C = select i1 %A, i8 %B, i8 %y
To
%C = select i1 %A, i8 %z, i8 %y
Fixes https://bugs.llvm.org/show_bug.cgi?id=38345
Proof: https://rise4fun.com/Alive/43J
Reviewers: lebedev.ri, spatel
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49954
llvm-svn: 338300
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D48754
llvm-svn: 338092
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D49238
llvm-svn: 337143
|
| |
|
|
|
|
|
|
|
|
| |
When adjusting a cmp in order to canonicalize an abs/nabs select pattern we need
to use the type of the existing operand when creating a new operand not the
type of a select operand, as the two may be different.
This fixes PR37686.
llvm-svn: 334019
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This is the planned enhancement to D47163 / rL333611.
We want to match cmp/select sizes because that will be recognized
as min/max more easily and lead to better codegen (especially for
vector types).
As mentioned in D47163, this improves some of the tests that would
also be folded by D46380, so we may want to adjust that patch to
match the new patterns where the extend op occurs after the select.
llvm-svn: 333689
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We already do this for min/max (see the blob above the diff),
so we should do the same for abs/nabs.
A sign-bit check (<s 0) is used as a predicate for other IR
transforms and it's likely the best for codegen.
This might solve the motivating cases for D47037 and D47041,
but I think those patches still make sense. We can't guarantee
this canonicalization if the icmp has more than one use.
Differential Revision: https://reviews.llvm.org/D47076
llvm-svn: 332819
|
| |
|
|
|
|
| |
min/max and ignore abs/nabs.
llvm-svn: 332770
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add logic for the special case when a cmp+select can clearly be
reduced to just a bitwise logic instruction, and remove an
over-reaching chunk of general purpose bit magic. The primary goal
is to remove cases where we are not improving the IR instruction
count when doing these select transforms, and in all cases here that
is true.
In the motivating 3-way compare tests, there are further improvements
because we can combine/propagate select values (not sure if that
belongs in instcombine, but it's there for now).
DAGCombiner has folds to turn some of these selects into bit magic,
so there should be no difference in the end result in those cases.
Not all constant combinations are handled there yet, however, so it
is possible that some targets will see more cmov/csel codegen with
this change in IR canonicalization.
Ideally, we'll go further to *not* turn selects into multiple
logic/math ops in instcombine, and we'll canonicalize to selects.
But we should make sure that this step does not result in regressions
first (and if it does, we should fix those in the backend).
The general direction for this change was discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105373.html
http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html
Alive proofs for the new bit magic:
https://rise4fun.com/Alive/XG7
Differential Revision: https://reviews.llvm.org/D46086
llvm-svn: 331486
|
| |
|
|
|
|
|
|
|
|
| |
As discussed in D45862, we want to delete parts of
this code because it can create more instructions
than it removes. But we also want to preserve some
folds that are winners, so tidy up what's here to
make splitting the good from bad a bit easier.
llvm-svn: 330841
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The fold added in D45108 did not account for the fact that
the and instruction is commutative, and if the mask is a variable,
the mask variable and the fold variable may be swapped.
I have noticed this by accident when looking into [[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]]
This extends/generalizes that fold, so it is handled too.
Reviewers: spatel, craig.topper
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45539
llvm-svn: 330001
|