| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
Avoids warnings in Release builds. NFC.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The MVE VADC instruction reads and writes the carry bit at bit 29 of
the FPSCR register. The corresponding ACLE intrinsic is specified to
work with an integer in which the carry bit is stored at bit 0. So if
a user writes a code sequence in C that passes the carry from one VADC
to the next, like this,
s0 = vadcq_u32(a0, b0, &carry);
s1 = vadcq_u32(a1, b1, &carry);
then clang will generate IR for each of those operations that shifts
the carry bit up into bit 29 before the VADC, and after it, shifts it
back down and masks off all but the low bit. But in this situation
what you really wanted was two consecutive VADC instructions, so that
the second one directly reads the value left in FPSCR by the first,
without wasting several instructions on pointlessly clearing the other
flag bits in between.
This commit explains to InstCombine that the other bits of the flags
operand don't matter, and adds a test that demonstrates that all the
code between the two VADC instructions can be optimized away as a
result.
Reviewers: dmgreen, miyuki, ostannard
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67162
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This adds an instcombine matcher for code that attempts to perform signed
saturating arithmetic by casting to a higher type. Unsigned cases are already
matched, this adds extra matches for the more complex signed cases, which
involves matching the min(max(add a b)) nodes with proper extends to ensure
legality.
Differential Revision: https://reviews.llvm.org/D68651
llvm-svn: 375505
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69303
llvm-svn: 375499
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69302
llvm-svn: 375498
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: hiraditya, asbirlea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69253
llvm-svn: 375419
|
| |
|
|
|
|
| |
As noted in post-commit review of rL375378375378.
llvm-svn: 375397
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Allow for ignoring the check for a single use in SimplifyDemandedVectorElts
to be able to simplify operands if DemandedElts is known to contain
the union of elements used by all users.
It is a responsibility of a caller of SimplifyDemandedVectorElts to
supply correct DemandedElts.
Simplify a series of extractelement instructions if only a subset of
elements is used.
Reviewers: reames, arsenm, majnemer, nhaehnle
Reviewed By: nhaehnle
Subscribers: wdng, jvesely, nhaehnle, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67345
llvm-svn: 375395
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
In this pattern, all the "magic" bits that we'd `add` are all
high sign bits, and in the value we'd be adding to they are all unset,
not unexpectedly, so we can have an `or` there:
https://rise4fun.com/Alive/ups
It is possible that `haveNoCommonBitsSet()` should be taught about this
pattern so that we never have an `add` variant, but the reasoning would
need to be recursive (because of that `select`), so i'm not really sure
that would be worth it just yet.
llvm-svn: 375378
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This adds folds for comparing uadd.sat/usub.sat with zero:
* uadd.sat(a, b) == 0 => a == 0 && b == 0 => (a | b) == 0
* usub.sat(a, b) == 0 => a <= b
And inverted forms for !=.
Differential Revision: https://reviews.llvm.org/D69224
llvm-svn: 375374
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This problem consists of several parts:
* Basic sign bit extraction - `trunc? (?shr %x, (bitwidth(x)-1))`.
This is trivial, and easy to do, we have a fold for it.
* Shift amount reassociation - if we have two identical shifts,
and we can simplify-add their shift amounts together,
then we likely can just perform them as a single shift.
But this is finicky, has one-use restrictions,
and shift opcodes must be identical.
But there is a super-pattern where both of these work together.
to produce sign bit test from two shifts + comparison.
We do indeed already handle this in most cases.
But since we get that fold transitively, it has one-use restrictions.
And what's worse, in this case the right-shifts aren't required to be
identical, and we can't handle that transitively:
If the total shift amount is bitwidth-1, only a sign bit will remain
in the output value. But if we look at this from the perspective of
two shifts, we can't fold - we can't possibly know what bit pattern
we'd produce via two shifts, it will be *some* kind of a mask
produced from original sign bit, but we just can't tell it's shape:
https://rise4fun.com/Alive/cM0 https://rise4fun.com/Alive/9IN
But it will *only* contain sign bit and zeros.
So from the perspective of sign bit test, we're good:
https://rise4fun.com/Alive/FRz https://rise4fun.com/Alive/qBU
Superb!
So the simplest solution is to extend `reassociateShiftAmtsOfTwoSameDirectionShifts()` to also have a
sudo-analysis mode that will ignore extra-uses, and will only check
whether a) those are two right shifts and b) they end up with bitwidth(x)-1
shift amount and return either the original value that we sign-checking,
or null.
This does not have any functionality change for
the existing `reassociateShiftAmtsOfTwoSameDirectionShifts()`.
All that being said, as disscussed in the review, this yet again
increases usage of instsimplify in instcombine as utility.
Some day that may need to be reevaluated.
https://bugs.llvm.org/show_bug.cgi?id=43595
Reviewers: spatel, efriedma, vsk
Reviewed By: spatel
Subscribers: xbolva00, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68930
llvm-svn: 375371
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add restrictions in canEvaluateShuffled to prevent that we for example
transform
%0 = insertelement <2 x i16> undef, i16 %a, i32 0
%1 = srem <2 x i16> %0, <i16 2, i16 1>
%2 = shufflevector <2 x i16> %1, <2 x i16> undef, <2 x i32> <i32 undef, i32 0>
into
%1 = insertelement <2 x i16> undef, i16 %a, i32 1
%2 = srem <2 x i16> %1, <i16 undef, i16 2>
as having an undef denominator makes the srem undefined (for all
vector elements).
Fixes: https://bugs.llvm.org/show_bug.cgi?id=43689
Reviewers: spatel, lebedev.ri
Reviewed By: spatel, lebedev.ri
Subscribers: lebedev.ri, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69038
llvm-svn: 375208
|
| |
|
|
|
|
| |
dropRedundantMaskingOfLeftShiftInput()
llvm-svn: 375153
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is something of a workaround to avoid a crash later on in type
legalizer (WidenVectorResult()).
Also added some f16 tests, including a non-working v3f16 case with
a FIXME.
Reviewers: arsenm, tpr, nhaehnle
Reviewed By: arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68865
llvm-svn: 374993
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The 1st attempt at rL374828 inserted the code
at the wrong position (outside of the constant-shift-amount
block). Trying again with an additional test to verify
const-ness.
For a constant shift amount, add the following fold.
shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0)
https://rise4fun.com/Alive/IZ9
Fixes PR42257.
Based on original patch by @zvi (Zvi Rackover)
Differential Revision: https://reviews.llvm.org/D63382
llvm-svn: 374886
|
| |
|
|
|
|
| |
This reverts r374828 (git commit 1f40f15d54aac06421448b6de131231d2d78bc75) due to bot breakage
llvm-svn: 374851
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a constant shift amount, add the following fold.
shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0)
https://rise4fun.com/Alive/IZ9
Fixes PR42257.
Based on original patch by @zvi (Zvi Rackover)
Differential Revision: https://reviews.llvm.org/D63382
llvm-svn: 374828
|
| |
|
|
|
|
| |
dropRedundantMaskingOfLeftShiftInput()
llvm-svn: 374734
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
non-default address space
Follow-up to D68244 to account for a corner case discussed in:
https://bugs.llvm.org/show_bug.cgi?id=43501
Add one more restriction: if the pointer is deref-or-null and in a non-default
(non-zero) address space, we can't assume inbounds.
Differential Revision: https://reviews.llvm.org/D68706
llvm-svn: 374728
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
high-bit-extract-with-signext (PR42389)
This can come up in Bit Stream abstractions.
The pattern looks big/scary, but it can't be simplified any further.
It only is so simple because a number of my preparatory folds had
happened already (shift amount reassociation / shift amount
reassociation in bit test, sign bit test detection).
Highlights:
* There are two main flavors: https://rise4fun.com/Alive/zWi
The difference is add vs. sub, and left-shift of -1 vs. 1
* Since we only change the shift opcode,
we can preserve the exact-ness: https://rise4fun.com/Alive/4u4
* There can be truncation after high-bit-extraction:
https://rise4fun.com/Alive/slHc1 (the main pattern i'm after!)
Which means that we need to ignore zext of shift amounts and of NBits.
* The sign-extending magic can be extended itself (in add pattern
via sext, in sub pattern via zext. not the other way around!)
https://rise4fun.com/Alive/NhG
(or those sext/zext can be sinked into `select`!)
Which again means we should pay attention when matching NBits.
* We can have both truncation of extraction and widening of magic:
https://rise4fun.com/Alive/XTw
In other words, i don't believe we need to have any checks on
bitwidths of any of these constructs.
This is worsened in general by the fact that we may have `sext` instead
of `zext` for shift amounts, and we don't yet canonicalize to `zext`,
although we should. I have not done anything about that here.
Also, we really should have something to weed out `sub` like these,
by folding them into `add` variant.
https://bugs.llvm.org/show_bug.cgi?id=42389
llvm-svn: 373964
|
| |
|
|
|
|
|
|
|
| |
True, no test coverage is being added here. But those non-canonical
predicates that are already handled here already have no test coverage
as far as i can tell. I tried to add tests for them, but all the patterns
already get handled elsewhere.
llvm-svn: 373962
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
deal with mask
Summary:
Currently, we pre-check whether we need to produce a mask or not.
This involves some rather magical constants.
I'd like to extend this fold to also handle the situation
when there's also a `trunc` before outer shift.
That will require another set of magical constants.
It's ugly.
Instead, we can just compute the mask, and check
whether mask is a pass-through (all-ones) or not.
This way we don't need to have any magical numbers.
This change is NFC other than the fact that we now compute
the mask and then check if we need (and can!) apply it.
Reviewers: spatel
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68470
llvm-svn: 373961
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
amounts
Summary:
When we do `ConstantExpr::getZExt()`, that "extends" `undef` to `0`,
which means that for patterns a/b we'd assume that we must not produce
any bits for that channel, while in reality we simply didn't care
about that channel - i.e. we don't need to mask it.
Reviewers: spatel
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68239
llvm-svn: 373960
|
| |
|
|
|
|
|
| |
Extends rL373230 and solves the motivating bug (although in a narrow way):
https://bugs.llvm.org/show_bug.cgi?id=43497
llvm-svn: 373851
|
| |
|
|
|
|
|
|
|
|
|
|
| |
(PR43501)
https://bugs.llvm.org/show_bug.cgi?id=43501
We can't declare a GEP 'inbounds' in general. But we may salvage that information if
we have known dereferenceable bytes on the source pointer.
Differential Revision: https://reviews.llvm.org/D68244
llvm-svn: 373847
|
| |
|
|
|
|
|
|
|
|
|
| |
'icmp sge/slt %x, 0'
We do indeed already get it right in some cases, but only transitively,
with one-use restrictions. Since we only need to produce a single
comparison, it makes sense to match the pattern directly:
https://rise4fun.com/Alive/kPg
llvm-svn: 373802
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR43564, PR42391)
Initially (D65380) i believed that if we have rightshift-trunc-rightshift,
we can't do any folding. But as it usually happens, i was wrong.
https://rise4fun.com/Alive/GEw
https://rise4fun.com/Alive/gN2O
In https://bugs.llvm.org/show_bug.cgi?id=43564 we happen to have
this very sequence, of two right shifts separated by trunc.
And "just" so that happens, we apparently can fold the pattern
if the total shift amount is either 0, or it's equal to the bitwidth
of the innermost widest shift - i.e. if we are left with only the
original sign bit. Which is exactly what is wanted there.
llvm-svn: 373801
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet, bollu, jdoerfert
Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D68268
llvm-svn: 373595
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://rise4fun.com/Alive/8BY - valid for lshr+trunc+variable sext
https://rise4fun.com/Alive/7jk - the variable sext can be redundant
https://rise4fun.com/Alive/Qslu - 'exact'-ness of first shift can be preserver
https://rise4fun.com/Alive/IF63 - without trunc we could view this as
more general "drop redundant mask before right-shift",
but let's handle it here for now
https://rise4fun.com/Alive/iip - likewise, without trunc, variable sext can be redundant.
There's more patterns for sure - e.g. we can have 'lshr' as the final shift,
but that might be best handled by some more generic transform, e.g.
"drop redundant masking before right-shift" (PR42456)
I'm singling-out this sext patch because you can only extract
high bits with `*shr` (unlike abstract bit masking),
and i *know* this fold is wanted by existing code.
I don't believe there is much to review here,
so i'm gonna opt into post-review mode here.
https://bugs.llvm.org/show_bug.cgi?id=43523
llvm-svn: 373542
|
| |
|
|
|
|
|
|
| |
Identical to it's trunc-less variant, just pretent-to hoist
trunc, and everything else still holds:
https://rise4fun.com/Alive/JRU
llvm-svn: 373364
|
| |
|
|
|
|
| |
https://rise4fun.com/Alive/yR4
llvm-svn: 373363
|
| |
|
|
| |
llvm-svn: 373249
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Name: negate if true
%sel = select i1 %cond, i32 -1, i32 1
%r = mul i32 %sel, %x
=>
%m = sub i32 0, %x
%r = select i1 %cond, i32 %m, i32 %x
Name: negate if false
%sel = select i1 %cond, i32 1, i32 -1
%r = mul i32 %sel, %x
=>
%m = sub i32 0, %x
%r = select i1 %cond, i32 %x, i32 %m
https://rise4fun.com/Alive/Nlh
llvm-svn: 373230
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D68141
llvm-svn: 373207
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet, jdoerfert
Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D68142
llvm-svn: 373195
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is valid for any `sext` bitwidth pair:
```
Processing /tmp/opt.ll..
----------------------------------------
%signed = sext %y
%r = shl %x, %signed
ret %r
=>
%unsigned = zext %y
%r = shl %x, %unsigned
ret %r
%signed = sext %y
Done: 2016
Optimization is correct!
```
(This isn't so for funnel shifts, there it's illegal for e.g. i6->i7.)
Main motivation is the C++ semantics:
```
int shl(int a, char b) {
return a << b;
}
```
ends as
```
%3 = sext i8 %1 to i32
%4 = shl i32 %0, %3
```
https://godbolt.org/z/0jgqUq
which is, as this shows, too pessimistic.
There is another problem here - we can only do the fold
if sext is one-use. But we can trivially have cases
where several shifts have the same sext shift amount.
This should be resolved, later.
Reviewers: spatel, nikic, RKSimon
Reviewed By: spatel
Subscribers: efriedma, hiraditya, nlopes, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68103
llvm-svn: 373106
|
| |
|
|
|
|
|
|
|
|
|
|
| |
index is all zeroes to prevent an infinite loop.
The test case here previously infinite looped. Only one element from the GEP is used so SimplifyDemandedVectorElts would replace the other lanes in each index with undef leading to the first index being <0, undef, undef, undef>. But there's a GEP transform that tries to replace an index into a 0 sized type with a zero index. But the zero index check only works on ConstantInt 0 or ConstantAggregateZero so it would turn the index back to zeroinitializer. Resulting in a loop.
The fix is to use m_Zero() to allow a vector of zeroes and undefs.
Differential Revision: https://reviews.llvm.org/D67977
llvm-svn: 373000
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
getFlippedStrictnessPredicateAndConstant
Summary:
Removing an assumption (assert) that the CmpInst already has been
simplified in getFlippedStrictnessPredicateAndConstant. Solution is
to simply bail out instead of hitting the assertion. Instead we
assume that any profitable rewrite will happen in the next iteration
of InstCombine.
The reason why we can't assume that the CmpInst already has been
simplified is that the worklist does not guarantee such an ordering.
Solves https://bugs.llvm.org/show_bug.cgi?id=43376
Reviewers: spatel, lebedev.ri
Reviewed By: lebedev.ri
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68022
llvm-svn: 372972
|
| |
|
|
|
|
|
|
| |
(PR43251)
https://rise4fun.com/Alive/0j9
llvm-svn: 372930
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
https://rise4fun.com/Alive/KtL
This also shows that the fold added in D67412 / r372257
was too specific, and the new fold allows those test cases
to be handled more generically, therefore i delete now-dead code.
This is yet again motivated by
D67122 "[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour"
llvm-svn: 372912
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As @reames pointed out post-commit, rL371518 adds additional rounding
in some cases, when doing constant folding of the multiplication.
This breaks a guarantee llvm.fma makes and must be avoided.
This patch reapplies rL371518, but splits off the simplifications not
requiring rounding from SimplifFMulInst as SimplifyFMAFMul.
Reviewers: spatel, lebedev.ri, reames, scanon
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D67434
llvm-svn: 372899
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently m_Br only takes references to BasicBlock*, which limits its
flexibility. For example, you have to declare a variable, even if you
ignore the result or you have to have additional checks to make sure the
matched BB matches an expected one.
This patch adds m_BasicBlock and m_SpecificBB matchers, which can be
used like the existing matchers for constants or values.
I also had a look at the existing uses and updated a few. IMO it makes
the code a bit more explicit.
Reviewers: spatel, craig.topper, RKSimon, majnemer, lebedev.ri
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D68013
llvm-svn: 372885
|
| |
|
|
|
|
| |
If we generate the gc.relocate, and then later prove two arguments to the statepoint are equivalent, we should canonicalize the gc.relocate to the form we would have produced if this had been known before rewriting.
llvm-svn: 372771
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is again motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens `-fsanitize=pointer-overflow`
overhead from 25% to 50%, which strongly implies missing folds.
For
```
#include <cassert>
char* test(char& base, signed long offset) {
__builtin_assume(offset < 0);
return &base + offset;
}
```
We produce
https://godbolt.org/z/r40U47
and again those two icmp's can be merged:
```
Name: 0
Pre: C != 0
%adjusted = add i8 %base, C
%not_null = icmp ne i8 %adjusted, 0
%no_underflow = icmp ult i8 %adjusted, %base
%r = and i1 %not_null, %no_underflow
=>
%neg_offset = sub i8 0, C
%r = icmp ugt i8 %base, %neg_offset
```
https://rise4fun.com/Alive/ALap
https://rise4fun.com/Alive/slnN
There are 3 other variants of this pattern,
i believe they all will go into InstSimplify.
https://bugs.llvm.org/show_bug.cgi?id=43259
Reviewers: spatel, xbolva00, nikic
Reviewed By: spatel
Subscribers: efriedma, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67849
llvm-svn: 372768
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is again motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens `-fsanitize=pointer-overflow`
overhead from 25% to 50%, which strongly implies missing folds.
This pattern isn't exactly what we get there
(strict vs. non-strict predicate), but this pattern does not
require known-bits analysis, so it is best to handle it first.
```
Name: 0
%adjusted = add i8 %base, %offset
%not_null = icmp ne i8 %adjusted, 0
%no_underflow = icmp ule i8 %adjusted, %base
%r = and i1 %not_null, %no_underflow
=>
%neg_offset = sub i8 0, %offset
%r = icmp ugt i8 %base, %neg_offset
```
https://rise4fun.com/Alive/knp
There are 3 other variants of this pattern,
they all will go into InstSimplify:
https://rise4fun.com/Alive/bIDZ
https://bugs.llvm.org/show_bug.cgi?id=43259
Reviewers: spatel, xbolva00, nikic
Reviewed By: spatel
Subscribers: hiraditya, majnemer, vsk, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67846
llvm-svn: 372767
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fold
or(ashr(subNSW(Y, X), ScalarSizeInBits(Y)-1), X)
into
X s> Y ? -1 : X
https://rise4fun.com/Alive/d8Ab
clamp255 is a common operator in image processing, can be implemented
in a shifty way "(255 - X) >> 31 | X & 255". Fold shift into select
enables more optimization, e.g., vmin generation for ARM target.
Reviewers: lebedev.ri, efriedma, spatel, kparzysz, bcahoon
Reviewed By: lebedev.ri
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67800
llvm-svn: 372678
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fold
and(ashr(subNSW(Y, X), ScalarSizeInBits(Y)-1), X)
into
X s> Y ? X : 0
https://rise4fun.com/Alive/lFH
Fold shift into select enables more optimization,
e.g., vmax generation for ARM target.
Reviewers: lebedev.ri, efriedma, spatel, kparzysz, bcahoon
Reviewed By: lebedev.ri
Subscribers: xbolva00, andreadb, craig.topper, RKSimon, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67799
llvm-svn: 372676
|
| |
|
|
|
|
| |
Extracted from https://reviews.llvm.org/D67849#inline-610377
llvm-svn: 372654
|
| |
|
|
|
|
| |
Extracted from https://reviews.llvm.org/D67849#inline-610377
llvm-svn: 372653
|
| |
|
|
|
|
| |
"Implementations are free to malloc() a buffer containing either (size + 1) bytes or (strnlen(s, size) + 1) bytes. Applications should not assume that strndup() will allocate (size + 1) bytes when strlen(s) is smaller than size."
llvm-svn: 372647
|