bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[NFC][InstCombine] Tests for right-shift shift amount reassociation (w/ ↵	Roman Lebedev	2019-10-04	3	-34/+402
\| \| \| \| \| \| \| \|	trunc) (PR43564, PR42391) https://rise4fun.com/Alive/GEw llvm-svn: 373797
*	[InstCombine] add tests for fneg disguised as fmul; NFC	Sanjay Patel	2019-10-04	1	-0/+74
\| \| \| \|	llvm-svn: 373788
*	[FPEnv] Strict FP tests should use the requisite function attributes.	Kevin P. Neal	2019-10-04	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A set of function attributes is required in any function that uses constrained floating point intrinsics. None of our tests use these attributes. This patch fixes this. These tests have been tested against the IR verifier changes in D68233. Reviewed by: andrew.w.kaylor, cameron.mcinally, uweigand Approved by: andrew.w.kaylor Differential Revision: https://reviews.llvm.org/D67925 llvm-svn: 373761
*	[NFC][InstCombine] Some tests for sub-of-negatible pattern	Roman Lebedev	2019-10-03	1	-0/+292
\| \| \| \| \| \| \| \| \| \| \|	As we have previously estabilished, `sub` is an outcast, and should be considered non-canonical iff it can be converted to `add`. It can be converted to `add` if it's second operand can be negated. So far we mostly only do that for constants and negation itself, but we should be more through. llvm-svn: 373597
*	[InstCombine] Bypass high bit extract before variable sign-extension (PR43523)	Roman Lebedev	2019-10-02	1	-26/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	https://rise4fun.com/Alive/8BY - valid for lshr+trunc+variable sext https://rise4fun.com/Alive/7jk - the variable sext can be redundant https://rise4fun.com/Alive/Qslu - 'exact'-ness of first shift can be preserver https://rise4fun.com/Alive/IF63 - without trunc we could view this as more general "drop redundant mask before right-shift", but let's handle it here for now https://rise4fun.com/Alive/iip - likewise, without trunc, variable sext can be redundant. There's more patterns for sure - e.g. we can have 'lshr' as the final shift, but that might be best handled by some more generic transform, e.g. "drop redundant masking before right-shift" (PR42456) I'm singling-out this sext patch because you can only extract high bits with `shr` (unlike abstract bit masking), and i know* this fold is wanted by existing code. I don't believe there is much to review here, so i'm gonna opt into post-review mode here. https://bugs.llvm.org/show_bug.cgi?id=43523 llvm-svn: 373542
*	[NFC][InstCombine] Add tests for 'variable sext of variable high bit ↵	Roman Lebedev	2019-10-02	1	-0/+584
\| \| \| \| \| \| \| \|	extract' pattern (PR43523) https://bugs.llvm.org/show_bug.cgi?id=43523 llvm-svn: 373541
*	[InstCombine] Transform bcopy to memmove	David Bolvansky	2019-10-02	1	-0/+25
\| \| \| \| \| \| \|	bcopy is still widely used mainly for network apps. Sadly, LLVM has no optimizations for bcopy, but there are some for memmove. Since bcopy == memmove, it is profitable to transform bcopy to memmove and use current optimizations for memmove for free here. llvm-svn: 373537
*	[InstCombine] Precommit tests for D68265	Florian Hahn	2019-10-02	1	-2/+204
\| \| \| \|	llvm-svn: 373458
*	[InstCombine] Deal with -(trunc(X >>u 63)) -> trunc(X >>s 63)	Roman Lebedev	2019-10-01	1	-17/+12
\| \| \| \| \| \| \| \|	Identical to it's trunc-less variant, just pretent-to hoist trunc, and everything else still holds: https://rise4fun.com/Alive/JRU llvm-svn: 373364
*	[InstCombine] Preserve 'exact' in -(X >>u 31) -> (X >>s 31) fold	Roman Lebedev	2019-10-01	1	-2/+2
\| \| \| \| \| \|	https://rise4fun.com/Alive/yR4 llvm-svn: 373363
*	[NFC][InstCombine] (Better) tests for sign-bit-smearing pattern	Roman Lebedev	2019-10-01	2	-0/+279
\| \| \| \| \| \| \|	https://rise4fun.com/Alive/JRU https://rise4fun.com/Alive/yR4 <- we can preserve 'exact' llvm-svn: 373362
*	Revert [InstCombine] sprintf(dest, "%s", str) -> memccpy(dest, str, 0, MAX)	David Bolvansky	2019-10-01	2	-9/+8
\| \| \| \| \| \|	Seems to be slower than memcpy + strlen. llvm-svn: 373335
*	[InstCombine] sprintf(dest, "%s", str) -> memccpy(dest, str, 0, MAX)	David Bolvansky	2019-10-01	2	-8/+9
\| \| \| \|	llvm-svn: 373333
*	[InstCombine] Expand the simplification of log()	Evandro Menezes	2019-09-30	1	-34/+28
\| \| \| \| \| \| \| \| \|	Expand the simplification of special cases of `log()` to include `log2()` and `log10()` as well as intrinsics and more types. Differential revision: https://reviews.llvm.org/D67199 llvm-svn: 373261
*	[NFC][InstCombine] Redundant-left-shift-input-masking: add some more undef tests	Roman Lebedev	2019-09-30	5	-0/+115
\| \| \| \|	llvm-svn: 373248
*	[InstCombine] fold negate disguised as select+mul	Sanjay Patel	2019-09-30	1	-8/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Name: negate if true %sel = select i1 %cond, i32 -1, i32 1 %r = mul i32 %sel, %x => %m = sub i32 0, %x %r = select i1 %cond, i32 %m, i32 %x Name: negate if false %sel = select i1 %cond, i32 1, i32 -1 %r = mul i32 %sel, %x => %m = sub i32 0, %x %r = select i1 %cond, i32 %x, i32 %m https://rise4fun.com/Alive/Nlh llvm-svn: 373230
*	[InstCombine] add tests for negate disguised as mul; NFC	Sanjay Patel	2019-09-30	1	-0/+74
\| \| \| \|	llvm-svn: 373222
*	[InstCombine] Simplify shift-by-sext to shift-by-zext	Roman Lebedev	2019-09-27	2	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is valid for any `sext` bitwidth pair: ``` Processing /tmp/opt.ll.. ---------------------------------------- %signed = sext %y %r = shl %x, %signed ret %r => %unsigned = zext %y %r = shl %x, %unsigned ret %r %signed = sext %y Done: 2016 Optimization is correct! ``` (This isn't so for funnel shifts, there it's illegal for e.g. i6->i7.) Main motivation is the C++ semantics: ``` int shl(int a, char b) { return a << b; } ``` ends as ``` %3 = sext i8 %1 to i32 %4 = shl i32 %0, %3 ``` https://godbolt.org/z/0jgqUq which is, as this shows, too pessimistic. There is another problem here - we can only do the fold if sext is one-use. But we can trivially have cases where several shifts have the same sext shift amount. This should be resolved, later. Reviewers: spatel, nikic, RKSimon Reviewed By: spatel Subscribers: efriedma, hiraditya, nlopes, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68103 llvm-svn: 373106
*	[NFC][InstCombine] Revisit shift-by-signext tests	Roman Lebedev	2019-09-27	1	-18/+86
\| \| \| \|	llvm-svn: 373055
*	[InstCombine][NFC] Add tests for shift-by-signext	Roman Lebedev	2019-09-26	1	-0/+105
\| \| \| \|	llvm-svn: 373013
*	[InstCombine][NFC] Regenerate load-cmp.ll test	Roman Lebedev	2019-09-26	1	-25/+25
\| \| \| \|	llvm-svn: 373012
*	[NFC] Precommit tests for D68089	David Bolvansky	2019-09-26	1	-0/+79
\| \| \| \|	llvm-svn: 373006
*	[InstCombine] Use m_Zero instead of isNullValue() when checking if a GEP ↵	Craig Topper	2019-09-26	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \|	index is all zeroes to prevent an infinite loop. The test case here previously infinite looped. Only one element from the GEP is used so SimplifyDemandedVectorElts would replace the other lanes in each index with undef leading to the first index being <0, undef, undef, undef>. But there's a GEP transform that tries to replace an index into a 0 sized type with a zero index. But the zero index check only works on ConstantInt 0 or ConstantAggregateZero so it would turn the index back to zeroinitializer. Resulting in a loop. The fix is to use m_Zero() to allow a vector of zeroes and undefs. Differential Revision: https://reviews.llvm.org/D67977 llvm-svn: 373000
*	[InstCombine] Don't assume CmpInst has been visited in ↵	Bjorn Pettersson	2019-09-26	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	getFlippedStrictnessPredicateAndConstant Summary: Removing an assumption (assert) that the CmpInst already has been simplified in getFlippedStrictnessPredicateAndConstant. Solution is to simply bail out instead of hitting the assertion. Instead we assume that any profitable rewrite will happen in the next iteration of InstCombine. The reason why we can't assume that the CmpInst already has been simplified is that the worklist does not guarantee such an ordering. Solves https://bugs.llvm.org/show_bug.cgi?id=43376 Reviewers: spatel, lebedev.ri Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68022 llvm-svn: 372972
*	[InstCombine] foldUnsignedUnderflowCheck(): one last pattern with 'sub' ↵	Roman Lebedev	2019-09-25	1	-8/+4
\| \| \| \| \| \| \| \|	(PR43251) https://rise4fun.com/Alive/0j9 llvm-svn: 372930
*	[NFC][InstCombine] Tests for 'base u<= offset && (base - offset) != 0' ↵	Roman Lebedev	2019-09-25	1	-0/+33
\| \| \| \| \| \|	pattern (PR43251) llvm-svn: 372929
*	[InstSimplify] Handle more 'A </>/>=/<= B &&/\|\| (A - B) !=/== 0' patterns ↵	Roman Lebedev	2019-09-25	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \|	(PR43251) https://rise4fun.com/Alive/sl9s https://rise4fun.com/Alive/2plN https://bugs.llvm.org/show_bug.cgi?id=43251 llvm-svn: 372928
*	[InstSimplify] Match 1.0 and 0.0 for both operands in SimplifyFMAMul	Florian Hahn	2019-09-25	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because we do not constant fold multiplications in SimplifyFMAMul, we match 1.0 and 0.0 for both operands, as multiplying by them is guaranteed to produce an exact result (if it is allowed to do so). Note that it is not enough to just swap the operands to ensure a constant is on the RHS, as we want to also cover the case with 2 constants. Reviewers: lebedev.ri, spatel, reames, scanon Reviewed By: lebedev.ri, reames Differential Revision: https://reviews.llvm.org/D67553 llvm-svn: 372915
*	[InstCombine] Fold (A - B) u>=/u< A --> B u>/u<= A iff B != 0	Roman Lebedev	2019-09-25	2	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	https://rise4fun.com/Alive/KtL This also shows that the fold added in D67412 / r372257 was too specific, and the new fold allows those test cases to be handled more generically, therefore i delete now-dead code. This is yet again motivated by D67122 "[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour" llvm-svn: 372912
*	[NFC][InstCombine] Add tests for (X - Y) < X --> Y <= X iff Y != 0	Roman Lebedev	2019-09-25	1	-0/+111
\| \| \| \| \| \| \| \|	https://rise4fun.com/Alive/KtL This should go to InstCombiner::foldICmpBinO(), next to "Convert sub-with-unsigned-overflow comparisons into a comparison of args." llvm-svn: 372911
*	[InstCombine] Limit FMul constant folding for fma simplifications.	Florian Hahn	2019-09-25	1	-10/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As @reames pointed out post-commit, rL371518 adds additional rounding in some cases, when doing constant folding of the multiplication. This breaks a guarantee llvm.fma makes and must be avoided. This patch reapplies rL371518, but splits off the simplifications not requiring rounding from SimplifFMulInst as SimplifyFMAFMul. Reviewers: spatel, lebedev.ri, reames, scanon Reviewed By: reames Differential Revision: https://reviews.llvm.org/D67434 llvm-svn: 372899
*	[GCRelocate] Add a peephole to canonicalize base pointer relocation	Philip Reames	2019-09-24	1	-0/+11
\| \| \| \| \| \|	If we generate the gc.relocate, and then later prove two arguments to the statepoint are equivalent, we should canonicalize the gc.relocate to the form we would have produced if this had been known before rewriting. llvm-svn: 372771
*	[InstCombine] (a+b) < a && (a+b) != 0 -> (0-b) < a iff a/b != 0 (PR43259)	Roman Lebedev	2019-09-24	1	-34/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is again motivated by D67122 sanitizer check enhancement. That patch seemingly worsens `-fsanitize=pointer-overflow` overhead from 25% to 50%, which strongly implies missing folds. For ``` #include <cassert> char* test(char& base, signed long offset) { __builtin_assume(offset < 0); return &base + offset; } ``` We produce https://godbolt.org/z/r40U47 and again those two icmp's can be merged: ``` Name: 0 Pre: C != 0 %adjusted = add i8 %base, C %not_null = icmp ne i8 %adjusted, 0 %no_underflow = icmp ult i8 %adjusted, %base %r = and i1 %not_null, %no_underflow => %neg_offset = sub i8 0, C %r = icmp ugt i8 %base, %neg_offset ``` https://rise4fun.com/Alive/ALap https://rise4fun.com/Alive/slnN There are 3 other variants of this pattern, i believe they all will go into InstSimplify. https://bugs.llvm.org/show_bug.cgi?id=43259 Reviewers: spatel, xbolva00, nikic Reviewed By: spatel Subscribers: efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67849 llvm-svn: 372768
*	[InstCombine] (a+b) <= a && (a+b) != 0 -> (0-b) < a (PR43259)	Roman Lebedev	2019-09-24	1	-26/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is again motivated by D67122 sanitizer check enhancement. That patch seemingly worsens `-fsanitize=pointer-overflow` overhead from 25% to 50%, which strongly implies missing folds. This pattern isn't exactly what we get there (strict vs. non-strict predicate), but this pattern does not require known-bits analysis, so it is best to handle it first. ``` Name: 0 %adjusted = add i8 %base, %offset %not_null = icmp ne i8 %adjusted, 0 %no_underflow = icmp ule i8 %adjusted, %base %r = and i1 %not_null, %no_underflow => %neg_offset = sub i8 0, %offset %r = icmp ugt i8 %base, %neg_offset ``` https://rise4fun.com/Alive/knp There are 3 other variants of this pattern, they all will go into InstSimplify: https://rise4fun.com/Alive/bIDZ https://bugs.llvm.org/show_bug.cgi?id=43259 Reviewers: spatel, xbolva00, nikic Reviewed By: spatel Subscribers: hiraditya, majnemer, vsk, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67846 llvm-svn: 372767
*	[InstCombine] Fold a shifty implementation of clamp-to-allones.	Huihui Zhang	2019-09-24	1	-35/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fold or(ashr(subNSW(Y, X), ScalarSizeInBits(Y)-1), X) into X s> Y ? -1 : X https://rise4fun.com/Alive/d8Ab clamp255 is a common operator in image processing, can be implemented in a shifty way "(255 - X) >> 31 \| X & 255". Fold shift into select enables more optimization, e.g., vmin generation for ARM target. Reviewers: lebedev.ri, efriedma, spatel, kparzysz, bcahoon Reviewed By: lebedev.ri Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67800 llvm-svn: 372678
*	[InstCombine] Fold a shifty implementation of clamp-to-zero.	Huihui Zhang	2019-09-24	1	-32/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fold and(ashr(subNSW(Y, X), ScalarSizeInBits(Y)-1), X) into X s> Y ? X : 0 https://rise4fun.com/Alive/lFH Fold shift into select enables more optimization, e.g., vmax generation for ARM target. Reviewers: lebedev.ri, efriedma, spatel, kparzysz, bcahoon Reviewed By: lebedev.ri Subscribers: xbolva00, andreadb, craig.topper, RKSimon, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67799 llvm-svn: 372676
*	[NFC][InstCombine] Add tests for shifty implementation of clamping.	Huihui Zhang	2019-09-23	2	-0/+473
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Clamp negative to zero and clamp positive to allOnes are common operation in image saturation. Add tests for shifty implementation of clamping, as prepare work for folding: and(ashr(subNSW(Y, X), ScalarSizeInBits(Y)-1), X) --> X s> 0 ? X : 0; or(ashr(subNSW(Y, X), ScalarSizeInBits(Y)-1), X) --> X s> Y ? allOnes : X. Reviewers: lebedev.ri, efriedma, spatel, kparzysz, bcahoon Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67798 llvm-svn: 372671
*	[InstCombine] Annotate strndup calls with dereferenceable_or_null	David Bolvansky	2019-09-23	2	-2/+2
\| \| \| \| \| \|	"Implementations are free to malloc() a buffer containing either (size + 1) bytes or (strnlen(s, size) + 1) bytes. Applications should not assume that strndup() will allocate (size + 1) bytes when strlen(s) is smaller than size." llvm-svn: 372647
*	[SLC] Convert some strndup calls to strdup calls	David Bolvansky	2019-09-23	2	-7/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Motivation: - If we can fold it to strdup, we should (strndup does more things than strdup). - Annotation mechanism. (Works for strdup well). strdup and strndup are part of C 20 (currently posix fns), so we should optimize them. Reviewers: efriedma, jdoerfert Reviewed By: jdoerfert Subscribers: lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67679 llvm-svn: 372636
*	[InstCombine] dropRedundantMaskingOfLeftShiftInput(): pat. c/d/e with mask ↵	Roman Lebedev	2019-09-23	3	-18/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR42563) Summary: If we have a pattern `(x & (-1 >> maskNbits)) << shiftNbits`, we already know (have a fold) that will drop the `& (-1 >> maskNbits)` mask iff `(shiftNbits-maskNbits) s>= 0` (i.e. `shiftNbits u>= maskNbits`). So even if `(shiftNbits-maskNbits) s< 0`, we can still fold, we will just need to apply a constant mask afterwards: ``` Name: c, normal+mask %t0 = lshr i32 -1, C1 %t1 = and i32 %t0, %x %r = shl i32 %t1, C2 => %n0 = shl i32 %x, C2 %n1 = i32 ((-(C2-C1))+32) %n2 = zext i32 %n1 to i64 %n3 = lshr i64 -1, %n2 %n4 = trunc i64 %n3 to i32 %r = and i32 %n0, %n4 ``` https://rise4fun.com/Alive/gslRa Naturally, old `%masked` will have to be one-use. This is not valid for pattern f - where "masking" is done via `ashr`. https://bugs.llvm.org/show_bug.cgi?id=42563 Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67725 llvm-svn: 372630
*	[InstCombine] dropRedundantMaskingOfLeftShiftInput(): pat. a/b with mask ↵	Roman Lebedev	2019-09-23	2	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR42563) Summary: And this is finally the interesting part of that fold! If we have a pattern `(x & (~(-1 << maskNbits))) << shiftNbits`, we already know (have a fold) that will drop the `& (~(-1 << maskNbits))` mask iff `(maskNbits+shiftNbits) u>= bitwidth(x)`. But that is actually ignorant, there's more general fold here: In this pattern, `(maskNbits+shiftNbits)` actually correlates with the number of low bits that will remain in the final value. So even if `(maskNbits+shiftNbits) u< bitwidth(x)`, we can still fold, we will just need to apply a constant mask afterwards: ``` Name: a, normal+mask %onebit = shl i32 -1, C1 %mask = xor i32 %onebit, -1 %masked = and i32 %mask, %x %r = shl i32 %masked, C2 => %n0 = shl i32 %x, C2 %n1 = add i32 C1, C2 %n2 = zext i32 %n1 to i64 %n3 = shl i64 -1, %n2 %n4 = xor i64 %n3, -1 %n5 = trunc i64 %n4 to i32 %r = and i32 %n0, %n5 ``` https://rise4fun.com/Alive/F5R Naturally, old `%masked` will have to be one-use. Similar fold exists for patterns c,d,e, will post patch later. https://bugs.llvm.org/show_bug.cgi?id=42563 Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67677 llvm-svn: 372629
*	[InstCombine] allow icmp+binop folds before min/max bailout (PR43310)	Sanjay Patel	2019-09-22	2	-14/+8
\| \| \| \| \| \| \| \| \|	This has the potential to uncover missed analysis/folds as shown in the min/max code comment/test, but fewer restrictions on icmp folds should be better in general to solve cases like: https://bugs.llvm.org/show_bug.cgi?id=43310 llvm-svn: 372510
*	[InstCombine] add tests for icmp fold hindered by min/max; NFC	Sanjay Patel	2019-09-22	1	-0/+36
\| \| \| \|	llvm-svn: 372509
*	[NFC][InstCombine] Fixup newly-added tests	Roman Lebedev	2019-09-20	2	-8/+8
\| \| \| \|	llvm-svn: 372413
*	[InstCombine] Tests for (a+b)<=a && (a+b)!=0 fold (PR43259)	Roman Lebedev	2019-09-20	2	-4/+189
\| \| \| \| \| \| \|	https://rise4fun.com/Alive/knp https://rise4fun.com/Alive/ALap llvm-svn: 372402
*	[InstCombine] Simplify @llvm.usub.with.overflow+non-zero check (PR43251)	Roman Lebedev	2019-09-19	1	-18/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is again motivated by D67122 sanitizer check enhancement. That patch seemingly worsens `-fsanitize=pointer-overflow` overhead from 25% to 50%, which strongly implies missing folds. In this particular case, given ``` char* test(char& base, unsigned long offset) { return &base - offset; } ``` it will end up producing something like https://godbolt.org/z/luGEju which after optimizations reduces down to roughly ``` declare void @use64(i64) define i1 @test(i8* dereferenceable(1) %base, i64 %offset) { %base_int = ptrtoint i8* %base to i64 %adjusted = sub i64 %base_int, %offset call void @use64(i64 %adjusted) %not_null = icmp ne i64 %adjusted, 0 %no_underflow = icmp ule i64 %adjusted, %base_int %no_underflow_and_not_null = and i1 %not_null, %no_underflow ret i1 %no_underflow_and_not_null } ``` Without D67122 there was no `%not_null`, and in this particular case we can "get rid of it", by merging two checks: Here we are checking: `Base u>= Offset && (Base u- Offset) != 0`, but that is simply `Base u> Offset` Alive proofs: https://rise4fun.com/Alive/QOs The `@llvm.usub.with.overflow` pattern itself is not handled here because this is the main pattern, that we currently consider canonical. https://bugs.llvm.org/show_bug.cgi?id=43251 Reviewers: spatel, nikic, xbolva00, majnemer Reviewed By: xbolva00, majnemer Subscribers: vsk, majnemer, xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67356 llvm-svn: 372341
*	[InstCombine] foldUnsignedUnderflowCheck(): handle last few cases (PR43251)	Roman Lebedev	2019-09-18	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I don't have a direct motivational case for this, but it would be good to have this for completeness/symmetry. This pattern is basically the motivational pattern from https://bugs.llvm.org/show_bug.cgi?id=43251 but with different predicate that requires that the offset is non-zero. The completeness bit comes from the fact that a similar pattern (offset != zero) will be needed for https://bugs.llvm.org/show_bug.cgi?id=43259, so it'd seem to be good to not overlook very similar patterns.. Proofs: https://rise4fun.com/Alive/21b Also, there is something odd with `isKnownNonZero()`, if the non-zero knowledge was specified as an assumption, it didn't pick it up (PR43267) With this, i see no other missing folds for https://bugs.llvm.org/show_bug.cgi?id=43251 Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67412 llvm-svn: 372257
*	[NFC][InstCombine] More tests for PR42563 "Dropping pointless masking before ↵	Roman Lebedev	2019-09-18	7	-67/+313
\| \| \| \| \| \| \| \| \| \|	left shift" For patterns c/d/e we too can deal with the pattern even if we can't just drop the mask, we can just apply it afterwars: https://rise4fun.com/Alive/gslRa llvm-svn: 372244
*	[SimplifyLibCalls] fix crash with empty function name (PR43347)	Sanjay Patel	2019-09-18	1	-0/+12
\| \| \| \| \| \| \| \|	...and improve some variable names while here. https://bugs.llvm.org/show_bug.cgi?id=43347 llvm-svn: 372227
*	[NFC][InstCombine] More tests for "Dropping pointless masking before left ↵	Roman Lebedev	2019-09-17	4	-50/+260
\| \| \| \| \| \| \| \| \| \| \| \| \|	shift" (PR42563) While we already fold that pattern if the sum of shift amounts is not smaller than bitwidth, there's painfully obvious generalization: https://rise4fun.com/Alive/F5R I.e. the "sub of shift amounts" tells us how many bits will be left in the output. If it's less than bitwidth, we simply need to apply a mask, which is constant. llvm-svn: 372170