bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[InstCombine][AMDGPU] Simplify tbuffer loads	Piotr Sobczak	2019-08-30	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add missing tbuffer loads intrinsics in SimplifyDemandedVectorElts. Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66926 llvm-svn: 370475
*	[InstCombine] reduce duplicated code; NFC	Sanjay Patel	2019-08-29	1	-10/+13
\| \| \| \|	llvm-svn: 370399
*	[InstCombine] Fold '((%x * %y) u/ %x) != %y' to '@llvm.umul.with.overflow' + ↵	Roman Lebedev	2019-08-29	2	-5/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	overflow bit extraction Summary: `((%x * %y) u/ %x) != %y` is one of (3?) common ways to check that some unsigned multiplication (will not) overflow. Currently, we don't catch it. We could: ``` $ /repositories/alive2/build-Clang-unknown/alive -root-only ~/llvm-patch1.ll Processing /home/lebedevri/llvm-patch1.ll.. ---------------------------------------- Name: no overflow %o0 = mul i4 %y, %x %o1 = udiv i4 %o0, %x %r = icmp ne i4 %o1, %y ret i1 %r => %n0 = umul_overflow i4 %x, %y %o0 = extractvalue {i4, i1} %n0, 0 %o1 = udiv %o0, %x %r = extractvalue {i4, i1} %n0, 1 ret %r Done: 1 Optimization is correct! ---------------------------------------- Name: no overflow %o0 = mul i4 %y, %x %o1 = udiv i4 %o0, %x %r = icmp eq i4 %o1, %y ret i1 %r => %n0 = umul_overflow i4 %x, %y %o0 = extractvalue {i4, i1} %n0, 0 %o1 = udiv %o0, %x %n1 = extractvalue {i4, i1} %n0, 1 %r = xor %n1, -1 ret i1 %r Done: 1 Optimization is correct! ``` Reviewers: nikic, spatel, efriedma, xbolva00, RKSimon Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65144 llvm-svn: 370348
*	[InstCombine] Fold '(-1 u/ %x) u< %y' to '@llvm.umul.with.overflow' + ↵	Roman Lebedev	2019-08-29	1	-0/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	overflow bit extraction Summary: `(-1 u/ %x) u< %y` is one of (3?) common ways to check that some unsigned multiplication (will not) overflow. Currently, we don't catch it. We could: ``` ---------------------------------------- Name: no overflow %o0 = udiv i4 -1, %x %r = icmp ult i4 %o0, %y => %o0 = udiv i4 -1, %x %n0 = umul_overflow i4 %x, %y %r = extractvalue {i4, i1} %n0, 1 Done: 1 Optimization is correct! ---------------------------------------- Name: no overflow, swapped %o0 = udiv i4 -1, %x %r = icmp ugt i4 %y, %o0 => %o0 = udiv i4 -1, %x %n0 = umul_overflow i4 %x, %y %r = extractvalue {i4, i1} %n0, 1 Done: 1 Optimization is correct! ---------------------------------------- Name: overflow %o0 = udiv i4 -1, %x %r = icmp uge i4 %o0, %y => %o0 = udiv i4 -1, %x %n0 = umul_overflow i4 %x, %y %n1 = extractvalue {i4, i1} %n0, 1 %r = xor %n1, -1 Done: 1 Optimization is correct! ---------------------------------------- Name: overflow %o0 = udiv i4 -1, %x %r = icmp ule i4 %y, %o0 => %o0 = udiv i4 -1, %x %n0 = umul_overflow i4 %x, %y %n1 = extractvalue {i4, i1} %n0, 1 %r = xor %n1, -1 Done: 1 Optimization is correct! ``` As it can be observed from tests, while simply forming the `@llvm.umul.with.overflow` is easy, if we were looking for the inverted answer, then more work needs to be done to cleanup the now-pointless control-flow that was guarding against division-by-zero. This is being addressed in follow-up patches. Reviewers: nikic, spatel, efriedma, xbolva00, RKSimon Reviewed By: nikic, xbolva00 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65143 llvm-svn: 370347
*	[InstCombine] Shift amount reassociation in bittest: trunc-of-lshr (PR42399)	Roman Lebedev	2019-08-29	1	-14/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Finally, the fold i was looking forward to :) The legality check is muddy, i doubt i've groked the full generalization, but it handles all the cases i care about, and can come up with: https://rise4fun.com/Alive/26j I.e. we can perform the fold if any of the following is true: * The shift amount is either zero or one less than widest bitwidth * Either of the values being shifted has at most lowest bit set * The value that is being shifted by `shl` (which is not truncated) should have no less leading zeros than the total shift amount; * The value that is being shifted by `lshr` (which is truncated) should have no less leading zeros than the widest bit width minus total shift amount minus one I strongly suspect there is some better generalization, but i'm not aware of it as of right now. For now i also avoided using actual `computeKnownBits()`, but restricted it to constants. Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66383 llvm-svn: 370324
*	Fix variable set but no used warning on NDEBUG builds. NFCI.	Simon Pilgrim	2019-08-29	1	-2/+2
\| \| \| \|	llvm-svn: 370317
*	[InstCombine] clean up wrap propagation for reassociated ops; NFCI	Sanjay Patel	2019-08-28	1	-11/+15
\| \| \| \| \| \| \| \| \| \|	Always true/false checks were flagged by static analysis; https://bugs.llvm.org/show_bug.cgi?id=43143 I have not confirmed the logic difference in propagating nsw vs. nuw, but presumably we would have noticed a bug by now if that was wrong. llvm-svn: 370248
*	Reduce scope of variable only used in a local pattern match. NFCI.	Simon Pilgrim	2019-08-28	1	-1/+1
\| \| \| \|	llvm-svn: 370224
*	[InstCombine] Disable recursion in foldGEPICmp for vector pointer GEPs	Craig Topper	2019-08-28	1	-2/+4
\| \| \| \| \| \| \|	Due to missing vector support in this function, recursion can generate worse code in some cases. llvm-svn: 370221
*	Fix uninitialized variable warning in cppcheck. NFCI.	Simon Pilgrim	2019-08-28	1	-1/+1
\| \| \| \| \| \|	InstCombiner::MaxArraySizeForCombine is set outside the constructor so we need to ensure it has a default initialization value. llvm-svn: 370220
*	[NFC] Added a comment to avoid possible confusion	David Bolvansky	2019-08-28	1	-0/+2
\| \| \| \|	llvm-svn: 370217
*	Remove duplicate 'BitWidth' variable. NFCI.	Simon Pilgrim	2019-08-28	1	-1/+0
\| \| \| \|	llvm-svn: 370212
*	InstCombiner::visitSelectInst - rename Pred to MinMaxPred to stop shadow ↵	Simon Pilgrim	2019-08-28	1	-5/+6
\| \| \| \| \| \| \| \|	variable warning. NFCI. We have a lot of Predicate variables, all similarly named.... llvm-svn: 370207
*	Annotate return values of allocation functions with dereferenceable_or_null	David Bolvansky	2019-08-28	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Example define dso_local noalias i8* @_Z6maixxnv() local_unnamed_addr #0 { entry: %call = tail call noalias dereferenceable_or_null(64) i8* @malloc(i64 64) #6 ret i8* %call } Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: aaron.ballman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66651 llvm-svn: 370168
*	[InstCombine] Disable some portions of foldGEPICmp for GEPs that return a ↵	Craig Topper	2019-08-27	1	-11/+26
\| \| \| \| \| \|	vector of pointers. Fix other portions. llvm-svn: 370114
*	[InstCombine] Fold select with ctlz to cttz	David Bolvansky	2019-08-27	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Handle pattern [0]: int ctz(unsigned int a) { int c = __clz(a & -a); return a ? 31 - c : c; } In reality, the compiler can generate much better code for cttz, so fold away this pattern. https://godbolt.org/z/c5kPtV [0] https://community.arm.com/community-help/f/discussions/2114/count-trailing-zeros Reviewers: spatel, nikic, lebedev.ri, dmgreen, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66308 llvm-svn: 370037
*	msan, codegen, instcombine: Keep more lifetime markers used for msan	Vitaly Buka	2019-08-26	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: eugenis Subscribers: hiraditya, cfe-commits, #sanitizers, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D66695 llvm-svn: 369979
*	[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null ↵	Philip Reames	2019-08-26	1	-1/+6
\| \| \| \| \| \| \| \| \| \|	for vectors Extend the transform introduced in https://reviews.llvm.org/D66608 to work for vector geps as well. Differential Revision: https://reviews.llvm.org/D66671 llvm-svn: 369949
*	[Constant] Add 'isElementWiseEqual()' method	Roman Lebedev	2019-08-24	1	-16/+2
\| \| \| \| \| \| \| \| \| \| \|	Promoting it from InstCombine's tryToReuseConstantFromSelectInComparison(). Return true if this constant and a constant 'Y' are element-wise equal. This is identical to just comparing the pointers, with the exception that for vectors, if only one of the constants has an `undef` element in some lane, the constants still match. llvm-svn: 369842
*	[InstCombine] matchThreeWayIntCompare(): commutativity awareness	Roman Lebedev	2019-08-24	1	-13/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: `matchThreeWayIntCompare()` looks for ``` select i1 (a == b), i32 Equal, i32 (select i1 (a < b), i32 Less, i32 Greater) ``` but both of these selects/compares can be in it's commuted form, so out of 8 variants, only the two most basic ones is handled. This fixes regression being introduced in D66232. Reviewers: spatel, nikic, efriedma, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66607 llvm-svn: 369841
*	[InstCombine] Try to reuse constant from select in leading comparison	Roman Lebedev	2019-08-24	2	-12/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If we have e.g.: ``` %t = icmp ult i32 %x, 65536 %r = select i1 %t, i32 %y, i32 65535 ``` the constants `65535` and `65536` are suspiciously close. We could perform a transformation to deduplicate them: ``` Name: ult %t = icmp ult i32 %x, 65536 %r = select i1 %t, i32 %y, i32 65535 => %t.inv = icmp ugt i32 %x, 65535 %r = select i1 %t.inv, i32 65535, i32 %y ``` https://rise4fun.com/Alive/avb While this may seem esoteric, this should certainly be good for vectors (less constant pool usage) and for opt-for-size - need to have only one constant. But the real fun part here is that it allows further transformation, in particular it finishes cleaning up the `clamp` folding, see e.g. `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`. We start with e.g. ``` %dont_need_to_clamp_positive = icmp sle i32 %X, 32767 %dont_need_to_clamp_negative = icmp sge i32 %X, -32768 %clamp_limit = select i1 %dont_need_to_clamp_positive, i32 -32768, i32 32767 %dont_need_to_clamp = and i1 %dont_need_to_clamp_positive, %dont_need_to_clamp_negative %R = select i1 %dont_need_to_clamp, i32 %X, i32 %clamp_limit ``` without this patch we currently produce ``` %1 = icmp slt i32 %X, 32768 %2 = icmp sgt i32 %X, -32768 %3 = select i1 %2, i32 %X, i32 -32768 %R = select i1 %1, i32 %3, i32 32767 ``` which isn't really a `clamp` - both comparisons are performed on the original value, this patch changes it into ``` %1.inv = icmp sgt i32 %X, 32767 %2 = icmp sgt i32 %X, -32768 %3 = select i1 %2, i32 %X, i32 -32768 %R = select i1 %1.inv, i32 32767, i32 %3 ``` and then the magic happens! Some further transform finishes polishing it and we finally get: ``` %t1 = icmp sgt i32 %X, -32768 %t2 = select i1 %t1, i32 %X, i32 -32768 %t3 = icmp slt i32 %t2, 32767 %R = select i1 %t3, i32 %t2, i32 32767 ``` which is beautiful and just what we want. Proofs for `getFlippedStrictnessPredicateAndConstant()` for de-canonicalization: https://rise4fun.com/Alive/THl Proofs for the fold itself: https://rise4fun.com/Alive/THl Reviewers: spatel, dmgreen, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66232 llvm-svn: 369840
*	Fix a bug in just submitted rL369789	Philip Reames	2019-08-23	1	-1/+4
\| \| \| \| \| \| \| \|	Started implementing the vector case and realized the scalar case hadn't handled the GEP producing a different type than the base correctly. It's entertaining seeing what slips through review when we're focused on the 'hard' parts. :( Also adding an extra vector test as it happened to be in workspace and wasn't worth separating. llvm-svn: 369795
*	[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null	Philip Reames	2019-08-23	1	-0/+21
\| \| \| \| \| \| \| \| \| \|	This generalizes the isGEPKnownNonNull rule from ValueTracking to apply when we do not know if the base is non-null, and thus need to replace one condition with another. The core notion is that since an inbounds GEP can only form null if the base pointer is null and the offset is zero. However, if the offset is non-zero, the the "inbounds" marker makes the result poison. Thus, we're free to ignore the case where the offset is non-zero. Similarly, there's no case under which a non-null base can result in a null result without generating poison. Differential Revision: https://reviews.llvm.org/D66608 llvm-svn: 369789
*	[instcombine] icmp eq/ne (sub C, Y), C -> icmp eq/ne Y, 0	Philip Reames	2019-08-21	1	-0/+5
\| \| \| \| \| \|	Noticed while looking at pr43028. llvm-svn: 369541
*	[InstCombine] narrow icmp with extended operands of different widths	Sanjay Patel	2019-08-21	1	-6/+17
\| \| \| \| \| \| \| \| \| \| \|	An intermediate extend is used to widen the narrow operand to the width of the other (wider) operand. At that point, we have the same logic as the existing transform that was restricted to folds of equal width zext/sext. This mostly solves PR42700: https://bugs.llvm.org/show_bug.cgi?id=42700 llvm-svn: 369519
*	[InstCombine] add helper function for icmp+zext/sext; NFC	Sanjay Patel	2019-08-20	1	-69/+75
\| \| \| \|	llvm-svn: 369421
*	[InstCombine] make fold for icmp with sext more efficient; NFC	Sanjay Patel	2019-08-20	1	-13/+7
\| \| \| \| \| \| \|	We were creating 2 instructions and relying on a subsequent fold to invert a not(icmp). Create the final icmp directly instead. llvm-svn: 369411
*	[InstCombine] improve readability for icmp with cast folds; NFC	Sanjay Patel	2019-08-20	2	-47/+42
\| \| \| \| \| \| \| \|	1. Update function name and stale code comments. 2. Use variable names that are less ambiguous. 3. Move operand checks into the function as early exits. llvm-svn: 369390
*	[InstCombine] simplify min/max of min/max with same operands (PR35607)	Sanjay Patel	2019-08-20	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the original integer variant requested in: https://bugs.llvm.org/show_bug.cgi?id=35607 As noted in the TODO and several similar TODOs around this block, we could do this in instsimplify, but then it would cost more because we would be trying to match min/max via ValueTracking in 2 different places. There are 4 commuted variants for each of smin/smax/umin/umax that are not matched here. There are also icmp predicate variants that are not included in the affected test file because they are already handled by instsimplify by folding the final icmp to true/false. https://rise4fun.com/Alive/3KVc Name: smax(smax, smin) %c1 = icmp slt i32 %x, %y %c2 = icmp slt i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp sgt i32 %max, %min %r = select i1 %c3, i32 %max, i32 %min => %r = %max Name: smin(smax, smin) %c1 = icmp slt i32 %x, %y %c2 = icmp slt i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp sgt i32 %max, %min %r = select i1 %c3, i32 %min, i32 %max => %r = %min Name: umax(umax, umin) %c1 = icmp ult i32 %x, %y %c2 = icmp ult i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp ult i32 %min, %max %r = select i1 %c3, i32 %max, i32 %min => %r = %max Name: umin(umax, umin) %c1 = icmp ult i32 %x, %y %c2 = icmp ult i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp ult i32 %min, %max %r = select i1 %c3, i32 %min, i32 %max => %r = %min llvm-svn: 369386
*	[InstCombine] Cherry-pick NFC cleanups of ↵	Roman Lebedev	2019-08-18	1	-5/+8
\| \| \| \| \| \|	foldShiftIntoShiftInAnotherHandOfAndInICmp() from D66383 llvm-svn: 369207
*	Revert r367891 - "[InstCombine] combine mul+shl separated by zext"	Sanjay Patel	2019-08-16	1	-13/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 5dbb90bfe14ace30224239cac7c61a1422fa5144. As noted in the post-commit thread for r367891, this can create a multiply that is lowered to a libcall that may not exist. We need to improve the backend decomposition for integer multiply before trying to re-land this (if it's still worthwhile after doing the backend work). llvm-svn: 369174
*	[InstCombine] canonicalize a scalar-select-of-vectors to vector select	Sanjay Patel	2019-08-16	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This pattern may arise more frequently with an enhancement to SLP vectorization suggested in PR42755: https://bugs.llvm.org/show_bug.cgi?id=42755 ...but we should handle this pattern to make things easier for the backend either way. For all in-tree targets that I looked at, codegen for typical vector sizes looks better when we change to a vector select, so this is safe to do without a cost model (in other words, as a target-independent canonicalization). For example, if the condition of the select is a scalar, we end up with something like this on x86: vpcmpgtd %xmm0, %xmm1, %xmm0 vpextrb $12, %xmm0, %eax testb $1, %al jne LBB0_2 ## %bb.1: vmovaps %xmm3, %xmm2 LBB0_2: vmovaps %xmm2, %xmm0 Rather than the splat-condition variant: vpcmpgtd %xmm0, %xmm1, %xmm0 vpshufd $255, %xmm0, %xmm0 ## xmm0 = xmm0[3,3,3,3] vblendvps %xmm0, %xmm2, %xmm3, %xmm0 Differential Revision: https://reviews.llvm.org/D66095 llvm-svn: 369140
*	[InstCombine] Shift amount reassociation in bittest: trunc-of-shl (PR42399)	Roman Lebedev	2019-08-16	1	-19/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is continuation of D63829 / https://bugs.llvm.org/show_bug.cgi?id=42399 I thought naive pattern would solve my issue, but nope, it involved truncation, thus more folds needed.. This isn't really the fold i'm interested in, i need trunc-of-lshr, but i'we decided to start with `shl` because it's simpler. In this case, no extra legality checks are needed: https://rise4fun.com/Alive/CAb We should be careful about not increasing instruction count, since we need to produce `zext` because `and` is done in wider type. Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66057 llvm-svn: 369117
*	[InstCombine] Refactor getFlippedStrictnessPredicateAndConstant() out of ↵	Roman Lebedev	2019-08-14	2	-32/+50
\| \| \| \| \| \| \| \| \|	canonicalizeCmpWithConstant(), NFCI I'd like to use it elsewhere, hopefully without reinventing the wheel. No functional change intended so far. llvm-svn: 368820
*	[InstCombine] Non-canonical clamp-like pattern handling	Roman Lebedev	2019-08-13	1	-0/+146
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Given a pattern like: ``` %old_cmp1 = icmp slt i32 %x, C2 %old_replacement = select i1 %old_cmp1, i32 %target_low, i32 %target_high %old_x_offseted = add i32 %x, C1 %old_cmp0 = icmp ult i32 %old_x_offseted, C0 %r = select i1 %old_cmp0, i32 %x, i32 %old_replacement ``` it can be rewritten as more canonical pattern: ``` %new_cmp1 = icmp slt i32 %x, -C1 %new_cmp2 = icmp sge i32 %x, C0-C1 %new_clamped_low = select i1 %new_cmp1, i32 %target_low, i32 %x %r = select i1 %new_cmp2, i32 %target_high, i32 %new_clamped_low ``` Iff `-C1 s<= C2 s<= C0-C1` Also, `ULT` predicate can also be `UGE`; or `UGT` iff `C0 != -1` (+invert result) Also, `SLT` predicate can also be `SGE`; or `SGT` iff `C2 != INT_MAX` (+invert result) If `C1 == 0`, then all 3 instructions must be one-use; else at most either `%old_cmp1` or `%old_x_offseted` can have extra uses. NOTE: if we could reuse `%old_cmp1` as one of the comparisons we'll have to build, this could be less limiting. So there are two icmp's, each one with 3 predicate variants, so there are 9 fold variants: \| \| ULT \| UGE \| UGT \| \| SLT \| https://rise4fun.com/Alive/yIJ \| https://rise4fun.com/Alive/5BfN \| https://rise4fun.com/Alive/INH \| \| SGE \| https://rise4fun.com/Alive/hd8 \| https://rise4fun.com/Alive/Abk \| https://rise4fun.com/Alive/PlzS \| \| SGT \| https://rise4fun.com/Alive/VYG \| https://rise4fun.com/Alive/oMY \| https://rise4fun.com/Alive/KrzC \| {F9730206} This fold was brought up in https://reviews.llvm.org/D65148#1603922 by @dmgreen, and is needed to unblock that patch. This patch requires D65530. Reviewers: spatel, nikic, xbolva00, dmgreen Reviewed By: spatel Subscribers: hiraditya, llvm-commits, dmgreen Tags: #llvm Differential Revision: https://reviews.llvm.org/D65765 llvm-svn: 368687
*	[InstCombine][NFC] Rename IsFreeToInvert() -> isFreeToInvert() for consistency	Roman Lebedev	2019-08-13	4	-18/+18
\| \| \| \| \| \|	As per https://reviews.llvm.org/D65530#inline-592325 llvm-svn: 368686
*	[InstCombine] foldXorOfICmps(): don't give up on non-single-use ICmp's if ↵	Roman Lebedev	2019-08-13	2	-10/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	all users are freely invertible Summary: This is rather unconventional.. As the comment there says, we don't have much folds for xor-of-icmps, we try to turn them into an and-of-icmps, for which we have plenty of folds. But if the ICmp we need to invert is not single-use - we give up. As discussed in https://reviews.llvm.org/D65148#1603922, we may have a non-canonical CLAMP pattern, with bit match and select-of-threshold that we'll potentially clamp. As it can be seen in `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`, out of all 8 variations of the pattern, only two are not canonicalized into the variant with and+icmp instead of bit math. The reason is because the ICmp we need to invert is not single-use - we give up. We indeed can't perform this fold at will, the general rule is that we should not increase instruction count in InstCombine, But we wouldn't end up increasing instruction count if we can adapt every other user to the inverted value. This way the `not` we create will get folded, and in the end the instruction count did not increase. For that, of course, we need to look at the users of a Value, which is again rather unconventional for InstCombine :S Thus i'm proposing to be a little bit more insistive in `foldXorOfICmps()`. The alternatives would be to not create that `not`, but add duplicate code to manually invert all users; or to add some even less general combine to handle some more specific pattern[s]. Reviewers: spatel, nikic, RKSimon, craig.topper Reviewed By: spatel Subscribers: hiraditya, jdoerfert, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65530 llvm-svn: 368685
*	[InstCombine] x /c fabs(x) -> copysign(1.0, x)	David Bolvansky	2019-08-12	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: x / fabs(x) -> copysign(1.0, x) fabs(x) / x -> copysign(1.0, x) Reviewers: spatel, foad, RKSimon, efriedma Reviewed By: spatel Subscribers: lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65898 llvm-svn: 368570
*	[InstCombine] foldShiftIntoShiftInAnotherHandOfAndInICmp(): avoid ↵	Roman Lebedev	2019-08-12	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \|	constantexpr pitfail (PR42962) Instead of matching value and then blindly casting to BinaryOperator just to get the opcode, just match instruction and do no cast. Fixes https://bugs.llvm.org/show_bug.cgi?id=42962 llvm-svn: 368554
*	[InstCombine][NFC] Use SimplifyAddInst() instead of ↵	Roman Lebedev	2019-08-10	1	-2/+2
\| \| \| \| \| \|	SimplifyBinOp(Instruction::BinaryOps::Add, ) llvm-svn: 368521
*	[InstCombine] Shift amount reassociation in bittest: relax one-use check ↵	Roman Lebedev	2019-08-10	1	-1/+11
\| \| \| \| \| \| \| \| \| \|	when shifting constant If one of the values being shifted is a constant, since the new shift amount is known-constant, the new shift will end up being constant-folded so, we don't need that one-use restriction then. llvm-svn: 368519
*	[InstCombine] Shift amount reassociation in bittest: drop pointless one-use ↵	Roman Lebedev	2019-08-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	restriction That one-use restriction is not needed for correctness - we have already ensured that one of the shifts will go away, so we know we won't increase the instruction count. So there is no need for that restriction. llvm-svn: 368518
*	[Transforms] Rename hasUnaryFloatFn() and getUnaryFloatFn() (NFC)	Evandro Menezes	2019-08-09	1	-2/+2
\| \| \| \| \| \|	Rename `hasUnaryFloatFn()` to `hasFloatFn()` and `getUnaryFloatFn()` to `getFloatFnName()`. llvm-svn: 368449
*	[InstCombine] Propagate fast math flags through selects	Jay Foad	2019-08-07	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In SimplifySelectsFeedingBinaryOp, propagate fast math flags from the outer op into both arms of the new select, to take advantage of simplifications that require fast math flags. Reviewers: mcberg2017, majnemer, spatel, arsenm, xbolva00 Subscribers: wdng, javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65658 llvm-svn: 368175
*	[InstCombine] Recommit: Shift amount reassociation: shl-trunc-shl pattern	Roman Lebedev	2019-08-07	1	-24/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was initially committed in r368059 but got reverted in r368084 because there was a faulty logic in how the shift amounts type mismatch was being handled (it simply wasn't). I've added an explicit bailout before we SimplifyAddInst() - i don't think it's designed in general to handle differently-typed values, even though the actual problem only comes from ConstantExpr's. I have also changed the common type deduction, to not just blindly look past zext, but try to do that so that in the end types match. Differential Revision: https://reviews.llvm.org/D65380 llvm-svn: 368141
*	Revert [InstCombine] Shift amount reassociation: shl-trunc-shl pattern	Reid Kleckner	2019-08-06	1	-66/+24
\| \| \| \| \| \| \| \| \|	This reverts r368059 (git commit 0f957109761913c563922f1afd4ceb29ef21bbd0) This caused Clang to assert while self-hosting and compiling SystemZInstrInfo.cpp. Reduction is running. llvm-svn: 368084
*	[InstCombine] Shift amount reassociation: shl-trunc-shl pattern	Roman Lebedev	2019-08-06	1	-24/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently `reassociateShiftAmtsOfTwoSameDirectionShifts()` only handles two shifts one after another. If the shifts are `shl`, we still can easily perform the fold, with no extra legality checks: https://rise4fun.com/Alive/OQbM If we have right-shift however, we won't be able to make it any simpler than it already is. After this the only thing missing here is constant-folding: (`NewShAmt >= bitwidth(X)`) * If it's a logical shift, then constant-fold to `0` (not `undef`) * If it's a `ashr`, then a splat of original signbit https://rise4fun.com/Alive/E1K https://rise4fun.com/Alive/i0V Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65380 llvm-svn: 368059
*	[InstCombine] combine mul+shl separated by zext	Sanjay Patel	2019-08-05	1	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This appears to slightly help patterns similar to what's shown in PR42874: https://bugs.llvm.org/show_bug.cgi?id=42874 ...but not in the way requested. That fix will require some later IR and/or backend pass to decompose multiply/shifts into something more optimal per target. Those transforms already exist in some basic forms, but probably need enhancing to catch more cases. https://rise4fun.com/Alive/Qzv2 llvm-svn: 367891
*	[InstCombine] add extra use constraint for shl-zext fold	Sanjay Patel	2019-08-05	1	-1/+1
\| \| \| \| \| \| \|	As the test shows, we can end up with more instructions than we started with if we don't include the extra-use check. llvm-svn: 367880
*	[InstCombine] fold cmp+select using select operand equivalence	Sanjay Patel	2019-08-02	1	-2/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As discussed in PR42696: https://bugs.llvm.org/show_bug.cgi?id=42696 ...but won't help that case yet. We have an odd situation where a select operand equivalence fold was implemented in InstSimplify when it could have been done more generally in InstCombine if we allow dropping of {nsw,nuw,exact} from a binop operand. Here's an example: https://rise4fun.com/Alive/Xplr %cmp = icmp eq i32 %x, 2147483647 %add = add nsw i32 %x, 1 %sel = select i1 %cmp, i32 -2147483648, i32 %add => %sel = add i32 %x, 1 I've left the InstSimplify code in place for now, but my guess is that we'd prefer to remove that as a follow-up to save on code duplication and compile-time. Differential Revision: https://reviews.llvm.org/D65576 llvm-svn: 367695