bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[InstCombine] allow bitcast to/from FP for vector insert/extract transform	Sanjay Patel	2018-10-04	1	-4/+31
\| \| \| \| \| \| \| \|	This is a follow-up to rL343482 / D52439. This was a pattern that initially caused the commit to be reverted because the transform requires a bitcast as shown here. llvm-svn: 343794
*	[InstCombine] allow SimplifyDemandedVectorElts to work with FP binops	Sanjay Patel	2018-10-03	1	-18/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We're a long way from D50992 and D51553, but this is where we have to start. We weren't back-propagating undefs into binop constant values for anything but add/sub/mul/and/or/xor. This is likely because we have to be careful about not introducing UB/poison with div/rem/shift. But I suspect we already are getting the poison part wrong for add/sub/mul (although it may not be possible to expose the bug currently because we use SimplifyDemandedVectorElts from a limited set of opcodes). See the discussion/implementation from D48987 and D49047. This patch just enables functionality for FP ops because those do not have UB/poison potential. llvm-svn: 343727
*	[InstCombine] clean up foldVectorBinop(); NFC	Sanjay Patel	2018-10-03	1	-11/+12
\| \| \| \| \| \| \| \|	1. Fix include ordering. 2. Improve variable name (width is bitwidth not number-of-elements). 3. Add local Opcode variable to reduce code duplication. llvm-svn: 343694
*	[InstCombine] name change: foldShuffledBinop -> foldVectorBinop; NFC	Sanjay Patel	2018-10-03	6	-20/+20
\| \| \| \| \| \| \|	This function will deal with more than shuffles with D50992, and I have another potential per-element fold that could live here. llvm-svn: 343692
*	[InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A	David Green	2018-10-02	2	-1/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is an attempt to get out of a local-minimum that instcombine currently gets stuck in. We essentially combine two optimisations at once, ~a - ~b = b-a and min(~a, ~b) = ~max(a, b), only doing the transform if the result is at least neutral. This involves using IsFreeToInvert, which has been expanded a little to include selects that can be easily inverted. This is trying to fix PR35875, using the ideas from Sanjay. It is a large improvement to one of our rgb to cmy kernels. Differential Revision: https://reviews.llvm.org/D52177 llvm-svn: 343569
*	[InstCombine] Handle vector compares in foldGEPIcmp(), take 2	Jesper Antonsson	2018-10-01	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a continuation of the fix for PR34627 "InstCombine assertion at vector gep/icmp folding". (I just realized bugpoint had fuzzed the original test for me, so I had fixed another trigger of the same assert in adjacent code in InstCombine.) This patch avoids optimizing an icmp (to look only at the base pointers) when the resulting icmp would have a different type. The patch adds a testcase and also cleans up and shrinks the pre-existing test for the adjacent assert trigger. Reviewers: lebedev.ri, majnemer, spatel Reviewed By: lebedev.ri Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52494 llvm-svn: 343486
*	[InstCombine] try to convert vector insert+extract to trunc; 2nd try	Sanjay Patel	2018-10-01	1	-2/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was originally committed at rL343407, but reverted at rL343458 because it crashed trying to handle a case where the destination type is FP. This version of the patch adds a check for that possibility. Tests added at rL343480. Original commit message: This transform is requested for the backend in: https://bugs.llvm.org/show_bug.cgi?id=39016 ...but I figured it was worth doing in IR too, and it's probably easier to implement here, so that's this patch. In the simplest case, we are just truncating a scalar value. If the extract index doesn't correspond to the LSBs of the scalar, then we have to shift-right before the truncate. Endian-ness makes this tricky, but hopefully the ASCII-art helps visualize the transform. Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343482
*	Revert r343407 "[InstCombine] try to convert vector insert+extract to trunc"	Hans Wennborg	2018-10-01	1	-44/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This caused Chromium builds to fail with "Illegal Trunc" assertion. See https://crbug.com/890723 for repro. > This transform is requested for the backend in: > https://bugs.llvm.org/show_bug.cgi?id=39016 > ...but I figured it was worth doing in IR too, and it's probably > easier to implement here, so that's this patch. > > In the simplest case, we are just truncating a scalar value. If the > extract index doesn't correspond to the LSBs of the scalar, then we > have to shift-right before the truncate. Endian-ness makes this tricky, > but hopefully the ASCII-art helps visualize the transform. > > Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343458
*	[InstCombine] try to convert vector insert+extract to trunc	Sanjay Patel	2018-09-30	1	-2/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This transform is requested for the backend in: https://bugs.llvm.org/show_bug.cgi?id=39016 ...but I figured it was worth doing in IR too, and it's probably easier to implement here, so that's this patch. In the simplest case, we are just truncating a scalar value. If the extract index doesn't correspond to the LSBs of the scalar, then we have to shift-right before the truncate. Endian-ness makes this tricky, but hopefully the ASCII-art helps visualize the transform. Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343407
*	[InstCombine] allow lengthening of insertelement to eliminate shuffles	Sanjay Patel	2018-09-30	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	As noted in post-commit comments for D52548, the limitation on increasing vector length can be applied by opcode. As a first step, this patch only allows insertelement to be widened because that has no logical downsides for IR and has little risk of pessimizing codegen. This may cause PR39132 to go into hiding during a full compile, but that bug is not fixed. llvm-svn: 343406
*	[InstCombine] fix formatting in vector evaluators; NFC	Sanjay Patel	2018-09-29	2	-14/+13
\| \| \| \| \| \|	We need to alter the functionality as shown in D52548. llvm-svn: 343379
*	[InstCombine] don't propagate wider shufflevector arguments to predecessors	Sanjay Patel	2018-09-28	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	InstCombine would propagate shufflevector insts that had wider output vectors onto predecessors, which would sometimes push undef's onto the divisor of a div/rem and result in bad codegen. I've fixed this by just banning propagating shufflevector back if the result of the shufflevector is wider than the input vectors. Patch by: @sheredom (Neil Henning) Differential Revision: https://reviews.llvm.org/D52548 llvm-svn: 343329
*	[InstCombine] Without infinites, fold (C / X) < 0.0 --> (X < 0)	Sanjay Patel	2018-09-27	1	-0/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When C is not zero and infinites are not allowed (C / X) > 0 is a sign test. Depending on the sign of C, the predicate must be swapped. E.g.: foo(double X) { if ((-2.0 / X) <= 0) ... } => foo(double X) { if (X >= 0) ... } Patch by: @marels (Martin Elshuber) Differential Revision: https://reviews.llvm.org/D51942 llvm-svn: 343228
*	[InstCombine] narrow binops on concatenated vectors (PR33026)	Sanjay Patel	2018-09-25	1	-6/+28
\| \| \| \| \| \| \| \| \|	The motivating case from: https://bugs.llvm.org/show_bug.cgi?id=33026 ...has no shuffles now. This kind of pattern may occur during vectorization when targets have lumpy ISAs like SSE/AVX. llvm-svn: 342988
*	[InstCombine] add bitcast+extelt helper function; NFC	Sanjay Patel	2018-09-24	1	-14/+26
\| \| \| \| \| \| \| \|	We can handle patterns where the elements have different sizes, so refactoring ahead of trying to add another blob within these clauses. llvm-svn: 342918
*	[InstCombine] improve variable name and use 'match'; NFC	Sanjay Patel	2018-09-24	1	-13/+15
\| \| \| \| \| \| \| \| \| \| \|	'width' of a vector usually refers to the bit-width. https://bugs.llvm.org/show_bug.cgi?id=39016 shows a case where we could extend this fold to handle a case where the number of elements in the bitcasted vector is not equal to the resulting value. llvm-svn: 342902
*	[InstCombine][x86] try even harder to convert blendv intrinsic to generic IR ↵	Sanjay Patel	2018-09-22	1	-7/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR38814) Follow-up to rL342324 (D52059): Missing optimizations with blendv are shown in: https://bugs.llvm.org/show_bug.cgi?id=38814 This is an easier and more powerful solution than adding pattern matching for a few special cases in the backend. The potential danger with this transform in IR is that the condition value can get separated from the select, and the backend might not be able to make a blendv out of it again. llvm-svn: 342806
*	[InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely ↵	Craig Topper	2018-09-22	1	-10/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	invertible Summary: This restores the combine that was reverted in r341883. The infinite loop from the failing test no longer occurs due to changes from r342163. Reviewers: spatel, dmgreen Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52070 llvm-svn: 342797
*	[InstCombine] Handle vector compares in foldGEPIcmp()	Jesper Antonsson	2018-09-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is to fix PR38984 "InstCombine assertion at vector gep/icmp folding": https://bugs.llvm.org/show_bug.cgi?id=38984 Reviewers: majnemer, spatel, lattner, lebedev.ri Reviewed By: lebedev.ri Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D52263 llvm-svn: 342647
*	[InstCombine] foldICmpWithLowBitMaskedVal(): handle uncanonical ((-1 << y) ↵	Roman Lebedev	2018-09-19	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	>> y) mask Summary: The last low-bit-mask-pattern-producing-pattern i can think of. https://rise4fun.com/Alive/UGzE <- non-canonical But we can not canonicalize it because of extra uses. https://bugs.llvm.org/show_bug.cgi?id=38123 Reviewers: spatel, craig.topper, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52148 llvm-svn: 342548
*	[InstCombine] foldICmpWithLowBitMaskedVal(): handle uncanonical ((1 << ↵	Roman Lebedev	2018-09-19	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	y)+(-1)) mask Summary: Same as to D52146. `((1 << y)+(-1))` is simply non-canoniacal version of `~(-1 << y)`: https://rise4fun.com/Alive/0vl We can not canonicalize it due to the extra uses. But we can handle it here. Reviewers: spatel, craig.topper, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52147 llvm-svn: 342547
*	[InstCombine] foldICmpWithLowBitMaskedVal(): handle ~(-1 << y) mask	Roman Lebedev	2018-09-19	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Two folds are happening here: 1. https://rise4fun.com/Alive/oaFX 2. And then `foldICmpWithHighBitMask()` (D52001): https://rise4fun.com/Alive/wsP4 This change doesn't just add the handling for eq/ne predicates, it actually builds upon the previous `foldICmpWithLowBitMaskedVal()` work, so all the 16 fold variants* are immediately supported. I'm indeed only testing these two predicates. I do not feel like re-proving all 16 folds, because they were already proven for the general case of constant with all-ones in low bits. So as long as the mask produces all-ones in low bits, i'm pretty sure the fold is valid. But required, i can re-prove, let me know. eq/ne are commutative - 4 folds; ult/ule/ugt/uge - are not commutative (the commuted variant is InstSimplified), 4 folds; slt/sle/sgt/sge are not commutative - 4 folds. 12 folds in total. https://bugs.llvm.org/show_bug.cgi?id=38123 https://bugs.llvm.org/show_bug.cgi?id=38708 Reviewers: spatel, craig.topper, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52146 llvm-svn: 342546
*	[InstCombine] Support (sub (sext x), (sext y)) --> (sext (sub x, y)) and ↵	Craig Topper	2018-09-15	2	-7/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(sub (zext x), (zext y)) --> (zext (sub x, y)) Summary: If the sub doesn't overflow in the original type we can move it above the sext/zext. This is similar to what we do for add. The overflow checking for sub is currently weaker than add, so the test cases are constructed for what is supported. Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52075 llvm-svn: 342335
*	[InstCombine][x86] try harder to convert blendv intrinsic to generic IR ↵	Sanjay Patel	2018-09-15	1	-7/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR38814) Missing optimizations with blendv are shown in: https://bugs.llvm.org/show_bug.cgi?id=38814 If this works, it's an easier and more powerful solution than adding pattern matching for a few special cases in the backend. The potential danger with this transform in IR is that the condition value can get separated from the select, and the backend might not be able to make a blendv out of it again. I don't think that's too likely, but I've kept this patch minimal with a 'TODO', so we can test that theory in the wild before expanding the transform. Differential Revision: https://reviews.llvm.org/D52059 llvm-svn: 342324
*	[InstCombine] Inefficient pattern for high-bits checking 3 (PR38708)	Roman Lebedev	2018-09-15	1	-6/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: It is sometimes important to check that some newly-computed value is non-negative and only n bits wide (where n is a variable.) There are many ways to check that: https://godbolt.org/z/o4RB8D The last variant seems best? (I'm sure there are some other variations i haven't thought of..) The last (as far i know?) pattern, non-canonical due to the extra use. https://godbolt.org/z/aCMsPk https://rise4fun.com/Alive/I6f https://bugs.llvm.org/show_bug.cgi?id=38708 Reviewers: spatel, craig.topper, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52062 llvm-svn: 342321
*	[InstCombine] refactor mul narrowing folds; NFCI	Sanjay Patel	2018-09-14	4	-112/+60
\| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to rL342278: The test diffs are all cosmetic due to the change in value naming, but I'm including that to show that the new code does perform these folds rather than something else in instcombine. D52075 should be able to use this code too rather than duplicating all of the logic. llvm-svn: 342292
*	[InstCombine] add/use overflowing math helper functions; NFC	Sanjay Patel	2018-09-14	2	-3/+19
\| \| \| \| \| \| \| \|	The mul case can already be refactored to use this similar to rL342278. The sub case is proposed in D52075. llvm-svn: 342289
*	[InstCombine] refactor add narrowing folds; NFCI	Sanjay Patel	2018-09-14	2	-71/+44
\| \| \| \| \| \| \| \| \|	The test diffs are all cosmetic due to the change in value naming, but I'm including that to show that the new code does perform these folds rather than something else in instcombine. llvm-svn: 342278
*	[InstCombine] Inefficient pattern for high-bits checking 2 (PR38708)	Roman Lebedev	2018-09-13	1	-19/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: It is sometimes important to check that some newly-computed value is non-negative and only n bits wide (where n is a variable.) There are many ways to check that: https://godbolt.org/z/o4RB8D The last variant seems best? (I'm sure there are some other variations i haven't thought of..) More complicated, canonical pattern: https://rise4fun.com/Alive/uhA We do need to have two `switch()`'es like this, to not mismatch the swappable predicates. https://bugs.llvm.org/show_bug.cgi?id=38708 Reviewers: spatel, craig.topper, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52001 llvm-svn: 342173
*	[InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y ↵	Craig Topper	2018-09-13	2	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	are freely invertible. This allows the xor to be removed completely. This might help with recomitting r341674, but seems good regardless. Coincidentally fixes PR38915. Differential Revision: https://reviews.llvm.org/D51964 llvm-svn: 342163
*	[InstCombine] remove checks for IsFreeToInvert()	Sanjay Patel	2018-09-13	1	-3/+1
\| \| \| \| \| \| \| \| \|	I accidentally committed this diff with rL342147 because I had applied D51964. We probably do need those checks, but D51964 has tests and more discussion/motivation, so they should be re-added with that patch. llvm-svn: 342149
*	[InstCombine] reorder folds to reduce chance of infinite loops	Sanjay Patel	2018-09-13	1	-22/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I don't have a test case for this, but it's motivated by the discussion in D51964, and I've added TODO comments for the better fix - move simplifications into instsimplify because that's more efficient and reduces risk of infinite loops in instcombine caused by transforms trying to do the opposite folds. In this case, we know that the transform that tries to move 'not' through min/max can be fooled by the multiple uses of a value in another min/max, so try to squash the foldSPFofSPF() patterns first. llvm-svn: 342147
*	[InstCombine] Inefficient pattern for high-bits checking (PR38708)	Roman Lebedev	2018-09-12	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: It is sometimes important to check that some newly-computed value is non-negative and only `n` bits wide (where `n` is a variable.) There are many ways to check that: https://godbolt.org/z/o4RB8D The last variant seems best? (I'm sure there are some other variations i haven't thought of..) Let's handle the second variant first, since it is much simpler. https://rise4fun.com/Alive/LYjY https://bugs.llvm.org/show_bug.cgi?id=38708 Reviewers: spatel, craig.topper, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51985 llvm-svn: 342067
*	[InstCombine] add folds for unsigned-overflow compares	Sanjay Patel	2018-09-11	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Name: op_ugt_sum %a = add i8 %x, %y %r = icmp ugt i8 %x, %a => %notx = xor i8 %x, -1 %r = icmp ugt i8 %y, %notx Name: sum_ult_op %a = add i8 %x, %y %r = icmp ult i8 %a, %x => %notx = xor i8 %x, -1 %r = icmp ugt i8 %y, %notx https://rise4fun.com/Alive/ZRxI AFAICT, this doesn't interfere with any add-saturation patterns because those have >1 use for the 'add'. But this should be better for IR analysis and codegen in the basic cases. This is another fold inspired by PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 llvm-svn: 342004
*	[InstCombine] add folds for icmp with xor mask constant	Sanjay Patel	2018-09-11	1	-10/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These are the folds in Alive; Name: xor_ult Pre: isPowerOf2(-C1) %xor = xor i8 %x, C1 %r = icmp ult i8 %xor, C1 => %r = icmp ugt i8 %x, ~C1 Name: xor_ugt Pre: isPowerOf2(C1+1) %xor = xor i8 %x, C1 %r = icmp ugt i8 %xor, C1 => %r = icmp ugt i8 %x, C1 https://rise4fun.com/Alive/Vty The ugt case in its simplest form was already handled by DemandedBits, but that's not ideal as shown in the multi-use test. I'm not sure if these are all of the symmetrical folds, but I adjusted the existing code for one of the folds to try to show the similarities. There's no obvious connection, but this is another preliminary step for PR14613... https://bugs.llvm.org/show_bug.cgi?id=14613 llvm-svn: 341997
*	[InstCombine] enhance vector demanded elements to look at a vector select ↵	Sanjay Patel	2018-09-11	1	-2/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	condition operand I noticed that we were not back-propagating undef lanes to shuffle masks when we have a shuffle that reduces the vector width. This is part of investigating/solving PR38691: https://bugs.llvm.org/show_bug.cgi?id=38691 The DAG equivalent was proposed with: D51696 Differential Revision: https://reviews.llvm.org/D51433 llvm-svn: 341981
*	[InstCombine] Fix incorrect usage of getPrimitiveSizeInBits when we should ↵	Craig Topper	2018-09-11	1	-2/+1
\| \| \| \| \| \| \| \| \| \|	be using the element size for vectors For vectors, getPrimitiveSizeInBits returns the full vector width. This code should using the element size for vectors. This could be fixed by calling getScalarSizeInBits, but its even easier to just get it from the APInt we're checking. Differential Revision: https://reviews.llvm.org/D51938 llvm-svn: 341971
*	[InstCombine] Use dyn_cast instead of match(m_Constant). NFC	Craig Topper	2018-09-11	1	-4/+2
\| \| \| \|	llvm-svn: 341962
*	[InstCombine] Support (mul (sext x), cst) --> (sext (mul x, cst')) and (mul ↵	Craig Topper	2018-09-11	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	(zext x), cst) --> (zext (mul x, cst')) for vectors constants. Similar to D51236, but for mul instead of add. Differential Revision: https://reviews.llvm.org/D51900 llvm-svn: 341961
*	[InstCombine] Partially revert rL341674 due to PR38897.	Alina Sbirlea	2018-09-10	1	-35/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Revert min/max changes in rL341674 dues to high compile times causing timeouts (PR38897). Checking in to unblock failing builds. Patch available for post-commit review and re-revert once resolved. Working on a smaller reproducer for PR38897. Reviewers: craig.topper, spatel Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D51897 llvm-svn: 341883
*	[InstCombine] use SelectInst operand names to make code clearer; NFC	Sanjay Patel	2018-09-10	1	-8/+10
\| \| \| \| \| \|	Cleanup step for D51433. llvm-svn: 341850
*	InstCombine: move hasOneUse check to the top of foldICmpAddConstant	Tim Northover	2018-09-10	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	There were two combines not covered by the check before now, neither of which actually differed from normal in the benefit analysis. The most recent seems to be because it was just added at the top of the function (naturally). The older is from way back in 2008 (r46687) when we just didn't put those checks in so routinely, and has been diligently maintained since. llvm-svn: 341831
*	[InstCombine] narrow vector select with padded condition and extracted ↵	Sanjay Patel	2018-09-07	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	result (PR38691) shuf (sel (shuf NarrowCond, undef, WideMask), X, Y), undef, NarrowMask) --> sel NarrowCond, (shuf X, undef, NarrowMask), (shuf Y, undef, NarrowMask) The motivating case from: https://bugs.llvm.org/show_bug.cgi?id=38691 ...is the last regression test. In that case, we're just left with the narrow select. Note that if we do create new shuffles, they use the existing extraction identity mask, so there's no danger that this transform creates arbitrary shuffles. Differential Revision: https://reviews.llvm.org/D51496 llvm-svn: 341708
*	[InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely ↵	Craig Topper	2018-09-07	2	-13/+46
\| \| \| \| \| \| \| \| \| \| \| \|	invertible If the ~X wasn't able to simplify above the max/min, we might be able to simplify it by moving it below the max/min. I had to modify the ~(min/max ~X, Y) transform to prevent getting stuck in a loop when we saw the new ~(max/min X, ~Y) before the ~Y had been folded away to remove the new not. Differential Revision: https://reviews.llvm.org/D51398 llvm-svn: 341674
*	[InstCombine] Do not fold scalar ops over select with vector condition.	Florian Hahn	2018-09-07	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	If OtherOpT or OtherOpF have scalar types and the condition is a vector, we would create an invalid select. Reviewers: spatel, john.brawn, mssimpso, craig.topper Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D51781 llvm-svn: 341666
*	[InstCombine] add xor+not folds	Sanjay Patel	2018-09-06	1	-0/+16
\| \| \| \| \| \| \| \| \| \|	This fold is needed to avoid a regression when we try to recommit rL300977. We can't see the most basic win currently because demanded bits changes the patterns: https://rise4fun.com/Alive/plpp llvm-svn: 341559
*	[InstCombine] fix formatting in SimplifyDemandedVectorElts->Select; NFCI	Sanjay Patel	2018-09-06	1	-12/+16
\| \| \| \| \| \| \| \|	I'm preparing to add the same functionality both here and to the DAG version of this code in D51696 / D51433, so try to make those cases as similar as possible to avoid bugs. llvm-svn: 341545
*	[InstCombine] fix xor-or-xor fold to check uses and handle commutes	Sanjay Patel	2018-09-04	1	-32/+21
\| \| \| \| \| \| \| \| \| \| \| \|	I'm probably missing some way to use m_Deferred to remove the code duplication, but that can be a follow-up. The improvement in demand_shrink_nsw.ll is an example of missing the fold because the pattern matching was deficient. I didn't try to follow the bits in that test, but Alive says it's correct: https://rise4fun.com/Alive/ugc llvm-svn: 341426
*	[InstCombine] make ((X & C) ^ C) form consistent for vectors	Sanjay Patel	2018-09-04	1	-4/+2
\| \| \| \| \| \|	It would be better to create a 'not' here, but that's not possible yet. llvm-svn: 341410
*	[InstCombine] simplify code for xor folds; NFCI	Sanjay Patel	2018-09-04	1	-40/+23
\| \| \| \| \| \| \| \| \|	This is just a cleanup step. The TODO comments show what is wrong with the 'and' version of the fold. Fixing this should be part of recommitting: rL300977 llvm-svn: 341405