bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[InstCombine] remove overzealous assert for shuffles (PR41419)	Sanjay Patel	2019-04-08	1	-2/+2
\| \| \| \| \| \| \| \| \|	As the TODO indicates, instsimplify could be improved. Should fix: https://bugs.llvm.org/show_bug.cgi?id=41419 llvm-svn: 357910
*	[InstCombine] Handle vector gep with scalar argument in ↵	Mikael Holmen	2019-04-01	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	evaluateInDifferentElementOrder Summary: This fixes PR41270. The recursive function evaluateInDifferentElementOrder expects to be called on a vector Value, so when we call it on a vector GEP's arguments, we must first check that the argument is indeed a vector. Reviewers: reames, spatel Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60058 llvm-svn: 357389
*	Revert "[InstCombine] Handle vector gep with scalar argument in ↵	Mikael Holmen	2019-04-01	1	-8/+1
\| \| \| \| \| \| \| \| \| \| \|	evaluateInDifferentElementOrder" This reverts commit 75216a6dbcfe5fb55039ef06a07e419fa875f4a5. I'll recommit with a better commit message with reference to the phabricator review. llvm-svn: 357387
*	[InstCombine] Handle vector gep with scalar argument in ↵	Mikael Holmen	2019-04-01	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \|	evaluateInDifferentElementOrder This fixes PR41270. The recursive function evaluateInDifferentElementOrder expects to be called on a vector Value, so when we call it on a vector GEP's arguments, we must first check that the argument is indeed a vector. llvm-svn: 357385
*	[InstCombine] canonicalize select shuffles by commuting	Sanjay Patel	2019-03-31	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In PR41304: https://bugs.llvm.org/show_bug.cgi?id=41304 ...we have a case where we want to fold a binop of select-shuffle (blended) values. Rather than try to match commuted variants of the pattern, we can canonicalize the shuffles and check for mask equality with commuted operands. We don't produce arbitrary shuffle masks in instcombine, but select-shuffles are a special case that the backend is required to handle because we already canonicalize vector select to this shuffle form. So there should be no codegen difference from this change. It's possible that this improves CSE in IR though. Differential Revision: https://reviews.llvm.org/D60016 llvm-svn: 357366
*	[InstCombine] move shuffle canonicalizations before other transforms	Sanjay Patel	2019-03-29	1	-30/+27
\| \| \| \| \| \| \| \| \|	This may not be NFC, but I'm not sure how to expose any diffs in tests. In theory, it should be slightly more efficient and possibly more profitable to do the canonicalizations (which can increase the undef elements in the mask) ahead of SimplifyDemandedVectorElts(). llvm-svn: 357272
*	[InstCombine] limit extracting shuffle transform based on uses	Sanjay Patel	2019-02-05	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	As discussed in D53037, this can lead to worse codegen, and we don't generally expect the backend to be able to optimize arbitrary shuffles. If there's only one use of the 1st shuffle, that means it's getting removed, so that should always be safe. llvm-svn: 353235
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	[InstCombine] refactor isCheapToScalarize(); NFC	Sanjay Patel	2018-12-18	1	-33/+25
\| \| \| \| \| \| \| \| \|	As the FIXME indicates, this has the potential to go overboard. So I'm not sure if it's even worth keeping this vs. iteratively doing simple matches, but we might as well clean it up. llvm-svn: 349523
*	InstCombine: Scalarize single use icmp/fcmp	Matt Arsenault	2018-12-10	1	-0/+12
\| \| \| \|	llvm-svn: 348801
*	[InstCombine] remove dead code from visitExtractElement	Sanjay Patel	2018-12-05	1	-6/+0
\| \| \| \| \| \| \| \|	Extracting from a splat constant is always handled by InstSimplify. Move the test for this from InstCombine to InstSimplify to make sure that stays true. llvm-svn: 348423
*	[InstCombine] reduce duplication in visitExtractElementInst; NFC	Sanjay Patel	2018-12-05	1	-38/+32
\| \| \| \|	llvm-svn: 348418
*	[InstCombine] try to turn shuffle into insertelement	Sanjay Patel	2018-10-30	1	-0/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	shuffle (insert ?, Scalar, IndexC), V1, Mask --> insert V1, Scalar, IndexC' The motivating case is at least a couple of steps away: I noticed that SLPVectorizer does not analyze shuffles as well as sequences of insert/extract in PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 ...so SLP may fail to vectorize when source code has shuffles to start with or instcombine has converted insert/extract to shuffles. Independent of that, an insertelement is always a simpler op for IR analysis vs. a shuffle, so we should transform to insert when possible. I don't think there's any codegen concern here - if a target can't insert a scalar directly to some fixed element in a vector (x86?), then this should get expanded to the insert+shuffle that we started with. Differential Revision: https://reviews.llvm.org/D53507 llvm-svn: 345607
*	[InstCombine] use 'match' to simplify code; NFC	Sanjay Patel	2018-10-20	1	-59/+56
\| \| \| \|	llvm-svn: 344855
*	[InstCombine] make code more flexible with lambda; NFC	Sanjay Patel	2018-10-20	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I couldn't tell from svn history when these checks were added, but it pre-dates the split of instcombine into its own directory at rL92459. The motivation for changing the check is partly shown by the code in PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 There are also existing regression tests for SLPVectorizer with sequences of extract+insert that are likely assumed to become shuffles by the vectorizer cost models. llvm-svn: 344854
*	[InstCombine] add explanatory comment for strange vector logic; NFC	Sanjay Patel	2018-10-20	1	-0/+16
\| \| \| \|	llvm-svn: 344852
*	[InstCombine] combine a shuffle and an extract subvector shuffle	Sanjay Patel	2018-10-14	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \|	This is part of the missing IR-level folding noted in D52912. This should be ok as a canonicalization because the new shuffle mask can't be any more complicated than the existing shuffle mask. If there's some target where the shorter vector shuffle is not legal, it should just end up expanding to something like the pair of shuffles that we're starting with here. Differential Revision: https://reviews.llvm.org/D53037 llvm-svn: 344476
*	revert r344082: [InstCombine] reverse 'trunc X to <N x i1>' canonicalization	Sanjay Patel	2018-10-10	1	-30/+0
\| \| \| \| \| \|	This commit accidentally included the diffs from D53057. llvm-svn: 344178
*	[InstCombine] reverse 'trunc X to <N x i1>' canonicalization	Sanjay Patel	2018-10-09	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	icmp ne (and X, 1), 0 --> trunc X to N x i1 Ideally, we'd do the same for scalars, but there will likely be regressions unless we add more trunc folds as we're doing here for vectors. The motivating vector case is from PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) { %c = fcmp ole <4 x float> %x, %y %s = sext <4 x i1> %c to <4 x i32> %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1> %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3> %cond = or <4 x i32> %s1, %s2 %condtr = trunc <4 x i32> %cond to <4 x i1> %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w ret <4 x float> %r } Here's a sampling of the vector codegen for that case using mask+icmp (current behavior) vs. trunc (with this patch): AVX before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vandps LCPI0_0(%rip), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vpcmpeqd %xmm1, %xmm0, %xmm0 vblendvps %xmm0, %xmm3, %xmm2, %xmm0 AVX after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vblendvps %xmm0, %xmm2, %xmm3, %xmm0 AVX512f before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpbroadcastd LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1] vptestnmd %zmm1, %zmm0, %k1 vblendmps %zmm3, %zmm2, %zmm0 {%k1} AVX512f after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpslld $31, %xmm0, %xmm0 vptestmd %zmm0, %zmm0, %k1 vblendmps %zmm2, %zmm3, %zmm0 {%k1} AArch64 before: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b movi v1.4s, #1 and v0.16b, v0.16b, v1.16b cmeq v0.4s, v0.4s, #0 bsl v0.16b, v3.16b, v2.16b AArch64 after: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b bsl v0.16b, v2.16b, v3.16b PowerPC-le before: xvcmpgesp 34, 35, 34 vspltisw 0, 1 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxlxor 35, 35, 35 xxland 34, 0, 32 vcmpequw 2, 2, 3 xxsel 34, 36, 37, 34 PowerPC-le after: xvcmpgesp 34, 35, 34 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxsel 34, 37, 36, 0 Differential Revision: https://reviews.llvm.org/D52747 llvm-svn: 344082
*	[InstCombine] make helper function 'static'; NFC	Sanjay Patel	2018-10-09	1	-2/+2
\| \| \| \|	llvm-svn: 344056
*	[InstCombine] allow bitcast to/from FP for vector insert/extract transform	Sanjay Patel	2018-10-04	1	-4/+31
\| \| \| \| \| \| \| \|	This is a follow-up to rL343482 / D52439. This was a pattern that initially caused the commit to be reverted because the transform requires a bitcast as shown here. llvm-svn: 343794
*	[InstCombine] try to convert vector insert+extract to trunc; 2nd try	Sanjay Patel	2018-10-01	1	-2/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was originally committed at rL343407, but reverted at rL343458 because it crashed trying to handle a case where the destination type is FP. This version of the patch adds a check for that possibility. Tests added at rL343480. Original commit message: This transform is requested for the backend in: https://bugs.llvm.org/show_bug.cgi?id=39016 ...but I figured it was worth doing in IR too, and it's probably easier to implement here, so that's this patch. In the simplest case, we are just truncating a scalar value. If the extract index doesn't correspond to the LSBs of the scalar, then we have to shift-right before the truncate. Endian-ness makes this tricky, but hopefully the ASCII-art helps visualize the transform. Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343482
*	Revert r343407 "[InstCombine] try to convert vector insert+extract to trunc"	Hans Wennborg	2018-10-01	1	-44/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This caused Chromium builds to fail with "Illegal Trunc" assertion. See https://crbug.com/890723 for repro. > This transform is requested for the backend in: > https://bugs.llvm.org/show_bug.cgi?id=39016 > ...but I figured it was worth doing in IR too, and it's probably > easier to implement here, so that's this patch. > > In the simplest case, we are just truncating a scalar value. If the > extract index doesn't correspond to the LSBs of the scalar, then we > have to shift-right before the truncate. Endian-ness makes this tricky, > but hopefully the ASCII-art helps visualize the transform. > > Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343458
*	[InstCombine] try to convert vector insert+extract to trunc	Sanjay Patel	2018-09-30	1	-2/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This transform is requested for the backend in: https://bugs.llvm.org/show_bug.cgi?id=39016 ...but I figured it was worth doing in IR too, and it's probably easier to implement here, so that's this patch. In the simplest case, we are just truncating a scalar value. If the extract index doesn't correspond to the LSBs of the scalar, then we have to shift-right before the truncate. Endian-ness makes this tricky, but hopefully the ASCII-art helps visualize the transform. Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343407
*	[InstCombine] allow lengthening of insertelement to eliminate shuffles	Sanjay Patel	2018-09-30	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	As noted in post-commit comments for D52548, the limitation on increasing vector length can be applied by opcode. As a first step, this patch only allows insertelement to be widened because that has no logical downsides for IR and has little risk of pessimizing codegen. This may cause PR39132 to go into hiding during a full compile, but that bug is not fixed. llvm-svn: 343406
*	[InstCombine] fix formatting in vector evaluators; NFC	Sanjay Patel	2018-09-29	1	-13/+13
\| \| \| \| \| \|	We need to alter the functionality as shown in D52548. llvm-svn: 343379
*	[InstCombine] don't propagate wider shufflevector arguments to predecessors	Sanjay Patel	2018-09-28	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	InstCombine would propagate shufflevector insts that had wider output vectors onto predecessors, which would sometimes push undef's onto the divisor of a div/rem and result in bad codegen. I've fixed this by just banning propagating shufflevector back if the result of the shufflevector is wider than the input vectors. Patch by: @sheredom (Neil Henning) Differential Revision: https://reviews.llvm.org/D52548 llvm-svn: 343329
*	[InstCombine] add bitcast+extelt helper function; NFC	Sanjay Patel	2018-09-24	1	-14/+26
\| \| \| \| \| \| \| \|	We can handle patterns where the elements have different sizes, so refactoring ahead of trying to add another blob within these clauses. llvm-svn: 342918
*	[InstCombine] improve variable name and use 'match'; NFC	Sanjay Patel	2018-09-24	1	-13/+15
\| \| \| \| \| \| \| \| \| \| \|	'width' of a vector usually refers to the bit-width. https://bugs.llvm.org/show_bug.cgi?id=39016 shows a case where we could extend this fold to handle a case where the number of elements in the bitcasted vector is not equal to the resulting value. llvm-svn: 342902
*	[InstCombine] narrow vector select with padded condition and extracted ↵	Sanjay Patel	2018-09-07	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	result (PR38691) shuf (sel (shuf NarrowCond, undef, WideMask), X, Y), undef, NarrowMask) --> sel NarrowCond, (shuf X, undef, NarrowMask), (shuf Y, undef, NarrowMask) The motivating case from: https://bugs.llvm.org/show_bug.cgi?id=38691 ...is the last regression test. In that case, we're just left with the narrow select. Note that if we do create new shuffles, they use the existing extraction identity mask, so there's no danger that this transform creates arbitrary shuffles. Differential Revision: https://reviews.llvm.org/D51496 llvm-svn: 341708
*	[InstCombine] move declarations closer to uses; NFC	Sanjay Patel	2018-08-29	1	-5/+3
\| \| \| \|	llvm-svn: 340930
*	[InstCombine] remove unnecessary shuffle undef folding	Sanjay Patel	2018-08-29	1	-7/+0
\| \| \| \| \| \| \| \|	Add a test for constant folding to show that (shuffle undef, undef, mask) should already be handled via instsimplify. llvm-svn: 340926
*	Remove trailing space	Fangrui Song	2018-07-30	1	-1/+1
\| \| \| \| \| \|	sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293
*	[InstCombine] allow flag propagation when using safe constant	Sanjay Patel	2018-07-10	1	-2/+3
\| \| \| \| \| \| \|	This corresponds with the code for the single binop pattern added in rL336684. llvm-svn: 336696
*	[InstCombine] safely allow non-commutative binop identity constant folds	Sanjay Patel	2018-07-10	1	-8/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was originally intended with D48893, but as discussed there, we have to make the folds safe from producing extra poison. This should give the single binop folds the same capabilities as the existing folds for 2-binops+shuffle. LLVM binary opcode review: there are a total of 18 binops. There are 7 commutative binops (add, mul, and, or, xor, fadd, fmul) which we already fold. We're able to fold 6 more opcodes with this patch (shl, lshr, ashr, fdiv, udiv, sdiv). There are no folds for srem/urem/frem AFAIK. We don't bother with sub/fsub with constant operand 1 because those are canonicalized to add/fadd. 7 + 6 + 3 + 2 = 18. llvm-svn: 336684
*	[InstCombine] drop poison flags when shuffle mask undef propagates to constant	Sanjay Patel	2018-07-10	1	-0/+5
\| \| \| \|	llvm-svn: 336679
*	[InstCombine] allow more shuffle-binop folds with safe constants	Sanjay Patel	2018-07-10	1	-10/+13
\| \| \| \| \| \| \| \| \| \| \| \|	The case with 2 variables is more complicated than the case where we eliminate the shuffle entirely because a shuffle with an undef mask element creates an undef result. I'm not aware of any current analysis/transform that recognizes that undef propagating to a div/rem/shift, but we have to guard against the possibility. llvm-svn: 336668
*	[InstCombine] allow more shuffle folds using safe constants	Sanjay Patel	2018-07-09	1	-8/+2
\| \| \| \| \| \| \| \| \| \|	getSafeVectorConstantForBinop() was calling getBinOpIdentity() assuming that the constant we wanted was operand 1 (RHS). That's wrong, but I don't think we could expose a bug or even a suboptimal fold from that because the callers have other guards for any binop that would have been affected. llvm-svn: 336617
*	[InstCombine] generalize safe vector constant utility	Sanjay Patel	2018-07-09	1	-20/+4
\| \| \| \| \| \| \| \| \| \| \|	This is almost NFC, but there could be some case where the original code had undefs in the constants (rather than just the shuffle mask), and we'll use safe constants rather than undefs now. The FIXME noted in foldShuffledBinop() is already visible in existing tests, so correcting that is the next step. llvm-svn: 336558
*	[InstCombine] fix shuffle-of-binops transform to avoid poison/undef	Sanjay Patel	2018-07-09	1	-21/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As noted in D48987, there are many different ways for this transform to go wrong. In particular, the poison potential for shifts means we have to more careful with those ops. I added tests to make that behavior visible for all of the different cases that I could find. This is a partial fix. To make this review easier, I did not make changes for the single binop pattern (handled in foldSelectShuffleWith1Binop()). I also left out some potential optimizations noted with TODO comments. I'll follow-up once we're confident that things are correct here. The goal is to correct all marked FIXME tests to either avoid the shuffle transform or do it safely. Note that distinguishing when the shuffle mask contains undefs and using getBinOpIdentity() allows for some improvements to div/rem patterns, so there are wins along with the missed opportunities and fixes. Differential Revision: https://reviews.llvm.org/D49047 llvm-svn: 336546
*	[InstCombine] fold shuffle-with-binop and common value	Sanjay Patel	2018-07-03	1	-0/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the last significant change suggested in PR37806: https://bugs.llvm.org/show_bug.cgi?id=37806#c5 ...though there are several follow-ups noted in the code comments in this patch to complete this transform. It's possible that a binop feeding a select-shuffle has been eliminated by earlier transforms (or the code was just written like this in the 1st place), so we'll fail to match the patterns that have 2 binops from: D48401, D48678, D48662, D48485. In that case, we can try to materialize identity constants for the remaining binop to fill in the "ghost" lanes of the vector (where we just want to pass through the original values of the source operand). I added comments to ConstantExpr::getBinOpIdentity() to show planned follow-ups. For now, we only handle the 5 commutative integer binops (add/mul/and/or/xor). Differential Revision: https://reviews.llvm.org/D48830 llvm-svn: 336196
*	[InstCombine] reverse canonicalization of add --> or to allow more shuffle ↵	Sanjay Patel	2018-07-02	1	-12/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	folding This extends D48485 to allow another pair of binops (add/or) to be combined either with or without a leading shuffle: or X, C --> add X, C (when X and C have no common bits set) Here, we need value tracking to determine that the 'or' can be reversed into an 'add', and we've added general infrastructure to allow extending to other opcodes or moving to where other passes could use that functionality. Differential Revision: https://reviews.llvm.org/D48662 llvm-svn: 336128
*	[InstCombine] enhance shuffle-of-binops to allow different variable ops ↵	Sanjay Patel	2018-06-29	1	-12/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR37806) This was discussed in D48401 as another improvement for: https://bugs.llvm.org/show_bug.cgi?id=37806 If we have 2 different variable values, then we shuffle (select) those lanes, shuffle (select) the constants, and then perform the binop. This eliminates a binop. The new shuffle uses the same shuffle mask as the existing shuffle, so there's no danger of creating a difficult shuffle. All of the earlier constraints still apply, but we also check for extra uses to avoid creating more instructions than we'll remove. Additionally, we're disallowing the fold for div/rem because that could expose a UB hole. Differential Revision: https://reviews.llvm.org/D48678 llvm-svn: 335974
*	[InstCombine] fix opcode check in shuffle fold	Sanjay Patel	2018-06-28	1	-1/+1
\| \| \| \| \| \| \| \| \|	There's no way to expose this difference currently, but we should use the updated variable because the original opcodes can go stale if we transform into something new. llvm-svn: 335920
*	[InstCombine] allow shl+mul combos with shuffle (select) fold (PR37806)	Sanjay Patel	2018-06-28	1	-5/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is an enhancement to D48401 that was discussed in: https://bugs.llvm.org/show_bug.cgi?id=37806 We can convert a shift-left-by-constant into a multiply (we canonicalize IR in the other direction because that's generally better of course). This allows us to remove the shuffle as we do in the regular opcodes-are-the-same cases. This requires a small hack to make sure we don't introduce any extra poison: https://rise4fun.com/Alive/ZGv Other examples of opcodes where this would work are add+sub and fadd+fsub, but we already canonicalize those subs into adds, so there's nothing to do for those cases AFAICT. There are planned enhancements for opcode transforms such or -> add. Note that there's a different fold needed if we've already managed to simplify away a binop as seen in the test based on PR37806, but we manage to get that one case here because this fold is positioned above the demanded elements fold currently. Differential Revision: https://reviews.llvm.org/D48485 llvm-svn: 335888
*	[InstCombine] rearrange shuffle-of-binops logic; NFC	Sanjay Patel	2018-06-22	1	-17/+12
\| \| \| \| \| \| \| \|	The commutative matcher makes things more complicated here, and I'm planning an enhancement where this form is more readable. llvm-svn: 335343
*	[InstCombine] fix shuffle-of-binops bug	Sanjay Patel	2018-06-21	1	-2/+8
\| \| \| \| \| \| \| \| \|	With non-commutative binops, we could be using the same variable value as operand 0 in 1 binop and operand 1 in the other, so we have to check for that possibility and bail out. llvm-svn: 335312
*	[InstCombine] fold vector select of binops with constant ops to 1 binop ↵	Sanjay Patel	2018-06-21	1	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR37806) This is the simplest case from PR37806: https://bugs.llvm.org/show_bug.cgi?id=37806 If we have a common variable operand used in a pair of binops with vector constants that are vector selected together, then we can constant shuffle the constant vectors to eliminate the shuffle instruction. This has some tricky parts that are hopefully addressed in the tests and their respective comments: 1. If the shuffle mask contains an undef element, then that lane of the result is undef: http://llvm.org/docs/LangRef.html#shufflevector-instruction Therefore, we can replace the constant in that lane with an undef value except for div/rem. With div/rem, an undef in the divisor would cause the whole op to be undef. So I'm using the same hack as in D47686 - replace the undefs with '1'. 2. Intersect the wrapping and FMF of the original binops for the new binop. There should be no extra poison or fast-math potential in the new binop that wasn't possible in the original code. 3. Disregard other uses. Given that we're eliminating uses (shortening the dependency chain), I think that's always the right IR canonicalization. But I purposely chose the udiv test to demonstrate the scenario where both intermediate values have other uses because that seems likely worse for codegen with an expensive math op. This seems like a very rare possibility to me, so I don't think it requires a backend patch first. Differential Revision: https://reviews.llvm.org/D48401 llvm-svn: 335283
*	[InstCombine] Gracefully handle out of range extractelement indices	Simon Pilgrim	2017-12-27	1	-3/+5
\| \| \| \| \| \| \| \|	InstSimplify is responsible for handling these, but we shouldn't just assert here. Reduced from oss-fuzz #4808 test case llvm-svn: 321489
*	Reintroduce r320049, r320014 and r319894.	Igor Laevsky	2017-12-13	1	-0/+4
\| \| \| \| \| \|	OpenGL issues should be fixed by now. llvm-svn: 320568