bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[InstCombine] Allow values with multiple users in SimplifyDemandedVectorElts	Piotr Sobczak	2019-10-21	1	-11/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Allow for ignoring the check for a single use in SimplifyDemandedVectorElts to be able to simplify operands if DemandedElts is known to contain the union of elements used by all users. It is a responsibility of a caller of SimplifyDemandedVectorElts to supply correct DemandedElts. Simplify a series of extractelement instructions if only a subset of elements is used. Reviewers: reames, arsenm, majnemer, nhaehnle Reviewed By: nhaehnle Subscribers: wdng, jvesely, nhaehnle, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67345 llvm-svn: 375395
*	[InstCombine] Fix miscompile bug in canEvaluateShuffled	Bjorn Pettersson	2019-10-18	1	-7/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add restrictions in canEvaluateShuffled to prevent that we for example transform %0 = insertelement <2 x i16> undef, i16 %a, i32 0 %1 = srem <2 x i16> %0, <i16 2, i16 1> %2 = shufflevector <2 x i16> %1, <2 x i16> undef, <2 x i32> <i32 undef, i32 0> into %1 = insertelement <2 x i16> undef, i16 %a, i32 1 %2 = srem <2 x i16> %1, <i16 undef, i16 2> as having an undef denominator makes the srem undefined (for all vector elements). Fixes: https://bugs.llvm.org/show_bug.cgi?id=43689 Reviewers: spatel, lebedev.ri Reviewed By: spatel, lebedev.ri Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69038 llvm-svn: 375208
*	[InstCombine] fold extract+insert into identity shuffle	Sanjay Patel	2019-09-08	1	-0/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is similar to the existing fold for splats added with: rL365379 If we can adjust the shuffle mask to include another element in an identity mask (if it changes vector length, that's an extract/insert subvector operation in the backend), then that can eliminate extractelement/insertelement pairs in IR. All targets are expected to lower shuffles with identity masks efficiently. llvm-svn: 371340
*	[InstCombine] fold insertelement into splat of same scalar	Sanjay Patel	2019-07-08	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \|	Forming the canonical splat shuffle improves analysis and may allow follow-on transforms (although some possibilities are missing as shown in the test diffs). The backend generically turns these patterns into build_vector, so there should be no codegen regressions. All targets are expected to be able to lower splats efficiently. llvm-svn: 365379
*	[InstCombine] canonicalize insert+splat to/from element 0 of vector	Sanjay Patel	2019-07-08	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \|	We recognize a splat from element 0 in (VectorUtils) llvm::getSplatValue() and also in ShuffleVectorInst::isZeroEltSplatMask(), so this converts to that form for better matching. The backend generically turns these patterns into build_vector, so there should be no codegen difference. llvm-svn: 365342
*	[InstCombine] allow undef elements when forming splat from chain of ↵	Sanjay Patel	2019-07-04	1	-4/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	insertelements We allow forming a splat (broadcast) shuffle, but we were conservatively limiting that to cases where all elements of the vector are specified. It should be safe from a codegen perspective to allow undefined lanes of the vector because the expansion of a splat shuffle would become the chain of inserts again. Forming splat shuffles can reduce IR and help enable further IR transforms. Motivating bugs: https://bugs.llvm.org/show_bug.cgi?id=42174 https://bugs.llvm.org/show_bug.cgi?id=16739 Differential Revision: https://reviews.llvm.org/D63848 llvm-svn: 365147
*	[InstCombine] simplify code for inserts -> splat; NFC	Sanjay Patel	2019-06-26	1	-25/+17
\| \| \| \|	llvm-svn: 364441
*	[InstCombine] prevent crashing with invalid extractelement index	Sanjay Patel	2019-05-26	1	-2/+3
\| \| \| \| \| \| \|	This was found/reduced from a fuzzer report: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14956 llvm-svn: 361729
*	[InstSimplify] fold insertelement-of-extractelement	Sanjay Patel	2019-05-24	1	-5/+0
\| \| \| \| \| \| \| \|	This was partly handled in InstCombine (only the constant index case), so delete that and zap it more generally in InstSimplify. llvm-svn: 361576
*	[InstCombine] remove redundant fold for extractelement; NFC	Sanjay Patel	2019-05-23	1	-4/+0
\| \| \| \| \| \| \| \|	The out-of-bounds index pattern is handled by InstSimplify, so the extractelement should be eliminated next time it is visited. llvm-svn: 361570
*	[InstCombine] remove redundant fold for insertelement; NFC	Sanjay Patel	2019-05-23	1	-4/+0
\| \| \| \| \| \|	The out-of-bounds index pattern is handled by InstSimplify. llvm-svn: 361569
*	[InstSimplify] insertelement V, undef, ? --> V	Sanjay Patel	2019-05-23	1	-4/+0
\| \| \| \| \| \| \| \|	This was part of InstCombine, but it's better placed in InstSimplify. InstCombine also had an unreachable but weaker fold for insertelement with undef index, so that is deleted. llvm-svn: 361559
*	[InstCombine] be more careful when transforming a shuffle mask	Sanjay Patel	2019-05-23	1	-4/+21
\| \| \| \| \| \| \| \| \| \| \|	This is reduced from a fuzzer test: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14890 Usually, demanded elements should be able to simplify shuffle mask elements that are pointing to undef elements of its source operands, but that doesn't happen in the test case. llvm-svn: 361533
*	[InstCombine] fold shuffles of insert_subvectors	Sanjay Patel	2019-05-22	1	-1/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This should be a valid exception to the general rule of not creating new shuffle masks in IR... because we already do it. :) Also, DAG combining/legalization will undo this by widening the shuffle back out if needed. Explanation for how we already do this: SLP or vector source can create chains of insert/extract as shown in 1 of the examples from PR16739: https://godbolt.org/z/NlK7rA https://bugs.llvm.org/show_bug.cgi?id=16739 And we expect instcombine or DAGCombine to clean that up by creating relatively simple shuffles. Differential Revision: https://reviews.llvm.org/D62024 llvm-svn: 361338
*	[InstCombine] move bitcast after insertelement-with-bitcasted-operands	Sanjay Patel	2019-05-17	1	-0/+14
\| \| \| \|	llvm-svn: 361058
*	[InstCombine] remove overzealous assert for shuffles (PR41419)	Sanjay Patel	2019-04-08	1	-2/+2
\| \| \| \| \| \| \| \| \|	As the TODO indicates, instsimplify could be improved. Should fix: https://bugs.llvm.org/show_bug.cgi?id=41419 llvm-svn: 357910
*	[InstCombine] Handle vector gep with scalar argument in ↵	Mikael Holmen	2019-04-01	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	evaluateInDifferentElementOrder Summary: This fixes PR41270. The recursive function evaluateInDifferentElementOrder expects to be called on a vector Value, so when we call it on a vector GEP's arguments, we must first check that the argument is indeed a vector. Reviewers: reames, spatel Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60058 llvm-svn: 357389
*	Revert "[InstCombine] Handle vector gep with scalar argument in ↵	Mikael Holmen	2019-04-01	1	-8/+1
\| \| \| \| \| \| \| \| \| \| \|	evaluateInDifferentElementOrder" This reverts commit 75216a6dbcfe5fb55039ef06a07e419fa875f4a5. I'll recommit with a better commit message with reference to the phabricator review. llvm-svn: 357387
*	[InstCombine] Handle vector gep with scalar argument in ↵	Mikael Holmen	2019-04-01	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \|	evaluateInDifferentElementOrder This fixes PR41270. The recursive function evaluateInDifferentElementOrder expects to be called on a vector Value, so when we call it on a vector GEP's arguments, we must first check that the argument is indeed a vector. llvm-svn: 357385
*	[InstCombine] canonicalize select shuffles by commuting	Sanjay Patel	2019-03-31	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In PR41304: https://bugs.llvm.org/show_bug.cgi?id=41304 ...we have a case where we want to fold a binop of select-shuffle (blended) values. Rather than try to match commuted variants of the pattern, we can canonicalize the shuffles and check for mask equality with commuted operands. We don't produce arbitrary shuffle masks in instcombine, but select-shuffles are a special case that the backend is required to handle because we already canonicalize vector select to this shuffle form. So there should be no codegen difference from this change. It's possible that this improves CSE in IR though. Differential Revision: https://reviews.llvm.org/D60016 llvm-svn: 357366
*	[InstCombine] move shuffle canonicalizations before other transforms	Sanjay Patel	2019-03-29	1	-30/+27
\| \| \| \| \| \| \| \| \|	This may not be NFC, but I'm not sure how to expose any diffs in tests. In theory, it should be slightly more efficient and possibly more profitable to do the canonicalizations (which can increase the undef elements in the mask) ahead of SimplifyDemandedVectorElts(). llvm-svn: 357272
*	[InstCombine] limit extracting shuffle transform based on uses	Sanjay Patel	2019-02-05	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	As discussed in D53037, this can lead to worse codegen, and we don't generally expect the backend to be able to optimize arbitrary shuffles. If there's only one use of the 1st shuffle, that means it's getting removed, so that should always be safe. llvm-svn: 353235
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	[InstCombine] refactor isCheapToScalarize(); NFC	Sanjay Patel	2018-12-18	1	-33/+25
\| \| \| \| \| \| \| \| \|	As the FIXME indicates, this has the potential to go overboard. So I'm not sure if it's even worth keeping this vs. iteratively doing simple matches, but we might as well clean it up. llvm-svn: 349523
*	InstCombine: Scalarize single use icmp/fcmp	Matt Arsenault	2018-12-10	1	-0/+12
\| \| \| \|	llvm-svn: 348801
*	[InstCombine] remove dead code from visitExtractElement	Sanjay Patel	2018-12-05	1	-6/+0
\| \| \| \| \| \| \| \|	Extracting from a splat constant is always handled by InstSimplify. Move the test for this from InstCombine to InstSimplify to make sure that stays true. llvm-svn: 348423
*	[InstCombine] reduce duplication in visitExtractElementInst; NFC	Sanjay Patel	2018-12-05	1	-38/+32
\| \| \| \|	llvm-svn: 348418
*	[InstCombine] try to turn shuffle into insertelement	Sanjay Patel	2018-10-30	1	-0/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	shuffle (insert ?, Scalar, IndexC), V1, Mask --> insert V1, Scalar, IndexC' The motivating case is at least a couple of steps away: I noticed that SLPVectorizer does not analyze shuffles as well as sequences of insert/extract in PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 ...so SLP may fail to vectorize when source code has shuffles to start with or instcombine has converted insert/extract to shuffles. Independent of that, an insertelement is always a simpler op for IR analysis vs. a shuffle, so we should transform to insert when possible. I don't think there's any codegen concern here - if a target can't insert a scalar directly to some fixed element in a vector (x86?), then this should get expanded to the insert+shuffle that we started with. Differential Revision: https://reviews.llvm.org/D53507 llvm-svn: 345607
*	[InstCombine] use 'match' to simplify code; NFC	Sanjay Patel	2018-10-20	1	-59/+56
\| \| \| \|	llvm-svn: 344855
*	[InstCombine] make code more flexible with lambda; NFC	Sanjay Patel	2018-10-20	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I couldn't tell from svn history when these checks were added, but it pre-dates the split of instcombine into its own directory at rL92459. The motivation for changing the check is partly shown by the code in PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 There are also existing regression tests for SLPVectorizer with sequences of extract+insert that are likely assumed to become shuffles by the vectorizer cost models. llvm-svn: 344854
*	[InstCombine] add explanatory comment for strange vector logic; NFC	Sanjay Patel	2018-10-20	1	-0/+16
\| \| \| \|	llvm-svn: 344852
*	[InstCombine] combine a shuffle and an extract subvector shuffle	Sanjay Patel	2018-10-14	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \|	This is part of the missing IR-level folding noted in D52912. This should be ok as a canonicalization because the new shuffle mask can't be any more complicated than the existing shuffle mask. If there's some target where the shorter vector shuffle is not legal, it should just end up expanding to something like the pair of shuffles that we're starting with here. Differential Revision: https://reviews.llvm.org/D53037 llvm-svn: 344476
*	revert r344082: [InstCombine] reverse 'trunc X to <N x i1>' canonicalization	Sanjay Patel	2018-10-10	1	-30/+0
\| \| \| \| \| \|	This commit accidentally included the diffs from D53057. llvm-svn: 344178
*	[InstCombine] reverse 'trunc X to <N x i1>' canonicalization	Sanjay Patel	2018-10-09	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	icmp ne (and X, 1), 0 --> trunc X to N x i1 Ideally, we'd do the same for scalars, but there will likely be regressions unless we add more trunc folds as we're doing here for vectors. The motivating vector case is from PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) { %c = fcmp ole <4 x float> %x, %y %s = sext <4 x i1> %c to <4 x i32> %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1> %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3> %cond = or <4 x i32> %s1, %s2 %condtr = trunc <4 x i32> %cond to <4 x i1> %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w ret <4 x float> %r } Here's a sampling of the vector codegen for that case using mask+icmp (current behavior) vs. trunc (with this patch): AVX before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vandps LCPI0_0(%rip), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vpcmpeqd %xmm1, %xmm0, %xmm0 vblendvps %xmm0, %xmm3, %xmm2, %xmm0 AVX after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vblendvps %xmm0, %xmm2, %xmm3, %xmm0 AVX512f before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpbroadcastd LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1] vptestnmd %zmm1, %zmm0, %k1 vblendmps %zmm3, %zmm2, %zmm0 {%k1} AVX512f after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpslld $31, %xmm0, %xmm0 vptestmd %zmm0, %zmm0, %k1 vblendmps %zmm2, %zmm3, %zmm0 {%k1} AArch64 before: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b movi v1.4s, #1 and v0.16b, v0.16b, v1.16b cmeq v0.4s, v0.4s, #0 bsl v0.16b, v3.16b, v2.16b AArch64 after: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b bsl v0.16b, v2.16b, v3.16b PowerPC-le before: xvcmpgesp 34, 35, 34 vspltisw 0, 1 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxlxor 35, 35, 35 xxland 34, 0, 32 vcmpequw 2, 2, 3 xxsel 34, 36, 37, 34 PowerPC-le after: xvcmpgesp 34, 35, 34 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxsel 34, 37, 36, 0 Differential Revision: https://reviews.llvm.org/D52747 llvm-svn: 344082
*	[InstCombine] make helper function 'static'; NFC	Sanjay Patel	2018-10-09	1	-2/+2
\| \| \| \|	llvm-svn: 344056
*	[InstCombine] allow bitcast to/from FP for vector insert/extract transform	Sanjay Patel	2018-10-04	1	-4/+31
\| \| \| \| \| \| \| \|	This is a follow-up to rL343482 / D52439. This was a pattern that initially caused the commit to be reverted because the transform requires a bitcast as shown here. llvm-svn: 343794
*	[InstCombine] try to convert vector insert+extract to trunc; 2nd try	Sanjay Patel	2018-10-01	1	-2/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was originally committed at rL343407, but reverted at rL343458 because it crashed trying to handle a case where the destination type is FP. This version of the patch adds a check for that possibility. Tests added at rL343480. Original commit message: This transform is requested for the backend in: https://bugs.llvm.org/show_bug.cgi?id=39016 ...but I figured it was worth doing in IR too, and it's probably easier to implement here, so that's this patch. In the simplest case, we are just truncating a scalar value. If the extract index doesn't correspond to the LSBs of the scalar, then we have to shift-right before the truncate. Endian-ness makes this tricky, but hopefully the ASCII-art helps visualize the transform. Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343482
*	Revert r343407 "[InstCombine] try to convert vector insert+extract to trunc"	Hans Wennborg	2018-10-01	1	-44/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This caused Chromium builds to fail with "Illegal Trunc" assertion. See https://crbug.com/890723 for repro. > This transform is requested for the backend in: > https://bugs.llvm.org/show_bug.cgi?id=39016 > ...but I figured it was worth doing in IR too, and it's probably > easier to implement here, so that's this patch. > > In the simplest case, we are just truncating a scalar value. If the > extract index doesn't correspond to the LSBs of the scalar, then we > have to shift-right before the truncate. Endian-ness makes this tricky, > but hopefully the ASCII-art helps visualize the transform. > > Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343458
*	[InstCombine] try to convert vector insert+extract to trunc	Sanjay Patel	2018-09-30	1	-2/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This transform is requested for the backend in: https://bugs.llvm.org/show_bug.cgi?id=39016 ...but I figured it was worth doing in IR too, and it's probably easier to implement here, so that's this patch. In the simplest case, we are just truncating a scalar value. If the extract index doesn't correspond to the LSBs of the scalar, then we have to shift-right before the truncate. Endian-ness makes this tricky, but hopefully the ASCII-art helps visualize the transform. Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343407
*	[InstCombine] allow lengthening of insertelement to eliminate shuffles	Sanjay Patel	2018-09-30	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	As noted in post-commit comments for D52548, the limitation on increasing vector length can be applied by opcode. As a first step, this patch only allows insertelement to be widened because that has no logical downsides for IR and has little risk of pessimizing codegen. This may cause PR39132 to go into hiding during a full compile, but that bug is not fixed. llvm-svn: 343406
*	[InstCombine] fix formatting in vector evaluators; NFC	Sanjay Patel	2018-09-29	1	-13/+13
\| \| \| \| \| \|	We need to alter the functionality as shown in D52548. llvm-svn: 343379
*	[InstCombine] don't propagate wider shufflevector arguments to predecessors	Sanjay Patel	2018-09-28	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	InstCombine would propagate shufflevector insts that had wider output vectors onto predecessors, which would sometimes push undef's onto the divisor of a div/rem and result in bad codegen. I've fixed this by just banning propagating shufflevector back if the result of the shufflevector is wider than the input vectors. Patch by: @sheredom (Neil Henning) Differential Revision: https://reviews.llvm.org/D52548 llvm-svn: 343329
*	[InstCombine] add bitcast+extelt helper function; NFC	Sanjay Patel	2018-09-24	1	-14/+26
\| \| \| \| \| \| \| \|	We can handle patterns where the elements have different sizes, so refactoring ahead of trying to add another blob within these clauses. llvm-svn: 342918
*	[InstCombine] improve variable name and use 'match'; NFC	Sanjay Patel	2018-09-24	1	-13/+15
\| \| \| \| \| \| \| \| \| \| \|	'width' of a vector usually refers to the bit-width. https://bugs.llvm.org/show_bug.cgi?id=39016 shows a case where we could extend this fold to handle a case where the number of elements in the bitcasted vector is not equal to the resulting value. llvm-svn: 342902
*	[InstCombine] narrow vector select with padded condition and extracted ↵	Sanjay Patel	2018-09-07	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	result (PR38691) shuf (sel (shuf NarrowCond, undef, WideMask), X, Y), undef, NarrowMask) --> sel NarrowCond, (shuf X, undef, NarrowMask), (shuf Y, undef, NarrowMask) The motivating case from: https://bugs.llvm.org/show_bug.cgi?id=38691 ...is the last regression test. In that case, we're just left with the narrow select. Note that if we do create new shuffles, they use the existing extraction identity mask, so there's no danger that this transform creates arbitrary shuffles. Differential Revision: https://reviews.llvm.org/D51496 llvm-svn: 341708
*	[InstCombine] move declarations closer to uses; NFC	Sanjay Patel	2018-08-29	1	-5/+3
\| \| \| \|	llvm-svn: 340930
*	[InstCombine] remove unnecessary shuffle undef folding	Sanjay Patel	2018-08-29	1	-7/+0
\| \| \| \| \| \| \| \|	Add a test for constant folding to show that (shuffle undef, undef, mask) should already be handled via instsimplify. llvm-svn: 340926
*	Remove trailing space	Fangrui Song	2018-07-30	1	-1/+1
\| \| \| \| \| \|	sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293
*	[InstCombine] allow flag propagation when using safe constant	Sanjay Patel	2018-07-10	1	-2/+3
\| \| \| \| \| \| \|	This corresponds with the code for the single binop pattern added in rL336684. llvm-svn: 336696
*	[InstCombine] safely allow non-commutative binop identity constant folds	Sanjay Patel	2018-07-10	1	-8/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was originally intended with D48893, but as discussed there, we have to make the folds safe from producing extra poison. This should give the single binop folds the same capabilities as the existing folds for 2-binops+shuffle. LLVM binary opcode review: there are a total of 18 binops. There are 7 commutative binops (add, mul, and, or, xor, fadd, fmul) which we already fold. We're able to fold 6 more opcodes with this patch (shl, lshr, ashr, fdiv, udiv, sdiv). There are no folds for srem/urem/frem AFAIK. We don't bother with sub/fsub with constant operand 1 because those are canonicalized to add/fadd. 7 + 6 + 3 + 2 = 18. llvm-svn: 336684