summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] replace shuffle's insertelement operand if inserted scalar is ↵Sanjay Patel2019-12-101-1/+27
| | | | | | | | | | | | | | not demanded This pattern is noted as a regression from: D70246 ...where we removed an over-aggressive shuffle simplification. SimplifyDemandedVectorElts fails to catch this case when the insert has multiple uses, so I'm proposing to pattern match the minimal sequence directly. This fold does not conflict with any of our current shuffle undef/poison semantics. Differential Revision: https://reviews.llvm.org/D71220
* [InstCombine] Revert aafde063aaf09285c701c80cd4b543c2beb523e8 and ↵Craig Topper2019-12-031-7/+0
| | | | | | | | | | | | | | | | 6749dc3446671df05235d0a218c426a314ac33cd related to bitcast handling of x86_mmx This reverts these two commits [InstCombine] Turn (extractelement <1 x i64/double> (bitcast (x86_mmx))) into a single bitcast from x86_mmx to i64/double. [InstCombine] Don't transform bitcasts between x86_mmx and v1i64 into insertelement/extractelement We're seeing at least one internal test failure related to a bitcast that was previously before an inline assembly block containing emms being placed after it. This leads to the mmx state ending up not empty after the emms. IR has no way to make any specific guarantees about this. Reverting these patches to get back to previous behavior which at least worked for this test.
* [InstCombine] remove shuffle mask canonicalization that creates undef elementsSanjay Patel2019-11-251-10/+10
| | | | | | This is NFC-intended because SimplifyDemandedVectorElts() does the same transform later. As discussed in D70641, we may want to change that behavior, so we need to isolate where it happens.
* [InstCombine] prevent infinite loop from conflicting shuffle mask transformsSanjay Patel2019-11-251-2/+4
| | | | | | | | | The pattern in question is currently not possible because we aggressively (wrongly) transform mask elements to undef values if they choose from an undef operand. That, however, would change if we tighten our semantics for shuffles as discussed in D70641. Adding this check gives us the flexibility to make that change with minimal overhead for current definitions.
* [InstCombine] simplify code for shuffle mask canonicalization; NFCSanjay Patel2019-11-251-6/+4
| | | | We never use the local 'Mask' before returning, so that was dead code.
* [InstCombine] remove dead code from shuffle mask canonicalization; NFCSanjay Patel2019-11-251-2/+2
|
* [InstCombine] simplify loop for shuffle mask canonicalization; NFCSanjay Patel2019-11-251-4/+4
|
* [InstCombine] remove identity shuffle simplification for mask with undefsSanjay Patel2019-11-241-24/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And simultaneously enhance SimplifyDemandedVectorElts() to rcognize that pattern. That preserves some of the old optimizations in IR. Given a shuffle that includes undef elements in an otherwise identity mask like: define <4 x float> @shuffle(<4 x float> %arg) { %shuf = shufflevector <4 x float> %arg, <4 x float> undef, <4 x i32> <i32 undef, i32 1, i32 2, i32 3> ret <4 x float> %shuf } We were simplifying that to the input operand. But as discussed in PR43958: https://bugs.llvm.org/show_bug.cgi?id=43958 ...that means that per-vector-element poison that would be stopped by the shuffle can now leak to the result. Also note that we still have (and there are tests for) the same transform with no undef elements in the mask (a fully-defined identity mask). I don't think there's any controversy about that case - it's a valid transform under any interpretation of shufflevector/undef/poison. Looking at a few of the diffs into codegen, I don't see any difference in final asm. So depending on your perspective, that's good (no real loss of optimization power) or bad (poison exists in the DAG, so we only partially fixed the bug). Differential Revision: https://reviews.llvm.org/D70246
* [InstCombine] remove duplicate code for simplifying a shuffle; NFCISanjay Patel2019-11-141-7/+0
| | | | | The transform is already handled by InstSimplify or earlier in InstCombine, so trying to do it again is not necessary.
* [InstCombine] Turn (extractelement <1 x i64/double> (bitcast (x86_mmx))) ↵Craig Topper2019-11-101-0/+7
| | | | | | | | | | | | | into a single bitcast from x86_mmx to i64/double. The _m64 type is represented in IR as <1 x i64>. The x86-64 ABI on Linux passes <1 x i64> as a double. MMX intrinsics use x86_mmx type in IR.These things result in a lot of bitcasts in mmx code. There's another instcombine that tries to turn bitcast <1 x i64> to double into extractelement and a bitcast. The combine here tries to reverse this extractelement conversion if we see an mmx type.
* [InstCombine] Allow values with multiple users in SimplifyDemandedVectorEltsPiotr Sobczak2019-10-211-11/+90
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Allow for ignoring the check for a single use in SimplifyDemandedVectorElts to be able to simplify operands if DemandedElts is known to contain the union of elements used by all users. It is a responsibility of a caller of SimplifyDemandedVectorElts to supply correct DemandedElts. Simplify a series of extractelement instructions if only a subset of elements is used. Reviewers: reames, arsenm, majnemer, nhaehnle Reviewed By: nhaehnle Subscribers: wdng, jvesely, nhaehnle, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67345 llvm-svn: 375395
* [InstCombine] Fix miscompile bug in canEvaluateShuffledBjorn Pettersson2019-10-181-7/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add restrictions in canEvaluateShuffled to prevent that we for example transform %0 = insertelement <2 x i16> undef, i16 %a, i32 0 %1 = srem <2 x i16> %0, <i16 2, i16 1> %2 = shufflevector <2 x i16> %1, <2 x i16> undef, <2 x i32> <i32 undef, i32 0> into %1 = insertelement <2 x i16> undef, i16 %a, i32 1 %2 = srem <2 x i16> %1, <i16 undef, i16 2> as having an undef denominator makes the srem undefined (for all vector elements). Fixes: https://bugs.llvm.org/show_bug.cgi?id=43689 Reviewers: spatel, lebedev.ri Reviewed By: spatel, lebedev.ri Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69038 llvm-svn: 375208
* [InstCombine] fold extract+insert into identity shuffleSanjay Patel2019-09-081-0/+52
| | | | | | | | | | | | | | | This is similar to the existing fold for splats added with: rL365379 If we can adjust the shuffle mask to include another element in an identity mask (if it changes vector length, that's an extract/insert subvector operation in the backend), then that can eliminate extractelement/insertelement pairs in IR. All targets are expected to lower shuffles with identity masks efficiently. llvm-svn: 371340
* [InstCombine] fold insertelement into splat of same scalarSanjay Patel2019-07-081-0/+37
| | | | | | | | | | | | Forming the canonical splat shuffle improves analysis and may allow follow-on transforms (although some possibilities are missing as shown in the test diffs). The backend generically turns these patterns into build_vector, so there should be no codegen regressions. All targets are expected to be able to lower splats efficiently. llvm-svn: 365379
* [InstCombine] canonicalize insert+splat to/from element 0 of vectorSanjay Patel2019-07-081-0/+38
| | | | | | | | | | | We recognize a splat from element 0 in (VectorUtils) llvm::getSplatValue() and also in ShuffleVectorInst::isZeroEltSplatMask(), so this converts to that form for better matching. The backend generically turns these patterns into build_vector, so there should be no codegen difference. llvm-svn: 365342
* [InstCombine] allow undef elements when forming splat from chain of ↵Sanjay Patel2019-07-041-4/+17
| | | | | | | | | | | | | | | | | | insertelements We allow forming a splat (broadcast) shuffle, but we were conservatively limiting that to cases where all elements of the vector are specified. It should be safe from a codegen perspective to allow undefined lanes of the vector because the expansion of a splat shuffle would become the chain of inserts again. Forming splat shuffles can reduce IR and help enable further IR transforms. Motivating bugs: https://bugs.llvm.org/show_bug.cgi?id=42174 https://bugs.llvm.org/show_bug.cgi?id=16739 Differential Revision: https://reviews.llvm.org/D63848 llvm-svn: 365147
* [InstCombine] simplify code for inserts -> splat; NFCSanjay Patel2019-06-261-25/+17
| | | | llvm-svn: 364441
* [InstCombine] prevent crashing with invalid extractelement indexSanjay Patel2019-05-261-2/+3
| | | | | | | This was found/reduced from a fuzzer report: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14956 llvm-svn: 361729
* [InstSimplify] fold insertelement-of-extractelementSanjay Patel2019-05-241-5/+0
| | | | | | | | This was partly handled in InstCombine (only the constant index case), so delete that and zap it more generally in InstSimplify. llvm-svn: 361576
* [InstCombine] remove redundant fold for extractelement; NFCSanjay Patel2019-05-231-4/+0
| | | | | | | | The out-of-bounds index pattern is handled by InstSimplify, so the extractelement should be eliminated next time it is visited. llvm-svn: 361570
* [InstCombine] remove redundant fold for insertelement; NFCSanjay Patel2019-05-231-4/+0
| | | | | | The out-of-bounds index pattern is handled by InstSimplify. llvm-svn: 361569
* [InstSimplify] insertelement V, undef, ? --> VSanjay Patel2019-05-231-4/+0
| | | | | | | | This was part of InstCombine, but it's better placed in InstSimplify. InstCombine also had an unreachable but weaker fold for insertelement with undef index, so that is deleted. llvm-svn: 361559
* [InstCombine] be more careful when transforming a shuffle maskSanjay Patel2019-05-231-4/+21
| | | | | | | | | | | This is reduced from a fuzzer test: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14890 Usually, demanded elements should be able to simplify shuffle mask elements that are pointing to undef elements of its source operands, but that doesn't happen in the test case. llvm-svn: 361533
* [InstCombine] fold shuffles of insert_subvectorsSanjay Patel2019-05-221-1/+52
| | | | | | | | | | | | | | | | | This should be a valid exception to the general rule of not creating new shuffle masks in IR... because we already do it. :) Also, DAG combining/legalization will undo this by widening the shuffle back out if needed. Explanation for how we already do this: SLP or vector source can create chains of insert/extract as shown in 1 of the examples from PR16739: https://godbolt.org/z/NlK7rA https://bugs.llvm.org/show_bug.cgi?id=16739 And we expect instcombine or DAGCombine to clean that up by creating relatively simple shuffles. Differential Revision: https://reviews.llvm.org/D62024 llvm-svn: 361338
* [InstCombine] move bitcast after insertelement-with-bitcasted-operandsSanjay Patel2019-05-171-0/+14
| | | | llvm-svn: 361058
* [InstCombine] remove overzealous assert for shuffles (PR41419)Sanjay Patel2019-04-081-2/+2
| | | | | | | | | As the TODO indicates, instsimplify could be improved. Should fix: https://bugs.llvm.org/show_bug.cgi?id=41419 llvm-svn: 357910
* [InstCombine] Handle vector gep with scalar argument in ↵Mikael Holmen2019-04-011-1/+8
| | | | | | | | | | | | | | | | | | | | | | | evaluateInDifferentElementOrder Summary: This fixes PR41270. The recursive function evaluateInDifferentElementOrder expects to be called on a vector Value, so when we call it on a vector GEP's arguments, we must first check that the argument is indeed a vector. Reviewers: reames, spatel Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60058 llvm-svn: 357389
* Revert "[InstCombine] Handle vector gep with scalar argument in ↵Mikael Holmen2019-04-011-8/+1
| | | | | | | | | | | evaluateInDifferentElementOrder" This reverts commit 75216a6dbcfe5fb55039ef06a07e419fa875f4a5. I'll recommit with a better commit message with reference to the phabricator review. llvm-svn: 357387
* [InstCombine] Handle vector gep with scalar argument in ↵Mikael Holmen2019-04-011-1/+8
| | | | | | | | | | | | evaluateInDifferentElementOrder This fixes PR41270. The recursive function evaluateInDifferentElementOrder expects to be called on a vector Value, so when we call it on a vector GEP's arguments, we must first check that the argument is indeed a vector. llvm-svn: 357385
* [InstCombine] canonicalize select shuffles by commutingSanjay Patel2019-03-311-0/+9
| | | | | | | | | | | | | | | | | | | | In PR41304: https://bugs.llvm.org/show_bug.cgi?id=41304 ...we have a case where we want to fold a binop of select-shuffle (blended) values. Rather than try to match commuted variants of the pattern, we can canonicalize the shuffles and check for mask equality with commuted operands. We don't produce arbitrary shuffle masks in instcombine, but select-shuffles are a special case that the backend is required to handle because we already canonicalize vector select to this shuffle form. So there should be no codegen difference from this change. It's possible that this improves CSE in IR though. Differential Revision: https://reviews.llvm.org/D60016 llvm-svn: 357366
* [InstCombine] move shuffle canonicalizations before other transformsSanjay Patel2019-03-291-30/+27
| | | | | | | | | This may not be NFC, but I'm not sure how to expose any diffs in tests. In theory, it should be slightly more efficient and possibly more profitable to do the canonicalizations (which can increase the undef elements in the mask) ahead of SimplifyDemandedVectorElts(). llvm-svn: 357272
* [InstCombine] limit extracting shuffle transform based on usesSanjay Patel2019-02-051-0/+5
| | | | | | | | | | As discussed in D53037, this can lead to worse codegen, and we don't generally expect the backend to be able to optimize arbitrary shuffles. If there's only one use of the 1st shuffle, that means it's getting removed, so that should always be safe. llvm-svn: 353235
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [InstCombine] refactor isCheapToScalarize(); NFCSanjay Patel2018-12-181-33/+25
| | | | | | | | | As the FIXME indicates, this has the potential to go overboard. So I'm not sure if it's even worth keeping this vs. iteratively doing simple matches, but we might as well clean it up. llvm-svn: 349523
* InstCombine: Scalarize single use icmp/fcmpMatt Arsenault2018-12-101-0/+12
| | | | llvm-svn: 348801
* [InstCombine] remove dead code from visitExtractElementSanjay Patel2018-12-051-6/+0
| | | | | | | | Extracting from a splat constant is always handled by InstSimplify. Move the test for this from InstCombine to InstSimplify to make sure that stays true. llvm-svn: 348423
* [InstCombine] reduce duplication in visitExtractElementInst; NFCSanjay Patel2018-12-051-38/+32
| | | | llvm-svn: 348418
* [InstCombine] try to turn shuffle into insertelementSanjay Patel2018-10-301-0/+70
| | | | | | | | | | | | | | | | | | | | | | shuffle (insert ?, Scalar, IndexC), V1, Mask --> insert V1, Scalar, IndexC' The motivating case is at least a couple of steps away: I noticed that SLPVectorizer does not analyze shuffles as well as sequences of insert/extract in PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 ...so SLP may fail to vectorize when source code has shuffles to start with or instcombine has converted insert/extract to shuffles. Independent of that, an insertelement is always a simpler op for IR analysis vs. a shuffle, so we should transform to insert when possible. I don't think there's any codegen concern here - if a target can't insert a scalar directly to some fixed element in a vector (x86?), then this should get expanded to the insert+shuffle that we started with. Differential Revision: https://reviews.llvm.org/D53507 llvm-svn: 345607
* [InstCombine] use 'match' to simplify code; NFCSanjay Patel2018-10-201-59/+56
| | | | llvm-svn: 344855
* [InstCombine] make code more flexible with lambda; NFCSanjay Patel2018-10-201-4/+10
| | | | | | | | | | | | | | | | I couldn't tell from svn history when these checks were added, but it pre-dates the split of instcombine into its own directory at rL92459. The motivation for changing the check is partly shown by the code in PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 There are also existing regression tests for SLPVectorizer with sequences of extract+insert that are likely assumed to become shuffles by the vectorizer cost models. llvm-svn: 344854
* [InstCombine] add explanatory comment for strange vector logic; NFCSanjay Patel2018-10-201-0/+16
| | | | llvm-svn: 344852
* [InstCombine] combine a shuffle and an extract subvector shuffle Sanjay Patel2018-10-141-0/+38
| | | | | | | | | | | | This is part of the missing IR-level folding noted in D52912. This should be ok as a canonicalization because the new shuffle mask can't be any more complicated than the existing shuffle mask. If there's some target where the shorter vector shuffle is not legal, it should just end up expanding to something like the pair of shuffles that we're starting with here. Differential Revision: https://reviews.llvm.org/D53037 llvm-svn: 344476
* revert r344082: [InstCombine] reverse 'trunc X to <N x i1>' canonicalizationSanjay Patel2018-10-101-30/+0
| | | | | | This commit accidentally included the diffs from D53057. llvm-svn: 344178
* [InstCombine] reverse 'trunc X to <N x i1>' canonicalizationSanjay Patel2018-10-091-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | icmp ne (and X, 1), 0 --> trunc X to N x i1 Ideally, we'd do the same for scalars, but there will likely be regressions unless we add more trunc folds as we're doing here for vectors. The motivating vector case is from PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) { %c = fcmp ole <4 x float> %x, %y %s = sext <4 x i1> %c to <4 x i32> %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1> %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3> %cond = or <4 x i32> %s1, %s2 %condtr = trunc <4 x i32> %cond to <4 x i1> %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w ret <4 x float> %r } Here's a sampling of the vector codegen for that case using mask+icmp (current behavior) vs. trunc (with this patch): AVX before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vandps LCPI0_0(%rip), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vpcmpeqd %xmm1, %xmm0, %xmm0 vblendvps %xmm0, %xmm3, %xmm2, %xmm0 AVX after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vblendvps %xmm0, %xmm2, %xmm3, %xmm0 AVX512f before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpbroadcastd LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1] vptestnmd %zmm1, %zmm0, %k1 vblendmps %zmm3, %zmm2, %zmm0 {%k1} AVX512f after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpslld $31, %xmm0, %xmm0 vptestmd %zmm0, %zmm0, %k1 vblendmps %zmm2, %zmm3, %zmm0 {%k1} AArch64 before: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b movi v1.4s, #1 and v0.16b, v0.16b, v1.16b cmeq v0.4s, v0.4s, #0 bsl v0.16b, v3.16b, v2.16b AArch64 after: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b bsl v0.16b, v2.16b, v3.16b PowerPC-le before: xvcmpgesp 34, 35, 34 vspltisw 0, 1 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxlxor 35, 35, 35 xxland 34, 0, 32 vcmpequw 2, 2, 3 xxsel 34, 36, 37, 34 PowerPC-le after: xvcmpgesp 34, 35, 34 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxsel 34, 37, 36, 0 Differential Revision: https://reviews.llvm.org/D52747 llvm-svn: 344082
* [InstCombine] make helper function 'static'; NFCSanjay Patel2018-10-091-2/+2
| | | | llvm-svn: 344056
* [InstCombine] allow bitcast to/from FP for vector insert/extract transformSanjay Patel2018-10-041-4/+31
| | | | | | | | This is a follow-up to rL343482 / D52439. This was a pattern that initially caused the commit to be reverted because the transform requires a bitcast as shown here. llvm-svn: 343794
* [InstCombine] try to convert vector insert+extract to trunc; 2nd trySanjay Patel2018-10-011-2/+46
| | | | | | | | | | | | | | | | | | | | | | | This was originally committed at rL343407, but reverted at rL343458 because it crashed trying to handle a case where the destination type is FP. This version of the patch adds a check for that possibility. Tests added at rL343480. Original commit message: This transform is requested for the backend in: https://bugs.llvm.org/show_bug.cgi?id=39016 ...but I figured it was worth doing in IR too, and it's probably easier to implement here, so that's this patch. In the simplest case, we are just truncating a scalar value. If the extract index doesn't correspond to the LSBs of the scalar, then we have to shift-right before the truncate. Endian-ness makes this tricky, but hopefully the ASCII-art helps visualize the transform. Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343482
* Revert r343407 "[InstCombine] try to convert vector insert+extract to trunc"Hans Wennborg2018-10-011-44/+2
| | | | | | | | | | | | | | | | | | | This caused Chromium builds to fail with "Illegal Trunc" assertion. See https://crbug.com/890723 for repro. > This transform is requested for the backend in: > https://bugs.llvm.org/show_bug.cgi?id=39016 > ...but I figured it was worth doing in IR too, and it's probably > easier to implement here, so that's this patch. > > In the simplest case, we are just truncating a scalar value. If the > extract index doesn't correspond to the LSBs of the scalar, then we > have to shift-right before the truncate. Endian-ness makes this tricky, > but hopefully the ASCII-art helps visualize the transform. > > Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343458
* [InstCombine] try to convert vector insert+extract to truncSanjay Patel2018-09-301-2/+44
| | | | | | | | | | | | | | | | This transform is requested for the backend in: https://bugs.llvm.org/show_bug.cgi?id=39016 ...but I figured it was worth doing in IR too, and it's probably easier to implement here, so that's this patch. In the simplest case, we are just truncating a scalar value. If the extract index doesn't correspond to the LSBs of the scalar, then we have to shift-right before the truncate. Endian-ness makes this tricky, but hopefully the ASCII-art helps visualize the transform. Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343407
* [InstCombine] allow lengthening of insertelement to eliminate shufflesSanjay Patel2018-09-301-2/+8
| | | | | | | | | | | | | As noted in post-commit comments for D52548, the limitation on increasing vector length can be applied by opcode. As a first step, this patch only allows insertelement to be widened because that has no logical downsides for IR and has little risk of pessimizing codegen. This may cause PR39132 to go into hiding during a full compile, but that bug is not fixed. llvm-svn: 343406
OpenPOWER on IntegriCloud