summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/InstCombine/shuffle_select.ll
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] remove identity shuffle simplification for mask with undefsSanjay Patel2019-11-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And simultaneously enhance SimplifyDemandedVectorElts() to rcognize that pattern. That preserves some of the old optimizations in IR. Given a shuffle that includes undef elements in an otherwise identity mask like: define <4 x float> @shuffle(<4 x float> %arg) { %shuf = shufflevector <4 x float> %arg, <4 x float> undef, <4 x i32> <i32 undef, i32 1, i32 2, i32 3> ret <4 x float> %shuf } We were simplifying that to the input operand. But as discussed in PR43958: https://bugs.llvm.org/show_bug.cgi?id=43958 ...that means that per-vector-element poison that would be stopped by the shuffle can now leak to the result. Also note that we still have (and there are tests for) the same transform with no undef elements in the mask (a fully-defined identity mask). I don't think there's any controversy about that case - it's a valid transform under any interpretation of shufflevector/undef/poison. Looking at a few of the diffs into codegen, I don't see any difference in final asm. So depending on your perspective, that's good (no real loss of optimization power) or bad (poison exists in the DAG, so we only partially fixed the bug). Differential Revision: https://reviews.llvm.org/D70246
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-171-0/+1466
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-171-1466/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
* [InstCombine] remove overzealous assert for shuffles (PR41419)Sanjay Patel2019-04-081-0/+10
| | | | | | | | | As the TODO indicates, instsimplify could be improved. Should fix: https://bugs.llvm.org/show_bug.cgi?id=41419 llvm-svn: 357910
* [InstCombine] canonicalize select shuffles by commutingSanjay Patel2019-03-311-14/+14
| | | | | | | | | | | | | | | | | | | | In PR41304: https://bugs.llvm.org/show_bug.cgi?id=41304 ...we have a case where we want to fold a binop of select-shuffle (blended) values. Rather than try to match commuted variants of the pattern, we can canonicalize the shuffles and check for mask equality with commuted operands. We don't produce arbitrary shuffle masks in instcombine, but select-shuffles are a special case that the backend is required to handle because we already canonicalize vector select to this shuffle form. So there should be no codegen difference from this change. It's possible that this improves CSE in IR though. Differential Revision: https://reviews.llvm.org/D60016 llvm-svn: 357366
* [ValueTracking] More accurate unsigned add overflow detectionNikita Popov2019-02-281-1/+1
| | | | | | | | | | | | Part of D58593. Compute precise overflow conditions based on all known bits, rather than just the sign bits. Unsigned a + b overflows iff a > ~b, and we can determine whether this always/never happens based on the minimal and maximal values achievable for a and ~b subject to the known bits constraint. llvm-svn: 355072
* [InstCombine] drop poison flags in SimplifyVectorDemandedEltsSanjay Patel2018-10-041-1/+1
| | | | | | | | | | | | | | | We established the (unfortunately complicated) rules for UB/poison propagation with vector ops in: D48893 D48987 D49047 It's clear from the affected tests that we are potentially creating poison where none existed before the transforms. For add/sub/mul, the answer is simple: just drop the flags because the extra undef vector lanes are generally more valuable for analysis and codegen. llvm-svn: 343819
* [InstCombine] allow SimplifyDemandedVectorElts to work with FP binopsSanjay Patel2018-10-031-7/+7
| | | | | | | | | | | | | | | | | We're a long way from D50992 and D51553, but this is where we have to start. We weren't back-propagating undefs into binop constant values for anything but add/sub/mul/and/or/xor. This is likely because we have to be careful about not introducing UB/poison with div/rem/shift. But I suspect we already are getting the poison part wrong for add/sub/mul (although it may not be possible to expose the bug currently because we use SimplifyDemandedVectorElts from a limited set of opcodes). See the discussion/implementation from D48987 and D49047. This patch just enables functionality for FP ops because those do not have UB/poison potential. llvm-svn: 343727
* [InstCombine] allow flag propagation when using safe constantSanjay Patel2018-07-101-7/+4
| | | | | | | This corresponds with the code for the single binop pattern added in rL336684. llvm-svn: 336696
* [InstCombine] safely allow non-commutative binop identity constant foldsSanjay Patel2018-07-101-28/+14
| | | | | | | | | | | | | | | | This was originally intended with D48893, but as discussed there, we have to make the folds safe from producing extra poison. This should give the single binop folds the same capabilities as the existing folds for 2-binops+shuffle. LLVM binary opcode review: there are a total of 18 binops. There are 7 commutative binops (add, mul, and, or, xor, fadd, fmul) which we already fold. We're able to fold 6 more opcodes with this patch (shl, lshr, ashr, fdiv, udiv, sdiv). There are no folds for srem/urem/frem AFAIK. We don't bother with sub/fsub with constant operand 1 because those are canonicalized to add/fadd. 7 + 6 + 3 + 2 = 18. llvm-svn: 336684
* [InstCombine] drop poison flags when shuffle mask undef propagates to constantSanjay Patel2018-07-101-4/+3
| | | | llvm-svn: 336679
* [InstCombine] allow more shuffle-binop folds with safe constantsSanjay Patel2018-07-101-14/+14
| | | | | | | | | | | | The case with 2 variables is more complicated than the case where we eliminate the shuffle entirely because a shuffle with an undef mask element creates an undef result. I'm not aware of any current analysis/transform that recognizes that undef propagating to a div/rem/shift, but we have to guard against the possibility. llvm-svn: 336668
* [InstCombine] allow more shuffle folds using safe constantsSanjay Patel2018-07-091-8/+4
| | | | | | | | | | getSafeVectorConstantForBinop() was calling getBinOpIdentity() assuming that the constant we wanted was operand 1 (RHS). That's wrong, but I don't think we could expose a bug or even a suboptimal fold from that because the callers have other guards for any binop that would have been affected. llvm-svn: 336617
* [InstCombine] correct test comments; NFCSanjay Patel2018-07-091-2/+2
| | | | llvm-svn: 336570
* [InstCombine] fix shuffle-of-binops transform to avoid poison/undef Sanjay Patel2018-07-091-47/+58
| | | | | | | | | | | | | | | | | | | | As noted in D48987, there are many different ways for this transform to go wrong. In particular, the poison potential for shifts means we have to more careful with those ops. I added tests to make that behavior visible for all of the different cases that I could find. This is a partial fix. To make this review easier, I did not make changes for the single binop pattern (handled in foldSelectShuffleWith1Binop()). I also left out some potential optimizations noted with TODO comments. I'll follow-up once we're confident that things are correct here. The goal is to correct all marked FIXME tests to either avoid the shuffle transform or do it safely. Note that distinguishing when the shuffle mask contains undefs and using getBinOpIdentity() allows for some improvements to div/rem patterns, so there are wins along with the missed opportunities and fixes. Differential Revision: https://reviews.llvm.org/D49047 llvm-svn: 336546
* [InstCombine] add more tests for potentially poisonous shifts; NFCSanjay Patel2018-07-061-0/+43
| | | | llvm-svn: 336454
* [InstCombine] add more tests with poison and undef; NFCSanjay Patel2018-07-061-5/+540
| | | | | | | | As discussed in D48987 and D48893, there are many different ways to go wrong depending on the binop (and as shown here we already do go wrong in some cases). llvm-svn: 336450
* [InstCombine] add tests for shuffle+binop with constant op1; NFCSanjay Patel2018-07-031-4/+25
| | | | | | This adds coverage for a planned enhancement for ConstantExpr::getBinOpIdentity() noted in D48830. llvm-svn: 336220
* [Constants] add identity constants for fadd/fmulSanjay Patel2018-07-031-4/+2
| | | | | | | | | | | | | | | | As the test diffs show, the current users of getBinOpIdentity() are InstCombine and Reassociate. SLP vectorizer is a candidate for using this functionality too (D28907). The InstCombine shuffle improvements are part of the planned enhancements noted in D48830. InstCombine actually has several other uses of getBinOpIdentity() via SimplifyUsingDistributiveLaws(), but we don't call that for any FP ops. Fixing that might be another part of removing the custom reassociation in InstCombine that is only done for fadd+fmul. llvm-svn: 336215
* [InstCombine] fold shuffle-with-binop and common valueSanjay Patel2018-07-031-9/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | This is the last significant change suggested in PR37806: https://bugs.llvm.org/show_bug.cgi?id=37806#c5 ...though there are several follow-ups noted in the code comments in this patch to complete this transform. It's possible that a binop feeding a select-shuffle has been eliminated by earlier transforms (or the code was just written like this in the 1st place), so we'll fail to match the patterns that have 2 binops from: D48401, D48678, D48662, D48485. In that case, we can try to materialize identity constants for the remaining binop to fill in the "ghost" lanes of the vector (where we just want to pass through the original values of the source operand). I added comments to ConstantExpr::getBinOpIdentity() to show planned follow-ups. For now, we only handle the 5 commutative integer binops (add/mul/and/or/xor). Differential Revision: https://reviews.llvm.org/D48830 llvm-svn: 336196
* [InstCombine] reverse canonicalization of add --> or to allow more shuffle ↵Sanjay Patel2018-07-021-12/+8
| | | | | | | | | | | | | | | | folding This extends D48485 to allow another pair of binops (add/or) to be combined either with or without a leading shuffle: or X, C --> add X, C (when X and C have no common bits set) Here, we need value tracking to determine that the 'or' can be reversed into an 'add', and we've added general infrastructure to allow extending to other opcodes or moving to where other passes could use that functionality. Differential Revision: https://reviews.llvm.org/D48662 llvm-svn: 336128
* [InstCombine] adjust shuffle tests with IR flags; NFCSanjay Patel2018-07-021-4/+3
| | | | | | | | | Due to current limitations in constant analysis, we need flags on add or mul to show propagation for the potential transform suggested in these tests (no other binops currently report identity constants). llvm-svn: 336101
* [InstCombine] add tests for shuffle-binop; NFCSanjay Patel2018-07-021-37/+256
| | | | | | This is another pattern mentioned in PR37806. llvm-svn: 336096
* [InstCombine] add more tests for shuffle-binop folds; NFCSanjay Patel2018-06-291-1/+73
| | | | | | | The mul+shl tests add coverage for the fold enabled with D48678. The and+or tests are not handled yet; that's D48662. llvm-svn: 335984
* [InstCombine] enhance shuffle-of-binops to allow different variable ops ↵Sanjay Patel2018-06-291-35/+28
| | | | | | | | | | | | | | | | | | | | | | | (PR37806) This was discussed in D48401 as another improvement for: https://bugs.llvm.org/show_bug.cgi?id=37806 If we have 2 different variable values, then we shuffle (select) those lanes, shuffle (select) the constants, and then perform the binop. This eliminates a binop. The new shuffle uses the same shuffle mask as the existing shuffle, so there's no danger of creating a difficult shuffle. All of the earlier constraints still apply, but we also check for extra uses to avoid creating more instructions than we'll remove. Additionally, we're disallowing the fold for div/rem because that could expose a UB hole. Differential Revision: https://reviews.llvm.org/D48678 llvm-svn: 335974
* [InstCombine] adjust shuffle tests; NFCSanjay Patel2018-06-281-5/+5
| | | | | | Use xor for the extra uses test because div/rem have other problems. llvm-svn: 335924
* [InstCombine] allow shl+mul combos with shuffle (select) fold (PR37806)Sanjay Patel2018-06-281-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | This is an enhancement to D48401 that was discussed in: https://bugs.llvm.org/show_bug.cgi?id=37806 We can convert a shift-left-by-constant into a multiply (we canonicalize IR in the other direction because that's generally better of course). This allows us to remove the shuffle as we do in the regular opcodes-are-the-same cases. This requires a small hack to make sure we don't introduce any extra poison: https://rise4fun.com/Alive/ZGv Other examples of opcodes where this would work are add+sub and fadd+fsub, but we already canonicalize those subs into adds, so there's nothing to do for those cases AFAICT. There are planned enhancements for opcode transforms such or -> add. Note that there's a different fold needed if we've already managed to simplify away a binop as seen in the test based on PR37806, but we manage to get that one case here because this fold is positioned above the demanded elements fold currently. Differential Revision: https://reviews.llvm.org/D48485 llvm-svn: 335888
* [InstCombine] add tests for vector-select-of-binops with 2 variables; NFCSanjay Patel2018-06-271-0/+263
| | | | llvm-svn: 335778
* [InstCombine] add more tests for shuffle with different binops; NFCSanjay Patel2018-06-271-0/+36
| | | | llvm-svn: 335756
* [InstCombine] add shuffle+binops test from PR37806; NFCSanjay Patel2018-06-221-0/+15
| | | | | | | | | This one shows another pattern that we'll need to match in some cases, but the current ordering of folds allows us to match this as 2 binops before simplification takes place. llvm-svn: 335347
* [InstCombine] add tests for shuffle-with-different-binops; NFCSanjay Patel2018-06-221-0/+42
| | | | llvm-svn: 335345
* [InstCombine] fix shuffle-of-binops bugSanjay Patel2018-06-211-2/+3
| | | | | | | | | With non-commutative binops, we could be using the same variable value as operand 0 in 1 binop and operand 1 in the other, so we have to check for that possibility and bail out. llvm-svn: 335312
* [InstCombine] add test for shuffle-of-binops; NFCSanjay Patel2018-06-211-0/+14
| | | | | | This shows a miscompile that was missed in rL335283. llvm-svn: 335311
* [InstCombine] fold vector select of binops with constant ops to 1 binop ↵Sanjay Patel2018-06-211-48/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (PR37806) This is the simplest case from PR37806: https://bugs.llvm.org/show_bug.cgi?id=37806 If we have a common variable operand used in a pair of binops with vector constants that are vector selected together, then we can constant shuffle the constant vectors to eliminate the shuffle instruction. This has some tricky parts that are hopefully addressed in the tests and their respective comments: 1. If the shuffle mask contains an undef element, then that lane of the result is undef: http://llvm.org/docs/LangRef.html#shufflevector-instruction Therefore, we can replace the constant in that lane with an undef value except for div/rem. With div/rem, an undef in the divisor would cause the whole op to be undef. So I'm using the same hack as in D47686 - replace the undefs with '1'. 2. Intersect the wrapping and FMF of the original binops for the new binop. There should be no extra poison or fast-math potential in the new binop that wasn't possible in the original code. 3. Disregard other uses. Given that we're eliminating uses (shortening the dependency chain), I think that's always the right IR canonicalization. But I purposely chose the udiv test to demonstrate the scenario where both intermediate values have other uses because that seems likely worse for codegen with an expensive math op. This seems like a very rare possibility to me, so I don't think it requires a backend patch first. Differential Revision: https://reviews.llvm.org/D48401 llvm-svn: 335283
* [InstCombine] fix typo in test comment; NFCSanjay Patel2018-06-201-1/+1
| | | | llvm-svn: 335165
* [InstCombine] add vector select of binops tests (PR37806)Sanjay Patel2018-06-201-0/+256
These represent the most basic requested transform - a matching operand and 2 constant operands. llvm-svn: 335151
OpenPOWER on IntegriCloud