summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/Utils/X86ShuffleDecode.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [X86][XOP] Fixed VPPERM permute op decoding (PR27472).Simon Pilgrim2016-04-241-1/+1
| | | | | | Fixed issue with VPPERM target shuffle mask decoding that was incorrectly masking off the 3-bit permute op with a 2-bit mask. llvm-svn: 267346
* [NFC] Header cleanupMehdi Amini2016-04-181-0/+1
| | | | | | | | | | | | | | Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595
* [X86][XOP] Added VPPERM constant mask decoding and target shuffle combining ↵Simon Pilgrim2016-04-161-1/+40
| | | | | | | | support Added additional test that peeks through bitcast to v16i8 mask llvm-svn: 266533
* [X86][AVX512] Fixed VPERMT2* shuffle mask decoding and enabled target ↵Simon Pilgrim2016-03-061-2/+3
| | | | | | | | | | | | | | shuffle combining. Patch to add support for target shuffle combining of X86ISD::VPERMV3 nodes, including support for detecting unary shuffles. This uncovered several issues with the X86ISD::VPERMV3 shuffle mask decoding of non-64 bit shuffle mask elements - the bit masking wasn't being correctly computed. Removed non-constant pool mask decode path as we have no way of testing it right now. Differential Revision: http://reviews.llvm.org/D17916 llvm-svn: 262809
* [X86][AVX] Improved VPERMILPS variable shuffle mask decoding.Simon Pilgrim2016-03-051-0/+18
| | | | | | | | | | Added support for decoding VPERMILPS variable shuffle masks that aren't in the constant pool. Added target shuffle mask decoding for SCALAR_TO_VECTOR+VZEXT_MOVL cases - these can happen for v2i64 constant re-materialization Followup to D17681 llvm-svn: 262784
* [X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test.Simon Pilgrim2016-03-031-3/+3
| | | | | | PSHUFB decoder was assuming that input was 128 or 256-bit vector only. llvm-svn: 262661
* [X86][SSE] Added support for MOVHPD/MOVLPD + MOVHPS/MOVLPS shuffle decoding.Simon Pilgrim2016-02-071-0/+11
| | | | llvm-svn: 260034
* [X86][SSE] Refactored PMOVZX shuffle decoding to use scalar input typesSimon Pilgrim2016-02-061-4/+2
| | | | | | | | First step towards being able to decode AVX512 PMOVZX instructions without a massive bloat in the shuffle decode switch statement. This should also make it easier to decode X86ISD::VZEXT target shuffles in the future. llvm-svn: 259995
* [X86] Move shuffle decoding for constant pool into the X86CodeGen library to ↵Craig Topper2015-12-311-165/+0
| | | | | | remove a layering violation in the Util library. llvm-svn: 256680
* [X86] Fix an unused variable warning in released builds.Craig Topper2015-12-261-0/+2
| | | | llvm-svn: 256453
* [X86] Add support for printing shuffle comments for AVX512 PSHUFB instructions.Craig Topper2015-12-261-4/+2
| | | | llvm-svn: 256452
* [X86] Fix shuffle decoding for variable VPERMIL to be tolerant of the ↵Craig Topper2015-12-261-32/+54
| | | | | | Constant type not matching due to folding in the constant pool and to get VPERMILPD correct. llvm-svn: 256433
* AVX512: Implemented DAG lowering for shuff62x2/shufi62x2 instructions ( ↵Igor Breger2015-10-151-0/+20
| | | | | | | | shuffle packed values at 128-bit granularity ) Differential Revision: http://reviews.llvm.org/D13648 llvm-svn: 250400
* [X86][MMX] Added shuffle decodes for MMX/3DNow! shuffles.Simon Pilgrim2015-09-131-2/+13
| | | | | | | Added shuffle decodes for MMX PUNPCK + PSHUFW shuffles. Added shuffle decodes for 3DNow! PSWAPD shuffles. llvm-svn: 247526
* AVX-512: Lowering for 512-bit vector shuffles.Elena Demikhovsky2015-09-081-0/+70
| | | | | | | | Vector types: <8 x 64>, <16 x 32>, <32 x 16> float and integer. Differential Revision: http://reviews.llvm.org/D10683 llvm-svn: 246981
* Fix gcc warnings of different enum and non-enum types in ternariesDenis Protivensky2015-07-071-1/+1
| | | | llvm-svn: 241567
* [X86][AVX] Add support for shuffle decoding of vperm2f128/vperm2i128 with ↵Simon Pilgrim2015-07-061-5/+3
| | | | | | | | | | | | zero'd lanes The vperm2f128/vperm2i128 shuffle mask decoding was not attempting to deal with shuffles that give zero lanes. This patch fixes this so that the assembly printer can provide shuffle comments. As this decoder is also used in X86ISelLowering for shuffle combining, I've added an early-out to match existing behaviour. The hope is that we can add zero support in the future, this would allow other ops' decodes (e.g. insertps) to be combined as well. Differential Revision: http://reviews.llvm.org/D10593 llvm-svn: 241516
* [X86][SSE4A] Shuffle lowering using SSE4A EXTRQ/INSERTQ instructionsSimon Pilgrim2015-07-061-0/+74
| | | | | | | | | | | | This patch adds support for v8i16 and v16i8 shuffle lowering using the immediate versions of the SSE4A EXTRQ and INSERTQ instructions. Although rather limited (they can only act on the lower 64-bits of the source vectors, leave the upper 64-bits of the result vector undefined and don't have VEX encoded variants), the instructions are still useful for the zero extension of any lane (EXTRQ) or inserting a lane into another vector (INSERTQ). Testing demonstrated that it wasn't typically worth it to use these instructions for v2i64 or v4i32 vector shuffles although they are capable of it. As well as adding specific pattern matching for the shuffles, the patch uses EXTRQ for zero extension cases where SSE41 isn't available and its more efficient than the SSE2 'unpack' default approach. It also adds shuffle decode support for the EXTRQ / INSERTQ cases when the instructions are handling full byte-sized extractions / insertions. From this foundation, future patches will be able to make use of the instructions for situations that use their ability to extract/insert at the bit level. Differential Revision: http://reviews.llvm.org/D10146 llvm-svn: 241508
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-1/+1
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-1/+1
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* Reformat.NAKAMURA Takumi2015-05-251-19/+19
| | | | llvm-svn: 238126
* Prune CRLFs.NAKAMURA Takumi2015-05-251-434/+434
| | | | llvm-svn: 238125
* X86: silence a GCC warningSaleem Abdulrasool2015-01-311-1/+1
| | | | | | | | GCC 4.9 gives the following warning: warning: enumeral and non-enumeral type in conditional expression Cast the enumeral value to an integer within the ternary operation. NFC. llvm-svn: 227692
* Remove unused variable.Diego Novillo2015-01-311-2/+2
| | | | | | | | | | | | | | Summary: This variable is only used inside an assert. This breaks builds with asserts disabled. OK for trunk? Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7314 llvm-svn: 227691
* [X86][SSE] Shuffle mask decode support for zero extend, scalar float/double ↵Simon Pilgrim2015-01-311-382/+414
| | | | | | | | | | | | moves and integer load instructions This patch adds shuffle mask decodes for integer zero extends (pmovzx** and movq xmm,xmm) and scalar float/double loads/moves (movss/movsd). Also adds shuffle mask decodes for integer loads (movd/movq). Differential Revision: http://reviews.llvm.org/D7228 llvm-svn: 227688
* [X86][SSE] movddup shuffle mask decodesSimon Pilgrim2015-01-211-6/+20
| | | | | | Patch to provide shuffle decodes and asm comments for the SSE3/AVX1 movddup double duplication instructions. llvm-svn: 226705
* Revert most of r225597David Majnemer2015-01-111-47/+30
| | | | | | We can't rely on a DataLayout enlightened constant folder. llvm-svn: 225599
* X86: Properly decode shuffle masks when the constant pool type is weirdDavid Majnemer2015-01-111-46/+56
| | | | | | | | | | | | | It's possible for the constant pool entry for the shuffle mask to come from a completely different operation. This occurs when Constants have the same bit pattern but have different types. Make DecodePSHUFBMask tolerant of types which, after a bitcast, are appropriately sized vector types. This fixes PR22188. llvm-svn: 225597
* [X86][SSE] pslldq/psrldq shuffle mask decodesSimon Pilgrim2014-10-141-0/+29
| | | | | | | | Patch to provide shuffle decodes and asm comments for the sse pslldq/psrldq SSE2/AVX2 byte shift instructions. Differential Revision: http://reviews.llvm.org/D5598 llvm-svn: 219738
* [x86] Implement v16i16 support with AVX2 in the new vector shuffleChandler Carruth2014-09-251-4/+11
| | | | | | | | | | | | | | | lowering. This also implements the fancy blend lowering for v16i16 using AVX2 and teaches the X86 backend to print shuffle masks for 256-bit PSHUFB and PBLENDW instructions. It also makes the mask decoding correct for PBLENDW instructions. The yaks, they are legion. Tests are updated accordingly. There are some missing tests for the VBLENDVB lowering, but I'll add those in a follow-up as this commit has accumulated enough cruft already. llvm-svn: 218430
* [x86] Teach the vector comment parsing and printing to correctly handleChandler Carruth2014-09-231-28/+74
| | | | | | | | | | | | | | | | | undef in the shuffle mask. This shows up when we're printing comments during lowering and we still have an IR-level constant hanging around that models undef. A nice consequence of this is *much* prettier test cases where the undef lanes actually show up as undef rather than as a particular set of values. This also allows us to print shuffle comments in cases that use undef such as the recently added variable VPERMILPS lowering. Now those test cases have nice shuffle comments attached with their details. The shuffle lowering for PSHUFB has been augmented to use undef, and the shuffle combining has been augmented to comprehend it. llvm-svn: 218301
* [x86] Teach the AVX1 path of the new vector shuffle lowering one moreChandler Carruth2014-09-231-0/+23
| | | | | | | | | | | | | | | | | | | | | | trick that I missed. VPERMILPS has a non-immediate memory operand mode that allows it to do asymetric shuffles in the two 128-bit lanes. Use this rather than two shuffles and a blend. However, it turns out the variable shuffle path to VPERMILPS (and VPERMILPD, although that one offers no functional differenc from the immediate operand other than variability) wasn't even plumbed through codegen. Do such plumbing so that we can reasonably emit a variable-masked VPERMILP instruction. Also plumb basic comment parsing and printing through so that the tests are reasonable. There are still a few tests which don't show the shuffle pattern. These are tests with undef lanes. I'll teach the shuffle decoding and printing to handle undef mask entries in a follow-up. I've looked at the masks and they seem reasonable. llvm-svn: 218300
* Fix assert when decoding PSHUFB maskRobert Lougher2014-09-221-6/+4
| | | | | | | | | | The PSHUFB mask decode routine used to assert if the mask index was out of range (<0 or greater than the size of the vector). The problem is, we can legitimately have a PSHUFB with a large index using intrinsics. The instruction only uses the least significant 4 bits. This change removes the assert and masks the index to match the instruction behaviour. llvm-svn: 218242
* [x86] Teach the x86 DAG combiner to form MOVSLDUP and MOVSHDUPChandler Carruth2014-09-151-0/+16
| | | | | | | | | | | | instructions when it finds an appropriate pattern. These are lovely instructions, and its a shame to not use them. =] They are fast, and can hand loads folded into their operands, etc. I've also plumbed the comment shuffle decoding through the various layers so that the test cases are printed nicely. llvm-svn: 217758
* [x86] Teach the instruction printer to decode immediate operands toChandler Carruth2014-08-151-0/+7
| | | | | | | | | BLENDPS, BLENDPD, and PBLENDW instructions into pretty shuffle comments. These will be used in my next commit as part of test cases for AVX shuffles which can directly use blend in more places. llvm-svn: 215701
* [x86] Largely complete the use of PSHUFB in the new vector shuffleChandler Carruth2014-08-021-1/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lowering with a small addition to it and adding PSHUFB combining. There is one obvious place in the new vector shuffle lowering where we should form PSHUFBs directly: when without them we will unpack a vector of i8s across two different registers and do a potentially 4-way blend as i16s only to re-pack them into i8s afterward. This is the crazy expensive fallback path for i8 shuffles and we can just directly use pshufb here as it will always be cheaper (the unpack and pack are two instructions so even a single shuffle between them hits our three instruction limit for forming PSHUFB). However, this doesn't generate very good code in many cases, and it leaves a bunch of common patterns not using PSHUFB. So this patch also adds support for extracting a shuffle mask from PSHUFB in the X86 lowering code, and uses it to handle PSHUFBs in the recursive shuffle combining. This allows us to combine through them, combine multiple ones together, and generally produce sufficiently high quality code. Extracting the PSHUFB mask is annoyingly complex because it could be either pre-legalization or post-legalization. At least this doesn't have to deal with re-materialized constants. =] I've added decode routines to handle the different patterns that show up at this level and we dispatch through them as appropriate. The two primary test cases are updated. For the v16 test case there is still a lot of room for improvement. Since I was going through it systematically I left behind a bunch of FIXME lines that I'm hoping to turn into ALL lines by the end of this. llvm-svn: 214628
* Fix broken assert.Nick Lewycky2014-07-261-1/+1
| | | | llvm-svn: 214019
* X86ShuffleDecode.cpp: Silence a warning. [-Wunused-variable]NAKAMURA Takumi2014-07-261-2/+2
| | | | llvm-svn: 214016
* [x86] Teach the X86 backend to print shuffle comments for PSHUFBChandler Carruth2014-07-251-0/+33
| | | | | | | | | | | | | | | | | | | | instructions which happen to have a constant mask. Currently, this only handles a very narrow set of cases, but those happen to be the cases that I care about for testing shuffles sanely. This is a bit trickier than other shuffle instructions because we're decoding constants out of the constant pool. The current MC layer makes it completely impossible to inspect a constant pool entry, so we have to do it at the MI level and attach the comment to the streamer on its way out. So no joy for disassembling, but it does make test cases and asm dumps *much* nicer. Sorry for no test cases, but it didn't really seem that valuable to go trolling through existing old test cases and updating them. I'll have lots of testing of this in the upcoming patch for SSSE3 emission in the new vector shuffle lowering code paths. llvm-svn: 213986
* Replace ValueTypes.h with MachineValueType.h if possible.Patrik Hagglund2014-03-151-1/+1
| | | | | | | | | Utilize the previous move of MVT to a separate header for all trivial cases (that don't need any further restructuring). Reviewed By: Tim Northover llvm-svn: 204003
* Replace '#include ValueTypes.h' with forward declarations.Patrik Hagglund2014-03-121-0/+1
| | | | | | | In some cases the include is pushed "downstream" (or removed if unused). llvm-svn: 203644
* Fix 256-bit PALIGNR comment decoding to understand that it works on ↵Craig Topper2013-01-281-2/+11
| | | | | | independent 256-bit lanes. llvm-svn: 173674
* Fix inconsistent usage of PALIGN and PALIGNR when referring to the same ↵Craig Topper2013-01-281-1/+2
| | | | | | instruction. llvm-svn: 173667
* X86: Decode PALIGN operands so I don't have to do it in my head.Benjamin Kramer2013-01-261-0/+8
| | | | llvm-svn: 173572
* Use MVT instead of EVT as the argument to all the shuffle decode functions. ↵Craig Topper2012-05-061-22/+18
| | | | | | Simplify some of the decode functions. llvm-svn: 156268
* Add shuffle decode support for VPERMQ/VPERMPD.Craig Topper2012-05-061-0/+8
| | | | llvm-svn: 156265
* Fix 256-bit vpshuflw and vpshufhw immediate encoding to handle undefs in the ↵Craig Topper2012-05-031-4/+2
| | | | | | lower half correctly. Missed in r155982. llvm-svn: 156059
* Add support for selecting AVX2 vpshuflw and vpshufhw. Add decoding support ↵Craig Topper2012-05-021-20/+32
| | | | | | for AsmPrinter. llvm-svn: 155982
* Don't decode vperm2i128 or vperm2f128 into a shuffle if bit 3 or 7 of the ↵Craig Topper2012-04-171-0/+3
| | | | | | immediate is set. llvm-svn: 154907
* Factor out target shuffle mask decoding from getShuffleScalarElt and use a ↵Craig Topper2012-03-201-16/+10
| | | | | | SmallVector of int instead of unsigned for shuffle mask in decode functions. Preparation for another change. llvm-svn: 153079
OpenPOWER on IntegriCloud