summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86ISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Remove unused variable.Diego Novillo2017-02-141-1/+0
| | | | llvm-svn: 295065
* [X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise UNDEF inputsSimon Pilgrim2017-02-141-7/+21
| | | | | | Add support for specifying an UNPCK input as UNDEF llvm-svn: 295061
* [X86][SSE] Move unary inputs handling inside matchVectorShuffleWithUNPCK.Simon Pilgrim2017-02-141-2/+3
| | | | llvm-svn: 295053
* [X86][SSE] Tidyup matchVectorShuffleWithUNPCK helper function call.Simon Pilgrim2017-02-141-7/+3
| | | | | | | | Don't bother setting the V1/V2 operands again for unary shuffles. Don't bother legalizing the value type unless the match succeeds. llvm-svn: 295051
* Fix indentation. NFCI.Simon Pilgrim2017-02-131-1/+1
| | | | llvm-svn: 294959
* [X86][SSE] Create matchVectorShuffleWithUNPCK helper function.Simon Pilgrim2017-02-131-46/+42
| | | | | | Currently only used by target shuffle combining - will use it for lowering as well in a future patch. llvm-svn: 294943
* [X86] Genericize the handling of INSERT_SUBVECTOR from an EXTRACT_SUBVECTOR ↵Craig Topper2017-02-131-21/+18
| | | | | | | | to support 512-bit vectors with 128-bit or 256-bit subvectors. We now detect that both the extract and insert indices are non-zero and convert to a shuffle. This will be lowered as a blend for 256-bit vectors or as a vshuf operations for 512-bit vectors. llvm-svn: 294931
* [X86] Don't let LowerEXTRACT_SUBVECTOR call getNode for EXTRACT_SUBVECTOR.Craig Topper2017-02-121-5/+7
| | | | | | This results in the simplifications inside of getNode running while we're legalizing nodes popped off the worklist during the final DAG combine. This basically makes a DAG combine like operation occur during this legalize step, but we don't handle something quite the same way. I think we don't recursively added the removed nodes to the DAG combiner worklist. llvm-svn: 294929
* [X86] Fix typo in function name. NFCI.Simon Pilgrim2017-02-121-2/+2
| | | | | | convertBitVectorToUnsiged - convertBitVectorToUnsigned llvm-svn: 294914
* [X86][SSE] Update argument names to match function name. NFCI.Simon Pilgrim2017-02-121-12/+13
| | | | | | The target shuffle match function arguments were using the term 'Ops' but the function names referred to them as 'Inputs' - use 'Inputs' consistently. llvm-svn: 294900
* [X86][AVX2] Add support for combining target shuffles to VPMOVZXSimon Pilgrim2017-02-121-6/+11
| | | | | | Initial 256-bit vector support - 512-bit support requires extra checks for AVX512BW support (PMOVZXBW) that will be handled in a future patch. llvm-svn: 294896
* [X86] Move code for using blendi for insert_subvector out to an isel ↵Craig Topper2017-02-111-27/+0
| | | | | | pattern. This gives the DAG combiner more opportunity to optimize without needing to dig through the blend. llvm-svn: 294876
* [X86][SSE] Use VSEXT/VZEXT constant folding for ↵Simon Pilgrim2017-02-111-1/+6
| | | | | | | | SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG Preparatory step for PR31712 llvm-svn: 294874
* [X86][SSE] Improve VSEXT/VZEXT constant folding.Simon Pilgrim2017-02-111-11/+18
| | | | | | Generalize VSEXT/VZEXT constant folding to work with any target constant bits source not just BUILD_VECTOR . llvm-svn: 294873
* [X86][SSE] Add early-out when trying to match blend shuffle. NFCI.Simon Pilgrim2017-02-111-3/+4
| | | | llvm-svn: 294864
* Fix indentation in X86ISelLowering. NFCAmaury Sechet2017-02-111-8/+8
| | | | llvm-svn: 294859
* [X86][SSE] Convert getTargetShuffleMaskIndices to use ↵Simon Pilgrim2017-02-111-75/+25
| | | | | | | | | | getTargetConstantBitsFromNode. Removes duplicate constant extraction code in getTargetShuffleMaskIndices. getTargetConstantBitsFromNode - adds support for VZEXT_MOVL(SCALAR_TO_VECTOR) and fail if the caller doesn't support undef bits. llvm-svn: 294856
* [X86] Merge repeated getScalarValueSizeInBits calls. NFCI.Simon Pilgrim2017-02-111-7/+7
| | | | llvm-svn: 294852
* [X86] Bitcast subvector before broadcasting it.Ahmed Bougacha2017-02-101-1/+10
| | | | | | | | | | | | | Since r274013, we've been looking through bitcasts on broadcast inputs. In the scalar-folding case (from a load, build_vector, or sc2vec), the input type didn't matter, as we'd simply bitcast the resulting scalar back. However, when broadcasting a 128-bit-lane-aligned element, we create an EXTRACT_SUBVECTOR. Use proper types, by creating an extract_subvector of the original input type. llvm-svn: 294774
* [X86][SSE] Use SDValue::getConstantOperandVal helper. NFCI.Simon Pilgrim2017-02-101-11/+6
| | | | | | Also reordered an if statement to test low cost comparisons first llvm-svn: 294748
* [X86][SSE] Add support for extracting target constants from BUILD_VECTORSimon Pilgrim2017-02-101-0/+17
| | | | | | | In some cases we call getTargetConstantBitsFromNode for nodes that haven't been lowered from BUILD_VECTOR yet Note: We're getting very close to being able to move most of the constant extraction code from getTargetShuffleMaskIndices into getTargetConstantBitsFromNode llvm-svn: 294746
* [X86][SSE] Add missing comment describing combing to SHUFPS. NFCISimon Pilgrim2017-02-101-0/+2
| | | | llvm-svn: 294745
* [X86] Remove duplicate call to getValueType. NFCI.Simon Pilgrim2017-02-091-4/+3
| | | | llvm-svn: 294640
* Convert to for-range loop. NFCI.Simon Pilgrim2017-02-091-3/+3
| | | | llvm-svn: 294610
* [X86][MMX] Remove the (long time) unused MMX_PINSRW ISD opcode.Simon Pilgrim2017-02-091-1/+0
| | | | llvm-svn: 294596
* [X86][btver2] PR31902: Fix a crash in combineOrCmpEqZeroToCtlzSrl under fast ↵Pierre Gousseau2017-02-091-1/+1
| | | | | | | | | | math. In combineOrCmpEqZeroToCtlzSrl, replace "getConstantOperand == 0" by "isNullConstant" to account for floating point constants. Differential Revision: https://reviews.llvm.org/D29756 llvm-svn: 294588
* [X86][SSE] Attempt to break register dependencies during lowerBuildVectorSimon Pilgrim2017-02-091-11/+40
| | | | | | | | | | | | LowerBuildVectorv16i8/LowerBuildVectorv8i16 insert values into a UNDEF vector if the build vector doesn't contain any zero elements, resulting in register dependencies with a previous use of the register. This patch attempts to break the register dependency by either always zeroing the vector before hand or (if we're inserting to the 0'th element) by using VZEXT_MOVL(SCALAR_TO_VECTOR(i32 AEXT(Elt))) which lowers to (V)MOVD and performs a similar function. Additionally (V)MOVD is a shorter instruction than PINSRB/PINSRW. We already do something similar for SSE41 PINSRD. On pre-SSE41 LowerBuildVectorv16i8 we go a little further and use VZEXT_MOVL(SCALAR_TO_VECTOR(i32 ZEXT(Elt))) if the build vector contains zeros to avoid the vector zeroing at the cost of a scalar zero extension, which can probably be brought over to the other cases in a future patch in some cases (load folding etc.) Differential Revision: https://reviews.llvm.org/D29720 llvm-svn: 294581
* [X86] Clzero intrinsic and its addition under znver1Craig Topper2017-02-091-0/+25
| | | | | | | | | | | | | | | | | This patch does the following. 1. Adds an Intrinsic int_x86_clzero which works with __builtin_ia32_clzero 2. Identifies clzero feature using cpuid info. (Function:8000_0008, Checks if EBX[0]=1) 3. Adds the clzero feature under znver1 architecture. 4. The custom inserter is added in Lowering. 5. A testcase is added to check the intrinsic. 6. The clzero instruction is added to assembler test. Patch by Ganesh Gopalasubramanian with a couple formatting tweaks, a disassembler test, and using update_llc_test.py from me. Differential revision: https://reviews.llvm.org/D29385 llvm-svn: 294558
* [X86][SSE] Tidyup LowerBuildVectorv16i8 and LowerBuildVectorv8i16. NFCI.Simon Pilgrim2017-02-081-20/+18
| | | | | | Run clang-format and standardized variable names between functions. llvm-svn: 294456
* [x86] improve comments for SHRUNKBLEND node creation; NFCSanjay Patel2017-02-071-25/+24
| | | | llvm-svn: 294344
* [x86] use range-for loops; NFCISanjay Patel2017-02-071-8/+6
| | | | llvm-svn: 294337
* [x86] use getSignBit() for clarity; NFCISanjay Patel2017-02-071-4/+3
| | | | llvm-svn: 294333
* [X86][SSE] Ensure that vector shift-by-immediate inputs are correctly ↵Simon Pilgrim2017-02-071-6/+10
| | | | | | | | | | bitcast to the result type vXi8/vXi64 vector shifts are often shifted as vYi16/vYi32 types but we weren't always remembering to bitcast the input. Tested with a new assert as we don't currently manipulate these shifts enough for test cases to catch them. llvm-svn: 294308
* [X86][SSE] Combine shuffle nodes with multiple uses if all the users are ↵Simon Pilgrim2017-02-061-9/+19
| | | | | | | | | | | | | | being combined. Currently we only combine shuffle nodes if they have a single user to prevent us from causing code bloat by splitting the shuffles into several different combines. We don't take into account that in some cases we will already have combined all the users during recursively calling up the shuffle tree. This patch keeps a list of all the shuffle nodes that have been combined so far and permits combining of further shuffle nodes if all its users are in that list. Differential Revision: https://reviews.llvm.org/D29399 llvm-svn: 294183
* [X86][SSE] Replace insert_vector_elt(vec, -1, idx) with shuffleSimon Pilgrim2017-02-051-8/+12
| | | | | | Similar to what we already do for zero elt insertion, we can quickly rematerialize 'allbits' vectors so to avoid a unnecessary gpr value and insertion into a vector llvm-svn: 294162
* [X86] In LowerTRUNCATE, create an ISD::VECTOR_SHUFFLE instead of explicitly ↵Craig Topper2017-02-051-25/+13
| | | | | | | | | | | | creating a PSHUFB. This will be lowered by regular shuffle lowering to a PSHUFB later. Similar was already done for several other shuffles in this function. The test changes are because the old code used explicity zeroing for elements that could have been undef. While I was here I also changed other shuffle vectors in the same function to use the same input twice instead of creating UNDEF nodes. getVectorShuffle can create the UNDEF for us. llvm-svn: 294130
* [X86] Add support for folding (insert_subvector vec1, (extract_subvector ↵Craig Topper2017-02-041-2/+25
| | | | | | vec2, idx1), idx1) -> (blendi vec2, vec1). llvm-svn: 294112
* [X86] Simplify the code that turns INSERT_SUBVECTOR into BLENDI. NFCICraig Topper2017-02-041-19/+8
| | | | llvm-svn: 294111
* [X86][SSE] Add support for combining scalar_to_vector(extract_vector_elt) ↵Simon Pilgrim2017-02-031-0/+14
| | | | | | | | into a target shuffle. Correctly flagging upper elements as undef. llvm-svn: 294020
* [X86] Mark 256-bit and 512-bit INSERT_SUBVECTOR operations as legal and ↵Craig Topper2017-02-031-27/+6
| | | | | | remove the custom lowering. llvm-svn: 293969
* [X86] Avoid sorted order check in release buildsReid Kleckner2017-02-021-4/+6
| | | | | | | Effectively reverts r290248 and fixes the unused function warning with ifndef NDEBUG. llvm-svn: 293945
* [X86] Move turning 256-bit INSERT_SUBVECTORS into BLENDI from legalize to ↵Craig Topper2017-02-021-44/+39
| | | | | | | | DAG combine. On one test this seems to have given more chance for DAG combine to do other INSERT_SUBVECTOR/EXTRACT_SUBVECTOR combines before the BLENDI was created. Looks like we can still improve more by teaching DAG combine to optimize INSERT_SUBVECTOR/EXTRACT_SUBVECTOR with BLENDI. llvm-svn: 293944
* [X86][SSE] Use MOVMSK for all_of/any_of reduction patternsSimon Pilgrim2017-02-021-0/+83
| | | | | | | | | | This is a first attempt at using the MOVMSK instructions to replace all_of/any_of reduction patterns (i.e. an and/or + shuffle chain). So far this only matches patterns where we are reducing an all/none bits source vector (i.e. a comparison result) but we should be able to expand on this in conjunction with improvements to 'bool vector' handling both in the x86 backend as well as the vectorizers etc. Differential Revision: https://reviews.llvm.org/D28810 llvm-svn: 293880
* [X86] Remove some unused DAGCombinerInfo parameters. NFCCraig Topper2017-02-021-7/+4
| | | | llvm-svn: 293873
* [X86] Move some INSERT_SUBVECTOR optimizations from legalize to DAG combine.Craig Topper2017-02-021-53/+74
| | | | | | | | This moves creation of SUBV_BROADCAST and merging of adjacent loads that are being inserted together. This is a step towards removing legalizing of INSERT_SUBVECTOR except for vXi1 cases. llvm-svn: 293872
* [X86][SSE] Remove unused argument. NFCI.Simon Pilgrim2017-02-011-9/+6
| | | | llvm-svn: 293777
* [X86][SSE] Merge SSE2 PINSRW lowering with SSE41 PINSRB/PINSRW lowering. NFCI.Simon Pilgrim2017-02-011-32/+21
| | | | | | These are identical apart from the extra SSE41 guard for PINSRB. llvm-svn: 293766
* [X86][SSE] Add support for combining PINSRB into a target shuffle.Simon Pilgrim2017-01-311-4/+7
| | | | llvm-svn: 293637
* [X86] Silence unused variable warning in Release builds.Benjamin Kramer2017-01-311-4/+5
| | | | llvm-svn: 293631
* [X86][SSE] Detect unary PBLEND shuffles.Simon Pilgrim2017-01-311-0/+1
| | | | | | These can appear during shuffle combining. llvm-svn: 293628
OpenPOWER on IntegriCloud