| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 295065
|
|
|
|
|
|
| |
Add support for specifying an UNPCK input as UNDEF
llvm-svn: 295061
|
|
|
|
| |
llvm-svn: 295053
|
|
|
|
|
|
|
|
| |
Don't bother setting the V1/V2 operands again for unary shuffles.
Don't bother legalizing the value type unless the match succeeds.
llvm-svn: 295051
|
|
|
|
| |
llvm-svn: 294959
|
|
|
|
|
|
| |
Currently only used by target shuffle combining - will use it for lowering as well in a future patch.
llvm-svn: 294943
|
|
|
|
|
|
|
|
| |
to support 512-bit vectors with 128-bit or 256-bit subvectors.
We now detect that both the extract and insert indices are non-zero and convert to a shuffle. This will be lowered as a blend for 256-bit vectors or as a vshuf operations for 512-bit vectors.
llvm-svn: 294931
|
|
|
|
|
|
| |
This results in the simplifications inside of getNode running while we're legalizing nodes popped off the worklist during the final DAG combine. This basically makes a DAG combine like operation occur during this legalize step, but we don't handle something quite the same way. I think we don't recursively added the removed nodes to the DAG combiner worklist.
llvm-svn: 294929
|
|
|
|
|
|
| |
convertBitVectorToUnsiged - convertBitVectorToUnsigned
llvm-svn: 294914
|
|
|
|
|
|
| |
The target shuffle match function arguments were using the term 'Ops' but the function names referred to them as 'Inputs' - use 'Inputs' consistently.
llvm-svn: 294900
|
|
|
|
|
|
| |
Initial 256-bit vector support - 512-bit support requires extra checks for AVX512BW support (PMOVZXBW) that will be handled in a future patch.
llvm-svn: 294896
|
|
|
|
|
|
| |
pattern. This gives the DAG combiner more opportunity to optimize without needing to dig through the blend.
llvm-svn: 294876
|
|
|
|
|
|
|
|
| |
SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG
Preparatory step for PR31712
llvm-svn: 294874
|
|
|
|
|
|
| |
Generalize VSEXT/VZEXT constant folding to work with any target constant bits source not just BUILD_VECTOR .
llvm-svn: 294873
|
|
|
|
| |
llvm-svn: 294864
|
|
|
|
| |
llvm-svn: 294859
|
|
|
|
|
|
|
|
|
|
| |
getTargetConstantBitsFromNode.
Removes duplicate constant extraction code in getTargetShuffleMaskIndices.
getTargetConstantBitsFromNode - adds support for VZEXT_MOVL(SCALAR_TO_VECTOR) and fail if the caller doesn't support undef bits.
llvm-svn: 294856
|
|
|
|
| |
llvm-svn: 294852
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since r274013, we've been looking through bitcasts on broadcast inputs.
In the scalar-folding case (from a load, build_vector, or sc2vec),
the input type didn't matter, as we'd simply bitcast the resulting
scalar back.
However, when broadcasting a 128-bit-lane-aligned element, we create an
EXTRACT_SUBVECTOR. Use proper types, by creating an extract_subvector
of the original input type.
llvm-svn: 294774
|
|
|
|
|
|
| |
Also reordered an if statement to test low cost comparisons first
llvm-svn: 294748
|
|
|
|
|
|
|
| |
In some cases we call getTargetConstantBitsFromNode for nodes that haven't been lowered from BUILD_VECTOR yet
Note: We're getting very close to being able to move most of the constant extraction code from getTargetShuffleMaskIndices into getTargetConstantBitsFromNode
llvm-svn: 294746
|
|
|
|
| |
llvm-svn: 294745
|
|
|
|
| |
llvm-svn: 294640
|
|
|
|
| |
llvm-svn: 294610
|
|
|
|
| |
llvm-svn: 294596
|
|
|
|
|
|
|
|
|
|
| |
math.
In combineOrCmpEqZeroToCtlzSrl, replace "getConstantOperand == 0" by "isNullConstant" to account for floating point constants.
Differential Revision: https://reviews.llvm.org/D29756
llvm-svn: 294588
|
|
|
|
|
|
|
|
|
|
|
|
| |
LowerBuildVectorv16i8/LowerBuildVectorv8i16 insert values into a UNDEF vector if the build vector doesn't contain any zero elements, resulting in register dependencies with a previous use of the register.
This patch attempts to break the register dependency by either always zeroing the vector before hand or (if we're inserting to the 0'th element) by using VZEXT_MOVL(SCALAR_TO_VECTOR(i32 AEXT(Elt))) which lowers to (V)MOVD and performs a similar function. Additionally (V)MOVD is a shorter instruction than PINSRB/PINSRW. We already do something similar for SSE41 PINSRD.
On pre-SSE41 LowerBuildVectorv16i8 we go a little further and use VZEXT_MOVL(SCALAR_TO_VECTOR(i32 ZEXT(Elt))) if the build vector contains zeros to avoid the vector zeroing at the cost of a scalar zero extension, which can probably be brought over to the other cases in a future patch in some cases (load folding etc.)
Differential Revision: https://reviews.llvm.org/D29720
llvm-svn: 294581
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch does the following.
1. Adds an Intrinsic int_x86_clzero which works with __builtin_ia32_clzero
2. Identifies clzero feature using cpuid info. (Function:8000_0008, Checks if EBX[0]=1)
3. Adds the clzero feature under znver1 architecture.
4. The custom inserter is added in Lowering.
5. A testcase is added to check the intrinsic.
6. The clzero instruction is added to assembler test.
Patch by Ganesh Gopalasubramanian with a couple formatting tweaks, a disassembler test, and using update_llc_test.py from me.
Differential revision: https://reviews.llvm.org/D29385
llvm-svn: 294558
|
|
|
|
|
|
| |
Run clang-format and standardized variable names between functions.
llvm-svn: 294456
|
|
|
|
| |
llvm-svn: 294344
|
|
|
|
| |
llvm-svn: 294337
|
|
|
|
| |
llvm-svn: 294333
|
|
|
|
|
|
|
|
|
|
| |
bitcast to the result type
vXi8/vXi64 vector shifts are often shifted as vYi16/vYi32 types but we weren't always remembering to bitcast the input.
Tested with a new assert as we don't currently manipulate these shifts enough for test cases to catch them.
llvm-svn: 294308
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
being combined.
Currently we only combine shuffle nodes if they have a single user to prevent us from causing code bloat by splitting the shuffles into several different combines.
We don't take into account that in some cases we will already have combined all the users during recursively calling up the shuffle tree.
This patch keeps a list of all the shuffle nodes that have been combined so far and permits combining of further shuffle nodes if all its users are in that list.
Differential Revision: https://reviews.llvm.org/D29399
llvm-svn: 294183
|
|
|
|
|
|
| |
Similar to what we already do for zero elt insertion, we can quickly rematerialize 'allbits' vectors so to avoid a unnecessary gpr value and insertion into a vector
llvm-svn: 294162
|
|
|
|
|
|
|
|
|
|
|
|
| |
creating a PSHUFB. This will be lowered by regular shuffle lowering to a PSHUFB later.
Similar was already done for several other shuffles in this function.
The test changes are because the old code used explicity zeroing for elements that could have been undef.
While I was here I also changed other shuffle vectors in the same function to use the same input twice instead of creating UNDEF nodes. getVectorShuffle can create the UNDEF for us.
llvm-svn: 294130
|
|
|
|
|
|
| |
vec2, idx1), idx1) -> (blendi vec2, vec1).
llvm-svn: 294112
|
|
|
|
| |
llvm-svn: 294111
|
|
|
|
|
|
|
|
| |
into a target shuffle.
Correctly flagging upper elements as undef.
llvm-svn: 294020
|
|
|
|
|
|
| |
remove the custom lowering.
llvm-svn: 293969
|
|
|
|
|
|
|
| |
Effectively reverts r290248 and fixes the unused function warning with
ifndef NDEBUG.
llvm-svn: 293945
|
|
|
|
|
|
|
|
| |
DAG combine.
On one test this seems to have given more chance for DAG combine to do other INSERT_SUBVECTOR/EXTRACT_SUBVECTOR combines before the BLENDI was created. Looks like we can still improve more by teaching DAG combine to optimize INSERT_SUBVECTOR/EXTRACT_SUBVECTOR with BLENDI.
llvm-svn: 293944
|
|
|
|
|
|
|
|
|
|
| |
This is a first attempt at using the MOVMSK instructions to replace all_of/any_of reduction patterns (i.e. an and/or + shuffle chain).
So far this only matches patterns where we are reducing an all/none bits source vector (i.e. a comparison result) but we should be able to expand on this in conjunction with improvements to 'bool vector' handling both in the x86 backend as well as the vectorizers etc.
Differential Revision: https://reviews.llvm.org/D28810
llvm-svn: 293880
|
|
|
|
| |
llvm-svn: 293873
|
|
|
|
|
|
|
|
| |
This moves creation of SUBV_BROADCAST and merging of adjacent loads that are being inserted together.
This is a step towards removing legalizing of INSERT_SUBVECTOR except for vXi1 cases.
llvm-svn: 293872
|
|
|
|
| |
llvm-svn: 293777
|
|
|
|
|
|
| |
These are identical apart from the extra SSE41 guard for PINSRB.
llvm-svn: 293766
|
|
|
|
| |
llvm-svn: 293637
|
|
|
|
| |
llvm-svn: 293631
|
|
|
|
|
|
| |
These can appear during shuffle combining.
llvm-svn: 293628
|