| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
| |
into the same location of a an undef vector can just use the original input to the extract.
llvm-svn: 294932
|
| |
|
|
|
|
|
|
| |
to support 512-bit vectors with 128-bit or 256-bit subvectors.
We now detect that both the extract and insert indices are non-zero and convert to a shuffle. This will be lowered as a blend for 256-bit vectors or as a vshuf operations for 512-bit vectors.
llvm-svn: 294931
|
| |
|
|
|
|
|
|
| |
EXTRACT_SUBVECTOR from an INSERT_SUBVECTOR.
This gives more parallelism opportunities for AVX-512 when dealing with 128-bit extracts from 512-bit vectors.
llvm-svn: 294930
|
| |
|
|
|
|
| |
This results in the simplifications inside of getNode running while we're legalizing nodes popped off the worklist during the final DAG combine. This basically makes a DAG combine like operation occur during this legalize step, but we don't handle something quite the same way. I think we don't recursively added the removed nodes to the DAG combiner worklist.
llvm-svn: 294929
|
| |
|
|
|
|
| |
deadness
llvm-svn: 294926
|
| |
|
|
| |
llvm-svn: 294925
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The bug was introduced with:
https://reviews.llvm.org/rL294863
...and manifests as a selection failure in x86, but that's actually
another bug. This fix prevents wrong codegen with -0.0, but in the
more common case when we have NSZ and NNAN (-ffast-math), we should
still be able to fold this setcc/compare.
llvm-svn: 294924
|
| |
|
|
|
|
| |
This reverts commit r294919
llvm-svn: 294923
|
| |
|
|
| |
llvm-svn: 294922
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This adds support for placing predicateinfo such that it affects critical edges.
This fixes the issues mentioned by Nuno on the mailing list.
Depends on D29519
Reviewers: davide, nlopes
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29606
llvm-svn: 294921
|
| |
|
|
| |
llvm-svn: 294920
|
| |
|
|
| |
llvm-svn: 294919
|
| |
|
|
|
|
| |
convertBitVectorToUnsiged - convertBitVectorToUnsigned
llvm-svn: 294914
|
| |
|
|
|
|
| |
the VEX equivalents as a guide.
llvm-svn: 294908
|
| |
|
|
|
|
|
|
| |
instruction from AVX/SSE.
I can't prove that we can select this instruction or the AVX/SSE version, but I'm adding it for consistency for now so I can continue matching the load folding tables.
llvm-svn: 294907
|
| |
|
|
|
|
| |
they are stores. AVX-512 version was already named with 'mr'.
llvm-svn: 294906
|
| |
|
|
| |
llvm-svn: 294905
|
| |
|
|
|
|
| |
The target shuffle match function arguments were using the term 'Ops' but the function names referred to them as 'Inputs' - use 'Inputs' consistently.
llvm-svn: 294900
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I found one special case of this transform for 'slt 0', so I removed that and added the general transform.
Alive code to check correctness:
Name: slt_no_overflow
Pre: WillNotOverflowSignedSub(C1, C2)
%a = add nsw i8 %x, C2
%b = icmp slt %a, C1
=>
%b = icmp slt %x, C1 - C2
Name: sgt_no_overflow
Pre: WillNotOverflowSignedSub(C1, C2)
%a = add nsw i8 %x, C2
%b = icmp sgt %a, C1
=>
%b = icmp sgt %x, C1 - C2
http://rise4fun.com/Alive/MH
Differential Revision: https://reviews.llvm.org/D29774
llvm-svn: 294898
|
| |
|
|
|
|
|
|
|
|
|
| |
Enhancing value tracking's analysis of null-ness was suggested in D27855, so here's a first attempt at that.
This is part of solving:
https://llvm.org/bugs/show_bug.cgi?id=28430
Differential Revision: https://reviews.llvm.org/D28204
llvm-svn: 294897
|
| |
|
|
|
|
| |
Initial 256-bit vector support - 512-bit support requires extra checks for AVX512BW support (PMOVZXBW) that will be handled in a future patch.
llvm-svn: 294896
|
| |
|
|
|
|
| |
function returned true or undef.
llvm-svn: 294895
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
proven larger than the loop-count
This fixes PR31098: Try to resolve statically data-dependences whose
compile-time-unknown distance can be proven larger than the loop-count,
instead of resorting to runtime dependence checking (which are not always
possible).
For vectorization it is sufficient to prove that the dependence distance
is >= VF; But in some cases we can prune unknown dependence distances early,
and even before selecting the VF, and without a runtime test, by comparing
the distance against the loop iteration count. Since the vectorized code
will be executed only if LoopCount >= VF, proving distance >= LoopCount
also guarantees that distance >= VF. This check is also equivalent to the
Strong SIV Test.
Reviewers: mkuper, anemet, sanjoy
Differential Revision: https://reviews.llvm.org/D28044
llvm-svn: 294892
|
| |
|
|
|
|
|
| |
The reference is here:
https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
llvm-svn: 294890
|
| |
|
|
|
|
|
|
|
|
|
| |
default pipeline.
A clang with this patch built with ASan and asserts can build all of the
test-suite as well, so it seems to not uncover any latent problems.
Differential Revision: https://reviews.llvm.org/D29853
llvm-svn: 294888
|
| |
|
|
|
|
|
|
|
|
| |
All the invalidation issues and bugs in this seem to be fixed, it has
survived a full build of the test suite plus SPEC with asserts and ASan
enabled on the Clang binary used.
Differential Revision: https://reviews.llvm.org/D29815
llvm-svn: 294887
|
| |
|
|
| |
llvm-svn: 294882
|
| |
|
|
|
|
| |
pattern. This gives the DAG combiner more opportunity to optimize without needing to dig through the blend.
llvm-svn: 294876
|
| |
|
|
|
|
| |
generic to support larger concats.
llvm-svn: 294875
|
| |
|
|
|
|
|
|
| |
SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG
Preparatory step for PR31712
llvm-svn: 294874
|
| |
|
|
|
|
| |
Generalize VSEXT/VZEXT constant folding to work with any target constant bits source not just BUILD_VECTOR .
llvm-svn: 294873
|
| |
|
|
| |
llvm-svn: 294864
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I don't know if anything other than x86 vectors is affected by this change, but this may allow
us to remove target-specific intrinsics for blendv* (vector selects). The simplification arises
from the fact that blendv* instructions only use the sign-bit when deciding which vector element
to choose for the destination vector. The mechanism to fold VSELECT into SHRUNKBLEND nodes already
exists in x86 lowering; this demanded bits change just enables the transform to fire more often.
The original motivation starts with a bug for DSE of masked stores that seems completely unrelated,
but I've explained the likely steps in this series here:
https://llvm.org/bugs/show_bug.cgi?id=11210
Differential Revision: https://reviews.llvm.org/D29687
llvm-svn: 294863
|
| |
|
|
| |
llvm-svn: 294859
|
| |
|
|
| |
llvm-svn: 294858
|
| |
|
|
| |
llvm-svn: 294857
|
| |
|
|
|
|
|
|
|
|
| |
getTargetConstantBitsFromNode.
Removes duplicate constant extraction code in getTargetShuffleMaskIndices.
getTargetConstantBitsFromNode - adds support for VZEXT_MOVL(SCALAR_TO_VECTOR) and fail if the caller doesn't support undef bits.
llvm-svn: 294856
|
| |
|
|
| |
llvm-svn: 294852
|
| |
|
|
| |
llvm-svn: 294851
|
| |
|
|
| |
llvm-svn: 294850
|
| |
|
|
| |
llvm-svn: 294849
|
| |
|
|
| |
llvm-svn: 294847
|
| |
|
|
|
|
|
|
| |
All commutations confirmed to give identical results - note PFMAX/PFMIN do not
PFSUB<->PFSUBR should be commutable as well
llvm-svn: 294846
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
it is dead or unreachable, as it should be.
This also makes the leader of INITIAL undef, enabling us to handle
irreducibility properly.
Summary:
This lets us verify, more than we do now, that we didn't screw up
value numbering.
Reviewers: davide
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D29842
llvm-svn: 294844
|
| |
|
|
| |
llvm-svn: 294843
|
| |
|
|
| |
llvm-svn: 294837
|
| |
|
|
| |
llvm-svn: 294830
|
| |
|
|
| |
llvm-svn: 294829
|
| |
|
|
| |
llvm-svn: 294827
|
| |
|
|
|
|
|
|
|
|
|
|
| |
is available.
Seems the execution dependency pass likes to use FP instructions when most of the consuming code is integer if a vextractf128 instruction produced the register. Without AVX2 we don't have the corresponding integer instruction available.
This patch suppresses the domain on these instructions to GenericDomain if AVX2 is not supported so that they are ignored by domain fixing. If AVX2 is supported we'll report the correct domain and allow them to switch between integer and fp.
Overall I think this produces better results in the modified test cases.
llvm-svn: 294824
|