| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Currently only used by target shuffle combining - will use it for lowering as well in a future patch.
llvm-svn: 294943
|
|
|
|
|
|
| |
common check prefix ALL. NFC.
llvm-svn: 294938
|
|
|
|
|
|
|
|
| |
consistency between VEX/EVEX versions of the same instruction.
Differential Revision: https://reviews.llvm.org/D29873
llvm-svn: 294937
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this patch compile time was about 21s (see below). After this patch
we have less than 2s (see bellow).
Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz
DAGCombiner - trunk
time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math
real 0m1.685s
DAGCombiner + Speed patch
time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math
real 0m1.655s
MachineCombiner w/o Speed patch
time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math
real 0m21.614s
MachineCombiner + Speed patch
time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math
real 0m1.593s
The test spill_fdiv.ll is attached to D29627
D29627 should be closed.
llvm-svn: 294936
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
reductions.
Currently, LLVM supports vectorization of horizontal reduction
instructions with initial value set to 0. Patch supports vectorization
of reduction with non-zero initial values. Also, it supports a
vectorization of instructions with some extra arguments, like:
```
float f(float x[], int a, int b) {
float p = a % b;
p += x[0] + 3;
for (int i = 1; i < 32; i++)
p += x[i];
return p;
}
```
Patch allows vectorization of this kind of horizontal reductions.
Differential Revision: https://reviews.llvm.org/D29727
llvm-svn: 294934
|
|
|
|
|
|
| |
into the same location of a an undef vector can just use the original input to the extract.
llvm-svn: 294932
|
|
|
|
|
|
|
|
| |
to support 512-bit vectors with 128-bit or 256-bit subvectors.
We now detect that both the extract and insert indices are non-zero and convert to a shuffle. This will be lowered as a blend for 256-bit vectors or as a vshuf operations for 512-bit vectors.
llvm-svn: 294931
|
|
|
|
|
|
|
|
| |
EXTRACT_SUBVECTOR from an INSERT_SUBVECTOR.
This gives more parallelism opportunities for AVX-512 when dealing with 128-bit extracts from 512-bit vectors.
llvm-svn: 294930
|
|
|
|
|
|
| |
This results in the simplifications inside of getNode running while we're legalizing nodes popped off the worklist during the final DAG combine. This basically makes a DAG combine like operation occur during this legalize step, but we don't handle something quite the same way. I think we don't recursively added the removed nodes to the DAG combiner worklist.
llvm-svn: 294929
|
|
|
|
|
|
| |
why they fail.
llvm-svn: 294928
|
|
|
|
|
|
| |
eliminates no-use readonly/readnone calls, even if they are not marked nounwind. NewGVN only eliminates them if they are marked nounwind, and thus, trivially dead.
llvm-svn: 294927
|
|
|
|
|
|
| |
deadness
llvm-svn: 294926
|
|
|
|
| |
llvm-svn: 294925
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bug was introduced with:
https://reviews.llvm.org/rL294863
...and manifests as a selection failure in x86, but that's actually
another bug. This fix prevents wrong codegen with -0.0, but in the
more common case when we have NSZ and NNAN (-ffast-math), we should
still be able to fold this setcc/compare.
llvm-svn: 294924
|
|
|
|
|
|
| |
This reverts commit r294919
llvm-svn: 294923
|
|
|
|
| |
llvm-svn: 294922
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This adds support for placing predicateinfo such that it affects critical edges.
This fixes the issues mentioned by Nuno on the mailing list.
Depends on D29519
Reviewers: davide, nlopes
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29606
llvm-svn: 294921
|
|
|
|
| |
llvm-svn: 294920
|
|
|
|
| |
llvm-svn: 294919
|
|
|
|
|
|
| |
convertBitVectorToUnsiged - convertBitVectorToUnsigned
llvm-svn: 294914
|
|
|
|
|
|
|
| |
core files on FreeBSD have additional notes to capture state. Process
those notes when dumping the notes.
llvm-svn: 294909
|
|
|
|
|
|
| |
the VEX equivalents as a guide.
llvm-svn: 294908
|
|
|
|
|
|
|
|
| |
instruction from AVX/SSE.
I can't prove that we can select this instruction or the AVX/SSE version, but I'm adding it for consistency for now so I can continue matching the load folding tables.
llvm-svn: 294907
|
|
|
|
|
|
| |
they are stores. AVX-512 version was already named with 'mr'.
llvm-svn: 294906
|
|
|
|
| |
llvm-svn: 294905
|
|
|
|
|
|
| |
The target shuffle match function arguments were using the term 'Ops' but the function names referred to them as 'Inputs' - use 'Inputs' consistently.
llvm-svn: 294900
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I found one special case of this transform for 'slt 0', so I removed that and added the general transform.
Alive code to check correctness:
Name: slt_no_overflow
Pre: WillNotOverflowSignedSub(C1, C2)
%a = add nsw i8 %x, C2
%b = icmp slt %a, C1
=>
%b = icmp slt %x, C1 - C2
Name: sgt_no_overflow
Pre: WillNotOverflowSignedSub(C1, C2)
%a = add nsw i8 %x, C2
%b = icmp sgt %a, C1
=>
%b = icmp sgt %x, C1 - C2
http://rise4fun.com/Alive/MH
Differential Revision: https://reviews.llvm.org/D29774
llvm-svn: 294898
|
|
|
|
|
|
|
|
|
|
|
| |
Enhancing value tracking's analysis of null-ness was suggested in D27855, so here's a first attempt at that.
This is part of solving:
https://llvm.org/bugs/show_bug.cgi?id=28430
Differential Revision: https://reviews.llvm.org/D28204
llvm-svn: 294897
|
|
|
|
|
|
| |
Initial 256-bit vector support - 512-bit support requires extra checks for AVX512BW support (PMOVZXBW) that will be handled in a future patch.
llvm-svn: 294896
|
|
|
|
|
|
| |
function returned true or undef.
llvm-svn: 294895
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
proven larger than the loop-count
This fixes PR31098: Try to resolve statically data-dependences whose
compile-time-unknown distance can be proven larger than the loop-count,
instead of resorting to runtime dependence checking (which are not always
possible).
For vectorization it is sufficient to prove that the dependence distance
is >= VF; But in some cases we can prune unknown dependence distances early,
and even before selecting the VF, and without a runtime test, by comparing
the distance against the loop iteration count. Since the vectorized code
will be executed only if LoopCount >= VF, proving distance >= LoopCount
also guarantees that distance >= VF. This check is also equivalent to the
Strong SIV Test.
Reviewers: mkuper, anemet, sanjoy
Differential Revision: https://reviews.llvm.org/D28044
llvm-svn: 294892
|
|
|
|
|
|
|
| |
The reference is here:
https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
llvm-svn: 294890
|
|
|
|
| |
llvm-svn: 294889
|
|
|
|
|
|
|
|
|
|
|
| |
default pipeline.
A clang with this patch built with ASan and asserts can build all of the
test-suite as well, so it seems to not uncover any latent problems.
Differential Revision: https://reviews.llvm.org/D29853
llvm-svn: 294888
|
|
|
|
|
|
|
|
|
|
| |
All the invalidation issues and bugs in this seem to be fixed, it has
survived a full build of the test suite plus SPEC with asserts and ASan
enabled on the Clang binary used.
Differential Revision: https://reviews.llvm.org/D29815
llvm-svn: 294887
|
|
|
|
| |
llvm-svn: 294885
|
|
|
|
| |
llvm-svn: 294884
|
|
|
|
| |
llvm-svn: 294883
|
|
|
|
| |
llvm-svn: 294882
|
|
|
|
|
|
| |
llvm::createPromoteMemoryToRegisterPass() added in r294870.
llvm-svn: 294881
|
|
|
|
| |
llvm-svn: 294878
|
|
|
|
|
|
| |
pattern. This gives the DAG combiner more opportunity to optimize without needing to dig through the blend.
llvm-svn: 294876
|
|
|
|
|
|
| |
generic to support larger concats.
llvm-svn: 294875
|
|
|
|
|
|
|
|
| |
SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG
Preparatory step for PR31712
llvm-svn: 294874
|
|
|
|
|
|
| |
Generalize VSEXT/VZEXT constant folding to work with any target constant bits source not just BUILD_VECTOR .
llvm-svn: 294873
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Many quoted code blocks were not in sync with the actual toy.cpp
files. Improve tutorial text slightly in several places.
Added some step descriptions crucial to avoid crashes (like
InitializeNativeTarget* calls).
Solve/workaround problems with Windows (JIT'ed method not found, using
custom and standard library functions from host process).
Patch by: Moritz Kroll <moritz.kroll@gmx.de>
Differential Revision: https://reviews.llvm.org/D29864
llvm-svn: 294870
|
|
|
|
| |
llvm-svn: 294867
|
|
|
|
| |
llvm-svn: 294866
|
|
|
|
| |
llvm-svn: 294865
|
|
|
|
| |
llvm-svn: 294864
|