| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
In preparation for demandedelts support
llvm-svn: 286463
|
| |
|
|
| |
llvm-svn: 286461
|
| |
|
|
|
|
| |
In preparation for demandedelts support
llvm-svn: 286457
|
| |
|
|
|
|
|
|
| |
We were failing to extract a constant splat shift value if the shifted value was being masked.
The (shl (and (setcc) N01CV) N1CV) -> (and (setcc) N01CV<<N1CV) combine was unnecessarily preventing this.
llvm-svn: 286454
|
| |
|
|
|
|
| |
Fails to match constant shift value due to presence of AND mask.
llvm-svn: 286452
|
| |
|
|
| |
llvm-svn: 286448
|
| |
|
|
|
|
| |
In preparation for demandedelts support
llvm-svn: 286447
|
| |
|
|
|
|
| |
instruction when available.
llvm-svn: 286435
|
| |
|
|
|
|
|
|
| |
ISD::FP_TO_SINT in the intrinsics table and delete patterns. While nearby also move CVTDQ2PS patterns into their instructions.
This allows these intrinsics to also use EVEX instructons.
llvm-svn: 286434
|
| |
|
|
|
|
| |
table. Removes extra patterns and allows legacy intrinsic to select EVEX encoded instructions when available.
llvm-svn: 286433
|
| |
|
|
|
|
| |
handle shuffles.
llvm-svn: 286425
|
| |
|
|
|
|
|
|
|
| |
represents a relocatable immediate.", with a fix for 32-bit x86.
Teach X86InstrInfo::analyzeCompare() not to crash on CMP and SUB instructions
that take a global address operand.
llvm-svn: 286420
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for fptoui to 2i32 from both 2f64 and 2f32, building on Simon's change for the signed version in r284459 and using AVX-512 instructions.
If we don't have VLX support we need to use a 512-bit operation for v2f64->v2i32 and extract the result.
It also recognises that cvttpd2udq zeroes the upper 64-bits of the xmm result.
Differential Revision: https://reviews.llvm.org/D26331
llvm-svn: 286345
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This allows the SSE intrinsic to use the EVEX instruction when available. It also fixes EVEX to not use a weird (v4i32 (fp_to_sint v2f64)) node and it merges some isel patterns. This also fixes some cases that weren't combining vzmovl with cvttpd2dq to remove extra moves.
Reviewers: delena, zvi, RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26330
llvm-svn: 286344
|
| |
|
|
|
|
| |
of a 256 or 512-bit subvector extract.
llvm-svn: 286343
|
| |
|
|
|
|
|
|
| |
256-bits of a 512-bit vector to use a 256-bit aligned store.
Previously we were only checking for 16 byte alignment instead of 32 byte alignment. Fixes PR30947.
llvm-svn: 286342
|
| |
|
|
|
|
| |
aligned store instructions when the original store was only 16 byte aligned if the store is from the lower bits of a subvector extract.
llvm-svn: 286341
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
set is also enabled.
Summary:
This is needed to make the v64i8 and v32i16 types legal for the 512-bit VBMI instructions. Fixes PR30912.
Reviewers: delena, zvi
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26322
llvm-svn: 286339
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The smallest tests that expose this are codegen tests (because SelectionDAGBuilder::visitSelect() uses matchSelectPattern
to create UMAX/UMIN nodes), but it's also possible to see the effects in IR alone with folds of min/max pairs.
If these were written as unsigned compares in IR, InstCombine canonicalizes the unsigned compares to signed compares.
Ie, running the optimizer pessimizes the codegen for this case without this patch:
define <4 x i32> @umax_vec(<4 x i32> %x) {
%cmp = icmp ugt <4 x i32> %x, <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647>
%sel = select <4 x i1> %cmp, <4 x i32> %x, <4 x i32> <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647>
ret <4 x i32> %sel
}
$ ./opt umax.ll -S | ./llc -o - -mattr=avx
vpmaxud LCPI0_0(%rip), %xmm0, %xmm0
$ ./opt -instcombine umax.ll -S | ./llc -o - -mattr=avx
vpxor %xmm1, %xmm1, %xmm1
vpcmpgtd %xmm0, %xmm1, %xmm1
vmovaps LCPI0_0(%rip), %xmm2 ## xmm2 = [2147483647,2147483647,2147483647,2147483647]
vblendvps %xmm1, %xmm0, %xmm2, %xmm0
Differential Revision: https://reviews.llvm.org/D26096
llvm-svn: 286318
|
| |
|
|
| |
llvm-svn: 286241
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixed an issue with vector usage of TargetLowering::isConstTrueVal / TargetLowering::isConstFalseVal boolean result matching.
The comment said we shouldn't handle constant splat vectors with undef elements. But the the actual code was returning false if the build vector contained no undef elements....
This patch now ignores the number of undefs (getConstantSplatNode will return null if the build vector is all undefs).
The change has also unearthed a couple of missed opportunities in AVX512 comparison code that will need to be addressed.
Differential Revision: https://reviews.llvm.org/D26031
llvm-svn: 286238
|
| |
|
|
|
|
|
|
|
|
| |
This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available.
This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful.
Differential Revision: https://reviews.llvm.org/D25910
llvm-svn: 286233
|
| |
|
|
|
|
| |
and regenerate. This will make a change in a future patch easier to see. NFC
llvm-svn: 286216
|
| |
|
|
| |
llvm-svn: 286105
|
| |
|
|
|
|
| |
cpu/triple duplication
llvm-svn: 286104
|
| |
|
|
|
|
|
|
| |
selects and native zext/sext.
This mostly reuses earlier autoupgrade support for the sse and avx equivalents. Just needed to add the code to add the select.
llvm-svn: 286092
|
| |
|
|
|
|
| |
legacy intrinsics and a select.
llvm-svn: 286089
|
| |
|
|
| |
llvm-svn: 286075
|
| |
|
|
|
|
| |
In preparation for demandedelts support
llvm-svn: 286074
|
| |
|
|
|
|
| |
upgrade them to a select and the older AVX2 intrinsic.
llvm-svn: 286073
|
| |
|
|
|
|
| |
Instead upgrade them to a select and the older SSE/AVX2 intrinsic.
llvm-svn: 286072
|
| |
|
|
| |
llvm-svn: 286071
|
| |
|
|
|
|
| |
in xmm. Instead upgrade them to a select and the older SSE/AVX2 intrinsic.
llvm-svn: 286070
|
| |
|
|
|
|
| |
already exists in the avx512f test file.
llvm-svn: 286069
|
| |
|
|
|
|
| |
In preparation for demandedelts support
llvm-svn: 286068
|
| |
|
|
|
|
| |
for these test cases will be improved for AVX512 in a future commit.
llvm-svn: 286063
|
| |
|
|
|
|
| |
addr:)) -> VCVTPS2PDZ128rm
llvm-svn: 286059
|
| |
|
|
|
|
| |
-show-mc-encoding to show where we aren't using EVEX instructions.
llvm-svn: 286058
|
| |
|
|
|
|
| |
instruction when available.
llvm-svn: 286057
|
| |
|
|
|
|
| |
they can use EVEX instructions when available.
llvm-svn: 286056
|
| |
|
|
|
|
| |
see when VEX or EVEX encoded instructions are being emitted. Make sure the tests all have an avx2 command line and an skx command line.
llvm-svn: 286055
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
valid int64_t to the set.
Summary:
SmallSetVector uses DenseSet, but that means we need to reserve some
values for the empty and tombstone keys.
It seems to me we should have a general way to let us store full-range
ints inside of DenseSets, and furthermore that we probably shouldn't
silently let you add ints into DenseSets without explicitly promising
that they're in range. But that's a battle for another day; for now,
just fix this code, since we currently do something Very Bad when
compiling ffmpeg.
Fixes PR30914.
Reviewers: jeremyhu
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D26323
llvm-svn: 286038
|
| |
|
|
|
|
|
|
| |
Broadcast from memory instructions should be treated as moves. They can't be unfolded.
Fixes pr30693.
llvm-svn: 285998
|
| |
|
|
| |
llvm-svn: 285997
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This fixes selection of KANDN instructions and allows us to remove an extra set of patterns for KNOT and KXNOR.
Reviewers: delena, igorb
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26134
llvm-svn: 285878
|
| |
|
|
|
|
|
|
| |
2 new intrinsics covering AVX-512 compress/expand functionality.
This implementation includes syntax, DAG builder, operation lowering and tests.
Does not include: handling of illegal data types, codegen prepare pass and the cost model.
llvm-svn: 285876
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bits (PR30841)
This bug was exposed by using nsw/nuw for more aggressive folds in:
https://reviews.llvm.org/rL284844
The changes mimic the IR demanded bits logic in InstCombiner::SimplifyDemandedUseBits(),
but we can't just flip flag bits in the DAG; we have to create a new node that has the
bits cleared.
This should fix:
https://llvm.org/bugs/show_bug.cgi?id=30841
llvm-svn: 285656
|
| |
|
|
| |
llvm-svn: 285560
|
| |
|
|
|
|
|
| |
In D26098, Davide Italiano submitted a .s file instead of the .ll file
that was the last stage of the review.
llvm-svn: 285559
|
| |
|
|
|
|
| |
started from shuffles.
llvm-svn: 285546
|