| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pattern we replaced these with may be too hard to match as demonstrated by
PR41496 and PR41316.
This patch restores the intrinsics and then we can start focusing
on the optimizing the intrinsics.
I've mostly reverted the original patch that removed them. Though I modified
the avx512 intrinsics to not have masking built in.
Differential Revision: https://reviews.llvm.org/D60674
llvm-svn: 358427
|
|
|
|
|
|
| |
As noted on PR40203, for gcc compatibility we need to support non-immediate values in the 'slli/srli/srai' shift by immediate vector intrinsics.
llvm-svn: 350619
|
|
|
|
|
|
|
|
|
|
|
|
| |
intrinsics (clang)
This emits SADD_SAT/SSUB_SAT generic intrinsics for the SSE signed saturated math intrinsics.
LLVM counterpart: https://reviews.llvm.org/D55894
Differential Revision: https://reviews.llvm.org/D55890
llvm-svn: 349743
|
|
|
|
|
|
|
|
|
|
| |
generic intrinsics (clang)
Sibling patch to D55855, this emits UADD_SAT/USUB_SAT generic intrinsics for the SSE saturated math intrinsics instead of expanding to a IR code sequence that could be difficult to reassemble.
Differential Revision: https://reviews.llvm.org/D55879
llvm-svn: 349631
|
|
|
|
|
|
|
|
|
|
| |
This adds
_mm_loadu_epi8, _mm256_loadu_epi8, _mm512_loadu_epi8
_mm_loadu_epi16, _mm256_loadu_epi16, _mm512_loadu_epi16
_mm_storeu_epi8, _mm256_storeu_epi8, _mm512_storeu_epi8
_mm_storeu_epi16, _mm256_storeu_epi16, _mm512_storeu_epi16
llvm-svn: 344862
|
|
|
|
|
|
|
|
|
|
|
|
| |
These aren't documented in the Intel Intrinsics Guide, but are supported by gcc and icc.
Includes these intrinsics:
_ktestc_mask8_u8, _ktestz_mask8_u8, _ktest_mask8_u8
_ktestc_mask16_u8, _ktestz_mask16_u8, _ktest_mask16_u8
_ktestc_mask32_u8, _ktestz_mask32_u8, _ktest_mask32_u8
_ktestc_mask64_u8, _ktestz_mask64_u8, _ktest_mask64_u8
llvm-svn: 341265
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds:
_cvtmask8_u32, _cvtmask16_u32, _cvtmask32_u32, _cvtmask64_u64
_cvtu32_mask8, _cvtu32_mask16, _cvtu32_mask32, _cvtu64_mask64
_load_mask8, _load_mask16, _load_mask32, _load_mask64
_store_mask8, _store_mask16, _store_mask32, _store_mask64
These are currently missing from the Intel Intrinsics Guide webpage.
llvm-svn: 341251
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the following intrinsics:
_kshiftli_mask8
_kshiftli_mask16
_kshiftli_mask32
_kshiftli_mask64
_kshiftri_mask8
_kshiftri_mask16
_kshiftri_mask32
_kshiftri_mask64
llvm-svn: 341234
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the following intrinsics:
_kadd_mask64
_kadd_mask32
_kadd_mask16
_kadd_mask8
These are missing from the Intel Intrinsics Guide, but are implemented by both gcc and icc.
llvm-svn: 340879
|
|
|
|
|
|
|
|
| |
names for 16 bit masks.
This matches gcc and icc despite not being documented in the Intel Intrinsics Guide.
llvm-svn: 340798
|
|
|
|
|
|
|
|
|
|
| |
64-bit mask registers.
This also adds a second intrinsic name for the 16-bit mask versions.
These intrinsics match gcc and icc. They just aren't published in the Intel Intrinsics Guide so I only recently found they existed.
llvm-svn: 340719
|
|
|
|
|
|
| |
builtin instead.
llvm-svn: 339843
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations.
Reviewers: craig.topper, spatel, RKSimon
Reviewed By: craig.topper
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D46892
llvm-svn: 339651
|
|
|
|
| |
llvm-svn: 334385
|
|
|
|
|
|
| |
and immediate range checking.
llvm-svn: 334265
|
|
|
|
|
|
|
|
|
|
| |
intrinsics.
We still lower them to native shuffle IR, but we do it in CGBuiltin.cpp now. This allows us to check the target feature and ensure the immediate fits in 8 bits.
This also improves our -O0 codegen slightly because we're able to see the zeroinitializer in the shuffle. It looks like it got lost behind a store+load previously.
llvm-svn: 334208
|
|
|
|
|
|
|
|
| |
to pass the first input a second time.
This is more consistent with other usages of builtin_shufflevector. Later optimization passes or codegen will detect the duplicate vector and replace it with undef. Using _mm_undefined just puts a zeroinitializer that still needs to be optimized out later.
llvm-svn: 333944
|
|
|
|
|
|
|
|
|
|
| |
Intel Intrinsics Guide.
We had quite a few for different element sizes of integers sometimes with strange target features attached to them.
We only need a single version for each of _m128i, _m256i, and _m512i with the target feature that first introduced those types.
llvm-svn: 333568
|
|
|
|
|
|
| |
to a single version without masking. Use select builtins with appropriate operand instead.
llvm-svn: 333387
|
|
|
|
|
|
|
|
| |
in IR instead.
Someday maybe we'll use selects for all the builtins.
llvm-svn: 332825
|
|
|
|
|
|
|
|
|
|
| |
builtins.
As long as the destination type is a 256 or 128 bit vector with the same number of elements we can use __builtin_convertvector to directly generate trunc IR instruction which will be handled natively by the backend.
Differential Revision: https://reviews.llvm.org/D46742
llvm-svn: 332266
|
|
|
|
|
|
|
|
| |
The LLVM commit introduces a crash in LLVM's instruction selection.
I filed http://llvm.org/PR37260 with the test case.
llvm-svn: 330997
|
|
|
|
|
|
|
|
|
|
|
| |
This is the patch that lowers x86 intrinsics to native IR
in order to enable optimizations.
Patch by tkrupa
Differential Revision: https://reviews.llvm.org/D44786
llvm-svn: 330323
|
|
|
|
|
|
|
|
|
|
| |
intrinsic and a select.
This makes it consistent with the 128/256-bit functions.
Someday maybe we'll have all the masking moved to selects.
llvm-svn: 329775
|
|
|
|
|
|
| |
We now use a vselect node in IR around an unmasked builtin. This makes it consistent with the 128 and 256 bit versions.
llvm-svn: 325560
|
|
|
|
|
|
|
|
| |
The second operand needs to be in the lower bits of the concatenation. This matches llvm 5.0, gcc, and icc behavior.
Fixes PR36360.
llvm-svn: 324954
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
integer shift/and/or
Summary:
kunpck intrinsics were removed in favor of native IR a few months ago. The implementation lowers them as by operation on the integer types passed to the intrinsic and then just shifting, masking, and oring them together. A special X86 DAG combine was added to recognize this patter and turn it into a concat_vector operation.
I think it makes more sense to keep the IR implementation closer to vector operations on vXi1. Given that we expect these builtins to be used around other builtins that operate on k-registers which we try to represent in IR with vXi1. InstCombine should be able to get rid of the bitcasts between integers and vXi1 leaving only the vector operations.
Reviewers: RKSimon, spatel, zvi, jina.nahias
Reviewed By: RKSimon
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D42016
llvm-svn: 322461
|
|
|
|
|
|
| |
zeroinitializer.
llvm-svn: 322038
|
|
|
|
|
|
|
|
|
| |
This patch, together with a matching llvm patch (https://reviews.llvm.org/D39720), implements the lowering of X86 kunpack intrinsics to IR.
Differential Revision: https://reviews.llvm.org/D39719
Change-Id: Id5d3cb394ad33b98be79a6783d1d15569e2b798d
llvm-svn: 319777
|
|
|
|
|
|
|
|
|
| |
Change Header files of the intrinsics for lowering test and testn intrinsics to IR code.
Removed test and testn builtins from clang
Differential Revision: https://reviews.llvm.org/D38737
llvm-svn: 318035
|
|
|
|
|
|
|
|
| |
This patch, together with a matching llvm patch (https://reviews.llvm.org/D37669), implements the lowering of X86 mask set1 intrinsics to IR.
Differential Revision: https://reviews.llvm.org/D37668
llvm-svn: 313624
|
|
|
|
|
|
|
|
| |
This patch, together with a matching llvm patch (https://reviews.llvm.org/D37693), implements the lowering of X86 ABS intrinsics to IR.
Differential Revision: https://reviews.llvm.org/D37694
llvm-svn: 313133
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D37562
llvm-svn: 313011
|
|
|
|
| |
llvm-svn: 299442
|
|
|
|
|
|
|
|
|
|
|
|
| |
generic intrinsics.
This patch is a part two of two reviews, one for the clang and the other for LLVM.
In this patch, I covered the clang side, by introducing the intrinsic to the front end.
This is done by creating a generic replacement.
Differential Revision: https://reviews.llvm.org/D31394a
llvm-svn: 299431
|
|
|
|
|
|
| |
typecasts instead of the intrinsic header.
llvm-svn: 298041
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
x86 has undef SSE/AVX intrinsics that should represent a bogus register operand.
This is not the same as LLVM's undef value which can take on multiple bit patterns.
There are better solutions / follow-ups to this discussed here:
https://bugs.llvm.org/show_bug.cgi?id=32176
...but this should prevent miscompiles with a one-line code change.
Differential Revision: https://reviews.llvm.org/D30834
llvm-svn: 297588
|
|
|
|
|
|
|
|
| |
unmasked builtins.
These new unmasked builtins will enable us to easily support optimizing these builtins in InstCombine in the backend.
llvm-svn: 295291
|
|
|
|
|
|
|
|
| |
version without masking so wrap it with select.
This will allow the backend to constant fold these to generic shuffle vectors like 128-bit and 256-bit without having to working about handling masking.
llvm-svn: 289345
|
|
|
|
|
|
| |
unmasked versions and selects.
llvm-svn: 287313
|
|
|
|
|
|
|
|
| |
element builtins over to the newly added unmasked builtins and a select.
This should also fix PR30691 since the new builtins are handled like the legacy builtins in the backend.
llvm-svn: 286714
|
|
|
|
|
|
| |
native IR like we do for 128/256-bit, but with the addition of masking.
llvm-svn: 284956
|
|
|
|
| |
llvm-svn: 284936
|
|
|
|
|
|
|
|
|
|
|
|
| |
The X86 clang/test/CodeGen/*builtins.c tests define the mm_malloc.h include
guard as a hack for avoiding its inclusion (mm_malloc.h requires a hosted
environment since it expects stdlib.h to be available - which is not the case
in these internal clang codegen tests).
This patch removes this hack and instead passes -ffreestanding to clang cc1.
Differential Revision: https://reviews.llvm.org/D24825
llvm-svn: 282581
|
|
|
|
| |
llvm-svn: 280597
|
|
|
|
| |
llvm-svn: 280596
|
|
|
|
|
|
| |
possible problems in headers.
llvm-svn: 277696
|
|
|
|
| |
llvm-svn: 274544
|
|
|
|
|
|
|
|
| |
_mm{|256|512}_mask_cvt{s|us|}epi16_storeu_epi8 intrinsics
Differential Revision: http://reviews.llvm.org/D21729
llvm-svn: 274532
|
|
|
|
|
|
| |
The mangling of their names was changed in order to support arbitrary addrspace pointers as arguments in rL274043.
llvm-svn: 274044
|