summaryrefslogtreecommitdiffstats
path: root/clang/test/CodeGen/avx512bw-builtins.c
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Restore the pavg intrinsics.Craig Topper2019-04-151-44/+6
| | | | | | | | | | | | | | | The pattern we replaced these with may be too hard to match as demonstrated by PR41496 and PR41316. This patch restores the intrinsics and then we can start focusing on the optimizing the intrinsics. I've mostly reverted the original patch that removed them. Though I modified the avx512 intrinsics to not have masking built in. Differential Revision: https://reviews.llvm.org/D60674 llvm-svn: 358427
* [X86] Add shift-by-immediate tests for non-immediate/out-of-range valuesSimon Pilgrim2019-01-081-0/+60
| | | | | | As noted on PR40203, for gcc compatibility we need to support non-immediate values in the 'slli/srli/srai' shift by immediate vector intrinsics. llvm-svn: 350619
* [X86][SSE] Auto upgrade PADDS/PSUBS intrinsics to SADD_SAT/SSUB_SAT generic ↵Simon Pilgrim2018-12-201-12/+12
| | | | | | | | | | | | intrinsics (clang) This emits SADD_SAT/SSUB_SAT generic intrinsics for the SSE signed saturated math intrinsics. LLVM counterpart: https://reviews.llvm.org/D55894 Differential Revision: https://reviews.llvm.org/D55890 llvm-svn: 349743
* [X86][SSE] Auto upgrade PADDUS/PSUBUS intrinsics to UADD_SAT/USUB_SAT ↵Simon Pilgrim2018-12-191-36/+12
| | | | | | | | | | generic intrinsics (clang) Sibling patch to D55855, this emits UADD_SAT/USUB_SAT generic intrinsics for the SSE saturated math intrinsics instead of expanding to a IR code sequence that could be difficult to reassemble. Differential Revision: https://reviews.llvm.org/D55879 llvm-svn: 349631
* [X86] Add more intrinsics to match icc.Craig Topper2018-10-201-1/+29
| | | | | | | | | | This adds _mm_loadu_epi8, _mm256_loadu_epi8, _mm512_loadu_epi8 _mm_loadu_epi16, _mm256_loadu_epi16, _mm512_loadu_epi16 _mm_storeu_epi8, _mm256_storeu_epi8, _mm512_storeu_epi8 _mm_storeu_epi16, _mm256_storeu_epi16, _mm512_storeu_epi16 llvm-svn: 344862
* [X86] Add ktest intrinsics to match gcc and icc.Craig Topper2018-08-311-0/+68
| | | | | | | | | | | | These aren't documented in the Intel Intrinsics Guide, but are supported by gcc and icc. Includes these intrinsics: _ktestc_mask8_u8, _ktestz_mask8_u8, _ktest_mask8_u8 _ktestc_mask16_u8, _ktestz_mask16_u8, _ktest_mask16_u8 _ktestc_mask32_u8, _ktestz_mask32_u8, _ktest_mask32_u8 _ktestc_mask64_u8, _ktestz_mask64_u8, _ktest_mask64_u8 llvm-svn: 341265
* [X86] Add k-mask conversion and load/store instrinsics to match gcc and icc.Craig Topper2018-08-311-0/+54
| | | | | | | | | | | | This adds: _cvtmask8_u32, _cvtmask16_u32, _cvtmask32_u32, _cvtmask64_u64 _cvtu32_mask8, _cvtu32_mask16, _cvtu32_mask32, _cvtu64_mask64 _load_mask8, _load_mask16, _load_mask32, _load_mask64 _store_mask8, _store_mask16, _store_mask32, _store_mask64 These are currently missing from the Intel Intrinsics Guide webpage. llvm-svn: 341251
* [X86] Add kshift intrinsics to match gcc and icc.Craig Topper2018-08-311-0/+32
| | | | | | | | | | | | | | This adds the following intrinsics: _kshiftli_mask8 _kshiftli_mask16 _kshiftli_mask32 _kshiftli_mask64 _kshiftri_mask8 _kshiftri_mask16 _kshiftri_mask32 _kshiftri_mask64 llvm-svn: 341234
* [X86] Add kadd intrinsics to match gcc and icc.Craig Topper2018-08-281-0/+22
| | | | | | | | | | | | This adds the following intrinsics: _kadd_mask64 _kadd_mask32 _kadd_mask16 _kadd_mask8 These are missing from the Intel Intrinsics Guide, but are implemented by both gcc and icc. llvm-svn: 340879
* [X86] Add kortest intrinsics for 8, 32, and 64 bit masks. Add new intrinsic ↵Craig Topper2018-08-281-0/+92
| | | | | | | | names for 16 bit masks. This matches gcc and icc despite not being documented in the Intel Intrinsics Guide. llvm-svn: 340798
* [X86] Add intrinsics for kand/kandn/knot/kor/kxnor/kxor with 8, 32, and ↵Craig Topper2018-08-271-0/+130
| | | | | | | | | | 64-bit mask registers. This also adds a second intrinsic name for the 16-bit mask versions. These intrinsics match gcc and icc. They just aren't published in the Intel Intrinsics Guide so I only recently found they existed. llvm-svn: 340719
* [X86] Remove masking from the 512-bit padds and psubs builtins. Use select ↵Craig Topper2018-08-161-12/+20
| | | | | | builtin instead. llvm-svn: 339843
* [X86] Lowering addus/subus intrinsics to native IRTomasz Krupa2018-08-141-27/+71
| | | | | | | | | | | | | | Summary: This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations. Reviewers: craig.topper, spatel, RKSimon Reviewed By: craig.topper Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D46892 llvm-svn: 339651
* [X86] Remove masking from dbpsadbw builtins, use select builtin instead.Craig Topper2018-06-111-3/+5
| | | | llvm-svn: 334385
* [X86] Add builtins for pshufd, pshuflw, and pshufhw to enable target feature ↵Craig Topper2018-06-081-6/+6
| | | | | | and immediate range checking. llvm-svn: 334265
* [X86] Add back builtins for _mm_slli_si128/_mm_srli_si128 and similar ↵Craig Topper2018-06-071-2/+2
| | | | | | | | | | intrinsics. We still lower them to native shuffle IR, but we do it in CGBuiltin.cpp now. This allows us to check the target feature and ensure the immediate fits in 8 bits. This also improves our -O0 codegen slightly because we're able to see the zeroinitializer in the shuffle. It looks like it got lost behind a store+load previously. llvm-svn: 334208
* [X86] Avoid passing _mm_undefined* to builtin_shufflevector if we are able ↵Craig Topper2018-06-041-6/+6
| | | | | | | | to pass the first input a second time. This is more consistent with other usages of builtin_shufflevector. Later optimization passes or codegen will detect the duplicate vector and replace it with undef. Using _mm_undefined just puts a zeroinitializer that still needs to be optimized out later. llvm-svn: 333944
* [X86] Reduce the number of setzero intrinsics to just the set defined by the ↵Craig Topper2018-05-301-2/+2
| | | | | | | | | | Intel Intrinsics Guide. We had quite a few for different element sizes of integers sometimes with strange target features attached to them. We only need a single version for each of _m128i, _m256i, and _m512i with the target feature that first introduced those types. llvm-svn: 333568
* [X86] Merge the 3 different flavors of masked vpermi2var/vpermt2var builtins ↵Craig Topper2018-05-291-4/+7
| | | | | | to a single version without masking. Use select builtins with appropriate operand instead. llvm-svn: 333387
* [X86] Remove mask arguments from permvar builtins/intrinsics. Use a select ↵Craig Topper2018-05-201-3/+5
| | | | | | | | in IR instead. Someday maybe we'll use selects for all the builtins. llvm-svn: 332825
* [X86] Use __builtin_convertvector to replace some of the avx512 truncate ↵Craig Topper2018-05-141-3/+5
| | | | | | | | | | builtins. As long as the destination type is a 256 or 128 bit vector with the same number of elements we can use __builtin_convertvector to directly generate trunc IR instruction which will be handled natively by the backend. Differential Revision: https://reviews.llvm.org/D46742 llvm-svn: 332266
* [x86] Revert r330322 (& r330323): Lowering x86 adds/addus/subs/subus intrinsicsChandler Carruth2018-04-261-205/+39
| | | | | | | | The LLVM commit introduces a crash in LLVM's instruction selection. I filed http://llvm.org/PR37260 with the test case. llvm-svn: 330997
* Lowering x86 adds/addus/subs/subus intrinsics (clang)Alexander Ivchenko2018-04-191-39/+205
| | | | | | | | | | | This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations. Patch by tkrupa Differential Revision: https://reviews.llvm.org/D44786 llvm-svn: 330323
* [X86] Replace 512-bit masked pmaddubsw and pmaddwd intrinsic with unmasked ↵Craig Topper2018-04-111-6/+10
| | | | | | | | | | intrinsic and a select. This makes it consistent with the 128/256-bit functions. Someday maybe we'll have all the masking moved to selects. llvm-svn: 329775
* [X86] Remove mask from 512 bit pmulhrsw/pmulhw/pmulhuw builtins.Craig Topper2018-02-201-9/+15
| | | | | | We now use a vselect node in IR around an unmasked builtin. This makes it consistent with the 128 and 256 bit versions. llvm-svn: 325560
* [X86] Reverse the operand order of the implementation of the kunpack builtins.Craig Topper2018-02-121-2/+2
| | | | | | | | The second operand needs to be in the lower bits of the concatenation. This matches llvm 5.0, gcc, and icc behavior. Fixes PR36360. llvm-svn: 324954
* [X86] Implement old kunpck intrinsics using vector ops on vXi1 instead of ↵Craig Topper2018-01-141-12/+11
| | | | | | | | | | | | | | | | | | | integer shift/and/or Summary: kunpck intrinsics were removed in favor of native IR a few months ago. The implementation lowers them as by operation on the integer types passed to the intrinsic and then just shifting, masking, and oring them together. A special X86 DAG combine was added to recognize this patter and turn it into a concat_vector operation. I think it makes more sense to keep the IR implementation closer to vector operations on vXi1. Given that we expect these builtins to be used around other builtins that operate on k-registers which we try to represent in IR with vXi1. InstCombine should be able to get rid of the bitcasts between integers and vXi1 leaving only the vector operations. Reviewers: RKSimon, spatel, zvi, jina.nahias Reviewed By: RKSimon Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D42016 llvm-svn: 322461
* [X86] Replace cvt*2mask intrinsics with native IR using 'icmp slt X, ↵Craig Topper2018-01-081-2/+4
| | | | | | zeroinitializer. llvm-svn: 322038
* [x86][AVX512] Lowering kunpack intrinsics to LLVM IRJina Nahias2017-12-051-6/+16
| | | | | | | | | This patch, together with a matching llvm patch (https://reviews.llvm.org/D39720), implements the lowering of X86 kunpack intrinsics to IR. Differential Revision: https://reviews.llvm.org/D39719 Change-Id: Id5d3cb394ad33b98be79a6783d1d15569e2b798d llvm-svn: 319777
* [X86] test/testn intrinsics lowering to IR. clang sideUriel Korach2017-11-131-8/+20
| | | | | | | | | Change Header files of the intrinsics for lowering test and testn intrinsics to IR code. Removed test and testn builtins from clang Differential Revision: https://reviews.llvm.org/D38737 llvm-svn: 318035
* Lowering Mask Set1 intrinsics to LLVM IRJina Nahias2017-09-191-4/+194
| | | | | | | | This patch, together with a matching llvm patch (https://reviews.llvm.org/D37669), implements the lowering of X86 mask set1 intrinsics to IR. Differential Revision: https://reviews.llvm.org/D37668 llvm-svn: 313624
* [X86] [PATCH] [intrinsics] Lowering X86 ABS intrinsics to IR. (clang)Uriel Korach2017-09-131-6/+22
| | | | | | | | This patch, together with a matching llvm patch (https://reviews.llvm.org/D37693), implements the lowering of X86 ABS intrinsics to IR. Differential Revision: https://reviews.llvm.org/D37694 llvm-svn: 313133
* [X86] Lower _mm[256|512]_[mask[z]]_avg_epu[8|16] intrinsics to native llvm IRYael Tsafrir2017-09-121-6/+48
| | | | | | Differential Revision: https://reviews.llvm.org/D37562 llvm-svn: 313011
* Fix problem with test. Michael Zuckerman2017-04-041-4/+4
| | | | llvm-svn: 299442
* [X86][Clang] Converting __mm{|256|512}_movm_epi{8|16|32|64} LLVMIR call into ↵Michael Zuckerman2017-04-041-2/+4
| | | | | | | | | | | | generic intrinsics. This patch is a part two of two reviews, one for the clang and the other for LLVM. In this patch, I covered the clang side, by introducing the intrinsic to the front end. This is done by creating a generic replacement. Differential Revision: https://reviews.llvm.org/D31394a llvm-svn: 299431
* [AVX-512] Fix test cases that were using the builtins directly without ↵Craig Topper2017-03-171-3/+3
| | | | | | typecasts instead of the intrinsic header. llvm-svn: 298041
* [x86] these aren't the undefs you're looking for (PR32176)Sanjay Patel2017-03-121-6/+6
| | | | | | | | | | | | | x86 has undef SSE/AVX intrinsics that should represent a bogus register operand. This is not the same as LLVM's undef value which can take on multiple bit patterns. There are better solutions / follow-ups to this discussed here: https://bugs.llvm.org/show_bug.cgi?id=32176 ...but this should prevent miscompiles with a one-line code change. Differential Revision: https://reviews.llvm.org/D30834 llvm-svn: 297588
* [AVX-512] Replace 512-bit masked packss/packus builtins and replace with new ↵Craig Topper2017-02-161-12/+20
| | | | | | | | unmasked builtins. These new unmasked builtins will enable us to easily support optimizing these builtins in InstCombine in the backend. llvm-svn: 295291
* [AVX-512] Remove masking from 512-bit pshufb builtin. The backend now has a ↵Craig Topper2016-12-101-3/+5
| | | | | | | | version without masking so wrap it with select. This will allow the backend to constant fold these to generic shuffle vectors like 128-bit and 256-bit without having to working about handling masking. llvm-svn: 289345
* [AVX-512] Replace masked 16-bit element variable shift builtins with new ↵Craig Topper2016-11-181-9/+15
| | | | | | unmasked versions and selects. llvm-svn: 287313
* [AVX-512] Convert the rest of the masked shift by immediate and by single ↵Craig Topper2016-11-121-18/+30
| | | | | | | | element builtins over to the newly added unmasked builtins and a select. This should also fix PR30691 since the new builtins are handled like the legacy builtins in the backend. llvm-svn: 286714
* [AVX-512] Replace 64-bit element and 512-bit vector pmin/pmax builtins with ↵Craig Topper2016-10-241-24/+64
| | | | | | native IR like we do for 128/256-bit, but with the addition of masking. llvm-svn: 284956
* [AVX-512] Replace 512-bit pmovzx/sx builtins with native IR.Craig Topper2016-10-231-6/+10
| | | | llvm-svn: 284936
* [X86] Remove the mm_malloc.h include guard hack from the X86 builtins testsElad Cohen2016-09-281-4/+2
| | | | | | | | | | | | The X86 clang/test/CodeGen/*builtins.c tests define the mm_malloc.h include guard as a hack for avoiding its inclusion (mm_malloc.h requires a hosted environment since it expects stdlib.h to be available - which is not the case in these internal clang codegen tests). This patch removes this hack and instead passes -ffreestanding to clang cc1. Differential Revision: https://reviews.llvm.org/D24825 llvm-svn: 282581
* [AVX-512] Remove masked integer mullo builtins and replace with native IR.Craig Topper2016-09-031-2/+4
| | | | llvm-svn: 280597
* [AVX-512] Remove masked integer add/sub builtins and replace with native IR.Craig Topper2016-09-031-8/+16
| | | | llvm-svn: 280596
* After PR28761 use -Wall with -Werror in builtins tests to identifyEric Christopher2016-08-041-2/+2
| | | | | | possible problems in headers. llvm-svn: 277696
* [X86][AVX512] Converted the VBROADCAST intrinsics to generic IRSimon Pilgrim2016-07-051-13/+16
| | | | llvm-svn: 274544
* [Clang][BuiltIn][AVX512] adding ↵Michael Zuckerman2016-07-051-0/+20
| | | | | | | | _mm{|256|512}_mask_cvt{s|us|}epi16_storeu_epi8 intrinsics Differential Revision: http://reviews.llvm.org/D21729 llvm-svn: 274532
* Update the expected masked load/store intrinsics names in testsArtur Pilipenko2016-06-281-6/+6
| | | | | | The mangling of their names was changed in order to support arbitrary addrspace pointers as arguments in rL274043. llvm-svn: 274044
OpenPOWER on IntegriCloud