summaryrefslogtreecommitdiffstats
path: root/clang/lib/Headers
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Undef the vector reduction helper macros when we're done with them.Craig Topper2018-05-231-0/+8
| | | | | | These are implementation helper macros we shouldn't expose them to user code if we don't need to. llvm-svn: 333064
* [X86] In the floating point max reduction intrinsics, negate infinity before ↵Craig Topper2018-05-231-2/+2
| | | | | | | | feeding it to set1. Previously we negated the whole vector after splatting infinity. But its better to negate the infinity before splatting. This generates IR with the negate already folded with the infinity constant. llvm-svn: 333062
* [X86] Remove mask argument from more builtins that are handled completely in ↵Craig Topper2018-05-233-404/+233
| | | | | | CGBuiltin.cpp. Just wrap a select builtin around them in the header file instead. llvm-svn: 333061
* [X86] As mentioned in post-commit feedback in D47174, move the 128 bit f16c ↵Craig Topper2018-05-225-135/+90
| | | | | | | | | | intrinsics into f16cintrin.h and remove __emmintrin_f16c.h These were included in emmintrin.h to match Intel Intrinsics Guide documentation. But this is because icc is capable of emulating them on targets that don't support F16C using library calls. Clang/LLVM doesn't have this emulation support. So it makes more sense to include them in immintrin.h instead. I've left a comment behind to hopefully deter someone from trying to move them again in the future. llvm-svn: 333033
* [X86] Remove mask argument from some builtins that are handled completely in ↵Craig Topper2018-05-223-74/+50
| | | | | | CGBuiltin.cpp. Just wrap a select builtin around them in the header file instead. llvm-svn: 333027
* [X86] Another attempt at fixing the intrinsic module map for rr333014.Craig Topper2018-05-221-1/+1
| | | | llvm-svn: 333026
* [X86] Add two missing #endif directives to immintrin.h that should have been ↵Craig Topper2018-05-221-0/+2
| | | | | | in r333014. llvm-svn: 333023
* [X86] Add __emmintrin_f16c.h to module map and CMakeLists.Craig Topper2018-05-222-0/+6
| | | | | | I missed this in r333014 llvm-svn: 333020
* [X86] Move 128-bit f16c intrinsics to __emmintrin_f16c.h include from ↵Craig Topper2018-05-224-110/+149
| | | | | | | | | | | | emmintrin.h. Move 256-bit f16c intrinsics back to f16cintrin.h Intel documents the 128-bit versions as being in emmintrin.h and the 256-bit version as being in immintrin.h. This patch makes a new __emmtrin_f16c.h to hold the 128-bit versions to be included from emmintrin.h. And makes the existing f16cintrin.h contain the 256-bit versions and include it from immintrin.h with an error if its included directly. Differential Revision: https://reviews.llvm.org/D47174 llvm-svn: 333014
* [X86] Prevent inclusion of __wmmintrin_aes.h and __wmmintrin_pclmul.h ↵Craig Topper2018-05-222-2/+10
| | | | | | without including wmmintrin.h llvm-svn: 332929
* [X86] Use __builtin_convertvector to implement some of the packed integer to ↵Craig Topper2018-05-216-108/+72
| | | | | | | | | | | | packed float conversion intrinsics. I believe this is safe assuming default default FP environment. The conversion might be inexact, but it can never overflow the FP type so this shouldn't be undefined behavior for the uitofp/sitofp instructions. We already do something similar for scalar conversions. Differential Revision: https://reviews.llvm.org/D46863 llvm-svn: 332882
* [X86] Remove some preprocessor feature checks from intrinsic headersCraig Topper2018-05-212-4/+0
| | | | | | | | | | | | | | | | | | | Summary: These look to be a couple things that weren't removed when we switched to target attribute. The popcnt makes including just smmintrin.h also include popcntintrin.h. The popcnt file itself already contains target attrributes. The prefetch ones are just wrappers around __builtin_prefetch which we have graceful fallbacks for in the backend if the exact instruction isn't available. So there's no reason to hide them. And it makes them available in functions that have the write target attribute but not a -march command line flag. Reviewers: echristo, RKSimon, spatel, DavidKreitzer Reviewed By: echristo Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D47029 llvm-svn: 332830
* [X86] Remove mask arguments from permvar builtins/intrinsics. Use a select ↵Craig Topper2018-05-206-152/+92
| | | | | | | | in IR instead. Someday maybe we'll use selects for all the builtins. llvm-svn: 332825
* [X86] Fix a bad cast from mask16 to mask8 in _mm256_mask_cvtepi16_epi8 ↵Craig Topper2018-05-181-2/+2
| | | | | | introduced in r332266. llvm-svn: 332738
* [CUDA] Make std::min/max work when compiling in C++14 mode with a C++11 stdlib.Justin Lebar2018-05-171-18/+29
| | | | | | | | | | Reviewers: rsmith Subscribers: sanjoy, cfe-commits, tra Differential Revision: https://reviews.llvm.org/D46993 llvm-svn: 332619
* [X86] Revert part of r332266: Use __builtin_convertvector to replace some of ↵Craig Topper2018-05-152-24/+20
| | | | | | | | the avx512 truncate builtins. The masking doesn't work right in the backend for the ones that produce byte or word elements without avx512bw. llvm-svn: 332322
* [X86] Use __builtin_convertvector to replace some of the avx512 truncate ↵Craig Topper2018-05-144-66/+56
| | | | | | | | | | builtins. As long as the destination type is a 256 or 128 bit vector with the same number of elements we can use __builtin_convertvector to directly generate trunc IR instruction which will be handled natively by the backend. Differential Revision: https://reviews.llvm.org/D46742 llvm-svn: 332266
* [X86] Use select instrution and fpextend in the implementation of ↵Craig Topper2018-05-141-9/+6
| | | | | | _mm512_mask_cvtps_pd and _mm512_maskz_cvtps_pd. llvm-svn: 332213
* [X86] Use __builtin_convertvector to implement _mm512_cvtps_pd.Craig Topper2018-05-141-5/+1
| | | | | | | | If we're using default rounding mode we can let __builtin_convertvector to generate an fpextend. This matches 128 and 256 bit. If we're using the version that takes an explicit rounding mode argument we would need to look at the immediate to see if its CUR_DIRECTION. llvm-svn: 332210
* [X86] Emit better code for _mm_cvtu32_sd, _mm_cvtu64_sd, _mm_cvtu32_ss, and ↵Craig Topper2018-05-131-7/+8
| | | | | | | | | | _mm_cvtu64_ss. We can use direct C code for these that will use uitofp and insertelement instructions. For the versions that take an explicit rounding mode we can't do this. llvm-svn: 332203
* [X86] Fix the file header name on fmaintrin.hCraig Topper2018-05-111-1/+1
| | | | llvm-svn: 332108
* [X86] Assume alignment of movdir64b dst argumentGabor Buella2018-05-111-2/+7
| | | | | | | | | | Reviewers: craig.topper Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D46683 llvm-svn: 332091
* [X86] ptwrite intrinsicGabor Buella2018-05-105-0/+60
| | | | | | | | | | Reviewers: craig.topper, RKSimon Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D46540 llvm-svn: 331962
* [X86] Change the implementation of scalar masked load/store intrinsics to ↵Craig Topper2018-05-101-26/+10
| | | | | | | | | | not use a 512-bit intermediate vector. This is unnecessary for AVX512VL supporting CPUs like SKX. We can just emit a 128-bit masked load/store here no matter what. The backend will widen it to 512-bits on KNL CPUs. Fixes the frontend portion of PR37386. Need to fix the backend to optimize the new sequences well. llvm-svn: 331958
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-0921-803/+803
| | | | | | | | | | | | | | | | | | | This is similar to the LLVM change https://reviews.llvm.org/D46290. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46320 llvm-svn: 331834
* [x86] Introduce the encl[u|s|v] intrinsicsGabor Buella2018-05-084-0/+76
| | | | | | | | | | Reviewers: craig.topper, zvi Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D46435 llvm-svn: 331743
* [x86] Introduce the pconfig intrinsicGabor Buella2018-05-085-0/+57
| | | | | | | | | | Reviewers: craig.topper, zvi Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D46431 llvm-svn: 331740
* [X86] Make _mm256_gf2p8mul_epi8 require avx features since its 256 bits.Craig Topper2018-05-071-6/+10
| | | | | | | | Without this we throw an error on the header file instead of the user code when the right features aren't enabled in clang. Rename the other DEFAULT_FN_ATTRS defines to _Z for 512-bit since I used _Y for this case. llvm-svn: 331682
* [X86] Fix some inconsistent formatting in the first line of our intrinsics ↵Craig Topper2018-05-0413-15/+13
| | | | | | | | headers. Some were too long and some were too short. llvm-svn: 331559
* Revert "Emit an error when mixing <stdatomic.h> and <atomic>"Volodymyr Sapsai2018-05-021-4/+0
| | | | | | | | | | | | | It reverts r331378 as it caused test failures ThreadSanitizer-x86_64 :: Darwin/gcd-groups-destructor.mm ThreadSanitizer-x86_64 :: Darwin/libcxx-shared-ptr-stress.mm ThreadSanitizer-x86_64 :: Darwin/xpc-race.mm Only clang part of the change is reverted, libc++ part remains as is because it emits error less aggressively. llvm-svn: 331392
* Emit an error when mixing <stdatomic.h> and <atomic>Volodymyr Sapsai2018-05-021-0/+4
| | | | | | | | | | | | | | | | | | | | | Atomics in C and C++ are incompatible at the moment and mixing the headers can result in confusing error messages. Emit an error explicitly telling about the incompatibility. Introduce the macro `__ALLOW_STDC_ATOMICS_IN_CXX__` that allows to choose in C++ between C atomics and C++ atomics. rdar://problem/27435938 Reviewers: rsmith, EricWF, mclow.lists Reviewed By: mclow.lists Subscribers: jkorous-apple, christof, bumblebritches57, JonChesterfield, smeenai, cfe-commits Differential Revision: https://reviews.llvm.org/D45470 llvm-svn: 331378
* [X86] directstore and movdir64b intrinsicsGabor Buella2018-05-015-0/+67
| | | | | | | | | | Reviewers: spatel, craig.topper, RKSimon Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45984 llvm-svn: 331249
* [X86] Add support for _mm512_mullox_epi64 and _mm512_mask_mullox_epi64 ↵Craig Topper2018-04-261-0/+12
| | | | | | | | | | intrinsics to match icc. On AVX512F targets we'll produce an emulated sequence using 3 pmuludqs with shifts and adds. On AVX512DQ we'll use vpmulld. Fixes PR37140. llvm-svn: 330923
* [CUDA] Enable CUDA compilation with CUDA-9.2Artem Belevich2018-04-241-1/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D45827 llvm-svn: 330753
* [X86] Add recently added intrinsic headers to the module map.Craig Topper2018-04-241-0/+3
| | | | llvm-svn: 330744
* [X86] Consistently use double underscore at the beginning of the include ↵Craig Topper2018-04-249-27/+27
| | | | | | | | guards in our intrinsic headers. Most files used double underscore, but a few used single. This converges them all to double. llvm-svn: 330743
* [X86] Remove '#ifdef __x86_64__' around mask_set1_epi64 intrinsics.Craig Topper2018-04-242-7/+0
| | | | | | The unmasked versions already didn't have this restrction. I don't think gcc or icc limit these to 64-bit mode so we shouldn't either. llvm-svn: 330681
* [X86] WaitPKG intrinsicsGabor Buella2018-04-204-0/+62
| | | | | | | | | | Reviewers: craig.topper, zvi Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45254 llvm-svn: 330463
* [CUDA] added missing __ldg(const signed char *)Artem Belevich2018-04-181-0/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D45780 llvm-svn: 330280
* [X86] Introduce cldemote intrinsicGabor Buella2018-04-134-0/+48
| | | | | | | | | | Reviewers: craig.topper, zvi Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45257 llvm-svn: 329993
* [X86] Introduce wbinvd intrinsicGabor Buella2018-04-121-0/+5
| | | | | | | | | | | | A previously missing intrinsic for an old instruction. Reviewers: craig.topper, echristo Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45311 llvm-svn: 329937
* [x86] wbnoinvd intrinsicGabor Buella2018-04-114-1/+45
| | | | | | | | | | | | | | The WBNOINVD instruction writes back all modified cache lines in the processor’s internal cache to main memory but does not invalidate (flush) the internal caches. Reviewers: craig.topper, zvi, ashlykov Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D43817 llvm-svn: 329848
* [X86] Fix typo in intrinsic header file __mask16->__mmask16 from r329775.Craig Topper2018-04-111-2/+2
| | | | llvm-svn: 329777
* [X86] Replace 512-bit masked pmaddubsw and pmaddwd intrinsic with unmasked ↵Craig Topper2018-04-111-32/+21
| | | | | | | | | | intrinsic and a select. This makes it consistent with the 128/256-bit functions. Someday maybe we'll have all the masking moved to selects. llvm-svn: 329775
* Fix typos in clangAlexander Kornienko2018-04-064-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399
* [DOXYGEN] Fix doxygen and content issues in mmintrin.hDouglas Yung2018-03-091-9/+11
| | | | | | | | | | - Fix instruction mappings/listings for various intrinsics This patch was made by Craig Flores Differential Revision: https://reviews.llvm.org/D41517 llvm-svn: 327090
* [X86] Fix typo in cpuid.h, bit_AVX51SER->bit_AVX512ER.Craig Topper2018-03-061-1/+1
| | | | llvm-svn: 326807
* [x86][CET] Introduce _get_ssp, _inc_ssp intrinsicsAlexander Ivchenko2018-03-051-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The _get_ssp intrinsic can be used to retrieve the shadow stack pointer, independent of the current arch -- in contract with the rdsspd and the rdsspq intrinsics. Also, this intrinsic returns zero on CPUs which don't support CET. The rdssp[d|q] instruction is decoded as nop, essentially just returning the input operand, which is zero. Example result of compilation: ``` xorl %eax, %eax movl %eax, %ecx rdsspq %rcx # NOP when CET is not supported movq %rcx, %rax # return zero ``` Reviewers: craig.topper Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D43814 llvm-svn: 326689
* [X86] Remove some masked cvt builtins that can be replaced with legacy ↵Craig Topper2018-02-241-77/+66
| | | | | | sse/avx buiiltins and a select. llvm-svn: 326039
* [X86] Remove __builtin_ia32_permvarsf256_mask and ↵Craig Topper2018-02-241-38/+19
| | | | | | __builtin_ia32_permvarsi256_mask and use the avx2 unmasked versions and a select instead. llvm-svn: 326022
OpenPOWER on IntegriCloud