summaryrefslogtreecommitdiffstats
path: root/clang/lib/Headers
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Cast to __v4hi instead of __m64 in the implementation of ↵Craig Topper2020-02-121-2/+2
| | | | | | | | | | | _mm_extract_pi16 and _mm_insert_pi16. __m64 is a vector of 1 long long. But the builtins these intrinsics are calling expect a vector of 4 shorts. Fixes PR44589 (cherry picked from commit 16b9410caa35da976fa5f3cf6dd3d6f3776d51ca)
* [CUDA] Assume the latest known CUDA version if we've found an unknown one.Artem Belevich2020-01-291-1/+1
| | | | | | | | | | | | | | | | | This makes clang somewhat forward-compatible with new CUDA releases without having to patch it for every minor release without adding any new function. If an unknown version is found, clang issues a warning (can be disabled with -Wno-cuda-unknown-version) and assumes that it has detected the latest known version. CUDA releases are usually supersets of older ones feature-wise, so it should be sufficient to keep released clang versions working with minor CUDA updates without having to upgrade clang, too. Differential Revision: https://reviews.llvm.org/D73231 (cherry picked from commit 12fefeef203ab4ef52d19bcdbd4180608a4deae1)
* [CUDA] Fix order of memcpy arguments in __shfl_*(<64-bit type>).Artem Belevich2020-01-241-2/+2
| | | | | | Wrong argument order resulted in broken shfl ops for 64-bit types. (cherry picked from commit cc14de88da27a8178976972bdc8211c31f7ca9ae)
* Work around PR43337: don't try to use the vec_sel overloads for vector long ↵Richard Smith2020-01-171-2/+2
| | | | | | long, since clang's <altivec.h> doesn't provide it yet! (cherry picked from commit 388eaa1270c2762d61b756759b6db8cf15bd3a83)
* [X86] Mark various pointer arguments in builtins as constWarren Ristow2019-12-1910-118/+118
| | | | | | | | | | | Enabling `-Wcast-qual` identified many casts in various system headers that were dropping the `const` qualifier. Fixing those missing qualifiers pointed out that a few of the definitions of the builtins did not properly identify their arguments as `const` pointers. This commit fixes those builtin definitions, and the system header files so that they no longer drop the qualifier. Differential Revision: https://reviews.llvm.org/D71718
* [ARM][CMSE] Add CMSE header and builtinsMomchil Velikov2019-12-122-0/+218
| | | | | | | | | | | | This is patch C2 as mentioned in RFC http://lists.llvm.org/pipermail/cfe-dev/2019-March/061834.html This adds CMSE builtin functions, and introduces arm_cmse.h header which has useful macros, functions, and data types for end-users of CMSE. Patch by Javed Absar. Diferential Revision: https://reviews.llvm.org/D70817
* [X86] Remove forward declaration of _invpcid from intrin.h. Rely on inline ↵Craig Topper2019-11-251-1/+0
| | | | | | | | | | | version from immintrin.h The forward declaration had a cdecl calling convention, but the inline version did not. This leads to a conflict if the default calling convention is not cdecl. Fix this by just removing the forward declaration. Fixes PR41503
* [X86] Fix the implementation of __readcr3/__writecr3 to work in 64-bit modeCraig Topper2019-11-141-9/+16
| | | | | | | | | | | | We need to use a 64-bit type in 64-bit mode so a 64-bit register will get used in the generated assembly. I've also changed the constraints to just use "r" intead of "q". "q" forces to a only an a/b/c/d register in 32-bit mode, but I see no reason that would matter here. Fixes Nico's note in PR19301 over 4 years ago. Differential Revision: https://reviews.llvm.org/D70101
* [PowerPC][Altivec] Fix offsets for vec_xl and vec_xstNemanja Ivanovic2019-11-071-20/+40
| | | | | | | | As we currently have it implemented in altivec.h, the offsets for these two intrinsics are element offsets. The documentation in the ABI (as well as the implementation in both XL and GCC) states that these should be byte offsets. Differential revision: https://reviews.llvm.org/D63636
* [PowerPC][Altivec] Emit correct builtin for single precision vec_all_neNemanja Ivanovic2019-11-071-1/+1
| | | | | | | We currently emit a double precision comparison instruction for this, whereas we need to emit the single precision version. Differential revision: https://reviews.llvm.org/D64024
* [Headers] Fix compatibility between arm_acle.h and intrin.hEli Friedman2019-10-291-0/+2
| | | | | | Make sure they don't both define __nop. Differential Revision: https://reviews.llvm.org/D69012
* [ARM][AArch64] Implement __cls, __clsl and __clsll intrinsics from ACLEvhscampos2019-10-281-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Writing support for three ACLE functions: unsigned int __cls(uint32_t x) unsigned int __clsl(unsigned long x) unsigned int __clsll(uint64_t x) CLS stands for "Count number of leading sign bits". In AArch64, these two intrinsics can be translated into the 'cls' instruction directly. In AArch32, on the other hand, this functionality is achieved by implementing it in terms of clz (count number of leading zeros). Reviewers: compnerd Reviewed By: compnerd Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69250
* [ARM][AArch64] Implement __arm_rsrf, __arm_rsrf64, __arm_wsrf & __arm_wsrf64vhscampos2019-10-281-0/+4
| | | | | | | | | | | | | | | | | Summary: Adding support for ACLE intrinsics. Patch by Michael Platings. Reviewers: chill, t.p.northover, efriedma Reviewed By: chill Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69297
* Fix a spelling mistake in a couple of intrinsic description comments. NFCGreg Bedwell2019-10-271-2/+2
|
* [clang,ARM] Initial ACLE intrinsics for MVE.Simon Tatham2019-10-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit sets up the infrastructure for auto-generating <arm_mve.h> and doing clang-side code generation for the builtins it relies on, and demonstrates that it works by implementing a representative sample of the ACLE intrinsics, more or less matching the ones introduced in LLVM IR by D67158,D68699,D68700. Like NEON, that header file will provide a set of vector types like uint16x8_t and C functions with names like vaddq_u32(). Unlike NEON, the ACLE spec for <arm_mve.h> includes a polymorphism system, so that you can write plain vaddq() and disambiguate by the vector types you pass to it. Unlike the corresponding NEON code, I've arranged to make every user- facing ACLE intrinsic into a clang builtin, and implement all the code generation inside clang. So <arm_mve.h> itself contains nothing but typedefs and function declarations, with the latter all using the new `__attribute__((__clang_builtin))` system to arrange that the user- facing function names correspond to the right internal BuiltinIDs. So the new MveEmitter tablegen system specifies the full sequence of IRBuilder operations that each user-facing ACLE intrinsic should translate into. Where possible, the ACLE intrinsics map to standard IR operations such as vector-typed `add` and `fadd`; where no standard representation exists, I call down to the sample IR intrinsics introduced in an earlier commit. Doing it like this means that you get the polymorphism for free just by using __attribute__((overloadable)): the clang overload resolution decides which function declaration is the relevant one, and _then_ its BuiltinID is looked up, so by the time we're doing code generation, that's all been resolved by the standard system. It also means that you get really nice error messages if the user passes the wrong combination of types: clang will show the declarations from the header file and explain why each one doesn't match. (The obvious alternative approach would be to have wrapper functions in <arm_mve.h> which pass their arguments to the underlying builtins. But that doesn't work in the case where one of the arguments has to be a constant integer: the wrapper function can't pass the constantness through. So you'd have to do that case using a macro instead, and then use C11 `_Generic` to handle the polymorphism. Then you have to add horrible workarounds because `_Generic` requires even the untaken branches to type-check successfully, and //then// if the user gets the types wrong, the error message is totally unreadable!) Reviewers: dmgreen, miyuki, ostannard Subscribers: mgorny, javed.absar, kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D67161
* [X86] Always define the tzcnt intrinsics even when _MSC_VER is defined.Craig Topper2019-10-112-85/+93
| | | | | | | | | | | | These intrinsics use llvm.cttz intrinsics so are always available even without the bmi feature. We already don't check for the bmi feature on the intrinsics themselves. But we were blocking the include of the header file with _MSC_VER unless BMI was enabled on the command line. Fixes PR30506. llvm-svn: 374516
* [x86] Adding support for some missing intrinsics: _castf32_u32, ↵Pengfei Wang2019-09-251-0/+68
| | | | | | | | | | | | | | | | | | | | _castf64_u64, _castu32_f32, _castu64_f64 Summary: Adding support for some missing intrinsics: _castf32_u32, _castf64_u64, _castu32_f32, _castu64_f64 Reviewers: craig.topper, LuoYuanke, RKSimon, pengfei Reviewed By: RKSimon Subscribers: llvm-commits Patch by yubing (Bing Yu) Differential Revision: https://reviews.llvm.org/D67212 llvm-svn: 372802
* Fix reliance on -flax-vector-conversions in AVX intrinsics headers andRichard Smith2019-09-171-2/+2
| | | | | | corresponding tests. llvm-svn: 372063
* Remove reliance on lax vector conversions from altivec.h in VSX mode.Richard Smith2019-09-171-19/+22
| | | | llvm-svn: 372061
* Remove reliance on lax vector conversions from altivec.h and its test.Richard Smith2019-09-131-17/+23
| | | | llvm-svn: 371814
* [PowerPC][Altivec] Fix constant argument for vec_dssJinsong Ji2019-09-041-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is similar to vec_ct* in https://reviews.llvm.org/rL304205. The argument must be a constant, otherwise instruction selection will fail. always_inline is not enough for isel to always fold everything away at -O0. The fix is to turn the function into macros in altivec.h. Fixes https://bugs.llvm.org/show_bug.cgi?id=43072 Reviewers: nemanjai, hfinkel, #powerpc, wuzish Reviewed By: #powerpc, wuzish Subscribers: wuzish, kbarton, MaskRay, shchenz, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D66699 llvm-svn: 370902
* [CUDA] Use activemask.b32 instruction to implement __activemask w/ CUDA-9.2+Artem Belevich2019-09-031-1/+9
| | | | | | | | | | | vote.ballot instruction is gone in recent CUDA versions and vote.sync.ballot can not be used because it needs a thread mask parameter. Fortunately PTX 6.2 (introduced with CUDA-9.2) provides activemask.b32 instruction for this. Differential Revision: https://reviews.llvm.org/D66665 llvm-svn: 370792
* [x86] Fix bugs of some intrinsic functions in CLANG : _mm512_stream_ps, ↵Pengfei Wang2019-09-031-3/+3
| | | | | | | | | | | | | | | | _mm512_stream_pd, _mm512_stream_si512 Reviewers: craig.topper, pengfei, LuoYuanke, RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Patch by Bing Yu (yubing) Differential Revision: https://reviews.llvm.org/D66786 llvm-svn: 370691
* [x86] Adding support for some missing intrinsics: _mm512_cvtsi512_si32Pengfei Wang2019-08-291-0/+17
| | | | | | | | | | | | | | | | | | Summary: Adding support for some missing intrinsics: _mm512_cvtsi512_si32 Reviewers: craig.topper, pengfei, LuoYuanke, spatel, RKSimon Reviewed By: craig.topper Subscribers: llvm-commits Patch by Bing Yu (yubing) Differential Revision: https://reviews.llvm.org/D66785 llvm-svn: 370297
* [OpenCL] Fix declaration of enqueue_markerYaxun Liu2019-08-221-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D66512 llvm-svn: 369641
* [OpenCL] Fix lang mode predefined macros for C++ mode.Anastasia Stulova2019-08-122-117/+110
| | | | | | | | | | | | | In C++ mode we should only avoid adding __OPENCL_C_VERSION__, all other predefined macros about the language mode are still valid. This change also fixes the language version check in the headers accordingly. Differential Revision: https://reviews.llvm.org/D65941 llvm-svn: 368552
* [clang] Fixed x86 cpuid NSC signatureRaphael Isemann2019-08-101-2/+2
| | | | | | | | | | | | | | | | | | Summary: The signature "Geode by NSC" for NSC vendor is wrong. In lib/Headers/cpuid.h, signature_NSC_edx and signature_NSC_ecx constants are inverted (cpuid signature order is ebx # edx # ecx). Reviewers: teemperor, rsmith, craig.topper Reviewed By: teemperor, craig.topper Subscribers: craig.topper, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D65978 llvm-svn: 368510
* [PowerPC] [Clang] Port SSE3, SSSE3 and SSE4 intrinsics to PowerPCQiu Chaofan2019-08-094-0/+733
| | | | | | | | | | | | | | | Port existing headers which include x86 intrinsics implementation to PowerPC platform (using Altivec), along with tests. Also, tests about including these intrinsic headers are combined. The headers are mainly developed by Steven Munroe, with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu. Reviewed By: Jinsong Ji Differential Revision: https://reviews.llvm.org/D65630 llvm-svn: 368392
* [AArch64] Add support for Transactional Memory Extension (TME)Momchil Velikov2019-07-311-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | Re-commit r366322 after some fixes TME is a future architecture technology, documented in https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools https://developer.arm.com/docs/ddi0601/a More about the future architectures: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and TCANCEL and the target feature/arch extension "tme". It also implements TME builtin functions, defined in ACLE Q2 2019 (https://developer.arm.com/docs/101028/latest) Differential Revision: https://reviews.llvm.org/D64416 Patch by Javed Absar and Momchil Velikov llvm-svn: 367428
* [PowerPC] [Clang] Add platform guards to PPC vector intrinsics headersQiu Chaofan2019-07-304-0/+25
| | | | | | | | | | | | Move the platform check out of PPC Linux toolchain code and add platform guards to the intrinsic headers, since they are supported currently only on 64-bit PowerPC targets. Reviewed By: Jinsong Ji Differential Revision: https://reviews.llvm.org/D64849 llvm-svn: 367281
* [X86] Remove const from some intrinsics that shouldn't have themPaul Robinson2019-07-221-3/+3
| | | | llvm-svn: 366699
* [OpenCL] Define CLK_NULL_EVENT without castSven van Haastregt2019-07-191-1/+1
| | | | | | | | | | | | | | Defining CLK_NULL_EVENT with a `(void*)` cast has the (unintended?) side-effect that the address space will be fixed (as generic in OpenCL 2.0 mode). The consequence is that any target specific address space for the clk_event_t type will not be applied. It is not clear why the void pointer cast was needed in the first place, and it seems we can do without it. Differential Revision: https://reviews.llvm.org/D63876 llvm-svn: 366546
* [PowerPC][Clang] Remove use of malloc in mm_mallocQiu Chaofan2019-07-181-4/+0
| | | | | | | | | | | Remove dependency of malloc in implementation of mm_malloc function in PowerPC intrinsics and alignment assumption on glibc. Reviewed By: Hal Finkel Differential Revision: https://reviews.llvm.org/D64850 llvm-svn: 366406
* Revert [AArch64] Add support for Transactional Memory Extension (TME)Momchil Velikov2019-07-171-23/+1
| | | | | | This reverts r366322 (git commit 4b8da3a503e434ddbc08ecf66582475765f449bc) llvm-svn: 366355
* [AArch64] Add support for Transactional Memory Extension (TME)Momchil Velikov2019-07-171-1/+23
| | | | | | | | | | | | | | | | | | | | | | | TME is a future architecture technology, documented in https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools https://developer.arm.com/docs/ddi0601/a More about the future architectures: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and TCANCEL and the target feature/arch extension "tme". It also implements TME builtin functions, defined in ACLE Q2 2019 (https://developer.arm.com/docs/101028/latest) Patch by Javed Absar and Momchil Velikov Differential Revision: https://reviews.llvm.org/D64416 llvm-svn: 366322
* [AArch64] Implement __jcvt intrinsic from Armv8.3-AKyrylo Tkachov2019-07-161-0/+8
| | | | | | | | | | | | | | | | The jcvt intrinsic defined in ACLE [1] is available when ARM_FEATURE_JCVT is defined. This change introduces the AArch64 intrinsic, wires it up to the instruction and a new clang builtin function. The __ARM_FEATURE_JCVT macro is now defined when an Armv8.3-A or higher target is used. I've implemented the target detection logic in Clang so that this feature is enabled for architectures from armv8.3-a onwards (so -march=armv8.4-a also enables this, for example). make check-all didn't show any new failures. [1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics Differential Revision: https://reviews.llvm.org/D64495 llvm-svn: 366197
* [SystemZ] Add support for new cpu architecture - arch13Ulrich Weigand2019-07-121-0/+406
| | | | | | | | | | | | | | | | | This patch series adds support for the next-generation arch13 CPU architecture to the SystemZ backend. This includes: - Basic support for the new processor and its features. - Support for low-level builtins mapped to new LLVM intrinsics. - New high-level intrinsics in vecintrin.h. - Indicate support by defining __VEC__ == 10303. Note: No currently available Z system supports the arch13 architecture. Once new systems become available, the official system name will be added as supported -march name. llvm-svn: 365933
* [X86] Change the IR sequence for _mm_storeh_pi and _mm_storel_pi to perform ↵Craig Topper2019-07-101-2/+10
| | | | | | | | the store as a <2 x float> instead of i64. This is similar to what we do for loadl_pi and loadh_pi. llvm-svn: 365669
* [OpenCL] Restore ATOMIC_VAR_INITSven van Haastregt2019-06-241-1/+6
| | | | | | | | | We accidentally lost the ATOMIC_VAR_INIT and ATOMIC_FLAG_INIT macros in r363794. Also put the `memory_order` typedef back inside a `>= CL2.0` guard. llvm-svn: 364174
* [OpenCL] Remove more duplicates from opencl-c.hSven van Haastregt2019-06-241-29/+0
| | | | | | | | Identified the duplicate declarations using sort lib/Headers/opencl-c.h | uniq -c | grep ' 2' llvm-svn: 364173
* [OpenCL][PR41963] Add generic addr space to old atomics in C++ modeAnastasia Stulova2019-06-211-0/+45
| | | | | | | | | Add overloads with generic address space pointer to old atomics. This is currently only added for C++ compilation mode. Differential Revision: https://reviews.llvm.org/D62335 llvm-svn: 364071
* [OpenCL] Remove duplicate read_image declarationsSven van Haastregt2019-06-211-47/+0
| | | | | | Patch by Pierre Gondois. llvm-svn: 364020
* [X86] Make _mm_mask_cvtps_ph, _mm_maskz_cvtps_ph, _mm256_mask_cvtps_ph, and ↵Craig Topper2019-06-202-44/+8
| | | | | | | | | | | | | | _mm256_maskz_cvtps_ph aliases for their corresponding cvt_roundps_ph intrinsic. These intrinsics should always take an immediate for the rounding mode. The base instruction comes from before EVEX embdedded rounding. The user should always provide the immediate rather than us assuming CUR_DIRECTION. Make the 512-bit versions also explicit aliases instead of copy pasting the code. llvm-svn: 363961
* AIX system headers need stdint.h and inttypes.h to be re-enterableXing Xue2019-06-202-0/+10
| | | | | | | | | | | | | | | | | Summary: AIX system headers need stdint.h and inttypes.h to be re-enterable when macro _STD_TYPES_T is defined so that limit macro definitions such as UINT32_MAX can be found. This patch attempts to allow that on AIX. Reviewers: hubert.reinterpretcast, jasonliu, mclow.lists, EricWF Reviewed by: hubert.reinterpretcast, mclow.lists Subscribers: jfb, jsji, christof, cfe-commits, libcxx-commits, llvm-commits Tags: #LLVM, #clang, #libc++ Differential Revision: https://reviews.llvm.org/D59253 llvm-svn: 363939
* [X86] Correct the __min_vector_width__ attribute on a few intrinsics.Craig Topper2019-06-192-5/+5
| | | | llvm-svn: 363890
* [OpenCL] Split type and macro definitions into opencl-c-base.hSven van Haastregt2019-06-194-535/+577
| | | | | | | | | | | | | | | | Using the -fdeclare-opencl-builtins option will require a way to predefine types and macros such as `int4`, `CLK_GLOBAL_MEM_FENCE`, etc. Move these out of opencl-c.h into opencl-c-base.h such that the latter can be shared by -fdeclare-opencl-builtins and -finclude-default-header. This changes the behaviour of -finclude-default-header when -fdeclare-opencl-builtins is specified: instead of including the full header, it will include the header with only the base definitions. Differential revision: https://reviews.llvm.org/D63256 llvm-svn: 363794
* [PowerPC] [Clang] Port SSE2 intrinsics to PowerPCZi Xuan Wu2019-06-122-0/+2319
| | | | | | | | | | | | | | | | Port emmintrin.h which include Intel SSE2 intrinsics implementation to PowerPC platform (using Altivec). The new headers containing those implemenations are located into a directory named ppc_wrappers which has higher priority when the platform is PowerPC on Linux. They are mainly developed by Steven Munroe, with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu. It's a follow-up patch of D62121. Patched by: Qiu Chaofan <qiucf@cn.ibm.com> Differential Revision: https://reviews.llvm.org/D62569 llvm-svn: 363122
* [X86] Enable intrinsics that convert float and bf16 data to each otherPengfei Wang2019-06-112-0/+130
| | | | | | | | | | | | | | | | Scalar version : _mm_cvtsbh_ss , _mm_cvtness_sbh Vector version: _mm512_cvtpbh_ps , _mm256_cvtpbh_ps _mm512_maskz_cvtpbh_ps , _mm256_maskz_cvtpbh_ps _mm512_mask_cvtpbh_ps , _mm256_mask_cvtpbh_ps Patch by Shengchen Kan (skan) Differential Revision: https://reviews.llvm.org/D62363 llvm-svn: 363018
* [X86] Add ENQCMD instructionsPengfei Wang2019-06-064-0/+69
| | | | | | | | | | | | For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference. Patch by Tianqing Wang (tianqing) Differential Revision: https://reviews.llvm.org/D62282 llvm-svn: 362685
* [OpenCL] Undefine cl_intel_planar_yuv extensionAndrew Savonichev2019-06-031-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Remove unnecessary definition (otherwise the extension will be defined where it's not supposed to be defined). Consider the code: #pragma OPENCL EXTENSION cl_intel_planar_yuv : begin // some declarations #pragma OPENCL EXTENSION cl_intel_planar_yuv : end is enough for extension to become known for clang. Patch by: Dmitry Sidorov <dmitry.sidorov@intel.com> Reviewers: Anastasia, yaxunl Reviewed By: Anastasia Tags: #clang Differential Revision: https://reviews.llvm.org/D58666 llvm-svn: 362398
OpenPOWER on IntegriCloud