| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D38191
llvm-svn: 314135
|
|
|
|
|
|
|
|
| |
Also fixed a syntax error in activemask().
Differential Revision: https://reviews.llvm.org/D38188
llvm-svn: 314129
|
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D38231
Change-Id: I80bbff9cbe93e4be54d8a761ef9723edf3f57c57
llvm-svn: 314102
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D38147
llvm-svn: 313899
|
|
|
|
|
|
|
|
| |
instructions/intrinsics/builtins.
Differential Revision: https://reviews.llvm.org/D38148
llvm-svn: 313898
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D38090
llvm-svn: 313820
|
|
|
|
|
|
|
|
| |
This patch, together with a matching llvm patch (https://reviews.llvm.org/D37669), implements the lowering of X86 mask set1 intrinsics to IR.
Differential Revision: https://reviews.llvm.org/D37668
llvm-svn: 313624
|
|
|
|
|
|
|
|
|
|
|
|
| |
a backend isel failure.
The __builtin_ia32_pbroadcastq512_mem_mask we were previously trying to use in 32-bit mode is not implemented in the x86 backend and causes isel to fail in release builds. In debug builds it fails even earlier during legalization with an llvm_unreachable.
While there add the missing test case for this intrinsic for this for 64-bit mode.
This fixes PR34631. D37668 should be able to recover this for 32-bit mode soon. But I wanted to fix the crash ahead of that.
llvm-svn: 313392
|
|
|
|
|
|
|
|
|
|
| |
In CUDA-9 some of device-side math functions that we need are conditionally
defined within '#if _GLIBCXX_MATH_H'. We need to temporarily undo the guard
around inclusion of math_functions.hpp.
Differential Revision: https://reviews.llvm.org/D37906
llvm-svn: 313369
|
|
|
|
|
|
| |
This was a typo in SVN r282447, where it was added.
llvm-svn: 313232
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D34695
llvm-svn: 313152
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D37562
llvm-svn: 313011
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For now CUDA-9 is not included in the list of CUDA versions clang
searches for, so the path to CUDA-9 must be explicitly passed
via --cuda-path=.
On LLVM side NVPTX added sm_70 GPU type which bumps required
PTX version to 6.0, but otherwise is equivalent to sm_62 at the moment.
Differential Revision: https://reviews.llvm.org/D37576
llvm-svn: 312734
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Tests have to live in the test-suite, and so will come in a separate
patch.
Fixes PR34360.
Reviewers: tra
Subscribers: llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D37539
llvm-svn: 312681
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR33977)
Based off the Intel Intrinsics guide, we should expect a void const* argument.
Prevents 'passing 'const void *' to parameter of type 'void *' discards qualifiers' warnings.
Differential Revision: https://reviews.llvm.org/D37449
llvm-svn: 312523
|
|
|
|
|
|
|
|
|
|
| |
__builtin_shufflevector instead builtins
This patch implements the broadcastf32x2/broadcasti32x2 intrinsics using __builtin_shufflevector.
Differential Revision: https://reviews.llvm.org/D37287
llvm-svn: 312135
|
|
|
|
|
|
|
|
| |
GCC will interpret `__attribute__((__aligned__))` as 8-byte alignment on
ARM, but clang will not. Explicitly specify the alignment. This
mirrors the declaration in libunwind.
llvm-svn: 311576
|
|
|
|
|
|
|
|
|
|
| |
The C++ ABI requires that the exception object (which under AEABI is the
`_Unwind_Control_Block`) is double-word aligned. The attribute was
applied to the `_Unwind_Exception` type, but not the
`_Unwind_Control_Block`. This should fix the libunwind test for the
alignment of the exception type.
llvm-svn: 311563
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OpenCL spec v2.0 s6.13.6:
gentype select (gentype a,
gentype b,
igentype c)
gentype select (gentype a,
gentype b,
ugentype c)
igentype and ugentype must have the same number
of elements and bits as gentype.
Differential Revision: https://reviews.llvm.org/D36259
llvm-svn: 310160
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OpenCL 2.0 atomic builtin functions have a scope argument which is ideally
represented as synchronization scope argument in LLVM atomic instructions.
Clang supports translating Clang atomic builtin functions to LLVM atomic
instructions. However it currently does not support synchronization scope
of LLVM atomic instructions. Without this, users have to use LLVM assembly
code to implement OpenCL atomic builtin functions.
This patch adds OpenCL 2.0 atomic builtin functions as Clang builtin
functions, which supports generating LLVM atomic instructions with
synchronization scope operand.
Currently only constant memory scope argument is supported. Support of
non-constant memory scope argument will be added later.
Differential Revision: https://reviews.llvm.org/D28691
llvm-svn: 310082
|
|
|
|
|
|
|
|
|
| |
This fixes PR31504 and it's a follow up from adding #include_next<float.h>
for Darwin in r289018.
rdar://problem/29856682
llvm-svn: 309752
|
|
|
|
|
|
|
|
|
|
|
|
| |
alignment (PR33830)
Clang specifies a max type alignment of 16 bytes on darwin targets (annoyingly in the driver not via cc1), meaning that the builtin nontemporal stores don't correctly align the loads/stores to 32 or 64 bytes when required, resulting in lowering to temporal unaligned loads/stores.
This patch casts the vectors to explicitly aligned types prior to the load/store to ensure that the require alignment is respected.
Differential Revision: https://reviews.llvm.org/D35996
llvm-svn: 309488
|
|
|
|
| |
llvm-svn: 309383
|
|
|
|
|
|
|
|
| |
The EHABI definition was being inlined into the users even when EHABI
was not in use. Adjust the condition to ensure that the right version
is defined.
llvm-svn: 309327
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure that we define the `_Unwind_Control_Block` structure used on ARM
EHABI targets. This is needed for building libc++abi with the unwind.h
from the resource dir. A minor fallout of this is that we needed to
create a typedef for _Unwind_Exception to work across ARM EHABI and
non-EHABI targets. The structure definitions here are based originally
on the documentation from ARM under the "Exception Handling ABI for the
ARM® Architecture" Section 7.2. They are then adjusted to more closely
reflect the definition in libunwind from LLVM. Those changes are
compatible in layout but permit easier use in libc++abi and help
maintain compatibility between libunwind and the compiler provided
definition.
llvm-svn: 309226
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This fixes compiling with headers from the Windows SDK for ARM64.
Reviewers: compnerd, ruiu, mstorsjo
Reviewed By: compnerd, mstorsjo
Subscribers: mgorny, aemerson, javed.absar, kristof.beyls, llvm-commits, cfe-commits
Differential Revision: https://reviews.llvm.org/D35862
llvm-svn: 309081
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch updates the vecintrin.h header file to provide the new
set of high-level vector built-in functions. This matches the
updated definition implemented by other compilers for the platform,
indicated by the pre-defined macro __VEC__ == 10302.
Note that some of the new functions (notably those involving the
vector float data type) are only available with -march=z14
(indicated by __ARCH__ == 12).
llvm-svn: 308199
|
|
|
|
|
|
|
|
|
|
| |
Corrected several typos and incorrect parameters description that Sony
's techinical writer found during review.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 307838
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The _bit_scan_forward and _bit_scan_reverse intrinsics were accidentally
masked under the preprocessor checks that prune intrinsics definitions for the
benefit of faster compile-time on Windows. This patch moves the
definitons out of that region.
Fixes pr33722
Reviewers: craig.topper, aaboud, thakis
Reviewed By: craig.topper
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D35184
llvm-svn: 307524
|
|
|
|
| |
llvm-svn: 307507
|
|
|
|
|
|
|
|
| |
maximum level support before accessing the leaf. Rename level to leaf everywhere.
This matches gcc behavior.
llvm-svn: 307506
|
|
|
|
|
|
|
|
|
| |
Sony's techinical writer found during review.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 304840
|
|
|
|
|
|
|
|
|
|
| |
The second argument must be a constant, otherwise instruction selection
will fail. always_inline is not enough for isel to always fold
everything away at -O0.
Sadly the overloading turned this into a big macro mess. Fixes PR33212.
llvm-svn: 304205
|
|
|
|
|
|
|
|
|
|
| |
AVX512_VPOPCNTDQ is a new feature set that was published by Intel.
The patch represents the Clang side of the addition of six intrinsics for two new machine instructions (vpopcntd and vpopcntq).
It also includes the addition of the new feature set.
Differential Revision: https://reviews.llvm.org/D33170
llvm-svn: 303857
|
|
|
|
|
|
|
|
|
|
| |
The vec_xxsldwi builtin is missing from altivec.h. This has been requested by
developers working on libvpx for VP9 support for Google.
The patch fixes PR: https://bugs.llvm.org/show_bug.cgi?id=32653
Differential Revision: https://reviews.llvm.org/D33236
llvm-svn: 303766
|
|
|
|
|
|
|
|
|
|
| |
The vec_xxpermdi builtin is missing from altivec.h. This has been requested by
developers working on libvpx for VP9 support for Google.
The patch fixes PR: https://bugs.llvm.org/show_bug.cgi?id=32653
Differential Revision: https://reviews.llvm.org/D33053
llvm-svn: 303760
|
|
|
|
|
|
|
|
|
| |
(2) Removed uncessary anymore \c commands, since the same effect will be achived by <c> ... </c> sequence.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 303228
|
|
|
|
|
|
|
|
|
| |
Separated very long brief sections into two sections.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 303031
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: Anastasia, cfe-commits
Reviewed By: Anastasia
Subscribers: bader, yaxunl
Differential Revision: https://reviews.llvm.org/D32897
llvm-svn: 302630
|
|
|
|
|
|
| |
Now provided in lwpintrin.h
llvm-svn: 302559
|
|
|
|
|
|
| |
LWP / lwpintrin.h is now supported
llvm-svn: 302557
|
|
|
|
|
|
|
|
| |
This patch adds support for the the LightWeight Profiling (LWP) instructions which are available on all AMD Bulldozer class CPUs (bdver1 to bdver4).
Differential Revision: https://reviews.llvm.org/D32770
llvm-svn: 302418
|
|
|
|
|
|
|
|
|
|
| |
Implemented the remaining integer data processing intrinsics from
the ARM ACLE v2.1 spec, such as parallel arithemtic and DSP style
multiplications.
Differential Revision: https://reviews.llvm.org/D32282
llvm-svn: 302131
|
|
|
|
| |
llvm-svn: 301749
|
|
|
|
|
|
|
|
|
|
|
|
| |
- I removed doxygen comments for the intrinsics that "alias" the other existing documented intrinsics and that only sligtly differ in spelling (single underscores vs. double underscores).
#define _tzcnt_u16(a) (__tzcnt_u16((a)))
It will be very hard to keep the documentation for these "aliases" in sync with the documentation for the intrinsics they alias to. Out of sync documentation will be more confusing than no documentation.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 301652
|
|
|
|
|
|
| |
Matches _mm_set_ps1 implementation
llvm-svn: 301637
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
size_t is usually defined as unsigned long, but on 64-bit platforms,
stdint.h currently defines SIZE_MAX using "ull" (unsigned long long).
Although this is the same width, it doesn't necessarily have the same
alignment or calling convention. It also triggers printf warnings when
using the format flag "%zu" to print SIZE_MAX.
This changes SIZE_MAX to reuse the compiler-provided __SIZE_MAX__, and
provides similar fixes for the other integers:
- INTPTR_MIN
- INTPTR_MAX
- UINTPTR_MAX
- PTRDIFF_MIN
- PTRDIFF_MAX
- INTMAX_MIN
- INTMAX_MAX
- UINTMAX_MAX
- INTMAX_C()
- UINTMAX_C()
... and fixes the typedefs for intptr_t and uintptr_t to use
__INTPTR_TYPE__ and __UINTPTR_TYPE__ instead of int32_t, effectively
reverting r89224, r89226, and r89237 (r89221 already having been
effectively reverted).
We can probably also kill __INTPTR_WIDTH__, __INTMAX_WIDTH__, and
__UINTMAX_WIDTH__ in a follow-up, but I was hesitant to delete all the
per-target CHECK lines in this commit since those might serve their own
purpose.
rdar://problem/11811377
llvm-svn: 301593
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch makes the header `stdatomic.h` work when `-fms-compatibility` is specified.
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D32322
llvm-svn: 300919
|
|
|
|
|
|
|
|
|
|
|
| |
- To be consistent with the rest of the intrinsics headers, I removed the tags <i> .. </i> for marking instruction names in italics in in smmintrin.h.
- Formatting changes to fit into 80 characters.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 300578
|
|
|
|
|
|
|
|
|
|
| |
MOVNTDQA non-temporal aligned vector loads can be correctly represented using generic builtin loads, allowing us to remove the existing x86 intrinsics.
LLVM companion patch: D31767.
Differential Revision: https://reviews.llvm.org/D31766
llvm-svn: 300326
|