| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
Keep looking for decl-specifiers after an unknown identifier. Don't
issue diagnostics about an error type specifier conflicting with later
type specifiers.
llvm-svn: 360117
|
|
|
|
|
|
|
|
|
|
|
| |
This matches the behavior of the old pass manager. There are some
targets that don't have target machine at all (e.g. le32, spir) which
whose tests would never run with new pass manager. Similarly, we would
need to disable tests for targets that are disabled.
Differential Revision: https://reviews.llvm.org/D58374
llvm-svn: 360100
|
|
|
|
|
|
|
|
|
|
|
|
| |
In MinGW, setjmp isn't expanded as a builtin in the compiler (like it
is for MSVC), but manually hooked up as calls to the right underlying
functions in headers. Using the actual CRT's real setjmp/longjmp
functions requires this intrinsic. (Currently this is worked around by
using MinGW specific reimplementations of setjmp/longjmp on aarch64.)
Differential Revision: https://reviews.llvm.org/D61592
llvm-svn: 360082
|
|
|
|
| |
llvm-svn: 360022
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lake
Summary:
1. Enable infrastructure of AVX512_BF16, which is supported for BFLOAT16 in Cooper Lake;
2. Enable intrinsics for VCVTNE2PS2BF16, VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural Network Instructions supporting BFLOAT16 inputs and conversion instructions from IEEE single precision.
For more details about BF16 intrinsic, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference
Patch by LiuTianle
Reviewers: craig.topper, smaslov, LuoYuanke, wxiao3, annita.zhang, spatel, RKSimon
Reviewed By: craig.topper
Subscribers: mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60552
llvm-svn: 360018
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Related llvm patch: D60348.
Patch co-authored by Sanjin Sijaric.
Reviewers: rnk, efriedma, TomTan, ssijaric, ostannard
Reviewed By: efriedma
Subscribers: dmajor, richard.townsend.arm, ostannard, javed.absar, kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60349
llvm-svn: 359932
|
|
|
|
| |
llvm-svn: 359823
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to alignment section in below ARM64 ABI document, MSVC could increase
alignment of global data based on its total size. Clang doesn't do this. Compile
the same symbol into different alignments by Clang and MSVC could cause link
error because some instruction encodings, like 64-bit LDR/STR with immediate,
require the target to be 8 bytes aligned, and linker could choose code stream
with such LDR/STR instruction from MSVC and 4 bytes aligned data from Clang into
final image, which actually cannot be linked together
(see https://bugs.llvm.org/show_bug.cgi?id=41506 for more details).
https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=vs-2019#alignment
Differential Revision: https://reviews.llvm.org/D61225
llvm-svn: 359744
|
|
|
|
|
|
| |
short options in tests. NFC
llvm-svn: 359662
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
C guarantees that brace-init with fewer initializers than members in the
aggregate will initialize the rest of the aggregate as-if it were static
initialization. In turn static initialization guarantees that padding is
initialized to zero bits.
Quoth the Standard:
C17 6.7.9 Initialization ❡21
If there are fewer initializers in a brace-enclosed list than there are elements
or members of an aggregate, or fewer characters in a string literal used to
initialize an array of known size than there are elements in the array, the
remainder of the aggregate shall be initialized implicitly the same as objects
that have static storage duration.
C17 6.7.9 Initialization ❡10
If an object that has automatic storage duration is not initialized explicitly,
its value is indeterminate. If an object that has static or thread storage
duration is not initialized explicitly, then:
* if it has pointer type, it is initialized to a null pointer;
* if it has arithmetic type, it is initialized to (positive or unsigned) zero;
* if it is an aggregate, every member is initialized (recursively) according to
these rules, and any padding is initialized to zero bits;
* if it is a union, the first named member is initialized (recursively)
according to these rules, and any padding is initialized to zero bits;
<rdar://problem/50188861>
Reviewers: glider, pcc, kcc, rjmccall, erik.pilkington
Subscribers: jkorous, dexonsmith, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61280
llvm-svn: 359628
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch adds support for __builtin_dcbf for PPC.
__builtin_dcbf copies the contents of a modified block from the data cache
to main memory and flushes the copy from the data cache.
Differential revision: https://reviews.llvm.org/D59843
llvm-svn: 359517
|
|
|
|
|
|
|
|
| |
Add the rest of test cases covering functions defined in mmintrin.h on PowerPC.
Reviewed By: Jinsong Ji
llvm-svn: 359393
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
us emitting the operand of __builtin_constant_p if it has side-effects.
Original commit message:
Fix interactions between __builtin_constant_p and constexpr to match
current trunk GCC.
GCC permits information from outside the operand of
__builtin_constant_p (but in the same constant evaluation context) to be
used within that operand; clang now does so too. A few other minor
deviations from GCC's behavior showed up in my testing and are also
fixed (matching GCC):
* Clang now supports nullptr_t as the argument type for
__builtin_constant_p
* Clang now returns true from __builtin_constant_p if called with a
null pointer
* Clang now returns true from __builtin_constant_p if called with an
integer cast to pointer type
llvm-svn: 359367
|
|
|
|
|
|
|
|
|
|
|
|
| |
This provides intrinsics support for Memory Tagging Extension (MTE),
which was introduced with the Armv8.5-a architecture.
These intrinsics are available when __ARM_FEATURE_MEMORY_TAGGING is defined.
Each intrinsic is described in detail in the ACLE Q1 2019 documentation:
https://developer.arm.com/docs/101028/latest
Reviewed By: Tim Nortover, David Spickett
Differential Revision: https://reviews.llvm.org/D60485
llvm-svn: 359348
|
|
|
|
|
|
|
|
|
| |
This reverts r359250 (git commit 4730604bd3a361c68b92b18bf73a5daa15afe9f4)
The newly added test should use -cc1 and -emit-llvm and there are other
test failures that need fixing.
llvm-svn: 359251
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Statically link certain runtime library functions for MSVC/GNU Windows
environments. This is consistent with MSVC behavior.
Fixes LNK4286 and LNK4217 warnings from link.exe when linking the static
CRT:
LINK : warning LNK4286: symbol '__std_terminate' defined in 'libvcruntime.lib(ehhelpers.obj)' is imported by 'ASAN_NOINST_TEST_OBJECTS.asan_noinst_test.cc.x86_64-calls.o'
LINK : warning LNK4286: symbol '__std_terminate' defined in 'libvcruntime.lib(ehhelpers.obj)' is imported by 'ASAN_NOINST_TEST_OBJECTS.asan_test_main.cc.x86_64-calls.o'
LINK : warning LNK4217: symbol '_CxxThrowException' defined in 'libvcruntime.lib(throw.obj)' is imported by 'ASAN_NOINST_TEST_OBJECTS.gtest-all.cc.x86_64-calls.o' in function '"int `public: static class UnitTest::GetInstance * __cdecl testing::UnitTest::GetInstance(void)'::`1'::dtor$5" (?dtor$5@?0??GetInstance@UnitTest@testing@@SAPEAV12@XZ@4HA)'
Reviewers: mstorsjo, efriedma, TomTan, compnerd, smeenai, mgrang
Subscribers: abdulras, theraven, smeenai, pcc, mehdi_amini, javed.absar, inglorion, kristof.beyls, dexonsmith, cfe-commits
Differential Revision: https://reviews.llvm.org/D55229
llvm-svn: 359250
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These builtins provide access to the new integer and
sub-integer variants of MMA (matrix multiply-accumulate) instructions
provided by CUDA-10.x on sm_75 (AKA Turing) GPUs.
Also added a feature for PTX 6.4. While Clang/LLVM does not generate
any PTX instructions that need it, we still need to pass it through to
ptxas in order to be able to compile code that uses the new 'mma'
instruction as inline assembly (e.g used by NVIDIA's CUTLASS library
https://github.com/NVIDIA/cutlass/blob/master/cutlass/arch/mma.h#L101)
Differential Revision: https://reviews.llvm.org/D60279
llvm-svn: 359248
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pass manager
Currently InstrProf lowering is not enabled for Clang PGO instrumentation in
the new pass manager. The following command
"-fprofile-instr-generate -fexperimental-new-pass-manager ..." is broken.
This CL enables InstrProf lowering pass for Clang PGO instrumentation in the
new pass manager.
Differential Revision: https://reviews.llvm.org/D61138
llvm-svn: 359215
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The opt level was not being passed down to the ThinLTO backend when
invoked via clang (for distributed ThinLTO).
This exposed an issue where the new PM was asserting if the Thin or
regular LTO backend pipelines were invoked with -O0 (not a new issue,
could be provoked by invoking in-process *LTO backends via linker using
new PM and -O0). Fix this similar to the old PM where -O0 only does the
necessary lowering of type metadata (WPD and LowerTypeTest passes) and
then quits, rather than asserting.
Reviewers: xur
Subscribers: mehdi_amini, inglorion, eraman, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits, pcc
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D61022
llvm-svn: 359025
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without this patch, APSInt inherits APInt::isNegative, which merely
checks the sign bit without regard to whether the type is actually
signed. isNonNegative and isStrictlyPositive call isNegative and so
are also affected.
This patch adjusts APSInt to override isNegative, isNonNegative, and
isStrictlyPositive with implementations that consider whether the type
is signed.
A large set of Clang OpenMP tests are affected. Without this patch,
these tests assume that `true` is not a valid argument for clauses
like `collapse`. Indeed, `true` fails APInt::isStrictlyPositive but
not APSInt::isStrictlyPositive. This patch adjusts those tests to
assume `true` should be accepted.
This patch also adds tests revealing various other similar fixes due
to APSInt::isNegative calls in Clang's ExprConstant.cpp and
SemaExpr.cpp: `++` and `--` overflow in `constexpr`, evaluated object
size based on `alloc_size`, `<<` and `>>` shift count validation, and
OpenMP array section validation.
Reviewed By: lebedev.ri, ABataev, hfinkel
Differential Revision: https://reviews.llvm.org/D59712
llvm-svn: 359012
|
|
|
|
|
|
| |
For the clang driver, -DLLVM_ENABLE_ASSERTIONS=off builds default to discard value names.
llvm-svn: 358953
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Port mmintrin.h which include x86 MMX intrinsics implementation to PowerPC platform (using Altivec).
To make the include process correct, PowerPC's toolchain class is overrided to insert new headers directory (named ppc_wrappers) into the path. Basic test cases for several intrinsic functions are added.
The header is mainly developed by Steven Munroe, with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu.
Reviewed By: Jinsong Ji
Differential Revision: https://reviews.llvm.org/D59924
llvm-svn: 358949
|
|
|
|
|
|
|
|
| |
by deadcode elimination. Improve CHECK lines to check IR types used. NFC
I plan to use this as the basis for backend IR test cases. We currently crash hard for using 32 or 64 bit mask registers without avx512bw.
llvm-svn: 358435
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pattern we replaced these with may be too hard to match as demonstrated by
PR41496 and PR41316.
This patch restores the intrinsics and then we can start focusing
on the optimizing the intrinsics.
I've mostly reverted the original patch that removed them. Though I modified
the avx512 intrinsics to not have masking built in.
Differential Revision: https://reviews.llvm.org/D60674
llvm-svn: 358427
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[MS] Add metadata for __declspec(allocator)
Original summary:
Emit !heapallocsite in the metadata for calls to functions marked with
__declspec(allocator). Eventually this will be emitted as S_HEAPALLOCSITE debug
info in codeview.
Differential Revision: https://reviews.llvm.org/D60237
llvm-svn: 358307
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Implements the intrinsics define on the ACLE to extract half precision fp scalar elements from float16x4_t and float16x8_t vector types.
a.k.a:
vduph_lane_f16
vduph_laneq_f16
Reviewers: pablooliveira, olista01, LukeGeeson, DavidSpickett
Reviewed By: DavidSpickett
Subscribers: DavidSpickett, javed.absar, kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60272
llvm-svn: 358276
|
|
|
|
| |
llvm-svn: 358125
|
|
|
|
|
|
| |
Patch by Brad Moody.
llvm-svn: 358104
|
|
|
|
|
|
|
| |
There were some errors in the committed test checks, left in due to a git
stash apply mishap.
llvm-svn: 357993
|
|
|
|
|
|
|
| |
One of the tests in riscv64-lp64-lp64f-lp64d would have had a different
lowering for lp64f/lp64d as a float argument was missed.
llvm-svn: 357991
|
|
|
|
|
|
|
|
|
| |
float patches
Split tests in to files representing the subset of RISC-V ABIs they should
have identical output for.
llvm-svn: 357989
|
|
|
|
|
|
|
| |
This reverts commit e7bd735bb03a7b8141e32f7d6cb98e8914d8799e.
Reverting because of buildbot failure.
llvm-svn: 357952
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Emit !heapallocsite in the metadata for calls to functions marked with
__declspec(allocator). Eventually this will be emitted as S_HEAPALLOCSITE debug
info in codeview.
Reviewers: rnk
Subscribers: jfb, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60237
llvm-svn: 357928
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In PR41304:
https://bugs.llvm.org/show_bug.cgi?id=41304
...we have a case where we want to fold a binop of select-shuffle (blended) values.
Rather than try to match commuted variants of the pattern, we can canonicalize the
shuffles and check for mask equality with commuted operands.
We don't produce arbitrary shuffle masks in instcombine, but select-shuffles are a
special case that the backend is required to handle because we already canonicalize
vector select to this shuffle form.
So there should be no codegen difference from this change. It's possible that this
improves CSE in IR though.
Differential Revision: https://reviews.llvm.org/D60016
llvm-svn: 357366
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
PowerPC64/PowerPC64le supports the builtin function __builtin_setrnd to set the floating point rounding mode. This function will use the least significant two bits of integer argument to set the floating point rounding mode.
double __builtin_setrnd(int mode);
The effective values for mode are:
0 - round to nearest
1 - round to zero
2 - round to +infinity
3 - round to -infinity
Note that the mode argument will modulo 4, so if the int argument is greater than 3, it will only use the least significant two bits of the mode. Namely, builtin_setrnd(102)) is equal to builtin_setrnd(2).
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D59403
llvm-svn: 357242
|
|
|
|
|
|
|
|
|
|
|
| |
Future versions of MSVC make these intrinsics available on x86 & x64,
according to:
http://lists.llvm.org/pipermail/cfe-dev/2019-March/061711.html
The purpose of these builtins is to emit plain, non-atomic, volatile
stores when /volatile:ms (-cc1 -fms-volatile) is enabled.
llvm-svn: 357220
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These are all implemented by icc as well.
I made bit_scan_forward/reverse forward to the __bsfd/__bsrq since we also have
__bsfq/__bsrq.
Note, when lzcnt is enabled the bsr intrinsics generates lzcnt+xor instead of bsr.
Reviewers: RKSimon, spatel
Subscribers: cfe-commits, llvm-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D59682
llvm-svn: 356848
|
|
|
|
|
|
| |
Add Exynos M5 test cases.
llvm-svn: 356794
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the result of discussions on the list about how to deal with intrinsics
which require codegen to disambiguate them via only the integer/fp overloads.
It causes problems for GlobalISel as some of that information is lost during
translation, while with other operations like IR instructions the information is
encoded into the instruction opcode.
This patch changes clang to emit the new faddp intrinsic if the vector operands
to the builtin have FP element types. LLVM IR AutoUpgrade has been taught to
upgrade existing calls to aarch64.neon.addp with fp vector arguments, and
we remove the workarounds introduced for GlobalISel in r355865.
This is a more permanent solution to PR40968.
Differential Revision: https://reviews.llvm.org/D59655
llvm-svn: 356722
|
|
|
|
|
|
|
|
|
|
| |
Use the new cx8 feature flag that was added to the backend to represent support for cmpxchg8b. Use this flag to set the MaxAtomicInlineWidth.
This also assumes all the cmpxchg instructions are enabled for CK_Generic which is what cc1 defaults to when nothing is specified.
Differential Revision: https://reviews.llvm.org/D59566
llvm-svn: 356709
|
|
|
|
|
|
|
|
|
|
| |
Remove popcnt feature flag from _popcnt32/_popcnt64 and move to ia32intrin.h to match gcc
gcc and icc both implement popcntd and popcntq which we did not. gcc doesn't seem to require a feature flag for the _popcnt32/_popcnt64 spelling and will use a libcall if its not supported.
Differential Revision: https://reviews.llvm.org/D59567
llvm-svn: 356689
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After https://reviews.llvm.org/rL355317 we noticed that quite a decent
amount of code redeclares builtins (memcpy in particular, I believe
reduced from an MSVC header) with a calling convention specified.
This gets particularly troublesome when the user specifies a new
'default' calling convention on the command line.
When looking to add a diagnostic for this case, it was noticed that we
had 3 other diagnostics that differed only slightly. This patch ALSO
unifies those under a 'select'. Unfortunately, the order of words in
ONE of these diagnostics was reversed ("'thiscall' calling convention"
vs "calling convention 'thiscall'"), so this patch also standardizes on
the former.
Differential Revision: https://reviews.llvm.org/D59560
Change-Id: I79f99fe7c2301640755ffdd774b46eb44526bb22
llvm-svn: 356663
|
|
|
|
|
|
|
|
|
| |
gcc has these intrinsics in ia32intrin.h as well. And icc implements them
though they aren't documented in the Intel Intrinsics Guide.
Differential Revision: https://reviews.llvm.org/D59533
llvm-svn: 356609
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This test is failing after r356499 (verified with `ninja check-clang-codegen`). Update the register selection used in the test from x0 to x8.
Reviewers: arsenm, MatzeB, efriedma
Reviewed By: efriedma
Subscribers: efriedma, wdng, javed.absar, kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D59557
llvm-svn: 356517
|
|
|
|
|
|
|
|
|
|
|
| |
The attribute pass_dynamic_object_size(n) behaves exactly like
pass_object_size(n), but instead of evaluating __builtin_object_size on calls,
it evaluates __builtin_dynamic_object_size, which has the potential to produce
runtime code when the object size can't be determined statically.
Differential revision: https://reviews.llvm.org/D58757
llvm-svn: 356515
|
|
|
|
|
|
|
|
| |
external linkage when marked as dllexport and targeting the MSVC ABI.
Patch thanks to Zahira Ammarguellat.
llvm-svn: 356458
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
`wasm.throw` builtin's first 'tag' argument should be an immediate index
into the event section.
Reviewers: dschuff, craig.topper
Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D59448
llvm-svn: 356436
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is another attempt at what Erich Keane tried to do in r355322.
This adds rolb, rolw, rold, rolq and their ror equivalent as always_inline wrappers around __builtin_rotate* which will lower to funnel shift intrinsics in IR.
Additionally, when _MSC_VER is not defined we will define _rotl, _lrotl, _rotr, _lrotr as macros to one of the always_inline intrinsics mentioned above. Making sure that _lrotl/_lrotr use either 32 or 64 bit based on the size of long. These need to be macros because we have builtins with the same name for MS compatibility, but _MSC_VER isn't always defined when those builtins are enabled.
We also define _rotwl and _rotwr as macros aliasing to rolw/rorw just like gcc to complete the set. These don't need to be gated with _MSC_VER because these aren't MS builtins.
I've added tests both for non-MS and -ms-extensions with and without _MSC_VER being defined.
Differential Revision: https://reviews.llvm.org/D59346
llvm-svn: 356423
|
|
|
|
| |
llvm-svn: 356385
|
|
|
|
| |
llvm-svn: 356354
|