| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
| |
In `CodeGenFunction::EmitAArch64BuiltinExpr()`, bulk move all of the aarch64 MSVC-builtin cases to an earlier point in the function (the `// Handle non-overloaded intrinsics first` switch block) in order to avoid an unreachable in `GetNeonType()`. The NEON type-overloading logic is not appropriate for the Windows builtins.
Fixes https://llvm.org/pr42775
Differential Revision: https://reviews.llvm.org/D65403
llvm-svn: 367323
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is useful for targets which have prefetch instructions for non-default address spaces.
<rdar://problem/42662136>
Subscribers: nemanjai, javed.absar, hiraditya, kbarton, jkorous, dexonsmith, cfe-commits, llvm-commits, RKSimon, hfinkel, t.p.northover, craig.topper, anemet
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D65254
llvm-svn: 367032
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Modified the intrinsics
int_addressofreturnaddress,
int_frameaddress & int_sponentry.
This commit depends on the changes in rL366679
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D64563
llvm-svn: 366683
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add immutable WASM global `__tls_align` which stores the alignment
requirements of the TLS segment.
Add `__builtin_wasm_tls_align()` intrinsic to get this alignment in Clang.
The expected usage has now changed to:
__wasm_init_tls(memalign(__builtin_wasm_tls_align(),
__builtin_wasm_tls_size()));
Reviewers: tlively, aheejin, sbc100, sunfish, alexcrichton
Reviewed By: tlively
Subscribers: dschuff, jgravelle-google, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D65028
llvm-svn: 366624
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add `__builtin_wasm_tls_base` so that LeakSanitizer can find the thread-local
block and scan through it for memory leaks.
Reviewers: tlively, aheejin, sbc100
Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D64900
llvm-svn: 366475
|
|
|
|
| |
llvm-svn: 366286
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Thread local variables are placed inside a `.tdata` segment. Their symbols are
offsets from the start of the segment. The address of a thread local variable
is computed as `__tls_base` + the offset from the start of the segment.
`.tdata` segment is a passive segment and `memory.init` is used once per thread
to initialize the thread local storage.
`__tls_base` is a wasm global. Since each thread has its own wasm instance,
it is effectively thread local. Currently, `__tls_base` must be initialized
at thread startup, and so cannot be used with dynamic libraries.
`__tls_base` is to be initialized with a new linker-synthesized function,
`__wasm_init_tls`, which takes as an argument a block of memory to use as the
storage for thread locals. It then initializes the block of memory and sets
`__tls_base`. As `__wasm_init_tls` will handle the memory initialization,
the memory does not have to be zeroed.
To help allocating memory for thread-local storage, a new compiler intrinsic
is introduced: `__builtin_wasm_tls_size()`. This instrinsic function returns
the size of the thread-local storage for the current function.
The expected usage is to run something like the following upon thread startup:
__wasm_init_tls(malloc(__builtin_wasm_tls_size()));
Reviewers: tlively, aheejin, kripken, sbc100
Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D64537
llvm-svn: 366272
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The jcvt intrinsic defined in ACLE [1] is available when ARM_FEATURE_JCVT is defined.
This change introduces the AArch64 intrinsic, wires it up to the instruction and a new clang builtin function.
The __ARM_FEATURE_JCVT macro is now defined when an Armv8.3-A or higher target is used.
I've implemented the target detection logic in Clang so that this feature is enabled for architectures from armv8.3-a onwards (so -march=armv8.4-a also enables this, for example).
make check-all didn't show any new failures.
[1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics
Differential Revision: https://reviews.llvm.org/D64495
llvm-svn: 366197
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch applies clang-tidy's bugprone-argument-comment tool
to LLVM, clang and lld source trees. Here is how I created this
patch:
$ git clone https://github.com/llvm/llvm-project.git
$ cd llvm-project
$ mkdir build
$ cd build
$ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug \
-DLLVM_ENABLE_PROJECTS='clang;lld;clang-tools-extra' \
-DCMAKE_EXPORT_COMPILE_COMMANDS=On -DLLVM_ENABLE_LLD=On \
-DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ../llvm
$ ninja
$ parallel clang-tidy -checks='-*,bugprone-argument-comment' \
-config='{CheckOptions: [{key: StrictMode, value: 1}]}' -fix \
::: ../llvm/lib/**/*.{cpp,h} ../clang/lib/**/*.{cpp,h} ../lld/**/*.{cpp,h}
llvm-svn: 366177
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch series adds support for the next-generation arch13
CPU architecture to the SystemZ backend.
This includes:
- Basic support for the new processor and its features.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining __VEC__ == 10303.
Note: No currently available Z system supports the arch13
architecture. Once new systems become available, the
official system name will be added as supported -march name.
llvm-svn: 365933
|
|
|
|
| |
llvm-svn: 365798
|
|
|
|
|
|
|
|
| |
the store as a <2 x float> instead of i64.
This is similar to what we do for loadl_pi and loadh_pi.
llvm-svn: 365669
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Remove unused variable. This fixes bug:
https://bugs.llvm.org/show_bug.cgi?id=42526
Signed-off-by: pengfei <pengfei.wang@intel.com>
Reviewers: RKSimon, xiangzhangllvm, craig.topper
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64389
llvm-svn: 365473
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For background of BPF CO-RE project, please refer to
http://vger.kernel.org/bpfconf2019.html
In summary, BPF CO-RE intends to compile bpf programs
adjustable on struct/union layout change so the same
program can run on multiple kernels with adjustment
before loading based on native kernel structures.
In order to do this, we need keep track of GEP(getelementptr)
instruction base and result debuginfo types, so we
can adjust on the host based on kernel BTF info.
Capturing such information as an IR optimization is hard
as various optimization may have tweaked GEP and also
union is replaced by structure it is impossible to track
fieldindex for union member accesses.
Three intrinsic functions, preserve_{array,union,struct}_access_index,
are introducted.
addr = preserve_array_access_index(base, index, dimension)
addr = preserve_union_access_index(base, di_index)
addr = preserve_struct_access_index(base, gep_index, di_index)
here,
base: the base pointer for the array/union/struct access.
index: the last access index for array, the same for IR/DebugInfo layout.
dimension: the array dimension.
gep_index: the access index based on IR layout.
di_index: the access index based on user/debuginfo types.
If using these intrinsics blindly, i.e., transforming all GEPs
to these intrinsics and later on reducing them to GEPs, we have
seen up to 7% more instructions generated. To avoid such an overhead,
a clang builtin is proposed:
base = __builtin_preserve_access_index(base)
such that user wraps to-be-relocated GEPs in this builtin
and preserve_*_access_index intrinsics only apply to
those GEPs. Such a buyin will prevent performance degradation
if people do not use CO-RE, even for programs which use
bpf_probe_read().
For example, for the following example,
$ cat test.c
struct sk_buff {
int i;
int b1:1;
int b2:2;
union {
struct {
int o1;
int o2;
} o;
struct {
char flags;
char dev_id;
} dev;
int netid;
} u[10];
};
static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr)
= (void *) 4;
#define _(x) (__builtin_preserve_access_index(x))
int bpf_prog(struct sk_buff *ctx) {
char dev_id;
bpf_probe_read(&dev_id, sizeof(char), _(&ctx->u[5].dev.dev_id));
return dev_id;
}
$ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all \
test.c >& log
The generated IR looks like below:
...
define dso_local i32 @bpf_prog(%struct.sk_buff*) #0 !dbg !15 {
%2 = alloca %struct.sk_buff*, align 8
%3 = alloca i8, align 1
store %struct.sk_buff* %0, %struct.sk_buff** %2, align 8, !tbaa !45
call void @llvm.dbg.declare(metadata %struct.sk_buff** %2, metadata !43, metadata !DIExpression()), !dbg !49
call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50
call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata !DIExpression()), !dbg !51
%4 = load i32 (i8*, i32, i8*)*, i32 (i8*, i32, i8*)** @bpf_probe_read, align 8, !dbg !52, !tbaa !45
%5 = load %struct.sk_buff*, %struct.sk_buff** %2, align 8, !dbg !53, !tbaa !45
%6 = call [10 x %union.anon]* @llvm.preserve.struct.access.index.p0a10s_union.anons.p0s_struct.sk_buffs(
%struct.sk_buff* %5, i32 2, i32 3), !dbg !53, !llvm.preserve.access.index !19
%7 = call %union.anon* @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(
[10 x %union.anon]* %6, i32 1, i32 5), !dbg !53
%8 = call %union.anon* @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(
%union.anon* %7, i32 1), !dbg !53, !llvm.preserve.access.index !26
%9 = bitcast %union.anon* %8 to %struct.anon.0*, !dbg !53
%10 = call i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(
%struct.anon.0* %9, i32 1, i32 1), !dbg !53, !llvm.preserve.access.index !34
%11 = call i32 %4(i8* %3, i32 1, i8* %10), !dbg !52
%12 = load i8, i8* %3, align 1, !dbg !54, !tbaa !55
%13 = sext i8 %12 to i32, !dbg !54
call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !56
ret i32 %13, !dbg !57
}
!19 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "sk_buff", file: !3, line: 1, size: 704, elements: !20)
!26 = distinct !DICompositeType(tag: DW_TAG_union_type, scope: !19, file: !3, line: 5, size: 64, elements: !27)
!34 = distinct !DICompositeType(tag: DW_TAG_structure_type, scope: !26, file: !3, line: 10, size: 16, elements: !35)
Note that @llvm.preserve.{struct,union}.access.index calls have metadata llvm.preserve.access.index
attached to instructions to provide struct/union debuginfo type information.
For &ctx->u[5].dev.dev_id,
. The "%6 = ..." represents struct member "u" with index 2 for IR layout and index 3 for DI layout.
. The "%7 = ..." represents array subscript "5".
. The "%8 = ..." represents union member "dev" with index 1 for DI layout.
. The "%10 = ..." represents struct member "dev_id" with index 1 for both IR and DI layout.
Basically, traversing the use-def chain recursively for the 3rd argument of bpf_probe_read() and
examining all preserve_*_access_index calls, the debuginfo struct/union/array access index
can be achieved.
The intrinsics also contain enough information to regenerate codes for IR layout.
For array and structure intrinsics, the proper GEP can be constructed.
For union intrinsics, replacing all uses of "addr" with "base" should be enough.
Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D61809
llvm-svn: 365438
|
|
|
|
|
|
|
|
|
| |
This reverts commit r365435.
Forgot adding the Differential Revision link. Will add to the
commit message and resubmit.
llvm-svn: 365436
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For background of BPF CO-RE project, please refer to
http://vger.kernel.org/bpfconf2019.html
In summary, BPF CO-RE intends to compile bpf programs
adjustable on struct/union layout change so the same
program can run on multiple kernels with adjustment
before loading based on native kernel structures.
In order to do this, we need keep track of GEP(getelementptr)
instruction base and result debuginfo types, so we
can adjust on the host based on kernel BTF info.
Capturing such information as an IR optimization is hard
as various optimization may have tweaked GEP and also
union is replaced by structure it is impossible to track
fieldindex for union member accesses.
Three intrinsic functions, preserve_{array,union,struct}_access_index,
are introducted.
addr = preserve_array_access_index(base, index, dimension)
addr = preserve_union_access_index(base, di_index)
addr = preserve_struct_access_index(base, gep_index, di_index)
here,
base: the base pointer for the array/union/struct access.
index: the last access index for array, the same for IR/DebugInfo layout.
dimension: the array dimension.
gep_index: the access index based on IR layout.
di_index: the access index based on user/debuginfo types.
If using these intrinsics blindly, i.e., transforming all GEPs
to these intrinsics and later on reducing them to GEPs, we have
seen up to 7% more instructions generated. To avoid such an overhead,
a clang builtin is proposed:
base = __builtin_preserve_access_index(base)
such that user wraps to-be-relocated GEPs in this builtin
and preserve_*_access_index intrinsics only apply to
those GEPs. Such a buyin will prevent performance degradation
if people do not use CO-RE, even for programs which use
bpf_probe_read().
For example, for the following example,
$ cat test.c
struct sk_buff {
int i;
int b1:1;
int b2:2;
union {
struct {
int o1;
int o2;
} o;
struct {
char flags;
char dev_id;
} dev;
int netid;
} u[10];
};
static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr)
= (void *) 4;
#define _(x) (__builtin_preserve_access_index(x))
int bpf_prog(struct sk_buff *ctx) {
char dev_id;
bpf_probe_read(&dev_id, sizeof(char), _(&ctx->u[5].dev.dev_id));
return dev_id;
}
$ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all \
test.c >& log
The generated IR looks like below:
...
define dso_local i32 @bpf_prog(%struct.sk_buff*) #0 !dbg !15 {
%2 = alloca %struct.sk_buff*, align 8
%3 = alloca i8, align 1
store %struct.sk_buff* %0, %struct.sk_buff** %2, align 8, !tbaa !45
call void @llvm.dbg.declare(metadata %struct.sk_buff** %2, metadata !43, metadata !DIExpression()), !dbg !49
call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50
call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata !DIExpression()), !dbg !51
%4 = load i32 (i8*, i32, i8*)*, i32 (i8*, i32, i8*)** @bpf_probe_read, align 8, !dbg !52, !tbaa !45
%5 = load %struct.sk_buff*, %struct.sk_buff** %2, align 8, !dbg !53, !tbaa !45
%6 = call [10 x %union.anon]* @llvm.preserve.struct.access.index.p0a10s_union.anons.p0s_struct.sk_buffs(
%struct.sk_buff* %5, i32 2, i32 3), !dbg !53, !llvm.preserve.access.index !19
%7 = call %union.anon* @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(
[10 x %union.anon]* %6, i32 1, i32 5), !dbg !53
%8 = call %union.anon* @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(
%union.anon* %7, i32 1), !dbg !53, !llvm.preserve.access.index !26
%9 = bitcast %union.anon* %8 to %struct.anon.0*, !dbg !53
%10 = call i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(
%struct.anon.0* %9, i32 1, i32 1), !dbg !53, !llvm.preserve.access.index !34
%11 = call i32 %4(i8* %3, i32 1, i8* %10), !dbg !52
%12 = load i8, i8* %3, align 1, !dbg !54, !tbaa !55
%13 = sext i8 %12 to i32, !dbg !54
call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !56
ret i32 %13, !dbg !57
}
!19 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "sk_buff", file: !3, line: 1, size: 704, elements: !20)
!26 = distinct !DICompositeType(tag: DW_TAG_union_type, scope: !19, file: !3, line: 5, size: 64, elements: !27)
!34 = distinct !DICompositeType(tag: DW_TAG_structure_type, scope: !26, file: !3, line: 10, size: 16, elements: !35)
Note that @llvm.preserve.{struct,union}.access.index calls have metadata llvm.preserve.access.index
attached to instructions to provide struct/union debuginfo type information.
For &ctx->u[5].dev.dev_id,
. The "%6 = ..." represents struct member "u" with index 2 for IR layout and index 3 for DI layout.
. The "%7 = ..." represents array subscript "5".
. The "%8 = ..." represents union member "dev" with index 1 for DI layout.
. The "%10 = ..." represents struct member "dev_id" with index 1 for both IR and DI layout.
Basically, traversing the use-def chain recursively for the 3rd argument of bpf_probe_read() and
examining all preserve_*_access_index calls, the debuginfo struct/union/array access index
can be achieved.
The intrinsics also contain enough information to regenerate codes for IR layout.
For array and structure intrinsics, the proper GEP can be constructed.
For union intrinsics, replacing all uses of "addr" with "base" should be enough.
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 365435
|
|
|
|
|
|
| |
llvm::partition_point. NFC
llvm-svn: 365006
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D63209
llvm-svn: 363341
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scalar version :
_mm_cvtsbh_ss , _mm_cvtness_sbh
Vector version:
_mm512_cvtpbh_ps , _mm256_cvtpbh_ps
_mm512_maskz_cvtpbh_ps , _mm256_maskz_cvtpbh_ps
_mm512_mask_cvtpbh_ps , _mm256_mask_cvtpbh_ps
Patch by Shengchen Kan (skan)
Differential Revision: https://reviews.llvm.org/D62363
llvm-svn: 363018
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM IR recently added a Type parameter to the byval Attribute, so that
when pointers become opaque and no longer have an element type the
information will still be present in IR.
For now the Type parameter is optional (which is why Clang didn't need
this change at the time), but it will become mandatory soon.
llvm-svn: 362652
|
|
|
|
|
|
|
|
|
|
| |
Support intel AVX512 VP2INTERSECT instructions in clang
Patch by Xiang Zhang (xiangzhangllvm)
Differential Revision: https://reviews.llvm.org/D62367
llvm-svn: 362196
|
|
|
|
|
|
|
|
|
|
|
|
| |
As for other floating-point rounding builtins that can be optimized
when build with -fno-math-errno, this patch adds support for lrint
and llrint. It currently only optimize for AArch64 backend.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D62019
llvm-svn: 361878
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
assignment, move assignment operators on Expr, Stmt and Decl
Reviewers: ilya-biryukov, rsmith
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D62187
llvm-svn: 361468
|
|
|
|
| |
llvm-svn: 361238
|
|
|
|
|
|
|
|
|
|
| |
overloaded result type. Make result type for llvm.llround overloaded instead of fixing to i64
We shouldn't really make assumptions about possible sizes for long and long long. And longer term we should probably support vectorizing these intrinsics. By making the result types not fixed we can support vectors as well.
Differential Revision: https://reviews.llvm.org/D62026
llvm-svn: 361169
|
|
|
|
|
|
|
|
| |
Previously we were doing this so that the 256 bit selectw builtin could be used in the implementation of the 512->256 bit conversion intrinsic.
After this commit we now use a masked convert builtin that will emit the intrinsic call and the 256-bit select from custom code in CGBuiltin. Then the header only needs to call that one intrinsic.
llvm-svn: 360924
|
|
|
|
|
|
|
|
|
|
|
|
| |
As for other floating-point rounding builtins that can be optimized
when build with -fno-math-errno, this patch adds support for lround
and llround. It currently only optimize for AArch64 backend.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D61392
llvm-svn: 360896
|
|
|
|
|
|
|
|
|
|
|
|
| |
In MinGW, setjmp isn't expanded as a builtin in the compiler (like it
is for MSVC), but manually hooked up as calls to the right underlying
functions in headers. Using the actual CRT's real setjmp/longjmp
functions requires this intrinsic. (Currently this is worked around by
using MinGW specific reimplementations of setjmp/longjmp on aarch64.)
Differential Revision: https://reviews.llvm.org/D61592
llvm-svn: 360082
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lake
Summary:
1. Enable infrastructure of AVX512_BF16, which is supported for BFLOAT16 in Cooper Lake;
2. Enable intrinsics for VCVTNE2PS2BF16, VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural Network Instructions supporting BFLOAT16 inputs and conversion instructions from IEEE single precision.
For more details about BF16 intrinsic, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference
Patch by LiuTianle
Reviewers: craig.topper, smaslov, LuoYuanke, wxiao3, annita.zhang, spatel, RKSimon
Reviewed By: craig.topper
Subscribers: mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60552
llvm-svn: 360018
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
us emitting the operand of __builtin_constant_p if it has side-effects.
Original commit message:
Fix interactions between __builtin_constant_p and constexpr to match
current trunk GCC.
GCC permits information from outside the operand of
__builtin_constant_p (but in the same constant evaluation context) to be
used within that operand; clang now does so too. A few other minor
deviations from GCC's behavior showed up in my testing and are also
fixed (matching GCC):
* Clang now supports nullptr_t as the argument type for
__builtin_constant_p
* Clang now returns true from __builtin_constant_p if called with a
null pointer
* Clang now returns true from __builtin_constant_p if called with an
integer cast to pointer type
llvm-svn: 359367
|
|
|
|
|
|
|
|
|
|
|
|
| |
This provides intrinsics support for Memory Tagging Extension (MTE),
which was introduced with the Armv8.5-a architecture.
These intrinsics are available when __ARM_FEATURE_MEMORY_TAGGING is defined.
Each intrinsic is described in detail in the ACLE Q1 2019 documentation:
https://developer.arm.com/docs/101028/latest
Reviewed By: Tim Nortover, David Spickett
Differential Revision: https://reviews.llvm.org/D60485
llvm-svn: 359348
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These builtins provide access to the new integer and
sub-integer variants of MMA (matrix multiply-accumulate) instructions
provided by CUDA-10.x on sm_75 (AKA Turing) GPUs.
Also added a feature for PTX 6.4. While Clang/LLVM does not generate
any PTX instructions that need it, we still need to pass it through to
ptxas in order to be able to compile code that uses the new 'mma'
instruction as inline assembly (e.g used by NVIDIA's CUTLASS library
https://github.com/NVIDIA/cutlass/blob/master/cutlass/arch/mma.h#L101)
Differential Revision: https://reviews.llvm.org/D60279
llvm-svn: 359248
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Implements the intrinsics define on the ACLE to extract half precision fp scalar elements from float16x4_t and float16x8_t vector types.
a.k.a:
vduph_lane_f16
vduph_laneq_f16
Reviewers: pablooliveira, olista01, LukeGeeson, DavidSpickett
Reviewed By: DavidSpickett
Subscribers: DavidSpickett, javed.absar, kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60272
llvm-svn: 358276
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
alloca isn’t auto-init’d right now because it’s a different path in clang that
all the other stuff we support (it’s a builtin, not an expression).
Interestingly, alloca doesn’t have a type (as opposed to even VLA) so we can
really only initialize it with memset.
<rdar://problem/49794007>
Subscribers: jkorous, dexonsmith, cfe-commits, rjmccall, glider, kees, kcc, pcc
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60548
llvm-svn: 358243
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of enqueue_kernel builtin function
Summary:
https://reviews.llvm.org/D53809 fixed wrong address space(assert in debug build)
generated for event_ret argument. But exactly the same problem exists for
event_wait_list argument. This patch should fix both.
Reviewers: Anastasia, yaxunl
Reviewed By: Anastasia
Subscribers: kristina, ebevhan, cfe-commits
Differential Revision: https://reviews.llvm.org/D59985
llvm-svn: 358151
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow the optimizer to remove unnecessary EH cleanups surrounding calls
to os_log_helper, to save some code size.
As a follow-up, it might be worthwhile to add a BasicNoexcept exception
spec to os_log_helper, and to then teach CGCall to emit direct calls for
callees which can't throw. This could save some compile-time.
Differential Revision: https://reviews.llvm.org/D60108
llvm-svn: 357501
|
|
|
|
|
|
|
|
|
|
|
| |
Future versions of MSVC make these intrinsics available on x86 & x64,
according to:
http://lists.llvm.org/pipermail/cfe-dev/2019-March/061711.html
The purpose of these builtins is to emit plain, non-atomic, volatile
stores when /volatile:ms (-cc1 -fms-volatile) is enabled.
llvm-svn: 357220
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the result of discussions on the list about how to deal with intrinsics
which require codegen to disambiguate them via only the integer/fp overloads.
It causes problems for GlobalISel as some of that information is lost during
translation, while with other operations like IR instructions the information is
encoded into the instruction opcode.
This patch changes clang to emit the new faddp intrinsic if the vector operands
to the builtin have FP element types. LLVM IR AutoUpgrade has been taught to
upgrade existing calls to aarch64.neon.addp with fp vector arguments, and
we remove the workarounds introduced for GlobalISel in r355865.
This is a more permanent solution to PR40968.
Differential Revision: https://reviews.llvm.org/D59655
llvm-svn: 356722
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Because in wasm we merge all catch clauses into one big catchpad, in
case none of the types in catch handlers matches after we test against
each of them, we should unwind to the next EH enclosing scope. For this,
we should NOT use a call to `__cxa_rethrow` but rather a call to our own
rethrow intrinsic, because what we're trying to do here is just to
transfer the control flow into the next enclosing EH pad (or the
caller). Calls to `__cxa_rethrow` should only be used after a call to
`__cxa_begin_catch`.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D59353
llvm-svn: 356317
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r353765. After talking with our c stdlib folks, we decided
to use the existing pass_object_size attribute to implement _FORTIFY_SOURCE
wrappers, like Bionic does (I didn't realize that pass_object_size could be used
for this purpose). Sorry for the flip/flop, and thanks to James Y. Knight for
pointing this out to me.
llvm-svn: 356103
|
|
|
|
|
|
|
| |
Big sorry. This undoes the indentation mess I made
in r354751.
llvm-svn: 354752
|
|
|
|
|
|
|
| |
Minor style fix to avoid going over 80 cols in handling
of case for Builtin::BI__builtin_assume_aligned. NFC.
llvm-svn: 354751
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
implements llvm intrinsics and clang intrinsics for
memory.init and data.drop.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D57736
llvm-svn: 353983
|
|
|
|
|
|
|
|
|
| |
Argument evaluation order is different between gcc and clang, so pull out
the Builder calls to make the generated IR independent of the host compiler's
argument evaluation order. Thanks to rnk for reminding me of this clang/gcc
difference.
llvm-svn: 353969
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This attribute applies to declarations of C stdlib functions
(sprintf, memcpy...) that have known fortified variants
(__sprintf_chk, __memcpy_chk, ...). When applied, clang will emit
calls to the fortified variant functions instead of calls to the
defaults.
In GCC, this is done by adding gnu_inline-style wrapper functions,
but that doesn't work for us for variadic functions because we don't
support __builtin_va_arg_pack (and have no intention to).
This attribute takes two arguments, the first is 'type' argument
passed through to __builtin_object_size, and the second is a flag
argument that gets passed through to the variadic checking variants.
rdar://47905754
Differential revision: https://reviews.llvm.org/D57918
llvm-svn: 353765
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The various EltSize, Offset, DataLayout, and StructLayout arguments
are all computable from the Address's element type and the DataLayout
which the CGBuilder already has access to.
After having previously asserted that the computed values are the same
as those passed in, now remove the redundant arguments from
CGBuilder's Create*GEP functions.
Differential Revision: https://reviews.llvm.org/D57767
llvm-svn: 353629
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we are calling `__builtin_constant_p` with ObjC objects of
different classes, we hit the assertion
> Assertion failed: (isa<X>(Val) && "cast<Ty>() argument of incompatible type!"), function cast, file include/llvm/Support/Casting.h, line 254.
It happens because LLVM types for `ObjCInterfaceType` are opaque and
have no name (see `CodeGenTypes::ConvertType`). As the result, for
different ObjC classes we have different `is_constant` intrinsics with
the same name `llvm.is.constant.p0s_s`. When we try to reuse an
intrinsic with the same name, we fail because of type mismatch.
Fix by bitcasting `ObjCObjectPointerType` to `id` prior to passing as an
argument to `__builtin_constant_p`. This results in using intrinsic
`llvm.is.constant.p0i8` and correct types.
rdar://problem/47499250
Reviewers: rjmccall, ahatanak, void
Reviewed By: void, ahatanak
Subscribers: ddunbar, jkorous, hans, dexonsmith, cfe-commits
Differential Revision: https://reviews.llvm.org/D57427
llvm-svn: 353577
|
|
|
|
|
|
|
|
|
|
| |
r344765 added those intrinsics, but used the wrong types.
Patch by Mike Hommey
Differential Revision: https://reviews.llvm.org/D57636
llvm-svn: 353493
|
|
|
|
|
|
|
|
|
|
| |
The MSDN document was also updated to reflect this, but it probably will take a few days to show in below link.
https://docs.microsoft.com/en-us/cpp/intrinsics/fastfail
Differential Revision: https://reviews.llvm.org/D57631
llvm-svn: 353337
|
|
|
|
|
|
|
|
| |
Change various functions to use FunctionCallee or Function*.
Pass function type through __builtin_dump_struct's dumpRecord helper.
llvm-svn: 353199
|