| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tests.
Summary: Add MVE VAND/VORR/VORN/VEOR/VBIC intrinsics. Add unit tests.
Reviewers: simon_tatham, ostannard, dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D70547
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
VMUL* instructions. Add unit tests.
Summary: Add MVE VMUL intrinsics. Remove annoying "t1" from VMUL* instructions. Add unit tests.
Reviewers: simon_tatham, ostannard, dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D70546
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Add MVE VABD intrinsics. Add unit tests.
Reviewers: simon_tatham, ostannard, dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D70545
|
|
|
|
|
|
|
|
|
|
| |
increment/decrement (PR44054)"
The asssertion that was added does not hold,
breaks on test-suite/MultiSource/Applications/SPASS/analyze.c
Will reduce the testcase and revisit.
This reverts commit 9872ea4ed1de4c49300430e4f1f4dfc110a79ab9, 870f3542d3e0d06d208442bdca6482866b59171b.
|
|
|
|
|
|
|
|
|
|
| |
This replaces the A32 NEON vqadds, vqaddu, vqsubs and vqsubu intrinsics
with the target independent sadd_sat, uadd_sat, ssub_sat and usub_sat.
This helps generate vqadds from standard IR nodes, which might be
produced from the vectoriser. The old variants are removed in the
process.
Differential Revision: https://reviews.llvm.org/D69350
|
|
|
|
|
| |
In particular, don't hardcode the signature of the handler:
it takes src filepath so the length of buffers will not match,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR44054)
Summary:
Implicit Conversion Sanitizer is *almost* feature complete.
There aren't *that* much unsanitized things left,
two major ones are increment/decrement (this patch) and bit fields.
As it was discussed in
[[ https://bugs.llvm.org/show_bug.cgi?id=39519 | PR39519 ]],
unlike `CompoundAssignOperator` (which is promoted internally),
or `BinaryOperator` (for which we always have promotion/demotion in AST)
or parts of `UnaryOperator` (we have promotion/demotion but only for
certain operations), for inc/dec, clang omits promotion/demotion
altogether, under as-if rule.
This is technically correct: https://rise4fun.com/Alive/zPgD
As it can be seen in `InstCombineCasts.cpp` `canEvaluateTruncated()`,
`add`/`sub`/`mul`/`and`/`or`/`xor` operators can all arbitrarily
be extended or truncated:
https://github.com/llvm/llvm-project/blob/901cd3b3f62d0c700e5d2c3f97eff97d634bec5e/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp#L1320-L1334
But that has serious implications:
1. Since we no longer model implicit casts, do we pessimise
their AST representation and everything that uses it?
2. There is no demotion, so lossy demotion sanitizer does not trigger :]
Now, i'm not going to argue about the first problem here,
but the second one **needs** to be addressed. As it was stated
in the report, this is done intentionally, so changing
this in all modes would be considered a penalization/regression.
Which means, the sanitization-less codegen must not be altered.
It was also suggested to not change the sanitized codegen
to the one with demotion, but i quite strongly believe
that will not be the wise choice here:
1. One will need to re-engineer the check that the inc/dec was lossy
in terms of `@llvm.{u,s}{add,sub}.with.overflow` builtins
2. We will still need to compute the result we would lossily demote.
(i.e. the result of wide `add`ition/`sub`traction)
3. I suspect it would need to be done right here, in sanitization.
Which kinda defeats the point of
using `@llvm.{u,s}{add,sub}.with.overflow` builtins:
we'd have two `add`s with basically the same arguments,
one of which is used for check+error-less codepath and other one
for the error reporting. That seems worse than a single wide op+check.
4. OR, we would need to do that in the compiler-rt handler.
Which means we'll need a whole new handler.
But then what about the `CompoundAssignOperator`,
it would also be applicable for it.
So this also doesn't really seem like the right path to me.
5. At least X86 (but likely others) pessimizes all sub-`i32` operations
(due to partial register stalls), so even if we avoid promotion+demotion,
the computations will //likely// be performed in `i32` anyways.
So i'm not really seeing much benefit of
not doing the straight-forward thing.
While looking into this, i have noticed a few more LLVM middle-end
missed canonicalizations, and filed
[[ https://bugs.llvm.org/show_bug.cgi?id=44100 | PR44100 ]],
[[ https://bugs.llvm.org/show_bug.cgi?id=44102 | PR44102 ]].
Those are not specific to inc/dec, we also have them for
`CompoundAssignOperator`, and it can happen for normal arithmetics, too.
But if we take some other path in the patch, it will not be applicable
here, and we will have most likely played ourselves.
TLDR: front-end should emit canonical, easy-to-optimize yet
un-optimized code. It is middle-end's job to make it optimal.
I'm really hoping reviewers agree with my personal assessment
of the path this patch should take..
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44054 | PR44054 ]].
Reviewers: rjmccall, erichkeane, rsmith, vsk
Reviewed By: erichkeane
Subscribers: mehdi_amini, dexonsmith, cfe-commits, #sanitizers, llvm-commits, aaron.ballman, t.p.northover, efriedma, regehr
Tags: #llvm, #clang, #sanitizers
Differential Revision: https://reviews.llvm.org/D70539
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pass at the O1 described there.""
This reapplies: 8ff85ed905a7306977d07a5cd67ab4d5a56fafb4
Original commit message:
As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there.
This change doesn't include any change to move from selection dag to fast isel
and that will come with other numbers that should help inform that decision.
There also haven't been any real debuggability studies with this pipeline yet,
this is just the initial start done so that people could see it and we could start
tweaking after.
Test updates: Outside of the newpm tests most of the updates are coming from either
optimization passes not run anymore (and without a compelling argument at the moment)
that were largely used for canonicalization in clang.
Original post:
http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65410
This reverts commit c9ddb02659e3ece7a0d9d6b4dac7ceea4ae46e6d.
|
|
|
|
|
|
| |
It is tricky to use replace_path_prefix correctly on Windows which uses
backslashes as native path separators. Switch back to the old approach
(startswith is not ideal) to appease build bots for now.
|
|
|
|
|
|
|
|
|
| |
GCC 8 implements -fmacro-prefix-map. Like -fdebug-prefix-map, it replaces a string prefix for the __FILE__ macro.
-ffile-prefix-map is the union of -fdebug-prefix-map and -fmacro-prefix-map
Reviewed By: rnk, Lekensteyn, maskray
Differential Revision: https://reviews.llvm.org/D49466
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The modifier system used to mutate types on NEON intrinsic definitions had a
separate letter for all kinds of transformations that might be needed, and we
were quite quickly running out of letters to use. This patch converts to a much
smaller set of orthogonal modifiers that can be applied together to achieve the
desired effect.
When merging with downstream it is likely to cause a conflict with any local
modifications to the .td files. There is a new script in
utils/convert_arm_neon.py that was used to convert all .td definitions and I
would suggest running it on the last downstream version of those files before
this commit rather than resolving conflicts manually.
The original version broke vcreate_* because it became a macro and didn't
apply the normal integer promotion rules before bitcasting to a vector.
This adds a temporary.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the O1 described there."
This reverts commit 8ff85ed905a7306977d07a5cd67ab4d5a56fafb4.
This commit introduced 9 new failures on lldb buildbot host at http://lab.llvm.org:8014/builders/lldb-aarch64-ubuntu
Following tests were failing:
lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq1/TestAmbiguousTailCallSeq1.py
lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq2/TestAmbiguousTailCallSeq2.py
lldb-api :: functionalities/tail_call_frames/disambiguate_call_site/TestDisambiguateCallSite.py
lldb-api :: functionalities/tail_call_frames/disambiguate_paths_to_common_sink/TestDisambiguatePathsToCommonSink.py
lldb-api :: functionalities/tail_call_frames/disambiguate_tail_call_seq/TestDisambiguateTailCallSeq.py
lldb-api :: functionalities/tail_call_frames/inlining_and_tail_calls/TestInliningAndTailCalls.py
lldb-api :: functionalities/tail_call_frames/sbapi_support/TestTailCallFrameSBAPI.py
lldb-api :: functionalities/tail_call_frames/thread_step_out_message/TestArtificialFrameStepOutMessage.py
lldb-api :: functionalities/tail_call_frames/thread_step_out_or_return/TestSteppingOutWithArtificialFrames.py
lldb-api :: functionalities/tail_call_frames/unambiguous_sequence/TestUnambiguousTailCalls.py
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65410
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
described there.
This change doesn't include any change to move from selection dag to fast isel
and that will come with other numbers that should help inform that decision.
There also haven't been any real debuggability studies with this pipeline yet,
this is just the initial start done so that people could see it and we could start
tweaking after.
Test updates: Outside of the newpm tests most of the updates are coming from either
optimization passes not run anymore (and without a compelling argument at the moment)
that were largely used for canonicalization in clang.
Original post:
http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65410
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This has the main effect of causing target-cpu and target-features to be set
on __cfi_check_fail, causing the function to become ABI-compatible with other
functions in the case where these attributes affect ABI (e.g. reserve-x18).
Technically we only need to call SetLLVMFunctionAttributes to get the target-*
attributes set, but since we're creating a definition we probably ought to
call the ForDefinition function as well.
Fixes PR44094.
Differential Revision: https://reviews.llvm.org/D70692
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
multiple modifiers."
This broke the vcreate_u64 intrinsic. Example:
$ cat /tmp/a.cc
#include <arm_neon.h>
void g() {
auto v = vcreate_u64(0);
}
$ bin/clang -c /tmp/a.cc --target=arm-linux-androideabi16 -march=armv7-a
/tmp/a.cc:4:12: error: C-style cast from scalar 'int' to vector 'uint64x1_t' (vector of 1 'uint64_t' value) of different size
auto v = vcreate_u64(0);
^~~~~~~~~~~~~~
/work/llvm.monorepo/build.release/lib/clang/10.0.0/include/arm_neon.h:4144:11: note: expanded from macro 'vcreate_u64'
__ret = (uint64x1_t)(__p0); \
^~~~~~~~~~~~~~~~~~
Reverting until this can be investigated.
> The modifier system used to mutate types on NEON intrinsic definitions had a
> separate letter for all kinds of transformations that might be needed, and we
> were quite quickly running out of letters to use. This patch converts to a much
> smaller set of orthogonal modifiers that can be applied together to achieve the
> desired effect.
>
> When merging with downstream it is likely to cause a conflict with any local
> modifications to the .td files. There is a new script in
> utils/convert_arm_neon.py that was used to convert all .td definitions and I
> would suggest running it on the last downstream version of those files before
> this commit rather than resolving conflicts manually.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the Dwarf Version metadata was initially added (r184276) there was
no support for Module::Max - though the comment suggested that was the
desired behavior. The original behavior was Module::Warn which would
warn and then pick whichever version came first - which is pretty
arbitrary/luck-based if the consumer has some need for one version or
the other.
Now that the functionality's been added (r303590) this change updates
the implementation to match the desired goal.
The general logic here is - if you compile /some/ of your program with a
more recent DWARF version, you must have a consumer that can handle it,
so might as well use it for /everything/.
The only place where this might fall down is if you have a need to use
an old tool (supporting only the older DWARF version) for some subset of
your program. In which case now it'll all be the higher version. That
seems pretty narrow (& the inverse could happen too - you specifically
/need/ the higher DWARF version for some extra expressivity, etc, in
some part of the program)
|
|
|
|
|
|
|
|
|
|
|
|
| |
We seem to have been gradually growing support for atomic min/max operations
(exposing longstanding IR atomicrmw instructions). But until now there have
been gaps in the expected intrinsics. This adds support for the C11-style
intrinsics (i.e. taking _Atomic, rather than individually blessed by C11
standard), and the variants that return the new value instead of the original
one.
That way, people won't be misled by trying one form and it not working, and the
front-end is more friendly to people using _Atomic types, as we recommend.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The modifier system used to mutate types on NEON intrinsic definitions had a
separate letter for all kinds of transformations that might be needed, and we
were quite quickly running out of letters to use. This patch converts to a much
smaller set of orthogonal modifiers that can be applied together to achieve the
desired effect.
When merging with downstream it is likely to cause a conflict with any local
modifications to the .td files. There is a new script in
utils/convert_arm_neon.py that was used to convert all .td definitions and I
would suggest running it on the last downstream version of those files before
this commit rather than resolving conflicts manually.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
It turns out that the ExprMutationAnalyzer can be very slow when AST
gets huge in some cases. The idea is to move this analysis to the LLVM
back-end level (more precisely, in the LiveDebugValues pass). The new
approach will remove the performance regression, simplify the
implementation and give us front-end independent implementation.
Differential Revision: https://reviews.llvm.org/D68206
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(reland with fixes)
Currently, clang emits subprograms for declared functions when the
target debugger or DWARF standard is known to support entry values
(DW_OP_entry_value & the GNU equivalent).
Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU
tail calls.
Pre-patch debug session with a cross-TU tail call:
```
* frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```
Post-patch (note that the tail-calling frame, "helper", is visible):
```
* frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
frame #1: 0x0000000100000f80 main`helper [opt] [artificial]
frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```
This was reverted in 5b9a072c because it attached declaration
subprograms to inlinable builtin calls, which interacted badly with the
MergeICmps pass. The fix is to not attach declarations to builtins.
rdar://46577651
Differential Revision: https://reviews.llvm.org/D69743
|
|
|
|
|
|
|
|
| |
The CUDA builtin library is apparently compiled in C++ mode, so the
assumption of convergent needs to be made in a typically non-SPMD
language. The functions in the library should still be assumed
convergent. Currently they are not, which is potentially incorrect and
this happens to work after the library is linked.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fills in the small family of MVE intrinsics that have nothing to
do with vectors: they implement bit-shift operations on 32- or 64-bit
values held in one or two general-purpose registers. Most of these
shift operations saturate if shifting left, and round to nearest if
shifting right, although LSLL and ASRL behave like ordinary shifts.
When these instructions take a variable shift count in a register,
they pay attention to its sign, so that (for example) LSLL or UQRSHLL
will shift left if given a positive number but right if given a
negative one. That makes even LSLL and ASRL different enough from
standard LLVM IR shift semantics that I couldn't see any better
alternative than to simply model the whole family as a set of
MVE-specific IR intrinsics.
(The //immediate// forms of LSLL and ASRL, on the other hand, do
behave exactly like a standard IR shift of a 64-bit value. In fact,
those forms don't have ACLE intrinsics defined at all, because you can
just write an ordinary C shift operation if you want one of those.)
The 64-bit shifts have to be instruction-selected in C++, because they
deliver two output values. But the 32-bit ones are simple enough that
I could write a DAG isel pattern directly into each Instruction
record.
Reviewers: ostannard, MarkMurrayARM, dmgreen
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D70319
|
|
|
|
|
|
|
|
| |
-ffp-model=, and -ffp-exception-behavior="
and a follow-up NFC rearrangement as it's causing a crash on valid. Testcase is on the original review thread.
This reverts commits af57dbf12e54f3a8ff48534bf1078f4de104c1cd and e6584b2b7b2de06f1e59aac41971760cac1e1b79
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the `vcmp` family of ACLE MVE intrinsics: vector/vector,
vector/scalar, and the predicated forms of both. All are represented
using standard existing IR: vector/scalar comparisons are represented
by making a vector out of the scalar first, and predicated forms are
represented by taking the bitwise AND of the input predicate and the
output of the comparison. Existing LLVM-side tests demonstrate that
ISel will pattern-match all of that back down to single MVE VCMPs.
The idiom of handling a vector/scalar operation by generating IR to
expand the scalar into a second vector is going to be needed for a lot
of MVE intrinsics, so to make that easy, I've provided a helper
function that automatically works out the element count.
The comparison intrinsics are the first ones that have to //return// a
predicate, in the user-facing `mve_pred16_t` format. This means we
have to use the `arm_mve_pred_v2i` low-level intrinsic to convert it
back from the logical `<n x i1>` form used in IR. I've done that
explicitly in the code gen specification for the builtins, because it
happens much more rarely in the ACLE API than passing a Predicate as
input, so it didn't seem worth automating in MveEmitter.
Reviewers: ostannard, MarkMurrayARM, dmgreen
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D70297
|
|
|
|
|
|
|
|
|
| |
This patch implements `__attribute__((target("branch-protection=...")))`
in a manner, compatible with the analogous GCC feature:
https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes
Differential Revision: https://reviews.llvm.org/D68711
|
|
|
|
| |
This reverts commit rG1643734741d2 due to LLDB test failure.
|
|
|
|
|
|
|
|
|
|
| |
It turns out that the ExprMutationAnalyzer can be very slow when AST
gets huge in some cases. The idea is to move this analysis to the LLVM
back-end level (more precisely, in the LiveDebugValues pass). The new
approach will remove the performance regression, simplify the
implementation and give us front-end independent implementation.
Differential Revision: https://reviews.llvm.org/D68206
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the `vgetq_lane` and `vsetq_lane` families, to copy between
a scalar and a specified lane of a vector.
One of the new `vgetq_lane` intrinsics returns a `float16_t`, which
causes a compile error if `%clang_cc1` doesn't get the option
`-fallow-half-arguments-and-returns`. The driver passes that option to
cc1 already, but I've had to edit all the explicit cc1 command lines
in the existing MVE intrinsics tests.
A couple of fixes are included for the code I wrote up front in
MveEmitter to support lane-index immediates (and which nothing has
tested until now): the type was wrong (`uint32_t` instead of `int`)
and the range was off by one.
I've also added a method of bypassing the default promotion to `i32`
that is done by the MveEmitter code generation: it's sensible to
promote short scalars like `i16` to `i32` if they're going to be
passed to custom IR intrinsics representing a machine instruction
operating on GPRs, but not if they're going to be passed to standard
IR operations like `insertelement` which expect the exact type.
Reviewers: ostannard, MarkMurrayARM, dmgreen
Reviewed By: dmgreen
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70188
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This batch of intrinsics includes lots of things that move vector data
around or change its type without really affecting its value very
much. It includes the `vreinterpretq` family (cast one vector type to
another); `vuninitializedq` (create a vector of a given type with
don't-care contents); and `vcreateq` (make a 128-bit vector out of two
`uint64_t` halves).
These are all implemented using completely standard IR that's already
tested in existing LLVM unit tests, so I've just written a clang test
to check the IR is correct, and left it at that.
I've also added some richer infrastructure to the MveEmitter Tablegen
backend, to make it specify the exact integer type of integer
arguments passed to IR construction functions, and wrap those
arguments in a `static_cast` in the autogenerated C++. That was
necessary to prevent an overloading ambiguity when passing the integer
literal `0` to `IRBuilder::CreateInsertElement`, because otherwise, it
could mean either a null pointer `llvm::Value *` or a zero `uint64_t`.
Reviewers: ostannard, MarkMurrayARM, dmgreen
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70133
|
|
|
|
|
| |
No instruction takes mxcsr as a an operand so we should always
treat it as an identifier name.
|
|
|
|
|
|
|
| |
Depending on different cmake configures, clang may generate different
IR name for slot variables. Let us use the regex instead of hard
coding the name. I did the same for other bpf-attr-preserve-access-index
tests with such an approach, but somehow did not do for this one.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a resubmission for the previous reverted commit
943436040121 with the same subject. This commit fixed the
segfault issue and addressed additional review comments.
This patch introduced a new bpf specific attribute which can
be added to struct or union definition. For example,
struct s { ... } __attribute__((preserve_access_index));
union u { ... } __attribute__((preserve_access_index));
The goal is to simplify user codes for cases
where preserve access index happens for certain struct/union,
so user does not need to use clang __builtin_preserve_access_index
for every members.
The attribute has no effect if -g is not specified.
When the attribute is specified and -g is specified, any member
access defined by that structure or union, including array subscript
access and inner records, will be preserved through
__builtin_preserve_{array,struct,union}_access_index()
IR intrinsics, which will enable relocation generation
in bpf backend.
The following is an example to illustrate the usage:
-bash-4.4$ cat t.c
#define __reloc__ __attribute__((preserve_access_index))
struct s1 {
int c;
} __reloc__;
struct s2 {
union {
struct s1 b[3];
};
} __reloc__;
struct s3 {
struct s2 a;
} __reloc__;
int test(struct s3 *arg) {
return arg->a.b[2].c;
}
-bash-4.4$ clang -target bpf -g -S -O2 t.c
A relocation with access string "0:0:0:0:2:0" will be generated
representing access offset of arg->a.b[2].c.
forward declaration with attribute is also handled properly such
that the attribute is copied and populated in real record definition.
Differential Revision: https://reviews.llvm.org/D69759
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the ACLE intrinsics for all the MVE load and store
instructions not already handled by D69791. These ones don't need new
IR intrinsics, because they can be implemented in terms of standard
LLVM IR constructions.
Some of the load and store instructions access less than 128 bits of
memory, sign/zero extending each value to a wider vector lane on load
or truncating it on store. These are represented in IR by a load of a
shorter vector followed by a zext/sext, and conversely, a trunc
followed by a short store. Existing ISel patterns already recognize
those combinations and turn them into the right MVE instructions.
The predicated forms of all these instructions are represented in the
same way, except that the ordinary load/store operation is replaced
with the existing intrinsics @llvm.masked.{load,store}. These are
currently only code-generated as predicated MVE load/store
instructions if you give LLVM the `-enable-arm-maskedldst` option; so
I've done that in the LLVM codegen test. When we make that the
default, that option can be removed.
In the Tablegen backend, I've had to add a handful of extra support
features:
* We need to be able to make clang::Address objects out of a
pointer and an alignment (previously we only needed these when the
user passed us an existing one).
* We can now specify vector types that aren't 128 bits wide (for use
in those intermediate values in IR), the parametrized type system
can make one starting from two existing vector types (using the lane
count of one and the element type of the other).
* I've added support for code generation of pointer casts, and for
specifying LLVM types as operands to IRBuilder operations (for zext
and sext, though I think they'll come in useful again).
* Now not all IR construction operations need to be specified as
Builder.CreateFoo; some don't involve a Builder at all, and one
passes it as a parameter to a tiny static helper function in
CGBuiltin.cpp.
Reviewers: ostannard, MarkMurrayARM, dmgreen
Subscribers: kristof.beyls, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D70088
|
| |
|
|
|
|
|
|
| |
This reverts commit 4a5aa1a7bf8b1714b817ede8e09cd28c0784228a.
There are some other test failures. Investigate them first.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduced a new bpf specific attribute which can
be added to struct or union definition. For example,
struct s { ... } __attribute__((preserve_access_index));
union u { ... } __attribute__((preserve_access_index));
The goal is to simplify user codes for cases
where preserve access index happens for certain struct/union,
so user does not need to use clang __builtin_preserve_access_index
for every members.
The attribute has no effect if -g is not specified.
When the attribute is specified and -g is specified, any member
access defined by that structure or union, including array subscript
access and inner records, will be preserved through
__builtin_preserve_{array,struct,union}_access_index()
IR intrinsics, which will enable relocation generation
in bpf backend.
The following is an example to illustrate the usage:
-bash-4.4$ cat t.c
#define __reloc__ __attribute__((preserve_access_index))
struct s1 {
int c;
} __reloc__;
struct s2 {
union {
struct s1 b[3];
};
} __reloc__;
struct s3 {
struct s2 a;
} __reloc__;
int test(struct s3 *arg) {
return arg->a.b[2].c;
}
-bash-4.4$ clang -target bpf -g -S -O2 t.c
A relocation with access string "0:0:0:0:2:0" will be generated
representing access offset of arg->a.b[2].c.
forward declaration with attribute is also handled properly such
that the attribute is copied and populated in real record definition.
Differential Revision: https://reviews.llvm.org/D69759
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D69648
|
|
|
|
|
|
|
|
| |
As we currently have it implemented in altivec.h, the offsets for these two
intrinsics are element offsets. The documentation in the ABI (as well as the
implementation in both XL and GCC) states that these should be byte offsets.
Differential revision: https://reviews.llvm.org/D63636
|
|
|
|
|
|
|
| |
We currently emit a double precision comparison instruction for this, whereas we
need to emit the single precision version.
Differential revision: https://reviews.llvm.org/D64024
|
|
|
|
|
|
|
|
|
|
|
|
| |
-ffp-exception-behavior=
Add options to control floating point behavior: trapping and
exception behavior, rounding, and control of optimizations that affect
floating point calculations. More details in UsersManual.rst.
Reviewers: rjmccall
Differential Revision: https://reviews.llvm.org/D62731
|
|
|
|
|
|
|
|
| |
Atomic compound expressions try to use atomicrmw if possible, but this
path doesn't set the Result variable, leaving it to crash in later code
if anything ever tries to use the result of the expression. This fixes
that issue by recalculating the new value based on the old one
atomically loaded.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
is understood"
This caused Chromium builds to fail with "inlinable function call in a function
with debug info must have a !dbg location" errors. See
https://bugs.chromium.org/p/chromium/issues/detail?id=1022296#c1 for a
reproducer.
> Currently, clang emits subprograms for declared functions when the
> target debugger or DWARF standard is known to support entry values
> (DW_OP_entry_value & the GNU equivalent).
>
> Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU
> tail calls.
>
> Pre-patch debug session with a cross-TU tail call:
>
> ```
> * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
> frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt]
> ```
>
> Post-patch (note that the tail-calling frame, "helper", is visible):
>
> ```
> * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
> frame #1: 0x0000000100000f80 main`helper [opt] [artificial]
> frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt]
> ```
>
> rdar://46577651
>
> Differential Revision: https://reviews.llvm.org/D69743
|
|
|
|
|
|
|
|
| |
'a' used to implement a splat in C++ code in NeonEmitter.cpp, but this
can be done directly from .td expansions now (and most ops already did).
So removing it simplifies the overall code.
https://reviews.llvm.org/D69716
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds two new families of intrinsics, both of which are
memory accesses taking a vector of locations to load from / store to.
The vldrq_gather_base / vstrq_scatter_base intrinsics take a vector of
base addresses, and an immediate offset to be added consistently to
each one. vldrq_gather_offset / vstrq_scatter_offset take a scalar
base address, and a vector of offsets to add to it. The
'shifted_offset' variants also multiply each offset by the element
size type, so that the vector is effectively of array indices.
At the IR level, these operations are represented by a single set of
four IR intrinsics: {gather,scatter} × {base,offset}. The other
details (signed/unsigned, shift, and memory element size as opposed to
vector element size) are all specified by IR intrinsic polymorphism
and immediate operands, because that made the selection job easier
than making a huge family of similarly named intrinsics.
I considered using the standard IR representations such as
llvm.masked.gather, but they're not a good fit. In order to use
llvm.masked.gather to represent a gather_offset load with element size
smaller than a pointer, you'd have to expand the <8 x i16> vector of
offsets into an <8 x i16*> vector of pointers, which would be split up
during legalization, so you'd spend most of your time undoing the mess
it had made. Also, ISel support for llvm.masked.gather would be easy
enough in a trivial way (you can expand it into a gather-base load
with a zero immediate offset), but instruction-selecting lots of
fiddly idioms back into all the _other_ MVE load instructions would be
much more work. So I think dedicated IR intrinsics are the more
sensible approach, at least for the moment.
On the clang tablegen side, I've added two new features to the
Tablegen source accepted by MveEmitter: a 'CopyKind' type node for
defining a type that varies with the parameter type (it lets you ask
for an unsigned integer type of the same width as the parameter), and
an 'unsignedflag' value node for passing an immediate IR operand which
is 0 for a signed integer type or 1 for an unsigned one. That lets me
write each kind of intrinsic just once and get all its subtypes and
immediate arguments generated automatically.
Also I've tweaked the handling of pointer-typed values in the code
generation part of MveEmitter: they're generated as Address rather
than Value (i.e. including an alignment) so that they can be given to
the ordinary IR load and store operations, but I'd omitted the code to
convert them back to Value when they're going to be used as an
argument to an IR intrinsic.
On the MC side, I've enhanced MVEVectorVTInfo so that it can tell you
not only the full assembly-language suffix for a given vector type
(like 's32' or 'u16') but also the numeric-only one used by store
instructions (just '32' or '16').
Reviewers: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D69791
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recognize -mnop-mcount from the command line and add a function attribute
"mnop-mcount"="true" when passed.
When this option is used, a nop is added instead of a call to fentry. This
is used when building the Linux Kernel.
If this option is passed for any other target than SystemZ, an error is
generated.
Review: Ulrich Weigand
https://reviews.llvm.org/D67763
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, clang emits subprograms for declared functions when the
target debugger or DWARF standard is known to support entry values
(DW_OP_entry_value & the GNU equivalent).
Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU
tail calls.
Pre-patch debug session with a cross-TU tail call:
```
* frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```
Post-patch (note that the tail-calling frame, "helper", is visible):
```
* frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
frame #1: 0x0000000100000f80 main`helper [opt] [artificial]
frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```
rdar://46577651
Differential Revision: https://reviews.llvm.org/D69743
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 004ed2b0d1b86d424643ffc88fce20ad8bab6804.
Original commit hash 6d03890384517919a3ba7fe4c35535425f278f89
Summary:
This adds a clang option to disable inline line tables. When it is used,
the inliner uses the call site as the location of the inlined function instead of
marking it as an inline location with the function location.
https://reviews.llvm.org/D67723
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch adds flag "mayRaiseFPException" , FPCW and FPSW for X87 instructions which could raise
float exception.
Reviewers: pengfei, RKSimon, andrew.w.kaylor, uweigand, kpn, spatel, cameron.mcinally, craig.topper
Reviewed By: craig.topper
Subscribers: thakis, hiraditya, llvm-commits
Patch by LiuChen.
Differential Revision: https://reviews.llvm.org/D68854
|