| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Before this change, X86_32ABIInfo::classifyArgument would be called
twice on vector arguments to vectorcall functions. This function has
side effects to track GPR register usage, and this would lead to
incorrect GPR usage in some cases. The specific case I noticed is from
running out of XMM registers with mixed FP and vector arguments and no
aggregates of any kind. Consider this prototype:
void __vectorcall vectorcall_indirect_vec(
double xmm0, double xmm1, double xmm2, double xmm3, double xmm4,
__m128 xmm5,
__m128 ecx,
int edx,
__m128 mem);
classifyArgument has no effects when called on a plain FP type, but when
called on a vector type, it modifies FreeRegs to model GPR consumption.
However, this should not happen during the vector call first pass.
I refactored the code to unify vectorcall HVA logic with regcall HVA
logic. The conventions pass HVAs in registers differently (expanded vs.
not expanded), but if they do not fit in registers, they both pass them
indirectly by address.
Reviewers: erichkeane, craig.topper
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72110
|
| |
|
|
| |
(clang diagnostic handler for IR input files)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For IR input files, we currently use LLVM diagnostic handler even the
compilation is from clang. As a result, we are not able to use -Rpass
to get the transformation reports. Some warnings are not handled
properly either: We found many mysterious warnings in our ThinLTO backend
compilations in SamplePGO and CSPGO. An example of the warning:
"warning: net/proto2/public/metadata_lite.h:51:21: 0.02% (1 / 4999)"
This turns out to be a warning by Wmisexpect, which is supposed to be
filtered out by default. But since the filter is in clang's
diagnostic hander, we emit these incomplete warnings from LLVM's
diagnostic handler.
This patch uses clang diagnostic handler for IR input files. We create
a fake backendconsumer just to install the diagnostic handler.
With this change, we will have proper handling of all the warnings and we can
use -Rpass* options in IR input files compilation.
Also note that with is patch, LLVM's diagnostic options, like
"-mllvm -pass-remarks=*", are no longer be able to get optimization remarks.
Differential Revision: https://reviews.llvm.org/D72523
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously, since these aggregates are > 2*XLen, Clang would think they
were being returned indirectly and thus would decrease the number of
available GPRs available by 1. For long argument lists this could lead
to a struct argument incorrectly being passed indirectly.
Reviewers: asb, lenary
Reviewed By: asb, lenary
Subscribers: luismarques, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, sameer.abuasal, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69590
|
| |
|
|
|
|
|
|
|
|
|
| |
This reverts commit 2af97be8027a0823b88d4b6a07fc5eedb440bc1f.
After attempting to fix bot failures from matching issues (mostly due to
inconsistent printing of "llvm::" prefixes on objects, and
AnalysisManager objects being printed differntly, I am now seeing some
differences I don't understand (real differences in the passes being
printed). Giving up at this point to allow the bots to recover. Will
revisit later.
|
| |
|
|
|
|
|
| |
Along with the previous fix for bot failures from
2af97be8027a0823b88d4b6a07fc5eedb440bc1f, need to add a wildcard in a
couple of places where my local output did not print "llvm::" but the
bot is.
|
| |
|
|
|
|
|
|
|
| |
Hopefully final bot fix for last few failures from
2af97be8027a0823b88d4b6a07fc5eedb440bc1f.
Looks like sometimes the "llvm::" preceeding objects get printed in the
debug pass manager output and sometimes they don't. Replace with
wildcard matching.
|
| |
|
|
|
|
|
|
| |
Additional fixes for bot failures from 2af97be8027a0823b88d4b6a07fc5eedb440bc1f.
Remove more exact matching on AnalyisManagers, as they can vary.
Also allow different orders between LoopAnalysis and
BranchProbabilityAnalysis as that can vary due to both being accessed in
the parameter list of a call.
|
| |
|
|
|
|
|
|
|
| |
Should fix most of the buildbot failures from
2af97be8027a0823b88d4b6a07fc5eedb440bc1f, by loosening up the matching
on the AnalysisProxy output.
Added in --dump-input=fail on the one test that appears to be something
different, so I can hopefully debug it better.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
I've added some more extensive ThinLTO pipeline testing with the new PM,
motivated by the bug fixed in D72386.
I beefed up llvm/test/Other/new-pm-pgo.ll a little so that it tests
ThinLTO pre and post link with PGO, similar to the testing for the
default pipelines with PGO.
Added new pre and post link PGO tests for both instrumentation and
sample PGO that exhaustively test the pipelines at different
optimization levels via opt.
Added a clang test to exhaustively test the post link pipeline invoked for
distributed builds. I am currently only testing O2 and O3 since these
are the most important for performance.
It would be nice to add similar exhaustive testing for full LTO, and for
the old PM, but I don't have the bandwidth now and this is a start to
cover some of the situations that are not currently default and were
under tested.
Reviewers: wmi
Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, jfb, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72538
|
| |
|
|
|
|
|
| |
All 130+ f_Group flags that take an argument allow it after a '=',
except for fdebug-complation-dir. Add a Joined<> alias so that
it behaves consistently with all the other f_Group flags.
(Keep the old Separate flag for backwards compat.)
|
| |
|
|
|
|
|
|
|
|
|
| |
In the backend, this feature is implemented with the function attribute
"patchable-function-entry". Both the attribute and XRay use
TargetOpcode::PATCHABLE_FUNCTION_ENTER, so the two features are
incompatible.
Reviewed By: ostannard, MaskRay
Differential Revision: https://reviews.llvm.org/D72222
|
| |
|
|
|
|
|
|
|
| |
This feature is generic. Make it applicable for AArch64 and X86 because
the backend has only implemented NOP insertion for AArch64 and X86.
Reviewed By: nickdesaulniers, aaron.ballman
Differential Revision: https://reviews.llvm.org/D72221
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Avoid using the `nocf_check` attribute with Control Flow Guard. Instead, use a
new `"guard_nocf"` function attribute to indicate that checks should not be
added on indirect calls within that function. Add support for
`__declspec(guard(nocf))` following the same syntax as MSVC.
Reviewers: rnk, dmajor, pcc, hans, aaron.ballman
Reviewed By: aaron.ballman
Subscribers: aaron.ballman, tomrittervg, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72167
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update the IRBuilder to generate constrained FP comparisons in
CreateFCmp when IsFPConstrained is true, similar to the other
places in the IRBuilder.
Also, add a new CreateFCmpS to emit signaling FP comparisons,
and use it in clang where comparisons are supposed to be signaling
(currently, only when emitting code for the <, <=, >, >= operators).
Note that there is currently no way to add fast-math flags to a
constrained FP comparison, since this is implemented as an intrinsic
call that returns a boolean type, and FMF are only allowed for calls
returning a floating-point type. However, given the discussion around
https://bugs.llvm.org/show_bug.cgi?id=42179, it seems that FCmp itself
really shouldn't have any FMF either, so this is probably OK.
Reviewed by: craig.topper
Differential Revision: https://reviews.llvm.org/D71467
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A copy-paste error in `arm_mve.td` meant that the MVE `vqrshrun`
intrinsic family was generating the `vqshrun` machine instruction,
because in the IR intrinsic call, the rounding flag argument was set
to 0 rather than 1.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72496
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a system header provides an (inline) implementation of some of their
function, clang still matches on the function name and generate the appropriate
llvm builtin, e.g. memcpy. This behavior is in line with glibc recommendation «
users may not provide their own version of symbols » but doesn't account for the
fact that glibc itself can provide inline version of some functions.
It is the case for the memcpy function when -D_FORTIFY_SOURCE=1 is on. In that
case an inline version of memcpy calls __memcpy_chk, a function that performs
extra runtime checks. Clang currently ignores the inline version and thus
provides no runtime check.
This code fixes the issue by detecting functions whose name is a builtin name
but also have an inline implementation.
Differential Revision: https://reviews.llvm.org/D71082
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
down to pass builder in ltobackend.
Currently CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP in clang
are not passed down to pass builder in ltobackend when new pass manager is
used. This is inconsistent with the behavior when new pass manager is used
and thinlto is not used. Such inconsistency causes slp vectorization pass
not being enabled in ltobackend for O3 + thinlto right now. This patch
fixes that.
Differential Revision: https://reviews.llvm.org/D72386
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change introduces three new builtins (which work on both pointers
and integers) that can be used instead of common bitwise arithmetic:
__builtin_align_up(x, alignment), __builtin_align_down(x, alignment) and
__builtin_is_aligned(x, alignment).
I originally added these builtins to the CHERI fork of LLVM a few years ago
to handle the slightly different C semantics that we use for CHERI [1].
Until recently these builtins (or sequences of other builtins) were
required to generate correct code. I have since made changes to the default
C semantics so that they are no longer strictly necessary (but using them
does generate slightly more efficient code). However, based on our experience
using them in various projects over the past few years, I believe that adding
these builtins to clang would be useful.
These builtins have the following benefit over bit-manipulation and casts
via uintptr_t:
- The named builtins clearly convey the semantics of the operation. While
checking alignment using __builtin_is_aligned(x, 16) versus
((x & 15) == 0) is probably not a huge win in readably, I personally find
__builtin_align_up(x, N) a lot easier to read than (x+(N-1))&~(N-1).
- They preserve the type of the argument (including const qualifiers). When
using casts via uintptr_t, it is easy to cast to the wrong type or strip
qualifiers such as const.
- If the alignment argument is a constant value, clang can check that it is
a power-of-two and within the range of the type. Since the semantics of
these builtins is well defined compared to arbitrary bit-manipulation,
it is possible to add a UBSAN checker that the run-time value is a valid
power-of-two. I intend to add this as a follow-up to this change.
- The builtins avoids int-to-pointer casts both in C and LLVM IR.
In the future (i.e. once most optimizations handle it), we could use the new
llvm.ptrmask intrinsic to avoid the ptrtoint instruction that would normally
be generated.
- They can be used to round up/down to the next aligned value for both
integers and pointers without requiring two separate macros.
- In many projects the alignment operations are already wrapped in macros (e.g.
roundup2 and rounddown2 in FreeBSD), so by replacing the macro implementation
with a builtin call, we get improved diagnostics for many call-sites while
only having to change a few lines.
- Finally, the builtins also emit assume_aligned metadata when used on pointers.
This can improve code generation compared to the uintptr_t casts.
[1] In our CHERI compiler we have compilation mode where all pointers are
implemented as capabilities (essentially unforgeable 128-bit fat pointers).
In our original model, casts from uintptr_t (which is a 128-bit capability)
to an integer value returned the "offset" of the capability (i.e. the
difference between the virtual address and the base of the allocation).
This causes problems for cases such as checking the alignment: for example, the
expression `if ((uintptr_t)ptr & 63) == 0` is generally used to check if the
pointer is aligned to a multiple of 64 bytes. The problem with offsets is that
any pointer to the beginning of an allocation will have an offset of zero, so
this check always succeeds in that case (even if the address is not correctly
aligned). The same issues also exist when aligning up or down. Using the
alignment builtins ensures that the address is used instead of the offset. While
I have since changed the default C semantics to return the address instead of
the offset when casting, this offset compilation mode can still be used by
passing a command-line flag.
Reviewers: rsmith, aaron.ballman, theraven, fhahn, lebedev.ri, nlopes, aqjune
Reviewed By: aaron.ballman, lebedev.ri
Differential Revision: https://reviews.llvm.org/D71499
|
| |
|
|
| |
Fixes http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/2597
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
assembly.
Summary:
Extend D71677 to apply to all branch-target operands, rather than special-casing call instructions.
Also add a regression test for llvm.org/PR44272, since this finishes fixing it.
Reviewers: thakis, rnk
Reviewed By: thakis
Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72417
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A few of the ARM MVE builtins directly return a structure type. This
causes an assertion failure at code-gen time if you try to assign the
result of the builtin to a variable, because the `RValue` created in
`EmitBuiltinExpr` from the `llvm::Value` produced by codegen is always
made by `RValue::get()`, which creates a non-aggregate `RValue` that
will fail an assertion when `AggExprEmitter::withReturnValueSlot` calls
`Src.getAggregatePointer()`. A similar failure occurs if you try to use
the struct return value directly to extract one field, e.g.
`vld2q(address).val[0]`.
The existing code-gen tests for those MVE builtins pass the returned
structure type directly to the C `return` statement, which apparently
managed to avoid that particular code path, so we didn't notice the
crash.
Now `EmitBuiltinExpr` checks the evaluation kind of the builtin's return
value, and does the necessary handling for aggregate returns. I've added
two extra test cases, both of which crashed before this change.
Reviewers: dmgreen, rjmccall
Reviewed By: rjmccall
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72271
|
| |
|
|
|
|
|
| |
- Lower to the memcpy intrinsic
- Raise warnings when size/bounds are known
Differential Revision: https://reviews.llvm.org/D71374
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Architecturally, it's allowed to have MVE-I without an FPU, thus
-mfpu=none should not disable MVE-I, or moves to/from FP-registers.
This patch removes `+/-fpregs` from features unconditionally added to
target feature list, depending on FPU and moves the logic to Clang
driver, where the negative form (`-fpregs`) is conditionally added to
the target features list for the cases of `-mfloat-abi=soft`, or
`-mfpu=none` without either `+mve` or `+mve.fp`. Only the negative
form is added by the driver, the positive one is derived from other
features in the backend.
Differential Revision: https://reviews.llvm.org/D71843
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This batch of intrinsics fills in all the shift instructions that take
a variable shift distance in a register, instead of an immediate. Some
of these instructions take a single shift distance in a scalar
register and apply it to all lanes; others take a vector of per-lane
distances.
These instructions are all basically one family, varying in whether
they saturate out-of-range values, and whether they round when bits
are shifted off the bottom. I've implemented them at the IR level by a
much smaller family of IR intrinsics, which take flag parameters to
indicate saturating and/or rounding (along with the usual one to
specify signed/unsigned integers).
An oddity is that all of them are //left// shift instructions – but if
you pass a negative shift count, they'll shift right. So the vector
shift distances are always vectors of //signed// integers, regardless
of whether you're considering the other input vector to be of signed
or unsigned. Also, even the simplest `vshlq` instruction in this
family (neither saturating nor rounding) has to be implemented as an
IR intrinsic, because the ordinary LLVM IR `shl` operation would
consider an out-of-range shift count to be undefined behavior.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72329
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This batch of intrinsics covers two sets of immediate shift
instructions, which have in common that they only overwrite part of
their output register and so they need an extra input giving its
previous value.
The VSLI and VSRI instructions shift each lane of the input vector
left or right just as if they were normal immediate VSHL/VSHR, but
then they only overwrite the output bits that correspond to actual
shifted bits of the input. So VSLI will leave the low n bits of each
output lane unchanged, and VSRI the same with the top n bits.
The V[Q][R]SHR[U]N family are all narrowing shifts: they take an input
vector of 2n-bit integers, shift each lane right by a constant, and
then narrowing the shifted result to only n bits. So they only
overwrite half of the n-bit lanes in the output register, and the B/T
suffix indicates whether it's the bottom or top half of each 2n-bit
lane.
I've implemented the whole of the latter family using a single IR
intrinsic `vshrn`, which takes a lot of i32 parameters indicating
which instruction it expands to (by specifying signedness of the input
and output types, whether it saturates and/or rounds, etc).
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72328
|
| |
|
|
|
|
|
|
| |
An empty string literal in an asm label does not make a whole lot of sense. GCC
does not diagnose such a construct, but it also generates code that cannot be
assembled by gas should two symbols have an empty asm label within the same TU.
This does not affect an asm statement with an empty string literal, which is
still a useful construct.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Running an end-to-end test last week I noticed that a lot of the ACLE
intrinsics that operate differently on vectors of signed and unsigned
integers were ending up generating the signed version of the
instruction unconditionally. This is because the IR intrinsics had no
way to distinguish signed from unsigned: the LLVM type system just
calls them both `v8i16` (or whatever), so you need either separate
intrinsics for signed and unsigned, or a flag parameter that tells
ISel which one to choose.
This patch fixes all the problems of that kind that I've noticed, by
adding an i32 flag parameter to many of the IR intrinsics which is set
to 1 for unsigned (matching the existing practice in cases where we
got it right), and conditioning all the isel patterns on that flag. So
the fundamental change is in `IntrinsicsARM.td`, changing the
low-level IR intrinsics API; there are knock-on changes in
`arm_mve.td` (adjusting code gen for the ACLE intrinsics to use the
modified API) and in `ARMInstrMVE.td` (adjusting isel to expect the
new unsigned flags). The rest of this patch is boringly updating tests.
Reviewers: dmgreen, miyuki, MarkMurrayARM
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72270
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The ACLE intrinsics with `gather_base` or `scatter_base` in the name
are wrappers on the MVE load/store instructions that take a vector of
base addresses and an immediate offset. The immediate offset can be up
to 127 times the alignment unit, and it can be positive or negative.
At the MC layer, we got that right. But in the Sema error checking for
the wrapping intrinsics, the offset was erroneously constrained to be
positive.
To fix this I've adjusted the `imm_mem7bit` class in the Tablegen that
defines the intrinsics. But that causes integer literals like
`0xfffffffffffffe04` to appear in the autogenerated calls to
`SemaBuiltinConstantArgRange`, which provokes a compiler warning
because that's out of the non-overflowing range of an `int64_t`. So
I've also tweaked `MveEmitter` to emit that as `-0x1fc` instead.
Updated the tests of the Sema checks themselves, and also adjusted a
random sample of the CodeGen tests to actually use negative offsets
and prove they get all the way through code generation without causing
a crash.
Reviewers: dmgreen, miyuki, MarkMurrayARM
Reviewed By: dmgreen
Subscribers: kristof.beyls, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72268
|
| |
|
|
| |
The s390x builtins are still using FSub instead of FNeg. Correct that.
|
| |
|
|
|
|
|
| |
We already recognize the __builtin versions of these, might as well
recognize the libcall version.
Differential Revision: https://reviews.llvm.org/D72028
|
| |
|
|
|
|
| |
This replaces the fsub -0.0 idiom with an fneg instruction. We didn't see to have a test that showed the current codegen. Just some tests for constant folding and a test that was only checking the declare lines for libcalls. The latter just checked that we did not have a declare for @conj when using __builtin_conj.
Differential Revision: https://reviews.llvm.org/D72012
|
| |
|
|
|
|
| |
We have an fneg instruction now and should use it instead of the fsub -0.0 idiom. Looks like we had no test that showed that we handled the negation cases here so I've added new tests.
Differential Revision: https://reviews.llvm.org/D72010
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Amend MS offset operator implementation, to more closely fit with its MS counterpart:
1. InlineAsm: evaluate non-local source entities to their (address) location
2. Provide a mean with which one may acquire the address of an assembly label via MS syntax, rather than yielding a memory reference (i.e. "offset asm_label" and "$asm_label" should be synonymous
3. address PR32530
Based on http://llvm.org/D37461
Fix broken test where the break appears unrelated.
- Set up appropriate memory-input rewrites for variable references.
- Intel-dialect assembly printing now correctly handles addresses by adding "offset".
- Pass offsets as immediate operands (using "r" constraint for offsets of locals).
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D71436
|
| |
|
|
| |
as cleanups after D56351
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
and validateInputSize.
The validateOutputSize and validateInputSize need to check whether
AVX or AVX512 are enabled. But this can be affected by the
target attribute so we need to factor that in.
This patch moves some of the code from CodeGen to create an
appropriate feature map that we can pass to the function.
Differential Revision: https://reviews.llvm.org/D68627
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Commit d77ae1552fc21a9f3877f3ed7e13d631f517c825
("[DebugInfo] Support to emit debugInfo for extern variables")
added deebugInfo for extern variables for BPF target.
The commit is reverted by 891e25b02d760d0de18c7d46947913b3166047e7
as the committed tests using %clang instead of %clang_cc1 causing
test failed in certain scenarios as reported by Reid Kleckner.
This patch fixed the tests by using %clang_cc1.
Differential Revision: https://reviews.llvm.org/D71818
|
| |
|
|
|
|
|
| |
This reverts commit d77ae1552fc21a9f3877f3ed7e13d631f517c825.
The tests committed along with this change do not pass, and should be
changed to use %clang_cc1.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
assembly.
Summary:
This is documented as the appropriate template modifier for call operands.
Fixes PR44272, and adds a regression test.
Also adds support for operand modifiers in Intel-style inline assembly.
Reviewers: rnk
Reviewed By: rnk
Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71677
|
| |
|
|
|
|
|
|
|
|
|
| |
GCC's x86 and s390 ports support -mrecord-mcount. Other ports reject the
option.
aarch64-linux-gnu-gcc: error: unrecognized command line option ‘-mrecord-mcount’
Allowing this option can cause failures when building Linux kernel for
aarch64, powerpc64, etc, which will think the feature is available if
the clang command returns 0.
|
| |
|
|
|
|
|
|
|
|
| |
Recognize -mrecord-mcount from the command line and add a function attribute
"mrecord-mcount" when passed.
Only valid on SystemZ (when used with -mfentry).
Review: Ulrich Weigand
https://reviews.llvm.org/D71627
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The vector pattern `(a + b + 1) / 2` was previously selected to an
avgr_u instruction regardless of nuw flags, but this is incorrect in
the case where either addition may have an unsigned wrap. This CL
changes the existing pattern to require both adds to have nuw flags
and adds builtin functions and intrinsics for the avgr_u instructions
because the corrected pattern is not representable in C.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71648
|
| |
|
|
|
|
|
|
| |
Let the "mnop-mcount" function attribute simply be present or non-present.
Update SystemZ backend as well to use hasFnAttribute() instead.
Review: Ulrich Weigand
https://reviews.llvm.org/D71669
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This adds parsing of the qualifiers __ptr32, __ptr64, __sptr, and __uptr and
lowers them to the corresponding address space pointer for 32-bit and 64-bit pointers.
(32/64-bit pointers added in https://reviews.llvm.org/D69639)
A large part of this patch is making these pointers ignore the address space
when doing things like overloading and casting.
https://bugs.llvm.org/show_bug.cgi?id=42359
Reviewers: rnk, rsmith
Subscribers: jholewinski, jvesely, nhaehnle, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71039
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following intrinsics currently carry a rounding mode metadata argument:
llvm.experimental.constrained.minnum
llvm.experimental.constrained.maxnum
llvm.experimental.constrained.ceil
llvm.experimental.constrained.floor
llvm.experimental.constrained.round
llvm.experimental.constrained.trunc
This is not useful since the semantics of those intrinsics do not in any way
depend on the rounding mode. In similar cases, other constrained intrinsics
do not have the rounding mode argument. Remove it here as well.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D71218
|
| |
|
|
|
|
|
|
|
|
|
| |
Recognize -mpacked-stack from the command line and add a function attribute
"mpacked-stack" when passed. This is needed for building the Linux kernel.
If this option is passed for any other target than SystemZ, an error is
generated.
Review: Ulrich Weigand
https://reviews.llvm.org/D71441
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The FP-classification builtins (__builtin_isfinite, etc) use variadic
packs in the definition file to mean an overload set. Because of that,
floats were converted to doubles, which is incorrect. There WAS a patch
to remove the cast after the fact.
THis patch switches these builtins to just be custom type checking,
calls the implicit conversions for the integer members, and makes sure
the correct L->R casts are put into place, then does type checking like
normal.
A future direction (that wouldn't be NFC) would consider making
conversions for the floating point parameter legal.
Note: The initial patch for this missed that certain systems need to
still convert half to float, since they dont' support that type.
|
| |
|
|
|
|
|
| |
This change updates the clang front end to add symbols to llvm.used
when they have explicit export_name attribute.
Differential Revision: https://reviews.llvm.org/D71493
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The instructions were originally implemented via builtins and
intrinsics so users would have to explicitly opt-in to using
them. This was useful while were validating whether these instructions
should have been merged into the spec proposal. Now that they have
been, we can use normal codegen patterns, so the intrinsics and
builtins are no longer useful.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71500
|
| |
|
|
|
|
|
|
|
|
|
| |
As brought up in D71467, a group of floating point builtins
automatically promoted floats to doubles because they used the variadic
builtin tag to support an overload set. The result is that the
parameters were treated as a variadic pack, which always promots
float->double.
This resulted in the wrong answer being given in cases with certain
values of NaN.
|