| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
| |
This reverts commits r334980 and r334983 because they were causing build
timeouts on the x86_64-linux-ubsan bot.
llvm-svn: 335085
|
|
|
|
|
|
|
|
|
|
|
|
| |
insertOutlinerEpilogue
insertOutlinerPrologue was not used by any target, and prologue-esque code was
beginning to appear in insertOutlinerEpilogue. Refactor that into one function,
buildOutlinedFrame.
This just removes insertOutlinerPrologue and renames insertOutlinerEpilogue.
llvm-svn: 335076
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch uses the DiagnosticPredicate for SVE predicate patterns
to improve their diagnostics, now giving a 'invalid operand' diagnostic
if the type is not an immediate or one of the expected pattern
labels.
Reviewers: samparker, SjoerdMeijer, javed.absar, fhahn
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48220
llvm-svn: 334983
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The variants added by this patch are:
- SQINC signed increment, e.g. sqinc x0, w0, all, mul #4
- SQDEC signed decrement, e.g. sqdec x0, w0, all, mul #4
- UQINC unsigned increment, e.g. uqinc w0, all, mul #4
- UQDEC unsigned decrement, e.g. uqdec w0, all, mul #4
This patch includes asmparser changes to parse a GPR64 as a GPR32 in
order to satisfy the constraint check:
x0 == GPR64(w0)
in:
sqinc x0, w0, all, mul #4
^___^ (must match)
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47716
llvm-svn: 334980
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The variants added by this patch are:
- SQINC (signed increment)
- UQINC (unsigned increment)
- SQDEC (signed decrement)
- UQDEC (unsigned decrement)
For example:
uqincw x0, all, mul #4
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Differential Revision: https://reviews.llvm.org/D47715
llvm-svn: 334948
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds instructions for comparing elements from two vectors, e.g.
cmpgt p0.s, p0/z, z0.s, z1.s
and also adds support for comparing to a 64-bit wide element vector, e.g.
cmpgt p0.s, p0/z, z0.s, z1.d
The patch also contains aliases for certain comparisons, e.g.:
cmple p0.s, p0/z, z0.s, z1.s => cmpge p0.s, p0/z, z1.s, z0.s
cmplo p0.s, p0/z, z0.s, z1.s => cmphi p0.s, p0/z, z1.s, z0.s
cmpls p0.s, p0/z, z0.s, z1.s => cmphs p0.s, p0/z, z1.s, z0.s
cmplt p0.s, p0/z, z0.s, z1.s => cmpgt p0.s, p0/z, z1.s, z0.s
llvm-svn: 334931
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for instructions performing bitwise operations
on predicate vectors, including AND, BIC, EOR, NAND, NOR, ORN, ORR, and
their status flag setting variants ANDS, BICS, EORS, NANDS, ORNS, ORRS.
This patch also adds several aliases:
orr p0.b, p1/z, p1.b, p1.b => mov p0.b, p1.b
orrs p0.b, p1/z, p1.b, p1.b => movs p0.b, p1.b
and p0.b, p1/z, p2.b, p2.b => mov p0.b, p1/z, p2.b
ands p0.b, p1/z, p2.b, p2.b => movs p0.b, p1/z, p2.b
eor p0.b, p1/z, p2.b, p1.b => not p0.b, p1/z, p2.b
eors p0.b, p1/z, p2.b, p1.b => nots p0.b, p1/z, p2.b
llvm-svn: 334906
|
|
|
|
|
|
|
|
| |
Support for SVE's predicated select instructions to select elements
from either vector, both in a data-vector and a predicate-vector
variant.
llvm-svn: 334905
|
|
|
|
|
|
|
| |
Predicated splat/copy of SIMD/FP register or general purpose
register to SVE vector, along with MOV-aliases.
llvm-svn: 334842
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Increment/decrement scalar register by (scaled) element count given by
predicate pattern, e.g. 'incw x0, all, mul #4'.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D47713
llvm-svn: 334838
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: javed.absar
Differential Revision: https://reviews.llvm.org/D47712
llvm-svn: 334831
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some instructions require of a limited set of FP immediates as operands,
for example '#0.5 or #1.0' for SVE's FADD instruction.
This patch adds support for parsing and printing such FP immediates as
exact values (e.g. #0.499999 is not accepted for #0.5).
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D47711
llvm-svn: 334826
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For targets I'm not familiar with, I've automatically made the "default to 1 for each resource" behaviour explicit in the td files.
For more obvious cases, I've ventured a fix.
Some notes:
- Exynos is especially fishy.
- AArch64SchedThunderX2T99.td had some truncated entries. If I understand correctly, the person who wrote that interpreted the ResourceCycle as a range. I made the decision to use the upper/lower bound for consistency with the 'Latency' value. I'm sure there is a better choice.
- The change to X86ScheduleBtVer2.td is an NFC, it just makes values more explicit.
Also see PR37310.
Reviewers: RKSimon, craig.topper, javed.absar
Subscribers: kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D46356
llvm-svn: 334586
|
|
|
|
|
|
|
|
|
|
|
|
| |
Register x20 is a callee-saved register which may be used for other
purposes in certain contexts, for example to hold special variables
within the kernel. This change adds support for reserving this register
both to frontend and backend to make this register usable for these
purposes.
Differential Revision: https://reviews.llvm.org/D46552
llvm-svn: 334531
|
|
|
|
| |
llvm-svn: 334488
|
|
|
|
|
|
| |
This is part of https://reviews.llvm.org/D46356.
llvm-svn: 334391
|
|
|
|
|
|
| |
Create a separate feature set for Exynos M4 and add test cases.
llvm-svn: 334115
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On targets like Arm some relaxations may only be performed when certain
architectural features are available. As functions can be compiled with
differing levels of architectural support we must make a judgement on
whether we can relax based on the MCSubtargetInfo for the function. This
change passes through the MCSubtargetInfo for the function to
fixupNeedsRelaxation so that the decision on whether to relax can be made
per function. In this patch, only the ARM backend makes use of this
information. We must also pass the MCSubtargetInfo to applyFixup because
some fixups skip error checking on the assumption that relaxation has
occurred, to prevent code-generation errors applyFixup must see the same
MCSubtargetInfo as fixupNeedsRelaxation.
Differential Revision: https://reviews.llvm.org/D44928
llvm-svn: 334078
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is setting up to fix bug 37573 cleanly.
This moves data structures that are technically both used in some way by the
target and the general-purpose outlining algorithm into MachineOutliner.h. In
particular, the `Candidate` class is of importance.
Before, the outliner passed the locations of `Candidates` to the target, which
would then make some decisions about the prospective outlined function. This
change allows us to just pass `Candidates` along to the target. This will allow
the target to discard `Candidates` that would be considered unsafe before cost
calculation. Thus, we will be able to remove the unsafe candidates described in
the bug without resorting to torching the entire prospective function.
Also, as a side-effect, it makes the outliner a bit cleaner.
https://bugs.llvm.org/show_bug.cgi?id=37573
llvm-svn: 333952
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The new rules are straightforward. The main rules to keep in mind
are:
1. NAME is an implicit template argument of class and multiclass,
and will be substituted by the name of the instantiating def/defm.
2. The name of a def/defm in a multiclass must contain a reference
to NAME. If such a reference is not present, it is automatically
prepended.
And for some additional subtleties, consider these:
3. defm with no name generates a unique name but has no special
behavior otherwise.
4. def with no name generates an anonymous record, whose name is
unique but undefined. In particular, the name won't contain a
reference to NAME.
Keeping rules 1&2 in mind should allow a predictable behavior of
name resolution that is simple to follow.
The old "rules" were rather surprising: sometimes (but not always),
NAME would correspond to the name of the toplevel defm. They were
also plain bonkers when you pushed them to their limits, as the old
version of the TableGen test case shows.
Having NAME correspond to the name of the toplevel defm introduces
"spooky action at a distance" and breaks composability:
refactoring the upper layers of a hierarchy of nested multiclass
instantiations can cause unexpected breakage by changing the value
of NAME at a lower level of the hierarchy. The new rules don't
suffer from this problem.
Some existing .td files have to be adjusted because they ended up
depending on the details of the old implementation.
Change-Id: I694095231565b30f563e6fd0417b41ee01a12589
Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm, javed.absar
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D47430
llvm-svn: 333900
|
|
|
|
| |
llvm-svn: 333879
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For immediates used in DUP instructions that have the range
-128 to 127, or a multiple of 256 in the range -32768 to 32512,
one could argue that when the result element size is 16bits (.h),
the value can be considered both signed and unsigned.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47619
llvm-svn: 333873
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Print the first indexed element as a FP register, for example:
mov z0.d, z1.d[0]
Is now printed as:
mov z0.d, d1
Next to printing, this patch also adds aliases to parse 'mov z0.d, d1'.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47571
llvm-svn: 333872
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unpredicated copy of indexed SVE element to SVE vector,
along with MOV-aliases.
For example:
dup z0.h, z1.h[0]
duplicates the first 16-bit element from z1 to all elements in
the result vector z0.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D47570
llvm-svn: 333871
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Predicated copy of floating-point immediate value to SVE vector,
along with MOV-aliases.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: javed.absar
Differential Revision: https://reviews.llvm.org/D47518
llvm-svn: 333869
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Predicated copy of possibly shifted immediate value into SVE
vector, along with MOV-aliases.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47517
llvm-svn: 333868
|
|
|
|
|
|
|
|
|
|
|
| |
Before we were relying on the any extend of the s1 to s32, but
for AAPCS we need to zero-extend it to at least s8.
Fixes PR36719
Differential Revision: https://reviews.llvm.org/D47425
llvm-svn: 333747
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unpredicated copy of floating-point immediate value into SVE vector,
along with MOV-aliases.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47482
llvm-svn: 333744
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unpredicated copy of repeating immediate pattern to SVE vector, along
with MOV-aliases.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D47328
llvm-svn: 333731
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of asserting when using the def_cfa directive with a register
different from fp, fallback on DWARF.
Easily triggered with:
.cfi_def_cfa x1, 32;
rdar://40249694
Differential Revision: https://reviews.llvm.org/D47593
llvm-svn: 333667
|
|
|
|
| |
llvm-svn: 333634
|
|
|
|
|
|
|
|
|
|
|
|
| |
LegalizerInfo::verify(...)
Reviewers: aemerson, qcolombet
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D46339
llvm-svn: 333618
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LegalizerInfo::verify(...) call w/o fixing bugs
This is to make it clear what kind of bugs the LegalizerInfo::verifier
is able to catch and test its output
Reviewers: aemerson, qcolombet
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D46338
llvm-svn: 333597
|
|
|
|
|
|
|
| |
The immediate on an ADRP MCInst needs to be multiplied by 0x1000 to obtain the
actual PC-offset that will be calculated.
llvm-svn: 333525
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Floating point immediate combining a negative sign and
a hexadecimal number, e.g. #-0x0 caused the compiler to crash.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: javed.absar
Differential Revision: https://reviews.llvm.org/D47483
llvm-svn: 333524
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As suggested in https://bugs.llvm.org/show_bug.cgi?id=32384#c1, this change
makes the inlining of `memset()` and `memcpy()` more aggressive when
compiling for speed. The tuning remains the same when optimizing for size.
Patch by: Sebastian Pop <s.pop@samsung.com>
Evandro Menezes <e.menezes@samsung.com>
Differential revision: https://reviews.llvm.org/D45098
llvm-svn: 333429
|
|
|
|
|
|
| |
This reverts commit r333410 due to bot failures.
llvm-svn: 333427
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: rengolin, huntergr, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D47365
llvm-svn: 333422
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch addresses the following variants:
- bitmask immediate, e.g. 'and z0.d, z0.d, #0x6'.
- unpredicated data vectors, e.g. 'and z0.d, z1.d, z2.d'.
- predicated data vectors, e.g. 'and z0.d, p0/m, z0.d, z1.d'.
And also several aliases, such as:
- ORN, alias of ORR.
- EON, alias of EOR.
- BIC, alias of AND (immediate variant)
- MOV, alias of ORR (if unpredicated and source register operands are the same)
Reviewers: rengolin, huntergr, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47363
llvm-svn: 333414
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Change-Id: I0df845749c7689dfc99150ba7c19c7d0dadbd705
Reviewers: javed.absar, SjoerdMeijer
Reviewed By: SjoerdMeijer
Subscribers: llvm-commits, SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D46311
llvm-svn: 333410
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds addsub_imm8_opt_lsl_(i8|i16|i32|i64) operands
that are unsigned values in the range 0 to 255. For element widths of
16 bits or higher it may also be a signed multiple of 256 in the
range 0 to 65280.
Note: This also does some refactoring to reuse convenience function
getShiftedVal<shift>(), and now allows AArch64 scalar 'ADD #-4096' to be
accepted to be mapped to SUB #4096.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47310
llvm-svn: 333408
|
|
|
|
| |
llvm-svn: 333270
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unpredicated copy of optionally-shifted immediate to SVE vector,
along with MOV-aliases.
This patch contains parsing and printing support for
cpy_imm8_opt_lsl_(i8|i16|i32|i64). This operand allows a signed value in
the range -128 to +127. For element widths of 16 bits or higher it may
also be a signed multiple of 256 in the range -32768 to +32512.
For element-width of 8 bits a range of -128 to 255 is accepted, since a copy
of a byte can be considered either signed/unsigned.
Note: This patch renames tryParseAddSubImm() -> tryParseImmWithOptionalShift()
and moves the behaviour of trying to shift a plain immediate by an allowed
shift-value to its addImmWithOptionalShiftOperands() method, so that the
parsing itself is generic and allows immediates from multiple shifted operands.
This is done because an immediate can be divisible by both shifted operands.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47309
llvm-svn: 333263
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing code has three different ways to try to lower a 64-bit
immediate to the sequence ORR+MOVK. The result is messy: it misses
some possible sequences, and the order of the checks means we sometimes
emit two MOVKs when we only need one.
Instead, just use a simple loop to try all possible two-instruction
ORR+MOVK sequences.
Differential Revision: https://reviews.llvm.org/D47176
llvm-svn: 333218
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Optimize code generated for variable shifts/rotates by taking advantage
of the implicit and/mod done on the variable shift amount register.
Resolves bug 27582 and bug 37421.
Reviewers: t.p.northover, qcolombet, MatzeB, javed.absar
Subscribers: rengolin, kristof.beyls, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D46844
llvm-svn: 333214
|
|
|
|
|
|
|
|
| |
Use RegUnits to track register aliases in AArch64RedundantCopyElimination.
Differential Revision: https://reviews.llvm.org/D47269
llvm-svn: 333107
|
|
|
|
|
|
|
|
|
|
| |
The AArch64 asm parser currently has custom parsing logic for .hword, .word,
and .xword. Rather than use this custom logic, we can just use
addAliasForDirective to enable the reuse of AsmParser::parseDirectiveValue.
Differential Revision: https://reviews.llvm.org/D47000
llvm-svn: 333077
|
|
|
|
|
|
|
| |
(The assertion suppressed the unused variable warning on
Release+Asserts builds, so I didn't notice.)
llvm-svn: 333018
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we're outlining a sequence that ends in a call, we can save up to
three instructions in the outlined function by turning the call into
a tail-call. I refer to this as thunk outlining because the resulting
outlined function looks like a thunk; suggestions welcome for a better
name.
In addition to making the outlined function shorter, thunk outlining
allows outlining calls which would otherwise be illegal to outline:
we don't need to save/restore LR, so we don't need to prove anything
about the stack access patterns of the callee.
To make this work effectively, I also added
MachineOutlinerInstrType::LegalTerminator to the generic MachineOutliner
code; this allows treating an arbitrary instruction as a terminator in
the suffix tree.
Differential Revision: https://reviews.llvm.org/D47173
llvm-svn: 333015
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This **appears** to be the last missing piece for the masked merge pattern handling in the backend.
This is [[ https://bugs.llvm.org/show_bug.cgi?id=37104 | PR37104 ]].
[[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]] will introduce an IR canonicalization that is likely bad for the end assembly.
Previously, `andps`+`andnps` / `bsl` would be generated. (see `@out`)
Now, they would no longer be generated (see `@in`), and we need to make sure that they are generated.
Differential Revision: https://reviews.llvm.org/D46528
llvm-svn: 332904
|