| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for the following instructions:
CLS (Count Leading Sign bits)
CLZ (Count Leading Zeros)
CNT (Count non-zero bits)
CNOT (Logically invert boolean condition in vector)
NOT (Bitwise invert vector)
FABS (Floating-point absolute value)
FNEG (Floating-point negate)
All operations are predicated and unary, e.g.
clz z0.s, p0/m, z1.s
- CLS, CLZ, CNT, CNOT and NOT have variants for 8, 16, 32
and 64 bit elements.
- FABS and FNEG have variants for 16, 32 and 64 bit elements.
llvm-svn: 336677
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for the following instructions:
CNTB CNTH - Determine the number of active elements implied by
CNTW CNTD the named predicate constant, multiplied by an
immediate, e.g.
cnth x0, vl8, #16
CNTP - Count active predicate elements, e.g.
cntp x0, p0, p1.b
counts the number of active elements in p1, predicated
by p0, and stores the result in x0.
llvm-svn: 336552
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch completes support for shifts, which include:
- LSL - Logical Shift Left
- LSLR - Logical Shift Left, Reversed form
- LSR - Logical Shift Right
- LSRR - Logical Shift Right, Reversed form
- ASR - Arithmetic Shift Right
- ASRR - Arithmetic Shift Right, Reversed form
- ASRD - Arithmetic Shift Right for Divide
In the following variants:
- Predicated shift by immediate - ASR, LSL, LSR, ASRD
e.g.
asr z0.h, p0/m, z0.h, #1
(active lanes of z0 shifted by #1)
- Unpredicated shift by immediate - ASR, LSL*, LSR*
e.g.
asr z0.h, z1.h, #1
(all lanes of z1 shifted by #1, stored in z0)
- Predicated shift by vector - ASR, LSL*, LSR*
e.g.
asr z0.h, p0/m, z0.h, z1.h
(active lanes of z0 shifted by z1, stored in z0)
- Predicated shift by vector, reversed form - ASRR, LSLR, LSRR
e.g.
lslr z0.h, p0/m, z0.h, z1.h
(active lanes of z1 shifted by z0, stored in z0)
- Predicated shift left/right by wide vector - ASR, LSL, LSR
e.g.
lsl z0.h, p0/m, z0.h, z1.d
(active lanes of z0 shifted by wide elements of vector z1)
- Unpredicated shift left/right by wide vector - ASR, LSL, LSR
e.g.
lsl z0.h, z1.h, z2.d
(all lanes of z1 shifted by wide elements of z2, stored in z0)
*Variants added in previous patches.
llvm-svn: 336547
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Related to http://reviews.llvm.org/D15772
Depends on http://reviews.llvm.org/D16889
Adds [D]REM[U] instructions.
Patch By: Srdjan Obucina
Contributions from: Simon Dardis
Differential Revision: https://reviews.llvm.org/D17036
llvm-svn: 336545
|
|
|
|
|
|
|
|
|
|
|
| |
Support for SVE's TBL instruction for programmable table
lookup/permute using vector of element indices, e.g.
tbl z0.d, { z1.d }, z2.d
stores elements from z1, indexed by elements from z2, into z0.
llvm-svn: 336544
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Supporting various addressing modes:
- adr z0.s, [z0.s, z0.s]
- adr z0.s, [z0.s, z0.s, lsl #<shift>]
- adr z0.d, [z0.d, z0.d]
- adr z0.d, [z0.d, z0.d, lsl #<shift>]
- adr z0.d, [z0.d, z0.d, uxtw #<shift>]
- adr z0.d, [z0.d, z0.d, sxtw #<shift>]
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D48870
llvm-svn: 336533
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for:
UZP1 Concatenate even elements from two vectors
UZP2 Concatenate odd elements from two vectors
TRN1 Interleave even elements from two vectors
TRN2 Interleave odd elements from two vectors
With variants for both data and predicate vectors, e.g.
uzp1 z0.b, z1.b, z2.b
trn2 p0.s, p1.s, p2.s
llvm-svn: 336531
|
|
|
|
|
|
|
|
| |
This adds:
- outer shareable TLB Maintenance instructions, and
- TLB range maintenance instructions.
llvm-svn: 336434
|
|
|
|
|
|
| |
Now with the asm operand definition included.
llvm-svn: 336432
|
|
|
|
|
|
| |
It's causing build errors.
llvm-svn: 336422
|
|
|
|
|
|
|
|
| |
These instructions are added to AArch64 only.
Differential Revision: https://reviews.llvm.org/D48926
llvm-svn: 336421
|
|
|
|
|
|
|
|
| |
This adds the Armv8.4-A Trace synchronization barrier (TSB) instruction.
Differential Revision: https://reviews.llvm.org/D48918
llvm-svn: 336418
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If LOCK prefix is not the first prefix in an instruction, LLVM
disassembler silently drops the prefix.
The fix is to select a proper instruction with a builtin LOCK prefix if
one exists.
Reviewers: craig.topper
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49001
llvm-svn: 336400
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a deficiency in TableGen that has been addressed in r336334.
[AArch64][SVE] Asm: Support for predicated FP rounding instructions.
This patch also adds instructions for predicated FP square-root and
reciprocal exponent.
The added instructions are:
- FRINTI Round to integral value (current FPCR rounding mode)
- FRINTX Round to integral value (current FPCR rounding mode, signalling inexact)
- FRINTA Round to integral value (to nearest, with ties away from zero)
- FRINTN Round to integral value (to nearest, with ties to even)
- FRINTZ Round to integral value (toward zero)
- FRINTM Round to integral value (toward minus Infinity)
- FRINTP Round to integral value (toward plus Infinity)
- FSQRT Floating-point square root
- FRECPX Floating-point reciprocal exponent
llvm-svn: 336387
|
|
|
|
|
|
|
| |
in TableGen, for which there is already a patch in Phabricator
(D48937) that needs to be committed first.
llvm-svn: 336324
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch also adds instructions for predicated FP square-root and
reciprocal exponent.
The added instructions are:
- FRINTI Round to integral value (current FPCR rounding mode)
- FRINTX Round to integral value (current FPCR rounding mode, signalling inexact)
- FRINTA Round to integral value (to nearest, with ties away from zero)
- FRINTN Round to integral value (to nearest, with ties to even)
- FRINTZ Round to integral value (toward zero)
- FRINTM Round to integral value (toward minus Infinity)
- FRINTP Round to integral value (toward plus Infinity)
- FSQRT Floating-point square root
- FRECPX Floating-point reciprocal exponent
llvm-svn: 336322
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements the following varieties:
- Unpredicated signed max, e.g. smax z0.h, z1.h, #-128
- Unpredicated signed min, e.g. smin z0.h, z1.h, #-128
- Unpredicated unsigned max, e.g. umax z0.h, z1.h, #255
- Unpredicated unsigned min, e.g. umin z0.h, z1.h, #255
- Predicated signed max, e.g. smax z0.h, p0/m, z0.h, z1.h
- Predicated signed min, e.g. smin z0.h, p0/m, z0.h, z1.h
- Predicated signed abd, e.g. sabd z0.h, p0/m, z0.h, z1.h
- Predicated unsigned max, e.g. umax z0.h, p0/m, z0.h, z1.h
- Predicated unsigned min, e.g. umin z0.h, p0/m, z0.h, z1.h
- Predicated unsigned abd, e.g. uabd z0.h, p0/m, z0.h, z1.h
llvm-svn: 336317
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support for negative immediates was implemented in
https://reviews.llvm.org/rL298380, however few instruction options were missing.
This change adds negative immediates support and respective tests
for the following:
ADD
ADDS
ADDS.W
AND.W
ANDS
BIC.W
BICS
BICS.W
SUB
SUBS
SUBS.W
Differential Revision: https://reviews.llvm.org/D48649
llvm-svn: 336286
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds both a vector and an immediate form, e.g.
- Vector form:
subr z0.h, p0/m, z0.h, z1.h
subtract active elements of z0 from z1, and store the result in z0.
- Immediate form:
subr z0.h, z0.h, #255
subtract elements of z0, and store the result in z0.
llvm-svn: 336274
|
|
|
|
| |
llvm-svn: 336268
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Includes instructions to read the First-Faulting Register (FFR):
- RDFFR (unpredicated)
rdffr p0.b
- RDFFR (predicated)
rdffr p0.b, p0/z
- RDFFRS (predicated, sets condition flags)
rdffr p0.b, p0/z
Includes instructions to set/write the FFR:
- SETFFR (no arguments, sets the FFR to all true)
setffr
- WRFFR (unpredicated)
wrffr p0.b
llvm-svn: 336267
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The variants added are:
- fcvt (FP convert precision)
- scvtf (signed int -> FP)
- ucvtf (unsigned int -> FP)
- fcvtzs (FP -> signed int (round to zero))
- fcvtzu (FP -> unsigned int (round to zero))
For example:
fcvt z0.h, p0/m, z0.s (single- to half-precision FP)
scvtf z0.h, p0/m, z0.s (32-bit int to half-precision FP)
ucvtf z0.h, p0/m, z0.s (32-bit unsigned int to half-precision FP)
fcvtzs z0.s, p0/m, z0.h (half-precision FP to 32-bit int)
fcvtzu z0.s, p0/m, z0.h (half-precision FP to 32-bit unsigned int)
llvm-svn: 336265
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SVE overloads the AArch64 PSTATE condition flags and introduces
a set of condition code aliases for the assembler. The
details are described in section 2.2 of the architecture
reference manual supplement for SVE.
In short:
SVE alias => AArch64 name
--------------------------
NONE => EQ
ANY => NE
NLAST => HS
LAST => LO
FIRST => MI
NFRST => PL
PMORE => HI
PLAST => LS
TCONT => GE
TSTOP => LT
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48869
llvm-svn: 336245
|
|
|
|
|
|
|
|
| |
This might make the error message added in r335668 unneeded, but I'm not sure yet.
The check for RIP is technically unnecessary since RIP is in GR64, but that fact is kind of surprising so be explicit.
llvm-svn: 336217
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The variants added in this patch are:
- Predicated Complex floating point ADD with rotate, e.g.
fcadd z0.h, p0/m, z0.h, z1.h, #90
- Predicated Complex floating point MLA with rotate, e.g.
fcmla z0.h, p0/m, z1.h, z2.h, #180
- Unpredicated Complex floating point MLA with rotate (indexed operand), e.g.
fcmla z0.h, p0/m, z1.h, z2.h[0], #180
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48824
llvm-svn: 336210
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unpredicated FP-multiply of SVE vector with a vector-element given by
vector[index], for example:
fmul z0.s, z1.s, z2.s[0]
which performs an unpredicated FP-multiply of all 32-bit elements in
'z1' with the first element from 'z2'.
This patch adds restricted register classes for SVE vectors:
ZPR_3b (only z0..z7 are allowed) - for indexed vector of 16/32-bit elements.
ZPR_4b (only z0..z15 are allowed) - for indexed vector of 64-bit elements.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48823
llvm-svn: 336205
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch includes support for the following instructions:
ABS z0.h, p0/m, z0.h
NEG z0.h, p0/m, z0.h
(S|U)XTB z0.h, p0/m, z0.h
(S|U)XTB z0.s, p0/m, z0.s
(S|U)XTB z0.d, p0/m, z0.d
(S|U)XTH z0.s, p0/m, z0.s
(S|U)XTH z0.d, p0/m, z0.d
(S|U)XTW z0.d, p0/m, z0.d
llvm-svn: 336204
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the following system registers:
- RAS registers,
- MPAM registers,
- Activitiy monitor registers,
- Trace Extension registers,
- Timing insensitivity of data processing instructions,
- Enhanced Support for Nested Virtualization.
Differential Revision: https://reviews.llvm.org/D48871
llvm-svn: 336193
|
|
|
|
|
|
|
|
|
|
| |
The variants added are:
signed Saturating ADD/SUB (immediate) e.g. sqadd z0.h, z0.h, #42
unsigned Saturating ADD/SUB (immediate) e.g. uqadd z0.h, z0.h, #42
signed Saturating ADD/SUB (vectors) e.g. sqadd z0.h, z0.h, z1.h
unsigned Saturating ADD/SUB (vectors) e.g. uqadd z0.h, z0.h, z1.h
llvm-svn: 336186
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Contains the following variants:
- Compare with (elements from) other vector
instructions: fcmeq, fcmgt, fcmge, fcmne, fcmuo.
aliases: fcmle, fcmlt.
e.g. fcmle p0.h, p0/z, z0.h, z1.h => fcmge p0.h, p0/z, z1.h, z0.h
- Compare absolute values with (absolute values from) other vector.
instructions: facge, facgt.
aliases: facle, faclt.
e.g. facle p0.h, p0/z, z0.h, z1.h => facge p0.h, p0/z, z1.h, z0.h
- Compare vector elements with #0.0
instructions: fcmeq, fcmgt, fcmge, fcmle, fcmlt, fcmne.
e.g. fcmle p0.h, p0/z, z0.h, #0.0
llvm-svn: 336182
|
|
|
|
|
|
| |
Similar to rLLD336129
llvm-svn: 336131
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On darwin, all virtual sections have zerofill type, and having a
.zerofill directive in a non-virtual section is not allowed. Instead of
asserting, show a nicer error.
In order to use the equivalent of .zerofill in a non-virtual section,
the usage of .zero of .space is required.
This patch replaces the assert with an error.
Differential Revision: https://reviews.llvm.org/D48517
llvm-svn: 336127
|
|
|
|
|
|
|
|
| |
This was my mistake for only running test/MC/X86 and test/CodeGen/X86.
Arguably .word should be removed from this test, as it is not supported
universally.
llvm-svn: 336107
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Increments/decrements the result with the number of active bits
from the predicate.
The inc/dec variants added are:
- incp x0, p0.h (scalar)
- incp z0.h, p0 (vector)
The unsigned saturating inc/dec variants added are:
- uqincp x0, p0.h (scalar)
- uqincp w0, p0.h (scalar, 32bit)
- uqincp z0.h, p0 (vector)
The signed saturating inc/dec variants added are:
- sqincp x0, p0.h (scalar)
- sqincp x0, p0.h, w0 (scalar, 32bit)
- sqincp z0.h, p0 (vector)
llvm-svn: 336091
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Increment/decrement vector by multiple of predicate constraint
element count.
The variants added by this patch are:
- INCH, INCW, INC
and (saturating):
- SQINCH, SQINCW, SQINCD
- UQINCH, UQINCW, UQINCW
- SQDECH, SQINCW, SQINCD
- UQDECH, UQINCW, UQINCW
For example:
incw z0.s, all, mul #4
llvm-svn: 336090
|
|
|
|
|
|
|
|
| |
Compare vector elements with a signed/unsigned immediate, e.g.
cmpgt p0.s, p0/z, z0.s, #-16
cmphi p0.s, p0/z, z0.s, #127
llvm-svn: 336081
|
|
|
|
|
|
|
|
|
| |
These patches were previously reverted as they led to
buildbot time-outs caused by large switch statement in
printAliasInstr when using UBSan and O3. The issue has
been addressed with a workaround (r335525).
llvm-svn: 336079
|
|
|
|
|
|
|
|
|
|
| |
section flags in the ELF assembler. This matches the defaults
given in the rest of MC.
Fixes PR37997 where we couldn't assemble our own assembly output
without warnings.
llvm-svn: 336072
|
|
|
|
|
|
|
|
| |
This adds the Secure EL2 extension.
Differential Revision: https://reviews.llvm.org/D48711
llvm-svn: 335962
|
|
|
|
|
|
|
|
|
|
| |
Mostly just adding checks for Thumb2 instructions which correspond to
ARM instructions which already had diagnostics. While I'm here, also fix
ARM-mode strd to check the input registers correctly.
Differential Revision: https://reviews.llvm.org/D48610
llvm-svn: 335909
|
|
|
|
|
|
|
|
| |
Frequency Info."
This reverts commits r335794 and r335797. Breaks ThinLTO+FDO selfhost.
llvm-svn: 335851
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
=== Generating the CG Profile ===
The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight.
After scanning all the functions, it generates an appending module flag containing the data. The format looks like:
```
!llvm.module.flags = !{!0}
!0 = !{i32 5, !"CG Profile", !1}
!1 = !{!2, !3, !4} ; List of edges
!2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32
!3 = !{void (i1)* @freq, void ()* @a, i64 11}
!4 = !{void (i1)* @freq, void ()* @b, i64 20}
```
Differential Revision: https://reviews.llvm.org/D48105
llvm-svn: 335794
|
|
|
|
|
|
|
|
|
|
| |
SIB byte is present, but doesn't encode an index register and there was another shorter encoding that would achieve the same result.
The %eiz/%riz are dummy registers that force the encoder to emit a SIB byte when it normally wouldn't. By emitting them in the disassembly output we ensure that assembling the disassembler output would also produce a SIB byte.
This should match the behavior of objdump from binutils.
llvm-svn: 335768
|
|
|
|
| |
llvm-svn: 335635
|
|
|
|
|
|
|
|
|
| |
When the condition code for an IT instruction is "AL" we get strange "15"
predicates on subsequent instructions. These are dealt with for most
instructions by treating them as "ARMCC::AL", but VFP takes a different path
which didn't have this code.
llvm-svn: 335594
|
|
|
|
|
|
|
|
|
|
|
| |
IT instructions are allowed to have the 'AL' predicate, but it must never
result in an 'NV' predicated instruction. Essentially this means that all
branches must be 't' rather than 'e' if the predicate is 'AL'.
This patch adds a diagnostic for this during assembly (error because parsing
hits an assertion if allowed to continue) and an annotation during disassembly.
llvm-svn: 335593
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move expected-fail cases from directive-cpu.s to
directive-cpu-err.s. This allows us to remove the 'not' from the
llvm-mc invocation in directive-cpu.s so that this test will fail
in unexpected error cases. It also means that we are not relying
on all stderr coming before any stdout, which seems fragile.
Also make use of CHECK-NEXT to ensure that multiline error messages
really are occuring together.
And add a test to verify that .cpu with an arch version as extension
is rejected.
Differential Revision: https://reviews.llvm.org/D47873
llvm-svn: 335586
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These were specifying an architecture version with .cpu directive,
which is invalid. As the error for this case outputs the problem
instruction we were still matching the expectations of FileCheck.
This patch fixes up the LSE tests to do what they seem to intend. A
follow-up patch will tighten up the directive tests.
Differential Revision: https://reviews.llvm.org/D47872
llvm-svn: 335585
|
|
|
|
|
|
| |
with well defined semantics like .rodata.
llvm-svn: 335558
|
|
|
|
|
|
|
|
| |
Encoding for the wait instruction was wrong. Fix according to ISA 3.0.
Differential Revision: https://reviews.llvm.org/D48550
llvm-svn: 335514
|