| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the following instructions:
RBIT reverse bits within each active elemnt (predicated), e.g.
rbit z0.d, p0/m, z1.d
for 8, 16, 32 and 64 bit elements.
REV reverse order of elements in data/predicate vector
(unpredicated), e.g.
rev z0.d, z1.d
rev p0.d, p1.d
for 8, 16, 32 and 64 bit elements.
REVB reverse order of bytes within each active element, e.g.
revb z0.d, p0/m, z1.d
for 16, 32 and 64 bit elements.
REVH reverse order of 16-bit half-words within each active
element, e.g.
revh z0.d, p0/m, z1.d
for 32 and 64 bit elements.
REVW reverse order of 32-bit words within each active element,
e.g.
revw z0.d, p0/m, z1.d
for 64 bit elements.
llvm-svn: 337534
|
| |
|
|
|
|
|
|
|
|
|
| |
Floating-point trigonometric multiply-add coefficient,
e.g.
ftmad z0.h, z0.h, z1.h, #7
with variants for 16, 32 and 64-bit elements.
llvm-svn: 337533
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for the following unpredicated
floating-point instructions:
FADD Floating point add
FSUB Floating point subtract
FMUL Floating point multiplication
FTSMUL Floating point trigonometric starting value
FRECPS Floating point reciprocal step
FRSQRTS Floating point reciprocal square root step
The instructions have the following assembly format:
fadd z0.h, z1.h, z2.h
and have variants for 16, 32 and 64-bit FP elements.
llvm-svn: 337383
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The signed/unsigned DOT instructions perform a dot-product on
quadtuplets from two source vectors and accumulate the result in
the destination register. The instructions come in two forms:
Vector form, e.g.
sdot z0.s, z1.b, z2.b - signed dot product on four 8-bit quad-tuplets,
accumulating results in 32-bit elements.
udot z0.d, z1.h, z2.h - unsigned dot product on four 16-bit quad-tuplets,
accumulating results in 64-bit elements.
Indexed form, e.g.
sdot z0.s, z1.b, z2.b[3] - signed dot product on four 8-bit quad-tuplets
with specified quadtuplet from second
source vector, accumulating results in 32-bit
elements.
udot z0.d, z1.h, z2.h[1] - dot product on four 16-bit quad-tuplets
with specified quadtuplet from second
source vector, accumulating results in 64-bit
elements.
llvm-svn: 337372
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the following predicated instructions:
UDIV Unsigned divide active elements
UDIVR Unsigned divide active elements, reverse form.
SDIV Signed divide active elements
SDIVR Signed divide active elements, reverse form.
e.g.
udiv z0.s, p0/m, z0.s, z1.s
(unsigned divide active elements in z0 by z1, store result in z0)
sdivr z0.s, p0/m, z0.s, z1.s
(signed divide active elements in z1 by z0, store result in z0)
llvm-svn: 337369
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the following instructions:
MUL - multiply vectors, e.g.
mul z0.h, p0/m, z0.h, z1.h
- multiply with immediate, e.g.
mul z0.h, z0.h, #127
SMULH - signed multiply returning high half, e.g.
smulh z0.h, p0/m, z0.h, z1.h
UMULH - unsigned multiply returning high half, e.g.
umulh z0.h, p0/m, z0.h, z1.h
llvm-svn: 337358
|
| |
|
|
|
|
|
|
|
|
| |
This patch adds support for the following instructions:
MLA mul-add, writing addend (Zda = Zda + Zn * Zm)
MLS mul-sub, writing addend (Zda = Zda + -Zn * Zm)
MAD mul-add, writing multiplicant (Zdn = Za + Zdn * Zm)
MSB mul-sub, writing multiplicant (Zdn = Za + -Zdn * Zm)
llvm-svn: 337293
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for the following instructions:
FMLA mul-add, writing addend (Zda = Zda + Zn * Zm)
FNMLA negated mul-add, writing addend (Zda = -Zda + -Zn * Zm)
FMLS mul-sub, writing addend (Zda = Zda + -Zn * Zm)
FNMLS negated mul-sub, writing addend (Zda = -Zda + Zn * Zm)
FMAD mul-add, writing multiplicant (Zdn = Za + Zdn * Zm)
FNMAD negated mul-add, writing multiplicant (Zdn = -Za + -Zdn * Zm)
FMSB mul-sub, writing multiplicant (Zdn = Za + -Zdn * Zm)
FNMSB negated mul-sub, writing multiplicant (Zdn = -Za + Zdn * Zm)
llvm-svn: 337282
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch completes support for the following floating point
instructions that take FP immediates:
FADD* (addition)
FSUB (subtract)
FSUBR (subtract reverse form)
FMUL* (multiplication)
FMAX* (maximum)
FMAXNM (maximum number)
FMIN (maximum)
FMINNM (maximum number)
All operations are predicated and take a FP immediate operand,
e.g.
fadd z0.h, p0/m, z0.h, #0.5
fmin z0.s, p0/m, z0.s, #1.0
^___________^ (tied)
* Instructions added in a previous patch.
llvm-svn: 337272
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for the following floating point
instructions:
FABD (absolute difference)
FADD (addition)
FSUB (subtract)
FSUBR (subtract reverse form)
FDIV (divide)
FDIVR (divide reverse form)
FMAX (maximum)
FMAXNM (maximum number)
FMIN (minimum)
FMINNM (minimum number)
FSCALE (adjust exponent)
FMULX (multiply extended)
All operations are predicated and binary form, e.g.
fadd z0.h, p0/m, z0.h, z1.h
^___________^ (tied)
Supporting 16, 32 and 64-bit FP elements.
llvm-svn: 337259
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SPLICE instruction splices two vectors into one vector using a
predicate. It copies the active elements from the first vector, and
then fills the remaining elements with the low-numbered elements from
the second vector.
The instruction has the following form, e.g.
splice z0.b, p0, z0.b, z1.b
for 8-bit elements. It also supports 16, 32 and
64-bit elements.
llvm-svn: 337253
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds an instruction that allows extracting
a vector from a pair of vectors, given an immediate index
that describes the element position to extract from.
The instruction has the following assembly:
ext z0.b, z0.b, z1.b, #imm
where #imm is an immediate between 0 and 255.
llvm-svn: 337251
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
[[ https://bugs.llvm.org/show_bug.cgi?id=38149 | PR38149 ]]
As discussed in https://reviews.llvm.org/D49179#1158957 and later,
the IR for 'check for [no] signed truncation' pattern can be improved:
https://rise4fun.com/Alive/gBf
^ that pattern will be produced by Implicit Integer Truncation sanitizer,
https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530
in signed case, therefore it is probably a good idea to improve it.
But the IR-optimal patter does not lower efficiently, so we want to undo it..
This handles the simple pattern.
There is a second pattern with predicate and constants inverted.
NOTE: we do not check uses here. we always do the transform.
Reviewers: spatel, craig.topper, RKSimon, javed.absar
Reviewed By: spatel
Subscribers: kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D49266
llvm-svn: 337166
|
| |
|
|
|
|
|
|
|
| |
Follow up of rL336913: fix base class description. Thanks to Ahmed Bougacha
for pointing this out.
Differential Revision: https://reviews.llvm.org/D49284
llvm-svn: 337009
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for AArch64 to cfi-verify.
This required three changes to cfi-verify. First, it generalizes checking if an instruction is a trap by adding a new isTrap flag to TableGen (and defining it for x86 and AArch64). Second, the code that ensures that the operand register is not clobbered between the CFI check and the indirect call needs to allow a single dereference (in x86 this happens as part of the jump instruction). Third, we needed to ensure that return instructions are not counted as indirect branches. Technically, returns are indirect branches and can be covered by CFI, but LLVM's forward-edge CFI does not protect them, and x86 does not consider them, so we keep that behavior.
In addition, we had to improve AArch64's code to evaluate the branch target of a MCInst to handle calls where the destination is not the first operand (which it often is not).
Differential Revision: https://reviews.llvm.org/D48836
llvm-svn: 337007
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for the following unpack instructions:
- PUNPKLO, PUNPKHI Unpack elements from low/high half and
place into elements of twice their size.
e.g. punpklo p0.h, p0.b
- UUNPKLO, UUNPKHI Unpack elements from low/high half and
SUNPKLO, SUNPKHI place into elements of twice their size
after zero- or sign-extending the values.
e.g. uunpklo z0.h, z0.b
llvm-svn: 336982
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Insert general purpose register into shifted vector, e.g.
insr z0.s, w0
insr z0.d, x0
Insert SIMD&FP scalar register into shifted vector, e.g.
insr z0.b, b0
insr z0.h, h0
insr z0.s, s0
insr z0.d, d0
llvm-svn: 336979
|
| |
|
|
|
|
| |
These instructions are added to AArch64 only.
llvm-svn: 336913
|
| |
|
|
|
|
|
|
|
|
|
| |
The compact instruction shuffles active elements of vector
into lowest numbered elements and sets remaining elements
to zero.
e.g.
compact z0.s, p0, z1.s
llvm-svn: 336789
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The LASTB and LASTA instructions extract the last active element,
or element after the last active, from the source vector.
The added variants are:
Scalar:
last(a|b) w0, p0, z0.b
last(a|b) w0, p0, z0.h
last(a|b) w0, p0, z0.s
last(a|b) x0, p0, z0.d
SIMD & FP Scalar:
last(a|b) b0, p0, z0.b
last(a|b) h0, p0, z0.h
last(a|b) s0, p0, z0.s
last(a|b) d0, p0, z0.d
The CLASTB and CLASTA conditionally extract the last or element after
the last active element from the source vector.
The added variants are:
Scalar:
clast(a|b) w0, p0, w0, z0.b
clast(a|b) w0, p0, w0, z0.h
clast(a|b) w0, p0, w0, z0.s
clast(a|b) x0, p0, x0, z0.d
SIMD & FP Scalar:
clast(a|b) b0, p0, b0, z0.b
clast(a|b) h0, p0, h0, z0.h
clast(a|b) s0, p0, s0, z0.s
clast(a|b) d0, p0, d0, z0.d
Vector:
clast(a|b) z0.b, p0, z0.b, z1.b
clast(a|b) z0.h, p0, z0.h, z1.h
clast(a|b) z0.s, p0, z0.s, z1.s
clast(a|b) z0.d, p0, z0.d, z1.d
Please refer to the architecture specification for more details on
the semantics of the added instructions.
llvm-svn: 336783
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for the following instructions:
CLS (Count Leading Sign bits)
CLZ (Count Leading Zeros)
CNT (Count non-zero bits)
CNOT (Logically invert boolean condition in vector)
NOT (Bitwise invert vector)
FABS (Floating-point absolute value)
FNEG (Floating-point negate)
All operations are predicated and unary, e.g.
clz z0.s, p0/m, z1.s
- CLS, CLZ, CNT, CNOT and NOT have variants for 8, 16, 32
and 64 bit elements.
- FABS and FNEG have variants for 16, 32 and 64 bit elements.
llvm-svn: 336677
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for the following instructions:
CNTB CNTH - Determine the number of active elements implied by
CNTW CNTD the named predicate constant, multiplied by an
immediate, e.g.
cnth x0, vl8, #16
CNTP - Count active predicate elements, e.g.
cntp x0, p0, p1.b
counts the number of active elements in p1, predicated
by p0, and stores the result in x0.
llvm-svn: 336552
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch completes support for shifts, which include:
- LSL - Logical Shift Left
- LSLR - Logical Shift Left, Reversed form
- LSR - Logical Shift Right
- LSRR - Logical Shift Right, Reversed form
- ASR - Arithmetic Shift Right
- ASRR - Arithmetic Shift Right, Reversed form
- ASRD - Arithmetic Shift Right for Divide
In the following variants:
- Predicated shift by immediate - ASR, LSL, LSR, ASRD
e.g.
asr z0.h, p0/m, z0.h, #1
(active lanes of z0 shifted by #1)
- Unpredicated shift by immediate - ASR, LSL*, LSR*
e.g.
asr z0.h, z1.h, #1
(all lanes of z1 shifted by #1, stored in z0)
- Predicated shift by vector - ASR, LSL*, LSR*
e.g.
asr z0.h, p0/m, z0.h, z1.h
(active lanes of z0 shifted by z1, stored in z0)
- Predicated shift by vector, reversed form - ASRR, LSLR, LSRR
e.g.
lslr z0.h, p0/m, z0.h, z1.h
(active lanes of z1 shifted by z0, stored in z0)
- Predicated shift left/right by wide vector - ASR, LSL, LSR
e.g.
lsl z0.h, p0/m, z0.h, z1.d
(active lanes of z0 shifted by wide elements of vector z1)
- Unpredicated shift left/right by wide vector - ASR, LSL, LSR
e.g.
lsl z0.h, z1.h, z2.d
(all lanes of z1 shifted by wide elements of z2, stored in z0)
*Variants added in previous patches.
llvm-svn: 336547
|
| |
|
|
|
|
|
|
|
|
|
| |
Support for SVE's TBL instruction for programmable table
lookup/permute using vector of element indices, e.g.
tbl z0.d, { z1.d }, z2.d
stores elements from z1, indexed by elements from z2, into z0.
llvm-svn: 336544
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Supporting various addressing modes:
- adr z0.s, [z0.s, z0.s]
- adr z0.s, [z0.s, z0.s, lsl #<shift>]
- adr z0.d, [z0.d, z0.d]
- adr z0.d, [z0.d, z0.d, lsl #<shift>]
- adr z0.d, [z0.d, z0.d, uxtw #<shift>]
- adr z0.d, [z0.d, z0.d, sxtw #<shift>]
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D48870
llvm-svn: 336533
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for:
UZP1 Concatenate even elements from two vectors
UZP2 Concatenate odd elements from two vectors
TRN1 Interleave even elements from two vectors
TRN2 Interleave odd elements from two vectors
With variants for both data and predicate vectors, e.g.
uzp1 z0.b, z1.b, z2.b
trn2 p0.s, p1.s, p2.s
llvm-svn: 336531
|
| |
|
|
|
|
|
|
|
|
| |
The checking is done deeper inside MachineBasicBlock, but this will
hopefully help to find issues when porting the machine outliner to a
target where Liveness tracking is broken (like ARM).
Differential Revision: https://reviews.llvm.org/D49023
llvm-svn: 336481
|
| |
|
|
|
|
|
|
| |
This adds:
- outer shareable TLB Maintenance instructions, and
- TLB range maintenance instructions.
llvm-svn: 336434
|
| |
|
|
|
|
| |
Now with the asm operand definition included.
llvm-svn: 336432
|
| |
|
|
|
|
| |
It's causing build errors.
llvm-svn: 336422
|
| |
|
|
|
|
|
|
| |
These instructions are added to AArch64 only.
Differential Revision: https://reviews.llvm.org/D48926
llvm-svn: 336421
|
| |
|
|
|
|
|
|
| |
This adds the Armv8.4-A Trace synchronization barrier (TSB) instruction.
Differential Revision: https://reviews.llvm.org/D48918
llvm-svn: 336418
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a deficiency in TableGen that has been addressed in r336334.
[AArch64][SVE] Asm: Support for predicated FP rounding instructions.
This patch also adds instructions for predicated FP square-root and
reciprocal exponent.
The added instructions are:
- FRINTI Round to integral value (current FPCR rounding mode)
- FRINTX Round to integral value (current FPCR rounding mode, signalling inexact)
- FRINTA Round to integral value (to nearest, with ties away from zero)
- FRINTN Round to integral value (to nearest, with ties to even)
- FRINTZ Round to integral value (toward zero)
- FRINTM Round to integral value (toward minus Infinity)
- FRINTP Round to integral value (toward plus Infinity)
- FSQRT Floating-point square root
- FRECPX Floating-point reciprocal exponent
llvm-svn: 336387
|
| |
|
|
| |
llvm-svn: 336331
|
| |
|
|
|
|
|
| |
in TableGen, for which there is already a patch in Phabricator
(D48937) that needs to be committed first.
llvm-svn: 336324
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch also adds instructions for predicated FP square-root and
reciprocal exponent.
The added instructions are:
- FRINTI Round to integral value (current FPCR rounding mode)
- FRINTX Round to integral value (current FPCR rounding mode, signalling inexact)
- FRINTA Round to integral value (to nearest, with ties away from zero)
- FRINTN Round to integral value (to nearest, with ties to even)
- FRINTZ Round to integral value (toward zero)
- FRINTM Round to integral value (toward minus Infinity)
- FRINTP Round to integral value (toward plus Infinity)
- FSQRT Floating-point square root
- FRECPX Floating-point reciprocal exponent
llvm-svn: 336322
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements the following varieties:
- Unpredicated signed max, e.g. smax z0.h, z1.h, #-128
- Unpredicated signed min, e.g. smin z0.h, z1.h, #-128
- Unpredicated unsigned max, e.g. umax z0.h, z1.h, #255
- Unpredicated unsigned min, e.g. umin z0.h, z1.h, #255
- Predicated signed max, e.g. smax z0.h, p0/m, z0.h, z1.h
- Predicated signed min, e.g. smin z0.h, p0/m, z0.h, z1.h
- Predicated signed abd, e.g. sabd z0.h, p0/m, z0.h, z1.h
- Predicated unsigned max, e.g. umax z0.h, p0/m, z0.h, z1.h
- Predicated unsigned min, e.g. umin z0.h, p0/m, z0.h, z1.h
- Predicated unsigned abd, e.g. uabd z0.h, p0/m, z0.h, z1.h
llvm-svn: 336317
|
| |
|
|
|
|
|
|
| |
getOutlininingCandidateInfo -> getOutliningCandidateInfo
Differential Revision: https://reviews.llvm.org/D48867
llvm-svn: 336285
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds both a vector and an immediate form, e.g.
- Vector form:
subr z0.h, p0/m, z0.h, z1.h
subtract active elements of z0 from z1, and store the result in z0.
- Immediate form:
subr z0.h, z0.h, #255
subtract elements of z0, and store the result in z0.
llvm-svn: 336274
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Includes instructions to read the First-Faulting Register (FFR):
- RDFFR (unpredicated)
rdffr p0.b
- RDFFR (predicated)
rdffr p0.b, p0/z
- RDFFRS (predicated, sets condition flags)
rdffr p0.b, p0/z
Includes instructions to set/write the FFR:
- SETFFR (no arguments, sets the FFR to all true)
setffr
- WRFFR (unpredicated)
wrffr p0.b
llvm-svn: 336267
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The variants added are:
- fcvt (FP convert precision)
- scvtf (signed int -> FP)
- ucvtf (unsigned int -> FP)
- fcvtzs (FP -> signed int (round to zero))
- fcvtzu (FP -> unsigned int (round to zero))
For example:
fcvt z0.h, p0/m, z0.s (single- to half-precision FP)
scvtf z0.h, p0/m, z0.s (32-bit int to half-precision FP)
ucvtf z0.h, p0/m, z0.s (32-bit unsigned int to half-precision FP)
fcvtzs z0.s, p0/m, z0.h (half-precision FP to 32-bit int)
fcvtzu z0.s, p0/m, z0.h (half-precision FP to 32-bit unsigned int)
llvm-svn: 336265
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SVE overloads the AArch64 PSTATE condition flags and introduces
a set of condition code aliases for the assembler. The
details are described in section 2.2 of the architecture
reference manual supplement for SVE.
In short:
SVE alias => AArch64 name
--------------------------
NONE => EQ
ANY => NE
NLAST => HS
LAST => LO
FIRST => MI
NFRST => PL
PMORE => HI
PLAST => LS
TCONT => GE
TSTOP => LT
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48869
llvm-svn: 336245
|
| |
|
|
|
|
| |
definitions
llvm-svn: 336222
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The variants added in this patch are:
- Predicated Complex floating point ADD with rotate, e.g.
fcadd z0.h, p0/m, z0.h, z1.h, #90
- Predicated Complex floating point MLA with rotate, e.g.
fcmla z0.h, p0/m, z1.h, z2.h, #180
- Unpredicated Complex floating point MLA with rotate (indexed operand), e.g.
fcmla z0.h, p0/m, z1.h, z2.h[0], #180
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48824
llvm-svn: 336210
|
| |
|
|
|
|
|
|
|
|
| |
unselectable stores.
r336120 resulted in falling back to SelectionDAG more often due to the G_STORE
MMOs not matching the vreg size. This fixes that by explicitly any-extending the
value.
llvm-svn: 336209
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unpredicated FP-multiply of SVE vector with a vector-element given by
vector[index], for example:
fmul z0.s, z1.s, z2.s[0]
which performs an unpredicated FP-multiply of all 32-bit elements in
'z1' with the first element from 'z2'.
This patch adds restricted register classes for SVE vectors:
ZPR_3b (only z0..z7 are allowed) - for indexed vector of 16/32-bit elements.
ZPR_4b (only z0..z15 are allowed) - for indexed vector of 64-bit elements.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48823
llvm-svn: 336205
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch includes support for the following instructions:
ABS z0.h, p0/m, z0.h
NEG z0.h, p0/m, z0.h
(S|U)XTB z0.h, p0/m, z0.h
(S|U)XTB z0.s, p0/m, z0.s
(S|U)XTB z0.d, p0/m, z0.d
(S|U)XTH z0.s, p0/m, z0.s
(S|U)XTH z0.d, p0/m, z0.d
(S|U)XTW z0.d, p0/m, z0.d
llvm-svn: 336204
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the following system registers:
- RAS registers,
- MPAM registers,
- Activitiy monitor registers,
- Trace Extension registers,
- Timing insensitivity of data processing instructions,
- Enhanced Support for Nested Virtualization.
Differential Revision: https://reviews.llvm.org/D48871
llvm-svn: 336193
|
| |
|
|
|
|
|
|
|
|
| |
The variants added are:
signed Saturating ADD/SUB (immediate) e.g. sqadd z0.h, z0.h, #42
unsigned Saturating ADD/SUB (immediate) e.g. uqadd z0.h, z0.h, #42
signed Saturating ADD/SUB (vectors) e.g. sqadd z0.h, z0.h, z1.h
unsigned Saturating ADD/SUB (vectors) e.g. uqadd z0.h, z0.h, z1.h
llvm-svn: 336186
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Contains the following variants:
- Compare with (elements from) other vector
instructions: fcmeq, fcmgt, fcmge, fcmne, fcmuo.
aliases: fcmle, fcmlt.
e.g. fcmle p0.h, p0/z, z0.h, z1.h => fcmge p0.h, p0/z, z1.h, z0.h
- Compare absolute values with (absolute values from) other vector.
instructions: facge, facgt.
aliases: facle, faclt.
e.g. facle p0.h, p0/z, z0.h, z1.h => facge p0.h, p0/z, z1.h, z0.h
- Compare vector elements with #0.0
instructions: fcmeq, fcmgt, fcmge, fcmle, fcmlt, fcmne.
e.g. fcmle p0.h, p0/z, z0.h, #0.0
llvm-svn: 336182
|