summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64
Commit message (Collapse)AuthorAgeFilesLines
* [AArch64][SVE] Asm: Support for bit/byte reverse operations.Sander de Smalen2018-07-202-0/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the following instructions: RBIT reverse bits within each active elemnt (predicated), e.g. rbit z0.d, p0/m, z1.d for 8, 16, 32 and 64 bit elements. REV reverse order of elements in data/predicate vector (unpredicated), e.g. rev z0.d, z1.d rev p0.d, p1.d for 8, 16, 32 and 64 bit elements. REVB reverse order of bytes within each active element, e.g. revb z0.d, p0/m, z1.d for 16, 32 and 64 bit elements. REVH reverse order of 16-bit half-words within each active element, e.g. revh z0.d, p0/m, z1.d for 32 and 64 bit elements. REVW reverse order of 32-bit words within each active element, e.g. revw z0.d, p0/m, z1.d for 64 bit elements. llvm-svn: 337534
* [AArch64][SVE] Asm: Support for FTMAD instruction.Sander de Smalen2018-07-202-0/+27
| | | | | | | | | | | Floating-point trigonometric multiply-add coefficient, e.g. ftmad z0.h, z0.h, z1.h, #7 with variants for 16, 32 and 64-bit elements. llvm-svn: 337533
* [AArch64][SVE] Asm: Support for unpredicated FP operations.Sander de Smalen2018-07-182-0/+35
| | | | | | | | | | | | | | | | | | This patch adds support for the following unpredicated floating-point instructions: FADD Floating point add FSUB Floating point subtract FMUL Floating point multiplication FTSMUL Floating point trigonometric starting value FRECPS Floating point reciprocal step FRSQRTS Floating point reciprocal square root step The instructions have the following assembly format: fadd z0.h, z1.h, z2.h and have variants for 16, 32 and 64-bit FP elements. llvm-svn: 337383
* [AArch64][SVE] Asm: Support for UDOT/SDOT instructions.Sander de Smalen2018-07-182-0/+72
| | | | | | | | | | | | | | | | | | | | | | | | | The signed/unsigned DOT instructions perform a dot-product on quadtuplets from two source vectors and accumulate the result in the destination register. The instructions come in two forms: Vector form, e.g. sdot z0.s, z1.b, z2.b - signed dot product on four 8-bit quad-tuplets, accumulating results in 32-bit elements. udot z0.d, z1.h, z2.h - unsigned dot product on four 16-bit quad-tuplets, accumulating results in 64-bit elements. Indexed form, e.g. sdot z0.s, z1.b, z2.b[3] - signed dot product on four 8-bit quad-tuplets with specified quadtuplet from second source vector, accumulating results in 32-bit elements. udot z0.d, z1.h, z2.h[1] - dot product on four 16-bit quad-tuplets with specified quadtuplet from second source vector, accumulating results in 64-bit elements. llvm-svn: 337372
* [AArch64][SVE] Asm: Integer divide instructions.Sander de Smalen2018-07-182-0/+11
| | | | | | | | | | | | | | | | | | This patch adds the following predicated instructions: UDIV Unsigned divide active elements UDIVR Unsigned divide active elements, reverse form. SDIV Signed divide active elements SDIVR Signed divide active elements, reverse form. e.g. udiv z0.s, p0/m, z0.s, z1.s (unsigned divide active elements in z0 by z1, store result in z0) sdivr z0.s, p0/m, z0.s, z1.s (signed divide active elements in z1 by z0, store result in z0) llvm-svn: 337369
* [AArch64][SVE] Asm: Support for integer MUL instructions.Sander de Smalen2018-07-182-8/+26
| | | | | | | | | | | | | | | | This patch adds the following instructions: MUL - multiply vectors, e.g. mul z0.h, p0/m, z0.h, z1.h - multiply with immediate, e.g. mul z0.h, z0.h, #127 SMULH - signed multiply returning high half, e.g. smulh z0.h, p0/m, z0.h, z1.h UMULH - unsigned multiply returning high half, e.g. umulh z0.h, p0/m, z0.h, z1.h llvm-svn: 337358
* [AArch64][SVE]: Integer multiply-add/subtract instructions.Sander de Smalen2018-07-172-0/+69
| | | | | | | | | | This patch adds support for the following instructions: MLA mul-add, writing addend (Zda = Zda + Zn * Zm) MLS mul-sub, writing addend (Zda = Zda + -Zn * Zm) MAD mul-add, writing multiplicant (Zdn = Za + Zdn * Zm) MSB mul-sub, writing multiplicant (Zdn = Za + -Zdn * Zm) llvm-svn: 337293
* [AArch64][SVE] Asm: FP fused multiply-add/subtract instructions.Sander de Smalen2018-07-172-0/+119
| | | | | | | | | | | | | | | | | | This patch adds support for the following instructions: FMLA mul-add, writing addend (Zda = Zda + Zn * Zm) FNMLA negated mul-add, writing addend (Zda = -Zda + -Zn * Zm) FMLS mul-sub, writing addend (Zda = Zda + -Zn * Zm) FNMLS negated mul-sub, writing addend (Zda = -Zda + Zn * Zm) FMAD mul-add, writing multiplicant (Zdn = Za + Zdn * Zm) FNMAD negated mul-add, writing multiplicant (Zdn = -Za + -Zdn * Zm) FMSB mul-sub, writing multiplicant (Zdn = Za + -Zdn * Zm) FNMSB negated mul-sub, writing multiplicant (Zdn = -Za + Zdn * Zm) llvm-svn: 337282
* [AArch64][SVE] Asm: Support for predicated FP operations (FP immediate)Sander de Smalen2018-07-171-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | This patch completes support for the following floating point instructions that take FP immediates: FADD* (addition) FSUB (subtract) FSUBR (subtract reverse form) FMUL* (multiplication) FMAX* (maximum) FMAXNM (maximum number) FMIN (maximum) FMINNM (maximum number) All operations are predicated and take a FP immediate operand, e.g. fadd z0.h, p0/m, z0.h, #0.5 fmin z0.s, p0/m, z0.s, #1.0 ^___________^ (tied) * Instructions added in a previous patch. llvm-svn: 337272
* [AArch64][SVE] Asm: Support for predicated FP operations.Sander de Smalen2018-07-172-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for the following floating point instructions: FABD (absolute difference) FADD (addition) FSUB (subtract) FSUBR (subtract reverse form) FDIV (divide) FDIVR (divide reverse form) FMAX (maximum) FMAXNM (maximum number) FMIN (minimum) FMINNM (minimum number) FSCALE (adjust exponent) FMULX (multiply extended) All operations are predicated and binary form, e.g. fadd z0.h, p0/m, z0.h, z1.h ^___________^ (tied) Supporting 16, 32 and 64-bit FP elements. llvm-svn: 337259
* [AArch64][SVE] Asm: Support for SPLICE instruction.Sander de Smalen2018-07-172-0/+26
| | | | | | | | | | | | | | | | The SPLICE instruction splices two vectors into one vector using a predicate. It copies the active elements from the first vector, and then fills the remaining elements with the low-numbered elements from the second vector. The instruction has the following form, e.g. splice z0.b, p0, z0.b, z1.b for 8-bit elements. It also supports 16, 32 and 64-bit elements. llvm-svn: 337253
* [AArch64][SVE] Asm: Support for EXT instruction.Sander de Smalen2018-07-172-0/+21
| | | | | | | | | | | | | This patch adds an instruction that allows extracting a vector from a pair of vectors, given an immediate index that describes the element position to extract from. The instruction has the following assembly: ext z0.b, z0.b, z1.b, #imm where #imm is an immediate between 0 and 255. llvm-svn: 337251
* [X86][AArch64][DAGCombine] Unfold 'check for [no] signed truncation' patternRoman Lebedev2018-07-161-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: [[ https://bugs.llvm.org/show_bug.cgi?id=38149 | PR38149 ]] As discussed in https://reviews.llvm.org/D49179#1158957 and later, the IR for 'check for [no] signed truncation' pattern can be improved: https://rise4fun.com/Alive/gBf ^ that pattern will be produced by Implicit Integer Truncation sanitizer, https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530 in signed case, therefore it is probably a good idea to improve it. But the IR-optimal patter does not lower efficiently, so we want to undo it.. This handles the simple pattern. There is a second pattern with predicate and constants inverted. NOTE: we do not check uses here. we always do the transform. Reviewers: spatel, craig.topper, RKSimon, javed.absar Reviewed By: spatel Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D49266 llvm-svn: 337166
* [AArch64] Armv8.4-A: LDAPR & STLR with immediate offset instructions (cont'd)Sjoerd Meijer2018-07-133-22/+35
| | | | | | | | | Follow up of rL336913: fix base class description. Thanks to Ahmed Bougacha for pointing this out. Differential Revision: https://reviews.llvm.org/D49284 llvm-svn: 337009
* [cfi-verify] Support AArch64.Joel Galenson2018-07-132-8/+14
| | | | | | | | | | | | This patch adds support for AArch64 to cfi-verify. This required three changes to cfi-verify. First, it generalizes checking if an instruction is a trap by adding a new isTrap flag to TableGen (and defining it for x86 and AArch64). Second, the code that ensures that the operand register is not clobbered between the CFI check and the indirect call needs to allow a single dereference (in x86 this happens as part of the jump instruction). Third, we needed to ensure that return instructions are not counted as indirect branches. Technically, returns are indirect branches and can be covered by CFI, but LLVM's forward-edge CFI does not protect them, and x86 does not consider them, so we keep that behavior. In addition, we had to improve AArch64's code to evaluate the branch target of a MCInst to handle calls where the destination is not the first operand (which it often is not). Differential Revision: https://reviews.llvm.org/D48836 llvm-svn: 337007
* [AArch64][SVE] Asm: Vector Unpack Low/High instructions.Sander de Smalen2018-07-132-0/+45
| | | | | | | | | | | | | | | | | This patch adds support for the following unpack instructions: - PUNPKLO, PUNPKHI Unpack elements from low/high half and place into elements of twice their size. e.g. punpklo p0.h, p0.b - UUNPKLO, UUNPKHI Unpack elements from low/high half and SUNPKLO, SUNPKHI place into elements of twice their size after zero- or sign-extending the values. e.g. uunpklo z0.h, z0.b llvm-svn: 336982
* [AArch64][SVE] Asm: Support for insert element (INSR) instructions.Sander de Smalen2018-07-132-0/+51
| | | | | | | | | | | | | | Insert general purpose register into shifted vector, e.g. insr z0.s, w0 insr z0.d, x0 Insert SIMD&FP scalar register into shifted vector, e.g. insr z0.b, b0 insr z0.h, h0 insr z0.s, s0 insr z0.d, d0 llvm-svn: 336979
* [AArch64] Armv8.4-A: LDAPR & STLR with immediate offset instructionsSjoerd Meijer2018-07-123-0/+43
| | | | | | These instructions are added to AArch64 only. llvm-svn: 336913
* [AArch64][SVE] Asm: Support for COMPACT instruction.Sander de Smalen2018-07-112-0/+24
| | | | | | | | | | | The compact instruction shuffles active elements of vector into lowest numbered elements and sets remaining elements to zero. e.g. compact z0.s, p0, z1.s llvm-svn: 336789
* [AArch64][SVE] Asm: Support for LAST(A|B) and CLAST(A|B) instructions.Sander de Smalen2018-07-112-0/+148
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The LASTB and LASTA instructions extract the last active element, or element after the last active, from the source vector. The added variants are: Scalar: last(a|b) w0, p0, z0.b last(a|b) w0, p0, z0.h last(a|b) w0, p0, z0.s last(a|b) x0, p0, z0.d SIMD & FP Scalar: last(a|b) b0, p0, z0.b last(a|b) h0, p0, z0.h last(a|b) s0, p0, z0.s last(a|b) d0, p0, z0.d The CLASTB and CLASTA conditionally extract the last or element after the last active element from the source vector. The added variants are: Scalar: clast(a|b) w0, p0, w0, z0.b clast(a|b) w0, p0, w0, z0.h clast(a|b) w0, p0, w0, z0.s clast(a|b) x0, p0, x0, z0.d SIMD & FP Scalar: clast(a|b) b0, p0, b0, z0.b clast(a|b) h0, p0, h0, z0.h clast(a|b) s0, p0, s0, z0.s clast(a|b) d0, p0, d0, z0.d Vector: clast(a|b) z0.b, p0, z0.b, z1.b clast(a|b) z0.h, p0, z0.h, z1.h clast(a|b) z0.s, p0, z0.s, z1.s clast(a|b) z0.d, p0, z0.d, z1.d Please refer to the architecture specification for more details on the semantics of the added instructions. llvm-svn: 336783
* [AArch64][SVE] Asm: Support for predicated unary operations.Sander de Smalen2018-07-102-14/+35
| | | | | | | | | | | | | | | | | | | | | This patch adds support for the following instructions: CLS (Count Leading Sign bits) CLZ (Count Leading Zeros) CNT (Count non-zero bits) CNOT (Logically invert boolean condition in vector) NOT (Bitwise invert vector) FABS (Floating-point absolute value) FNEG (Floating-point negate) All operations are predicated and unary, e.g. clz z0.s, p0/m, z1.s - CLS, CLZ, CNT, CNOT and NOT have variants for 8, 16, 32 and 64 bit elements. - FABS and FNEG have variants for 16, 32 and 64 bit elements. llvm-svn: 336677
* [AArch64][SVE] Asm: Support for CNT(B|H|W|D) and CNTP instructions.Sander de Smalen2018-07-092-13/+72
| | | | | | | | | | | | | | | | | | This patch adds support for the following instructions: CNTB CNTH - Determine the number of active elements implied by CNTW CNTD the named predicate constant, multiplied by an immediate, e.g. cnth x0, vl8, #16 CNTP - Count active predicate elements, e.g. cntp x0, p0, p1.b counts the number of active elements in p1, predicated by p0, and stores the result in x0. llvm-svn: 336552
* [AArch64][SVE] Asm: Support for remaining shift instructions.Sander de Smalen2018-07-092-26/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch completes support for shifts, which include: - LSL - Logical Shift Left - LSLR - Logical Shift Left, Reversed form - LSR - Logical Shift Right - LSRR - Logical Shift Right, Reversed form - ASR - Arithmetic Shift Right - ASRR - Arithmetic Shift Right, Reversed form - ASRD - Arithmetic Shift Right for Divide In the following variants: - Predicated shift by immediate - ASR, LSL, LSR, ASRD e.g. asr z0.h, p0/m, z0.h, #1 (active lanes of z0 shifted by #1) - Unpredicated shift by immediate - ASR, LSL*, LSR* e.g. asr z0.h, z1.h, #1 (all lanes of z1 shifted by #1, stored in z0) - Predicated shift by vector - ASR, LSL*, LSR* e.g. asr z0.h, p0/m, z0.h, z1.h (active lanes of z0 shifted by z1, stored in z0) - Predicated shift by vector, reversed form - ASRR, LSLR, LSRR e.g. lslr z0.h, p0/m, z0.h, z1.h (active lanes of z1 shifted by z0, stored in z0) - Predicated shift left/right by wide vector - ASR, LSL, LSR e.g. lsl z0.h, p0/m, z0.h, z1.d (active lanes of z0 shifted by wide elements of vector z1) - Unpredicated shift left/right by wide vector - ASR, LSL, LSR e.g. lsl z0.h, z1.h, z2.d (all lanes of z1 shifted by wide elements of z2, stored in z0) *Variants added in previous patches. llvm-svn: 336547
* [AArch64][SVE] Asm: Support for TBL instruction.Sander de Smalen2018-07-092-0/+35
| | | | | | | | | | | Support for SVE's TBL instruction for programmable table lookup/permute using vector of element indices, e.g. tbl z0.d, { z1.d }, z2.d stores elements from z1, indexed by elements from z2, into z0. llvm-svn: 336544
* [AArch64][SVE] Asm: Support for ADR instruction.Sander de Smalen2018-07-094-16/+88
| | | | | | | | | | | | | | | | | | Supporting various addressing modes: - adr z0.s, [z0.s, z0.s] - adr z0.s, [z0.s, z0.s, lsl #<shift>] - adr z0.d, [z0.d, z0.d] - adr z0.d, [z0.d, z0.d, lsl #<shift>] - adr z0.d, [z0.d, z0.d, uxtw #<shift>] - adr z0.d, [z0.d, z0.d, sxtw #<shift>] Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D48870 llvm-svn: 336533
* [AArch64][SVE] Asm: Support for UZP and TRN instructions.Sander de Smalen2018-07-091-0/+8
| | | | | | | | | | | | | | This patch adds support for: UZP1 Concatenate even elements from two vectors UZP2 Concatenate odd elements from two vectors TRN1 Interleave even elements from two vectors TRN2 Interleave odd elements from two vectors With variants for both data and predicate vectors, e.g. uzp1 z0.b, z1.b, z2.b trn2 p0.s, p1.s, p2.s llvm-svn: 336531
* [MachineOutliner] Assert that Liveness tracking is accurate (NFC)Yvan Roux2018-07-071-0/+2
| | | | | | | | | | The checking is done deeper inside MachineBasicBlock, but this will hopefully help to find issues when porting the machine outliner to a target where Liveness tracking is broken (like ARM). Differential Revision: https://reviews.llvm.org/D49023 llvm-svn: 336481
* [AArch64] Armv8.4-A: TLB supportSjoerd Meijer2018-07-062-0/+56
| | | | | | | | This adds: - outer shareable TLB Maintenance instructions, and - TLB range maintenance instructions. llvm-svn: 336434
* Recommit: [AArch64] Armv8.4-A: Flag manipulation instructionsSjoerd Meijer2018-07-063-0/+61
| | | | | | Now with the asm operand definition included. llvm-svn: 336432
* Revert [AArch64] Armv8.4-A: Flag manipulation instructionsSjoerd Meijer2018-07-062-35/+0
| | | | | | It's causing build errors. llvm-svn: 336422
* [AArch64] Armv8.4-A: Flag manipulation instructionsSjoerd Meijer2018-07-062-0/+35
| | | | | | | | These instructions are added to AArch64 only. Differential Revision: https://reviews.llvm.org/D48926 llvm-svn: 336421
* [AArch64][ARM] Armv8.4-A: Trace synchronization barrier instructionSjoerd Meijer2018-07-066-4/+66
| | | | | | | | This adds the Armv8.4-A Trace synchronization barrier (TSB) instruction. Differential Revision: https://reviews.llvm.org/D48918 llvm-svn: 336418
* This is a recommit of r336322, previously reverted in r336324 due toSander de Smalen2018-07-052-1/+16
| | | | | | | | | | | | | | | | | | | | | | a deficiency in TableGen that has been addressed in r336334. [AArch64][SVE] Asm: Support for predicated FP rounding instructions. This patch also adds instructions for predicated FP square-root and reciprocal exponent. The added instructions are: - FRINTI Round to integral value (current FPCR rounding mode) - FRINTX Round to integral value (current FPCR rounding mode, signalling inexact) - FRINTA Round to integral value (to nearest, with ties away from zero) - FRINTN Round to integral value (to nearest, with ties to even) - FRINTZ Round to integral value (toward zero) - FRINTM Round to integral value (toward minus Infinity) - FRINTP Round to integral value (toward plus Infinity) - FSQRT Floating-point square root - FRECPX Floating-point reciprocal exponent llvm-svn: 336387
* Try to fix -Wimplicit-fallthrough warning. NFCI.Simon Pilgrim2018-07-051-0/+1
| | | | llvm-svn: 336331
* Reverting r336322 for now, as it causes an assert failureSander de Smalen2018-07-052-16/+1
| | | | | | | in TableGen, for which there is already a patch in Phabricator (D48937) that needs to be committed first. llvm-svn: 336324
* [AArch64][SVE] Asm: Support for predicated FP rounding instructions.Sander de Smalen2018-07-052-1/+16
| | | | | | | | | | | | | | | | | | This patch also adds instructions for predicated FP square-root and reciprocal exponent. The added instructions are: - FRINTI Round to integral value (current FPCR rounding mode) - FRINTX Round to integral value (current FPCR rounding mode, signalling inexact) - FRINTA Round to integral value (to nearest, with ties away from zero) - FRINTN Round to integral value (to nearest, with ties to even) - FRINTZ Round to integral value (toward zero) - FRINTM Round to integral value (toward minus Infinity) - FRINTP Round to integral value (toward plus Infinity) - FSQRT Floating-point square root - FRECPX Floating-point reciprocal exponent llvm-svn: 336322
* [AArch64][SVE] Asm: Support for signed/unsigned MIN/MAX/ABDSander de Smalen2018-07-054-0/+53
| | | | | | | | | | | | | | | | | | | | This patch implements the following varieties: - Unpredicated signed max, e.g. smax z0.h, z1.h, #-128 - Unpredicated signed min, e.g. smin z0.h, z1.h, #-128 - Unpredicated unsigned max, e.g. umax z0.h, z1.h, #255 - Unpredicated unsigned min, e.g. umin z0.h, z1.h, #255 - Predicated signed max, e.g. smax z0.h, p0/m, z0.h, z1.h - Predicated signed min, e.g. smin z0.h, p0/m, z0.h, z1.h - Predicated signed abd, e.g. sabd z0.h, p0/m, z0.h, z1.h - Predicated unsigned max, e.g. umax z0.h, p0/m, z0.h, z1.h - Predicated unsigned min, e.g. umin z0.h, p0/m, z0.h, z1.h - Predicated unsigned abd, e.g. uabd z0.h, p0/m, z0.h, z1.h llvm-svn: 336317
* [MachineOutliner] Fix typo in getOutliningCandidateInfo function nameYvan Roux2018-07-042-2/+2
| | | | | | | | getOutlininingCandidateInfo -> getOutliningCandidateInfo Differential Revision: https://reviews.llvm.org/D48867 llvm-svn: 336285
* [AArch64][SVE] Asm: Support for reversed subtract (SUBR) instruction. Sander de Smalen2018-07-041-2/+4
| | | | | | | | | | | | | | | | | | This patch adds both a vector and an immediate form, e.g. - Vector form: subr z0.h, p0/m, z0.h, z1.h subtract active elements of z0 from z1, and store the result in z0. - Immediate form: subr z0.h, z0.h, #255 subtract elements of z0, and store the result in z0. llvm-svn: 336274
* [AArch64][SVE] Asm: Support for instructions to set/read FFR.Sander de Smalen2018-07-042-0/+61
| | | | | | | | | | | | | | | | | | Includes instructions to read the First-Faulting Register (FFR): - RDFFR (unpredicated) rdffr p0.b - RDFFR (predicated) rdffr p0.b, p0/z - RDFFRS (predicated, sets condition flags) rdffr p0.b, p0/z Includes instructions to set/write the FFR: - SETFFR (no arguments, sets the FFR to all true) setffr - WRFFR (unpredicated) wrffr p0.b llvm-svn: 336267
* [AArch64][SVE] Asm: Support for FP conversion instructions.Sander de Smalen2018-07-042-0/+61
| | | | | | | | | | | | | | | | | | | The variants added are: - fcvt (FP convert precision) - scvtf (signed int -> FP) - ucvtf (unsigned int -> FP) - fcvtzs (FP -> signed int (round to zero)) - fcvtzu (FP -> unsigned int (round to zero)) For example: fcvt z0.h, p0/m, z0.s (single- to half-precision FP) scvtf z0.h, p0/m, z0.s (32-bit int to half-precision FP) ucvtf z0.h, p0/m, z0.s (32-bit unsigned int to half-precision FP) fcvtzs z0.s, p0/m, z0.h (half-precision FP to 32-bit int) fcvtzu z0.s, p0/m, z0.h (half-precision FP to 32-bit unsigned int) llvm-svn: 336265
* [AArch64][SVE] Asm: Support for SVE condition code aliasesSander de Smalen2018-07-041-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SVE overloads the AArch64 PSTATE condition flags and introduces a set of condition code aliases for the assembler. The details are described in section 2.2 of the architecture reference manual supplement for SVE. In short: SVE alias => AArch64 name -------------------------- NONE => EQ ANY => NE NLAST => HS LAST => LO FIRST => MI NFRST => PL PMORE => HI PLAST => LS TCONT => GE TSTOP => LT Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48869 llvm-svn: 336245
* [AArch64] Make function parameter names in declarations match those of ↵Fangrui Song2018-07-031-8/+8
| | | | | | definitions llvm-svn: 336222
* [AArch64][SVE] Asm: Support for FP Complex ADD/MLA.Sander de Smalen2018-07-033-4/+114
| | | | | | | | | | | | | | | | | | | | | | | | The variants added in this patch are: - Predicated Complex floating point ADD with rotate, e.g. fcadd z0.h, p0/m, z0.h, z1.h, #90 - Predicated Complex floating point MLA with rotate, e.g. fcmla z0.h, p0/m, z1.h, z2.h, #180 - Unpredicated Complex floating point MLA with rotate (indexed operand), e.g. fcmla z0.h, p0/m, z1.h, z2.h[0], #180 Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48824 llvm-svn: 336210
* [AArch64][GlobalISel] Fix fallbacks introduced in r336120 due to ↵Amara Emerson2018-07-031-2/+5
| | | | | | | | | | unselectable stores. r336120 resulted in falling back to SelectionDAG more often due to the G_STORE MMOs not matching the vreg size. This fixes that by explicitly any-extending the value. llvm-svn: 336209
* [AArch64][SVE] Asm: Support for FMUL (indexed)Sander de Smalen2018-07-035-3/+125
| | | | | | | | | | | | | | | | | | | | | | Unpredicated FP-multiply of SVE vector with a vector-element given by vector[index], for example: fmul z0.s, z1.s, z2.s[0] which performs an unpredicated FP-multiply of all 32-bit elements in 'z1' with the first element from 'z2'. This patch adds restricted register classes for SVE vectors: ZPR_3b (only z0..z7 are allowed) - for indexed vector of 16/32-bit elements. ZPR_4b (only z0..z15 are allowed) - for indexed vector of 64-bit elements. Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48823 llvm-svn: 336205
* [AArch64][SVE] Asm: Support for predicated unary operations.Sander de Smalen2018-07-032-0/+57
| | | | | | | | | | | | | | | | | | The patch includes support for the following instructions: ABS z0.h, p0/m, z0.h NEG z0.h, p0/m, z0.h (S|U)XTB z0.h, p0/m, z0.h (S|U)XTB z0.s, p0/m, z0.s (S|U)XTB z0.d, p0/m, z0.d (S|U)XTH z0.s, p0/m, z0.s (S|U)XTH z0.d, p0/m, z0.d (S|U)XTW z0.d, p0/m, z0.d llvm-svn: 336204
* [AArch64] Armv8.4-A: system registersSjoerd Meijer2018-07-032-1/+102
| | | | | | | | | | | | | | This adds the following system registers: - RAS registers, - MPAM registers, - Activitiy monitor registers, - Trace Extension registers, - Timing insensitivity of data processing instructions, - Enhanced Support for Nested Virtualization. Differential Revision: https://reviews.llvm.org/D48871 llvm-svn: 336193
* [AArch64][SVE] Asm: Support for saturing ADD/SUB instructions.Sander de Smalen2018-07-031-0/+8
| | | | | | | | | | The variants added are: signed Saturating ADD/SUB (immediate) e.g. sqadd z0.h, z0.h, #42 unsigned Saturating ADD/SUB (immediate) e.g. uqadd z0.h, z0.h, #42 signed Saturating ADD/SUB (vectors) e.g. sqadd z0.h, z0.h, z1.h unsigned Saturating ADD/SUB (vectors) e.g. uqadd z0.h, z0.h, z1.h llvm-svn: 336186
* [AArch64][SVE] Asm: Support for vector element FP compare.Sander de Smalen2018-07-033-1/+109
| | | | | | | | | | | | | | | | | | | | | | | Contains the following variants: - Compare with (elements from) other vector instructions: fcmeq, fcmgt, fcmge, fcmne, fcmuo. aliases: fcmle, fcmlt. e.g. fcmle p0.h, p0/z, z0.h, z1.h => fcmge p0.h, p0/z, z1.h, z0.h - Compare absolute values with (absolute values from) other vector. instructions: facge, facgt. aliases: facle, faclt. e.g. facle p0.h, p0/z, z0.h, z1.h => facge p0.h, p0/z, z1.h, z0.h - Compare vector elements with #0.0 instructions: fcmeq, fcmgt, fcmge, fcmle, fcmlt, fcmne. e.g. fcmle p0.h, p0/z, z0.h, #0.0 llvm-svn: 336182
OpenPOWER on IntegriCloud