summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64
Commit message (Collapse)AuthorAgeFilesLines
* Merging r348444:Tom Stellard2018-12-071-93/+124
| | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r348444 | matze | 2018-12-05 17:40:23 -0800 (Wed, 05 Dec 2018) | 15 lines AArch64: Fix invalid CCMP emission The code emitting AND-subtrees used to check whether any of the operands was an OR in order to figure out if the result needs to be negated. However the OR could be hidden in further subtrees and not immediately visible. Change the code so that canEmitConjunction() determines whether the result of the generated subtree needs to be negated. Cleanup emission logic to use this. I also changed the code a bit to make all negation decisions early before we actually emit the subtrees. This fixes http://llvm.org/PR39550 Differential Revision: https://reviews.llvm.org/D54137 ------------------------------------------------------------------------ llvm-svn: 348642
* Merging r346203:Tom Stellard2018-12-071-29/+30
| | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r346203 | matze | 2018-11-05 19:15:22 -0800 (Mon, 05 Nov 2018) | 7 lines AArch64: Cleanup CCMP code; NFC Cleanup CCMP pattern matching code in preparation for review/bugfix: - Rename `isConjunctionDisjunctionTree()` to `canEmitConjunction()` (it won't accept arbitrary disjunctions and is really about whether we can transform the subtree into a conjunction that we can emit). - Rename `emitConjunctionDisjunctionTree()` to `emitConjunction()` ------------------------------------------------------------------------ llvm-svn: 348636
* Merging r340158:Hans Wennborg2018-08-211-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r340158 | s.desmalen | 2018-08-20 11:16:59 +0200 (Mon, 20 Aug 2018) | 16 lines [AArch64][SVE] Asm: Add SVE System registers This patch adds system registers for controlling aspects of SVE: - ZCR_EL1 (r/w) visible at EL1 and EL0. - ZCR_EL2 (r/w) visible at EL2 and Non-secure EL1 and EL0. - ZCR_EL3 (r/w) visible at all exception levels. and a system register identifying SVE: - ID_AA64ZFR0_EL1 (r) SVE Feature identifier. Reviewers: SjoerdMeijer, samparker, pbarrio, fhahn, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D50885 ------------------------------------------------------------------------ llvm-svn: 340355
* Merging r338554:Hans Wennborg2018-08-021-1/+3
| | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r338554 | bryanpkc | 2018-08-01 15:50:29 +0200 (Wed, 01 Aug 2018) | 11 lines [AArch64] Fix FCCMP with FP16 operands Summary: This patch adds support for FCCMP instruction with FP16 operands, avoiding an assertion during instruction selection. Reviewers: olista01, SjoerdMeijer, t.p.northover, javed.absar Reviewed By: SjoerdMeijer Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D50115 ------------------------------------------------------------------------ llvm-svn: 338692
* [AArch64] Disallow the MachO specific .loh directive for windowsMartin Storsjo2018-08-011-6/+6
| | | | | | | | Also add a test for it being unsupported for linux. Differential Revision: https://reviews.llvm.org/D49929 llvm-svn: 338493
* [AArch64] Support the .inst directive for MachO and COFF targetsMartin Storsjo2018-07-312-7/+18
| | | | | | | | | | Contrary to ELF, we don't add any markers that distinguish data generated with .long from normal instructions, so the .inst directive only adds compatibility with assembly that uses it. Differential Revision: https://reviews.llvm.org/D49935 llvm-svn: 338355
* [AArch64][GlobalISel] Add isel support for G_BLOCK_ADDR.Amara Emerson2018-07-311-31/+65
| | | | | | | | | | | Also refactors some existing code to materialize addresses for the large code model so it can be shared between G_GLOBAL_VALUE and G_BLOCK_ADDR. This implements PR36390. Differential Revision: https://reviews.llvm.org/D49903 llvm-svn: 338337
* [AArch64][GlobalISel] Make G_BLOCK_ADDR legal.Amara Emerson2018-07-311-0/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D49902 llvm-svn: 338336
* [DAGCombiner][TargetLowering] Pass a SmallVector instead of a std::vector to ↵Craig Topper2018-07-302-2/+2
| | | | | | | | BuildSDIV/BuildUDIV/etc. The vector contains the SDNodes that these functions create. The number of nodes is always a small number so we should use SmallVector to avoid a heap allocation. llvm-svn: 338329
* [DAGCombiner][PowerPC][AArch64] Pass Created vector by reference to ↵Craig Topper2018-07-302-9/+6
| | | | | | BuildSDIVPow2. llvm-svn: 338303
* Remove trailing spaceFangrui Song2018-07-303-5/+5
| | | | | | sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h} llvm-svn: 338293
* [MachineOutliner][AArch64] Add support for saving LR to a registerJessica Paquette2018-07-302-84/+163
| | | | | | | | | | | | | | | | | | | | | | This teaches the outliner to save LR to a register rather than the stack when possible. This allows us to avoid bumping the stack in outlined functions in some cases. By doing this, in a later patch, we can teach the outliner to do something like this: f1: ... bl OUTLINED_FUNCTION ... f2: ... move LR's contents to a register bl OUTLINED_FUNCTION move the register's contents back instead of falling back to saving LR in both cases. llvm-svn: 338278
* [AArch64][SVE] Asm: Enable instructions to be prefixed.Sander de Smalen2018-07-302-48/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables instructions that are destructive on their destination- and first source operand, to be prefixed with a MOVPRFX instruction. This patch also adds a variety of tests: - positive tests for all instructions and forms that accept a movprfx for either or both predicated and unpredicated forms. - negative tests for all instructions and forms that do not accept an unpredicated or predicated movprfx. - negative tests for the diagnostics that get emitted when a MOVPRFX instruction is used incorrectly. This is patch [2/2] in a series to add MOVPRFX instructions: - Patch [1/2]: https://reviews.llvm.org/D49592 - Patch [2/2]: https://reviews.llvm.org/D49593 Reviewers: rengolin, SjoerdMeijer, samparker, fhahn, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D49593 llvm-svn: 338261
* [AArch64][SVE] Asm: Add MOVPRFX instructions.Sander de Smalen2018-07-306-30/+273
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds predicated and unpredicated MOVPRFX instructions, which can be prepended to SVE instructions that are destructive on their first source operand, to make them a constructive operation, e.g. add z1.s, p0/m, z1.s, z2.s <=> z1 = z1 + z2 can be made constructive: movprfx z0, z1 add z0.s, p0/m, z0.s, z2.s <=> z0 = z1 + z2 The predicated MOVPRFX instruction can additionally be used to zero inactive elements, e.g. movprfx z0.s, p0/z, z1.s add z0.s, p0/m, z0.s, z2.s Not all instructions can be prefixed with the MOVPRFX instruction which is why this patch also adds a mechanism to validate prefixed instructions. The exact rules when a MOVPRFX applies is detailed in the SVE supplement of the Architectural Reference Manual. This is patch [1/2] in a series to add MOVPRFX instructions: - Patch [1/2]: https://reviews.llvm.org/D49592 - Patch [2/2]: https://reviews.llvm.org/D49593 Reviewers: rengolin, SjoerdMeijer, samparker, fhahn, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D49592 llvm-svn: 338258
* [AArch64][SVE] Asm: Support for WHILE(LE|LO|LS|LT) instructions.Sander de Smalen2018-07-292-0/+45
| | | | | | | | | | | | | | | | | | | | The WHILE instructions generate a predicate that is true while the comparison of the first scalar operand (incremented for each predicate element) with the second scalar operand is true and false thereafter. WHILELE While incrementing signed scalar less than or equal to scalar WHILELO While incrementing unsigned scalar lower than scalar WHILELS While incrementing unsigned scalar lower than or same as scalar WHILELT While incrementing signed scalar less than scalar e.g. whilele p0.s, x0, x1 generates predicate p0 (for 32bit elements) by incrementing (signed) x0 and comparing that vector to splat(x1). llvm-svn: 338211
* [AArch64][SVE] Asm: Instructions to perform serialized operations.Sander de Smalen2018-07-292-0/+63
| | | | | | | | | | | | The instructions added in this patch permit active elements within a vector to be processed sequentially without unpacking the vector. PFIRST Set the first active element to true. PNEXT Find next active element in predicate. CTERMEQ Compare and terminate loop when equal. CTERMNE Compare and terminate loop when not equal. llvm-svn: 338210
* [AArch64][SVE] Asm: Support for PFALSE and PTEST instructions.Sander de Smalen2018-07-282-0/+45
| | | | | | | | This patch adds PFALSE (unconditionally sets all elements of the predicate to false) and PTEST (set the status flags for the predicate). llvm-svn: 338198
* [AArch64][SVE] Asm: Data-dependent loop predicate partitioning instructions.Sander de Smalen2018-07-282-0/+100
| | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for instructions that partition a predicate based on data-dependent termination conditions in a loop. BRKA Break after the first true condition BRKAS Break after the first true condition, setting condition flags BRKB Break before the first true condition BRKBS Break before the first true condition, setting condition flags BRKPA Break after the first true condition, propagating from the previous partition BRKPAS Break after the first true condition, propagating from the previous partition, setting condition flags BRKPB Break before the first true condition, propagating from the previous partition BRKPBS Break before the first true condition, propagating from the previous partition, setting condition flags BRKN Propagate break to next partition BKRNS Propagate break to next partition, setting condition flags llvm-svn: 338196
* DAG: Add calling convention argument to calling convention funcsMatt Arsenault2018-07-281-1/+1
| | | | | | | | This seems like a pretty glaring omission, and AMDGPU wants to treat kernels differently from other calling conventions. llvm-svn: 338194
* Recommit "Enable MachineOutliner by default under -Oz for AArch64"Jessica Paquette2018-07-273-0/+9
| | | | | | | | | | | | | | | | | | Fixed the ASAN failure from before in r338148, so recommiting. This patch enables the MachineOutliner by default in AArch64 under -Oz. The MachineOutliner offers around a 4.5% improvement on the current -Oz code size improvements. We have done work into improving the debuggability of outlined code, so that users of -Oz won't be surprised by the optimization. We have also been executing the LLVM test suite and common external tests such as the SPEC suites continuously with no issue. The outliner has a low compile-time overhead of roughly 1%. At this point, the outliner would be a really good addition to the -Oz pass pipeline! llvm-svn: 338160
* [MachineOutliner] Exit getOutliningCandidateInfo when we erase all candidatesJessica Paquette2018-07-271-0/+4
| | | | | | | | | | There was a missing check for if a candidate list was entirely deleted. This adds that check. This fixes an asan failure caused by running test/CodeGen/AArch64/addsub_ext.ll with the MachineOutliner enabled. llvm-svn: 338148
* Revert "Enable MachineOutliner by default under -Oz for AArch64"Jessica Paquette2018-07-273-9/+0
| | | | | | | | | | It failed an Asan test on a bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/21543/steps/check-llvm%20asan/logs/stdio Fixing that before recommitting. llvm-svn: 338136
* Enable MachineOutliner by default under -Oz for AArch64Jessica Paquette2018-07-273-0/+9
| | | | | | | | | | | | | | | | This patch enables the MachineOutliner by default in AArch64 under -Oz. The MachineOutliner offers around a 4.5% improvement on the current -Oz code size improvements. We have done work into improving the debuggability of outlined code, so that users of -Oz won't be surprised by the optimization. We have also been executing the LLVM test suite and common external tests such as the SPEC suites continuously with no issue. The outliner has a low compile-time overhead of roughly 1%. At this point, the outliner would be a really good addition to the -Oz pass pipeline! llvm-svn: 338133
* [AArch64][SVE] Asm: Predicated integer reductions.Sander de Smalen2018-07-272-0/+62
| | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for various integer reduction operations: SADDV signed add reduction to scalar UADDV unsigned add reduction to scalar SMAXV signed maximum reduction to scalar SMINV signed minimum reduction to scalar UMAXV unsigned maximum reduction to scalar UMINV unsigned minimum reduction to scalar ANDV logical AND reduction to scalar ORV logical OR reduction to scalar EORV logical EOR reduction to scalar The reduction is predicated, e.g. smaxv s0, p0, z1.s performs a signed maximum reduction on active elements in z1, and stores the (signed max value) result in s0. llvm-svn: 338126
* [AArch64][SVE] Asm: Predicated floating point reductions.Sander de Smalen2018-07-272-1/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for various floating-point reduction operations: FADDA strictly-ordered add reduction, accumulating in scalar FADDV recursive add reduction to scalar FMAXV recursive max reduction to scalar FMINV recursive min reduction to scalar FMAXNMV recursive max number reduction to scalar FMINNMV recursive min number reduction to scalar The reduction is predicated, e.g. fadda d0, p0, d0, z1.d performs the add-reduction in strict order on active elements in z1, accumulating into d0. faddv d0, p0, z1.d performs the add-reduction (not in strict order) on active elements in z1, storing the result in d0. llvm-svn: 338123
* [AArch64][SVE] Asm: Support for FEXPA and FTSSEL.Sander de Smalen2018-07-272-0/+51
| | | | | | | | This patch adds support for transcendental acceleration instructions 'FEXPA' (exponential accelerator) and 'FTSSEL' (trigonometric select coefficient). llvm-svn: 338121
* [AArch64][SVE] Asm: Support for FRECPE and FRSQRTE.Sander de Smalen2018-07-272-0/+30
| | | | | | | Support for floating-point instructions for reciprocal estimate (FRECPE) and reciprocal square root estimate (FRSQRTE). llvm-svn: 338120
* Enable some pointer authentication instructions for aarch64 v8a targetsLuke Cheeseman2018-07-261-24/+27
| | | | | | | | | | | - Some of the v8.3 pointer authentication instruction inhabit the Hint space - These instructions can be assembled to hint instructions which act as NOP instructions prior to v8.3 - This patch permits using the hint instructions for all v8a targets - Also, correct the RETA{A,B} instructions to match the instruction attributes of RET (set isTerminator and isBarrier) Differential Revision: https://reviews.llvm.org/D49786 llvm-svn: 338029
* [AArch64] Armv8.2-A: add the crypto extensionsSjoerd Meijer2018-07-263-5/+192
| | | | | | | | | This adds MC support for the crypto instructions that were made optional extensions in Armv8.2-A (AArch64 only). Differential Revision: https://reviews.llvm.org/D49370 llvm-svn: 338010
* [COFF] Hoist constant pool handling from X86AsmPrinter into AsmPrinterMartin Storsjo2018-07-251-3/+1
| | | | | | | | | | | | | | | | | | | In SVN r334523, the first half of comdat constant pool handling was hoisted from X86WindowsTargetObjectFile (which despite the name only was used for msvc targets) into the arch independent TargetLoweringObjectFileCOFF, but the other half of the handling was left behind in X86AsmPrinter::GetCPISymbol. With only half of the handling in place, inconsistent comdat sections/symbols are created, causing issues with both GNU binutils (avoided for X86 in SVN r335918) and with the MS linker, which would complain like this: fatal error LNK1143: invalid or corrupt file: no symbol for COMDAT section 0x4 Differential Revision: https://reviews.llvm.org/D49644 llvm-svn: 337950
* [MachineOutliner][NFC] Move target frame info into OutlinedFunctionJessica Paquette2018-07-242-12/+13
| | | | | | | | | | | | | | Just some gardening here. Similar to how we moved call information into Candidates, this moves outlined frame information into OutlinedFunction. This allows us to remove TargetCostInfo entirely. Anywhere where we returned a TargetCostInfo struct, we now return an OutlinedFunction. This establishes OutlinedFunctions as more of a general repeated sequence, and Candidates as occurrences of those repeated sequences. llvm-svn: 337848
* [MachineOutliner][NFC] Make Candidates own their call informationJessica Paquette2018-07-242-17/+26
| | | | | | | | | | | | | Before this, TCI contained all the call information for each Candidate. This moves that information onto the Candidates. As a result, each Candidate can now supply how it ought to be called. Thus, Candidates will be able to, say, call the same function in cheaper ways when possible. This also removes that information from TCI, since it's no longer used there. A follow-up patch for the AArch64 outliner will demonstrate this. llvm-svn: 337840
* [AArch64] Use MCAsmInfoMicrosoft and MCAsmInfoGNUCOFF as base classesMartin Storsjo2018-07-232-9/+14
| | | | | | | | | | | | | | | | This matches the structure used on X86 and ARM. This requires a little bit of duplication of the parts that are equal in both AArch64 COFF variants though. Before SVN r335286, these classes didn't add anything that MCAsmInfoCOFF didn't, but now they do. This makes AArch64 match X86 in how comdat is used for float constants for MinGW. Differential Revision: https://reviews.llvm.org/D49637 llvm-svn: 337755
* [AArch64][SVE] Asm: Support for bit/byte reverse operations.Sander de Smalen2018-07-202-0/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the following instructions: RBIT reverse bits within each active elemnt (predicated), e.g. rbit z0.d, p0/m, z1.d for 8, 16, 32 and 64 bit elements. REV reverse order of elements in data/predicate vector (unpredicated), e.g. rev z0.d, z1.d rev p0.d, p1.d for 8, 16, 32 and 64 bit elements. REVB reverse order of bytes within each active element, e.g. revb z0.d, p0/m, z1.d for 16, 32 and 64 bit elements. REVH reverse order of 16-bit half-words within each active element, e.g. revh z0.d, p0/m, z1.d for 32 and 64 bit elements. REVW reverse order of 32-bit words within each active element, e.g. revw z0.d, p0/m, z1.d for 64 bit elements. llvm-svn: 337534
* [AArch64][SVE] Asm: Support for FTMAD instruction.Sander de Smalen2018-07-202-0/+27
| | | | | | | | | | | Floating-point trigonometric multiply-add coefficient, e.g. ftmad z0.h, z0.h, z1.h, #7 with variants for 16, 32 and 64-bit elements. llvm-svn: 337533
* [AArch64][SVE] Asm: Support for unpredicated FP operations.Sander de Smalen2018-07-182-0/+35
| | | | | | | | | | | | | | | | | | This patch adds support for the following unpredicated floating-point instructions: FADD Floating point add FSUB Floating point subtract FMUL Floating point multiplication FTSMUL Floating point trigonometric starting value FRECPS Floating point reciprocal step FRSQRTS Floating point reciprocal square root step The instructions have the following assembly format: fadd z0.h, z1.h, z2.h and have variants for 16, 32 and 64-bit FP elements. llvm-svn: 337383
* [AArch64][SVE] Asm: Support for UDOT/SDOT instructions.Sander de Smalen2018-07-182-0/+72
| | | | | | | | | | | | | | | | | | | | | | | | | The signed/unsigned DOT instructions perform a dot-product on quadtuplets from two source vectors and accumulate the result in the destination register. The instructions come in two forms: Vector form, e.g. sdot z0.s, z1.b, z2.b - signed dot product on four 8-bit quad-tuplets, accumulating results in 32-bit elements. udot z0.d, z1.h, z2.h - unsigned dot product on four 16-bit quad-tuplets, accumulating results in 64-bit elements. Indexed form, e.g. sdot z0.s, z1.b, z2.b[3] - signed dot product on four 8-bit quad-tuplets with specified quadtuplet from second source vector, accumulating results in 32-bit elements. udot z0.d, z1.h, z2.h[1] - dot product on four 16-bit quad-tuplets with specified quadtuplet from second source vector, accumulating results in 64-bit elements. llvm-svn: 337372
* [AArch64][SVE] Asm: Integer divide instructions.Sander de Smalen2018-07-182-0/+11
| | | | | | | | | | | | | | | | | | This patch adds the following predicated instructions: UDIV Unsigned divide active elements UDIVR Unsigned divide active elements, reverse form. SDIV Signed divide active elements SDIVR Signed divide active elements, reverse form. e.g. udiv z0.s, p0/m, z0.s, z1.s (unsigned divide active elements in z0 by z1, store result in z0) sdivr z0.s, p0/m, z0.s, z1.s (signed divide active elements in z1 by z0, store result in z0) llvm-svn: 337369
* [AArch64][SVE] Asm: Support for integer MUL instructions.Sander de Smalen2018-07-182-8/+26
| | | | | | | | | | | | | | | | This patch adds the following instructions: MUL - multiply vectors, e.g. mul z0.h, p0/m, z0.h, z1.h - multiply with immediate, e.g. mul z0.h, z0.h, #127 SMULH - signed multiply returning high half, e.g. smulh z0.h, p0/m, z0.h, z1.h UMULH - unsigned multiply returning high half, e.g. umulh z0.h, p0/m, z0.h, z1.h llvm-svn: 337358
* [AArch64][SVE]: Integer multiply-add/subtract instructions.Sander de Smalen2018-07-172-0/+69
| | | | | | | | | | This patch adds support for the following instructions: MLA mul-add, writing addend (Zda = Zda + Zn * Zm) MLS mul-sub, writing addend (Zda = Zda + -Zn * Zm) MAD mul-add, writing multiplicant (Zdn = Za + Zdn * Zm) MSB mul-sub, writing multiplicant (Zdn = Za + -Zdn * Zm) llvm-svn: 337293
* [AArch64][SVE] Asm: FP fused multiply-add/subtract instructions.Sander de Smalen2018-07-172-0/+119
| | | | | | | | | | | | | | | | | | This patch adds support for the following instructions: FMLA mul-add, writing addend (Zda = Zda + Zn * Zm) FNMLA negated mul-add, writing addend (Zda = -Zda + -Zn * Zm) FMLS mul-sub, writing addend (Zda = Zda + -Zn * Zm) FNMLS negated mul-sub, writing addend (Zda = -Zda + Zn * Zm) FMAD mul-add, writing multiplicant (Zdn = Za + Zdn * Zm) FNMAD negated mul-add, writing multiplicant (Zdn = -Za + -Zdn * Zm) FMSB mul-sub, writing multiplicant (Zdn = Za + -Zdn * Zm) FNMSB negated mul-sub, writing multiplicant (Zdn = -Za + Zdn * Zm) llvm-svn: 337282
* [AArch64][SVE] Asm: Support for predicated FP operations (FP immediate)Sander de Smalen2018-07-171-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | This patch completes support for the following floating point instructions that take FP immediates: FADD* (addition) FSUB (subtract) FSUBR (subtract reverse form) FMUL* (multiplication) FMAX* (maximum) FMAXNM (maximum number) FMIN (maximum) FMINNM (maximum number) All operations are predicated and take a FP immediate operand, e.g. fadd z0.h, p0/m, z0.h, #0.5 fmin z0.s, p0/m, z0.s, #1.0 ^___________^ (tied) * Instructions added in a previous patch. llvm-svn: 337272
* [AArch64][SVE] Asm: Support for predicated FP operations.Sander de Smalen2018-07-172-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for the following floating point instructions: FABD (absolute difference) FADD (addition) FSUB (subtract) FSUBR (subtract reverse form) FDIV (divide) FDIVR (divide reverse form) FMAX (maximum) FMAXNM (maximum number) FMIN (minimum) FMINNM (minimum number) FSCALE (adjust exponent) FMULX (multiply extended) All operations are predicated and binary form, e.g. fadd z0.h, p0/m, z0.h, z1.h ^___________^ (tied) Supporting 16, 32 and 64-bit FP elements. llvm-svn: 337259
* [AArch64][SVE] Asm: Support for SPLICE instruction.Sander de Smalen2018-07-172-0/+26
| | | | | | | | | | | | | | | | The SPLICE instruction splices two vectors into one vector using a predicate. It copies the active elements from the first vector, and then fills the remaining elements with the low-numbered elements from the second vector. The instruction has the following form, e.g. splice z0.b, p0, z0.b, z1.b for 8-bit elements. It also supports 16, 32 and 64-bit elements. llvm-svn: 337253
* [AArch64][SVE] Asm: Support for EXT instruction.Sander de Smalen2018-07-172-0/+21
| | | | | | | | | | | | | This patch adds an instruction that allows extracting a vector from a pair of vectors, given an immediate index that describes the element position to extract from. The instruction has the following assembly: ext z0.b, z0.b, z1.b, #imm where #imm is an immediate between 0 and 255. llvm-svn: 337251
* [X86][AArch64][DAGCombine] Unfold 'check for [no] signed truncation' patternRoman Lebedev2018-07-161-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: [[ https://bugs.llvm.org/show_bug.cgi?id=38149 | PR38149 ]] As discussed in https://reviews.llvm.org/D49179#1158957 and later, the IR for 'check for [no] signed truncation' pattern can be improved: https://rise4fun.com/Alive/gBf ^ that pattern will be produced by Implicit Integer Truncation sanitizer, https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530 in signed case, therefore it is probably a good idea to improve it. But the IR-optimal patter does not lower efficiently, so we want to undo it.. This handles the simple pattern. There is a second pattern with predicate and constants inverted. NOTE: we do not check uses here. we always do the transform. Reviewers: spatel, craig.topper, RKSimon, javed.absar Reviewed By: spatel Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D49266 llvm-svn: 337166
* [AArch64] Armv8.4-A: LDAPR & STLR with immediate offset instructions (cont'd)Sjoerd Meijer2018-07-133-22/+35
| | | | | | | | | Follow up of rL336913: fix base class description. Thanks to Ahmed Bougacha for pointing this out. Differential Revision: https://reviews.llvm.org/D49284 llvm-svn: 337009
* [cfi-verify] Support AArch64.Joel Galenson2018-07-132-8/+14
| | | | | | | | | | | | This patch adds support for AArch64 to cfi-verify. This required three changes to cfi-verify. First, it generalizes checking if an instruction is a trap by adding a new isTrap flag to TableGen (and defining it for x86 and AArch64). Second, the code that ensures that the operand register is not clobbered between the CFI check and the indirect call needs to allow a single dereference (in x86 this happens as part of the jump instruction). Third, we needed to ensure that return instructions are not counted as indirect branches. Technically, returns are indirect branches and can be covered by CFI, but LLVM's forward-edge CFI does not protect them, and x86 does not consider them, so we keep that behavior. In addition, we had to improve AArch64's code to evaluate the branch target of a MCInst to handle calls where the destination is not the first operand (which it often is not). Differential Revision: https://reviews.llvm.org/D48836 llvm-svn: 337007
* [AArch64][SVE] Asm: Vector Unpack Low/High instructions.Sander de Smalen2018-07-132-0/+45
| | | | | | | | | | | | | | | | | This patch adds support for the following unpack instructions: - PUNPKLO, PUNPKHI Unpack elements from low/high half and place into elements of twice their size. e.g. punpklo p0.h, p0.b - UUNPKLO, UUNPKHI Unpack elements from low/high half and SUNPKLO, SUNPKHI place into elements of twice their size after zero- or sign-extending the values. e.g. uunpklo z0.h, z0.b llvm-svn: 336982
* [AArch64][SVE] Asm: Support for insert element (INSR) instructions.Sander de Smalen2018-07-132-0/+51
| | | | | | | | | | | | | | Insert general purpose register into shifted vector, e.g. insr z0.s, w0 insr z0.d, x0 Insert SIMD&FP scalar register into shifted vector, e.g. insr z0.b, b0 insr z0.h, h0 insr z0.s, s0 insr z0.d, d0 llvm-svn: 336979
OpenPOWER on IntegriCloud