summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64
Commit message (Collapse)AuthorAgeFilesLines
...
* [AArch64][GlobalISel] Any-extend vararg parameters to stack slot size on Darwin.Amara Emerson2018-07-021-0/+3
| | | | | | | | | | | We currently don't any-extend vararg parameters before storing them to the stack locations on Darwin. However, SelectionDAG however does this, and so user code is in the wild which inadvertently relies on this extension. This can manifest in cases where the value stored is (int)0, but the actual parameter is interpreted by va_arg as a pointer, and so not extending to 64 bits causes the callee to load additional undefined bits. llvm-svn: 336120
* [AArch64][SVE] Asm: Support for (SQ)INCP/DECP (scalar, vector)Sander de Smalen2018-07-022-0/+94
| | | | | | | | | | | | | | | | | | | | | Increments/decrements the result with the number of active bits from the predicate. The inc/dec variants added are: - incp x0, p0.h (scalar) - incp z0.h, p0 (vector) The unsigned saturating inc/dec variants added are: - uqincp x0, p0.h (scalar) - uqincp w0, p0.h (scalar, 32bit) - uqincp z0.h, p0 (vector) The signed saturating inc/dec variants added are: - sqincp x0, p0.h (scalar) - sqincp x0, p0.h, w0 (scalar, 32bit) - sqincp z0.h, p0 (vector) llvm-svn: 336091
* [AArch64][SVE] Asm: Support for (saturating) vector INC/DEC instructions.Sander de Smalen2018-07-022-0/+49
| | | | | | | | | | | | | | | | | | | Increment/decrement vector by multiple of predicate constraint element count. The variants added by this patch are: - INCH, INCW, INC and (saturating): - SQINCH, SQINCW, SQINCD - UQINCH, UQINCW, UQINCW - SQDECH, SQINCW, SQINCD - UQDECH, UQINCW, UQINCW For example: incw z0.s, all, mul #4 llvm-svn: 336090
* [AArch64][SVE] Asm: Support for vector element compares (immediate).Sander de Smalen2018-07-022-0/+85
| | | | | | | | Compare vector elements with a signed/unsigned immediate, e.g. cmpgt p0.s, p0/z, z0.s, #-16 cmphi p0.s, p0/z, z0.s, #127 llvm-svn: 336081
* Reapply r334980 and r334983.Sander de Smalen2018-07-026-18/+154
| | | | | | | | | These patches were previously reverted as they led to buildbot time-outs caused by large switch statement in printAliasInstr when using UBSan and O3. The issue has been addressed with a workaround (r335525). llvm-svn: 336079
* [AArch64] Armv8.4-A: Virtualization system registersSjoerd Meijer2018-06-291-0/+22
| | | | | | | | This adds the Secure EL2 extension. Differential Revision: https://reviews.llvm.org/D48711 llvm-svn: 335962
* [ARM][AArch64] Armv8.4-A EnablementSjoerd Meijer2018-06-292-1/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initial patch adding assembly support for Armv8.4-A. Besides adding v8.4 as a supported architecture to the usual places, this also adds target features for the different crypto algorithms. Armv8.4-A introduced new crypto algorithms, made them optional, and allows different combinations: - none of the v8.4 crypto functions are supported, which is independent of the implementation of the Armv8.0 SHA1 and SHA2 instructions. - the v8.4 SHA512 and SHA3 support is implemented, in this case the Armv8.0 SHA1 and SHA2 instructions must also be implemented. - the v8.4 SM3 and SM4 support is implemented, which is independent of the implementation of the Armv8.0 SHA1 and SHA2 instructions. - all of the v8.4 crypto functions are supported, in this case the Armv8.0 SHA1 and SHA2 instructions must also be implemented. The v8.4 crypto instructions are added to AArch64 only, and not AArch32, and are made optional extensions to Armv8.2-A. The user-facing Clang options will map on these new target features, their naming will be compatible with GCC and added in follow-up patches. The Armv8.4-A instruction sets can be downloaded here: https://developer.arm.com/products/architecture/a-profile/exploration-tools Differential Revision: https://reviews.llvm.org/D48625 llvm-svn: 335953
* [MachineOutliner] Define MachineOutliner support in TargetOptionsJessica Paquette2018-06-282-2/+3
| | | | | | | | | | | | | | | Targets should be able to define whether or not they support the outliner without the outliner being added to the pass pipeline. Before this, the outliner pass would be added, and ask the target whether or not it supports the outliner. After this, it's possible to query the target in TargetPassConfig, before the outliner pass is created. This ensures that passing -enable-machine-outliner will not modify the pass pipeline of any target that does not support it. https://reviews.llvm.org/D48683 llvm-svn: 335887
* SelectionDAGBuilder, mach-o: Skip trap after noreturn call (for Mach-O)Matthias Braun2018-06-281-1/+3
| | | | | | | | | | | | | Add NoTrapAfterNoreturn target option which skips emission of traps behind noreturn calls even if TrapUnreachable is enabled. Enable the feature on Mach-O to save code size; Comments suggest it is not possible to enable it for the other users of TrapUnreachable. rdar://41530228 DifferentialRevision: https://reviews.llvm.org/D48674 llvm-svn: 335877
* [globalisel][legalizer] Add AtomicOrdering to LegalityQuery and use it in ↵Daniel Sanders2018-06-271-2/+6
| | | | | | | | | | | | | AArch64 Now that we have the ability to legalize based on MMO's. Add support for legalizing based on AtomicOrdering and use it to correct the legalization of the atomic instructions. Also extend all() to be a variadic template as this ruleset now requires 3 and 4 argument versions. llvm-svn: 335767
* [MachineOutliner] Don't outline sequences where x16/x17/nzcv are live acrossJessica Paquette2018-06-272-27/+41
| | | | | | | | | | | | | | It isn't safe to outline sequences of instructions where x16/x17/nzcv live across the sequence. This teaches the outliner to check whether or not a specific canidate has x16/x17/nzcv live across it and discard the candidate in the case that that is true. https://bugs.llvm.org/show_bug.cgi?id=37573 https://reviews.llvm.org/D47655 llvm-svn: 335758
* [AArch64] Reverting FP16 vcvth_n_s64_f16 to fixLuke Geeson2018-06-271-2/+0
| | | | llvm-svn: 335737
* [AArch64] Add custom lowering for v4i8 trunc storeAdhemerval Zanella2018-06-273-8/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a custom trunc store lowering for v4i8 vector types. Since there is not v.4b register, the v4i8 is promoted to v4i16 (v.4h) and default action for v4i8 is to extract each element and issue 4 byte stores. A better strategy would be to extended the promoted v4i16 to v8i16 (with undef elements) and extract and store the word lane which represents the v4i8 subvectores. The construction: define void @foo(<4 x i16> %x, i8* nocapture %p) { %0 = trunc <4 x i16> %x to <4 x i8> %1 = bitcast i8* %p to <4 x i8>* store <4 x i8> %0, <4 x i8>* %1, align 4, !tbaa !2 ret void } Can be optimized from: umov w8, v0.h[3] umov w9, v0.h[2] umov w10, v0.h[1] umov w11, v0.h[0] strb w8, [x0, #3] strb w9, [x0, #2] strb w10, [x0, #1] strb w11, [x0] ret To: xtn v0.8b, v0.8h str s0, [x0] ret The patch also adjust the memory cost for autovectorization, so the C code: void foo (const int *src, int width, unsigned char *dst) { for (int i = 0; i < width; i++) *dst++ = *src++; } can be vectorized to: .LBB0_4: // %vector.body // =>This Inner Loop Header: Depth=1 ldr q0, [x0], #16 subs x12, x12, #4 // =4 xtn v0.4h, v0.4s xtn v0.8b, v0.8h st1 { v0.s }[0], [x2], #4 b.ne .LBB0_4 Instead of byte operations. llvm-svn: 335735
* [AArch64] Remove Duplicate FP16 Patterns with same encoding, match on ↵Luke Geeson2018-06-274-35/+54
| | | | | | existing patterns llvm-svn: 335715
* [CostModel][AArch64] Add some initial costs for SK_Select and ↵Simon Pilgrim2018-06-221-17/+32
| | | | | | | | | | | | | | SK_PermuteSingleSrc AArch64 was only setting costs for SK_Transpose, which meant that many of the simpler shuffles (e.g. SK_Select and SK_PermuteSingleSrc for larger vector elements) was being severely overestimated by the default shuffle expansion. This patch adds costs to help improve SLP performance and avoid a regression in reductions introduced by D48174. I'm not very knowledgeable about AArch64 shuffle lowering so I've kept the extra costs to a minimum - someone who knows this code can add extra costs which should improve vectorization a lot more. Differential Revision: https://reviews.llvm.org/D48172 llvm-svn: 335329
* Revert "[AArch64] Coalesce Copy Zero during instruction selection"Sirish Pande2018-06-211-29/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit d8f57105010cc7e78026e511d5def873fc91e0e7. Original Commit: Author: Haicheng Wu <haicheng@codeaurora.org> Date: Sun Feb 18 13:51:33 2018 +0000 [AArch64] Coalesce Copy Zero during instruction selection Add special case for copy of zero to avoid a double copy. Differential Revision: https://reviews.llvm.org/D36104 Author's intention is to remove a BB that has one mov instruction. In order to do that, d8f571050 pessmizes MachineSinking by introducing a copy, such that mov instruction is NOT moved to the BB. Optimization downstream gets rid of the BB with only mov instruction. This works well if we have only one fall through branch as there is only one "extra" mov instruction. If we have multiple fall throughs, we will have a lot of redundant movs. In such a case, it's better to have this BB which has one mov instruction. This is causing degradation in jpeg, fft and other codebases. I believe if we want to remove a BB with only one branch instruction, we should not pessimize Machine Sinking at all, and find some other solution. llvm-svn: 335251
* [AArch64] Implement FLT_ROUNDS macro.Tim Northover2018-06-203-0/+28
| | | | | | | | | | Very similar to ARM implementation, just maps to an MRS. Should fix PR25191. Patch by Michael Brase. llvm-svn: 335118
* Revert r334980 and 334983Vlad Tsyrklevich2018-06-206-154/+18
| | | | | | | This reverts commits r334980 and r334983 because they were causing build timeouts on the x86_64-linux-ubsan bot. llvm-svn: 335085
* [MachineOutliner] NFC: Remove insertOutlinerPrologue, rename ↵Jessica Paquette2018-06-192-8/+2
| | | | | | | | | | | | insertOutlinerEpilogue insertOutlinerPrologue was not used by any target, and prologue-esque code was beginning to appear in insertOutlinerEpilogue. Refactor that into one function, buildOutlinedFrame. This just removes insertOutlinerPrologue and renames insertOutlinerEpilogue. llvm-svn: 335076
* [AArch64][SVE] Asm: Fix predicate pattern diagnostics.Sander de Smalen2018-06-181-4/+6
| | | | | | | | | | | | | | | This patch uses the DiagnosticPredicate for SVE predicate patterns to improve their diagnostics, now giving a 'invalid operand' diagnostic if the type is not an immediate or one of the expected pattern labels. Reviewers: samparker, SjoerdMeijer, javed.absar, fhahn Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48220 llvm-svn: 334983
* [AArch64][SVE] Asm: Support for saturating INC/DEC (32bit scalar) instructions.Sander de Smalen2018-06-186-14/+148
| | | | | | | | | | | | | | | | | | | | | | | The variants added by this patch are: - SQINC signed increment, e.g. sqinc x0, w0, all, mul #4 - SQDEC signed decrement, e.g. sqdec x0, w0, all, mul #4 - UQINC unsigned increment, e.g. uqinc w0, all, mul #4 - UQDEC unsigned decrement, e.g. uqdec w0, all, mul #4 This patch includes asmparser changes to parse a GPR64 as a GPR32 in order to satisfy the constraint check: x0 == GPR64(w0) in: sqinc x0, w0, all, mul #4 ^___^ (must match) Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47716 llvm-svn: 334980
* [AArch64][SVE] Asm: Support for saturating INC/DEC (64bit scalar) instructions.Sander de Smalen2018-06-182-0/+51
| | | | | | | | | | | | | | | | | | Summary: The variants added by this patch are: - SQINC (signed increment) - UQINC (unsigned increment) - SQDEC (signed decrement) - UQDEC (unsigned decrement) For example: uqincw x0, all, mul #4 Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Differential Revision: https://reviews.llvm.org/D47715 llvm-svn: 334948
* [AArch64][SVE] Asm: Support for vector element compares.Sander de Smalen2018-06-182-0/+102
| | | | | | | | | | | | | | | | This patch adds instructions for comparing elements from two vectors, e.g. cmpgt p0.s, p0/z, z0.s, z1.s and also adds support for comparing to a 64-bit wide element vector, e.g. cmpgt p0.s, p0/z, z0.s, z1.d The patch also contains aliases for certain comparisons, e.g.: cmple p0.s, p0/z, z0.s, z1.s => cmpge p0.s, p0/z, z1.s, z0.s cmplo p0.s, p0/z, z0.s, z1.s => cmphi p0.s, p0/z, z1.s, z0.s cmpls p0.s, p0/z, z0.s, z1.s => cmphs p0.s, p0/z, z1.s, z0.s cmplt p0.s, p0/z, z0.s, z1.s => cmpgt p0.s, p0/z, z1.s, z0.s llvm-svn: 334931
* [AArch64][SVE] Asm: Support for bitwise operations on predicate vectors.Sander de Smalen2018-06-171-0/+29
| | | | | | | | | | | | | | | | | | | This patch adds support for instructions performing bitwise operations on predicate vectors, including AND, BIC, EOR, NAND, NOR, ORN, ORR, and their status flag setting variants ANDS, BICS, EORS, NANDS, ORNS, ORRS. This patch also adds several aliases: orr p0.b, p1/z, p1.b, p1.b => mov p0.b, p1.b orrs p0.b, p1/z, p1.b, p1.b => movs p0.b, p1.b and p0.b, p1/z, p2.b, p2.b => mov p0.b, p1/z, p2.b ands p0.b, p1/z, p2.b, p2.b => movs p0.b, p1/z, p2.b eor p0.b, p1/z, p2.b, p1.b => not p0.b, p1/z, p2.b eors p0.b, p1/z, p2.b, p1.b => nots p0.b, p1/z, p2.b llvm-svn: 334906
* [AArch64][SVE] Asm: Support for SEL (vector/predicate) instructions.Sander de Smalen2018-06-172-0/+81
| | | | | | | | Support for SVE's predicated select instructions to select elements from either vector, both in a data-vector and a predicate-vector variant. llvm-svn: 334905
* [AArch64][SVE] Asm: Support for CPY SIMD/FP and GPR instructions.Sander de Smalen2018-06-152-0/+79
| | | | | | | Predicated splat/copy of SIMD/FP register or general purpose register to SVE vector, along with MOV-aliases. llvm-svn: 334842
* [AArch64][SVE] Asm: Support for INC/DEC (scalar) instructions.Sander de Smalen2018-06-155-12/+105
| | | | | | | | | | | | | Increment/decrement scalar register by (scaled) element count given by predicate pattern, e.g. 'incw x0, all, mul #4'. Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47713 llvm-svn: 334838
* [AArch64][SVE] Asm: Support for FADD, FMUL and FMAX immediate instructions.Sander de Smalen2018-06-153-0/+57
| | | | | | | | | | Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: javed.absar Differential Revision: https://reviews.llvm.org/D47712 llvm-svn: 334831
* [AArch64][SVE] Asm: Add parsing/printing support for exact FP immediates.Sander de Smalen2018-06-158-46/+164
| | | | | | | | | | | | | | | | Some instructions require of a limited set of FP immediates as operands, for example '#0.5 or #1.0' for SVE's FADD instruction. This patch adds support for parsing and printing such FP immediates as exact values (e.g. #0.499999 is not accepted for #0.5). Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47711 llvm-svn: 334826
* [TableGen] Emit a fatal error on inconsistencies in resource units vs cycles.Clement Courbet2018-06-131-6/+6
| | | | | | | | | | | | | | | | | | | | | Summary: For targets I'm not familiar with, I've automatically made the "default to 1 for each resource" behaviour explicit in the td files. For more obvious cases, I've ventured a fix. Some notes: - Exynos is especially fishy. - AArch64SchedThunderX2T99.td had some truncated entries. If I understand correctly, the person who wrote that interpreted the ResourceCycle as a range. I made the decision to use the upper/lower bound for consistency with the 'Latency' value. I'm sure there is a better choice. - The change to X86ScheduleBtVer2.td is an NFC, it just makes values more explicit. Also see PR37310. Reviewers: RKSimon, craig.topper, javed.absar Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D46356 llvm-svn: 334586
* [AArch64] Support reserving x20 registerPetr Hosek2018-06-124-5/+25
| | | | | | | | | | | | Register x20 is a callee-saved register which may be used for other purposes in certain contexts, for example to hold special variables within the kernel. This change adds support for reserving this register both to frontend and backend to make this register usable for these purposes. Differential Revision: https://reviews.llvm.org/D46552 llvm-svn: 334531
* [AArch64] Audit on rL333879 to fix FP16 64bit bitpatternsLuke Geeson2018-06-121-2/+2
| | | | llvm-svn: 334488
* [ExynosM1][Sched] Fix resource usage in scheduling model.Clement Courbet2018-06-111-16/+16
| | | | | | This is part of https://reviews.llvm.org/D46356. llvm-svn: 334391
* [AArch64, ARM] Add support for Samsung Exynos M4Evandro Menezes2018-06-061-0/+1
| | | | | | Create a separate feature set for Exynos M4 and add test cases. llvm-svn: 334115
* [MC] Pass MCSubtargetInfo to fixupNeedsRelaxation and applyFixupPeter Smith2018-06-061-4/+8
| | | | | | | | | | | | | | | | | | On targets like Arm some relaxations may only be performed when certain architectural features are available. As functions can be compiled with differing levels of architectural support we must make a judgement on whether we can relax based on the MCSubtargetInfo for the function. This change passes through the MCSubtargetInfo for the function to fixupNeedsRelaxation so that the decision on whether to relax can be made per function. In this patch, only the ARM backend makes use of this information. We must also pass the MCSubtargetInfo to applyFixup because some fixups skip error checking on the assumption that relaxation has occurred, to prevent code-generation errors applyFixup must see the same MCSubtargetInfo as fixupNeedsRelaxation. Differential Revision: https://reviews.llvm.org/D44928 llvm-svn: 334078
* [MachineOutliner] NFC - Move intermediate data structures to MachineOutliner.hJessica Paquette2018-06-042-56/+53
| | | | | | | | | | | | | | | | | | | | | This is setting up to fix bug 37573 cleanly. This moves data structures that are technically both used in some way by the target and the general-purpose outlining algorithm into MachineOutliner.h. In particular, the `Candidate` class is of importance. Before, the outliner passed the locations of `Candidates` to the target, which would then make some decisions about the prospective outlined function. This change allows us to just pass `Candidates` along to the target. This will allow the target to discard `Candidates` that would be considered unsafe before cost calculation. Thus, we will be able to remove the unsafe candidates described in the bug without resorting to torching the entire prospective function. Also, as a side-effect, it makes the outliner a bit cleaner. https://bugs.llvm.org/show_bug.cgi?id=37573 llvm-svn: 333952
* TableGen: Streamline the semantics of NAMENicolai Haehnle2018-06-041-72/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The new rules are straightforward. The main rules to keep in mind are: 1. NAME is an implicit template argument of class and multiclass, and will be substituted by the name of the instantiating def/defm. 2. The name of a def/defm in a multiclass must contain a reference to NAME. If such a reference is not present, it is automatically prepended. And for some additional subtleties, consider these: 3. defm with no name generates a unique name but has no special behavior otherwise. 4. def with no name generates an anonymous record, whose name is unique but undefined. In particular, the name won't contain a reference to NAME. Keeping rules 1&2 in mind should allow a predictable behavior of name resolution that is simple to follow. The old "rules" were rather surprising: sometimes (but not always), NAME would correspond to the name of the toplevel defm. They were also plain bonkers when you pushed them to their limits, as the old version of the TableGen test case shows. Having NAME correspond to the name of the toplevel defm introduces "spooky action at a distance" and breaks composability: refactoring the upper layers of a hierarchy of nested multiclass instantiations can cause unexpected breakage by changing the value of NAME at a lower level of the hierarchy. The new rules don't suffer from this problem. Some existing .td files have to be adjusted because they ended up depending on the details of the old implementation. Change-Id: I694095231565b30f563e6fd0417b41ee01a12589 Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm, javed.absar Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D47430 llvm-svn: 333900
* [AArch64] Audit on rL333634 to fix FP16 Disasm BitPatternsLuke Geeson2018-06-042-2/+2
| | | | llvm-svn: 333879
* [AArch64][SVE] Fix range for DUP immediates (16bit elts)Sander de Smalen2018-06-042-3/+11
| | | | | | | | | | | | | | | For immediates used in DUP instructions that have the range -128 to 127, or a multiple of 256 in the range -32768 to 32512, one could argue that when the result element size is 16bits (.h), the value can be considered both signed and unsigned. Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47619 llvm-svn: 333873
* [AArch64][SVE] Asm: Print indexed element 0 as FPR.Sander de Smalen2018-06-045-0/+67
| | | | | | | | | | | | | | | | | | | | Print the first indexed element as a FP register, for example: mov z0.d, z1.d[0] Is now printed as: mov z0.d, d1 Next to printing, this patch also adds aliases to parse 'mov z0.d, d1'. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47571 llvm-svn: 333872
* [AArch64][SVE] Asm: Support for indexed DUP instructions.Sander de Smalen2018-06-044-71/+127
| | | | | | | | | | | | | | | | | | | | Unpredicated copy of indexed SVE element to SVE vector, along with MOV-aliases. For example: dup z0.h, z1.h[0] duplicates the first 16-bit element from z1 to all elements in the result vector z0. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47570 llvm-svn: 333871
* [AArch64][SVE] Asm: Support for FCPY immediate instructions.Sander de Smalen2018-06-042-2/+43
| | | | | | | | | | | | | Predicated copy of floating-point immediate value to SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: javed.absar Differential Revision: https://reviews.llvm.org/D47518 llvm-svn: 333869
* [AArch64][SVE] Asm: Support for CPY immediate instructionsSander de Smalen2018-06-042-0/+62
| | | | | | | | | | | | | Predicated copy of possibly shifted immediate value into SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47517 llvm-svn: 333868
* [AArch64][GlobalISel] Zero-extend s1 values when returning.Amara Emerson2018-06-011-1/+6
| | | | | | | | | | | Before we were relying on the any extend of the s1 to s32, but for AAPCS we need to zero-extend it to at least s8. Fixes PR36719 Differential Revision: https://reviews.llvm.org/D47425 llvm-svn: 333747
* [AArch64][SVE] Asm: Support for FDUP_ZI (copy fp immediate) instruction.Sander de Smalen2018-06-012-0/+39
| | | | | | | | | | | | | Unpredicated copy of floating-point immediate value into SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47482 llvm-svn: 333744
* [AArch64][SVE] Asm: Support for DUPM (masked immediate) instruction.Sander de Smalen2018-06-017-1/+137
| | | | | | | | | | | | | Unpredicated copy of repeating immediate pattern to SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47328 llvm-svn: 333731
* [MC] Fallback on DWARF when generating compact unwind on AArch64Francis Visoiu Mistrih2018-05-311-3/+11
| | | | | | | | | | | | | | | Instead of asserting when using the def_cfa directive with a register different from fp, fallback on DWARF. Easily triggered with: .cfi_def_cfa x1, 32; rdar://40249694 Differential Revision: https://reviews.llvm.org/D47593 llvm-svn: 333667
* [AArch64] Reverted rL333427 fixing Clang UnitTest FailureLuke Geeson2018-05-312-5/+39
| | | | llvm-svn: 333634
* [GlobalISel][AArch64] LegalizerInfo verifier: Fixing bugs exposed by ↵Roman Tereshin2018-05-311-11/+4
| | | | | | | | | | | | LegalizerInfo::verify(...) Reviewers: aemerson, qcolombet Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D46339 llvm-svn: 333618
* [GlobalISel][AArch64] LegalizerInfo verifier: Adding ↵Roman Tereshin2018-05-301-0/+1
| | | | | | | | | | | | | | | LegalizerInfo::verify(...) call w/o fixing bugs This is to make it clear what kind of bugs the LegalizerInfo::verifier is able to catch and test its output Reviewers: aemerson, qcolombet Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D46338 llvm-svn: 333597
OpenPOWER on IntegriCloud