summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] Fix MVE_VQxDMLxDH instruction classMikhail Maltsev2019-07-011-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: According to the ARMARM, the VQDMLADH, VQRDMLADH, VQDMLSDH and VQRDMLSDH instructions handle their results as follows: "The base variant writes the results into the lower element of each pair of elements in the destination register, whereas the exchange variant writes to the upper element in each pair". I.e., the initial content of the output register affects the result, as usual, we model this with an additional input. Also, for 32-bit variants Qd is not allowed to be the same register as Qm and Qn, we use @earlyclobber to indicate this. This patch also changes vpred_r to vpred_n because the instructions don't have an explicit 'inactive' operand. Reviewers: dmgreen, ostannard, simon_tatham Reviewed By: simon_tatham Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64007 llvm-svn: 364796
* [ARM] MVE: support QQPRRegClass and QQQQPRRegClassMikhail Maltsev2019-07-011-2/+3
| | | | | | | | | | | | | | | | | | | Summary: QQPRRegClass and QQQQPRRegClass are used by the interleaving/deinterleaving loads/stores to represent sequences of consecutive SIMD registers. Reviewers: ostannard, simon_tatham, dmgreen Reviewed By: simon_tatham Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64009 llvm-svn: 364794
* [ARM] WLS/LE Code GenerationSam Parker2019-07-017-28/+162
| | | | | | | | | | | | | | | | | Backend changes to enable WLS/LE low-overhead loops for armv8.1-m: 1) Use TTI to communicate to the HardwareLoop pass that we should try to generate intrinsics that guard the loop entry, as well as setting the loop trip count. 2) Lower the BRCOND that uses said intrinsic to an Arm specific node: ARMWLS. 3) ISelDAGToDAG the node to a new pseudo instruction: t2WhileLoopStart. 4) Add support in ArmLowOverheadLoops to handle the new pseudo instruction. Differential Revision: https://reviews.llvm.org/D63816 llvm-svn: 364733
* [ARM] Add support for the MVE long shift instructionsSam Tebbs2019-06-284-7/+85
| | | | | | | | | | | | MVE adds the lsll, lsrl and asrl instructions, which perform a shift on a 64 bit value separated into two 32 bit registers. The Expand64BitShift function is modified to accept ISD::SHL, ISD::SRL and ISD::SRA and convert it into the appropriate opcode in ARMISD. An SHL is converted into an lsll, an SRL is converted into an lsrl for the immediate form and a negation and lsll for the register form, and SRA is converted into an asrl. test/CodeGen/ARM/shift_parts.ll is added to test the logic of emitting these instructions. Differential Revision: https://reviews.llvm.org/D63430 llvm-svn: 364654
* [ARM] Add MVE mul patternsDavid Green2019-06-281-0/+16
| | | | | | | | | This simply adds integer and floating point VMUL patterns for MVE, same as we have add and sub. Differential Revision: https://reviews.llvm.org/D63866 llvm-svn: 364643
* [ARM] Mark math routines as non-legal for MVEDavid Green2019-06-281-0/+9
| | | | | | | | | This adds handling and tests for a number of floating point math routines, which have no MVE instructions. Differential Revision: https://reviews.llvm.org/D63725 llvm-svn: 364641
* [ARM] MVE patterns for VABS and VNEGDavid Green2019-06-281-0/+14
| | | | | | | | This simply adds the required patterns for fp neg and abs. Differential Revision: https://reviews.llvm.org/D63861 llvm-svn: 364640
* [ARM] Widening loads and narrowing storesDavid Green2019-06-283-4/+58
| | | | | | | | | | | | MVE has instructions to widen as it loads, and narrow as it stores. This adds the required patterns and legalisation to make them work including specifying that they are legal, patterns to select them and test changes. Patch by David Sherwood. Differential Revision: https://reviews.llvm.org/D63839 llvm-svn: 364636
* [ARM] Fix integer UB in MVE load/store immediate handling.Simon Tatham2019-06-282-6/+9
| | | | llvm-svn: 364635
* [ARM] MVE loads and storesDavid Green2019-06-282-11/+52
| | | | | | | | | | | | | | | This fills in the gaps for basic MVE loads and stores, allowing unaligned access and adding far too many tests. These will become important as narrowing/expanding and pre/post inc are added. Big endian might still not be handled very well, because we have not yet added bitcasts (and I'm not sure how we want it to work yet). I've included the alignment code anyway which maps with our current patterns. We plan to return to that later. Code written by Simon Tatham, with additional tests from Me and Mikhail Maltsev. Differential Revision: https://reviews.llvm.org/D63838 llvm-svn: 364633
* [ARM] Mark div and rem as expand for MVEDavid Green2019-06-281-0/+12
| | | | | | | | | We don't have vector operations for these, so they need to be expanded for both integer and float. Differential Revision: https://reviews.llvm.org/D63595 llvm-svn: 364631
* [ARM] Select MVE fp add and subDavid Green2019-06-281-0/+14
| | | | | | | | | | | The same as integer arithmetic, we can add simple floating point MVE addition and subtraction patterns. Initial code by David Sherwood Differential Revision: https://reviews.llvm.org/D63257 llvm-svn: 364629
* [ARM] Select MVE add and subDavid Green2019-06-281-0/+18
| | | | | | | | | | | This adds the first few patterns for MVE code generation, adding simple integer add and sub patterns. Initial code by David Sherwood Differential Revision: https://reviews.llvm.org/D63255 llvm-svn: 364627
* [ARM] MVE vector shufflesDavid Green2019-06-285-183/+355
| | | | | | | | | | | | | | | | | | This patch adds necessary shuffle vector and buildvector support for ARM MVE. It essentially adds support for VDUP, VREVs and some VMOVs, which are often required by other code (like upcoming patches). This mostly uses the same code from Neon that already generated NEONvdup/NEONvduplane/NEONvrev's. These have been renamed to ARMvdup/etc and moved to ARMInstrInfo as they are common to both architectures. Most of the selection code seems to be applicable to both, but NEON does have some more instructions making some parts specific. Most code originally by David Sherwood. Differential Revision: https://reviews.llvm.org/D63567 llvm-svn: 364626
* [ARM] Fix formatting issue in ARMISelLowering.cppSam Tebbs2019-06-271-1/+2
| | | | | | | Fix a formatting error in ARMISelLowering.cpp::Expand64BitShift. My test commit after receiving write access. llvm-svn: 364560
* [ARM] Fix bogus assertions in copyPhysReg v8.1-M cases.Simon Tatham2019-06-271-4/+4
| | | | | | | | | | | | | | | | The code to generate register move instructions in and out of VPR and FPSCR_NZCV had assertions checking that the other register involved was a GPR _pair_, instead of a single GPR as it should have been. Reviewers: miyuki, ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63865 llvm-svn: 364534
* [ARM] Fix handling of zero offsets in LOB instructions.Simon Tatham2019-06-272-16/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The BF and WLS/WLSTP instructions have various branch-offset fields occupying different positions and lengths in the instruction encoding, and all of them were decoded at disassembly time by the function DecodeBFLabelOffset() which returned SoftFail if the offset was zero. In fact, it's perfectly fine and not even a SoftFail for most of those offset fields to be zero. The only one that can't be zero is the 4-bit field labelled `boff` in the architecture spec, occupying bits {26-23} of the BF instruction family. If that one is zero, the encoding overlaps other instructions (WLS, DLS, LETP, VCTP), so it ought to be a full Fail. Fixed by adding an extra template parameter to DecodeBFLabelOffset which controls whether a zero offset is accepted or rejected. Adjusted existing tests (only in error messages for bad disassemblies); added extra tests to demonstrate zero offsets being accepted in all the right places, and a few demonstrating rejection of zero `boff`. Reviewers: DavidSpickett, ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63864 llvm-svn: 364533
* [ARM] Make coprocessor number restrictions consistent.Simon Tatham2019-06-273-10/+25
| | | | | | | | | | | | | | | | | | | | | | Different versions of the Arm architecture disallow the use of generic coprocessor instructions like MCR and CDP on different sets of coprocessors. This commit centralises the check of the coprocessor number so that it's consistent between assembly and disassembly, and also updates it for the new restrictions in Arm v8.1-M. New tests added that check all the coprocessor numbers; old tests updated, where they used a number that's now become illegal in the context in question. Reviewers: DavidSpickett, ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63863 llvm-svn: 364532
* [ARM] Tighten restrictions on use of SP in v8.1-M CSEL.Simon Tatham2019-06-273-8/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | In the `CSEL Rd,Rm,Rn` instruction family (also including CSINC, CSINV and CSNEG), the architecture lists it as CONSTRAINED UNPREDICTABLE (i.e. SoftFail) to use SP in the Rd or Rm slot, but outright illegal to use it in the Rn slot, not least because some encodings of that form are used by MVE instructions such as UQRSHLL. MC was treating all three slots the same, as SoftFail. So the only reason UQRSHLL was disassembled correctly at all was because the MVE decode table is separate from the Thumb2 one and takes priority; if you turned off MVE, then encodings such as `[0x5f,0xea,0x0d,0x83]` would disassemble as spurious CSELs. Fixed by inventing another version of the `GPRwithZR` register class, which disallows SP completely instead of just SoftFailing it. Reviewers: DavidSpickett, ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63862 llvm-svn: 364531
* [GlobalISel] Accept multiple vregs for lowerCall's argsDiana Picus2019-06-271-8/+3
| | | | | | | | | | | | | | | | | | | | | | | | Change the interface of CallLowering::lowerCall to accept several virtual registers for each argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63551 llvm-svn: 364512
* [GlobalISel] Accept multiple vregs for lowerCall's resultDiana Picus2019-06-272-23/+9
| | | | | | | | | | | | | | | | | | | | | | | | Change the interface of CallLowering::lowerCall to accept several virtual registers for the call result, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63550 llvm-svn: 364511
* [GlobalISel] Accept multiple vregs in lowerFormalArgsDiana Picus2019-06-272-17/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the interface of CallLowering::lowerFormalArguments to accept several virtual registers for each formal argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660. lowerCall will be refactored in the same way in follow-up patches. With this change, we forward the virtual registers generated for aggregates to CallLowering. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. We also copy the pack/unpackRegs helpers to CallLowering to facilitate this. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was put into a s64 instead of a p0. Added a test-case which illustrates the problem more clearly (it crashes without this patch) and fixed the existing test-case to expect p0. AMDGPU has been updated to unpack into the virtual registers for kernels. I think the other code paths fall back for aggregates, so this should be NFC. Mips doesn't support aggregates yet, so it's also NFC. x86 seems to have code for dealing with aggregates, but I couldn't find the tests for it, so I just added a fallback to DAGISel if we get more than one virtual register for an argument. Differential Revision: https://reviews.llvm.org/D63549 llvm-svn: 364510
* [GlobalISel] Allow multiple VRegs in ArgInfo. NFCDiana Picus2019-06-271-7/+15
| | | | | | | | | | | Allow CallLowering::ArgInfo to contain more than one virtual register. This is useful when passes split aggregates into several virtual registers, but need to also provide information about the original type to the call lowering. Used in follow-up patches. Differential Revision: https://reviews.llvm.org/D63548 llvm-svn: 364509
* [ARM] Don't reserve R12 on Thumb1 as an emergency spill slot.Eli Friedman2019-06-267-120/+217
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current implementation of ThumbRegisterInfo::saveScavengerRegister is bad for two reasons: one, it's buggy, and two, it blocks using R12 for other optimizations. So this patch gets rid of it, and adds the necessary support for using an ordinary emergency spill slot on Thumb1. (Specifically, I think saveScavengerRegister was broken by r305625, and nobody noticed for two years because the codepath is almost never used. The new code will also probably not be used much, but it now has better tests, and if we fail to emit a necessary emergency spill slot we get a reasonable error message instead of a miscompile.) A rough outline of the changes in the patch: 1. Gets rid of ThumbRegisterInfo::saveScavengerRegister. 2. Modifies ARMFrameLowering::determineCalleeSaves to allocate an emergency spill slot for Thumb1. 3. Implements useFPForScavengingIndex, so the emergency spill slot isn't placed at a negative offset from FP on Thumb1. 4. Modifies the heuristics for allocating an emergency spill slot to support Thumb1. This includes fixing ExtraCSSpill so we don't try to use "lr" as a substitute for allocating an emergency spill slot. 5. Allocates a base pointer in more cases, so the emergency spill slot is always accessible. 6. Modifies ARMFrameLowering::ResolveFrameIndexReference to compute the right offset in the new cases where we're forcing a base pointer. 7. Ensures we never generate a load or store with an offset outside of its frame object. This makes the heuristics more straightforward. 8. Changes Thumb1 prologue and epilogue emission so it never uses register scavenging. Some of the changes to the emergency spill slot heuristics in determineCalleeSaves affect ARM/Thumb2; hopefully, they should allow the compiler to avoid allocating an emergency spill slot in cases where it isn't necessary. The rest of the changes should only affect Thumb1. Differential Revision: https://reviews.llvm.org/D63677 llvm-svn: 364490
* [ARM] Handle fixup_arm_pcrel_9 correctly on big-endian targetsMikhail Maltsev2019-06-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | Summary: The getFixupKindContainerSizeBytes function returns the size of the instruction containing a given fixup. Currently fixup_arm_pcrel_9 is not handled in this function, this causes an assertion failure in the debug build and incorrect codegen in the release build. This patch fixes the problem. Reviewers: ostannard, simon_tatham Reviewed By: ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, pbarrio, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63778 llvm-svn: 364404
* [ARM] Fix -Wimplicit-fallthrough after D60709/r364331Fangrui Song2019-06-261-4/+3
| | | | llvm-svn: 364376
* [ARM] Support inline assembler constraints for MVE.Simon Tatham2019-06-251-1/+22
| | | | | | | | | | | | | | | | | | | | | "To" selects an odd-numbered GPR, and "Te" an even one. There are some 8.1-M instructions that have one too few bits in their register fields and require registers of particular parity, without necessarily using a consecutive even/odd pair. Also, the constraint letter "t" should select an MVE q-register, when MVE is present. This didn't need any source changes, but some extra tests have been added. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, eraman, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60709 llvm-svn: 364331
* [ARM] Code-generation infrastructure for MVE.Simon Tatham2019-06-257-17/+306
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This provides the low-level support to start using MVE vector types in LLVM IR, loading and storing them, passing them to __asm__ statements containing hand-written MVE vector instructions, and *if* you have the hard-float ABI turned on, using them as function parameters. (In the soft-float ABI, vector types are passed in integer registers, and combining all those 32-bit integers into a q-reg requires support for selection DAG nodes like insert_vector_elt and build_vector which aren't implemented yet for MVE. In fact I've also had to add `arm_aapcs_vfpcc` to a couple of existing tests to avoid that problem.) Specifically, this commit adds support for: * spills, reloads and register moves for MVE vector registers * ditto for the VPT predication mask that lives in VPR.P0 * make all the MVE vector types legal in ISel, and provide selection DAG patterns for BITCAST, LOAD and STORE * make loads and stores of scalar FP types conditional on `hasFPRegs()` rather than `hasVFP2Base()`. As a result a few existing tests needed their llc command lines updating to use `-mattr=-fpregs` as their method of turning off all hardware FP support. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60708 llvm-svn: 364329
* [ARM] Fix for DLS/LE CodeGenSam Parker2019-06-251-8/+9
| | | | | | | | | The expensive buildbots highlighted the mir tests were broken, which I've now updated and added --verify-machineinstrs to them. This also uncovered a couple of bugs in the backend pass, so these have also been fixed. llvm-svn: 364323
* [ARM] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D60692Fangrui Song2019-06-251-0/+1
| | | | llvm-svn: 364312
* [ARM] Fix buildbot failure due to -Werror.Simon Tatham2019-06-251-1/+0
| | | | | | | | Including both 'case ARM_AM::uxtw' and 'default' in the getShiftOp switch caused a buildbot to fail with error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default] llvm-svn: 364300
* [ARM] MVE VPT BlocksSjoerd Meijer2019-06-251-4/+12
| | | | | | | | | | | | A minor iteration on the MVE VPT Block pass to enable more efficient VPT Block code generation: consecutive VPT predicated statements, predicated on the same condition, will be placed within the same VPT Block. This essentially is also an exercise to write some more tests for the next step, which should be more generic also merging instructions when they are not consecutive. Differential Revision: https://reviews.llvm.org/D63711 llvm-svn: 364298
* [ARM] Explicit lowering of half <-> double conversions.Simon Tatham2019-06-252-15/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an FP_EXTEND or FP_ROUND isel dag node converts directly between f16 and f32 when the target CPU has no instruction to do it in one go, it has to be done in two steps instead, going via f32. Previously, this was done implicitly, because all such CPUs had the storage-only implementation of f16 (i.e. the only thing you can do with one at all is to convert it to/from f32). So isel would legalize the f16 into an f32 as soon as it saw it, by inserting an fp16_to_fp node (or vice versa), and then the fp_extend would already be f32->f64 rather than f16->f64. But that technique can't support a target CPU which has full f16 support but _not_ f64, such as some variants of Arm v8.1-M. So now we provide custom lowering for FP_EXTEND and FP_ROUND, which checks support for f16 and f64 and decides on the best thing to do given the combination of flags it gets back. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60692 llvm-svn: 364294
* [ARM] Add remaining miscellaneous MVE instructions.Simon Tatham2019-06-254-22/+164
| | | | | | | | | | | | | | | | | | This final batch includes the tail-predicated versions of the low-overhead loop instructions (LETP); the VPSEL instruction to select between two vector registers based on the predicate mask without having to open a VPT block; and VPNOT which complements the predicate mask in place. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62681 llvm-svn: 364292
* [ARM] Add MVE vector load/store instructions.Simon Tatham2019-06-2513-60/+1049
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds the rest of the vector memory access instructions. It includes contiguous loads/stores, with an ordinary addressing mode such as [r0,#offset] (plus writeback variants); gather loads and scatter stores with a scalar base address register and a vector of offsets from it (written [r0,q1] or similar); and gather/scatters with a vector of base addresses (written [q0,#offset], again with writeback). Additionally, some of the loads can widen each loaded value into a larger vector lane, and the corresponding stores narrow them again. To implement these, we also have to add the addressing modes they need. Also, in AsmParser, the `isMem` query function now has subqueries `isGPRMem` and `isMVEMem`, according to which kind of base register is used by a given memory access operand. I've also had to add an extra check in `checkTargetMatchPredicate` in the AsmParser, without which our last-minute check of `rGPR` register operands against SP and PC was failing an assertion because Tablegen had inserted an immediate 0 in place of one of a pair of tied register operands. (This matches the way the corresponding check for `MCK_rGPR` in `validateTargetOperandClass` is guarded.) Apparently the MVE load instructions were the first to have ever triggered this assertion, but I think only because they were the first to have a combination of the usual Arm pre/post writeback system and the `rGPR` class in particular. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62680 llvm-svn: 364291
* [ARM] DLS/LE low-overhead loop code generationSam Parker2019-06-256-1/+349
| | | | | | | | | | | | | | | | | Introduce three pseudo instructions to be used during DAG ISel to represent v8.1-m low-overhead loops. One maps to set_loop_iterations while loop_decrement_reg is lowered to two, so that we can separate the decrement and branching operations. The pseudo instructions are expanded pre-emission, where we can still decide whether we actually want to generate a low-overhead loop, in a new pass: ARMLowOverheadLoops. The pass currently bails, reverting to an sub, icmp and br, in the cases where a call or stack spill/restore happens between the decrement and branching instructions, or if the loop is too large. Differential Revision: https://reviews.llvm.org/D63476 llvm-svn: 364288
* GlobalISel: Remove unsigned variant of SrcOpMatt Arsenault2019-06-242-14/+14
| | | | | | | | | Force using Register. One downside is the generated register enums require explicit conversion. llvm-svn: 364194
* CodeGen: Introduce a class for registersMatt Arsenault2019-06-244-18/+18
| | | | | | | | | Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191
* [ARM] Add MVE interleaving load/store family.Simon Tatham2019-06-247-33/+272
| | | | | | | | | | | | | | | | | | This adds the family of loads and stores with names like VLD20.8 and VST42.32, which load and store parts of multiple q-registers in such a way that executing both VLD20 and VLD21, or all four of VLD40..VLD43, will distribute 2 or 4 vectors' worth of memory data across the lanes of the same number of registers but in a transposed order. In addition to the Tablegen descriptions of the instructions themselves, this patch also adds encode and decode support for the QQPR and QQQQPR register classes (representing the range of loaded or stored vector registers), and tweaks to the parsing system for lists of vector registers to make it return the right format in this case (since, unlike NEON, MVE regards q-registers as primitive, and not just an alias for two d-registers). llvm-svn: 364172
* Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.Simon Pilgrim2019-06-211-1/+1
| | | | llvm-svn: 364068
* [ARM] Add MVE 64-bit GPR <-> vector move instructions.Simon Tatham2019-06-215-0/+216
| | | | | | | | | | | | | | | | | | | | | | | | | | | | These instructions let you load half a vector register at once from two general-purpose registers, or vice versa. The assembly syntax for these instructions mentions the vector register name twice. For the move _into_ a vector register, the MC operand list also has to mention the register name twice (once as the output, and once as an input to represent where the unchanged half of the output register comes from). So we can conveniently assign one of the two asm operands to be the output $Qd, and the other $QdSrc, which avoids confusing the auto-generated AsmMatcher too much. For the move _from_ a vector register, there's no way to get round the fact that both instances of that register name have to be inputs, so we need a custom AsmMatchConverter to avoid generating two separate output MC operands. (And even that wouldn't have worked if it hadn't been for D60695.) Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62679 llvm-svn: 364041
* [ARM] Add MVE vector instructions that take a scalar input.Simon Tatham2019-06-216-2/+440
| | | | | | | | | | | | | | | | | | | This adds the `MVE_qDest_rSrc` superclass and all its instances, plus a few other instructions that also take a scalar input register or two. I've also belatedly added custom diagnostic messages to the operand classes for odd- and even-numbered GPRs, which required matching changes in two of the existing MVE assembly test files. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62678 llvm-svn: 364040
* [ARM] Add a batch of similarly encoded MVE instructions.Simon Tatham2019-06-213-1/+345
| | | | | | | | | | | | | | | | | | | | | | | Summary: This adds the `MVE_qDest_qSrc` superclass and all instructions that inherit from it. It's not the complete class of _everything_ with a q-register as both destination and source; it's a subset of them that all have similar encodings (but it would have been hopelessly unwieldy to call it anything like MVE_111x11100). This category includes add/sub with carry; long multiplies; halving multiplies; multiply and accumulate, and some more complex instructions. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62677 llvm-svn: 364037
* [ARM] Fix -Wimplicit-fallthrough after D62675Fangrui Song2019-06-211-0/+2
| | | | llvm-svn: 364028
* [ARM] Add MVE vector compare instructions.Simon Tatham2019-06-213-6/+201
| | | | | | | | | | | | | | | | | | Summary: These take a pair of vector register to compare, and a comparison type (written in the form of an Arm condition suffix); they output a vector of booleans in the VPR register, where predication can conveniently use them. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62676 llvm-svn: 364027
* [ARM] Add a batch of MVE floating-point instructions.Simon Tatham2019-06-213-4/+456
| | | | | | | | | | | | | | | | | Summary: This includes floating-point basic arithmetic (add/sub/multiply), complex add/multiply, unary negation and absolute value, rounding to integer value, and conversion to/from integer formats. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62675 llvm-svn: 364013
* Simplify std::lower_bound with llvm::{bsearch,lower_bound}. NFCFangrui Song2019-06-212-8/+3
| | | | llvm-svn: 364006
* [ARM GlobalISel] Add support for s64 G_ADD and G_SUB.Eli Friedman2019-06-202-2/+19
| | | | | | | | | | | | | Teach RegisterBankInfo to use the correct register class, and tell the legalizer it's legal. Everything else just works. The one thing that's slightly weird about this compared to SelectionDAG isel is that legalization can't distinguish between i64 and <1 x i64>, so we might end up with more NEON instructions than the user expects. Differential Revision: https://reviews.llvm.org/D63585 llvm-svn: 363989
* [ARM] Add a batch of MVE integer instructions.Simon Tatham2019-06-203-1/+406
| | | | | | | | | | | | | | | | This includes integer arithmetic of various kinds (add/sub/multiply, saturating and not), and the immediate forms of VMOV and VMVN that load an immediate into all lanes of a vector. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62674 llvm-svn: 363936
* [llvm-objdump] Switch between ARM/Thumb based on mapping symbols.Eli Friedman2019-06-201-29/+28
| | | | | | | | | | | | | | | The ARMDisassembler changes allow changing between ARM and Thumb mode based on the MCSubtargetInfo, rather than the Target, which simplifies the other changes a bit. I'm not really happy with adding more target-specific logic to tools/llvm-objdump/, but there isn't any easy way around it: the logic in question specifically applies to disassembling an object file, and that code simply isn't located in lib/Target, at least at the moment. Differential Revision: https://reviews.llvm.org/D60927 llvm-svn: 363903
OpenPOWER on IntegriCloud