summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM
Commit message (Collapse)AuthorAgeFilesLines
...
* [IPRA][ARM] Disable no-CSR optimisation for ARMOliver Stannard2019-08-021-0/+5
| | | | | | | | | | | This optimisation isn't generally profitable for ARM, because we can save/restore many registers in the prologue and epilogue using the PUSH and POP instructions, but mostly use individual LDR/STR instructions for other spills. Differential revision: https://reviews.llvm.org/D64910 llvm-svn: 367670
* Fix and test inter-procedural register allocation for ARMOliver Stannard2019-08-023-2/+10
| | | | | | | | | | | | | | - Avoid a crash when IPRA calls ARMFrameLowering::determineCalleeSaves with a null RegScavenger. Simply not updating the register scavenger is fine because IPRA only cares about the SavedRegs vector, the acutal code of the function has already been generated at this point. - Add a new hook to TargetRegisterInfo to get the set of registers which can be clobbered inside a call, even if the compiler can see both sides, by linker-generated code. Differential revision: https://reviews.llvm.org/D64908 llvm-svn: 367669
* [NFC][ARM[ParallelDSP] Rename/remove/change typesSam Parker2019-08-021-13/+8
| | | | | | | Remove forward declaration, fold a couple of typedefs and change one to be more useful. llvm-svn: 367665
* [NFC][ARM][ParallelDSP] Remove ValueListSam Parker2019-08-021-10/+8
| | | | | | We only care about the first element in the list. llvm-svn: 367660
* Finish moving TargetRegisterInfo::isVirtualRegister() and friends to ↵Daniel Sanders2019-08-0112-59/+56
| | | | | | llvm::Register as started by r367614. NFC llvm-svn: 367633
* [ARM] Fix for MVE VREV64David Green2019-08-011-5/+5
| | | | | | | | | | | The VREV64 instruction is apparently unpredictable if Qd == Qm, due to the cross-beat nature of the instruction. This adds an earlyclobber to Qd, which seems to be the same way we deal with this on other instructions like the write-back on loads and stores. Differential Revision: https://reviews.llvm.org/D65502 llvm-svn: 367544
* [NFC][ARM][ParallelDSP] Getters and renamingSam Parker2019-08-011-16/+22
| | | | | | | Add a couple of getters for Reduction and do some renaming of variables around CreateSMLAD for clarity. llvm-svn: 367522
* [ARM] Lower "(x<<c) > 0x80000000U" to "lsls" on Thumb1.Eli Friedman2019-07-315-0/+34
| | | | | | | | | This is extremely specific, but saves three instructions when it's legal. I don't think the code can be usefully generalized. Differential Revision: https://reviews.llvm.org/D65351 llvm-svn: 367492
* [ARM] Transform compare of masked value to shift on Thumb1.Eli Friedman2019-07-311-0/+37
| | | | | | | | | | | | Thumb1 has very limited immediate modes, so turning an "and" into a shift can save multiple instructions. It's possible to simplify the generated code for test2 and test3 in cmp-and-fold.ll a little more, but I'll implement that as a followup. Differential Revision: https://reviews.llvm.org/D65175 llvm-svn: 367491
* [GISel] Address review feedback on passing MD_callees to lowerCall.Mark Lacey2019-07-311-1/+1
| | | | | | | Preserve the nullptr default for KnownCallees that appears in the base class. llvm-svn: 367477
* [GISel] Pass MD_callees metadata down in call lowering.Mark Lacey2019-07-312-2/+4
| | | | | | | | | | | | | | | | | | | | Summary: This will make it possible to improve IPRA by taking into account register usage in indirect calls. NFC yet; this is just laying the groundwork to start building up patches to take advantage of the information for improved register allocation. Reviewers: aditya_nandakumar, volkan, qcolombet, arsenm, rovka, aemerson, paquette Subscribers: sdardis, wdng, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65488 llvm-svn: 367476
* [ARM] Reject CSEL instructions with invalid operandsMikhail Maltsev2019-07-311-1/+1
| | | | | | | | | | | | | | | | | | | | Summary: According to the Armv8.1-M manual CSEL, CSINC, CSINV and CSNEG are "constrained unpredictable" when SP is used as the source register Rn. The assembler should diagnose this case. Reviewers: momchil.velikov, dmgreen, ostannard, simon_tatham, t.p.northover Reviewed By: ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65505 llvm-svn: 367433
* [ARM] Generate MVE VFMAsOliver Cruickshank2019-07-311-0/+21
| | | | llvm-svn: 367408
* [NFC] Test CommitOliver Cruickshank2019-07-311-2/+2
| | | | llvm-svn: 367405
* [NFC][ARMCGP] Use switch in isSupportedValueSam Parker2019-07-311-47/+40
| | | | | | | | Use a switch instead of many isa<> while checking for supported values. Also be explicit about which cast instructions are supported; This allows the removal of SIToFP from GenerateSignBits. llvm-svn: 367402
* [ARM][ParallelDSP] Convert to function passSam Parker2019-07-311-73/+45
| | | | | | | | Run across a whole function, visiting each basic block one at a time. Differential Revision: https://reviews.llvm.org/D65324 llvm-svn: 367389
* [ARM][LowOverheadLoops] Enable by defaultSam Parker2019-07-301-1/+1
| | | | | | | | | The code is now in a good enough state to pass the bunch of tests that I have run (after fixing the bugs), so let's enable it by default. Differential Revision: https://reviews.llvm.org/D65277 llvm-svn: 367297
* [ARM][LowOverheadLoops] Revert non-header LE targetSam Parker2019-07-301-3/+9
| | | | | | | | | Revert the hardware loop upon finding a LoopEnd that doesn't target the loop header, instead of asserting a failure. Differential Revision: https://reviews.llvm.org/D65268 llvm-svn: 367296
* [NFC][ARM[ParallelDSP] Cleanup of BinOpChainSam Parker2019-07-291-81/+58
| | | | | | | | | | | | - Remove some unused typedefs. - Rename BinOpChain struct to MulCandidate. - Remove the size method of MulCandidate. - Store only the first input of the ValueList provided to MulCandidate, as it's the only value we care about. This means we don't have to perform any ugly (and unnecessary) iterations of the list later on. llvm-svn: 367208
* [NFC][ARM][ParallelDSP] Remove AreSymmetricalSam Parker2019-07-291-43/+0
| | | | | | | We explicitly search for a parallel mac and we only care about its inputs, checking for symmetry doesn't add anything here. llvm-svn: 367205
* [NFC][ARM][ParallelDSP] Remove PopulateLoadsSam Parker2019-07-291-9/+0
| | | | | | | | We no longer have to check what loads are used, all this is performed at the start of the transform, so it's not doing anything now. llvm-svn: 367204
* [ARM] MVE VPNOTDavid Green2019-07-282-3/+22
| | | | | | | | | | This adds the patterns required to transform xor P0, -1 to a VPNOT. The instruction operands have to change a little for this, adding an in and an out VCCR reg and using a custom DecodeMVEVPNOT for the decode. Differential Revision: https://reviews.llvm.org/D65133 llvm-svn: 367192
* [ARM] Better patterns for fp <> predicate vectorsDavid Green2019-07-282-4/+26
| | | | | | | | | | These are some better patterns for converting between predicates and floating points. Much like the extends, we select "1"/"-1" or "0" depending on the predicate value. Or we perform a compare against 0 to convert to a predicate. Differential Revision: https://reviews.llvm.org/D65103 llvm-svn: 367191
* [ARM][ParallelDSP] Combine structsSam Parker2019-07-261-19/+15
| | | | | | | Combine OpChain and BinOpChain structs as OpChain is a base class to BinOpChain that is never used. llvm-svn: 367114
* [NFC][ARM][ParallelDSP] Cleanup isNarrowSequenceSam Parker2019-07-261-26/+5
| | | | | | Remove unused logic. llvm-svn: 367099
* [ARM][LowOverheadLoops] Add CPSR defsSam Parker2019-07-261-2/+4
| | | | | | | | | | Both WhileLoopStart and LoopEnd may get turned into a cmp and br pair, so add an implicit def to these pseudo instructions in case that WLS and LE aren't generated. Differential Revision: https://reviews.llvm.org/D65275 llvm-svn: 367089
* [ARM][AArch64] Support for Cortex-A65 & A65AE, Neoverse E1 & N1Pablo Barrio2019-07-253-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add support for Cortex-A65, Cortex-A65AE, Neoverse E1 and Neoverse N1. Neoverse E1 and Cortex-A65(&AE) only implement the AArch64 state of the Arm architecture. Neoverse N1 implements both AArch32 and AArch64. Cortex-A65: https://developer.arm.com/ip-products/processors/cortex-a/cortex-a65 Cortex-A65AE: https://developer.arm.com/ip-products/processors/cortex-a/cortex-a65ae Neoverse E1: https://developer.arm.com/ip-products/processors/neoverse/neoverse-e1 Neoverse N1: https://developer.arm.com/ip-products/processors/neoverse/neoverse-n1 Patch by Diogo Sampaio and Pablo Barrio Reviewers: samparker, LukeCheeseman, sbaranga, ostannard Reviewed By: ostannard Subscribers: ostannard, javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64406 llvm-svn: 367007
* [ARM] Remove dead code from ARMConstantIslands.Eli Friedman2019-07-241-5/+0
| | | | | | | tLDRHi is not a pc-relative load; it can't directly refer to a constant pool or jump table. llvm-svn: 366963
* [ARM] Rewrite how VCMP are lowered, using a single nodeDavid Green2019-07-245-248/+283
| | | | | | | | | | | | This removes the VCEQ/VCNE/VCGE/VCEQZ/etc nodes, just using two called VCMP and VCMPZ with an extra operand as the condition code. I believe this will make some combines simpler, allowing us to just look at these codes and not the operands. It also helps fill in a missing VCGTUZ MVE selection without adding extra nodes for it. Differential Revision: https://reviews.llvm.org/D65072 llvm-svn: 366934
* [ARM] Disable MVE fptosi and friendsDavid Green2019-07-241-0/+4
| | | | | | | | | | The prevents us from trying to convert an i1 predicate vector to a float, or vice-versa. Better patterns are possible, which will follow in a subsequent commit. For now we just expand them. Differential Revision: https://reviews.llvm.org/D65066 llvm-svn: 366931
* [ARM] More MVE compare vector splat combines for ANDsDavid Green2019-07-241-0/+12
| | | | | | | | Adds some extra r register compare combines, this time for ANDs. Differential Revision: https://reviews.llvm.org/D65062 llvm-svn: 366928
* [ARM] MVE compare vector splat combineDavid Green2019-07-241-0/+12
| | | | | | | | | MVE VCMP instructions can use a general purpose register as the second operand. This adds the combines for it, selecting from a compare of a vdup. Differential Revision: https://reviews.llvm.org/D65061 llvm-svn: 366924
* [ARM] Better OR's for MVE comparesDavid Green2019-07-244-8/+73
| | | | | | | | | | | This adds a DeMorgan combine for OR's of compares to turn them into AND's, helping prevent them from going into and out of gpr registers. It also fills in the VCLE and VCLT nodes that MVE can select, allowing it to invert more compares. Differential Revision: https://reviews.llvm.org/D65059 llvm-svn: 366920
* [ARM] Better AND's for MVE comparesDavid Green2019-07-241-0/+24
| | | | | | | | | | | | Add a number of folds to convert and(vcmp, vcmp) into a single VPT block, where the second vcmp becomes predicated on the first. The VCMP; VPST; VCMP will eventually be converted to VPT; VCMP in the VPTBlockPass. Differential Revision: https://reviews.llvm.org/D65058 llvm-svn: 366910
* [ARM] MVE floating point compares and selectsDavid Green2019-07-242-1/+53
| | | | | | | | | | | | | Much like integers, this adds MVE floating point compares and select. It requires a lot more buildvector/shuffle code because we may need to expand the compares without mve.fp, and requires support for and/or because of the way we lower llvm condition codes. Some original code by David Sherwood Differential Revision: https://reviews.llvm.org/D65054 llvm-svn: 366909
* [ARM] Basic And/Or/Xor handling for MVE predicatesDavid Green2019-07-241-0/+26
| | | | | | | | | | | This adds some basic, "worst case" handling for MVE predicate Or/And/Xor. It does this by going into and out of GPRs, doing the operation on scalars. Code by David Sherwood. Differential Revision: https://reviews.llvm.org/D65053 llvm-svn: 366907
* [ARM] Make sure that the constant pool does not keep in the middle of an IT ↵Simi Pallipurath2019-07-241-3/+26
| | | | | | | | | | | | | | | | | | block. This change make sure that llvm does not emit an invalid IT block by putting the constant pool in the middle of an IT block. We have code to try to avoid putting a constant island in the middle of an IT block, but it only works if we see an IT between the one currently referencing CPE and possible insertion point. If the first instruction we look at is the VLDRD after the IT , we never see the IT and does not realize that the instruction doing the load could be in an IT block itself. Differential Revision: https://reviews.llvm.org/D64621 Change-Id: I24cecb37cded75e8992870bd997f6226853bd920 llvm-svn: 366905
* Test commit. NFC.Sjoerd Meijer2019-07-241-1/+1
| | | | | | | Removed 2 trailing whitespaces in 2 files that used to be in different repos to test my new github monorepo workflow. llvm-svn: 366904
* [ARM] MVE predicate register supportDavid Green2019-07-243-13/+360
| | | | | | | | | | | | | | | | This adds support code for building and shuffling i1 predicate registers. It generally uses two basic principles, either converting the predicate into an scalar (through a PREDICATE_CAST) and doing scalar operations on it there, or by converting the register to an full vector register and back. Some of the code here is a not super efficient but will hopefully cover most cases of moving i1 vectors around and can be improved in subsequent patches. Some code by David Sherwood. Differential Revision: https://reviews.llvm.org/D65052 llvm-svn: 366890
* [ARM] MVE integer compares and selectsDavid Green2019-07-245-40/+132
| | | | | | | | | | | | | | | | | | | This adds the very basics for MVE vector predication, adding integer VCMP and VSEL instruction support. This is done through predicate registers (MVT::v16i1, MVT::v8i1, MVT::v4i1), but otherwise using same mechanics as NEON to custom lower setcc's through ARMISD::VCXX nodes (VCEQ, VCGT, VCEQZ, etc). An extra VCNE was added, as this can be handled sensibly by MVE's expanded number of VCMP condition codes. (There are also VCLE and VCLT which are added later). VPSEL is also added here, simply selecting on the vselect. Original code by David Sherwood. Differential Revision: https://reviews.llvm.org/D65051 llvm-svn: 366885
* [ARM][ParallelDSP] Fix pointer operand reorderingSam Parker2019-07-241-2/+2
| | | | | | | | | | | While combining two loads into a single load, we often need to reorder the pointer operands for the new load. This reordering was broken in the cases where there was a chain of values that built up the pointer. Differential Revision: https://reviews.llvm.org/D65193 llvm-svn: 366881
* [ARM] Add opt-bisect support to ARMParallelDSP.Eli Friedman2019-07-231-0/+3
| | | | llvm-svn: 366851
* [ARM][LowOverheadLoops] Fix branch target codegenSam Parker2019-07-234-37/+183
| | | | | | | | | | | | | | | | While lowering test.set.loop.iterations, it wasn't checked how the brcond was using the result and so the wls could branch to the loop preheader instead of not entering it. The same was true for loop.decrement.reg. So brcond and br_cc and now lowered manually when using the hwloop intrinsics. During this we now check whether the result has been negated and whether we're using SETEQ or SETNE and 0 or 1. We can then figure out which basic block the WLS and LE should be targeting. Differential Revision: https://reviews.llvm.org/D64616 llvm-svn: 366809
* [ARM] Rename NEONModImm to VMOVModImm. NFCDavid Green2019-07-238-46/+46
| | | | | | Rename NEONModImm to VMOVModImm as it is used in both NEON and MVE. llvm-svn: 366790
* [ARM][LowOverheadLoops] Revert remaining pseudosSam Parker2019-07-221-12/+56
| | | | | | | | | | | ARMLowOverheadLoops would assert a failure if it did not find all the pseudo instructions that comprise the hardware loop. Instead of doing this, iterate through all the instructions of the function and revert any remaining pseudo instructions that haven't been converted. Differential Revision: https://reviews.llvm.org/D65080 llvm-svn: 366691
* [ARM] Fix for MVE VPT block passDavid Green2019-07-221-3/+18
| | | | | | | | | We need to ensure that the number of T's is correct when adding multiple instructions into the same VPT block. Differential revision: https://reviews.llvm.org/D65049 llvm-svn: 366684
* [IPRA][ARM] Make use of the "returned" parameter attributeOliver Stannard2019-07-223-0/+19
| | | | | | | | | | | | ARM has code to recognise uses of the "returned" function parameter attribute which guarantee that the value passed to the function in r0 will be returned in r0 unmodified. IPRA replaces the regmask on call instructions, so needs to be told about this to avoid reverting the optimisation. Differential revision: https://reviews.llvm.org/D64986 llvm-svn: 366669
* [ARM] Add <saturate> operand to SQRSHRL and UQRSHLLMikhail Maltsev2019-07-196-11/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: According to the new Armv8-M specification https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf the instructions SQRSHRL and UQRSHLL now have an additional immediate operand <saturate>. The new assembly syntax is: SQRSHRL<c> RdaLo, RdaHi, #<saturate>, Rm UQRSHLL<c> RdaLo, RdaHi, #<saturate>, Rm where <saturate> can be either 64 (the existing behavior) or 48, in that case the result is saturated to 48 bits. The new operand is encoded as follows: #64 Encoded as sat = 0 #48 Encoded as sat = 1 sat is bit 7 of the instruction bit pattern. This patch adds a new assembler operand class MveSaturateOperand which implements parsing and encoding. Decoding is implemented in DecodeMVEOverlappingLongShift. Reviewers: ostannard, simon_tatham, t.p.northover, samparker, dmgreen, SjoerdMeijer Reviewed By: simon_tatham Subscribers: javed.absar, kristof.beyls, hiraditya, pbarrio, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64810 llvm-svn: 366555
* [ARM][DAGCOMBINE][FIX] PerformVMOVRRDCombineDiogo N. Sampaio2019-07-181-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: PerformVMOVRRDCombine ommits adding a offset of 4 to the PointerInfo, when converting a f64 = load[M] to {i32, i32} = {load[M], load[M + 4]} Which would allow the machine scheduller to break dependencies with the second load. - pr42638 Reviewers: eli.friedman, dmgreen, ostannard Reviewed By: ostannard Subscribers: ostannard, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64870 llvm-svn: 366423
* [ARM GlobalISel] Cleanup CallLowering. NFCDiana Picus2019-07-172-71/+20
| | | | | | | | | | Migrate CallLowering::lowerReturnVal to use the same infrastructure as lowerCall/FormalArguments and remove the now obsolete code path from splitToValueTypes. Forgot to push this earlier. llvm-svn: 366308
OpenPOWER on IntegriCloud