summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [ARM][LowOverheadLoops] Allow all MVE instrs.Sam Parker2020-01-141-21/+18
| | | | | | | | | | | | | | We have a whitelist of instructions that we allow when tail predicating, since these are trivial ones that we've deemed need no special handling. Now change ARMLowOverheadLoops to allow the non-trivial instructions if they're contained within a valid VPT block. Since a valid block is one that is predicated upon the VCTP so we know that these non-trivial instructions will still behave as expected once the implicit predication is used instead. This also fixes a previous test failure. Differential Revision: https://reviews.llvm.org/D72509
* [ARM][LowOverheadLoops] Change predicate inspectionSam Parker2020-01-141-26/+27
| | | | | | | | | | Use the already provided helper function to get the operand type so that we can detect whether the vpr is being used as a predicate or not. Also use existing helpers to get the predicate indices when we converting the vpt blocks. This enables us to support both types of vpr predicate operand. Differential Revision: https://reviews.llvm.org/D72504
* [ARM][MVE] Disallow VPSEL for tail predicationSam Parker2020-01-141-3/+16
| | | | | | | | | | | | | | | | | | | Due to the current way that we collect predicated instructions, we can't easily handle vpsel in tail predicated loops. There are a couple of issues: 1) It will use the VPR as a predicate operand, but doesn't have to be instead a VPT block, which means we can assert while building up the VPT block because we don't find another VPST to being a new one. 2) VPSEL still requires a VPR operand even after tail predicating, which means we can't remove it unless there is another instruction, such as vcmp, that can provide the VPR def. The first issue should be a relatively simple fix in the logic of the LowOverheadLoops pass, whereas the second will require us to represent the 'implicit' tail predication with an explicit value. Differential Revision: https://reviews.llvm.org/D72629
* ARMLowOverheadLoops: return earlier to avoid printing irrelevant dbg msg. NFCSjoerd Meijer2020-01-131-0/+1
|
* ARMLowOverheadLoops: a few more dbg msgs to better trace rejected TP loops. NFC.Sjoerd Meijer2020-01-101-7/+16
|
* [NFC][ARM] LowOverheadLoop commentsSam Parker2020-01-091-0/+16
| | | | Add a comment describing the dependencies of the pass.
* Revert "[ARM][LowOverheadLoops] Update liveness info"Sam Parker2020-01-091-64/+0
| | | | | | | This reverts commit e93e0d413f3afa1df5c5f88df546bebcd1183155. There's some ordering problems on some on the buildbots which needs investigating.
* [ARM][LowOverheadLoops] Update liveness infoSam Parker2020-01-091-0/+64
| | | | | | | | After expanding the pseudo instructions, update the liveness info. We do this in a post-order traversal of the loop, including its exit blocks and preheader(s). Differential Revision: https://reviews.llvm.org/D72131
* [ARM][MVE] More MVETailPredication debug messages. NFC.Sjoerd Meijer2020-01-061-0/+4
| | | | | | | | | | I've added a few more debug messages to MVETailPredication because I wanted to trace better which instructions are added/removed. And while I was at it, I factored out one function which I thought was clearer, and have added some comments to describe better the flow between MVETailPredication and ARMLowOverheadLoops. Differential Revision: https://reviews.llvm.org/D71549
* [ARM][NFC] Move tail predication checksSam Parker2020-01-031-69/+76
| | | | | Extract the tail predication validation checks out into their own LowOverHeadLoop method.
* [ARM][MVE] Fixes for tail predication.Sam Parker2019-12-201-8/+51
| | | | | | | | | | | | | | | | | | 1) Fix an issue with the incorrect value being used for the number of elements being passed to [d|w]lstp. We were trying to check that the value was available at LoopStart, but this doesn't consider that the last instruction in the block could also define the register. Two helpers have been added to RDA for this. 2) Insert some code to now try to move the element count def or the insertion point so that we can perform more tail predication. 3) Related to (1), the same off-by-one could prevent us from generating a low-overhead loop when a mov lr could have been the last instruction in the block. 4) Fix up some instruction attributes so that not all the low-overhead loop instructions are labelled as branches and terminators - as this is not true for dls/dlstp. Differential Revision: https://reviews.llvm.org/D71609
* [ARM][MVE] Tail predicate in the presence of vcmpSam Parker2019-12-201-59/+235
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Record the discovered VPT blocks while checking for validity and, for now, only handle blocks that begin with VPST and not VPT. We're now allowing more than one instruction to define vpr, but each block must somehow be predicated using the vctp. This leaves us with several scenarios which need fixing up: 1) A VPT block with is only predicated by the vctp and has no internal vpr defs. 2) A VPT block which is only predicated by the vctp but has an internal vpr def. 3) A VPT block which is predicated upon the vctp as well as another vpr def. 4) A VPT block which is not predicated upon a vctp, but contains it and all instructions within the block are predicated upon in. The changes needed are, for: 1) The easy one, just remove the vpst and unpredicate the instructions in the block. 2) Remove the vpst and unpredicate the instructions up to the internal vpr def. Need insert a new vpst to predicate the remaining instructions. 3) No nothing. 4) The vctp will be inside a vpt and the instruction will be removed, so adjust the size of the mask on the vpst. Differential Revision: https://reviews.llvm.org/D71107
* [ARM] Move MVE opcode helper functions to ARMBaseInstrInfo. NFC.Sjoerd Meijer2019-12-161-37/+6
| | | | | | | | In ARMLowOverheadLoops.cpp, MVETailPredication.cpp, and MVEVPTBlock.cpp we have quite a few helper functions all looking at the opcodes of MVE instructions. This moves all these utility functions to ARMBaseInstrInfo. Diferential Revision: https://reviews.llvm.org/D71426
* [ARM][LowOverheadLoops] Remove dead loop update instructions.Sjoerd Meijer2019-12-111-2/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After creating a low-overhead loop, the loop update instruction was still lingering around hurting performance. This removes dead loop update instructions, which in our case are mostly SUBS instructions. To support this, some helper functions were added to MachineLoopUtils and ReachingDefAnalysis to analyse live-ins of loop exit blocks and find uses before a particular loop instruction, respectively. This is a first version that removes a SUBS instruction when there are no other uses inside and outside the loop block, but there are some more interesting cases in test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll which shows that there is room for improvement. For example, we can't handle this case yet: .. dlstp.32 lr, r2 .LBB0_1: mov r3, r2 subs r2, #4 vldrh.u32 q2, [r1], #8 vmov q1, q0 vmla.u32 q0, q2, r0 letp lr, .LBB0_1 @ %bb.2: vctp.32 r3 .. which is a lot more tricky because r2 is not only used by the subs, but also by the mov to r3, which is used outside the low-overhead loop by the vctp instruction, and that requires a bit of a different approach, and I will follow up on this. Differential Revision: https://reviews.llvm.org/D71007
* [ARM][ReachingDefs] Remove dead code in loloops.Sam Parker2019-11-261-49/+140
| | | | | | | | | | | | Add some more helper functions to ReachingDefs to query the uses of a given MachineInstr and also to query whether two MachineInstrs use the same def of a register. For Arm, while tail-predicating, these helpers are used in the low-overhead loops to remove the dead code that calculates the number of loop iterations. Differential Revision: https://reviews.llvm.org/D70240
* [ARM][ReachingDefs] RDA in LoLoopsSam Parker2019-11-261-93/+39
| | | | | | | | | | | | | | | | | Add several new methods to ReachingDefAnalysis: - getReachingMIDef, instead of returning an integer, return the MachineInstr that produces the def. - getInstFromId, return a MachineInstr for which the given integer corresponds to. - hasSameReachingDef, return whether two MachineInstr use the same def of a register. - isRegUsedAfter, return whether a register is used after a given MachineInstr. These methods have been used in ARMLowOverhead to replace searching for uses/defs. Differential Revision: https://reviews.llvm.org/D70009
* [ARM][MVE] Tail predication conversionSam Parker2019-11-191-134/+294
| | | | | | | | | | | | | | | This patch modifies ARMLowOverheadLoops to convert a predicated vector low-overhead loop into a tail-predicatd one. This is currently a very basic conversion, with the following restrictions: - Operates only on single block loops. - The loop can only contain a single vctp instruction. - No other instructions can write to the vpr. - We only allow a subset of the mve instructions in the loop. TODO: Pass the number of elements, not the number of iterations to dlstp/wlstp. Differential Revision: https://reviews.llvm.org/D69945
* [ARM][LowOverheadLoops] Use subs during revert.Sam Parker2019-09-231-16/+37
| | | | | | | | | | Check whether there are any uses or defs between the LoopDec and LoopEnd. If there's not, then we can use a subs to set the cpsr and skip generating a cmp. Differential Revision: https://reviews.llvm.org/D67801 llvm-svn: 372560
* [ARM][LowOverheadLoops] Use tBcc when revertingSam Parker2019-09-231-8/+11
| | | | | | | | | Check the branch target ranges and use a tBcc instead of t2Bcc when we can. Differential Revision: https://reviews.llvm.org/D67796 llvm-svn: 372557
* [ARM][LowOverheadLoops] Add LR def safety checkSam Parker2019-09-171-48/+139
| | | | | | | | | | | | | | | Converting the *LoopStart pseudo instructions into DLS/WLS results in LR being defined. These instructions were inserted on the assumption that LR would already contain the loop counter because a mov is introduced during ISel as the the consumers in the loop can only use LR. That assumption proved wrong! So perform a safety check, finding an appropriate place to insert the DLS/WLS instructions or revert if this isn't possible. Differential Revision: https://reviews.llvm.org/D67539 llvm-svn: 372111
* [ARM][LowOverheadLoops] Fix generated code for "revert".Eli Friedman2019-08-151-3/+3
| | | | | | | | | | | | | | | | | | Two issues: 1. t2CMPri shouldn't use CPSR if it isn't predicated. This doesn't really have any visible effect at the moment, but it might matter in the future. 2. The t2CMPri generated for t2WhileLoopStart might need to use a register that isn't LR. My team found this because we have a patch to track register liveness late in the pass pipeline. I'll look into upstreaming it to help catch issues like this earlier. Differential Revision: https://reviews.llvm.org/D66243 llvm-svn: 369069
* [ARM][LowOverheadLoops] Revert after read/writeSam Parker2019-08-071-13/+18
| | | | | | | | | | | | | | | | | Currently we check whether LR is stored/loaded to/from inbetween the loop decrement and loop end pseudo instructions. There's two problems here: - It relies on all load/store instructions being labelled as such in tablegen. - Actually any use of loop decrement is troublesome because the value doesn't exist! So we need to check for any read/write of LR that occurs between the two instructions and revert if we find anything. Differential Revision: https://reviews.llvm.org/D65792 llvm-svn: 368130
* [ARM][LowOverheadLoops] Revert non-header LE targetSam Parker2019-07-301-3/+9
| | | | | | | | | Revert the hardware loop upon finding a LoopEnd that doesn't target the loop header, instead of asserting a failure. Differential Revision: https://reviews.llvm.org/D65268 llvm-svn: 367296
* Test commit. NFC.Sjoerd Meijer2019-07-241-1/+1
| | | | | | | Removed 2 trailing whitespaces in 2 files that used to be in different repos to test my new github monorepo workflow. llvm-svn: 366904
* [ARM][LowOverheadLoops] Revert remaining pseudosSam Parker2019-07-221-12/+56
| | | | | | | | | | | ARMLowOverheadLoops would assert a failure if it did not find all the pseudo instructions that comprise the hardware loop. Instead of doing this, iterate through all the instructions of the function and revert any remaining pseudo instructions that haven't been converted. Differential Revision: https://reviews.llvm.org/D65080 llvm-svn: 366691
* [ARM][LowOverheadLoops] Correct offset checkingSam Parker2019-07-111-3/+13
| | | | | | | | | | | | | | | | This patch addresses a couple of problems: 1) The maximum supported offset of LE is -4094. 2) The offset of WLS also needs to be checked, this uses a maximum positive offset of 4094. The use of BasicBlockUtils has been changed because the block offsets weren't being initialised, but the isBBInRange checks both positive and negative offsets. ARMISelLowering has been tweaked because the test case presented another pattern that we weren't supporting. llvm-svn: 365749
* [NFC][ARM] Convert lambdas to static helpersSam Parker2019-07-101-57/+73
| | | | | | | Break up and convert some of the lambdas in ARMLowOverheadLoops into static functions. llvm-svn: 365623
* [ARM] WLS/LE Code GenerationSam Parker2019-07-011-27/+89
| | | | | | | | | | | | | | | | | Backend changes to enable WLS/LE low-overhead loops for armv8.1-m: 1) Use TTI to communicate to the HardwareLoop pass that we should try to generate intrinsics that guard the loop entry, as well as setting the loop trip count. 2) Lower the BRCOND that uses said intrinsic to an Arm specific node: ARMWLS. 3) ISelDAGToDAG the node to a new pseudo instruction: t2WhileLoopStart. 4) Add support in ArmLowOverheadLoops to handle the new pseudo instruction. Differential Revision: https://reviews.llvm.org/D63816 llvm-svn: 364733
* [ARM] Fix for DLS/LE CodeGenSam Parker2019-06-251-8/+9
| | | | | | | | | The expensive buildbots highlighted the mir tests were broken, which I've now updated and added --verify-machineinstrs to them. This also uncovered a couple of bugs in the backend pass, so these have also been fixed. llvm-svn: 364323
* [ARM] DLS/LE low-overhead loop code generationSam Parker2019-06-251-0/+295
Introduce three pseudo instructions to be used during DAG ISel to represent v8.1-m low-overhead loops. One maps to set_loop_iterations while loop_decrement_reg is lowered to two, so that we can separate the decrement and branching operations. The pseudo instructions are expanded pre-emission, where we can still decide whether we actually want to generate a low-overhead loop, in a new pass: ARMLowOverheadLoops. The pass currently bails, reverting to an sub, icmp and br, in the cases where a call or stack spill/restore happens between the decrement and branching instructions, or if the loop is too large. Differential Revision: https://reviews.llvm.org/D63476 llvm-svn: 364288
OpenPOWER on IntegriCloud