summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM
Commit message (Collapse)AuthorAgeFilesLines
...
* [IfCvt][ARM] Optimise diamond if-conversion for code sizeOliver Stannard2019-10-102-0/+36
| | | | | | | | | | | | | | | | | Currently, the heuristics the if-conversion pass uses for diamond if-conversion are based on execution time, with no consideration for code size. This adds a new set of heuristics to be used when optimising for code size. This is mostly target-independent, because the if-conversion pass can see the code size of the instructions which it is removing. For thumb, there are a few passes (insertion of IT instructions, selection of narrow branches, and selection of CBZ instructions) which are run after if conversion and affect these heuristics, so I've added target hooks to better predict the code-size effect of a proposed if-conversion. Differential revision: https://reviews.llvm.org/D67350 llvm-svn: 374301
* Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure ↵Jinsong Ji2019-10-081-2/+1
| | | | | | | | | | | | | | separately in loop-vectorize" Also Revert "[LoopVectorize] Fix non-debug builds after rL374017" This reverts commit 9f41deccc0e648a006c9f38e11919f181b6c7e0a. This reverts commit 18b6fe07bcf44294f200bd2b526cb737ed275c04. The patch is breaking PowerPC internal build, checked with author, reverting on behalf of him for now due to timezone. llvm-svn: 374091
* [DebugInfo][If-Converter] Update call site info during the optimizationNikola Prica2019-10-081-2/+2
| | | | | | | | | | | | | | During the If-Converter optimization pay attention when copying or deleting call instructions in order to keep call site information in valid state. Reviewers: aprantl, vsk, efriedma Reviewed By: vsk, efriedma Differential Revision: https://reviews.llvm.org/D66955 llvm-svn: 374068
* [ISEL][ARM][AARCH64] Tracking simple parameter forwarding registersNikola Prica2019-10-082-2/+13
| | | | | | | | | | | | | | Support for tracking registers that forward function parameters into the following function frame. For now we only support cases when parameter is forwarded through single register. Reviewers: aprantl, vsk, t.p.northover Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D66953 llvm-svn: 374033
* [ARM] Generate vcmp instead of vcmpeKristof Beyls2019-10-085-69/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on the discussion in http://lists.llvm.org/pipermail/llvm-dev/2019-October/135574.html, the conclusion was reached that the ARM backend should produce vcmp instead of vcmpe instructions by default, i.e. not be producing an Invalid Operation exception when either arguments in a floating point compare are quiet NaNs. In the future, after constrained floating point intrinsics for floating point compare have been introduced, vcmpe instructions probably should be produced for those intrinsics - depending on the exact semantics they'll be defined to have. This patch logically consists of the following parts: - Revert http://llvm.org/viewvc/llvm-project?rev=294945&view=rev and http://llvm.org/viewvc/llvm-project?rev=294968&view=rev, which implemented fine-tuning for when to produce vcmpe (i.e. not do it for equality comparisons). The complexity introduced by those patches isn't needed anymore if we just always produce vcmp instead. Maybe these patches need to be reintroduced again once support is needed to map potential LLVM-IR constrained floating point compare intrinsics to the ARM instruction set. - Simply select vcmp, instead of vcmpe, see simple changes in lib/Target/ARM/ARMInstrVFP.td - Adapt lots of tests that tested for vcmpe (instead of vcmp). For all of these test, the intent of what is tested for isn't related to whether the vcmp should produce an Invalid Operation exception or not. Fixes PR43374. Differential Revision: https://reviews.llvm.org/D68463 llvm-svn: 374025
* [LoopVectorize][PowerPC] Estimate int and float register pressure separately ↵Zi Xuan Wu2019-10-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374017
* ARM-Darwin: keep the frame register reserved even if not updated.Tim Northover2019-10-041-1/+1
| | | | | | | | Darwin platforms need the frame register to always point at a valid record even if it's not updated in a leaf function. Backtraces are more important than one extra GPR. llvm-svn: 373738
* Fix uninitialized variable warning. NFCISimon Pilgrim2019-10-031-1/+1
| | | | llvm-svn: 373582
* [ARM] Make helpers static. NFC.Benjamin Kramer2019-10-021-3/+5
| | | | llvm-svn: 373503
* [ARM] Identity shuffles are legalDavid Green2019-10-021-0/+1
| | | | | | | | | | | | | | | Identity shuffles, of the form (0, 1, 2, 3, ...) are perfectly OK under MVE (they essentially just become bitcasts). We were not catching that in the existing set of what we considered legal though. On NEON, they would be covered by vext's, but that is not generally available in MVE. This uses ShuffleVectorInst::isIdentityMask which is a little odd to use here but does what we want and prevents us from just rewriting what is the same function. Differential Revision: https://reviews.llvm.org/D68241 llvm-svn: 373446
* TLI: Remove DAG argument from getRegisterByNameMatt Arsenault2019-10-012-5/+5
| | | | | | | | | | | Replace with the MachineFunction. X86 is the only user, and only uses it for the function. This removes one obstacle from using this in GlobalISel. The other is the more tolerable EVT argument. The X86 use of the function seems questionable to me. It checks hasFP, before frame lowering. llvm-svn: 373292
* [ARM][MVE] Change VCTP operandSam Parker2019-09-301-3/+3
| | | | | | | | | | | | The VCTP instruction will calculate the predicate masked based upon the number of elements that need to be processed. I had inserted the sub before the vctp intrinsic and supplied it as the operand, but this is incorrect as the phi should directly feed the vctp. The sub is calculating the value for the next iteration. Differential Revision: https://reviews.llvm.org/D67921 llvm-svn: 373188
* [ARM][CGP] Allow signext argumentsSam Parker2019-09-301-5/+2
| | | | | | | | | | | | As we perform a zext on any arguments used in the promoted tree, it doesn't matter if they're marked as signext. The only permitted user(s) in the tree which would interpret the sign bits are signed icmps. For these instructions, their promoted operands are truncated before the icmp uses them. Differential Revision: https://reviews.llvm.org/D68019 llvm-svn: 373186
* [ARM] Cortex-M4 schedule additionsDavid Green2019-09-295-17/+40
| | | | | | | | | | | | | | | | | | | This is an attempt to fill in some of the missing instructions from the Cortex-M4 schedule, and make it easier to do the same for other ARM cpus. - Some instructions are marked as hasNoSchedulingInfo as they are pseudos or otherwise do not require scheduling info - A lot of features have been marked not supported - Some WriteRes's have been added for cvt instructions. - Some extra instruction latencies have been added, notably by relaxing the regex for dsp instruction to catch more cases, and some fp instructions. This goes a long way to get the CompleteModel working for this CPU. It does not go far enough as to get all scheduling info for all output operands correct. Differential Revision: https://reviews.llvm.org/D67957 llvm-svn: 373163
* [Alignment][NFC] Remove unneeded llvm:: scoping on Align typesGuillaume Chatelet2019-09-275-43/+41
| | | | llvm-svn: 373081
* [MC][ARM] vscclrm disassembles as vldmiaAlexandros Lamprineas2019-09-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Happens only when the mve.fp subtarget feature is enabled: $ llvm-mc -triple thumbv8.1m.main -mattr=+mve.fp,+8msecext -disassemble <<< "0x9f,0xec,0x08,0x0b" .text vldmia pc, {d0, d1, d2, d3} $ llvm-mc -triple thumbv8.1m.main -mattr=+8msecext -disassemble <<< "0x9f,0xec,0x08,0x0b" .text vscclrm {d0, d1, d2, d3, vpr} Assembling returns the correct encoding with or without mve.fp: $ llvm-mc -triple thumbv8.1m.main -mattr=+mve.fp,+8msecext -show-encoding <<< "vscclrm {d0-d3, vpr}" .text vscclrm {d0, d1, d2, d3, vpr} @ encoding: [0x9f,0xec,0x08,0x0b] $ llvm-mc -triple thumbv8.1m.main -mattr=+8msecext -show-encoding <<< "vscclrm {d0-d3, vpr}" .text vscclrm {d0, d1, d2, d3, vpr} @ encoding: [0x9f,0xec,0x08,0x0b] The problem seems to be in the TableGen description of VSCCLRMD. The least significant bit should be set to zero. Differential Revision: https://reviews.llvm.org/D68025 llvm-svn: 373052
* ARMBaseInstrInfo getOperandLatency - silence static analyzer dyn_cast<> null ↵Simon Pilgrim2019-09-261-2/+2
| | | | | | | | dereference warnings. NFCI. The static analyzer is warning about potential null dereferences, but we should be able to use cast<> directly and if not assert will fire for us. llvm-svn: 372992
* [ARM] Ensure we do not attempt to create lsll #0David Green2019-09-253-5/+6
| | | | | | | | | | | During legalisation we can end up with some pretty strange nodes, like shifts of 0. We need to make sure we don't try to make long shifts of these, ending up with invalid assembly instructions. A long shift with a zero immediate actually encodes a shift by 32. Differential Revision: https://reviews.llvm.org/D67664 llvm-svn: 372839
* [ARM] Split large widening MVE loadsDavid Green2019-09-241-3/+72
| | | | | | | | | | | | Similar to rL372717, we can force the splitting of extends of vector loads in MVE, in order to use the better widening loads as opposed to going through expensive extends. This adds a combine to early-on detect extends of loads and split the load in two, from where normal legalisation will kick in and we get a series of widening loads. Differential Revision: https://reviews.llvm.org/D67909 llvm-svn: 372721
* [ARM] Split large truncating MVE storesDavid Green2019-09-241-82/+148
| | | | | | | | | | | | | | | | | | | | | MVE does not have a simple sign extend instruction that can move elements across lanes. We currently often end up moving each lane into and out of a GPR, in order to get elements into the correct places. When we have a store of a trunc (or a extend of a load), we can instead just split the store/load in two, using the narrowing/widening load/store instructions from each half of the vector. This does that for stores. It happens very early in a store combine, so as to easily detect the truncates. (It would be possible to do this later, but that would involve looking through a buildvector of extract elements. Not impossible but this way seemed simpler). By enabling store combines we also get a vmovdrr combine for free, helping some other tests. Differential Revision: https://reviews.llvm.org/D67828 llvm-svn: 372717
* MCRegisterInfo: Merge getLLVMRegNum and getLLVMRegNumFromEHPavel Labath2019-09-241-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The functions different in two ways: - getLLVMRegNum could return both "eh" and "other" dwarf register numbers, while getLLVMRegNumFromEH only returned the "eh" number. - getLLVMRegNum asserted if the register was not found, while the second function returned -1. The second distinction was pretty important, but it was very hard to infer that from the function name. Aditionally, for the use case of dumping dwarf expressions, we needed a function which can work with both kinds of number, but does not assert. This patch solves both of these issues by merging the two functions into one, returning an Optional<unsigned> value. While the same thing could be achieved by adding an "IsEH" argument to the (renamed) getLLVMRegNumFromEH function, it seemed better to avoid the confusion of two functions and put the choice of asserting into the hands of the caller -- if he checks the Optional value, he can safely process "untrusted" input, and if he blindly dereferences the Optional, he gets the assertion. I've updated all call sites to the new API, choosing between the two options according to the function they were calling originally, except that I've updated the usage in DWARFExpression.cpp to use the "safe" method instead, and added a test case which would have previously triggered an assertion failure when processing (incorrect?) dwarf expressions. Reviewers: dsanders, arsenm, JDevlieghere Subscribers: wdng, aprantl, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67154 llvm-svn: 372710
* Cosmetic; don't use the magic constant 35 when HASH is more readable. This ↵Mark Murray2019-09-231-3/+3
| | | | | | | | | | | | | | | | matches other MCK__<THING>_* usage better. Summary: No functional change. This fixes a magic constant in MCK__*_... macros only. Reviewers: ostannard Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67840 llvm-svn: 372599
* [Alignment] Get DataLayout::StackAlignment as AlignGuillaume Chatelet2019-09-232-2/+3
| | | | | | | | | | | | | | | | | | | | Summary: Internally it is needed to know if StackAlignment is set but we can expose it as llvm::Align. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67852 llvm-svn: 372585
* [ARM][MVE] Remove old tail predicatesSam Parker2019-09-232-9/+60
| | | | | | | | | | | Remove any predicate that we replace with a vctp intrinsic, and try to remove their operands too. Also look into the exit block to see if there's any duplicates of the predicates that we've replaced and clone the vctp to be used there instead. Differential Revision: https://reviews.llvm.org/D67709 llvm-svn: 372567
* [ARM][LowOverheadLoops] Use subs during revert.Sam Parker2019-09-231-16/+37
| | | | | | | | | | Check whether there are any uses or defs between the LoopDec and LoopEnd. If there's not, then we can use a subs to set the cpsr and skip generating a cmp. Differential Revision: https://reviews.llvm.org/D67801 llvm-svn: 372560
* [ARM][LowOverheadLoops] Use tBcc when revertingSam Parker2019-09-231-8/+11
| | | | | | | | | Check the branch target ranges and use a tBcc instead of t2Bcc when we can. Differential Revision: https://reviews.llvm.org/D67796 llvm-svn: 372557
* [ARM] Fix CTTZ not generating correct instructions MVEOliver Cruickshank2019-09-201-1/+1
| | | | | | CTTZ intrinsic should have been set to Custom, not Expand llvm-svn: 372401
* Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Matt Arsenault2019-09-193-62/+62
| | | | | | | | | This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338
* Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Hans Wennborg2019-09-193-62/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC*. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_* instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314
* [ARM] MVE i1 splatDavid Green2019-09-191-1/+13
| | | | | | | | | We needn't BFI each lane individually into a predicate register when each lane in the same. A simple sign extend and a vmsr will do. Differential Revision: https://reviews.llvm.org/D67653 llvm-svn: 372313
* Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.Simon Pilgrim2019-09-191-1/+1
| | | | llvm-svn: 372308
* GlobalISel: Don't materialize immarg arguments to intrinsicsMatt Arsenault2019-09-193-62/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Encode them directly as an imm argument to G_INTRINSIC*. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_* instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285
* [Alignment][NFC] Remove LogAlignment functionsGuillaume Chatelet2019-09-183-59/+55
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67620 llvm-svn: 372231
* Revert "[AArch64][DebugInfo] Do not recompute CalleeSavedStackSize"Krasimir Georgiev2019-09-182-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This reverts commit r372204. This change causes build bot failures under msan: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/35236/steps/check-llvm%20msan/logs/stdio: ``` FAIL: LLVM :: DebugInfo/AArch64/asan-stack-vars.mir (19531 of 33579) ******************** TEST 'LLVM :: DebugInfo/AArch64/asan-stack-vars.mir' FAILED ******************** Script: -- : 'RUN: at line 1'; /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc -O0 -start-before=livedebugvalues -filetype=obj -o - /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-dwarfdump -v - | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir -- Exit Code: 2 Command Output (stderr): -- ==62894==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0xdfcafb in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 #1 0xdfae8a in resolveFrameIndexReference /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1580:10 #2 0xdfae8a in llvm::AArch64FrameLowering::getFrameIndexReference(llvm::MachineFunction const&, int, unsigned int&) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1536 #3 0x46642c1 in (anonymous namespace)::LiveDebugValues::extractSpillBaseRegAndOffset(llvm::MachineInstr const&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:582:21 #4 0x4647cb3 in transferSpillOrRestoreInst /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:883:11 #5 0x4647cb3 in process /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1079 #6 0x4647cb3 in (anonymous namespace)::LiveDebugValues::ExtendRanges(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1361 #7 0x463ac0e in (anonymous namespace)::LiveDebugValues::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1415:18 #8 0x4854ef0 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:13 #9 0x53b0b01 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1648:27 #10 0x53b15f6 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1685:16 #11 0x53b298d in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1750:27 #12 0x53b298d in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1863 #13 0x905f21 in compileModule(char**, llvm::LLVMContext&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:601:8 #14 0x8fdc4e in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:355:22 #15 0x7f67673632e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #16 0x882369 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc+0x882369) MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const Exiting error: -: The file was not recognized as a valid object file FileCheck error: '-' is empty. FileCheck command line: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir ``` Reviewers: bkramer Reviewed By: bkramer Subscribers: sdardis, aprantl, kristof.beyls, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67710 llvm-svn: 372228
* [AArch64][DebugInfo] Do not recompute CalleeSavedStackSizeSander de Smalen2019-09-182-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a bug exposed by D65653 where a subsequent invocation of `determineCalleeSaves` ends up with a different size for the callee save area, leading to different frame-offsets in debug information. In the invocation by PEI, `determineCalleeSaves` tries to determine whether it needs to spill an extra callee-saved register to get an emergency spill slot. To do this, it calls 'estimateStackSize' and manually adds the size of the callee-saves to this. PEI then allocates the spill objects for the callee saves and the remaining frame layout is calculated accordingly. A second invocation in LiveDebugValues causes estimateStackSize to return the size of the stack frame including the callee-saves. Given that the size of the callee-saves is added to this, these callee-saves are counted twice, which leads `determineCalleeSaves` to believe the stack has become big enough to require spilling an extra callee-save as emergency spillslot. It then updates CalleeSavedStackSize with a larger value. Since CalleeSavedStackSize is used in the calculation of the frame offset in getFrameIndexReference, this leads to incorrect offsets for variables/locals when this information is recalculated after PEI. Reviewers: omjavaid, eli.friedman, thegameg, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D66935 llvm-svn: 372204
* [ARM] VFPv2 only supports 16 D registers.Eli Friedman2019-09-175-15/+23
| | | | | | | | | | | | | | | | | | | | r361845 changed the way we handle "D16" vs. "D32" targets; there used to be a negative "d16" which removed instructions from the instruction set, and now there's a "d32" feature which adds instructions to the instruction set. This is good, but there was an oversight in the implementation: the behavior of VFPv2 was changed. In particular, the "vfp2" feature was changed to imply "d32". This is wrong: VFPv2 only supports 16 D registers. In practice, this means if you specify -mfpu=vfpv2, the compiler will generate illegal instructions. This patch gets rid of "vfp2d16" and "vfp2d16sp", and fixes "vfp2" and "vfp2sp" so they don't imply "d32". Differential Revision: https://reviews.llvm.org/D67375 llvm-svn: 372186
* [ARM][AsmParser] Don't dereference a dyn_cast result. NFCI.Simon Pilgrim2019-09-171-50/+41
| | | | | | The static analyzer is warning about potential null dereferences of dyn_cast<> results - in these cases we can safely use cast<> directly as we know that these cases should all be the correct type, which is why its working atm and anyway cast<> will assert if they aren't. llvm-svn: 372145
* [ARM] Add a SelectTAddrModeImm7 for MVE narrow loads and storesDavid Green2019-09-172-10/+35
| | | | | | | | | | | | We were previously using the SelectT2AddrModeImm7 for both normal and narrowing MVE loads/stores. As the narrowing instructions do not accept sp as a register, it makes little sense to optimise a FrameIndex into the load, only to have to recover that later on. This adds a SelectTAddrModeImm7 which does not do that folding, and uses it for narrowing load/store patterns. Differential Revision: https://reviews.llvm.org/D67489 llvm-svn: 372134
* [ARM] Reserve an emergency spill slot for fp16 addressing modes that need itDavid Green2019-09-171-1/+14
| | | | | | | | | | | Similar to D67327, but this time for the FP16 VLDR and VSTR instructions that use the AddrMode5FP16 addressing mode. We need to reserve an emergency spill slot for instructions that will be out of range to use sp directly. AddrMode5FP16 is 8 bits with a scale of 2. Differential Revision: https://reviews.llvm.org/D67483 llvm-svn: 372132
* [ARM] Fix for buildbotsSam Parker2019-09-171-1/+0
| | | | | | | | Remove setPreservesCFG from ARMConstantIslandPass and add a couple of -verify-machine-dom-info instances into the existing codegen tests. llvm-svn: 372126
* [ARM] Fix for MVE load/store stack accessesDavid Green2019-09-171-7/+26
| | | | | | | | | | MVE loads and stores have a 7 bit immediate range, scaled by the length of the type. This needs to be taught to the stack estimation code to ensure that an emergency spill slot is reserved in case we run out of registers when materialising stack indices. Also the narrowing loads/stores can be created with frame indices even though they do not accept SP as a register. We need in those cases to make sure we have an emergency register to use as the frame base, as SP can never be used. Differential Revision: https://reviews.llvm.org/D67327 llvm-svn: 372114
* Hide implementation details in namespaces.Benjamin Kramer2019-09-171-4/+4
| | | | llvm-svn: 372113
* [ARM][LowOverheadLoops] Add LR def safety checkSam Parker2019-09-171-48/+139
| | | | | | | | | | | | | | | Converting the *LoopStart pseudo instructions into DLS/WLS results in LR being defined. These instructions were inserted on the assumption that LR would already contain the loop counter because a mov is introduced during ISel as the the consumers in the loop can only use LR. That assumption proved wrong! So perform a safety check, finding an appropriate place to insert the DLS/WLS instructions or revert if this isn't possible. Differential Revision: https://reviews.llvm.org/D67539 llvm-svn: 372111
* [SVE][MVT] Fixed-length vector MVT rangesGraham Hunter2019-09-171-4/+4
| | | | | | | | | | | | | | | | | * Reordered MVT simple types to group scalable vector types together. * New range functions in MachineValueType.h to only iterate over the fixed-length int/fp vector types. * Stopped backends which don't support scalable vector types from iterating over scalable types. Reviewers: sdesmalen, greened Reviewed By: greened Differential Revision: https://reviews.llvm.org/D66339 llvm-svn: 372099
* [ARM] LE support in ConstantIslandsSam Parker2019-09-171-26/+111
| | | | | | | | | | | | | The low-overhead branch extension provides a loop-end 'LE' instruction that performs no decrement nor compare, it just jumps backwards. This patch modifies the constant islands pass to try to insert LE instructions in place of a Thumb2 conditional branch, instead of shrinking it. This only happens if a cmp can be converted to a cbn/z and used to exit the loop. Differential Revision: https://reviews.llvm.org/D67404 llvm-svn: 372085
* [ARM][MVE] Add invalidForTailPredication to TSFlagsSam Parker2019-09-173-0/+14
| | | | | | | | | Set this bit for the MVE reduction instructions to prevent a loop from becoming tail predicated in their presence. Differential Revision: https://reviews.llvm.org/D67444 llvm-svn: 372076
* [ARM] A predicate cast of a predicate cast is a predicate castDavid Green2019-09-161-0/+20
| | | | | | | | | | | The adds some very basic folding of PREDICATE_CASTS, removing cases when they are chained together. These would already be removed eventually, as these are lowered to copies. This just allows it to happen earlier, which can help other simplifications. Differential Revision: https://reviews.llvm.org/D67591 llvm-svn: 372012
* [ARM] Add patterns for BSWAP intrinsic on MVEOliver Cruickshank2019-09-162-0/+8
| | | | | | | BSWAP can use the VREV instruction on MVE to produce better results than expanding. llvm-svn: 372002
* [ARM] Add patterns for bitreverse intrinsic on MVEOliver Cruickshank2019-09-162-0/+12
| | | | | | | BITREVERSE can use the VBRSR which will reverse and right shift. Shifting right by 0 will just reverse the bits. llvm-svn: 372001
* [ARM] Lower CTTZ on MVEOliver Cruickshank2019-09-161-2/+2
| | | | | | | Lower CTTZ on MVE using VBRSR and VCLS which will reverse the bits and count the leading zeros, equivalent to a count trailing zeros (CTTZ). llvm-svn: 372000
OpenPOWER on IntegriCloud