summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] ARMExpandPseudoInsts: add debug messagesSjoerd Meijer2019-05-241-2/+16
| | | | | | | | | | | This pass wasn't printing any messages at all, which I find really inconvenient while debugging/tracing things. It now dumps the before and after of expanded instructions. It doesn't do this yet for all instructions, but this is a good start I guess. Differential Revision: https://reviews.llvm.org/D62297 llvm-svn: 361604
* [ARM][CGP] Clear SafeWrap before each searchSam Parker2019-05-231-0/+1
| | | | | | | | | | | The previous patch added a member set to store instructions that we could allow to wrap. But this wasn't cleared between searches meaning that they could get promoted, incorrectly, during the promotion of a separate valid chain. Differential Revision: https://reviews.llvm.org/D62254 llvm-svn: 361462
* [ARM][CGP] Skip nuw in PrepareConstantsSam Parker2019-05-211-72/+52
| | | | | | | | | | | | | | | PrepareConstants step converts add/sub with 'negative' immediates to sub/add with a 'positive' imm to make promotion more simple. nuw already states that the add shouldn't cause an unsigned wrap, so it shouldn't need any tweaking. Plus, we also don't allow a sub with a 'negative' immediate to be safe wrap, so this functionality has been removed. The PrepareConstants step now just handles the add instructions that we've determined would be safe if they wrap around zero. Differential Revision: https://reviews.llvm.org/D62057 llvm-svn: 361227
* [ARM] Support .reloc *, R_ARM_NONE, *Fangrui Song2019-05-173-1/+21
| | | | | | | | | | | | | | | | R_ARM_NONE can be used to create references among sections. When --gc-sections is used, the referenced section will be retained if the origin section is retained. Add a generic MCFixupKind FK_NONE as this kind of no-op relocation is ubiquitous on ELF and COFF, and probably available on many other binary formats. See D62014. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D61992 llvm-svn: 360980
* [ARM] Don't use the Machine Scheduler for cortex-m at minsizeDavid Green2019-05-152-1/+8
| | | | | | | | | | | | | | | | | | The new cortex-m schedule in rL360768 helps performance, but can increase the amount of high-registers used. This, on average, ends up increasing the codesize by a fair amount (because less instructions are converted from T2 to T1). On cortex-m at -Oz, where we are quite size-paranoid, it is better to use the existing DAG scheduler with the RegPressure scheduling preference (at least until the issues around T2 vs T1 instructions can be improved). I have also made sure that the Sched::RegPressure dag scheduler is always chosen for MinSize. The test shows one case where we increase the number of registers used. Differential Revision: https://reviews.llvm.org/D61882 llvm-svn: 360769
* [ARM] Cortex-M4 scheduleDavid Green2019-05-156-65/+176
| | | | | | | | | | | | | | | | | | | | This patch adds a simple Cortex-M4 schedule, renaming the existing M3 schedule to M4 and filling in the latencies as-per the Cortex-M4 TRM: https://developer.arm.com/docs/ddi0439/latest Most of these are 1, with the important exception being loads taking 2 cycles. A few others are also higher, but I don't believe they make a large difference. I've repurposed the M3 schedule as the latencies are mostly the same between the two cores, with the M4 having more FP and DSP instructions. We also turn on MISched and UseAA for the cores that now use this. It also adds some schedule Write's to various instruction to make things simpler. Differential Revision: https://reviews.llvm.org/D54142 llvm-svn: 360768
* [ARM] Create a TargetInfo header. NFCRichard Trieu2019-05-148-6/+29
| | | | | | | | Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360718
* [ARM][ParallelDSP] Relax alias checksSam Parker2019-05-131-184/+174
| | | | | | | | | | | | | | | | | | | | | | | When deciding the safety of generating smlad, we checked for any writes within the block that may alias with any of the loads that need to be widened. This is overly conservative because it only matters when there's a potential aliasing write to a location accessed by a pair of loads. Now we check for aliasing writes only once, during setup. If two loads are found to have an aliasing write between them, we don't add these loads to LoadPairs. This means that later during the transform, we can safely widened a pair without worrying about aliasing. However, to maintain correctness, we also need to change the way that wide loads are inserted because the order is now important. The MatchSMLAD method has also been changed, absorbing MatchReductions and AddMACCandidate to hopefully improve readability. Differential Revision: https://reviews.llvm.org/D6102 llvm-svn: 360567
* [ARM] Move InstPrinter files to MCTargetDesc. NFCRichard Trieu2019-05-1112-36/+11
| | | | | | | | | For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360490
* [ARM][CGP] Guard against signext args and sitofpSam Parker2019-05-091-10/+12
| | | | | | | | | | Add an Argument that has the SExtAttr attached, as well as SIToFP instructions, as values that generate sign bits. SIToFP doesn't strictly do this and could be treated as a sink to be sign-extended. Differential Revision: https://reviews.llvm.org/D61381 llvm-svn: 360331
* [ARM GlobalISel] Map DBG_VALUE for types != s32Diana Picus2019-05-091-2/+8
| | | | | | | | ...and make sure we fail elegantly for unsupported values. s64 goes into DPR, anything <= 32 into GPR. llvm-svn: 360321
* ARM: disallow SP as Rn for Thumb2 TST & TEQ instructionsTim Northover2019-05-081-14/+14
| | | | | | | | Using SP in this position is unpredictable in ARMv7. CMP and CMN are not affected, and of course v8 relaxes this requirement, but that's handled elsewhere. llvm-svn: 360242
* [ARM GlobalISel] Widen G_SELECT operandsDiana Picus2019-05-071-2/+3
| | | | | | ...except for the condition operand. llvm-svn: 360135
* [ARM GlobalISel] Widen G_INTTOPTR/G_PTRTOINTDiana Picus2019-05-071-2/+6
| | | | | | | We actually have a couple of G_PTRTOINT to s8 when building clang, so we should do something about them. llvm-svn: 360130
* [ARM GlobalISel] Widen G_GEP index operandDiana Picus2019-05-071-1/+3
| | | | llvm-svn: 360127
* [ARM] Glue register copies to tail calls.Eli Friedman2019-05-061-26/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | This generally follows what other targets do. I don't completely understand why the special case for tail calls existed in the first place; even when the code was committed in r105413, call lowering didn't work in the way described in the comments. Stack protector lowering breaks if the register copies are not glued to a tail call: we have to insert the stack protector check before the tail call, and we choose the location based on the assumption that all physical register dependencies of a tail call are adjacent to the tail call. (See FindSplitPointForStackProtector.) This is sort of fragile, but I don't see any reason to break that assumption. I'm guessing nobody has seen this before just because it's hard to convince the scheduler to actually schedule the code in a way that breaks; even without the glue, the only computation that could actually be scheduled after the register copies is the computation of the call address, and the scheduler usually prefers to schedule that before the copies anyway. Fixes https://bugs.llvm.org/show_bug.cgi?id=41417 Differential Revision: https://reviews.llvm.org/D60427 llvm-svn: 360099
* [ARM GlobalISel] Fixup r359768Diana Picus2019-05-021-2/+1
| | | | | | Get rid of local variable used only in assertion. llvm-svn: 359772
* [ARM GlobalISel] Select extensions to < 32 bitsDiana Picus2019-05-021-5/+2
| | | | | | | | | Select G_SEXT and G_ZEXT with destination types smaller than 32 bits in the exact same way as 32 bits. This overwrites the higher bits, but that should be ok since all legal users of types smaller than 32 bits ignore those bits anyway. llvm-svn: 359768
* [ARM GlobalISel] Legalize extensions to < 32 bitsDiana Picus2019-05-021-1/+1
| | | | | | Make it legal to extend from e.g. s1 to s8 or s16. llvm-svn: 359766
* [ARM] Implement TTI::getMemcpyCostSjoerd Meijer2019-04-302-0/+37
| | | | | | | | | This implements TargetTransformInfo method getMemcpyCost, which estimates the number of instructions to which a memcpy instruction expands to. Differential Revision: https://reviews.llvm.org/D59787 llvm-svn: 359547
* [ARM GlobalISel] Widen small shift operandsDiana Picus2019-04-301-0/+1
| | | | | | | The legalizer was already widening the shift amount. Add tests for that behaviour, and also support widening the shifted value. llvm-svn: 359542
* [ARM GlobalISel] Be more careful about bailing outDiana Picus2019-04-301-2/+2
| | | | | | | | | Bail out on function arguments/returns with types aggregating an unsupported type. This fixes cases where we would happily and incorrectly lower functions taking e.g. [1 x i64] parameters, when we don't even support plain i64 yet. llvm-svn: 359540
* [TargetLowering] Change getOptimalMemOpType to take a function attribute listSjoerd Meijer2019-04-302-9/+6
| | | | | | | | | | | | The MachineFunction wasn't used in getOptimalMemOpType, but more importantly, this allows reuse of findOptimalMemOpLowering that is calling getOptimalMemOpType. This is the groundwork for the changes in D59766 and D59787, that allows implementation of TTI::getMemcpyCost. Differential Revision: https://reviews.llvm.org/D59785 llvm-svn: 359537
* [ARM] Add bitcast/extract_subvec. of fp16 vectorsDiogo N. Sampaio2019-04-291-91/+144
| | | | | | | | | | | | | | | | | | | | Summary: This patch adds some basic operations for fp16 vectors, such as bitcast from fp16 to i16, required to perform extract_subvector (also added here) and extract_element. Reviewers: SjoerdMeijer, DavidSpickett, t.p.northover, ostannard Reviewed By: ostannard Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60618 llvm-svn: 359433
* [ARM] Add v4f16 and v8f16 types to the CallingConvDiogo N. Sampaio2019-04-291-18/+18
| | | | | | | | | | | | | | | | | | | | | Summary: The Procedure Call Standard for the Arm Architecture states that float16x4_t and float16x8_t behave just as uint16x4_t and uint16x8_t for argument passing. This patch adds the fp16 vectors to the ARMCallingConv.td file. Reviewers: miyuki, ostannard Reviewed By: ostannard Subscribers: ostannard, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60720 llvm-svn: 359431
* [AsmPrinter] refactor to support %c w/ GlobalAddress'Nick Desaulniers2019-04-262-11/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Targets like ARM, MSP430, PPC, and SystemZ have complex behavior when printing the address of a MachineOperand::MO_GlobalAddress. Move that handling into a new overriden method in each base class. A virtual method was added to the base class for handling the generic case. Refactors a few subclasses to support the target independent %a, %c, and %n. The patch also contains small cleanups for AVRAsmPrinter and SystemZAsmPrinter. It seems that NVPTXTargetLowering is possibly missing some logic to transform GlobalAddressSDNodes for TargetLowering::LowerAsmOperandForConstraint to handle with "i" extended inline assembly asm constraints. Fixes: - https://bugs.llvm.org/show_bug.cgi?id=41402 - https://github.com/ClangBuiltLinux/linux/issues/449 Reviewers: echristo, void Reviewed By: void Subscribers: void, craig.topper, jholewinski, dschuff, jyknight, dylanmckay, sdardis, nemanjai, javed.absar, sbc100, jgravelle-google, eraman, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, jrtc27, atanasyan, jsji, llvm-commits, kees, tpimh, nathanchance, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60887 llvm-svn: 359337
* ARM: disallow add/sub to sp unless Rn is also sp.Tim Northover2019-04-232-1/+27
| | | | | | | | The manual says that Thumb2 add/sub instructions are only allowed to modify sp if the first source is also sp. This is slightly different from the usual rGPR restriction since it's context-sensitive, so implement it in C++. llvm-svn: 358987
* [ARM] Update check for CBZ in IfcvtDavid Green2019-04-233-43/+59
| | | | | | | | | | | The check for creating CBZ in constant island pass recently obtained the ability to search backwards to find a Cmp instruction. The code in IfCvt should mirror this to allow more conversions to the smaller form. The common code has been pulled out into a separate function to be shared between the two places. Differential Revision: https://reviews.llvm.org/D60090 llvm-svn: 358977
* [ARM] Don't replicate instructions in Ifcvt at minsizeDavid Green2019-04-231-0/+9
| | | | | | | | | | Ifcvt can replicate instructions as it converts them to be predicated. This stops that from happening on thumb2 targets at minsize where an extra IT instruction is likely needed. Differential Revision: https://reviews.llvm.org/D60089 llvm-svn: 358974
* [ARM][FIX] Add missing f16.lane.vldN/vstN loweringDiogo N. Sampaio2019-04-231-0/+2
| | | | | | | | | | | | | | | | | | Summary: Add missing D and Q lane VLDSTLane lowering for fp16 elements. Reviewers: efriedma, kosarev, SjoerdMeijer, ostannard Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60874 llvm-svn: 358962
* [ARM] Rewrite isLegalT2AddressImmediateDavid Green2019-04-211-29/+24
| | | | | | | | | | | | | | | | | | | | | This does two main things, firstly adding some at least basic addressing modes for i64 types, and secondly treats floats and doubles sensibly when there is no fpu. The floating point change can help codesize in some cases, especially with D60294. Most backends seems to not consider the exact VT in isLegalAddressingMode, instead switching on type size. That is now what this does when the target does not have an fpu (as the float data will be loaded using LDR's). i64's currently use the address range of an LDRD (even though they may be legalised and loaded with an LDR). This is at least better than marking them all as illegal addressing modes. I have not attempted to do much with vectors yet. That will need changing once MVE is added. Differential Revision: https://reviews.llvm.org/D60677 llvm-svn: 358845
* [AsmPrinter] hoist %a output template to base class for ARM+Aarch64Nick Desaulniers2019-04-171-11/+0
| | | | | | | | | | | | | | | | | | | | | Summary: X86 is quite complicated; so I intend to leave it as is. ARM+Aarch64 do basically the same thing (Aarch64 did not correctly handle immediates, ARM has a test llvm/test/CodeGen/ARM/2009-04-06-AsmModifier.ll that uses %a with an immediate) for a flag that should be target independent anyways. Reviewers: echristo, peter.smith Reviewed By: echristo Subscribers: javed.absar, eraman, kristof.beyls, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60841 llvm-svn: 358618
* [AsmPrinter] defer %c to base class for ARM, PPC, and Hexagon. NFCNick Desaulniers2019-04-171-6/+4
| | | | | | | | | | | | | | | | | | | Summary: None of these derived classes do anything that the base class cannot. If we remove these case statements, then the base class can handle them just fine. Reviewers: peter.smith, echristo Reviewed By: echristo Subscribers: nemanjai, javed.absar, eraman, kristof.beyls, hiraditya, kbarton, jsji, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60803 llvm-svn: 358603
* [TargetLowering] Rename preferShiftsToClearExtremeBits and ↵Simon Pilgrim2019-04-162-5/+4
| | | | | | | | | | | | shouldFoldShiftPairToMask (PR41359) As discussed on PR41359, this patch renames the pair of shift-mask target feature functions to make their purposes more obvious. shouldFoldShiftPairToMask -> shouldFoldConstantShiftPairToMask preferShiftsToClearExtremeBits -> shouldFoldMaskToVariableShiftPair llvm-svn: 358526
* [GlobalISel] Introduce a CSEConfigBase class to allow targets to define ↵Amara Emerson2019-04-151-0/+6
| | | | | | | | | | | | | | their own CSE configs. Because CodeGen can't depend on GlobalISel, we need a way to encapsulate the CSE configs that can be passed between TargetPassConfig and the targets' custom pass configs. This CSEConfigBase allows targets to create custom CSE configs which is then used by the GISel passes for the CSEMIRBuilder. This support will be used in a follow up commit to allow constant-only CSE for -O0 compiles in D60580. llvm-svn: 358368
* Test commit accessOliver Stannard2019-04-111-0/+1
| | | | llvm-svn: 358162
* [AsmPrinter] refactor to remove remove AsmVariant. NFCNick Desaulniers2019-04-102-9/+5
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The InlineAsm::AsmDialect is only required for X86; no architecture makes use of it and as such it gets passed around between arch-specific and general code while being unused for all architectures but X86. Since the AsmDialect is queried from a MachineInstr, which we also pass around, remove the additional AsmDialect parameter and query for it deep in the X86AsmPrinter only when needed/as late as possible. This refactor should help later planned refactors to AsmPrinter, as this difference in the X86AsmPrinter makes it harder to make AsmPrinter more generic. Reviewers: craig.topper Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60488 llvm-svn: 358101
* [ARM] [FIX] Add missing f16 vector operations loweringDiogo N. Sampaio2019-04-102-1/+6
| | | | | | | | | | | | | | | | | | | | Summary: Add missing <8xhalf> shufflevectors pattern, when using concat_vector dag node. As well, allows <8xhalf> and <4xhalf> vldup1 operations. These instructions are required for v8.2a fp16 lowering of vmul_n_f16, vmulq_n_f16 and vmulq_lane_f16 intrinsics. Reviewers: olista01, pbarrio, LukeGeeson, efriedma Reviewed By: efriedma Subscribers: efriedma, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60319 llvm-svn: 358081
* Fixup r358063Diana Picus2019-04-101-2/+2
| | | | | | Fix warning/error about mixed signedness. llvm-svn: 358065
* [ARM GlobalISel] Add some asserts. NFC.Diana Picus2019-04-101-0/+2
| | | | | | Make sure some arm opcodes don't unintentionally sneak into thumb mode. llvm-svn: 358064
* [ARM GlobalISel] Select G_FCONSTANT for VFP3Diana Picus2019-04-102-10/+59
| | | | | | | | | | | | | | | | Make it possible to TableGen code for FCONSTS and FCONSTD. We need to make two changes to the TableGen descriptions of vfp_f32imm and vfp_f64imm respectively: * add GISelPredicateCode to check that the immediate fits in 8 bits; * extract the SDNodeXForms into separate definitions and create a GISDNodeXFormEquiv and a custom renderer function for each of them. There's a lot of boilerplate to get the actual value of the immediate, but it basically just boils down to calling ARM_AM::getFP32Imm or ARM_AM::getFP64Imm. llvm-svn: 358063
* [ARM GlobalISel] Select G_FCONSTANT into poolsDiana Picus2019-04-101-0/+21
| | | | | | | Put all floating point constants into constant pools and load their values from there. llvm-svn: 358062
* [ARM GlobalISel] Map G_FCONSTANTDiana Picus2019-04-101-0/+8
| | | | llvm-svn: 358061
* [GlobalISel][AArch64] Allow CallLowering to handle types which are normallyAmara Emerson2019-04-091-0/+2
| | | | | | | | | | | required to be passed as different register types. E.g. <2 x i16> may need to be passed as a larger <2 x i32> type, so formal arg lowering needs to be able truncate it back. Likewise, when dealing with returns of these types, they need to be widened in the appropriate way back. Differential Revision: https://reviews.llvm.org/D60425 llvm-svn: 358032
* [IR] Refactor attribute methods in Function class (NFC)Evandro Menezes2019-04-0411-20/+20
| | | | | | | | Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731
* [ARM GlobalISel] Support DBG_VALUEDiana Picus2019-04-041-0/+7
| | | | | | Make sure we can map and select DBG_VALUE. llvm-svn: 357681
* [IR] Create new method in `Function` class (NFC)Evandro Menezes2019-04-031-1/+1
| | | | | | | | | Create method `optForNone()` testing for the function level equivalent of `-O0` and refactor appropriately. Differential revision: https://reviews.llvm.org/D59852 llvm-svn: 357638
* [ARM] Optimize expressions like "return x != 0;" for Thumb1.Eli Friedman2019-04-021-11/+13
| | | | | | | | | | | | | There's an existing optimization for x != C, but somehow it was missing a special case for 0. While I'm here, also cleaned up the code/comments a bit: the second value produced by the MERGE_VALUES was actually dead, since a CMOV only produces one result. Differential Revision: https://reviews.llvm.org/D59616 llvm-svn: 357437
* [ARM] Don't try to create "push {r12, lr}" in Thumb1 at -Oz.Eli Friedman2019-04-011-0/+2
| | | | | | | | | | | | It's a little tricky to make this issue show up because prologue/epilogue emission normally likes to push at least two registers... but it doesn't when lr is force-spilled due to function length. Not sure if that really makes sense, but I decided not to touch it for now. Differential Revision: https://reviews.llvm.org/D59385 llvm-svn: 357436
* [ARM GlobalISel] Fix G_STORE with s1Diana Picus2019-03-281-0/+18
| | | | | | | G_STORE for 1-bit values uses a STRBi12, which stores the whole byte. Zero out the undefined bits before writing. llvm-svn: 357154
OpenPOWER on IntegriCloud