summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] Fix null pointer dereference in ↵Fangrui Song2019-07-081-6/+8
| | | | | | | | CodeGen/ARM/Windows/stack-protector-msvc.ll.test after D64292/r365283 CLI.CS may not be set. llvm-svn: 365299
* [ARM] Add support for MSVC stack cookie checkingMartin Storsjo2019-07-071-0/+30
| | | | | | | | Heavily based on the same for AArch64, from SVN r346469. Differential Revision: https://reviews.llvm.org/D64292 llvm-svn: 365283
* [ARM] MVE VMOV immediate handlingDavid Green2019-07-051-9/+18
| | | | | | | | | | | | | This adds some handling for VMOVimm, using the same method that NEON uses. We create VMOVIMM/VMVNIMM/VMOVFPIMM nodes based on the immediate, and select them using the now renamed ARMvmovImm/etc. There is also an extra 64bit immediate mode that I have not yet added here. Code by David Sherwood Differential Revision: https://reviews.llvm.org/D63884 llvm-svn: 365178
* [ARM] MVE fp to int conversionsDavid Green2019-07-051-0/+7
| | | | | | | | This adds the patterns needed for fptosi and sitofp. Differential Revision: https://reviews.llvm.org/D63729 llvm-svn: 365176
* [ARM] Favour PL/MI over GE/LT when possibleDavid Green2019-07-041-0/+19
| | | | | | | | | | | | | | | The arm condition codes for GE is N==V (and for LT is N!=V). If the source of flags cannot set V (overflow), such as a cmp against #0, then we can use the simpler PL and MI conditions that only check N. As these PL/MI conditions are simpler than GE/LT, other passes like the peephole optimiser can have a better time optimising away the redundant CMPs. The exception is the VSEL instruction, which cannot take the PL code, so there the transform favours GE. Differential Revision: https://reviews.llvm.org/D64160 llvm-svn: 365117
* [ARM] Fix for NDEBUG buildsSam Parker2019-07-031-4/+3
| | | | | | | | Fix unused variable warning as well as a nonsense assert. Differential Revision: https://reviews.llvm.org/D63816 llvm-svn: 365046
* [Codegen][X86][AArch64][ARM][PowerPC] Inc-of-add vs sub-of-not (PR42457)Roman Lebedev2019-07-031-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the backend part of [[ https://bugs.llvm.org/show_bug.cgi?id=42457 | PR42457 ]]. In middle-end, we'd want to prefer the form with two adds - D63992, but as this diff shows, not every target will prefer that pattern. Out of 4 targets for which i added tests all seem to be ok with inc-of-add for scalars, but only X86 prefer that same pattern for vectors. Here i'm adding a new TLI hook, always defaulting to the inc-of-add, but adding AArch64,ARM,PowerPC overrides to prefer inc-of-add only for scalars. Reviewers: spatel, RKSimon, efriedma, t.p.northover, hfinkel Reviewed By: efriedma Subscribers: nemanjai, javed.absar, kristof.beyls, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64090 llvm-svn: 365010
* [ARM] MVE: allow soft-float ABI to pass vector types.Simon Tatham2019-07-021-2/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | Passing a vector type over the soft-float ABI involves it being split into four GPRs, so the first thing that has to happen at the start of the function is to recombine those into a vector register. The ABI types all vectors as v2f64, so we need to support BUILD_VECTOR for that type, which I do in this patch by allowing it to be expanded in terms of INSERT_VECTOR_ELT, and writing an ISel pattern for that in turn. Similarly, I provide a rule for EXTRACT_VECTOR_ELT so that a returned vector can be marshalled back into GPRs. While I'm here, I've also added ISD::UNDEF to the list of operations we turn back on in `setAllExpand`, because I noticed that otherwise it gets expanded into a BUILD_VECTOR with explicit zero inputs, leading to pointless machine instructions to zero out a vector register that's about to have every lane overwritten of in any case. Reviewers: dmgreen, ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63937 llvm-svn: 364910
* [ARM] Stop using scalar FP instructions in integer-only MVE mode.Simon Tatham2019-07-021-16/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If you compile with `-mattr=+mve` (enabling integer MVE instructions but not floating-point ones), then the scalar FP //registers// exist and it's legal to move things in and out of them, load and store them, but it's not legal to do arithmetic on them. In D60708, the calls to `addRegisterClass` in ARMISelLowering that enable use of the scalar FP registers became conditionalised on `Subtarget->hasFPRegs()` instead of `Subtarget->hasVFP2Base()`, so that loads, stores and moves of those registers would work. But I didn't realise that that would also enable all the operations on those types by default. Now, if the target doesn't have basic VFP, we follow up those `addRegisterClass` calls by turning back off all the nontrivial operations you can perform on f32 and f64. That causes several knock-on failures, which are fixed by allowing the `VMOVDcc` and `VMOVScc` instructions to be selected even if all you have is `HasFPRegs`, and adjusting several checks for 'is this a double in a single-precision-only world?' to the more general 'is this any FP type we can't do arithmetic on?'. Between those, the whole of the `float-ops.ll` and `fp16-instructions.ll` tests can now run in MVE-without-FP mode and generate correct-looking code. One odd side effect is that I had to relax the check lines in that test so that they permit test functions like `add_f` to be generated as tailcalls to software FP library functions, instead of ordinary calls. Doing that is entirely legal, but the mystery is why this is the first RUN line that's needed the relaxation: on the usual kind of non-FP target, no tailcalls ever seem to be generated. Going by the llc messages, I think `SoftenFloatResult` must be perturbing the code generation in some way, but that's as much as I can guess. Reviewers: dmgreen, ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63938 llvm-svn: 364909
* [ARM] MVE: support QQPRRegClass and QQQQPRRegClassMikhail Maltsev2019-07-011-2/+3
| | | | | | | | | | | | | | | | | | | Summary: QQPRRegClass and QQQQPRRegClass are used by the interleaving/deinterleaving loads/stores to represent sequences of consecutive SIMD registers. Reviewers: ostannard, simon_tatham, dmgreen Reviewed By: simon_tatham Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64009 llvm-svn: 364794
* [ARM] WLS/LE Code GenerationSam Parker2019-07-011-0/+42
| | | | | | | | | | | | | | | | | Backend changes to enable WLS/LE low-overhead loops for armv8.1-m: 1) Use TTI to communicate to the HardwareLoop pass that we should try to generate intrinsics that guard the loop entry, as well as setting the loop trip count. 2) Lower the BRCOND that uses said intrinsic to an Arm specific node: ARMWLS. 3) ISelDAGToDAG the node to a new pseudo instruction: t2WhileLoopStart. 4) Add support in ArmLowOverheadLoops to handle the new pseudo instruction. Differential Revision: https://reviews.llvm.org/D63816 llvm-svn: 364733
* [ARM] Add support for the MVE long shift instructionsSam Tebbs2019-06-281-2/+55
| | | | | | | | | | | | MVE adds the lsll, lsrl and asrl instructions, which perform a shift on a 64 bit value separated into two 32 bit registers. The Expand64BitShift function is modified to accept ISD::SHL, ISD::SRL and ISD::SRA and convert it into the appropriate opcode in ARMISD. An SHL is converted into an lsll, an SRL is converted into an lsrl for the immediate form and a negation and lsll for the register form, and SRA is converted into an asrl. test/CodeGen/ARM/shift_parts.ll is added to test the logic of emitting these instructions. Differential Revision: https://reviews.llvm.org/D63430 llvm-svn: 364654
* [ARM] Mark math routines as non-legal for MVEDavid Green2019-06-281-0/+9
| | | | | | | | | This adds handling and tests for a number of floating point math routines, which have no MVE instructions. Differential Revision: https://reviews.llvm.org/D63725 llvm-svn: 364641
* [ARM] Widening loads and narrowing storesDavid Green2019-06-281-4/+21
| | | | | | | | | | | | MVE has instructions to widen as it loads, and narrow as it stores. This adds the required patterns and legalisation to make them work including specifying that they are legal, patterns to select them and test changes. Patch by David Sherwood. Differential Revision: https://reviews.llvm.org/D63839 llvm-svn: 364636
* [ARM] MVE loads and storesDavid Green2019-06-281-11/+49
| | | | | | | | | | | | | | | This fills in the gaps for basic MVE loads and stores, allowing unaligned access and adding far too many tests. These will become important as narrowing/expanding and pre/post inc are added. Big endian might still not be handled very well, because we have not yet added bitcasts (and I'm not sure how we want it to work yet). I've included the alignment code anyway which maps with our current patterns. We plan to return to that later. Code written by Simon Tatham, with additional tests from Me and Mikhail Maltsev. Differential Revision: https://reviews.llvm.org/D63838 llvm-svn: 364633
* [ARM] Mark div and rem as expand for MVEDavid Green2019-06-281-0/+12
| | | | | | | | | We don't have vector operations for these, so they need to be expanded for both integer and float. Differential Revision: https://reviews.llvm.org/D63595 llvm-svn: 364631
* [ARM] MVE vector shufflesDavid Green2019-06-281-61/+128
| | | | | | | | | | | | | | | | | | This patch adds necessary shuffle vector and buildvector support for ARM MVE. It essentially adds support for VDUP, VREVs and some VMOVs, which are often required by other code (like upcoming patches). This mostly uses the same code from Neon that already generated NEONvdup/NEONvduplane/NEONvrev's. These have been renamed to ARMvdup/etc and moved to ARMInstrInfo as they are common to both architectures. Most of the selection code seems to be applicable to both, but NEON does have some more instructions making some parts specific. Most code originally by David Sherwood. Differential Revision: https://reviews.llvm.org/D63567 llvm-svn: 364626
* [ARM] Fix formatting issue in ARMISelLowering.cppSam Tebbs2019-06-271-1/+2
| | | | | | | Fix a formatting error in ARMISelLowering.cpp::Expand64BitShift. My test commit after receiving write access. llvm-svn: 364560
* [ARM] Don't reserve R12 on Thumb1 as an emergency spill slot.Eli Friedman2019-06-261-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current implementation of ThumbRegisterInfo::saveScavengerRegister is bad for two reasons: one, it's buggy, and two, it blocks using R12 for other optimizations. So this patch gets rid of it, and adds the necessary support for using an ordinary emergency spill slot on Thumb1. (Specifically, I think saveScavengerRegister was broken by r305625, and nobody noticed for two years because the codepath is almost never used. The new code will also probably not be used much, but it now has better tests, and if we fail to emit a necessary emergency spill slot we get a reasonable error message instead of a miscompile.) A rough outline of the changes in the patch: 1. Gets rid of ThumbRegisterInfo::saveScavengerRegister. 2. Modifies ARMFrameLowering::determineCalleeSaves to allocate an emergency spill slot for Thumb1. 3. Implements useFPForScavengingIndex, so the emergency spill slot isn't placed at a negative offset from FP on Thumb1. 4. Modifies the heuristics for allocating an emergency spill slot to support Thumb1. This includes fixing ExtraCSSpill so we don't try to use "lr" as a substitute for allocating an emergency spill slot. 5. Allocates a base pointer in more cases, so the emergency spill slot is always accessible. 6. Modifies ARMFrameLowering::ResolveFrameIndexReference to compute the right offset in the new cases where we're forcing a base pointer. 7. Ensures we never generate a load or store with an offset outside of its frame object. This makes the heuristics more straightforward. 8. Changes Thumb1 prologue and epilogue emission so it never uses register scavenging. Some of the changes to the emergency spill slot heuristics in determineCalleeSaves affect ARM/Thumb2; hopefully, they should allow the compiler to avoid allocating an emergency spill slot in cases where it isn't necessary. The rest of the changes should only affect Thumb1. Differential Revision: https://reviews.llvm.org/D63677 llvm-svn: 364490
* [ARM] Fix -Wimplicit-fallthrough after D60709/r364331Fangrui Song2019-06-261-4/+3
| | | | llvm-svn: 364376
* [ARM] Support inline assembler constraints for MVE.Simon Tatham2019-06-251-1/+22
| | | | | | | | | | | | | | | | | | | | | "To" selects an odd-numbered GPR, and "Te" an even one. There are some 8.1-M instructions that have one too few bits in their register fields and require registers of particular parity, without necessarily using a consecutive even/odd pair. Also, the constraint letter "t" should select an MVE q-register, when MVE is present. This didn't need any source changes, but some extra tests have been added. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, eraman, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60709 llvm-svn: 364331
* [ARM] Code-generation infrastructure for MVE.Simon Tatham2019-06-251-5/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This provides the low-level support to start using MVE vector types in LLVM IR, loading and storing them, passing them to __asm__ statements containing hand-written MVE vector instructions, and *if* you have the hard-float ABI turned on, using them as function parameters. (In the soft-float ABI, vector types are passed in integer registers, and combining all those 32-bit integers into a q-reg requires support for selection DAG nodes like insert_vector_elt and build_vector which aren't implemented yet for MVE. In fact I've also had to add `arm_aapcs_vfpcc` to a couple of existing tests to avoid that problem.) Specifically, this commit adds support for: * spills, reloads and register moves for MVE vector registers * ditto for the VPT predication mask that lives in VPR.P0 * make all the MVE vector types legal in ISel, and provide selection DAG patterns for BITCAST, LOAD and STORE * make loads and stores of scalar FP types conditional on `hasFPRegs()` rather than `hasVFP2Base()`. As a result a few existing tests needed their llc command lines updating to use `-mattr=-fpregs` as their method of turning off all hardware FP support. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60708 llvm-svn: 364329
* [ARM] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D60692Fangrui Song2019-06-251-0/+1
| | | | llvm-svn: 364312
* [ARM] Explicit lowering of half <-> double conversions.Simon Tatham2019-06-251-11/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an FP_EXTEND or FP_ROUND isel dag node converts directly between f16 and f32 when the target CPU has no instruction to do it in one go, it has to be done in two steps instead, going via f32. Previously, this was done implicitly, because all such CPUs had the storage-only implementation of f16 (i.e. the only thing you can do with one at all is to convert it to/from f32). So isel would legalize the f16 into an f32 as soon as it saw it, by inserting an fp16_to_fp node (or vice versa), and then the fp_extend would already be f32->f64 rather than f16->f64. But that technique can't support a target CPU which has full f16 support but _not_ f64, such as some variants of Arm v8.1-M. So now we provide custom lowering for FP_EXTEND and FP_ROUND, which checks support for f16 and f64 and decides on the best thing to do given the combination of flags it gets back. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60692 llvm-svn: 364294
* [TargetLowering] Add MachineMemOperand::Flags to allowsMemoryAccess tests ↵Simon Pilgrim2019-06-121-4/+7
| | | | | | | | | | | | | | (PR42123) As discussed on D62910, we need to check whether particular types of memory access are allowed, not just their alignment/address-space. This NFC patch adds a MachineMemOperand::Flags argument to allowsMemoryAccess and allowsMisalignedMemoryAccesses, and wires up calls to pass the relevant flags to them. If people are happy with this approach I can then update X86TargetLowering::allowsMisalignedMemoryAccesses to handle misaligned NT load/stores. Differential Revision: https://reviews.llvm.org/D63075 llvm-svn: 363179
* [ARM] Adjust isLegalT1AddressImmediate for non-legal typesDavid Green2019-06-081-2/+2
| | | | | | | | | | | Types such as float and i64's do not have legal loads in Thumb1, but will still be loaded with a LDR (or potentially multiple LDR's). As such we can treat the cost of addressing mode calculations the same as an i32 and get some optimisation benefits. Differential Revision: https://reviews.llvm.org/D62968 llvm-svn: 362874
* [ARM] Add MVE addressing to isLegalT2AddressImmediateDavid Green2019-06-081-1/+20
| | | | | | | | | | Now with MVE being added, we can add the vector addressing mode costs for it. These are generally imm7 multiplied by the size of the type being loaded / stored. Differential Revision: https://reviews.llvm.org/D62967 llvm-svn: 362873
* [ARM] Add fp16 addressing to isLegalT2AddressImmediateDavid Green2019-06-081-0/+3
| | | | | | | | | | The fp16 version of VLDR takes a imm8 multiplied by 2. This updates the costs to account for those, and adds extra testing. It is dependant upon hasFPRegs16 as this is what the load/store instructions require. Differential Revision: https://reviews.llvm.org/D62966 llvm-svn: 362872
* [ARM][FIX] Ran out of registers due tail recursionDiogo N. Sampaio2019-06-031-40/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - pr42062 When compiling for MinSize, ARMTargetLowering::LowerCall decides to indirect multiple calls to a same function. However, it disconsiders the limitation that thumb1 indirect calls require the callee to be in a register from r0 to r3 (llvm limiation). If all those registers are used by arguments, the compiler dies with "error: run out of registers during register allocation". This patch tells the function IsEligibleForTailCallOptimization if we intend to perform indirect calls, as to avoid tail call optimization. Reviewers: dmgreen, efriedma Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62683 llvm-svn: 362366
* [ARM] LowerVECTOR_SHUFFLE - fix uninitialized variable warnings. NFCI.Simon Pilgrim2019-05-301-4/+4
| | | | llvm-svn: 362094
* [ARM] Replace fp-only-sp and d16 with fp64 and d32.Simon Tatham2019-05-281-35/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Those two subtarget features were awkward because their semantics are reversed: each one indicates the _lack_ of support for something in the architecture, rather than the presence. As a consequence, you don't get the behavior you want if you combine two sets of feature bits. Each SubtargetFeature for an FP architecture version now comes in four versions, one for each combination of those options. So you can still say (for example) '+vfp2' in a feature string and it will mean what it's always meant, but there's a new string '+vfp2d16sp' meaning the version without those extra options. A lot of this change is just mechanically replacing positive checks for the old features with negative checks for the new ones. But one more interesting change is that I've rearranged getFPUFeatures() so that the main FPU feature is appended to the output list *before* rather than after the features derived from the Restriction field, so that -fp64 and -d32 can override defaults added by the main feature. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: srhines, javed.absar, eraman, kristof.beyls, hiraditya, zzheng, Petar.Avramovic, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60691 llvm-svn: 361845
* [AMDGPU] Divergence driven ISel. Assign register class for cross block ↵Alexander Timofeev2019-05-261-1/+3
| | | | | | | | | | | | | | | | | | values according to the divergence. Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence. Reviewers: rampitec, nhaehnle Differential Revision: https://reviews.llvm.org/D59990 This commit was reverted because of the build failure. The reason was mlformed patch. Build failure fixed. llvm-svn: 361741
* [ARM] Select a number of fp16 rounding functionsDavid Green2019-05-261-0/+2
| | | | | | | | | This add patterns for fp16 round and ceil etc. Same as the float and double patterns. Differential Revision: https://reviews.llvm.org/D62326 llvm-svn: 361718
* [ARM] Promote various fp16 math intrinsicsDavid Green2019-05-261-0/+11
| | | | | | | | | Promote a number of fp16 math intrinsics to float, so that the relevant float math routines can be used. Copysign is expanded so as to be handled in-place. Differential Revision: https://reviews.llvm.org/D62325 llvm-svn: 361717
* [ARM] Promote fp16 fremDavid Green2019-05-261-0/+5
| | | | | | | | Promote fp16 frem operations on ARM to floats so they call fmodf. Differential Revision: https://reviews.llvm.org/D62321 llvm-svn: 361713
* Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for ↵Peter Collingbourne2019-05-251-3/+1
| | | | | | | | | | cross block values according to the divergence." Broke sanitizer bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21694/steps/bootstrap%20clang/logs/stdio http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32478/steps/check-llvm%20asan/logs/stdio llvm-svn: 361688
* [AMDGPU] Divergence driven ISel. Assign register class for cross block ↵Alexander Timofeev2019-05-241-1/+3
| | | | | | | | | | | | | | values according to the divergence. Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence. Reviewers: rampitec, nhaehnle Differential Revision: https://reviews.llvm.org/D59990 llvm-svn: 361644
* [ARM] Don't use the Machine Scheduler for cortex-m at minsizeDavid Green2019-05-151-1/+1
| | | | | | | | | | | | | | | | | | The new cortex-m schedule in rL360768 helps performance, but can increase the amount of high-registers used. This, on average, ends up increasing the codesize by a fair amount (because less instructions are converted from T2 to T1). On cortex-m at -Oz, where we are quite size-paranoid, it is better to use the existing DAG scheduler with the RegPressure scheduling preference (at least until the issues around T2 vs T1 instructions can be improved). I have also made sure that the Sched::RegPressure dag scheduler is always chosen for MinSize. The test shows one case where we increase the number of registers used. Differential Revision: https://reviews.llvm.org/D61882 llvm-svn: 360769
* [ARM] Glue register copies to tail calls.Eli Friedman2019-05-061-26/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | This generally follows what other targets do. I don't completely understand why the special case for tail calls existed in the first place; even when the code was committed in r105413, call lowering didn't work in the way described in the comments. Stack protector lowering breaks if the register copies are not glued to a tail call: we have to insert the stack protector check before the tail call, and we choose the location based on the assumption that all physical register dependencies of a tail call are adjacent to the tail call. (See FindSplitPointForStackProtector.) This is sort of fragile, but I don't see any reason to break that assumption. I'm guessing nobody has seen this before just because it's hard to convince the scheduler to actually schedule the code in a way that breaks; even without the glue, the only computation that could actually be scheduled after the register copies is the computation of the call address, and the scheduler usually prefers to schedule that before the copies anyway. Fixes https://bugs.llvm.org/show_bug.cgi?id=41417 Differential Revision: https://reviews.llvm.org/D60427 llvm-svn: 360099
* [TargetLowering] Change getOptimalMemOpType to take a function attribute listSjoerd Meijer2019-04-301-8/+5
| | | | | | | | | | | | The MachineFunction wasn't used in getOptimalMemOpType, but more importantly, this allows reuse of findOptimalMemOpLowering that is calling getOptimalMemOpType. This is the groundwork for the changes in D59766 and D59787, that allows implementation of TTI::getMemcpyCost. Differential Revision: https://reviews.llvm.org/D59785 llvm-svn: 359537
* [ARM] Rewrite isLegalT2AddressImmediateDavid Green2019-04-211-29/+24
| | | | | | | | | | | | | | | | | | | | | This does two main things, firstly adding some at least basic addressing modes for i64 types, and secondly treats floats and doubles sensibly when there is no fpu. The floating point change can help codesize in some cases, especially with D60294. Most backends seems to not consider the exact VT in isLegalAddressingMode, instead switching on type size. That is now what this does when the target does not have an fpu (as the float data will be loaded using LDR's). i64's currently use the address range of an LDRD (even though they may be legalised and loaded with an LDR). This is at least better than marking them all as illegal addressing modes. I have not attempted to do much with vectors yet. That will need changing once MVE is added. Differential Revision: https://reviews.llvm.org/D60677 llvm-svn: 358845
* [TargetLowering] Rename preferShiftsToClearExtremeBits and ↵Simon Pilgrim2019-04-161-3/+2
| | | | | | | | | | | | shouldFoldShiftPairToMask (PR41359) As discussed on PR41359, this patch renames the pair of shift-mask target feature functions to make their purposes more obvious. shouldFoldShiftPairToMask -> shouldFoldConstantShiftPairToMask preferShiftsToClearExtremeBits -> shouldFoldMaskToVariableShiftPair llvm-svn: 358526
* [IR] Refactor attribute methods in Function class (NFC)Evandro Menezes2019-04-041-4/+4
| | | | | | | | Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731
* [ARM] Optimize expressions like "return x != 0;" for Thumb1.Eli Friedman2019-04-021-11/+13
| | | | | | | | | | | | | There's an existing optimization for x != C, but somehow it was missing a special case for 0. While I'm here, also cleaned up the code/comments a bit: the second value produced by the MERGE_VALUES was actually dead, since a CMOV only produces one result. Differential Revision: https://reviews.llvm.org/D59616 llvm-svn: 357437
* [ARM] Add missing memory operands to a bunch of instructions.Eli Friedman2019-03-251-3/+9
| | | | | | | | | | | | | | This should hopefully lead to minor improvements in code generation, and more accurate spill/reload comments in assembly. Also fix isLoadFromStackSlotPostFE/isStoreToStackSlotPostFE so they don't lead to misleading assembly comments for merged memory operands; this is technically orthogonal, but in practice the relevant memory operand lists don't show up without this change. Differential Revision: https://reviews.llvm.org/D59713 llvm-svn: 356963
* [Thumb] Fix infinite loop in ABS expansion (PR41160)Simon Pilgrim2019-03-211-1/+4
| | | | | | Don't expand ISD::ABS node if its legal. llvm-svn: 356661
* Fix unused variable warning. NFCI.Simon Pilgrim2019-03-191-2/+1
| | | | llvm-svn: 356474
* [SelectionDAG] Handle unary SelectPatternFlavor for ABS case in ↵Simon Pilgrim2019-03-191-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SelectionDAGBuilder::visitSelect These changes are related to PR37743 and include: SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node. Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner. Add promoting the integer ABS node in the LegalizeIntegerType. Expand-based legalization of integer result for the ABS nodes. Expand-based legalization of ABS vector operations. Add some integer abs testcases for different typesizes for Thumb arch Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to: tmp = (SRA, Hi, 31) Lo = (UADDO tmp, Lo) Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1)) Lo = (XOR tmp, Lo) The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern: (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))). Change integer abs testcases for codegen with the ABS node support for AArch64. Indicate that the ABS is legal for the i64 type when the NEON is supported. Change the integer abs testcases to show changing of codegen. Add combine and legalization of ABS nodes for Thumb arch. Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition. For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743 Patch by: @ikulagin (Ivan Kulagin) Differential Revision: https://reviews.llvm.org/D49837 llvm-svn: 356468
* [TargetLowering] Add code size information on isFPImmLegal. NFCAdhemerval Zanella2019-03-181-1/+2
| | | | | | | | | | | This allows better code size for aarch64 floating point materialization in a future patch. Reviewers: evandro Differential Revision: https://reviews.llvm.org/D58690 llvm-svn: 356389
* [ARM] Fixed an assumption of power-of-2 vector MVTTim Renouf2019-03-171-6/+6
| | | | | | | | | | | | I am about to introduce some non-power-of-2 width vector MVTs. This commit fixes a power-of-2 assumption that my forthcoming change would otherwise break, as shown by test/CodeGen/ARM/vcvt_combine.ll and vdiv_combine.ll. Differential Revision: https://reviews.llvm.org/D58927 Change-Id: I56a282e365d3874ab0621e5bdef98a612f702317 llvm-svn: 356341
OpenPOWER on IntegriCloud