summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM
Commit message (Collapse)AuthorAgeFilesLines
...
* [TargetLowering] Remove optional arguments passing to makeLibCallShiva Chen2019-08-221-5/+9
| | | | | | | | | | The patch introduces MakeLibCallOptions struct as suggested by @efriedma on D65497. The struct contain argument flags which will pass to makeLibCall function. The patch should not has any functionality changes. Differential Revision: https://reviews.llvm.org/D65795 llvm-svn: 369622
* Revert r367389 (and follow-up r368404); it caused PR43073.Nico Weber2019-08-211-50/+76
| | | | llvm-svn: 369567
* [ARM] Formatting for ARMInstrMVE.td. NFCDavid Green2019-08-211-89/+98
| | | | | | | This is just some formatting cleanup, prior to the masked load and store patch in D66534. llvm-svn: 369545
* [ARM] Select vaddvaSam Tebbs2019-08-201-0/+7
| | | | | | | | This patch adds vaddva selection. Differential revision: https://reviews.llvm.org/D66410 llvm-svn: 369404
* [ARM] Add support for MVE vaddvSam Tebbs2019-08-194-0/+41
| | | | | | | | | This patch adds vecreduce_add and the relevant instruction selection for vaddv. Differential revision: https://reviews.llvm.org/D66085 llvm-svn: 369245
* [ARM] MVE sext costsDavid Green2019-08-191-0/+25
| | | | | | | | | This adds some sext costs for MVE, taken from the length of assembly sequences that we currently generate. Differential Revision: https://reviews.llvm.org/D66010 llvm-svn: 369244
* Reland "[ARM] push LR before __gnu_mcount_nc"Jian Cai2019-08-165-0/+90
| | | | | | | | This relands r369147 with fixes to unit tests. https://reviews.llvm.org/D65019 llvm-svn: 369173
* [ARM] Preserve liveness in ARMConstantIslands.Eli Friedman2019-08-161-3/+18
| | | | | | | | | | We currently don't use liveness information after this point, but it can be useful to catch bugs using -verify-machineinstrs, and optimizations could potentially use this information in the future. Differential Revision: https://reviews.llvm.org/D66319 llvm-svn: 369162
* Revert "[ARM] push LR before __gnu_mcount_nc"Jian Cai2019-08-165-90/+0
| | | | | | This reverts commit f4cf3b959333f62b7a7b2d7771f7010c9d8da388. llvm-svn: 369149
* [ARM] push LR before __gnu_mcount_ncJian Cai2019-08-165-0/+90
| | | | | | | | | Push LR register before calling __gnu_mcount_nc as it expects the value of LR register to be the top value of the stack on ARM32. Differential Revision: https://reviews.llvm.org/D65019 llvm-svn: 369147
* [ARM] MVE sext of a load is freeDavid Green2019-08-161-0/+15
| | | | | | | | | MVE also has some sext of loads, which will be free just as scalar instructions are. Differential Revision: https://reviews.llvm.org/D66008 llvm-svn: 369118
* [ARM] Correct register for narrowing and widening MVE loads and stores.David Green2019-08-163-13/+42
| | | | | | | | | | | | | | | | The widening and narrowing MVE instructions like VLDRH.32 are only permitted to use low tGPR registers. This means that if they are used for a stack slot, where the register used is only decided during frame setup, we need to be able to correctly pick a thumb1 register over a normal GPR. This attempts to add the required logic into eliminateFrameIndex and rewriteT2FrameIndex, only picking the FrameReg if it is a valid register for the operands register class, and picking a valid scratch register for the register class. Differential Revision: https://reviews.llvm.org/D66285 llvm-svn: 369108
* [ARM] Don't pretend we know how to generate MVE VLDnDavid Green2019-08-162-1/+7
| | | | | | | | | | We don't yet know how to generate these instructions for MVE. And in the case of VLD3, we don't even have the instruction. For the moment don't tell the vectoriser that we have VLD4, just to end up serialising the results. Differential Revision: https://reviews.llvm.org/D66009 llvm-svn: 369101
* [ARM][LowOverheadLoops] Fix generated code for "revert".Eli Friedman2019-08-151-3/+3
| | | | | | | | | | | | | | | | | | Two issues: 1. t2CMPri shouldn't use CPSR if it isn't predicated. This doesn't really have any visible effect at the moment, but it might matter in the future. 2. The t2CMPri generated for t2WhileLoopStart might need to use a register that isn't LR. My team found this because we have a patch to track register liveness late in the pass pipeline. I'll look into upstreaming it to help catch issues like this earlier. Differential Revision: https://reviews.llvm.org/D66243 llvm-svn: 369069
* [SDAG] Minor code cleanup/standardization of atomic accessors [NFC]Philip Reames2019-08-151-1/+1
| | | | llvm-svn: 369057
* Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVMDaniel Sanders2019-08-1519-257/+257
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041
* [llvm] Migrate llvm::make_unique to std::make_uniqueJonas Devlieghere2019-08-156-39/+39
| | | | | | | | Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013
* [ARM] Fix alignment checks for BE VLDRHDavid Green2019-08-151-2/+2
| | | | | | | | | | | | We need to allow any alignment at least 2, not just exactly 2, so that the big endian loads and stores can be selected successfully. I've also added extra BE testing for the load and store tests. Thanks to Oliver for the report. Differential Revision: https://reviews.llvm.org/D66222 llvm-svn: 368996
* [ARM] MVE predicate store patternsDavid Green2019-08-151-0/+7
| | | | | | | | | Stack loads and stores were already working, but direct stores were not. This adds the patterns for them, same as predicate loads. Differential Revision: https://reviews.llvm.org/D66213 llvm-svn: 368988
* [ARM] MVE trunc to i1 vectorsDavid Green2019-08-151-0/+7
| | | | | | | | | This adds patterns for selecting trunc instructions from full vectors to i1's vectors. Differential Revision: https://reviews.llvm.org/D66201 llvm-svn: 368981
* [ARM] Add MVE beats vector cost modelDavid Green2019-08-134-21/+87
| | | | | | | | | | | | | | | | | | | | | | | | The MVE architecture has the idea of "beats", where a vector instruction can be executed over several ticks of the architecture. This adds a similar system into the Arm backend cost model, multiplying the cost of all vector instructions by a factor. This factor essentially becomes the expected difference between scalar code and vector code, on average. MVE Vector instructions can also overlap so the a true cost of them is often lower. But equally scalar instructions can in some situations be dual issued, or have other optimisations such as unrolling or make use of dsp instructions. The default is chosen as 2. This should not prevent vectorisation is a most cases (as the vector instructions will still be doing at least 4 times the work), but it will help prevent over vectorising in cases where the benefits are less likely. This adds things so far to the obvious places in ARMTargetTransformInfo, and updates a few related costs like not treating float instructions as cost 2 just because they are floats. Differential Revision: https://reviews.llvm.org/D66005 llvm-svn: 368733
* [ARM] Fix detection of duplicates when parsing reg list operandsMomchil Velikov2019-08-131-19/+43
| | | | | | Differential Revision: https://reviews.llvm.org/D65957 llvm-svn: 368712
* [ARM] Fix encoding of APSR in CLRM instructionMomchil Velikov2019-08-132-16/+7
| | | | | | | | | The APSR is encoded by setting bit 15 in the register list of the CLRM instruction (cf. https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf). Differential Revision: https://reviews.llvm.org/D65873 llvm-svn: 368711
* GlobalISel: Change representation of shuffle masksMatt Arsenault2019-08-131-0/+1
| | | | | | | | | | | | | | | | | | Currently shufflemasks get emitted as any other constant, and you end up with a bunch of virtual registers of G_CONSTANT with a G_BUILD_VECTOR. The AArch64 selector then asserts on anything that doesn't fit this pattern. This isn't an ideal representation, and should avoid legalization and have fewer opportunities for a representational error. Rather than invent a new shuffle mask operand type, similar to what ShuffleVectorSDNode does, just track the original IR Constant mask operand. I don't completely like the idea of adding another link to the IR, but MIR is already quite dependent on IR constants already, and this will allow sharing the shuffle mask utility functions with the IR. llvm-svn: 368704
* [GlobalISel] Make the InstructionSelector instance non-const, allowing state ↵Amara Emerson2019-08-133-6/+5
| | | | | | | | | | | | | | | | to be maintained. Currently we can't keep any state in the selector object that we get from subtarget. As a result we have to plumb through all our variables through multiple functions. This change makes it non-const and adds a virtual init() method to allow further state to be captured for each target. AArch64 makes use of this in this patch to cache a call to hasFnAttribute() which is expensive to call, and is used on each selection of G_BRCOND. Differential Revision: https://reviews.llvm.org/D65984 llvm-svn: 368652
* [ARM] sext of a load is freeDavid Green2019-08-121-0/+21
| | | | | | | | | This teaches the cost model that the sext or zext of a load is going to be free. Differential Revision: https://reviews.llvm.org/D66006 llvm-svn: 368593
* [ARM] MVE shuffle broadcast costsDavid Green2019-08-121-0/+17
| | | | | | | | | | | A VDUP will perform a vector broadcast in a single instruction. Update the cost model for MVE accordingly. Code originally by David Sherwood. Differential Revision: https://reviews.llvm.org/D63448 llvm-svn: 368589
* [ARM] Put some of the TTI costmodel behind hasNeon calls.David Green2019-08-121-74/+75
| | | | | | | | | This puts some of the calls in ARMTargetTransformInfo.cpp behind hasNeon() checks, now that we have MVE, and updates all the tests accordingly. Differential Revision: https://reviews.llvm.org/D63447 llvm-svn: 368587
* [MVE] Don't try to unroll vectorised MVE loopsDavid Green2019-08-111-0/+5
| | | | | | | | | | | | | | | Due to the nature of the beat system in the MVE architecture, along with tail predication and low-overhead loops, unrolling has less benefit compared to normal loops. You can not, for example, hide the latency of a load with other instructions as you can for scalar code. Preventing unrolling also makes the code easier to read and reason about. So if a loop contains vector code, don't enable the runtime unrolling. At least for the time being. Differential Revision: https://reviews.llvm.org/D65803 llvm-svn: 368530
* [ARM] Permit auto-vectorization using MVEDavid Green2019-08-111-2/+6
| | | | | | | | | | | | | With enough codegen complete, we can now correctly report the number and size of vector registers for MVE, allowing auto vectorisation. This also allows FP auto-vectorization for MVE without -Ofast/-ffast-math, due to support for IEEE FP arithmetic and parity between scalar and vector FP behaviour. Patch by David Sherwood. Differential Revision: https://reviews.llvm.org/D63728 llvm-svn: 368529
* [globalisel] Add G_SEXT_INREGDaniel Sanders2019-08-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Targets often have instructions that can sign-extend certain cases faster than the equivalent shift-left/arithmetic-shift-right. Such cases can be identified by matching a shift-left/shift-right pair but there are some issues with this in the context of combines. For example, suppose you can sign-extend 8-bit up to 32-bit with a target extend instruction. %1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity) %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_ASHR %2:_(s32), i32 1 would reasonably combine to: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 25 which no longer matches the special case. If your shifts and extend are equal cost, this would break even as a pair of shifts but if your shift is more expensive than the extend then it's cheaper as: %2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8 %3:_(s32) = G_ASHR %2:_(s32), i32 1 It's possible to match the shift-pair in ISel and emit an extend and ashr. However, this is far from the only way to break this shift pair and make it hard to match the extends. Another example is that with the right known-zeros, this: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_MUL %2:_(s32), i32 2 can become: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 23 All upstream targets have been configured to lower it to the current G_SHL,G_ASHR pair but will likely want to make it legal in some cases to handle their faster cases. To follow-up: Provide a way to legalize based on the constant. At the moment, I'm thinking that the best way to achieve this is to provide the MI in LegalityQuery but that opens the door to breaking core principles of the legalizer (legality is not context sensitive). That said, it's worth noting that looking at other instructions and acting on that information doesn't violate this principle in itself. It's only a violation if, at the end of legalization, a pass that checks legality without being able to see the context would say an instruction might not be legal. That's a fairly subtle distinction so to give a concrete example, saying %2 in: %1 = G_CONSTANT 16 %2 = G_SEXT_INREG %0, %1 is legal is in violation of that principle if the legality of %2 depends on %1 being constant and/or being 16. However, legalizing to either: %2 = G_SEXT_INREG %0, 16 or: %1 = G_CONSTANT 16 %2:_(s32) = G_SHL %0, %1 %3:_(s32) = G_ASHR %2, %1 depending on whether %1 is constant and 16 does not violate that principle since both outputs are genuinely legal. Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61289 llvm-svn: 368487
* GlobalISel: pack various parameters for lowerCall into a struct.Tim Northover2019-08-092-21/+14
| | | | | | | | | I've now needed to add an extra parameter to this call twice recently. Not only is the signature getting extremely unwieldy, but just updating all of the callsites and implementations is a pain. Putting the parameters in a struct sidesteps both issues. llvm-svn: 368408
* [ARM][ParallelDSP] Replace SExt usesSam Parker2019-08-091-3/+5
| | | | | | | | | | | As loads are combined and widened, we replaced their sext users operands whereas we should have been replacing the uses of the sext. I've added a load of tests, with only a few of them originally causing assertion failures, the rest improve pattern coverage. Differential Revision: https://reviews.llvm.org/D65740 llvm-svn: 368404
* [ARM] Add support for MVE pre and post inc loads and storesDavid Green2019-08-083-19/+273
| | | | | | | | | | | | This adds pre- and post- increment and decrements for MVE loads and stores. It uses the builtin pre and post load/store detection, unlike Neon. Loads are selected with the code in tryT2IndexedLoad, stores are selected with tablegen patterns. The immediates have a +/-7bit range, multiplied by the size of the element. Differential Revision: https://reviews.llvm.org/D63840 llvm-svn: 368305
* [ARM] MVE big endian loads/storesDavid Green2019-08-082-43/+47
| | | | | | | | | | | This adds some missing patterns for big endian loads/stores, allowing unaligned loads/stores to also be selected with an extra VREV, which produces better code than aligning through a stack. Also moves VLDR_P0 to not be LE only, and adjusts some of the tests to show all that working. Differential Revision: https://reviews.llvm.org/D65583 llvm-svn: 368304
* [ARM] Select VFMASam Tebbs2019-08-081-0/+7
| | | | llvm-svn: 368264
* [ARM] Tighten up VLDRH.32 with low alignmentsDavid Green2019-08-082-16/+35
| | | | | | | | | | | | VLDRH needs to have an alignment of at least 2, including the widening/narrowing versions. This tightens up the ISel patterns for it and alters allowsMisalignedMemoryAccesses so that unaligned accesses are expanded through the stack. It also fixed some incorrect shift amounts, which seemed to be passing a multiple not a shift. Differential Revision: https://reviews.llvm.org/D65580 llvm-svn: 368256
* [ARM] Expand CTPOP intrinsic for MVEOliver Cruickshank2019-08-071-0/+1
| | | | llvm-svn: 368180
* [ARM] Generate MVE VHADDs/VHSUBsOliver Cruickshank2019-08-071-0/+54
| | | | llvm-svn: 368146
* [ARM][LowOverheadLoops] Revert after read/writeSam Parker2019-08-071-13/+18
| | | | | | | | | | | | | | | | | Currently we check whether LR is stored/loaded to/from inbetween the loop decrement and loop end pseudo instructions. There's two problems here: - It relies on all load/store instructions being labelled as such in tablegen. - Actually any use of loop decrement is troublesome because the value doesn't exist! So we need to check for any read/write of LR that occurs between the two instructions and revert if we find anything. Differential Revision: https://reviews.llvm.org/D65792 llvm-svn: 368130
* CodeGen: Migration to using RegisterMatt Arsenault2019-08-061-22/+22
| | | | llvm-svn: 367974
* [GlobalISel][CallLowering] Rename isArgumentHandler() -> ↵Amara Emerson2019-08-051-1/+1
| | | | | | | | | isIncomingArgumentHandler() Previous name and comment incorrectly implied it was just for formal arg handlers, which is not true. llvm-svn: 367945
* AMDGPU: Correct behavior of f16 buffer loadsMatt Arsenault2019-08-051-2/+3
| | | | | | | Don't assume format loads for f16. Also fixes support for targets without i16. llvm-svn: 367879
* [LLVM][Alignment] Introduce Alignment TypeGuillaume Chatelet2019-08-051-8/+8
| | | | | | | | | | | | | | | | | | | Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jfb, jakehehrlich Reviewed By: jfb Subscribers: wuzish, jholewinski, arsenm, dschuff, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65514 llvm-svn: 367828
* Reland: Fix and test inter-procedural register allocation for ARMOliver Stannard2019-08-053-2/+10
| | | | | | | | | | | | | | | | | | | | Add an explicit construction of the ArrayRef, gcc 5 and earlier don't seem to select the ArrayRef constructor which takes a C array when the construction is implicit. Original commit message: - Avoid a crash when IPRA calls ARMFrameLowering::determineCalleeSaves with a null RegScavenger. Simply not updating the register scavenger is fine because IPRA only cares about the SavedRegs vector, the acutal code of the function has already been generated at this point. - Add a new hook to TargetRegisterInfo to get the set of registers which can be clobbered inside a call, even if the compiler can see both sides, by linker-generated code. Differential revision: https://reviews.llvm.org/D64908 llvm-svn: 367819
* [ARM] MVE big endian bitcastsDavid Green2019-08-041-0/+45
| | | | | | | | | | | | | | | | | | | | | | This adds big endian MVE patterns for bitcasts. They are defined in llvm as being the same as a store of the existing type and the load into the new. This means that they have to become a VREV between the two types, working in the same way that NEON works in big-endian. This also adds some example tests for bigendian, showing where code is and isn't different. The main difference, especially from a testing perspective is that vectors are passed as v2f64, and so are VREV into and out of call arguments, and the parameters are passed in a v2f64 format. Same happens for inline assembly where the register class is used, so it is VREV to a v16i8. So some of this is probably not correct yet, but it is (mostly) self-consistent and seems to be consistent with how llvm treats vectors. The rest we can hopefully fix later. More details about big endian neon can be found in https://llvm.org/docs/BigEndianNEON.html. Differential Revision: https://reviews.llvm.org/D65581 llvm-svn: 367780
* [Thumb] Fix invalid symbol redefinition due to duplicated jumptable (PR42760)Nikita Popov2019-08-031-1/+2
| | | | | | | | | | | | | | | Fix for https://bugs.llvm.org/show_bug.cgi?id=42760. A tBR_JTr instruction is duplicated by tail duplication, which results in the same jumptable with the same label being emitted twice. Fix this by marking tBR_JTr as not duplicable. The corresponding ARM/Thumb instructions are already marked as not duplicable. Additionally also mark tTBB_JT and tTBH_JT to be consistent with Thumb2, even though this shouldn't be strictly necessary. Differential Revision: https://reviews.llvm.org/D65606 llvm-svn: 367753
* Emit diagnostic if an inline asm constraint requires an immediateBill Wendling2019-08-031-5/+6
| | | | | | | | | | | | | | | | | | Summary: An inline asm call can result in an immediate after inlining. Therefore emit a diagnostic here if constraint requires an immediate but one isn't supplied. Reviewers: joerg, mgorny, efriedma, rsmith Reviewed By: joerg Subscribers: asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, s.egerton, MaskRay, jyknight, dylanmckay, javed.absar, fedor.sergeev, jrtc27, Jim, krytarowski, eraman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60942 llvm-svn: 367750
* Revert Fix and test inter-procedural register allocation for ARMDouglas Yung2019-08-023-10/+2
| | | | | | | | This reverts r367669 (git commit f6b00c279a5587a25876752a6ecd8da0bed959dc) This was breaking a build bot http://lab.llvm.org:8011/builders/netbsd-amd64/builds/21233 llvm-svn: 367731
* GlobalISel: support swiftself attributeTim Northover2019-08-021-0/+1
| | | | llvm-svn: 367683
OpenPOWER on IntegriCloud