summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Use tablegen pattern for sendmsg intrinsicsMatt Arsenault2019-08-013-17/+23
| | | | | | | Since this now emits a direct copy to m0, SIFixSGPRCopies has to handle a physical register. llvm-svn: 367593
* [WebAssembly] Assembler/InstPrinter: support call_indirect type index.Wouter van Oortmerssen2019-08-015-37/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A TYPE_INDEX operand (as used by call_indirect) used to be represented by the InstPrinter as a symbol (e.g. .Ltype_index0@TYPE_INDEX) which was a bit of a mismatch with the WasmObjectWriter which expects an unnamed symbol, to receive the signature from and then turn into a reloc. There was really no good way to round-trip this information. An earlier version of this patch tried to attach the signature information using a .functype, but that ran into trouble when the symbol was re-emitted without a name. Removing the name was a giant hack also. The current version changes the assembly syntax to have an inline signature spec for TYPEINDEX operands that is always unnamed, which is much more elegant both in syntax and in implementation (as now the assembler is able to follow the same path as the regular backend) Reviewers: sbc100, dschuff, aheejin, jgravelle-google, sunfish, tlively Subscribers: arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64758 llvm-svn: 367590
* [X86][SSE] Add PEXTR*(PINSR*(v, s, c), c) -> s combine.Simon Pilgrim2019-08-011-4/+15
| | | | | | We should probably extend this to cover bitcasts as well to help other cases in promote-vec3.ll. llvm-svn: 367582
* [mips] Fix lowering load/store instruction in PIC caseSimon Atanasyan2019-08-011-1/+18
| | | | | | | | | | | | | | | | | | | | | If an operand of the `lw/sw` instructions is a symbol, these instructions incorrectly lowered using not-position-independent chain of commands. For PIC code we should use `lw/addiu` instructions with the `R_MIPS_GOT16` and `R_MIPS_LO16` relocations respectively. Instead of that LLVM generates position dependent code with the `R_MIPS_HI16` and `R_MIPS_LO16` relocations. This patch provides a fix for the bug by handling PIC case separately in the `MipsAsmParser::expandMemInst`. The main idea is to generate a chain of PIC instructions to load a symbol address into a register and then load the address content. The fix is not optimal and does not fix all PIC-related problems. This is a task for subsequent patches. Differential Revision: https://reviews.llvm.org/D65524 llvm-svn: 367580
* [X86][SSE] SimplifyMultipleUseDemandedBits - Add PEXTR/PINSR B+W handlingSimon Pilgrim2019-08-012-0/+31
| | | | | | This adds SimplifyMultipleUseDemandedBitsForTargetNode X86 support and uses it to allow us to peek through vector insertions to avoid dependencies on entire insertion chains. llvm-svn: 367570
* [X86] EltsFromConsecutiveLoads - don't attempt to merge volatile loads (PR42846)Simon Pilgrim2019-08-011-1/+4
| | | | llvm-svn: 367556
* [RISCV] Add Custom Parser for Atomic Memory OperandsSam Elliott2019-08-014-4/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: GCC Accepts both (reg) and 0(reg) for atomic instruction memory operands. These instructions do not allow for an offset in their encoding, so in the latter case, the 0 is silently dropped. Due to how we have structured the RISCVAsmParser, the easiest way to add support for parsing this offset is to add a custom AsmOperand and parser. This parser drops all the parens, and just keeps the register. This commit also adds a custom printer for these operands, which matches the GCC canonical printer, printing both `(a0)` and `0(a0)` as `(a0)`. Reviewers: asb, lewis-revill Reviewed By: asb Subscribers: s.egerton, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, psnobl, benna, Jim, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65205 llvm-svn: 367553
* [IR] Value: add replaceUsesWithIf() utilityRoman Lebedev2019-08-011-5/+2
| | | | | | | | | | | | | | | | | | | | | | Summary: While there is always a `Value::replaceAllUsesWith()`, sometimes the replacement needs to be conditional. I have only cleaned a few cases where `replaceUsesWithIf()` could be used, to both add test coverage, and show that it is actually useful. Reviewers: jdoerfert, spatel, RKSimon, craig.topper Reviewed By: jdoerfert Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, george.burgess.iv, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65528 llvm-svn: 367548
* [ARM] Fix for MVE VREV64David Green2019-08-011-5/+5
| | | | | | | | | | | The VREV64 instruction is apparently unpredictable if Qd == Qm, due to the cross-beat nature of the instruction. This adds an earlyclobber to Qd, which seems to be the same way we deal with this on other instructions like the write-back on loads and stores. Differential Revision: https://reviews.llvm.org/D65502 llvm-svn: 367544
* [AArch64] Do not allocate unnecessary emergency slot.Sander de Smalen2019-08-011-2/+2
| | | | | | | | | | | | | | Fix an issue where the compiler still allocates an emergency spill slot even though it already decided to spill an extra callee-save register to use as a scratch register. Reviewers: gberry, thegameg, mstorsjo, t.p.northover Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D65504 llvm-svn: 367540
* [MIPS GlobalISel] Fold load/store + G_GEP + G_CONSTANTPetar Avramovic2019-08-011-2/+23
| | | | | | | | | Fold load/store + G_GEP + G_CONSTANT when immediate in G_CONSTANT fits into 16 bit signed integer. Differential Revision: https://reviews.llvm.org/D65507 llvm-svn: 367535
* [NFC][ARM][ParallelDSP] Getters and renamingSam Parker2019-08-011-16/+22
| | | | | | | Add a couple of getters for Reduction and do some renaming of variables around CreateSMLAD for clarity. llvm-svn: 367522
* AMDGPU/SILoadStoreOptimizer: Make some functions constTom Stellard2019-08-011-6/+6
| | | | | | | | | | | | | | Reviewers: arsenm, pendingchaos, rampitec Reviewed By: rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65316 llvm-svn: 367517
* recommit:[PowerPC] Eliminate loads/swap feeding swap/store for vector type ↵Zi Xuan Wu2019-08-013-1/+117
| | | | | | | | | | | | by using big-endian load/store In PowerPC, there is instruction to load vector in big endian element order when it's in little endian target. So we can combine vector load + reverse into big endian load to eliminate the swap instruction. Also combine vector reverse + store into big endian store. Differential Revision: https://reviews.llvm.org/D65063 llvm-svn: 367516
* AMDGPU/GlobalISel: Fix flat load/store of pointer typesMatt Arsenault2019-08-014-8/+13
| | | | llvm-svn: 367513
* AMDGPU/GlobalISel: Remove manual store select codeMatt Arsenault2019-08-012-58/+23
| | | | | | | This regresses the weird types that are newly treated as legal load types, but fixes incorrectly using flat instrucions on SI. llvm-svn: 367512
* AMDGPU/GlobalISel: Select local atomic cmpxchgMatt Arsenault2019-08-013-28/+13
| | | | llvm-svn: 367511
* AMDGPU/GlobalISel: Handle G_ATOMICRMW_FADDMatt Arsenault2019-08-014-0/+6
| | | | llvm-svn: 367509
* AMDGPU/GlobalISel: Allow selection of DS atomicrmwMatt Arsenault2019-08-014-6/+27
| | | | llvm-svn: 367507
* AMDGPU: Start redefining atomic PatFragsMatt Arsenault2019-08-016-189/+210
| | | | | | | | Start migrating to a form that will be compatible with the global isel emitter. Also should fix some overly lax checks on the memory type, which allowed mis-selecting some illegal atomics. llvm-svn: 367506
* AMDGPU: Correct FP atomic patternsMatt Arsenault2019-08-013-9/+10
| | | | | | | These need to use an fadd, not an add. Also make the noret part clear in the name. llvm-svn: 367505
* AMDGPU/GlobalISel: Select simple local storesMatt Arsenault2019-08-016-19/+52
| | | | llvm-svn: 367504
* GlobalISel: moreElementsVector for G_LOAD/G_STOREMatt Arsenault2019-08-011-0/+1
| | | | | | | AMDGPU change and test is a placeholder until a future patch with complete handling. llvm-svn: 367503
* Reapply "AMDGPU: Split block for si_end_cf"Matt Arsenault2019-08-015-25/+139
| | | | | | This reverts commit r359363, reapplying r357634 llvm-svn: 367500
* AMDGPU/GlobalISel: Select local loadsMatt Arsenault2019-08-015-9/+108
| | | | llvm-svn: 367498
* Revert "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG" andAmy Huang2019-07-311-11/+0
| | | | | | | | | | and partial fix. Causes windows buildbot errors. This reverts commit 6e65c34523963094acd0d6c94a5f5c64b32fe6aa and 53da7ca94343166ac68aef81db0398932fc258bb. llvm-svn: 367496
* [ARM] Lower "(x<<c) > 0x80000000U" to "lsls" on Thumb1.Eli Friedman2019-07-315-0/+34
| | | | | | | | | This is extremely specific, but saves three instructions when it's legal. I don't think the code can be usefully generalized. Differential Revision: https://reviews.llvm.org/D65351 llvm-svn: 367492
* [ARM] Transform compare of masked value to shift on Thumb1.Eli Friedman2019-07-311-0/+37
| | | | | | | | | | | | Thumb1 has very limited immediate modes, so turning an "and" into a shift can save multiple instructions. It's possible to simplify the generated code for test2 and test3 in cmp-and-fold.ll a little more, but I'll implement that as a followup. Differential Revision: https://reviews.llvm.org/D65175 llvm-svn: 367491
* [X86] Add DAG combine to fold any_extend_vector_inreg+truncstore to an ↵Craig Topper2019-07-311-0/+35
| | | | | | | | | | | | extractelement+store We have custom code that ignores the normal promoting type legalization on less than 128-bit vector types like v4i8 to emit pavgb, paddusb, psubusb since we don't have the equivalent instruction on a larger element type like v4i32. If this operation appears before a store, we can be left with an any_extend_vector_inreg followed by a truncstore after type legalization. When truncstore isn't legal, this will normally be decomposed into shuffles and a non-truncating store. This will then combine away the any_extend_vector_inreg and shuffle leaving just the store. On avx512, truncstore is legal so we don't decompose it and we had no combines to fix it. This patch adds a new DAG combine to detect this case and emit either an extract_store for 64-bit stoers or a extractelement+store for 32 and 16 bit stores. This makes the avx512 codegen match the avx2 codegen for these situations. I'm restricting to only when -x86-experimental-vector-widening-legalization is false. When we're widening we're not likely to create this any_extend_inreg+truncstore combination. This means we should be able to remove this code when we flip the default. I would like to flip the default soon, but I need to investigate some performance regressions its causing in our branch that I wasn't seeing on trunk. Differential Revision: https://reviews.llvm.org/D65538 llvm-svn: 367488
* [GISel] Address review feedback on passing MD_callees to lowerCall.Mark Lacey2019-07-314-5/+5
| | | | | | | Preserve the nullptr default for KnownCallees that appears in the base class. llvm-svn: 367477
* [GISel] Pass MD_callees metadata down in call lowering.Mark Lacey2019-07-318-11/+20
| | | | | | | | | | | | | | | | | | | | Summary: This will make it possible to improve IPRA by taking into account register usage in indirect calls. NFC yet; this is just laying the groundwork to start building up patches to take advantage of the information for improved register allocation. Reviewers: aditya_nandakumar, volkan, qcolombet, arsenm, rovka, aemerson, paquette Subscribers: sdardis, wdng, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65488 llvm-svn: 367476
* AArch64: Add a tagged-globals backend feature.Peter Collingbourne2019-07-317-1/+42
| | | | | | | | | | | | | | | | | | | This feature instructs the backend to allow locally defined global variable addresses to contain a pointer tag in bits 56-63 that will be ignored by the hardware (i.e. TBI), but may be used by an instrumentation pass such as HWASAN. It works by adding a MOVK instruction to the regular ADRP/ADD sequence that sets bits 48-63 to the corresponding bits of the global, with the linker bounds check disabled on the ADRP instruction to prevent the tag from causing a link failure. This implementation of the feature omits the MOVK when loading from or storing to a global, which is sufficient for TBI. If the same approach is extended to MTE, assuming that 0 is not configured as a catch-all tag, we will most likely also need the MOVK in this case in order to avoid a tag mismatch. Differential Revision: https://reviews.llvm.org/D65364 llvm-svn: 367475
* SelectionDAG, MI, AArch64: Widen target flags fields/arguments from unsigned ↵Peter Collingbourne2019-07-316-13/+12
| | | | | | | | | | | | | char to unsigned. This makes the field wider than MachineOperand::SubReg_TargetFlags so that we don't end up silently truncating any higher bits. We should still catch any bits truncated from the MachineOperand field as a consequence of the assertion in MachineOperand::setTargetFlags(). Differential Revision: https://reviews.llvm.org/D65465 llvm-svn: 367474
* Reland "[DwarfDebug] Dump call site debug info"Djordje Todorovic2019-07-312-1/+90
| | | | | | | | | The build failure found after the rL365467 has been resolved. Differential Revision: https://reviews.llvm.org/D60716 llvm-svn: 367446
* [X86] Moved IsNOT helper earlier. NFCI.Simon Pilgrim2019-07-311-28/+28
| | | | | | Makes it available for more combines to use without adding declarations. llvm-svn: 367436
* [ARM] Reject CSEL instructions with invalid operandsMikhail Maltsev2019-07-311-1/+1
| | | | | | | | | | | | | | | | | | | | Summary: According to the Armv8.1-M manual CSEL, CSINC, CSINV and CSNEG are "constrained unpredictable" when SP is used as the source register Rn. The assembler should diagnose this case. Reviewers: momchil.velikov, dmgreen, ostannard, simon_tatham, t.p.northover Reviewed By: ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65505 llvm-svn: 367433
* [X86][AVX] Ensure chained subvector insertions are the same size (PR42833)Simon Pilgrim2019-07-311-0/+2
| | | | | | Before combining insert_subvector(insert_subvector(vec, sub0, c0), sub1, c1) patterns, ensure that the subvectors are all the same type. On AVX512 targets especially we might have a mixture of 128/256 subvector insertions. llvm-svn: 367429
* [AArch64] Add support for Transactional Memory Extension (TME)Momchil Velikov2019-07-314-12/+78
| | | | | | | | | | | | | | | | | | | | | | | | | Re-commit r366322 after some fixes TME is a future architecture technology, documented in https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools https://developer.arm.com/docs/ddi0601/a More about the future architectures: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and TCANCEL and the target feature/arch extension "tme". It also implements TME builtin functions, defined in ACLE Q2 2019 (https://developer.arm.com/docs/101028/latest) Differential Revision: https://reviews.llvm.org/D64416 Patch by Javed Absar and Momchil Velikov llvm-svn: 367428
* [ARM] Generate MVE VFMAsOliver Cruickshank2019-07-311-0/+21
| | | | llvm-svn: 367408
* [NFC] Test CommitOliver Cruickshank2019-07-311-2/+2
| | | | llvm-svn: 367405
* [RISCV] Support 'f' Inline Assembly ConstraintSam Elliott2019-07-312-0/+22
| | | | | | | | | | | | | | | | | | | | | | | Summary: This adds the 'f' inline assembly constraint, as supported by GCC. An 'f'-constrained operand is passed in a floating point register. Exactly which kind of floating-point register (32-bit or 64-bit) is decided based on the operand type and the available standard extensions (-f and -d, respectively). This patch adds support in both the clang frontend, and LLVM itself. Reviewers: asb, lewis-revill Reviewed By: asb Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65500 llvm-svn: 367403
* [NFC][ARMCGP] Use switch in isSupportedValueSam Parker2019-07-311-47/+40
| | | | | | | | Use a switch instead of many isa<> while checking for supported values. Also be explicit about which cast instructions are supported; This allows the removal of SIToFP from GenerateSignBits. llvm-svn: 367402
* [AArch64][SVE2] Load/store instruction fixesCullen Rhodes2019-07-312-37/+30
| | | | | | | | | | | | | | | | Summary: * Loads and stores in SVE2 are gather/scatter not contiguous, fixed by renaming multiclasses to reflect this and also updated comments. * Remove aliases from load/store multiclasses that reflect the behaviour of the original form. * Fix bug in scatter store implementation, vector list should be used as input, not output. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D65392 llvm-svn: 367398
* [RISCV] Add support for lowering floating point inlineasm clobbersSimon Cook2019-07-311-0/+46
| | | | | | | | | | | | | This adds the required extension to RISC-V's getRegForInlineAsmConstraint in order to be able to correctly distringuish between the 32 and 64-bit floating point registers when the generic fX name appears in inlineasm clobber contraints. It also adds a check to validate that callee saved floating point registers are only saved in this case when a hard-float ABI is selected. Differential Revision: https://reviews.llvm.org/D64751 llvm-svn: 367397
* [AArch64][SVE2] Minor refactoring and cleanupCullen Rhodes2019-07-312-26/+26
| | | | | | | | | | | | | | | | | Summary: * Clarify comment with SVE2 for predicated shifts and move next to other shift instructions. * Clarify comments for various instructions. * Move FCVTX instruction next to other fp conversions. * Move FLOGB to next to other fp instructions and fix description. * Remove "cons" from non-constructive multiclass for bitwise shift-right and accumulate instructions. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D65390 llvm-svn: 367396
* [AArch64][SVE2] Use destination register as source registerCullen Rhodes2019-07-312-86/+207
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch fixes a bug in the following instructions that should have been implemented as destructive. A destructive instruction is an instruction where one of the source registers also acts as the destination register. Therefore, the contents of the source register, when the instruction begins execution, are replaced by the result of the instruction when the instruction completes execution [1]: * SRI/SLI * EORBT/EORTB * TBX * Narrowing top instructions * FP convert precision instructions These changes are non-functional from the assembler/diassembler point-of-view but are necessary for correct codegen. [1] https://static.docs.arm.com/ddi0584/ae/DDI0584A_e_SVE_supp_armv8A.pdf Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D65389 llvm-svn: 367394
* [ARM][ParallelDSP] Convert to function passSam Parker2019-07-311-73/+45
| | | | | | | | Run across a whole function, visiting each basic block one at a time. Differential Revision: https://reviews.llvm.org/D65324 llvm-svn: 367389
* revert r367382 because buildbot failureZi Xuan Wu2019-07-313-116/+1
| | | | llvm-svn: 367388
* [PowerPC] Eliminate loads/swap feeding swap/store for vector type by using ↵Zi Xuan Wu2019-07-313-1/+116
| | | | | | | | | | big-endian load/store In PowerPC, there is instruction to load vector in big endian element order when it's in little endian target. So we can combine vector load + reverse into big endian load to eliminate the swap instruction. Also combine vector reverse + store into big endian store. llvm-svn: 367382
* [AMDGPU] Fix high occupancy calculation and print itStanislav Mekhanoshin2019-07-315-10/+43
| | | | | | | | | | | We had couple places which still return 10 as a maximum occupancy. Fixed. Also print comment about occupancy as compiler see it. Differential Revision: https://reviews.llvm.org/D65423 llvm-svn: 367381
OpenPOWER on IntegriCloud