summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [AArch64] Place the first ldp at the end when ReverseCSRRestoreSeq is trueFrancis Visoiu Mistrih2018-04-271-30/+45
| | | | | | | | | | Put the first ldp at the end, so that the load-store optimizer can run and merge the ldp and the add into a post-index ldp. This didn't work in case no frame was needed and resulted in code size regressions. llvm-svn: 331044
* [SystemZ] Remove scheduling info from some Pseudo instructions (NFC).Jonas Paulsson2018-04-277-133/+22
| | | | | | | | If the MachineInstr uses a custom inserter and is then erased after instruction selection, there is no use for mapping it to a sched class. Review: Ulrich Weigand llvm-svn: 331040
* [AArch64] Codegen for v8.2A dot product intrinsicsOliver Stannard2018-04-272-13/+39
| | | | | | | | | This adds IR intrinsics for the AArch64 dot-product instructions introduced in v8.2-A. Differential revisioon: https://reviews.llvm.org/D46107 llvm-svn: 331036
* [NVPTX] Turn on Loop/SLP vectorizationBenjamin Kramer2018-04-271-0/+12
| | | | | | | | | | | | | | | | | | | | Since PTX has grown a <2 x half> datatype vectorization has become more important. The late LoadStoreVectorizer intentionally only does loads and stores, but now arithmetic has to be vectorized for optimal throughput too. This is still very limited, SLP vectorization happily creates <2 x half> if it's a legal type but there's still a lot of register moving happening to get that fed into a vectorized store. Overall it's a small performance win by reducing the amount of arithmetic instructions. I haven't really checked what the loop vectorizer does to PTX code, the cost model there might need some more tweaks. I didn't see it causing harm though. Differential Revision: https://reviews.llvm.org/D46130 llvm-svn: 331035
* [X86] Replace some system instruction instregex single matches with instrs ↵Simon Pilgrim2018-04-277-85/+60
| | | | | | entry. NFCI. llvm-svn: 331034
* [mips] Fix how compiler fuse instructions to fmadd/fmsubAleksandar Beserminji2018-04-275-9/+30
| | | | | | | | | | This patch makes compiler does not fuse fmul and fadd/fsub into fmadd/fmsub by default. Instead, -fp-contract=fast option can be used when such behavior is desired. Differential Revision: https://reviews.llvm.org/D46057 llvm-svn: 331033
* [ARM] Codegen for v8.2A dot product intrinsicsOliver Stannard2018-04-271-26/+48
| | | | | | | | | This adds IR intrinsics for the ARM dot-product instructions introduced in v8.2-A. Differential revision: https://reviews.llvm.org/D46106 llvm-svn: 331032
* [ARM] Enable misched for R52.David Green2018-04-271-0/+1
| | | | | | | | | Back when the R52 schedule was added in rL286949, there was no way to enable machine schedules in ARM for specific cores. Since then a target feature has been added. This enables the feature for R52, removing the need to manually specify compiler flags. llvm-svn: 331027
* [mips] Add support for Virtualization ASEPetar Jovanovic2018-04-2713-24/+357
| | | | | | | | | | | | | | | | | | | This includes Instructions: tlbginv, tlbginvf, tlbgp, tlbgr, tlbgwi, tlbgwr, hypcall mfgc0, mtgc0, mfhgc0, mthgc0, dmfgc0, dmtgc0, Assembler directives: .set virt, .set novirt, .module virt, .module novirt Attribute: virt .MIPS.abiflags: VZ (0x100) Patch by Vladimir Stefanovic. Differential Revision: https://reviews.llvm.org/D44905 llvm-svn: 331024
* [MachineOutliner] Don't outline from functions with a section marking.Eli Friedman2018-04-271-0/+7
| | | | | | | | | | | | | | The program might have unusual expectations for functions; for example, the Linux kernel's build system warns if it finds references from .text to .init.data. I'm not sure this is something we actually want to make any guarantees about (there isn't any explicit rule that would disallow outlining in this case), but we might want to be conservative anyway. Differential Revision: https://reviews.llvm.org/D46091 llvm-svn: 331007
* [x86] Revert r330322 (& r330323): Lowering x86 adds/addus/subs/subus intrinsicsChandler Carruth2018-04-262-89/+40
| | | | | | | | The LLVM commit introduces a crash in LLVM's instruction selection. I filed http://llvm.org/PR37260 with the test case. llvm-svn: 330997
* [mips] Accept 32-bit offsets for lb and lbu commandsSimon Atanasyan2018-04-262-2/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `lb` and `lbu` commands accepts 16-bit signed offsets. But GAS accepts larger offsets for these commands. If an offset does not fit in 16-bit range, `lb` command is translated into lui/lb or lui/addu/lb series. It's interesting that initially LLVM assembler supported this feature, but later it was broken. This patch restores support for 32-bit offsets. It replaces `mem_simm16` operand for `LB` and `LBu` definitions by the new `mem_simmptr` operand. This operand is intended to check that offset fits to the same size as using for pointers. Later we will be able to extend this rule and accepts 64-bit offsets when it is possible. Some issues remain: - The regression also affects LD, SD, LH, LHU commands. I'm going to fix them by a separate patch. - GAS accepts any 32-bit values as an offset. Now LLVM accepts signed 16-bit values and this patch extends the range to signed 32-bit offsets. In other words, the following code accepted by GAS and still triggers an error by LLVM: ``` lb $4, 0x80000004 # gas lui a0, 0x8000 lb a0, 4(a0) ``` - In case of 64-bit pointers GAS accepts a 64-bit offset and translates it to the li/dsll/lb series of commands. LLVM still rejects it. Probably this feature has never been implemented in LLVM. This issue is for a separate patch. ``` lb $4, 0x800000001 # gas li a0, 0x8000 dsll a0, a0, 0x14 lb a0, 4(a0) ``` Differential Revision: https://reviews.llvm.org/D45020 llvm-svn: 330983
* [WebAssembly] Write DWARF data into wasm object fileSam Clegg2018-04-261-1/+28
| | | | | | | | | | | - Writes ".debug_XXX" into corresponding custom sections. - Writes relocation records into "reloc.debug_XXX" sections. Patch by Yury Delendik! Differential Revision: https://reviews.llvm.org/D44184 llvm-svn: 330982
* DAG: Fix not legalizing vector fcanonicalizesMatt Arsenault2018-04-261-0/+1
| | | | | | If an fcanoncialize was done on a vector type that was legal, llvm-svn: 330981
* AMDGPU: Extend extract_vector_elt fneg combine to fabsMatt Arsenault2018-04-261-2/+3
| | | | | | Fixes a regression in a future commit. llvm-svn: 330980
* AMDGPU: Consolidate SubtargetPredicate definitionsMatt Arsenault2018-04-262-7/+7
| | | | llvm-svn: 330979
* [AArch64] Fix scavenged spill slot base when stack realignment required.Geoff Berry2018-04-261-2/+10
| | | | | | | | | | | | | | | | | | Summary: Use the FP for scavenged spill slot accesses to prevent corruption of the callee-save region when the SP is re-aligned. Based on problem and patch reported by @paulwalker-arm This is an alternative to solution proposed in D45770 Reviewers: t.p.northover, paulwalker-arm, thegameg, javed.absar Subscribers: qcolombet, mcrosier, paulwalker-arm, kristof.beyls, rengolin, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46063 llvm-svn: 330976
* [AMDGPU][Waitcnt] As of gfx7, VMEM operations do not increment the export ↵Mark Searles2018-04-262-1/+5
| | | | | | | | counter and the input registers are available in the next instruction; update the waitcnt pass to take this into account. Differential Revision: https://reviews.llvm.org/D46067 llvm-svn: 330954
* [mips] Correct the definitions of some control instructionsSimon Dardis2018-04-263-39/+36
| | | | | | | | | | | | | | Correct the definitions of ei, di, eret, deret, wait, syscall and break. Also provide microMIPS specific aliases to match the MIPS aliases. Additionally correct the definition of the wait instruction so that it is present in the instruction mapping tables. Reviewers: smaksimovic, abeserminji, atanasyan Differential Revision: https://reviews.llvm.org/D45939 llvm-svn: 330952
* [RISCV] Implement isLoadFromStackSlot and isStoreToStackSlotAlex Bradbury2018-04-262-0/+54
| | | | | | | | | This causes some slight shuffling but no meaningful codegen differences on the corpus I used for testing, but it has a larger impact when combined with e.g. rematerialisation. Regardless, it makes sense to report as accurate target-specific information as possible. llvm-svn: 330949
* [NVPTX] Make the legalizer expand shufflevector of <2 x half>Benjamin Kramer2018-04-261-0/+1
| | | | | | | | | | There's no direct instruction for this, but it's trivially implemented with two movs. Without this the code generator just dies when encountering a shufflevector. Differential Revision: https://reviews.llvm.org/D46116 llvm-svn: 330948
* [RISCV] Implement isZextFreeAlex Bradbury2018-04-262-0/+15
| | | | | | | This returns true for 8-bit and 16-bit loads, allowing LBU/LHU to be selected and avoiding unnecessary masks. llvm-svn: 330943
* [TTI, AArch64] Add transpose shuffle kindMatthew Simpson2018-04-262-0/+29
| | | | | | | | | | | | | | This patch adds a new shuffle kind useful for transposing a 2xn matrix. These transpose shuffle masks read corresponding even- or odd-numbered vector elements from two n-dimensional source vectors and write each result into consecutive elements of an n-dimensional destination vector. The transpose shuffle kind is meant to model the TRN1 and TRN2 AArch64 instructions. As such, this patch also considers transpose shuffles in the AArch64 implementation of getShuffleCost. Differential Revision: https://reviews.llvm.org/D45982 llvm-svn: 330941
* [RISCV] Implement isTruncateFreeAlex Bradbury2018-04-262-0/+22
| | | | | | Adapted from ARM's implementation introduced in r313533 and r314280. llvm-svn: 330940
* [X86] Fix Update Kill Register in Avoid SFB Pass - Bug 37153Lama Saba2018-04-261-19/+29
| | | | | | | Differential Revision: https://reviews.llvm.org/D45823 Change-Id: Icf6f34f6babc3cb2ff5292fde003472473037a71 llvm-svn: 330939
* [RISCV] Implement isLegalICmpImmediateAlex Bradbury2018-04-262-0/+5
| | | | | | | | I'm unable to construct a representative test case that demonstrates the advantage, but it seems sensible to report accurate target-specific information regardless. llvm-svn: 330938
* [RISCV] Implement isLegalAddImmediateAlex Bradbury2018-04-262-0/+5
| | | | | | | This causes a trivial improvement in the recently added lsr-legaladdimm.ll test case. llvm-svn: 330937
* [AArch64][SVE] Enable DiagnosticPredicates for SVE LD1 instructions.Sander de Smalen2018-04-261-14/+27
| | | | | | | | | | | | | | | | | | | | This patch extends the PredicateMethod of AsmOperands used in SVE's LD1 instructions with a DiagnosticPredicate. This makes them 'context sensitive' to the operand that has been parsed and tells the user to use the right register (with expected shift/extend), rather than telling the immediate is out of range when it actually parsed a register. Patch [2/2] in a series to improve assembler diagnostics for SVE: - Patch [1/2]: https://reviews.llvm.org/D45879 - Patch [2/2]: https://reviews.llvm.org/D45880 Reviewers: olista01, stoklund, craig.topper, mcrosier, rengolin, echristo, fhahn, SjoerdMeijer, evandro, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D45880 llvm-svn: 330934
* [NVPTX] Deduplicate code. No functionality change.Benjamin Kramer2018-04-261-18/+6
| | | | llvm-svn: 330933
* [RISCV] Implement isLegalAddressingMode for RISC-VAlex Bradbury2018-04-262-0/+30
| | | | | | | | | | | | This has no impact on codegen for the current RISC-V unit tests or my small benchmark set and very minor changes in a few programs in the GCC torture suite. Based on this, I haven't been able to produce a representative test program that demonstrates a benefit from isLegalAddressingMode. I'm committing the patch anyway, on the basis that presenting accurate information to the target-independent code is preferable to relying on incorrect generic assumptions. llvm-svn: 330932
* [AArch64][SVE] Asm: Support for gather LD1/LDFF1 (scalar + vector) load ↵Sander de Smalen2018-04-262-1/+212
| | | | | | | | | | | | | | | | | | instructions. Patch [2/3] in series to add support for SVE's gather load instructions that use scalar+vector addressing modes: - Patch [1/3]: https://reviews.llvm.org/D45951 - Patch [2/3]: https://reviews.llvm.org/D46023 - Patch [3/3]: https://reviews.llvm.org/D45958 Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, t.p.northover, echristo, evandro, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D46023 llvm-svn: 330928
* [X86] Print 'tbyte ptr' instead of 'xword ptr' for f80mem in Intel syntax.Craig Topper2018-04-261-1/+1
| | | | | | This matches objdump. llvm-svn: 330922
* [X86] Remove alignment restriction on loading folding of pcmp[ei]str* during ↵Craig Topper2018-04-261-4/+4
| | | | | | | | isel too. This is a follow up to the changes in r330896 which enabled folding after isel during peephole and register allocation. llvm-svn: 330897
* [x86] Allow folding unaligned memory operands into pcmp[ei]str*Chandler Carruth2018-04-261-4/+4
| | | | | | | | | | | instructions. These have special permission according to the x86 manual to read unaligned memory, and this folding is done by ICC and GCC as well. This corrects one of the issues identified in PR37246. llvm-svn: 330896
* [CostModel][X86] Remove hard coded SDIV/UDIV vector costsSimon Pilgrim2018-04-251-37/+13
| | | | | | Algorithmically compute the 'x20' SDIV/UDIV vector costs - this is necessary for PR36550 when DIV costs will be driven from the scheduler models. llvm-svn: 330870
* AMDGPU/R600: Move int_r600_store_stream_output to the public intrinsic fileTom Stellard2018-04-251-4/+0
| | | | | | | | | | | | | | | | Summary: The TableGen'd GlobalISel instruction selector assumes all intrinsics are in the public Intrinsic:: namespace. Reviewers: jvesely, nhaehnle Reviewed By: jvesely, nhaehnle Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45989 llvm-svn: 330866
* [AMDGPU] Waitcnt pass: add debug optionsMark Searles2018-04-251-11/+75
| | | | | | | | | | | | | | | | | - Add "amdgpu-waitcnt-forcezero" to force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) - Add debug counters to control force emit of s_waitcnt instrs; debug counters: si-insert-waitcnts-forceexp: force emit s_waitcnt expcnt(0) instrs si-insert-waitcnts-forcevm: force emit s_waitcnt lgkmcnt(0) instrs si-insert-waitcnts-forcelgkm: force emit s_waitcnt vmcnt(0) instrs - Add some debug statements Note that a variant of this patch was previously committed/reverted. Differential Revision: https://reviews.llvm.org/D45888 llvm-svn: 330862
* [X86] Form MUL_IMM for multiplies with 3/5/9 to encourage LEA formation over ↵Craig Topper2018-04-251-2/+6
| | | | | | | | | | load folding. Previously we only formed MUL_IMM when we split a constant. This blocked load folding on those cases. We should also form MUL_IMM for 3/5/9 to favor LEA over load folding. Differential Revision: https://reviews.llvm.org/D46040 llvm-svn: 330850
* [RISCV] Allow call pseudoinstruction to be used to call a function name that ↵Alex Bradbury2018-04-251-9/+12
| | | | | | | | | | coincides with a register name Previously `call zero`, `call f0` etc would fail. This leads to compilation failures if building programs that define functions with those names and using -save-temps. llvm-svn: 330846
* [CostModel][X86] Recursive call for cost of imul for packed v16i16 constant ↵Simon Pilgrim2018-04-251-1/+3
| | | | | | | | shift left. Don't just assume cost = 1. llvm-svn: 330834
* [AArch64][GlobalISel] Implement selection for the llvm.trap intrinsic.Amara Emerson2018-04-251-0/+9
| | | | | | rdar://38674040 llvm-svn: 330831
* [RISCV] Expand function call to "call" pseudoinstructionShiva Chen2018-04-253-10/+18
| | | | | | | | | | | | | | | | To do this: 1. Change GlobalAddress SDNode to TargetGlobalAddress to avoid legalizer split the symbol. 2. Change ExternalSymbol SDNode to TargetExternalSymbol to avoid legalizer split the symbol. 3. Let PseudoCALL match direct call with target operand TargetGlobalAddress and TargetExternalSymbol. Differential Revision: https://reviews.llvm.org/D44885 llvm-svn: 330827
* [RISCV] Support "call" pseudoinstruction in the MC layerShiva Chen2018-04-257-4/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | To do this: 1. Add PseudoCALLIndirct to match indirect function call. 2. Add PseudoCALL to support parsing and print pseudo `call` in assembly 3. Expand PseudoCALL to the following form with R_RISCV_CALL relocation type while encoding: auipc ra, func jalr ra, ra, 0 If we expand PseudoCALL before emitting assembly, we will see auipc and jalr pair when compile with -S. It's hard for assembly parser to parsing this pair and identify it's semantic is function call and then insert R_RISCV_CALL relocation type. Although we could insert R_RISCV_PCREL_HI20 and R_RISCV_PCREL_LO12_I relocation types instead of R_RISCV_CALL. Due to RISCV relocation design, auipc and jalr pair only can relax to jal with R_RISCV_CALL + R_RISCV_RELAX relocation types. We expand PseudoCALL as late as encoding(RISCVMCCodeEmitter) instead of before emitting assembly(RISCVAsmPrinter) because we want to preserve call pseudoinstruction in assembly code. It's more readable and assembly parser could identify call assembly and insert R_RISCV_CALL relocation type. Differential Revision: https://reviews.llvm.org/D45859 llvm-svn: 330826
* [mips] Teach the delay slot filler to transform 'jal' for microMIPSSimon Dardis2018-04-251-0/+1
| | | | | | | | | | ISel is currently picking 'JAL' over 'JAL_MM' for calling a function when targeting microMIPS. A later patch will correct this behaviour. This patch extends the mechanism for transforming instructions into their short delay to recognise 'JAL_MM' for transforming into 'JALS_MM'. llvm-svn: 330825
* [X86] Split WriteFMA into XMM, Scalar and YMM/ZMM scheduler classesSimon Pilgrim2018-04-2512-170/+201
| | | | | | | | This removes all the FMA InstRW overrides. If we ever get PR36924, then we can remove many of these declarations from models. llvm-svn: 330820
* [AMDGPU] Revert b0efc4fd6 (https://reviews.llvm.org/D40556)Alexander Timofeev2018-04-251-64/+15
| | | | llvm-svn: 330818
* [X86][SKX] Setup WriteFAdd and remove unnecessary InstRW scheduler overrides.Simon Pilgrim2018-04-251-140/+7
| | | | llvm-svn: 330813
* [X86][SNB] Remove unnecessary WriteFBlendLd InstRW scheduler overrides.Simon Pilgrim2018-04-251-4/+2
| | | | llvm-svn: 330812
* [mips] Fix the definition of sync, synciSimon Dardis2018-04-255-11/+36
| | | | | | | | | | Also, fix the disassembly of synci for microMIPS. Reviewers: abeserminji, smaksimovic, atanasyan Differential Revision: https://reviews.llvm.org/D45870 llvm-svn: 330810
* [AArch64][SVE] Asm: Add AsmOperand classes for SVE gather/scatter addressing ↵Sander de Smalen2018-04-254-8/+150
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | modes. This patch adds parsing support for 'vector + shift/extend' and corresponding asm operand classes, needed for implementing SVE's gather/scatter addressing modes. The added combinations of vector (ZPR) and Shift/Extend are: Unscaled: ZPR64ExtLSL8: signed 64-bit offsets (z0.d) ZPR32ExtUXTW8: unsigned 32-bit offsets (z0.s, uxtw) ZPR32ExtSXTW8: signed 32-bit offsets (z0.s, sxtw) Unpacked and unscaled: ZPR64ExtUXTW8: unsigned 32-bit offsets (z0.d, uxtw) ZPR64ExtSXTW8: signed 32-bit offsets (z0.d, sxtw) Unpacked and scaled: ZPR64ExtUXTW<scale>: unsigned 32-bit offsets (z0.d, uxtw #<shift>) ZPR64ExtSXTW<scale>: signed 32-bit offsets (z0.d, sxtw #<shift>) Scaled: ZPR32ExtUXTW<scale>: unsigned 32-bit offsets (z0.s, uxtw #<shift>) ZPR32ExtSXTW<scale>: signed 32-bit offsets (z0.s, sxtw #<shift>) ZPR64ExtLSL<scale>: unsigned 64-bit offsets (z0.d, lsl #<shift>) ZPR64ExtLSL<scale>: signed 64-bit offsets (z0.d, lsl #<shift>) Patch [1/3] in series to add support for SVE's gather load instructions that use scalar+vector addressing modes: - Patch [1/3]: https://reviews.llvm.org/D45951 - Patch [2/3]: https://reviews.llvm.org/D46023 - Patch [3/3]: https://reviews.llvm.org/D45958 Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, t.p.northover, echristo, evandro, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D45951 llvm-svn: 330805
OpenPOWER on IntegriCloud