summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [MIPS GlobalISel] Select integer to floating point conversionsPetar Avramovic2019-06-202-0/+21
| | | | | | | | Select G_SITOFP and G_UITOFP for MIPS32. Differential Revision: https://reviews.llvm.org/D63542 llvm-svn: 363912
* [MIPS GlobalISel] Select floating point to integer conversionsPetar Avramovic2019-06-203-0/+50
| | | | | | | | Select G_FPTOSI and G_FPTOUI for MIPS32. Differential Revision: https://reviews.llvm.org/D63541 llvm-svn: 363911
* [X86] Remove memory instructions form isUseDefConvertible.Craig Topper2019-06-201-15/+15
| | | | | | | | The caller of this is looking for comparisons of the input to these instructions with 0. But the memory instructions input is an addess not a value input in a register. llvm-svn: 363907
* [X86] Add v64i8/v32i16 to several places in X86CallingConv.td where they ↵Craig Topper2019-06-201-3/+4
| | | | | | seemed obviously missing. llvm-svn: 363906
* AMDGPU: Don't clobber VCC in MUBUF addr64 emulationMatt Arsenault2019-06-201-9/+16
| | | | | | | | | Introducing VCC defs during SIFixSGPRCopies is generally problematic. Avoid it by starting with the VOP3 form with the general condition register. This is the easiest to fix instance, but doesn't solve any specific problems I'm looking at. llvm-svn: 363904
* [llvm-objdump] Switch between ARM/Thumb based on mapping symbols.Eli Friedman2019-06-201-29/+28
| | | | | | | | | | | | | | | The ARMDisassembler changes allow changing between ARM and Thumb mode based on the MCSubtargetInfo, rather than the Target, which simplifies the other changes a bit. I'm not really happy with adding more target-specific logic to tools/llvm-objdump/, but there isn't any easy way around it: the logic in question specifically applies to disassembling an object file, and that code simply isn't located in lib/Target, at least at the moment. Differential Revision: https://reviews.llvm.org/D60927 llvm-svn: 363903
* AMDGPU: Consolidate some getGeneration checksMatt Arsenault2019-06-199-31/+82
| | | | | | | | This is incomplete, and ideally these would all be removed, but it's better to localize them to the subtarget first with comments about what they're for. llvm-svn: 363902
* AMDGPU: Undo sub x, c canonicalization for v2i16Matt Arsenault2019-06-193-26/+87
| | | | | | Should avoid regression from D62341 llvm-svn: 363899
* [mips] Mark the `lwupc` instruction as MIPS64 R6 onlySimon Atanasyan2019-06-192-3/+3
| | | | | | | | | | The "The MIPS64 Instruction Set Reference Manual" [1] states that the `lwupc` is MIPS64 Release 6 only. It should not be supported for 32-bit CPUs. [1] https://s3-eu-west-1.amazonaws.com/downloads-mips/documents/MD00087-2B-MIPS64BIS-AFP-6.06.pdf llvm-svn: 363886
* [mips] Add (GPR|PTR)_64 predicates to PseudoReturn64 and ↵Simon Atanasyan2019-06-191-2/+2
| | | | | | | | | PseudoIndirectHazardBranch64 This patch is one of a series of patches. The goal is to make P5600 scheduler model complete and turn on the `CompleteModel` flag. llvm-svn: 363885
* AMDGPU: Fix folding immediate into readfirstlane through reg_sequenceMatt Arsenault2019-06-192-2/+6
| | | | | | | | | | | | | The def instruction for the vreg may not match, because it may be folding through a reg_sequence. The assert was overly conservative and not necessary. It's not actually important if DefMI really defined the register, because the fold that will be done cares about the def of the value that will be folded. For some reason copies aren't making it through the reg_sequence, although they should. llvm-svn: 363876
* hwasan: Shrink outlined checks by 1 instruction.Peter Collingbourne2019-06-191-12/+7
| | | | | | | | | Turns out that we can save an instruction by folding the right shift into the compare. Differential Revision: https://reviews.llvm.org/D63568 llvm-svn: 363874
* Reapply "AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsics"Matt Arsenault2019-06-196-21/+155
| | | | | | | | This reapplies r363678, using the correct chain for the CopyToReg for v0. glueCopyToM0 counterintuitively changes the operands of the original node. llvm-svn: 363870
* [x86] avoid vector load narrowing with extracted store uses (PR42305)Sanjay Patel2019-06-191-0/+20
| | | | | | | | | | | | This is an exception to the rule that we should prefer xmm ops to ymm ops. As shown in PR42305: https://bugs.llvm.org/show_bug.cgi?id=42305 ...the store folding opportunity with vextractf128 may result in better perf by reducing the instruction count. Differential Revision: https://reviews.llvm.org/D63517 llvm-svn: 363853
* [X86][SSE] combineToExtendVectorInReg - add ANY_EXTEND support TODO. NFCI.Simon Pilgrim2019-06-191-0/+1
| | | | | | So I don't forget - there's a load of yak shaving to do first. llvm-svn: 363847
* [X86][SSE] Combine shuffles to ANY_EXTEND/ANY_EXTEND_VECTOR_INREG.Simon Pilgrim2019-06-191-10/+15
| | | | | | We already do this for ZERO_EXTEND/ZERO_EXTEND_VECTOR_INREG - this just extends the pattern matcher to recognize cases where we don't need the zeros in the extension. llvm-svn: 363841
* [ARM] Add MVE vector bit-operations (register inputs).Simon Tatham2019-06-198-25/+477
| | | | | | | | | | | | | | | | | | | | | | | | This includes all the obvious bitwise operations (AND, OR, BIC, ORN, MVN) in register-to-register forms, and the immediate forms of AND/OR/BIC/ORN; byte-order reverse instructions; and the VMOVs that access a single lane of a vector. Some of those VMOVs (specifically, the ones that access a 32-bit lane) share an encoding with existing instructions that were disassembled as accessing half of a d-register (e.g. `vmov.32 r0, d1[0]`), but in 8.1-M they're now written as accessing a quarter of a q-register (e.g. `vmov.32 r0, q0[2]`). The older syntax is still accepted by the assembler. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62673 llvm-svn: 363838
* [AVR] Change limit type to match the argument type (NFC)Evandro Menezes2019-06-191-1/+1
| | | | llvm-svn: 363832
* [Hexagon] Change limit type to match the argument type (NFC)Evandro Menezes2019-06-191-1/+1
| | | | llvm-svn: 363831
* [X86] getExtendInVec - take a ISD::*_EXTEND opcode instead of a IsSigned ↵Simon Pilgrim2019-06-191-15/+13
| | | | | | | | bool flag. NFCI. Prep work to support ANY_EXTEND/ANY_EXTEND_VECTOR_INREG without needing another flag. llvm-svn: 363818
* [X86] Add *_EXTEND -> *_EXTEND_VECTOR_INREG opcode conversion helper. NFCI.Simon Pilgrim2019-06-191-11/+19
| | | | | | Given a *_EXTEND or *_EXTEND_VECTOR_INREG opcode, convert it to *_EXTEND_VECTOR_INREG. llvm-svn: 363812
* [X86] Merge extract_subvector(*_EXTEND) and ↵Simon Pilgrim2019-06-191-12/+8
| | | | | | extract_subvector(*_EXTEND_VECTOR_INREG) handling. NFCI. llvm-svn: 363808
* [SystemZ] Support vector load/store alignment hintsUlrich Weigand2019-06-195-24/+99
| | | | | | | | | | | | | Vector load/store instructions support an optional alignment field that the compiler can use to provide known alignment info to the hardware. If the field is used (and the information is correct), the hardware may be able (on some models) to perform faster memory accesses than otherwise. This patch adds support for alignment hints in the assembler and disassembler, and fills in known alignment during codegen. llvm-svn: 363806
* Revert rL363678 : AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsicsSimon Pilgrim2019-06-196-156/+21
| | | | | | | | | There may or may not be additional work to handle this correctly on SI/CI. ........ Breaks EXPENSIVE_CHECKS buildbots - http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/78/ llvm-svn: 363797
* [RISCV] Allow parsing immediates that use tilde & exclaimLewis Revill2019-06-191-0/+4
| | | | | | | | | This patch allows immediates (and CSR alias immediates) which start with a tilde token or an exclaim (!) token to be parsed as intended. Differential Revision: https://reviews.llvm.org/D57320 llvm-svn: 363783
* [RISCV] Fix failure to parse parenthesized immediatesLewis Revill2019-06-191-3/+8
| | | | | | | | | | | | | | | Since the parser attempts to parse an operand as a register with parentheses before parsing it as an immediate, immediates in parentheses should not be parsed by parseRegister. However in the case where the immediate does not start with an identifier, the LParen is not unlexed and so the RParen causes an unexpected token error. This patch adds the missing UnLex, and modifies the existing UnLex to not use a buffered token, as it should always be unlexing an LParen. Differential Revision: https://reviews.llvm.org/D57319 llvm-svn: 363782
* [X86] Add missing properties on llvm.x86.sse.{st,ld}mxcsrClement Courbet2019-06-191-0/+4
| | | | | | | | | | | | | | | | Summary: llvm.x86.sse.stmxcsr only writes to memory. llvm.x86.sse.ldmxcsr only reads from memory, and might generate an FPE. Reviewers: craig.topper, RKSimon Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62896 llvm-svn: 363773
* [RISCV] Add lowering of global TLS addressesLewis Revill2019-06-196-0/+169
| | | | | | | | | | | | | This patch adds lowering for global TLS addresses for the TLS models of InitialExec, GlobalDynamic, LocalExec and LocalDynamic. LocalExec support required using a 4-operand add instruction, which uses the fourth operand to express a relocation on the symbol. The necessary fixup is emitted when the instruction is emitted. Differential Revision: https://reviews.llvm.org/D55305 llvm-svn: 363771
* [NFC] move some hardware loop checking code to a common place for other using.Chen Zheng2019-06-194-4/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D63478 llvm-svn: 363758
* Rename ExpandISelPseudo->FinalizeISel, delay register reservationMatt Arsenault2019-06-192-2/+3
| | | | | | | | | | | This allows targets to make more decisions about reserved registers after isel. For example, now it should be certain there are calls or stack objects in the frame or not, which could have been introduced by legalization. Patch by Matthias Braun llvm-svn: 363757
* [WebAssembly] Optimize ISel for SIMD Boolean reductionsThomas Lively2019-06-191-0/+22
| | | | | | | | | | | | | | | | | | | | Summary: Converting the result *.{all,any}_true to a bool at the source level generates LLVM IR that compares the result to 0. This check is redundant since these instructions already return either 0 or 1 and therefore conform to the BooleanContents setting for WebAssembly. This CL adds patterns to detect and remove such redundant operations on the result of Boolean reductions. Reviewers: dschuff, aheejin Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63529 llvm-svn: 363756
* [ARM] Comply with rules on ARMv8-A thumb mode partial deprecation of IT.Huihui Zhang2019-06-182-11/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When identifing instructions that can be folded into a MOVCC instruction, checking for a predicate operand is not enough, also need to check for thumb2 function, with restrict-IT, is the machine instruction eligible for ARMv8 IT or not. Notes in ARMv8-A Architecture Reference Manual, section "Partial deprecation of IT" https://usermanual.wiki/Pdf/ARM20Architecture20Reference20ManualARMv8.1667877052.pdf "ARMv8-A deprecates some uses of the T32 IT instruction. All uses of IT that apply to instructions other than a single subsequent 16-bit instruction from a restricted set are deprecated, as are explicit references to the PC within that single 16-bit instruction. This permits the non-deprecated forms of IT and subsequent instructions to be treated as a single 32-bit conditional instruction." Reviewers: efriedma, lebedev.ri, t.p.northover, jmolloy, aemerson, compnerd, stoklund, ostannard Reviewed By: ostannard Subscribers: ostannard, javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63474 llvm-svn: 363739
* [RISCV] Prevent re-ordering some adds after shiftsSam Elliott2019-06-184-4/+75
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: DAGCombine will normally turn a `(shl (add x, c1), c2)` into `(add (shl x, c2), c1 << c2)`, where `c1` and `c2` are constants. This can be prevented by a callback in TargetLowering. On RISC-V, materialising the constant `c1 << c2` can be more expensive than materialising `c1`, because materialising the former may take more instructions, and may use a register, where materialising the latter would not. This patch implements the hook in RISCVTargetLowering to prevent this transform, in the cases where: - `c1` fits into the immediate field in an `addi` instruction. - `c1` takes fewer instructions to materialise than `c1 << c2`. In future, DAGCombine could do the check to see whether `c1` fits into an add immediate, which might simplify more targets hooks than just RISC-V. Reviewers: asb, luismarques, efriedma Reviewed By: asb Subscribers: xbolva00, lebedev.ri, craig.topper, lewis-revill, Jim, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62857 llvm-svn: 363736
* [AMDGPU] gfx10 wave32 patternsStanislav Mekhanoshin2019-06-183-15/+86
| | | | | | Differential Revision: https://reviews.llvm.org/D63511 llvm-svn: 363729
* [AMDGPU] gfx1010 disassembler changes for wave32Stanislav Mekhanoshin2019-06-182-3/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D63506 llvm-svn: 363721
* [X86] Remove unnecessary line that makes v4f32 FP_ROUND Legal. NFCCraig Topper2019-06-181-1/+0
| | | | | | | | FP_ROUND defaults to Legal for all MVT types and nothing changes the v4f32 entry way from this default. If we needed this line we'd also need one for v8f32 with AVX512 which we don't have. llvm-svn: 363719
* [mips] Add more strict predicates to the RSQRT_S_MM and TAILCALL_MMSimon Atanasyan2019-06-182-2/+3
| | | | | | | This patch is one of a series of patches. The goal is to make P5600 scheduler model complete and turn on the `CompleteModel` flag. llvm-svn: 363703
* [mips] Add PTR_64 and GPR_64 predicates to some MIPS 64-bit instructionsSimon Atanasyan2019-06-182-15/+17
| | | | | | | | | | | | | | | Add `IsGP64bit` and `IsPTR64bit` to the list of `UnsupportedFeatures` of the P5600 scheduling definitions. Also mark some MIPS 64-bit instructions by PTR_64 and GPR_64 predicates. This reduces number of "No schedule information for" and "lacks information for" errors in case of marking this scheduler model as complete. This patch is one of a series of patches. The goal is to make P5600 scheduler model complete and turn on the `CompleteModel` flag. Differential Revision: https://reviews.llvm.org/D63237 llvm-svn: 363702
* [mips] Set the hasNoSchedulingInfo flag for the `MipsAsmPseudoInst`Simon Atanasyan2019-06-181-0/+1
| | | | | | | | | | | | | | Set the hasNoSchedulingInfo flag for the`MipsAsmPseudoInst`. These pseudo-instructions are never used by codegen. This flag allows to reduce number of "No schedule information for" and "lacks information for" errors in case of marking a scheduler model as complete. This patch is one of a series of patches. The goal is to make P5600 scheduler model complete and turn on the `CompleteModel` flag. Differential Revision: https://reviews.llvm.org/D63236 llvm-svn: 363701
* [ARM] Add MVE vector shift instructions.Simon Tatham2019-06-184-4/+655
| | | | | | | | | | | | | | | | | | | This includes saturating and non-saturating shifts, both with immediate shift count and with the shift counts given by another vector register; VSHLC (in which the bits shifted out of each active vector lane are shifted in to the next active lane); and also VMOVL, which is enough like an immediate shift that it didn't fit too badly in this category. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62672 llvm-svn: 363696
* [ARM] Add MVE integer vector min/max instructions.Simon Tatham2019-06-182-1/+28
| | | | | | | | | | | | | | | | | | | | | Summary: These form a small family of their own, to go with the floating-point VMINNM/VMAXNM instructions added in a previous commit. They introduce the first of many special cases in the mnemonic recognition code, because VMIN with the E suffix used by the VPT predication system needs to avoid being interpreted as the nonexistent instruction 'VMI' with an ordinary 'NE' condition suffix. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62671 llvm-svn: 363695
* [X86][AVX] extract_subvector(any_extend(x)) -> any_extend_vector_inreg(x)Simon Pilgrim2019-06-181-4/+10
| | | | | | Part of fixing the X86 regression noted in D63281 - I've split this into X86 and generic parts - the generic commit will be coming shortly and will fix the vector-reduce-mul-widen.ll regression introduced here. llvm-svn: 363693
* [ARM] Rename MVE instructions in Tablegen for consistency.Simon Tatham2019-06-185-243/+256
| | | | | | | | | | | | | | | | | | | Summary: Their names began with a mishmash of `MVE_`, `t2` and no prefix at all. Now they all start with `MVE_`, which seems like a reasonable choice on the grounds that (a) NEON is the thing they're most at risk of being confused with, and (b) MVE implies Thumb-2, so a prefix indicating MVE is strictly more specific than one indicating Thumb-2. Reviewers: ostannard, SjoerdMeijer, dmgreen Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63492 llvm-svn: 363690
* [RISCV] Lower calls through PLTLewis Revill2019-06-183-4/+18
| | | | | | | | | | This patch adds support for generating calls through the procedure linkage table where required for a given ExternalSymbol or GlobalAddress callee. Differential Revision: https://reviews.llvm.org/D55304 llvm-svn: 363686
* AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsicsMatt Arsenault2019-06-186-21/+156
| | | | | | | There may or may not be additional work to handle this correctly on SI/CI. llvm-svn: 363678
* AMDGPU: Change API for checking for exec modificationMatt Arsenault2019-06-184-27/+55
| | | | | | | | | | | | | | | | | | Invert the name and return value to better reflect the imprecise nature. Force passing in the DefMI, since it's known in the 2 users and could possibly fail for an arbitrary vreg. Allow specifying a specific user instruction. Scan through use instructions, instead of use operands. Add scan thresholds instead of searching infinitely. Stop using a set to track seen uses. I didn't understand this usage, or why it would not check the last use. I don't think the use list has any particular order. llvm-svn: 363675
* AMDGPU: Fold readlane from copy of SGPR or immMatt Arsenault2019-06-182-0/+42
| | | | | | These may be inserted to assert uniformity somewhere. llvm-svn: 363670
* AMDGPU: Remove unnecessary check for virtual registerMatt Arsenault2019-06-181-17/+4
| | | | | | | The copy was found by searching the uses of a virtual register, so it's already known to be virtual. llvm-svn: 363669
* AMDGPU: Fix iterator crash in AMDGPUPromoteAllocaMatt Arsenault2019-06-181-5/+9
| | | | | | The lifetime intrinsic was erased, which was the next iterator. llvm-svn: 363668
* AMDGPU/GlobalISel: RegBankSelect for amdgcn.div.scaleMatt Arsenault2019-06-181-0/+14
| | | | llvm-svn: 363667
OpenPOWER on IntegriCloud