summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/SOPInstructions.td
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Fix failure to select branch with optnoneMatt Arsenault2017-10-101-2/+1
| | | | | | | | opt-bisect/optnone disable the AMDGPUUniformAnnotateValues pass. The heuristic in the custom selector for brcond deferred the branch uniformity check to the pattern, which would fail. llvm-svn: 315360
* AMDGPU: Remove global isGCN predicatesMatt Arsenault2017-10-031-14/+10
| | | | | | | | | | | | | | These are problematic because they apply to everything, and can easily clobber whatever more specific predicate you are trying to add to a function. Currently instructions use SubtargetPredicate/PredicateControl to apply this to patterns applied to an instruction definition, but not to free standing Pats. Add a wrapper around Pat so the special PredicateControls requirements can be appended to the final predicate list like how Mips does it. llvm-svn: 314742
* AMDGPU: Start selecting s_xnor_{b32, b64}Konstantin Zhuravlyov2017-09-181-2/+8
| | | | | | Differential Revision: https://reviews.llvm.org/D37981 llvm-svn: 313565
* Resubmit r303859 with test fixed.Konstantin Zhuravlyov2017-05-261-1/+3
| | | | | | | | | | [AMDGPU] add intrinsic for s_getpc Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc. Patch by Tim Corringham llvm-svn: 304031
* Revert r303859, CodeGen/AMDGPU/llvm.amdgcn.s.getpc.ll fails on bots.Nico Weber2017-05-251-3/+1
| | | | llvm-svn: 303902
* [AMDGPU] add intrinsic for s_getpcTim Corringham2017-05-251-1/+3
| | | | | | | | | | | | | | Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D32862 llvm-svn: 303859
* AMDGPU: Start defining a calling conventionMatt Arsenault2017-05-171-3/+15
| | | | | | | | Partially implement callee-side for arguments and return values. byval doesn't work properly, and most likely sret or other on-stack return values most as well. llvm-svn: 303308
* [AMDGPU][MC] Added check for truncation of SOPK imm operandDmitry Preobrazhensky2017-04-261-17/+19
| | | | | | | | | | See bug 30827: https://bugs.llvm.org//show_bug.cgi?id=30827 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D32535 llvm-svn: 301418
* [AMDGPU][MC] Enabled constants for src operands of s_cbranch_g_forkDmitry Preobrazhensky2017-04-141-1/+1
| | | | | | | | | | Fixed bug 32619: https://bugs.llvm.org//show_bug.cgi?id=32619 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D31973 llvm-svn: 300318
* [AMDGPU][MC] Added support for several VI-specific opcodes (s_wakeup, etc)Dmitry Preobrazhensky2017-04-121-0/+28
| | | | | | | | | | | | | | | | | | | | | | Added support for VI: - s_endpgm_saved - s_wakeup - s_rfe_restore_b64 - v_perm_b32 Enabled for VI: - v_mov_fed_b32 - v_mov_fed_b32_e64 See bug 32593: https://bugs.llvm.org//show_bug.cgi?id=32593 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D31931 llvm-svn: 300076
* [AMDGPU][MC] Corrected src0 size for s_cbranch_joinDmitry Preobrazhensky2017-04-121-1/+7
| | | | | | | | | | Fix for bug 28159: https://bugs.llvm.org//show_bug.cgi?id=28159 Reviewers: vpykhtin, arsenm Differential Revision: https://reviews.llvm.org/D31595 llvm-svn: 300055
* [AMDGPU][MC] Fix for Bug 28158 + LIT testsDmitry Preobrazhensky2017-04-051-0/+20
| | | | | | | | | | | | | | | Added support of the following instructions: - s_cbranch_cdbgsys - s_cbranch_cdbgsys_and_user - s_cbranch_cdbgsys_or_user - s_cbranch_cdbguser - s_setkill Reviewers: vpykhtin Differential Revision: https://reviews.llvm.org/D31469 llvm-svn: 299567
* AMDGPU: Add VOP3P instruction formatMatt Arsenault2017-02-271-0/+8
| | | | | | | | Add a few non-VOP3P but instructions related to packed. Includes hack with dummy operands for the benefit of the assembler llvm-svn: 296368
* AMDGPU/SI: Implement sendmsghalt intrinsicJan Vesely2017-01-041-1/+4
| | | | | | | | v2: expose using amdgcn prefix Differential Revision: https://reviews.llvm.org/D23511 llvm-svn: 290977
* AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.Tom Stellard2016-12-071-1/+5
| | | | | | | | | | | | | | Patch By: Wei Ding Summary: This patch fixes the fdiv precision issues. Reviewers: b-sumner, cfang, wdng, arsenm Subscribers: kzhuravl, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D26424 llvm-svn: 288879
* AMDGPU: Add VI i16 supportTom Stellard2016-11-101-1/+36
| | | | | | | | Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 286464
* Revert "AMDGPU: Add VI i16 support"Tom Stellard2016-11-041-36/+1
| | | | | | This reverts commit r285939 and r285948. These broke some conformance tests. llvm-svn: 285995
* AMDGPU: Add VI i16 supportTom Stellard2016-11-031-1/+36
| | | | | | | | Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 285939
* AMDGPU: Workaround for instruction size with literalsMatt Arsenault2016-11-011-0/+1
| | | | | | | | | | Instructions with a 32-bit base encoding with an optional 32-bit literal encoded after them report their size as 4 for the disassembler. Consider these when computing the MachineInstr size. This fixes problems caused by size estimate consistency in BranchRelaxation. llvm-svn: 285743
* AMDGPU: Fix instruction flags for s_endpgmMatt Arsenault2016-10-281-2/+1
| | | | | | | Set isReturn, remove hasSideEffects. Also remove hasCtrlDep, I'm not really sure what that does. llvm-svn: 285476
* AMDGPU: Add instruction definitions for VGPR indexingMatt Arsenault2016-10-121-1/+51
| | | | | | | VI added a second method of indexing into VGPRs besides using v_movrel* llvm-svn: 284027
* BranchRelaxation: Support expanding unconditional branchesMatt Arsenault2016-10-061-0/+2
| | | | | | | AMDGPU needs to expand unconditional branches in a new block with an indirect branch. llvm-svn: 283464
* AMDGPU: Partially fix reported code size for some instructionsMatt Arsenault2016-10-061-2/+3
| | | | | | | | These ones need to have the size on the pseudo instruction set for getInstSizeInBytes to work correctly. These also have a statically known size. llvm-svn: 283437
* AMDGPU: Use unsigned compare for eq/neMatt Arsenault2016-09-301-2/+2
| | | | | | | | | | For some reason there are both of these available, except for scalar 64-bit compares which only has u64. I'm not sure why there are both (I'm guessing it's for the one bit inputs we don't use), but for consistency always using the unsigned one. llvm-svn: 282832
* AMDGPU: Use i64 scalar compare instructionsMatt Arsenault2016-09-171-0/+12
| | | | | | VI added eq/ne for i64, so use them. llvm-svn: 281800
* AMDGPU: Use SOPK compare instructionsMatt Arsenault2016-09-161-23/+37
| | | | llvm-svn: 281780
* Revert "AMDGPU: Use SOPK compare instructions"Matt Arsenault2016-09-141-37/+23
| | | | | | Accidentally committed llvm-svn: 281514
* AMDGPU: Use SOPK compare instructionsMatt Arsenault2016-09-141-23/+37
| | | | llvm-svn: 281513
* AMDGPU] Assembler: better support for immediate literals in assembler.Sam Kolton2016-09-091-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals. E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least. With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction). Here are rules how we convert literals: - We parsed fp literal: - Instruction expects 64-bit operand: - If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5) - then we do nothing this literal - Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5) - report error - Else literal is not-inlinable but we can encode it as additional 32-bit literal constant - If instruction expect fp operand type (f64) - Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5) - If so then do nothing - Else (e.g. v_fract_f64 v[0:1], 3.1415) - report warning that low 32 bits will be set to zeroes and precision will be lost - set low 32 bits of literal to zeroes - Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5) - report error as it is unclear how to encode this literal - Instruction expects 32-bit operand: - Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow - Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5) - do nothing - Else report error - Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0) - Parsed binary literal: - Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35) - do nothing - Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35) - report error - Else, literal is not-inlinable and we are not required to inline it - Are high 32 bit of literal zeroes or same as sign bit (32 bit) - do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef) - Else - report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0) For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types: ''' enum OperandType { OPERAND_REG_IMM32_INT, OPERAND_REG_IMM32_FP, OPERAND_REG_INLINE_C_INT, OPERAND_REG_INLINE_C_FP, } ''' This is not working yet: - Several tests are failing - Problems with predicate methods for inline immediates - LLVM generated assembler parts try to select e64 encoding before e32. More changes are required for several AsmOperands. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, artem.tamazov Differential Revision: https://reviews.llvm.org/D22922 llvm-svn: 281050
* AMDGPU: Make some scalar instructions commutableMatt Arsenault2016-09-071-2/+9
| | | | llvm-svn: 280784
* AMDGPU/SI: Teach SIInstrInfo::FoldImmediate() to fold immediates into copiesTom Stellard2016-09-061-1/+2
| | | | | | | | | | | | | | | | | | | | Summary: I put this code here, because I want to re-use it in a few other places. This supersedes some of the immediate folding code we have in SIFoldOperands. I think the peephole optimizers is probably a better place for folding immediates into copies, since it does some register coalescing in the same time. This will also make it easier to transition SIFoldOperands into a smarter pass, where it looks at all uses of instruction at once to determine the optimal way to fold operands. Right now, the pass just considers one operand at a time. Reviewers: arsenm Subscribers: wdng, nhaehnle, arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23402 llvm-svn: 280744
* [AMDGPU] Refactor SOP instructions TD files.Valery Pykhtin2016-08-301-0/+1103
Differential revision: https://reviews.llvm.org/D23617 llvm-svn: 280101
OpenPOWER on IntegriCloud