| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
The constant is now at source operand 1 (previously at 2).
This is also how it is in legacy AMD sp3 assembler.
Update tests.
Differential Revision: http://reviews.llvm.org/D17984
llvm-svn: 263212
|
| |
|
|
|
|
|
|
| |
Frontend authors are strongly encouraged to keep allocas
in the entry block, so don't bother visiting every instruction
in the other blocks of the function.
llvm-svn: 263206
|
| |
|
|
|
|
|
| |
Move a few functions only used by R600 to R600 specific code,
fix header macros to stop using R600, mark classes as final.
llvm-svn: 263204
|
| |
|
|
|
|
|
| |
If a constant is the same as the reverse of an inline immediate,
this is 4 bytes smaller than having to embed a 32-bit literal.
llvm-svn: 263201
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
They correspond to BUFFER_LOAD/STORE_FORMAT_XYZW and will be used by Mesa
to implement the GL_ARB_shader_image_load_store extension.
The intention is that for llvm.amdgcn.buffer.load.format, LLVM will decide
whether one of the _X/_XY/_XYZ opcodes can be used (similar to image sampling
and loads). However, this is not currently implemented.
For llvm.amdgcn.buffer.store, LLVM cannot decide to use one of the "smaller"
opcodes and therefore the intrinsic is overloaded. Currently, only the v4f32
is actually implemented since GLSL also only has a vec4 variant of the store
instructions, although it's conceivable that Mesa will want to be smarter
about this in the future.
BUFFER_LOAD_FORMAT_XYZW is already exposed via llvm.SI.vs.load.input, which
has a legacy name, pretends not to access memory, and does not capture the
full flexibility of the instruction.
Reviewers: arsenm, tstellarAMD, mareko
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D17277
llvm-svn: 263140
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Define s_getreg intrinsic to generate s_getreg instruction to read
hardware registers.
Reviewers: tstellarAMD, arsenm
Subscribers: llvm-commits, arsenm
Differential Revision: http://reviews.llvm.org/D17892
llvm-svn: 263124
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D17651
llvm-svn: 263108
|
| |
|
|
|
|
| |
http://reviews.llvm.org/D17967
llvm-svn: 263021
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Fixes error from building with clang:
/usr/local/google/home/tejohnson/llvm/llvm_15/lib/Target/AMDGPU/InstPrinter/AMDGPUInstPrinter.cpp:407:12:
error: comparison of unsigned expression >= 0 is always true
[-Werror,-Wtautological-compare]
if ((Imm >= 0x000) && (Imm <= 0x0ff)) {
~~~ ^ ~~~~~
llvm-svn: 263014
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Supprot DPP syntax as used in SP3 (except several operands syntax).
Added dpp-specific operands in td-files.
Added DPP flag to TSFlags to determine if instruction is dpp in InstPrinter.
Support for VOP2 DPP instructions in td-files.
Some tests for DPP instructions.
ToDo:
- VOP2bInst:
- vcc is considered as operand
- AsmMatcher doesn't apply mnemonic aliases when parsing operands
- v_mac_f32
- v_nop
- disable instructions with 64-bit operands
- change dpp_ctrl assembler representation to conform sp3
Review: http://reviews.llvm.org/D17804
llvm-svn: 263008
|
| |
|
|
|
|
|
|
|
| |
Support legacy SP3 abs(v1) syntax. InstPrinter still uses |v1|.
Add tests.
Differential Revision: http://reviews.llvm.org/D17887
llvm-svn: 263006
|
| |
|
|
|
|
|
|
| |
s_setpc_b64 has just one 64-bit source which is the address of instruction to jump to.
Differential Revision: http://reviews.llvm.org/D17888
llvm-svn: 263005
|
| |
|
|
| |
llvm-svn: 262864
|
| |
|
|
| |
llvm-svn: 262854
|
| |
|
|
| |
llvm-svn: 262853
|
| |
|
|
|
|
|
|
| |
Engages code from r262804.
Differential Revision: http://reviews.llvm.org/D17151
llvm-svn: 262808
|
| |
|
|
|
|
|
|
| |
error: moving a local object in a return statement prevents copy elision [-Werror,-Wpessimizing-move]
http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/930
llvm-svn: 262805
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D17150
llvm-svn: 262804
|
| |
|
|
|
|
|
|
|
| |
dst -> sdst
ssrcN -> srcN
Differential Revision: http://reviews.llvm.org/D17646
llvm-svn: 262801
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is necessary for when we run out of VGPRs and can no
longer use v_{read,write}_lane for spilling SGPRs.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D17592
llvm-svn: 262732
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This allows us to use virtual registers when we need extra registers
for inserting spill instructions in SIRegisterInfo:eliminateFrameIndex().
Once all the frame indices have been eliminated, the
PrologEpilogueInserter does an extra pass over the program to replace
all virtual registers with physical ones.
This allows us to make more efficient use of our emergency spill slots,
so we only need to create one.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D17591
llvm-svn: 262728
|
| |
|
|
| |
llvm-svn: 262714
|
| |
|
|
| |
llvm-svn: 262709
|
| |
|
|
|
|
|
|
|
|
|
| |
These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the
GL_ARB_shader_image_load_store extension.
Initial change by Nicolai H.hnle
Differential Revision: http://reviews.llvm.org/D17401
llvm-svn: 262701
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch by: Konstantin Zhuravlyov
Summary: Tools, such as debugger, need to pause execution based on user input (i.e. breakpoint). In order to do this, two S_NOP instructions are inserted for each high level source statement: one before first isa instruction of high level source statement, and one after last isa instruction of high level source statement. Further, debugger may replace S_NOP instructions with S_TRAP instructions based on user input.
Reviewers: tstellarAMD, arsenm
Subscribers: echristo, dblaikie, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D17454
llvm-svn: 262579
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When there were no free SGPRs, we were trying to move this value into
some of the reserved registers which was causing a segmentation fault.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D17590
llvm-svn: 262577
|
| |
|
|
|
|
| |
Patch by Richard Thomson
llvm-svn: 262536
|
| |
|
|
|
|
|
|
| |
fields"
Build failure with clang.
llvm-svn: 262477
|
| |
|
|
|
|
|
|
| |
assembler."
Build failure with clang.
llvm-svn: 262475
|
| |
|
|
|
|
|
|
|
|
| |
complementary patch to table-driven amd_kernel_code_t field parser/printer utility. lit tests passed.
Patch by: Valery Pykhtin
Differential Revision: http://reviews.llvm.org/D17151
llvm-svn: 262474
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is going to be used in .hsatext disassembler and can be used
in current assembler parser (lit tests passed on parsing).
Code using this helpers isn't included in this patch.
Benefits:
unified approach
fast field name lookup on parsing
Later I would like to enhance some of the field naming/syntax using this code.
Patch by: Valery Pykhtin
Differential Revision: http://reviews.llvm.org/D17150
llvm-svn: 262473
|
| |
|
|
|
|
|
|
| |
Fix checking the same instruction twice instead of the
second branch that uses vccz. I don't think this matters
currently because s_branch_vccnz is always used currently.
llvm-svn: 262457
|
| |
|
|
| |
llvm-svn: 262456
|
| |
|
|
| |
llvm-svn: 262455
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TableGen checks at compiletime that for scheduling models with
"CompleteModel = 1" one of the following holds:
- Is marked with the hasNoSchedulingInfo flag
- The instruction is a subclass of Sched
- There are InstRW definitions in the scheduling model
Typical steps necessary to complete a model:
- Ensure all pseudo instructions that are expanded before machine
scheduling (usually everything handled with EmitYYY() functions in
XXXTargetLowering).
- If a CPU does not support some instructions mark the corresponding
resource unsupported: "WriteRes<WriteXXX, []> { let Unsupported = 1; }".
- Add missing scheduling information.
Differential Revision: http://reviews.llvm.org/D17747
llvm-svn: 262384
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Intrinsics
Summary:
This patch impleemnts DS_PERMUTE/DS_BPERMUTE instruction definitions and intrinsics,
which are new since VI.
Reviewers: tstellarAMD, arsenm
Subscribers: llvm-commits, arsenm
Differential Revision: http://reviews.llvm.org/D17614
llvm-svn: 262356
|
| |
|
|
| |
llvm-svn: 262346
|
| |
|
|
| |
llvm-svn: 262338
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Idea behind this change is to make code shorter and as much common for all targets as possible. Let's even accept more code than is valid for a particular target, leaving it for the assembler to sort out.
64bit instructions decoding added.
Error\warning messages on unrecognized instructions operands added, InstPrinter allowed to print invalid operands helping to find invalid/unsupported code.
The change is massive and hard to compare with previous version, so it makes sense just to take a look on the new version. As a bonus, with a few TD changes following, it disassembles the majority of instructions. Currently it fully disassembles >300K binary source of some blas kernel.
Previous TODOs were saved whenever possible.
Patch by: Valery Pykhtin
Differential Revision: http://reviews.llvm.org/D17720
llvm-svn: 262332
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
it is not present
Previosy, if actual instruction have one of optional operands then other optional operands listed before this also should be presented.
For example instruction v_fract_f32 v0, v1, mul:2 have one optional operand - OMod and do not have optional operand clamp. Previously this was not allowed because clamp is listed before omod in AsmString:
string AsmString = "v_fract_f32$vdst, $src0_modifiers$clamp$omod";
Making this work required some hacks (both OMod and Clamp match classes have same PredicateMethod).
Now, if MatchInstructionImpl meets formal optional operand that is not presented in actual instruction it skips this formal operand and tries to match current actual operand with next formal.
Patch by: Sam Kolton
Review: http://reviews.llvm.org/D17568
[AMDGPU] Assembler: Check immediate types for several optional operands in predicate methods
With this change you should place optional operands in order specified by asm string:
clamp -> omod
offset -> glc -> slc -> tfe
Fixes for several tests.
Depends on D17568
Patch by: Sam Kolton
Review: http://reviews.llvm.org/D17644
llvm-svn: 262314
|
| |
|
|
|
|
|
|
| |
Technically you aren't supposed to emit these after type legalization
for some reason, and we use vector extracts of bitcasted integers
as the canonical way to do this.
llvm-svn: 262298
|
| |
|
|
| |
llvm-svn: 262297
|
| |
|
|
|
|
|
|
|
|
| |
This currently does not have the control over the bitwidth,
and there are missing optimizations to reduce the integer to
32-bit if it can be.
But in most situations we do want the sinking to occur.
llvm-svn: 262296
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The maximum private allocation for the whole GPU is 4G,
so the maximum possible index for a single workitem is the
maximum size divided by the smallest granularity for a dispatch.
This increases the number of known zero high bits, which
enables more offset folding. The maximum private size per
workitem with this is 128M but may be smaller still.
llvm-svn: 262153
|
| |
|
|
|
|
| |
These parameters aren't expected to be null, so take them by reference.
llvm-svn: 262151
|
| |
|
|
|
|
|
|
|
| |
In all but one case, change the DFAPacketizer API to take MachineInstr&
instead of MachineInstr*. In DFAPacketizer::endPacket(), take
MachineBasicBlock::iterator. Besides cleaning up the API, this is in
search of PR26753.
llvm-svn: 262142
|
| |
|
|
|
|
|
| |
This will be more useful for marking builtins acceptable for which
subtargets.
llvm-svn: 262121
|
| |
|
|
| |
llvm-svn: 262120
|
| |
|
|
|
|
|
|
|
|
| |
This matches the behavior of the HSAIL clock instruction.
s_realmemtime is used if the subtarget supports it, and falls
back to s_memtime if not.
Also introduces new intrinsics for each of s_memtime / s_memrealtime.
llvm-svn: 262119
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Take MachineInstr by reference instead of by pointer in SlotIndexes and
the SlotIndex wrappers in LiveIntervals. The MachineInstrs here are
never null, so this cleans up the API a bit. It also incidentally
removes a few implicit conversions from MachineInstrBundleIterator to
MachineInstr* (see PR26753).
At a couple of call sites it was convenient to convert to a range-based
for loop over MachineBasicBlock::instr_begin/instr_end, so I added
MachineBasicBlock::instrs.
llvm-svn: 262115
|