| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
compr is not currently parsed (or printed) correctly,
but that should probably be fixed along with
intrinsic changes.
llvm-svn: 288698
|
|
|
|
|
|
|
| |
This is an improvement over a long list of unreadable numbers.
A follow up patch will try to match how sc formats these.
llvm-svn: 288697
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Structure the definitions a bit more like the other classes.
The main change here is to split EXP with the done bit set
to a separate opcode, so we can set mayLoad = 1 so that it won't
be reordered before the other exp stores, since this has the special
constraint that if the done bit is set then this should be the last
exp in she shader.
Previously all exp instructions were inferred to have unmodeled
side effects.
llvm-svn: 288695
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: s_buffer_store_dword instructions sdata operand was called sdst in encoding. This caused disassembler to fail.
Reviewers: tstellarAMD, vpykhtin, artem.tamazov
Subscribers: arsenm, nhaehnle, rampitec
Differential Revision: https://reviews.llvm.org/D27100
llvm-svn: 288657
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lib/Target/AMDGPU/SIRegisterInfo.cpp: In member function 'void llvm::SIRegisterInfo::spillSGPR(llvm::MachineBasicBlock::iterator, int, llvm::RegScavenger*) const':
lib/Target/AMDGPU/SIRegisterInfo.cpp:572:30: warning: variable 'SubRC' set but not used [-Wunused-but-set-variable]
const TargetRegisterClass *SubRC = nullptr;
^
lib/Target/AMDGPU/SIRegisterInfo.cpp: In member function 'void llvm::SIRegisterInfo::restoreSGPR(llvm::MachineBasicBlock::iterator, int, llvm::RegScavenger*) const':
lib/Target/AMDGPU/SIRegisterInfo.cpp:723:30: warning: variable 'SubRC' set but not used [-Wunused-but-set-variable]
const TargetRegisterClass *SubRC = nullptr;
^
The variable was assigned to, but never used. The functions called did not
mutate state. Simplify the logic and remove the variable. Identified by gcc
5.4.0.
llvm-svn: 288601
|
|
|
|
| |
llvm-svn: 288590
|
|
|
|
| |
llvm-svn: 288523
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since the spill is for the whole wave, these
don't have the swizzling problems that vector stores do
and a single 4-byte allocation is enough to spill a 64 element
register. This should reduce the number of spill instructions and
put all the spills for a register in the same cacheline.
This should save allocated private size, but for now it doesn't.
The extra slots are allocated for each component, but never used
because the frame layout is essentially finalized before frame
indices are replaced. For always using the scalar store path,
this should probably be moved into processFunctionBeforeFrameFinalized.
llvm-svn: 288445
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is not in the list of valid inputs for the encoding.
When spilling, copies from exec can be folded directly
into the spill instruction which results in broken
stores.
This only fixes the operand constraints, more codegen
work is required to avoid emitting the invalid
spills.
This sort of breaks the dbg.value test. Because the
register class of the s_load_dwordx2 changes, there
is a copy to SReg_64, and the copy is the operand
of dbg_value. The copy is later dead, and removed
from the dbg_value.
llvm-svn: 288191
|
|
|
|
| |
llvm-svn: 288190
|
|
|
|
|
|
|
|
|
|
| |
Use vaddr/vdst for the same purposes.
This also fixes a beg in SIInsertWaits for the
operand check. The stored value operand is currently called
data0 in the single offset case, not data.
llvm-svn: 288188
|
|
|
|
| |
llvm-svn: 288187
|
|
|
|
|
|
|
|
|
|
|
| |
It isn't generally safe to fold the frame index
directly into the operand since it will possibly
not be an inline immediate after it is expanded.
This surprisingly seems to produce better code, since
the FI doesn't prevent folding other immediate operands.
llvm-svn: 288185
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change the logic for when to fold immediates to
consider the destination operand rather than the
source of the materializing mov instruction.
No change yet, but this will allow for correctly handling
i16/f16 operands. Since 32-bit moves are used to materialize
constants for these, the same bitvalue will not be in the
register.
llvm-svn: 288184
|
|
|
|
|
|
|
|
|
|
|
|
| |
branches
Reviewers: arsenm
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D23417
llvm-svn: 288095
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the createGenericSchedLive() function that constructs the
default scheduler available for the public API. This should help when
you want to get a scheduler and the default list of DAG mutations.
This also shrinks the list of default DAG mutations:
{Load|Store}ClusterDAGMutation and MacroFusionDAGMutation are no longer
added by default. Targets can easily add them if they need them. It also
makes it easier for targets to add alternative/custom macrofusion or
clustering mutations while staying with the default
createGenericSchedLive(). It also saves the callback back and forth in
TargetInstrInfo::enableClusterLoads()/enableClusterStores().
Differential Revision: https://reviews.llvm.org/D26986
llvm-svn: 288057
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
copies
Codegen prepare sinks comparisons close to a user is we have only one register
for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions.
Changed BE to report we have many condition registers. That way IR LICM pass
would hoist an invariant comparison out of a loop and codegen prepare will not
sink it.
With that done a condition is calculated in one block and used in another.
Current behavior is to store workitem's condition in a VGPR using v_cndmask_b32
and then restore it with yet another v_cmp instruction from that v_cndmask's
result. To mitigate the issue a propagation of source SGPR pair in place of v_cmp
is implemented. Additional side effect of this is that we may consume less VGPRs
at a cost of more SGPRs in case if holding of multiple conditions is needed, and
that is a clear win in most cases.
Differential Revision: https://reviews.llvm.org/D26114
llvm-svn: 288053
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D26724
llvm-svn: 287962
|
|
|
|
|
|
| |
suggested as a better solution by Matt
llvm-svn: 287942
|
|
|
|
|
|
| |
This reverts commit 4404d0d6e354e80dd7f8f0a0e12d8ad809cf007e.
llvm-svn: 287936
|
|
|
|
|
|
| |
This reverts commit 79d4f8b8b1ce430c3d5dac4fc72a9eebaed24fe1.
llvm-svn: 287935
|
|
|
|
|
|
| |
This reverts commit e834ce5976567575621901fb967b8018b9916d71.
llvm-svn: 287934
|
|
|
|
|
|
| |
This reverts commit 057bbbe4ae170247ba37f08f2e70ef185267d1bb.
llvm-svn: 287933
|
|
|
|
|
|
| |
This reverts commit 124ad83dae04514f943902446520c859adee0e96.
llvm-svn: 287932
|
|
|
|
|
|
| |
This reverts commit f18de36554eb22416f8ba58e094e0272523a4301.
llvm-svn: 287931
|
|
|
|
|
|
| |
This reverts commit a5a179ffd94fd4136df461ec76fb30f04afa87ce.
llvm-svn: 287930
|
|
|
|
| |
llvm-svn: 287844
|
|
|
|
|
|
|
|
|
|
|
|
| |
The scavenger was not passed if requiresFrameIndexScavenging was
enabled. I need to be able to test for the availability of an
unallocatable register here, so I can't create a virtual register for
it.
It might be better to just always use the scavenger and stop
creating virtual registers.
llvm-svn: 287843
|
|
|
|
|
|
| |
Since m0 isn't allocatable it should never be spilled anymore.
llvm-svn: 287842
|
|
|
|
|
|
|
|
|
|
|
| |
m0 may need to be written for spill code, so
we don't want general code uses relying on the
value stored in it.
This introduces a few code quality regressions where copies
from m0 are not coalesced into copies of a copy of m0.
llvm-svn: 287841
|
|
|
|
|
|
|
| |
Move code down to use, reorder to avoid hard to follow
immediate folding logic.
llvm-svn: 287818
|
|
|
|
|
|
| |
The uint8_t was printed as a char which didn't really work.
llvm-svn: 287817
|
|
|
|
| |
llvm-svn: 287808
|
|
|
|
|
|
|
| |
In the scalar case, there's no reason to add an additional
def of the same register.
llvm-svn: 287807
|
|
|
|
|
|
|
|
|
|
| |
The size and offset were wrong. The size of the object was
being used for the size of the access, when here it is really
being split into 4-byte accesses. The underlying object size
is set in the MachinePointerInfo, which also didn't have the
offset set.
llvm-svn: 287806
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D26939
llvm-svn: 287608
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ARM, Mips, and X86.
Summary:
* ARM is omitted from this patch because this check appears to expose bugs in this target.
* Mips is omitted from this patch because this check either detects bugs or deliberate
emission of instructions that don't satisfy their predicates. One deliberate
use is the SYNC instruction where the version with an operand is correctly
defined as requiring MIPS32 while the version without an operand is defined
as an alias of 'SYNC 0' and requires MIPS2.
* X86 is omitted from this patch because it doesn't use the tablegen-erated
MCCodeEmitter infrastructure.
Patches for ARM and Mips will follow.
Depends on D25617
Reviewers: tstellarAMD, jmolloy
Subscribers: wdng, jmolloy, aemerson, rengolin, arsenm, jyknight, nemanjai, nhaehnle, tstellarAMD, llvm-commits
Differential Revision: https://reviews.llvm.org/D25618
llvm-svn: 287439
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D26862
llvm-svn: 287389
|
|
|
|
| |
llvm-svn: 287362
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The 32-bit instructions don't zero the high 16-bits like the 16-bit
instructions do.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D26828
llvm-svn: 287342
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The addr64-based legalization is incorrect for MUBUF instructions with idxen
set as well as for BUFFER_LOAD/STORE_FORMAT_* instructions. This affects
e.g. shaders that access buffer textures.
Since we never actually need the addr64-legalization in shaders, this patch
takes the easy route and keys off the calling convention. If this ever
affects (non-OpenGL) compute, the type of legalization needs to be chosen
based on some TSFlag.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98664
Reviewers: arsenm, tstellarAMD
Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D26747
llvm-svn: 287339
|
|
|
|
|
|
| |
Identified by Pedro Giffuni in PR27636.
llvm-svn: 287333
|
|
|
|
| |
llvm-svn: 287311
|
|
|
|
|
|
|
| |
There are still crashes on non-MVT types in other
places.
llvm-svn: 287310
|
|
|
|
|
|
|
|
| |
This reverts commit r287146.
This breaks few conformance tests.
llvm-svn: 287233
|
|
|
|
| |
llvm-svn: 287203
|
|
|
|
| |
llvm-svn: 287201
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D26732
llvm-svn: 287199
|
|
|
|
|
|
|
| |
This fixes a probably unintended divergence from the default
scheduler behavior.
llvm-svn: 287146
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
1. Don't try to copy values to and from the same register class.
2. Replace copies with of registers with immediate values with v_mov/s_mov
instructions.
The main purpose of this change is to make MachineSink do a better job of
determining when it is beneficial to split a critical edge, since the pass
assumes that copies will become move instructions.
This prevents a regression in uniform-cfg.ll if we enable critical edge
splitting for AMDGPU.
Reviewers: arsenm
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: https://reviews.llvm.org/D23408
llvm-svn: 287131
|