| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 274954
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These have been replaced with TableGen code (except for isConstantLoad,
which is still used for R600). The queries were broken for cases
where MemOperand was a PseudoSourceValue.
Reviewers: arsenm
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: http://reviews.llvm.org/D21684
llvm-svn: 274561
|
|
|
|
|
|
|
|
|
| |
This moves of the r600 logic out of isGlobalLoad() and into the
TableGen files.
Differential Revision: http://reviews.llvm.org/D21710
llvm-svn: 274527
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The isGlobalLoad() query was returning true for constant address space loads
with memory types less than 32-bits, which is wrong. This logic has been
replaced with PatFrag in the TableGen files, to provide the same functionality.
Reviewers: arsenm
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: http://reviews.llvm.org/D21696
llvm-svn: 274521
|
|
|
|
|
|
|
|
|
| |
Split AMDGPUSubtarget into amdgcn/r600 specific subclasses.
This removes most of the static_casting of the basic codegen
classes everywhere, and tries to restrict the features
visible on the wrong target.
llvm-svn: 273652
|
|
|
|
|
|
|
| |
Mostly removing dead code. Apparently gcc's warning
for unused functions is better
llvm-svn: 273363
|
|
|
|
|
|
| |
Found by gcc 6.
llvm-svn: 273322
|
|
|
|
|
|
| |
Found by gcc 6.
llvm-svn: 273303
|
|
|
|
| |
llvm-svn: 273131
|
|
|
|
| |
llvm-svn: 273129
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes two related bugs. First, the generic optimization passes
unfortunately generate negative constant offsets but the hardware treats
SOffset as an unsigned value.
Second, there is a hardware bug on SI and CI, where address clamping in MUBUF
instructions does not work correctly when SOffset is larger than the buffer
size. This patch works around this bug by never using SOffset.
An alternative workaround would be to do the clamping manually when SOffset
is too large, but generating the required code sequence during instruction
selection would be rather involved, and in any case the resulting code would
probably be worse.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96360
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D21326
llvm-svn: 272761
|
|
|
|
|
|
|
|
| |
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.
llvm-svn: 272512
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes a bug with ds_*permute instructions where if it was passed a
constant address, then the offset operand would get assigned a register
operand instead of an immediate.
Reviewers: scchan, arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19994
llvm-svn: 272349
|
|
|
|
|
|
|
|
| |
The flat atomics could already be selected, but only
when using flat instructions for global memory. Add
patterns for flat addresses.
llvm-svn: 272345
|
|
|
|
|
|
|
|
|
|
| |
This was using extract_subreg sub0 to extract the low register
of the result instead of sub0_sub1, producing an invalid copy.
There doesn't seem to be a way to use the compound subreg indices
in tablegen since those are generated, so manually select it.
llvm-svn: 272344
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: tstellard
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D19792
llvm-svn: 269479
|
|
|
|
|
|
|
|
|
|
|
| |
- Where we were returning a node before, call ReplaceNode instead.
- Where we would return null to fall back to another selector, rename
the method to try* and return a bool for success.
- Where we were calling SelectNodeTo, just return afterwards.
Part of llvm.org/pr26808.
llvm-svn: 269349
|
|
|
|
| |
llvm-svn: 268931
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a step towards removing the rampant undefined behaviour in
SelectionDAG, which is a part of llvm.org/PR26808.
We rename SelectionDAGISel::Select to SelectImpl and update targets to
match, and then change Select to return void and consolidate the
sketchy behaviour we're trying to get away from there.
Next, we'll update backends to implement `void Select(...)` instead of
SelectImpl and eventually drop the base Select implementation.
llvm-svn: 268693
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that unaligned access expansion should not attempt
to produce i64 accesses, we can remove the hack in
PreprocessISelDAG where this is done.
This allows splitting i64 private accesses while
allowing the new add nodes indexing the vector components
can be folded with the base pointer arithmetic.
llvm-svn: 268293
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These instructions can add an immediate offset to the address, like other
ds instructions.
Reviewers: arsenm
Subscribers: arsenm, scchan
Differential Revision: http://reviews.llvm.org/D19233
llvm-svn: 268043
|
|
|
|
| |
llvm-svn: 267452
|
|
|
|
| |
llvm-svn: 267244
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fully solves the problem where the StructurizeCFG pass does not
consider the same branches as uniform as the SIAnnotateControlFlow pass.
The patch in D19013 helps with this problem, but is not sufficient
(and, interestingly, causes a "regression" with one of the existing
test cases).
No tests included here, because tests in D19013 already cover this.
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19018
llvm-svn: 266346
|
|
|
|
|
|
|
| |
These are different than atomicrmw add 1 because they have
an additional input value to clamp the result.
llvm-svn: 266074
|
|
|
|
|
|
|
|
|
|
| |
Standard load/store instructions with GLC bit set.
Reviewers: tstellardAMD, arsenm
Differential Revision: http://reviews.llvm.org/D18760
llvm-svn: 265709
|
|
|
|
| |
llvm-svn: 264255
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Strengthen tests of storing frame indices.
Right now this just creates irrelevant scheduling changes.
We don't want to have multiple frame index operands
on an instruction. There seem to be various assumptions
that at least the same frame index will not appear twice
in the LocalStackSlotAllocation pass.
There's no reason to have this happen, and it just
makes it easy to introduce bugs where the immediate
offset is appplied to the storing instruction when it should
really be applied to the value being stored as a separate
add.
This might not be sufficient. It might still be problematic
to have an add fi, fi situation, but that's even less unlikely
to happen in real code.
llvm-svn: 264200
|
|
|
|
|
|
|
| |
These instructions do not have the same negative base
address problem that DS instructions do on SI.
llvm-svn: 263964
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We cannot easily deduce that an offset is in an SGPR, but the Mesa frontend
cannot easily make use of an explicit soffset parameter either. Furthermore,
it is likely that in the future, LLVM will be in a better position than the
frontend to choose an SGPR offset if possible.
Since there aren't any frontend uses of these intrinsics in upstream
repositories yet, I would like to take this opportunity to change the
intrinsic signatures to a single offset parameter, which is then selected
to immediate offsets or voffsets using a ComplexPattern.
Reviewers: arsenm, tstellarAMD, mareko
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18218
llvm-svn: 263790
|
|
|
|
|
|
| |
Patch by Richard Thomson
llvm-svn: 262536
|
|
|
|
|
|
|
| |
Don't do an expensive computeKnownBits call when we
can do the cheap check for legal offsets first.
llvm-svn: 261720
|
|
|
|
| |
llvm-svn: 260784
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D16603
llvm-svn: 260765
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-commit of r258951 after fixing layering violation.
The BPF and WebAssembly backends had identical code for emitting errors
for unsupported features, and AMDGPU had very similar code. This merges
them all into one DiagnosticInfo subclass, that can be used by any
backend.
There should be minimal functional changes here, but some AMDGPU tests
have been updated for the new format of errors (it used a slightly
different format to BPF and WebAssembly). The AMDGPU error messages will
now benefit from having precise source locations when debug info is
available.
llvm-svn: 259498
|
|
|
|
| |
llvm-svn: 259045
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-commit of r258951 after fixing layering violation.
The related LLVM patch adds a backend diagnostic type for reporting
unsupported features, this adds a printer for them to clang.
In the case where debug location information is not available, I've
changed the printer to report the location as the first line of the
function, rather than the closing brace, as the latter does not give the
user any information. This also affects optimisation remarks.
Differential Revision: http://reviews.llvm.org/D16590
llvm-svn: 259035
|
|
|
|
|
|
|
|
|
|
|
| |
features"
It broke layering violation in LLVMIR.
clang r258950 "Add backend dignostic printer for unsupported features"
llvm r258951 "Refactor backend diagnostics for unsupported features"
llvm-svn: 259016
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The BPF and WebAssembly backends had identical code for emitting errors
for unsupported features, and AMDGPU had very similar code. This merges
them all into one DiagnosticInfo subclass, that can be used by any
backend.
There should be minimal functional changes here, but some AMDGPU tests
have been updated for the new format of errors (it used a slightly
different format to BPF and WebAssembly). The AMDGPU error messages will
now benefit from having precise source locations when debug info is
available.
The implementation of DiagnosticInfoUnsupported::print must be in
lib/Codegen rather than in the existing file in lib/IR/ to avoid
introducing a dependency from IR to CodeGen.
Differential Revision: http://reviews.llvm.org/D16590
llvm-svn: 258951
|
|
|
|
| |
llvm-svn: 258350
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For some reason doing executing an MUBUF instruction with the addr64
bit set and a zero base pointer in the resource descriptor causes
the memory operation to be dropped when the shader is executed using
the HSA runtime.
This kind of MUBUF instruction is commonly used when the pointer is
stored in VGPRs. The base pointer field in the resource descriptor
is set to zero and and the pointer is stored in the vaddr field.
This patch resolves the issue by only using flat instructions for
global memory operations when targeting HSA. This is an overly
conservative fix as all other configurations of MUBUF instructions
appear to work.
NOTE: re-commit by fixing a failure in Codegen/AMDGPU/llvm.dbg.value.ll
Reviewers: tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15543
llvm-svn: 256282
|
|
|
|
|
|
|
|
| |
This reverts commit r256273.
It broke CodeGen/AMDGPU/llvm.dbg.value.ll
llvm-svn: 256275
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For some reason doing executing an MUBUF instruction with the addr64
bit set and a zero base pointer in the resource descriptor causes
the memory operation to be dropped when the shader is executed using
the HSA runtime.
This kind of MUBUF instruction is commonly used when the pointer is
stored in VGPRs. The base pointer field in the resource descriptor
is set to zero and and the pointer is stored in the vaddr field.
This patch resolves the issue by only using flat instructions for
global memory operations when targeting HSA. This is an overly
conservative fix as all other configurations of MUBUF instructions
appear to work.
Reviewers: tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15543
llvm-svn: 256273
|
|
|
|
| |
llvm-svn: 254469
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15050
llvm-svn: 254427
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we know we have stack objects, we reserve the registers
that the private buffer resource and wave offset are passed
and use them directly.
If not, reserve the last 5 SGPRs just in case we need to spill.
After register allocation, try to pick the next available registers
instead of the last SGPRs, and then insert copies from the inputs
to the reserved registers in the progloue.
This also only selectively enables all of the input registers
which are really required instead of always enabling them.
llvm-svn: 254331
|
|
|
|
| |
llvm-svn: 254330
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It does not work because of emergency stack slots.
This pass was supposed to eliminate dummy registers for the
spill instructions, but the register scavenger can introduce
more during PrologEpilogInserter, so some would end up
left behind if they were needed.
The potential for spilling the scratch resource descriptor
and offset register makes doing something like this
overly complicated. Reserve registers to use for the resource
descriptor and use them directly in eliminateFrameIndex.
Also removes creating another scratch resource descriptor
when directly selecting scratch MUBUF instructions.
The choice of which registers are reserved is temporary.
For now it attempts to pick the next available registers
after the user and system SGPRs.
llvm-svn: 254329
|
|
|
|
| |
llvm-svn: 252675
|
|
|
|
| |
llvm-svn: 251994
|