| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D26010
llvm-svn: 294692
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix build when global-isel is disabled and fix a warning.
Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP.
Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm
Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris
Differential Revision: https://reviews.llvm.org/D26730
llvm-svn: 293551
|
|
|
|
|
|
|
|
| |
This reverts commit r293503.
Revert while I investigate some of the buildbot failures.
llvm-svn: 293509
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP.
Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm
Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris
Differential Revision: https://reviews.llvm.org/D26730
llvm-svn: 293503
|
|
|
|
|
|
|
|
|
|
|
| |
Accomplishes what r292982 was supposed to, which ended up
only really making the necessary test changes.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <vedran@miletic.net>
llvm-svn: 293310
|
|
|
|
|
|
|
|
|
|
|
| |
This switches to the workaround that HSA defaults to
for the mesa path.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <vedran@miletic.net>
llvm-svn: 292982
|
|
|
|
|
|
|
| |
The same control register controls both, and are set to
the same defaults. Keep the old names around as aliases.
llvm-svn: 292837
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: vpykhtin, artem.tamazov, tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D28900
llvm-svn: 292596
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: CI doesn't have XNACK.
Reviewers: tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27175
llvm-svn: 289263
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This frees 2 additional scalar registers.
These are results from all of my 3 patches combined:
Polaris:
Spilled SGPRs: 2231 -> 1517 (-32.00 %)
Tonga:
Spilled SGPRs: 3829 -> 2608 (-31.89 %)
Spilled VGPRs: 100 -> 84 (-16.00 %)
Tonga even spills SGPRs via VGPRs to scratch. That's a compute shader
limited to 64 VGPRs.
Reviewers: tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27151
llvm-svn: 289262
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D25975
llvm-svn: 286753
|
|
|
|
|
|
|
|
| |
Patch By: Wei Ding
Differential Revision: https://reviews.llvm.org/D18049
llvm-svn: 286464
|
|
|
|
|
|
| |
This reverts commit r285939 and r285948. These broke some conformance tests.
llvm-svn: 285995
|
|
|
|
|
|
|
|
| |
Patch By: Wei Ding
Differential Revision: https://reviews.llvm.org/D18049
llvm-svn: 285939
|
|
|
|
| |
llvm-svn: 285659
|
|
|
|
|
|
| |
I'm guessing at how it is supposed to be printed
llvm-svn: 285490
|
|
|
|
|
|
|
|
|
|
| |
Also add glc bit to the scalar loads since they exist on VI
and change the caching behavior.
This currently has an assembler bug where the glc bit is incorrectly
accepted on SI/CI which do not have it.
llvm-svn: 285463
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add missing ISA versions 7.0.2/8.0.4/8.1.0. to backend.
Refactor processor definition to use ISA version features.
Fixed ISA version for stoney.
Based on Laurent Morichetti's patch.
Differential Revision: https://reviews.llvm.org/D25919
llvm-svn: 285210
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The hardware doesn't support this.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D25523
llvm-svn: 284257
|
|
|
|
|
|
|
| |
VI added a second method of indexing into VGPRs
besides using v_movrel*
llvm-svn: 284027
|
|
|
|
|
|
|
|
|
|
| |
Differential Revision:
http://reviews.llvm.org/D25454
Reviewers:
tstellarAMD
llvm-svn: 283893
|
|
|
|
|
|
|
|
|
|
| |
subtarget
This is a prerequisite for coming waitcnt changes
Differential Revision: https://reviews.llvm.org/D24939
llvm-svn: 282489
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
variant
Summary: This removes disabled instructions from match tables so we will not match them at all.
Reviewers: tstellarAMD, vpykhtin, artem.tamazov
Subscribers: wdng, nhaehnle, arsenm
Differential Revision: https://reviews.llvm.org/D24452
llvm-svn: 281216
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Split assembler match table in 4 tables with assembler variants:
Default - all instructions except VOP3, SDWA and DPP
- VOP3
- SDWA
- DPP
First match Default table then VOP3, SDWA and DPP.
Reviewers: tstellarAMD, artem.tamazov, vpykhtin
Subscribers: arsenm, wdng, nhaehnle, AMDGPU
Differential Revision: https://reviews.llvm.org/D24252
llvm-svn: 281023
|
|
|
|
| |
llvm-svn: 274398
|
|
|
|
| |
llvm-svn: 273937
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the kernel code header
Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue.
Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format:
- offset 0: work group ID x
- offset 4: work group ID y
- offset 8: work group ID z
- offset 16: work item ID x
- offset 20: work item ID y
- offset 24: work item ID z
Set
- amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg
- amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg
- amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled
Differential Revision: http://reviews.llvm.org/D20335
llvm-svn: 273769
|
|
|
|
|
|
|
|
| |
The only real reason to use it is for testing, so replace
it with a command line option instead of a potentially function
dependent feature.
llvm-svn: 273653
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D20081
llvm-svn: 270594
|
|
|
|
|
|
|
|
|
| |
- Insert one nop for each high level statement instead of two
- Do not insert nop before prologue
Differential Revision: http://reviews.llvm.org/D20215
llvm-svn: 269452
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D19235
llvm-svn: 267563
|
|
|
|
|
|
|
|
|
|
|
| |
Also,
- Skip pass if machine module does not have debug info
- Minor comment changes
- Added test
Differential Revision: http://reviews.llvm.org/D19079
llvm-svn: 266626
|
|
|
|
|
|
|
|
|
|
|
|
| |
The maximum private allocation for the whole GPU is 4G,
so the maximum possible index for a single workitem is the
maximum size divided by the smallest granularity for a dispatch.
This increases the number of known zero high bits, which
enables more offset folding. The maximum private size per
workitem with this is 128M but may be smaller still.
llvm-svn: 262153
|
|
|
|
|
|
|
| |
This will be more useful for marking builtins acceptable for which
subtargets.
llvm-svn: 262121
|
|
|
|
|
|
|
|
|
|
| |
This matches the behavior of the HSAIL clock instruction.
s_realmemtime is used if the subtarget supports it, and falls
back to s_memtime if not.
Also introduces new intrinsics for each of s_memtime / s_memrealtime.
llvm-svn: 262119
|
|
|
|
|
|
|
|
|
| |
Introduce a subtarget feature for this, and leave the default with
the current behavior which assumes up to 16-byte loads/stores can
be used. The field also seems to have the ability to be set to 2 bytes,
but I'm not sure what that would be used for.
llvm-svn: 260651
|
|
|
|
| |
llvm-svn: 259089
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is a candidate for stable, along with all patches that add the "stoney"
processor.
Reviewers: tstellarAMD
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D16485
llvm-svn: 258922
|
|
|
|
|
|
|
|
|
|
| |
Make comments and indentation more consistent.
Rearrange a few things to be in a more consistent order,
such as organizing subtarget features from those describing
an actual device property, and those used as options.
llvm-svn: 258789
|
|
|
|
|
|
|
| |
This is a leftover from AMDIL that doesn't do anything
and doesn't belong here.
llvm-svn: 258606
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently the SI scheduler can be selected via command line option,
but it turned out it would be better if it was selectable via a Target Attribute.
This patch adds "si-scheduler" attribute to the backend.
Reviewers: tstellarAMD, echristo
Subscribers: echristo, arsenm
Differential Revision: http://reviews.llvm.org/D16192
llvm-svn: 258386
|
|
|
|
| |
llvm-svn: 258085
|
|
|
|
| |
llvm-svn: 257666
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Enabling this feature will account for the two SGPRs used by the hardware
to store the XNACK_MASK physically.
The hardware only requires this reservation when the XNACK feature is
explicitly enabled. At some point, HSA will probably want to do that, but
it does increase SGPR register pressure, so leave it disabled by default
for now (but do add a small test).
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15869
llvm-svn: 256794
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15735
llvm-svn: 256360
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For some reason doing executing an MUBUF instruction with the addr64
bit set and a zero base pointer in the resource descriptor causes
the memory operation to be dropped when the shader is executed using
the HSA runtime.
This kind of MUBUF instruction is commonly used when the pointer is
stored in VGPRs. The base pointer field in the resource descriptor
is set to zero and and the pointer is stored in the vaddr field.
This patch resolves the issue by only using flat instructions for
global memory operations when targeting HSA. This is an overly
conservative fix as all other configurations of MUBUF instructions
appear to work.
NOTE: re-commit by fixing a failure in Codegen/AMDGPU/llvm.dbg.value.ll
Reviewers: tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15543
llvm-svn: 256282
|
|
|
|
|
|
|
|
| |
This reverts commit r256273.
It broke CodeGen/AMDGPU/llvm.dbg.value.ll
llvm-svn: 256275
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For some reason doing executing an MUBUF instruction with the addr64
bit set and a zero base pointer in the resource descriptor causes
the memory operation to be dropped when the shader is executed using
the HSA runtime.
This kind of MUBUF instruction is commonly used when the pointer is
stored in VGPRs. The base pointer field in the resource descriptor
is set to zero and and the pointer is stored in the vaddr field.
This patch resolves the issue by only using flat instructions for
global memory operations when targeting HSA. This is an overly
conservative fix as all other configurations of MUBUF instructions
appear to work.
Reviewers: tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15543
llvm-svn: 256273
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We can safely assume that the high bit of scratch offsets will never
be set, because this would require at least 128 GB of GPU memory.
Reviewers: arsenm
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11225
llvm-svn: 242433
|
|
|
|
|
|
|
|
| |
We don't have a good way to detect most situations where
DS offsets are usable on SI, so add an option to force using
them even if unsafe for debugging performance problems.
llvm-svn: 241462
|