| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
This patch adds support for S_ANDN2, S_ORN2 32-bit and 64-bit instructions and adds splits to move them to the vector unit (for which there is no equivalent instruction). It modifies the way that the more complex scalar instructions are lowered to vector instructions by first breaking them down to sequences of simpler scalar instructions which are then lowered through the existing code paths. The pattern for S_XNOR has also been updated to apply inversion to one input rather than the output of the XOR as the result is equivalent and may allow leaving the NOT instruction on the scalar unit.
A new tests for NAND, NOR, ANDN2 and ORN2 have been added, and existing tests now hit the new instructions (and have been modified accordingly).
Differential: https://reviews.llvm.org/D54714
llvm-svn: 347877
|
|
|
|
|
|
| |
Leftover from before amdgcn/r600 split.
llvm-svn: 310277
|
|
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D34407
llvm-svn: 307097
|
|
|
|
|
|
|
|
|
| |
It broke a testcase.
Failing Tests (1):
LLVM :: CodeGen/AMDGPU/alignbit-pat.ll
llvm-svn: 307054
|
|
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D34407
llvm-svn: 307026
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.
Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).
llvm-svn: 298444
|
|
|
|
|
|
|
|
|
|
| |
COPY was lacking a scheduling class, define it to avoid regressions in
the upcoming change to the bidirectional MachineScheduler. Approved by
tstellar on IRC.
Differential Revision: http://reviews.llvm.org/D21540
llvm-svn: 273751
|
|
|
|
| |
llvm-svn: 266506
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch contains a few improvements to the model, including:
- Using a single resource with a defined buffers size for each memory unit.
- Setting the IssueWidth correctly.
- Fixing latency values for memory instructions.
shader-db stats:
16429 shaders in 3231 tests
Totals:
SGPRS: 318232 -> 312328 (-1.86 %)
VGPRS: 208996 -> 209346 (0.17 %)
Code Size: 7147044 -> 7166440 (0.27 %) bytes
LDS: 83 -> 83 (0.00 %) blocks
Scratch: 1862656 -> 1459200 (-21.66 %) bytes per wave
Max Waves: 49182 -> 49243 (0.12 %)
Wait states: 0 -> 0 (0.00 %)A
Differential Revision: http://reviews.llvm.org/D18453
llvm-svn: 264877
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r252565.
This also includes the revert of the commit mentioned below in order to
avoid breaking tests in AMDGPU:
Revert "AMDGPU: Set isAllocatable = 0 on VS_32/VS_64"
This reverts commit r252674.
llvm-svn: 252956
|
|
|
|
| |
llvm-svn: 252674
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This basic combine was surprisingly missing.
AMDGPU legalizes many operations in terms of 32-bit vector components,
so not doing this results in many extra copies and subregister extracts
that need to be cleaned up later.
InstCombine already does this for the hasOneUse case. The target hook
is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn
from a vector materialize repeated immediate instruction to a constant
vector load with more scalar copies from it.
llvm-svn: 250129
|
|
llvm-svn: 239657
|