| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
| |
This switches to the workaround that HSA defaults to
for the mesa path.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <vedran@miletic.net>
llvm-svn: 292982
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Limit register coalescer by not allowing it to artificially increase
size of registers beyond dword. Such super-registers are in fact
register sequences and not distinct HW registers.
With more super-regs we would need to allocate adjacent registers
and constraint regalloc more than needed. Moreover, our super
registers are overlapping. For instance we have VGPR0_VGPR1_VGPR2,
VGPR1_VGPR2_VGPR3, VGPR2_VGPR3_VGPR4 etc, which complicates registers
allocation even more, resulting in excessive spilling.
Differential Revision: https://reviews.llvm.org/D28782
llvm-svn: 292413
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D25975
llvm-svn: 286753
|
| |
|
|
|
|
|
|
| |
Patch By: Wei Ding
Differential Revision: https://reviews.llvm.org/D18049
llvm-svn: 286464
|
| |
|
|
|
|
| |
This reverts commit r285939 and r285948. These broke some conformance tests.
llvm-svn: 285995
|
| |
|
|
|
|
|
|
| |
Patch By: Wei Ding
Differential Revision: https://reviews.llvm.org/D18049
llvm-svn: 285939
|
| |
|
|
|
|
|
|
|
|
| |
If the literal is being folded into src0, it doesn't matter
if it's an SGPR because it's being replaced with the literal.
Also fixes initially selecting 32-bit versions of some instructions
which also confused commuting.
llvm-svn: 281117
|
| |
|
|
|
|
|
|
|
|
| |
There was a combine before to handle the simple copy case.
Split this into handling loads and stores separately.
We might want to change how this handles some of the vector
extloads, since this can result in large code size increases.
llvm-svn: 274394
|
| |
|
|
|
|
|
|
|
|
| |
Allocating larger register classes first should give better allocation
results (and more importantly for myself, make the lit tests more stable
with respect to scheduler changes).
Patch by Matthias Braun
llvm-svn: 270312
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This includes a hazard recognizer implementation to replace some of
the hazard handling we had during frame index elimination.
Reviewers: arsenm
Subscribers: qcolombet, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18602
llvm-svn: 268143
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The goal is for each operand type to have its own parse function and
at the same time share common code for tracking state as different
instruction types share operand types (e.g. glc/glc_flat, etc).
Introduce parseAMDGPUOperand which can parse any optional operand.
DPP and Clamp/OMod have custom handling for now. Sam also suggested
to have class hierarchy for operand types instead of table. This
can be done in separate change.
Remove parseVOP3OptionalOps, parseDS*OptionalOps, parseFlatOptionalOps,
parseMubufOptionalOps, parseDPPOptionalOps.
Reduce number of definitions of AsmOperand's and MatchClasses' by using common base class.
Rename AsmMatcher/InstPrinter methods accordingly.
Print immediate type when printing parsed immediate operand.
Use 'off' if offset/index register is unused instead of skipping it to make it more readable (also agreed with SP3).
Update tests.
Reviewers: tstellarAMD, SamWot, artem.tamazov
Subscribers: qcolombet, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19584
llvm-svn: 268015
|
| |
|
|
|
|
|
|
|
|
|
| |
On AMDGPU where operations i64 operations are often bitcasted to v2i32
and back, this pattern shows up regularly where it breaks some
expected combines on i64, such as load width reducing.
This fixes some test failures in a future commit when i64 loads
are changed to promote.
llvm-svn: 262397
|
| |
|
|
|
|
| |
64-bit shifts are very slow on some subtargets.
llvm-svn: 258090
|
| |
|
|
|
|
|
| |
They can be loaded and stored, so count them as legal. This is
mostly to fix a number of common cases for load/store merging.
llvm-svn: 254086
|
| |
|
|
|
|
|
|
| |
The one regression in the builtin tests is in the read2 test which now
(again) has many extra copies, but this should be solved once the pass
is replaced with a DAG combine.
llvm-svn: 253974
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow a target to do something other than search for copies
that will avoid cross register bank copies.
Implement for SI by only rewriting the most basic copies,
so it should look through anything like a subregister extract.
I'm not entirely satisified with this because it seems like
eliminating a reg_sequence that isn't fully used should work
generically for all targets without them having to override
something. However, it seems to be tricky to have a simple
implementation of this without rewriting to invalid kinds
of subregister copies on some targets.
I'm not sure if there is currently a generic way to easily check
if a subregister index would be valid for the current use.
The current set of TargetRegisterInfo::get*Class functions don't
quite behave like I would expect (e.g. getSubClassWithSubReg
returns the maximal register class rather than the minimal), so
I'm not sure how to make the generic test keep searching if
SrcRC:SrcSubReg is a valid replacement for DefRC:DefSubReg. Making
the default implementation to check for simple copies breaks
a variety of ARM and x86 tests by producing illegal subregister uses.
The ARM tests are not actually changed since it should still be using
the same sharesSameRegisterFile implementation, this just relaxes
them to not check for specific registers.
llvm-svn: 248478
|
| |
|
|
|
|
|
|
|
|
| |
Currently this hits an assert that extload should
always be supported, which assumes integer extloads.
This moves a hack out of SI's argument lowering and
is covered by existing tests.
llvm-svn: 247113
|
|
|
llvm-svn: 239657
|