| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
Differential Revision:
http://reviews.llvm.org/D28970
Reviewer:
Matt
llvm-svn: 292966
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Regalloc creates COPY instructions which do not formally use VALU.
That results in v_mov instructions displaced after exec mask modification.
One pass which do it is SIOptimizeExecMasking, but potentially it can be
done by other passes too.
This patch adds a pass immediately after regalloc to add implicit exec
use operand to all VGPR copy instructions.
Differential Revision: https://reviews.llvm.org/D28874
llvm-svn: 292956
|
| |
|
|
| |
llvm-svn: 292893
|
| |
|
|
|
|
| |
This avoids stack usage.
llvm-svn: 292846
|
| |
|
|
|
|
|
| |
Fixes turning a 32-bit scalar load into an extending vector load
for AMDGPU when dynamically indexing a vector.
llvm-svn: 292842
|
| |
|
|
|
|
|
| |
The same control register controls both, and are set to
the same defaults. Keep the old names around as aliases.
llvm-svn: 292837
|
| |
|
|
| |
llvm-svn: 292814
|
| |
|
|
|
|
| |
The colon is important.
llvm-svn: 292761
|
| |
|
|
|
|
|
|
|
|
|
| |
Add DUMMY_CHAIN SDNode to denote stores of interest
Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=28915
Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=30411
Differential Revision: https://reviews.llvm.org/D27964
llvm-svn: 292651
|
| |
|
|
| |
llvm-svn: 292567
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Inline spiller can decide to move a spill as early as possible in the basic block.
It will skip phis and label, but we also need to make sure it skips instructions
in the basic block prologue which restore exec mask.
Added isPositionLike callback in TargetInstrInfo to detect instructions which
shall be skipped in addition to common phis, labels etc.
Differential Revision: https://reviews.llvm.org/D27997
llvm-svn: 292554
|
| |
|
|
|
|
|
|
|
|
|
|
| |
For -(x + y) -> (-x) + (-y), if x == -y, this would
change the result from -0.0 to 0.0. Since the fma/fmad
combine is an extension of this problem it also
applies there.
fmul should be fine, and I don't think any of the unary
operators or conversions should be a problem either.
llvm-svn: 292473
|
| |
|
|
|
|
|
|
| |
They seem to produce nonsense results when used.
This should be applied to the release branch.
llvm-svn: 292472
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Limit register coalescer by not allowing it to artificially increase
size of registers beyond dword. Such super-registers are in fact
register sequences and not distinct HW registers.
With more super-regs we would need to allocate adjacent registers
and constraint regalloc more than needed. Moreover, our super
registers are overlapping. For instance we have VGPR0_VGPR1_VGPR2,
VGPR1_VGPR2_VGPR3, VGPR2_VGPR3_VGPR4 etc, which complicates registers
allocation even more, resulting in excessive spilling.
Differential Revision: https://reviews.llvm.org/D28782
llvm-svn: 292413
|
| |
|
|
| |
llvm-svn: 292328
|
| |
|
|
| |
llvm-svn: 292205
|
| |
|
|
|
|
|
|
| |
_RTN versions will be a lot more complicated
Differential Revision: https://reviews.llvm.org/D28067
llvm-svn: 292162
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28496
llvm-svn: 291954
|
| |
|
|
| |
llvm-svn: 291792
|
| |
|
|
| |
llvm-svn: 291790
|
| |
|
|
| |
llvm-svn: 291784
|
| |
|
|
| |
llvm-svn: 291779
|
| |
|
|
| |
llvm-svn: 291778
|
| |
|
|
| |
llvm-svn: 291777
|
| |
|
|
|
|
| |
Patch mostly by Fiona Glaser
llvm-svn: 291733
|
| |
|
|
|
|
| |
Patch mostly by Fiona Glaser
llvm-svn: 291732
|
| |
|
|
|
|
| |
Patch mostly by Fiona Glaser
llvm-svn: 291731
|
| |
|
|
|
|
| |
Allows better source modifier usage.
llvm-svn: 291729
|
| |
|
|
|
|
| |
To shrink to VOP2 the input carry must also be VCC.
llvm-svn: 291720
|
| |
|
|
|
|
|
|
| |
This produces worse code when i16 is legal, mostly
due to combines getting confused by conversions inserted
for uniform 16-bit operations.
llvm-svn: 291717
|
| |
|
|
|
|
|
| |
This was shrinking the instruction even though the carry output
register was a virtual register, not known VCC.
llvm-svn: 291716
|
| |
|
|
|
|
|
| |
Whether it is legal or not needs to check for the instruction
it will be replaced with.
llvm-svn: 291711
|
| |
|
|
|
|
|
|
|
| |
This reverts commit ada6595a526d71df04988eb0a4b4fe84df398ded.
This needs a simple probability check because there are some cases where it is
not profitable.
llvm-svn: 291695
|
| |
|
|
|
|
|
|
| |
Even with aggressive fusion enabled, this requires duplicating
the fmul, or increases an fadd to another fma which is not an
improvement.
llvm-svn: 291642
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28164
llvm-svn: 291622
|
| |
|
|
|
|
| |
In future commits these patterns will appear after moveToVALU changes.
llvm-svn: 291615
|
| |
|
|
|
|
|
|
|
|
|
| |
When choosing the best successor for a block, ordinarily we would have preferred
a block that preserves the CFG unless there is a strong probability the other
direction. For small blocks that can be duplicated we now skip that requirement
as well.
Differential revision: https://reviews.llvm.org/D27742
llvm-svn: 291609
|
| |
|
|
|
|
|
|
|
| |
If a vector index is out of bounds, the result is supposed to be
undefined but is not undefined behavior. Change the legalization
for indexing the vector on the stack so that an out of bounds
index does not create an out of bounds memory access.
llvm-svn: 291604
|
| |
|
|
|
|
| |
This was enabled without many specific tests or the comment.
llvm-svn: 291586
|
| |
|
|
|
|
|
|
|
|
|
| |
For i16 zeroext arguments when i16 was a legal type, the
known bits information from the truncate was lost. Insert
a zeroext so the known bits optimizations work with the 32-bit
loads.
Fixes code quality regressions vs. SI in min.ll test.
llvm-svn: 291461
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Originally
i64 = umax t8, Constant:i64<4>
was expanded into
i32,i32 = umax Constant:i32<0>, Constant:i32<0>
i32,i32 = umax t7, Constant:i32<4>
Now instead the two produced umax:es return i32 instead of i32, i32.
Thanks to Jan Vesely for help with the test case.
Patch by mikael.holmen at ericsson.com
Reviewers: bogner, jvesely, tstellarAMD, arsenm
Subscribers: test, wdng, RKSimon, arsenm, nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D28135
llvm-svn: 291441
|
| |
|
|
|
|
|
|
| |
This will make transition to SCRATCH_MEMORY easier
Differential Revision: https://reviews.llvm.org/D24746
llvm-svn: 291279
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D27732
llvm-svn: 291245
|
| |
|
|
|
|
|
|
| |
v2: expose using amdgcn prefix
Differential Revision: https://reviews.llvm.org/D23511
llvm-svn: 290977
|
| |
|
|
|
|
|
|
|
|
|
| |
Canonicalize a select with a constant to the false side. This
enables more instruction shrinking opportunities since an
inline immediate can be used for the false side of v_cndmask_b32_e32.
This seems to usually be better but causes some code size regressions
in some tests.
llvm-svn: 290372
|
| |
|
|
| |
llvm-svn: 290351
|
| |
|
|
| |
llvm-svn: 290348
|
| |
|
|
|
|
|
| |
FMA is canonicalized to constant in the middle operand. Do
the same so fmad matches and avoid an extra combine step.
llvm-svn: 290313
|
| |
|
|
| |
llvm-svn: 290312
|
| |
|
|
|
|
|
| |
Extend the existing fadd/fsub->fmad combines to produce
FMA if allowed.
llvm-svn: 290311
|