| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D29958
llvm-svn: 296186
|
|
|
|
|
|
|
|
| |
This reverts commit r296009. It broke one out of tree target and also
does not account for all partial lines added or removed when calculating
PressureDiff.
llvm-svn: 296182
|
|
|
|
|
|
|
| |
Clang issues warning about hidden overload. That was intended, so
add "using AMDGPUGenRegisterInfo::getRegUnitWeight;" to mute it.
llvm-svn: 296021
|
|
|
|
|
|
|
|
|
|
| |
If a subreg is used in an instruction it counts as a whole superreg
for the purpose of register pressure calculation. This patch corrects
improper register pressure calculation by examining operand's lane mask.
Differential Revision: https://reviews.llvm.org/D29835
llvm-svn: 296009
|
|
|
|
|
|
|
|
| |
Hit on ASICs that support 16bit instructions.
Differential Revision: https://reviews.llvm.org/D30281
llvm-svn: 295990
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement isLegalToVectorizeLoadChain for AMDGPU to avoid
producing private address spaces accesses that will need to be
split up later. This was doing the wrong thing in the case
where the queried chain was an even number of elements.
A possible <4 x i32> store was being split into
store <2 x i32>
store i32
store i32
rather than
store <2 x i32>
store <2 x i32>
when legal.
llvm-svn: 295933
|
|
|
|
|
|
|
| |
This is the pattern that falls out of the instruction's
definition if offset == 0.
llvm-svn: 295912
|
|
|
|
| |
llvm-svn: 295908
|
|
|
|
|
|
|
|
|
|
|
| |
The manual is unclear on the details of this. It's not
clear to me if denormals are not allowed with clamp,
or if that is only omod. Not allowing denorms for
fp16 or fp64 isn't useful so I also question if that
is really a restriction. Same with whether this is valid
without IEEE mode enabled.
llvm-svn: 295905
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D30232
llvm-svn: 295904
|
|
|
|
| |
llvm-svn: 295899
|
|
|
|
|
|
|
|
|
| |
This should avoid reporting any stack needs to be allocated in the
case where no stack is truly used. An unused stack slot is still
left around in other cases where there are real stack objects
but no spilling occurs.
llvm-svn: 295891
|
|
|
|
|
|
| |
Fixes not adjusting using new intrinsics with chains.
llvm-svn: 295878
|
|
|
|
|
|
|
|
|
| |
This allows us to ensure that 0 is never a valid pointer
to a user object, and ensures that the offset is always legal
without needing a register to access it. This comes at the cost
of usable offsets and wasted stack space.
llvm-svn: 295877
|
|
|
|
| |
llvm-svn: 295873
|
|
|
|
|
|
| |
This reverts commit r295867.
llvm-svn: 295871
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D30232
llvm-svn: 295867
|
|
|
|
|
|
| |
Convert llvm.SI.packf16 test uses
llvm-svn: 295797
|
|
|
|
| |
llvm-svn: 295789
|
|
|
|
|
|
|
|
|
|
|
| |
Change implementation to use max instead of add.
min/max/med3 do not flush denormals regardless of the mode,
so it is OK to use it whether or not they are enabled.
Also allow using clamp with f16, and use knowledge
of dx10_clamp.
llvm-svn: 295788
|
|
|
|
| |
llvm-svn: 295783
|
|
|
|
| |
llvm-svn: 295754
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before frame offsets are calculated, try to eliminate the
frame indexes used by SGPR spills. Then we can delete them
after.
I think for now we can be sure that no other instruction
will be re-using the same frame indexes. It should be easy
to notice if this assumption ever breaks since everything
asserts if it tries to use a dead frame index later.
The unused emergency stack slot seems to still be left behind,
so an additional 4 bytes is still wasted.
llvm-svn: 295753
|
|
|
|
|
|
| |
This was accepting GFX9 instructions on VI.
llvm-svn: 295557
|
|
|
|
| |
llvm-svn: 295555
|
|
|
|
| |
llvm-svn: 295554
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D29792
llvm-svn: 295539
|
|
|
|
| |
llvm-svn: 295489
|
|
|
|
| |
llvm-svn: 295359
|
|
|
|
| |
llvm-svn: 295358
|
|
|
|
| |
llvm-svn: 295270
|
|
|
|
|
|
| |
Update test uses with expansion in terms of new intrinsics.
llvm-svn: 295269
|
|
|
|
| |
llvm-svn: 295247
|
|
|
|
| |
llvm-svn: 295244
|
|
|
|
|
|
| |
Also use a more refined condition.
llvm-svn: 295239
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch reverts region's scheduling to the original untouched state
in case if we have have decreased occupancy.
In addition it switches to use TargetRegisterInfo occupancy callback
for pressure limits instead of gradually increasing limits which were
just passed by. We are going to stay with the best schedule so we do
not need to tolerate worsened scheduling anymore.
Differential Revision: https://reviews.llvm.org/D29971
llvm-svn: 295206
|
|
|
|
|
|
|
|
|
|
| |
This patch corrects the maximum workgroups per CU if we have big
workgroups (more than 128). This calculation contributes to the
occupancy calculation in respect to LDS size.
Differential Revision: https://reviews.llvm.org/D29974
llvm-svn: 295134
|
|
|
|
|
|
| |
This reverts commit ce06d9cb99298eb844b66e117f5108a06747c907.
llvm-svn: 295054
|
|
|
|
|
|
|
|
| |
minor fixes (NFC).
Same changes in files affected by reduced MC headers dependencies.
llvm-svn: 295009
|
|
|
|
|
|
| |
function returned true or undef.
llvm-svn: 294895
|
|
|
|
| |
llvm-svn: 294694
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D26010
llvm-svn: 294692
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change returns empty PSet list for M0 register. Otherwise its
PSet as defined by tablegen is SReg_32. This results in incorrect
register pressure calculation every time an instruction uses M0.
Such uses count as SReg_32 PSet and inadequately increase pressure
on SGPRs.
Differential Revision: https://reviews.llvm.org/D29798
llvm-svn: 294691
|
|
|
|
| |
llvm-svn: 294635
|
|
|
|
|
|
|
|
| |
using switch statement
Differential Revision: https://reviews.llvm.org/D29741
llvm-svn: 294627
|
|
|
|
| |
llvm-svn: 294621
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
graph_children, inverse_graph_nodes, inverse_graph_children).
Summary:
Convert all obvious node_begin/node_end and child_begin/child_end
pairs to range based for.
Sending for review in case someone has a good idea how to make
graph_children able to be inferred. It looks like it would require
changing GraphTraits to be two argument or something. I presume
inference does not happen because it would have to check every
GraphTraits in the world to see if the noderef types matched.
Note: This change was 3-staged with clang as well, which uses
Dominators/etc from LLVM.
Reviewers: chandlerc, tstellarAMD, dblaikie, rsmith
Subscribers: arsenm, llvm-commits, nhaehnle
Differential Revision: https://reviews.llvm.org/D29767
llvm-svn: 294620
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement getRegPressureLimit and getRegPressureSetLimit callbacks in
SIRegisterInfo.
This makes standard converge scheduler to behave almost the same as
GCNScheduler, sometime slightly better sometimes a bit worse.
In gerenal that is also possible to switch GCNScheduler to use these
callbacks instead of getMaxWaves(), which also makes GCNScheduler
slightly better on some tests and slightly worse on another. A big
win is behavior with converge scheduler.
Note, these are used not only by scheduling, but in places like
MachineLICM.
Differential Revision: https://reviews.llvm.org/D29700
llvm-svn: 294518
|
|
|
|
|
|
| |
lines
llvm-svn: 294454
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28760#fb670e28
llvm-svn: 294449
|