| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
If there is only one successor, and that successor only
has one predecessor the wait can obviously be delayed until
uses or the end of the next block. This avoids code quality
regressions when there are trivial fallthrough blocks inserted
for structurization.
llvm-svn: 297251
|
| |
|
|
|
|
|
| |
When doing arcp optimization with a constant denominator,
this was leaving behind rcps with constant inputs.
llvm-svn: 297248
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers:
arsenm
Differential Revision:
http://reviews.llvm.org/D22025
llvm-svn: 297243
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
object that knows how to generate it.
Summary:
This will allow future patches to inspect the details of the LLT. The implementation is now split between
the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns.
Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem.
The problem with the previous commit appears to have been that TableGen was including CodeGen/LowLevelType.h instead of Support/LowLevelTypeImpl.h.
Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar
Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30046
llvm-svn: 297241
|
| |
|
|
|
|
|
|
|
|
| |
More module problems. This time it only showed up in the stage 2 compile of
clang-x86_64-linux-selfhost-modules-2 but not the stage 1 compile.
Somehow, this change causes the build to need Attributes.gen before it's been
generated.
llvm-svn: 297188
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
knows how to generate it.
Summary:
This will allow future patches to inspect the details of the LLT. The implementation is now split between
the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns.
Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem.
Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar
Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30046
llvm-svn: 297177
|
| |
|
|
|
|
|
|
| |
It breaks line tables because the patch is not complete, working on a complete one at the moment
This reverts commit r294031.
llvm-svn: 297118
|
| |
|
|
|
|
|
|
| |
also exit early on kill instead of redefinition.
Differential Revision: https://reviews.llvm.org/D30230
llvm-svn: 297060
|
| |
|
|
| |
llvm-svn: 296901
|
| |
|
|
|
|
|
|
| |
Added code to check constant bus restrictions for VOP formats (only one SGPR value or literal-constant may be used by the instruction).
Note that the same checks are performed by SIInstrInfo::verifyInstruction (used by lowering code).
Added LIT tests.
llvm-svn: 296873
|
| |
|
|
| |
llvm-svn: 296842
|
| |
|
|
| |
llvm-svn: 296523
|
| |
|
|
|
|
|
|
| |
This is somewhat tricky because there are two
pairs of tied operands, and it isn't allowed to be
VOP3 encoded.
llvm-svn: 296519
|
| |
|
|
| |
llvm-svn: 296515
|
| |
|
|
| |
llvm-svn: 296513
|
| |
|
|
|
|
|
|
|
| |
It's not clear to me if this is always better than
doing ds_write2_b64 This adds the constraint of
a 128-bit register input instead of a pair of
64-bit.
llvm-svn: 296512
|
| |
|
|
|
|
|
|
|
|
|
| |
If during scheduling we have identified that we cannot keep optimistic
occupancy increase critical register pressure limit and try scheduling
of the whole function again. In this case blocks with smaller pressure
will have a chance for better scheduling.
Differential Revision: https://reviews.llvm.org/D30442
llvm-svn: 296506
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change introduces new method to estimate register pressure in
GCNScheduler. Standard RPTracker gives huge error due to the following
reasons:
1. It does not account for live-ins or live-outs if value is not used
in the region itself. That creates a huge error in a very common case
if there are a lot of live-thu registers.
2. It does not properly count subregs.
3. It assumes a register used as an input operand can be reused as an
output. This is not always possible by itself, this is not what RA
will finally do in many cases for various reasons not limited to RA's
inability to do so, and this is not so if the value is actually a
live-thu.
In addition we can now see clear separation between live-in pressure
which we cannot change with the scheduling and tentative pressure
which we can change.
Differential Revision: https://reviews.llvm.org/D30439
llvm-svn: 296491
|
| |
|
|
|
|
|
|
| |
- We do emit amd_kernel_code_t v1.1
Differential Revision: https://reviews.llvm.org/D30433
llvm-svn: 296489
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
If two subregs of the same register are defined and we need to revert
schedule changing def order, we will end up with both instructions
having def,read-undef flags because adjustLaneLiveness() will only set
this flag but will not remove it.
Fix this by removing read-undef flags before calling adjustLaneLiveness.
Differential Revision: https://reviews.llvm.org/D30428
llvm-svn: 296484
|
| |
|
|
|
|
|
|
| |
subclass that knows how to generate it.
There's a circular dependency that's only revealed when LLVM_ENABLE_MODULES=1.
llvm-svn: 296478
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
how to generate it.
Summary:
This will allow future patches to inspect the details of the LLT. The implementation is now split between
the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns.
Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem.
Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar
Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30046
llvm-svn: 296474
|
| |
|
|
| |
llvm-svn: 296401
|
| |
|
|
| |
llvm-svn: 296396
|
| |
|
|
| |
llvm-svn: 296382
|
| |
|
|
|
|
|
| |
Add packed types as legal so they may be used with inlineasm.
Keep all operations expanded for now.
llvm-svn: 296379
|
| |
|
|
|
|
|
| |
Doesn't fix any practical problems because clamp/omod
are currently folded after peephole optimizer.
llvm-svn: 296375
|
| |
|
|
| |
llvm-svn: 296372
|
| |
|
|
|
|
| |
Mostly useful for writing tests for f16 features.
llvm-svn: 296370
|
| |
|
|
|
|
|
|
| |
Add a few non-VOP3P but instructions related to packed.
Includes hack with dummy operands for the benefit of the assembler
llvm-svn: 296368
|
| |
|
|
|
|
|
|
|
|
|
| |
- Verify that runtime metadata is actually valid runtime metadata when assembling, otherwise we could accept the following when assembling, but ocl runtime will reject it:
.amdgpu_runtime_metadata
{ amd.MDVersion: [ 2, 1 ], amd.RandomUnknownKey, amd.IsaInfo: ...
- Make IsaInfo optional, and always emit it.
Differential Revision: https://reviews.llvm.org/D30349
llvm-svn: 296324
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D29958
llvm-svn: 296186
|
| |
|
|
|
|
|
|
| |
This reverts commit r296009. It broke one out of tree target and also
does not account for all partial lines added or removed when calculating
PressureDiff.
llvm-svn: 296182
|
| |
|
|
|
|
|
| |
Clang issues warning about hidden overload. That was intended, so
add "using AMDGPUGenRegisterInfo::getRegUnitWeight;" to mute it.
llvm-svn: 296021
|
| |
|
|
|
|
|
|
|
|
| |
If a subreg is used in an instruction it counts as a whole superreg
for the purpose of register pressure calculation. This patch corrects
improper register pressure calculation by examining operand's lane mask.
Differential Revision: https://reviews.llvm.org/D29835
llvm-svn: 296009
|
| |
|
|
|
|
|
|
| |
Hit on ASICs that support 16bit instructions.
Differential Revision: https://reviews.llvm.org/D30281
llvm-svn: 295990
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement isLegalToVectorizeLoadChain for AMDGPU to avoid
producing private address spaces accesses that will need to be
split up later. This was doing the wrong thing in the case
where the queried chain was an even number of elements.
A possible <4 x i32> store was being split into
store <2 x i32>
store i32
store i32
rather than
store <2 x i32>
store <2 x i32>
when legal.
llvm-svn: 295933
|
| |
|
|
|
|
|
| |
This is the pattern that falls out of the instruction's
definition if offset == 0.
llvm-svn: 295912
|
| |
|
|
| |
llvm-svn: 295908
|
| |
|
|
|
|
|
|
|
|
|
| |
The manual is unclear on the details of this. It's not
clear to me if denormals are not allowed with clamp,
or if that is only omod. Not allowing denorms for
fp16 or fp64 isn't useful so I also question if that
is really a restriction. Same with whether this is valid
without IEEE mode enabled.
llvm-svn: 295905
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D30232
llvm-svn: 295904
|
| |
|
|
| |
llvm-svn: 295899
|
| |
|
|
|
|
|
|
|
| |
This should avoid reporting any stack needs to be allocated in the
case where no stack is truly used. An unused stack slot is still
left around in other cases where there are real stack objects
but no spilling occurs.
llvm-svn: 295891
|
| |
|
|
|
|
| |
Fixes not adjusting using new intrinsics with chains.
llvm-svn: 295878
|
| |
|
|
|
|
|
|
|
| |
This allows us to ensure that 0 is never a valid pointer
to a user object, and ensures that the offset is always legal
without needing a register to access it. This comes at the cost
of usable offsets and wasted stack space.
llvm-svn: 295877
|
| |
|
|
| |
llvm-svn: 295873
|
| |
|
|
|
|
| |
This reverts commit r295867.
llvm-svn: 295871
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D30232
llvm-svn: 295867
|
| |
|
|
|
|
| |
Convert llvm.SI.packf16 test uses
llvm-svn: 295797
|
| |
|
|
| |
llvm-svn: 295789
|