| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Added decoder methods and tests
Reviewers: vpykhtin, artem.tamazov, dp
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D33545
llvm-svn: 303999
|
|
|
|
| |
llvm-svn: 303902
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc.
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D32862
llvm-svn: 303859
|
|
|
|
|
|
|
|
|
| |
Various address spaces on the SI and R600 subtargets have stricter
limits on memory access size that other address spaces. Use
canMergeStoresTo predicate to prevent the DAGCombiner from creating
these stores as they will be split up during legalization.
llvm-svn: 303767
|
|
|
|
|
|
|
|
|
|
|
| |
patterns"
This reverts commit e065977c4b5f68ab845400b256f6a3822b1325fa.
It doesn't work. S_LOAD_DWORD_IMM_ci and friends aren't selected by any of
the patterns, so it was putting 32-bit literals into the 8-bit field.
llvm-svn: 303754
|
|
|
|
|
|
|
|
| |
This fixes 17 of the 41 -verify-machineinstrs test failures identified in PR33045
Differential Revision: https://reviews.llvm.org/D33451
llvm-svn: 303691
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
convention checking in PromoteAlloca
Summary:
Promoting Alloca to Vector and Promoting Alloca to LDS are two independent handling of Alloca and should not affect each other.
As a result, we should not give up promoting to vector if there is not enough LDS. This patch factors out the local memory usage
related checking out and replace it after the calling convention checking.
Reviewer:
arsenm
Differential Revision:
http://reviews.llvm.org/D33139
llvm-svn: 303684
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Perform DAG combine:
and (srl x, c), mask => shl (bfe x, nb + c, mask >> nb), nb
Where nb is a number of trailing zeroes in mask.
It replaces two instructions with two and BFE is generally a more
expensive one. However this is only done if we are selecting a byte
or word at an aligned boundary which results in a proper SDWA
operand pattern. It is only done if SDWA is supported.
TODO: improve SDWA pass to actually convert this pattern. It is not
done now because we have an immediate in the instruction, which has
be moved into a VGPR.
Differential Revision: https://reviews.llvm.org/D33455
llvm-svn: 303681
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is just a cleanup. Also, it adds checking that ByteCount is aligned to 4.
Reviewers: arsenm, nhaehnle, tstellarAMD
Subscribers: kzhuravl, wdng, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D28994
llvm-svn: 303658
|
|
|
|
|
|
|
|
|
|
|
| |
shl (or|add x, c2), c1 => or|add (shl x, c1), (c2 << c1)
This allows to fold a constant into an address in some cases as
well as to eliminate second shift if the expression is used as
an address and second shift is a result of a GEP.
Differential Revision: https://reviews.llvm.org/D33432
llvm-svn: 303641
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Added separate pseudo and real instruction for GFX9 SDWA instructions.
Currently supports only in assembler.
Depends D32493
Reviewers: vpykhtin, artem.tamazov
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D33132
llvm-svn: 303620
|
|
|
|
|
|
|
|
|
| |
Turn expensive 64 bit shift into 32 bit if shift does not overflow int:
shl (ext x) => zext (shl x)
Differential Revision: https://reviews.llvm.org/D33367
llvm-svn: 303569
|
|
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D33289
llvm-svn: 303548
|
|
|
|
|
|
|
|
|
|
| |
See bug 32922: https://bugs.llvm.org//show_bug.cgi?id=32922
Reviewers: artem.tamazov, vpykhtin
Differential Revision: https://reviews.llvm.org/D32912
llvm-svn: 303428
|
|
|
|
|
|
|
|
|
|
|
|
| |
See Bugs 33019, 33056:
https://bugs.llvm.org//show_bug.cgi?id=33019
https://bugs.llvm.org//show_bug.cgi?id=33056
Reviewers: artem.tamazov, vpykhtin
Differential Revision: https://reviews.llvm.org/D33288
llvm-svn: 303423
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This provides a new way to access the TargetMachine through
TargetPassConfig, as a dependency.
The patterns replaced here are:
* Passes handling a null TargetMachine call
`getAnalysisIfAvailable<TargetPassConfig>`.
* Passes not handling a null TargetMachine
`addRequired<TargetPassConfig>` and call
`getAnalysis<TargetPassConfig>`.
* MachineFunctionPasses now use MF.getTarget().
* Remove all the TargetMachine constructors.
* Remove INITIALIZE_TM_PASS.
This fixes a crash when running `llc -start-before prologepilog`.
PEI needs StackProtector, which gets constructed without a TargetMachine
by the pass manager. The StackProtector pass doesn't handle the case
where there is no TargetMachine, so it segfaults.
Related to PR30324.
Differential Revision: https://reviews.llvm.org/D33222
llvm-svn: 303360
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There should be no intesection between SDWA operands and potential MIs. E.g.:
```
v_and_b32 v0, 0xff, v1 -> src:v1 sel:BYTE_0
v_and_b32 v2, 0xff, v0 -> src:v0 sel:BYTE_0
v_add_u32 v3, v4, v2
```
In that example it is possible that we would fold 2nd instruction into 3rd (v_add_u32_sdwa) and then try to fold 1st instruction into 2nd (that was already destroyed). So if SDWAOperand is also a potential MI then do not apply it.
Reviewers: vpykhtin, arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D32804
llvm-svn: 303347
|
|
|
|
|
|
|
|
| |
Partially implement callee-side for arguments and return values.
byval doesn't work properly, and most likely sret or other on-stack
return values most as well.
llvm-svn: 303308
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order for an arbitrary callee to access an object
in a caller's stack frame, the 32-bit offset used as
the private pointer needs to be relative to the kernel's
scratch wave offset register.
Convert to this by finding the difference from the current
stack frame and scaling by the wavefront size.
llvm-svn: 303303
|
|
|
|
|
|
|
|
|
|
| |
Check the MachinePointerInfo for whether the access is
supposed to be relative to the stack pointer.
No tests because this is used in later commits implementing
calls.
llvm-svn: 303301
|
|
|
|
|
|
| |
Handle more general swizzles.
llvm-svn: 303296
|
|
|
|
|
|
|
|
|
|
|
|
| |
Avoids instructions to pack a vector when the source is really
a scalar being broadcast.
Also be smarter and look for per-component fneg.
Doesn't yet handle scalar from upper half of register
or other swizzles.
llvm-svn: 303291
|
|
|
|
|
|
|
| |
This needs to be the frame offset register, and not the global
scratch wave offset register. For kernels, these are the same.
llvm-svn: 303287
|
|
|
|
|
|
| |
Fix missing instruction definitions for min3/max3.
llvm-svn: 303284
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D33244
llvm-svn: 303186
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using LIS can be quite expensive, so caching of calculated region
live-ins and pressure is implemented. It does two things:
1. Caches the info for the second stage when we schedule with
decreased target occupancy.
2. Tracks the basic block from top to bottom thus eliminating the
need to scan whole register file liveness at every region split
in the middle of the block.
The scheduling is now done in 3 stages instead of two, with the first
one being really a no-op and only used to collect scheduling regions
as sent by the scheduler driver.
There is no functional change to the current behavior, only compilation
speed is affected. In general computeBlockPressure() could be simplified
if we switch to backward RP tracker, because scheduler sends regions
within a block starting from the last upward. We could use a natural
order of upward tracker to seamlessly change between regions of the same
block, since live reg set of a previous tracked region would become a
live-out of the next region. That however requires fixing upward tracker
to properly account defs and uses of the same instruction as both are
contributing to the current pressure. When we converge on the produced
pressure we should be able to switch between them back and forth. In
addition, backward tracker is less expensive as it uses LIS in recede
less often than forward uses it in advance.
At the moment the worst known case compilation time has improved from 26
minutes to 8.5.
Differential Revision: https://reviews.llvm.org/D33117
llvm-svn: 303184
|
|
|
|
|
|
|
|
|
|
| |
This factors register pressure estimation mechanism from the
GCNSchedStrategy into the forward tracker to unify interface
with other strategies and expose it to other interested phases.
Differential Revision: https://reviews.llvm.org/D33105
llvm-svn: 303179
|
|
|
|
| |
llvm-svn: 303137
|
|
|
|
| |
llvm-svn: 303122
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D23209
llvm-svn: 303111
|
|
|
|
| |
llvm-svn: 303098
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D23209
llvm-svn: 303091
|
|
|
|
|
|
|
|
|
|
| |
See bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936
Reviewers: artem.tamazov, vpykhtin
Differential Revision: https://reviews.llvm.org/D33123
llvm-svn: 303070
|
|
|
|
|
|
|
|
|
|
|
|
| |
This instruction does not really exist
See Bug 33018: https://bugs.llvm.org//show_bug.cgi?id=33018
Reviewers: vpykhtin, artem.tamazov
Differential Revision: https://reviews.llvm.org/D33126
llvm-svn: 303055
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We should not change volatile loads/stores in promoting alloca to vector.
Reviewers:
arsenm
Differential Revision:
http://reviews.llvm.org/D33107
llvm-svn: 302943
|
|
|
|
|
|
|
|
|
|
|
|
| |
possible
This patch adds min/max population count, leading/trailing zero/one bit counting methods.
The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give.
Differential Revision: https://reviews.llvm.org/D32931
llvm-svn: 302925
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D33115
llvm-svn: 302919
|
|
|
|
| |
llvm-svn: 302821
|
|
|
|
|
|
|
|
|
|
| |
We don't use it and it was removed in gfx9, and the encoding
bit repurposed.
Additionally actually using it requires changing the output register
class, which wasn't done anyway.
llvm-svn: 302814
|
|
|
|
|
|
|
| |
This allows folding source modifiers in more f16 cases.
Makes it easier to select per-component packed neg modifiers.
llvm-svn: 302813
|
|
|
|
|
|
|
|
|
| |
Earlier fix D32572 introduced a bug where live-ins were calculated
for basic block instead of scheduling region. This change fixes it.
Differential Revision: https://reviews.llvm.org/D33086
llvm-svn: 302812
|
|
|
|
| |
llvm-svn: 302779
|
|
|
|
|
|
|
| |
VOP3P instructions can encode access to either
half of the register.
llvm-svn: 302730
|
|
|
|
|
|
|
| |
Flat instructions gain an immediate offset, and 2 new
sets of segment specific flat instructions are added.
llvm-svn: 302729
|
|
|
|
|
|
|
|
|
|
|
|
| |
disassembler output
See bug 32927: https://bugs.llvm.org//show_bug.cgi?id=32927
Reviewers: vpykhtin, artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D32913
llvm-svn: 302648
|
|
|
|
|
|
| |
VGRP -> VGPR, SGRP -> SGPR
llvm-svn: 302586
|
|
|
|
|
|
|
|
|
|
| |
This is a step toward having statically allocated instruciton mapping.
We are going to tablegen them eventually, so let us reflect that in
the API.
NFC.
llvm-svn: 302316
|
|
|
|
|
|
|
|
| |
instead of getTopBlock to find the loop header.
Differential Revision: https://reviews.llvm.org/D32831
llvm-svn: 302290
|
|
|
|
|
|
|
|
| |
This field is populated by the CP
Differential Revision: https://reviews.llvm.org/D32619
llvm-svn: 302277
|
|
|
|
|
|
|
|
|
|
| |
underlying APInts in KnownBits.
This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown.
Differential Revision: https://reviews.llvm.org/D32637
llvm-svn: 302262
|