| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D31437
llvm-svn: 304812
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D33927
llvm-svn: 304805
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D33890
llvm-svn: 304797
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: craig.topper, arsenm, mehdi_amini
Reviewed By: mehdi_amini
Subscribers: mehdi_amini, wdng, nhaehnle, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33924
llvm-svn: 304767
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D33689
llvm-svn: 304737
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D33907
llvm-svn: 304729
|
| |
|
|
|
|
|
|
|
| |
Fixes bug #33302. Pass did not account that Src1 of max instruction
can be an immediate.
Differential Revision: https://reviews.llvm.org/D33884
llvm-svn: 304696
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Remove dependency of SDWA pass on SIShrinkInstructions.
The goal is to move SDWA even higher in the stack to avoid second run
of MachineLICM, MachineCSE and SIFoldOperands.
Also added handling to preserve original src modifiers.
Differential Revision: https://reviews.llvm.org/D33860
llvm-svn: 304665
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These are mostly legal, but will probably need special lowering for some
cases.
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D33791
llvm-svn: 304628
|
| |
|
|
|
|
|
|
|
| |
SIFoldOperands can commute operands even if no folding was done.
This change is to preserve IR is no folding was done.
Differential Revision: https://reviews.llvm.org/D33802
llvm-svn: 304625
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D33808
llvm-svn: 304619
|
| |
|
|
| |
llvm-svn: 304574
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D33793
llvm-svn: 304571
|
| |
|
|
| |
llvm-svn: 304554
|
| |
|
|
|
|
|
|
|
| |
-enable-si-insert-waitcnts=1 becomes the default
-enable-si-insert-waitcnts=0 to use old pass
Differential Revision: https://reviews.llvm.org/D33730
llvm-svn: 304551
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D33307
llvm-svn: 304482
|
| |
|
|
|
|
|
| |
Partial revert of r301938 which is making it harder
to split patches up.
llvm-svn: 304418
|
| |
|
|
| |
llvm-svn: 304416
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a callback to the LLVMTargetMachine that lets target indicate
that they do not pass the machine verifier checks in all cases yet.
This is intended to be a temporary measure while the targets are fixed
allowing us to enable the machine verifier by default with
EXPENSIVE_CHECKS enabled!
Differential Revision: https://reviews.llvm.org/D33696
llvm-svn: 304320
|
| |
|
|
|
|
|
|
|
|
|
| |
- new waitcnt pass remains off by default; -enable-si-insert-waitcnts=1 to enable it
- fix handling of PERMUTE ops
- fix insertion of waitcnt instrs at function begin/end ( port of analogous code that was added to old waitcnt pass )
- add new test
Differential Revision: https://reviews.llvm.org/D33114
llvm-svn: 304311
|
| |
|
|
|
|
|
|
|
|
| |
See Bug 28601: https://bugs.llvm.org//show_bug.cgi?id=28601
Reviewers: artem.tamazov, vpykhtin
Differential Revision: https://reviews.llvm.org/D33542
llvm-svn: 304309
|
| |
|
|
|
|
|
|
|
|
|
| |
TargetPassConfig is not useful for targets that do not use the CodeGen
library, so we may just as well store a pointer to an
LLVMTargetMachine instead of just to a TargetMachine.
While at it, also change the constructor to take a reference instead of a
pointer as the TM must not be nullptr.
llvm-svn: 304247
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An encoding does not allow to use SDWA in an instruction with
scalar operands, either literals or SGPRs. That is however possible
to copy these operands into a VGPR first.
Several copies of the value are produced if multiple SDWA conversions
were done. To cleanup MachineLICM (to hoist copies out of loops),
MachineCSE (to remove duplicate copies) and SIFoldOperands (to replace
SGPR to VGPR copy with immediate copy right to the VGPR) runs are added
after the SDWA pass.
Differential Revision: https://reviews.llvm.org/D33583
llvm-svn: 304219
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D33576
llvm-svn: 304217
|
| |
|
|
|
|
|
|
|
|
| |
[AMDGPU] add intrinsic for s_getpc
Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc.
Patch by Tim Corringham
llvm-svn: 304031
|
| |
|
|
| |
llvm-svn: 304029
|
| |
|
|
|
|
|
|
|
|
| |
See bug 33171: https://bugs.llvm.org/show_bug.cgi?id=33171
Reviewers: Sam Kolton
Differential Revision: https://reviews.llvm.org/D33553
llvm-svn: 304015
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D33212
llvm-svn: 304003
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Added decoder methods and tests
Reviewers: vpykhtin, artem.tamazov, dp
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D33545
llvm-svn: 303999
|
| |
|
|
| |
llvm-svn: 303902
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc.
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D32862
llvm-svn: 303859
|
| |
|
|
|
|
|
|
|
| |
Various address spaces on the SI and R600 subtargets have stricter
limits on memory access size that other address spaces. Use
canMergeStoresTo predicate to prevent the DAGCombiner from creating
these stores as they will be split up during legalization.
llvm-svn: 303767
|
| |
|
|
|
|
|
|
|
|
|
| |
patterns"
This reverts commit e065977c4b5f68ab845400b256f6a3822b1325fa.
It doesn't work. S_LOAD_DWORD_IMM_ci and friends aren't selected by any of
the patterns, so it was putting 32-bit literals into the 8-bit field.
llvm-svn: 303754
|
| |
|
|
|
|
|
|
| |
This fixes 17 of the 41 -verify-machineinstrs test failures identified in PR33045
Differential Revision: https://reviews.llvm.org/D33451
llvm-svn: 303691
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
convention checking in PromoteAlloca
Summary:
Promoting Alloca to Vector and Promoting Alloca to LDS are two independent handling of Alloca and should not affect each other.
As a result, we should not give up promoting to vector if there is not enough LDS. This patch factors out the local memory usage
related checking out and replace it after the calling convention checking.
Reviewer:
arsenm
Differential Revision:
http://reviews.llvm.org/D33139
llvm-svn: 303684
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Perform DAG combine:
and (srl x, c), mask => shl (bfe x, nb + c, mask >> nb), nb
Where nb is a number of trailing zeroes in mask.
It replaces two instructions with two and BFE is generally a more
expensive one. However this is only done if we are selecting a byte
or word at an aligned boundary which results in a proper SDWA
operand pattern. It is only done if SDWA is supported.
TODO: improve SDWA pass to actually convert this pattern. It is not
done now because we have an immediate in the instruction, which has
be moved into a VGPR.
Differential Revision: https://reviews.llvm.org/D33455
llvm-svn: 303681
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This is just a cleanup. Also, it adds checking that ByteCount is aligned to 4.
Reviewers: arsenm, nhaehnle, tstellarAMD
Subscribers: kzhuravl, wdng, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D28994
llvm-svn: 303658
|
| |
|
|
|
|
|
|
|
|
|
| |
shl (or|add x, c2), c1 => or|add (shl x, c1), (c2 << c1)
This allows to fold a constant into an address in some cases as
well as to eliminate second shift if the expression is used as
an address and second shift is a result of a GEP.
Differential Revision: https://reviews.llvm.org/D33432
llvm-svn: 303641
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Added separate pseudo and real instruction for GFX9 SDWA instructions.
Currently supports only in assembler.
Depends D32493
Reviewers: vpykhtin, artem.tamazov
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D33132
llvm-svn: 303620
|
| |
|
|
|
|
|
|
|
| |
Turn expensive 64 bit shift into 32 bit if shift does not overflow int:
shl (ext x) => zext (shl x)
Differential Revision: https://reviews.llvm.org/D33367
llvm-svn: 303569
|
| |
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D33289
llvm-svn: 303548
|
| |
|
|
|
|
|
|
|
|
| |
See bug 32922: https://bugs.llvm.org//show_bug.cgi?id=32922
Reviewers: artem.tamazov, vpykhtin
Differential Revision: https://reviews.llvm.org/D32912
llvm-svn: 303428
|
| |
|
|
|
|
|
|
|
|
|
|
| |
See Bugs 33019, 33056:
https://bugs.llvm.org//show_bug.cgi?id=33019
https://bugs.llvm.org//show_bug.cgi?id=33056
Reviewers: artem.tamazov, vpykhtin
Differential Revision: https://reviews.llvm.org/D33288
llvm-svn: 303423
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This provides a new way to access the TargetMachine through
TargetPassConfig, as a dependency.
The patterns replaced here are:
* Passes handling a null TargetMachine call
`getAnalysisIfAvailable<TargetPassConfig>`.
* Passes not handling a null TargetMachine
`addRequired<TargetPassConfig>` and call
`getAnalysis<TargetPassConfig>`.
* MachineFunctionPasses now use MF.getTarget().
* Remove all the TargetMachine constructors.
* Remove INITIALIZE_TM_PASS.
This fixes a crash when running `llc -start-before prologepilog`.
PEI needs StackProtector, which gets constructed without a TargetMachine
by the pass manager. The StackProtector pass doesn't handle the case
where there is no TargetMachine, so it segfaults.
Related to PR30324.
Differential Revision: https://reviews.llvm.org/D33222
llvm-svn: 303360
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There should be no intesection between SDWA operands and potential MIs. E.g.:
```
v_and_b32 v0, 0xff, v1 -> src:v1 sel:BYTE_0
v_and_b32 v2, 0xff, v0 -> src:v0 sel:BYTE_0
v_add_u32 v3, v4, v2
```
In that example it is possible that we would fold 2nd instruction into 3rd (v_add_u32_sdwa) and then try to fold 1st instruction into 2nd (that was already destroyed). So if SDWAOperand is also a potential MI then do not apply it.
Reviewers: vpykhtin, arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D32804
llvm-svn: 303347
|
| |
|
|
|
|
|
|
| |
Partially implement callee-side for arguments and return values.
byval doesn't work properly, and most likely sret or other on-stack
return values most as well.
llvm-svn: 303308
|
| |
|
|
|
|
|
|
|
|
|
|
| |
In order for an arbitrary callee to access an object
in a caller's stack frame, the 32-bit offset used as
the private pointer needs to be relative to the kernel's
scratch wave offset register.
Convert to this by finding the difference from the current
stack frame and scaling by the wavefront size.
llvm-svn: 303303
|
| |
|
|
|
|
|
|
|
|
| |
Check the MachinePointerInfo for whether the access is
supposed to be relative to the stack pointer.
No tests because this is used in later commits implementing
calls.
llvm-svn: 303301
|
| |
|
|
|
|
| |
Handle more general swizzles.
llvm-svn: 303296
|