| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
| |
There should probably be nonconvergent versions, but my guess is it
doesn't matter in practice.
llvm-svn: 361331
|
|
|
|
| |
llvm-svn: 361330
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unfortunately the way SIInsertSkips works is backwards, and is
required for correctness. r338235 added handling of some special cases
where skipping is mandatory to avoid side effects if no lanes are
active. It conservatively handled asm correctly, but the same logic
needs to apply to calls.
Usually the call sequence code is larger than the skip threshold,
although the way the count is computed is really broken, so I'm not
sure if anything was likely to really hit this.
llvm-svn: 361202
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
chains. NFC
A std::array is implemented as a template with an array
inside a struct. Older versions of clang, like 3.6,
require an extra set of curly braces around std::array
initializations to avoid warnings.
The C++ language was changed regarding this by CWG 1270.
So more modern tool chains does not complaing even if
leaving out one level of braces.
llvm-svn: 361171
|
|
|
|
| |
llvm-svn: 361167
|
|
|
|
| |
llvm-svn: 361134
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Avoid introducing hazard mitigation when lgkmcnt is reduced to 0.
Clarify code comments to explain assumptions made for this hazard
mitigation. Expand and correct test cases to cover variants of
s_waitcnt.
Reviewers: nhaehnle, rampitec
Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62058
llvm-svn: 361124
|
|
|
|
| |
llvm-svn: 361082
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is ported from the custom AMDGPU DAG implementation. I think this
is a better default expansion than what the DAG currently uses, at
least if the target has CTLZ.
This implements the signed version in terms of the unsigned
conversion, which is implemented with bit operations. SelectionDAG has
several other implementations that should eventually be ported
depending on what instructions are legal.
llvm-svn: 361081
|
|
|
|
|
|
|
|
|
|
| |
See bug 41298: https://bugs.llvm.org/show_bug.cgi?id=41298
Reviewers: artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D61009
llvm-svn: 361045
|
|
|
|
|
|
|
|
|
|
| |
See https://bugs.llvm.org/show_bug.cgi?id=41888
Reviewers: artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D62016
llvm-svn: 361040
|
|
|
|
|
|
|
|
|
|
| |
See bug 40873: https://bugs.llvm.org/show_bug.cgi?id=40873
Reviewers: artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D60768
llvm-svn: 361031
|
|
|
|
| |
llvm-svn: 361030
|
|
|
|
| |
llvm-svn: 361028
|
|
|
|
| |
llvm-svn: 361027
|
|
|
|
| |
llvm-svn: 361026
|
|
|
|
| |
llvm-svn: 361025
|
|
|
|
| |
llvm-svn: 361023
|
|
|
|
|
|
|
| |
This saves instructions and extra steps, but I'm not sure about
introducing subregister indexes at this point.
llvm-svn: 361022
|
|
|
|
|
|
|
| |
This adds support for more complex waterfall loops that need to handle
operands > 32-bits, and multiple operands.
llvm-svn: 361021
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In order to combine memory operations efficiently, the load/store
optimizer might move some instructions around. It's usually safe
to move instructions down past the merged instruction because the
pass checks if memory operations can be re-ordered.
Though, the current logic doesn't handle Write-after-Write hazards.
This fixes a reflection issue with Monster Hunter World and DXVK.
v2: - rebased on top of master
- clean up the test case
- handle WaW hazards correctly
Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=40130
Original patch by Samuel Pitoiset.
Reviewers: tpr, arsenm, nhaehnle
Reviewed By: nhaehnle
Subscribers: ronlieb, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D61313
llvm-svn: 361008
|
|
|
|
|
|
|
| |
The call was missing chain dependencies on the pre-call copies. I
don't think this was causing any real issues however.
llvm-svn: 360906
|
|
|
|
|
|
|
|
|
|
|
| |
This is the conservatively correct default. It is always safe to
assume xnack is enabled, but not the converse.
Introduce a feature to blacklist targets where xnack can never be
meaningfully enabled. I'm not sure the targets this is applied to is
100% correct.
llvm-svn: 360903
|
|
|
|
|
|
| |
Bool values should use the scc/vcc regbank since r350611.
llvm-svn: 360877
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
SGPR in CC can be either hw initialized or set by other chained shaders
and so this increases the SGPR count availalbe to CC to 105.
Change-Id: I3dfadc750fe4a3e2bd07117a2899fd13f3e2fef3
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61261
llvm-svn: 360778
|
|
|
|
|
|
|
|
| |
Move the declarations of getThe<Name>Target() functions into a new header in
TargetInfo and make users of these functions include this new header.
This fixes a layering problem.
llvm-svn: 360713
|
|
|
|
|
|
|
|
| |
Reviewers: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D61905
llvm-svn: 360702
|
|
|
|
|
|
|
|
| |
This bug was exposed by the rL360395.
Differential Revision: https://reviews.llvm.org/D61812
llvm-svn: 360689
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The +DumpCode attribute is a horrible hack in AMDGPU to embed the
disassembly of the generated code into the elf file. It is used by LLPC
to implement an extension that allows the application to read back the
disassembly of the code. Longer term, we should re-implement that by
using the LLVM disassembler from the Vulkan driver.
Recent LLVM changes broke +DumpCode. With -filetype=asm it crashed, and
with -filetype=obj I think it did not include any instructions, only the
labels. Fixed with this commit: now it has no effect with -filetype=asm,
and works as intended with -filetype=obj.
Differential Revision: https://reviews.llvm.org/D60682
Change-Id: I6436d86fe2ea220d74a643a85e64753747c9366b
llvm-svn: 360688
|
|
|
|
| |
llvm-svn: 360609
|
|
|
|
| |
llvm-svn: 360608
|
|
|
|
|
|
|
|
|
| |
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
llvm-svn: 360487
|
|
|
|
|
|
|
|
|
| |
This also allows three op patterns to use increased constant bus
limit of GFX10.
Differential Revision: https://reviews.llvm.org/D61763
llvm-svn: 360395
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D61703
llvm-svn: 360364
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D61704
llvm-svn: 360353
|
|
|
|
| |
llvm-svn: 360294
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The VOP3 form should always be the preferred selection, to be shrunk
later. This should only be an optimization issue, but this partially
works around a problem from clobbering VCC when SIFixSGPRCopies
rewrites an SCC defining operation directly to VCC.
3 of the testcases are regressions from failing to fold the immediate
in cases it should. These can be avoided by improving the VCC liveness
handling in SIFoldOperands. Simply increasing the threshold to
computeRegisterLiveness works, although this is common enough that VCC
liveness should probably be tracked throughout the pass. The hack of
leaving behind an implicit_def instruction to avoid breaking iterator
wastes instruction count, which inhibits finding the VCC def in long
chains of adds. Doing this however exposes different, worse looking
regressions from poor scheduling behavior. This could probably be
avoided around by forcing the shrink of the addc here, but the
scheduler should probably be fixed.
The r600 add test needs to be split out because it asserts on the
arguments in the new test during the calling convention lowering.
llvm-svn: 360293
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D61701
llvm-svn: 360287
|
|
|
|
|
|
|
| |
Differential Revision:
https://reviews.llvm.org/D61430
llvm-svn: 360283
|
|
|
|
|
|
| |
This was committed in rL358887 but reverted in rL360066 due to a x86 regression, really it should be have been pre-committed instead of being part of the SimplifyDemandedBits bitcast patch.
llvm-svn: 360263
|
|
|
|
|
|
| |
Fixes static analyzer undefined value warning.
llvm-svn: 360239
|
|
|
|
|
|
|
|
| |
As noted in https://www.viva64.com/en/b/0629/ (Snippet No. 36) and the scan-build CI reports (https://llvm.org/reports/scan-build/report-SIModeRegister.cpp-Status-1-1.html#EndPath), rL348754 introduced a typo in the Status constructor due to argument variable names shadowing the member variable names.
Differential Revision: https://reviews.llvm.org/D61595
llvm-svn: 360236
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: GCNHazardRecognizer fails to identify hazards that are in and around bundles. This patch allows the hazard recognizer to consider bundled instructions in both scheduler and hazard recognizer mode. We ignore “bundledness” for the purpose of detecting hazards and examine the instructions individually.
Reviewers: arsenm, msearles, rampitec
Reviewed By: rampitec
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61564
llvm-svn: 360199
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
No test case because I don't know of a way to trigger this, but I
accidentally caused this to fail while working on a different change.
Change-Id: I8015aa447fe27163cc4e4902205a203bd44bf7e3
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61490
llvm-svn: 360123
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D61521
llvm-svn: 360095
|
|
|
|
|
|
|
|
|
| |
GFX10 deprecates v_mul_lo_i32 instruction, so choose u32 form for
all targets.
Differential Revision: https://reviews.llvm.org/D61525
llvm-svn: 360094
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D61535
llvm-svn: 360087
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reverts "[X86] Remove (V)MOV64toSDrr/m and (V)MOVDI2SSrr/m. Use 128-bit result MOVD/MOVQ and COPY_TO_REGCLASS instead"
Reverts "[TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits bitcast handling"
Eric Christopher and Jorge Gorbe Moya reported some issues with these patches to me off list.
Removing the CodeGenOnly instructions has changed how fneg is handled during fast-isel with sse/sse2. We're now emitting fsub -0.0, x instead
moving to the integer domain(in a GPR), xoring the sign bit, and then moving back to xmm. This is because the fast isel table no longer
contains an entry for (f32/f64 bitcast (i32/i64)) so the target independent fneg code fails. The use of fsub changes the behavior of nan with
respect to -O2 codegen which will always use a pxor. NOTE: We still have a difference with double with -m32 since the move to GPR doesn't work
there. I'll file a separate PR for that and add test cases.
Since removing the CodeGenOnly instructions was fixing PR41619, I'm reverting r358887 which exposed that PR. Though I wouldn't be surprised
if that bug can still be hit independent of that.
This should hopefully get Google back to green. I'll work with Simon and other X86 folks to figure out how to move forward again.
llvm-svn: 360066
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D61046
llvm-svn: 360044
|
|
|
|
| |
llvm-svn: 359963
|