| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
| |
Currently it returns incorrect operand size for a target independet
node such as COPY if operand is a register with subreg. Instead of
correct subreg size it returns a size of the whole superreg.
Differential Revision: https://reviews.llvm.org/D52736
llvm-svn: 343508
|
|
|
|
|
|
|
|
|
|
| |
Summary: This change enables VOP3 shifts to be explicitly selected
dependent on the divergence.
Differential Revision: https://reviews.llvm.org/D52559
Reviewers: rampitec
llvm-svn: 343455
|
|
|
|
| |
llvm-svn: 343369
|
|
|
|
| |
llvm-svn: 343264
|
|
|
|
| |
llvm-svn: 343259
|
|
|
|
| |
llvm-svn: 343254
|
|
|
|
|
|
|
|
| |
This allows to reduce a number of used VGPRs in some cases.
Differential Revision: https://reviews.llvm.org/D52577
llvm-svn: 343249
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The convenience wrapper in STLExtras is available since rL342102.
Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb
Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits
Differential Revision: https://reviews.llvm.org/D52573
llvm-svn: 343163
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is essentially NFC, because the complex pattern used for these patterns
will fail on non-CI, but this makes the pattern consistent with other CI
smrd patterns. It is also a performance improvement, because the pattern
will now fail earlier on non-CI.
Reviewers: arsenm, nhaehnle
Reviewed By: arsenm
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D52469
llvm-svn: 343125
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D52522
llvm-svn: 343047
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We generate s_xor to lower add of i1s in general cases, and s_not to
lower add with a one-bit imm of -1 (true).
Reviewers:
rampitec
Differential Revision:
https://reviews.llvm.org/D52518
llvm-svn: 343030
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[AMDGPU] lower-switch in preISel as a workaround for legacy DA
Summary:
The default target of the switch instruction may sometimes be an
"unreachable" block, when it is guaranteed that one of the cases is
always taken. The dominator tree concludes that such a switch
instruction does not have an immediate post dominator. This confuses
divergence analysis, which is unable to propagate sync dependence to
the targets of the switch instruction.
As a workaround, the AMDGPU target now invokes lower-switch as a
preISel pass. LowerSwitch is designed to handle the unreachable
default target correctly, allowing the divergence analysis to locate
the correct immediate dominator of the now-lowered switch.
llvm-svn: 342956
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the alignment is at least 4, this should report true.
Something still seems off with how < 4-byte types are
handled here though.
Fixing this seems to change how some combines get
to where they get, but somehow isn't changing the net
result.
llvm-svn: 342879
|
|
|
|
|
|
|
|
|
| |
"[AMDGPU] lower-switch in preISel as a workaround for legacy DA"
This broke regression tests. The first breakage was noticed here:
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/23549
llvm-svn: 342743
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The default target of the switch instruction may sometimes be an
"unreachable" block, when it is guaranteed that one of the cases is
always taken. The dominator tree concludes that such a switch
instruction does not have an immediate post dominator. This confuses
divergence analysis, which is unable to propagate sync dependence to
the targets of the switch instruction.
As a workaround, the AMDGPU target now invokes lower-switch as a
preISel pass. LowerSwitch is designed to handle the unreachable
default target correctly, allowing the divergence analysis to locate
the correct immediate dominator of the now-lowered switch.
Reviewers: arsenm, nhaehnle
Reviewed By: nhaehnle
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits, simoll
Differential Revision: https://reviews.llvm.org/D52221
llvm-svn: 342722
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This change is the first part of the AMDGPU target description
change. The aim of it is the effective splitting the vector and scalar
flows at the selection stage. Selection uses predicate functions based
on the framework implemented earlier - https://reviews.llvm.org/D35267
Differential revision: https://reviews.llvm.org/D52019
Reviewers: rampitec
llvm-svn: 342719
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is required for GPUs with 16 bit instructions where f16 is a
legal register type and hence int_to_fp i1 to f16 is not lowered
by legalizing.
Reviewers: arsenm, nhaehnle
Reviewed By: nhaehnle
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D52018
Change-Id: Ie4c0fd6ced7cf10ad612023c6879724d9ded5851
llvm-svn: 342558
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Instead of having both `SUnit::dump(ScheduleDAG*)` and
`ScheduleDAG::dumpNode(ScheduleDAG*)`, just keep the latter around.
- Add `ScheduleDAG::dump()` and avoid code duplication in several
places. Implement it for different ScheduleDAG variants.
- Add `ScheduleDAG::dumpNodeName()` in favor of the `SUnit::print()`
functions. They were only ever used for debug dumping and putting the
function into ScheduleDAG is consistent with the `dumpNode()` change.
llvm-svn: 342520
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: D.u32 = S0.u4[0] * S1.u4[0] +
S0.u4[1] * S1.u4[1] +
S0.u4[2] * S1.u4[2] +
S0.u4[3] * S1.u4[3] +
S0.u4[4] * S1.u4[4] +
S0.u4[5] * S1.u4[5] +
S0.u4[6] * S1.u4[6] +
S0.u4[7] * S1.u4[7] +
S2.u32
Author: FarhanaAleen
Reviewed By: arsenm, nhaehnle
Differential Revision: https://reviews.llvm.org/D51947
llvm-svn: 342497
|
|
|
|
|
|
|
| |
If there is a single use constant, it can be folded into the
min/max, but not into med3.
llvm-svn: 342443
|
|
|
|
| |
llvm-svn: 342439
|
|
|
|
|
|
|
|
| |
I need to use it in the GCN codegen.
Differential Revision: https://reviews.llvm.org/D52123
llvm-svn: 342400
|
|
|
|
|
|
| |
Change by Tony Tye
llvm-svn: 342270
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
GFX9 and above support sin/cos instructions with a greater range and thus don't
require a fract instruction prior to invocation.
Added a subtarget feature to reflect this and added code to take advantage of
expanded range on GFX9+
Also updated the tests to check correct behaviour
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D51933
Change-Id: I1c1f1d3726a5ae32116646ca5cfa1ab4ef69e5b0
llvm-svn: 342222
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
I accidentally left this behind in D50306, and it causes a build warning
when I build with gcc7.
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D52022
Change-Id: I30f7a47047e9d9d841f652da66d2fea19e74842c
llvm-svn: 342189
|
|
|
|
|
|
|
|
|
|
|
|
| |
If an argument was passed on the stack, this
was using the default alignment.
I'm not sure there's an observable change from this. This
was observable due to bugs in expansion of unaligned
loads and stores, but since that is fixed I don't think
this matters much.
llvm-svn: 342133
|
|
|
|
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D51931
Reviewers: rampitec
llvm-svn: 342120
|
|
|
|
|
|
|
|
|
|
| |
Load offset inlining pattern changed.
Differential revision: https://reviews.llvm.org/D51975
Reviewers: rampitec
llvm-svn: 342115
|
|
|
|
|
|
|
|
|
|
| |
default values)
Change by Tony Tye
Differential Revision: https://reviews.llvm.org/D51954
llvm-svn: 342077
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move isa version determination into TargetParser.
Also switch away from target features to CPU string when
determining isa version. This fixes an issue when we
output wrong isa version in the object code when features
of a particular CPU are altered (i.e. gfx902 w/o xnack
used to result in gfx900).
llvm-svn: 342069
|
|
|
|
|
|
|
|
|
|
|
| |
TargetParser."
This reverts commit r341982.
The change introduced a layering violation. Reverting to unbreak
our integrate.
llvm-svn: 342023
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
into TargetParser.
Also switch away from target features to CPU string when
determining isa version. This fixes an issue when we
output wrong isa version in the object code when features
of a particular CPU are altered (i.e. gfx902 w/o xnack
used to result in gfx900).
Differential Revision: https://reviews.llvm.org/D51890
llvm-svn: 341982
|
|
|
|
|
|
|
|
|
| |
Immediate selection predicate changed
Differential revision: https://reviews.llvm.org/D51734
Reviewers: rampitec
llvm-svn: 341928
|
|
|
|
| |
llvm-svn: 341895
|
|
|
|
|
|
|
|
|
|
| |
Inline immediate move to V_MADAK_F32.
Differential revision: https://reviews.llvm.org/D51586
Reviewer: rampitec
llvm-svn: 341843
|
|
|
|
|
|
|
| |
Now the pointer size should always be correct and
we don't need to improperly inspect the pointee type.
llvm-svn: 341806
|
|
|
|
|
|
|
| |
This will require something to cast. Before this would eliminate
the cast, which would result in copies of $noreg.
llvm-svn: 341803
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This already worked if only one register piece was used,
but didn't if a type was split into multiple, unequal
sized pieces.
Fixes not splitting 3i16/v3f16 into two registers for
AMDGPU.
This will also allow fixing the ABI for 16-bit vectors
in a future commit so that it's the same for all subtargets.
llvm-svn: 341801
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GCNHazardRecognizer wait state counting
Summary:
This fixes a bug where a large number of implicit def instructions can fill the GCNHazardRecognizer lookahead buffer causing required NOPs to not be inserted.
Reviewers: nhaehnle, arsenm
Reviewed By: arsenm
Subscribers: sheredom, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D51726
Change-Id: Ie75338f94de704ee5816b05afd0c922c6748a95b
llvm-svn: 341798
|
|
|
|
| |
llvm-svn: 341768
|
|
|
|
| |
llvm-svn: 341767
|
|
|
|
|
|
|
|
|
| |
immediate SMRD offset.
Differential revision: https://reviews.llvm.org/D51610
Reviewer: rampitec
llvm-svn: 341636
|
|
|
|
|
|
| |
Causes a regression in expensive checks.
llvm-svn: 341589
|
|
|
|
| |
llvm-svn: 341567
|
|
|
|
|
|
|
|
|
| |
Emit a waterfall loop in the general case for a potentially-divergent Rsrc
operand. When practical, avoid this by using Addr64 instructions.
Differential Revision: https://reviews.llvm.org/D50982
llvm-svn: 341413
|
|
|
|
|
|
| |
Match behavior in DAG of r340343
llvm-svn: 341393
|
|
|
|
| |
llvm-svn: 341303
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D49737
llvm-svn: 341271
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D51555
llvm-svn: 341266
|
|
|
|
|
|
|
|
|
|
|
| |
The intention is to enable the extract_vector_elt load combine,
and doing this for other operations interferes with more
useful optimizations on vectors.
Handle any type of load since in principle we should do the
same combine for the various load intrinsics.
llvm-svn: 341219
|