<feed xmlns='http://www.w3.org/2005/Atom'>
<title>bcm5719-llvm/llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp, branch meklort-10.0.1</title>
<subtitle>Project Ortega BCM5719 LLVM</subtitle>
<id>https://git.raptorcs.com/git/bcm5719-llvm/atom?h=meklort-10.0.1</id>
<link rel='self' href='https://git.raptorcs.com/git/bcm5719-llvm/atom?h=meklort-10.0.1'/>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/'/>
<updated>2020-01-14T09:18:59+00:00</updated>
<entry>
<title>[AMDGPU] Model distance to instruction in bundle</title>
<updated>2020-01-14T09:18:59+00:00</updated>
<author>
<name>Stanislav Mekhanoshin</name>
<email>Stanislav.Mekhanoshin@amd.com</email>
</author>
<published>2020-01-14T01:01:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=ad741853c38880dff99cd5b5035b8965c5a73011'/>
<id>urn:sha1:ad741853c38880dff99cd5b5035b8965c5a73011</id>
<content type='text'>
This change allows to model the height of the instruction
within a bundle for latency adjustment purposes.

Differential Revision: https://reviews.llvm.org/D72669
</content>
</entry>
<entry>
<title>Let targets adjust operand latency of bundles</title>
<updated>2020-01-10T22:56:53+00:00</updated>
<author>
<name>Stanislav Mekhanoshin</name>
<email>Stanislav.Mekhanoshin@amd.com</email>
</author>
<published>2020-01-10T20:28:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=987bf8b6c14613da907fa78330415e266b97a036'/>
<id>urn:sha1:987bf8b6c14613da907fa78330415e266b97a036</id>
<content type='text'>
This reverts the AMDGPU DAG mutation implemented in D72487 and gives
a more general way of adjusting BUNDLE operand latency.

It also replaces FixBundleLatencyMutation with adjustSchedDependency
callback in the AMDGPU, fixing not only successor latencies but
predecessors' as well.

Differential Revision: https://reviews.llvm.org/D72535
</content>
</entry>
<entry>
<title>AMDGPU/GlobalISel: Select G_EXTRACT_VECTOR_ELT</title>
<updated>2020-01-10T00:52:24+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2020-01-02T21:45:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=35c3d101aee240f6c034f25ff6800fda22a89987'/>
<id>urn:sha1:35c3d101aee240f6c034f25ff6800fda22a89987</id>
<content type='text'>
Doesn't try to do the fold into the base register of an add of a
constant in the index like the DAG path does.
</content>
</entry>
<entry>
<title>[AMDGPU] Fix bundle scheduling</title>
<updated>2020-01-09T23:56:36+00:00</updated>
<author>
<name>Stanislav Mekhanoshin</name>
<email>Stanislav.Mekhanoshin@amd.com</email>
</author>
<published>2020-01-09T22:28:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=cd69e4c74c174101817c9f6b7c02374ac6a7476f'/>
<id>urn:sha1:cd69e4c74c174101817c9f6b7c02374ac6a7476f</id>
<content type='text'>
Bundles coming to scheduler considered free, i.e. zero latency.
Fixed.

Differential Revision: https://reviews.llvm.org/D72487
</content>
</entry>
<entry>
<title>AMDGPU: Switch backend default max workgroup size to 1024</title>
<updated>2019-11-13T01:41:02+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2019-08-27T16:34:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=4b472139513ba460595804f8113497844b41fbcc'/>
<id>urn:sha1:4b472139513ba460595804f8113497844b41fbcc</id>
<content type='text'>
Previously this would default to 256, not the maximum supported size
of 1024. Using a maximum lower than the hardware maximum requires
language runtimes to enforce this limit for correctness, which no
language has correctly done. Switch the default to the conservatively
correct maximum, and force frontends to opt-in to the more optimal 256
default maximum.

I don't really understand why the changes in occupancy-levels.ll
increased the computed occupancy, which I expected to decrease. I'm
not sure if these tests should be forcing the old maximum.
</content>
</entry>
<entry>
<title>[AMDGPU] Fix mfma scheduling crash</title>
<updated>2019-10-24T18:01:52+00:00</updated>
<author>
<name>Stanislav Mekhanoshin</name>
<email>Stanislav.Mekhanoshin@amd.com</email>
</author>
<published>2019-10-24T17:34:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=3c8e055187d8adf1834cdc735d82df5529fbbd86'/>
<id>urn:sha1:3c8e055187d8adf1834cdc735d82df5529fbbd86</id>
<content type='text'>
An SUnit can be neither intruction not SDNode. It is all
null if represents a nop. Fixed a crash on using SU-&gt;getInstr().

Differential Revision: https://reviews.llvm.org/D69395
</content>
</entry>
<entry>
<title>[Alignment] Migrate Attribute::getWith(Stack)Alignment</title>
<updated>2019-10-15T12:56:24+00:00</updated>
<author>
<name>Guillaume Chatelet</name>
<email>gchatelet@google.com</email>
</author>
<published>2019-10-15T12:56:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=b65fa483058f1b4049c7201525779b4f49cceb80'/>
<id>urn:sha1:b65fa483058f1b4049c7201525779b4f49cceb80</id>
<content type='text'>
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, jdoerfert

Reviewed By: courbet

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68792

llvm-svn: 374884
</content>
</entry>
<entry>
<title>[AMDGPU] fixed underflow in getOccupancyWithNumVGPRs</title>
<updated>2019-09-19T20:09:04+00:00</updated>
<author>
<name>Stanislav Mekhanoshin</name>
<email>Stanislav.Mekhanoshin@amd.com</email>
</author>
<published>2019-09-19T20:09:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=d487d6401d98ba7a759cd069b61ae67b286a3014'/>
<id>urn:sha1:d487d6401d98ba7a759cd069b61ae67b286a3014</id>
<content type='text'>
The function could return zero if an extreme number or
registers were used. Minimal possible occupancy is 1.

Differential Revision: https://reviews.llvm.org/D67771

llvm-svn: 372350
</content>
</entry>
<entry>
<title>Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"</title>
<updated>2019-09-19T16:26:14+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2019-09-19T16:26:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=3ecab8e4555aee0b4aa10c413696a67f55948c39'/>
<id>urn:sha1:3ecab8e4555aee0b4aa10c413696a67f55948c39</id>
<content type='text'>
This reverts r372314, reapplying r372285 and the commits which depend
on it (r372286-r372293, and r372296-r372297)

This was missing one switch to getTargetConstant in an untested case.

llvm-svn: 372338
</content>
</entry>
<entry>
<title>Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"</title>
<updated>2019-09-19T12:33:07+00:00</updated>
<author>
<name>Hans Wennborg</name>
<email>hans@hanshq.net</email>
</author>
<published>2019-09-19T12:33:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=13bdae8541c3fc5acf6ee7de78ec5ab8446848e4'/>
<id>urn:sha1:13bdae8541c3fc5acf6ee7de78ec5ab8446848e4</id>
<content type='text'>
This broke the Chromium build, causing it to fail with e.g.

  fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8&lt;15&gt;

See llvm-commits thread of r372285 for details.

This also reverts r372286, r372287, r372288, r372289, r372290, r372291,
r372292, r372293, r372296, and r372297, which seemed to depend on the
main commit.

&gt; Encode them directly as an imm argument to G_INTRINSIC*.
&gt;
&gt; Since now intrinsics can now define what parameters are required to be
&gt; immediates, avoid using registers for them. Intrinsics could
&gt; potentially want a constant that isn't a legal register type. Also,
&gt; since G_CONSTANT is subject to CSE and legalization, transforms could
&gt; potentially obscure the value (and create extra work for the
&gt; selector). The register bank of a G_CONSTANT is also meaningful, so
&gt; this could throw off future folding and legalization logic for AMDGPU.
&gt;
&gt; This will be much more convenient to work with than needing to call
&gt; getConstantVRegVal and checking if it may have failed for every
&gt; constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
&gt; immarg operands, many of which need inspection during lowering. Having
&gt; to find the value in a register is going to add a lot of boilerplate
&gt; and waste compile time.
&gt;
&gt; SelectionDAG has always provided TargetConstant for constants which
&gt; should not be legalized or materialized in a register. The distinction
&gt; between Constant and TargetConstant was somewhat fuzzy, and there was
&gt; no automatic way to force usage of TargetConstant for certain
&gt; intrinsic parameters. They were both ultimately ConstantSDNode, and it
&gt; was inconsistently used. It was quite easy to mis-select an
&gt; instruction requiring an immediate. For SelectionDAG, start emitting
&gt; TargetConstant for these arguments, and using timm to match them.
&gt;
&gt; Most of the work here is to cleanup target handling of constants. Some
&gt; targets process intrinsics through intermediate custom nodes, which
&gt; need to preserve TargetConstant usage to match the intrinsic
&gt; expectation. Pattern inputs now need to distinguish whether a constant
&gt; is merely compatible with an operand or whether it is mandatory.
&gt;
&gt; The GlobalISelEmitter needs to treat timm as a special case of a leaf
&gt; node, simlar to MachineBasicBlock operands. This should also enable
&gt; handling of patterns for some G_* instructions with immediates, like
&gt; G_FENCE or G_EXTRACT.
&gt;
&gt; This does include a workaround for a crash in GlobalISelEmitter when
&gt; ARM tries to uses "imm" in an output with a "timm" pattern source.

llvm-svn: 372314
</content>
</entry>
</feed>
