| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current implementation assumes there is an instruction associated
with the transform, but this is not the case for
timm/TargetConstant/immarg values. These transforms should directly
operate on a specific MachineOperand in the source
instruction. TableGen would assert if you attempted to define an
equivalent GISDNodeXFormEquiv using timm when it failed to find the
instruction matcher.
Specially recognize SDNodeXForms on timm, and pass the operand index
to the render function.
Ideally this would be a separate render function type that looks like
void renderFoo(MachineInstrBuilder, const MachineOperand&), but this
proved to be somewhat mechanically painful. Add an optional operand
index which will only be passed if the transform should only look at
the one source operand.
Theoretically it would also be possible to only ever pass the
MachineOperand, and the existing renderers would check the parent. I
think that would be somewhat ugly for the standard usage which may
want to inspect other operands, and I also think MachineOperand should
eventually not carry a pointer to the parent instruction.
Use it in one sample pattern. This isn't a great example, since the
transform exists to satisfy DAG type constraints. This could also be
avoided by just changing the MachineInstr's arbitrary choice of
operand type from i16 to i32. Other patterns have nontrivial uses, but
this serves as the simplest example.
One flaw this still has is if you try to use an SDNodeXForm defined
for imm, but the source pattern uses timm, you still see the "Failed
to lookup instruction" assert. However, there is now a way to avoid
it.
|
| |
|
|
|
|
| |
Reduces diff for a future patch.
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D69867
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We are duplicating predicates if several parts of the combined
predicate list contain the same condition. Added code to deduplicate
the list.
We have AssemblerPredicates and AssemblerPredicate in the
PredicateControl, but we never use AssemblerPredicates with an
actual list, so this one is dropped.
This addresses the first part of the llvm bug 43886:
https://bugs.llvm.org/show_bug.cgi?id=43886
Differential Revision: https://reviews.llvm.org/D69815
|
|
|
|
|
|
|
|
|
|
|
|
| |
ds_[read/write]_addtid_b32
See https://bugs.llvm.org/show_bug.cgi?id=37941
Reviewers: arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D68787
llvm-svn: 374561
|
|
|
|
|
|
|
|
|
| |
This reverts r372314, reapplying r372285 and the commits which depend
on it (r372286-r372293, and r372296-r372297)
This was missing one switch to getTargetConstant in an untested case.
llvm-svn: 372338
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This broke the Chromium build, causing it to fail with e.g.
fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15>
See llvm-commits thread of r372285 for details.
This also reverts r372286, r372287, r372288, r372289, r372290, r372291,
r372292, r372293, r372296, and r372297, which seemed to depend on the
main commit.
> Encode them directly as an imm argument to G_INTRINSIC*.
>
> Since now intrinsics can now define what parameters are required to be
> immediates, avoid using registers for them. Intrinsics could
> potentially want a constant that isn't a legal register type. Also,
> since G_CONSTANT is subject to CSE and legalization, transforms could
> potentially obscure the value (and create extra work for the
> selector). The register bank of a G_CONSTANT is also meaningful, so
> this could throw off future folding and legalization logic for AMDGPU.
>
> This will be much more convenient to work with than needing to call
> getConstantVRegVal and checking if it may have failed for every
> constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
> immarg operands, many of which need inspection during lowering. Having
> to find the value in a register is going to add a lot of boilerplate
> and waste compile time.
>
> SelectionDAG has always provided TargetConstant for constants which
> should not be legalized or materialized in a register. The distinction
> between Constant and TargetConstant was somewhat fuzzy, and there was
> no automatic way to force usage of TargetConstant for certain
> intrinsic parameters. They were both ultimately ConstantSDNode, and it
> was inconsistently used. It was quite easy to mis-select an
> instruction requiring an immediate. For SelectionDAG, start emitting
> TargetConstant for these arguments, and using timm to match them.
>
> Most of the work here is to cleanup target handling of constants. Some
> targets process intrinsics through intermediate custom nodes, which
> need to preserve TargetConstant usage to match the intrinsic
> expectation. Pattern inputs now need to distinguish whether a constant
> is merely compatible with an operand or whether it is mandatory.
>
> The GlobalISelEmitter needs to treat timm as a special case of a leaf
> node, simlar to MachineBasicBlock operands. This should also enable
> handling of patterns for some G_* instructions with immediates, like
> G_FENCE or G_EXTRACT.
>
> This does include a workaround for a crash in GlobalISelEmitter when
> ARM tries to uses "imm" in an output with a "timm" pattern source.
llvm-svn: 372314
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Encode them directly as an imm argument to G_INTRINSIC*.
Since now intrinsics can now define what parameters are required to be
immediates, avoid using registers for them. Intrinsics could
potentially want a constant that isn't a legal register type. Also,
since G_CONSTANT is subject to CSE and legalization, transforms could
potentially obscure the value (and create extra work for the
selector). The register bank of a G_CONSTANT is also meaningful, so
this could throw off future folding and legalization logic for AMDGPU.
This will be much more convenient to work with than needing to call
getConstantVRegVal and checking if it may have failed for every
constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
immarg operands, many of which need inspection during lowering. Having
to find the value in a register is going to add a lot of boilerplate
and waste compile time.
SelectionDAG has always provided TargetConstant for constants which
should not be legalized or materialized in a register. The distinction
between Constant and TargetConstant was somewhat fuzzy, and there was
no automatic way to force usage of TargetConstant for certain
intrinsic parameters. They were both ultimately ConstantSDNode, and it
was inconsistently used. It was quite easy to mis-select an
instruction requiring an immediate. For SelectionDAG, start emitting
TargetConstant for these arguments, and using timm to match them.
Most of the work here is to cleanup target handling of constants. Some
targets process intrinsics through intermediate custom nodes, which
need to preserve TargetConstant usage to match the intrinsic
expectation. Pattern inputs now need to distinguish whether a constant
is merely compatible with an operand or whether it is mandatory.
The GlobalISelEmitter needs to treat timm as a special case of a leaf
node, simlar to MachineBasicBlock operands. This should also enable
handling of patterns for some G_* instructions with immediates, like
G_FENCE or G_EXTRACT.
This does include a workaround for a crash in GlobalISelEmitter when
ARM tries to uses "imm" in an output with a "timm" pattern source.
llvm-svn: 372285
|
|
|
|
|
|
|
| |
Fixes 8-byte, 8-byte aligned LDS loads. 16-byte case still broken due
to not be reported as legal.
llvm-svn: 371413
|
|
|
|
| |
llvm-svn: 371156
|
|
|
|
|
|
| |
There should probably be a size only matcher.
llvm-svn: 371155
|
|
|
|
| |
llvm-svn: 367511
|
|
|
|
| |
llvm-svn: 367507
|
|
|
|
|
|
|
|
| |
Start migrating to a form that will be compatible with the global isel
emitter. Also should fix some overly lax checks on the memory type,
which allowed mis-selecting some illegal atomics.
llvm-svn: 367506
|
|
|
|
| |
llvm-svn: 367504
|
|
|
|
|
|
|
|
|
|
|
| |
The xform has no real valuewhen it's using out of a complex pattern
output. The complex pattern was already creating TargetConstants with
i16, so this was just unnecessary machinery.
This allows global isel to import the simple cases once the complex
pattern is implemented.
llvm-svn: 366743
|
|
|
|
|
|
|
| |
This is apparently required to be the immediately following
instruction, so force it into a bundle with a waitcnt.
llvm-svn: 366607
|
|
|
|
|
|
|
|
| |
NFC at the momemnt, needed for future commit.
Differential Revision: https://reviews.llvm.org/D64761
llvm-svn: 366092
|
|
|
|
|
|
|
|
|
|
| |
See bug 42599: https://bugs.llvm.org/show_bug.cgi?id=42599
Reviewers: artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D64716
llvm-svn: 366067
|
|
|
|
|
|
|
| |
This will help removing the custom load predicates, allowing the
global isel emitter to handle them.
llvm-svn: 365398
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Original patch by Marek Olšák
Change-Id: Ia97d5d685a63a377d86e82942436d1fe6e429bab
Reviewers: mareko, arsenm, rampitec
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63452
llvm-svn: 364814
|
|
|
|
| |
llvm-svn: 363983
|
|
|
|
|
|
|
| |
It is necessary to emit this loop around GWS operations in case the
wave is preempted pre-GFX9.
llvm-svn: 363979
|
|
|
|
|
|
|
|
| |
This reapplies r363678, using the correct chain for the CopyToReg for
v0. glueCopyToM0 counterintuitively changes the operands of the
original node.
llvm-svn: 363870
|
|
|
|
|
|
|
|
|
| |
There may or may not be additional work to handle this correctly on
SI/CI.
........
Breaks EXPENSIVE_CHECKS buildbots - http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/78/
llvm-svn: 363797
|
|
|
|
|
|
|
| |
There may or may not be additional work to handle this correctly on
SI/CI.
llvm-svn: 363678
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D61332
llvm-svn: 359696
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D60346
llvm-svn: 357835
|
|
|
|
|
|
|
|
|
| |
We have done some predicate and feature refactoring lately but
did not upstream it. This is to sync.
Differential revision: https://reviews.llvm.org/D60292
llvm-svn: 357791
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When matching half of the build_vector to a load, there could still be
a hidden dependency on the other half of the build_vector the pattern
wouldn't detect. If there was an additional chain dependency on the
other value, a cycle could be introduced.
I don't think a tablegen pattern is capable of matching the necessary
conditions, so move this into PreprocessISelDAG. Check isPredecessorOf
for the other value to avoid a cycle. This has a warning that it's
expensive, so this should probably be moved into an MI pass eventually
that will have more freedom to reorder instructions to help match
this. That is currently complicated by the lack of a computeKnownBits
type mechanism for the selected function.
llvm-svn: 355731
|
|
|
|
|
|
|
|
|
|
|
|
| |
These were not recognized as potential atomics by memory legalizer.
The test was working not because legalizer did a right thing, but
because it has skipped all these instructions. When I have fixed
DS desciption test started to fail because region address has
changed from 4 to 2 a while ago.
Differential Revision: https://reviews.llvm.org/D58802
llvm-svn: 355179
|
|
|
|
|
|
|
| |
It breaks one of our downstream merges, so revert it
temporarily while investigating failures downstream
llvm-svn: 354700
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D58522
llvm-svn: 354620
|
|
|
|
|
|
|
| |
These are no longer necessary since the R600 tablegen files are split
out now.
llvm-svn: 353548
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D52944
llvm-svn: 351351
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
To workaround a hardware issue in the (base + offset) calculation
when base is negative. The impact on code quality should be limited
since SILoadStoreOptimizer still runs afterwards and is able to
combine loads/stores based on known sign information.
This fixes visible corruption in Hitman on SI (easily reproducible
by running benchmark mode).
Change-Id: Ia178d207a5e2ac38ae7cd98b532ea2ae74704e5f
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99923
Reviewers: arsenm, mareko
Subscribers: jholewinski, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D53160
llvm-svn: 344698
|
|
|
|
|
|
|
|
| |
Not sure why the 32/64 split is needed in the atomic_load
store hierarchies. The regular PatFrags do this, but we don't
do it for the existing handling for global.
llvm-svn: 335325
|
|
|
|
|
|
|
|
|
| |
- Predicate D16 patterns on this new feature
- Added this new feature to gfx900/2/4
Differential Revision: https://reviews.llvm.org/D46366
llvm-svn: 331551
|
|
|
|
|
|
|
|
|
| |
See bug 36833: https://bugs.llvm.org/show_bug.cgi?id=36833
Differential Revision: https://reviews.llvm.org/D44779
Reviewers: arsenm, artem.tamazov, timcorringham
llvm-svn: 328713
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This is a follow-on patch of https://reviews.llvm.org/D44210
Author: FarhanaAleen
Reviewed By: msearles
Subscribers: llvm-commits, AMDGPU
Differential Revision: https://reviews.llvm.org/D44319
llvm-svn: 327726
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
address-space.
Summary: Starting from GCN 2nd generation, ISA supports ds_read_b128 on top of ds_read_b64.
This patch supports ds_read_b128 instruction pattern and generation of this instruction.
In the vectorizer, this patch also widen the vector length so that vectorizer generates
128 bit loads for local address-space which gets translated to ds_read_b128.
Since the performance benefit is not clear; compiler generates ds_read_b128 under -amdgpu-ds128.
Author: FarhanaAleen
Reviewed By: rampitec, arsenm
Subscribers: llvm-commits, AMDGPU
Differential Revision: https://reviews.llvm.org/D44210
llvm-svn: 327153
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
.NAME is a bit of an odd duck, in that we should really treat it like
a template argument, but we currently don't, and so when and where
NAME is initialized and how is pretty inconsistent. Best to just avoid
using it as a field of already instantiated records, and use cast to
string instead.
Change-Id: I5a0c202401cede3d5c3827ab9c7858ea48b29108
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D43551
llvm-svn: 325794
|
|
|
|
|
|
|
|
|
|
|
|
| |
added llvm.amdgcn.atomic.{add|min|max}.f32 intrinsics
to allow generate ds_{add|min|max}[_rtn]_f32 instructions
needed for OpenCL float atomics in LDS
Reviewed by: arsenm
Differential Revision: https://reviews.llvm.org/D37985
llvm-svn: 322656
|
|
|
|
|
|
|
|
|
| |
GFX9 stopped using m0 for most DS instructions. Select
a different instruction without the use. I think this will
be less error prone than trying to manually maintain m0
uses as needed.
llvm-svn: 319270
|
|
|
|
| |
llvm-svn: 318246
|
|
|
|
| |
llvm-svn: 318005
|
|
|
|
| |
llvm-svn: 316349
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are problematic because they apply to everything,
and can easily clobber whatever more specific predicate
you are trying to add to a function.
Currently instructions use SubtargetPredicate/PredicateControl
to apply this to patterns applied to an instruction definition,
but not to free standing Pats. Add a wrapper around Pat
so the special PredicateControls requirements can be appended
to the final predicate list like how Mips does it.
llvm-svn: 314742
|