| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
The imm folding optimization pattern failed to import. The instruction
pattern was already working, but failing to fail on SGPR inputs.
|
|
|
|
|
|
|
|
| |
See https://bugs.llvm.org/show_bug.cgi?id=43712
Reviewers: arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D70170
|
|
|
|
|
|
|
|
| |
See https://bugs.llvm.org/show_bug.cgi?id=40903
Reviewers: arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D69888
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We are duplicating predicates if several parts of the combined
predicate list contain the same condition. Added code to deduplicate
the list.
We have AssemblerPredicates and AssemblerPredicate in the
PredicateControl, but we never use AssemblerPredicates with an
actual list, so this one is dropped.
This addresses the first part of the llvm bug 43886:
https://bugs.llvm.org/show_bug.cgi?id=43886
Differential Revision: https://reviews.llvm.org/D69815
|
|
|
|
|
|
|
|
|
| |
Do not generate non-existing sdwa instructions. It reduces the
number of generated instructions by 185.
Differential Revision: https://reviews.llvm.org/D69010
llvm-svn: 375016
|
|
|
|
|
|
|
|
|
|
|
|
| |
This defaults to zero fi operand, but we do not expose it
anyway. Should we expose it later it needs to be added to
the pseudo.
This enables dpp combining on gfx10.
Differential Revision: https://reviews.llvm.org/D68888
llvm-svn: 374604
|
|
|
|
|
|
|
|
|
|
|
| |
Inhibit generation of unused real dpp instructions on gfx10 just
like it is done on other subtargets. This does not change anything
because these are illegal anyway and not accepted, but it does
reduce the number of instruction definitions generated.
Differential Revision: https://reviews.llvm.org/D68607
llvm-svn: 374083
|
|
|
|
| |
llvm-svn: 373944
|
|
|
|
|
|
|
|
|
| |
This reverts r372314, reapplying r372285 and the commits which depend
on it (r372286-r372293, and r372296-r372297)
This was missing one switch to getTargetConstant in an untested case.
llvm-svn: 372338
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This broke the Chromium build, causing it to fail with e.g.
fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15>
See llvm-commits thread of r372285 for details.
This also reverts r372286, r372287, r372288, r372289, r372290, r372291,
r372292, r372293, r372296, and r372297, which seemed to depend on the
main commit.
> Encode them directly as an imm argument to G_INTRINSIC*.
>
> Since now intrinsics can now define what parameters are required to be
> immediates, avoid using registers for them. Intrinsics could
> potentially want a constant that isn't a legal register type. Also,
> since G_CONSTANT is subject to CSE and legalization, transforms could
> potentially obscure the value (and create extra work for the
> selector). The register bank of a G_CONSTANT is also meaningful, so
> this could throw off future folding and legalization logic for AMDGPU.
>
> This will be much more convenient to work with than needing to call
> getConstantVRegVal and checking if it may have failed for every
> constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
> immarg operands, many of which need inspection during lowering. Having
> to find the value in a register is going to add a lot of boilerplate
> and waste compile time.
>
> SelectionDAG has always provided TargetConstant for constants which
> should not be legalized or materialized in a register. The distinction
> between Constant and TargetConstant was somewhat fuzzy, and there was
> no automatic way to force usage of TargetConstant for certain
> intrinsic parameters. They were both ultimately ConstantSDNode, and it
> was inconsistently used. It was quite easy to mis-select an
> instruction requiring an immediate. For SelectionDAG, start emitting
> TargetConstant for these arguments, and using timm to match them.
>
> Most of the work here is to cleanup target handling of constants. Some
> targets process intrinsics through intermediate custom nodes, which
> need to preserve TargetConstant usage to match the intrinsic
> expectation. Pattern inputs now need to distinguish whether a constant
> is merely compatible with an operand or whether it is mandatory.
>
> The GlobalISelEmitter needs to treat timm as a special case of a leaf
> node, simlar to MachineBasicBlock operands. This should also enable
> handling of patterns for some G_* instructions with immediates, like
> G_FENCE or G_EXTRACT.
>
> This does include a workaround for a crash in GlobalISelEmitter when
> ARM tries to uses "imm" in an output with a "timm" pattern source.
llvm-svn: 372314
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Encode them directly as an imm argument to G_INTRINSIC*.
Since now intrinsics can now define what parameters are required to be
immediates, avoid using registers for them. Intrinsics could
potentially want a constant that isn't a legal register type. Also,
since G_CONSTANT is subject to CSE and legalization, transforms could
potentially obscure the value (and create extra work for the
selector). The register bank of a G_CONSTANT is also meaningful, so
this could throw off future folding and legalization logic for AMDGPU.
This will be much more convenient to work with than needing to call
getConstantVRegVal and checking if it may have failed for every
constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
immarg operands, many of which need inspection during lowering. Having
to find the value in a register is going to add a lot of boilerplate
and waste compile time.
SelectionDAG has always provided TargetConstant for constants which
should not be legalized or materialized in a register. The distinction
between Constant and TargetConstant was somewhat fuzzy, and there was
no automatic way to force usage of TargetConstant for certain
intrinsic parameters. They were both ultimately ConstantSDNode, and it
was inconsistently used. It was quite easy to mis-select an
instruction requiring an immediate. For SelectionDAG, start emitting
TargetConstant for these arguments, and using timm to match them.
Most of the work here is to cleanup target handling of constants. Some
targets process intrinsics through intermediate custom nodes, which
need to preserve TargetConstant usage to match the intrinsic
expectation. Pattern inputs now need to distinguish whether a constant
is merely compatible with an operand or whether it is mandatory.
The GlobalISelEmitter needs to treat timm as a special case of a leaf
node, simlar to MachineBasicBlock operands. This should also enable
handling of patterns for some G_* instructions with immediates, like
G_FENCE or G_EXTRACT.
This does include a workaround for a crash in GlobalISelEmitter when
ARM tries to uses "imm" in an output with a "timm" pattern source.
llvm-svn: 372285
|
|
|
|
| |
llvm-svn: 371538
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unfortunately MnemonicAlias defines a "Predicates" field just like an
instruction or pattern, with a somewhat different interpretation.
This ends up overriding the intended Predicates set by
PredicateControl on the pseudoinstruction defintions with an empty
list. This allowed incorrectly selecting instructions that should have
been rejected due to the SubtargetPredicate from patterns on the
instruction definition.
This does remove the divergent predicate from the 64-bit shift
patterns, which were already not used for the 32-bit shift, so I'm not
sure what the point was. This also removes a second, redundant copy of
the 64-bit divergent patterns.
llvm-svn: 371427
|
|
|
|
| |
llvm-svn: 370980
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The LLVM disassembler assumes that the unused src0 operand of v_nop is
zero. Other tools can put another value in that field, which is still
valid. This commit fixes the LLVM disassembler to recognize such an
encoding as v_nop, in the same way as we already do for s_getpc.
Differential Revision: https://reviews.llvm.org/D63724
Change-Id: Iaf0363eae26ff92fc4ebc716216476adbff37a6f
llvm-svn: 364208
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D63203
llvm-svn: 363186
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D61156
llvm-svn: 359316
|
|
|
|
|
|
|
| |
Revert DecoderNamespace in one place for now. It will need more
changes to properly work.
llvm-svn: 359239
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D61099
llvm-svn: 359225
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D60346
llvm-svn: 357835
|
|
|
|
|
|
|
|
|
| |
We have done some predicate and feature refactoring lately but
did not upstream it. This is to sync.
Differential revision: https://reviews.llvm.org/D60292
llvm-svn: 357791
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Fix a bug in the scheduling model where V_CVT_F32_UBYTE{0,1,2,3} are incorrectly marked as quarter rate instructions.
Reviewers: arsenm, rampitec
Reviewed By: rampitec
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59091
llvm-svn: 355671
|
|
|
|
|
|
|
|
|
|
|
|
| |
v_readlane_b32 and v_writelane_b32
See bug 40662: https://bugs.llvm.org/show_bug.cgi?id=40662
Reviewers: artem.tamazov, arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D58713
llvm-svn: 355312
|
|
|
|
|
|
|
| |
It breaks one of our downstream merges, so revert it
temporarily while investigating failures downstream
llvm-svn: 354700
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D58522
llvm-svn: 354620
|
|
|
|
|
|
|
| |
These are no longer necessary since the R600 tablegen files are split
out now.
llvm-svn: 353548
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A new pass to manage the Mode register.
Currently this just manages the floating point double precision
rounding requirements, but is intended to be easily extended to
encompass all Mode register settings.
The immediate motivation comes from the requirement to use the
round-to-zero rounding mode for the 16 bit interpolation
instructions, where the rounding mode setting is shared between
16 and 64 bit operations.
llvm-svn: 348754
|
|
|
|
|
|
|
|
| |
Introduces DPP pseudo instructions and the pass that combines DPP mov with subsequent uses.
Differential revision: https://reviews.llvm.org/D53762
llvm-svn: 347993
|
|
|
|
| |
llvm-svn: 343264
|
|
|
|
|
|
|
|
|
|
|
| |
If a source of rcp instruction is a result of any conversion from
an integer convert it into rcp_iflag instruction. No FP exception
can ever happen except division by zero if a single precision rcp
argument is a representation of an integral number.
Differential Revision: https://reviews.llvm.org/D48569
llvm-svn: 335742
|
|
|
|
|
|
|
|
|
| |
See bug 36845: https://bugs.llvm.org/show_bug.cgi?id=36845
Differential Revision: https://reviews.llvm.org/D45443
Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329801
|
|
|
|
|
|
|
|
|
| |
See bug 36847: https://bugs.llvm.org/show_bug.cgi?id=36847
Differential Revision: https://reviews.llvm.org/D45097
Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 328988
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D45073
llvm-svn: 328874
|
|
|
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D44820
Change-Id: I732979e2964006aa15d78a333d8886e6855f319a
llvm-svn: 328496
|
|
|
|
|
|
|
|
|
|
| |
In some cases we do not copy implicit defs from pseudo to real
VOP instructions. It has no visible impact at the moment thus no
tests are affected or added.
Differential Revision: https://reviews.llvm.org/D41783
llvm-svn: 322496
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are problematic because they apply to everything,
and can easily clobber whatever more specific predicate
you are trying to add to a function.
Currently instructions use SubtargetPredicate/PredicateControl
to apply this to patterns applied to an instruction definition,
but not to free standing Pats. Add a wrapper around Pat
so the special PredicateControls requirements can be appended
to the final predicate list like how Mips does it.
llvm-svn: 314742
|
|
|
|
|
|
|
|
|
|
| |
See Bug 34152: https://bugs.llvm.org//show_bug.cgi?id=34152
Reviewers: SamWot, artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D36674
llvm-svn: 311006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Now that we've made all the necessary backend changes, we can add a new
intrinsic which exposes the new capabilities to IR producers. Since
llvm.amdgpu.update.dpp is a strict superset of llvm.amdgpu.mov.dpp, we
should deprecate the former. We also add tests for all the functionality
that was added in previous changes, now that we can access it via an IR
construct.
Reviewers: tstellar, arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D34718
llvm-svn: 310399
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
All instructions with the DPP modifier may not write to certain lanes of
the output if bound_ctrl=1 is set or any bits in bank_mask or row_mask
aren't set, so the destination register may be both defined and modified.
The right way to handle this is to add a constraint that the destination
register is the same as one of the inputs. We could tie the destination
to the first source, but that would be too restrictive for some use-cases
where we want the destination to be some other value before the
instruction executes. Instead, add a fake "old" source and tie it to the
destination. Effectively, the "old" source defines what value unwritten
lanes will get. We'll expose this functionality to users with a new
intrinsic later.
Also, we want to use DPP instructions for computing derivatives, which
means we need to set WQM for them. We also need to enable the entire
wavefront when using DPP intrinsics to implement nonuniform subgroup
reductions, since otherwise we'll get incorrect results in some cases.
To accomodate this, add a new operand to all DPP instructions which will
be interpreted by the SI WQM pass. This will be exposed with a new
intrinsic later. We'll also add support for Whole Wavefront Mode later.
I also fixed llvm.amdgcn.mov.dpp to overwrite the source and fixed up
the test. However, I could also keep the old behavior (where lanes that
aren't written are undefined) if people want it.
Reviewers: tstellar, arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D34716
llvm-svn: 310283
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Previously there were two separate pseudo instruction for SDWA on VI and on GFX9. Created one pseudo instruction that is union of both of them. Added verifier to check that operands conform either VI or GFX9.
Reviewers: dp, arsenm, vpykhtin
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, artem.tamazov
Differential Revision: https://reviews.llvm.org/D34026
llvm-svn: 305886
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Added separate pseudo and real instruction for GFX9 SDWA instructions.
Currently supports only in assembler.
Depends D32493
Reviewers: vpykhtin, artem.tamazov
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D33132
llvm-svn: 303620
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added support for VI:
- s_endpgm_saved
- s_wakeup
- s_rfe_restore_b64
- v_perm_b32
Enabled for VI:
- v_mov_fed_b32
- v_mov_fed_b32_e64
See bug 32593: https://bugs.llvm.org//show_bug.cgi?id=32593
Reviewers: artem.tamazov, vpykhtin
Differential Revision: https://reviews.llvm.org/D31931
llvm-svn: 300076
|
|
|
|
|
|
|
|
|
|
| |
Enabled clamp and omod for v_cvt_* opcodes which have src0 of an integer type
Reviewers: vpykhtin, arsenm
Differential Revision: https://reviews.llvm.org/D31327
llvm-svn: 298852
|
|
|
|
|
|
|
|
|
| |
computeKnownBits didn't handle fp_to_fp16 to report
the high bits as 0. ARM maps the generic node to an instruction
that does not modify the high bits of the register, so introduce
a target node where the high bits are known 0.
llvm-svn: 297873
|
|
|
|
|
|
|
|
| |
Added code to check constant bus restrictions for VOP formats (only one SGPR value or literal-constant may be used by the instruction).
Note that the same checks are performed by SIInstrInfo::verifyInstruction (used by lowering code).
Added LIT tests.
llvm-svn: 296873
|
|
|
|
|
|
|
|
| |
This is somewhat tricky because there are two
pairs of tied operands, and it isn't allowed to be
VOP3 encoded.
llvm-svn: 296519
|
|
|
|
|
|
|
|
| |
Add a few non-VOP3P but instructions related to packed.
Includes hack with dummy operands for the benefit of the assembler
llvm-svn: 296368
|
|
|
|
|
|
|
|
| |
Hit on ASICs that support 16bit instructions.
Differential Revision: https://reviews.llvm.org/D30281
llvm-svn: 295990
|
|
|
|
| |
llvm-svn: 294694
|