| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
| |
We have a number of useful lowering strategies for VBROADCAST instructions (both from memory and register element 0) which the 128-bit form of the MOVDDUP instruction can make use of.
This patch tweaks lowerVectorShuffleAsBroadcast to enable it to broadcast 2f64 args using MOVDDUP as well.
It does require a slight tweak to the lowerVectorShuffleAsBroadcast mechanism as the existing MOVDDUP lowering uses isShuffleEquivalent which can match binary shuffles that can lower to (unary) broadcasts.
Differential Revision: http://reviews.llvm.org/D17680
llvm-svn: 262478
|
| |
|
|
|
|
|
|
| |
fields"
Build failure with clang.
llvm-svn: 262477
|
| |
|
|
|
|
|
|
| |
assembler."
Build failure with clang.
llvm-svn: 262475
|
| |
|
|
|
|
|
|
|
|
| |
complementary patch to table-driven amd_kernel_code_t field parser/printer utility. lit tests passed.
Patch by: Valery Pykhtin
Differential Revision: http://reviews.llvm.org/D17151
llvm-svn: 262474
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is going to be used in .hsatext disassembler and can be used
in current assembler parser (lit tests passed on parsing).
Code using this helpers isn't included in this patch.
Benefits:
unified approach
fast field name lookup on parsing
Later I would like to enhance some of the field naming/syntax using this code.
Patch by: Valery Pykhtin
Differential Revision: http://reviews.llvm.org/D17150
llvm-svn: 262473
|
| |
|
|
|
|
| |
VEX prefix. The operand is always a register. NFC
llvm-svn: 262468
|
| |
|
|
|
|
| |
how VEX prefix handling does.
llvm-svn: 262467
|
| |
|
|
|
|
|
|
|
|
|
| |
We modeled the RDFLAGS{32,64} operations as "using" {E,R}FLAGS.
While technically correct, this is not be desirable for folks who want
to examine aspects of the FLAGS register which are not related to
computation like whether or not CPUID is a valid instruction.
Differential Revision: http://reviews.llvm.org/D17782
llvm-svn: 262465
|
| |
|
|
| |
llvm-svn: 262464
|
| |
|
|
|
|
|
|
|
|
| |
encoded in bits 7:4 of the immediate.
For some instructions the register is not the last operand and the immediate handling had to detect this and hardcode the index to find it. It also required CurOp to be pointing at the last operand handled in the Form switch whereas for any instruction it would be pointing at the next operand.
Now we just capture the value in the Form switch when we know exactly where it is and the CurOp pointer can behave normally.
llvm-svn: 262462
|
| |
|
|
|
|
| |
respectively should reduce size tiny bit. NFC
llvm-svn: 262458
|
| |
|
|
|
|
|
|
| |
Fix checking the same instruction twice instead of the
second branch that uses vccz. I don't think this matters
currently because s_branch_vccnz is always used currently.
llvm-svn: 262457
|
| |
|
|
| |
llvm-svn: 262456
|
| |
|
|
| |
llvm-svn: 262455
|
| |
|
|
| |
llvm-svn: 262411
|
| |
|
|
|
|
| |
The fixes in r262393 completed them as well.
llvm-svn: 262408
|
| |
|
|
|
|
| |
not the integrated assembler.
llvm-svn: 262400
|
| |
|
|
|
|
|
|
|
| |
This adds some missing generic schedule info definitions, enables
completeness checking for cyclone and fixes a typo uncovered by that.
Differential Revision: http://reviews.llvm.org/D17748
llvm-svn: 262393
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change implements the following vector operations:
- Vector Compare Not Equal
- vcmpneb(.) vcmpneh(.) vcmpnew(.)
- vcmpnezb(.) vcmpnezh(.) vcmpnezw(.)
- Vector Extract Unsigned
- vextractub vextractuh vextractuw vextractd
- vextublx vextubrx vextuhlx vextuhrx vextuwlx vextuwrx
- Vector Insert
- vinsertb vinserth vinsertw vinsertd
26 instructions.
Phabricator: http://reviews.llvm.org/D15916
llvm-svn: 262392
|
| |
|
|
|
|
|
|
| |
This isn't quite NFC because some of the SDLocs may change which could
cause scheduling differences. But no regression tests are affected and
there is no functional change intended.
llvm-svn: 262391
|
| |
|
|
|
|
|
|
|
|
| |
negative values."
Revert r262248 in an attempt to fix the clang-native-aarch64-full
bot and to investigate a performance regression in
SingleSource/Benchmarks/CoyoteBench/huffbench
llvm-svn: 262388
|
| |
|
|
|
|
|
|
|
| |
This reverts commit r262316.
It seems that my change breaks an out-of-tree chromium buildbot, so
I'm reverting this in order to investigate the situation further.
llvm-svn: 262387
|
| |
|
|
| |
llvm-svn: 262386
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TableGen checks at compiletime that for scheduling models with
"CompleteModel = 1" one of the following holds:
- Is marked with the hasNoSchedulingInfo flag
- The instruction is a subclass of Sched
- There are InstRW definitions in the scheduling model
Typical steps necessary to complete a model:
- Ensure all pseudo instructions that are expanded before machine
scheduling (usually everything handled with EmitYYY() functions in
XXXTargetLowering).
- If a CPU does not support some instructions mark the corresponding
resource unsupported: "WriteRes<WriteXXX, []> { let Unsupported = 1; }".
- Add missing scheduling information.
Differential Revision: http://reviews.llvm.org/D17747
llvm-svn: 262384
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Tablegen was unable to determine that param loads/stores were actually
reading or writing from memory. I think this isn't a problem in
practice for param stores, because those occur in a block right before
we make our call. But param loads don't have to at the very beginning
of a function, so should be annotated as mayLoad so we don't incorrectly
optimize them.
Reviewers: jholewinski
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D17471
llvm-svn: 262381
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Looks like this was caused by a typo.
Reviewers: jholewinski
Subscribers: jholewinski, llvm-commits, tra
Differential Revision: http://reviews.llvm.org/D17357
llvm-svn: 262380
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Calls sometimes need to be convergent. This is already handled at the
LLVM IR level, but it also needs to be handled at the MI level.
Ideally we'd propagate convergence from instructions, down through the
selection DAG, and into MIs. But this is Hard, and would affect
optimizations in the SDNs -- right now only SDNs with two operands have
any flags at all.
Instead, here's a much simpler hack: Add new opcodes for NVPTX for
convergent calls, and generate these when lowering convergent LLVM
calls.
Reviewers: jholewinski
Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits
Differential Revision: http://reviews.llvm.org/D17423
llvm-svn: 262373
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Tablegen understands backslash as an escape char; that's sufficient.
Reviewers: jholewinski
Subscribers: llvm-commits, tra, jholewinski
Differential Revision: http://reviews.llvm.org/D17432
llvm-svn: 262372
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Also simplify some of the embedded C++ logic.
No functional changes.
Reviewers: jholewinski
Subscribers: llvm-commits, tra, jholewinski
Differential Revision: http://reviews.llvm.org/D17354
llvm-svn: 262371
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The _chkstk function is called by the compiler to probe the stack in an
order consistent with Windows' expectations. However, it is possible to
elide the call to _chkstk and manually adjust the stack pointer if we
can prove that the allocation is fixed size and smaller than the probe
size.
This shrinks chrome.dll, chrome_child.dll and chrome.exe by a
cummulative ~133 KB.
Differential Revision: http://reviews.llvm.org/D17679
llvm-svn: 262370
|
| |
|
|
| |
llvm-svn: 262367
|
| |
|
|
|
|
|
| |
This reduces the number of bitcast nodes and generally cleans up the
DAG when bitcasting between integers and vectors everywhere.
llvm-svn: 262358
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Intrinsics
Summary:
This patch impleemnts DS_PERMUTE/DS_BPERMUTE instruction definitions and intrinsics,
which are new since VI.
Reviewers: tstellarAMD, arsenm
Subscribers: llvm-commits, arsenm
Differential Revision: http://reviews.llvm.org/D17614
llvm-svn: 262356
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D17713
llvm-svn: 262353
|
| |
|
|
|
|
|
|
|
|
|
|
| |
In the code below on 32-bit targets, x would previously get forwarded to g()
without sign-extension to 32 bits as required by the parameter attribute.
void g(signed short);
void f(unsigned short x) {
g(x);
}
llvm-svn: 262352
|
| |
|
|
| |
llvm-svn: 262351
|
| |
|
|
| |
llvm-svn: 262347
|
| |
|
|
| |
llvm-svn: 262346
|
| |
|
|
| |
llvm-svn: 262338
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Idea behind this change is to make code shorter and as much common for all targets as possible. Let's even accept more code than is valid for a particular target, leaving it for the assembler to sort out.
64bit instructions decoding added.
Error\warning messages on unrecognized instructions operands added, InstPrinter allowed to print invalid operands helping to find invalid/unsupported code.
The change is massive and hard to compare with previous version, so it makes sense just to take a look on the new version. As a bonus, with a few TD changes following, it disassembles the majority of instructions. Currently it fully disassembles >300K binary source of some blas kernel.
Previous TODOs were saved whenever possible.
Patch by: Valery Pykhtin
Differential Revision: http://reviews.llvm.org/D17720
llvm-svn: 262332
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D17692
llvm-svn: 262320
|
| |
|
|
|
|
|
|
|
|
| |
handler function.
This resolves https://llvm.org/bugs/show_bug.cgi?id=26412
Differential Revision: http://reviews.llvm.org/D17542
llvm-svn: 262319
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch modifies the existing comparison, branch, conditional-move
and select patterns, and adds new ones where needed. Also, the updated
SLT{u,i,iu} set of instructions generate a GPR width result.
The majority of the code changes in the Mips back-end fix the wrong
assumption that the result of SETCC nodes always produce an i32 value.
The changes in the common code path account for the fact that in 64-bit
MIPS targets, i1 is promoted to i32 instead of i64.
Reviewers: dsanders
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D10970
llvm-svn: 262316
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
it is not present
Previosy, if actual instruction have one of optional operands then other optional operands listed before this also should be presented.
For example instruction v_fract_f32 v0, v1, mul:2 have one optional operand - OMod and do not have optional operand clamp. Previously this was not allowed because clamp is listed before omod in AsmString:
string AsmString = "v_fract_f32$vdst, $src0_modifiers$clamp$omod";
Making this work required some hacks (both OMod and Clamp match classes have same PredicateMethod).
Now, if MatchInstructionImpl meets formal optional operand that is not presented in actual instruction it skips this formal operand and tries to match current actual operand with next formal.
Patch by: Sam Kolton
Review: http://reviews.llvm.org/D17568
[AMDGPU] Assembler: Check immediate types for several optional operands in predicate methods
With this change you should place optional operands in order specified by asm string:
clamp -> omod
offset -> glc -> slc -> tfe
Fixes for several tests.
Depends on D17568
Patch by: Sam Kolton
Review: http://reviews.llvm.org/D17644
llvm-svn: 262314
|
| |
|
|
|
|
| |
earlier so we can stop masking in multiple places. NFC
llvm-svn: 262312
|
| |
|
|
| |
llvm-svn: 262310
|
| |
|
|
| |
llvm-svn: 262309
|
| |
|
|
|
|
| |
comments. NFC
llvm-svn: 262301
|
| |
|
|
|
|
|
|
| |
Technically you aren't supposed to emit these after type legalization
for some reason, and we use vector extracts of bitcasted integers
as the canonical way to do this.
llvm-svn: 262298
|
| |
|
|
| |
llvm-svn: 262297
|