| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
| |
We have a number of useful lowering strategies for VBROADCAST instructions (both from memory and register element 0) which the 128-bit form of the MOVDDUP instruction can make use of.
This patch tweaks lowerVectorShuffleAsBroadcast to enable it to broadcast 2f64 args using MOVDDUP as well.
It does require a slight tweak to the lowerVectorShuffleAsBroadcast mechanism as the existing MOVDDUP lowering uses isShuffleEquivalent which can match binary shuffles that can lower to (unary) broadcasts.
Differential Revision: http://reviews.llvm.org/D17680
llvm-svn: 262478
|
| |
|
|
|
|
|
|
| |
fields"
Build failure with clang.
llvm-svn: 262477
|
| |
|
|
|
|
|
|
| |
assembler."
Build failure with clang.
llvm-svn: 262475
|
| |
|
|
|
|
|
|
|
|
| |
complementary patch to table-driven amd_kernel_code_t field parser/printer utility. lit tests passed.
Patch by: Valery Pykhtin
Differential Revision: http://reviews.llvm.org/D17151
llvm-svn: 262474
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is going to be used in .hsatext disassembler and can be used
in current assembler parser (lit tests passed on parsing).
Code using this helpers isn't included in this patch.
Benefits:
unified approach
fast field name lookup on parsing
Later I would like to enhance some of the field naming/syntax using this code.
Patch by: Valery Pykhtin
Differential Revision: http://reviews.llvm.org/D17150
llvm-svn: 262473
|
| |
|
|
|
|
|
|
| |
- unused sigaction/setitimer result (used in assert)
- unchecked fscanf return value
- signed/unsigned comparison
llvm-svn: 262472
|
| |
|
|
|
|
| |
VEX prefix. The operand is always a register. NFC
llvm-svn: 262468
|
| |
|
|
|
|
| |
how VEX prefix handling does.
llvm-svn: 262467
|
| |
|
|
|
|
|
|
|
|
|
| |
We modeled the RDFLAGS{32,64} operations as "using" {E,R}FLAGS.
While technically correct, this is not be desirable for folks who want
to examine aspects of the FLAGS register which are not related to
computation like whether or not CPUID is a valid instruction.
Differential Revision: http://reviews.llvm.org/D17782
llvm-svn: 262465
|
| |
|
|
| |
llvm-svn: 262464
|
| |
|
|
|
|
|
|
|
|
| |
encoded in bits 7:4 of the immediate.
For some instructions the register is not the last operand and the immediate handling had to detect this and hardcode the index to find it. It also required CurOp to be pointing at the last operand handled in the Form switch whereas for any instruction it would be pointing at the next operand.
Now we just capture the value in the Form switch when we know exactly where it is and the CurOp pointer can behave normally.
llvm-svn: 262462
|
| |
|
|
| |
llvm-svn: 262459
|
| |
|
|
|
|
| |
respectively should reduce size tiny bit. NFC
llvm-svn: 262458
|
| |
|
|
|
|
|
|
| |
Fix checking the same instruction twice instead of the
second branch that uses vccz. I don't think this matters
currently because s_branch_vccnz is always used currently.
llvm-svn: 262457
|
| |
|
|
| |
llvm-svn: 262456
|
| |
|
|
| |
llvm-svn: 262455
|
| |
|
|
| |
llvm-svn: 262454
|
| |
|
|
| |
llvm-svn: 262453
|
| |
|
|
| |
llvm-svn: 262452
|
| |
|
|
|
|
|
|
| |
For some reason MSVC seems to think I'm calling getConstant() from a
static context. Try to avoid this issue by explicitly specifying
'this->' (though I'm not confident that this will actually work).
llvm-svn: 262451
|
| |
|
|
| |
llvm-svn: 262449
|
| |
|
|
| |
llvm-svn: 262448
|
| |
|
|
| |
llvm-svn: 262446
|
| |
|
|
|
|
|
|
| |
asm output
that is broken by this change
llvm-svn: 262440
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Have ScalarEvolution::getRange re-consider cases like "{C?A:B,+,C?P:Q}"
by factoring out "C" and computing RangeOf{A,+,P} union RangeOf({B,+,Q})
instead.
The latter can be easier to compute precisely in cases like
"{C?0:N,+,C?1:-1}" N is the backedge taken count of the loop; since in
such cases the latter form simplifies to [0,N+1) union [0,N+1).
llvm-svn: 262438
|
| |
|
|
|
|
| |
Pure code-motion change. Will be used later in making getRange more clever.
llvm-svn: 262437
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shifts (PR26701)
As noted in the code comment, I don't think we can do the same transform that we do for
*scalar* integers comparisons to *vector* integers comparisons because it might pessimize
the general case.
Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT
for integer vectors.
But we should now recognize all the variants of this construct and produce the optimal code
for the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26701
llvm-svn: 262424
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls.
Reviewers: davidxl, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17742
llvm-svn: 262419
|
| |
|
|
| |
llvm-svn: 262417
|
| |
|
|
|
|
| |
least something if ASan is not handlig the signals for us. Remove abort_on_timeout flag.
llvm-svn: 262415
|
| |
|
|
| |
llvm-svn: 262411
|
| |
|
|
|
|
| |
The fixes in r262393 completed them as well.
llvm-svn: 262408
|
| |
|
|
|
|
| |
not the integrated assembler.
llvm-svn: 262400
|
| |
|
|
|
|
|
|
|
|
|
| |
On AMDGPU where operations i64 operations are often bitcasted to v2i32
and back, this pattern shows up regularly where it breaks some
expected combines on i64, such as load width reducing.
This fixes some test failures in a future commit when i64 loads
are changed to promote.
llvm-svn: 262397
|
| |
|
|
|
|
| |
Should fix the DBUILD_SHARED_LIBS bots.
llvm-svn: 262396
|
| |
|
|
|
|
|
|
|
|
| |
Add ELF enum value and relocations for Lanai backed.
General Lanai backend discussion on llvm-dev thread "[RFC] Lanai backend" (http://lists.llvm.org/pipermail/llvm-dev/2016-February/095118.html).
Differential Revision: http://reviews.llvm.org/D17008
llvm-svn: 262394
|
| |
|
|
|
|
|
|
|
| |
This adds some missing generic schedule info definitions, enables
completeness checking for cyclone and fixes a typo uncovered by that.
Differential Revision: http://reviews.llvm.org/D17748
llvm-svn: 262393
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change implements the following vector operations:
- Vector Compare Not Equal
- vcmpneb(.) vcmpneh(.) vcmpnew(.)
- vcmpnezb(.) vcmpnezh(.) vcmpnezw(.)
- Vector Extract Unsigned
- vextractub vextractuh vextractuw vextractd
- vextublx vextubrx vextuhlx vextuhrx vextuwlx vextuwrx
- Vector Insert
- vinsertb vinserth vinsertw vinsertd
26 instructions.
Phabricator: http://reviews.llvm.org/D15916
llvm-svn: 262392
|
| |
|
|
|
|
|
|
| |
This isn't quite NFC because some of the SDLocs may change which could
cause scheduling differences. But no regression tests are affected and
there is no functional change intended.
llvm-svn: 262391
|
| |
|
|
|
|
|
| |
This is an alternate fix to 262378 and a fix to a pessimizing-move
warning.
llvm-svn: 262390
|
| |
|
|
|
|
|
|
|
|
| |
negative values."
Revert r262248 in an attempt to fix the clang-native-aarch64-full
bot and to investigate a performance regression in
SingleSource/Benchmarks/CoyoteBench/huffbench
llvm-svn: 262388
|
| |
|
|
|
|
|
|
|
| |
This reverts commit r262316.
It seems that my change breaks an out-of-tree chromium buildbot, so
I'm reverting this in order to investigate the situation further.
llvm-svn: 262387
|
| |
|
|
| |
llvm-svn: 262386
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TableGen checks at compiletime that for scheduling models with
"CompleteModel = 1" one of the following holds:
- Is marked with the hasNoSchedulingInfo flag
- The instruction is a subclass of Sched
- There are InstRW definitions in the scheduling model
Typical steps necessary to complete a model:
- Ensure all pseudo instructions that are expanded before machine
scheduling (usually everything handled with EmitYYY() functions in
XXXTargetLowering).
- If a CPU does not support some instructions mark the corresponding
resource unsupported: "WriteRes<WriteXXX, []> { let Unsupported = 1; }".
- Add missing scheduling information.
Differential Revision: http://reviews.llvm.org/D17747
llvm-svn: 262384
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Tablegen was unable to determine that param loads/stores were actually
reading or writing from memory. I think this isn't a problem in
practice for param stores, because those occur in a block right before
we make our call. But param loads don't have to at the very beginning
of a function, so should be annotated as mayLoad so we don't incorrectly
optimize them.
Reviewers: jholewinski
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D17471
llvm-svn: 262381
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Looks like this was caused by a typo.
Reviewers: jholewinski
Subscribers: jholewinski, llvm-commits, tra
Differential Revision: http://reviews.llvm.org/D17357
llvm-svn: 262380
|
| |
|
|
| |
llvm-svn: 262378
|
| |
|
|
|
|
|
| |
Most portions of InstCombine properly propagate fast math flags, but
apparently the vector scalarization section was overlooked.
llvm-svn: 262376
|
| |
|
|
| |
llvm-svn: 262374
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Calls sometimes need to be convergent. This is already handled at the
LLVM IR level, but it also needs to be handled at the MI level.
Ideally we'd propagate convergence from instructions, down through the
selection DAG, and into MIs. But this is Hard, and would affect
optimizations in the SDNs -- right now only SDNs with two operands have
any flags at all.
Instead, here's a much simpler hack: Add new opcodes for NVPTX for
convergent calls, and generate these when lowering convergent LLVM
calls.
Reviewers: jholewinski
Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits
Differential Revision: http://reviews.llvm.org/D17423
llvm-svn: 262373
|