| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 270746
|
| |
|
|
|
|
|
| |
These operations tend to get promoted away to v4i32 so
this doesn't happen often.
llvm-svn: 270740
|
| |
|
|
|
|
|
|
|
| |
f32 vectors would use a sequence of BFI instructions instead
of unrolled cmp + select. This was better in the case of a VALU
select with SGPR inputs, but we don't have a way of dealing with that
in the DAG.
llvm-svn: 270731
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
By making pointer extraction from a vector more expensive in the cost model,
we avoid the vectorization of a loop that is very likely to be memory-bound:
https://llvm.org/bugs/show_bug.cgi?id=27826
There are still bugs related to this, so we may need a more general solution
to avoid vectorizing obviously memory-bound loops when we don't have HW gather
support.
Differential Revision: http://reviews.llvm.org/D20601
llvm-svn: 270729
|
| |
|
|
|
|
|
|
|
|
| |
VZeroUpperInserter pass (PR27823)
As noted in the review, there are still problems, so this doesn't the bug completely.
Differential Revision: http://reviews.llvm.org/D20529
llvm-svn: 270718
|
| |
|
|
|
|
|
|
|
|
| |
intrinsics with generic IR
Followup to D20528 clang patch, this removes the (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) llvm intrinsics and auto-upgrades to sitofp/fpext instead.
Differential Revision: http://reviews.llvm.org/D20568
llvm-svn: 270678
|
| |
|
|
|
|
| |
long time.
llvm-svn: 270677
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
[AMDGPU] emitPrologue looks for an unused unallocated SGPR that is not
the scratch descriptor. Continue search if unused register found fails
other requirements.
Reviewers: arsenm, tstellarAMD, nhaehnle
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D20526
llvm-svn: 270646
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of this:
i32.const $push10=, __stack_pointer
i32.load $push11=, 0($pop10)
Emit this:
i32.const $push10=, 0
i32.load $push11=, __stack_pointer($pop10)
It's not currently clear which is better, though there's a chance the second
form may be better at overall compression. We can revisit this when we have
more data; for now it makes sense to make PEI consistent with isel.
Differential Revision: http://reviews.llvm.org/D20411
llvm-svn: 270635
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D20081
llvm-svn: 270594
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Change process of parsing of optional operands. All optional operands use same parsing method - parseOptionalOperand().
No default values are added to OperandsVector.
Get rid of WORKAROUND_USE_DUMMY_OPERANDS_INSTEAD_MUTIPLE_DEFAULT_OPERANDS.
Reviewers: tstellarAMD, vpykhtin, artem.tamazov, nhaustov
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D20527
llvm-svn: 270556
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D20476
llvm-svn: 270552
|
| |
|
|
|
|
|
|
| |
second argument to buildin function but it is first instruction operand.
Differential Revision: http://reviews.llvm.org/D20515
llvm-svn: 270548
|
| |
|
|
|
|
|
|
|
|
|
| |
Patch by Nitesh Jain.
Summary: The type of Imm in MipsDisassembler.cpp was incorrect since SignExtend64 return int64_t type.As per the MIPSr6 doc ,the offset is added to the address of the instruction following the branch (not the branch itself), to form a PC-relative effective target address hence “4” is added to the offset. The offset of some test case are update to reflect the changes due to “ + 4 ” offset and new test case for negative offset are added.
Reviewers: dsanders, vkalintiris
Differential Revision: http://reviews.llvm.org/D17540
llvm-svn: 270542
|
| |
|
|
|
|
| |
Now that we have a nice fast VPPERM solution. Added framework for future intrinsic costs as well.
llvm-svn: 270537
|
| |
|
|
| |
llvm-svn: 270508
|
| |
|
|
|
|
|
|
|
| |
They were accidentally using the 32-bit load/store instruction for
8/16-bit operations, due to incorrect patterns
(8/16-bit cmpxchg and atomicrmw will be fixed in subsequent changes)
llvm-svn: 270486
|
| |
|
|
| |
llvm-svn: 270469
|
| |
|
|
| |
llvm-svn: 270467
|
| |
|
|
|
|
|
|
| |
Use the more specific LiveInterval::removeSegment instead of
LiveInterval::shrinkToUses when we know the specific range that's
being removed.
llvm-svn: 270463
|
| |
|
|
| |
llvm-svn: 270459
|
| |
|
|
| |
llvm-svn: 270444
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The exit-on-error flag on the many_args1.ll test is needed to avoid an
unreachable in BPFTargetLowering::LowerCall. We can also avoid it by ignoring
any superfluous arguments to the call (i.e. any arguments after the first 5).
Fixes PR27766.
Differential Revision: http://reviews.llvm.org/D20471
v2 of r270419
llvm-svn: 270440
|
| |
|
|
|
|
|
|
| |
This patch reverts r270419 because it broke a lot of buildbots,
mostly Windows. We'd like help in investigating the issues, but
for now, it should stay out.
llvm-svn: 270433
|
| |
|
|
|
|
|
|
|
|
| |
The exit-on-error flag on the many_args1.ll test is needed to avoid an
unreachable in BPFTargetLowering::LowerCall. We can also avoid it by ignoring
any superfluous arguments to the call (i.e. any arguments after the first 5).
Fixes PR27766
llvm-svn: 270419
|
| |
|
|
|
|
| |
This code should have been with the previous check-in (r270417) and prevents the DelaySlotFiller pass being utilized in functions where the erratum fix has been applied as this will break the run-time code.
llvm-svn: 270418
|
| |
|
|
|
|
|
|
|
|
| |
Due to an erratum in some versions of LEON, we must insert a NOP after any LD or LDF instruction to ensure the processor has time to load the value correctly before using it. This pass will implement that erratum fix.
The code will have no effect for other Sparc, but non-LEON processors.
Differential Review: http://reviews.llvm.org/D20353
llvm-svn: 270417
|
| |
|
|
|
|
|
|
|
|
|
|
| |
modifiers for imms.
Reviewers: nhaustov, tstellarAMD
Subscribers: kzhuravl, arsenm
Differential Revision: http://reviews.llvm.org/D20166
llvm-svn: 270415
|
| |
|
|
| |
llvm-svn: 270414
|
| |
|
|
|
|
| |
optimizing moves to use 2 byte VEX prefix.
llvm-svn: 270394
|
| |
|
|
|
|
|
|
| |
subvectors using XMM or YMM stores instead of the vector extract instructions.
Similar is already done for AVX and we had lost it going to AVX512VL.
llvm-svn: 270383
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR27823)
This isn't the complete fix, but it handles the trivial examples of duplicate vzero* ops in PR27823:
https://llvm.org/bugs/show_bug.cgi?id=27823
...and amusingly, the bogus cases already exist as regression tests, so let's take this baby step.
We'll need to do more in the general case where there's legitimate AVX usage in the function + there's
already a vzero in the code.
Differential Revision: http://reviews.llvm.org/D20477
llvm-svn: 270378
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D20513
llvm-svn: 270357
|
| |
|
|
|
|
| |
the source is 512-bits. The 256-bit source patterns were redundant with AVX.
llvm-svn: 270356
|
| |
|
|
|
|
| |
index 0 patterns. This gives them higher priority than the memory patterns. This matches AVX1/2.
llvm-svn: 270355
|
| |
|
|
|
|
| |
equivalents. This helps group them close together in the isel tables and enable table compression.
llvm-svn: 270354
|
| |
|
|
|
|
| |
inversions could appear in a row.
llvm-svn: 270344
|
| |
|
|
| |
llvm-svn: 270343
|
| |
|
|
|
|
| |
for integer types when only AVX1 is supported.
llvm-svn: 270335
|
| |
|
|
| |
llvm-svn: 270334
|
| |
|
|
|
|
| |
used to indicating the zero masking behavior which is not the case here. NFC
llvm-svn: 270333
|
| |
|
|
|
|
| |
reflect the fact that memory is the destination.
llvm-svn: 270332
|
| |
|
|
| |
llvm-svn: 270331
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D20438
llvm-svn: 270322
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D20324
llvm-svn: 270321
|
| |
|
|
|
|
| |
AVX2 versions of vector extract when AVX512VL is enabled.
llvm-svn: 270318
|
| |
|
|
|
|
| |
AVX512VL is enabled. Also add shuffle comment printing for AVX512VL VPERMPD/VPERMQ to keep some tests that now use these instructions instead of the AVX2 ones.
llvm-svn: 270317
|
| |
|
|
|
|
| |
is enabled.
llvm-svn: 270316
|
| |
|
|
|
|
|
|
|
|
| |
Allocating larger register classes first should give better allocation
results (and more importantly for myself, make the lit tests more stable
with respect to scheduler changes).
Patch by Matthias Braun
llvm-svn: 270312
|
| |
|
|
|
|
| |
AVX512VL/AVX512BWI equivalents are available.
llvm-svn: 270311
|