| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
| |
Remove broken patterns matching it. This was matching the
unsafe math pattern and expanding the fix for the buggy instruction
from the pattern. The problems are also on CI. Remove the workarounds
and only use fract with unsafe math or from the intrinsic.
llvm-svn: 271078
|
| |
|
|
|
|
| |
Given where this is used it should be a nop.
llvm-svn: 271066
|
| |
|
|
| |
llvm-svn: 271057
|
| |
|
|
|
|
|
| |
DynamicNoPIC was only every used on darwin. This maps it to static on
ELF. It matches what is done on X86.
llvm-svn: 271052
|
| |
|
|
| |
llvm-svn: 271045
|
| |
|
|
|
|
|
| |
When running mir tests, a pass created in that constructor would not be
freed, leading to memory leaks.
llvm-svn: 271043
|
| |
|
|
|
|
|
|
| |
This recommits r267649 with a fix for PR27539.
Differential Revision: http://reviews.llvm.org/D20598
llvm-svn: 271033
|
| |
|
|
|
|
| |
Also guard against v32i8 users.
llvm-svn: 271024
|
| |
|
|
| |
llvm-svn: 271023
|
| |
|
|
|
|
| |
It's faster and easier to read.
llvm-svn: 271018
|
| |
|
|
|
|
|
|
| |
"ld".
PR27904.
llvm-svn: 271016
|
| |
|
|
|
|
| |
No functionality change intended, maybe a tiny performance improvement.
llvm-svn: 270997
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The isMemWithSimmOffset predicate rejects relocations which is incorrect
behaviour. Linkers and other tools should handle|warn|error when the
field overflows.
Reviewers: dsanders, vkalintiris
Differential Revision: http://reviews.llvm.org/D20727
llvm-svn: 270995
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Register numbers may be specified as assembly-time expressions.
This feature can be useful in macros and alike. However, expressions
are supported within sqare braces only.
Sqare braces were initially intended to support specifying of multiple
(pairs/quads...) registers. Syntax like v[8:8] which specifies single register
is also supported. That allows expressions but looks a bit unnatural.
This change supports syntax REG[EXPR].
Tests added.
Differential Revision: http://reviews.llvm.org/D20588
llvm-svn: 270990
|
| |
|
|
|
|
|
| |
clang-tidy's performance-unnecessary-copy-initialization with some manual
fixes. No functional changes intended.
llvm-svn: 270988
|
| |
|
|
|
|
|
| |
Also fold conditions into assert(0) where it makes sense. No functional
change intended.
llvm-svn: 270982
|
| |
|
|
|
|
| |
No functionality change.
llvm-svn: 270981
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 270980
|
| |
|
|
|
|
| |
Also give them library visiblity while there.
llvm-svn: 270979
|
| |
|
|
|
|
| |
extension intrinsics with generic IR (llvm)
llvm-svn: 270976
|
| |
|
|
|
|
|
|
|
|
|
|
| |
generic IR (llvm)
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.
A companion patch (D20684) removes/auto-upgrade the clang intrinsics.
Differential Revision: http://reviews.llvm.org/D20686
llvm-svn: 270973
|
| |
|
|
|
|
|
|
|
| |
The aggressive anti-dependency breaker can rename the restored callee-
saved registers. To prevent this, mark these registers are live on all
paths to the return/tail-call instructions, and add implicit use operands
for them to these instructions.
llvm-svn: 270898
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Canonicalize (srl (bswap i32 x), 16) to (rotr (bswap i32 x), 16), if the high
16-bits of x are zero. Similarly, canonicalize (srl (bswap i64 x), 32) to
(rotr (bswap i64 x), 32), if the high 32-bits of x are zero.
test_rev_w_srl16: test_rev_w_srl16:
and w8, w0, #0xffff and w8, w0, #0xffff
rev w8, w8 ---> rev16 w0, w8
lsr w0, w8, #16
test_rev_x_srl32: test_rev_x_srl32:
rev x8, x8 ---> rev32 x0, x8
lsr x0, x8, #32
llvm-svn: 270896
|
| |
|
|
|
|
|
|
|
|
| |
Summary: Enable load-store-opt by default, and update LIT tests.
Reviewers: arsenm
Differential Revision: http://reviews.llvm.org/D20694
llvm-svn: 270894
|
| |
|
|
|
|
|
| |
Fixes build error on windows where MSVC does not
support list initialization inside member initializer list.
llvm-svn: 270877
|
| |
|
|
|
|
|
|
|
|
|
|
| |
NVVMIntrRange adds !range metadata to calls of NVVM intrinsics
that return values within known limited range.
This allows LLVM to generate optimal code for indexing arrays
based on tid/ctaid which is a frequently used pattern in CUDA code.
Differential Revision: http://reviews.llvm.org/D20644
llvm-svn: 270872
|
| |
|
|
|
|
|
|
|
|
|
| |
Hwreg(...) syntax implementation unified with sendmsg(...).
Common strings moved to Utils
MathExtras.h functionality utilized.
Added missing build dependency in Disassembler.
Differential Revision: http://reviews.llvm.org/D20381
llvm-svn: 270871
|
| |
|
|
|
|
| |
support for TTMP/TBA/TMA registers."
llvm-svn: 270859
|
| |
|
|
|
|
|
|
| |
vector to the lower 128-bit subvector.
Most often as not this is what it started out as, the extraction is zero-cost on AVX and the PMOVZX/PMOVSX folding logic is based around 128-bit loads.
llvm-svn: 270858
|
| |
|
|
| |
llvm-svn: 270857
|
| |
|
|
|
|
|
|
|
|
| |
Similar to r269948, but for argument lowering.
Fixes PR27762
Differential Revision: http://reviews.llvm.org/D20430
llvm-svn: 270856
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The exit-on-error flag is needed to avoid an assert where
llvm::SelectionDAGISel::LowerArguments doesn't create enough arguments. Fill up
with zeroes to reach the right number of args.
Fixes PR27767.
Differential Revision: http://reviews.llvm.org/D20571
llvm-svn: 270855
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If and only if the value being inserted sets only known zero bits.
This combine transforms things like
and w8, w0, #0xfffffff0
movz w9, #5
orr w0, w8, w9
into
movz w8, #5
bfxil w0, w8, #0, #4
The combine is tuned to make sure we always reduce the number of instructions.
We avoid churning code for what is expected to be performance neutral changes
(e.g., converted AND+OR to OR+BFI).
Differential Revision: http://reviews.llvm.org/D20387
llvm-svn: 270846
|
| |
|
|
|
|
| |
This reduces code duplication and now AArch64 also handles PIE.
llvm-svn: 270844
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D20615
llvm-svn: 270843
|
| |
|
|
|
|
| |
Allows display of floating-point registers and display of assembler meta-data output.
llvm-svn: 270829
|
| |
|
|
|
|
|
|
|
|
|
|
| |
analyses.
Reviewers: tra
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D20585
llvm-svn: 270790
|
| |
|
|
| |
llvm-svn: 270785
|
| |
|
|
| |
llvm-svn: 270769
|
| |
|
|
| |
llvm-svn: 270753
|
| |
|
|
| |
llvm-svn: 270746
|
| |
|
|
|
|
|
| |
These operations tend to get promoted away to v4i32 so
this doesn't happen often.
llvm-svn: 270740
|
| |
|
|
|
|
|
|
|
| |
f32 vectors would use a sequence of BFI instructions instead
of unrolled cmp + select. This was better in the case of a VALU
select with SGPR inputs, but we don't have a way of dealing with that
in the DAG.
llvm-svn: 270731
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
By making pointer extraction from a vector more expensive in the cost model,
we avoid the vectorization of a loop that is very likely to be memory-bound:
https://llvm.org/bugs/show_bug.cgi?id=27826
There are still bugs related to this, so we may need a more general solution
to avoid vectorizing obviously memory-bound loops when we don't have HW gather
support.
Differential Revision: http://reviews.llvm.org/D20601
llvm-svn: 270729
|
| |
|
|
|
|
|
|
|
|
| |
VZeroUpperInserter pass (PR27823)
As noted in the review, there are still problems, so this doesn't the bug completely.
Differential Revision: http://reviews.llvm.org/D20529
llvm-svn: 270718
|
| |
|
|
|
|
|
|
|
|
| |
intrinsics with generic IR
Followup to D20528 clang patch, this removes the (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) llvm intrinsics and auto-upgrades to sitofp/fpext instead.
Differential Revision: http://reviews.llvm.org/D20568
llvm-svn: 270678
|
| |
|
|
|
|
| |
long time.
llvm-svn: 270677
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
[AMDGPU] emitPrologue looks for an unused unallocated SGPR that is not
the scratch descriptor. Continue search if unused register found fails
other requirements.
Reviewers: arsenm, tstellarAMD, nhaehnle
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D20526
llvm-svn: 270646
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of this:
i32.const $push10=, __stack_pointer
i32.load $push11=, 0($pop10)
Emit this:
i32.const $push10=, 0
i32.load $push11=, __stack_pointer($pop10)
It's not currently clear which is better, though there's a chance the second
form may be better at overall compression. We can revisit this when we have
more data; for now it makes sense to make PEI consistent with isel.
Differential Revision: http://reviews.llvm.org/D20411
llvm-svn: 270635
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D20081
llvm-svn: 270594
|