| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 206784
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The INSERTPS pattern fragment was called insrtps (mising 'e'), which
would make it harder to grep for the patterns related to this instruction.
Renaming it to use the proper instruction name.
Reviewers: nadav
CC: llvm-commits
Differential Revision: http://reviews.llvm.org/D3443
llvm-svn: 206779
|
| |
|
|
|
|
| |
cpp file rather than in the header and then again in the cpp file.
llvm-svn: 206778
|
| |
|
|
|
|
| |
instruction
llvm-svn: 206774
|
| |
|
|
|
|
| |
It can be reverted a few days later, after X86Disassembler.d is updated not to contain "X86Disassembler.c".
llvm-svn: 206758
|
| |
|
|
| |
llvm-svn: 206749
|
| |
|
|
|
|
|
|
| |
break the API.
No functionality change.
llvm-svn: 206740
|
| |
|
|
|
|
|
|
|
|
| |
Generating BZHI in the variable mask case, i.e. (and X, (sub (shl 1, N), 1)),
was already supported, but we were missing the constant-mask case. This patch
fixes that.
<rdar://problem/15480077>
llvm-svn: 206738
|
| |
|
|
|
|
|
|
| |
Original commit message:
Implement builtins for safe division: safe.sdiv.iN, safe.udiv.iN,
safe.srem.iN, safe.urem.iN (iN = i8, i61, i32, or i64).
llvm-svn: 206735
|
| |
|
|
|
|
| |
safe.urem.iN (iN = i8, i16, i32, or i64).
llvm-svn: 206732
|
| |
|
|
| |
llvm-svn: 206723
|
| |
|
|
| |
llvm-svn: 206722
|
| |
|
|
| |
llvm-svn: 206721
|
| |
|
|
|
|
| |
and ContextDecision in different source files (depending on #define magic).
llvm-svn: 206720
|
| |
|
|
|
|
| |
different source files.
llvm-svn: 206719
|
| |
|
|
|
|
| |
they need to...
llvm-svn: 206718
|
| |
|
|
|
|
|
|
|
|
| |
reason to expose a global symbol 'decodeInstruction' nor to pollute the global
scope with a bunch of external linkage entities (some of which conflict with
others elsewhere in LLVM).
This is just the initial transition to C++; more cleanups to follow.
llvm-svn: 206717
|
| |
|
|
|
|
| |
Cleanup only.
llvm-svn: 206710
|
| |
|
|
|
|
|
|
|
|
|
| |
Win64 stack unwinder gets confused when execution flow "falls through" after
a call to 'noreturn' function. This fixes the "missing epilogue" problem by
emitting a trap instruction for IR 'unreachable' on x86_x64-pc-windows.
A secondary use for it would be for anyone wanting to make double-sure that
'noreturn' functions, indeed, do not return.
llvm-svn: 206684
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
expressions for mov instructions instead of silently truncating by default.
For the ARM assembler, we want to avoid misleadingly allowing something
like "mov r0, <symbol>" especially when we turn it into a movw and the
expression <symbol> does not have a :lower16: or :upper16" as part of the
expression. We don't want the behavior of silently truncating, which can be
unexpected and lead to bugs that are difficult to find since this is an easy
mistake to make.
This does change the previous behavior of llvm but actually matches an
older gnu assembler that would not allow this but print less useful errors
of like “invalid constant (0x927c0) after fixup” and “unsupported relocation on
symbol foo”. The error for llvm is "immediate expression for mov requires
:lower16: or :upper16" with correct location information on the operand
as shown in the added test cases.
rdar://12342160
llvm-svn: 206669
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This port includes the rudimentary latencies that were provided for
the Cortex-A53 Machine Model in the AArch64 backend. It also changes
the SchedAlias for COPY in the Cyclone model to an explicit
WriteRes mapping to avoid conflicts in other subtargets.
Differential Revision: http://reviews.llvm.org/D3427
Patch by Dave Estes <cestes@codeaurora.org>!
llvm-svn: 206652
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a 256-bit BUILD_VECTOR consisting mostly of shuffles of 256-bit vectors,
both the BUILD_VECTOR and its operands may need to be legalized in multiple
steps. Consider:
(v8f32 (BUILD_VECTOR (extract_vector_elt (v8f32 %vreg0,) Constant<1>),
(extract_vector_elt %vreg0, Constant<2>),
(extract_vector_elt %vreg0, Constant<3>),
(extract_vector_elt %vreg0, Constant<4>),
(extract_vector_elt %vreg0, Constant<5>),
(extract_vector_elt %vreg0, Constant<6>),
(extract_vector_elt %vreg0, Constant<7>),
%vreg1))
a. We can't build a 256-bit vector efficiently so, we need to split it into
two 128-bit vecs and combine them with VINSERTX128.
b. Operands like (extract_vector_elt (v8f32 %vreg0), Constant<7>) needs to be
split into a VEXTRACTX128 and a further extract_vector_elt from the
resulting 128-bit vector.
c. The extract_vector_elt from b. is lowered into a shuffle to the first
element and a movss.
Depending on the order in which we legalize the BUILD_VECTOR and its
operands[1], buildFromShuffleMostly may be faced with:
(v4f32 (BUILD_VECTOR (extract_vector_elt
(vector_shuffle<1,u,u,u> (extract_subvector %vreg0, Constant<4>), undef),
Constant<0>),
(extract_vector_elt
(vector_shuffle<2,u,u,u> (extract_subvector %vreg0, Constant<4>), undef),
Constant<0>),
(extract_vector_elt
(vector_shuffle<3,u,u,u> (extract_subvector %vreg0, Constant<4>), undef),
Constant<0>),
%vreg1))
In order to figure out the underlying vector and their identity we need to see
through the shuffles.
[1] Note that the order in which operations and their operands are legalized is
only guaranteed in the first iteration of LegalizeDAG.
Fixes <rdar://problem/16296956>
llvm-svn: 206634
|
| |
|
|
|
|
| |
Part of PR19455.
llvm-svn: 206611
|
| |
|
|
|
|
| |
Part of PR19455.
llvm-svn: 206610
|
| |
|
|
| |
llvm-svn: 206609
|
| |
|
|
| |
llvm-svn: 206591
|
| |
|
|
|
|
|
| |
We couldn't cope if the first mask element was UNDEF before, which
isn't ideal.
llvm-svn: 206588
|
| |
|
|
|
|
|
| |
vcvtph2ps only reads the lower 64 bits of the address passed to the
intrinsic.
llvm-svn: 206579
|
| |
|
|
|
|
|
|
|
|
| |
Code mostly copied from AArch64, just tidied up a trifle and plumbed
into the ARM64 way of doing things.
This also enables the AArch64 tests which inspired the previous
untested commits.
llvm-svn: 206574
|
| |
|
|
|
|
|
|
| |
A vector extract followed by a dup can become a single instruction even if the
types don't match. AArch64 handled this in ISelLowering, but a few reasonably
simple patterns can take care of it in TableGen, so that's where I've put it.
llvm-svn: 206573
|
| |
|
|
|
|
|
| |
Tests will be coming very shortly when all the optimisations needed to
support AArch64's neon-copy.ll file are committed.
llvm-svn: 206572
|
| |
|
|
|
|
|
| |
Tests will be committed shortly when all optimisations needed to
support AArch64's neon-copy.ll file are supported.
llvm-svn: 206571
|
| |
|
|
|
|
|
|
|
|
|
|
| |
ARM64 was scalarizing some vector comparisons which don't quite map to
AArch64's compare and mask instructions. AArch64's approach of sacrificing a
little efficiency to emulate them with the limited set available was better, so
I ported it across.
More "inspired by" than copy/paste since the backend's internal expectations
were a bit different, but the tests were invaluable.
llvm-svn: 206570
|
| |
|
|
|
|
|
|
|
|
|
| |
I enhanced it a little in the process. The decision shouldn't really be beased
on whether a BUILD_VECTOR is a splat: any set of constants will do the job
provided they're related in the correct way.
Also, the BUILD_VECTOR could be any operand of the incoming AND nodes, so it's
best to check for all 4 possibilities rather than assuming it'll be the RHS.
llvm-svn: 206569
|
| |
|
|
|
|
|
|
| |
It's not actually used to handle C or C++ ABI rules on ARM64, but could well be
emitted by other language front-ends, so it's as well to have a sensible
implementation.
llvm-svn: 206568
|
| |
|
|
|
|
|
|
| |
A new test case is also added for ARM64.
Patched by Z.Zheng
llvm-svn: 206563
|
| |
|
|
|
|
| |
Fix indentation, better line wrapping, unused includes.
llvm-svn: 206562
|
| |
|
|
|
|
|
|
| |
performance gap between these two back ends. The test case newly added for AArch64 already exists in ARM64.
Patched by Z.Zheng
llvm-svn: 206559
|
| |
|
|
|
|
|
|
| |
Use scalar BFE with constant shift and offset when possible.
This is complicated by the fact that the scalar version packs
the two operands of the vector version into one.
llvm-svn: 206558
|
| |
|
|
|
|
|
|
| |
back end. This should boost vectorized code performance.
Patched by Z. Zheng
llvm-svn: 206557
|
| |
|
|
| |
llvm-svn: 206547
|
| |
|
|
| |
llvm-svn: 206541
|
| |
|
|
| |
llvm-svn: 206539
|
| |
|
|
| |
llvm-svn: 206505
|
| |
|
|
| |
llvm-svn: 206501
|
| |
|
|
|
|
|
|
|
| |
Having i128 as a legal type complicates the legalization phase. v4i32
is already a legal type, so we will use that instead.
This fixes several piglit tests.
llvm-svn: 206500
|
| |
|
|
|
|
| |
SIFixSGPRCopies is smart enough to handle this now.
llvm-svn: 206499
|
| |
|
|
| |
llvm-svn: 206498
|
| |
|
|
|
|
| |
Otherwise we may not legalize some illegal REG_SEQUENCE instructions.
llvm-svn: 206497
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch improves the performance of vector creation in caseiswhere where
several of the lanes in the vector are a constant floating point value. It
also includes new patterns to fold together some of the instructions when the
value is 0.0f. Test cases included.
rdar://16349427
llvm-svn: 206496
|