| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
It has already been reverted on the 3.4 branch in r196521.
llvm-svn: 206311
|
|
|
|
| |
llvm-svn: 206284
|
|
|
|
|
|
|
|
|
|
|
| |
In rare cases the dead definition elimination pass code can cause illegal cmn
instructions when it replaces dead registers on instructions that use
unmaterialized frame indexes. This patch disables the dead definition
optimization for instructions which include frame index operands.
rdar://16438284
llvm-svn: 206208
|
|
|
|
|
|
|
| |
This patch adds a -arm64-dead-def-elimination flag so that it is possible to
disable dead definition elimination. Includes test case.
llvm-svn: 206207
|
|
|
|
|
|
| |
The 32-bit pattern is still valid: 0123 -> 3210 -> 1032.
llvm-svn: 206172
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current memory-instruction optimization logic in CGP, which sinks parts of
the address computation that can be adsorbed by the addressing mode, does this
by explicitly converting the relevant part of the address computation into
IR-level integer operations (making use of ptrtoint and inttoptr). For most
targets this is currently not a problem, but for targets wishing to make use of
IR-level aliasing analysis during CodeGen, the use of ptrtoint/inttoptr is a
problem for two reasons:
1. BasicAA becomes less powerful in the face of the ptrtoint/inttoptr
2. In cases where type-punning was used, and BasicAA was used
to override TBAA, BasicAA may no longer do so. (this had forced us to disable
all use of TBAA in CodeGen; something which we can now enable again)
This (use of GEPs instead of ptrtoint/inttoptr) is not currently enabled by
default (except for those targets that use AA during CodeGen), and so aside
from some PowerPC subtargets and SystemZ, there should be no change in
behavior. We may be able to switch completely away from the ptrtoint/inttoptr
sinking on all targets, but further testing is required.
I've doubled-up on a number of existing tests that are sensitive to the
address sinking behavior (including some store-merging tests that are
sensitive to the order of the resulting ADD operations at the SDAG level).
llvm-svn: 206092
|
|
|
|
|
|
|
|
|
| |
This patch adds patterns to generate the cls instruction ARM64. Includes tests
for 64 bit and 32 bit operands.
rdar://15611957
llvm-svn: 206079
|
|
|
|
|
|
|
|
|
|
|
| |
sign/zero/any extensions. However a few places were not checking properly the
property of the load and were turning an indexed load into a regular extended
load. Therefore the indexed value was lost during the process and this was
triggering an assertion.
<rdar://problem/16389332>
llvm-svn: 205923
|
|
|
|
| |
llvm-svn: 205899
|
|
|
|
| |
llvm-svn: 205885
|
|
|
|
| |
llvm-svn: 205884
|
|
|
|
|
|
| |
This is the second part of fixing PR19367.
llvm-svn: 205836
|
|
|
|
|
|
| |
This should fix PR19367.
llvm-svn: 205835
|
|
|
|
|
|
|
|
| |
This implements the target-hooks for ARM64 to enable constant hoisting.
This fixes <rdar://problem/14774662> and <rdar://problem/16381500>.
llvm-svn: 205791
|
|
|
|
|
|
|
|
|
|
| |
Confusingly, the NEON fmla instructions put the accumulator first but the
scalar versions put it at the end (like the fma lib function & LLVM's
intrinsic).
This should fix PR19345, assuming there's only one issue.
llvm-svn: 205758
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When LLVM sees something like (v1iN (vselect v1i1, v1iN, v1iN)) it can
decide that the result is OK (v1i64 is legal on AArch64, for example)
but it still need scalarising because of that v1i1. There was no code
to do this though.
AArch64 and ARM64 have DAG combines to produce efficient code and
prevent that occuring in *most* such situations, but there are edge
cases that they miss. This adds a legalization to cope with that.
llvm-svn: 205626
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There were several overlapping problems here, and this solution is
closely inspired by the one adopted in AArch64 in r201381.
Firstly, scalarisation of v1i1 setcc operations simply fails if the
input types are legal. This is fixed in LegalizeVectorTypes.cpp this
time, and allows AArch64 code to be simplified slightly.
Second, vselect with such a setcc feeding into it ends up in
ScalarizeVectorOperand, where it's not handled. I experimented with an
implementation, but found that whatever DAG came out was rather
horrific. I think Hao's DAG combine approach is a good one for
quality, though there are edge cases it won't catch (to be fixed
separately).
Should fix PR19335.
llvm-svn: 205625
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous patterns directly inserted FMOV or INS instructions into
the DAG for scalar_to_vector & bitconvert patterns. This is horribly
inefficient and can generated lots more GPR <-> FPR register traffic
than necessary.
It's much better to emit instructions the register allocator
understands so it can coalesce the copies when appropriate.
It led to at least one ISelLowering hack to avoid the problems, which
was incorrect for v1i64 (FPR64 has no dsub). It can now be removed
entirely.
This should also fix PR19331.
llvm-svn: 205616
|
|
|
|
|
|
|
|
|
|
|
| |
Without this change, the llvm_unreachable kicked in. The code pattern
being spotted is rather non-canonical for 128-bit MLAs, but it can
happen and there's no point in generating sub-optimal code for it just
because it looks odd.
Should fix PR19332.
llvm-svn: 205615
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When rematerializing through truncates, the coalescer may produce instructions
with dead defs, but live implicit-defs of subregs:
E.g.
%X1<def,dead> = MOVi64imm 2, %W1<imp-def>; %X1:GPR64, %W1:GPR32
These instructions are live, and their definitions should not be rewritten.
Fixes <rdar://problem/16492408>
llvm-svn: 205565
|
|
|
|
| |
llvm-svn: 205520
|
|
|
|
|
|
| |
This should fix PR19314.
llvm-svn: 205514
|
|
|
|
|
|
|
|
|
|
| |
Weak symbols cannot use the small code model's usual ADRP sequences since the
instruction simply may not be able to encode a value of 0.
This redirects them to use the GOT, which hopefully linkers are able to cope
with even in the static relocation model.
llvm-svn: 205426
|
|
|
|
|
|
|
| |
We were creating libcall nodes that returned an MVT::f128, when these
particular operations actually return an int of some stripe.
llvm-svn: 205425
|
|
|
|
|
|
|
|
| |
Again, coalescing and other optimisations swiftly made the MachineInstrs
consistent again, but when compiled at -O0 a bad INSERT_SUBREGISTER was
produced.
llvm-svn: 205423
|
|
|
|
|
|
|
|
| |
The previous attempt was fine with optimisations, but was actually rather
cavalier with its types. When compiled at -O0, it produced invalid COPY
MachineInstrs.
llvm-svn: 205422
|
|
|
|
| |
llvm-svn: 205302
|
|
|
|
| |
llvm-svn: 205294
|
|
|
|
| |
llvm-svn: 205293
|
|
|
|
|
|
|
|
|
|
|
|
| |
function address and properly align all entries.
This commit updates the stackmap format to version 1 to indicate the
reorganizaion of several fields. This was done in order to align stackmap
entries to their natural alignment and to minimize padding.
Fixes <rdar://problem/16005902>
llvm-svn: 205254
|
|
|
|
| |
llvm-svn: 205209
|
|
|
|
| |
llvm-svn: 205208
|
|
|
|
| |
llvm-svn: 205207
|
|
|
|
| |
llvm-svn: 205206
|
|
|
|
| |
llvm-svn: 205205
|
|
|
|
| |
llvm-svn: 205204
|
|
|
|
| |
llvm-svn: 205203
|
|
|
|
|
|
| |
This will be used by the Clang front-end code for vabsd_s64.
llvm-svn: 205202
|
|
|
|
|
|
|
|
|
| |
is not a pattern to lower this with clever instructions that zero the
register, so restrict the zero immediate legality special case to f64
and f32 (the only two sizes which fmov seems to directly support). Fixes
backend errors when building code such as libxml.
llvm-svn: 205161
|
|
|
|
|
| |
FIXME: Could we support them?
llvm-svn: 205126
|
|
This adds a second implementation of the AArch64 architecture to LLVM,
accessible in parallel via the "arm64" triple. The plan over the
coming weeks & months is to merge the two into a single backend,
during which time thorough code review should naturally occur.
Everything will be easier with the target in-tree though, hence this
commit.
llvm-svn: 205090
|