| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
| |
Doing this before register allocation reduces register pressure as we do
not even have to allocate a register for those dead definitions.
Differential Revision: https://reviews.llvm.org/D26111
llvm-svn: 287076
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D26673
llvm-svn: 287036
|
| |
|
|
|
|
|
|
|
|
|
| |
Lower a = b * C where C = (2^n + 1) * 2^m to
add w0, w0, w0, lsl n
lsl w0, w0, m
Differential Revision: https://reviews.llvm.org/D229245
llvm-svn: 287019
|
| |
|
|
|
|
|
|
|
|
| |
Implement the Newton series for square root, its reciprocal and reciprocal
natively using the specialized instructions in AArch64 to perform each
series iteration.
Differential revision: https://reviews.llvm.org/D26518
llvm-svn: 286907
|
| |
|
|
|
|
| |
Follow-up change to r286875.
llvm-svn: 286879
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Replace a splat of zeros to a vector store by scalar stores of WZR/XZR.
The load store optimizer pass will merge them to store pair stores.
This should be better than a movi to create the vector zero followed by
a vector store if the zero constant is not re-used, since one
instructions and one register live range will be removed.
For example, the final generated code should be:
stp xzr, xzr, [x0]
instead of:
movi v0.2d, #0
str q0, [x0]
Reviewers: t.p.northover, mcrosier, MatzeB, jmolloy
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D26561
llvm-svn: 286875
|
| |
|
|
| |
llvm-svn: 286874
|
| |
|
|
| |
llvm-svn: 286808
|
| |
|
|
| |
llvm-svn: 286625
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fix off-by-one indexing error in loop checking that inserted value was a
splat vector.
Add code to check that INSERT_VECTOR_ELT nodes constructing the splat
vector have the expected constant index values.
Reviewers: t.p.northover, jmolloy, mcrosier
Subscribers: aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D26409
llvm-svn: 286616
|
| |
|
|
| |
llvm-svn: 286606
|
| |
|
|
| |
llvm-svn: 286601
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This optimization merges adjacent zero stores into a wider store.
e.g.,
strh wzr, [x0]
strh wzr, [x0, #2]
; becomes
str wzr, [x0]
e.g.,
str wzr, [x0]
str wzr, [x0, #4]
; becomes
str xzr, [x0]
Previously, this was only enabled for Kryo and Cortex-A57.
Differential Revision: https://reviews.llvm.org/D26396
llvm-svn: 286592
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The generic infrastructure to compute the Newton series for reciprocal and
reciprocal square root was conceived to allow a target to compute the series
itself. However, the original code did not properly consider this condition
if returned by a target. This patch addresses the issues to allow a target
to compute the series on its own.
Differential revision: https://reviews.llvm.org/D22975
llvm-svn: 286523
|
| |
|
|
|
|
|
| |
Pretty bare-bones support for exception handling (no weird MSVC stuff, no SjLj
etc), but it should get things going.
llvm-svn: 286407
|
| |
|
|
|
|
| |
Fix a bug in the calculation of the changed flag introduced in r285488.
llvm-svn: 286293
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: t.p.northover, rengolin
Subscribers: llvm-commits, aemerson
Differential Revision: https://reviews.llvm.org/D26309
llvm-svn: 286265
|
| |
|
|
| |
llvm-svn: 286253
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Under -enable-unsafe-fp-math, SELECT_CC lowering in AArch64
transforms floating point comparisons of the form "a == 0.0 ? 0.0 : x" to
"a == 0.0 ? a : x". But it incorrectly assumes that 'x' and 'a' have
the same type which can lead to a wrong CSEL node that crashes later
due to nonsensical copies.
Differential Revision: https://reviews.llvm.org/D26394
llvm-svn: 286231
|
| |
|
|
| |
llvm-svn: 286185
|
| |
|
|
|
|
|
|
|
|
| |
Self-referencing PHI nodes need their destination operands to be constrained
because nothing else is likely to do so. For now we just pick a register class
naively.
Patch mostly by Ahmed again.
llvm-svn: 286183
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Some vector loads and stores generated from AArch64 intrinsics alias each other
unnecessarily, preventing better scheduling. We just need to transfer memory
operands during lowering.
Reviewers: mcrosier, t.p.northover, jmolloy
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D26313
llvm-svn: 286168
|
| |
|
|
| |
llvm-svn: 286137
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
asm register selection on AArch64.
Without this patch, register allocation for the example below fails.
define half @test(half %a1, half %a2) #0 {
entry:
%0 = tail call half asm "sqrshl ${0:h}, ${1:h}, ${2:h}", "=w,w,w" (half %a1, half %a2) #1
ret half %0
}
Patch by Florian Hahn.
Differential Revision: https://reviews.llvm.org/D25080
llvm-svn: 286111
|
| |
|
|
|
|
|
|
| |
This feature has been disabled for some time now, so remove cruft.
Differential Revision: https://reviews.llvm.org/D26248
llvm-svn: 286110
|
| |
|
|
|
|
|
|
| |
These interfaces are no longer used.
Differential Revision: https://reviews.llvm.org/D26222
llvm-svn: 285774
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As it stands, the OperandMatchResultTy is only included in the generated
header if there is custom operand parsing. However, almost all backends
make use of MatchOperand_Success and friends from OperandMatchResultTy for
e.g. parseRegister. This is a pain when starting an AsmParser for a new
backend that doesn't yet have custom operand parsing. Move the enum to
MCTargetAsmParser.h.
This patch is a prerequisite for D23563
Differential Revision: https://reviews.llvm.org/D23496
llvm-svn: 285705
|
| |
|
|
| |
llvm-svn: 285615
|
| |
|
|
| |
llvm-svn: 285614
|
| |
|
|
|
|
|
|
|
|
|
| |
- Fix doxygen file comment
- reduce indentation in loop
- Factor out some common subexpressions
- Move independent helper function out of class
- Fix Changed flag (this is not strictly NFC but a bugfix, but the flag
seems ignored anyway)
llvm-svn: 285488
|
| |
|
|
|
|
|
|
| |
Since Exynos-M2 improved the FP square root unit a bit over the one in
Exynos-M1, it does not benefit from using the Newton series for such
operations.
llvm-svn: 285246
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of
cmp w0, #1
orr w8, wzr, #0x1
cneg w0, w8, ne
we now generate
cmp w0, #1
csinv w0, w0, wzr, eq
PR28965
llvm-svn: 285217
|
| |
|
|
|
|
| |
Modify the maximum jump table size.
llvm-svn: 285106
|
| |
|
|
|
|
|
|
|
| |
Add support for estimating the square root or its reciprocal and division or
reciprocal using the combiner generic Newton series.
Differential revision: https://reviews.llvm.org/D25291
llvm-svn: 284986
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add relocations for AArch64 ILP32. Includes:
- Addition of definitions for R_AARCH32_*
- Definition of new -target-abi: ilp32
- Definition of data layout string
- Tests for added relocations. Not comprehensive, but matches
existing tests for 64-bit. Renames "CHECK-OBJ" to "CHECK-OBJ-LP64".
- Tests for llvm-readobj
Reviewers: zatrazz, peter.smith, echristo, t.p.northover
Subscribers: aemerson, rengolin, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25159
llvm-svn: 284973
|
| |
|
|
| |
llvm-svn: 284839
|
| |
|
|
| |
llvm-svn: 284832
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The spill size was incorrectly set to 196 bits,
which isn't a multiple of 8. This problem was detected when
experimenting with asserts that the spill size should be a
multiple of the byte size.
New corrected value for the spill size is set to 192 bits.
Note that tablegen (RegisterInfoEmitter) will divide the
size set in the RegisterClass definition by 8. So this
change should not have any impact on the tablegen output
(trunc(192/8) == trunc(196/8) == 24 bytes).
Reviewers: t.p.northover
Subscribers: llvm-commits, aemerson, rengolin
Differential Revision: https://reviews.llvm.org/D25818
llvm-svn: 284814
|
| |
|
|
|
|
|
|
|
|
| |
All of these existed because MSVC 2013 was unable to synthesize default
move ctors. We recently dropped support for it so all that error-prone
boilerplate can go.
No functionality change intended.
llvm-svn: 284721
|
| |
|
|
|
|
|
|
|
|
|
| |
Transform `a == 0.0 ? 0.0 : x` to `a == 0.0 ? a : x` and `a != 0.0 ? x : 0.0`
to `a != 0.0 ? x : a` to avoid materializing 0.0 for FCSEL, since it does not
have to be materialized beforehand for FCMP, as it has a form that has 0.0
as an implicit operand.
Differential Revision: https://reviews.llvm.org/D24808
llvm-svn: 284531
|
| |
|
|
|
|
|
|
| |
AArch64 actually supports many 8-bit operations under the definition used by
GlobalISel: the designated information-carrying bits of a GPR32 get the right
value if you just use the normal 32-bit instruction.
llvm-svn: 284526
|
| |
|
|
|
|
| |
Patch from Ahmed Bougacha.
llvm-svn: 284523
|
| |
|
|
| |
llvm-svn: 284406
|
| |
|
|
|
|
|
|
|
|
| |
The previous names were both misleading (the MachineLegalizer actually
contained the info tables) and inconsistent with the selector & translator (in
having a "Machine") prefix. This should make everything sensible again.
The only functional change is the name of a couple of command-line options.
llvm-svn: 284287
|
| |
|
|
|
|
| |
NFC.
llvm-svn: 284146
|
| |
|
|
|
|
|
|
| |
This allows RegBankSelect in greedy mode to get rid some of the cross
register bank copies when loads are involved in the chain of
computation.
llvm-svn: 284097
|
| |
|
|
|
|
|
| |
Thanks to this patch, RegBankSelect is able to get rid of some register
bank copies as demonstrated in the test case.
llvm-svn: 284094
|
| |
|
|
|
|
| |
NFC.
llvm-svn: 284091
|
| |
|
|
|
|
| |
NFC.
llvm-svn: 284090
|
| |
|
|
|
|
|
| |
Basically any vector types that fits in a 32-bit register is also valid
as far as copies are concerned.
llvm-svn: 284089
|