| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
For big-endian targets, when we merge two halfword loads into a word load, the
order of the halfwords in the loaded value is reversed compared to
little-endian, so the load-store optimiser needs to swap the destination
registers.
This does not affect merging of two word loads, as we use ldp, which treats the
memory as two separate 32-bit words.
llvm-svn: 252597
|
|
|
|
|
|
|
|
|
| |
The benefit from converting narrow loads into a wider load (r251438) could be
micro-architecturally dependent, as it assumes that a single load with two bitfield
extracts is cheaper than two narrow loads. Currently, this conversion is
enabled only in cortex-a57 on which performance benefits were verified.
llvm-svn: 252316
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This recommits r250719, which caused a failure in SPEC2000.gcc
because of the incorrect insert point for the new wider load.
Convert two halfword loads into a single 32-bit word load with bitfield extract
instructions. For example :
ldrh w0, [x2]
ldrh w1, [x2, #2]
becomes
ldr w0, [x2]
ubfx w1, w0, #16, #16
and w0, w0, #ffff
llvm-svn: 251438
|
|
|
|
|
|
| |
This reverts commit r250719. This introduced a codegen fault in SPEC2000.gcc, when compiled for Cortex-A53.
llvm-svn: 251108
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Convert two halfword loads into a single 32-bit word load with bitfield extract
instructions. For example :
ldrh w0, [x2]
ldrh w1, [x2, #2]
becomes
ldr w0, [x2]
ubfx w1, w0, #16, #16
and w0, w0, #ffff
llvm-svn: 250719
|
|
|
|
|
|
|
| |
Support for pairing unscaled loads and stores has been enabled since the
original ARM64 port. This feature is no longer experimental, AFAICT.
llvm-svn: 249049
|
|
|
|
| |
llvm-svn: 249011
|
|
|
|
| |
llvm-svn: 249008
|
|
|
|
| |
llvm-svn: 249007
|
|
|
|
|
|
|
|
| |
Previously, the index was constrained to the size of the memory operation for
no apparent reason. This change removes that constraint so that we can form
pre-index instructions with any valid offset.
llvm-svn: 248931
|
|
|
|
| |
llvm-svn: 248914
|
|
|
|
| |
llvm-svn: 248825
|
|
|
|
| |
llvm-svn: 248817
|
|
|
|
|
|
|
|
|
| |
The immediate in the load/store should be scaled by the size of the memory
operation, not the size of the register being loaded/stored. This change gets
us one step closer to forming LDPSW instructions. This change also enables
pre- and post-indexing for halfword and byte loads and stores.
llvm-svn: 248804
|
|
|
|
| |
llvm-svn: 248800
|
|
|
|
| |
llvm-svn: 248593
|
|
|
|
|
|
| |
In this context, MI is an add/sub instruction not a loads/store.
llvm-svn: 248540
|
|
|
|
|
|
| |
function. NFC.
llvm-svn: 248377
|
|
|
|
|
|
|
|
| |
This reverts commit r246769.
This appears to have broken Multisource/Benchmarks/tramp3d-v4.
llvm-svn: 246782
|
|
|
|
|
|
|
|
|
|
|
| |
This patch allows the mixing of scaled and unscaled load/stores to form
load/store pairs.
PR24465
http://reviews.llvm.org/D12116
Many thanks to Ahmed and Michael for fixes and code review.
llvm-svn: 246769
|
|
|
|
| |
llvm-svn: 246767
|
|
|
|
|
|
|
| |
The call to mergePairedInsns() deletes MI, so the later use by isUnscaledLdSt()
is referencing freed memory.
llvm-svn: 246033
|
|
|
|
|
|
|
| |
This reverts commit r245443, as it broke AArch64 test-suite tramp3d
with an assert "Reg && "Null register has no regunits".
llvm-svn: 245455
|
|
|
|
| |
llvm-svn: 245443
|
|
|
|
| |
llvm-svn: 245307
|
|
|
|
|
|
| |
NFC.
llvm-svn: 244479
|
|
|
|
| |
llvm-svn: 244465
|
|
|
|
|
|
|
|
|
| |
At this point the given Opc must be valid, otherwise we should
not look for a matching pair to form paired load or store.
Thanks to Chad to point out this piece of code!
llvm-svn: 244366
|
|
|
|
| |
llvm-svn: 244233
|
|
|
|
| |
llvm-svn: 244222
|
|
|
|
|
|
|
|
|
| |
Summary: Among other things, this allows -print-after-all/-print-before-all to
dump IR around this pass.
This is the AArch64 version of r243052.
llvm-svn: 244041
|
|
|
|
| |
llvm-svn: 244038
|
|
|
|
|
|
|
| |
Also converted a cast<> to dyn_cast while i was working on the same
line of code.
llvm-svn: 243894
|
|
|
|
| |
llvm-svn: 242922
|
|
|
|
| |
llvm-svn: 242812
|
|
|
|
|
|
| |
This is setup for future work planned for the AArch64 Load/Store Opt pass.
llvm-svn: 242810
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Store instructions do not modify register values and therefore it's safe
to form a store pair even if the source register has been read in between
the two store instructions.
Previously, the read of w1 (see below) prevented the formation of a stp.
str w0, [x2]
ldr w8, [x2, #8]
add w0, w8, w1
str w1, [x2, #4]
ret
We now generate the following code.
stp w0, w1, [x2]
ldr w8, [x2, #8]
add w0, w8, w1
ret
All correctness tests with -Ofast on A57 with Spec200x and EEMBC pass.
Performance results for SPEC2K were within noise.
llvm-svn: 239432
|
|
|
|
|
| |
Phabricator: http://reviews.llvm.org/D9863
llvm-svn: 237963
|
|
|
|
|
|
|
|
|
| |
This was previously returning int. However there are no negative opcode
numbers and more importantly this was needlessly different from
MCInstrDesc::getOpcode() (which even is the value returned here) and
SDValue::getOpcode()/SDNode::getOpcode().
llvm-svn: 237611
|
|
|
|
|
|
|
|
|
|
|
| |
Teach the load store optimizer how to sign extend a result of a load pair when
it helps creating more pairs.
The rational is that loads are more expensive than sign extensions, so if we
gather some in one instruction this is better!
<rdar://problem/20072968>
llvm-svn: 231527
|
|
|
|
| |
llvm-svn: 227293
|
|
|
|
|
|
|
|
|
| |
This patch adds the missing LD[U]RSW variants to the load store optimizer, so
that we generate LDPSW when possible.
<rdar://problem/19583480>
llvm-svn: 226978
|
|
|
|
| |
llvm-svn: 215402
|
|
|
|
|
|
| |
information and update all callers. No functional change.
llvm-svn: 214781
|
|
|
|
|
|
| |
No functionality change.
llvm-svn: 213938
|
|
|
|
|
|
| |
No change in functionality.
llvm-svn: 210182
|
|
|
|
|
|
|
|
| |
Variable names should start with an upper case letter.
No change in functionality.
llvm-svn: 210181
|
|
|
|
| |
llvm-svn: 210114
|
|
This commit starts with a "git mv ARM64 AArch64" and continues out
from there, renaming the C++ classes, intrinsics, and other
target-local objects for consistency.
"ARM64" test directories are also moved, and tests that began their
life in ARM64 use an arm64 triple, those from AArch64 use an aarch64
triple. Both should be equivalent though.
This finishes the AArch64 merge, and everyone should feel free to
continue committing as normal now.
llvm-svn: 209577
|