| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change promotes load instructions which directly read from stores by
replacing them with mov instructions. If the store is wider than the load,
the load will be replaced with a bitfield extract.
For example :
STRWui %W1, %X0, 1
%W0 = LDRHHui %X0, 3
becomes
STRWui %W1, %X0, 1
%W0 = UBFMWri %W1, 16, 31
llvm-svn: 256004
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change merges adjacent zero stores into a wider single store.
For example :
strh wzr, [x0]
strh wzr, [x0, #2]
becomes
str wzr, [x0]
This will fix PR25410.
llvm-svn: 253711
|
|
|
|
|
|
|
|
|
| |
Summary :
* Rename isSmallTypeLdMerge() to isNarrowLoad().
* Rename NumSmallTypeMerged to NumNarrowTypePromoted.
* Use Subtarget defined as a member variable.
llvm-svn: 253587
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change extends r251438 to handle more narrow load promotions
including byte type, unscaled, and signed. For example, this change will
convert :
ldursh w1, [x0, #-2]
ldurh w2, [x0, #-4]
into
ldur w2, [x0, #-4]
asr w1, w2, #16
and w2, w2, #0xffff
llvm-svn: 253577
|
|
|
|
|
|
|
|
|
|
|
|
| |
For big-endian targets, when we merge two halfword loads into a word load, the
order of the halfwords in the loaded value is reversed compared to
little-endian, so the load-store optimiser needs to swap the destination
registers.
This does not affect merging of two word loads, as we use ldp, which treats the
memory as two separate 32-bit words.
llvm-svn: 252597
|
|
|
|
|
|
|
|
|
| |
The benefit from converting narrow loads into a wider load (r251438) could be
micro-architecturally dependent, as it assumes that a single load with two bitfield
extracts is cheaper than two narrow loads. Currently, this conversion is
enabled only in cortex-a57 on which performance benefits were verified.
llvm-svn: 252316
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This recommits r250719, which caused a failure in SPEC2000.gcc
because of the incorrect insert point for the new wider load.
Convert two halfword loads into a single 32-bit word load with bitfield extract
instructions. For example :
ldrh w0, [x2]
ldrh w1, [x2, #2]
becomes
ldr w0, [x2]
ubfx w1, w0, #16, #16
and w0, w0, #ffff
llvm-svn: 251438
|
|
|
|
|
|
| |
This reverts commit r250719. This introduced a codegen fault in SPEC2000.gcc, when compiled for Cortex-A53.
llvm-svn: 251108
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Convert two halfword loads into a single 32-bit word load with bitfield extract
instructions. For example :
ldrh w0, [x2]
ldrh w1, [x2, #2]
becomes
ldr w0, [x2]
ubfx w1, w0, #16, #16
and w0, w0, #ffff
llvm-svn: 250719
|
|
|
|
|
|
|
| |
Support for pairing unscaled loads and stores has been enabled since the
original ARM64 port. This feature is no longer experimental, AFAICT.
llvm-svn: 249049
|
|
|
|
| |
llvm-svn: 249011
|
|
|
|
| |
llvm-svn: 249008
|
|
|
|
| |
llvm-svn: 249007
|
|
|
|
|
|
|
|
| |
Previously, the index was constrained to the size of the memory operation for
no apparent reason. This change removes that constraint so that we can form
pre-index instructions with any valid offset.
llvm-svn: 248931
|
|
|
|
| |
llvm-svn: 248914
|
|
|
|
| |
llvm-svn: 248825
|
|
|
|
| |
llvm-svn: 248817
|
|
|
|
|
|
|
|
|
| |
The immediate in the load/store should be scaled by the size of the memory
operation, not the size of the register being loaded/stored. This change gets
us one step closer to forming LDPSW instructions. This change also enables
pre- and post-indexing for halfword and byte loads and stores.
llvm-svn: 248804
|
|
|
|
| |
llvm-svn: 248800
|
|
|
|
| |
llvm-svn: 248593
|
|
|
|
|
|
| |
In this context, MI is an add/sub instruction not a loads/store.
llvm-svn: 248540
|
|
|
|
|
|
| |
function. NFC.
llvm-svn: 248377
|
|
|
|
|
|
|
|
| |
This reverts commit r246769.
This appears to have broken Multisource/Benchmarks/tramp3d-v4.
llvm-svn: 246782
|
|
|
|
|
|
|
|
|
|
|
| |
This patch allows the mixing of scaled and unscaled load/stores to form
load/store pairs.
PR24465
http://reviews.llvm.org/D12116
Many thanks to Ahmed and Michael for fixes and code review.
llvm-svn: 246769
|
|
|
|
| |
llvm-svn: 246767
|
|
|
|
|
|
|
| |
The call to mergePairedInsns() deletes MI, so the later use by isUnscaledLdSt()
is referencing freed memory.
llvm-svn: 246033
|
|
|
|
|
|
|
| |
This reverts commit r245443, as it broke AArch64 test-suite tramp3d
with an assert "Reg && "Null register has no regunits".
llvm-svn: 245455
|
|
|
|
| |
llvm-svn: 245443
|
|
|
|
| |
llvm-svn: 245307
|
|
|
|
|
|
| |
NFC.
llvm-svn: 244479
|
|
|
|
| |
llvm-svn: 244465
|
|
|
|
|
|
|
|
|
| |
At this point the given Opc must be valid, otherwise we should
not look for a matching pair to form paired load or store.
Thanks to Chad to point out this piece of code!
llvm-svn: 244366
|
|
|
|
| |
llvm-svn: 244233
|
|
|
|
| |
llvm-svn: 244222
|
|
|
|
|
|
|
|
|
| |
Summary: Among other things, this allows -print-after-all/-print-before-all to
dump IR around this pass.
This is the AArch64 version of r243052.
llvm-svn: 244041
|
|
|
|
| |
llvm-svn: 244038
|
|
|
|
|
|
|
| |
Also converted a cast<> to dyn_cast while i was working on the same
line of code.
llvm-svn: 243894
|
|
|
|
| |
llvm-svn: 242922
|
|
|
|
| |
llvm-svn: 242812
|
|
|
|
|
|
| |
This is setup for future work planned for the AArch64 Load/Store Opt pass.
llvm-svn: 242810
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Store instructions do not modify register values and therefore it's safe
to form a store pair even if the source register has been read in between
the two store instructions.
Previously, the read of w1 (see below) prevented the formation of a stp.
str w0, [x2]
ldr w8, [x2, #8]
add w0, w8, w1
str w1, [x2, #4]
ret
We now generate the following code.
stp w0, w1, [x2]
ldr w8, [x2, #8]
add w0, w8, w1
ret
All correctness tests with -Ofast on A57 with Spec200x and EEMBC pass.
Performance results for SPEC2K were within noise.
llvm-svn: 239432
|
|
|
|
|
| |
Phabricator: http://reviews.llvm.org/D9863
llvm-svn: 237963
|
|
|
|
|
|
|
|
|
| |
This was previously returning int. However there are no negative opcode
numbers and more importantly this was needlessly different from
MCInstrDesc::getOpcode() (which even is the value returned here) and
SDValue::getOpcode()/SDNode::getOpcode().
llvm-svn: 237611
|
|
|
|
|
|
|
|
|
|
|
| |
Teach the load store optimizer how to sign extend a result of a load pair when
it helps creating more pairs.
The rational is that loads are more expensive than sign extensions, so if we
gather some in one instruction this is better!
<rdar://problem/20072968>
llvm-svn: 231527
|
|
|
|
| |
llvm-svn: 227293
|
|
|
|
|
|
|
|
|
| |
This patch adds the missing LD[U]RSW variants to the load store optimizer, so
that we generate LDPSW when possible.
<rdar://problem/19583480>
llvm-svn: 226978
|
|
|
|
| |
llvm-svn: 215402
|
|
|
|
|
|
| |
information and update all callers. No functional change.
llvm-svn: 214781
|
|
|
|
|
|
| |
No functionality change.
llvm-svn: 213938
|
|
|
|
|
|
| |
No change in functionality.
llvm-svn: 210182
|