|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| | other minor fixes (NFC).
llvm-svn: 292996 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Kill flags need to be updated correctly when moving stores up/down to
form store pair instructions.
Those invalid flags have been ignored before but as of r290014 they are
recognized when using -mllvm -verify-machineinstrs.
Also simplifies test/CodeGen/AArch64/ldst-opt-dbg-limit.mir, renames it
to ldst-opt.mir test and adds a new tests for this change.
Differential Revision: https://reviews.llvm.org/D28875
llvm-svn: 292625 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Rename from addOperand to just add, to match the other method that has been
added to MachineInstrBuilder for adding more than just 1 operand.
See https://reviews.llvm.org/D28057 for the whole discussion.
Differential Revision: https://reviews.llvm.org/D28556
llvm-svn: 291891 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Fix early-exit analysis for memory operation pairing when operations are
not emitted in ascending order.
Reviewers: mcrosier, t.p.northover
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D28251
llvm-svn: 291008 | 
| | 
| 
| 
| 
| 
| | Differential Revision: https://reviews.llvm.org/D27559
llvm-svn: 290014 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
When searching for load/store instructions to pair/merge don't treat
writes to WZR/XZR as clobbers since they don't change the value read
from WZR/XZR (which is always 0).
Reviewers: mcrosier, junbuml, jmolloy, t.p.northover
Subscribers: aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D26921
llvm-svn: 287592 | 
| | 
| 
| 
| | llvm-svn: 286625 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This optimization merges adjacent zero stores into a wider store.
e.g.,
strh wzr, [x0]
strh wzr, [x0, #2]
; becomes
str wzr, [x0]
e.g.,
str wzr, [x0]
str wzr, [x0, #4]
; becomes
str xzr, [x0]
Previously, this was only enabled for Kryo and Cortex-A57.
Differential Revision: https://reviews.llvm.org/D26396
llvm-svn: 286592 | 
| | 
| 
| 
| | llvm-svn: 286137 | 
| | 
| 
| 
| 
| 
| 
| 
| | This feature has been disabled for some time now, so remove cruft.
Differential Revision: https://reviews.llvm.org/D26248
llvm-svn: 286110 | 
| | 
| 
| 
| | llvm-svn: 283004 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | compute it
Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of
running after register and simply describes that no vregs are used in
a machine function. With that we can simply compute the property and do
not need to dump/parse it in .mir files.
Differential Revision: http://reviews.llvm.org/D23850
llvm-svn: 279698 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | loads/stores.
The existing code accidentally skipped the aliasing check in edge cases.
Differential revision: https://reviews.llvm.org/D23372
llvm-svn: 278562 | 
| | 
| 
| 
| 
| 
| 
| 
| | Trunk would try to create something like "stp x9, x8, [x0], #512", which isn't actually a valid instruction.
Differential revision: https://reviews.llvm.org/D23368
llvm-svn: 278559 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This re-factoring could cause the following slight changes in generated
code, though none were observed during testing:
- MachineScheduler could decide not to cluster some loads/stores if
  there are other load/stores with non-pairable opcodes that have the
  same base register and offset as a pairable set of load/stores.  One
  case of different MachineScheduler pairing did show up in my testing,
  but it wasn't due to this issue, but due
  BaseMemOpClusterMutation::clusterNeighboringMemOps() being unstable
  w.r.t. the order it considers memory operations.  See PR28942.
- The ImplicitNullChecks optimization could be done for more load/store
  opcodes.  This optimization isn't done for C/C++ code, so it didn't
  show up in my testing.
Reviewers: mcrosier, t.p.northover
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D23365
llvm-svn: 278515 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | limits.
Summary:
This change also changes findMatchingInsn and
findMatchingUpdateInsnForward to take DBG_VALUE opcodes into account
when tracking register defs and uses, which could potentially inhibit
these optimizations in the presence of debug information.
Reviewers: mcrosier
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D22582
llvm-svn: 276293 | 
| | 
| 
| 
| 
| 
| | -run-pass. NFCI.
llvm-svn: 276193 | 
| | 
| 
| 
| 
| 
| 
| 
| | Avoid implicit conversions from MachineInstrBundleInstr to MachineInstr*
in the AArch64 backend, mainly by preferring MachineInstr& over
MachineInstr* when a pointer isn't nullable.
llvm-svn: 274924 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr.  This is a
general API improvement.
Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other.  Instead I've done everything as a block and just
updated what was necessary.
This is mostly mechanical fixes: adding and removing `*` and `&`
operators.  The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency.  Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.
As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.
Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy.  I couldn't run tests
for AVR since llc doesn't link with it turned on.
llvm-svn: 274189 | 
| | 
| 
| 
| | llvm-svn: 273129 | 
| | 
| 
| 
| | llvm-svn: 272430 | 
| | 
| 
| 
| | llvm-svn: 272429 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Testing for specific CPUs has a number of problems, better use subtarget
features:
- When some tweak is added for a specific CPU it is often desirable for
  the next version of that CPU as well, yet we often forget to add it.
- It is hard to keep track of checks scattered around the target code;
  Declaring all target specifics together with the CPU in the tablegen
  file is a clear representation.
- Subtarget features can be tweaked from the command line.
To discourage people from using CPU checks in the future I removed the
isCortexXX(), isCyclone(), ... functions. I added an getProcFamily()
function for exceptional circumstances but made it clear in the comment
that usage is discouraged.
Reformat feature list in AArch64.td to have 1 feature per line in
alphabetical order to simplify merging and sorting for out of tree
tweaks.
No functional change intended.
Differential Revision: http://reviews.llvm.org/D20762
llvm-svn: 271555 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
As this optimization converts two loads into one load with two shift instructions,
it could potentially hurt performance if a loop is arithmetic operation intensive.
Reviewers: t.p.northover, mcrosier, jmolloy
Subscribers: evandro, jmolloy, aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D20172
llvm-svn: 270251 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary: This change refactors to decouple the zero store promotion from the narrow ld merge and add a flag (enable-narrow-ld-merge=true) to control the narrow ld merge optimization.
Reviewers: jmolloy, t.p.northover, mcrosier
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D19885
llvm-svn: 268744 | 
| | 
| 
| 
| 
| 
| | Differential Revision: http://reviews.llvm.org/D19394
llvm-svn: 267479 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
This adds the same checks that were added in r264593 to all
target-specific passes that run after register allocation.
Reviewers: qcolombet
Subscribers: jyknight, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D18525
llvm-svn: 265313 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
This change will handle missing store pair opportunity where the first store
instruction stores zero followed by the non-zero store. For example, this change
will convert :
  str wzr, [x8]
  str w1, [x8, #4]
into:
  stp wzr, w1, [x8]
Reviewers: jmolloy, t.p.northover, mcrosier
Subscribers: flyingforyou, aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D18570
llvm-svn: 265021 | 
| | 
| 
| 
| | llvm-svn: 264882 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch adds unscaled loads and sign-extend loads to the TII
getMemOpBaseRegImmOfs API, which is used to control clustering in the MI
scheduler. This is done to create more opportunities for load pairing.  I've
also added the scaled LDRSWui instruction, which was missing from the scaled
instructions. Finally, I've added support in shouldClusterLoads for clustering
adjacent sext and zext loads that too can be paired by the load/store optimizer.
Differential Revision: http://reviews.llvm.org/D18048
llvm-svn: 263819 | 
| | 
| 
| 
| | llvm-svn: 263032 | 
| | 
| 
| 
| 
| 
| 
| | Test to be committed in follow up commit, per discussion in D17097.
http://reviews.llvm.org/D17097
llvm-svn: 262942 | 
| | 
| 
| 
| 
| 
| | Machine model description by Dave Estes <cestes@codeaurora.org>.
llvm-svn: 260686 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
This change merges adjacent 32 bit zero stores into a 64 bit zero store.
e.g.,
  str wzr, [x0]
  str wzr, [x0, #4]
becomes
  str xzr, [x0]
Therefore, four adjacent 32 bit zero stores will be a single stp.
e.g.,
  str wzr, [x0]
  str wzr, [x0, #4]
  str wzr, [x0, #8]
  str wzr, [x0, #12]
becomes
  stp xzr, xzr, [x0]
Reviewers: mcrosier, jmolloy, gberry, t.p.northover
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D16933
llvm-svn: 260682 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary: This change makes findMatchingStore() follow the same coding style introduced in r260275.
Reviewers: gberry, junbuml
Subscribers: aemerson, rengolin, haicheng, bmakam, mssimpso
Differential Revision: http://reviews.llvm.org/D17083
llvm-svn: 260534 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch allows the mixing of scaled and unscaled load/stores to form
load/store pairs.
This is a reapplication of r259812, which had an incorrect assert.  The
test_stur_str_no_assert() test is a reduced version of the issue hit in
the AArch64 self-host.
PR24465
llvm-svn: 260523 | 
| | 
| 
| 
| | llvm-svn: 260419 | 
| | 
| 
| 
| | llvm-svn: 260406 | 
| | 
| 
| 
| | llvm-svn: 260383 | 
| | 
| 
| 
| | llvm-svn: 260283 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Fix case where a pre-inc/dec load/store would not be formed if the
add/sub that forms the inc/dec part of the operation was the first
instruction in the block being examined.
Reviewers: mcrosier, jmolloy, t.p.northover, junbuml
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D16785
llvm-svn: 260275 | 
| | 
| 
| 
| 
| 
| | NFC.
llvm-svn: 260274 | 
| | 
| 
| 
| | llvm-svn: 260273 | 
| | 
| 
| 
| | llvm-svn: 260272 | 
| | 
| 
| 
| | llvm-svn: 260264 | 
| | 
| 
| 
| | llvm-svn: 260260 | 
| | 
| 
| 
| | llvm-svn: 260257 | 
| | 
| 
| 
| | llvm-svn: 260256 | 
| | 
| 
| 
| | llvm-svn: 260249 | 
| | 
| 
| 
| 
| 
| 
| 
| | The logic to pair instructions and merge narrow instructions has become cloogy
and error prone.  This patch beings to unravel these two similar, but distinct
optimizations.
llvm-svn: 260242 |