summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Add optimization bisect opt-in calls for AArch64 passesAndrew Kaylor2016-04-251-0/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D19394 llvm-svn: 267479
* Add MachineFunctionProperty checks for AllVRegsAllocated for target passesDerek Schuff2016-04-041-0/+5
| | | | | | | | | | | | | | Summary: This adds the same checks that were added in r264593 to all target-specific passes that run after register allocation. Reviewers: qcolombet Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18525 llvm-svn: 265313
* [AArch64] Handle missing store pair opportunityJun Bum Lim2016-03-311-22/+23
| | | | | | | | | | | | | | | | | | | | Summary: This change will handle missing store pair opportunity where the first store instruction stores zero followed by the non-zero store. For example, this change will convert : str wzr, [x8] str w1, [x8, #4] into: stp wzr, w1, [x8] Reviewers: jmolloy, t.p.northover, mcrosier Subscribers: flyingforyou, aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18570 llvm-svn: 265021
* [AArch64] Fix warnings pointed out by Hal.Chad Rosier2016-03-301-1/+5
| | | | llvm-svn: 264882
* [AArch64] Enable more load clustering in the MI Scheduler.Chad Rosier2016-03-181-29/+2
| | | | | | | | | | | | | This patch adds unscaled loads and sign-extend loads to the TII getMemOpBaseRegImmOfs API, which is used to control clustering in the MI scheduler. This is done to create more opportunities for load pairing. I've also added the scaled LDRSWui instruction, which was missing from the scaled instructions. Finally, I've added support in shouldClusterLoads for clustering adjacent sext and zext loads that too can be paired by the load/store optimizer. Differential Revision: http://reviews.llvm.org/D18048 llvm-svn: 263819
* [AArch64] Move helper functions into TII, so they can be reused elsewhere. NFC.Chad Rosier2016-03-091-47/+21
| | | | llvm-svn: 263032
* [AArch64] Add MMOs to unscaled pairs.Chad Rosier2016-03-081-3/+2
| | | | | | | Test to be committed in follow up commit, per discussion in D17097. http://reviews.llvm.org/D17097 llvm-svn: 262942
* [AArch64] Add support for Qualcomm Kryo CPU.Chad Rosier2016-02-121-1/+1
| | | | | | Machine model description by Dave Estes <cestes@codeaurora.org>. llvm-svn: 260686
* [AArch64] Merge two adjacent str WZR into str XZRJun Bum Lim2016-02-121-15/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change merges adjacent 32 bit zero stores into a 64 bit zero store. e.g., str wzr, [x0] str wzr, [x0, #4] becomes str xzr, [x0] Therefore, four adjacent 32 bit zero stores will be a single stp. e.g., str wzr, [x0] str wzr, [x0, #4] str wzr, [x0, #8] str wzr, [x0, #12] becomes stp xzr, xzr, [x0] Reviewers: mcrosier, jmolloy, gberry, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16933 llvm-svn: 260682
* [AArch64] Refactoring findMatchingStore() in aarch64-ldst-opt; NFCJun Bum Lim2016-02-111-11/+13
| | | | | | | | | | | | Summary: This change makes findMatchingStore() follow the same coding style introduced in r260275. Reviewers: gberry, junbuml Subscribers: aemerson, rengolin, haicheng, bmakam, mssimpso Differential Revision: http://reviews.llvm.org/D17083 llvm-svn: 260534
* [AArch64] Improve load/store optimizer to handle LDUR + LDR.Chad Rosier2016-02-111-11/+68
| | | | | | | | | | | | | This patch allows the mixing of scaled and unscaled load/stores to form load/store pairs. This is a reapplication of r259812, which had an incorrect assert. The test_stur_str_no_assert() test is a reduced version of the issue hit in the AArch64 self-host. PR24465 llvm-svn: 260523
* [AArch64] Refactor is logic into a helper function. NFC.Chad Rosier2016-02-101-12/+22
| | | | llvm-svn: 260419
* [AArch64] Update comment to match reality. NFC.Chad Rosier2016-02-101-2/+2
| | | | llvm-svn: 260406
* [AArch64] This bit of logic is specific to pairing. NFC.Chad Rosier2016-02-101-8/+10
| | | | llvm-svn: 260383
* [AArch64] This check is specific to merging instructions. NFC.Chad Rosier2016-02-091-4/+4
| | | | llvm-svn: 260283
* [AArch64] AArch64LoadStoreOptimizer: fix bug in pre-inc check iteratorGeoff Berry2016-02-091-8/+9
| | | | | | | | | | | | | | | Summary: Fix case where a pre-inc/dec load/store would not be formed if the add/sub that forms the inc/dec part of the operation was the first instruction in the block being examined. Reviewers: mcrosier, jmolloy, t.p.northover, junbuml Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16785 llvm-svn: 260275
* [AArch64] Bail even earlier if the instructions modifieds the base register. ↵Chad Rosier2016-02-091-5/+6
| | | | | | NFC. llvm-svn: 260274
* [AArch64] Simplify. NFC.Chad Rosier2016-02-091-3/+1
| | | | llvm-svn: 260273
* [AArch64] Add an assert to ensure we don't scale an offset that can't be scaled.Chad Rosier2016-02-091-1/+3
| | | | llvm-svn: 260272
* [AArch64] Add a FIXME about invalid KILL markers after the ld/st opt pass.Chad Rosier2016-02-091-0/+5
| | | | llvm-svn: 260264
* [AArch64] Remove redundant calls and clang format. NFC.Chad Rosier2016-02-091-42/+40
| | | | llvm-svn: 260260
* [AArch64] Hoist now common logic. NFC.Chad Rosier2016-02-091-13/+9
| | | | llvm-svn: 260257
* [AArch64] Rename variable to make it clear we're merging here, not pairing.Chad Rosier2016-02-091-19/+19
| | | | llvm-svn: 260256
* [AArch64] Separage the codegen logic for widening vs. pairing. NFC.Chad Rosier2016-02-091-38/+94
| | | | llvm-svn: 260249
* [AArch64] Cleanup to simplify logic when widening vs. pairing loads/stores. NFC.Chad Rosier2016-02-091-13/+50
| | | | | | | | The logic to pair instructions and merge narrow instructions has become cloogy and error prone. This patch beings to unravel these two similar, but distinct optimizations. llvm-svn: 260242
* [AArch64] Rename variable to improve readability. NFC.Chad Rosier2016-02-091-5/+5
| | | | llvm-svn: 260228
* [AArch64] Remove stale comment.Chad Rosier2016-02-091-3/+0
| | | | llvm-svn: 260226
* [AArch64] Refactoring aarch64-ldst-opt. NCF.Jun Bum Lim2016-02-051-25/+38
| | | | | | | Remove narrow load / store instructions from getMatchingPairOpcode(), and add getMatchingWideOpcode(). llvm-svn: 259914
* Revert "[AArch64] Improve load/store optimizer to handle LDUR + LDR (take 3)."Renato Golin2016-02-051-76/+21
| | | | | | This reverts commit r259812 as it broke AArch64 self-hosting. llvm-svn: 259881
* [AArch64] Bound the number of instructions we scan when searching for updates.Chad Rosier2016-02-041-14/+26
| | | | | | | This only impacts the creation of pre-/post-index instructions. The bound was set high enough such that it did not change code generation for SPEC200X. llvm-svn: 259828
* [AArch64] Improve load/store optimizer to handle LDUR + LDR (take 3).Chad Rosier2016-02-041-21/+76
| | | | | | | | | | | | | | | This patch allows the mixing of scaled and unscaled load/stores to form load/store pairs. PR24465 http://reviews.llvm.org/D12116 Many thanks to Ahmed and Michael for fixes and code review. This is a reapplication of r246769 and r259790. The tramp3d failure was caused by an incorrect refactoring in the patch. Specifically, we weren't always properly clearing the SExtIdx flag. llvm-svn: 259812
* Revert "[AArch64] Improve load/store optimizer to handle LDUR + LDR."Chad Rosier2016-02-041-77/+22
| | | | | | This reverts commit r259790. tramp3d-v4 is still having problems. llvm-svn: 259795
* [AArch64] Improve load/store optimizer to handle LDUR + LDR.Chad Rosier2016-02-041-22/+77
| | | | | | | | | | | | | | This patch allows the mixing of scaled and unscaled load/stores to form load/store pairs. PR24465 http://reviews.llvm.org/D12116 Many thanks to Ahmed and Michael for fixes and code review. This is a reapplication of r246769, which was reverted in r246782 due to a test-suite failure. I'm unable to reproduce the issue at this time. llvm-svn: 259790
* [AArch64] Add a FIXME comment.Chad Rosier2016-02-021-0/+2
| | | | llvm-svn: 259515
* [AArch64] Allocate the modified and used regs only once per function.Chad Rosier2016-02-021-12/+17
| | | | llvm-svn: 259510
* Move comments a bit closer to associated code. NFC.Chad Rosier2016-02-011-29/+25
| | | | llvm-svn: 259411
* [AArch64] Set MMOs on pre- and post-index instructions.Chad Rosier2016-01-281-2/+4
| | | | | | | Without the MMOs the MI scheduler is unable to reason about the dependencies of these instructions. llvm-svn: 259052
* [AArch64] Remove a bunch of useless FIXME comments.Chad Rosier2016-01-191-4/+0
| | | | llvm-svn: 258193
* [AArch64] Remove more dead code after r258093.Chad Rosier2016-01-191-12/+4
| | | | llvm-svn: 258191
* [AArch64] Remove unused arguments. NFC.Chad Rosier2016-01-181-7/+7
| | | | | | AFAICT, these have been unused since the initial backend import. llvm-svn: 258093
* Update to use new name alignTo().Rui Ueyama2016-01-141-1/+1
| | | | llvm-svn: 257804
* Extract helper function to merge MemoryOperand lists [NFC]Philip Reames2016-01-061-22/+4
| | | | | | | | | | In the discussion on http://reviews.llvm.org/D15730, Andy pointed out we had a utility function for merging MMO lists. Since it turned we actually had two copies and there's another review in progress (http://reviews.llvm.org/D15230) which needs the same, extract it into a utility function and clean up the interfaces to make it easier to use with a MachineInstBuilder. I introduced a pair here to track size and allocation together. I think we should probably move in the direction of the MachineOperandsRef helper class, but I'm leaving that for further work. I want to get the poison state introduced before I make major changes to the interface. Differential Revision: http://reviews.llvm.org/D15757 llvm-svn: 256909
* [AArch64] Promote loads from storedJun Bum Lim2015-12-221-3/+280
| | | | | | | | | | | | | | | | | | | | This is a recommit of r256004 which was reverted in r256160. The issue was the incorrect promotion for half and byte loads transformed into mov instructions. This fix will replace half and byte type loads only with bit field extracts. Original commit message: This change promotes load instructions which directly read from stored by replacing them with mov instructions. If the store is wider than the load, the load will be replaced with a bitfield extract. For example : STRWui %W1, %X0, 1 %W0 = LDRHHui %X0, 3 becomes STRWui %W1, %X0, 1 %W0 = UBFMWri %W1, 16, 31 llvm-svn: 256249
* Revert "[AArch64] Promote loads from stores"Jun Bum Lim2015-12-211-280/+3
| | | | | | This reverts commit r256004 due to a failure in cortex-a53. llvm-svn: 256160
* [AArch64] Promote loads from storesJun Bum Lim2015-12-181-3/+280
| | | | | | | | | | | | | | This change promotes load instructions which directly read from stores by replacing them with mov instructions. If the store is wider than the load, the load will be replaced with a bitfield extract. For example : STRWui %W1, %X0, 1 %W0 = LDRHHui %X0, 3 becomes STRWui %W1, %X0, 1 %W0 = UBFMWri %W1, 16, 31 llvm-svn: 256004
* [AArch64]Merge narrow zero stores to a wider storeJun Bum Lim2015-11-201-16/+80
| | | | | | | | | | | | | This change merges adjacent zero stores into a wider single store. For example : strh wzr, [x0] strh wzr, [x0, #2] becomes str wzr, [x0] This will fix PR25410. llvm-svn: 253711
* [AArch64] Refactoring aarch64-ldst-opt. NCF.Jun Bum Lim2015-11-191-16/+13
| | | | | | | | | Summary : * Rename isSmallTypeLdMerge() to isNarrowLoad(). * Rename NumSmallTypeMerged to NumNarrowTypePromoted. * Use Subtarget defined as a member variable. llvm-svn: 253587
* [AArch64]Extend merging narrow loads into a wider loadJun Bum Lim2015-11-191-26/+107
| | | | | | | | | | | | | | This change extends r251438 to handle more narrow load promotions including byte type, unscaled, and signed. For example, this change will convert : ldursh w1, [x0, #-2] ldurh w2, [x0, #-4] into ldur w2, [x0, #-4] asr w1, w2, #16 and w2, w2, #0xffff llvm-svn: 253577
* [AArch64] Fix halfword load merging for big-endian targetsOliver Stannard2015-11-101-3/+9
| | | | | | | | | | | | For big-endian targets, when we merge two halfword loads into a word load, the order of the halfwords in the loaded value is reversed compared to little-endian, so the load-store optimiser needs to swap the destination registers. This does not affect merging of two word loads, as we use ldp, which treats the memory as two separate 32-bit words. llvm-svn: 252597
* [AArch64]Enable the narrow ld promotion only on profitable microarchitecturesJun Bum Lim2015-11-061-8/+22
| | | | | | | | | The benefit from converting narrow loads into a wider load (r251438) could be micro-architecturally dependent, as it assumes that a single load with two bitfield extracts is cheaper than two narrow loads. Currently, this conversion is enabled only in cortex-a57 on which performance benefits were verified. llvm-svn: 252316
OpenPOWER on IntegriCloud