summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Untabify.NAKAMURA Takumi2016-06-201-1/+1
| | | | llvm-svn: 273129
* [AArch64] Move comments closer to relevant check. NFC.Chad Rosier2016-06-101-6/+4
| | | | llvm-svn: 272430
* [AArch64] Refactor a check earlier. NFC.Chad Rosier2016-06-101-12/+18
| | | | llvm-svn: 272429
* AArch64: Do not test for CPUs, use SubtargetFeaturesMatthias Braun2016-06-021-14/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Testing for specific CPUs has a number of problems, better use subtarget features: - When some tweak is added for a specific CPU it is often desirable for the next version of that CPU as well, yet we often forget to add it. - It is hard to keep track of checks scattered around the target code; Declaring all target specifics together with the CPU in the tablegen file is a clear representation. - Subtarget features can be tweaked from the command line. To discourage people from using CPU checks in the future I removed the isCortexXX(), isCyclone(), ... functions. I added an getProcFamily() function for exceptional circumstances but made it clear in the comment that usage is discouraged. Reformat feature list in AArch64.td to have 1 feature per line in alphabetical order to simplify merging and sorting for out of tree tweaks. No functional change intended. Differential Revision: http://reviews.llvm.org/D20762 llvm-svn: 271555
* [AArch64] Disable narrow load merge by defaultJun Bum Lim2016-05-201-1/+1
| | | | | | | | | | | | | | Summary: As this optimization converts two loads into one load with two shift instructions, it could potentially hurt performance if a loop is arithmetic operation intensive. Reviewers: t.p.northover, mcrosier, jmolloy Subscribers: evandro, jmolloy, aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20172 llvm-svn: 270251
* [AArch64] Decouple zero store promotion from narrow ld merge. NFC.Jun Bum Lim2016-05-061-28/+16
| | | | | | | | | | | | Summary: This change refactors to decouple the zero store promotion from the narrow ld merge and add a flag (enable-narrow-ld-merge=true) to control the narrow ld merge optimization. Reviewers: jmolloy, t.p.northover, mcrosier Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19885 llvm-svn: 268744
* Add optimization bisect opt-in calls for AArch64 passesAndrew Kaylor2016-04-251-0/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D19394 llvm-svn: 267479
* Add MachineFunctionProperty checks for AllVRegsAllocated for target passesDerek Schuff2016-04-041-0/+5
| | | | | | | | | | | | | | Summary: This adds the same checks that were added in r264593 to all target-specific passes that run after register allocation. Reviewers: qcolombet Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18525 llvm-svn: 265313
* [AArch64] Handle missing store pair opportunityJun Bum Lim2016-03-311-22/+23
| | | | | | | | | | | | | | | | | | | | Summary: This change will handle missing store pair opportunity where the first store instruction stores zero followed by the non-zero store. For example, this change will convert : str wzr, [x8] str w1, [x8, #4] into: stp wzr, w1, [x8] Reviewers: jmolloy, t.p.northover, mcrosier Subscribers: flyingforyou, aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18570 llvm-svn: 265021
* [AArch64] Fix warnings pointed out by Hal.Chad Rosier2016-03-301-1/+5
| | | | llvm-svn: 264882
* [AArch64] Enable more load clustering in the MI Scheduler.Chad Rosier2016-03-181-29/+2
| | | | | | | | | | | | | This patch adds unscaled loads and sign-extend loads to the TII getMemOpBaseRegImmOfs API, which is used to control clustering in the MI scheduler. This is done to create more opportunities for load pairing. I've also added the scaled LDRSWui instruction, which was missing from the scaled instructions. Finally, I've added support in shouldClusterLoads for clustering adjacent sext and zext loads that too can be paired by the load/store optimizer. Differential Revision: http://reviews.llvm.org/D18048 llvm-svn: 263819
* [AArch64] Move helper functions into TII, so they can be reused elsewhere. NFC.Chad Rosier2016-03-091-47/+21
| | | | llvm-svn: 263032
* [AArch64] Add MMOs to unscaled pairs.Chad Rosier2016-03-081-3/+2
| | | | | | | Test to be committed in follow up commit, per discussion in D17097. http://reviews.llvm.org/D17097 llvm-svn: 262942
* [AArch64] Add support for Qualcomm Kryo CPU.Chad Rosier2016-02-121-1/+1
| | | | | | Machine model description by Dave Estes <cestes@codeaurora.org>. llvm-svn: 260686
* [AArch64] Merge two adjacent str WZR into str XZRJun Bum Lim2016-02-121-15/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change merges adjacent 32 bit zero stores into a 64 bit zero store. e.g., str wzr, [x0] str wzr, [x0, #4] becomes str xzr, [x0] Therefore, four adjacent 32 bit zero stores will be a single stp. e.g., str wzr, [x0] str wzr, [x0, #4] str wzr, [x0, #8] str wzr, [x0, #12] becomes stp xzr, xzr, [x0] Reviewers: mcrosier, jmolloy, gberry, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16933 llvm-svn: 260682
* [AArch64] Refactoring findMatchingStore() in aarch64-ldst-opt; NFCJun Bum Lim2016-02-111-11/+13
| | | | | | | | | | | | Summary: This change makes findMatchingStore() follow the same coding style introduced in r260275. Reviewers: gberry, junbuml Subscribers: aemerson, rengolin, haicheng, bmakam, mssimpso Differential Revision: http://reviews.llvm.org/D17083 llvm-svn: 260534
* [AArch64] Improve load/store optimizer to handle LDUR + LDR.Chad Rosier2016-02-111-11/+68
| | | | | | | | | | | | | This patch allows the mixing of scaled and unscaled load/stores to form load/store pairs. This is a reapplication of r259812, which had an incorrect assert. The test_stur_str_no_assert() test is a reduced version of the issue hit in the AArch64 self-host. PR24465 llvm-svn: 260523
* [AArch64] Refactor is logic into a helper function. NFC.Chad Rosier2016-02-101-12/+22
| | | | llvm-svn: 260419
* [AArch64] Update comment to match reality. NFC.Chad Rosier2016-02-101-2/+2
| | | | llvm-svn: 260406
* [AArch64] This bit of logic is specific to pairing. NFC.Chad Rosier2016-02-101-8/+10
| | | | llvm-svn: 260383
* [AArch64] This check is specific to merging instructions. NFC.Chad Rosier2016-02-091-4/+4
| | | | llvm-svn: 260283
* [AArch64] AArch64LoadStoreOptimizer: fix bug in pre-inc check iteratorGeoff Berry2016-02-091-8/+9
| | | | | | | | | | | | | | | Summary: Fix case where a pre-inc/dec load/store would not be formed if the add/sub that forms the inc/dec part of the operation was the first instruction in the block being examined. Reviewers: mcrosier, jmolloy, t.p.northover, junbuml Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16785 llvm-svn: 260275
* [AArch64] Bail even earlier if the instructions modifieds the base register. ↵Chad Rosier2016-02-091-5/+6
| | | | | | NFC. llvm-svn: 260274
* [AArch64] Simplify. NFC.Chad Rosier2016-02-091-3/+1
| | | | llvm-svn: 260273
* [AArch64] Add an assert to ensure we don't scale an offset that can't be scaled.Chad Rosier2016-02-091-1/+3
| | | | llvm-svn: 260272
* [AArch64] Add a FIXME about invalid KILL markers after the ld/st opt pass.Chad Rosier2016-02-091-0/+5
| | | | llvm-svn: 260264
* [AArch64] Remove redundant calls and clang format. NFC.Chad Rosier2016-02-091-42/+40
| | | | llvm-svn: 260260
* [AArch64] Hoist now common logic. NFC.Chad Rosier2016-02-091-13/+9
| | | | llvm-svn: 260257
* [AArch64] Rename variable to make it clear we're merging here, not pairing.Chad Rosier2016-02-091-19/+19
| | | | llvm-svn: 260256
* [AArch64] Separage the codegen logic for widening vs. pairing. NFC.Chad Rosier2016-02-091-38/+94
| | | | llvm-svn: 260249
* [AArch64] Cleanup to simplify logic when widening vs. pairing loads/stores. NFC.Chad Rosier2016-02-091-13/+50
| | | | | | | | The logic to pair instructions and merge narrow instructions has become cloogy and error prone. This patch beings to unravel these two similar, but distinct optimizations. llvm-svn: 260242
* [AArch64] Rename variable to improve readability. NFC.Chad Rosier2016-02-091-5/+5
| | | | llvm-svn: 260228
* [AArch64] Remove stale comment.Chad Rosier2016-02-091-3/+0
| | | | llvm-svn: 260226
* [AArch64] Refactoring aarch64-ldst-opt. NCF.Jun Bum Lim2016-02-051-25/+38
| | | | | | | Remove narrow load / store instructions from getMatchingPairOpcode(), and add getMatchingWideOpcode(). llvm-svn: 259914
* Revert "[AArch64] Improve load/store optimizer to handle LDUR + LDR (take 3)."Renato Golin2016-02-051-76/+21
| | | | | | This reverts commit r259812 as it broke AArch64 self-hosting. llvm-svn: 259881
* [AArch64] Bound the number of instructions we scan when searching for updates.Chad Rosier2016-02-041-14/+26
| | | | | | | This only impacts the creation of pre-/post-index instructions. The bound was set high enough such that it did not change code generation for SPEC200X. llvm-svn: 259828
* [AArch64] Improve load/store optimizer to handle LDUR + LDR (take 3).Chad Rosier2016-02-041-21/+76
| | | | | | | | | | | | | | | This patch allows the mixing of scaled and unscaled load/stores to form load/store pairs. PR24465 http://reviews.llvm.org/D12116 Many thanks to Ahmed and Michael for fixes and code review. This is a reapplication of r246769 and r259790. The tramp3d failure was caused by an incorrect refactoring in the patch. Specifically, we weren't always properly clearing the SExtIdx flag. llvm-svn: 259812
* Revert "[AArch64] Improve load/store optimizer to handle LDUR + LDR."Chad Rosier2016-02-041-77/+22
| | | | | | This reverts commit r259790. tramp3d-v4 is still having problems. llvm-svn: 259795
* [AArch64] Improve load/store optimizer to handle LDUR + LDR.Chad Rosier2016-02-041-22/+77
| | | | | | | | | | | | | | This patch allows the mixing of scaled and unscaled load/stores to form load/store pairs. PR24465 http://reviews.llvm.org/D12116 Many thanks to Ahmed and Michael for fixes and code review. This is a reapplication of r246769, which was reverted in r246782 due to a test-suite failure. I'm unable to reproduce the issue at this time. llvm-svn: 259790
* [AArch64] Add a FIXME comment.Chad Rosier2016-02-021-0/+2
| | | | llvm-svn: 259515
* [AArch64] Allocate the modified and used regs only once per function.Chad Rosier2016-02-021-12/+17
| | | | llvm-svn: 259510
* Move comments a bit closer to associated code. NFC.Chad Rosier2016-02-011-29/+25
| | | | llvm-svn: 259411
* [AArch64] Set MMOs on pre- and post-index instructions.Chad Rosier2016-01-281-2/+4
| | | | | | | Without the MMOs the MI scheduler is unable to reason about the dependencies of these instructions. llvm-svn: 259052
* [AArch64] Remove a bunch of useless FIXME comments.Chad Rosier2016-01-191-4/+0
| | | | llvm-svn: 258193
* [AArch64] Remove more dead code after r258093.Chad Rosier2016-01-191-12/+4
| | | | llvm-svn: 258191
* [AArch64] Remove unused arguments. NFC.Chad Rosier2016-01-181-7/+7
| | | | | | AFAICT, these have been unused since the initial backend import. llvm-svn: 258093
* Update to use new name alignTo().Rui Ueyama2016-01-141-1/+1
| | | | llvm-svn: 257804
* Extract helper function to merge MemoryOperand lists [NFC]Philip Reames2016-01-061-22/+4
| | | | | | | | | | In the discussion on http://reviews.llvm.org/D15730, Andy pointed out we had a utility function for merging MMO lists. Since it turned we actually had two copies and there's another review in progress (http://reviews.llvm.org/D15230) which needs the same, extract it into a utility function and clean up the interfaces to make it easier to use with a MachineInstBuilder. I introduced a pair here to track size and allocation together. I think we should probably move in the direction of the MachineOperandsRef helper class, but I'm leaving that for further work. I want to get the poison state introduced before I make major changes to the interface. Differential Revision: http://reviews.llvm.org/D15757 llvm-svn: 256909
* [AArch64] Promote loads from storedJun Bum Lim2015-12-221-3/+280
| | | | | | | | | | | | | | | | | | | | This is a recommit of r256004 which was reverted in r256160. The issue was the incorrect promotion for half and byte loads transformed into mov instructions. This fix will replace half and byte type loads only with bit field extracts. Original commit message: This change promotes load instructions which directly read from stored by replacing them with mov instructions. If the store is wider than the load, the load will be replaced with a bitfield extract. For example : STRWui %W1, %X0, 1 %W0 = LDRHHui %X0, 3 becomes STRWui %W1, %X0, 1 %W0 = UBFMWri %W1, 16, 31 llvm-svn: 256249
* Revert "[AArch64] Promote loads from stores"Jun Bum Lim2015-12-211-280/+3
| | | | | | This reverts commit r256004 due to a failure in cortex-a53. llvm-svn: 256160
OpenPOWER on IntegriCloud