summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [AArch64] Promote loads from storesJun Bum Lim2015-12-181-3/+280
| | | | | | | | | | | | | | This change promotes load instructions which directly read from stores by replacing them with mov instructions. If the store is wider than the load, the load will be replaced with a bitfield extract. For example : STRWui %W1, %X0, 1 %W0 = LDRHHui %X0, 3 becomes STRWui %W1, %X0, 1 %W0 = UBFMWri %W1, 16, 31 llvm-svn: 256004
* [AArch64]Merge narrow zero stores to a wider storeJun Bum Lim2015-11-201-16/+80
| | | | | | | | | | | | | This change merges adjacent zero stores into a wider single store. For example : strh wzr, [x0] strh wzr, [x0, #2] becomes str wzr, [x0] This will fix PR25410. llvm-svn: 253711
* [AArch64] Refactoring aarch64-ldst-opt. NCF.Jun Bum Lim2015-11-191-16/+13
| | | | | | | | | Summary : * Rename isSmallTypeLdMerge() to isNarrowLoad(). * Rename NumSmallTypeMerged to NumNarrowTypePromoted. * Use Subtarget defined as a member variable. llvm-svn: 253587
* [AArch64]Extend merging narrow loads into a wider loadJun Bum Lim2015-11-191-26/+107
| | | | | | | | | | | | | | This change extends r251438 to handle more narrow load promotions including byte type, unscaled, and signed. For example, this change will convert : ldursh w1, [x0, #-2] ldurh w2, [x0, #-4] into ldur w2, [x0, #-4] asr w1, w2, #16 and w2, w2, #0xffff llvm-svn: 253577
* [AArch64] Fix halfword load merging for big-endian targetsOliver Stannard2015-11-101-3/+9
| | | | | | | | | | | | For big-endian targets, when we merge two halfword loads into a word load, the order of the halfwords in the loaded value is reversed compared to little-endian, so the load-store optimiser needs to swap the destination registers. This does not affect merging of two word loads, as we use ldp, which treats the memory as two separate 32-bit words. llvm-svn: 252597
* [AArch64]Enable the narrow ld promotion only on profitable microarchitecturesJun Bum Lim2015-11-061-8/+22
| | | | | | | | | The benefit from converting narrow loads into a wider load (r251438) could be micro-architecturally dependent, as it assumes that a single load with two bitfield extracts is cheaper than two narrow loads. Currently, this conversion is enabled only in cortex-a57 on which performance benefits were verified. llvm-svn: 252316
* [AArch64]Merge halfword loads into a 32-bit loadJun Bum Lim2015-10-271-45/+216
| | | | | | | | | | | | | | | | This recommits r250719, which caused a failure in SPEC2000.gcc because of the incorrect insert point for the new wider load. Convert two halfword loads into a single 32-bit word load with bitfield extract instructions. For example : ldrh w0, [x2] ldrh w1, [x2, #2] becomes ldr w0, [x2] ubfx w1, w0, #16, #16 and w0, w0, #ffff llvm-svn: 251438
* Revert "[AArch64]Merge halfword loads into a 32-bit load"James Molloy2015-10-231-215/+45
| | | | | | This reverts commit r250719. This introduced a codegen fault in SPEC2000.gcc, when compiled for Cortex-A53. llvm-svn: 251108
* [AArch64]Merge halfword loads into a 32-bit loadJun Bum Lim2015-10-191-45/+215
| | | | | | | | | | | | | Convert two halfword loads into a single 32-bit word load with bitfield extract instructions. For example : ldrh w0, [x2] ldrh w1, [x2, #2] becomes ldr w0, [x2] ubfx w1, w0, #16, #16 and w0, w0, #ffff llvm-svn: 250719
* [AArch64] Deprecate a command-line option used for testing.Chad Rosier2015-10-011-12/+4
| | | | | | | Support for pairing unscaled loads and stores has been enabled since the original ARM64 port. This feature is no longer experimental, AFAICT. llvm-svn: 249049
* [AArch64] Hoist commonly failing check. NFC.Chad Rosier2015-10-011-6/+6
| | | | llvm-svn: 249011
* [AArch64] Rename variable to improve readability. NFC.Chad Rosier2015-10-011-10/+10
| | | | llvm-svn: 249008
* [AArch64] Update comment to reflect reality.Chad Rosier2015-10-011-2/+2
| | | | llvm-svn: 249007
* [AArch64] Remove an unnecessary restriction on pre-index instructions.Chad Rosier2015-09-301-2/+1
| | | | | | | | Previously, the index was constrained to the size of the memory operation for no apparent reason. This change removes that constraint so that we can form pre-index instructions with any valid offset. llvm-svn: 248931
* [AArch64] Use helper function to improve readability. NFC.Chad Rosier2015-09-301-2/+1
| | | | llvm-svn: 248914
* [AArch64] Add support for pre- and post-index LDPSWs.Chad Rosier2015-09-291-5/+7
| | | | llvm-svn: 248825
* [AArch64] Add integer pre- and post-index halfword/byte loads and stores.Chad Rosier2015-09-291-1/+27
| | | | llvm-svn: 248817
* [AArch64] Scale offsets by the size of the memory operation. NFC.Chad Rosier2015-09-291-17/+21
| | | | | | | | | The immediate in the load/store should be scaled by the size of the memory operation, not the size of the register being loaded/stored. This change gets us one step closer to forming LDPSW instructions. This change also enables pre- and post-indexing for halfword and byte loads and stores. llvm-svn: 248804
* [AArch64] Remove some redundant cases. NFC.Chad Rosier2015-09-291-23/+15
| | | | llvm-svn: 248800
* [AArch64] Add support for generating pre- and post-index load/store pairs.Chad Rosier2015-09-251-43/+173
| | | | llvm-svn: 248593
* [AArch64] Improve the readability of the ld/st optimization pass. NFC.Chad Rosier2015-09-241-4/+4
| | | | | | In this context, MI is an add/sub instruction not a loads/store. llvm-svn: 248540
* [AArch64] Refactor pre- and post-index merge fuctions into a single ↵Chad Rosier2015-09-231-59/+16
| | | | | | function. NFC. llvm-svn: 248377
* Revert "[AArch64] Improve load/store optimizer to handle LDUR + LDR."Chad Rosier2015-09-031-77/+21
| | | | | | | | This reverts commit r246769. This appears to have broken Multisource/Benchmarks/tramp3d-v4. llvm-svn: 246782
* [AArch64] Improve load/store optimizer to handle LDUR + LDR.Chad Rosier2015-09-031-21/+77
| | | | | | | | | | | This patch allows the mixing of scaled and unscaled load/stores to form load/store pairs. PR24465 http://reviews.llvm.org/D12116 Many thanks to Ahmed and Michael for fixes and code review. llvm-svn: 246769
* [AArch64] Reuse MayLoad. NFC.Chad Rosier2015-09-031-1/+1
| | | | llvm-svn: 246767
* [AArch64] Remove a use-after-free when collecting stats.Chad Rosier2015-08-261-4/+4
| | | | | | | The call to mergePairedInsns() deletes MI, so the later use by isUnscaledLdSt() is referencing freed memory. llvm-svn: 246033
* Revert "[AArch64] Simplify/refactor code to ease code review. NFC."Renato Golin2015-08-191-32/+18
| | | | | | | This reverts commit r245443, as it broke AArch64 test-suite tramp3d with an assert "Reg && "Null register has no regunits". llvm-svn: 245455
* [AArch64] Simplify/refactor code to ease code review. NFC.Chad Rosier2015-08-191-18/+32
| | | | llvm-svn: 245443
* [AArch64] Simplify the logic for computing in bounds offset. NFC.Chad Rosier2015-08-181-10/+6
| | | | llvm-svn: 245307
* [AArch64] Convert a conditional check that will always be true to an assert. ↵Chad Rosier2015-08-101-6/+4
| | | | | | NFC. llvm-svn: 244479
* Typo. Move comment closer to relevant code. NFC.Chad Rosier2015-08-101-3/+4
| | | | llvm-svn: 244465
* [AArch64][LoadStoreOptimizer] Turn a test into an assert. NFC.Quentin Colombet2015-08-071-2/+2
| | | | | | | | | At this point the given Opc must be valid, otherwise we should not look for a matching pair to form paired load or store. Thanks to Chad to point out this piece of code! llvm-svn: 244366
* [AArch64] Use a static function and other minor cleanup for readability. NFC.Chad Rosier2015-08-061-11/+12
| | | | llvm-svn: 244233
* [AArch64] Improve the readability of the ld/st optimization pass. NFC.Chad Rosier2015-08-061-37/+48
| | | | llvm-svn: 244222
* [AArch64] Register (existing) AArch64LoadStoreOpt pass with LLVM pass manager.Chad Rosier2015-08-051-2/+13
| | | | | | | | | Summary: Among other things, this allows -print-after-all/-print-before-all to dump IR around this pass. This is the AArch64 version of r243052. llvm-svn: 244041
* Update comment. NFC.Chad Rosier2015-08-051-2/+2
| | | | llvm-svn: 244038
* Convert some AArch64 code to foreach loops. NFC.Pete Cooper2015-08-031-3/+2
| | | | | | | Also converted a cast<> to dyn_cast while i was working on the same line of code. llvm-svn: 243894
* Simplify switch as all cases other than default return true. NFC.Chad Rosier2015-07-221-10/+0
| | | | llvm-svn: 242922
* Follow up to r242810. NFC.Chad Rosier2015-07-211-1/+1
| | | | llvm-svn: 242812
* [AArch64] Simplify the passing of arguments. NFC.Chad Rosier2015-07-211-23/+37
| | | | | | This is setup for future work planned for the AArch64 Load/Store Opt pass. llvm-svn: 242810
* [AArch64] Remove an overly conservative check when generating store pairs.Chad Rosier2015-06-091-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Store instructions do not modify register values and therefore it's safe to form a store pair even if the source register has been read in between the two store instructions. Previously, the read of w1 (see below) prevented the formation of a stp. str w0, [x2] ldr w8, [x2, #8] add w0, w8, w1 str w1, [x2, #4] ret We now generate the following code. stp w0, w1, [x2] ldr w8, [x2, #8] add w0, w8, w1 ret All correctness tests with -Ofast on A57 with Spec200x and EEMBC pass. Performance results for SPEC2K were within noise. llvm-svn: 239432
* [AArch64] Enhance the load/store optimizer with target-specific alias analysis.Chad Rosier2015-05-211-20/+51
| | | | | Phabricator: http://reviews.llvm.org/D9863 llvm-svn: 237963
* MachineInstr: Change return value of getOpcode() to unsigned.Matthias Braun2015-05-181-2/+2
| | | | | | | | | This was previously returning int. However there are no negative opcode numbers and more importantly this was needlessly different from MCInstrDesc::getOpcode() (which even is the value returned here) and SDValue::getOpcode()/SDNode::getOpcode(). llvm-svn: 237611
* [AArch64][LoadStoreOptimizer] Generate LDP + SXTW instead of LD[U]R + LD[U]RSW.Quentin Colombet2015-03-061-11/+116
| | | | | | | | | | | Teach the load store optimizer how to sign extend a result of a load pair when it helps creating more pairs. The rational is that loads are more expensive than sign extensions, so if we gather some in one instruction this is better! <rdar://problem/20072968> llvm-svn: 231527
* Migrate AArch64 except for TTI and AsmPrinter away from getSubtargetImpl.Eric Christopher2015-01-281-4/+2
| | | | llvm-svn: 227293
* [AArch64][LoadStoreOptimizer] Form LDPSW when possible.Quentin Colombet2015-01-241-1/+15
| | | | | | | | | This patch adds the missing LD[U]RSW variants to the load store optimizer, so that we generate LDPSW when possible. <rdar://problem/19583480> llvm-svn: 226978
* Add missing closing namespace comment.Jim Grosbach2014-08-111-1/+1
| | | | llvm-svn: 215402
* Remove the TargetMachine forwards for TargetSubtargetInfo basedEric Christopher2014-08-041-2/+4
| | | | | | information and update all callers. No functional change. llvm-svn: 214781
* Run sort_includes.py on the AArch64 backend.Benjamin Kramer2014-07-251-4/+4
| | | | | | No functionality change. llvm-svn: 213938
* [AArch64] clang-format the load/store optimizer.Tilmann Scheller2014-06-041-16/+25
| | | | | | No change in functionality. llvm-svn: 210182
OpenPOWER on IntegriCloud