summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [AArch64] Fix halfword load merging for big-endian targetsOliver Stannard2015-11-101-3/+9
| | | | | | | | | | | | For big-endian targets, when we merge two halfword loads into a word load, the order of the halfwords in the loaded value is reversed compared to little-endian, so the load-store optimiser needs to swap the destination registers. This does not affect merging of two word loads, as we use ldp, which treats the memory as two separate 32-bit words. llvm-svn: 252597
* [AArch64]Enable the narrow ld promotion only on profitable microarchitecturesJun Bum Lim2015-11-061-8/+22
| | | | | | | | | The benefit from converting narrow loads into a wider load (r251438) could be micro-architecturally dependent, as it assumes that a single load with two bitfield extracts is cheaper than two narrow loads. Currently, this conversion is enabled only in cortex-a57 on which performance benefits were verified. llvm-svn: 252316
* [AArch64]Merge halfword loads into a 32-bit loadJun Bum Lim2015-10-271-45/+216
| | | | | | | | | | | | | | | | This recommits r250719, which caused a failure in SPEC2000.gcc because of the incorrect insert point for the new wider load. Convert two halfword loads into a single 32-bit word load with bitfield extract instructions. For example : ldrh w0, [x2] ldrh w1, [x2, #2] becomes ldr w0, [x2] ubfx w1, w0, #16, #16 and w0, w0, #ffff llvm-svn: 251438
* Revert "[AArch64]Merge halfword loads into a 32-bit load"James Molloy2015-10-231-215/+45
| | | | | | This reverts commit r250719. This introduced a codegen fault in SPEC2000.gcc, when compiled for Cortex-A53. llvm-svn: 251108
* [AArch64]Merge halfword loads into a 32-bit loadJun Bum Lim2015-10-191-45/+215
| | | | | | | | | | | | | Convert two halfword loads into a single 32-bit word load with bitfield extract instructions. For example : ldrh w0, [x2] ldrh w1, [x2, #2] becomes ldr w0, [x2] ubfx w1, w0, #16, #16 and w0, w0, #ffff llvm-svn: 250719
* [AArch64] Deprecate a command-line option used for testing.Chad Rosier2015-10-011-12/+4
| | | | | | | Support for pairing unscaled loads and stores has been enabled since the original ARM64 port. This feature is no longer experimental, AFAICT. llvm-svn: 249049
* [AArch64] Hoist commonly failing check. NFC.Chad Rosier2015-10-011-6/+6
| | | | llvm-svn: 249011
* [AArch64] Rename variable to improve readability. NFC.Chad Rosier2015-10-011-10/+10
| | | | llvm-svn: 249008
* [AArch64] Update comment to reflect reality.Chad Rosier2015-10-011-2/+2
| | | | llvm-svn: 249007
* [AArch64] Remove an unnecessary restriction on pre-index instructions.Chad Rosier2015-09-301-2/+1
| | | | | | | | Previously, the index was constrained to the size of the memory operation for no apparent reason. This change removes that constraint so that we can form pre-index instructions with any valid offset. llvm-svn: 248931
* [AArch64] Use helper function to improve readability. NFC.Chad Rosier2015-09-301-2/+1
| | | | llvm-svn: 248914
* [AArch64] Add support for pre- and post-index LDPSWs.Chad Rosier2015-09-291-5/+7
| | | | llvm-svn: 248825
* [AArch64] Add integer pre- and post-index halfword/byte loads and stores.Chad Rosier2015-09-291-1/+27
| | | | llvm-svn: 248817
* [AArch64] Scale offsets by the size of the memory operation. NFC.Chad Rosier2015-09-291-17/+21
| | | | | | | | | The immediate in the load/store should be scaled by the size of the memory operation, not the size of the register being loaded/stored. This change gets us one step closer to forming LDPSW instructions. This change also enables pre- and post-indexing for halfword and byte loads and stores. llvm-svn: 248804
* [AArch64] Remove some redundant cases. NFC.Chad Rosier2015-09-291-23/+15
| | | | llvm-svn: 248800
* [AArch64] Add support for generating pre- and post-index load/store pairs.Chad Rosier2015-09-251-43/+173
| | | | llvm-svn: 248593
* [AArch64] Improve the readability of the ld/st optimization pass. NFC.Chad Rosier2015-09-241-4/+4
| | | | | | In this context, MI is an add/sub instruction not a loads/store. llvm-svn: 248540
* [AArch64] Refactor pre- and post-index merge fuctions into a single ↵Chad Rosier2015-09-231-59/+16
| | | | | | function. NFC. llvm-svn: 248377
* Revert "[AArch64] Improve load/store optimizer to handle LDUR + LDR."Chad Rosier2015-09-031-77/+21
| | | | | | | | This reverts commit r246769. This appears to have broken Multisource/Benchmarks/tramp3d-v4. llvm-svn: 246782
* [AArch64] Improve load/store optimizer to handle LDUR + LDR.Chad Rosier2015-09-031-21/+77
| | | | | | | | | | | This patch allows the mixing of scaled and unscaled load/stores to form load/store pairs. PR24465 http://reviews.llvm.org/D12116 Many thanks to Ahmed and Michael for fixes and code review. llvm-svn: 246769
* [AArch64] Reuse MayLoad. NFC.Chad Rosier2015-09-031-1/+1
| | | | llvm-svn: 246767
* [AArch64] Remove a use-after-free when collecting stats.Chad Rosier2015-08-261-4/+4
| | | | | | | The call to mergePairedInsns() deletes MI, so the later use by isUnscaledLdSt() is referencing freed memory. llvm-svn: 246033
* Revert "[AArch64] Simplify/refactor code to ease code review. NFC."Renato Golin2015-08-191-32/+18
| | | | | | | This reverts commit r245443, as it broke AArch64 test-suite tramp3d with an assert "Reg && "Null register has no regunits". llvm-svn: 245455
* [AArch64] Simplify/refactor code to ease code review. NFC.Chad Rosier2015-08-191-18/+32
| | | | llvm-svn: 245443
* [AArch64] Simplify the logic for computing in bounds offset. NFC.Chad Rosier2015-08-181-10/+6
| | | | llvm-svn: 245307
* [AArch64] Convert a conditional check that will always be true to an assert. ↵Chad Rosier2015-08-101-6/+4
| | | | | | NFC. llvm-svn: 244479
* Typo. Move comment closer to relevant code. NFC.Chad Rosier2015-08-101-3/+4
| | | | llvm-svn: 244465
* [AArch64][LoadStoreOptimizer] Turn a test into an assert. NFC.Quentin Colombet2015-08-071-2/+2
| | | | | | | | | At this point the given Opc must be valid, otherwise we should not look for a matching pair to form paired load or store. Thanks to Chad to point out this piece of code! llvm-svn: 244366
* [AArch64] Use a static function and other minor cleanup for readability. NFC.Chad Rosier2015-08-061-11/+12
| | | | llvm-svn: 244233
* [AArch64] Improve the readability of the ld/st optimization pass. NFC.Chad Rosier2015-08-061-37/+48
| | | | llvm-svn: 244222
* [AArch64] Register (existing) AArch64LoadStoreOpt pass with LLVM pass manager.Chad Rosier2015-08-051-2/+13
| | | | | | | | | Summary: Among other things, this allows -print-after-all/-print-before-all to dump IR around this pass. This is the AArch64 version of r243052. llvm-svn: 244041
* Update comment. NFC.Chad Rosier2015-08-051-2/+2
| | | | llvm-svn: 244038
* Convert some AArch64 code to foreach loops. NFC.Pete Cooper2015-08-031-3/+2
| | | | | | | Also converted a cast<> to dyn_cast while i was working on the same line of code. llvm-svn: 243894
* Simplify switch as all cases other than default return true. NFC.Chad Rosier2015-07-221-10/+0
| | | | llvm-svn: 242922
* Follow up to r242810. NFC.Chad Rosier2015-07-211-1/+1
| | | | llvm-svn: 242812
* [AArch64] Simplify the passing of arguments. NFC.Chad Rosier2015-07-211-23/+37
| | | | | | This is setup for future work planned for the AArch64 Load/Store Opt pass. llvm-svn: 242810
* [AArch64] Remove an overly conservative check when generating store pairs.Chad Rosier2015-06-091-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Store instructions do not modify register values and therefore it's safe to form a store pair even if the source register has been read in between the two store instructions. Previously, the read of w1 (see below) prevented the formation of a stp. str w0, [x2] ldr w8, [x2, #8] add w0, w8, w1 str w1, [x2, #4] ret We now generate the following code. stp w0, w1, [x2] ldr w8, [x2, #8] add w0, w8, w1 ret All correctness tests with -Ofast on A57 with Spec200x and EEMBC pass. Performance results for SPEC2K were within noise. llvm-svn: 239432
* [AArch64] Enhance the load/store optimizer with target-specific alias analysis.Chad Rosier2015-05-211-20/+51
| | | | | Phabricator: http://reviews.llvm.org/D9863 llvm-svn: 237963
* MachineInstr: Change return value of getOpcode() to unsigned.Matthias Braun2015-05-181-2/+2
| | | | | | | | | This was previously returning int. However there are no negative opcode numbers and more importantly this was needlessly different from MCInstrDesc::getOpcode() (which even is the value returned here) and SDValue::getOpcode()/SDNode::getOpcode(). llvm-svn: 237611
* [AArch64][LoadStoreOptimizer] Generate LDP + SXTW instead of LD[U]R + LD[U]RSW.Quentin Colombet2015-03-061-11/+116
| | | | | | | | | | | Teach the load store optimizer how to sign extend a result of a load pair when it helps creating more pairs. The rational is that loads are more expensive than sign extensions, so if we gather some in one instruction this is better! <rdar://problem/20072968> llvm-svn: 231527
* Migrate AArch64 except for TTI and AsmPrinter away from getSubtargetImpl.Eric Christopher2015-01-281-4/+2
| | | | llvm-svn: 227293
* [AArch64][LoadStoreOptimizer] Form LDPSW when possible.Quentin Colombet2015-01-241-1/+15
| | | | | | | | | This patch adds the missing LD[U]RSW variants to the load store optimizer, so that we generate LDPSW when possible. <rdar://problem/19583480> llvm-svn: 226978
* Add missing closing namespace comment.Jim Grosbach2014-08-111-1/+1
| | | | llvm-svn: 215402
* Remove the TargetMachine forwards for TargetSubtargetInfo basedEric Christopher2014-08-041-2/+4
| | | | | | information and update all callers. No functional change. llvm-svn: 214781
* Run sort_includes.py on the AArch64 backend.Benjamin Kramer2014-07-251-4/+4
| | | | | | No functionality change. llvm-svn: 213938
* [AArch64] clang-format the load/store optimizer.Tilmann Scheller2014-06-041-16/+25
| | | | | | No change in functionality. llvm-svn: 210182
* [AArch64] Fix some LLVM Coding Standards violations in the load/store optimizer.Tilmann Scheller2014-06-041-19/+19
| | | | | | | | Variable names should start with an upper case letter. No change in functionality. llvm-svn: 210181
* [AArch64] Fix typo in load/store optimizer.Tilmann Scheller2014-06-031-1/+1
| | | | llvm-svn: 210114
* AArch64/ARM64: move ARM64 into AArch64's placeTim Northover2014-05-241-0/+942
This commit starts with a "git mv ARM64 AArch64" and continues out from there, renaming the C++ classes, intrinsics, and other target-local objects for consistency. "ARM64" test directories are also moved, and tests that began their life in ARM64 use an arm64 triple, those from AArch64 use an aarch64 triple. Both should be equivalent though. This finishes the AArch64 merge, and everyone should feel free to continue committing as normal now. llvm-svn: 209577
OpenPOWER on IntegriCloud