summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
* AArch64: Avoid implicit iterator conversions, NFCDuncan P. N. Exon Smith2016-07-081-7/+7
| | | | | | | | Avoid implicit conversions from MachineInstrBundleInstr to MachineInstr* in the AArch64 backend, mainly by preferring MachineInstr& over MachineInstr* when a pointer isn't nullable. llvm-svn: 274924
* AArch64: Replace a RegScavenger instance with LivePhysRegsMatthias Braun2016-07-061-14/+14
| | | | | | | | | | | | | | findScratchNonCalleeSaveRegister() just needs a simple liveness analysis, use LivePhysRegs for that as it is simpler and does not depend on the kill flags. This commit adds a convenience function available() to LivePhysRegs: This function returns true if the given register is not reserved and neither the register nor any of its aliases are alive. Differential Revision: http://reviews.llvm.org/D21865 llvm-svn: 274685
* Pass DebugLoc and SDLoc by const ref.Benjamin Kramer2016-06-121-2/+2
| | | | | | | | This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512
* [PEI, AArch64] Use empty spaces in stack area for local stack slot allocation.Geoff Berry2016-06-021-1/+8
| | | | | | | | | | | | | | | | | Summary: If the target requests it, use emptry spaces in the fixed and callee-save stack area to allocate local stack objects. AArch64: Change last callee-save reg stack object alignment instead of size to leave a gap to take advantage of above change. Reviewers: t.p.northover, qcolombet, MatzeB Subscribers: rengolin, mcrosier, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D20220 llvm-svn: 271527
* [AArch64] Fix bug in large stack spill slot handling (PR27717)Geoff Berry2016-05-161-1/+3
| | | | | | | | | | | | | | | | | Summary: Fix bug in MachO path where a frame index offset would not be reserved for handling large frames when an extra non-used callee-save register was saved. In the case where the extra register is reserved or not a GPR (e.g. %FP in the MachO case), this would lead to the register scavenger later failing when called from PrologEpilogInserter. Reviewers: t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20185 llvm-svn: 269697
* [AArch64] Combine callee-save and local stack SP adjustment instructions.Geoff Berry2016-05-061-80/+191
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If a function needs to allocate both callee-save stack memory and local stack memory, we currently decrement/increment the SP in two steps: first for the callee-save area, and then for the local stack area. This changes the code to allocate them both at once at the very beginning/end of the function. This has two benefits: 1) there is one fewer sub/add micro-op in the prologue/epilogue 2) the stack adjustment instructions act as a scheduling barrier, so moving them to the very beginning/end of the function increases post-RA scheduler's ability to move instructions (that only depend on argument registers) before any of the callee-save stores This change can cause an increase in instructions if the original local stack SP decrement could be folded into the first store to the stack. This occurs when the first local stack store is to stack offset 0. In this case we are trading off one more sub instruction for one fewer sub micro-op (along with benefits (2) and (3) above). Reviewers: t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18619 llvm-svn: 268746
* [AArch64] Add MMOs to callee-save load/store instructions.Geoff Berry2016-04-151-2/+15
| | | | | | | | | | | | | | Summary: Without MMOs, the callee-save load/store instructions were treated as volatile by the MI post-RA scheduler and AArch64LoadStoreOptimizer. Reviewers: t.p.northover, mcrosier Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17661 llvm-svn: 266439
* AArch64: Use a callee save registers for swiftself parametersMatthias Braun2016-04-131-7/+7
| | | | | | | | | | | | | | | | | | It is very likely that the swiftself parameter is alive throughout most functions function so putting it into a callee save register should avoid spills for the callers with only a minimum amount of extra spills in the callees. Currently the generated code is correct but unnecessarily spills and reloads arguments passed in callee save registers, I will address this in upcoming patches. This also adds a missing check that for tail calls the preserved value of the caller must be the same as the callees parameter. Differential Revision: http://reviews.llvm.org/D19007 llvm-svn: 266251
* Swift Calling Convention: swifterror target support.Manman Ren2016-04-111-5/+13
| | | | | | Differential Revision: http://reviews.llvm.org/D18716 llvm-svn: 265997
* AArch64: Fix compile errorMatthias Braun2016-04-061-1/+1
| | | | | | | Fixed to adapt a use of enterBasicBlock() in my last commit (because I had follow on patches in my repository that change the code). llvm-svn: 265513
* Change eliminateCallFramePseudoInstr() to return an iteratorHans Wennborg2016-03-311-2/+2
| | | | | | | | | | | | | | | | | | | | | This will become necessary in a subsequent change to make this method merge adjacent stack adjustments, i.e. it might erase the previous and/or next instruction. It also greatly simplifies the calls to this function from Prolog- EpilogInserter. Previously, that had a bunch of logic to resume iteration after the call; now it just continues with the returned iterator. Note that this changes the behaviour of PEI a little. Previously, it attempted to re-visit the new instruction created by eliminateCallFramePseudoInstr(). That code was added in r36625, but I can't see any reason for it: the new instructions will obviously not be pseudo instructions, they will not have FrameIndex operands, and we have already accounted for the stack adjustment. Differential Revision: http://reviews.llvm.org/D18627 llvm-svn: 265036
* [AArch64] Refactor AArch64FrameLowering::emitPrologue. NFC.Chad Rosier2016-03-141-52/+50
| | | | | | | http://reviews.llvm.org/D18125 Patch by Aditya Kumar. llvm-svn: 263461
* [AArch64] Break the dependency between FP and SP when possible.Chad Rosier2016-03-141-1/+5
| | | | | | | | | | | | | When the SP in not changed because of realignment/VLAs etc., we restore the SP by using the previous value of SP and not the FP. Breaking the dependency will help in cases when the epilog of a callee is close to the epilog of the caller; for then "sub sp, fp, #" depends on the load restoring the FP in the epilog of the callee. http://reviews.llvm.org/D18060 Patch by Aditya Kumar and Evandro Menezes. llvm-svn: 263458
* Add support for a preserve_most calling convention to the AArch64 backend.Roman Levenstein2016-03-101-0/+4
| | | | | | | | | | This change adds a support for a preserve_most calling convention to the AArch64 backend, similar to how it was done for X86-64. There is also a subsequent patch on top of this one to add a tail-calls support for this calling convention. Differential Revision: http://reviews.llvm.org/D18016 llvm-svn: 263092
* [AArch64] Enable non-leaf frame pointer elimination.Geoff Berry2016-03-021-6/+6
| | | | | | | | | | | | | | | | | | | | Summary: This change enables frame pointer elimination in non-leaf functions. The -fomit-frame-pointer option still needs to be used when compiling via clang (or an equivalent method of not setting the 'no-frame-pointer-elim*' function attributes if generating llvm IR via some other method) to take advantage of this optimization. This change should be NFC when compiling via clang without -fomit-frame-pointer. Reviewers: t.p.northover Subscribers: aemerson, rengolin, tberghammer, qcolombet, llvm-commits, danalbert, mcrosier, srhines Differential Revision: http://reviews.llvm.org/D17730 llvm-svn: 262495
* Simplify some boolean conditional return statements in AArch64.Eric Christopher2016-02-291-3/+1
| | | | | | | | http://reviews.llvm.org/D9979 Patch by Richard Thomson (and some conflict resolution by me). llvm-svn: 262266
* [AArch64] Clean up callee-save CFI emission. NFC.Geoff Berry2016-02-251-44/+8
| | | | | | | | | | | | | | | | | | | Summary: Avoid special case for FP, LR CFI emission and just allow general AArch64FrameLowering::emitCalleeSavedFrameMoves() to handle them. Also, stop recalculating the stack offsets in emitCalleeSavedFrameMoves() since we can just reuse the previously calculated offset stored in the MachineFrameInfo. Depends on D17000 Reviewers: t.p.northover, rengolin, mcrosier, jmolloy Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17004 llvm-svn: 261885
* [AArch64] Fix fastcc -tailcallopt epilog code generation.Geoff Berry2016-02-231-6/+23
| | | | | | | | | | | | | | Summary: Fix a bug in epilog generation where the incoming stack arguments were not being popped for fastcc functions when -tailcallopt was passed. Reviewers: t.p.northover, mcrosier, jmolloy, rengolin Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16894 llvm-svn: 261650
* [AArch64][ShrinkWrap] Fix bug in prolog clobbering live reg when shrink ↵Geoff Berry2016-02-191-9/+61
| | | | | | | | | | | | | | wrapping. Summary: See bug https://llvm.org/bugs/show_bug.cgi?id=26642 Reviewers: qcolombet, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17350 llvm-svn: 261349
* [AArch64] Reduce number of callee-save save/restores.Geoff Berry2016-02-121-126/+160
| | | | | | | | | | | | | | | | | | | | | Summary: Before this change, callee-save registers would be rounded up to even pairs of GPRs and FPRs. This change eliminates these extra padding load/stores, though it does keep the stack allocation the same size unless both the GPR and FPR sets have an odd size, in which case one full pair stack slot (16 bytes) is saved. This optimization cannot currently be done for MachO targets since they rely on a fast-path .debug_frame equivalent that can only encode callee-save registers as pairs. Reviewers: t.p.northover, rengolin, mcrosier, jmolloy Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17000 llvm-svn: 260689
* [AArch64] Simplify prolog/epilog callee save/restore. NFC.Geoff Berry2016-02-011-61/+87
| | | | | | | | | | | | | | | | | Summary: Factor out common code for callee-save register pair calculation. This is intended to simplify follow-on changes that reduce the number of registers saved/restored. Depends on D16732 Reviewers: mcrosier, jmolloy, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16734 llvm-svn: 259384
* [AArch64] Simplify callee-save register save/restore. NFC.Geoff Berry2016-02-011-68/+17
| | | | | | | | | | | | | | | | | | | Summary: Simplify callee-save register save/restore code generation by remembering the size of the callee-save area when it is computed so we don't have to scan the prologue/epilogue instructions again later to reconstruct it. This is intended to simplify follow-on changes that reduce the number of registers saved/restored. Reviewers: mcrosier, jmolloy, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16732 llvm-svn: 259365
* Typo.Chad Rosier2016-01-191-1/+1
| | | | llvm-svn: 258137
* Update to use new name alignTo().Rui Ueyama2016-01-141-1/+1
| | | | llvm-svn: 257804
* AArch64: Simplify emitEpilogue() and related code; NFCMatthias Braun2015-12-171-24/+25
| | | | | | This is in preparation to an upcoming patch. llvm-svn: 255872
* [AArch64] Simplify some TRI/TII getters. NFC.Ahmed Bougacha2015-12-161-7/+6
| | | | | | We don't need static_casts when we use the right Subtarget. llvm-svn: 255836
* Remove windows line endings introduced by r252177. NFC.Tim Northover2015-11-051-16/+16
| | | | llvm-svn: 252217
* [DebugInfo] Fix ARM/AArch64 prologue_end position. Related to D11268.Oleg Ranevskyy2015-11-051-16/+16
| | | | | | | | | | | | | | | | | | | Summary: This review is related to another review request http://reviews.llvm.org/D11268, does the same and merely fixes a couple of issues with it. D11268 is quite old and has merge conflicts against the current trunk. This request - rebases D11268 onto the new trunk; - resolves the merge conflicts; - fixes the prologue_end tests, which do not pass due to the subprogram definitions not marked as distinct. Reviewers: echristo, rengolin, kubabrecka Subscribers: aemerson, rengolin, jyknight, dsanders, llvm-commits, asl Differential Revision: http://reviews.llvm.org/D14338 llvm-svn: 252177
* Remove redundant TargetFrameLowering::getFrameIndexOffset virtualJames Y Knight2015-08-151-8/+0
| | | | | | | | | | | function. This was the same as getFrameIndexReference, but without the FrameReg output. Differential Revision: http://reviews.llvm.org/D12042 llvm-svn: 245148
* Fix some comment typos.Benjamin Kramer2015-08-081-2/+2
| | | | llvm-svn: 244402
* Move most user of TargetMachine::getDataLayout to the Module oneMehdi Amini2015-07-161-4/+4
| | | | | | | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. This patch is quite boring overall, except for some uglyness in ASMPrinter which has a getDataLayout function but has some clients that use it without a Module (llmv-dsymutil, llvm-dwarfdump), so some methods are taking a DataLayout as parameter. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11090 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 242386
* WebAssembly: fix build breakage.JF Bastien2015-07-141-1/+1
| | | | | | | | | | | | | | | Summary: processFunctionBeforeCalleeSavedScan was renamed to determineCalleeSaves and now takes a BitVector parameter as of rL242165, reviewed in http://reviews.llvm.org/D10909 WebAssembly is still marked as experimental and therefore doesn't build by default. It does, however, grep by default! I notice that processFunctionBeforeCalleeSavedScan is still mentioned in a few comments and error messages, which I also fixed. Reviewers: qcolombet, sunfish Subscribers: jfb, dsanders, hfinkel, MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D11199 llvm-svn: 242242
* MachineRegisterInfo: Remove UsedPhysReg infrastructureMatthias Braun2015-07-141-1/+0
| | | | | | | | | | | | | We have a detailed def/use lists for every physical register in MachineRegisterInfo anyway, so there is little use in maintaining an additional bitset of which ones are used. Removing it frees us from extra book keeping. This simplifies VirtRegMap. Differential Revision: http://reviews.llvm.org/D10911 llvm-svn: 242173
* PrologEpilogInserter: Rewrite API to determine callee save regsiters.Matthias Braun2015-07-141-11/+17
| | | | | | | | | | | | | | | | This changes TargetFrameLowering::processFunctionBeforeCalleeSavedScan(): - Rename the function to determineCalleeSaves() - Pass a bitset of callee saved registers by reference, thus avoiding the function-global PhysRegUsed bitset in MachineRegisterInfo. - Without PhysRegUsed the implementation is fine tuned to not save physcial registers which are only read but never modified. Related to rdar://21539507 Differential Revision: http://reviews.llvm.org/D10909 llvm-svn: 242165
* Fix AArch64 prologue for empty frame with dynamic allocas.Evgeniy Stepanov2015-07-101-8/+4
| | | | | | | | Fixes PR23804: assertion failure in emitPrologue in the case of a function with an empty frame and a dynamic alloca that needs stack realignment. This is a typical case for AddressSanitizer. llvm-svn: 241943
* MachineInstr: Change return value of getOpcode() to unsigned.Matthias Braun2015-05-181-1/+1
| | | | | | | | | This was previously returning int. However there are no negative opcode numbers and more importantly this was needlessly different from MCInstrDesc::getOpcode() (which even is the value returned here) and SDValue::getOpcode()/SDNode::getOpcode(). llvm-svn: 237611
* MC: Clean up method names in MCContext.Jim Grosbach2015-05-181-1/+1
| | | | | | | The naming was a mish-mash of old and new style. Update to be consistent with the new. NFC. llvm-svn: 237594
* [ShrinkWrap] Add (a simplified version) of shrink-wrapping.Quentin Colombet2015-05-051-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find safe points that are cheaper than the entry and exits blocks. As an example and to avoid regressions to be introduce, this patch also implements the required bits to enable the shrink-wrapping pass for AArch64. ** Context ** Currently we insert the prologue and epilogue of the method/function in the entry and exits blocks. Although this is correct, we can do a better job when those are not immediately required and insert them at less frequently executed places. The job of the shrink-wrapping pass is to identify such places. ** Motivating example ** Let us consider the following function that perform a call only in one branch of a if: define i32 @f(i32 %a, i32 %b) { %tmp = alloca i32, align 4 %tmp2 = icmp slt i32 %a, %b br i1 %tmp2, label %true, label %false true: store i32 %a, i32* %tmp, align 4 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp) br label %false false: %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ] ret i32 %tmp.0 } On AArch64 this code generates (removing the cfi directives to ease readabilities): _f: ; @f ; BB#0: stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething LBB0_2: ; %false mov sp, x29 ldp x29, x30, [sp], #16 ret With shrink-wrapping we could generate: _f: ; @f ; BB#0: cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething add sp, x29, #16 ; =16 ldp x29, x30, [sp], #16 LBB0_2: ; %false ret Therefore, we would pay the overhead of setting up/destroying the frame only if we actually do the call. ** Proposed Solution ** This patch introduces a new machine pass that perform the shrink-wrapping analysis (See the comments at the beginning of ShrinkWrap.cpp for more details). It then stores the safe save and restore point into the MachineFrameInfo attached to the MachineFunction. This information is then used by the PrologEpilogInserter (PEI) to place the related code at the right place. This pass runs right before the PEI. Unlike the original paper of Chow from PLDI’88, this implementation of shrink-wrapping does not use expensive data-flow analysis and does not need hack to properly avoid frequently executed point. Instead, it relies on dominance and loop properties. The pass is off by default and each target can opt-in by setting the EnableShrinkWrap boolean to true in their derived class of TargetPassConfig. This setting can also be overwritten on the command line by using -enable-shrink-wrap. Before you try out the pass for your target, make sure you properly fix your emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not necessarily the entry block. ** Design Decisions ** 1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but for debugging and clarity I thought it was best to have its own file. 2. Right now, we only support one save point and one restore point. At some point we can expand this to several save point and restore point, the impacted component would then be: - The pass itself: New algorithm needed. - MachineFrameInfo: Hold a list or set of Save/Restore point instead of one pointer. - PEI: Should loop over the save point and restore point. Anyhow, at least for this first iteration, I do not believe this is interesting to support the complex cases. We should revisit that when we motivating examples. Differential Revision: http://reviews.llvm.org/D9210 <rdar://problem/3201744> llvm-svn: 236507
* [AArch64] Refactor out codes that depend on specific CS save sequence.Manman Ren2015-04-291-19/+28
| | | | | | No functionality change. llvm-svn: 236143
* [AArch64] Strengthen the code for the prologue insertion.Quentin Colombet2015-04-101-0/+2
| | | | | | | | | The spilled registers are pristine and thus, correctly handled by the register scavenger and so on, but the liveness information is strictly speaking wrong at this point. Fix that. llvm-svn: 234664
* [AArch64] Add support for dynamic stack alignmentKristof Beyls2015-04-091-42/+138
| | | | | | Differential Revision: http://reviews.llvm.org/D8876 llvm-svn: 234471
* AArch64: Canonicalize access to function attributes, NFCDuncan P. N. Exon Smith2015-02-141-2/+1
| | | | | | | | | | | | Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) llvm-svn: 229218
* Move DataLayout back to the TargetMachine from TargetSubtargetInfoEric Christopher2015-01-261-2/+2
| | | | | | | | | | | | | | | | | | | derived classes. Since global data alignment, layout, and mangling is often based on the DataLayout, move it to the TargetMachine. This ensures that global data is going to be layed out and mangled consistently if the subtarget changes on a per function basis. Prior to this all targets(*) have had subtarget dependent code moved out and onto the TargetMachine. *One target hasn't been migrated as part of this change: R600. The R600 port has, as a subtarget feature, the size of pointers and this affects global data layout. I've currently hacked in a FIXME to enable progress, but the port needs to be updated to either pass the 64-bitness to the TargetMachine, or fix the DataLayout to avoid subtarget dependent features. llvm-svn: 227113
* [AArch64] Implement GHC calling conventionGreg Fitzgerald2015-01-191-0/+10
| | | | | | | | | | Original patch by Luke Iannini. Minor improvements and test added by Erik de Castro Lopo. Differential Revision: http://reviews.llvm.org/D6877 From: Erik de Castro Lopo <erikd@mega-nerd.com> llvm-svn: 226473
* ARM/AArch64: Attach the FrameSetup MIFlag to CFI instructions.Adrian Prantl2014-12-161-6/+12
| | | | | | | | | | Debug info marks the first instruction without the FrameSetup flag as being the end of the function prologue. Any CFI instructions in the middle of the function prologue would cause debug info to end the prologue too early and worse, attach the line number of the CFI instruction, which incidentally is often 0. llvm-svn: 224294
* [Stackmaps] Make ithe frame-pointer required for stackmaps.Juergen Ributzka2014-10-021-1/+2
| | | | | | | | | Do not eliminate the frame pointer if there is a stackmap or patchpoint in the function. All stackmap references should be FP relative. This fixes PR21107. llvm-svn: 218920
* Fix typos:Sylvestre Ledru2014-08-111-1/+1
| | | | | | | * libaries => libraries * avaiable => available llvm-svn: 215366
* Have MachineFunction cache a pointer to the subtarget to make lookupsEric Christopher2014-08-051-22/+16
| | | | | | | | | | | shorter/easier and have the DAG use that to do the same lookup. This can be used in the future for TargetMachine based caching lookups from the MachineFunction easily. Update the MIPS subtarget switching machinery to update this pointer at the same time it runs. llvm-svn: 214838
* Remove the TargetMachine forwards for TargetSubtargetInfo basedEric Christopher2014-08-041-16/+22
| | | | | | information and update all callers. No functional change. llvm-svn: 214781
* Run sort_includes.py on the AArch64 backend.Benjamin Kramer2014-07-251-3/+3
| | | | | | No functionality change. llvm-svn: 213938
OpenPOWER on IntegriCloud