summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* Resynchronize isLoadFromStackSlot with LoadRegFromStackSlot (and stores) in ↵Hal Finkel2013-03-271-0/+18
| | | | | | | | | | | | | PPCInstrInfo These functions should have the same list of load/store instructions. Now that all load/store forms have been normalized (to single instructions or pseudos) they can be resynchronized. Found by inspection, although hopefully this will improve optimization. I've also added some comments. llvm-svn: 178180
* Fix typo (common to both X86 and PPC)Hal Finkel2013-03-271-1/+1
| | | | | | Thanks to Bill Schmidt for pointing this out during code review! llvm-svn: 178170
* Remove more dead LR-as-GPR PPC codeHal Finkel2013-03-271-16/+4
| | | | | | I had removed similar code a few days ago, but somehow missed this. llvm-svn: 178169
* Remove "gpr0 allocation" from the PPC README TODO listHal Finkel2013-03-271-1/+0
| | | | | | As Chris pointed out, post r178123, this is now done! llvm-svn: 178165
* Print PPC ZERO as 0 (not r0) even on DarwinHal Finkel2013-03-271-2/+2
| | | | | | | It seems that the Darwin PPC assembler requires r0 to be written as 0 when it means 0 (at least in lwarx/stwcx.). Fixes PR15605. llvm-svn: 178142
* Allocate r0 on PPCHal Finkel2013-03-271-2/+0
| | | | | | | The R0 register can now be allocated because instructions that cannot use R0 as a GPR have been appropriately marked. llvm-svn: 178123
* Use the PPC no-r0 class on the TOC LD pseudosHal Finkel2013-03-271-2/+2
| | | | | | | | | | The register parameter in these instructions becomes the base register in an r+i ld instruction (and, thus, cannot be r0). This is not yet testable because we don't yet allocate r0 (and even then any test would be very fragile). llvm-svn: 178121
* Apply the no-r0 register class to the PPC SELECT_CC_I[4|8] pseudosHal Finkel2013-03-271-2/+7
| | | | | | | | | | Either operand of these pseudo instructions can be transformed into the first operand of an isel instruction (and this operand cannot be r0). This is not yet testable because we don't yet allocate r0 (and even when we do, any test would be very fragile). llvm-svn: 178119
* Apply the no-r0 class to PPC TOC ADDI[S] pseudo instructionsHal Finkel2013-03-271-9/+9
| | | | | | | | | | | | Like the addi/addis instructions themselves, these pseudo instructions also cannot have r0 as their register parameter (because it will be interpreted as the value 0). This is not yet testable because we don't yet allocate r0 (and even when we do, any regression test would be very fragile because it would depend on the register allocator heuristics). llvm-svn: 178118
* Remove the link register from the GPR classes on PowerPC.Bill Schmidt2013-03-271-2/+2
| | | | | | | | | | | | | | | Some implementation detail in the forgotten past required the link register to be placed in the GPRC and G8RC register classes. This is just wrong on the face of it, and causes several extra intersection register classes to be generated. I found this was having evil effects on instruction scheduling, by causing the wrong register class to be consulted for register pressure decisions. No code generation changes are expected, other than some minor changes in instruction order. Seven tests in the test bucket required minor tweaks to adjust to the new normal. llvm-svn: 178114
* Don't spill PPC VRSAVE on non-Darwin (even in SjLj)Hal Finkel2013-03-273-1/+8
| | | | | | | | | | | | | As Bill Schmidt pointed out to me, only on Darwin do we need to spill/restore VRSAVE in the SjLj code. For non-Darwin, don't spill/restore VRSAVE (and I've added some asserts to make sure that we're not). As it turns out, we're not currently handling the Darwin case correctly (I've added a FIXME in the test case). I've tried adding various implied register definitions/uses to force the spill without success, so I'll need to address this later. llvm-svn: 178096
* Restore real bit lengths on PPC register numbersHal Finkel2013-03-261-12/+12
| | | | | | | | | | As suggested by Bill Schmidt (in reviewing r178067), use the real register number bit lengths (which is self-documenting, and prevents using illegal numbers), and set only the relevant bits in HWEncoding (which defaults to 0). No functionality change intended. llvm-svn: 178077
* PPC: Use HWEncoding and TRI->getEncodingValueHal Finkel2013-03-268-100/+31
| | | | | | | | | | | As pointed out by Jakob, we don't need to maintain a separate register-numbering table. Instead we should let TableGen generate the table for us from the information (already present) in PPCRegisterInfo.td. TRI->getEncodingValue is now used to access register-encoding values. No functionality change intended. llvm-svn: 178067
* Use multiple virtual registers in PPC CR spillingHal Finkel2013-03-262-25/+35
| | | | | | | | | | | | Now that the register scavenger can support multiple spill slots, and PEI can use virtual-register-based scavenging for multiple simultaneous registers, we can use a virtual register for the transfer register in the CR spilling code. This should eliminate the last place (outside of the prologue/epilogue) where we depend on the unconditional availability of the r0 register. We will soon be able to allocate it (in a somewhat restricted sense) as a GPR. llvm-svn: 178060
* Update PPCRegisterInfo's use of virtual registers to be SSAHal Finkel2013-03-261-3/+5
| | | | | | | | | | | PPC's use of PEI's virtual-register-based scavenging functionality had redefined the virtual registers (it was non-SSA). Now that PEI supports dealing with instructions with multiple virtual registers, this can be cleanup up to use multiple virtual registers and keep SSA form. No functionality change intended. llvm-svn: 178059
* Remove default case from fully covered switch.Benjamin Kramer2013-03-261-1/+1
| | | | llvm-svn: 178025
* PowerPC: Mark patterns as isCodeGenOnly.Ulrich Weigand2013-03-264-7/+21
| | | | | | | | | | | There remain a number of patterns that cannot (and should not) be handled by the asm parser, in particular all the Pseudo patterns. This commit marks those patterns as isCodeGenOnly. No change in generated code. llvm-svn: 178008
* PowerPC: Simplify handling of fixups.Ulrich Weigand2013-03-264-70/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | MCTargetDesc/PPCMCCodeEmitter.cpp current has code like: if (isSVR4ABI() && is64BitMode()) Fixups.push_back(MCFixup::Create(0, MO.getExpr(), (MCFixupKind)PPC::fixup_ppc_toc16)); else Fixups.push_back(MCFixup::Create(0, MO.getExpr(), (MCFixupKind)PPC::fixup_ppc_lo16)); This is a problem for the asm parser, since it requires knowledge of the ABI / 64-bit mode to be set up. However, more fundamentally, at this point we shouldn't make such distinctions anyway; in an assembler file, it always ought to be possible to e.g. generate TOC relocations even when the main ABI is one that doesn't use TOC. Fortunately, this is actually completely unnecessary; that code was added to decide whether to generate TOC relocations, but that information is in fact already encoded in the VariantKind of the underlying symbol. This commit therefore merges those fixup types into one, and then decides which relocation to use based on the VariantKind. No changes in generated code. llvm-svn: 178007
* PowerPC: Simplify FADD in round-to-zero mode.Ulrich Weigand2013-03-264-101/+50
| | | | | | | | | | | | | | | | | | | | | | | As part of the the sequence generated to implement long double -> int conversions, we need to perform an FADD in round-to-zero mode. This is problematical since the FPSCR is not at all modeled at the SelectionDAG level, and thus there is a risk of getting floating point instructions generated out of sequence with the instructions to modify FPSCR. The current code handles this by somewhat "special" patterns that in part have dummy operands, and/or duplicate existing instructions, making them awkward to handle in the asm parser. This commit changes this by leaving the "FADD in round-to-zero mode" as an atomic operation on the SelectionDAG level, and only split it up into real instructions at the MI level (via custom inserter). Since at *this* level the FPSCR *is* modeled (via the "RM" hard register), much of the "special" stuff can just go away, and the resulting patterns can be used by the asm parser. No significant change in generated code expected. llvm-svn: 178006
* PowerPC: Remove LDrs pattern.Ulrich Weigand2013-03-263-16/+9
| | | | | | | | | | | | | | | | | | | | The LDrs pattern is a duplicate of LD, except that it accepts memory addresses where the displacement is a symbolLo64. An operand type "memrs" is defined for just that purpose. However, this wouldn't be necessary if the default "memrix" operand type were to simply accept 64-bit symbolic addresses directly. The only problem with that is that it uses "symbolLo", which is hardcoded to 32-bit. To fix this, this commit changes "memri" and "memrix" to use new operand types for the memory displacement, which allow iPTR instead of i32. This will also make address parsing easier to implment in the asm parser. No change in generated code. llvm-svn: 178005
* PowerPC: Remove ADDIL patterns.Ulrich Weigand2013-03-264-20/+13
| | | | | | | | | | | | | | The ADDI/ADDI8 patterns are currently duplicated into ADDIL/ADDI8L, which describe the same instruction, except that they accept a symbolLo[64] operand instead of a s16imm[64] operand. This duplication confuses the asm parser, and it actually not really needed, since symbolLo[64] already accepts immediate operands anyway. So this commit removes the duplicate patterns. No change in generated code. llvm-svn: 178004
* PowerPC: Use CCBITRC operand for ISEL patterns.Ulrich Weigand2013-03-265-46/+19
| | | | | | | | | | | | This commit changes the ISEL patterns to use a CCBITRC operand instead of a "pred" operand. This matches the actual instruction text more directly, and simplifies use of ISEL with the asm parser. In addition, this change allows some simplification of handling the "pred" operand, as this is now only used by BCC. No change in generated code. llvm-svn: 178003
* PowerPC: Simplify BLR pattern.Ulrich Weigand2013-03-263-12/+6
| | | | | | | | | | | | | | | | | The BLR pattern cannot be recognized by the asm parser in its current form. This complexity is due to an apparent attempt to enable conditional BLR variants. However, none of those can ever be generated by current code; the pattern is only ever created using the default "pred" operand. To simplify the pattern and allow it to be recognized by the parser, this commit removes those attempts at conditional BLR support. When we later come back to actually add real conditional BLR, this should probably be done via a fully generic conditional branch pattern. No change in generated code. llvm-svn: 178002
* PowerPC: Move some 64-bit branch patterns.Ulrich Weigand2013-03-261-17/+18
| | | | | | | | | | | | In PPCInstr64Bit.td, some branch patterns appear in a different sequence than the corresponding 32-bit patterns in PPCInstrInfo.td. To simplify future changes that affect both files, this commit moves those patterns to rearrange them into a similar sequence. No effect on generated code. llvm-svn: 178001
* Use direct types in PowerPC instruction patterns.Ulrich Weigand2013-03-252-270/+262
| | | | | | | | | | This commit updates the PowerPC back-end (PPCInstrInfo.td and PPCInstr64Bit.td) to use types instead of register classes in instruction patterns, along the lines of Jakob Stoklund Olesen's changes in r177835 for Sparc. llvm-svn: 177890
* Use direct types in PowerPC Pat patterns.Ulrich Weigand2013-03-252-113/+113
| | | | | | | | | This commit updates the PowerPC back-end (PPCInstrInfo.td and PPCInstr64Bit.td) to use types instead of register classes in Pat patterns, along the lines of Jakob Stoklund Olesen's changes in r177829 for Sparc. llvm-svn: 177889
* PPC ZERO register needs a register number of 0.Hal Finkel2013-03-231-0/+1
| | | | | | | | | In order for the new ZERO register to be used with MC, etc. we need to specify its register number (0). Thanks to Kai for reporting the problem! llvm-svn: 177833
* Note in PPCFunctionInfo VRSAVE spillsHal Finkel2013-03-234-14/+34
| | | | | | | | | | | | In preparation for using the new register scavenger capability for providing more than one register simultaneously, specifically note functions that have spilled VRSAVE (currently, this can happen only in functions that use the setjmp intrinsic). As with CR spilling, such functions will need to provide two emergency spill slots to the scavenger. No functionality change intended. llvm-svn: 177832
* MCize the bcl instruction in PPCAsmPrinterHal Finkel2013-03-231-4/+5
| | | | | | | | | I recently added a BCL instruction definition as part of implementing SjLj support. This can also be used to MCize bcl emission in the asm printer. No functionality change intended. llvm-svn: 177830
* Cleanup some unused reg. scavenger parameters in PPCRegisterInfoHal Finkel2013-03-232-33/+19
| | | | | | | | | | | These spilling functions will eventually make use of the register scavenger, however, they'll do so by taking advantage of PEI's virtual-register-based delayed scavenging mechanism. As a result, these function parameters will not be used, and can be removed. No functionality change intended. llvm-svn: 177827
* Remove dead PPC LR spilling codeHal Finkel2013-03-231-30/+8
| | | | | | | | | | The LR register is unconditionally reserved, and its spilling and restoration is handled by the prologue/epilogue code. As a result, it is never explicitly spilled by the register allocator. No functionality change intended. llvm-svn: 177823
* Allow the register scavenger to spill multiple registersHal Finkel2013-03-221-1/+1
| | | | | | | | | | | | | | | | | | This patch lets the register scavenger make use of multiple spill slots in order to guarantee that it will be able to provide multiple registers simultaneously. To support this, the RS's API has changed slightly: setScavengingFrameIndex / getScavengingFrameIndex have been replaced by addScavengingFrameIndex / isScavengingFrameIndex / getScavengingFrameIndices. In forthcoming commits, the PowerPC backend will use this capability in order to implement the spilling of condition registers, and some special-purpose registers, without relying on r0 being reserved. In some cases, spilling these registers requires two GPRs: one for addressing and one to hold the value being transferred. llvm-svn: 177774
* Remove ABI-duplicated call instruction patterns.Ulrich Weigand2013-03-228-148/+71
| | | | | | | | | | | | | | | | | | We currently have a duplicated set of call instruction patterns depending on the ABI to be followed (Darwin vs. Linux). This is a bit odd; while the different ABIs will result in different instruction sequences, the actual instructions themselves ought to be independent of the ABI. And in fact it turns out that the only nontrivial difference between the two sets of patterns is that in the PPC64 Linux ABI, the instruction used for indirect calls is marked to take X11 as extra input register (which is indeed used only with that ABI to hold an incoming environment pointer for nested functions). However, this does not need to be hard-coded at the .td pattern level; instead, the C++ code expanding calls can simply add that use, just like it adds uses for argument registers anyway. No change in generated code expected. llvm-svn: 177735
* Rename memrr ptrreg and offreg components.Ulrich Weigand2013-03-222-22/+22
| | | | | | | | | | | Currently, the sub-operand of a memrr address that corresponds to what hardware considers the base register is called "offreg", while the sub-operand that corresponds to the offset is called "ptrreg". To avoid confusion, this patch simply swaps the named of those two sub-operands and updates all uses. No functional change is intended. llvm-svn: 177734
* Fix swapped BasePtr and Offset in pre-inc memory addresses.Ulrich Weigand2013-03-224-20/+40
| | | | | | | | | | | | | | | | | | | | PPCTargetLowering::getPreIndexedAddressParts currently provides the base part of a memory address in the offset result, and the offset part in the base result. That swap is then undone again when an MI instruction is generated (in PPCDAGToDAGISel::Select for loads, and using .md Pat patterns for stores). This patch reverts this double swap, to make common code and back-end be in sync as to which part of the address is base and which is offset. To avoid performance regressions in certain cases, target code now checks whether the choice of base register would be rejected for pre-inc accesses by common code, and attempts to swap base and offset again in such cases. (Overall, this means that now pre-ice accesses are generated *more* frequently than before.) llvm-svn: 177733
* Tighten iaddroff ComplexPattern.Ulrich Weigand2013-03-221-4/+4
| | | | | | | | | | | | | | | | | The iaddroff ComplexPattern is supposed to recognize displacement expressions that have been processed by a SelectAddressRegImm, which means it needs to accept TargetConstant and TargetGlobalAddress nodes. Currently, it erroneously also accepts some other nodes, in particular Constant and PPCISD::Lo. While this problem is currently latent, it would cause wrong-code bugs with a follow-on patch I'm about to commit, so this patch tightens the ComplexPattern. The equivalent change is made in PPCDAGToDAGISel::Select, where pre-inc load patterns are handled (as opposed to store patterns, the loads are handled in C++ code without making use of the .td ComplexPattern). llvm-svn: 177732
* Remove the xaddroff ComplexPattern.Ulrich Weigand2013-03-223-31/+18
| | | | | | | | | The xaddroff pattern is currently (mistakenly) used to recognize the *base* register in pre-inc store patterns. This patch replaces those uses by ptr_rc_nor0 (as is elsewhere done to match the base register of an address), and removes the now unused ComplexPattern. llvm-svn: 177731
* Remove the G8RC_NOX0_and_GPRC_NOR0 PPC register classHal Finkel2013-03-213-7/+12
| | | | | | | | | | | | As Jakob pointed out in his review of r177423, having a shared ZERO register between the 32- and 64-bit register classes causes this odd G8RC_NOX0_and_GPRC_NOR0 class to be created. As recommended, this adds a ZERO8 register which differentiates the 32- and 64-bit zeros. No functionality change intended. llvm-svn: 177683
* Fix a register-class comparison bug in PPCCTRLoopsHal Finkel2013-03-213-19/+1
| | | | | | | | | Thanks to Jakob for isolating the underlying problem from the test case in r177423. The original commit had introduced asymmetric copy operations, but these turned out to be a work-around to the real problem (the use of == instead of hasSubClassEq in PPCCTRLoops). llvm-svn: 177679
* Implement builtin_{setjmp/longjmp} on PPCHal Finkel2013-03-219-0/+367
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implements SJLJ lowering on PPC, making the Clang functions __builtin_{setjmp/longjmp} functional on PPC platforms. The implementation strategy is similar to that on X86, with the exception that a branch-and-link variant is used to get the right jump address. Credit goes to Bill Schmidt for suggesting the use of the unconditional bcl form (instead of the regular bl instruction) to limit return-address-cache pollution. Benchmarking the speed at -O3 of: static jmp_buf env_sigill; void foo() { __builtin_longjmp(env_sigill,1); } main() { ... for (int i = 0; i < c; ++i) { if (__builtin_setjmp(env_sigill)) { goto done; } else { foo(); } done:; } ... } vs. the same code using the libc setjmp/longjmp functions on a P7 shows that this builtin implementation is ~4x faster with Altivec enabled and ~7.25x faster with Altivec disabled. This comparison is somewhat unfair because the libc version must also save/restore the VSX registers which we don't yet support. llvm-svn: 177666
* Add support for spilling VRSAVE on PPCHal Finkel2013-03-214-1/+106
| | | | | | | | | | Although there is only one Altivec VRSAVE register, it is a member of a register class, and we need the ability to spill it. Because this register is normally callee-preserved and handled by special code this has never before been necessary. However, this capability will be required by a forthcoming commit adding SjLj support. llvm-svn: 177654
* Correct PPC FRAMEADDR lowering using a pseudo-registerHal Finkel2013-03-215-9/+50
| | | | | | | | | | | | | | | | The old code used to lower FRAMEADDR tried to replicate the logic in the real frame-lowering code that determines whether or not the frame pointer (r31) will be used. When it seemed as through the frame pointer would not be used, the stack pointer (r1) was used instead. Unfortunately, because the stack size is not yet known, this does not work. Instead, this change introduces new always-reserved pseudo-registers (FP and FP8) that are replaced during prologue insertion with the real frame-pointer register (either r1 or r31). It is important that this intrinsic always return a valid frame address because it is used by Clang to store the frame address as part of code generation for __builtin_setjmp. llvm-svn: 177653
* Add missing mayLoad flag to LHAUX8 and LWAUX.Ulrich Weigand2013-03-191-1/+2
| | | | | | | | | All pre-increment load patterns need to set the mayLoad flag (since they don't provide a DAG pattern). This was missing for LHAUX8 and LWAUX, which is added by this patch. llvm-svn: 177431
* Rewrite LHAU8 pattern to use standard memory operand.Ulrich Weigand2013-03-191-4/+4
| | | | | | | | | | | | As opposed to to pre-increment store patterns, the pre-increment load patterns were already using standard memory operands, with the sole exception of LHAU8. As there's no real reason why LHAU8 should be different here, this patch simply rewrites the pattern to also use a memri operand, just like all the other patterns. llvm-svn: 177430
* Rewrite pre-increment store patterns to use standard memory operands.Ulrich Weigand2013-03-192-151/+121
| | | | | | | | | | | | | | | | | | | | | | Currently, pre-increment store patterns are written to use two separate operands to represent address base and displacement: stwu $rS, $ptroff($ptrreg) This causes problems when implementing the assembler parser, so this commit changes the patterns to use standard (complex) memory operands like in all other memory access instruction patterns: stwu $rS, $dst To still match those instructions against the appropriate pre_store SelectionDAG nodes, the patch uses the new feature that allows a Pat to match multiple DAG operands against a single (complex) instruction operand. Approved by Hal Finkel. llvm-svn: 177429
* Fix sub-operand size mismatch in tocentry operands.Ulrich Weigand2013-03-191-1/+1
| | | | | | | | | | | The tocentry operand class refers to 64-bit values (it is only used in 64-bit, where iPTR is a 64-bit type), but its sole suboperand is designated as 32-bit type. This causes a mismatch to be detected at compile-time with the TableGen patch I'll check in shortly. To fix this, this commit changes the suboperand to a 64-bit type as well. llvm-svn: 177427
* Prepare to make r0 an allocatable register on PPCHal Finkel2013-03-196-112/+156
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the PPC r0 register is unconditionally reserved. There are two reasons for this: 1. r0 is treated specially (as the constant 0) by certain instructions, and so cannot be used with those instructions as a regular register. 2. r0 is used as a temporary register in the CR-register spilling process (where, under some circumstances, we require two GPRs). This change addresses the first reason by introducing a restricted register class (without r0) for use by those instructions that treat r0 specially. These register classes have a new pseudo-register, ZERO, which represents the r0-as-0 use. This has the side benefit of making the existing target code simpler (and easier to understand), and will make it clear to the register allocator that uses of r0 as 0 don't conflict will real uses of the r0 register. Once the CR spilling code is improved, we'll be able to allocate r0. Adding these extra register classes, for some reason unclear to me, causes requests to the target to copy 32-bit registers to 64-bit registers. The resulting code seems correct (and causes no test-suite failures), and the new test case covers this new kind of asymmetric copy. As r0 is still reserved, no functionality change intended. llvm-svn: 177423
* Cleanup PPC64 unaligned i64 load/storeHal Finkel2013-03-191-4/+0
| | | | | | | | | Remove an accidentally-added instruction definition and add a comment in the test case. This is in response to a post-commit review by Bill Schmidt. No functionality change intended. llvm-svn: 177404
* Don't reserve R31 on PPC64 unless the frame pointer is neededHal Finkel2013-03-191-4/+3
| | | | llvm-svn: 177379
* Fix a sign-extension bug in PPCCTRLoopsHal Finkel2013-03-181-1/+1
| | | | | | | Don't sign extend the immediate value from the OR instruction in an LIS/OR pair. llvm-svn: 177361
OpenPOWER on IntegriCloud