summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Cleanup PPCInstrInfo::optimizeCompareInstrHal Finkel2013-05-071-14/+10
| | | | | | | Implement suggestions by Bill Schmidt in post-commit review. No functionality change intended. llvm-svn: 181338
* Move PPC getSwappedPredicate for reuseHal Finkel2013-04-201-17/+1
| | | | | | | | | | | The getSwappedPredicate function can be used in other places (such as in improvements to the PPCCTRLoops pass). Instead of trapping it as a static function in PPCInstrInfo, move it into PPCPredicates with other predicate-related things. No functionality change intended. llvm-svn: 179926
* Fix PPC optimizeCompareInstr swapped-sub argument handlingHal Finkel2013-04-191-16/+40
| | | | | | | | | | | | | When matching a compare with a subtract where the arguments of the compare are swapped w.r.t. the arguments of the subtract, we need to negate the predicates (or CR bit indices) of the users. This, however, is not the same as inverting the predicate (negating LT -> GT, but inverting LT -> GE, for example). The ARM backend seems to do this correctly, but when I adapted the code for the PPC backend, I introduced an error in this logic. Comparison optimization is now enabled again by default. llvm-svn: 179899
* Disable PPC comparison optimization by defaultHal Finkel2013-04-181-0/+6
| | | | | | | This seems to cause a stage-2 LLVM compile failure (by crashing TableGen); do I'm disabling this for now. llvm-svn: 179807
* Implement optimizeCompareInstr for PPCHal Finkel2013-04-181-0/+300
| | | | | | | | | | | | | | | Many PPC instructions have a so-called 'record form' which stores to a specific condition register the result of comparing the result of the instruction with zero (always as a signed comparison). For integer operations on PPC64, this is always a 64-bit comparison. This implementation is derived from the implementation in the ARM backend; there are some differences because PPC condition registers are allocatable virtual registers (although the record forms always use a specific one), and we look for a matching subtraction instruction after the compare (but before the first use) in addition to before it. llvm-svn: 179802
* Add PPC instruction record forms and associated query functionsHal Finkel2013-04-121-1/+2
| | | | | | | | | | | | | | | | | | This is prep. work for the implementation of optimizeCompare. Many PPC instructions have 'record' forms (in almost all cases, this means that the RC bit is set) that cause the result of the instruction to be compared with zero, and the result of that comparison saved in a predefined condition register. In order to add the record forms of the instructions without too much copy-and-paste, the relevant functions have been refactored into multiclasses which define both the record and normal forms. Also, two TableGen-generated mapping functions have been added which allow querying the instruction code for the record form given the normal form (and vice versa). No functionality change intended. llvm-svn: 179356
* Make PPCInstrInfo::isPredicated always return falseHal Finkel2013-04-111-16/+8
| | | | | | | Because of how predication in implemented on PPC (only for branches), I think that this is the right thing to do. No functionality change intended. llvm-svn: 179252
* PPC: Don't predicate a diamond with two counter decrementsHal Finkel2013-04-101-0/+23
| | | | | | | | | | I've not seen this happen in practice, and probably can't until we start allowing decrement-counter-based conditional branches to be double predicated, but just in case, don't allow predication of a diamond in which both sides have ctr-defining branches. Even though the branching behavior of these can be predicated, the counter-decrementing behavior cannot be. llvm-svn: 179199
* Cleanup PPCInstrInfo::DefinesPredicateHal Finkel2013-04-101-5/+10
| | | | | | Implement suggestions made by Bill Schmidt in post-commit review. Thanks! llvm-svn: 179162
* PPC: Prep for if conversion of bctr[l]Hal Finkel2013-04-101-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | This adds in-principle support for if-converting the bctr[l] instructions. These instructions are used for indirect branching. It seems, however, that the current if converter will never actually predicate these. To do so, it would need the ability to hoist a few setup insts. out of the conditionally-executed block. For example, code like this: void foo(int a, int (*bar)()) { if (a != 0) bar(); } becomes: ... beq 0, .LBB0_2 std 2, 40(1) mr 12, 4 ld 3, 0(4) ld 11, 16(4) ld 2, 8(4) mtctr 3 bctrl ld 2, 40(1) .LBB0_2: ... and it would be safe to do all of this unconditionally with a predicated beqctrl instruction. llvm-svn: 179156
* Allow PPC B and BLR to be if-converted into some predicated formsHal Finkel2013-04-091-0/+137
| | | | | | | | | | | | | | This enables us to form predicated branches (which are the same conditional branches we had before) and also a larger set of predicated returns (including instructions like bdnzlr which is a conditional return and loop-counter decrement all in one). At the moment, if conversion does not capture all possible opportunities. A simple example is provided in early-ret2.ll, where if conversion forms one predicated return, and then the PPCEarlyReturn pass picks up the other one. So, at least for now, we'll keep both mechanisms. llvm-svn: 179134
* Cleanup PPCEarlyReturnHal Finkel2013-04-091-28/+31
| | | | | | | | | | Some general cleanup and only scan the end of a BB for branches (once we're done with the terminators and debug values, then there should not be any other branches). These address post-commit review suggestions by Bill Schmidt. No functionality change intended. llvm-svn: 179112
* Generate PPC early conditional returnsHal Finkel2013-04-081-0/+148
| | | | | | | | | | | | | PowerPC has a conditional branch to the link register (return) instruction: BCLR. This should be used any time when we'd otherwise have a conditional branch to a return. This adds a small pass, PPCEarlyReturn, which runs just prior to the branch selection pass (and, importantly, after block placement) to generate these conditional returns when possible. It will also eliminate unconditional branches to returns (these happen rarely; most of the time these have already been tail duplicated by the time PPCEarlyReturn is invoked). This is a nice optimization for small functions that do not maintain a stack frame. llvm-svn: 179026
* Implement PPCInstrInfo::FoldImmediateHal Finkel2013-04-061-0/+68
| | | | | | | | There are certain PPC instructions into which we can fold a zero immediate operand. We can detect such cases by looking at the register class required by the using operand (so long as it is not otherwise constrained). llvm-svn: 178961
* Enable early if conversion on PPCHal Finkel2013-04-051-0/+99
| | | | | | | | | | | | | On cores for which we know the misprediction penalty, and we have the isel instruction, we can profitably perform early if conversion. This enables us to replace some small branch sequences with selects and avoid the potential stalls from mispredicting the branches. Enabling this feature required implementing canInsertSelect and insertSelect in PPCInstrInfo; isel code in PPCISelLowering was refactored to use these functions as well. llvm-svn: 178926
* Resynchronize isLoadFromStackSlot with LoadRegFromStackSlot (and stores) in ↵Hal Finkel2013-03-271-0/+18
| | | | | | | | | | | | | PPCInstrInfo These functions should have the same list of load/store instructions. Now that all load/store forms have been normalized (to single instructions or pseudos) they can be resynchronized. Found by inspection, although hopefully this will improve optimization. I've also added some comments. llvm-svn: 178180
* Remove more dead LR-as-GPR PPC codeHal Finkel2013-03-271-16/+4
| | | | | | I had removed similar code a few days ago, but somehow missed this. llvm-svn: 178169
* Don't spill PPC VRSAVE on non-Darwin (even in SjLj)Hal Finkel2013-03-271-0/+4
| | | | | | | | | | | | | As Bill Schmidt pointed out to me, only on Darwin do we need to spill/restore VRSAVE in the SjLj code. For non-Darwin, don't spill/restore VRSAVE (and I've added some asserts to make sure that we're not). As it turns out, we're not currently handling the Darwin case correctly (I've added a FIXME in the test case). I've tried adding various implied register definitions/uses to force the spill without success, so I'll need to address this later. llvm-svn: 178096
* Note in PPCFunctionInfo VRSAVE spillsHal Finkel2013-03-231-10/+18
| | | | | | | | | | | | In preparation for using the new register scavenger capability for providing more than one register simultaneously, specifically note functions that have spilled VRSAVE (currently, this can happen only in functions that use the setjmp intrinsic). As with CR spilling, such functions will need to provide two emergency spill slots to the scavenger. No functionality change intended. llvm-svn: 177832
* Remove dead PPC LR spilling codeHal Finkel2013-03-231-30/+8
| | | | | | | | | | The LR register is unconditionally reserved, and its spilling and restoration is handled by the prologue/epilogue code. As a result, it is never explicitly spilled by the register allocator. No functionality change intended. llvm-svn: 177823
* Remove ABI-duplicated call instruction patterns.Ulrich Weigand2013-03-221-2/+2
| | | | | | | | | | | | | | | | | | We currently have a duplicated set of call instruction patterns depending on the ABI to be followed (Darwin vs. Linux). This is a bit odd; while the different ABIs will result in different instruction sequences, the actual instructions themselves ought to be independent of the ABI. And in fact it turns out that the only nontrivial difference between the two sets of patterns is that in the PPC64 Linux ABI, the instruction used for indirect calls is marked to take X11 as extra input register (which is indeed used only with that ABI to hold an incoming environment pointer for nested functions). However, this does not need to be hard-coded at the .td pattern level; instead, the C++ code expanding calls can simply add that use, just like it adds uses for argument registers anyway. No change in generated code expected. llvm-svn: 177735
* Fix a register-class comparison bug in PPCCTRLoopsHal Finkel2013-03-211-9/+0
| | | | | | | | | Thanks to Jakob for isolating the underlying problem from the test case in r177423. The original commit had introduced asymmetric copy operations, but these turned out to be a work-around to the real problem (the use of == instead of hasSubClassEq in PPCCTRLoops). llvm-svn: 177679
* Add support for spilling VRSAVE on PPCHal Finkel2013-03-211-0/+12
| | | | | | | | | | Although there is only one Altivec VRSAVE register, it is a member of a register class, and we need the ability to spill it. Because this register is normally callee-preserved and handled by special code this has never before been necessary. However, this capability will be required by a forthcoming commit adding SjLj support. llvm-svn: 177654
* Prepare to make r0 an allocatable register on PPCHal Finkel2013-03-191-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the PPC r0 register is unconditionally reserved. There are two reasons for this: 1. r0 is treated specially (as the constant 0) by certain instructions, and so cannot be used with those instructions as a regular register. 2. r0 is used as a temporary register in the CR-register spilling process (where, under some circumstances, we require two GPRs). This change addresses the first reason by introducing a restricted register class (without r0) for use by those instructions that treat r0 specially. These register classes have a new pseudo-register, ZERO, which represents the r0-as-0 use. This has the side benefit of making the existing target code simpler (and easier to understand), and will make it clear to the register allocator that uses of r0 as 0 don't conflict will real uses of the r0 register. Once the CR spilling code is improved, we'll be able to allocate r0. Adding these extra register classes, for some reason unclear to me, causes requests to the target to copy 32-bit registers to 64-bit registers. The resulting code seems correct (and causes no test-suite failures), and the new test case covers this new kind of asymmetric copy. As r0 is still reserved, no functionality change intended. llvm-svn: 177423
* Improve PPC VR (Altivec) register spillingHal Finkel2013-03-171-34/+29
| | | | | | | | | | | | | | | | This change cleans up two issues with Altivec register spilling: 1. The spilling code was inefficient (using two instructions, and add and a load, when just one would do) 2. The code assumed that r0 would always be available (true for now, but this will change) The new code handles VR spilling just like GPR spills but forced into r+r mode. As a result, when any VR spills are present, we must now always allocate the register-scavenger spill slot. llvm-svn: 177231
* Allocate the RS spill slot for any PPC function with spills and a large ↵Hal Finkel2013-03-151-3/+4
| | | | | | | | | | | | | | | | stack frame For spills into a large stack frame, the FI-elimination code uses the register scavenger to obtain a free GPR for use with an r+r-addressed load or store. When there are no available GPRs, the scavenger gets one by using its spill slot. Previously, we were not always allocating that spill slot and the RS would assert when the spill slot was needed. I don't currently have a small test that triggered the assert, but I've created a small regression test that verifies that the spill slot is now added when the stack frame is sufficiently large. llvm-svn: 177140
* PPC should always use the register scavenger for CR spillingHal Finkel2013-03-121-77/+9
| | | | | | | | This removes the -disable-ppc[32|64]-regscavenger options; the code that uses the register scavenger has been working well (and has been the default) for some time, and we don't need options to enable the old (broken) CR spilling code. llvm-svn: 176865
* Use the new script to sort the includes of every file under lib.Chandler Carruth2012-12-031-3/+3
| | | | | | | | | | | | | | | | | Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
* Remove all references to TargetInstrInfoImpl.Jakob Stoklund Olesen2012-11-281-2/+2
| | | | | | This class has been merged into its super-class TargetInstrInfo. llvm-svn: 168760
* When generating spill and reload code for vector registers on PowerPC,Bill Schmidt2012-10-101-6/+12
| | | | | | | | | | | | the compiler makes use of GPR0. However, there are two flavors of GPR0 defined by the target: the 32-bit GPR0 (R0) and the 64-bit GPR0 (X0). The spill/reload code makes use of R0 regardless of whether we are generating 32- or 64-bit code. This patch corrects the problem in the obvious manner, using X0 and ADDI8 for 64-bit and R0 and ADDI for 32-bit. llvm-svn: 165658
* Add PPC Freescale e500mc and e5500 subtargets.Hal Finkel2012-08-281-2/+4
| | | | | | | | | Add subtargets for Freescale e500mc (32-bit) and e5500 (64-bit) to the PowerPC backend. Patch by Tobias von Koch. llvm-svn: 162764
* Implement PPCInstrInfo::isCoalescableExtInstr().Jakob Stoklund Olesen2012-06-191-0/+16
| | | | | | | | | | | | | The PPC::EXTSW instruction preserves the low 32 bits of its input, just like some of the x86 instructions. Use it to reduce register pressure when the low 32 bits have multiple uses. This requires a small change to PeepholeOptimizer since EXTSW takes a 64-bit input register. This is related to PR5997. llvm-svn: 158743
* Enable PPC CTR loop formation by default.Hal Finkel2012-06-081-6/+6
| | | | | | | | | | | | | | | | | | | | | | Thanks to Jakob's help, this now causes no new test suite failures! Over the entire test suite, this gives an average 1% speedup. The largest speedups are: SingleSource/Benchmarks/Misc/pi - 108% SingleSource/Benchmarks/CoyoteBench/lpbench - 54% MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50% SingleSource/Benchmarks/Shootout/ary3 - 32% SingleSource/Benchmarks/Shootout-C++/matrix - 30% The largest slowdowns are: MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30% MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25% MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22% MultiSource/Applications/d/make_dparser - -14% SingleSource/Benchmarks/Shootout-C++/ary - -13% In light of these slowdowns, additional profiling work is obviously needed! llvm-svn: 158223
* Disable the PPC CTR-Loops pass by default.Hal Finkel2012-06-081-0/+12
| | | | | | | | | | The pass itself works well, but the something in the Machine* infrastructure does not understand terminators which define registers. Without the ability to use the block-placement pass, etc. this causes performance regressions (and so is turned off by default). Turning off the analysis turns off the problems with the Machine* infrastructure. llvm-svn: 158206
* Add the PPCCTRLoops pass: a PPC machine-code-level optimization pass to form ↵Hal Finkel2012-06-081-6/+71
| | | | | | | | | | CTR-based loop branching code. This pass is derived from the Hexagon HardwareLoops pass. The only significant enhancement over the Hexagon pass is that PPCCTRLoops will also attempt to delete the replaced add and compare operations if they are no longer otherwise used. Also, invalid preheader DebugLoc is not used. llvm-svn: 158204
* Convert some uses of XXXRegisterClass to &XXXRegClass. No functional change ↵Craig Topper2012-04-201-16/+16
| | | | | | since they are equivalent. llvm-svn: 155186
* Add instruction itinerary for the PPC64 A2 core.Hal Finkel2012-04-011-3/+4
| | | | | | | This adds a full itinerary for IBM's PPC64 A2 embedded core. These cores form the basis for the CPUs in the new IBM BG/Q supercomputer. llvm-svn: 153842
* Fix dynamic linking on PPC64.Hal Finkel2012-03-311-1/+4
| | | | | | | | | | | | | | | | | | Dynamic linking on PPC64 has had problems since we had to move the top-down hazard-detection logic post-ra. For dynamic linking to work there needs to be a nop placed after every call. It turns out that it is really hard to guarantee that nothing will be placed in between the call (bl) and the nop during post-ra scheduling. Previous attempts at fixing this by placing logic inside the hazard detector only partially worked. This is now fixed in a different way: call+nop codegen-only instructions. As far as CodeGen is concerned the pair is now a single instruction and cannot be split. This solution works much better than previous attempts. The scoreboard hazard detector is also renamed to be more generic, there is currently no cpu-specific logic in it. llvm-svn: 153816
* Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, ↵Jia Liu2012-02-181-1/+1
| | | | | | MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878
* make CR spill and restore 64-bit clean (no functional change), and fix some ↵Hal Finkel2011-12-071-5/+11
| | | | | | other problems found with -verify-machineinstrs llvm-svn: 146024
* 64-bit LR8 load should use X11 not R11Hal Finkel2011-12-071-3/+3
| | | | llvm-svn: 146021
* add RESTORE_CR and support CR unspillsHal Finkel2011-12-061-23/+36
| | | | llvm-svn: 145961
* enable PPC register scavenging by default (update tests and remove some FIXMEs)Hal Finkel2011-12-051-5/+5
| | | | llvm-svn: 145819
* update PPC 940 hazard rec. to function in postRA modeHal Finkel2011-12-021-12/+19
| | | | llvm-svn: 145676
* add basic PPC register-pressure feedback; adjust the vaarg test to match the ↵Hal Finkel2011-11-221-5/+2
| | | | | | new register-allocation pattern llvm-svn: 145065
* Make use of MachinePointerInfo::getFixedStack. This removes all mentionJay Foad2011-11-151-5/+2
| | | | | | of PseudoSourceValue from lib/Target/. llvm-svn: 144632
* PPCInstrInfo.cpp: Fix one "unused" warning.NAKAMURA Takumi2011-11-081-0/+1
| | | | llvm-svn: 144071
* Fix unused variable warning.Richard Smith2011-10-211-1/+1
| | | | llvm-svn: 142630
* Disable the PPC hazard recognizer. It currently only supportsDan Gohman2011-10-201-2/+8
| | | | | | top-down scheduling and top-down scheduling is going away. llvm-svn: 142621
* Add PPC 440 scheduler and some associated testsHal Finkel2011-10-171-1/+9
| | | | llvm-svn: 142170
OpenPOWER on IntegriCloud