bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[PowerPC] Initial support for the VSX instruction set	Hal Finkel	2014-03-13	1	-0/+199
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	VSX is an ISA extension supported on the POWER7 and later cores that enhances floating-point vector and scalar capabilities. Among other things, this adds <2 x double> support and generally helps to reduce register pressure. The interesting part of this ISA feature is the register configuration: there are 64 new 128-bit vector registers, the 32 of which are super-registers of the existing 32 scalar floating-point registers, and the second 32 of which overlap with the 32 Altivec vector registers. This makes things like vector insertion and extraction tricky: this can be free but only if we force a restriction to the right register subclass when needed. A new "minipass" PPCVSXCopy takes care of this (although it could do a more-optimal job of it; see the comment about unnecessary copies below). Please note that, currently, VSX is not enabled by default when targeting anything because it is not yet ready for that. The assembler and disassembler are fully implemented and tested. However: - CodeGen support causes miscompiles; test-suite runtime failures: MultiSource/Benchmarks/FreeBench/distray/distray MultiSource/Benchmarks/McCat/08-main/main MultiSource/Benchmarks/Olden/voronoi/voronoi MultiSource/Benchmarks/mafft/pairlocalalign MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4 SingleSource/Benchmarks/CoyoteBench/almabench SingleSource/Benchmarks/Misc/matmul_f64_4x4 - The lowering currently falls back to using Altivec instructions far more than it should. Worse, there are some things that are scalarized through the stack that shouldn't be. - A lot of unnecessary copies make it past the optimizers, and this needs to be fixed. - Many more regression tests are needed. Normally, I'd fix these things prior to committing, but there are some students and other contributors who would like to work this, and so it makes sense to move this development process upstream where it can be subject to the regular code-review procedures. llvm-svn: 203768
*	[C++11] Replace llvm::next and llvm::prior with std::next and std::prev.	Benjamin Kramer	2014-03-02	1	-1/+1
\| \| \| \| \| \|	Remove the old functions. llvm-svn: 202636
*	Add CR-bit tracking to the PowerPC backend for i1 values	Hal Finkel	2014-02-28	1	-70/+121
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change enables tracking i1 values in the PowerPC backend using the condition register bits. These bits can be treated on PowerPC as separate registers; individual bit operations (and, or, xor, etc.) are supported. Tracking booleans in CR bits has several advantages: - Reduction in register pressure (because we no longer need GPRs to store boolean values). - Logical operations on booleans can be handled more efficiently; we used to have to move all results from comparisons into GPRs, perform promoted logical operations in GPRs, and then move the result back into condition register bits to be used by conditional branches. This can be very inefficient, because the throughput of these CR <-> GPR moves have high latency and low throughput (especially when other associated instructions are accounted for). - On the POWER7 and similar cores, we can increase total throughput by using the CR bits. CR bit operations have a dedicated functional unit. Most of this is more-or-less mechanical: Adjustments were needed in the calling-convention code, support was added for spilling/restoring individual condition-register bits, and conditional branch instruction definitions taking specific CR bits were added (plus patterns and code for generating bit-level operations). This is enabled by default when running at -O2 and higher. For -O0 and -O1, where the ability to debug is more important, this feature is disabled by default. Individual CR bits do not have assigned DWARF register numbers, and storing values in CR bits makes them invisible to the debugger. It is critical, however, that we don't move i1 values that have been promoted to larger values (such as those passed as function arguments) into bit registers only to quickly turn around and move the values back into GPRs (such as happens when values are returned by functions). A pair of target-specific DAG combines are added to remove the trunc/extends in: trunc(binary-ops(binary-ops(zext(x), zext(y)), ...) and: zext(binary-ops(binary-ops(trunc(x), trunc(y)), ...) In short, we only want to use CR bits where some of the i1 values come from comparisons or are used by conditional branches or selects. To put it another way, if we can do the entire i1 computation in GPRs, then we probably should (on the POWER7, the GPR-operation throughput is higher, and for all cores, the CR <-> GPR moves are expensive). POWER7 test-suite performance results (from 10 runs in each configuration): SingleSource/Benchmarks/Misc/mandel-2: 35% speedup MultiSource/Benchmarks/Prolangs-C++/city/city: 21% speedup MultiSource/Benchmarks/MiBench/automotive-susan: 23% speedup SingleSource/Benchmarks/CoyoteBench/huffbench: 13% speedup SingleSource/Benchmarks/Misc-C++/Large/sphereflake: 13% speedup SingleSource/Benchmarks/Misc-C++/mandel-text: 10% speedup SingleSource/Benchmarks/Misc-C++-EH/spirit: 10% slowdown MultiSource/Applications/lemon/lemon: 8% slowdown llvm-svn: 202451
*	Replace PPC instruction-size code with MCInstrDesc getSize	Hal Finkel	2014-02-02	1	-13/+6
\| \| \| \| \| \| \| \| \| \|	As part of the cleanup done to enable the disassembler, the PPC instructions now have a valid Size description field. This can now be used to replace some custom logic in a few places to compute instruction sizes. Patch by David Wiberg! llvm-svn: 200623
*	Handle spilling the PPC GPRC_NOR0 register class	Hal Finkel	2014-01-28	1	-4/+8
\| \| \| \| \| \| \|	GPRC_NOR0 is not a subclass of GPRC (because it also contains the ZERO pseudo register). As a result, we also need to check for it in the spilling code. llvm-svn: 200288
*	Re-sort all of the includes with ./utils/sort_includes.py so that	Chandler Carruth	2014-01-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	subsequent changes are easier to review. About to fix some layering issues, and wanted to separate out the necessary churn. Also comment and sink the include of "Windows.h" in three .inc files to match the usage in Memory.inc. llvm-svn: 198685
*	Allow MachineCSE to coalesce trivial subregister copies the same way that it ↵	Andrew Trick	2013-12-17	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	coalesces normal copies. Without this, MachineCSE is powerless to handle redundant operations with truncated source operands. This required fixing the 2-addr pass to handle tied subregisters. It isn't clear what combinations of subregisters can legally be tied, but the simple case of truncated source operands is now safely handled: %vreg11<def> = COPY %vreg1:sub_32bit; GR32:%vreg11 GR64:%vreg1 %vreg12<def> = COPY %vreg2:sub_32bit; GR32:%vreg12 GR64:%vreg2 %vreg13<def,tied1> = ADD32rr %vreg11<tied0>, %vreg12<kill>, %EFLAGS<imp-def> Test case: cse-add-with-overflow.ll. This exposed an existing bug in PPCInstrInfo::commuteInstruction. Thanks to Rafael for the test case: PowerPC/crash.ll. llvm-svn: 197465
*	whitespace	Andrew Trick	2013-12-17	1	-3/+3
\| \| \| \|	llvm-svn: 197464
*	Improve instruction scheduling for the PPC POWER7	Hal Finkel	2013-12-12	1	-2/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Aside from a few minor latency corrections, the major change here is a new hazard recognizer which focuses on better dispatch-group formation on the POWER7. As with the PPC970's hazard recognizer, the most important thing it does is avoid load-after-store hazards within the same dispatch group. It uses the POWER7's special dispatch-group-terminating nop instruction (instead of inserting multiple regular nop instructions). This new hazard recognizer makes use of the scheduling dependency graph itself, built using AA information, to robustly detect the possibility of load-after-store hazards. significant test-suite performance changes (the error bars are 99.5% confidence intervals based on 5 test-suite runs both with and without the change -- speedups are negative): speedups: MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2 -0.55171% +/- 0.333168% MultiSource/Benchmarks/TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl -17.5576% +/- 14.598% MultiSource/Benchmarks/TSVC/Reductions-dbl/Reductions-dbl -29.5708% +/- 7.09058% MultiSource/Benchmarks/TSVC/Reductions-flt/Reductions-flt -34.9471% +/- 11.4391% SingleSource/Benchmarks/BenchmarkGame/puzzle -25.1347% +/- 11.0104% SingleSource/Benchmarks/Misc/flops-8 -17.7297% +/- 9.79061% SingleSource/Benchmarks/Shootout-C++/ary3 -35.5018% +/- 23.9458% SingleSource/Regression/C/uint64_to_float -56.3165% +/- 25.4234% SingleSource/UnitTests/Vectorizer/gcc-loops -18.5309% +/- 6.8496% regressions: MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000 18.351% +/- 12.156% SingleSource/Benchmarks/Shootout-C++/methcall 27.3086% +/- 14.4733% llvm-svn: 197099
*	Fix the PPC subsumes-predicate check	Hal Finkel	2013-12-11	1	-0/+4
\| \| \| \| \| \| \| \| \|	For one predicate to subsume another, they must both check the same condition register. Failure to check this prerequisite was causing miscompiles. Fixes PR18003. llvm-svn: 197089
*	Remove PPCScoreboardHazardRecognizer	Hal Finkel	2013-12-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	PPCScoreboardHazardRecognizer was a subclass of ScoreboardHazardRecognizer which did only one thing: filtered out nodes in EmitInstruction for which DAG->getInstrDesc(SU) returned NULL. This used to be the case for PPC pseudo instructions. As far as I can tell, this is no longer true, and so we can use ScoreboardHazardRecognizer directly. llvm-svn: 196171
*	[weak vtables] Remove a bunch of weak vtables	Juergen Ributzka	2013-11-19	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \|	This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. The memory leaks in this version have been fixed. Thanks Alexey for pointing them out. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 195064
*	Revert r194865 and r194874.	Alexey Samsonov	2013-11-18	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This change is incorrect. If you delete virtual destructor of both a base class and a subclass, then the following code: Base *foo = new Child(); delete foo; will not cause the destructor for members of Child class. As a result, I observe plently of memory leaks. Notable examples I investigated are: ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl. llvm-svn: 194997
*	[weak vtables] Remove a bunch of weak vtables	Juergen Ributzka	2013-11-15	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \|	This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 194865
*	Fix register subclass handling in PPCInstrInfo::insertSelect	Hal Finkel	2013-07-15	1	-5/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PPCInstrInfo::insertSelect and PPCInstrInfo::canInsertSelect were computing the common subclass of the true and false inputs, and then selecting either the 32-bit or the 64-bit isel variant based on the result of calling PPC::GPRCRegClass.hasSubClassEq(RC) and PPC::G8RCRegClass.hasSubClassEq(RC) (where RC is the common subclass). Unfortunately, this is not quite right: if we have something like this: %vreg8<def> = SELECT_CC_I8 %vreg4<kill>, %vreg7<kill>, %vreg6<kill>, 76; G8RC_and_G8RC_NOX0:%vreg8 CRRC:%vreg4 G8RC_NOX0:%vreg7,%vreg6 then the common subclass of G8RC_and_G8RC_NOX0 and G8RC_NOX0 is G8RC_NOX0, and G8RC_NOX0 is not a subclass of G8RC (because it also contains the ZERO8 pseudo-register). As a result, we also need to check the common subclass against GPRC_NOR0 and G8RC_NOX0 explicitly. This had not been a problem for clients of insertSelect that called canInsertSelect first (because it had a compensating mistake), but insertSelect is also used by the PPC pseudo-instruction expander, and this error was causing a problem in that context. This problem was found by csmith. llvm-svn: 186343
*	DebugInfo: remove target-specific Frame Index handling for DBG_VALUE ↵	David Blaikie	2013-06-16	1	-10/+0
\| \| \| \| \| \| \| \| \| \|	MachineInstrs Frame index handling is now target-agnostic, so delete the target hooks for creation & asm printing of target-specific addressing in DBG_VALUEs and any related functions. llvm-svn: 184067
*	Fold variable that's only used in assert into the assert.	Benjamin Kramer	2013-06-07	1	-2/+1
\| \| \| \| \| \|	Avoids unused variable warnings in Release builds. llvm-svn: 183512
*	Don't cache the instruction and register info from the TargetMachine, because	Bill Wendling	2013-06-07	1	-2/+2
\| \| \| \| \| \| \| \|	the internals of TargetMachine could change. No functionality change intended. llvm-svn: 183494
*	PPCInstrInfo::optimizeCompareInstr should not optimize FP compares	Hal Finkel	2013-05-08	1	-18/+11
\| \| \| \| \| \| \| \|	The floating-point record forms on PPC don't set the condition register bits based on a comparison with zero (like the integer record forms do), but rather based on the exception status bits. llvm-svn: 181423
*	Cleanup PPCInstrInfo::optimizeCompareInstr	Hal Finkel	2013-05-07	1	-14/+10
\| \| \| \| \| \| \|	Implement suggestions by Bill Schmidt in post-commit review. No functionality change intended. llvm-svn: 181338
*	Move PPC getSwappedPredicate for reuse	Hal Finkel	2013-04-20	1	-17/+1
\| \| \| \| \| \| \| \| \| \| \|	The getSwappedPredicate function can be used in other places (such as in improvements to the PPCCTRLoops pass). Instead of trapping it as a static function in PPCInstrInfo, move it into PPCPredicates with other predicate-related things. No functionality change intended. llvm-svn: 179926
*	Fix PPC optimizeCompareInstr swapped-sub argument handling	Hal Finkel	2013-04-19	1	-16/+40
\| \| \| \| \| \| \| \| \| \| \| \| \|	When matching a compare with a subtract where the arguments of the compare are swapped w.r.t. the arguments of the subtract, we need to negate the predicates (or CR bit indices) of the users. This, however, is not the same as inverting the predicate (negating LT -> GT, but inverting LT -> GE, for example). The ARM backend seems to do this correctly, but when I adapted the code for the PPC backend, I introduced an error in this logic. Comparison optimization is now enabled again by default. llvm-svn: 179899
*	Disable PPC comparison optimization by default	Hal Finkel	2013-04-18	1	-0/+6
\| \| \| \| \| \| \|	This seems to cause a stage-2 LLVM compile failure (by crashing TableGen); do I'm disabling this for now. llvm-svn: 179807
*	Implement optimizeCompareInstr for PPC	Hal Finkel	2013-04-18	1	-0/+300
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Many PPC instructions have a so-called 'record form' which stores to a specific condition register the result of comparing the result of the instruction with zero (always as a signed comparison). For integer operations on PPC64, this is always a 64-bit comparison. This implementation is derived from the implementation in the ARM backend; there are some differences because PPC condition registers are allocatable virtual registers (although the record forms always use a specific one), and we look for a matching subtraction instruction after the compare (but before the first use) in addition to before it. llvm-svn: 179802
*	Add PPC instruction record forms and associated query functions	Hal Finkel	2013-04-12	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is prep. work for the implementation of optimizeCompare. Many PPC instructions have 'record' forms (in almost all cases, this means that the RC bit is set) that cause the result of the instruction to be compared with zero, and the result of that comparison saved in a predefined condition register. In order to add the record forms of the instructions without too much copy-and-paste, the relevant functions have been refactored into multiclasses which define both the record and normal forms. Also, two TableGen-generated mapping functions have been added which allow querying the instruction code for the record form given the normal form (and vice versa). No functionality change intended. llvm-svn: 179356
*	Make PPCInstrInfo::isPredicated always return false	Hal Finkel	2013-04-11	1	-16/+8
\| \| \| \| \| \| \|	Because of how predication in implemented on PPC (only for branches), I think that this is the right thing to do. No functionality change intended. llvm-svn: 179252
*	PPC: Don't predicate a diamond with two counter decrements	Hal Finkel	2013-04-10	1	-0/+23
\| \| \| \| \| \| \| \| \| \|	I've not seen this happen in practice, and probably can't until we start allowing decrement-counter-based conditional branches to be double predicated, but just in case, don't allow predication of a diamond in which both sides have ctr-defining branches. Even though the branching behavior of these can be predicated, the counter-decrementing behavior cannot be. llvm-svn: 179199
*	Cleanup PPCInstrInfo::DefinesPredicate	Hal Finkel	2013-04-10	1	-5/+10
\| \| \| \| \| \|	Implement suggestions made by Bill Schmidt in post-commit review. Thanks! llvm-svn: 179162
*	PPC: Prep for if conversion of bctr[l]	Hal Finkel	2013-04-10	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds in-principle support for if-converting the bctr[l] instructions. These instructions are used for indirect branching. It seems, however, that the current if converter will never actually predicate these. To do so, it would need the ability to hoist a few setup insts. out of the conditionally-executed block. For example, code like this: void foo(int a, int (*bar)()) { if (a != 0) bar(); } becomes: ... beq 0, .LBB0_2 std 2, 40(1) mr 12, 4 ld 3, 0(4) ld 11, 16(4) ld 2, 8(4) mtctr 3 bctrl ld 2, 40(1) .LBB0_2: ... and it would be safe to do all of this unconditionally with a predicated beqctrl instruction. llvm-svn: 179156
*	Allow PPC B and BLR to be if-converted into some predicated forms	Hal Finkel	2013-04-09	1	-0/+137
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This enables us to form predicated branches (which are the same conditional branches we had before) and also a larger set of predicated returns (including instructions like bdnzlr which is a conditional return and loop-counter decrement all in one). At the moment, if conversion does not capture all possible opportunities. A simple example is provided in early-ret2.ll, where if conversion forms one predicated return, and then the PPCEarlyReturn pass picks up the other one. So, at least for now, we'll keep both mechanisms. llvm-svn: 179134
*	Cleanup PPCEarlyReturn	Hal Finkel	2013-04-09	1	-28/+31
\| \| \| \| \| \| \| \| \| \|	Some general cleanup and only scan the end of a BB for branches (once we're done with the terminators and debug values, then there should not be any other branches). These address post-commit review suggestions by Bill Schmidt. No functionality change intended. llvm-svn: 179112
*	Generate PPC early conditional returns	Hal Finkel	2013-04-08	1	-0/+148
\| \| \| \| \| \| \| \| \| \| \| \| \|	PowerPC has a conditional branch to the link register (return) instruction: BCLR. This should be used any time when we'd otherwise have a conditional branch to a return. This adds a small pass, PPCEarlyReturn, which runs just prior to the branch selection pass (and, importantly, after block placement) to generate these conditional returns when possible. It will also eliminate unconditional branches to returns (these happen rarely; most of the time these have already been tail duplicated by the time PPCEarlyReturn is invoked). This is a nice optimization for small functions that do not maintain a stack frame. llvm-svn: 179026
*	Implement PPCInstrInfo::FoldImmediate	Hal Finkel	2013-04-06	1	-0/+68
\| \| \| \| \| \| \| \|	There are certain PPC instructions into which we can fold a zero immediate operand. We can detect such cases by looking at the register class required by the using operand (so long as it is not otherwise constrained). llvm-svn: 178961
*	Enable early if conversion on PPC	Hal Finkel	2013-04-05	1	-0/+99
\| \| \| \| \| \| \| \| \| \| \| \| \|	On cores for which we know the misprediction penalty, and we have the isel instruction, we can profitably perform early if conversion. This enables us to replace some small branch sequences with selects and avoid the potential stalls from mispredicting the branches. Enabling this feature required implementing canInsertSelect and insertSelect in PPCInstrInfo; isel code in PPCISelLowering was refactored to use these functions as well. llvm-svn: 178926
*	Resynchronize isLoadFromStackSlot with LoadRegFromStackSlot (and stores) in ↵	Hal Finkel	2013-03-27	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \|	PPCInstrInfo These functions should have the same list of load/store instructions. Now that all load/store forms have been normalized (to single instructions or pseudos) they can be resynchronized. Found by inspection, although hopefully this will improve optimization. I've also added some comments. llvm-svn: 178180
*	Remove more dead LR-as-GPR PPC code	Hal Finkel	2013-03-27	1	-16/+4
\| \| \| \| \| \|	I had removed similar code a few days ago, but somehow missed this. llvm-svn: 178169
*	Don't spill PPC VRSAVE on non-Darwin (even in SjLj)	Hal Finkel	2013-03-27	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	As Bill Schmidt pointed out to me, only on Darwin do we need to spill/restore VRSAVE in the SjLj code. For non-Darwin, don't spill/restore VRSAVE (and I've added some asserts to make sure that we're not). As it turns out, we're not currently handling the Darwin case correctly (I've added a FIXME in the test case). I've tried adding various implied register definitions/uses to force the spill without success, so I'll need to address this later. llvm-svn: 178096
*	Note in PPCFunctionInfo VRSAVE spills	Hal Finkel	2013-03-23	1	-10/+18
\| \| \| \| \| \| \| \| \| \| \| \|	In preparation for using the new register scavenger capability for providing more than one register simultaneously, specifically note functions that have spilled VRSAVE (currently, this can happen only in functions that use the setjmp intrinsic). As with CR spilling, such functions will need to provide two emergency spill slots to the scavenger. No functionality change intended. llvm-svn: 177832
*	Remove dead PPC LR spilling code	Hal Finkel	2013-03-23	1	-30/+8
\| \| \| \| \| \| \| \| \| \|	The LR register is unconditionally reserved, and its spilling and restoration is handled by the prologue/epilogue code. As a result, it is never explicitly spilled by the register allocator. No functionality change intended. llvm-svn: 177823
*	Remove ABI-duplicated call instruction patterns.	Ulrich Weigand	2013-03-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently have a duplicated set of call instruction patterns depending on the ABI to be followed (Darwin vs. Linux). This is a bit odd; while the different ABIs will result in different instruction sequences, the actual instructions themselves ought to be independent of the ABI. And in fact it turns out that the only nontrivial difference between the two sets of patterns is that in the PPC64 Linux ABI, the instruction used for indirect calls is marked to take X11 as extra input register (which is indeed used only with that ABI to hold an incoming environment pointer for nested functions). However, this does not need to be hard-coded at the .td pattern level; instead, the C++ code expanding calls can simply add that use, just like it adds uses for argument registers anyway. No change in generated code expected. llvm-svn: 177735
*	Fix a register-class comparison bug in PPCCTRLoops	Hal Finkel	2013-03-21	1	-9/+0
\| \| \| \| \| \| \| \| \|	Thanks to Jakob for isolating the underlying problem from the test case in r177423. The original commit had introduced asymmetric copy operations, but these turned out to be a work-around to the real problem (the use of == instead of hasSubClassEq in PPCCTRLoops). llvm-svn: 177679
*	Add support for spilling VRSAVE on PPC	Hal Finkel	2013-03-21	1	-0/+12
\| \| \| \| \| \| \| \| \| \|	Although there is only one Altivec VRSAVE register, it is a member of a register class, and we need the ability to spill it. Because this register is normally callee-preserved and handled by special code this has never before been necessary. However, this capability will be required by a forthcoming commit adding SjLj support. llvm-svn: 177654
*	Prepare to make r0 an allocatable register on PPC	Hal Finkel	2013-03-19	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the PPC r0 register is unconditionally reserved. There are two reasons for this: 1. r0 is treated specially (as the constant 0) by certain instructions, and so cannot be used with those instructions as a regular register. 2. r0 is used as a temporary register in the CR-register spilling process (where, under some circumstances, we require two GPRs). This change addresses the first reason by introducing a restricted register class (without r0) for use by those instructions that treat r0 specially. These register classes have a new pseudo-register, ZERO, which represents the r0-as-0 use. This has the side benefit of making the existing target code simpler (and easier to understand), and will make it clear to the register allocator that uses of r0 as 0 don't conflict will real uses of the r0 register. Once the CR spilling code is improved, we'll be able to allocate r0. Adding these extra register classes, for some reason unclear to me, causes requests to the target to copy 32-bit registers to 64-bit registers. The resulting code seems correct (and causes no test-suite failures), and the new test case covers this new kind of asymmetric copy. As r0 is still reserved, no functionality change intended. llvm-svn: 177423
*	Improve PPC VR (Altivec) register spilling	Hal Finkel	2013-03-17	1	-34/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change cleans up two issues with Altivec register spilling: 1. The spilling code was inefficient (using two instructions, and add and a load, when just one would do) 2. The code assumed that r0 would always be available (true for now, but this will change) The new code handles VR spilling just like GPR spills but forced into r+r mode. As a result, when any VR spills are present, we must now always allocate the register-scavenger spill slot. llvm-svn: 177231
*	Allocate the RS spill slot for any PPC function with spills and a large ↵	Hal Finkel	2013-03-15	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	stack frame For spills into a large stack frame, the FI-elimination code uses the register scavenger to obtain a free GPR for use with an r+r-addressed load or store. When there are no available GPRs, the scavenger gets one by using its spill slot. Previously, we were not always allocating that spill slot and the RS would assert when the spill slot was needed. I don't currently have a small test that triggered the assert, but I've created a small regression test that verifies that the spill slot is now added when the stack frame is sufficiently large. llvm-svn: 177140
*	PPC should always use the register scavenger for CR spilling	Hal Finkel	2013-03-12	1	-77/+9
\| \| \| \| \| \| \| \|	This removes the -disable-ppc[32\|64]-regscavenger options; the code that uses the register scavenger has been working well (and has been the default) for some time, and we don't need options to enable the old (broken) CR spilling code. llvm-svn: 176865
*	Use the new script to sort the includes of every file under lib.	Chandler Carruth	2012-12-03	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
*	Remove all references to TargetInstrInfoImpl.	Jakob Stoklund Olesen	2012-11-28	1	-2/+2
\| \| \| \| \| \|	This class has been merged into its super-class TargetInstrInfo. llvm-svn: 168760
*	When generating spill and reload code for vector registers on PowerPC,	Bill Schmidt	2012-10-10	1	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \|	the compiler makes use of GPR0. However, there are two flavors of GPR0 defined by the target: the 32-bit GPR0 (R0) and the 64-bit GPR0 (X0). The spill/reload code makes use of R0 regardless of whether we are generating 32- or 64-bit code. This patch corrects the problem in the obvious manner, using X0 and ADDI8 for 64-bit and R0 and ADDI for 32-bit. llvm-svn: 165658
*	Add PPC Freescale e500mc and e5500 subtargets.	Hal Finkel	2012-08-28	1	-2/+4
\| \| \| \| \| \| \| \| \|	Add subtargets for Freescale e500mc (32-bit) and e5500 (64-bit) to the PowerPC backend. Patch by Tobias von Koch. llvm-svn: 162764