bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[PowerPC] Add rldcr/rldic instructions	Ulrich Weigand	2013-06-25	1	-0/+8
\| \| \| \| \| \| \| \|	This adds pattern for the rldcr and rldic instructions (the last instruction from the rotate/shift family that were missing). They are currently used only by the asm parser. llvm-svn: 184833
*	[PowerPC] Add predicted forms of branches	Ulrich Weigand	2013-06-24	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds support for the predicted forms of branches (+/-). There are three cases to consider: - Branches using a PPC::Predicate code For these, I've added new PPC::Predicate codes corresponding to the BO values for predicted branch forms, and updated insn printing to print them correctly. I've also added new aliases for the asm parser matching the new forms. - bt/bf I've added new aliases matching to gBC etc. - bd(n)z variants I've added new instruction patterns for the predicted forms. In all cases, the new patterns are used for the asm parser only. (The new infrastructure ought to be sufficient to allow use by the compiler too at some point.) llvm-svn: 184754
*	[PowerPC] Support absolute branches	Ulrich Weigand	2013-06-24	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is currently only limited support for the "absolute" variants of branch instructions. This patch adds support for the absolute variants of all branches that are currently otherwise supported. This requires adding new fixup types so that the correct variant of relocation type can be selected by the object writer. While the compiler will continue to usually choose the relative branch variants, this will allow the asm parser to fully support the absolute branches, with either immediate (numerical) or symbolic target addresses. No change in code generation intended. llvm-svn: 184721
*	[PowerPC] Remove symbolLo/symbolHi instruction operand types	Ulrich Weigand	2013-05-23	1	-22/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that there is no longer any distinction between symbolLo and symbolHi operands in either printing, encoding, or parsing, the operand types can be removed in favor of simply using s16imm. This completes the patch series to decouple lo/hi operand part processing from the particular instruction whose operand it is. No change in code generation expected from this patch. llvm-svn: 182618
*	[PowerPC] Clean up generation of ha16() / lo16() markers	Ulrich Weigand	2013-05-23	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When targeting the Darwin assembler, we need to generate markers ha16() and lo16() to designate the high and low parts of a (symbolic) immediate. This is necessary not just for plain symbols, but also for certain symbolic expression, typically along the lines of ha16(A - B). The latter doesn't work when simply using VariantKind flags on the symbol reference. This is why the current back-end uses hacks (explicitly called out as such via multiple FIXMEs) in the symbolLo/symbolHi print methods. This patch uses target-defined MCExpr codes to represent the Darwin ha16/lo16 constructs, following along the lines of the equivalent solution used by the ARM back end to handle their :upper16: / :lower16: markers. This allows us to get rid of special handling both in the symbolLo/symbolHi print method and in the common code MCExpr::print routine. Instead, the ha16 / lo16 markers are printed simply in a custom print routine for the target MCExpr types. (As a result, the symbolLo/symbolHi print methods can now replaced by a single printS16ImmOperand routine that also handles symbolic operands.) The patch also provides a EvaluateAsRelocatableImpl routine to handle ha16/lo16 constructs. This is not actually used at the moment by any in-tree code, but is provided as it makes merging into David Fang's out-of-tree Mach-O object writer simpler. Since there is no longer any need to treat VK_PPC_GAS_HA16 and VK_PPC_DARWIN_HA16 differently, they are merged into a single VK_PPC_ADDR16_HA (and likewise for the _LO16 types). llvm-svn: 182616
*	Change some PowerPC PatLeaf definitions to ImmLeaf for fast-isel.	Bill Schmidt	2013-05-22	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using PatLeaf rather than ImmLeaf when defining immediate predicates prevents simple patterns using those predicates from being recognized for fast instruction selection. This patch replaces the immSExt16 PatLeaf predicate with two ImmLeaf predicates, imm32SExt16 and imm64SExt16, allowing a few more patterns to be recognized (ADDI, ADDIC, MULLI, ADDI8, and ADDIC8). Using the new predicates does not help for LI, LI8, SUBFIC, and SUBFIC8 because these are rejected for other reasons, but I see no reason to retain the PatLeaf predicate. No functional change intended, and thus no test cases yet. This is preliminary work for enabling fast-isel support for PowerPC. When that support is ready, we'll be able to test this function. llvm-svn: 182510
*	Rename PPC MTCTRse to MTCTRloop	Hal Finkel	2013-05-20	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	As the pairing of this instruction form with the bdnz/bdz branches is now enforced by the verification pass, make it clear from the name that these are used only for counter-based loops. No functionality change intended. llvm-svn: 182296
*	[PowerPC] Fix hi/lo encoding in old-style code emitter	Ulrich Weigand	2013-05-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements the equivalent change to r182091/r182092 in the old-style code emitter. Instead of having two separate 16-bit immediate encoding routines depending on the instruction, this patch introduces a single encoder that checks the machine operand flags to decide whether the low or high half of a symbol address is required. Since now both encoders make no further distinction between "symbolLo" and "symbolHi", the .td operand can now use a single getS16ImmEncoding method. Tested by running the old-style JIT tests on 32-bit Linux. llvm-svn: 182097
*	Implement PPC counter loops as a late IR-level pass	Hal Finkel	2013-05-15	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old PPCCTRLoops pass, like the Hexagon pass version from which it was derived, could only handle some simple loops in canonical form. We cannot directly adapt the new Hexagon hardware loops pass, however, because the Hexagon pass contains a fundamental assumption that non-constant-trip-count loops will contain a guard, and this is not always true (the result being that incorrect negative counts can be generated). With this commit, we replace the pass with a late IR-level pass which makes use of SE to calculate the backedge-taken counts and safely generate the loop-count expressions (including any necessary max() parts). This IR level pass inserts custom intrinsics that are lowered into the desired decrement-and-branch instructions. The most fragile part of this new implementation is that interfering uses of the counter register must be detected on the IR level (and, on PPC, this also includes any indirect branches in addition to function calls). Also, to make all of this work, we need a variant of the mtctr instruction that is marked as having side effects. Without this, machine-code level CSE, DCE, etc. illegally transform the resulting code. Hopefully, this can be improved in the future. This new pass is smaller than the original (and much smaller than the new Hexagon hardware loops pass), and can handle many additional cases correctly. In addition, the preheader-creation code has been copied from LoopSimplify, and after we decide on where it belongs, this code will be refactored so that it can be explicitly shared (making this implementation even smaller). The new test-case files ctrloop-{le,lt,ne}.ll have been adapted from tests for the new Hexagon pass. There are a few classes of loops that this pass does not transform (noted by FIXMEs in the files), but these deficiencies can be addressed within the SE infrastructure (thus helping many other passes as well). llvm-svn: 181927
*	[PowerPC] Add assembler parser	Ulrich Weigand	2013-05-03	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds assembler parser support to the PowerPC back end. The parser will run for any powerpc-- and powerpc64-- triples, but was tested only on 64-bit Linux. The supported syntax is intended to be compatible with the GNU assembler. The parser does not yet support all PowerPC instructions, but it does support anything that is generated by LLVM itself. There is no support for testing restricted instruction sets yet, i.e. the parser will always accept any instructions it knows, no matter what feature flags are given. Instruction operands will be checked for validity and errors generated. (Error handling in general could still be improved.) The patch adds a number of test cases to verify instruction and operand encodings. The tests currently cover all instructions from the following PowerPC ISA v2.06 Book I facilities: Branch, Fixed-point, Floating-Point, and Vector. Note that a number of these instructions are not yet supported by the back end; they are marked with FIXME. A number of follow-on check-ins will add extra features. When they are all included, LLVM passes all tests (including bootstrap) when using clang -cc1as as the system assembler. llvm-svn: 181050
*	PowerPC: Use RegisterOperand instead of RegisterClass operands	Ulrich Weigand	2013-04-26	1	-145/+145
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the default PowerPC assembler syntax, registers are specified simply by number, so they cannot be distinguished from immediate values (without looking at the opcode). This means that the default operand matching logic for the asm parser does not work, and we need to specify custom matchers. Since those can only be specified with RegisterOperand classes and not directly on the RegisterClass, all instructions patterns used by the asm parser need to use a RegisterOperand (instead of a RegisterClass) for all their register operands. This patch adds one RegisterOperand for each RegisterClass, using the same name as the class, just in lower case, and updates all instruction patterns to use RegisterOperand instead of RegisterClass operands. llvm-svn: 180611
*	PowerPC: Fix encoding of rldimi and rldcl instructions	Ulrich Weigand	2013-04-26	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	When testing the asm parser, I noticed wrong encodings for the above instructions (wrong operand name in rldimi, wrong form and sub-opcode for rldcl). Tests will be added together with the asm parser. llvm-svn: 180606
*	PowerPC: Mark some more patterns as isCodeGenOnly.	Ulrich Weigand	2013-04-17	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	A couple of recently introduced conditional branch patterns also need to be marked as isCodeGenOnly since they cannot be handled by the asm parser. No change in generated code. llvm-svn: 179690
*	Mark all PPC comparison instructions as not having side effects	Hal Finkel	2013-04-15	1	-8/+10
\| \| \| \| \| \| \| \| \| \|	Now that the CR spilling issues have been resolved, we can remove the unmodeled-side-effect attributes from the comparison instructions (and also mark them as isCompare). By allowing these, by default, to have unmodeled side effects, we were hiding problems with CR spilling; but everything seems much happier now. llvm-svn: 179502
*	Mark all PPC CR registers to be spilled as live-in and tag MFCR appropriately	Hal Finkel	2013-04-13	1	-5/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Leaving MFCR has having unmodeled side effects is not enough to prevent unwanted instruction reordering post-RA. We could probably apply a stronger barrier attribute, but there is a better way: Add all (not just the first) CR to be spilled as live-in to the entry block, and add all CRs to the MFCR instruction as implicitly killed. Unfortunately, I don't have a small test case. llvm-svn: 179465
*	PPC: Remove (broken) nested implicit definition lists	Hal Finkel	2013-04-12	1	-35/+32
\| \| \| \| \| \| \| \| \| \|	TableGen will not combine nested list 'let' bindings into a single list, and instead uses only the inner scope. As a result, several instruction definitions were missing implicit register defs that were in outer scopes. This de-nests these scopes and makes all instructions have only one let binding which sets implicit register definitions. llvm-svn: 179392
*	Add PPC instruction record forms and associated query functions	Hal Finkel	2013-04-12	1	-163/+191
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is prep. work for the implementation of optimizeCompare. Many PPC instructions have 'record' forms (in almost all cases, this means that the RC bit is set) that cause the result of the instruction to be compared with zero, and the result of that comparison saved in a predefined condition register. In order to add the record forms of the instructions without too much copy-and-paste, the relevant functions have been refactored into multiclasses which define both the record and normal forms. Also, two TableGen-generated mapping functions have been added which allow querying the instruction code for the record form given the normal form (and vice versa). No functionality change intended. llvm-svn: 179356
*	PPC: Prep for if conversion of bctr[l]	Hal Finkel	2013-04-10	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds in-principle support for if-converting the bctr[l] instructions. These instructions are used for indirect branching. It seems, however, that the current if converter will never actually predicate these. To do so, it would need the ability to hoist a few setup insts. out of the conditionally-executed block. For example, code like this: void foo(int a, int (*bar)()) { if (a != 0) bar(); } becomes: ... beq 0, .LBB0_2 std 2, 40(1) mr 12, 4 ld 3, 0(4) ld 11, 16(4) ld 2, 8(4) mtctr 3 bctrl ld 2, 40(1) .LBB0_2: ... and it would be safe to do all of this unconditionally with a predicated beqctrl instruction. llvm-svn: 179156
*	Allow PPC B and BLR to be if-converted into some predicated forms	Hal Finkel	2013-04-09	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This enables us to form predicated branches (which are the same conditional branches we had before) and also a larger set of predicated returns (including instructions like bdnzlr which is a conditional return and loop-counter decrement all in one). At the moment, if conversion does not capture all possible opportunities. A simple example is provided in early-ret2.ll, where if conversion forms one predicated return, and then the PPCEarlyReturn pass picks up the other one. So, at least for now, we'll keep both mechanisms. llvm-svn: 179134
*	PPC rotate instructions don't have unmodeled side effcts	Hal Finkel	2013-04-07	1	-1/+3
\| \| \| \|	llvm-svn: 178982
*	Most PPC M[TF]CR instructions do not have side effects	Hal Finkel	2013-04-07	1	-1/+8
\| \| \| \|	llvm-svn: 178978
*	PPC pre-increment load instructions do not have side effects	Hal Finkel	2013-04-07	1	-2/+3
\| \| \| \| \| \|	A few were missed in r178972. llvm-svn: 178973
*	PPC pre-increment load instructions do not have side effects	Hal Finkel	2013-04-07	1	-2/+2
\| \| \| \|	llvm-svn: 178972
*	PPC ISEL is a select and never has side effects	Hal Finkel	2013-04-06	1	-0/+1
\| \| \| \|	llvm-svn: 178960
*	Add more PPC floating-point conversion instructions	Hal Finkel	2013-04-01	1	-0/+16
\| \| \| \| \| \| \| \| \|	The P7 and A2 have additional floating-point conversion instructions which allow a direct two-instruction sequence (plus load/store) to convert from all combinations (signed/unsigned i32/i64) <--> (float/double) (on previous cores, only some combinations were directly available). llvm-svn: 178480
*	Add the PPC popcntw instruction	Hal Finkel	2013-04-01	1	-0/+7
\| \| \| \| \| \| \| \| \|	The popcntw instruction is available whenever the popcntd instruction is available, and performs a separate popcnt on the lower and upper 32-bits. Ignoring the high-order count, this can be used for the 32-bit input case (saving on the explicit zero extension otherwise required to use popcntd). llvm-svn: 178470
*	Cleanup PPC(64) i32 -> float/double conversion	Hal Finkel	2013-03-31	1	-13/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The existing SINT_TO_FP code for i32 -> float/double conversion was disabled because it relied on broken EXTSW_32/STD_32 instruction definitions. The original intent had been to enable these 64-bit instructions to be used on CPUs that support them even in 32-bit mode. Unfortunately, this form of lying to the infrastructure was buggy (as explained in the FIXME comment) and had therefore been disabled. This re-enables this functionality, using regular DAG nodes, but only when compiling in 64-bit mode. The old STD_32/EXTSW_32 definitions (which were dead) are removed. llvm-svn: 178438
*	Add the PPC64 ldbrx/stdbrx instructions	Hal Finkel	2013-03-28	1	-1/+9
\| \| \| \| \| \| \| \|	These are 64-bit load/store with byte-swap, and available on the P7 and the A2. Like the similar instructions for 16- and 32-bit words, these are matched in the target DAG-combine phase against load/store-bswap pairs. llvm-svn: 178276
*	Add the PPC64 popcntd instruction	Hal Finkel	2013-03-28	1	-0/+3
\| \| \| \| \| \| \|	PPC ISA 2.06 (P7, A2, etc.) has a popcntd instruction. Add this instruction and tell TTI about it so that popcount-loop recognition will know about it. llvm-svn: 178233
*	Fix typo in PPCInstr64Bit	Hal Finkel	2013-03-28	1	-1/+1
\| \| \| \|	llvm-svn: 178219
*	Use the PPC no-r0 class on the TOC LD pseudos	Hal Finkel	2013-03-27	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	The register parameter in these instructions becomes the base register in an r+i ld instruction (and, thus, cannot be r0). This is not yet testable because we don't yet allocate r0 (and even then any test would be very fragile). llvm-svn: 178121
*	Apply the no-r0 class to PPC TOC ADDI[S] pseudo instructions	Hal Finkel	2013-03-27	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \|	Like the addi/addis instructions themselves, these pseudo instructions also cannot have r0 as their register parameter (because it will be interpreted as the value 0). This is not yet testable because we don't yet allocate r0 (and even when we do, any regression test would be very fragile because it would depend on the register allocator heuristics). llvm-svn: 178118
*	PowerPC: Mark patterns as isCodeGenOnly.	Ulrich Weigand	2013-03-26	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \|	There remain a number of patterns that cannot (and should not) be handled by the asm parser, in particular all the Pseudo patterns. This commit marks those patterns as isCodeGenOnly. No change in generated code. llvm-svn: 178008
*	PowerPC: Remove LDrs pattern.	Ulrich Weigand	2013-03-26	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The LDrs pattern is a duplicate of LD, except that it accepts memory addresses where the displacement is a symbolLo64. An operand type "memrs" is defined for just that purpose. However, this wouldn't be necessary if the default "memrix" operand type were to simply accept 64-bit symbolic addresses directly. The only problem with that is that it uses "symbolLo", which is hardcoded to 32-bit. To fix this, this commit changes "memri" and "memrix" to use new operand types for the memory displacement, which allow iPTR instead of i32. This will also make address parsing easier to implment in the asm parser. No change in generated code. llvm-svn: 178005
*	PowerPC: Remove ADDIL patterns.	Ulrich Weigand	2013-03-26	1	-5/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ADDI/ADDI8 patterns are currently duplicated into ADDIL/ADDI8L, which describe the same instruction, except that they accept a symbolLo[64] operand instead of a s16imm[64] operand. This duplication confuses the asm parser, and it actually not really needed, since symbolLo[64] already accepts immediate operands anyway. So this commit removes the duplicate patterns. No change in generated code. llvm-svn: 178004
*	PowerPC: Use CCBITRC operand for ISEL patterns.	Ulrich Weigand	2013-03-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This commit changes the ISEL patterns to use a CCBITRC operand instead of a "pred" operand. This matches the actual instruction text more directly, and simplifies use of ISEL with the asm parser. In addition, this change allows some simplification of handling the "pred" operand, as this is now only used by BCC. No change in generated code. llvm-svn: 178003
*	PowerPC: Move some 64-bit branch patterns.	Ulrich Weigand	2013-03-26	1	-17/+18
\| \| \| \| \| \| \| \| \| \| \| \|	In PPCInstr64Bit.td, some branch patterns appear in a different sequence than the corresponding 32-bit patterns in PPCInstrInfo.td. To simplify future changes that affect both files, this commit moves those patterns to rearrange them into a similar sequence. No effect on generated code. llvm-svn: 178001
*	Use direct types in PowerPC instruction patterns.	Ulrich Weigand	2013-03-25	1	-122/+120
\| \| \| \| \| \| \| \| \| \|	This commit updates the PowerPC back-end (PPCInstrInfo.td and PPCInstr64Bit.td) to use types instead of register classes in instruction patterns, along the lines of Jakob Stoklund Olesen's changes in r177835 for Sparc. llvm-svn: 177890
*	Use direct types in PowerPC Pat patterns.	Ulrich Weigand	2013-03-25	1	-53/+53
\| \| \| \| \| \| \| \| \|	This commit updates the PowerPC back-end (PPCInstrInfo.td and PPCInstr64Bit.td) to use types instead of register classes in Pat patterns, along the lines of Jakob Stoklund Olesen's changes in r177829 for Sparc. llvm-svn: 177889
*	Remove ABI-duplicated call instruction patterns.	Ulrich Weigand	2013-03-22	1	-59/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently have a duplicated set of call instruction patterns depending on the ABI to be followed (Darwin vs. Linux). This is a bit odd; while the different ABIs will result in different instruction sequences, the actual instructions themselves ought to be independent of the ABI. And in fact it turns out that the only nontrivial difference between the two sets of patterns is that in the PPC64 Linux ABI, the instruction used for indirect calls is marked to take X11 as extra input register (which is indeed used only with that ABI to hold an incoming environment pointer for nested functions). However, this does not need to be hard-coded at the .td pattern level; instead, the C++ code expanding calls can simply add that use, just like it adds uses for argument registers anyway. No change in generated code expected. llvm-svn: 177735
*	Rename memrr ptrreg and offreg components.	Ulrich Weigand	2013-03-22	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \|	Currently, the sub-operand of a memrr address that corresponds to what hardware considers the base register is called "offreg", while the sub-operand that corresponds to the offset is called "ptrreg". To avoid confusion, this patch simply swaps the named of those two sub-operands and updates all uses. No functional change is intended. llvm-svn: 177734
*	Fix swapped BasePtr and Offset in pre-inc memory addresses.	Ulrich Weigand	2013-03-22	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PPCTargetLowering::getPreIndexedAddressParts currently provides the base part of a memory address in the offset result, and the offset part in the base result. That swap is then undone again when an MI instruction is generated (in PPCDAGToDAGISel::Select for loads, and using .md Pat patterns for stores). This patch reverts this double swap, to make common code and back-end be in sync as to which part of the address is base and which is offset. To avoid performance regressions in certain cases, target code now checks whether the choice of base register would be rejected for pre-inc accesses by common code, and attempts to swap base and offset again in such cases. (Overall, this means that now pre-ice accesses are generated more frequently than before.) llvm-svn: 177733
*	Remove the xaddroff ComplexPattern.	Ulrich Weigand	2013-03-22	1	-8/+8
\| \| \| \| \| \| \| \| \|	The xaddroff pattern is currently (mistakenly) used to recognize the base register in pre-inc store patterns. This patch replaces those uses by ptr_rc_nor0 (as is elsewhere done to match the base register of an address), and removes the now unused ComplexPattern. llvm-svn: 177731
*	Fix a register-class comparison bug in PPCCTRLoops	Hal Finkel	2013-03-21	1	-9/+0
\| \| \| \| \| \| \| \| \|	Thanks to Jakob for isolating the underlying problem from the test case in r177423. The original commit had introduced asymmetric copy operations, but these turned out to be a work-around to the real problem (the use of == instead of hasSubClassEq in PPCCTRLoops). llvm-svn: 177679
*	Implement builtin_{setjmp/longjmp} on PPC	Hal Finkel	2013-03-21	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This implements SJLJ lowering on PPC, making the Clang functions __builtin_{setjmp/longjmp} functional on PPC platforms. The implementation strategy is similar to that on X86, with the exception that a branch-and-link variant is used to get the right jump address. Credit goes to Bill Schmidt for suggesting the use of the unconditional bcl form (instead of the regular bl instruction) to limit return-address-cache pollution. Benchmarking the speed at -O3 of: static jmp_buf env_sigill; void foo() { __builtin_longjmp(env_sigill,1); } main() { ... for (int i = 0; i < c; ++i) { if (__builtin_setjmp(env_sigill)) { goto done; } else { foo(); } done:; } ... } vs. the same code using the libc setjmp/longjmp functions on a P7 shows that this builtin implementation is ~4x faster with Altivec enabled and ~7.25x faster with Altivec disabled. This comparison is somewhat unfair because the libc version must also save/restore the VSX registers which we don't yet support. llvm-svn: 177666
*	Add missing mayLoad flag to LHAUX8 and LWAUX.	Ulrich Weigand	2013-03-19	1	-1/+2
\| \| \| \| \| \| \| \| \|	All pre-increment load patterns need to set the mayLoad flag (since they don't provide a DAG pattern). This was missing for LHAUX8 and LWAUX, which is added by this patch. llvm-svn: 177431
*	Rewrite LHAU8 pattern to use standard memory operand.	Ulrich Weigand	2013-03-19	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	As opposed to to pre-increment store patterns, the pre-increment load patterns were already using standard memory operands, with the sole exception of LHAU8. As there's no real reason why LHAU8 should be different here, this patch simply rewrites the pattern to also use a memri operand, just like all the other patterns. llvm-svn: 177430
*	Rewrite pre-increment store patterns to use standard memory operands.	Ulrich Weigand	2013-03-19	1	-72/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, pre-increment store patterns are written to use two separate operands to represent address base and displacement: stwu $rS, $ptroff($ptrreg) This causes problems when implementing the assembler parser, so this commit changes the patterns to use standard (complex) memory operands like in all other memory access instruction patterns: stwu $rS, $dst To still match those instructions against the appropriate pre_store SelectionDAG nodes, the patch uses the new feature that allows a Pat to match multiple DAG operands against a single (complex) instruction operand. Approved by Hal Finkel. llvm-svn: 177429
*	Fix sub-operand size mismatch in tocentry operands.	Ulrich Weigand	2013-03-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	The tocentry operand class refers to 64-bit values (it is only used in 64-bit, where iPTR is a 64-bit type), but its sole suboperand is designated as 32-bit type. This causes a mismatch to be detected at compile-time with the TableGen patch I'll check in shortly. To fix this, this commit changes the suboperand to a 64-bit type as well. llvm-svn: 177427
*	Prepare to make r0 an allocatable register on PPC	Hal Finkel	2013-03-19	1	-48/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the PPC r0 register is unconditionally reserved. There are two reasons for this: 1. r0 is treated specially (as the constant 0) by certain instructions, and so cannot be used with those instructions as a regular register. 2. r0 is used as a temporary register in the CR-register spilling process (where, under some circumstances, we require two GPRs). This change addresses the first reason by introducing a restricted register class (without r0) for use by those instructions that treat r0 specially. These register classes have a new pseudo-register, ZERO, which represents the r0-as-0 use. This has the side benefit of making the existing target code simpler (and easier to understand), and will make it clear to the register allocator that uses of r0 as 0 don't conflict will real uses of the r0 register. Once the CR spilling code is improved, we'll be able to allocate r0. Adding these extra register classes, for some reason unclear to me, causes requests to the target to copy 32-bit registers to 64-bit registers. The resulting code seems correct (and causes no test-suite failures), and the new test case covers this new kind of asymmetric copy. As r0 is still reserved, no functionality change intended. llvm-svn: 177423