bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[PowerPC] [Peephole] fold frame offset by using index form to save add.	czhengsz	2019-10-25	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	renamable $x6 = ADDI8 $x1, -80 ;;; 0 is replaced with -80 renamable $x6 = ADD8 killed renamable $x6, renamable $x5 STW killed renamable $r3, 4, killed renamable $x6 :: (store 4 into %ir.14, !tbaa !2) After PEI there is a peephole opt opportunity to combine above -80 in ADDI8 with 4 in the STW to eliminate unnecessary ADD8. Expected result: renamable $x6 = ADDI8 $x1, -76 STWX killed renamable $r3, renamable $x5, killed renamable $x6 :: (store 4 into %ir.6, !tbaa !2) Reviewed by: stefanp Differential Revision: https://reviews.llvm.org/D66329
*	CodeGen: Introduce a class for registers	Matt Arsenault	2019-06-24	1	-2/+2
\| \| \| \| \| \| \| \| \|	Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191
*	Include what you use in PPCRegisterInfo.h	Dmitri Gribenko	2019-06-04	1	-1/+2
\| \| \| \|	llvm-svn: 362475
*	[PowerPC] Move the stack pointer update instruction later in the prologue ↵	Stefan Pintilie	2019-02-28	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	and earlier in the epilogue. Move the stdu instruction in the prologue and epilogue. This should provide a small performance boost in functions that are able to do this. I've kept this change rather conservative at the moment and functions with frame pointers or base pointers will not try to move the stack pointer update. Differential Revision: https://reviews.llvm.org/D42590 llvm-svn: 355085
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	[TargetRegisterInfo] Remove temporary hook enableMultipleCopyHints()	Jonas Paulsson	2018-10-05	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Finally all targets are enabling multiple regalloc hints, so the hook to disable this can now be removed. NFC. Review: Simon Pilgrim https://reviews.llvm.org/D52316 llvm-svn: 343851
*	[PowerPC] [NFC] Refactor code for printing register operands	Nemanja Ivanovic	2018-09-27	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have an unfortunate situation in our back end where we have to keep pairs of functions synchronized. Needless to say that this is not an ideal situation as it is very difficult to enforce. Even without bugs, it's annoying to have to do the same thing in two places. This patch just refactors the code so that the two pairs of those functions that pertain to printing register operands are unified: - stripRegisterPrefix() - this just removes the letter prefixes from registers for the InstrPrinter and AsmPrinter. This patch provides this as a static member of PPCRegisterInfo - Handling of PPCII::UseVSXReg - there are 3 places where we do something special for instructions with that flag set. Each of those places does its own checking of this flag and implements code customization. Any changes to how we print/encode VSX/VMX registers require modifying all 3 places. This patch unifies this into a static function in PPCInstrInfo that returns the register number adjusted as needed. Differential revision: https://reviews.llvm.org/D52467 llvm-svn: 343195
*	[PowerPC] Return true in enableMultipleCopyHints().	Jonas Paulsson	2018-01-31	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Nemanja Ivanovic llvm-svn: 323858
*	[MachineLICM] Hoist TOC-based address instructions	Lei Huang	2017-06-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add condition for MachineLICM to safely hoist instructions that utilize non constant registers that are reserved. On PPC, global variable access is done through the table of contents (TOC) which is always in register X2. The ABI reserves this register in any functions that have calls or access global variables. A call through a function pointer involves saving, changing and restoring this register around the call and thus MachineLICM does not consider it to be invariant. We can however guarantee the register is preserved across the call and thus is invariant. Differential Revision: https://reviews.llvm.org/D33562 llvm-svn: 305490
*	Target: Remove unused entities.	Peter Collingbourne	2016-10-09	1	-1/+1
\| \| \| \|	llvm-svn: 283690
*	CXX_FAST_TLS calling convention: performance improvement for PPC64	Chuang-Yu Cheng	2016-04-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the same change on PPC64 as r255821 on AArch64. I have even borrowed his commit message. The access function has a short entry and a short exit, the initialization block is only run the first time. To improve the performance, we want to have a short frame at the entry and exit. We explicitly handle most of the CSRs via copies. Only the CSRs that are not handled via copies will be in CSR_SaveList. Frame lowering and prologue/epilogue insertion will generate a short frame in the entry and exit according to CSR_SaveList. The majority of the CSRs will be handled by register allcoator. Register allocator will try to spill and reload them in the initialization block. We add CSRsViaCopy, it will be explicitly handled during lowering. 1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target supports it for the given machine function and the function has only return exits). We also call TLI->initializeSplitCSR to perform initialization. 2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to virtual registers at beginning of the entry block and copies from virtual registers to CSRsViaCopy at beginning of the exit blocks. 3> we also need to make sure the explicit copies will not be eliminated. Author: Tom Jablin (tjablin) Reviewers: hfinkel kbarton cycheng http://reviews.llvm.org/D17533 llvm-svn: 265781
*	Introduce new @llvm.get.dynamic.area.offset.i{32, 64} intrinsics.	Yury Gribov	2015-12-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The @llvm.get.dynamic.area.offset.* intrinsic family is used to get the offset from native stack pointer to the address of the most recent dynamic alloca on the caller's stack. These intrinsics are intendend for use in combination with @llvm.stacksave and @llvm.restore to get a pointer to the most recent dynamic alloca. This is useful, for example, for AddressSanitizer's stack unpoisoning routines. Patch by Max Ostapenko. Differential Revision: http://reviews.llvm.org/D14983 llvm-svn: 254404
*	[WinEH] Mark funclet entries and exits as clobbering all registers	Reid Kleckner	2015-11-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In this implementation, LiveIntervalAnalysis invents a few register masks on basic block boundaries that preserve no registers. The nice thing about this is that it prevents the prologue inserter from thinking it needs to spill all XMM CSRs, because it doesn't see any explicit physreg defs in the MI. Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer Subscribers: MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D14407 llvm-svn: 252318
*	Untabify.	NAKAMURA Takumi	2015-09-22	1	-3/+3
\| \| \| \|	llvm-svn: 248264
*	Reformat blank lines.	NAKAMURA Takumi	2015-09-22	1	-2/+2
\| \| \| \|	llvm-svn: 248263
*	Targets: commonize some stack realignment code	JF Bastien	2015-07-20	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch does the following: * Fix FIXME on `needsStackRealignment`: it is now shared between multiple targets, implemented in `TargetRegisterInfo`, and isn't `virtual` anymore. This will break out-of-tree targets, silently if they used `virtual` and with a build error if they used `override`. * Factor out `canRealignStack` as a `virtual` function on `TargetRegisterInfo`, by default only looks for the `no-realign-stack` function attribute. Multiple targets duplicated the same `needsStackRealignment` code: - Aarch64. - ARM. - Mips almost: had extra `DEBUG` diagnostic, which the default implementation now has. - PowerPC. - WebAssembly. - x86 almost: has an extra `-force-align-stack` option, which the default implementation now has. The default implementation of `needsStackRealignment` used to just return `false`. My current patch changes the behavior by simply using the above shared behavior. This affects: - AMDGPU - BPF - CppBackend - MSP430 - NVPTX - Sparc - SystemZ - XCore - Out-of-tree targets This is a breaking change! `make check` passes. The only implementation of the `virtual` function (besides the slight different in x86) was Hexagon (which did `MF.getFrameInfo()->getMaxAlignment() > 8`), and potentially some out-of-tree targets. Hexagon now uses the default implementation. `needsStackRealignment` was being overwritten in `<Target>GenRegisterInfo.inc`, to return `false` as the default also did. That was odd and is now gone. Reviewers: sunfish Subscribers: aemerson, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11160 llvm-svn: 242727
*	Add Hardware Transactional Memory (HTM) Support	Kit Barton	2015-03-25	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07 (POWER8). The intrinsic support is based on GCC one [1], but currently only the 'PowerPC HTM Low Level Built-in Function' are implemented. The HTM instructions follows the RC ones and the transaction initiation result is set on RC0 (with exception of tcheck). Currently approach is to create a register copy from CR0 to GPR and comapring. Although this is suboptimal, since the branch could be taken directly by comparing the CR0 value, it generates code correctly on both test and branch and just return value. A possible future optimization could be elimitate the MFCR instruction to branch directly. The HTM usage requires a recently newer kernel with PPC HTM enabled. Tested on powerpc64 and powerpc64le. This is send along a clang patch to enabled the builtins and option switch. [1] https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html Phabricator Review: http://reviews.llvm.org/D8247 llvm-svn: 233204
*	[ARM] Fix handling of thumb1 out-of-range frame offsets	John Brawn	2015-03-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LocalStackSlotPass assumes that isFrameOffsetLegal doesn't change its answer when the base register changes. Unfortunately this isn't true in thumb1, where SP-based loads allow a larger offset than non-SP-based loads, and this causes the base register reuse code to generate instructions that are unencodable, causing an assertion failure. Solve this by adding a BaseReg parameter to isFrameOffsetLegal, which ARMBaseRegisterInfo can then make use of to give the correct answer. Differential Revision: http://reviews.llvm.org/D8419 llvm-svn: 232825
*	Remove some unnecessary forward declarations and put a couple more	Eric Christopher	2015-03-12	1	-4/+0
\| \| \| \| \| \|	where they're supposed to reside. llvm-svn: 232014
*	Remove the need to cache the subtarget in the PowerPC TargetRegisterInfo	Eric Christopher	2015-03-12	1	-2/+2
\| \| \| \| \| \| \|	classes. Replace it with a cache to the TargetMachine and use that where applicable at the moment. llvm-svn: 232002
*	Have getCallPreservedMask and getThisCallPreservedMask take a	Eric Christopher	2015-03-11	1	-1/+2
\| \| \| \| \| \| \|	MachineFunction argument so that we can grab subtarget specific features off of it. llvm-svn: 231979
*	One more getCalleeSavedRegs prototype with nullptr.	Eric Christopher	2015-03-11	1	-2/+1
\| \| \| \|	llvm-svn: 231977
*	Have TargetRegisterInfo::getLargestLegalSuperClass take a	Eric Christopher	2015-03-10	1	-2/+3
\| \| \| \| \| \| \|	MachineFunction argument so that it can look up the subtarget rather than using a cached one in some Targets. llvm-svn: 231888
*	Revert "r225811 - Revert "r225808 - [PowerPC] Add StackMap/PatchPoint support""	Hal Finkel	2015-01-14	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This re-applies r225808, fixed to avoid problems with SDAG dependencies along with the preceding fix to ScheduleDAGSDNodes::RegDefIter::InitNodeNumDefs. These problems caused the original regression tests to assert/segfault on many (but not all) systems. Original commit message: This commit does two things: 1. Refactors PPCFastISel to use more of the common infrastructure for call lowering (this lets us take advantage of this common code for lowering some common intrinsics, stackmap/patchpoint among them). 2. Adds support for stackmap/patchpoint lowering. For the most part, this is very similar to the support in the AArch64 target, with the obvious differences (different registers, NOP instructions, etc.). The test cases are adapted from the AArch64 test cases. One difference of note is that the patchpoint call sequence takes 24 bytes, so you can't use less than that (on AArch64 you can go down to 16). Also, as noted in the docs, we take the patchpoint address to be the actual code address (assuming the call is local in the TOC-sharing sense), which should yield higher performance than generating the full cross-DSO indirect-call sequence and is likely just as useful for JITed code (if not, we'll change it). StackMaps and Patchpoints are still marked as experimental, and so this support is doubly experimental. So go ahead and experiment! llvm-svn: 225909
*	Revert "r225808 - [PowerPC] Add StackMap/PatchPoint support"	Hal Finkel	2015-01-13	1	-2/+0
\| \| \| \| \| \| \|	Reverting this while I investiage buildbot failures (segfaulting in GetCostForDef at ScheduleDAGRRList.cpp:314). llvm-svn: 225811
*	[PowerPC] Add StackMap/PatchPoint support	Hal Finkel	2015-01-13	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit does two things: 1. Refactors PPCFastISel to use more of the common infrastructure for call lowering (this lets us take advantage of this common code for lowering some common intrinsics, stackmap/patchpoint among them). 2. Adds support for stackmap/patchpoint lowering. For the most part, this is very similar to the support in the AArch64 target, with the obvious differences (different registers, NOP instructions, etc.). The test cases are adapted from the AArch64 test cases. One difference of note is that the patchpoint call sequence takes 24 bytes, so you can't use less than that (on AArch64 you can go down to 16). Also, as noted in the docs, we take the patchpoint address to be the actual code address (assuming the call is local in the TOC-sharing sense), which should yield higher performance than generating the full cross-DSO indirect-call sequence and is likely just as useful for JITed code (if not, we'll change it). StackMaps and Patchpoints are still marked as experimental, and so this support is doubly experimental. So go ahead and experiment! llvm-svn: 225808
*	Canonicalize header guards into a common format.	Benjamin Kramer	2014-08-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558
*	[C++11] Add 'override' keywords and remove 'virtual'. Additionally add ↵	Craig Topper	2014-04-29	1	-19/+21
\| \| \| \| \| \|	'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. PowerPC edition llvm-svn: 207504
*	[C++] Use 'nullptr'.	Craig Topper	2014-04-28	1	-2/+2
\| \| \| \|	llvm-svn: 207394
*	Make consistent use of MCPhysReg instead of uint16_t throughout the tree.	Craig Topper	2014-04-04	1	-1/+1
\| \| \| \|	llvm-svn: 205610
*	Simplify resolveFrameIndex() signature.	Jim Grosbach	2014-04-02	1	-2/+2
\| \| \| \| \| \| \| \|	Just pass a MachineInstr reference rather than an MBB iterator. Creating a MachineInstr& is the first thing every implementation did anyway. llvm-svn: 205453
*	[PowerPC] Add subregister classes for f64 VSX values	Hal Finkel	2014-03-29	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	We had stored both f64 values and v2f64, etc. values in the VSX registers. This worked, but was suboptimal because we would always spill 16-byte values even through we almost always had scalar 8-byte values. This resulted in an increase in stack-size use, extra memory bandwidth, etc. To fix this, I've added 64-bit subregisters of the Altivec registers, and combined those with the existing scalar floating-point registers to form a class of VSX scalar floating-point registers. The ABI code has also been enhanced to use this register class and some other necessary improvements have been made. llvm-svn: 205075
*	Add CR-bit tracking to the PowerPC backend for i1 values	Hal Finkel	2014-02-28	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change enables tracking i1 values in the PowerPC backend using the condition register bits. These bits can be treated on PowerPC as separate registers; individual bit operations (and, or, xor, etc.) are supported. Tracking booleans in CR bits has several advantages: - Reduction in register pressure (because we no longer need GPRs to store boolean values). - Logical operations on booleans can be handled more efficiently; we used to have to move all results from comparisons into GPRs, perform promoted logical operations in GPRs, and then move the result back into condition register bits to be used by conditional branches. This can be very inefficient, because the throughput of these CR <-> GPR moves have high latency and low throughput (especially when other associated instructions are accounted for). - On the POWER7 and similar cores, we can increase total throughput by using the CR bits. CR bit operations have a dedicated functional unit. Most of this is more-or-less mechanical: Adjustments were needed in the calling-convention code, support was added for spilling/restoring individual condition-register bits, and conditional branch instruction definitions taking specific CR bits were added (plus patterns and code for generating bit-level operations). This is enabled by default when running at -O2 and higher. For -O0 and -O1, where the ability to debug is more important, this feature is disabled by default. Individual CR bits do not have assigned DWARF register numbers, and storing values in CR bits makes them invisible to the debugger. It is critical, however, that we don't move i1 values that have been promoted to larger values (such as those passed as function arguments) into bit registers only to quickly turn around and move the values back into GPRs (such as happens when values are returned by functions). A pair of target-specific DAG combines are added to remove the trunc/extends in: trunc(binary-ops(binary-ops(zext(x), zext(y)), ...) and: zext(binary-ops(binary-ops(trunc(x), trunc(y)), ...) In short, we only want to use CR bits where some of the i1 values come from comparisons or are used by conditional branches or selects. To put it another way, if we can do the entire i1 computation in GPRs, then we probably should (on the POWER7, the GPR-operation throughput is higher, and for all cores, the CR <-> GPR moves are expensive). POWER7 test-suite performance results (from 10 runs in each configuration): SingleSource/Benchmarks/Misc/mandel-2: 35% speedup MultiSource/Benchmarks/Prolangs-C++/city/city: 21% speedup MultiSource/Benchmarks/MiBench/automotive-susan: 23% speedup SingleSource/Benchmarks/CoyoteBench/huffbench: 13% speedup SingleSource/Benchmarks/Misc-C++/Large/sphereflake: 13% speedup SingleSource/Benchmarks/Misc-C++/mandel-text: 10% speedup SingleSource/Benchmarks/Misc-C++-EH/spirit: 10% slowdown MultiSource/Applications/lemon/lemon: 8% slowdown llvm-svn: 202451
*	Re-sort all of the includes with ./utils/sort_includes.py so that	Chandler Carruth	2014-01-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	subsequent changes are easier to review. About to fix some layering issues, and wanted to separate out the necessary churn. Also comment and sink the include of "Windows.h" in three .inc files to match the usage in Memory.inc. llvm-svn: 198685
*	Remove getEHExceptionRegister and getEHHandlerRegister.	Rafael Espindola	2013-10-07	1	-4/+0
\| \| \| \| \| \|	They haven't been used for a long time. Patch by MathOnNapkins. llvm-svn: 192099
*	PPC: Implement base pointer and stack realignment	Hal Finkel	2013-07-17	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This builds on some frame-lowering code that has existed since 2005 (r24224) but was disabled in 2008 (r48188) because it needed base pointer support to function correctly. This implementation follows the strategy suggested by Dale Johannesen in r48188 where the following comment was added: This does not currently work, because the delta between old and new stack pointers is added to offsets that reference incoming parameters after the prolog is generated, and the code that does that doesn't handle a variable delta. You don't want to do that anyway; a better approach is to reserve another register that retains to the incoming stack pointer, and reference parameters relative to that. And now we do exactly that. If we don't need a frame pointer, then we use r31 as a base pointer. If we do need a frame pointer, then we use r30 as a base pointer. The base pointer retains the value of the stack pointer before it was decremented in the prologue. We then use the base pointer to resolve all negative frame indicies. The basic scheme follows that for base pointers in the X86 backend. We use a base pointer when we need to dynamically realign the incoming stack pointer. This currently applies only to static objects (dynamic allocas with large alignments, and base-pointer support in SjLj lowering will come in future commits). llvm-svn: 186478
*	Don't cache the instruction and register info from the TargetMachine, because	Bill Wendling	2013-06-07	1	-2/+1
\| \| \| \| \| \| \| \|	the internals of TargetMachine could change. No functionality change intended. llvm-svn: 183494
*	Use virtual base registers on PPC	Hal Finkel	2013-04-09	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \|	On PowerPC, non-vector loads and stores have r+i forms; however, in functions with large stack frames these were not being used to access slots far from the stack pointer because such slots were out of range for the signed 16-bit immediate offset field. This increases register pressure because we need a separate register for each offset (when the r+r form is used). By enabling virtual base registers, we can deal with large stack frames without unduly increasing register pressure. llvm-svn: 179105
*	Cleanup ImmToIdxMap and noImmForm in PPCRegisterInfo	Hal Finkel	2013-03-31	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	ImmToIdxMap should be a DenseMap (not a std::map) because there is no ordering requirement. Also, we don't need a separate list of instructions for noImmForm in eliminateFrameIndex, because this list is essentially the complement of the keys in ImmToIdxMap. No functionality change intended. llvm-svn: 178450
*	Cleanup some unused reg. scavenger parameters in PPCRegisterInfo	Hal Finkel	2013-03-23	1	-10/+9
\| \| \| \| \| \| \| \| \| \| \|	These spilling functions will eventually make use of the register scavenger, however, they'll do so by taking advantage of PEI's virtual-register-based delayed scavenging mechanism. As a result, these function parameters will not be used, and can be removed. No functionality change intended. llvm-svn: 177827
*	Implement builtin_{setjmp/longjmp} on PPC	Hal Finkel	2013-03-21	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This implements SJLJ lowering on PPC, making the Clang functions __builtin_{setjmp/longjmp} functional on PPC platforms. The implementation strategy is similar to that on X86, with the exception that a branch-and-link variant is used to get the right jump address. Credit goes to Bill Schmidt for suggesting the use of the unconditional bcl form (instead of the regular bl instruction) to limit return-address-cache pollution. Benchmarking the speed at -O3 of: static jmp_buf env_sigill; void foo() { __builtin_longjmp(env_sigill,1); } main() { ... for (int i = 0; i < c; ++i) { if (__builtin_setjmp(env_sigill)) { goto done; } else { foo(); } done:; } ... } vs. the same code using the libc setjmp/longjmp functions on a P7 shows that this builtin implementation is ~4x faster with Altivec enabled and ~7.25x faster with Altivec disabled. This comparison is somewhat unfair because the libc version must also save/restore the VSX registers which we don't yet support. llvm-svn: 177666
*	Add support for spilling VRSAVE on PPC	Hal Finkel	2013-03-21	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	Although there is only one Altivec VRSAVE register, it is a member of a register class, and we need the ability to spill it. Because this register is normally callee-preserved and handled by special code this has never before been necessary. However, this capability will be required by a forthcoming commit adding SjLj support. llvm-svn: 177654
*	Remove PPC avoidWriteAfterWrite callback	Hal Finkel	2013-03-16	1	-2/+0
\| \| \| \| \| \| \| \| \|	As a follow-up to r158719, remove PPCRegisterInfo::avoidWriteAfterWrite. Jakob pointed out in response to r158719 that this callback is currently unused and so this has no effect (and the speedups that I thought that I had observed as a result of implementing this function must have been noise). llvm-svn: 177228
*	Use frame-index scavenging for PPC register spilling	Hal Finkel	2013-03-14	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make requiresFrameIndexScavenging return true, and create virtual registers in the spilling code instead of using the register scavenger directly. This makes the target-level code simpler, and importantly, delays the scavenging until after callee-saved register processing (which will be important for later changes). Also cleans up trackLivenessAfterRegAlloc (makes it inline in the header with the other related functions). This makes it clear that it always returns true. No functionality change intended. llvm-svn: 177107
*	PPC should always use the register scavenger for CR spilling	Hal Finkel	2013-03-12	1	-2/+3
\| \| \| \| \| \| \| \|	This removes the -disable-ppc[32\|64]-regscavenger options; the code that uses the register scavenger has been working well (and has been the default) for some time, and we don't need options to enable the old (broken) CR spilling code. llvm-svn: 176865
*	Fix PR14364.	Bill Schmidt	2013-02-24	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \|	This removes a const_cast hack from PPCRegisterInfo::hasReservedSpillSlot(). The proper place to save the frame index for the CR spill slot is in the PPCFunctionInfo object, not the PPCRegisterInfo object. No new test cases, as this just reimplements existing function. Existing tests such as test/CodeGen/PowerPC/crsave.ll are sufficient. llvm-svn: 175998
*	Move the eliminateCallFramePseudoInstr method from TargetRegisterInfo	Eli Bendersky	2013-02-21	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to TargetFrameLowering, where it belongs. Incidentally, this allows us to delete some duplicated (and slightly different!) code in TRI. There are potentially other layering problems that can be cleaned up as a result, or in a similar manner. The refactoring was OK'd by Anton Korobeynikov on llvmdev. Note: this touches the target interfaces, so out-of-tree targets may be affected. llvm-svn: 175788
*	[PEI] Pass the frame index operand number to the eliminateFrameIndex function.	Chad Rosier	2013-01-31	1	-1/+2
\| \| \| \| \| \| \|	Each target implementation was needlessly recomputing the index. Part of rdar://13076458 llvm-svn: 174083
*	Change unsigned to uint32_t to match base class declaration and other targets.	Craig Topper	2012-09-16	1	-1/+1
\| \| \| \|	llvm-svn: 164001
*	This patch corrects logic in PPCFrameLowering for save and restore of ↵	Roman Divacky	2012-09-12	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nonvolatile condition register fields across calls under the SVR4 ABIs. * With the 64-bit ABI, the save location is at a fixed offset of 8 from the stack pointer. The frame pointer cannot be used to access this portion of the stack frame since the distance from the frame pointer may change with alloca calls. * With the 32-bit ABI, the save location is just below the general register save area, and is accessed via the frame pointer like the rest of the save areas. This is an optional slot, so it must only be created if any of CR2, CR3, and CR4 were modified. * For both ABIs, save/restore logic is generated only if one of the nonvolatile CR fields were modified. I also took this opportunity to clean up an extra FIXME in PPCFrameLowering.h. Save area offsets for 32-bit GPRs are meaningless for the 64-bit ABI, so I removed them for correctness and efficiency. Fixes PR13708 and partially also PR13623. It lets us enable exception handling on PPC64. Patch by William J. Schmidt! llvm-svn: 163713