summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
* Fix PR14364.Bill Schmidt2013-02-244-20/+27
| | | | | | | | | | | This removes a const_cast hack from PPCRegisterInfo::hasReservedSpillSlot(). The proper place to save the frame index for the CR spill slot is in the PPCFunctionInfo object, not the PPCRegisterInfo object. No new test cases, as this just reimplements existing function. Existing tests such as test/CodeGen/PowerPC/crsave.ll are sufficient. llvm-svn: 175998
* Move the eliminateCallFramePseudoInstr method from TargetRegisterInfoEli Bendersky2013-02-215-44/+46
| | | | | | | | | | | | | | | to TargetFrameLowering, where it belongs. Incidentally, this allows us to delete some duplicated (and slightly different!) code in TRI. There are potentially other layering problems that can be cleaned up as a result, or in a similar manner. The refactoring was OK'd by Anton Korobeynikov on llvmdev. Note: this touches the target interfaces, so out-of-tree targets may be affected. llvm-svn: 175788
* Trivial cleanupBill Schmidt2013-02-211-2/+2
| | | | llvm-svn: 175771
* Large code model support for PowerPC.Bill Schmidt2013-02-215-15/+20
| | | | | | | | | | | Large code model is identical to medium code model except that the addis/addi sequence for "local" accesses is never used. All accesses use the addis/ld sequence. The coding changes are straightforward; most of the patch is taken up with creating variants of the medium model tests for large model. llvm-svn: 175767
* Code review cleanup for r175697Bill Schmidt2013-02-211-11/+7
| | | | llvm-svn: 175739
* PPCDAGToDAGISel::PostprocessISelDAG()Bill Schmidt2013-02-211-0/+155
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the PPCDAGToDAGISel::PostprocessISelDAG virtual method to perform post-selection peephole optimizations on the DAG representation. One optimization is implemented here: folds to clean up complex addressing expressions for thread-local storage and medium code model. It will also be useful for large code model sequences when those are added later. I originally thought about doing this on the MI representation prior to register assignment, but it's difficult to do effective global dead code elimination at that point. DCE is trivial on the DAG representation. A typical example of a candidate code sequence in assembly: addis 3, 2, globalvar@toc@ha addi 3, 3, globalvar@toc@l lwz 5, 0(3) When the final instruction is a load or store with an immediate offset of zero, the offset from the add-immediate can replace the zero, provided the relocation information is carried along: addis 3, 2, globalvar@toc@ha lwz 5, globalvar@toc@l(3) Since the addi can in general have multiple uses, we need to only delete the instruction when the last use is removed. llvm-svn: 175697
* Relocation enablement for PPC DAG postprocessing passBill Schmidt2013-02-214-11/+37
| | | | llvm-svn: 175693
* Update TargetLowering ivars for name policy.Jim Grosbach2013-02-201-7/+7
| | | | | | | | | | | http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly ivars should be camel-case and start with an upper-case letter. A few in TargetLowering were starting with a lower-case letter. No functional change intended. llvm-svn: 175667
* Additional fixes for bug 15155.Bill Schmidt2013-02-203-35/+64
| | | | | | | | This handles the cases where the 6-bit splat element is odd, converting to a three-instruction sequence to add or subtract two splats. With this fix, the XFAIL in test/CodeGen/PowerPC/vec_constants.ll is removed. llvm-svn: 175663
* Fix bug 14779 for passing anonymous aggregates [patch by Kai Nacke].Bill Schmidt2013-02-201-1/+7
| | | | | | | | The PPC backend doesn't handle these correctly. This patch uses logic similar to that in the X86 and ARM backends to track these arguments properly. llvm-svn: 175635
* Fix PR15155: lost vadd/vsplat optimization.Bill Schmidt2013-02-203-8/+43
| | | | | | | | | | | | | | During lowering of a BUILD_VECTOR, we look for opportunities to use a vector splat. When the splatted value fits in 5 signed bits, a single splat does the job. When it doesn't fit in 5 bits but does fit in 6, and is an even value, we can splat on half the value and add the result to itself. This last optimization hasn't been working recently because of improved constant folding. To circumvent this, create a pseudo VADD_SPLAT that can be expanded during instruction selection. llvm-svn: 175632
* Add missing #include.Jakub Staszak2013-02-201-0/+1
| | | | llvm-svn: 175583
* Make the visibility of LLVMPPCCompilationCallback work with GCC.Benjamin Kramer2013-02-171-1/+1
| | | | | | | | | | | | GCC warns about the attribute being ignored if it occurs after void*. There seems to be some kind of incompatibility between clang and gcc here, but I can't fathom who's right. void* LLVM_LIBRARY_VISIBILITY foo(); // clang: hidden, gcc: default LLVM_LIBRARY_VISIBILITY void *bar(); // clang: hidden, gcc: hidden void LLVM_LIBRARY_VISIBILITY qux(); // clang: hidden, gcc: hidden llvm-svn: 175394
* Give these callbacks hidden visibility. It is better to not export them moreRafael Espindola2013-02-151-3/+4
| | | | | | | than we need to and some ELF linkers complain about directly accessing symbols with default visibility. llvm-svn: 175268
* Don't make assumptions about the mangling of static functions in extern "C"Rafael Espindola2013-02-151-7/+7
| | | | | | | | blocks. We still don't have consensus if we should try to change clang or the standard, but llvm should work with compilers that implement the current standard and mangle those functions. llvm-svn: 175267
* Revert r175120 and r175121. Clang is producing the expected asm names again.Rafael Espindola2013-02-141-1/+1
| | | | llvm-svn: 175133
* Don't asume that a static function in an extern "C" block will not be mangled.Rafael Espindola2013-02-141-1/+1
| | | | | | | | | | | | | | Since functions with internal linkage don't have language linkage, it is valid to overload them: extern "C" { static int foo(); static int foo(int); } So we mangle them. llvm-svn: 175120
* Add registration for PPC-specific passes to allow the IR to be dumpedKrzysztof Parzyszek2013-02-133-3/+41
| | | | | | via -print-after-all. llvm-svn: 175058
* Refine fix to bug 15041.Bill Schmidt2013-02-081-18/+17
| | | | | | | | | Thanks to help from Nadav and Hal, I have a more reasonable (and even correct!) approach. This specifically penalizes the insertelement and extractelement operations for the performance hit that will occur on PowerPC processors. llvm-svn: 174725
* Constrain PowerPC autovectorization to fix bug 15041.Bill Schmidt2013-02-071-0/+19
| | | | | | | | | | | | | Certain vector operations don't vectorize well with the current PowerPC implementation. Element insert/extract performs poorly without VSX support because Altivec requires going through memory. SREM, UREM, and VSELECT all produce bad scalar code. There's a lot of work to do for the cost model before autovectorization will be tuned well, and this is not an attempt to address the larger problem. llvm-svn: 174660
* PPC calling convention cleanup.Bill Schmidt2013-02-062-75/+46
| | | | | | | | Most of PPCCallingConv.td is used only by the 32-bit SVR4 ABI. Rename things to clarify this. Also delete some code that's been commented out for a long time. llvm-svn: 174526
* Move MRI liveouts to PowerPC return instructions.Jakob Stoklund Olesen2013-02-051-21/+9
| | | | llvm-svn: 174409
* Avoid using MRI::liveout_iterator for computing VRSAVEs.Jakob Stoklund Olesen2013-02-051-6/+15
| | | | | | | | | | The liveout lists are about to be removed from MRI, this is the only place they were used after register allocation. Get the live out V registers directly from the return instructions instead. llvm-svn: 174399
* Disable a couple more vector splat optimizations on PPC.Benjamin Kramer2013-02-041-2/+4
| | | | | | | I didn't see those because the test case used "not grep". FileCheck the test and XFAIL it, preserving the old optimization, so this can be fixed eventually. llvm-svn: 174330
* SelectionDAG: Teach FoldConstantArithmetic how to deal with vectors.Benjamin Kramer2013-02-041-0/+5
| | | | | | | | | | | | | | | | | This required disabling a PowerPC optimization that did the following: input: x = BUILD_VECTOR <i32 16, i32 16, i32 16, i32 16> lowered to: tmp = BUILD_VECTOR <i32 8, i32 8, i32 8, i32 8> x = ADD tmp, tmp The add now gets folded immediately and we're back at the BUILD_VECTOR we started from. I don't see a way to fix this currently so I left it disabled for now. Fix some trivially foldable X86 tests too. llvm-svn: 174325
* PPCDarwinAsmPrinter::EmitStartOfAsmFile(): Add checking range in ↵NAKAMURA Takumi2013-02-041-1/+4
| | | | | | CPUDirectives[]. llvm-svn: 174298
* PPCDarwinAsmPrinter::EmitStartOfAsmFile(): Add possible elements in ↵NAKAMURA Takumi2013-02-041-0/+5
| | | | | | CPUDirectives[]. llvm-svn: 174297
* Add notes about future PowerPC featuresBill Schmidt2013-02-011-0/+17
| | | | llvm-svn: 174232
* LLVM enablement for some older PowerPC CPUsBill Schmidt2013-02-012-0/+25
| | | | llvm-svn: 174230
* [PEI] Pass the frame index operand number to the eliminateFrameIndex function.Chad Rosier2013-01-312-14/+9
| | | | | | | Each target implementation was needlessly recomputing the index. Part of rdar://13076458 llvm-svn: 174083
* PPC QPX requires a 32-byte aligned stackHal Finkel2013-01-302-1/+8
| | | | | | | On systems which support the QPX vector instructions, the stack must be 32-byte aligned. llvm-svn: 173993
* Initialize hasQPX in PPCSubtargetHal Finkel2013-01-301-0/+1
| | | | | | This should have gone in with r173973. llvm-svn: 173984
* Add definitions for the PPC a2q core marked as having QPX availableHal Finkel2013-01-302-0/+9
| | | | | | | | This is the first commit of a large series which will add support for the QPX vector instruction set to the PowerPC backend. This instruction set is used on the IBM Blue Gene/Q supercomputers. llvm-svn: 173973
* Teach SDISel to combine fsin / fcos into a fsincos node if the followingEvan Cheng2013-01-291-0/+2
| | | | | | | | | | | | | | | | | | conditions are met: 1. They share the same operand and are in the same BB. 2. Both outputs are used. 3. The target has a native instruction that maps to ISD::FSINCOS node or the target provides a sincos library call. Implemented the generic optimization in sdisel and enabled it for Mac OSX. Also added an additional optimization for x86_64 Mac OSX by using an alternative entry point __sincos_stret which returns the two results in xmm0 / xmm1. rdar://13087969 PR13204 llvm-svn: 173755
* Add isBGQ method to PPCSubtargetHal Finkel2013-01-291-0/+2
| | | | | | This function will be used in future commits. llvm-svn: 173729
* Remove unused variables, silences -Wunused-variableDmitri Gribenko2013-01-251-4/+2
| | | | llvm-svn: 173526
* Initial implementation of PPCTargetTransformInfoHal Finkel2013-01-255-0/+237
| | | | | | | | | | This provides a place to add customized operation cost information and control some other target-specific IR-level transformations. The only non-trivial logic in this checkin assigns a higher cost to unaligned loads and stores (covered by the included test case). llvm-svn: 173520
* More cleanup of PPC register definitions.Hal Finkel2013-01-251-64/+8
| | | | | | | Uses the new !add TableGen operator to do more cleanup of the PPC register definitions. llvm-svn: 173446
* Start cleanup of PPC register definitions using foreach loops.Hal Finkel2013-01-241-65/+7
| | | | | | | | | | | No functionality change intended. This captures the first two cases GPR32/64. For the others, we need an addition operator (if we have one, I've not yet found it). Based on a suggestion made by Tom Stellard in the AArch64 review! llvm-svn: 173366
* Fix powerpc test failure - forgot to initialize stack slot size for ↵Eli Bendersky2013-01-231-2/+3
| | | | | | PPCLinuxMCAsmInfo llvm-svn: 173275
* Clean up assignment of CalleeSaveStackSlotSize: get rid of the default and ↵Eli Bendersky2013-01-231-2/+3
| | | | | | explicitly set this in every target that needs to change it from the default. llvm-svn: 173270
* Sort all of the includes. Several files got checked in with mis-sortedChandler Carruth2013-01-191-1/+1
| | | | | | includes. llvm-svn: 172891
* This patch fixes PR13626 by providing i128 support in the returnBill Schmidt2013-01-171-0/+1
| | | | | | | calling convention. 128-bit integers are now properly returned in GPR3 and GPR4 on PowerPC. llvm-svn: 172745
* This patch fixes the PPC calling convention to handle returns ofBill Schmidt2013-01-171-2/+2
| | | | | | | | | _Complex float and _Complex long double, by simply increasing the number of floating point registers available for return values. The test case verifies that the correct registers are loaded. llvm-svn: 172733
* PowerPC: EH adjustmentsAdhemerval Zanella2013-01-091-0/+19
| | | | | | | | | This patch adjust the r171506 to make all DWARF enconding pc-relative for PPC64. It also adds the R_PPC64_REL32 relocation handling in MCJIT (since the eh_frame will not generate PIC-relative relocation) and also adds the emission of stubs created by the TTypeEncoding. llvm-svn: 171979
* These functions have default arguments of 0 for the last arg. UseEric Christopher2013-01-091-6/+6
| | | | | | them. llvm-svn: 171933
* Renamed MCInstFragment to MCRelaxableFragment and added some comments.Eli Bendersky2013-01-081-1/+1
| | | | | | No change in functionality. llvm-svn: 171822
* This patch addresses bug 14678 by fixing two problems in medium code modelBill Schmidt2013-01-072-12/+29
| | | | | | | | | code generation. Variables addressed through a GlobalAlias were not being handled, and variables with available_externally linkage were treated incorrectly. The patch contains two new tests to verify the correct code generation for these cases. llvm-svn: 171778
* Switch TargetTransformInfo from an immutable analysis pass that requiresChandler Carruth2013-01-072-11/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a TargetMachine to construct (and thus isn't always available), to an analysis group that supports layered implementations much like AliasAnalysis does. This is a pretty massive change, with a few parts that I was unable to easily separate (sorry), so I'll walk through it. The first step of this conversion was to make TargetTransformInfo an analysis group, and to sink the nonce implementations in ScalarTargetTransformInfo and VectorTargetTranformInfo into a NoTargetTransformInfo pass. This allows other passes to add a hard requirement on TTI, and assume they will always get at least on implementation. The TargetTransformInfo analysis group leverages the delegation chaining trick that AliasAnalysis uses, where the base class for the analysis group delegates to the previous analysis *pass*, allowing all but tho NoFoo analysis passes to only implement the parts of the interfaces they support. It also introduces a new trick where each pass in the group retains a pointer to the top-most pass that has been initialized. This allows passes to implement one API in terms of another API and benefit when some other pass above them in the stack has more precise results for the second API. The second step of this conversion is to create a pass that implements the TargetTransformInfo analysis using the target-independent abstractions in the code generator. This replaces the ScalarTargetTransformImpl and VectorTargetTransformImpl classes in lib/Target with a single pass in lib/CodeGen called BasicTargetTransformInfo. This class actually provides most of the TTI functionality, basing it upon the TargetLowering abstraction and other information in the target independent code generator. The third step of the conversion adds support to all TargetMachines to register custom analysis passes. This allows building those passes with access to TargetLowering or other target-specific classes, and it also allows each target to customize the set of analysis passes desired in the pass manager. The baseline LLVMTargetMachine implements this interface to add the BasicTTI pass to the pass manager, and all of the tools that want to support target-aware TTI passes call this routine on whatever target machine they end up with to add the appropriate passes. The fourth step of the conversion created target-specific TTI analysis passes for the X86 and ARM backends. These passes contain the custom logic that was previously in their extensions of the ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces. I separated them into their own file, as now all of the interface bits are private and they just expose a function to create the pass itself. Then I extended these target machines to set up a custom set of analysis passes, first adding BasicTTI as a fallback, and then adding their customized TTI implementations. The fourth step required logic that was shared between the target independent layer and the specific targets to move to a different interface, as they no longer derive from each other. As a consequence, a helper functions were added to TargetLowering representing the common logic needed both in the target implementation and the codegen implementation of the TTI pass. While technically this is the only change that could have been committed separately, it would have been a nightmare to extract. The final step of the conversion was just to delete all the old boilerplate. This got rid of the ScalarTargetTransformInfo and VectorTargetTransformInfo classes, all of the support in all of the targets for producing instances of them, and all of the support in the tools for manually constructing a pass based around them. Now that TTI is a relatively normal analysis group, two things become straightforward. First, we can sink it into lib/Analysis which is a more natural layer for it to live. Second, clients of this interface can depend on it *always* being available which will simplify their code and behavior. These (and other) simplifications will follow in subsequent commits, this one is clearly big enough. Finally, I'm very aware that much of the comments and documentation needs to be updated. As soon as I had this working, and plausibly well commented, I wanted to get it committed and in front of the build bots. I'll be doing a few passes over documentation later if it sticks. Commits to update DragonEgg and Clang will be made presently. llvm-svn: 171681
* PowerPC: Fix eh_frame relocation for PIC Adhemerval Zanella2013-01-041-0/+5
| | | | | | | | | This patch fixes the PPC eh_frame definitions for the personality and frame unwinding for PIC objects. It makes PIC build correctly creates relative relocations in the '.rela.eh_frame' segments and thus avoiding a text relocation that generates a DT_TEXTREL segments in link phase. llvm-svn: 171506
OpenPOWER on IntegriCloud