summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC/PPC.h
Commit message (Collapse)AuthorAgeFilesLines
...
* [PowerPC] Initial support for the VSX instruction setHal Finkel2014-03-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VSX is an ISA extension supported on the POWER7 and later cores that enhances floating-point vector and scalar capabilities. Among other things, this adds <2 x double> support and generally helps to reduce register pressure. The interesting part of this ISA feature is the register configuration: there are 64 new 128-bit vector registers, the 32 of which are super-registers of the existing 32 scalar floating-point registers, and the second 32 of which overlap with the 32 Altivec vector registers. This makes things like vector insertion and extraction tricky: this can be free but only if we force a restriction to the right register subclass when needed. A new "minipass" PPCVSXCopy takes care of this (although it could do a more-optimal job of it; see the comment about unnecessary copies below). Please note that, currently, VSX is not enabled by default when targeting anything because it is not yet ready for that. The assembler and disassembler are fully implemented and tested. However: - CodeGen support causes miscompiles; test-suite runtime failures: MultiSource/Benchmarks/FreeBench/distray/distray MultiSource/Benchmarks/McCat/08-main/main MultiSource/Benchmarks/Olden/voronoi/voronoi MultiSource/Benchmarks/mafft/pairlocalalign MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4 SingleSource/Benchmarks/CoyoteBench/almabench SingleSource/Benchmarks/Misc/matmul_f64_4x4 - The lowering currently falls back to using Altivec instructions far more than it should. Worse, there are some things that are scalarized through the stack that shouldn't be. - A lot of unnecessary copies make it past the optimizers, and this needs to be fixed. - Many more regression tests are needed. Normally, I'd fix these things prior to committing, but there are some students and other contributors who would like to work this, and so it makes sense to move this development process upstream where it can be subject to the regular code-review procedures. llvm-svn: 203768
* [PowerPC] Always use "assembler dialect" 1Ulrich Weigand2013-07-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A setting in MCAsmInfo defines the "assembler dialect" to use. This is used by common code to choose between alternatives in a multi-alternative GNU inline asm statement like the following: __asm__ ("{sfe|subfe} %0,%1,%2" : "=r" (out) : "r" (in1), "r" (in2)); The meaning of these dialects is platform specific, and GCC defines those for PowerPC to use dialect 0 for old-style (POWER) mnemonics and 1 for new-style (PowerPC) mnemonics, like in the example above. To be compatible with inline asm used with GCC, LLVM ought to do the same. Specifically, this means we should always use assembler dialect 1 since old-style mnemonics really aren't supported on any current platform. However, the current LLVM back-end uses: AssemblerDialect = 1; // New-Style mnemonics. in PPCMCAsmInfoDarwin, and AssemblerDialect = 0; // Old-Style mnemonics. in PPCLinuxMCAsmInfo. The Linux setting really isn't correct, we should be using new-style mnemonics everywhere. This is changed by this commit. Unfortunately, the setting of this variable is overloaded in the back-end to decide whether or not we are on a Darwin target. This is done in PPCInstPrinter (the "SyntaxVariant" is initialized from the MCAsmInfo AssemblerDialect setting), and also in PPCMCExpr. Setting AssemblerDialect to 1 for both Darwin and Linux no longer allows us to make this distinction. Instead, this patch uses the MCSubtargetInfo passed to createPPCMCInstPrinter to distinguish Darwin targets, and ignores the SyntaxVariant parameter. As to PPCMCExpr, this patch adds an explicit isDarwin argument that needs to be passed in by the caller when creating a target MCExpr. (To do so this patch implicitly also reverts commit 184441.) llvm-svn: 185858
* [PowerPC] Support @tls in the asm parserUlrich Weigand2013-07-051-1/+4
| | | | | | | | | | | | | | | | | | This adds support for the last missing construct to parse TLS-related assembler code: add 3, 4, symbol@tls The ADD8TLS currently hard-codes the @tls into the assembler string. This cannot be handled by the asm parser, since @tls is parsed as a symbol variant. This patch changes ADD8TLS to have the @tls suffix printed as symbol variant on output too, which allows us to remove the isCodeGenOnly marker from ADD8TLS. This in turn means that we can add a AsmOperand to accept @tls marked symbols on input. As a side effect, this means that the fixup_ppc_tlsreg fixup type is no longer necessary and can be merged into fixup_ppc_nofixup. llvm-svn: 185692
* [PowerPC] Rename some more VK_PPC_ enumsUlrich Weigand2013-06-211-8/+8
| | | | | | | | | | | | | This renames more VK_PPC_ enums, to make them more closely reflect the @modifier string they represent. This also prepares for adding a bunch of new VK_PPC_ enums in upcoming patches. For consistency, some MO_ flags related to VK_PPC_ enums are likewise renamed. No change in behaviour. llvm-svn: 184547
* [PowerPC] Remove unused parameterUlrich Weigand2013-06-201-1/+1
| | | | | | | The isDarwin parameter to the llvm::LowerPPCMachineInstrToMCInst routine is now no longer needed; remove it. llvm-svn: 184441
* Add a PPCCTRLoops verification passHal Finkel2013-05-201-0/+3
| | | | | | | | | | | | | | | | | | When asserts are enabled, this adds a verification pass for PPC counter-loop formation. Unfortunately, without sacrificing code quality, there is no better way of forming counter-based loops except at the (late) IR level. This means that we need to recognize, at the IR level, anything which might turn into a function call (or indirect branch). Because this is currently a finite set of things, and because SelectionDAG lowering is basic-block local, this can be done. Nevertheless, it is fragile, and failure results in a miscompile. This verification pass checks that all (reachable) counter-based branches are dominated by a loop mtctr instruction, and that no instructions in between clobber the counter register. If these conditions are not satisfied, then an ICE will be triggered. In short, this is to help us sleep better at night. llvm-svn: 182295
* Implement PPC counter loops as a late IR-level passHal Finkel2013-05-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The old PPCCTRLoops pass, like the Hexagon pass version from which it was derived, could only handle some simple loops in canonical form. We cannot directly adapt the new Hexagon hardware loops pass, however, because the Hexagon pass contains a fundamental assumption that non-constant-trip-count loops will contain a guard, and this is not always true (the result being that incorrect negative counts can be generated). With this commit, we replace the pass with a late IR-level pass which makes use of SE to calculate the backedge-taken counts and safely generate the loop-count expressions (including any necessary max() parts). This IR level pass inserts custom intrinsics that are lowered into the desired decrement-and-branch instructions. The most fragile part of this new implementation is that interfering uses of the counter register must be detected on the IR level (and, on PPC, this also includes any indirect branches in addition to function calls). Also, to make all of this work, we need a variant of the mtctr instruction that is marked as having side effects. Without this, machine-code level CSE, DCE, etc. illegally transform the resulting code. Hopefully, this can be improved in the future. This new pass is smaller than the original (and much smaller than the new Hexagon hardware loops pass), and can handle many additional cases correctly. In addition, the preheader-creation code has been copied from LoopSimplify, and after we decide on where it belongs, this code will be refactored so that it can be explicitly shared (making this implementation even smaller). The new test-case files ctrloop-{le,lt,ne}.ll have been adapted from tests for the new Hexagon pass. There are a few classes of loops that this pass does not transform (noted by FIXMEs in the files), but these deficiencies can be addressed within the SE infrastructure (thus helping many other passes as well). llvm-svn: 181927
* Generate PPC early conditional returnsHal Finkel2013-04-081-1/+2
| | | | | | | | | | | | | PowerPC has a conditional branch to the link register (return) instruction: BCLR. This should be used any time when we'd otherwise have a conditional branch to a return. This adds a small pass, PPCEarlyReturn, which runs just prior to the branch selection pass (and, importantly, after block placement) to generate these conditional returns when possible. It will also eliminate unconditional branches to returns (these happen rarely; most of the time these have already been tail duplicated by the time PPCEarlyReturn is invoked). This is a nice optimization for small functions that do not maintain a stack frame. llvm-svn: 179026
* PPC: Use HWEncoding and TRI->getEncodingValueHal Finkel2013-03-261-1/+0
| | | | | | | | | | | As pointed out by Jakob, we don't need to maintain a separate register-numbering table. Instead we should let TableGen generate the table for us from the information (already present) in PPCRegisterInfo.td. TRI->getEncodingValue is now used to access register-encoding values. No functionality change intended. llvm-svn: 178067
* Relocation enablement for PPC DAG postprocessing passBill Schmidt2013-02-211-8/+14
| | | | llvm-svn: 175693
* Initial implementation of PPCTargetTransformInfoHal Finkel2013-01-251-0/+4
| | | | | | | | | | This provides a place to add customized operation cost information and control some other target-specific IR-level transformations. The only non-trivial logic in this checkin assigns a higher cost to unaligned loads and stores (covered by the included test case). llvm-svn: 173520
* This is just a clean-up patch that simplifies the initial-exec TLS logic byBill Schmidt2012-12-131-3/+1
| | | | | | | avoiding use of machine operand flags. No change in observable behavior, so no new test cases. llvm-svn: 170141
* This patch introduces initial-exec model support for thread-local storageBill Schmidt2012-12-041-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | on 64-bit PowerPC ELF. The patch includes code to handle external assembly and MC output with the integrated assembler. It intentionally does not support the "old" JIT. For the initial-exec TLS model, the ABI requires the following to calculate the address of external thread-local variable x: Code sequence Relocation Symbol ld 9,x@got@tprel(2) R_PPC64_GOT_TPREL16_DS x add 9,9,x@tls R_PPC64_TLS x The register 9 is arbitrary here. The linker will replace x@got@tprel with the offset relative to the thread pointer to the generated GOT entry for symbol x. It will replace x@tls with the thread-pointer register (13). The two test cases verify correct assembly output and relocation output as just described. PowerPC-specific selection node variants are added for the two instructions above: LD_GOT_TPREL and ADD_TLS. These are inserted when an initial-exec global variable is encountered by PPCTargetLowering::LowerGlobalTLSAddress(), and later lowered to machine instructions LDgotTPREL and ADD8TLS. LDgotTPREL is a pseudo that uses the same LDrs support added for medium code model's LDtocL, with a different relocation type. The rest of the processing is straightforward. llvm-svn: 169281
* Add the PPCCTRLoops pass: a PPC machine-code-level optimization pass to form ↵Hal Finkel2012-06-081-0/+1
| | | | | | | | | | CTR-based loop branching code. This pass is derived from the Hexagon HardwareLoops pass. The only significant enhancement over the Hexagon pass is that PPCCTRLoops will also attempt to delete the replaced add and compare operations if they are no longer otherwise used. Also, invalid preheader DebugLoc is not used. llvm-svn: 158204
* Implement local-exec TLS on PowerPC.Roman Divacky2012-06-041-6/+12
| | | | llvm-svn: 157935
* Reorder includes in Target backends to following coding standards. Remove ↵Craig Topper2012-03-171-4/+1
| | | | | | some superfluous forward declarations. llvm-svn: 152997
* Code clean up.Evan Cheng2011-07-251-5/+0
| | | | llvm-svn: 135954
* Refactor PPC target to separate MC routines from Target routines.Evan Cheng2011-07-251-5/+1
| | | | llvm-svn: 135942
* Next round of MC refactoring. This patch factor MC table instantiations, MCEvan Cheng2011-07-141-14/+1
| | | | | | registeration and creation code into XXXMCDesc libraries. llvm-svn: 135184
* - Eliminate MCCodeEmitter's dependency on TargetMachine. It now uses MCInstrInfoEvan Cheng2011-07-111-1/+4
| | | | | | | | | | | | and MCSubtargetInfo. - Added methods to update subtarget features (used when targets automatically detect subtarget features or switch modes). - Teach X86Subtarget to update MCSubtargetInfo features bits since the MCSubtargetInfo layer can be shared with other modules. - These fixes .code 16 / .code 32 support since mode switch is updated in MCSubtargetInfo so MC code emitter can do the right thing. llvm-svn: 134884
* Merge XXXGenRegisterNames.inc into XXXGenRegisterInfo.incEvan Cheng2011-06-281-1/+2
| | | | llvm-svn: 134024
* Merge XXXGenRegisterDesc.inc XXXGenRegisterNames.inc XXXGenRegisterInfo.h.incEvan Cheng2011-06-271-1/+2
| | | | | | into XXXGenRegisterInfo.inc. llvm-svn: 133922
* Fix emission of PPC64 assembler on non-darwin platforms by splittingRoman Divacky2011-06-091-1/+1
| | | | | | | | VK_PPC_{HA,LO}16 into darwin and gas variants. Darwin wants {ha,lo}16(symbol) while gnu as wants symbol@{ha,l}. llvm-svn: 132802
* Wire up primitive support in the assembler backend for writing .o filesChris Lattner2010-11-151-0/+4
| | | | | | | | | | | | | | directly on the mac. This is very early, doesn't support relocations and has a terrible hack to avoid .machine from being printed, but despite that it generates an bitwise-identical-to-cctools .o file for stuff like this: define i32 @test() nounwind { ret i32 42 } I don't plan to continue pushing this forward, but if anyone else was interested in doing it, it should be really straight-forward. llvm-svn: 119136
* Implement a basic MCCodeEmitter for PPC. This doesn't handleChris Lattner2010-11-151-1/+6
| | | | | | | | | | | | | | | | fixups yet, and doesn't handle actually encoding operand values, but this is enough for llc -show-mc-encoding to show the base instruction encoding information, e.g.: mflr r0 ; encoding: [0x7c,0x08,0x02,0xa6] stw r0, 8(r1) ; encoding: [0x90,0x00,0x00,0x00] stwu r1, -64(r1) ; encoding: [0x94,0x00,0x00,0x00] Ltmp0: lhz r4, 4(r3) ; encoding: [0xa0,0x00,0x00,0x00] cmplwi cr0, r4, 8 ; encoding: [0x28,0x00,0x00,0x00] beq cr0, LBB0_2 ; encoding: [0x40,0x00,0x00,0x00] llvm-svn: 119116
* convert the operand bits into bitfields since they are all combinable inChris Lattner2010-11-151-25/+25
| | | | | | | | | different ways. Add $non_lazy_ptr support, and proper lowering for global values. Now all the ppc regression tests pass with the new instruction printer. llvm-svn: 119106
* add targetoperand flags for jump tables, constant pool and block addressChris Lattner2010-11-151-1/+16
| | | | | | | | | | | | | | nodes to indicate when ha16/lo16 modifiers should be used. This lets us pass PowerPC/indirectbr.ll. The one annoying thing about this patch is that the MCSymbolExpr isn't expressive enough to represent ha16(label1-label2) which we need on PowerPC. I have a terrible hack in the meantime, but this will have to be revisited at some point. Last major conversion item left is global variable references. llvm-svn: 119105
* implement support for the MO_DARWIN_STUB TargetOperand flag,Chris Lattner2010-11-141-0/+16
| | | | | | | | and have isel apply to to call operands as required. This allows us to get $stub suffixes on label references on ppc/tiger with the new instprinter, fixing two tests. Only 2 to go. llvm-svn: 119093
* switch PPC to a simplified MCInstLowering model.Chris Lattner2010-11-141-0/+6
| | | | llvm-svn: 119074
* fix PPC.h to not pull in TargetMachine.hChris Lattner2010-11-141-2/+2
| | | | llvm-svn: 119072
* tidy some targets.Chris Lattner2010-02-021-2/+0
| | | | llvm-svn: 95146
* remove dead code.Chris Lattner2010-02-021-4/+0
| | | | llvm-svn: 95141
* Add new helpers for registering targets.Daniel Dunbar2009-07-251-3/+0
| | | | | | - Less boilerplate == good. llvm-svn: 77052
* Put Target definitions inside Target specific header, and llvm namespace.Daniel Dunbar2009-07-181-0/+4
| | | | llvm-svn: 76344
* Reapply TargetRegistry refactoring commits.Daniel Dunbar2009-07-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | --- Reverse-merging r75799 into '.': U test/Analysis/PointerTracking U include/llvm/Target/TargetMachineRegistry.h U include/llvm/Target/TargetMachine.h U include/llvm/Target/TargetRegistry.h U include/llvm/Target/TargetSelect.h U tools/lto/LTOCodeGenerator.cpp U tools/lto/LTOModule.cpp U tools/llc/llc.cpp U lib/Target/PowerPC/PPCTargetMachine.h U lib/Target/PowerPC/AsmPrinter/PPCAsmPrinter.cpp U lib/Target/PowerPC/PPCTargetMachine.cpp U lib/Target/PowerPC/PPC.h U lib/Target/ARM/ARMTargetMachine.cpp U lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp U lib/Target/ARM/ARMTargetMachine.h U lib/Target/ARM/ARM.h U lib/Target/XCore/XCoreTargetMachine.cpp U lib/Target/XCore/XCoreTargetMachine.h U lib/Target/PIC16/PIC16TargetMachine.cpp U lib/Target/PIC16/PIC16TargetMachine.h U lib/Target/Alpha/AsmPrinter/AlphaAsmPrinter.cpp U lib/Target/Alpha/AlphaTargetMachine.cpp U lib/Target/Alpha/AlphaTargetMachine.h U lib/Target/X86/X86TargetMachine.h U lib/Target/X86/X86.h U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.h U lib/Target/X86/AsmPrinter/X86AsmPrinter.cpp U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.h U lib/Target/X86/X86TargetMachine.cpp U lib/Target/MSP430/MSP430TargetMachine.cpp U lib/Target/MSP430/MSP430TargetMachine.h U lib/Target/CppBackend/CPPTargetMachine.h U lib/Target/CppBackend/CPPBackend.cpp U lib/Target/CBackend/CTargetMachine.h U lib/Target/CBackend/CBackend.cpp U lib/Target/TargetMachine.cpp U lib/Target/IA64/IA64TargetMachine.cpp U lib/Target/IA64/AsmPrinter/IA64AsmPrinter.cpp U lib/Target/IA64/IA64TargetMachine.h U lib/Target/IA64/IA64.h U lib/Target/MSIL/MSILWriter.cpp U lib/Target/CellSPU/SPUTargetMachine.h U lib/Target/CellSPU/SPU.h U lib/Target/CellSPU/AsmPrinter/SPUAsmPrinter.cpp U lib/Target/CellSPU/SPUTargetMachine.cpp U lib/Target/Mips/AsmPrinter/MipsAsmPrinter.cpp U lib/Target/Mips/MipsTargetMachine.cpp U lib/Target/Mips/MipsTargetMachine.h U lib/Target/Mips/Mips.h U lib/Target/Sparc/AsmPrinter/SparcAsmPrinter.cpp U lib/Target/Sparc/SparcTargetMachine.cpp U lib/Target/Sparc/SparcTargetMachine.h U lib/ExecutionEngine/JIT/TargetSelect.cpp U lib/Support/TargetRegistry.cpp llvm-svn: 75820
* Revert 75762, 75763, 75766..75769, 75772..75775, 75778, 75780, 75782 to ↵Stuart Hastings2009-07-151-1/+1
| | | | | | | | repair broken LLVM-GCC build. Will revert 75770 in the llvm-gcc trunk. llvm-svn: 75799
* Register Target's TargetMachine and AsmPrinter in the new registry.Daniel Dunbar2009-07-151-1/+1
| | | | | | | - This abuses TargetMachineRegistry's constructor for now, this will get cleaned up in time. llvm-svn: 75762
* Have asm printers use formatted_raw_ostream directly to avoid aDavid Greene2009-07-141-2/+3
| | | | | | dynamic_cast<>. llvm-svn: 75670
* Add the Object Code Emitter class. Original patch by Aaron Gray, I did someBruno Cardoso Lopes2009-07-061-1/+4
| | | | | | cleanup, removed some #includes and moved Object Code Emitter out-of-line. llvm-svn: 74813
* Remove unused AsmPrinter OptLevel argument, and propogate.Daniel Dunbar2009-07-011-3/+2
| | | | | | | - This more or less amounts to a revert of r65379. I'm curious to know what happened that caused this variable to become unused. llvm-svn: 74579
* First patch in the direction of splitting MachineCodeEmitter in two subclasses:Bruno Cardoso Lopes2009-05-301-0/+2
| | | | | | JITCodeEmitter and ObjectCodeEmitter. No functional changes yet. Patch by Aaron Gray llvm-svn: 72631
* Instead of passing in an unsigned value for the optimization level, use an enum,Bill Wendling2009-04-291-1/+3
| | | | | | | which better identifies what the optimization is doing. And is more flexible for future uses. llvm-svn: 70440
* Second attempt:Bill Wendling2009-04-291-1/+1
| | | | | | | | | | | | Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to use the old behavior, the flag is -O0. This change allows for finer-grained control over which optimizations are run at different -O levels. Most of this work was pretty mechanical. The majority of the fixes came from verifying that a "fast" variable wasn't used anymore. The JIT still uses a "Fast" flag. I'll change the JIT with a follow-up patch. llvm-svn: 70343
* r70270 isn't ready yet. Back this out. Sorry for the noise.Bill Wendling2009-04-281-1/+1
| | | | llvm-svn: 70275
* Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want toBill Wendling2009-04-281-1/+1
| | | | | | | | | | | use the old behavior, the flag is -O0. This change allows for finer-grained control over which optimizations are run at different -O levels. Most of this work was pretty mechanical. The majority of the fixes came from verifying that a "fast" variable wasn't used anymore. The JIT still uses a "Fast" flag. I'm not 100% sure if it's necessary to change it there... llvm-svn: 70270
* CodeGen still defaults to non-verbose asm, but llc now overrides it and ↵Evan Cheng2009-03-251-1/+1
| | | | | | default to verbose. llvm-svn: 67668
* Overhaul my earlier submission due to feedback. It's a large patch, but most ofBill Wendling2009-02-241-1/+2
| | | | | | | | | | | | them are generic changes. - Use the "fast" flag that's already being passed into the asm printers instead of shoving it into the DwarfWriter. - Instead of calling "MI->getParent()->getParent()" for every MI, set the machine function when calling "runOnMachineFunction" in the asm printers. llvm-svn: 65379
* Tidy up #includes, deleting a bunch of unnecessary #includes.Dan Gohman2009-01-051-3/+0
| | | | llvm-svn: 61715
* Use raw_ostream throughout the AsmPrinter.Owen Anderson2008-08-211-1/+2
| | | | llvm-svn: 55092
* Use PassManagerBase instead of FunctionPassManager for functionsDan Gohman2008-03-111-1/+0
| | | | | | | | that merely add passes. This allows them to be used with either FunctionPassManager or PassManager, or even with a custom new kind of pass manager. llvm-svn: 48256
OpenPOWER on IntegriCloud