summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC/PPCSubtarget.h
Commit message (Collapse)AuthorAgeFilesLines
...
* [PowerPC] 32-bit ELF PIC supportHal Finkel2014-07-181-0/+3
| | | | | | | | | | This adds initial support for PPC32 ELF PIC (Position Independent Code; the -fPIC variety), thus rectifying a long-standing deficiency in the PowerPC backend. Patch by Justin Hibbits! llvm-svn: 213427
* Move Post RA Scheduling flag bit into SchedMachineModelSanjay Patel2014-07-151-5/+5
| | | | | | | | | | | | | | | | | | | | | Refactoring; no functional changes intended Removed PostRAScheduler bits from subtargets (X86, ARM). Added PostRAScheduler bit to MCSchedModel class. This bit is set by a CPU's scheduling model (if it exists). Removed enablePostRAScheduler() function from TargetSubtargetInfo and subclasses. Fixed the existing enablePostMachineScheduler() method to use the MCSchedModel (was just returning false!). Added methods to TargetSubtargetInfo to allow overrides for AntiDepBreakMode, CriticalPathRCs, and OptLevel for PostRAScheduling. Added enablePostRAScheduler() function to PostRAScheduler class which queries the subtarget for the above values. Preserved existing scheduler behavior for ARM, MIPS, PPC, and X86: a. ARM overrides the CPU's postRA settings by enabling postRA for any non-Thumb or Thumb2 subtarget. b. MIPS overrides the CPU's postRA settings by enabling postRA for everything. c. PPC overrides the CPU's postRA settings by enabling postRA for everything. d. X86 is the only target that actually has postRA specified via sched model info. Differential Revision: http://reviews.llvm.org/D4217 llvm-svn: 213101
* add ppc64/pwr8 as targetWill Schmidt2014-06-261-0/+1
| | | | | | | includes handling DIR_PWR8 where appropriate The P7Model Itinerary is currently tied in for use under the P8Model, and will be updated later. llvm-svn: 211779
* Fix typo.Eric Christopher2014-06-131-1/+1
| | | | llvm-svn: 210947
* Move the PPCSelectionDAGInfo off the TargetMachine and onto theEric Christopher2014-06-121-0/+3
| | | | | | subtarget. llvm-svn: 210854
* Move PPCTargetLowering off of the TargetMachine and onto the subtarget.Eric Christopher2014-06-121-1/+4
| | | | llvm-svn: 210852
* Move PPCJITInfo off of the TargetMachine and onto the subtarget.Eric Christopher2014-06-121-1/+4
| | | | | | | Needed to migrate a few functions around to avoid circular header dependencies. llvm-svn: 210845
* Move PPCInstrInfo off of the target machine and onto the subtarget.Eric Christopher2014-06-121-0/+3
| | | | llvm-svn: 210839
* Move DataLayout from the PPCTargetMachine to the subtarget.Eric Christopher2014-06-121-0/+4
| | | | llvm-svn: 210824
* Move PPCFrameLowering into PPCSubtarget from PPCTargetMachine. UseEric Christopher2014-06-121-0/+8
| | | | | | | | the initializeSubtargetDependencies code to obtain an initialized subtarget and migrate a couple of subtarget using functions to the .cpp file to avoid circular includes. llvm-svn: 210822
* Make early if conversion dependent upon the subtarget and addEric Christopher2014-05-211-0/+2
| | | | | | | a subtarget hook to enable. Unconditionally add to the pass pipeline for targets that might want to use it. No functional change. llvm-svn: 209340
* Save the optimization level the subtarget was created with in aEric Christopher2014-05-131-0/+3
| | | | | | | | | | member variable and sink the initialization of crbits into the subtarget feature reset code. No functional change, but this refactor will be used in a future commit. llvm-svn: 208726
* [C++11] Add 'override' keywords and remove 'virtual'. Additionally add ↵Craig Topper2014-04-291-5/+5
| | | | | | 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. PowerPC edition llvm-svn: 207504
* [PowerPC] Initial support for the VSX instruction setHal Finkel2014-03-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VSX is an ISA extension supported on the POWER7 and later cores that enhances floating-point vector and scalar capabilities. Among other things, this adds <2 x double> support and generally helps to reduce register pressure. The interesting part of this ISA feature is the register configuration: there are 64 new 128-bit vector registers, the 32 of which are super-registers of the existing 32 scalar floating-point registers, and the second 32 of which overlap with the 32 Altivec vector registers. This makes things like vector insertion and extraction tricky: this can be free but only if we force a restriction to the right register subclass when needed. A new "minipass" PPCVSXCopy takes care of this (although it could do a more-optimal job of it; see the comment about unnecessary copies below). Please note that, currently, VSX is not enabled by default when targeting anything because it is not yet ready for that. The assembler and disassembler are fully implemented and tested. However: - CodeGen support causes miscompiles; test-suite runtime failures: MultiSource/Benchmarks/FreeBench/distray/distray MultiSource/Benchmarks/McCat/08-main/main MultiSource/Benchmarks/Olden/voronoi/voronoi MultiSource/Benchmarks/mafft/pairlocalalign MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4 SingleSource/Benchmarks/CoyoteBench/almabench SingleSource/Benchmarks/Misc/matmul_f64_4x4 - The lowering currently falls back to using Altivec instructions far more than it should. Worse, there are some things that are scalarized through the stack that shouldn't be. - A lot of unnecessary copies make it past the optimizers, and this needs to be fixed. - Many more regression tests are needed. Normally, I'd fix these things prior to committing, but there are some students and other contributors who would like to work this, and so it makes sense to move this development process upstream where it can be subject to the regular code-review procedures. llvm-svn: 203768
* Don't avoid cfi instructions on the bg/p.Rafael Espindola2014-03-071-2/+0
| | | | | | | The integrated assembler now works for ppc. Since this was the last use of the bg/p predicate and Hal says that it is now dead, drop the predicate too. llvm-svn: 203269
* Add CR-bit tracking to the PowerPC backend for i1 valuesHal Finkel2014-02-281-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change enables tracking i1 values in the PowerPC backend using the condition register bits. These bits can be treated on PowerPC as separate registers; individual bit operations (and, or, xor, etc.) are supported. Tracking booleans in CR bits has several advantages: - Reduction in register pressure (because we no longer need GPRs to store boolean values). - Logical operations on booleans can be handled more efficiently; we used to have to move all results from comparisons into GPRs, perform promoted logical operations in GPRs, and then move the result back into condition register bits to be used by conditional branches. This can be very inefficient, because the throughput of these CR <-> GPR moves have high latency and low throughput (especially when other associated instructions are accounted for). - On the POWER7 and similar cores, we can increase total throughput by using the CR bits. CR bit operations have a dedicated functional unit. Most of this is more-or-less mechanical: Adjustments were needed in the calling-convention code, support was added for spilling/restoring individual condition-register bits, and conditional branch instruction definitions taking specific CR bits were added (plus patterns and code for generating bit-level operations). This is enabled by default when running at -O2 and higher. For -O0 and -O1, where the ability to debug is more important, this feature is disabled by default. Individual CR bits do not have assigned DWARF register numbers, and storing values in CR bits makes them invisible to the debugger. It is critical, however, that we don't move i1 values that have been promoted to larger values (such as those passed as function arguments) into bit registers only to quickly turn around and move the values back into GPRs (such as happens when values are returned by functions). A pair of target-specific DAG combines are added to remove the trunc/extends in: trunc(binary-ops(binary-ops(zext(x), zext(y)), ...) and: zext(binary-ops(binary-ops(trunc(x), trunc(y)), ...) In short, we only want to use CR bits where some of the i1 values come from comparisons or are used by conditional branches or selects. To put it another way, if we can do the entire i1 computation in GPRs, then we probably should (on the POWER7, the GPR-operation throughput is higher, and for all cores, the CR <-> GPR moves are expensive). POWER7 test-suite performance results (from 10 runs in each configuration): SingleSource/Benchmarks/Misc/mandel-2: 35% speedup MultiSource/Benchmarks/Prolangs-C++/city/city: 21% speedup MultiSource/Benchmarks/MiBench/automotive-susan: 23% speedup SingleSource/Benchmarks/CoyoteBench/huffbench: 13% speedup SingleSource/Benchmarks/Misc-C++/Large/sphereflake: 13% speedup SingleSource/Benchmarks/Misc-C++/mandel-text: 10% speedup SingleSource/Benchmarks/Misc-C++-EH/spirit: 10% slowdown MultiSource/Applications/lemon/lemon: 8% slowdown llvm-svn: 202451
* Move PPC's getDataLayoutString out of line and document it better.Rafael Espindola2013-12-111-16/+0
| | | | llvm-svn: 196987
* Add support for the VSX target attribute. No functional changeEric Christopher2013-10-161-0/+1
| | | | | | as we don't actually use it to emit any code yet. llvm-svn: 192837
* Mark PPC MFTB and DST (and friends) as deprecatedHal Finkel2013-09-121-0/+4
| | | | | | | | Use the new instruction deprecation feature to mark mftb (now replaced with mfspr) and dst (along with the other Altivec cache control instructions) as deprecated when targeting cores supporting at least ISA v2.03. llvm-svn: 190605
* Enable MI scheduling (and CodeGen AA) by default for embedded PPC coresHal Finkel2013-09-111-0/+8
| | | | | | | For embedded PPC cores (especially the A2 core), using the MI scheduler with AA is far superior to the other scheduling options. llvm-svn: 190558
* Add the PPC fcpsgn instructionHal Finkel2013-08-191-0/+2
| | | | | | | | | Modern PPC cores support a floating-point copysign instruction, and we can use this to lower the FCOPYSIGN node (which is created from calls to the libm copysign function). A couple of extra patterns are necessary because the operand types of FCOPYSIGN need not agree. llvm-svn: 188653
* [PowerPC] Support powerpc64le as a syntax-checking target.Bill Schmidt2013-07-261-0/+4
| | | | | | | | | | | | | | | | | | | | | | | This patch provides basic support for powerpc64le as an LLVM target. However, use of this target will not actually generate little-endian code. Instead, use of the target will cause the correct little-endian built-in defines to be generated, so that code that tests for __LITTLE_ENDIAN__, for example, will be correctly parsed for syntax-only testing. Code generation will otherwise be the same as powerpc64 (big-endian), for now. The patch leaves open the possibility of creating a little-endian PowerPC64 back end, but there is no immediate intent to create such a thing. The LLVM portions of this patch simply add ppc64le coverage everywhere that ppc64 coverage currently exists. There is nothing of any import worth testing until such time as little-endian code generation is implemented. In the corresponding Clang patch, there is a new test case variant to ensure that correct built-in defines for little-endian code are generated. llvm-svn: 187179
* PPC: Refactoring to support subtarget feature changingHal Finkel2013-07-151-0/+7
| | | | | | | | | This change mirrors the changes that were made to the X86 and ARM targets to support subtarget feature changing. As indicated in r182899, the mechanism is still undergoing revision, and so as with the X86 and ARM targets, there is no test case yet (there is no effective functionality change). llvm-svn: 186357
* [PowerPC] FreeBSD does not require f128 in its data layout string.Bill Schmidt2013-07-031-1/+1
| | | | | | Long double is 64 bits on FreeBSD PPC, so the f128 entry is superfluous. llvm-svn: 185583
* Use PPC reciprocal estimates with Newton iteration in fast-math modeHal Finkel2013-04-031-0/+7
| | | | | | | | | | | | | | | | | | | When unsafe FP math operations are enabled, we can use the fre[s] and frsqrte[s] instructions, which generate reciprocal (sqrt) estimates, together with some Newton iteration, in order to quickly generate floating-point division and sqrt results. All of these instructions are separately optional, and so each has its own feature flag (except for the Altivec instructions, which are covered under the existing Altivec flag). Doing this is not only faster than using the IEEE-compliant fdiv/fsqrt instructions, but allows these computations to be pipelined with other computations in order to hide their overall latency. I've also added a couple of missing fnmsub patterns which turned out to be missing (but are necessary for good code generation of the Newton iterations). Altivec needs a similar fix, but that will probably be more complicated because fneg is expanded for Altivec's v4f32. llvm-svn: 178617
* Add more PPC floating-point conversion instructionsHal Finkel2013-04-011-0/+2
| | | | | | | | | The P7 and A2 have additional floating-point conversion instructions which allow a direct two-instruction sequence (plus load/store) to convert from all combinations (signed/unsigned i32/i64) <--> (float/double) (on previous cores, only some combinations were directly available). llvm-svn: 178480
* Add the PPC lfiwax instructionHal Finkel2013-03-311-0/+2
| | | | | | | | | This instruction is available on modern PPC64 CPUs, and is now used to improve the SINT_TO_FP lowering (by eliminating the need for the separate sign extension instruction and decreasing the amount of needed stack space). llvm-svn: 178446
* Add PPC FP rounding instructions fri[mnpz]Hal Finkel2013-03-291-0/+2
| | | | | | | | | These instructions are available on the P5x (and later) and on the A2. They implement the standard floating-point rounding operations (floor, trunc, etc.). One caveat: frin (round to nearest) does not implement "ties to even", and so is only enabled in fast-math mode. llvm-svn: 178337
* Add the PPC64 ldbrx/stdbrx instructionsHal Finkel2013-03-281-0/+2
| | | | | | | | These are 64-bit load/store with byte-swap, and available on the P7 and the A2. Like the similar instructions for 16- and 32-bit words, these are matched in the target DAG-combine phase against load/store-bswap pairs. llvm-svn: 178276
* Add the PPC64 popcntd instructionHal Finkel2013-03-281-0/+2
| | | | | | | PPC ISA 2.06 (P7, A2, etc.) has a popcntd instruction. Add this instruction and tell TTI about it so that popcount-loop recognition will know about it. llvm-svn: 178233
* LLVM enablement for some older PowerPC CPUsBill Schmidt2013-02-011-0/+5
| | | | llvm-svn: 174230
* Add definitions for the PPC a2q core marked as having QPX availableHal Finkel2013-01-301-0/+2
| | | | | | | | This is the first commit of a large series which will add support for the QPX vector instruction set to the PowerPC backend. This instruction set is used on the IBM Blue Gene/Q supercomputers. llvm-svn: 173973
* Add isBGQ method to PPCSubtargetHal Finkel2013-01-291-0/+2
| | | | | | This function will be used in future commits. llvm-svn: 173729
* Sort includes for all of the .h files under the 'lib' tree. These wereChandler Carruth2012-12-041-2/+2
| | | | | | | | | | missed in the first pass because the script didn't yet handle include guards. Note that the script is now able to handle all of these headers without manual edits. =] llvm-svn: 169224
* PPCSubtarget.h: Add explicit braces.NAKAMURA Takumi2012-10-291-1/+2
| | | | llvm-svn: 166932
* PPCSubtarget.h: Whitespace.NAKAMURA Takumi2012-10-291-22/+22
| | | | llvm-svn: 166931
* This patch adds alignment information for long double to the 64-bit PowerPCBill Schmidt2012-10-291-0/+6
| | | | | | | | | | | | | | | | | ELF subtarget. The existing logic is used as a fallback to avoid any changes to the Darwin ABI. PPC64 ELF now has two possible data layout strings: one for FreeBSD, which requires 8-byte alignment, and a default string that requires 16-byte alignment. I've added a test for PPC64 Linux to verify the 16-byte alignment. If somebody wants to add a separate test for FreeBSD, that would be great. Note that there is a companion patch to update the alignment information in Clang, which I am committing now as well. llvm-svn: 166928
* Move TargetData to DataLayout.Micah Villmow2012-10-081-2/+2
| | | | llvm-svn: 165402
* Add PPC Freescale e500mc and e5500 subtargets.Hal Finkel2012-08-281-0/+2
| | | | | | | | | Add subtargets for Freescale e500mc (32-bit) and e5500 (64-bit) to the PowerPC backend. Patch by Tobias von Koch. llvm-svn: 162764
* Add support for the PPC isel instruction.Hal Finkel2012-06-221-0/+2
| | | | | | | The isel (integer select) instruction is supported on the 440 and A2 embedded cores and on the POWER7. llvm-svn: 159045
* Rename the PPC target feature gpul to mfocrf.Hal Finkel2012-06-111-2/+2
| | | | | | | | | | | The PPC target feature gpul (IsGigaProcessor) was only used for one thing: To enable the generation of the MFOCRF instruction. Furthermore, this instruction is available on other PPC cores outside of the G5 line. This feature now corresponds to the HasMFOCRF flag. No functionality change. llvm-svn: 158323
* Add POWER6 and POWER7 CPU types to the PPC backend.Hal Finkel2012-06-111-0/+2
| | | | | | No functional change; these will be used by upcoming scheduler enhancements. llvm-svn: 158313
* The binutils for the IBM BG/P are too old to support CFI.Hal Finkel2012-04-021-0/+2
| | | | llvm-svn: 153886
* Add instruction itinerary for the PPC64 A2 core.Hal Finkel2012-04-011-0/+1
| | | | | | | This adds a full itinerary for IBM's PPC64 A2 embedded core. These cores form the basis for the CPUs in the new IBM BG/Q supercomputer. llvm-svn: 153842
* Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, ↵Jia Liu2012-02-181-1/+1
| | | | | | MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878
* update PPC 940 hazard rec. to function in postRA modeHal Finkel2011-12-021-0/+4
| | | | llvm-svn: 145676
* Add PPC 440 scheduler and some associated testsHal Finkel2011-10-171-0/+3
| | | | llvm-svn: 142170
* Compute feature bits at time of MCSubtargetInfo initialization.Evan Cheng2011-07-071-2/+2
| | | | llvm-svn: 134606
* Make the i64 and f64 be 64bit ABI aligned in the target description.Roman Divacky2011-07-031-1/+1
| | | | | | This is what both the ABI and clang says. llvm-svn: 134367
* Rename XXXGenSubtarget.inc to XXXGenSubtargetInfo.inc for consistency.Evan Cheng2011-07-011-1/+1
| | | | llvm-svn: 134281
OpenPOWER on IntegriCloud