summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC/PPCSubtarget.h
Commit message (Collapse)AuthorAgeFilesLines
* Introduce codegen for the Signal Processing EngineJustin Hibbits2018-07-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The Signal Processing Engine (SPE) is found on NXP/Freescale e500v1, e500v2, and several e200 cores. This adds support targeting the e500v2, as this is more common than the e500v1, and is in SoCs still on the market. This patch is very intrusive because the SPE is binary incompatible with the traditional FPU. After discussing with others, the cleanest solution was to make both SPE and FPU features on top of a base PowerPC subset, so all FPU instructions are now wrapped with HasFPU predicates. Supported by this are: * Code generation following the SPE ABI at the LLVM IR level (calling conventions) * Single- and Double-precision math at the level supported by the APU. Still to do: * Vector operations * SPE intrinsics As this changes the Callee-saved register list order, one test, which tests the precise generated code, was updated to account for the new register order. Reviewed by: nemanjai Differential Revision: https://reviews.llvm.org/D44830 llvm-svn: 337347
* Add PowerPC e500(v2) core scheduler and directives.Justin Hibbits2018-07-181-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D44828 llvm-svn: 337345
* [PowerPC] Secure PLT supportStrahinja Petrovic2018-03-271-0/+2
| | | | | | | | This patch supports secure PLT mode for PowerPC 32 architecture. Differential Revision: https://reviews.llvm.org/D42112 llvm-svn: 328617
* Fix a bunch more layering of CodeGen headers that are in TargetDavid Blaikie2017-11-171-1/+1
| | | | | | | | All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490
* [PPC] Fix two bugs in frame lowering.Tony Jiang2017-07-111-0/+7
| | | | | | | | | | | 1. The available program storage region of the red zone to compilers is 288 bytes rather than 244 bytes. 2. The formula for negative number alignment calculation should be y = x & ~(n-1) rather than y = (x + (n-1)) & ~(n-1). Differential Revision: https://reviews.llvm.org/D34337 llvm-svn: 307672
* [XRay] Implement powerpc64le xray.Tim Shen2017-02-101-0/+2
| | | | | | | | | | | | | | | | | | Summary: powerpc64 big-endian is not supported, but I believe that most logic can be shared, except for xray_powerpc64.cc. Also add a function InvalidateInstructionCache to xray_util.h, which is copied from llvm/Support/Memory.cpp. I'm not sure if I need to add a unittest, and I don't know how. Reviewers: dberris, echristo, iteratee, kbarton, hfinkel Subscribers: mehdi_amini, nemanjai, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D29742 llvm-svn: 294781
* [PowerPC] Expand ISEL instruction into if-then-else sequence.Tony Jiang2017-01-161-1/+3
| | | | | | | | | Generally, the ISEL is expanded into if-then-else sequence, in some cases (like when the destination register is the same with the true or false value register), it may just be expanded into just the if or else sequence. llvm-svn: 292154
* Revert "[PowerPC] Expand ISEL instruction into if-then-else sequence."Tony Jiang2017-01-161-3/+1
| | | | | | This reverts commit 1d0e0374438ca6e153844c683826ba9b82486bb1. llvm-svn: 292131
* [PowerPC] Expand ISEL instruction into if-then-else sequence.Tony Jiang2017-01-161-1/+3
| | | | | | | | | Generally, the ISEL is expanded into if-then-else sequence, in some cases (like when the destination register is the same with the true or false value register), it may just be expanded into just the if or else sequence. llvm-svn: 292128
* [PowerPC] Refactor soft-float support, and enable PPC64 soft floatHal Finkel2016-10-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | This change enables soft-float for PowerPC64, and also makes soft-float disable all vector instruction sets for both 32-bit and 64-bit modes. This latter part is necessary because the PPC backend canonicalizes many Altivec vector types to floating-point types, and so soft-float breaks scalarization support for many operations. Both for embedded targets and for operating-system kernels desiring soft-float support, it seems reasonable that disabling hardware floating-point also disables vector instructions (embedded targets without hardware floating point support are unlikely to have Altivec, etc. and operating system kernels desiring not to use floating-point registers to lower syscall cost are unlikely to want to use vector registers either). If someone needs this to work, we'll need to change the fact that we promote many Altivec operations to act on v4f32. To make it possible to disable Altivec when soft-float is enabled, hardware floating-point support needs to be expressed as a positive feature, like the others, and not a negative feature, because target features cannot have dependencies on the disabling of some other feature. So +soft-float has now become -hard-float. Fixes PR26970. llvm-svn: 283060
* [Power9] Add exploitation of non-permuting memory opsNemanja Ivanovic2016-09-221-0/+3
| | | | | | | | | | | | This patch corresponds to review: https://reviews.llvm.org/D19825 The new lxvx/stxvx instructions do not require the swaps to line the elements up correctly. In order to select them over the lxvd2x/lxvw4x instructions which require swaps, the patterns for the old instruction have a predicate that ensures they won't be selected on Power9 and newer CPUs. llvm-svn: 282143
* [PowerPC] Add support for -mlongcallHal Finkel2016-08-301-0/+2
| | | | | | | | | | | The "long call" option forces the use of the indirect calling sequence for all calls (even those that don't really need it). GCC provides this option; This is helpful, under certain circumstances, for building very-large binaries, and some other specialized use cases. Fixes PR19098. llvm-svn: 280040
* Target: Remove unused arguments from overrideSchedPolicy, NFCDuncan P. N. Exon Smith2016-07-011-2/+0
| | | | | | | | | | TargetSubtargetInfo::overrideSchedPolicy takes two MachineInstr* arguments (begin and end) that invite implicit conversions from MachineInstrBundleIterator. One option would be to change their type to an iterator, but since they don't seem to have been used since the API was added in 2010, I'm deleting the dead code. llvm-svn: 274304
* [Power9] Add support for -mcpu=pwr9 in the back endNemanja Ivanovic2016-05-091-0/+1
| | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D19683 Simply adds the bits for being able to specify -mcpu=pwr9 to the back end. llvm-svn: 268950
* [PPC, SSP] Support PowerPC Linux stack protection.Tim Shen2016-04-191-0/+1
| | | | llvm-svn: 266809
* [PowerPC] Basic support for P9 atomic loads and storesNemanja Ivanovic2016-03-311-0/+2
| | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D18032 This patch provides asm implementation for the following instructions: lwat, ldat, stwat, stdat, ldmx, mcrxrx llvm-svn: 265022
* [PowerPC] Refactor popcnt[dw] target featuresHal Finkel2016-03-291-4/+11
| | | | | | | | | Instead of using two feature bits, one to indicate the availability of the popcnt[dw] instructions, and another to indicate whether or not they're fast, use a single enum. This allows more consistent control via target attribute strings, and via Clang's command line. llvm-svn: 264690
* [PowerPC] On the A2, popcnt[dw] are very slowHal Finkel2016-03-281-0/+2
| | | | | | | | | | | | | | | | The A2 cores support the popcntw/popcntd instructions, but they're microcoded, and slower than our default software emulation. Specifically, popcnt[dw] take approximately 74 cycles, whereas our software emulation takes only 24-28 cycles. I've added a new target feature to indicate a slow popcnt[dw], instead of just removing the existing target feature from the a2/a2q processor models, because: 1. This allows us to return more accurate information via the TTI interface (I recognize that this currently makes no practical difference) 2. Is hopefully easier to understand (it allows the core's features to match its manual while still having the desired effect). llvm-svn: 264600
* Power9] Implement new vsx instructions: compare and conversionKit Barton2016-02-261-0/+4
| | | | | | | | | | | | | | | | | | | This change implements the following vsx instructions: Quad/Double-Precision Compare: xscmpoqp xscmpuqp xscmpexpdp xscmpexpqp xscmpeqdp xscmpgedp xscmpgtdp xscmpnedp xvcmpnedp(.) xvcmpnesp(.) Quad-Precision Floating-Point Conversion xscvqpdp(o) xscvdpqp xscvqpsdz xscvqpswz xscvqpudz xscvqpuwz xscvsdqp xscvudqp xscvdphp xscvhpdp xvcvhpsp xvcvsphp xsrqpi xsrqpix xsrqpxp 28 instructions Phabricator: http://reviews.llvm.org/D16709 llvm-svn: 262068
* Rename TargetSelectionDAGInfo into SelectionDAGTargetInfo and move it to ↵Benjamin Kramer2016-01-271-3/+3
| | | | | | | | CodeGen/ It's a SelectionDAG thing, not a Target thing. llvm-svn: 258939
* Define a feature for __float128 support in the PPC back endNemanja Ivanovic2015-12-151-0/+2
| | | | | | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D15117 In preparation for supporting IEEE Quad precision floating point, this patch simply defines a feature to specify the target supports this. For now, nothing is done with the target feature, we just don't want warnings from the Clang FE when a user specifies -mfloat128. Calling convention and other related work will add to this patch in the near future. llvm-svn: 255642
* [Power PC] llvm soft float support for ppc32Petar Jovanovic2015-12-141-0/+3
| | | | | | | | | | | This is the second in a set of patches for soft float support for ppc32, it enables soft float operations. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D13700 llvm-svn: 255516
* Power8 and later support fusing addis/addi and addis/ld instructionEric Christopher2015-11-201-0/+2
| | | | | | | | | pairs that use the same register to execute as a single instruction. No Functional Change Patch by Kyle Butt! llvm-svn: 253724
* Weak non-function symbols were being accessed directly, which isEric Christopher2015-11-201-0/+4
| | | | | | | | | | incorrect, as the chosen representative of the weak symbol may not live with the code in question. Always indirect the access through the TOC instead. Patch by Kyle Butt! llvm-svn: 253708
* Remove getDataLayout() from TargetSelectionDAGInfo (had no users)Mehdi Amini2015-07-091-3/+3
| | | | | | | | | | | | | | | | | | Summary: Remove empty subclass in the process. This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren, ted Differential Revision: http://reviews.llvm.org/D11045 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241780
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-2/+2
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-2/+2
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* Properly handle the mftb instruction.Kit Barton2015-06-161-2/+2
| | | | | | | | | | | | | | | | | | | | The mftb instruction was incorrectly marked as deprecated in the PPC Backend. Instead, it should not be treated as deprecated, but rather be implemented using the mfspr instruction. A similar patch was put into GCC last year. Details can be found at: https://sourceware.org/ml/binutils/2014-11/msg00383.html. This change will replace instances of the mftb instruction with the mfspr instruction for all CPUs except 601 and pwr3. This will also be the default behaviour. Additional details can be found in: https://llvm.org/bugs/show_bug.cgi?id=23680 Phabricator review: http://reviews.llvm.org/D10419 llvm-svn: 239827
* Rename TargetSubtargetInfo::enablePostMachineScheduler() to ↵Matthias Braun2015-06-131-1/+1
| | | | | | | | | | | | | | enablePostRAScheduler() r213101 changed the behaviour of this method to not only affect the PostMachineScheduler scheduler but also the PostRAScheduler scheduler, renaming should make this fact clear. Also document that the preferred way is to specify this in the scheduling model instead of overriding this method. Differential Revision: http://reviews.llvm.org/D10427 llvm-svn: 239659
* Replace string GNU Triples with llvm::Triple in MCSubtargetInfo and ↵Daniel Sanders2015-06-101-2/+2
| | | | | | | | | | | | | | | | | | create*MCSubtargetInfo(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rafael Reviewed By: rafael Subscribers: rafael, ted, jfb, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10311 llvm-svn: 239467
* Add direct moves to/from VSR and exploit them for FP/INT conversionsNemanja Ivanovic2015-04-111-0/+2
| | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D8928 It adds direct move instructions to/from VSX registers to GPR's. These are exploited for FP <-> INT conversions. llvm-svn: 234682
* Add LLVM support for remaining integer divide and permute instructions from ↵Nemanja Ivanovic2015-04-091-0/+4
| | | | | | | | | | | ISA 2.06 This is the patch corresponding to review: http://reviews.llvm.org/D8406 It adds some missing instructions from ISA 2.06 to the PPC back end. llvm-svn: 234546
* Add Hardware Transactional Memory (HTM) SupportKit Barton2015-03-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07 (POWER8). The intrinsic support is based on GCC one [1], but currently only the 'PowerPC HTM Low Level Built-in Function' are implemented. The HTM instructions follows the RC ones and the transaction initiation result is set on RC0 (with exception of tcheck). Currently approach is to create a register copy from CR0 to GPR and comapring. Although this is suboptimal, since the branch could be taken directly by comparing the CR0 value, it generates code correctly on both test and branch and just return value. A possible future optimization could be elimitate the MFCR instruction to branch directly. The HTM usage requires a recently newer kernel with PPC HTM enabled. Tested on powerpc64 and powerpc64le. This is send along a clang patch to enabled the builtins and option switch. [1] https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html Phabricator Review: http://reviews.llvm.org/D8247 llvm-svn: 233204
* Add support for part-word atomics for PPCNemanja Ivanovic2015-03-101-0/+2
| | | | | | http://reviews.llvm.org/D8090#inline-67337 llvm-svn: 231843
* Add LLVM support for PPC cryptography builtinsNemanja Ivanovic2015-03-041-0/+2
| | | | | | Review: http://reviews.llvm.org/D7955 llvm-svn: 231285
* [PowerPC] Add support for the QPX vector instruction setHal Finkel2015-02-251-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for the QPX vector instruction set, which is used by the enhanced A2 cores on the IBM BG/Q supercomputers. QPX vectors are 256 bytes wide, holding 4 double-precision floating-point values. Boolean values, modeled here as <4 x i1> are actually also represented as floating-point values (essentially { -1, 1 } for { false, true }). QPX shares many features with Altivec and VSX, but is distinct from both of them. One major difference is that, instead of adding completely-separate vector registers, QPX vector registers are extensions of the scalar floating-point registers (lane 0 is the corresponding scalar floating-point value). The operations supported on QPX vectors mirrors that supported on the scalar floating-point values (with some additional ones for permutations and logical/comparison operations). I've been maintaining this support out-of-tree, as part of the bgclang project, for several years. This is not the entire bgclang patch set, but is most of the subset that can be cleanly integrated into LLVM proper at this time. Adding this to the LLVM backend is part of my efforts to rebase bgclang to the current LLVM trunk, but is independently useful (especially for codes that use LLVM as a JIT in library form). The assembler/disassembler test coverage is complete. The CodeGen test coverage is not, but I've included some tests, and more will be added as follow-up work. llvm-svn: 230413
* Move ABI handling and 64-bitness to the PowerPC target machine.Eric Christopher2015-02-171-9/+4
| | | | | | | This required changing how the computation of the ABI is handled and how some of the checks for ABI/target are done. llvm-svn: 229471
* Move the target machine variable so that it's initialized earlyEric Christopher2015-02-131-2/+1
| | | | | | enough we can use it to initialize frame lowering. llvm-svn: 229168
* Stash the TargetMachine on the subtarget so we can access it later.Eric Christopher2015-02-131-2/+3
| | | | | | Clean up a subtarget function that has it passed in while we're at it. llvm-svn: 229164
* [PowerPC] Implement the vpopcnt instructions for POWER8Bill Schmidt2015-02-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Patch by Kit Barton. Add the vector population count instructions for byte, halfword, word, and doubleword sizes. There are two major changes here: PPCISelLowering.cpp: Make CTPOP legal for vector types. PPCRegisterInfo.td: Added v2i64 to the VRRC register definition. This is needed for the doubleword variations of the integer ops that were added in P8. Test Plan Test the instruction vpcnt* encoding/decoding in ppc64-encoding-vmx.s Test the generation of the vpopcnt instructions for various vector data types. When adding the v2i64 type to the Vector Register set, I also needed to add the appropriate bit conversion patterns between v2i64 and the existing vector types. Testing for these conversions were also added in the test case by passing a different vector type as a parameter into the test functions. There is also a run step that will ensure the vpopcnt instructions are generated when the vsx feature is disabled. llvm-svn: 228046
* Move DataLayout back to the TargetMachine from TargetSubtargetInfoEric Christopher2015-01-261-4/+0
| | | | | | | | | | | | | | | | | | | derived classes. Since global data alignment, layout, and mangling is often based on the DataLayout, move it to the TargetMachine. This ensures that global data is going to be layed out and mangled consistently if the subtarget changes on a per function basis. Prior to this all targets(*) have had subtarget dependent code moved out and onto the TargetMachine. *One target hasn't been migrated as part of this change: R600. The R600 port has, as a subtarget feature, the size of pointers and this affects global data layout. I've currently hacked in a FIXME to enable progress, but the port needs to be updated to either pass the 64-bitness to the TargetMachine, or fix the DataLayout to avoid subtarget dependent features. llvm-svn: 227113
* [PowerPC] Loosen ELFv1 PPC64 func descriptor loads for indirect callsHal Finkel2015-01-151-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Function pointers under PPC64 ELFv1 (which is used on PPC64/Linux on the POWER7, A2 and earlier cores) are really pointers to a function descriptor, a structure with three pointers: the actual pointer to the code to which to jump, the pointer to the TOC needed by the callee, and an environment pointer. We used to chain these loads, and make them opaque to the rest of the optimizer, so that they'd always occur directly before the call. This is not necessary, and in fact, highly suboptimal on embedded cores. Once the function pointer is known, the loads can be performed ahead of time; in fact, they can be hoisted out of loops. Now these function descriptors are almost always generated by the linker, and thus the contents of the descriptors are invariant. As a result, by default, we'll mark the associated loads as invariant (allowing them to be hoisted out of loops). I've added a target feature to turn this off, however, just in case someone needs that option (constructing an on-stack descriptor, casting it to a function pointer, and then calling it cannot be well-defined C/C++ code, but I can imagine some JIT-compilation system doing so). Consider this simple test: $ cat call.c typedef void (*fp)(); void bar(fp x) { for (int i = 0; i < 1600000000; ++i) x(); } $ cat main.c typedef void (*fp)(); void bar(fp x); void foo() {} int main() { bar(foo); } On the PPC A2 (the BG/Q supercomputer), marking the function-descriptor loads as invariant brings the execution time down to ~8 seconds from ~32 seconds with the loads in the loop. The difference on the POWER7 is smaller. Compiling with: gcc -std=c99 -O3 -mcpu=native call.c main.c : ~6 seconds [this is 4.8.2] clang -O3 -mcpu=native call.c main.c : ~5.3 seconds clang -O3 -mcpu=native call.c main.c -mno-invariant-function-descriptors : ~4 seconds (looks like we'd benefit from additional loop unrolling here, as a first guess, because this is faster with the extra loads) The -mno-invariant-function-descriptors will be added to Clang shortly. llvm-svn: 226207
* [PPC64] Add support for the ICBT instruction on POWER8.Bill Schmidt2015-01-141-0/+2
| | | | | | | | | | | | | | | | | | | Patch by Kit Barton. Support for the ICBT instruction is currently present, but limited to embedded processors. This change adds a new FeatureICBT that can be used to identify whether the ICBT instruction is available on a specific processor. Two new tests are added: * Positive test to ensure the icbt instruction is present when using -mcpu=pwr8 * Negative test to ensure the icbt instruction is not generated when using -mcpu=pwr7 Both test cases use the Prefetch opcode in LLVM. They are based on the ppc64-prefetch.ll test case. llvm-svn: 226033
* [cleanup] Re-sort all the #include lines in LLVM usingChandler Carruth2015-01-141-1/+1
| | | | | | | | | | | utils/sort_includes.py. I clearly haven't done this in a while, so more changed than usual. This even uncovered a missing include from the InstrProf library that I've added. No functionality changed here, just mechanical cleanup of the include order. llvm-svn: 225974
* [PowerPC] Add a flag for experimenting with subreg liveness trackingHal Finkel2015-01-091-0/+2
| | | | | | | This cannot yet be enabled by default, it causes ~50 miscompiles in the test suite. llvm-svn: 225497
* [PowerPC] Add support for the CMPB instructionHal Finkel2015-01-031-0/+2
| | | | | | | | | | | | | | Newer POWER cores, and the A2, support the cmpb instruction. This instruction compares its operands, treating each of the 8 bytes in the GPRs separately, returning a 'mask' result of 0 (for false) or -1 (for true) in each byte. Code generation support is added, in the form of a PPCISelDAGToDAG DAG-preprocessing routine, that recognizes patterns close to what the instruction computes (either exactly, or related by a constant masking operation), and generates the cmpb instruction (along with any necessary constant masking operation). This can be expanded if use cases arise. llvm-svn: 225106
* [PowerPC] Reduce names from Power8Vector to P8VectorBill Schmidt2014-10-101-2/+2
| | | | | | Per Hal Finkel's review, improving typability of some variable names. llvm-svn: 219514
* [PowerPC] Add feature for Power8 vector extensionsBill Schmidt2014-10-101-0/+2
| | | | | | | | | | | | | | | | | | The current VSX feature for PowerPC specifies availability of the VSX instructions added with the 2.06 architecture version. With 2.07, the architecture adds new instructions to both the Category:Vector and Category:VSX instruction sets. Additionally, unaligned vector storage operations have improved performance. This patch adds a feature to provide access to the new instructions and performance capabilities of Power8. For compatibility with GCC, the feature is controlled via a new -mpower8-vector switch, and the feature causes the __POWER8_VECTOR__ builtin define to be generated by the preprocessor. There is a companion patch for cfe being committed at the same time. llvm-svn: 219501
* [PowerPC] Modern Book-E cores support syncHal Finkel2014-10-021-0/+2
| | | | | | | | | | | | | Older Book-E cores, such as the PPC 440, support only msync (which has the same encoding as sync 0), but not any of the other sync forms. Newer Book-E cores, however, do support sync, and for performance reasons we should allow the use of the more-general form. This refactors msync use into its own feature group so that it applies by default only to older Book-E cores (of the relevant cores, we only have definitions for the PPC440/450 currently). llvm-svn: 218923
* constify the TargetMachine argument used in the subtarget andEric Christopher2014-10-011-1/+1
| | | | | | lowering constructors. llvm-svn: 218832
OpenPOWER on IntegriCloud