summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* [PowerPC] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-01-1310-255/+433
| | | | | | other minor fixes (NFC). llvm-svn: 291872
* [X86] updating TTI costs for arithmetic instructions on X86\SLM arch.Mohammed Agabaria2017-01-112-2/+3
| | | | | | | | | | | | updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd. special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. In case if the real operands bitwidth <= 16. Differential Revision: https://reviews.llvm.org/D28104 llvm-svn: 291657
* [PowerPC] Implement missing ISA 2.06 instructions.Tony Jiang2017-01-054-1/+18
| | | | | | | Instructions: fctidu[.], fctiwu[.], ftdiv, ftsqrt are not implemented. Implement them and add corresponding test cases in this patch. llvm-svn: 291116
* [PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility)Hal Finkel2017-01-041-40/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change aims to unify and correct our logic for when we need to allow for the possibility of the linker adding a TOC restoration instruction after a call. This comes up in two contexts: 1. When determining tail-call eligibility. If we make a tail call (i.e. directly branch to a function) then there is no place for the linker to add a TOC restoration. 2. When determining when we need to add a nop instruction after a call. Likewise, if there is a possibility that the linker might need to add a TOC restoration after a call, then we need to put a nop after the call (the bl instruction). First problem: We were using similar, but different, logic to decide (1) and (2). This is just wrong. Both the resideInSameModule function (used when determining tail-call eligibility) and the isLocalCall function (used when deciding if the post-call nop is needed) were supposed to be determining the same underlying fact (i.e. might a TOC restoration be needed after the call). The same logic should be used in both places. Second problem: The logic in both places was wrong. We only know that two functions will share the same TOC when both functions come from the same section of the same object. Otherwise the linker might cause the functions to use different TOC base addresses (unless the multi-TOC linker option is disabled, in which case only shared-library boundaries are relevant). There are a number of factors that can cause functions to be placed in different sections or come from different objects (-ffunction-sections, explicitly-specified section names, COMDAT, weak linkage, etc.). All of these need to be checked. The existing logic only checked properties of the callee, but the properties of the caller must also be checked (for example, calling from a function in a COMDAT section means calling between sections). There was a conceptual error in the resideInSameModule function in that it allowed tail calls to functions with weak linkage and protected/hidden visibility. While protected/hidden visibility does prevent the function implementation from being replaced at runtime (via interposition), it does not prevent the linker from using an alternate implementation at link time (i.e. using some strong definition to replace the provided weak one during linking). If this happens, then we're still potentially looking at a required TOC restoration upon return. Otherwise, in general, the post-call nop is needed wherever ELF interposition needs to be supported. We don't currently support ELF interposition at the IR level (see http://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html for more information), and I don't think we should try to make it appear to work in the backend in spite of that fact. Unfortunately, because of the way that the ABI works, we need to generate code as if we supported interposition whenever the linker might insert stubs for the purpose of supporting it. Differential Revision: https://reviews.llvm.org/D27231 llvm-svn: 291003
* [Power9] Processor Model for SchedulingEhsan Amiri2016-12-194-3/+1145
| | | | | | | | PWR9 processor model for instruction scheduling. A subsequent patch will migrate PWR9 to Post RA MIScheduler. https://reviews.llvm.org/D24525 llvm-svn: 290102
* Revert r289638: [PowerPC] Fix logic dealing with nop after calls (and ↵Chandler Carruth2016-12-161-25/+40
| | | | | | | | | | | | | tail-call eligibility) This patch appears to result in trampolines in vtables being miscompiled when they in turn tail call a method. I've posted some preliminary details about the failure on the thread for this commit and talked to Hal. He was comfortable going ahead and reverting until we sort out what is wrong. llvm-svn: 289928
* [Power9] Allow AnyExt immediates for XXSPLTIBNemanja Ivanovic2016-12-152-7/+7
| | | | | | | | | | In some situations, the BUILD_VECTOR node that builds a v18i8 vector by a splat of an i8 constant will end up with signed 8-bit values and other situations, it'll end up with unsigned ones. Handle both situations. Fixes PR31340. llvm-svn: 289804
* Use PIC relocation model as default for PowerPC64 ELF.Joerg Sonnenberger2016-12-151-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most of the PowerPC64 code generation for the ELF ABI is already PIC. There are four main exceptions: (1) Constant pointer arrays etc. should in writeable sections. (2) The TOC restoration NOP after a call is needed for all global symbols. While GNU ld has a workaround for questionable GCC self-calls, we trigger the checks for calls from COMDAT sections as they cross input sections and are therefore not considered self-calls. The current decision is questionable and suboptimal, but outside the scope of the change. (3) TLS access can not use the initial-exec model. (4) Jump tables should use relative addresses. Note that the current encoding doesn't work for the large code model, but it is more compact than the default for any non-trivial jump table. Improving this is again beyond the scope of this change. At least (1) and (3) are assumptions made in target-independent code and introducing additional hooks is a bit messy. Testing with clang shows that a -fPIC binary is 600KB smaller than the corresponding -fno-pic build. Separate testing from improved jump table encodings would explain only about 100KB or so. The rest is expected to be a result of more aggressive immediate forming for -fno-pic, where the -fPIC binary just uses TOC entries. This change brings the LLVM output in line with the GCC output, other PPC64 compilers like XLC on AIX are known to produce PIC by default as well. The relocation model can still be provided explicitly, i.e. when using MCJIT. One test case for case (1) is included, other test cases with relocation mode sensitive behavior are wired to static for now. They will be reviewed and adjusted separately. Differential Revision: https://reviews.llvm.org/D26566 llvm-svn: 289743
* [PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility)Hal Finkel2016-12-141-40/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change aims to unify and correct our logic for when we need to allow for the possibility of the linker adding a TOC restoration instruction after a call. This comes up in two contexts: 1. When determining tail-call eligibility. If we make a tail call (i.e. directly branch to a function) then there is no place for the linker to add a TOC restoration. 2. When determining when we need to add a nop instruction after a call. Likewise, if there is a possibility that the linker might need to add a TOC restoration after a call, then we need to put a nop after the call (the bl instruction). First problem: We were using similar, but different, logic to decide (1) and (2). This is just wrong. Both the resideInSameModule function (used when determining tail-call eligibility) and the isLocalCall function (used when deciding if the post-call nop is needed) were supposed to be determining the same underlying fact (i.e. might a TOC restoration be needed after the call). The same logic should be used in both places. Second problem: The logic in both places was wrong. We only know that two functions will share the same TOC when both functions come from the same section of the same object. Otherwise the linker might cause the functions to use different TOC base addresses (unless the multi-TOC linker option is disabled, in which case only shared-library boundaries are relevant). There are a number of factors that can cause functions to be placed in different sections or come from different objects (-ffunction-sections, explicitly-specified section names, COMDAT, weak linkage, etc.). All of these need to be checked. The existing logic only checked properties of the callee, but the properties of the caller must also be checked (for example, calling from a function in a COMDAT section means calling between sections). There was a conceptual error in the resideInSameModule function in that it allowed tail calls to functions with weak linkage and protected/hidden visibility. While protected/hidden visibility does prevent the function implementation from being replaced at runtime (via interposition), it does not prevent the linker from using an alternate implementation at link time (i.e. using some strong definition to replace the provided weak one during linking). If this happens, then we're still potentially looking at a required TOC restoration upon return. Otherwise, in general, the post-call nop is needed wherever ELF interposition needs to be supported. We don't currently support ELF interposition at the IR level (see http://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html for more information), and I don't think we should try to make it appear to work in the backend in spite of that fact. This will yield subtle bugs if interposition is attempted. As a result, regardless of whether we're in PIC mode, we don't assume that we need to add the nop to support the possibility of ELF interposition. However, the necessary check is in place (i.e. calling GV->isInterposable and TM.shouldAssumeDSOLocal) so when we have functions for which interposition is allowed at the IR level, we'll add the nop as necessary. In the mean time, we'll generate more tail calls and fewer nops when compiling position-independent code. Differential Revision: https://reviews.llvm.org/D27231 llvm-svn: 289638
* [AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What ↵Eugene Zelenko2016-12-122-40/+65
| | | | | | You Use warnings; other minor fixes (NFC). llvm-svn: 289475
* [PPC] Prefer direct move on power8 if load 1 or 2 bytes to VSRGuozhi Wei2016-12-122-1/+9
| | | | | | | | | Power8 has MTVSRWZ but no LXSIBZX/LXSIHZX, so move 1 or 2 bytes to VSR through MTVSRWZ is much faster than store the extended value into stack and load it with LXSIWZX. This patch fixes pr31144. Differential Revision: https://reviews.llvm.org/D27287 llvm-svn: 289473
* [AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What ↵Eugene Zelenko2016-12-092-16/+28
| | | | | | You Use warnings; other minor fixes (NFC). llvm-svn: 289282
* [PPC] Add intrinsics for vector extract word and vector insert word.Sean Fertile2016-12-091-0/+9
| | | | | Revision: https://reviews.llvm.org/D26547 llvm-svn: 289227
* [PowerPC] Improvements for BUILD_VECTOR Vol. 4Nemanja Ivanovic2016-12-062-27/+126
| | | | | | | | | | | | This is the final patch in the series of patches that improves BUILD_VECTOR handling on PowerPC. This adds a few peephole optimizations to remove redundant instructions. It also adds a large test case which encompasses a large set of code patterns that build vectors - this test case was the motivator for this series of patches. Differential Revision: https://reviews.llvm.org/D26066 llvm-svn: 288800
* [PPC] Slightly Improve Assembly Parsing errors and add EOL commentNirav Dave2016-12-051-194/+119
| | | | | | | | parsing tests. NFC intended. llvm-svn: 288667
* [ppc] Correctly compute the cost of loading 32/64 bit memory into VSRGuozhi Wei2016-12-031-5/+14
| | | | | | | | VSX has instructions lxsiwax/lxsdx that can load 32/64 bit value into VSX register cheaply. That patch makes it known to memory cost model, so the vectorization of the test case in pr30990 is beneficial. Differential Revision: https://reviews.llvm.org/D26713 llvm-svn: 288560
* IR: Change the gep_type_iterator API to avoid always exposing the "current" ↵Peter Collingbourne2016-12-021-1/+1
| | | | | | | | | | | | | type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458
* Move FrameInstructions from MachineModuleInfo to MachineFunctionMatthias Braun2016-11-301-9/+9
| | | | | | | | | | | This is per function data so it is better kept at the function instead of the module. This is a necessary step to have machine module passes work properly. Differential Revision: https://reviews.llvm.org/D27185 llvm-svn: 288291
* [PowerPC] Preserve machine dominator tree in PPCVSXFMAMutateKrzysztof Parzyszek2016-11-301-0/+4
| | | | | | It is needed by LiveIntervalAnalysis. llvm-svn: 288243
* [PowerPC] Improvements for BUILD_VECTOR Vol. 2Nemanja Ivanovic2016-11-291-4/+98
| | | | | | | | | | This patch corresponds to review: https://reviews.llvm.org/D26023 This patch adds support for converting a vector of loads into a single load if the loads are consecutive (in either direction). llvm-svn: 288219
* [PowerPC] Improvements for BUILD_VECTOR Vol. 2Nemanja Ivanovic2016-11-292-1/+98
| | | | | | | | | | | | | | | This patch corresponds to review: https://reviews.llvm.org/D25980 This is the 2nd patch in a series of 4 that improve the lowering and combining for BUILD_VECTOR nodes on PowerPC. This particular patch combines a build vector of fp-to-int conversions into an fp-to-int conversion of a build vector of fp values. For example: Converts (build_vector (fp_to_[su]i $A), (fp_to_[su]i $B), ...) Into (fp_to_[su]i (build_vector $A, $B, ...))). Which is a natural match for much cleaner code. llvm-svn: 288218
* Revert https://reviews.llvm.org/rL287679Nemanja Ivanovic2016-11-292-23/+6
| | | | | | | This commit caused some miscompiles that did not show up on any of the bots. Reverting until we can investigate the cause of those failures. llvm-svn: 288214
* [PowerPC] Improvements for BUILD_VECTOR Vol. 1Nemanja Ivanovic2016-11-293-58/+361
| | | | | | | | | | This patch corresponds to review: https://reviews.llvm.org/D25912 This is the first patch in a series of 4 that improve the lowering and combining for BUILD_VECTOR nodes on PowerPC. llvm-svn: 288152
* [PowerPC] Remove InstAlias definitions that cause incorrect assemblyNemanja Ivanovic2016-11-231-11/+15
| | | | | | | | | | | | | In rL283190, I added some InstAlias definitions to generate extended mnemonics for some uses of the XXPERMDI instruction. However, when the assembler matches these extended mnemonics, it matches the new instruction in situations where it should match the old one. This patch removes these definitions and accomplishes that by defining these mnemonics with additional instructions that are isCodeGenOnly. Fixes PR31127. llvm-svn: 287765
* [PowerPC] Emit VMX loads/stores for aligned ops to avoid adding swaps on LENemanja Ivanovic2016-11-222-6/+23
| | | | | | | | | | | This patch corresponds to review: https://reviews.llvm.org/D26861 It also fixes PR30730. Committing on behalf of Lei Huang. llvm-svn: 287679
* Fix comment typos. NFC.Simon Pilgrim2016-11-201-2/+2
| | | | | | Identified by Pedro Giffuni in PR27636. llvm-svn: 287486
* Check that emitted instructions meet their predicates on all targets except ↵Daniel Sanders2016-11-191-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | ARM, Mips, and X86. Summary: * ARM is omitted from this patch because this check appears to expose bugs in this target. * Mips is omitted from this patch because this check either detects bugs or deliberate emission of instructions that don't satisfy their predicates. One deliberate use is the SYNC instruction where the version with an operand is correctly defined as requiring MIPS32 while the version without an operand is defined as an alias of 'SYNC 0' and requires MIPS2. * X86 is omitted from this patch because it doesn't use the tablegen-erated MCCodeEmitter infrastructure. Patches for ARM and Mips will follow. Depends on D25617 Reviewers: tstellarAMD, jmolloy Subscribers: wdng, jmolloy, aemerson, rengolin, arsenm, jyknight, nemanjai, nhaehnle, tstellarAMD, llvm-commits Differential Revision: https://reviews.llvm.org/D25618 llvm-svn: 287439
* [PPC] limit line width to 80 charactersEhsan Amiri2016-11-181-1/+2
| | | | | | NFC. Forgot to fix this in the original commit. llvm-svn: 287350
* [Power9] Add patterns for vnegd, vnegwEhsan Amiri2016-11-181-2/+7
| | | | | | | Exploit new instructions by adding patterns to .td file. https://reviews.llvm.org/D26551 llvm-svn: 287334
* [PPC][DAGCombine] Convert SETCC to subtract when the result is zero extendedEhsan Amiri2016-11-182-1/+88
| | | | | | | | | | | | | | | | | When we see a SETCC whose only users are zero extend operations, we can replace it with a subtraction. This results in doing all calculations in GPRs and avoids CR use. Currently we do this only for ULT, ULE, UGT and UGE condition codes. There are ways that this can be extended. For example for signed condition codes. In that case we will be introducing additional sign extend instructions, so more careful profitability analysis may be required. Another direction to extend this is for equal, not equal conditions. Also when users of SETCC are any_ext or sign_ext, we might be able to do something similar. llvm-svn: 287329
* Always use relative jump table encodings on PowerPC64.Joerg Sonnenberger2016-11-162-0/+59
| | | | | | | | | | | | | | | | | For the default, small and medium code model, use the existing difference from the jump table towards the label. For all other code models, setup the picbase and use the difference between the picbase and the block address. Overall, this results in smaller data tables at the expensive of one or two more arithmetic operation at the jump site. Given that we only create jump tables with a lot more than two entries, it is a net win in size. For larger code models the assumption remains that individual functions are no larger than 2GB. Differential Revision: https://reviews.llvm.org/D26336 llvm-svn: 287059
* vector load store with length (left justified) llvm portionZaara Syeda2016-11-151-4/+16
| | | | llvm-svn: 286993
* [PowerPC] Implement BE VSX load/store builtins - llvm portion.Tony Jiang2016-11-152-0/+15
| | | | | | | | | | | | This patch implements all the overloads for vec_xl_be and vec_xst_be. On BE, they behaves exactly the same with vec_xl and vec_xst, therefore they are simply implemented by defining a matching macro. On LE, they are implemented by defining new builtins and intrinsics. For int/float/long long/double, it is just a load (lxvw4x/lxvd2x) or store(stxvw4x/stxvd2x). For char/char/short, we also need some extra shuffling before or after call the builtins to get the desired BE order. For int128, simply call vec_xl or vec_xst. llvm-svn: 286967
* [PPC] Add intrinsic mapping to the xscvhpsp instructionSean Fertile2016-11-141-0/+9
| | | | | | | | | add an intrinsic to expose the 'VSX Scalar Convert Half-Precision to Single-Precision' instruction. Differential review: https://reviews.llvm.org/D26536 llvm-svn: 286862
* [PPC] add intrinsics for vec extract exp/significand and vec test data class.Sean Fertile2016-11-141-6/+18
| | | | | | Differential Revision: https://reviews.llvm.org/D26272 llvm-svn: 286829
* [PowerPC] Add remaining vector permute builtins in altivec.h - LLVM portionNemanja Ivanovic2016-11-112-5/+23
| | | | | | | | | | This patch corresponds to review: https://reviews.llvm.org/D26480 Adds all the intrinsics used for various permute builtins that will be added to altivec.h. llvm-svn: 286638
* [PowerPC] Add vector conversion builtins to altivec.h - LLVM portionNemanja Ivanovic2016-11-111-8/+16
| | | | | | | | | | | This patch corresponds to review: https://reviews.llvm.org/D26307 Adds all the intrinsics used for various conversion builtins that will be added to altivec.h. These are type conversions between various types of vectors. llvm-svn: 286596
* Add a blank line for a test commit.Sean Fertile2016-11-111-0/+1
| | | | llvm-svn: 286550
* [DAG Combiner] Fix the native computation of the Newton series for reciprocalsEvandro Menezes2016-11-102-6/+7
| | | | | | | | | | | | The generic infrastructure to compute the Newton series for reciprocal and reciprocal square root was conceived to allow a target to compute the series itself. However, the original code did not properly consider this condition if returned by a target. This patch addresses the issues to allow a target to compute the series on its own. Differential revision: https://reviews.llvm.org/D22975 llvm-svn: 286523
* Sink all of the code relying on the MachO MachineModuleInfo to liveChandler Carruth2016-11-031-47/+51
| | | | | | | | | | | | | | | behind the test that the MachineModuleInfo analysis was actually available and can be used. While the MachO bits may well be reasonable to assume in the darwin assembly printer, the analysis isn't constructively guaranteed anywhere I could find so it seems safest to avoid crashing here. This issue was found with PVS-Studio. Pretty sure the Clang Static Anaylzer flags similar issues but we've probably never pointed it at this code effectively. llvm-svn: 285972
* NFC - Test commit.Tony Jiang2016-11-031-1/+0
| | | | | | Delete an empty line at the end of README.txt file. llvm-svn: 285964
* Create the virtual register for the global base in the intersection ofJoerg Sonnenberger2016-11-021-2/+2
| | | | | | | | GPRC and GPRC_NOR0 (or the 64bit equivalent) and not just the latter. GPRC_NOR0 contains ZERO as alternative meaning of r0 and is therefore not a true subclass of GPRC. llvm-svn: 285813
* [PowerPC] Implement vector shift builtins - llvm portionNemanja Ivanovic2016-11-011-2/+4
| | | | | | | This patch corresponds to review https://reviews.llvm.org/D26095. Committing on behalf of Tony Jiang. llvm-svn: 285681
* [PPC] add absolute difference altivec instructions and matching intrinsicsNemanja Ivanovic2016-10-311-0/+11
| | | | | | | This patch corresponds to review https://reviews.llvm.org/D26072. Committing on behalf of Sean Fertile. llvm-svn: 285627
* Implement vector count leading/trailing bytes with zero lsb and vector parityNemanja Ivanovic2016-10-281-7/+14
| | | | | | | | | builtins - llvm portion This patch corresponds to review https://reviews.llvm.org/D26003. Committing on behalf of Zaara Syeda. llvm-svn: 285434
* [PowerPC] - No SExt/ZExt needed for count trailing zerosNemanja Ivanovic2016-10-271-2/+4
| | | | | | | | | This patch corresponds to review: https://reviews.llvm.org/D25896 It just eliminates the redundant ZExt after a count trailing zeros instruction. llvm-svn: 285267
* [PowerPC] Implement vec_insert_exp builtins - llvm portionNemanja Ivanovic2016-10-261-2/+2
| | | | | | | This revision corresponds to review: https://reviews.llvm.org/D25957. Committing on behalf of Zaara Syeda. llvm-svn: 285225
* Target: Change various section classifiers in TargetLoweringObjectFile to ↵Peter Collingbourne2016-10-242-4/+4
| | | | | | | | | | | | | | | | take a GlobalObject. These functions are about classifying a global which will actually be emitted, so it does not make sense for them to take a GlobalValue which may for example be an alias. Change the Mach-O object writer and the Hexagon, Lanai and MIPS backends to look through aliases before using TargetLoweringObjectFile interfaces. These are functional changes but all appear to be bug fixes. Differential Revision: https://reviews.llvm.org/D25917 llvm-svn: 285006
* [PPC] Generate positive FP zero using xor insn instead of loading from ↵Ehsan Amiri2016-10-245-0/+43
| | | | | | | | | | | constant area https://reviews.llvm.org/D23614 Currently we load +0.0 from constant area. That can change to be generated using XOR instruction. llvm-svn: 284995
* [PPC] Better codegen for AND, ANY_EXT, SRL sequenceEhsan Amiri2016-10-242-0/+23
| | | | | | | | https://reviews.llvm.org/D24924 This improves the code generated for a sequence of AND, ANY_EXT, SRL instructions. This is a targetted fix for this special pattern. The pattern is generated by target independet dag combiner and so a more general fix may not be necessary. If we come across other similar cases, some ideas for handling it are discussed on the code review. llvm-svn: 284983
OpenPOWER on IntegriCloud