summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* [PowerPC] Reverting sequence of patches for elimination of comparison ↵Nemanja Ivanovic2017-09-261-1061/+0
| | | | | | | | | | | | | | | | | | | | | | instructions In the past while, I've committed a number of patches in the PowerPC back end aimed at eliminating comparison instructions. However, this causes some failures in proprietary source and these issues are not observed in SPEC or any open source packages I've been able to run. As a result, I'm pulling the entire series and will refactor it to: - Have a single entry point for easy control - Have fine-grained control over which patterns we transform A side-effect of this is that test cases for these patches (and modified by them) are XFAIL-ed. This is a temporary measure as it is counter-productive to remove/modify these test cases and then have to modify them again when the refactored patch is recommitted. The failure will be investigated in parallel to the refactoring effort and the recommit will either have a fix for it or will leave this transformation off by default until the problem is resolved. llvm-svn: 314244
* [PowerPC] Eliminate compares - add i64 sext/zext handling for SETLT/SETGTNemanja Ivanovic2017-09-251-2/+76
| | | | | | | | As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential review. llvm-svn: 314106
* [CodeGenPrepare][NFC] Rename TargetTransformInfo::expandMemCmp -> ↵Clement Courbet2017-09-252-2/+2
| | | | | | | | | | | | | | | | TargetTransformInfo::enableMemCmpExpansion. Summary: Right now there are two functions with the same name, one does the work and the other one returns true if expansion is needed. Rename TargetTransformInfo::expandMemCmp to make it more consistent with other members of TargetTransformInfo. Remove the unused Instruction* parameter. Differential Revision: https://reviews.llvm.org/D38165 llvm-svn: 314096
* [PowerPC] Eliminate compares - add i64 sext/zext handling for SETLE/SETGENemanja Ivanovic2017-09-241-0/+96
| | | | | | | | As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential review. llvm-svn: 314073
* [PowerPC] Eliminate compares - add i32 sext/zext handling for SETULT/SETUGTNemanja Ivanovic2017-09-231-3/+34
| | | | | | | | As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential revision. llvm-svn: 314062
* [PowerPC] Eliminate compares - add i32 sext/zext handling for SETULE/SETUGENemanja Ivanovic2017-09-232-1/+74
| | | | | | | | As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential revision. llvm-svn: 314060
* [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLT/SETGTNemanja Ivanovic2017-09-231-0/+101
| | | | | | | | As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential revision. llvm-svn: 314055
* [PowerPC] Mark P9 scheduling model completeStefan Pintilie2017-09-224-266/+503
| | | | | | | | | | | | This patch just adds the missing information to the P9 scheduling model to allow the model to be marked as complete. The model has been verified against P9 documentation. The model was verified with utils/schedcover.py. Differential Revision: https://reviews.llvm.org/D35695 llvm-svn: 314026
* [XRay] support conditional return on PPC.Tim Shen2017-09-223-56/+110
| | | | | | | | | | | | Summary: Conditional returns were not taken into consideration at all. Implement them by turning them into jumps and normal returns. This means there is a slightly higher performance penalty for conditional returns, but this is the best we can do, and it still disturbs little of the rest. Reviewers: dberris, echristo Subscribers: sanjoy, nemanjai, hiraditya, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D38102 llvm-svn: 314005
* Remove the default clause from a fully-covering switchNemanja Ivanovic2017-09-221-4/+10
| | | | | | | to appease bots that use a compiler that warns about this and use -Werror. llvm-svn: 313980
* Recommit r310809 with a fix for the spill problemNemanja Ivanovic2017-09-221-51/+150
| | | | | | | | | | This patch re-commits the patch that was pulled out due to a problem it caused, but with a fix for the problem. The fix was reviewed separately by Eric Christopher and Hal Finkel. Differential Revision: https://reviews.llvm.org/D38054 llvm-svn: 313978
* [Power9] Spill gprs to vector registers rather than stackZaara Syeda2017-09-214-1/+121
| | | | | | | | | | This patch updates register allocation to enable spilling gprs to volatile vector registers rather than the stack. It can be enabled for Power9 with option -ppc-enable-gpr-to-vsr-spills. Differential Revision: https://reviews.llvm.org/D34815 llvm-svn: 313886
* [PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD.Tony Jiang2017-09-191-0/+116
| | | | | | | | | | Two blocks prior to the join each perform an li and the the join block has an add using the initialized register. Optimize each predecessor block to instead use addi and delete the li's and add. Differential Revision: https://reviews.llvm.org/D36734 llvm-svn: 313639
* [Power9] Add missing Power9 instructions.Tony Jiang2017-09-195-442/+67
| | | | | | | The following 8 instructions are implemented in this patch. addpcis(subpcis, lnia), darn, maddhd, maddhdu, maddld, setb llvm-svn: 313636
* [Power9] Add missing instructions: extswsli, popcntbStefan Pintilie2017-09-132-0/+22
| | | | | | | | Added the following P9 instructions: extswsli, extswsli., popcntb Differential Revision: https://reviews.llvm.org/D37342 llvm-svn: 313147
* Update branch coalescing to be a PowerPC specific passLei Huang2017-09-124-0/+794
| | | | | | | | | | | | Implementing this pass as a PowerPC specific pass. Branch coalescing utilizes the analyzeBranch method which currently does not include any implicit operands. This is not an issue on PPC but must be handled on other targets. Pass is currently off by default. Enabled via -enable-ppc-branch-coalesce. Differential Revision : https: // reviews.llvm.org/D32776 llvm-svn: 313061
* PPC: Don't select lxv/stxv for insufficiently aligned stack slots.Kyle Butt2017-09-091-1/+11
| | | | | | | | | | | | | | The lxv/stxv instructions require an offset that is 0 % 16. Previously we were selecting lxv/stxv for loads and stores to the stack where the offset from the slot was a multiple of 16, but the stack slot was not 16 or more byte aligned. When the frame gets lowered these transform to r(1|31) + slot + offset. If slot is not aligned, slot + offset may not be 0 % 16. Now we require 16 byte or more alignment for select lxv/stxv to stack slots. Includes a testcase that shows both sufficiently and insufficiently aligned stack slots. llvm-svn: 312843
* [XRay][CodeGen][PowerPC] Fix tail exit codegen for XRay in PPCDean Michael Berris2017-09-081-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes code-gen for XRay in PPC. The regression wasn't caught by codegen tests which we add in this change. What happened was the following: - For tail exits, we used to unconditionally prepend the returns/exits with a pseudo-instruction that gets lowered to the instrumentation sled (and leave the actual return/exit instruction as-is). - Changes to the XRay instrumentation pass caused the tail exits to suddenly also emit the tail exit pseudo-instruction, since the check for whether a return instruction was also a call instruction meant it was a tail exit instruction. - None of the tests caught the regression either due to non-existent tests, or the tests being disabled/removed for continuous breakage. This change re-introduces some of the basic tests and verifies that we're back to a state that allows the back-end to generate appropriate XRay instrumented binaries for PPC in the presence of tail exits. Reviewers: echristo, timshen Subscribers: nemanjai, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D37570 llvm-svn: 312772
* [PowerPC] Don't use xscvdpspn on the P7Hal Finkel2017-09-061-3/+6
| | | | | | | xscvdpspn was not introduced until the P8, so don't use it on the P7. Fixes a regression introduced in r288152. llvm-svn: 312612
* [PPC][NFC] Renaming things with 'xxinsert' moniker to 'vecinsert' to make it ↵Tony Jiang2017-09-054-9/+9
| | | | | | | | more general. Commit on behalf of Graham Yiu (gyiu@ca.ibm.com) llvm-svn: 312547
* [PowerPC] eliminate redundant compare instructionHiroshi Inoue2017-09-051-0/+299
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If multiple conditional branches are executed based on the same comparison, we can execute multiple conditional branches based on the result of one comparison on PPC. For example, if (a == 0) { ... } else if (a < 0) { ... } can be executed by one compare and two conditional branches instead of two pairs of a compare and a conditional branch. This patch identifies a code sequence of the two pairs of a compare and a conditional branch and merge the compares if possible. To maximize the opportunity, we do canonicalization of code sequence before merging compares. For the above example, the input for this pass looks like: cmplwi r3, 0 beq 0, .LBB0_3 cmpwi r3, -1 bgt 0, .LBB0_4 So, before merging two compares, we canonicalize it as cmpwi r3, 0 ; cmplwi and cmpwi yield same result for beq beq 0, .LBB0_3 cmpwi r3, 0 ; greather than -1 means greater or equal to 0 bge 0, .LBB0_4 The generated code should be cmpwi r3, 0 beq 0, .LBB0_3 bge 0, .LBB0_4 Differential Revision: https://reviews.llvm.org/D37211 llvm-svn: 312514
* Temporarily revert "Update branch coalescing to be a PowerPC specific pass"Eric Christopher2017-08-314-785/+0
| | | | | | | | From comments and code review it wasn't intended to be enabled by default yet. This reverts commit r311588. llvm-svn: 312214
* [Power9] Add new instructions for floating point status and control registers.Stefan Pintilie2017-08-282-0/+91
| | | | | | | | | Added the following P9 instructions: mffsce, mffscdrn, mffscdrni, mffscrn, mffscrni, mffsl Differential Revision: https://reviews.llvm.org/D37167 llvm-svn: 311903
* Untabify.NAKAMURA Takumi2017-08-281-2/+2
| | | | llvm-svn: 311875
* [DAG] convert vector select-of-constants to logic/mathSanjay Patel2017-08-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | This goes back to a discussion about IR canonicalization. We'd like to preserve and convert more IR to 'select' than we currently do because that's likely the best choice in IR: http://lists.llvm.org/pipermail/llvm-dev/2016-September/105335.html ...but that's often not true for codegen, so we need to account for this pattern coming in to the backend and transform it to better DAG ops. Steps in this patch: 1. Add an EVT param to the existing convertSelectOfConstantsToMath() TLI hook to more finely enable this transform. Other targets will probably want that anyway to distinguish scalars from vectors. We're using that here to exclude AVX512 targets, but it may not be necessary. 2. Convert a vselect to ext+add. This eliminates a constant load/materialization, and the vector ext is often free. Implementing a more general fold using xor+and can be a follow-up for targets that don't have a legal vselect. It's also possible that we can remove the TLI hook for the special case fold implemented here because we're eliminating a constant, but it needs to be tested on other targets. Differential Revision: https://reviews.llvm.org/D36840 llvm-svn: 311731
* Update branch coalescing to be a PowerPC specific passLei Huang2017-08-234-0/+785
| | | | | | | | | | Implementing this pass as a PowerPC specific pass. Branch coalescing utilizes the analyzeBranch method which currently does not include any implicit operands. This is not an issue on PPC but must be handled on other targets. Differential Revision : https: // reviews.llvm.org/D32776 llvm-svn: 311588
* [PowerPC] better instruction selection for OR (XOR) with a 32-bit immediateHiroshi Inoue2017-08-231-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | - recommitting after fixing a test failure on MacOS On PPC64, OR (XOR) with a 32-bit immediate can be done with only two instructions, i.e. ori + oris. But the current LLVM generates three or four instructions for this purpose (and also it clobbers one GPR). This patch makes PPC backend generate ori + oris (xori + xoris) for OR (XOR) with a 32-bit immediate. e.g. (x | 0xFFFFFFFF) should be ori 3, 3, 65535 oris 3, 3, 65535 but LLVM generates without this patch li 4, 0 oris 4, 4, 65535 ori 4, 4, 65535 or 3, 3, 4 Differential Revision: https://reviews.llvm.org/D34757 llvm-svn: 311538
* Revert rL311526: [PowerPC] better instruction selection for OR (XOR) with a ↵Hiroshi Inoue2017-08-231-51/+0
| | | | | | | | 32-bit immediate This reverts commit rL311526 due to failures in some buildbot. llvm-svn: 311530
* [PowerPC] better instruction selection for OR (XOR) with a 32-bit immediateHiroshi Inoue2017-08-231-0/+51
| | | | | | | | | | | | | | | | | | | | | | | On PPC64, OR (XOR) with a 32-bit immediate can be done with only two instructions, i.e. ori + oris. But the current LLVM generates three or four instructions for this purpose (and also it clobbers one GPR). This patch makes PPC backend generate ori + oris (xori + xoris) for OR (XOR) with a 32-bit immediate. e.g. (x | 0xFFFFFFFF) should be ori 3, 3, 65535 oris 3, 3, 65535 but LLVM generates without this patch li 4, 0 oris 4, 4, 65535 ori 4, 4, 65535 or 3, 3, 4 Differential Revision: https://reviews.llvm.org/D34757 llvm-svn: 311526
* [PPC] Refine checks for emiting TOC restore nop and tail-call eligibility.Sean Fertile2017-08-211-6/+17
| | | | | | | | | For the medium and large code models we only need to check if a call crosses dso-boundaries when considering tail-call elgibility. Differential Revision: https://reviews.llvm.org/D34245 llvm-svn: 311353
* [PowerPC] Check if the pre-increment PHI Node already existsStefan Pintilie2017-08-211-0/+65
| | | | | | | | | | | Preparations to use the per-increment are sometimes done in the target independent pass Loop Strength Reduction. We try to detect them in the PowerPC specific pass so that they are not done twice and so that we do not add PHIs that are not required. Differential Revision: https://reviews.llvm.org/D36736 llvm-svn: 311332
* [PowerPC] Add codegen for VSX word extract convert to FPLei Huang2017-08-141-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add codegen for VSX word extract conversion from signed/unsigned to single/double precision. For UINT_TO_FP: Extract word unsigned and convert to float was implemented in https://reviews.llvm.org/D20239. Here we will add the missing extract integer and conversion to double. This utilizes the new P9 instruction xxextractuw to extracting an integer element when the result will be converted to double thereby saving 2 direct moves (VSR <-> GPR). For SINT_TO_FP: We will implement the following sequence which will also reduce the number of instructions by saving 2 direct moves. v4i32->f32: xxspltw xvcvsxwsp xscvspdpn v4i32->f64: xxspltw xvcvsxwdp Differential Revision: https://reviews.llvm.org/D35859 llvm-svn: 310866
* [PowerPC] Revert r310346 (and followups r310356 & r310424) whichChandler Carruth2017-08-141-132/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | introduce a miscompile bug. There appears to be a bug where the generated code to extract the sign bit doesn't work correctly for 32-bit inputs. I've replied to the original commit pointing out the problem. I think I see by inspection (and reading the manual for PPC) how to fix this, but I can't be 100% confident and I also don't know what the best way to test this is. Currently it seems nearly impossible to get the backend to hit this code path, but the patch autohr is likely in a better position to craft such test cases than I am, and based on where the bug is it should be easily done. Original commit message for r310346: """ [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGE Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it adds the handling for the special case where RHS == 0. Differential Revision: https://reviews.llvm.org/D34048 """ llvm-svn: 310809
* Add "Restored" flag to CalleeSavedInfoKrzysztof Parzyszek2017-08-102-2/+2
| | | | | | | | | | | The liveness-tracking code assumes that the registers that were saved in the function's prolog are live outside of the function. Specifically, that registers that were saved are also live-on-exit from the function. This isn't always the case as illustrated by the LR register on ARM. Differential Revision: https://reviews.llvm.org/D36160 llvm-svn: 310619
* My commit r310346 introduced some valid warnings. This cleans them up.Nemanja Ivanovic2017-08-081-6/+9
| | | | llvm-svn: 310424
* [PowerPC] Don't crash on larger splats achieved through 1-byte splatsNemanja Ivanovic2017-08-081-0/+9
| | | | | | | | | | We've implemented a 1-byte splat using XXSPLTISB on P9. However, LLVM will produce a 1-byte splat even for wider element BUILD_VECTOR nodes. This patch prevents crashing in that situation. Differential Revision: https://reviews.llvm.org/D35650 llvm-svn: 310358
* Appease compilers that have the -Wcovered-switch-default switch.Nemanja Ivanovic2017-08-081-2/+5
| | | | llvm-svn: 310356
* [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGENemanja Ivanovic2017-08-081-0/+126
| | | | | | | | | Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it adds the handling for the special case where RHS == 0. Differential Revision: https://reviews.llvm.org/D34048 llvm-svn: 310346
* Fix the ppc jit tests.Rafael Espindola2017-08-031-3/+4
| | | | llvm-svn: 309921
* Delete Default and JITDefault code modelsRafael Espindola2017-08-035-20/+18
| | | | | | | | | | | | | | | IMHO it is an antipattern to have a enum value that is Default. At any given piece of code it is not clear if we have to handle Default or if has already been mapped to a concrete value. In this case in particular, only the target can do the mapping and it is nice to make sure it is always done. This deletes the two default enum values of CodeModel and uses an explicit Optional<CodeModel> when it is possible that it is unspecified. llvm-svn: 309911
* [Power9] Exploit vector absolute difference instructions on Power 9Stefan Pintilie2017-08-022-1/+52
| | | | | | | | | Power 9 has instructions to do absolute difference (VABSDUB, VABSDUH, VABSDUW) for byte, halfword and word. We should take advantage of these. Differential Revision: https://reviews.llvm.org/D34684 llvm-svn: 309876
* [PowerPC] Change method names; NFCHiroshi Inoue2017-07-311-24/+25
| | | | | | | | Changed method names based on the discussion in https://reviews.llvm.org/D34986: getInt64 -> selectI64Imm, getInt64Count -> selectI64ImmInstrCount. llvm-svn: 309541
* [PowerPC] enable optimizeCompareInstr for branch with static branch hintHiroshi Inoue2017-07-272-7/+31
| | | | | | | | | | In optimizeCompareInstr, a compare instruction is eliminated by using a record form instruction if possible. If the branch instruction that uses the result of the compare has a static branch hint, the optimization does not happen. This patch makes this optimization happen regardless of the branch hint by splitting branch hint and branch condition before checking the predicate to identify the possible optimizations. Differential Revision: https://reviews.llvm.org/D35801 llvm-svn: 309255
* Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. ↵Peter Collingbourne2017-07-262-19/+19
| | | | | | | | NFCI. This was a use-after-free waiting to happen. llvm-svn: 309159
* [NFC] test commit.Stefan Pintilie2017-07-261-0/+8
| | | | | | Added a comment to explain how to add a PPCISD node. llvm-svn: 309114
* Update the comments on default subtargets based on feedback.Eric Christopher2017-07-251-2/+3
| | | | llvm-svn: 309041
* [PowerPC] Pretty-print CR bits the way the binutils disassembler doesNemanja Ivanovic2017-07-251-11/+26
| | | | | | | | | This patch just adds printing of CR bit registers in a more human-readable form akin to that used by the GNU binutils. Differential Revision: https://reviews.llvm.org/D31494 llvm-svn: 309001
* [PowerPC] - Recommit r304907 now that the issue has been fixedNemanja Ivanovic2017-07-251-0/+26
| | | | | | | This is just a recommit since the issue that the commit exposed is now resolved. llvm-svn: 308995
* [PPC] Add Defs = [CARRY] to MIR SRADI_32Guozhi Wei2017-07-211-1/+1
| | | | | | | | | | MIR SRADI uses instruction template XSForm_1rc which declares Defs = [CARRY]. But MIR SRADI_32 uses instruction template XSForm_1, and it doesn't declare such implicit definition. With patch D33720 it causes wrong code generation for perl. This patch adds the implicit definition. Differential Revision: https://reviews.llvm.org/D35699 llvm-svn: 308780
* [SystemZ, LoopStrengthReduce]Jonas Paulsson2017-07-212-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes LSR generate better code for SystemZ in the cases of memory intrinsics, Load->Store pairs or comparison of immediate with memory. In order to achieve this, the following common code changes were made: * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if LSR should do instruction-based addressing evaluations by calling isLegalAddressingMode() with the Instruction pointers. * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address, not just loads or stores. SystemZ changes: * isLSRCostLess() implemented with Insns first, and without ImmCost. * New function supportedAddressingMode() that is a helper for TTI methods looking at Instructions passed via pointers. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D35262 https://reviews.llvm.org/D35049 llvm-svn: 308729
OpenPOWER on IntegriCloud