summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* Update branch coalescing to be a PowerPC specific passLei Huang2017-09-124-0/+794
| | | | | | | | | | | | Implementing this pass as a PowerPC specific pass. Branch coalescing utilizes the analyzeBranch method which currently does not include any implicit operands. This is not an issue on PPC but must be handled on other targets. Pass is currently off by default. Enabled via -enable-ppc-branch-coalesce. Differential Revision : https: // reviews.llvm.org/D32776 llvm-svn: 313061
* PPC: Don't select lxv/stxv for insufficiently aligned stack slots.Kyle Butt2017-09-091-1/+11
| | | | | | | | | | | | | | The lxv/stxv instructions require an offset that is 0 % 16. Previously we were selecting lxv/stxv for loads and stores to the stack where the offset from the slot was a multiple of 16, but the stack slot was not 16 or more byte aligned. When the frame gets lowered these transform to r(1|31) + slot + offset. If slot is not aligned, slot + offset may not be 0 % 16. Now we require 16 byte or more alignment for select lxv/stxv to stack slots. Includes a testcase that shows both sufficiently and insufficiently aligned stack slots. llvm-svn: 312843
* [XRay][CodeGen][PowerPC] Fix tail exit codegen for XRay in PPCDean Michael Berris2017-09-081-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes code-gen for XRay in PPC. The regression wasn't caught by codegen tests which we add in this change. What happened was the following: - For tail exits, we used to unconditionally prepend the returns/exits with a pseudo-instruction that gets lowered to the instrumentation sled (and leave the actual return/exit instruction as-is). - Changes to the XRay instrumentation pass caused the tail exits to suddenly also emit the tail exit pseudo-instruction, since the check for whether a return instruction was also a call instruction meant it was a tail exit instruction. - None of the tests caught the regression either due to non-existent tests, or the tests being disabled/removed for continuous breakage. This change re-introduces some of the basic tests and verifies that we're back to a state that allows the back-end to generate appropriate XRay instrumented binaries for PPC in the presence of tail exits. Reviewers: echristo, timshen Subscribers: nemanjai, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D37570 llvm-svn: 312772
* [PowerPC] Don't use xscvdpspn on the P7Hal Finkel2017-09-061-3/+6
| | | | | | | xscvdpspn was not introduced until the P8, so don't use it on the P7. Fixes a regression introduced in r288152. llvm-svn: 312612
* [PPC][NFC] Renaming things with 'xxinsert' moniker to 'vecinsert' to make it ↵Tony Jiang2017-09-054-9/+9
| | | | | | | | more general. Commit on behalf of Graham Yiu (gyiu@ca.ibm.com) llvm-svn: 312547
* [PowerPC] eliminate redundant compare instructionHiroshi Inoue2017-09-051-0/+299
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If multiple conditional branches are executed based on the same comparison, we can execute multiple conditional branches based on the result of one comparison on PPC. For example, if (a == 0) { ... } else if (a < 0) { ... } can be executed by one compare and two conditional branches instead of two pairs of a compare and a conditional branch. This patch identifies a code sequence of the two pairs of a compare and a conditional branch and merge the compares if possible. To maximize the opportunity, we do canonicalization of code sequence before merging compares. For the above example, the input for this pass looks like: cmplwi r3, 0 beq 0, .LBB0_3 cmpwi r3, -1 bgt 0, .LBB0_4 So, before merging two compares, we canonicalize it as cmpwi r3, 0 ; cmplwi and cmpwi yield same result for beq beq 0, .LBB0_3 cmpwi r3, 0 ; greather than -1 means greater or equal to 0 bge 0, .LBB0_4 The generated code should be cmpwi r3, 0 beq 0, .LBB0_3 bge 0, .LBB0_4 Differential Revision: https://reviews.llvm.org/D37211 llvm-svn: 312514
* Temporarily revert "Update branch coalescing to be a PowerPC specific pass"Eric Christopher2017-08-314-785/+0
| | | | | | | | From comments and code review it wasn't intended to be enabled by default yet. This reverts commit r311588. llvm-svn: 312214
* [Power9] Add new instructions for floating point status and control registers.Stefan Pintilie2017-08-282-0/+91
| | | | | | | | | Added the following P9 instructions: mffsce, mffscdrn, mffscdrni, mffscrn, mffscrni, mffsl Differential Revision: https://reviews.llvm.org/D37167 llvm-svn: 311903
* Untabify.NAKAMURA Takumi2017-08-281-2/+2
| | | | llvm-svn: 311875
* [DAG] convert vector select-of-constants to logic/mathSanjay Patel2017-08-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | This goes back to a discussion about IR canonicalization. We'd like to preserve and convert more IR to 'select' than we currently do because that's likely the best choice in IR: http://lists.llvm.org/pipermail/llvm-dev/2016-September/105335.html ...but that's often not true for codegen, so we need to account for this pattern coming in to the backend and transform it to better DAG ops. Steps in this patch: 1. Add an EVT param to the existing convertSelectOfConstantsToMath() TLI hook to more finely enable this transform. Other targets will probably want that anyway to distinguish scalars from vectors. We're using that here to exclude AVX512 targets, but it may not be necessary. 2. Convert a vselect to ext+add. This eliminates a constant load/materialization, and the vector ext is often free. Implementing a more general fold using xor+and can be a follow-up for targets that don't have a legal vselect. It's also possible that we can remove the TLI hook for the special case fold implemented here because we're eliminating a constant, but it needs to be tested on other targets. Differential Revision: https://reviews.llvm.org/D36840 llvm-svn: 311731
* Update branch coalescing to be a PowerPC specific passLei Huang2017-08-234-0/+785
| | | | | | | | | | Implementing this pass as a PowerPC specific pass. Branch coalescing utilizes the analyzeBranch method which currently does not include any implicit operands. This is not an issue on PPC but must be handled on other targets. Differential Revision : https: // reviews.llvm.org/D32776 llvm-svn: 311588
* [PowerPC] better instruction selection for OR (XOR) with a 32-bit immediateHiroshi Inoue2017-08-231-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | - recommitting after fixing a test failure on MacOS On PPC64, OR (XOR) with a 32-bit immediate can be done with only two instructions, i.e. ori + oris. But the current LLVM generates three or four instructions for this purpose (and also it clobbers one GPR). This patch makes PPC backend generate ori + oris (xori + xoris) for OR (XOR) with a 32-bit immediate. e.g. (x | 0xFFFFFFFF) should be ori 3, 3, 65535 oris 3, 3, 65535 but LLVM generates without this patch li 4, 0 oris 4, 4, 65535 ori 4, 4, 65535 or 3, 3, 4 Differential Revision: https://reviews.llvm.org/D34757 llvm-svn: 311538
* Revert rL311526: [PowerPC] better instruction selection for OR (XOR) with a ↵Hiroshi Inoue2017-08-231-51/+0
| | | | | | | | 32-bit immediate This reverts commit rL311526 due to failures in some buildbot. llvm-svn: 311530
* [PowerPC] better instruction selection for OR (XOR) with a 32-bit immediateHiroshi Inoue2017-08-231-0/+51
| | | | | | | | | | | | | | | | | | | | | | | On PPC64, OR (XOR) with a 32-bit immediate can be done with only two instructions, i.e. ori + oris. But the current LLVM generates three or four instructions for this purpose (and also it clobbers one GPR). This patch makes PPC backend generate ori + oris (xori + xoris) for OR (XOR) with a 32-bit immediate. e.g. (x | 0xFFFFFFFF) should be ori 3, 3, 65535 oris 3, 3, 65535 but LLVM generates without this patch li 4, 0 oris 4, 4, 65535 ori 4, 4, 65535 or 3, 3, 4 Differential Revision: https://reviews.llvm.org/D34757 llvm-svn: 311526
* [PPC] Refine checks for emiting TOC restore nop and tail-call eligibility.Sean Fertile2017-08-211-6/+17
| | | | | | | | | For the medium and large code models we only need to check if a call crosses dso-boundaries when considering tail-call elgibility. Differential Revision: https://reviews.llvm.org/D34245 llvm-svn: 311353
* [PowerPC] Check if the pre-increment PHI Node already existsStefan Pintilie2017-08-211-0/+65
| | | | | | | | | | | Preparations to use the per-increment are sometimes done in the target independent pass Loop Strength Reduction. We try to detect them in the PowerPC specific pass so that they are not done twice and so that we do not add PHIs that are not required. Differential Revision: https://reviews.llvm.org/D36736 llvm-svn: 311332
* [PowerPC] Add codegen for VSX word extract convert to FPLei Huang2017-08-141-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add codegen for VSX word extract conversion from signed/unsigned to single/double precision. For UINT_TO_FP: Extract word unsigned and convert to float was implemented in https://reviews.llvm.org/D20239. Here we will add the missing extract integer and conversion to double. This utilizes the new P9 instruction xxextractuw to extracting an integer element when the result will be converted to double thereby saving 2 direct moves (VSR <-> GPR). For SINT_TO_FP: We will implement the following sequence which will also reduce the number of instructions by saving 2 direct moves. v4i32->f32: xxspltw xvcvsxwsp xscvspdpn v4i32->f64: xxspltw xvcvsxwdp Differential Revision: https://reviews.llvm.org/D35859 llvm-svn: 310866
* [PowerPC] Revert r310346 (and followups r310356 & r310424) whichChandler Carruth2017-08-141-132/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | introduce a miscompile bug. There appears to be a bug where the generated code to extract the sign bit doesn't work correctly for 32-bit inputs. I've replied to the original commit pointing out the problem. I think I see by inspection (and reading the manual for PPC) how to fix this, but I can't be 100% confident and I also don't know what the best way to test this is. Currently it seems nearly impossible to get the backend to hit this code path, but the patch autohr is likely in a better position to craft such test cases than I am, and based on where the bug is it should be easily done. Original commit message for r310346: """ [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGE Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it adds the handling for the special case where RHS == 0. Differential Revision: https://reviews.llvm.org/D34048 """ llvm-svn: 310809
* Add "Restored" flag to CalleeSavedInfoKrzysztof Parzyszek2017-08-102-2/+2
| | | | | | | | | | | The liveness-tracking code assumes that the registers that were saved in the function's prolog are live outside of the function. Specifically, that registers that were saved are also live-on-exit from the function. This isn't always the case as illustrated by the LR register on ARM. Differential Revision: https://reviews.llvm.org/D36160 llvm-svn: 310619
* My commit r310346 introduced some valid warnings. This cleans them up.Nemanja Ivanovic2017-08-081-6/+9
| | | | llvm-svn: 310424
* [PowerPC] Don't crash on larger splats achieved through 1-byte splatsNemanja Ivanovic2017-08-081-0/+9
| | | | | | | | | | We've implemented a 1-byte splat using XXSPLTISB on P9. However, LLVM will produce a 1-byte splat even for wider element BUILD_VECTOR nodes. This patch prevents crashing in that situation. Differential Revision: https://reviews.llvm.org/D35650 llvm-svn: 310358
* Appease compilers that have the -Wcovered-switch-default switch.Nemanja Ivanovic2017-08-081-2/+5
| | | | llvm-svn: 310356
* [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGENemanja Ivanovic2017-08-081-0/+126
| | | | | | | | | Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it adds the handling for the special case where RHS == 0. Differential Revision: https://reviews.llvm.org/D34048 llvm-svn: 310346
* Fix the ppc jit tests.Rafael Espindola2017-08-031-3/+4
| | | | llvm-svn: 309921
* Delete Default and JITDefault code modelsRafael Espindola2017-08-035-20/+18
| | | | | | | | | | | | | | | IMHO it is an antipattern to have a enum value that is Default. At any given piece of code it is not clear if we have to handle Default or if has already been mapped to a concrete value. In this case in particular, only the target can do the mapping and it is nice to make sure it is always done. This deletes the two default enum values of CodeModel and uses an explicit Optional<CodeModel> when it is possible that it is unspecified. llvm-svn: 309911
* [Power9] Exploit vector absolute difference instructions on Power 9Stefan Pintilie2017-08-022-1/+52
| | | | | | | | | Power 9 has instructions to do absolute difference (VABSDUB, VABSDUH, VABSDUW) for byte, halfword and word. We should take advantage of these. Differential Revision: https://reviews.llvm.org/D34684 llvm-svn: 309876
* [PowerPC] Change method names; NFCHiroshi Inoue2017-07-311-24/+25
| | | | | | | | Changed method names based on the discussion in https://reviews.llvm.org/D34986: getInt64 -> selectI64Imm, getInt64Count -> selectI64ImmInstrCount. llvm-svn: 309541
* [PowerPC] enable optimizeCompareInstr for branch with static branch hintHiroshi Inoue2017-07-272-7/+31
| | | | | | | | | | In optimizeCompareInstr, a compare instruction is eliminated by using a record form instruction if possible. If the branch instruction that uses the result of the compare has a static branch hint, the optimization does not happen. This patch makes this optimization happen regardless of the branch hint by splitting branch hint and branch condition before checking the predicate to identify the possible optimizations. Differential Revision: https://reviews.llvm.org/D35801 llvm-svn: 309255
* Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. ↵Peter Collingbourne2017-07-262-19/+19
| | | | | | | | NFCI. This was a use-after-free waiting to happen. llvm-svn: 309159
* [NFC] test commit.Stefan Pintilie2017-07-261-0/+8
| | | | | | Added a comment to explain how to add a PPCISD node. llvm-svn: 309114
* Update the comments on default subtargets based on feedback.Eric Christopher2017-07-251-2/+3
| | | | llvm-svn: 309041
* [PowerPC] Pretty-print CR bits the way the binutils disassembler doesNemanja Ivanovic2017-07-251-11/+26
| | | | | | | | | This patch just adds printing of CR bit registers in a more human-readable form akin to that used by the GNU binutils. Differential Revision: https://reviews.llvm.org/D31494 llvm-svn: 309001
* [PowerPC] - Recommit r304907 now that the issue has been fixedNemanja Ivanovic2017-07-251-0/+26
| | | | | | | This is just a recommit since the issue that the commit exposed is now resolved. llvm-svn: 308995
* [PPC] Add Defs = [CARRY] to MIR SRADI_32Guozhi Wei2017-07-211-1/+1
| | | | | | | | | | MIR SRADI uses instruction template XSForm_1rc which declares Defs = [CARRY]. But MIR SRADI_32 uses instruction template XSForm_1, and it doesn't declare such implicit definition. With patch D33720 it causes wrong code generation for perl. This patch adds the implicit definition. Differential Revision: https://reviews.llvm.org/D35699 llvm-svn: 308780
* [SystemZ, LoopStrengthReduce]Jonas Paulsson2017-07-212-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes LSR generate better code for SystemZ in the cases of memory intrinsics, Load->Store pairs or comparison of immediate with memory. In order to achieve this, the following common code changes were made: * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if LSR should do instruction-based addressing evaluations by calling isLegalAddressingMode() with the Instruction pointers. * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address, not just loads or stores. SystemZ changes: * isLSRCostLess() implemented with Insns first, and without ImmCost. * New function supportedAddressingMode() that is a helper for TTI methods looking at Instructions passed via pointers. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D35262 https://reviews.llvm.org/D35049 llvm-svn: 308729
* fix formatting issue; NFCHiroshi Inoue2017-07-181-4/+6
| | | | llvm-svn: 308305
* Add a set of comments explaining why getSubtargetImpl() is deleted on these ↵Eric Christopher2017-07-141-0/+2
| | | | | | targets. llvm-svn: 307999
* [PowerPC] Ensure displacements for DQ-Form instructions are multiples of 16Nemanja Ivanovic2017-07-137-67/+126
| | | | | | | | | | | | | As outlined in the PR, we didn't ensure that displacements for DQ-Form instructions are multiples of 16. Since the instruction encoding encodes a quad-word displacement, a sub-16 byte displacement is meaningless and ends up being encoded incorrectly. Fixes https://bugs.llvm.org/show_bug.cgi?id=33671. Differential Revision: https://reviews.llvm.org/D35007 llvm-svn: 307934
* Fully fix the movw/movt addend.Rafael Espindola2017-07-111-1/+1
| | | | | | | | | | The issue is not if the value is pcrel. It is whether we have a relocation or not. If we have a relocation, the static linker will select the upper bits. If we don't have a relocation, we have to do it. llvm-svn: 307730
* [PPC] Fix two bugs in frame lowering.Tony Jiang2017-07-112-17/+26
| | | | | | | | | | | 1. The available program storage region of the red zone to compilers is 288 bytes rather than 244 bytes. 2. The formula for negative number alignment calculation should be y = x & ~(n-1) rather than y = (x + (n-1)) & ~(n-1). Differential Revision: https://reviews.llvm.org/D34337 llvm-svn: 307672
* fix formatting; NFCHiroshi Inoue2017-07-111-2/+2
| | | | llvm-svn: 307662
* [PowerPC] fix latency for simple integer instructions in POWER9 schedulerHiroshi Inoue2017-07-111-1/+1
| | | | | | | | | In the POWER9 instruction scheduler, SchedWriteRes for the simple integer instructions are misconfigured to use that of (costly) DFU instructions. This results in surprisingly long instruction latency estimation and causes misbehavior in some optimizers such as if-conversion. Differential Revision: https://reviews.llvm.org/D34869 llvm-svn: 307624
* [PowerPC] avoid redundant analysis while lowering an immediate; NFCHiroshi Inoue2017-07-111-2/+8
| | | | | | | | | | This patch reduces compilation time by avoiding redundant analysis while selecting instructions to create an immediate. If the instruction count required to create the input number without rotate is 2, we do not need further analysis to find a shorter instruction sequence with rotate; rotate + load constant cannot be done by 1 instruction (i.e. getInt64CountDirectnever return 0). This patch should not change functionality. Differential Revision: https://reviews.llvm.org/D34986 llvm-svn: 307623
* [PPC CodeGen] Expand the bitreverse.i64 intrinsic.Tony Jiang2017-07-102-0/+120
| | | | | | | Differential Revision: https://reviews.llvm.org/D34908 Fix PR: https://bugs.llvm.org/show_bug.cgi?id=33093 llvm-svn: 307563
* [PowerPC] Reduce register pressure by not materializing a constant just for ↵Lei Huang2017-07-101-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | use as an index register for X-Form loads/stores. For this example: float test (int *arr) { return arr[2]; } We currently generate the following code: li r4, 8 lxsiwax f0, r3, r4 xscvsxdsp f1, f0 With this patch, we will now generate: addi r3, r3, 8 lxsiwax f0, 0, r3 xscvsxdsp f1, f0 Originally reported in: https://bugs.llvm.org/show_bug.cgi?id=27204 Differential Revision: https://reviews.llvm.org/D35027 llvm-svn: 307553
* fix typos in comments and error messages; NFCHiroshi Inoue2017-07-101-3/+3
| | | | llvm-svn: 307533
* fix formatting; NFCHiroshi Inoue2017-07-101-2/+2
| | | | llvm-svn: 307523
* [PowerPC] NFC : Common up definitions of isIntS16Immediate and update ↵Lei Huang2017-07-073-30/+14
| | | | | | parameter to int16_t llvm-svn: 307442
* [PPC CodeGen] Expand the bitreverse.i32 intrinsic.Tony Jiang2017-07-072-0/+71
| | | | | | | Differential Revision: https://reviews.llvm.org/D33572 Fix PR: https://bugs.llvm.org/show_bug.cgi?id=33093 llvm-svn: 307413
* [PowerPC] Fix -Wimplicit-fallthrough warnings. NFCI.Simon Pilgrim2017-07-071-0/+4
| | | | llvm-svn: 307382
OpenPOWER on IntegriCloud