summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
* [PowerPC] Add profitablilty check for conversion to mtctr loopsLei Huang2017-10-122-5/+158
| | | | | | | | | | | | | | | Add profitability checks for modifying counted loops to use the mtctr instruction. The latency of mtctr is only justified if there are more than 4 comparisons that will be removed as a result. Usually counted loops are formed relatively early and before unrolling, so most low trip count loops often don't survive. However we want to ensure that if they do, we do not mistakenly update them to mtctr loops. Use CodeMetrics to ensure we are only doing this for small loops with small trip counts. Differential Revision: https://reviews.llvm.org/D38212 llvm-svn: 315592
* [PowerPC] Utilize DQ-Form instructions for spill/restore and fix FrameIndex ↵Lei Huang2017-10-114-238/+210
| | | | | | | | | | | | | | | | | | | | | | | | | | elimination to only use `lis/addi` if necessary. Currently we produce a bunch of unnecessary code when emitting the prologue/epilogue for spills/restores. Namely, if the load from stack slot/store to stack slot instruction is an X-Form instruction, we will always produce an LIS/ORI sequence for the stack offset. Furthermore, we have not exploited the P9 vector D-Form loads/stores for this purpose. This patch address both issues. Specifying the D-Form load as the instruction to use for stack spills/reloads should be safe because: 1. The stack should be aligned according to the ABI 2. If the stack isn't aligned, PPCRegisterInfo::eliminateFrameIndex() will check for the offset being a multiple of 16 and will convert it to an X-Form instruction if it isn't. Differential Revision : https://reviews.llvm.org/D38758 llvm-svn: 315500
* [NFC] update test case so checks are not order dependent when not neededLei Huang2017-10-111-5/+5
| | | | llvm-svn: 315482
* Fix for PR34888.Nemanja Ivanovic2017-10-101-0/+121
| | | | | | | | | | The issue is that we assume operand zero of the input to the add instruction is a register. In this case, the input comes from inline assembly and operand zero is not a register thereby causing a crash. The code will bail anyway if the input instruction doesn't have the right opcode. So do that check first and let short-circuiting prevent the crash. llvm-svn: 315285
* [MC] Suppress .Lcfi labels when emitting textual assemblyReid Kleckner2017-10-102-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This suppresses the generation of .Lcfi labels in our textual assembler. It was annoying that this generated cascading .Lcfi labels: llc foo.ll -o - | llvm-mc | llvm-mc After three trips through MCAsmStreamer, we'd have three labels in the output when none are necessary. We should only bother creating the labels and frame data when making a real object file. This supercedes D38605, which moved the entire .seh_ implementation into MCObjectStreamer. This has the advantage that we do more checking when emitting textual assembly, as a minor efficiency cost. Outputting textual assembly is not performance critical, so this shouldn't matter. Reviewers: majnemer, MatzeB Subscribers: qcolombet, nemanjai, javed.absar, eraman, hiraditya, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D38638 llvm-svn: 315259
* Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source ↵Geoff Berry2017-10-035-6/+5
| | | | | | | | | | forwarding"" This reverts commit r314729. Another bug has been encountered in an out-of-tree target reported by Quentin. llvm-svn: 314814
* [DebugInfo] Handle endianness when moving debug info for split integer ↵Bjorn Pettersson2017-10-031-0/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | values (reapplied) Summary: Take the target's endianness into account when splitting the debug information in DAGTypeLegalizer::SetExpandedInteger. This patch fixes so that, for big-endian targets, the fragment expression corresponding to the high part of a split integer value is placed at offset 0, in order to correctly represent the memory address order. I have attached a PPC32 reproducer where the resulting DWARF pieces for a 64-bit integer were incorrectly reversed. Original patch was reverted due to using -stop-after=isel in the test case (but that is only working when AMDGPU target is included in the llc build). The test case has now been updated to use -stop-before=expand-isel-pseudos instead. Patch by: dstenb Reviewers: JDevlieghere, aprantl, dblaikie Reviewed By: JDevlieghere, aprantl, dblaikie Subscribers: nemanjai Differential Revision: https://reviews.llvm.org/D38172 llvm-svn: 314781
* [PowerPC] Revert r314666.Tim Shen2017-10-021-68/+0
| | | | | | | | | See https://reviews.llvm.org/D38172. I tried to XFAIL it, but sometimes XPASS triggers the bot. Simply revert it. llvm-svn: 314739
* [PowerPC] Temporarily disable the test introduced by r314666Tim Shen2017-10-021-0/+2
| | | | | | See https://reviews.llvm.org/D38172 for details. llvm-svn: 314732
* Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"Geoff Berry2017-10-025-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issues addressed since original review: - Avoid bug in regalloc greedy/machine verifier when forwarding to use in an instruction that re-defines the same virtual register. - Fixed bug when forwarding to use in EarlyClobber instruction slot. - Fixed incorrect forwarding to register definitions that showed up in explicit_uses() iterator (e.g. in INLINEASM). - Moved removal of dead instructions found by LiveIntervals::shrinkToUses() outside of loop iterating over instructions to avoid instructions being deleted while pointed to by iterator. - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. llvm-svn: 314729
* [Debug info] Handle endianness when moving debug info for split integer valuesBjorn Pettersson2017-10-021-0/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Take the target's endianness into account when splitting the debug information in DAGTypeLegalizer::SetExpandedInteger. This patch fixes so that, for big-endian targets, the fragment expression corresponding to the high part of a split integer value is placed at offset 0, in order to correctly represent the memory address order. I have attached a PPC32 reproducer where the resulting DWARF pieces for a 64-bit integer were incorrectly reversed. Patch by: dstenb Reviewers: JDevlieghere, aprantl, dblaikie Reviewed By: JDevlieghere, aprantl, dblaikie Subscribers: nemanjai Differential Revision: https://reviews.llvm.org/D38172 llvm-svn: 314666
* [PowerPC] support ZERO_EXTEND in tryBitPermutationHiroshi Inoue2017-10-021-0/+23
| | | | | | | | | | | | | | | | | | | This patch add a support of ISD::ZERO_EXTEND in PPCDAGToDAGISel::tryBitPermutation to increase the opportunity to use rotate-and-mask by reordering ZEXT and ANDI. Since tryBitPermutation stops analyzing nodes if it hits a ZEXT node while traversing SDNodes, we want to avoid ZEXT between two nodes that can be folded into a rotate-and-mask instruction. For example, we allow these nodes t9: i32 = add t7, Constant:i32<1> t11: i32 = and t9, Constant:i32<255> t12: i64 = zero_extend t11 t14: i64 = shl t12, Constant:i64<2> to be folded into a rotate-and-mask instruction. Such case often happens in array accesses with logical AND operation in the index, e.g. array[i & 0xFF]; Differential Revision: https://reviews.llvm.org/D37514 llvm-svn: 314655
* [PowerPC] eliminate partially redundant compare instructionHiroshi Inoue2017-09-281-0/+34
| | | | | | | | | | | | | | | | | | This is a follow-on of D37211. D37211 eliminates a compare instruction if two conditional branches can be made based on the one compare instruction, e.g. if (a == 0) { ... } else if (a < 0) { ... } This patch extends this optimization to support partially redundant cases, which often happen in while loops. For example, one compare instruction is moved from the loop body into the preheader by this optimization in the following example. do { if (a == 0) dummy1(); a = func(a); } while (a > 0); Differential Revision: https://reviews.llvm.org/D38236 llvm-svn: 314390
* [PowerPC] eliminate unconditional branch to the next instructionHiroshi Inoue2017-09-273-26/+2
| | | | | | | | | This patch makes analyzeBranch eliminate unconditional branch to the next instruction. After basic blocks are re-organized by optimizers, such as machine block placement, a BB may end with an unconditional branch to the next (fallthrough) BB. This patch removes such redundant branch instruction. Differential Revision: https://reviews.llvm.org/D37730 llvm-svn: 314297
* [PowerPC] Reverting sequence of patches for elimination of comparison ↵Nemanja Ivanovic2017-09-2688-0/+88
| | | | | | | | | | | | | | | | | | | | | | instructions In the past while, I've committed a number of patches in the PowerPC back end aimed at eliminating comparison instructions. However, this causes some failures in proprietary source and these issues are not observed in SPEC or any open source packages I've been able to run. As a result, I'm pulling the entire series and will refactor it to: - Have a single entry point for easy control - Have fine-grained control over which patterns we transform A side-effect of this is that test cases for these patches (and modified by them) are XFAIL-ed. This is a temporary measure as it is counter-productive to remove/modify these test cases and then have to modify them again when the refactored patch is recommitted. The failure will be investigated in parallel to the refactoring effort and the recommit will either have a fix for it or will leave this transformation off by default until the problem is resolved. llvm-svn: 314244
* [PowerPC] Eliminate compares - add i64 sext/zext handling for SETLT/SETGTNemanja Ivanovic2017-09-255-1/+467
| | | | | | | | As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential review. llvm-svn: 314106
* [PowerPC] Eliminate compares - add i64 sext/zext handling for SETLE/SETGENemanja Ivanovic2017-09-244-0/+524
| | | | | | | | As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential review. llvm-svn: 314073
* [PowerPC] Eliminate compares - add i32 sext/zext handling for SETULT/SETUGTNemanja Ivanovic2017-09-2314-0/+1200
| | | | | | | | As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential revision. llvm-svn: 314062
* [PowerPC] Eliminate compares - add i32 sext/zext handling for SETULE/SETUGENemanja Ivanovic2017-09-2317-26/+1428
| | | | | | | | As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential revision. llvm-svn: 314060
* [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLT/SETGTNemanja Ivanovic2017-09-238-7/+610
| | | | | | | | As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential revision. llvm-svn: 314055
* [XRay] support conditional return on PPC.Tim Shen2017-09-222-0/+111
| | | | | | | | | | | | Summary: Conditional returns were not taken into consideration at all. Implement them by turning them into jumps and normal returns. This means there is a slightly higher performance penalty for conditional returns, but this is the best we can do, and it still disturbs little of the rest. Reviewers: dberris, echristo Subscribers: sanjoy, nemanjai, hiraditya, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D38102 llvm-svn: 314005
* Recommit r310809 with a fix for the spill problemNemanja Ivanovic2017-09-2225-96/+971
| | | | | | | | | | This patch re-commits the patch that was pulled out due to a problem it caused, but with a fix for the problem. The fix was reviewed separately by Eric Christopher and Hal Finkel. Differential Revision: https://reviews.llvm.org/D38054 llvm-svn: 313978
* [SelectionDAG] Pick correct frame index in LowerArgumentsBjorn Pettersson2017-09-211-0/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: SelectionDAGISel::LowerArguments is associating arguments with frame indices (FuncInfo->setArgumentFrameIndex). That information is later on used by EmitFuncArgumentDbgValue to create DBG_VALUE instructions that denotes that a variable can be found on the stack. I discovered that for our (big endian) out-of-tree target the association created by SelectionDAGISel::LowerArguments sometimes is wrong. I've seen this happen when a 64-bit value is passed on the stack. The argument will occupy two stack slots (frame index X, and frame index X+1). The fault is that a call to setArgumentFrameIndex is associating the 64-bit argument with frame index X+1. The effect is that the debug information (DBG_VALUE) will point at the least significant part of the arguement on the stack. When printing the argument in a debugger I will get the wrong value. I managed to create a test case for PowerPC that seems to show the same kind of problem. The bugfix will look at the datalayout, taking endianness into account when examining a BUILD_PAIR node, assuming that the least significant part is in the first operand of the BUILD_PAIR. For big endian targets we should use the frame index from the second operand, as the most significant part will be stored at the lower address (using the highest frame index). Reviewers: bogner, rnk, hfinkel, sdardis, aprantl Reviewed By: aprantl Subscribers: nemanjai, aprantl, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D37740 llvm-svn: 313901
* Fix buildbot failures, add mtriple to gpr-vsr-spill.llZaara Syeda2017-09-211-1/+1
| | | | llvm-svn: 313890
* [Power9] Spill gprs to vector registers rather than stackZaara Syeda2017-09-211-0/+24
| | | | | | | | | | This patch updates register allocation to enable spilling gprs to volatile vector registers rather than the stack. It can be enabled for Power9 with option -ppc-enable-gpr-to-vsr-spills. Differential Revision: https://reviews.llvm.org/D34815 llvm-svn: 313886
* [PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD.Tony Jiang2017-09-191-0/+60
| | | | | | | | | | Two blocks prior to the join each perform an li and the the join block has an add using the initialized register. Optimize each predecessor block to instead use addi and delete the li's and add. Differential Revision: https://reviews.llvm.org/D36734 llvm-svn: 313639
* [XRay][CodeGen] Use the current function symbol as the associated symbol for ↵Dean Michael Berris2017-09-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the instrumentation map Summary: XRay had been assuming that the previous section is the "text" section of the function when lowering the instrumentation map. Unfortunately this is not a safe assumption, because we may be coming from lowering debug type information for the function being lowered. This fixes an issue with combining -gsplit-dwarf, -generate-type-units, -debug-compile and -fxray-instrument for sole member functions. When the split dwarf section is stripped, we're left with references from the xray_instr_map to the debug section. The change now uses the function's symbol instead of the previous section's start symbol. We found the bug while attempting to strip the split debug sections off an XRay-instrumented object file, which had a peculiar edge-case for single-function classes where the single function is being lowered. Because XRay had assocaited the instrumentation map for a function to the debug types section instead of the function's section, the objcopy call will fail due to the misplaced reference from the xray_instr_map section. Reviewers: pcc, dblaikie, echristo Subscribers: llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D37791 llvm-svn: 313233
* Update branch coalescing to be a PowerPC specific passLei Huang2017-09-121-14/+43
| | | | | | | | | | | | Implementing this pass as a PowerPC specific pass. Branch coalescing utilizes the analyzeBranch method which currently does not include any implicit operands. This is not an issue on PPC but must be handled on other targets. Pass is currently off by default. Enabled via -enable-ppc-branch-coalesce. Differential Revision : https: // reviews.llvm.org/D32776 llvm-svn: 313061
* PPC: Don't select lxv/stxv for insufficiently aligned stack slots.Kyle Butt2017-09-091-0/+46
| | | | | | | | | | | | | | The lxv/stxv instructions require an offset that is 0 % 16. Previously we were selecting lxv/stxv for loads and stores to the stack where the offset from the slot was a multiple of 16, but the stack slot was not 16 or more byte aligned. When the frame gets lowered these transform to r(1|31) + slot + offset. If slot is not aligned, slot + offset may not be 0 % 16. Now we require 16 byte or more alignment for select lxv/stxv to stack slots. Includes a testcase that shows both sufficiently and insufficiently aligned stack slots. llvm-svn: 312843
* [XRay][CodeGen][PowerPC] Fix tail exit codegen for XRay in PPCDean Michael Berris2017-09-083-0/+114
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes code-gen for XRay in PPC. The regression wasn't caught by codegen tests which we add in this change. What happened was the following: - For tail exits, we used to unconditionally prepend the returns/exits with a pseudo-instruction that gets lowered to the instrumentation sled (and leave the actual return/exit instruction as-is). - Changes to the XRay instrumentation pass caused the tail exits to suddenly also emit the tail exit pseudo-instruction, since the check for whether a return instruction was also a call instruction meant it was a tail exit instruction. - None of the tests caught the regression either due to non-existent tests, or the tests being disabled/removed for continuous breakage. This change re-introduces some of the basic tests and verifies that we're back to a state that allows the back-end to generate appropriate XRay instrumented binaries for PPC in the presence of tail exits. Reviewers: echristo, timshen Subscribers: nemanjai, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D37570 llvm-svn: 312772
* [PowerPC] Don't use xscvdpspn on the P7Hal Finkel2017-09-061-0/+27
| | | | | | | xscvdpspn was not introduced until the P8, so don't use it on the P7. Fixes a regression introduced in r288152. llvm-svn: 312612
* [PowerPC] eliminate redundant compare instructionHiroshi Inoue2017-09-051-0/+722
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If multiple conditional branches are executed based on the same comparison, we can execute multiple conditional branches based on the result of one comparison on PPC. For example, if (a == 0) { ... } else if (a < 0) { ... } can be executed by one compare and two conditional branches instead of two pairs of a compare and a conditional branch. This patch identifies a code sequence of the two pairs of a compare and a conditional branch and merge the compares if possible. To maximize the opportunity, we do canonicalization of code sequence before merging compares. For the above example, the input for this pass looks like: cmplwi r3, 0 beq 0, .LBB0_3 cmpwi r3, -1 bgt 0, .LBB0_4 So, before merging two compares, we canonicalize it as cmpwi r3, 0 ; cmplwi and cmpwi yield same result for beq beq 0, .LBB0_3 cmpwi r3, 0 ; greather than -1 means greater or equal to 0 bge 0, .LBB0_4 The generated code should be cmpwi r3, 0 beq 0, .LBB0_3 bge 0, .LBB0_4 Differential Revision: https://reviews.llvm.org/D37211 llvm-svn: 312514
* Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source ↵Sam McCall2017-09-043-4/+3
| | | | | | | | | | forwarding"" This crashes on boringSSL on PPC (will send reduced testcase) This reverts commit r312328. llvm-svn: 312490
* Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"Geoff Berry2017-09-013-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Issues addressed since original review: - Moved removal of dead instructions found by LiveIntervals::shrinkToUses() outside of loop iterating over instructions to avoid instructions being deleted while pointed to by iterator. - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. llvm-svn: 312328
* Temporarily revert "Update branch coalescing to be a PowerPC specific pass"Eric Christopher2017-08-312-44/+21
| | | | | | | | From comments and code review it wasn't intended to be enabled by default yet. This reverts commit r311588. llvm-svn: 312214
* Revert r312154 "Re-enable "[MachineCopyPropagation] Extend pass to do COPY ↵Hans Wennborg2017-08-303-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | source forwarding"" It caused PR34387: Assertion failed: (RegNo < NumRegs && "Attempting to access record for invalid register number!") > Issues identified by buildbots addressed since original review: > - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. > - The pass no longer forwards COPYs to physical register uses, since > doing so can break code that implicitly relies on the physical > register number of the use. > - The pass no longer forwards COPYs to undef uses, since doing so > can break the machine verifier by creating LiveRanges that don't > end on a use (since the undef operand is not considered a use). > > [MachineCopyPropagation] Extend pass to do COPY source forwarding > > This change extends MachineCopyPropagation to do COPY source forwarding. > > This change also extends the MachineCopyPropagation pass to be able to > be run during register allocation, after physical registers have been > assigned, but before the virtual registers have been re-written, which > allows it to remove virtual register COPY LiveIntervals that become dead > through the forwarding of all of their uses. llvm-svn: 312178
* Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"Geoff Berry2017-08-303-3/+4
| | | | | | | | | | | | | | | | | | | | | | | Issues identified by buildbots addressed since original review: - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. llvm-svn: 312154
* Canonicalize the representation of empty an expression in ↵Adrian Prantl2017-08-302-116/+116
| | | | | | | | | | | | | | | | DIGlobalVariableExpression This change simplifies code that has to deal with DIGlobalVariableExpression and mirrors how we treat DIExpressions in debug info intrinsics. Before this change there were two ways of representing empty expressions on globals, a nullptr and an empty !DIExpression(). If someone needs to upgrade out-of-tree testcases: perl -pi -e 's/(!DIGlobalVariableExpression\(var: ![0-9]*)\)/\1, expr: !DIExpression())/g' <MYTEST.ll> will catch 95%. llvm-svn: 312144
* [DAG] convert vector select-of-constants to logic/mathSanjay Patel2017-08-241-60/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | This goes back to a discussion about IR canonicalization. We'd like to preserve and convert more IR to 'select' than we currently do because that's likely the best choice in IR: http://lists.llvm.org/pipermail/llvm-dev/2016-September/105335.html ...but that's often not true for codegen, so we need to account for this pattern coming in to the backend and transform it to better DAG ops. Steps in this patch: 1. Add an EVT param to the existing convertSelectOfConstantsToMath() TLI hook to more finely enable this transform. Other targets will probably want that anyway to distinguish scalars from vectors. We're using that here to exclude AVX512 targets, but it may not be necessary. 2. Convert a vselect to ext+add. This eliminates a constant load/materialization, and the vector ext is often free. Implementing a more general fold using xor+and can be a follow-up for targets that don't have a legal vselect. It's also possible that we can remove the TLI hook for the special case fold implemented here because we're eliminating a constant, but it needs to be tested on other targets. Differential Revision: https://reviews.llvm.org/D36840 llvm-svn: 311731
* Update branch coalescing to be a PowerPC specific passLei Huang2017-08-232-21/+44
| | | | | | | | | | Implementing this pass as a PowerPC specific pass. Branch coalescing utilizes the analyzeBranch method which currently does not include any implicit operands. This is not an issue on PPC but must be handled on other targets. Differential Revision : https: // reviews.llvm.org/D32776 llvm-svn: 311588
* [PowerPC] better instruction selection for OR (XOR) with a 32-bit immediateHiroshi Inoue2017-08-231-0/+96
| | | | | | | | | | | | | | | | | | | | | | | | | - recommitting after fixing a test failure on MacOS On PPC64, OR (XOR) with a 32-bit immediate can be done with only two instructions, i.e. ori + oris. But the current LLVM generates three or four instructions for this purpose (and also it clobbers one GPR). This patch makes PPC backend generate ori + oris (xori + xoris) for OR (XOR) with a 32-bit immediate. e.g. (x | 0xFFFFFFFF) should be ori 3, 3, 65535 oris 3, 3, 65535 but LLVM generates without this patch li 4, 0 oris 4, 4, 65535 ori 4, 4, 65535 or 3, 3, 4 Differential Revision: https://reviews.llvm.org/D34757 llvm-svn: 311538
* Revert rL311526: [PowerPC] better instruction selection for OR (XOR) with a ↵Hiroshi Inoue2017-08-231-96/+0
| | | | | | | | 32-bit immediate This reverts commit rL311526 due to failures in some buildbot. llvm-svn: 311530
* [PowerPC] better instruction selection for OR (XOR) with a 32-bit immediateHiroshi Inoue2017-08-231-0/+96
| | | | | | | | | | | | | | | | | | | | | | | On PPC64, OR (XOR) with a 32-bit immediate can be done with only two instructions, i.e. ori + oris. But the current LLVM generates three or four instructions for this purpose (and also it clobbers one GPR). This patch makes PPC backend generate ori + oris (xori + xoris) for OR (XOR) with a 32-bit immediate. e.g. (x | 0xFFFFFFFF) should be ori 3, 3, 65535 oris 3, 3, 65535 but LLVM generates without this patch li 4, 0 oris 4, 4, 65535 ori 4, 4, 65535 or 3, 3, 4 Differential Revision: https://reviews.llvm.org/D34757 llvm-svn: 311526
* [PPC] Refine checks for emiting TOC restore nop and tail-call eligibility.Sean Fertile2017-08-213-15/+63
| | | | | | | | | For the medium and large code models we only need to check if a call crosses dso-boundaries when considering tail-call elgibility. Differential Revision: https://reviews.llvm.org/D34245 llvm-svn: 311353
* [PowerPC] Check if the pre-increment PHI Node already existsStefan Pintilie2017-08-211-0/+30
| | | | | | | | | | | Preparations to use the per-increment are sometimes done in the target independent pass Loop Strength Reduction. We try to detect them in the PowerPC specific pass so that they are not done twice and so that we do not add PHIs that are not required. Differential Revision: https://reviews.llvm.org/D36736 llvm-svn: 311332
* Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" ↵Geoff Berry2017-08-183-4/+3
| | | | | | | | | | | round 2 This reverts commit r311135. sanitizer-x86_64-linux-android buildbot is timing out with just this patch applied. llvm-svn: 311142
* Re-enable "[MachineCopyPropagation] Extend pass to do COPY source ↵Geoff Berry2017-08-173-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | forwarding" Two issues identified by buildbots were addressed: - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. Reviewers: qcolombet, javed.absar, MatzeB, jonpa Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny Differential Revision: https://reviews.llvm.org/D30751 llvm-svn: 311135
* [PowerPC] add tests for vector select-of-constants; NFCSanjay Patel2017-08-171-0/+236
| | | | | | | | We've discussed canonicalizing to this form in IR, so the backend should be prepared to lower these in ways better than what we see here. llvm-svn: 311099
* Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"Geoff Berry2017-08-173-4/+3
| | | | | | | | | | This reverts commit r311038. Several buildbots are breaking, and at least one appears to be due to the forwarding of physical regs enabled by this change. Reverting while I investigate further. llvm-svn: 311062
* [MachineCopyPropagation] Extend pass to do COPY source forwardingGeoff Berry2017-08-163-3/+4
| | | | | | | | | | | | | | | | | | This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. Reviewers: qcolombet, javed.absar, MatzeB, jonpa Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny Differential Revision: https://reviews.llvm.org/D30751 llvm-svn: 311038
OpenPOWER on IntegriCloud