bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[PPC] Use xxbrd to speed up bswap64	Guozhi Wei	2017-11-06	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Power doesn't have bswap instructions, so llvm generates following code sequence for bswap64. rotldi 5, 3, 16 rotldi 4, 3, 8 rotldi 9, 3, 24 rotldi 10, 3, 32 rotldi 11, 3, 48 rotldi 12, 3, 56 rldimi 4, 5, 8, 48 rldimi 4, 9, 16, 40 rldimi 4, 10, 24, 32 rldimi 4, 11, 40, 16 rldimi 4, 12, 48, 8 rldimi 4, 3, 56, 0 But Power9 has vector bswap instructions, they can also be used to speed up scalar bswap intrinsic. With this patch, bswap64 can be translated to: mtvsrdd 34, 3, 3 xxbrd 34, 34 mfvsrld 3, 34 Differential Revision: https://reviews.llvm.org/D39510 llvm-svn: 317499
*	[LICM] sink through non-trivially replicable PHI	Jun Bum Lim	2017-11-03	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The current LICM allows sinking an instruction only when it is exposed to exit blocks through a trivially replacable PHI of which all incoming values are the same instruction. This change enhance LICM to sink a sinkable instruction through non-trivially replacable PHIs by spliting predecessors of loop exits. Reviewers: hfinkel, majnemer, davidxl, bmakam, mcrosier, danielcdh, efriedma, jtony Reviewed By: efriedma Subscribers: nemanjai, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D37163 llvm-svn: 317335
*	Adds code to PPC ISEL lowering to recognize half-word inserts from ↵	Graham Yiu	2017-11-01	1	-0/+300
\| \| \| \| \| \| \| \|	vector_shuffles, and use P9 shift and vector insert instructions instead of vperm. Differential Revision: https://reviews.llvm.org/D34160 llvm-svn: 317111
*	Revert "[PowerPC] Try to simplify a Swap if it feeds a Splat"	Stefan Pintilie	2017-10-30	2	-136/+2
\| \| \| \| \| \| \| \| \| \|	Revert r316478. A test case has failed. Will recommit this change once we find and fix the failure. This reverts commit 7c330fabaedaba3d02c58bc3cc1198896c895f34. llvm-svn: 316952
*	[PPC CodeGen] Fix the bitreverse.i64 intrinsic.	Fangrui Song	2017-10-30	2	-32/+32
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: The two 32-bit words were swapped. Update a test omitted in reverted r316270. Reviewers: jtony, aaron.ballman Subscribers: nemanjai, kbarton Differential Revision: https://reviews.llvm.org/D39163 llvm-svn: 316916
*	[DAGCombine] Don't combine sext with extload if sextload is not supported ↵	Guozhi Wei	2017-10-27	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and extload has multi users In function DAGCombiner::visitSIGN_EXTEND_INREG, sext can be combined with extload even if sextload is not supported by target, then if sext is the only user of extload, there is no big difference, no harm no benefit. if extload has more than one user, the combined sextload may block extload from combining with other zext, causes extra zext instructions generated. As demonstrated by the attached test case. This patch add the constraint that when sextload is not supported by target, sext can only be combined with extload if it is the only user of extload. Differential Revision: https://reviews.llvm.org/D39108 llvm-svn: 316802
*	Add subclass data to the FoldingSetNode for MemIntrinsicSDNodes.	Sean Fertile	2017-10-27	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \|	Not having the subclass data on an MemIntrinsicSDNodes means it was possible to try to fold 2 nodes with the same operands but differing MMO flags. This would trip an assertion when trying to refine the alignment between the 2 MachineMemOperands. Differential Revision: https://reviews.llvm.org/D38898 llvm-svn: 316737
*	Represent runtime preemption in the IR.	Sean Fertile	2017-10-26	1	-0/+301
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we do not represent runtime preemption in the IR, which has several drawbacks: 1) The semantics of GlobalValues differ depending on the object file format you are targeting (as well as the relocation-model and -fPIE value). 2) We have no way of disabling inlining of run time interposable functions, since in the IR we only know if a function is link-time interposable. Because of this llvm cannot support elf-interposition semantics. 3) In LTO builds of executables we will have extra knowledge that a symbol resolved to a local definition and can't be preemptable, but have no way to propagate that knowledge through the compiler. This patch adds preemptability specifiers to the IR with the following meaning: dso_local --> means the compiler may assume the symbol will resolve to a definition within the current linkage unit and the symbol may be accessed directly even if the definition is not within this compilation unit. dso_preemptable --> means that the compiler must assume the GlobalValue may be replaced with a definition from outside the current linkage unit at runtime. To ease transitioning dso_preemptable is treated as a 'default' in that low-level codegen will still do the same checks it did previously to see if a symbol should be accessed indirectly. Eventually when IR producers emit the specifiers on all Globalvalues we can change dso_preemptable to mean 'always access indirectly', and remove the current logic. Differential Revision: https://reviews.llvm.org/D20217 llvm-svn: 316668
*	[PowerPC] Use record-form instruction for Less-or-Equal -1 and ↵	Hiroshi Inoue	2017-10-26	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \|	Greater-or-Equal 1 Currently a record-form instruction is used for comparison of "greater than -1" and "less than 1" by modifying the predicate (e.g. LT 1 into LE 0) in addition to the naive case of comparison against 0. This patch also enables emitting a record-form instruction for "less than or equal to -1" (i.e. "less than 0") and "greater than or equal to 1" (i.e. "greater than 0") to increase the optimization opportunities. Differential Revision: https://reviews.llvm.org/D38941 llvm-svn: 316647
*	MIR: Print the register class or bank in vreg defs	Justin Bogner	2017-10-24	3	-19/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This updates the MIRPrinter to include the regclass when printing virtual register defs, which is already valid syntax for the parser. That is, given 64 bit %0 and %1 in a "gpr" regbank, %1(s64) = COPY %0(s64) would now be written as %1:gpr(s64) = COPY %0(s64) While this change alone introduces a bit of redundancy with the registers block, it allows us to update the tests to be more concise and understandable and brings us closer to being able to remove the registers block completely. Note: We generally only print the class in defs, but there is one exception. If there are uses without any defs whatsoever, we'll print the class on all uses. I'm not completely convinced this comes up in meaningful machine IR, but for now the MIRParser and MachineVerifier both accept that kind of stuff, so we don't want to have a situation where we can print something we can't parse. llvm-svn: 316479
*	[PowerPC] Try to simplify a Swap if it feeds a Splat	Stefan Pintilie	2017-10-24	2	-2/+136
\| \| \| \| \| \| \| \| \| \| \| \|	If we have the situation where a Swap feeds a Splat we can sometimes change the index on the Splat and then remove the Swap instruction. Fixed the test case that was failing and recommit after pulling the original commit. Original revision is here: https://reviews.llvm.org/D39009 llvm-svn: 316478
*	Revert "[PowerPC] Try to simplify a Swap if it feeds a Splat"	Stefan Pintilie	2017-10-23	1	-134/+0
\| \| \| \| \| \| \| \| \|	Revert commit r316366. Previous commit causes p8-scalar_vector_conversions.ll to fail. This reverts commit 990e764ad8a2eec206ce5dda6aefab059ccd4e92. llvm-svn: 316371
*	[PowerPC] Try to simplify a Swap if it feeds a Splat	Stefan Pintilie	2017-10-23	1	-0/+134
\| \| \| \| \| \| \| \| \|	If we have the situation where a Swap feeds a Splat we can sometimes change the index on the Splat and then remove the Swap instruction. Differential Revision: https://reviews.llvm.org/D39009 llvm-svn: 316366
*	Reverting r316270 due to failing build bots.	Aaron Ballman	2017-10-21	1	-16/+16
\| \| \| \| \| \| \|	http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/12899 http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/7951 llvm-svn: 316276
*	[PPC CodeGen] Fix the bitreverse.i64 intrinsic.	Fangrui Song	2017-10-21	1	-16/+16
\| \| \| \| \| \| \| \| \| \|	Summary: The two 32-bit words were swapped. Subscribers: nemanjai, kbarton Differential Revision: https://reviews.llvm.org/D38705 llvm-svn: 316270
*	Disabling the transformation introduced in r315888	Nemanja Ivanovic	2017-10-20	1	-1/+1
\| \| \| \| \| \| \|	The commit at https://reviews.llvm.org/rL315888 is causing some failures with internal testing. Disabling this code until we can resolve the issues. llvm-svn: 316199
*	[PowerPC] Use helper functions to check sign-/zero-extended value	Hiroshi Inoue	2017-10-18	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \|	Helper functions to identify sign- and zero-extending machine instruction is introduced in rL315888. This patch makes PPCInstrInfo::optimizeCompareInstr use the helper functions. It simplifies the code and also makes possible more optimizations since the helper can do more analysis than the original check code; I observed about 5000 more compare instructions are eliminated while building LLVM. Also, this patch fixes a bug in helpers on ANDIo instruction handling due to the order of checks. This bug causes a failure in an existing test case for optimizeCompareInstr. Differential Revision: https://reviews.llvm.org/D38988 llvm-svn: 316071
*	[PowerPC] Eliminate sign- and zero-extensions if already sign- or zero-extended	Hiroshi Inoue	2017-10-16	4	-62/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch enables redundant sign- and zero-extension elimination in PowerPC MI Peephole pass. If the input value of a sign- or zero-extension is known to be already sign- or zero-extended, the operation is redundant and can be eliminated. One common case is sign-extensions for a method parameter or for a method return value; they must be sign- or zero-extended as defined in PPC ELF ABI. For example of the following simple code, two extsw instructions are generated before the invocation of int_func and before the return. With this patch, both extsw are eliminated. void int_func(int); void ii_test(int a) { if (a & 1) return int_func(a); } Such redundant sign- or zero-extensions are quite common in many programs; e.g. I observed about 60,000 occurrences of the elimination while compiling the LLVM+CLANG. Differential Revision: https://reviews.llvm.org/D31319 llvm-svn: 315888
*	[PowerPC] Add profitablilty check for conversion to mtctr loops	Lei Huang	2017-10-12	2	-5/+158
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add profitability checks for modifying counted loops to use the mtctr instruction. The latency of mtctr is only justified if there are more than 4 comparisons that will be removed as a result. Usually counted loops are formed relatively early and before unrolling, so most low trip count loops often don't survive. However we want to ensure that if they do, we do not mistakenly update them to mtctr loops. Use CodeMetrics to ensure we are only doing this for small loops with small trip counts. Differential Revision: https://reviews.llvm.org/D38212 llvm-svn: 315592
*	[PowerPC] Utilize DQ-Form instructions for spill/restore and fix FrameIndex ↵	Lei Huang	2017-10-11	4	-238/+210
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	elimination to only use `lis/addi` if necessary. Currently we produce a bunch of unnecessary code when emitting the prologue/epilogue for spills/restores. Namely, if the load from stack slot/store to stack slot instruction is an X-Form instruction, we will always produce an LIS/ORI sequence for the stack offset. Furthermore, we have not exploited the P9 vector D-Form loads/stores for this purpose. This patch address both issues. Specifying the D-Form load as the instruction to use for stack spills/reloads should be safe because: 1. The stack should be aligned according to the ABI 2. If the stack isn't aligned, PPCRegisterInfo::eliminateFrameIndex() will check for the offset being a multiple of 16 and will convert it to an X-Form instruction if it isn't. Differential Revision : https://reviews.llvm.org/D38758 llvm-svn: 315500
*	[NFC] update test case so checks are not order dependent when not needed	Lei Huang	2017-10-11	1	-5/+5
\| \| \| \|	llvm-svn: 315482
*	Fix for PR34888.	Nemanja Ivanovic	2017-10-10	1	-0/+121
\| \| \| \| \| \| \| \| \| \|	The issue is that we assume operand zero of the input to the add instruction is a register. In this case, the input comes from inline assembly and operand zero is not a register thereby causing a crash. The code will bail anyway if the input instruction doesn't have the right opcode. So do that check first and let short-circuiting prevent the crash. llvm-svn: 315285
*	[MC] Suppress .Lcfi labels when emitting textual assembly	Reid Kleckner	2017-10-10	2	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This suppresses the generation of .Lcfi labels in our textual assembler. It was annoying that this generated cascading .Lcfi labels: llc foo.ll -o - \| llvm-mc \| llvm-mc After three trips through MCAsmStreamer, we'd have three labels in the output when none are necessary. We should only bother creating the labels and frame data when making a real object file. This supercedes D38605, which moved the entire .seh_ implementation into MCObjectStreamer. This has the advantage that we do more checking when emitting textual assembly, as a minor efficiency cost. Outputting textual assembly is not performance critical, so this shouldn't matter. Reviewers: majnemer, MatzeB Subscribers: qcolombet, nemanjai, javed.absar, eraman, hiraditya, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D38638 llvm-svn: 315259
*	Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source ↵	Geoff Berry	2017-10-03	5	-6/+5
\| \| \| \| \| \| \| \| \| \|	forwarding"" This reverts commit r314729. Another bug has been encountered in an out-of-tree target reported by Quentin. llvm-svn: 314814
*	[DebugInfo] Handle endianness when moving debug info for split integer ↵	Bjorn Pettersson	2017-10-03	1	-0/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	values (reapplied) Summary: Take the target's endianness into account when splitting the debug information in DAGTypeLegalizer::SetExpandedInteger. This patch fixes so that, for big-endian targets, the fragment expression corresponding to the high part of a split integer value is placed at offset 0, in order to correctly represent the memory address order. I have attached a PPC32 reproducer where the resulting DWARF pieces for a 64-bit integer were incorrectly reversed. Original patch was reverted due to using -stop-after=isel in the test case (but that is only working when AMDGPU target is included in the llc build). The test case has now been updated to use -stop-before=expand-isel-pseudos instead. Patch by: dstenb Reviewers: JDevlieghere, aprantl, dblaikie Reviewed By: JDevlieghere, aprantl, dblaikie Subscribers: nemanjai Differential Revision: https://reviews.llvm.org/D38172 llvm-svn: 314781
*	[PowerPC] Revert r314666.	Tim Shen	2017-10-02	1	-68/+0
\| \| \| \| \| \| \| \| \|	See https://reviews.llvm.org/D38172. I tried to XFAIL it, but sometimes XPASS triggers the bot. Simply revert it. llvm-svn: 314739
*	[PowerPC] Temporarily disable the test introduced by r314666	Tim Shen	2017-10-02	1	-0/+2
\| \| \| \| \| \|	See https://reviews.llvm.org/D38172 for details. llvm-svn: 314732
*	Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"	Geoff Berry	2017-10-02	5	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issues addressed since original review: - Avoid bug in regalloc greedy/machine verifier when forwarding to use in an instruction that re-defines the same virtual register. - Fixed bug when forwarding to use in EarlyClobber instruction slot. - Fixed incorrect forwarding to register definitions that showed up in explicit_uses() iterator (e.g. in INLINEASM). - Moved removal of dead instructions found by LiveIntervals::shrinkToUses() outside of loop iterating over instructions to avoid instructions being deleted while pointed to by iterator. - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. llvm-svn: 314729
*	[Debug info] Handle endianness when moving debug info for split integer values	Bjorn Pettersson	2017-10-02	1	-0/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Take the target's endianness into account when splitting the debug information in DAGTypeLegalizer::SetExpandedInteger. This patch fixes so that, for big-endian targets, the fragment expression corresponding to the high part of a split integer value is placed at offset 0, in order to correctly represent the memory address order. I have attached a PPC32 reproducer where the resulting DWARF pieces for a 64-bit integer were incorrectly reversed. Patch by: dstenb Reviewers: JDevlieghere, aprantl, dblaikie Reviewed By: JDevlieghere, aprantl, dblaikie Subscribers: nemanjai Differential Revision: https://reviews.llvm.org/D38172 llvm-svn: 314666
*	[PowerPC] support ZERO_EXTEND in tryBitPermutation	Hiroshi Inoue	2017-10-02	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch add a support of ISD::ZERO_EXTEND in PPCDAGToDAGISel::tryBitPermutation to increase the opportunity to use rotate-and-mask by reordering ZEXT and ANDI. Since tryBitPermutation stops analyzing nodes if it hits a ZEXT node while traversing SDNodes, we want to avoid ZEXT between two nodes that can be folded into a rotate-and-mask instruction. For example, we allow these nodes t9: i32 = add t7, Constant:i32<1> t11: i32 = and t9, Constant:i32<255> t12: i64 = zero_extend t11 t14: i64 = shl t12, Constant:i64<2> to be folded into a rotate-and-mask instruction. Such case often happens in array accesses with logical AND operation in the index, e.g. array[i & 0xFF]; Differential Revision: https://reviews.llvm.org/D37514 llvm-svn: 314655
*	[PowerPC] eliminate partially redundant compare instruction	Hiroshi Inoue	2017-09-28	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a follow-on of D37211. D37211 eliminates a compare instruction if two conditional branches can be made based on the one compare instruction, e.g. if (a == 0) { ... } else if (a < 0) { ... } This patch extends this optimization to support partially redundant cases, which often happen in while loops. For example, one compare instruction is moved from the loop body into the preheader by this optimization in the following example. do { if (a == 0) dummy1(); a = func(a); } while (a > 0); Differential Revision: https://reviews.llvm.org/D38236 llvm-svn: 314390
*	[PowerPC] eliminate unconditional branch to the next instruction	Hiroshi Inoue	2017-09-27	3	-26/+2
\| \| \| \| \| \| \| \| \|	This patch makes analyzeBranch eliminate unconditional branch to the next instruction. After basic blocks are re-organized by optimizers, such as machine block placement, a BB may end with an unconditional branch to the next (fallthrough) BB. This patch removes such redundant branch instruction. Differential Revision: https://reviews.llvm.org/D37730 llvm-svn: 314297
*	[PowerPC] Reverting sequence of patches for elimination of comparison ↵	Nemanja Ivanovic	2017-09-26	88	-0/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instructions In the past while, I've committed a number of patches in the PowerPC back end aimed at eliminating comparison instructions. However, this causes some failures in proprietary source and these issues are not observed in SPEC or any open source packages I've been able to run. As a result, I'm pulling the entire series and will refactor it to: - Have a single entry point for easy control - Have fine-grained control over which patterns we transform A side-effect of this is that test cases for these patches (and modified by them) are XFAIL-ed. This is a temporary measure as it is counter-productive to remove/modify these test cases and then have to modify them again when the refactored patch is recommitted. The failure will be investigated in parallel to the refactoring effort and the recommit will either have a fix for it or will leave this transformation off by default until the problem is resolved. llvm-svn: 314244
*	[PowerPC] Eliminate compares - add i64 sext/zext handling for SETLT/SETGT	Nemanja Ivanovic	2017-09-25	5	-1/+467
\| \| \| \| \| \| \| \|	As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential review. llvm-svn: 314106
*	[PowerPC] Eliminate compares - add i64 sext/zext handling for SETLE/SETGE	Nemanja Ivanovic	2017-09-24	4	-0/+524
\| \| \| \| \| \| \| \|	As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential review. llvm-svn: 314073
*	[PowerPC] Eliminate compares - add i32 sext/zext handling for SETULT/SETUGT	Nemanja Ivanovic	2017-09-23	14	-0/+1200
\| \| \| \| \| \| \| \|	As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential revision. llvm-svn: 314062
*	[PowerPC] Eliminate compares - add i32 sext/zext handling for SETULE/SETUGE	Nemanja Ivanovic	2017-09-23	17	-26/+1428
\| \| \| \| \| \| \| \|	As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential revision. llvm-svn: 314060
*	[PowerPC] Eliminate compares - add i32 sext/zext handling for SETLT/SETGT	Nemanja Ivanovic	2017-09-23	8	-7/+610
\| \| \| \| \| \| \| \|	As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential revision. llvm-svn: 314055
*	[XRay] support conditional return on PPC.	Tim Shen	2017-09-22	2	-0/+111
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Conditional returns were not taken into consideration at all. Implement them by turning them into jumps and normal returns. This means there is a slightly higher performance penalty for conditional returns, but this is the best we can do, and it still disturbs little of the rest. Reviewers: dberris, echristo Subscribers: sanjoy, nemanjai, hiraditya, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D38102 llvm-svn: 314005
*	Recommit r310809 with a fix for the spill problem	Nemanja Ivanovic	2017-09-22	25	-96/+971
\| \| \| \| \| \| \| \| \| \|	This patch re-commits the patch that was pulled out due to a problem it caused, but with a fix for the problem. The fix was reviewed separately by Eric Christopher and Hal Finkel. Differential Revision: https://reviews.llvm.org/D38054 llvm-svn: 313978
*	[SelectionDAG] Pick correct frame index in LowerArguments	Bjorn Pettersson	2017-09-21	1	-0/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: SelectionDAGISel::LowerArguments is associating arguments with frame indices (FuncInfo->setArgumentFrameIndex). That information is later on used by EmitFuncArgumentDbgValue to create DBG_VALUE instructions that denotes that a variable can be found on the stack. I discovered that for our (big endian) out-of-tree target the association created by SelectionDAGISel::LowerArguments sometimes is wrong. I've seen this happen when a 64-bit value is passed on the stack. The argument will occupy two stack slots (frame index X, and frame index X+1). The fault is that a call to setArgumentFrameIndex is associating the 64-bit argument with frame index X+1. The effect is that the debug information (DBG_VALUE) will point at the least significant part of the arguement on the stack. When printing the argument in a debugger I will get the wrong value. I managed to create a test case for PowerPC that seems to show the same kind of problem. The bugfix will look at the datalayout, taking endianness into account when examining a BUILD_PAIR node, assuming that the least significant part is in the first operand of the BUILD_PAIR. For big endian targets we should use the frame index from the second operand, as the most significant part will be stored at the lower address (using the highest frame index). Reviewers: bogner, rnk, hfinkel, sdardis, aprantl Reviewed By: aprantl Subscribers: nemanjai, aprantl, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D37740 llvm-svn: 313901
*	Fix buildbot failures, add mtriple to gpr-vsr-spill.ll	Zaara Syeda	2017-09-21	1	-1/+1
\| \| \| \|	llvm-svn: 313890
*	[Power9] Spill gprs to vector registers rather than stack	Zaara Syeda	2017-09-21	1	-0/+24
\| \| \| \| \| \| \| \| \| \|	This patch updates register allocation to enable spilling gprs to volatile vector registers rather than the stack. It can be enabled for Power9 with option -ppc-enable-gpr-to-vsr-spills. Differential Revision: https://reviews.llvm.org/D34815 llvm-svn: 313886
*	[PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD.	Tony Jiang	2017-09-19	1	-0/+60
\| \| \| \| \| \| \| \| \| \|	Two blocks prior to the join each perform an li and the the join block has an add using the initialized register. Optimize each predecessor block to instead use addi and delete the li's and add. Differential Revision: https://reviews.llvm.org/D36734 llvm-svn: 313639
*	[XRay][CodeGen] Use the current function symbol as the associated symbol for ↵	Dean Michael Berris	2017-09-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the instrumentation map Summary: XRay had been assuming that the previous section is the "text" section of the function when lowering the instrumentation map. Unfortunately this is not a safe assumption, because we may be coming from lowering debug type information for the function being lowered. This fixes an issue with combining -gsplit-dwarf, -generate-type-units, -debug-compile and -fxray-instrument for sole member functions. When the split dwarf section is stripped, we're left with references from the xray_instr_map to the debug section. The change now uses the function's symbol instead of the previous section's start symbol. We found the bug while attempting to strip the split debug sections off an XRay-instrumented object file, which had a peculiar edge-case for single-function classes where the single function is being lowered. Because XRay had assocaited the instrumentation map for a function to the debug types section instead of the function's section, the objcopy call will fail due to the misplaced reference from the xray_instr_map section. Reviewers: pcc, dblaikie, echristo Subscribers: llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D37791 llvm-svn: 313233
*	Update branch coalescing to be a PowerPC specific pass	Lei Huang	2017-09-12	1	-14/+43
\| \| \| \| \| \| \| \| \| \| \| \|	Implementing this pass as a PowerPC specific pass. Branch coalescing utilizes the analyzeBranch method which currently does not include any implicit operands. This is not an issue on PPC but must be handled on other targets. Pass is currently off by default. Enabled via -enable-ppc-branch-coalesce. Differential Revision : https: // reviews.llvm.org/D32776 llvm-svn: 313061
*	PPC: Don't select lxv/stxv for insufficiently aligned stack slots.	Kyle Butt	2017-09-09	1	-0/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The lxv/stxv instructions require an offset that is 0 % 16. Previously we were selecting lxv/stxv for loads and stores to the stack where the offset from the slot was a multiple of 16, but the stack slot was not 16 or more byte aligned. When the frame gets lowered these transform to r(1\|31) + slot + offset. If slot is not aligned, slot + offset may not be 0 % 16. Now we require 16 byte or more alignment for select lxv/stxv to stack slots. Includes a testcase that shows both sufficiently and insufficiently aligned stack slots. llvm-svn: 312843
*	[XRay][CodeGen][PowerPC] Fix tail exit codegen for XRay in PPC	Dean Michael Berris	2017-09-08	3	-0/+114
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes code-gen for XRay in PPC. The regression wasn't caught by codegen tests which we add in this change. What happened was the following: - For tail exits, we used to unconditionally prepend the returns/exits with a pseudo-instruction that gets lowered to the instrumentation sled (and leave the actual return/exit instruction as-is). - Changes to the XRay instrumentation pass caused the tail exits to suddenly also emit the tail exit pseudo-instruction, since the check for whether a return instruction was also a call instruction meant it was a tail exit instruction. - None of the tests caught the regression either due to non-existent tests, or the tests being disabled/removed for continuous breakage. This change re-introduces some of the basic tests and verifies that we're back to a state that allows the back-end to generate appropriate XRay instrumented binaries for PPC in the presence of tail exits. Reviewers: echristo, timshen Subscribers: nemanjai, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D37570 llvm-svn: 312772
*	[PowerPC] Don't use xscvdpspn on the P7	Hal Finkel	2017-09-06	1	-0/+27
\| \| \| \| \| \| \|	xscvdpspn was not introduced until the P8, so don't use it on the P7. Fixes a regression introduced in r288152. llvm-svn: 312612
*	[PowerPC] eliminate redundant compare instruction	Hiroshi Inoue	2017-09-05	1	-0/+722
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If multiple conditional branches are executed based on the same comparison, we can execute multiple conditional branches based on the result of one comparison on PPC. For example, if (a == 0) { ... } else if (a < 0) { ... } can be executed by one compare and two conditional branches instead of two pairs of a compare and a conditional branch. This patch identifies a code sequence of the two pairs of a compare and a conditional branch and merge the compares if possible. To maximize the opportunity, we do canonicalization of code sequence before merging compares. For the above example, the input for this pass looks like: cmplwi r3, 0 beq 0, .LBB0_3 cmpwi r3, -1 bgt 0, .LBB0_4 So, before merging two compares, we canonicalize it as cmpwi r3, 0 ; cmplwi and cmpwi yield same result for beq beq 0, .LBB0_3 cmpwi r3, 0 ; greather than -1 means greater or equal to 0 bge 0, .LBB0_4 The generated code should be cmpwi r3, 0 beq 0, .LBB0_3 bge 0, .LBB0_4 Differential Revision: https://reviews.llvm.org/D37211 llvm-svn: 312514