bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[PowerPC] Remove allow-deprecated-dag-overlap and fix broken tests	Jinsong Ji	2019-11-12	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is found during review of https://reviews.llvm.org/D67088. CHECK-DAG is non-overlapping after https://reviews.llvm.org/D47106. -allow-deprecated-dag-overlap was introduced to temporary accept old behavior. But it actually hide some broken tests, eg: `test/CodeGen/PowerPC/swaps-le-1.ll` The codegen has changed, but the CHECK-DAG still PASS due to allowing `overlap`. This patch remove the deprecated options, and fix the broken tests. Reviewers: #powerpc, hfinkel, nemanjai, steven.zhang, shchenz Reviewed By: shchenz Subscribers: shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69733
*	[PowerPC][NFC] Remove deprecated Function Attrs comments	Jinsong Ji	2019-10-22	1	-129/+2
\|
*	[PowerPC] Exploit single instruction load-and-splat for word and doubleword	Nemanja Ivanovic	2019-09-17	1	-12/+4
\| \| \| \| \| \| \| \| \| \| \|	We currently produce a load, followed by (possibly a move for integers and) a splat as separate instructions. VSX has always had a splatting load for doublewords, but as of Power9, we have it for words as well. This patch just exploits these instructions. Differential revision: https://reviews.llvm.org/D63624 llvm-svn: 372139
*	[PowerPC][NFC] Remove duplicate tests in build-vector-test.ll	Jinsong Ji	2019-08-14	1	-341/+221
\| \| \| \| \| \|	AllOnes has been split into build-vector-allones.ll. llvm-svn: 368900
*	recommit:[PowerPC] Eliminate loads/swap feeding swap/store for vector type ↵	Zi Xuan Wu	2019-08-01	1	-30/+18
\| \| \| \| \| \| \| \| \| \| \| \|	by using big-endian load/store In PowerPC, there is instruction to load vector in big endian element order when it's in little endian target. So we can combine vector load + reverse into big endian load to eliminate the swap instruction. Also combine vector reverse + store into big endian store. Differential Revision: https://reviews.llvm.org/D65063 llvm-svn: 367516
*	revert r367382 because buildbot failure	Zi Xuan Wu	2019-07-31	1	-20/+42
\| \| \| \|	llvm-svn: 367388
*	[PowerPC] Eliminate loads/swap feeding swap/store for vector type by using ↵	Zi Xuan Wu	2019-07-31	1	-42/+20
\| \| \| \| \| \| \| \| \| \|	big-endian load/store In PowerPC, there is instruction to load vector in big endian element order when it's in little endian target. So we can combine vector load + reverse into big endian load to eliminate the swap instruction. Also combine vector reverse + store into big endian store. llvm-svn: 367382
*	[MachineScheduler] checkResourceLimit boundary condition update	Jinsong Ji	2019-06-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we call checkResourceLimit in bumpCycle or bumpNode, and we know the resource count has just reached the limit (the equations are equal). We should return true to mark that we are resource limited for next schedule, or else we might continue to schedule in favor of latency for 1 more schedule and create a schedule that actually overbook the resource. When we call checkResourceLimit to estimate the resource limite before scheduling, we don't need to return true even if the equations are equal, as it shouldn't limit the schedule for it . Differential Revision: https://reviews.llvm.org/D62345 llvm-svn: 362805
*	[PowerPC] P9 Scheduling Model: dispatching rule fixes	Jinsong Ji	2019-06-04	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is to address some of the problems in existing P9 resource modeling, especially about the dispatching rules. Instead of using a hypothetical DISPATCHER , we try to use the number of actual dispatch slots, and define SchedWriteRes to model dispatch rules, then update instruction classes according to dispatch rules. All the dispatch rules and instruction classes update are made according to POWER9 User Manual. Differential Revision: https://reviews.llvm.org/D61873 llvm-svn: 362509
*	[PowerPC] [ISEL] select x-form instruction for unaligned offset	Chen Zheng	2019-05-22	1	-16/+16
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D62173 llvm-svn: 361346
*	[PowerPC][NFC] Update build-vector-tests.ll using ↵	Jinsong Ji	2019-05-07	1	-2460/+4053
\| \| \| \| \| \| \| \| \| \| \| \| \|	utils/update_llc_test_checks.py build-vector-tests.ll is a huge testcase, it is hard to maintain: eg: any fundamental changes might need to update hundreds of lines. We should leverage the script to maintain it. This patch simply run utils/update_llc_test_checks.py on it. There should be no missing test points. llvm-svn: 360175
*	[Power9] Enable the Out-of-Order scheduling model for P9 hw	QingShan Zhang	2019-01-03	1	-38/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When switched to the MI scheduler for P9, the hardware is modeled as out of order. However, inside the MI Scheduler algorithm, we still use the in-order scheduling model as the MicroOpBufferSize isn't set. The MI scheduler take it as the hw cannot buffer the op. So, only when all the available instructions issued, the pending instruction could be scheduled. That is not true for our P9 hw in fact. This patch is trying to enable the Out-of-Order scheduling model. The buffer size 44 is picked from the P9 hw spec, and the perf test indicate that, its value won't hurt the cpu2017. With this patch, there are 3 specs improved over 3% and 1 spec deg over 3%. The detail is as follows: x264_r: +6.95% cactuBSSN_r: +6.94% lbm_r: +4.11% xz_r: -3.85% And the GEOMEAN for all the C/C++ spec in spec2017 is about 0.18% improved. Reviewer: Nemanjai Differential Revision: https://reviews.llvm.org/D55810 llvm-svn: 350285
*	[PowerPC] Improve BUILD_VECTOR of 4 i32s	Lei Huang	2018-10-26	1	-104/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, for this node: vector int test(int a, int b, int c, int d) { return (vector int) { a, b, c, d }; } we get this on Power9: mtvsrdd 34, 5, 3 mtvsrdd 35, 6, 4 vmrgow 2, 3, 2 and this on Power8: mtvsrwz 0, 3 mtvsrwz 1, 5 mtvsrwz 2, 4 mtvsrwz 3, 6 xxmrghd 34, 1, 0 xxmrghd 35, 3, 2 vmrgow 2, 3, 2 This can be improved to this on LE Power9: rldimi 3, 4, 32, 0 rldimi 5, 6, 32, 0 mtvsrdd 34, 5, 3 and this on LE Power8 rldimi 3, 4, 32, 0 rldimi 5, 6, 32, 0 mtvsrd 34, 3 mtvsrd 35, 5 xxpermdi 34, 35, 34, 0 This patch updates the TD pattern to generate the optimized sequence for both Power8 and Power9 on LE and BE. Differential Revision: https://reviews.llvm.org/D53494 llvm-svn: 345414
*	[PowerPC] Improve codegen for vector loads using scalar_to_vector	Zaara Syeda	2018-08-08	1	-16/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch aims to improve the codegen for vector loads involving the scalar_to_vector (load X) sequence. Initially, ld->mv instructions were used for scalar_to_vector (load X), so this patch allows scalar_to_vector (load X) to utilize: LXSD and LXSDX for i64 and f64 LXSIWAX for i32 (sign extension to i64) LXSIWZX for i32 and f64 Committing on behalf of Amy Kwan. Differential Revision: https://reviews.llvm.org/D48950 llvm-svn: 339260
*	[PowerPC] Do not round values prior to converting to integer	Nemanja Ivanovic	2018-08-02	1	-204/+153
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adding the FP_ROUND nodes when combining FP_TO_[SU]INT of elements feeding a BUILD_VECTOR into an FP_TO_[SU]INT of the built vector loses precision. This patch removes the code that adds these nodes to true f64 operands. It also adds patterns required to ensure the code is still vectorized rather than converting individual elements and inserting into a vector. Fixes https://bugs.llvm.org/show_bug.cgi?id=38342 Differential Revision: https://reviews.llvm.org/D50121 llvm-svn: 338658
*	[FileCheck] Add -allow-deprecated-dag-overlap to failing llvm tests	Joel E. Denny	2018-07-11	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	See https://reviews.llvm.org/D47106 for details. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D47171 This commit drops that patch's changes to: llvm/test/CodeGen/NVPTX/f16x2-instructions.ll llvm/test/CodeGen/NVPTX/param-load-store.ll For some reason, the dos line endings there prevent me from commiting via the monorepo. A follow-up commit (not via the monorepo) will finish the patch. llvm-svn: 336843
*	[PowerPC] Remove the match pattern in the definition of LXSDX/STXSDX	Lei Huang	2018-05-24	1	-40/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The match pattern in the definition of LXSDX is xoaddr, so the Pseudo instruction XFLOADf64 never gets selected. XFLOADf64 expands to LXSDX/LFDX post RA based on the register pressure. To avoid ambiguity, we need to remove the select pattern for LXSDX, same as what was done for LXSD. STXSDX also have the same issue. Patch by Qing Shan Zhang (steven.zhang). Differential Revision: https://reviews.llvm.org/D47178 llvm-svn: 333150
*	[PowerPC] Convert r+r instructions to r+i (pre and post RA)	Nemanja Ivanovic	2017-12-15	1	-28/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds the necessary infrastructure to convert instructions that take two register operands to those that take a register and immediate if the necessary operand is produced by a load-immediate. Furthermore, it uses this infrastructure to perform such conversions twice - first at MachineSSA and then pre-emit. There are a number of reasons we may end up with opportunities for this transformation, including but not limited to: - X-Form instructions chosen since the exact offset isn't available at ISEL time - Atomic instructions with constant operands (we will add patterns for this in the future) - Tail duplication may duplicate code where one block contains this redundancy - When emitting compare-free code in PPCDAGToDAGISel, we don't handle constant comparands specially Furthermore, this patch moves the initialization of PPCMIPeepholePass so that it can be used for MIR tests. llvm-svn: 320791
*	[PPC] Heuristic to choose between a X-Form VSX ld/st vs a X-Form FP ld/st.	Tony Jiang	2017-11-20	1	-36/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The VSX versions have the advantage of a full 64-register target whereas the FP ones have the advantage of lower latency and higher throughput. So what we’re after is using the faster instructions in low register pressure situations and using the larger register file in high register pressure situations. The heuristic chooses between the following 7 pairs of instructions. PPC::LXSSPX vs PPC::LFSX PPC::LXSDX vs PPC::LFDX PPC::STXSSPX vs PPC::STFSX PPC::STXSDX vs PPC::STFDX PPC::LXSIWAX vs PPC::LFIWAX PPC::LXSIWZX vs PPC::LFIWZX PPC::STXSIWX vs PPC::STFIWX Differential Revision: https://reviews.llvm.org/D38486 llvm-svn: 318651
*	[PowerPC] Pretty-print CR bits the way the binutils disassembler does	Nemanja Ivanovic	2017-07-25	1	-8/+12
\| \| \| \| \| \| \| \| \|	This patch just adds printing of CR bit registers in a more human-readable form akin to that used by the GNU binutils. Differential Revision: https://reviews.llvm.org/D31494 llvm-svn: 309001
*	[PowerPC] Ensure displacements for DQ-Form instructions are multiples of 16	Nemanja Ivanovic	2017-07-13	1	-19/+21
\| \| \| \| \| \| \| \| \| \| \| \| \|	As outlined in the PR, we didn't ensure that displacements for DQ-Form instructions are multiples of 16. Since the instruction encoding encodes a quad-word displacement, a sub-16 byte displacement is meaningless and ends up being encoded incorrectly. Fixes https://bugs.llvm.org/show_bug.cgi?id=33671. Differential Revision: https://reviews.llvm.org/D35007 llvm-svn: 307934
*	[PowerPC] Reduce register pressure by not materializing a constant just for ↵	Lei Huang	2017-07-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	use as an index register for X-Form loads/stores. For this example: float test (int *arr) { return arr[2]; } We currently generate the following code: li r4, 8 lxsiwax f0, r3, r4 xscvsxdsp f1, f0 With this patch, we will now generate: addi r3, r3, 8 lxsiwax f0, 0, r3 xscvsxdsp f1, f0 Originally reported in: https://bugs.llvm.org/show_bug.cgi?id=27204 Differential Revision: https://reviews.llvm.org/D35027 llvm-svn: 307553
*	P9: D-form vector load/store. Differential Revision: ↵	Zaara Syeda	2017-05-24	1	-108/+108
\| \| \| \| \| \|	https://reviews.llvm.org/D33248 llvm-svn: 303780
*	[PowerPC] Emit VMX loads/stores for aligned ops to avoid adding swaps on LE	Nemanja Ivanovic	2017-05-02	1	-26/+20
\| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes PR30730. This is a re-commit of a pulled commit. The commit was pulled because some software projects contained uses of Altivec vectors that violated alignment requirements. Known issues have now been fixed. Committing on behalf of Lei Huang. Differential Revision: https://reviews.llvm.org/D26861 llvm-svn: 301892
*	[PPC] corrections in two testcases	Ehsan Amiri	2016-12-16	1	-14/+14
\| \| \| \| \| \| \| \| \|	Removing sensitivity to scheduling (by using CHECK-DAG instead of CHECK) and some other minor corrections. In preparation to commit Power9 processor model. llvm-svn: 289900
*	[PowerPC] Improvements for BUILD_VECTOR Vol. 4	Nemanja Ivanovic	2016-12-06	1	-0/+4858
	This is the final patch in the series of patches that improves BUILD_VECTOR handling on PowerPC. This adds a few peephole optimizations to remove redundant instructions. It also adds a large test case which encompasses a large set of code patterns that build vectors - this test case was the motivator for this series of patches. Differential Revision: https://reviews.llvm.org/D26066 llvm-svn: 288800