summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/PowerPC/build-vector-tests.ll
Commit message (Collapse)AuthorAgeFilesLines
* [PowerPC] Remove allow-deprecated-dag-overlap and fix broken testsJinsong Ji2019-11-121-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This is found during review of https://reviews.llvm.org/D67088. CHECK-DAG is non-overlapping after https://reviews.llvm.org/D47106. -allow-deprecated-dag-overlap was introduced to temporary accept old behavior. But it actually hide some broken tests, eg: `test/CodeGen/PowerPC/swaps-le-1.ll` The codegen has changed, but the CHECK-DAG still PASS due to allowing `overlap`. This patch remove the deprecated options, and fix the broken tests. Reviewers: #powerpc, hfinkel, nemanjai, steven.zhang, shchenz Reviewed By: shchenz Subscribers: shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69733
* [PowerPC][NFC] Remove deprecated Function Attrs commentsJinsong Ji2019-10-221-129/+2
|
* [PowerPC] Exploit single instruction load-and-splat for word and doublewordNemanja Ivanovic2019-09-171-12/+4
| | | | | | | | | | | We currently produce a load, followed by (possibly a move for integers and) a splat as separate instructions. VSX has always had a splatting load for doublewords, but as of Power9, we have it for words as well. This patch just exploits these instructions. Differential revision: https://reviews.llvm.org/D63624 llvm-svn: 372139
* [PowerPC][NFC] Remove duplicate tests in build-vector-test.llJinsong Ji2019-08-141-341/+221
| | | | | | AllOnes has been split into build-vector-allones.ll. llvm-svn: 368900
* recommit:[PowerPC] Eliminate loads/swap feeding swap/store for vector type ↵Zi Xuan Wu2019-08-011-30/+18
| | | | | | | | | | | | by using big-endian load/store In PowerPC, there is instruction to load vector in big endian element order when it's in little endian target. So we can combine vector load + reverse into big endian load to eliminate the swap instruction. Also combine vector reverse + store into big endian store. Differential Revision: https://reviews.llvm.org/D65063 llvm-svn: 367516
* revert r367382 because buildbot failureZi Xuan Wu2019-07-311-20/+42
| | | | llvm-svn: 367388
* [PowerPC] Eliminate loads/swap feeding swap/store for vector type by using ↵Zi Xuan Wu2019-07-311-42/+20
| | | | | | | | | | big-endian load/store In PowerPC, there is instruction to load vector in big endian element order when it's in little endian target. So we can combine vector load + reverse into big endian load to eliminate the swap instruction. Also combine vector reverse + store into big endian store. llvm-svn: 367382
* [MachineScheduler] checkResourceLimit boundary condition updateJinsong Ji2019-06-071-2/+2
| | | | | | | | | | | | | | | | | When we call checkResourceLimit in bumpCycle or bumpNode, and we know the resource count has just reached the limit (the equations are equal). We should return true to mark that we are resource limited for next schedule, or else we might continue to schedule in favor of latency for 1 more schedule and create a schedule that actually overbook the resource. When we call checkResourceLimit to estimate the resource limite before scheduling, we don't need to return true even if the equations are equal, as it shouldn't limit the schedule for it . Differential Revision: https://reviews.llvm.org/D62345 llvm-svn: 362805
* [PowerPC] P9 Scheduling Model: dispatching rule fixesJinsong Ji2019-06-041-4/+4
| | | | | | | | | | | | | | | | This is to address some of the problems in existing P9 resource modeling, especially about the dispatching rules. Instead of using a hypothetical DISPATCHER , we try to use the number of actual dispatch slots, and define SchedWriteRes to model dispatch rules, then update instruction classes according to dispatch rules. All the dispatch rules and instruction classes update are made according to POWER9 User Manual. Differential Revision: https://reviews.llvm.org/D61873 llvm-svn: 362509
* [PowerPC] [ISEL] select x-form instruction for unaligned offsetChen Zheng2019-05-221-16/+16
| | | | | | Differential Revision: https://reviews.llvm.org/D62173 llvm-svn: 361346
* [PowerPC][NFC] Update build-vector-tests.ll using ↵Jinsong Ji2019-05-071-2460/+4053
| | | | | | | | | | | | | utils/update_llc_test_checks.py build-vector-tests.ll is a huge testcase, it is hard to maintain: eg: any fundamental changes might need to update hundreds of lines. We should leverage the script to maintain it. This patch simply run utils/update_llc_test_checks.py on it. There should be no missing test points. llvm-svn: 360175
* [Power9] Enable the Out-of-Order scheduling model for P9 hwQingShan Zhang2019-01-031-38/+38
| | | | | | | | | | | | | | | | | | | | | | | | | When switched to the MI scheduler for P9, the hardware is modeled as out of order. However, inside the MI Scheduler algorithm, we still use the in-order scheduling model as the MicroOpBufferSize isn't set. The MI scheduler take it as the hw cannot buffer the op. So, only when all the available instructions issued, the pending instruction could be scheduled. That is not true for our P9 hw in fact. This patch is trying to enable the Out-of-Order scheduling model. The buffer size 44 is picked from the P9 hw spec, and the perf test indicate that, its value won't hurt the cpu2017. With this patch, there are 3 specs improved over 3% and 1 spec deg over 3%. The detail is as follows: x264_r: +6.95% cactuBSSN_r: +6.94% lbm_r: +4.11% xz_r: -3.85% And the GEOMEAN for all the C/C++ spec in spec2017 is about 0.18% improved. Reviewer: Nemanjai Differential Revision: https://reviews.llvm.org/D55810 llvm-svn: 350285
* [PowerPC] Improve BUILD_VECTOR of 4 i32sLei Huang2018-10-261-104/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, for this node: vector int test(int a, int b, int c, int d) { return (vector int) { a, b, c, d }; } we get this on Power9: mtvsrdd 34, 5, 3 mtvsrdd 35, 6, 4 vmrgow 2, 3, 2 and this on Power8: mtvsrwz 0, 3 mtvsrwz 1, 5 mtvsrwz 2, 4 mtvsrwz 3, 6 xxmrghd 34, 1, 0 xxmrghd 35, 3, 2 vmrgow 2, 3, 2 This can be improved to this on LE Power9: rldimi 3, 4, 32, 0 rldimi 5, 6, 32, 0 mtvsrdd 34, 5, 3 and this on LE Power8 rldimi 3, 4, 32, 0 rldimi 5, 6, 32, 0 mtvsrd 34, 3 mtvsrd 35, 5 xxpermdi 34, 35, 34, 0 This patch updates the TD pattern to generate the optimized sequence for both Power8 and Power9 on LE and BE. Differential Revision: https://reviews.llvm.org/D53494 llvm-svn: 345414
* [PowerPC] Improve codegen for vector loads using scalar_to_vectorZaara Syeda2018-08-081-16/+28
| | | | | | | | | | | | | | | | This patch aims to improve the codegen for vector loads involving the scalar_to_vector (load X) sequence. Initially, ld->mv instructions were used for scalar_to_vector (load X), so this patch allows scalar_to_vector (load X) to utilize: LXSD and LXSDX for i64 and f64 LXSIWAX for i32 (sign extension to i64) LXSIWZX for i32 and f64 Committing on behalf of Amy Kwan. Differential Revision: https://reviews.llvm.org/D48950 llvm-svn: 339260
* [PowerPC] Do not round values prior to converting to integerNemanja Ivanovic2018-08-021-204/+153
| | | | | | | | | | | | | | | Adding the FP_ROUND nodes when combining FP_TO_[SU]INT of elements feeding a BUILD_VECTOR into an FP_TO_[SU]INT of the built vector loses precision. This patch removes the code that adds these nodes to true f64 operands. It also adds patterns required to ensure the code is still vectorized rather than converting individual elements and inserting into a vector. Fixes https://bugs.llvm.org/show_bug.cgi?id=38342 Differential Revision: https://reviews.llvm.org/D50121 llvm-svn: 338658
* [FileCheck] Add -allow-deprecated-dag-overlap to failing llvm testsJoel E. Denny2018-07-111-4/+4
| | | | | | | | | | | | | | | | | | | See https://reviews.llvm.org/D47106 for details. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D47171 This commit drops that patch's changes to: llvm/test/CodeGen/NVPTX/f16x2-instructions.ll llvm/test/CodeGen/NVPTX/param-load-store.ll For some reason, the dos line endings there prevent me from commiting via the monorepo. A follow-up commit (not via the monorepo) will finish the patch. llvm-svn: 336843
* [PowerPC] Remove the match pattern in the definition of LXSDX/STXSDXLei Huang2018-05-241-40/+40
| | | | | | | | | | | | | | The match pattern in the definition of LXSDX is xoaddr, so the Pseudo instruction XFLOADf64 never gets selected. XFLOADf64 expands to LXSDX/LFDX post RA based on the register pressure. To avoid ambiguity, we need to remove the select pattern for LXSDX, same as what was done for LXSD. STXSDX also have the same issue. Patch by Qing Shan Zhang (steven.zhang). Differential Revision: https://reviews.llvm.org/D47178 llvm-svn: 333150
* [PowerPC] Convert r+r instructions to r+i (pre and post RA)Nemanja Ivanovic2017-12-151-28/+28
| | | | | | | | | | | | | | | | | | | | | | This patch adds the necessary infrastructure to convert instructions that take two register operands to those that take a register and immediate if the necessary operand is produced by a load-immediate. Furthermore, it uses this infrastructure to perform such conversions twice - first at MachineSSA and then pre-emit. There are a number of reasons we may end up with opportunities for this transformation, including but not limited to: - X-Form instructions chosen since the exact offset isn't available at ISEL time - Atomic instructions with constant operands (we will add patterns for this in the future) - Tail duplication may duplicate code where one block contains this redundancy - When emitting compare-free code in PPCDAGToDAGISel, we don't handle constant comparands specially Furthermore, this patch moves the initialization of PPCMIPeepholePass so that it can be used for MIR tests. llvm-svn: 320791
* [PPC] Heuristic to choose between a X-Form VSX ld/st vs a X-Form FP ld/st.Tony Jiang2017-11-201-36/+36
| | | | | | | | | | | | | | | | | | | | The VSX versions have the advantage of a full 64-register target whereas the FP ones have the advantage of lower latency and higher throughput. So what we’re after is using the faster instructions in low register pressure situations and using the larger register file in high register pressure situations. The heuristic chooses between the following 7 pairs of instructions. PPC::LXSSPX vs PPC::LFSX PPC::LXSDX vs PPC::LFDX PPC::STXSSPX vs PPC::STFSX PPC::STXSDX vs PPC::STFDX PPC::LXSIWAX vs PPC::LFIWAX PPC::LXSIWZX vs PPC::LFIWZX PPC::STXSIWX vs PPC::STFIWX Differential Revision: https://reviews.llvm.org/D38486 llvm-svn: 318651
* [PowerPC] Pretty-print CR bits the way the binutils disassembler doesNemanja Ivanovic2017-07-251-8/+12
| | | | | | | | | This patch just adds printing of CR bit registers in a more human-readable form akin to that used by the GNU binutils. Differential Revision: https://reviews.llvm.org/D31494 llvm-svn: 309001
* [PowerPC] Ensure displacements for DQ-Form instructions are multiples of 16Nemanja Ivanovic2017-07-131-19/+21
| | | | | | | | | | | | | As outlined in the PR, we didn't ensure that displacements for DQ-Form instructions are multiples of 16. Since the instruction encoding encodes a quad-word displacement, a sub-16 byte displacement is meaningless and ends up being encoded incorrectly. Fixes https://bugs.llvm.org/show_bug.cgi?id=33671. Differential Revision: https://reviews.llvm.org/D35007 llvm-svn: 307934
* [PowerPC] Reduce register pressure by not materializing a constant just for ↵Lei Huang2017-07-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | use as an index register for X-Form loads/stores. For this example: float test (int *arr) { return arr[2]; } We currently generate the following code: li r4, 8 lxsiwax f0, r3, r4 xscvsxdsp f1, f0 With this patch, we will now generate: addi r3, r3, 8 lxsiwax f0, 0, r3 xscvsxdsp f1, f0 Originally reported in: https://bugs.llvm.org/show_bug.cgi?id=27204 Differential Revision: https://reviews.llvm.org/D35027 llvm-svn: 307553
* P9: D-form vector load/store. Differential Revision: ↵Zaara Syeda2017-05-241-108/+108
| | | | | | https://reviews.llvm.org/D33248 llvm-svn: 303780
* [PowerPC] Emit VMX loads/stores for aligned ops to avoid adding swaps on LENemanja Ivanovic2017-05-021-26/+20
| | | | | | | | | | | | | Fixes PR30730. This is a re-commit of a pulled commit. The commit was pulled because some software projects contained uses of Altivec vectors that violated alignment requirements. Known issues have now been fixed. Committing on behalf of Lei Huang. Differential Revision: https://reviews.llvm.org/D26861 llvm-svn: 301892
* [PPC] corrections in two testcasesEhsan Amiri2016-12-161-14/+14
| | | | | | | | | Removing sensitivity to scheduling (by using CHECK-DAG instead of CHECK) and some other minor corrections. In preparation to commit Power9 processor model. llvm-svn: 289900
* [PowerPC] Improvements for BUILD_VECTOR Vol. 4Nemanja Ivanovic2016-12-061-0/+4858
This is the final patch in the series of patches that improves BUILD_VECTOR handling on PowerPC. This adds a few peephole optimizations to remove redundant instructions. It also adds a large test case which encompasses a large set of code patterns that build vectors - this test case was the motivator for this series of patches. Differential Revision: https://reviews.llvm.org/D26066 llvm-svn: 288800
OpenPOWER on IntegriCloud