summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* [AIX] Implement function descriptor on SDAGJason Liu2019-06-062-21/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: (1) Function descriptor on AIX On AIX, a called routine may have 2 distinct symbols associated with it: * A function descriptor (Name) * A function entry point (.Name) The descriptor structure on AIX is the same as those in the ELF V1 ABI: * The address of the entry point of the function. * The TOC base address for the function. * The environment pointer. The descriptor symbol uses the same name as the source level function in C. The function entry point is analogous to the symbol we would generate for a function in a non-descriptor-based ABI, except that it is renamed by prepending a ".". Which symbol gets referenced depends on the context: * Taking the address of the function references the descriptor symbol. * Calling the function references the entry point symbol. (2) Speaking of implementation on AIX, for direct function call target, we create proper MCSymbol SDNode(e.g . ".foo") while constructing SDAG to replace original TargetGlobalAddress SDNode. Then down the path, we can take advantage of this MCSymbol. Patch by: Xiangling_L Reviewed by: sfertile, hubert.reinterpretcast, jasonliu, syzaara Differential Revision: https://reviews.llvm.org/D62532 llvm-svn: 362735
* [AIX] Implement call lowering with parameters could pass onto GPRsJason Liu2019-06-062-4/+203
| | | | | | | | | | | | Summary: This patch implements SDAG call lowering on AIX for functions which only have parameters that could fit into GPRs. Reviewers: hubert.reinterpretcast, syzaara Differential Revision: https://reviews.llvm.org/D62823 llvm-svn: 362708
* [PowerPC] Collapse RLDICL/RLDICR into RLDIC when possibleNemanja Ivanovic2019-06-052-0/+187
| | | | | | | | | | | | | | | | | | | Generally speaking, we lower to an optimal rotate sequence for nodes visible in the SDAG. However, there are instances where the two rotates are not visible at ISEL time - most notably those in a very common sequence when lowering switch statements to jump tables. A common situation is a switch on a 32-bit integer. This value has to have the upper 32 bits cleared and because jump table offsets are word offsets, the value needs to be shifted left by 2 bits. We currently emit the clear and the left shift as two separate instructions, but this is not needed as we can lower it to a single RLDIC. This patch just cleans that up. Differential revision: https://reviews.llvm.org/D60402 llvm-svn: 362576
* [PowerPC][NFC] Add codegen test for consecutive stores of vector elementsNemanja Ivanovic2019-06-051-0/+535
| | | | | | | | | NFC commit of a test case in order for the subsequent review to show differences in codegen. Differential revision: https://reviews.llvm.org/D62843 llvm-svn: 362573
* Revert r362472 as it is breaking PPC build botsNemanja Ivanovic2019-06-041-49/+266
| | | | | | | The patch https://reviews.llvm.org/rL362472 broke PPC LNT buildbots. Reverting it to bring the bots back to green. llvm-svn: 362539
* [NFC][Codegen][PowerPC] Autogenerate shift-cmp.ll testRoman Lebedev2019-06-041-23/+23
| | | | | | Being affected by upcoming patch llvm-svn: 362529
* [PowerPC] P9 Scheduling Model: dispatching rule fixesJinsong Ji2019-06-043-204/+204
| | | | | | | | | | | | | | | | This is to address some of the problems in existing P9 resource modeling, especially about the dispatching rules. Instead of using a hypothetical DISPATCHER , we try to use the number of actual dispatch slots, and define SchedWriteRes to model dispatch rules, then update instruction classes according to dispatch rules. All the dispatch rules and instruction classes update are made according to POWER9 User Manual. Differential Revision: https://reviews.llvm.org/D61873 llvm-svn: 362509
* [DAGCombine] Match a pattern where a wide type scalar value is stored by ↵QingShan Zhang2019-06-041-266/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | several narrow stores This opportunity is found from spec 2017 557.xz_r. And it is used by the sha encrypt/decrypt. See sha-2/sha512.c static void store64(u64 x, unsigned char* y) { for(int i = 0; i != 8; ++i) y[i] = (x >> ((7-i) * 8)) & 255; } static u64 load64(const unsigned char* y) { u64 res = 0; for(int i = 0; i != 8; ++i) res |= (u64)(y[i]) << ((7-i) * 8); return res; } The load64 has been implemented by https://reviews.llvm.org/D26149 This patch is trying to implement the store pattern. Match a pattern where a wide type scalar value is stored by several narrow stores. Fold it into a single store or a BSWAP and a store if the targets supports it. Assuming little endian target: i8 *p = ... i32 val = ... p[0] = (val >> 0) & 0xFF; p[1] = (val >> 8) & 0xFF; p[2] = (val >> 16) & 0xFF; p[3] = (val >> 24) & 0xFF; > *((i32)p) = val; i8 *p = ... i32 val = ... p[0] = (val >> 24) & 0xFF; p[1] = (val >> 16) & 0xFF; p[2] = (val >> 8) & 0xFF; p[3] = (val >> 0) & 0xFF; > *((i32)p) = BSWAP(val); Differential Revision: https://reviews.llvm.org/D61843 llvm-svn: 362472
* [PowerPC] add testcases for reordering LSR and PPCCTRLoops - NFCChen Zheng2019-06-041-0/+217
| | | | llvm-svn: 362468
* Propagate fmf for setcc/select foldsMichael Berg2019-06-031-2/+2
| | | | | | | | | | | | | | Summary: This change facilitates propagating fmf which was placed on setcc from fcmp through folds with selects so that back ends can model this path for arithmetic folds on selects in SDAG. Reviewers: qcolombet, spatel Reviewed By: qcolombet Subscribers: nemanjai, jsji Differential Revision: https://reviews.llvm.org/D62552 llvm-svn: 362439
* [PowerPC] Look through copies for compare eliminationNemanja Ivanovic2019-06-031-0/+29
| | | | | | | | | | | | We currently miss the opportunities for optmizing comparisons in the peephole optimizer if the input is the result of a COPY since we look for record-form versions of the producing instruction. This patch simply lets the optimization peek through copies. Differential revision: https://reviews.llvm.org/D59633 llvm-svn: 362438
* [PPC] Correctly adjust branch probability in PPCReduceCRLogicalsGuozhi Wei2019-05-312-1/+89
| | | | | | | | | | | | | | | | | | | | | In PPCReduceCRLogicals after splitting the original MBB into 2, the 2 impacted branches still use original branch probability. This is unreasonable. Suppose we have following code, and the probability of each successor is 50%. condc = conda || condb br condc, label %target, label %fallthrough It can be transformed to following, br conda, label %target, label %newbb newbb: br condb, label %target, label %fallthrough Since each branch has a probability of 50% to each successor, the total probability to %fallthrough is 25% now, and the total probability to %target is 75%. This actually changed the original profiling data. A more reasonable probability can be set to 70% to the false side for each branch instruction, so the total probability to %fallthrough is close to 50%. This patch assumes the branch target with two incoming edges have same edge frequency and computes new probability fore each target, and keep the total probability to original targets unchanged. Differential Revision: https://reviews.llvm.org/D62430 llvm-svn: 362237
* Partial revert of revert of r361827: Add constrained intrinsic tests for ↵Kevin P. Neal2019-05-291-0/+7528
| | | | | | | | | | | | | | powerpc64le. The powerpc64-"nonle" tests are removed. They fail because of a bug that Drew is currently working on that affects multiple targets. Submitted by: Drew Wock <drew.wock@sas.com> Reviewed by: Hal Finkel, Kevin P. Neal Approved by: Hal Finkel Differential Revision: http://reviews.llvm.org/D62388 llvm-svn: 361985
* [CodeGen] Add lrint/llrint builtinsAdhemerval Zanella2019-05-282-0/+112
| | | | | | | | | | | | | | | | | This patch add the ISD::LRINT and ISD::LLRINT along with new intrinsics. The changes are straightforward as for other floating-point rounding functions, with just some adjustments required to handle the return value being an interger. The idea is to optimize lrint/llrint generation for AArch64 in a subsequent patch. Current semantic is just route it to libm symbol. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D62017 llvm-svn: 361875
* Revert 361827. It broke the bots.Kevin P. Neal2019-05-281-10810/+0
| | | | llvm-svn: 361831
* Add constrained intrinsic tests for powerpc64 and powerpc64le.Kevin P. Neal2019-05-281-0/+10810
| | | | | | | | | Submitted by: Drew Wock Reviewed by: Hal Finkel Approved by: Hal Finkel Differential Revision: https://reviews.llvm.org/D62388 llvm-svn: 361827
* [SelectionDAG] soften assertion when legalizing narrow vector FP opsSanjay Patel2019-05-251-0/+24
| | | | | | | | | | The test based on PR42010: https://bugs.llvm.org/show_bug.cgi?id=42010 ...may show an inaccuracy for PPC's target defs, but we should not be so aggressive with an assert here. There's no telling what out-of-tree targets look like. llvm-svn: 361696
* Implement call lowering without parameters on AIXJason Liu2019-05-241-0/+40
| | | | | | | | | | | | Summary:dd This patch implements call lowering for calls without parameters on AIX as initial support. Reviewers: sfertile, hubert.reinterpretcast, aheejin, efriedma Differential Revision: https://reviews.llvm.org/D61948 llvm-svn: 361669
* [PowerPC] Remove CRBits Copy Of Unset/set CBitStefan Pintilie2019-05-242-4/+188
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For the situation, where we generate the following code: crxor 8, 8, 8 < Some instructions> .LBB0_1: < Some instructions> cror 1, 8, 8 cror (COPY of CRbit) depends on the result of the crxor instruction. CR8 is known to be zero as crxor is equivalent to CRUNSET. We can simply use crxor 1, 1, 1 instead to zero out CR1, which does not have any dependency on any previous instruction. This patch will optimize it to: < Some instructions> .LBB0_1: < Some instructions> cror 1, 1, 1 Patch By: Victor Huang (NeHuang) Differential Revision: https://reviews.llvm.org/D62044 llvm-svn: 361632
* [Power9] Add a specific heuristic to schedule the addi before the loadQingShan Zhang2019-05-241-1/+18
| | | | | | | | | | When we are scheduling the load and addi, if all other heuristic didn't take effect, we will try to schedule the addi before the load, to hide the latency, and avoid the true dependency added by RA. And this only take effects for Power9. Differential Revision: https://reviews.llvm.org/D61930 llvm-svn: 361600
* UpdateTestChecks: ppc32 triple supportRoman Lebedev2019-05-231-51/+241
| | | | | | | | | | | | | | | | | | Summary: Appears identical to powerpc64{,le}. Regenerate test that is being affected by upcoming patch. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: nemanjai, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62339 llvm-svn: 361543
* [NFC][PPC] Autogenerate vec_add_sub_quadword.ll testRoman Lebedev2019-05-231-90/+140
| | | | | | Being affected by (sub %x, C) -> add %X, (sub 0, C) 'for vectors' patch. llvm-svn: 361525
* [NFC][PPC] Autogenerate vec_add_sub_doubleword.ll testRoman Lebedev2019-05-231-42/+98
| | | | | | Being affected by (sub %x, C) -> add %X, (sub 0, C) 'for vectors' patch. llvm-svn: 361524
* [TargetLowering] Extend bool args to inline-asm according to getBooleanTypeKees Cook2019-05-221-0/+14
| | | | | | | | | | | | | | | | | Summary: This extends Krzysztof Parzyszek's X86-specific solution (https://reviews.llvm.org/D60208) to the generic code pointed out by James Y Knight. Reviewers: kparzysz, craig.topper, nickdesaulniers Subscribers: efriedma, sdardis, nemanjai, javed.absar, eraman, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, srhines, void, nickdesaulniers, jyknight Tags: #llvm Differential Revision: https://reviews.llvm.org/D60224 llvm-svn: 361404
* [PPC64] Parse -elfv1 -elfv2 when specified on target tripleFangrui Song2019-05-221-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: For big-endian powerpc64, the default ABI is ELFv1. OpenPower ABI ELFv2 is supported when -mabi=elfv2 is specified. FreeBSD support for PowerPC64 ELFv2 ABI with LLVM is in progress[1]. This patch adds an alternative way to specify ELFv2 ABI on target triple [2]. The following results are expected: ELFv1 when using: -target powerpc64-unknown-freebsd12.0 -target powerpc64-unknown-freebsd12.0 -mabi=elfv1 -target powerpc64-unknown-freebsd12.0-elfv1 ELFv2 when using: -target powerpc64-unknown-freebsd12.0 -mabi=elfv2 -target powerpc64-unknown-freebsd12.0-elfv2 [1] https://wiki.freebsd.org/powerpc/llvm-elfv2 [2] https://clang.llvm.org/docs/CrossCompilation.html Patch by Alfredo Dal'Ava Júnior! Differential Revision: https://reviews.llvm.org/D61950 llvm-svn: 361355
* [PowerPC] [ISEL] select x-form instruction for unaligned offsetChen Zheng2019-05-222-25/+26
| | | | | | Differential Revision: https://reviews.llvm.org/D62173 llvm-svn: 361346
* Move csr-save-restore-order.ll to the right placeYi-Hong Lyu2019-05-211-0/+168
| | | | llvm-svn: 361306
* [NFC][PowerPC] Add a test to verify if the scheduler schedule the addi ↵QingShan Zhang2019-05-211-0/+106
| | | | | | before the load. llvm-svn: 361221
* [PowerPC] test cases for selecting x-form instruction for unaligned offset - ↵Chen Zheng2019-05-211-0/+113
| | | | | | NFC llvm-svn: 361219
* Re-land r360859: "[MergeICmps] Simplify the code."Clement Courbet2019-05-171-1/+1
| | | | | | | | | | With a fix for PR41917: The predecessor list was changing under our feet. - for (BasicBlock *Pred : predecessors(EntryBlock_)) { + while (!pred_empty(EntryBlock_)) { + BasicBlock* const Pred = *pred_begin(EntryBlock_); llvm-svn: 361009
* Revert r360859: "Reland r360771 "[MergeICmps] Simplify the code.""Nico Weber2019-05-171-1/+1
| | | | | | It caused PR41917. llvm-svn: 360963
* [CodeGen] Add lround/llround builtinsAdhemerval Zanella2019-05-162-0/+112
| | | | | | | | | | | | | This patch add the ISD::LROUND and ISD::LLROUND along with new intrinsics. The changes are straightforward as for other floating-point rounding functions, with just some adjustments required to handle the return value being an interger. The idea is to optimize lround/llround generation for AArch64 in a subsequent patch. Current semantic is just route it to libm symbol. llvm-svn: 360889
* RegAllocFast: Improve hinting heuristicMatt Arsenault2019-05-161-9/+10
| | | | | | | | | | | | | | | Trace through multiple COPYs when looking for a physreg source. Add hinting for vregs that will be copied into physregs (we only hinted for vregs getting copied to a physreg previously). Give hinted a register a bonus when deciding which value to spill. This is part of my rewrite regallocfast series. In fact this one doesn't even have an effect unless you also flip the allocation to happen from back to front of a basic block. Nonetheless it helps to split this up to ease review of D52010 Patch by Matthias Braun llvm-svn: 360887
* Reland r360771 "[MergeICmps] Simplify the code."Clement Courbet2019-05-161-1/+1
| | | | | | This revision does not seem to be the culprit. llvm-svn: 360859
* RegAlloc: try to fail more gracefully when out of registersNicolai Haehnle2019-05-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Summary: The emitError path allows the program to continue, unlike report_fatal_error. This is friendlier to use cases where LLVM is embedded in a larger program, because the caller may be able to deal with the error somewhat gracefully. Change the number of requested NOP bytes in the AArch64 and PowerPC test cases to avoid triggering an unrelated assertion. The compilation still fails, as verified by the test. Change-Id: Iafb9ca341002a597b82e59ddc7a1f13c78758e3d Reviewers: arsenm, MatzeB Subscribers: qcolombet, nemanjai, wdng, javed.absar, kristof.beyls, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61489 llvm-svn: 360786
* Revert r360771 "[MergeICmps] Simplify the code."Clement Courbet2019-05-151-1/+1
| | | | | | Breaks a bunch of builbdots. llvm-svn: 360776
* [MergeICmps] Simplify the code.Clement Courbet2019-05-151-1/+1
| | | | | | | | | | | | | | | | | | | Instead of patching the original blocks, we now generate new blocks and delete the old blocks. This results in simpler code with a less twisted control flow (see the change in `entry-block-shuffled.ll`). This will make https://reviews.llvm.org/D60318 simpler by making it more obvious where control flow created and deleted. Reviewers: gchatelet Subscribers: hiraditya, llvm-commits, spatel Tags: #llvm Differential Revision: https://reviews.llvm.org/D61736 llvm-svn: 360771
* [IR] Disallow llvm.global_ctors and llvm.global_dtors of the 2-field form in ↵Fangrui Song2019-05-151-1/+1
| | | | | | | | | | | | | | | | | | | | textual format The 3-field form was introduced by D3499 in 2014 and the legacy 2-field form was planned to be removed in LLVM 4.0 For the textual format, this patch migrates the existing 2-field form to use the 3-field form and deletes the compatibility code. test/Verifier/global-ctors-2.ll checks we have a friendly error message. For bitcode, lib/IR/AutoUpgrade UpgradeGlobalVariables will upgrade the 2-field form (add i8* null as the third field). Reviewed By: rnk, dexonsmith Differential Revision: https://reviews.llvm.org/D61547 llvm-svn: 360742
* [PowerPC] Custom lower known CR bit spillsLei Huang2019-05-141-0/+131
| | | | | | | | | | | | | | | | | | | For known CRBit spills, CRSET/CRUNSET, it is more efficient to load and spill the known value instead of extracting the bit. eg. This sequence is currently used to spill a CRUNSET: crclr 4*cr5+lt mfocrf r3,4 rlwinm r3,r3,20,0,0 stw r3,132(r1) This patch custom lower it to: li r3,0 stw r3,132(r1) Differential Revision: https://reviews.llvm.org/D61754 llvm-svn: 360677
* [PowerPC][NFC] Fix typos in triplesJinsong Ji2019-05-144-7/+7
| | | | | | Found by bzEq (Kai Luo). llvm-svn: 360643
* [TargetLowering] Handle multi depth GEPs w/ inline asm constraintsNick Desaulniers2019-05-131-0/+12
| | | | | | | | | | | | | | | | | | | | | | | Summary: X86TargetLowering::LowerAsmOperandForConstraint had better support than TargetLowering::LowerAsmOperandForConstraint for arbitrary depth getelementpointers for "i", "n", and "s" extended inline assembly constraints. Hoist its support from the derived class into the base class. Link: https://github.com/ClangBuiltLinux/linux/issues/469 Reviewers: echristo, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, E5ten, kees, jyknight, nemanjai, javed.absar, eraman, hiraditya, jsji, llvm-commits, void, craig.topper, nathanchance, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D61560 llvm-svn: 360604
* [PowerPC] custom lower `v2f64 fpext v2f32`Lei Huang2019-05-101-0/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | Reduces scalarization overhead via custom lowering of v2f64 fpext v2f32. eg. For the following IR %0 = load <2 x float>, <2 x float>* %Ptr, align 8 %1 = fpext <2 x float> %0 to <2 x double> ret <2 x double> %1 Pre custom lowering: ld r3, 0(r3) mtvsrd f0, r3 xxswapd vs34, vs0 xscvspdpn f0, vs0 xxsldwi vs1, vs34, vs34, 3 xscvspdpn f1, vs1 xxmrghd vs34, vs0, vs1 After custom lowering: lfd f0, 0(r3) xxmrghw vs0, vs0, vs0 xvcvspdp vs34, vs0 Differential Revision: https://reviews.llvm.org/D57857 llvm-svn: 360429
* [PowerPC][NFC] Add test for D60506 to show differences in code-genNemanja Ivanovic2019-05-091-0/+576
| | | | | | Differential revision: https://reviews.llvm.org/D61723 llvm-svn: 360338
* [NFC][PowerPC] Add test for store combine optimization.QingShan Zhang2019-05-081-0/+793
| | | | llvm-svn: 360229
* [CodeGenPrepare] Don't split the store if it is volatileQingShan Zhang2019-05-081-0/+17
| | | | | | | | We shouldn't split the store when it is volatile. Differential Revision: https://reviews.llvm.org/D61169 llvm-svn: 360228
* [PowerPC][NFC] Update build-vector-tests.ll using ↵Jinsong Ji2019-05-071-2460/+4053
| | | | | | | | | | | | | utils/update_llc_test_checks.py build-vector-tests.ll is a huge testcase, it is hard to maintain: eg: any fundamental changes might need to update hundreds of lines. We should leverage the script to maintain it. This patch simply run utils/update_llc_test_checks.py on it. There should be no missing test points. llvm-svn: 360175
* [PowerPC] Use the two-constant NR algorithm for refining estimatesNemanja Ivanovic2019-05-074-47/+43
| | | | | | | | | | | | The single-constant algorithm produces infinities on a lot of denormal values. The precision of the two-constant algorithm is actually sufficient across the range of denormals. We will switch to that algorithm for now to avoid the infinities on denormals. In the future, we will re-evaluate the algorithm to find the optimal one for PowerPC. Differential revision: https://reviews.llvm.org/D60037 llvm-svn: 360144
* [PowerPC] Fix erroneous condition for converting uint-to-fp vector conversionNemanja Ivanovic2019-05-061-0/+134
| | | | | | | | | | | | | | A condition for exiting the legalization of v4i32 conversion to v2f64 through extract/convert/build erroneously checks for the extract having type i32. This is not adequate as smaller extracts are actually legalized to i32 as well. Furthermore, an early exit is missing which means that we only check that both extracts are from the same vector if that check fails. As a result, both cases in the included test case fail - the first gets a select error and the second generates incorrect code. The culprit commit is r274535. llvm-svn: 360043
* Reapply r359906, "RegAllocFast: Add heuristic to detect values not live-out ↵Matt Arsenault2019-05-032-13/+0
| | | | | | | | | | | of a block" This reverts commit r359912. This should pass now, since the clang test was made less fragile in r359918. llvm-svn: 359919
* Revert r359906, "RegAllocFast: Add heuristic to detect values not live-out ↵Nico Weber2019-05-032-0/+13
| | | | | | | | of a block" Makes clang/test/Misc/backend-stack-frame-diagnostics-fallback.cpp fail. llvm-svn: 359912
OpenPOWER on IntegriCloud