summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombiner] fold shift-trunc-shift to shift-mask-truncSanjay Patel2019-12-121-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This fold is done in IR by instcombine, and we have a special form of it already here in DAGCombiner, but we want the more general transform too: https://rise4fun.com/Alive/3jZm Name: general Pre: (C1 + zext(C2) < 64) %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %a = and i64 %s2, zext((1 << (16 - C2)) - 1) %r = trunc %a to i16 Name: special Pre: C1 == 48 %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %r = trunc %s2 to i16 ...because D58017 exposes a regression without this fold.
* [AArch64][PowerPC] add tests for shift sandwich; NFCSanjay Patel2019-12-121-3/+20
|
* [ValueTracking] Pointer is known nonnull after load/storeDanila Kutenin2019-12-111-4/+3
| | | | | | | | | | | If the pointer was loaded/stored before the null check, the check is redundant and can be removed. For now the optimizers do not remove the nullptr check, see https://gcc.godbolt.org/z/H2r5GG. The patch allows to use more nonnull constraints. Also, it found one more optimization in some PowerPC test. This is my first llvm review, I am free to any comments. Differential Revision: https://reviews.llvm.org/D71177
* [PowerPC][NFC] add test case for lwa - loop ds form prepczhengsz2019-12-111-0/+73
|
* [PowerPC] Exploitate the Vector Integer Average InstructionsQingShan Zhang2019-12-111-72/+123
| | | | | | | | | | | PowerPC has instruction to do the semantics of this piece of code: vector int foo(vector int m, vector int n) { return (m + n + 1) >> 1; } This patch is adding the match rule to select it. Differential Revision: https://reviews.llvm.org/D71002
* [BUG-FIX][XCOFF] fixed a bug of XCOFFObjectFile.cpp when there is padding at ↵diggerlin2019-12-101-0/+34
| | | | | | | | | | | | the last csect of a sections SUMMARY: Fixed a bug of XCOFFObjectFile.cpp when there is padding at the last csect of a sections. when there is a tail padding of a section, but the value of CurrentAddressLocation do not be increased by the padding size. it will hit assert assert(CurrentAddressLocation == Section->Address && "We should have no padding between sections."); Reviewers: daltenty,hubert.reinterpretcast, Differential Revision: https://reviews.llvm.org/D70859
* [PowerPC][NFC] Rename ANDI(S)o8 to ANDI(S)8oJinsong Ji2019-12-096-18/+18
| | | | | | | | | | | | | | | | | | | Summary: This is found during https://reviews.llvm.org/D70758 All the other record forms are having suffix o at the end. ANDIo8 and ANDISo8 are the only two that put o before 8. This patch rename them to be consistent with others. Reviewers: #powerpc, hfinkel, nemanjai, lei, steven.zhang, echristo, jhibbits, joerg Reviewed By: jhibbits Subscribers: wuzish, hiraditya, kbarton, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70928
* [PowerPC] Automatically generate store-constant.ll . NFCAmaury Séchet2019-12-091-58/+145
|
* [PowerPC] Fix MI peephole optimization for splatsKai Luo2019-12-072-44/+116
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch fixes an issue where the PPC MI peephole optimization pass incorrectly remove a vector swap. Specifically, the pass can combine a splat/swap to a splat/copy. It uses `TargetRegisterInfo::lookThruCopyLike` to determine that the operands to the splat are the same. However, the current logic only compares the operands based on register numbers. In the case where the splat operands are ultimately feed from the same physical register, the pass can incorrectly remove a swap if the feed register for one of the operands has been clobbered. This patch adds a check to ensure that the registers feeding are both virtual registers or the operands to the splat or swap are both the same register. Here is an example in pseudo-MIR of what happens in the test cased added in this patch: Before PPC MI peephole optimization: ``` %arg = XVADDDP %0, %1 $f1 = COPY %arg.sub_64 call double rint(double) %res.first = COPY $f1 %vec.res.first = SUBREG_TO_REG 1, %res.first, %subreg.sub_64 %arg.swapped = XXPERMDI %arg, %arg, 2 $f1 = COPY %arg.swapped.sub_64 call double rint(double) %res.second = COPY $f1 %vec.res.second = SUBREG_TO_REG 1, %res.second, %subreg.sub_64 %vec.res.splat = XXPERMDI %vec.res.first, %vec.res.second, 0 %vec.res = XXPERMDI %vec.res.splat, %vec.res.splat, 2 ; %vec.res == [ %vec.res.second[0], %vec.res.first[0] ] ``` After optimization: ``` ; ... %vec.res.splat = XXPERMDI %vec.res.first, %vec.res.second, 0 ; lookThruCopyLike(%vec.res.first) == lookThruCopyLike(%vec.res.second) == $f1 ; so the pass replaces the swap with a copy: %vec.res = COPY %vec.res.splat ; %vec.res == [ %vec.res.first[0], %vec.res.second[0] ] ``` As best as I can tell, this has occurred since r288152, which added support for lowering certain vector operations to direct moves in the form of a splat. Committed for vddvss (Colin Samples). Thanks Colin for the patch! Differential Revision: https://reviews.llvm.org/D69497
* [MBP] Avoid tail duplication if it can't bring benefitGuozhi Wei2019-12-064-11/+100
| | | | | | | | | | | | | Current tail duplication integrated in bb layout is designed to increase the fallthrough from a BB's predecessor to its successor, but we have observed cases that duplication doesn't increase fallthrough, or it brings too much size overhead. To overcome these two issues in function canTailDuplicateUnplacedPreds I add two checks: make sure there is at least one duplication in current work set. the number of duplication should not exceed the number of successors. The modification in hasBetterLayoutPredecessor fixes a bug that potential predecessor must be at the bottom of a chain. Differential Revision: https://reviews.llvm.org/D64376
* [AIX][XCOFF] created a test case to verify the raw text section of ↵diggerlin2019-12-061-0/+22
| | | | | | | | | | | | | xcoffobject file SUMMARY: in the patch https://reviews.llvm.org/D66969 . we need a test case to verify the out text section of the xcoffobject file is correct or not. but we do not have llvm disassembly tools to dump the xcoffobjectfile . since we commit the patch https://reviews.llvm.org/D70255, we have tools for it. we create this test case for it. Reviewers: daltenty,hubert.reinterpretcast, Differential Revision: https://reviews.llvm.org/D70719
* [AIX] Make sure to use QualNames for external global objectsDavid Tenty2019-12-051-8/+16
| | | | | | | | | | | | | | Summary: Previously we only handled the case where the csect hadn't been set up yet, so we'd hit an assert later on. Reviewers: jasonliu, DiggerLin, stevewan Reviewed By: jasonliu Subscribers: hubert.reinterpretcast, wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71032
* Reland [MachineCopyPropagation] Extend MCP to do trivial copy backward ↵Kai Luo2019-12-052-25/+39
| | | | | | | | | | propagation. Fix assertion error ``` bool llvm::MachineOperand::isRenamable() const: Assertion `Register::isPhysicalRegister(getReg()) && "isRenamable should only be checked on physical registers"' failed. ``` by checking if the register is 0 before invoking `isRenamable`.
* Revert "[MachineCopyPropagation] Extend MCP to do trivial copy backward ↵Kai Luo2019-12-052-39/+25
| | | | | | | propagation" This reverts commit 75b3a1c318ccad0f96c38689279bc5db63e2ad05, since it breaks bootstrap build.
* [MachineCopyPropagation] Extend MCP to do trivial copy backward propagationKai Luo2019-12-052-25/+39
| | | | | | | | | | | | | | | | | | Summary: This patch mainly do such transformation ``` $R0 = OP ... ... // No read/clobber of $R0 and $R1 $R1 = COPY $R0 // $R0 is killed ``` Replace $R0 with $R1 and remove the COPY, we have ``` $R1 = OP ... ``` This transformation can also expose more opportunities for existing copy elimination in MCP. Differential Revision: https://reviews.llvm.org/D67794
* [XCOFF][AIX] Emit TOC entries for object file generationjasonliu2019-12-043-2/+175
| | | | | | | | | | | | | | | | | | | | Summary: Implement emitTCEntry for PPCTargetXCOFFStreamer. Add TC csects to TOCCsects for object file writing. Note: 1. I did not include any raw data testing for this object file generation because TC entries raw data will all be 0 without relocation implemented. I will add raw data testing as part of relocation testing later. 2. I removed "Symbol->setFragment(F);" for common symbols because we don't need it, and if we have it then we would hit assertions below: Assertion `(SymbolContents == SymContentsUnset || SymbolContents == SymContentsOffset) && "Cannot get offset for a common/variable symbol"' failed. 3.Fixed incorrect TOC-base alignment. Differential Revision: https://reviews.llvm.org/D70798
* [PowerPC] folding rlwinm + rlwinm to rlwinmczhengsz2019-12-032-6/+145
| | | | | | | | | | | | For example: x3 = rlwinm x3, 27, 5, 31 x3 = rlwinm x3, 19, 0, 12 can be combined to x3 = rlwinm x3, 14, 0, 12 Reviewed by: steven.zhang, lkail Differential Revision: https://reviews.llvm.org/D70374
* [FPEnv] [PowerPC] Lowering ppc_fp128 StrictFP Nodes to libcallsCraig Topper2019-12-031-0/+1569
| | | | | | | | This is an alternative to D64662 that shares more code between strict and non-strict nodes. It's modeled after the implementation that I did for softening. Differential Revision: https://reviews.llvm.org/D70867
* Reland "b19ec1eb3d0c [BPI] Improve unreachable/ColdCall heurstics to handle ↵Taewook Oh2019-12-022-4/+2
| | | | | | | | | | | | | loops." Summary: b19ec1eb3d0c has been reverted because of the test failures with PowerPC targets. This patch addresses the issues from the previous commit. Test Plan: ninja check-all. Confirmed that CodeGen/PowerPC/pr36292.ll and CodeGen/PowerPC/sms-cpy-1.ll pass Subscribers: llvm-commits
* [PowerPC] Fix crash in peephole optimizationNemanja Ivanovic2019-12-021-0/+56
| | | | | | | | | | | | When converting reg+reg shifts to reg+imm rotates, we neglect to consider the CodeGenOnly versions of the 32-bit shift mnemonics. This means we produce a rotate with missing operands which causes a crash. Committing this fix without review since it is non-controversial that the list of mnemonics to consider should include the 64-bit aliases for the exact mnemonics. Fixes PR44183.
* [PowerPC][AIX] Add support for lowering int/float/double formal arguments.Sean Fertile2019-11-293-349/+614
| | | | | | | | | | | | | This patch adds LowerFormalArguments_AIX, support is added for lowering int, float, and double formal arguments into general purpose and floating point registers only. The aix calling convention testcase have been redone to test for caller and callee functionality in the same lit test. Patch by Zarko Todorovski! Differential Revision: https://reviews.llvm.org/D69578
* [AIX] Emit TOC entries for ASM printingDavid Tenty2019-11-277-7/+67
| | | | | | | | | | | | | | | | | | | | Summary: Emit the correct .toc psuedo op when we change to the TOC and emit TC entries. Make sure TOC psuedos get the right symbols via overriding getMCSymbolForTOCPseudoMO on AIX. Add a test for TOC assembly writing and update tests to include TOC entries. Also make sure external globals have a csect set and handle external function descriptor (originally authored by Jason Liu) so we can emit TOC entries for them. Reviewers: DiggerLin, sfertile, Xiangling_L, jasonliu, hubert.reinterpretcast Reviewed By: jasonliu Subscribers: arphaman, wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70461
* [PowerPC] Add new Future CPU for PowerPC in LLVMStefan Pintilie2019-11-271-0/+11
| | | | | | | | | | This is a continuation of D70262 The previous patch as listed above added the future CPU in clang. This patch adds the future CPU in the PowerPC backend. At this point the patch simply assumes that a future CPU will have the same characteristics as pwr9. Those characteristics may change with later patches. Differential Revision: https://reviews.llvm.org/D70333
* [PowerPC] [NFC] change PPCLoopPreIncPrep class name after D67088.czhengsz2019-11-262-11/+11
| | | | | | | | | | Afer https://reviews.llvm.org/D67088, PPCLoopPreIncPrep pass can prepare more instruction forms except pre inc form, like DS/DQ forms. This patch is a follow-up of https://reviews.llvm.org/D67088 to rename the pass name. Reviewed by: jsji Differential Revision: https://reviews.llvm.org/D70371
* [XCOFF][AIX] Check linkage on the function, and two fixes for commentsjasonliu2019-11-261-0/+30
| | | | | | This is a follow up commit to address post-commit comment in D70443 Differential revision: https://reviews.llvm.org/D70443
* [PowerPC] Fix VSX clobbers of CSR registersNemanja Ivanovic2019-11-252-7/+63
| | | | | | | | If an inline asm statement clobbers a VSX register that overlaps with a callee-saved Altivec register or FPR, we will not record the clobber and will therefore violate the ABI. This is clearly a bug so this patch fixes it. Differential revision: https://reviews.llvm.org/D68576
* [AIX][XCOFF] Generate undefined symbol in symbol table for external function ↵jasonliu2019-11-251-0/+29
| | | | | | | | | | | | | | | | call Summary: This patch sets up the infrastructure for 1. Associate MCSymbolXCOFF with an MCSectionXCOFF when it could not get implicitly associated. 2. Generate undefined symbols. The patch itself generates undefined symbol for external function call only. Generate undefined symbol for external global variable and external function descriptors will be handled in separate patch(s) after this is land. Differential Revision: https://reviews.llvm.org/D70443
* [NFC][Test] Adding the test for bswap + logic op for PowerPCQingShan Zhang2019-11-251-0/+18
|
* Revert "[PowerPC] combine rlwinm+rlwinm to rlwinm"czhengsz2019-11-242-115/+6
| | | | This reverts commit 29f6f9b2b2bfecccf903738e2f5a0cd0a70fce31.
* [PowerPC] Spill CR LT bits on P9 using setbAmy Kwan2019-11-241-0/+56
| | | | | | | | | | | | | | | | This patch aims to spill CR[0-7]LT bits on POWER9 using the setb instruction. The sequence on P9 to spill these bits will be: setb %reg, %CRREG stw %reg, $FI Instead of the typical sequence: mfocrf %reg, %CRREG rlwinm %reg1, %reg, $SH, 0, 0 stw %reg1, $FI Differential Revision: https://reviews.llvm.org/D68443
* [XCOFF][AIX] Read-only data section object file generationjasonliu2019-11-221-0/+272
| | | | | | | | | | Summary: This patch is a follow up on read-only assembly patch D70182. It intends to enable object file generation for the read-only data section on AIX. Reviewers: DiggerLin, daltenty Differential Revision: https://reviews.llvm.org/D70455
* [PowerPC] Implement the vector extend sign instruction pattern matchQingShan Zhang2019-11-221-0/+178
| | | | | | | Power9 has instructions to implement the semantics of SIGN_EXTEND_INREG for vector type. Mark it as legal and add the match pattern. Differential Revision: https://reviews.llvm.org/D69601
* [PowerPC] combine rlwinm+rlwinm to rlwinmczhengsz2019-11-222-6/+115
| | | | | | | | | | | | | combine x3 = rlwinm x3, 27, 5, 31 x3 = rlwinm x3, 19, 0, 12 to x3 = rlwinm x3, 14, 0, 12 Reviewed by: steven.zhang Differential Revision: https://reviews.llvm.org/D70374
* [AIX][XCOFF] Add support for generating assembly code for one-byte mergable ↵Xing Xue2019-11-201-0/+28
| | | | | | | | | | | | | | | | | | strings This patch adds support for generating assembly code for one-byte mergeable strings. Generating assembly code for multi-byte mergeable strings and the `XCOFF` object code for mergeable strings will be supported later. Reviewers: hubert.reinterpretcast, jasonliu, daltenty, sfertile, DiggerLin, Xiangling_L Reviewed by: daltenty Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70310
* [AIX] Lowering jump table, constant pool and block address in asmXiangling Liao2019-11-203-3/+218
| | | | | | | | | | This patch lowering jump table, constant pool and block address in assembly. 1. On AIX, jump table index is always relative; 2. Put CPI and JTI into ReadOnlySection until we support unique data sections; 3. Create the temp symbol for block address symbol; 4. Update MIR testcases and add related assembly part; Differential Revision: https://reviews.llvm.org/D70243
* [AIX][XCOFF] Write Function descriptors and TOC base to data sectionjasonliu2019-11-191-0/+112
| | | | | | This patch implements writing function descriptors and TOC base into data section, and also add function descriptors(both csect and label) and TOC base symbols to the symbol table.
* [PowerPC] Regenerate vsx_insert_extract_le.ll testsSimon Pilgrim2019-11-191-12/+17
|
* Adding a test case for read-only data assembly writing for aixdiggerlin2019-11-181-0/+50
| | | | | | | | | | | SUMMARY: Adding a test case for read-only data assembly writing for aix Reviewers: daltenty,Xiangling_Liao Subscribers: rupprecht, seiyai,hiraditya Differential Revision: https://reviews.llvm.org/D70182
* [PowerPC] Improve float vector gather codegenStefan Pintilie2019-11-181-20/+15
| | | | | | | | | | This patch aims to improve the code generation for float vector gather on POWER9. Patterns have been implemented to utilize instructions that deliver improved performance. Patch by: Kamau Bridgeman Differential Revision: https://reviews.llvm.org/D62908
* [PowerPC] Test case for vector float gather on ppc64le and ppc64Stefan Pintilie2019-11-181-0/+53
| | | | | | | | | | Test case to verify that the expected code is generated for a vector float gather based on the patterns in tablegen for big and little endian cases. Patch by: Kamau Bridgeman Differential Revision: https://reviews.llvm.org/D69443
* [PowerPC] [NFC] add IR testcases for folding rlwinma.czhengsz2019-11-181-0/+45
|
* [NFC][Test] Add the vavg test for PowerPCQingShan Zhang2019-11-181-0/+189
|
* [PowerPC] extend PPCPreIncPrep Pass for ds/dq formczhengsz2019-11-172-100/+103
| | | | | | | | | | Now, PPCPreIncPrep pass changes a loop to update form and update all load/store with same base accordingly. We can do more for load/store with same base, for example, convert load/store with same base to ds/dq form. Reviewed by: jsji Differential Revision: https://reviews.llvm.org/D67088
* [NFC] Add one test for PowerPC to verify the sext_inreg for vector type.QingShan Zhang2019-11-141-0/+25
|
* [PowerPC] Remove allow-deprecated-dag-overlap and fix broken testsJinsong Ji2019-11-125-31/+31
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This is found during review of https://reviews.llvm.org/D67088. CHECK-DAG is non-overlapping after https://reviews.llvm.org/D47106. -allow-deprecated-dag-overlap was introduced to temporary accept old behavior. But it actually hide some broken tests, eg: `test/CodeGen/PowerPC/swaps-le-1.ll` The codegen has changed, but the CHECK-DAG still PASS due to allowing `overlap`. This patch remove the deprecated options, and fix the broken tests. Reviewers: #powerpc, hfinkel, nemanjai, steven.zhang, shchenz Reviewed By: shchenz Subscribers: shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69733
* [NFC] Fix test case after edab7dd426249bd40059b49b255ba9cc5b784753Nemanja Ivanovic2019-11-112-9/+37
| | | | | | The author of the patch forgot to add -verify-machineinstrs to the RUN lines which would have made the issue appear on all bots. Added that as well as a fix for the undefined register issue (after the hoisting).
* [PowerPC][XCOFF] Add support for zero initialized global values.Sean Fertile2019-11-111-27/+108
| | | | | | | | For XCOFF, globals mapped into the .bss section are linked as COMMON definitions. This behaviour is incorrect for zero initialized data, so emit those to the .data section instead. Differential Revision: https://reviews.llvm.org/D69528
* Fixing PowerPC llc test cases for Disable hoisting MI to hotter basic blocks ↵Victor Huang2019-11-112-7/+7
| | | | by adding powerpc triple
* Disable hoisting MI to hotter basic blocksVictor Huang2019-11-112-0/+427
| | | | | | | | | In current Hoist() function of machine licm pass, it will not check the source and destination basic block frequencies that a instruction is hoisted from/to. There is a chance that instruction is hoisted from a cold to a hot basic block. In this patch, we add options to disable machine instruction hoisting if destination block is hotter. Differential Revision: https://reviews.llvm.org/D63676
* [CGP] Make ICMP_EQ use CR result of ICMP_S(L|G)T dominatorsYi-Hong Lyu2019-11-111-32/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For example: long long test(long long a, long long b) { if (a << b > 0) return b; if (a << b < 0) return a; return a*b; } Produces: sld. 5, 3, 4 ble 0, .LBB0_2 mr 3, 4 blr .LBB0_2: # %if.end cmpldi 5, 0 li 5, 1 isel 4, 4, 5, 2 mulld 3, 4, 3 blr But the compare (cmpldi 5, 0) is redundant and can be removed (CR0 already contains the result of that comparison). The root cause of this is that LLVM converts signed comparisons into equality comparison based on dominance. Equality comparisons are unsigned by default, so we get either a record-form or cmp (without the l for logical) feeding a cmpl. That is the situation we want to avoid here. Differential Revision: https://reviews.llvm.org/D60506
OpenPOWER on IntegriCloud