summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* RegAllocFast: Add heuristic to detect values not live-out of a blockMatt Arsenault2019-05-032-13/+0
| | | | | | | | | Add an improved/new heuristic to catch more cases when values are not live out of a basic block. Patch by Matthias Braun llvm-svn: 359906
* [PowerPC] add test that could infinite loop with reordered transforms; NFCSanjay Patel2019-05-011-0/+27
| | | | | | | | This is a slightly reduced version of the test from D61384. Adding this as a preliminary step, so I can update D61149 with the proposed fix. llvm-svn: 359709
* [llvm-readobj] Change -t to --symbols in tests. NFCFangrui Song2019-05-011-1/+1
| | | | | | | | | | -t is --symbols in llvm-readobj but --section-details (unimplemented) in readelf. The confusing option should not be used since we aim for improving compatibility. Keep just one llvm-readobj -t use case in test/tools/llvm-readobj/symbols.test llvm-svn: 359661
* [NFC][PowerPC] Use -check-prefixes to simplify the check in code-align.llKang Zhang2019-04-301-51/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When checking the same output, we can use the `-check-prefixes` to simplify the check. For example, if we want to check below output. ``` ; GENERIC-LABEL: .globl foo ; BASIC-LABEL: .globl foo ; PWR-LABEL: .globl foo ; GENERIC: .p2align 2 ; BASIC: .p2align 4 ; PWR: .p2align 4 ; GENERIC: @foo ; BASIC: @foo ; PWR: @foo ``` If we use `-check-prefixes` ``` ... -check-prefixes=CHECK,GENERAL ... -check-prefixes=CHECK,BASIC ... -check-prefixes=CHECK,PWR ``` Above check can be simplify to: ``` ; CHECK-LABEL: .globl foo ; GENERIC: .p2align 2 ; BASIC: .p2align 4 ; PWR: .p2align 4 ; CHECK: @foo ``` Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D61227 llvm-svn: 359533
* [DAGCombiner] Do not generate ISD::ADDE node if adde is not legal for the ↵Zi Xuan Wu2019-04-302-3/+38
| | | | | | | | | | | | | | | target when combine ISD::TRUNC node Do not combine (trunc adde(X, Y, Carry)) into (adde trunc(X), trunc(Y), Carry), if adde is not legal for the target. Even it's at type-legalize phase. Because adde is special and will not be legalized at operation-legalize phase later. This fixes: PR40922 https://bugs.llvm.org/show_bug.cgi?id=40922 Differential Revision: https://reviews.llvm.org//D60854 llvm-svn: 359532
* Add __builtin_dcbf support for PPCAhsan Saghir2019-04-291-0/+15
| | | | | | | | | | | | Summary: This patch adds support for __builtin_dcbf for PPC. __builtin_dcbf copies the contents of a modified block from the data cache to main memory and flushes the copy from the data cache. Differential revision: https://reviews.llvm.org/D59843 llvm-svn: 359517
* [PowerPC] Try harder to avoid load/move-to VSR for partial vector loadsRoland Froese2019-04-291-22/+136
| | | | | | | | | | Change the PPCISelLowering.cpp function that decides to avoid update form in favor of partial vector loads to know about newer load types and to not be confused by the chain operand. Differential Revision: https://reviews.llvm.org/D60102 llvm-svn: 359504
* [AsmPrinter] refactor to support %c w/ GlobalAddress'Nick Desaulniers2019-04-261-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Targets like ARM, MSP430, PPC, and SystemZ have complex behavior when printing the address of a MachineOperand::MO_GlobalAddress. Move that handling into a new overriden method in each base class. A virtual method was added to the base class for handling the generic case. Refactors a few subclasses to support the target independent %a, %c, and %n. The patch also contains small cleanups for AVRAsmPrinter and SystemZAsmPrinter. It seems that NVPTXTargetLowering is possibly missing some logic to transform GlobalAddressSDNodes for TargetLowering::LowerAsmOperandForConstraint to handle with "i" extended inline assembly asm constraints. Fixes: - https://bugs.llvm.org/show_bug.cgi?id=41402 - https://github.com/ClangBuiltLinux/linux/issues/449 Reviewers: echristo, void Reviewed By: void Subscribers: void, craig.topper, jholewinski, dschuff, jyknight, dylanmckay, sdardis, nemanjai, javed.absar, sbc100, jgravelle-google, eraman, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, jrtc27, atanasyan, jsji, llvm-commits, kees, tpimh, nathanchance, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60887 llvm-svn: 359337
* [PowerPC] Allow using initial-exec TLS with PICJoerg Sonnenberger2019-04-241-1/+9
| | | | | | | | | | Using initial-exec TLS variables is a reasonable performance optimisation for system libraries. Use the correct PIC mechanism to get hold of the GOT to avoid text relocations. Differential Revision: https://reviews.llvm.org/D61026 llvm-svn: 359146
* [PowerPC] Fix wrong ElemSIze when calling isConsecutiveLS()Kang Zhang2019-04-181-0/+12
| | | | | | | | | | | | | | | | | | | | Summary: This issue from the bugzilla: https://bugs.llvm.org/show_bug.cgi?id=41177 When the two operands for BUILD_VECTOR are same, we will get assert error. llvm::SDValue combineBVOfConsecutiveLoads(llvm::SDNode*, llvm::SelectionDAG&): Assertion `!(InputsAreConsecutiveLoads && InputsAreReverseConsecutive) && "The loads cannot be both consecutive and reverse consecutive."' failed. This error caused by the wrong ElemSIze when calling isConsecutiveLS(). We should use `getScalarType().getStoreSize();` to get the ElemSize instread of `getScalarSizeInBits() / 8`. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D60811 llvm-svn: 358644
* [AsmPrinter] defer %c to base class for ARM, PPC, and Hexagon. NFCNick Desaulniers2019-04-171-0/+17
| | | | | | | | | | | | | | | | | | | Summary: None of these derived classes do anything that the base class cannot. If we remove these case statements, then the base class can handle them just fine. Reviewers: peter.smith, echristo Reviewed By: echristo Subscribers: nemanjai, javed.absar, eraman, kristof.beyls, hiraditya, kbarton, jsji, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60803 llvm-svn: 358603
* [PowerPC] Add initialization for some ppc passesKang Zhang2019-04-122-0/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Some llc debug options need pass-name as the parameters. But if we use the pass-name ppc-early-ret, we will get below error: llc test.ll -stop-after ppc-early-ret LLVM ERROR: "ppc-early-ret" pass is not registered. Below pass-names have the pass is not registered error: ppc-ctr-loops ppc-ctr-loops-verify ppc-loop-preinc-prep ppc-toc-reg-deps ppc-vsx-copy ppc-early-ret ppc-vsx-fma-mutate ppc-vsx-swaps ppc-reduce-cr-ops ppc-qpx-load-splat ppc-branch-coalescing ppc-branch-select Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D60248 llvm-svn: 358271
* Revert "[PowerPC] Add initialization for some ppc passes"Eric Christopher2019-04-122-142/+0
| | | | | | | This reverts commit 6f8f98ce8de7c0e4ebd7fa2e1fd9507fe8d1c317 as it is breaking nearly every bot. llvm-svn: 358260
* [PowerPC] Add initialization for some ppc passesKang Zhang2019-04-122-0/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Some llc debug options need pass-name as the parameters. But if we use the pass-name ppc-early-ret, we will get below error: llc test.ll -stop-after ppc-early-ret LLVM ERROR: "ppc-early-ret" pass is not registered. Below pass-names have the pass is not registered error: ppc-ctr-loops ppc-ctr-loops-verify ppc-loop-preinc-prep ppc-toc-reg-deps ppc-vsx-copy ppc-early-ret ppc-vsx-fma-mutate ppc-vsx-swaps ppc-reduce-cr-ops ppc-qpx-load-splat ppc-branch-coalescing ppc-branch-select Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D60248 llvm-svn: 358256
* [PowerPC] More precise exploitation of P9 maddld instruction when operands ↵Zi Xuan Wu2019-04-121-51/+175
| | | | | | | | | | | | | are constant There are 3 operands of maddld, (add (mul %1, %2), %3) and sometimes they are constant. If there is constant operand, it takes extra li to materialize the operand, and one more extra register too. So it's not profitable to use maddld to optimize mul-add pattern. Differential Revision: https://reviews.llvm.org/D60181 llvm-svn: 358253
* Revert rL357745: [SelectionDAG] Compute known bits of CopyFromRegDavid Green2019-04-101-7/+9
| | | | | | | | | | Certain optimisations from ConstantHoisting and CGP rely on Selection DAG not seeing through to the constant in other blocks. Revert this patch while we come up with a better way to handle that. I will try to follow this up with some better tests. llvm-svn: 358113
* [PowerPC] initialize SchedModel according to platform.Chen Zheng2019-04-091-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D60177 llvm-svn: 357962
* [SelectionDAG] Compute known bits of CopyFromRegPiotr Sobczak2019-04-051-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Teach SelectionDAG how to compute known bits of ISD::CopyFromReg if the virtual reg used has one def only. This can be particularly useful when calling isBaseWithConstantOffset() with the ISD::CopyFromReg argument, as more optimizations may get enabled in the result. Also add a missing truncation on X86, found by testing of this patch. Change-Id: Id1c9fceec862d118c54a5b53adf72ada5d6daefa Reviewers: bogner, craig.topper, RKSimon Reviewed By: RKSimon Subscribers: lebedev.ri, nemanjai, jvesely, nhaehnle, javed.absar, jsji, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59535 llvm-svn: 357745
* [PowerPC]add testcase for ppcctrloops pass shortloop checkChen Zheng2019-04-031-2/+51
| | | | llvm-svn: 357560
* [DAGCombine] Prune unnused nodes.Nirav Dave2019-03-292-4/+4
| | | | | | | | | | | | | | | | | | | Summary: Nodes that have no uses are eventually pruned when they are selected from the worklist. Record nodes newly added to the worklist or DAG and perform pruning after every combine attempt. Reviewers: efriedma, RKSimon, craig.topper, spatel, jyknight Reviewed By: jyknight Subscribers: jdoerfert, jyknight, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58070 llvm-svn: 357283
* [PowerPC] Add the support for __builtin_setrnd()Kang Zhang2019-03-291-0/+46
| | | | | | | | | | | | | | | | | | Summary: PowerPC64/PowerPC64le supports the builtin function __builtin_setrnd to set the floating point rounding mode. This function will use the least significant two bits of integer argument to set the floating point rounding mode. double __builtin_setrnd(int mode); The effective values for mode are: 0 - round to nearest 1 - round to zero 2 - round to +infinity 3 - round to -infinity Note that the mode argument will modulo 4, so if the int argument is greater than 3, it will only use the least significant two bits of the mode. Namely, builtin_setrnd(102)) is equal to builtin_setrnd(2). Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D59405 llvm-svn: 357241
* [PowerPC] Strength reduction of multiply by a constant by shift and add/sub ↵Zi Xuan Wu2019-03-295-24/+553
| | | | | | | | | | | | | | | | | | | | | | | in place A shift and add/sub sequence combination is faster in place of a multiply by constant. Because the cycle or latency of multiply is not huge, we only consider such following worthy patterns. ``` (mul x, 2^N + 1) => (add (shl x, N), x) (mul x, -(2^N + 1)) => -(add (shl x, N), x) (mul x, 2^N - 1) => (sub (shl x, N), x) (mul x, -(2^N - 1)) => (sub x, (shl x, N)) ``` And the cycles or latency is subtarget-dependent so that we need consider the subtarget to determine to do or not do such transformation. Also data type is considered for different cycles or latency to do multiply. Differential Revision: https://reviews.llvm.org/D58950 llvm-svn: 357233
* Revert r356996 "[DAG] Avoid smart constructor-based dangling nodes."Nirav Dave2019-03-273-66/+66
| | | | | | | This patch appears to trigger very large compile time increases in halide builds. llvm-svn: 357116
* [PowerPC] Remove UseVSXRegStefan Pintilie2019-03-261-16/+16
| | | | | | | | | | The UseVSXReg flag can be safely removed and the code cleaned up. Patch By: Yi-Hong Liu Differential Revision: https://reviews.llvm.org/D58685 llvm-svn: 357028
* [DAG] Avoid smart constructor-based dangling nodes.Nirav Dave2019-03-263-66/+66
| | | | | | | | | | | | | | | Various SelectionDAG non-combine operations (e.g. the getNode smart constructor and legalization) may leave dangling nodes by applying optimizations or not fully pruning unused result values. This can result in nodes that are never added to the worklist and therefore can not be pruned. Add a node inserter as the current node deleter to make sure such nodes have the chance of being pruned. Many minor changes, mostly positive. llvm-svn: 356996
* RegAllocFast: Remove early selection loop, the spill calculation will report ↵Matt Arsenault2019-03-194-639/+1978
| | | | | | | | | | | | | | | | cost 0 anyway for free regs The 2nd loop calculates spill costs but reports free registers as cost 0 anyway, so there is little benefit from having a separate early loop. Surprisingly this is not NFC, as many register are marked regDisabled so the first loop often picks up later registers unnecessarily instead of the first one available in the allocation order... Patch by Matthias Braun llvm-svn: 356499
* [DAGCombiner] If a TokenFactor would be merged into its user, consider the ↵Nirav Dave2019-03-132-44/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | user later. Summary: A number of optimizations are inhibited by single-use TokenFactors not being merged into the TokenFactor using it. This makes we consider if we can do the merge immediately. Most tests changes here are due to the change in visitation causing minor reorderings and associated reassociation of paired memory operations. CodeGen tests with non-reordering changes: X86/aligned-variadic.ll -- memory-based add folded into stored leaq value. X86/constant-combiners.ll -- Optimizes out overlap between stores. X86/pr40631_deadstore_elision -- folds constant byte store into preceding quad word constant store. Reviewers: RKSimon, craig.topper, spatel, efriedma, courbet Reviewed By: courbet Subscribers: dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, eraman, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59260 llvm-svn: 356068
* Set useful flags for vector imm setting instructionsJinsong Ji2019-03-123-26/+17
| | | | | | | | | | | | Vector imm setting instructions like XXLXORz/XXLXORspz/XXLXORdpz Should behave like LI8. We should set corresponding flags to allow rematerialization and other opts in LICM, RA, Scheduling etc. Differential Revision: https://reviews.llvm.org/D58645 llvm-svn: 355948
* [NFC][PowerPC] Update testcases using utils/update_llc_test_checks.pyJinsong Ji2019-03-121-5/+18
| | | | llvm-svn: 355945
* [PowerPC] Use real pointers instead of undefSimon Pilgrim2019-03-061-13/+17
| | | | | | The reduced test removed the pointer arguments, but to better survive D58017 and D58070 we need them back. llvm-svn: 355532
* [PPC] Adjust the computed branch offset for the possible shorter distanceGuozhi Wei2019-03-061-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In file PPCBranchSelector.cpp we tend to over estimate code size due to large alignment and inline assembly. Usually it causes larger computed branch offset, it is not big problem. But sometimes it may also causes smaller computed branch offset than actual branch offset. If the offset is close to the limit of encoding, it may cause problem at run time. Following is a simplified example. actual estimated address address ... bne Far 100 10c .p2align 4 Near: 110 110 ... Far: 8108 8108 Actual offset: 0x8108 - 0x100 = 0x8008 Computed offset: 0x8108 - 0x10c = 0x7ffc The computed offset is at most ((1 << alignment) - 4) bytes smaller than actual offset. So we add this number to the offset for safety. Differential Revision: https://reviews.llvm.org/D57718 llvm-svn: 355529
* [PowerPC] Add secure plt support for TLS symbolsStrahinja Petrovic2019-03-061-0/+18
| | | | | | | | This patch supports secure plt mode for TLS symbols. Differential Revision: https://reviews.llvm.org/D45520 llvm-svn: 355513
* [PowerPC] fix killed/dead flag after convert x-form to d-form tranformation.Chen Zheng2019-03-052-1/+184
| | | | | | Differential Revision: https://reviews.llvm.org/D58428 llvm-svn: 355378
* [PowerPC] Move the stack pointer update instruction later in the prologue ↵Stefan Pintilie2019-02-285-84/+84
| | | | | | | | | | | | | | and earlier in the epilogue. Move the stdu instruction in the prologue and epilogue. This should provide a small performance boost in functions that are able to do this. I've kept this change rather conservative at the moment and functions with frame pointers or base pointers will not try to move the stack pointer update. Differential Revision: https://reviews.llvm.org/D42590 llvm-svn: 355085
* Fixed a typo in the test s/CEHCK/CHECK/Dmitri Gribenko2019-02-281-2/+5
| | | | | | | | | | | | | | | | | Summary: Turns out the test was not correct, I had to adjust the test to work. I also added CHECK-LABELs for better error messages from FileCheck while I'm here. Reviewers: jsji Subscribers: nemanjai, eraman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58614 llvm-svn: 355079
* Default to Secure PLT on PPC for NetBSD and OpenBSD.Joerg Sonnenberger2019-02-271-0/+4
| | | | | | This matches the default settings of clang. llvm-svn: 355038
* Fixed typos in tests: s/CHEKC/CHECK/Dmitri Gribenko2019-02-251-1/+1
| | | | | | | | | | | | Reviewers: ilya-biryukov Subscribers: nemanjai, javed.absar, jsji, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D58611 llvm-svn: 354785
* [PowerPC] [PowerPC] Enhance the fast selection of fptoi & fptrunc ↵Kang Zhang2019-02-252-6/+32
| | | | | | | | | | | | | | | | | | | | instruction and clean up related asserts Summary: Fast selection of llvm fptoi & fptrunc instructions is not handled well about VSX instruction support. We'd use VSX float convert integer instruction instead of non-vsx float convert integer instruction if the operand register class is VSSRC or VSFRC because i32 and i64 are mapped to VSSRC and VSFRC correspondingly if VSX feature is openeded. For float trunc instruction, we do this silimar work like float convert integer instruction to try to use VSX instruction. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D58430 llvm-svn: 354762
* Disable big-endian constant store merges from rL354676.Nirav Dave2019-02-221-5/+7
| | | | llvm-svn: 354677
* [DAGCombine] Fold overlapping constant storesNirav Dave2019-02-221-12/+6
| | | | | | | | | | | | | | | Fold a smaller constant store into larger constant stores immediately preceeding it. Reviewers: rnk, courbet Subscribers: javed.absar, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58468 llvm-svn: 354676
* [PPC] Add store merging testcase.Nirav Dave2019-02-211-0/+51
| | | | llvm-svn: 354595
* [PowerPC] exploit P9 instruction maddld.Chen Zheng2019-02-201-0/+115
| | | | | | Differential Revision: https://reviews.llvm.org/D58364 llvm-svn: 354427
* [PowerPC][NFC] Added tests for prologue and epilogue code gen.Stefan Pintilie2019-02-134-0/+494
| | | | | | | | | | Added four test files to check the existing behaviour of prologue and epilogue code generation. This patch was done as a setup for the upcoming patch listed on Phabricator that will change how the prologue and epilogue work. The upcoming patch is: https://reviews.llvm.org/D42590 llvm-svn: 353994
* [DAGCombiner] convert logic-of-setcc into bit magic (PR40611)Sanjay Patel2019-02-121-10/+10
| | | | | | | | | | | | | | | | | | | | If we're comparing some value for equality against 2 constants and those constants have an absolute difference of just 1 bit, then we can offset and mask off that 1 bit and reduce to a single compare against zero: and/or (setcc X, C0, ne), (setcc X, C1, ne/eq) --> setcc ((add X, -C1), ~(C0 - C1)), 0, ne/eq https://rise4fun.com/Alive/XslKj This transform is disabled by default using a TLI hook ("convertSetCCLogicToBitwiseLogic()"). That should be overridden for AArch64, MIPS, Sparc and possibly others based on the asm shown in: https://bugs.llvm.org/show_bug.cgi?id=40611 llvm-svn: 353859
* [PowerPC] Regenerate testSimon Pilgrim2019-02-121-153/+170
| | | | llvm-svn: 353851
* [PowerPC] add tests for logic of setcc (PR40611); NFCSanjay Patel2019-02-121-0/+30
| | | | llvm-svn: 353788
* [PowerPC] Avoid scalarization of vector truncateRoland Froese2019-02-111-258/+35
| | | | | | | | The PowerPC code generator currently scalarizes vector truncates that would fit in a vector register, resulting in vector extracts, scalar operations, and vector merges. This patch custom lowers a vector truncate that would fit in a register to a vector shuffle instead. Differential Revision: https://reviews.llvm.org/D56507 llvm-svn: 353724
* [DAGCombine] Optimize pow(X, 0.75) to sqrt(X) * sqrt(sqrt(X))Nemanja Ivanovic2019-02-081-0/+48
| | | | | | | | | | | | The sqrt case is faster and we already do this for the case where the exponent is 0.25. This adds the 0.75 case which is also not sensitive to signed zeros. Patch by Whitney Tsang (Whitney) Differential revision: https://reviews.llvm.org/D57434 llvm-svn: 353557
* [DAGCombiner] fold add/sub with bool operand based on target's boolean contentsSanjay Patel2019-02-072-10/+6
| | | | | | | | | | | | | | | | | | | I noticed that we are missing this canonicalization in IR: rL352515 ...and then realized that we don't get this right in SDAG either, so this has to be fixed first regardless of what we choose to do in IR. The existing fold was limited to scalars and using the wrong predicate to guard the transform. We have a boolean contents TLI query that can be used to decide which direction to fold. This may eventually lead back to the problems/question in: https://bugs.llvm.org/show_bug.cgi?id=40486 ...but it makes no difference to that yet. Differential Revision: https://reviews.llvm.org/D57401 llvm-svn: 353433
* [PowerPC] Add vector truncate test to prep for D56507 NFCRoland Froese2019-02-061-0/+420
| | | | llvm-svn: 353344
OpenPOWER on IntegriCloud