summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* [PowerPC][LegalizeFloatTypes] Move the PPC hacks for (i32 ↵Craig Topper2018-03-201-22/+44
| | | | | | | | | | | | | | fp_to_sint/fp_to_uint (ppcf128 X)) out of LegalizeFloatTypes and into PPC specific code I'm not entirely sure these hacks are still needed. If you remove the hacks completely, the name of the library call that gets generated doesn't match the grep the test previously had. So the test wasn't really checking anything. If the hack is still needed it belongs in PPC specific code. I believe the FP_TO_SINT code here is the only place in the tree where a FP_ROUND_INREG node is created today. And I don't think its even being used correctly because the legalization returned a BUILD_PAIR with the same value twice. That doesn't seem right to me. By moving the code entirely to PPC we can avoid creating the FP_ROUND_INREG at all. I replaced the grep in the existing test with full checks generated by hacking update_llc_test_check.py to support ppc32 just long enough to generate it. Differential Revision: https://reviews.llvm.org/D44061 llvm-svn: 328017
* [Power9]Legalize and emit code for quad-precision copySign/abs/nabs/neg/sqrtLei Huang2018-03-191-5/+11
| | | | | | | | | | | | | | Legalize and emit code for quad-precision floating point operations: * xscpsgnqp * xsabsqp * xsnabsqp * xsnegqp * xssqrtqp Differential Revision: https://reviews.llvm.org/D44530 llvm-svn: 327889
* [PowerPC][Power9]Legalize and emit code for quad-precision add/div/mul/subLei Huang2018-03-192-5/+27
| | | | | | | | | | | | | Legalize and emit code for quad-precision floating point operations: * xsaddqp * xssubqp * xsdivqp * xsmulqp Differential Revision: https://reviews.llvm.org/D44506 llvm-svn: 327878
* [PowerPC] Make AddrSpaceCast noopNemanja Ivanovic2018-03-191-0/+5
| | | | | | | | | | | PowerPC targets do not use address spaces. As a result, we can get selection failures with address space casts. This patch makes those casts noops. Patch by Valentin Churavy. Differential revision: https://reviews.llvm.org/D43781 llvm-svn: 327877
* Revert [MachineLICM] This reverts commit rL327856Zaara Syeda2018-03-194-33/+44
| | | | | | Failing build bots. Revert the commit now. llvm-svn: 327864
* [MachineLICM] Add functions to MachineLICM to hoist invariant storesZaara Syeda2018-03-194-44/+33
| | | | | | | | | | | | | | | | This patch adds functions to allow MachineLICM to hoist invariant stores. Currently, MachineLICM does not hoist any store instructions, however when storing the same value to a constant spot on the stack, the store instruction should be considered invariant and be hoisted. The function isInvariantStore iterates each operand of the store instruction and checks that each register operand satisfies isCallerPreservedPhysReg. The store may be fed by a copy, which is hoisted by isCopyFeedingInvariantStore. This patch also adds the PowerPC changes needed to consider the stack register as caller preserved. Differential Revision: https://reviews.llvm.org/D40196 llvm-svn: 327856
* TableGen: Explicitly test some cases of self-references and !cast errorsNicolai Haehnle2018-03-191-6/+6
| | | | | | | | | | | | | | | | | | | | | | | Summary: These are cases of self-references that exist today in practice. Let's add tests for them to avoid regressions. The self-references in PPCInstrInfo.td can be expressed in a simpler way. Allowing this type of self-reference while at the same time consistently doing late-resolve even for self-references is problematic because there are references to fields that aren't in any class. Since there's no need for this type of self-reference anyway, let's just remove it. Change-Id: I914e0b3e1ae7adae33855fac409b536879bc3f62 Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: nemanjai, wdng, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D44474 llvm-svn: 327848
* [PPC] Avoid non-simple MVT in STBRX optimizationGuozhi Wei2018-03-151-1/+5
| | | | | | | | | | PR35402 triggered this case. It bswap and stores a 48bit value, current STBRX optimization transforms it into STBRX. Unfortunately 48bit is not a simple MVT, there is no PPC instruction to support it, and it can't be automatically expanded by llvm, so caused a crash. This patch detects the non-simple MVT and returns early. Differential Revision: https://reviews.llvm.org/D44500 llvm-svn: 327651
* [PowerPC] Optimize TLS initial-exec sequence to use X-Form loads/storesZaara Syeda2018-03-152-2/+155
| | | | | | | | | This patch adds new load/store instructions for integer scalar types which can be used for X-Form when fed by add with an @tls relocation. Differential Revision: https://reviews.llvm.org/D43315 llvm-svn: 327635
* [PowerPC][NFC] formatting-only fixLei Huang2018-03-151-16/+15
| | | | llvm-svn: 327599
* test commit: fix formatting of a commentZaara Syeda2018-03-131-3/+3
| | | | | | This is a simple change to do the test commit. llvm-svn: 327412
* [PowerPC][NFC] Explicitly state types on FP SDAG patterns in anticipation of ↵Lei Huang2018-03-123-132/+158
| | | | | | adding the f128 type llvm-svn: 327319
* [Power9] Code Cleaup and adding Comments for Power 9 SchedulerStefan Pintilie2018-03-092-341/+179
| | | | | | | | | Did some code cleanup up removing ItinRW that are not needed and resource types that are no longer used. Also added more comments to the td files related to the Power 9 sheduler model. llvm-svn: 327174
* Revert "[PowerPC] LSR tunings for PowerPC"Stefan Pintilie2018-03-092-16/+0
| | | | | | | | Revert the rest of the LST tune commit. It seems that the LSR tune commit breaks internal tests. Reverting the commit. llvm-svn: 327143
* [Power9] Add more missing instructions to the Power 9 schedulerStefan Pintilie2018-03-082-40/+276
| | | | | | With this patch we should be able to mark the Power 9 model as complete. llvm-svn: 327021
* [PowerPC] LSR tunings for PowerPCStefan Pintilie2018-03-072-0/+16
| | | | | | | | | The purpose of this patch is to have LSR generate better code on Power. This is done by overriding isLSRCostLess. Differential Revision: https://reviews.llvm.org/D40855 llvm-svn: 326906
* [TargetLowering] Rename DAGCombinerInfo::isAfterLegalizeVectorOps to ↵Craig Topper2018-03-061-1/+1
| | | | | | | | | | | | DAGCombiner::isAfterLegalizeDAG since that's what it checks. NFC The code checks Level == AfterLegalizeDAG which is the fourth and last of the possible DAG combine stages that we have. There is a Level called AfterLegalVectorOps, but that's the third DAG combine and it doesn't always run. A function called isAfterLegalVectorOps should imply it returns true in either of the DAG combines that runs after the legalize vector ops stage, but that's not what this function does. llvm-svn: 326832
* [PowerPC] Do not emit record-form rotates when record-form andi sufficesNemanja Ivanovic2018-03-051-0/+27
| | | | | | | | | | | | | | | | Up until Power9, the performance profile for rlwinm., rldicl. and andi. looked more or less equivalent. However with Power9, the rotates are still 2-way cracked whereas the and-immediate is not. This patch just ensures that we don't emit record-form rotates when an andi. is adequate. As first pointed out by Carrot in https://bugs.llvm.org/show_bug.cgi?id=30833 (this patch is a fix for that PR). Differential Revision: https://reviews.llvm.org/D43977 llvm-svn: 326736
* [Power9] Add more missing instructions to the Power 9 schedulerStefan Pintilie2018-03-052-38/+150
| | | | | | | Adding more instructions using InstRW so that we can move away from ItinRW and ultimately have a complete Power 9 scheduler. llvm-svn: 326701
* [Power9] Add missing instructions to the Power 9 schedulerStefan Pintilie2018-03-021-8/+59
| | | | | | | Adding more instructions using InstRW so that we can move away from ItinRW and ultimately have a complete Power 9 scheduler. llvm-svn: 326578
* [Power9] Add missing instructions to the Power 9 schedulerStefan Pintilie2018-03-012-42/+160
| | | | | | | | | Adding more instructions using InstRW so that we can move away from ItinRW and ultimately have a complete Power 9 scheduler. Differential Revision: https://reviews.llvm.org/D43899 llvm-svn: 326447
* [TLS] use emulated TLS if the target supports only this modeChih-Hung Hsieh2018-02-281-1/+1
| | | | | | | | | | | | | | | Emulated TLS is enabled by llc flag -emulated-tls, which is passed by clang driver. When llc is called explicitly or from other drivers like LTO, missing -emulated-tls flag would generate wrong TLS code for targets that supports only this mode. Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether emulated TLS code should be generated. Unit tests are modified to run with and without the -emulated-tls flag. Differential Revision: https://reviews.llvm.org/D42999 llvm-svn: 326341
* [PowerPC] Disable shrink-wrapping when getting PC address through the LRNemanja Ivanovic2018-02-233-0/+23
| | | | | | | | | | | | | | The instruction sequence used to get the address of the PC into a GPR requires that we clobber the link register. Doing so without having first saved it in the prologue leaves the function unable to return. Currently, this sequence is emitted into the entry block. To ensure the prologue is inserted before this sequence, disable shrink-wrapping. This fixes PR33547. Differential Revision: https://reviews.llvm.org/D43677 llvm-svn: 325972
* [Power9] Add missing instructions to the Power 9 schedulerStefan Pintilie2018-02-231-47/+97
| | | | | | | | | | This is the first in a series of patches that will define more instructions using InstRW so that we can move away from ItinRW and ultimately have a complete Power 9 scheduler. Differential Revision: https://reviews.llvm.org/D43635 llvm-svn: 325956
* [MachineOperand][Target] MachineOperand::isRenamable semantics changesGeoff Berry2018-02-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add a target option AllowRegisterRenaming that is used to opt in to post-register-allocation renaming of registers. This is set to 0 by default, which causes the hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq fields of all opcodes to be set to 1, causing MachineOperand::isRenamable to always return false. Set the AllowRegisterRenaming flag to 1 for all in-tree targets that have lit tests that were effected by enabling COPY forwarding in MachineCopyPropagation (AArch64, AMDGPU, ARM, Hexagon, Mips, PowerPC, RISCV, Sparc, SystemZ and X86). Add some more comments describing the semantics of the MachineOperand::isRenamable function and how it is set and maintained. Change isRenamable to check the operand's opcode hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq bit directly instead of relying on it being consistently reflected in the IsRenamable bit setting. Clear the IsRenamable bit when changing an operand's register value. Remove target code that was clearing the IsRenamable bit when changing registers/opcodes now that this is done conservatively by default. Change setting of hasExtraSrcRegAllocReq in AMDGPU target to be done in one place covering all opcodes that have constant pipe read limit restrictions. Reviewers: qcolombet, MatzeB Subscribers: aemerson, arsenm, jyknight, mcrosier, sdardis, nhaehnle, javed.absar, tpr, arichardson, kristof.beyls, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, escha, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D43042 llvm-svn: 325931
* [PowerPC] Code cleanup. Remove instructions that were withdrawn from Power 9.Stefan Pintilie2018-02-231-21/+0
| | | | | | | | | | | | | | The following set of instructions was originally planned to be added for Power 9 and so code was added to support them. However, a decision was made later on to withdraw support for these instructions in the hardware. xscmpnedp xvcmpnesp xvcmpnedp This patch removes support for the instructions that were not added. Differential Revision: https://reviews.llvm.org/D43641 llvm-svn: 325918
* [PowerPC] Do not produce invalid CTR loop with an FRemNemanja Ivanovic2018-02-221-1/+4
| | | | | | | | | | | An FRem instruction inside a loop should prevent the loop from being converted into a CTR loop since this is not an operation that is legal on any PPC subtarget. This will always be a call to a library function which means the loop will be invalid if this instruction is in the body. Fixes PR36292. llvm-svn: 325739
* [PowerPC] Reduce stack frame for fastcc functions by only allocating ↵Lei Huang2018-02-201-2/+11
| | | | | | | | | | | | parameter save area when needed Current implementation always allocates the parameter save area conservatively for fastcc functions. There is no reason to allocate the parameter save area if all the parameters can be passed via registers. Differential Revision: https://reviews.llvm.org/D42602 llvm-svn: 325581
* [PowerPC] Fix transform in table gen file causing UBNemanja Ivanovic2018-02-161-1/+1
| | | | | | | | Running a bootstrap build with UBSan produces a number of instances where we have signed integer overflow due to this transform. Change the type to long to prevent this UB on 64-bit build machines. llvm-svn: 325347
* [CodeGen] Unify the syntax of MBB liveins in MIR and -debug outputFrancis Visoiu Mistrih2018-02-091-2/+2
| | | | | | | | | | | Instead of: Live Ins: %r0 %r1 print: liveins: %r0, %r1 llvm-svn: 324694
* [PowerPC] fix up in rL324229, NFCHiroshi Inoue2018-02-061-1/+1
| | | | | | This patch fixes up my previous commit (add initialization of local variables). llvm-svn: 324336
* [PowerPC] Check hot loop exit edge in PPCCTRLoopsHiroshi Inoue2018-02-051-0/+21
| | | | | | | | | | | | | | | | PPCCTRLoops transform loops using mtctr/bdnz instructions if loop trip count is known and big enough to compensate for the cost of mtctr. But if there is a loop exit edge which is known to be frequently taken (by builtin_expect or by PGO), we should not transform the loop to avoid the cost of mtctr instruction. Here is an example of a loop with hot exit edge: for (unsigned i = 0; i < TripCount; i++) { // do something if (__builtin_expect(check(), 1)) break; // do something } Differential Revision: https://reviews.llvm.org/D42637 llvm-svn: 324229
* Remove non-modular header containing static utility functionsDavid Blaikie2018-02-022-203/+178
| | | | | | | | | | The one place that uses these functions isn't particularly long/complicated, so it's easier to just have these inline at that location than trying to split it out into a true header. (in part also because of the use of the DEBUG macros, which make this not really a standalone header even if the static functions were made inline instead) llvm-svn: 324044
* [PowerPC] Tell VSX swap removal that scalar conversions are lane-sensitiveNemanja Ivanovic2018-02-011-0/+2
| | | | | | | | This is a rather non-controversial change. We were missing these instructions from the list of instructions that are lane-sensitive. These two put the result into lane 0 (BE) or 3 (LE) regardless of the input. This patch fixes PR36068. llvm-svn: 324005
* [PowerPC] Return true in enableMultipleCopyHints().Jonas Paulsson2018-01-311-0/+2
| | | | | | | | | | Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Nemanja Ivanovic llvm-svn: 323858
* Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass ↵Zaara Syeda2018-01-307-6/+113
| | | | | | | | | | | | | | | | | | | | | to mark candidates with coldcc attribute. This recommits r322721 reverted due to sanitizer memory leak build bot failures. Original commit message: This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 323778
* [NFC] fix trivial typos in comments and documentsHiroshi Inoue2018-01-291-1/+1
| | | | | | "to to" -> "to" llvm-svn: 323628
* [PPC] Avoid incorrect fp-i128-fp lowering.Tim Shen2018-01-231-2/+4
| | | | | | | | | | | | | | | Summary: Fix an issue that's similar to what D41411 fixed: float(__int128(float_var)) shouldn't be optimized to xscvdpsxds + xscvsxdsp, as they mean (float)(int64_t)float_var. Reviewers: jtony, hfinkel, echristo Subscribers: sanjoy, nemanjai, hiraditya, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D42400 llvm-svn: 323270
* Revert [PowerPC] This reverts commit rL322721Zaara Syeda2018-01-177-113/+6
| | | | | | Failing build bots. Revert the commit now. llvm-svn: 322748
* [PowerPC] Add handling for ColdCC calling convention and a pass to markZaara Syeda2018-01-177-6/+113
| | | | | | | | | | | | | | | | candidates with coldcc attribute. This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 322721
* [PPC] Add a new register XER aliased to CARRYGuozhi Wei2018-01-161-2/+6
| | | | | | | | | | When "xer" is specified as clobbered register in inline assembler, clang can accept it, but llvm simply ignore it when lowered to machine instructions. It may cause problems later in scheduler. This patch adds a new register XER aliased to CARRY, and adds it to register class CARRYRC. Now PPCTargetLowering::getRegForInlineAsmConstraint can return correct register number for inline asm constraint "{xer}", and scheduler behave correctly. Differential Revision: https://reviews.llvm.org/D41967 llvm-svn: 322591
* [PowerPC] Don't miscompile rotate+mask into an ANDIo if it can't recreate ↵Benjamin Kramer2018-01-121-0/+4
| | | | | | | | | the immediate I'm not even sure if this transform is ever worth it, but this at least stops the bleeding. llvm-svn: 322373
* [PowerPC] Zero-extend the compare operand for ATOMIC_CMP_SWAPNemanja Ivanovic2018-01-123-0/+61
| | | | | | | | | | | | Part of the fix for https://bugs.llvm.org/show_bug.cgi?id=35812. This patch ensures that the compare operand for the atomic compare and swap is properly zero-extended to 32 bits if applicable. A follow-up commit will fix the extension for the SETCC node generated when expanding an ATOMIC_CMP_SWAP_WITH_SUCCESS. That will complete the bug fix. Differential Revision: https://reviews.llvm.org/D41856 llvm-svn: 322372
* Revert "[PowerPC] Manually schedule the prologue and epilogue"Stefan Pintilie2018-01-121-62/+6
| | | | | | | This reverts commit r322124 since some tests were broken by that patch. Will recommmit once the patch is fixed. llvm-svn: 322369
* [PowerPC] Manually schedule the prologue and epilogueStefan Pintilie2018-01-091-6/+62
| | | | | | | | | | | | | | | | | | | | | This patch makes the following changes to the schedule of instructions in the prologue and epilogue. The stack pointer update is moved down in the prologue so that the callee saves do not have to wait for the update to happen. Saving the lr is moved down in the prologue to hide the latency of the mflr. The stack pointer is moved up in the epilogue so that restoring of the lr can happen sooner. The mtlr is moved up in the epilogue so that it is away form the blr at the end of the epilogue. The latency of the mtlr can now be hidden by the loads of the callee saved registers. This commit is almost identical to this one: r322036 except that two warnings that broke build bots have been fixed. The revision number is D41737 as before. llvm-svn: 322124
* [PowerPC] Can not assume an intrinsic argument is a simple type.Sean Fertile2018-01-091-6/+7
| | | | | | | | | | | | | The CTRLoop pass performs checks on the argument of certain libcalls/intrinsics, and assumes the arguments must be of a simple type. This isn't always the case though. For example if we unroll and vectorize a loop we may end up with vectors larger then the largest legal type, along with intrinsics that operate on those wider types. This happened in the ffmpeg build, where we unrolled a loop and ended up with a sqrt intrinsic that operated on V16f64, triggering an assertion. Differential Revision: https://reviews.llvm.org/D41758 llvm-svn: 322055
* Revert "[PowerPC] Manually schedule the prologue and epilogue"Stefan Pintilie2018-01-091-65/+6
| | | | | | | | [PowerPC] This reverts commit r322036. Failing build bots. Revert the commit now. llvm-svn: 322051
* [PowerPC] Manually schedule the prologue and epilogueStefan Pintilie2018-01-081-6/+65
| | | | | | | | | | | | | | | | | | This patch makes the following changes to the schedule of instructions in the prologue and epilogue. The stack pointer update is moved down in the prologue so that the callee saves do not have to wait for the update to happen. Saving the lr is moved down in the prologue to hide the latency of the mflr. The stack pointer is moved up in the epilogue so that restoring of the lr can happen sooner. The mtlr is moved up in the epilogue so that it is away form the blr at the end of the epilogue. The latency of the mtlr can now be hidden by the loads of the callee saved registers. Differential Revision: https://reviews.llvm.org/D41737 llvm-svn: 322036
* [PowerPC] Add an ISD::TRUNCATE to the legalization for ↵Craig Topper2018-01-071-1/+1
| | | | | | | | | | | | | | | | | | | | | ppc_is_decremented_ctr_nonzero Summary: I believe legalization is really expecting that ReplaceNodeResults will return something with the same type as the thing that's being legalized. Ultimately, it uses the output to replace the uses in the DAG so the type should match to make that work. There are two relevant cases here. When crbits are enabled, then i1 is a legal type and getSetCCResultType should return i1. In this case, the truncate will be between i1 and i1 and should be removed (SelectionDAG::getNode does this). Otherwise, getSetCCResultType will be i32 and the legalizer will promote the truncate to be i32 -> i32 which will be similarly removed. With this fixed we can remove some code from PromoteIntRes_SETCC that seemed to only exist to deal with the intrinsic being replaced with a larger type without changing the other operand. With the truncate being used for connectivity this doesn't happen anymore. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: nemanjai, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D41654 llvm-svn: 321959
* Thread MCSubtargetInfo through Target::createMCAsmBackendAlex Bradbury2018-01-032-3/+6
| | | | | | | | | | | | | | | | | | | | | Currently it's not possible to access MCSubtargetInfo from a TgtMCAsmBackend. D20830 threaded an MCSubtargetInfo reference through MCAsmBackend::relaxInstruction, but this isn't the only function that would benefit from access. This patch removes the Triple and CPUString arguments from createMCAsmBackend and replaces them with MCSubtargetInfo. This patch just changes the interface without making any intentional functional changes. Once in, several cleanups are possible: * Get rid of the awkward MCSubtargetInfo handling in ARMAsmBackend * Support 16-bit instructions when valid in MipsAsmBackend::writeNopData * Get rid of the CPU string parsing in X86AsmBackend and just use a SubtargetFeature for HasNopl * Emit 16-bit nops in RISCVAsmBackend::writeNopData if the compressed instruction set extension is enabled (see D41221) This change initially exposed PR35686, which has since been resolved in r321026. Differential Revision: https://reviews.llvm.org/D41349 llvm-svn: 321692
OpenPOWER on IntegriCloud