summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
* [PowerPC] Reduce stack frame for fastcc functions by only allocating ↵Lei Huang2018-02-201-2/+11
| | | | | | | | | | | | parameter save area when needed Current implementation always allocates the parameter save area conservatively for fastcc functions. There is no reason to allocate the parameter save area if all the parameters can be passed via registers. Differential Revision: https://reviews.llvm.org/D42602 llvm-svn: 325581
* [PowerPC] Fix transform in table gen file causing UBNemanja Ivanovic2018-02-161-1/+1
| | | | | | | | Running a bootstrap build with UBSan produces a number of instances where we have signed integer overflow due to this transform. Change the type to long to prevent this UB on 64-bit build machines. llvm-svn: 325347
* [CodeGen] Unify the syntax of MBB liveins in MIR and -debug outputFrancis Visoiu Mistrih2018-02-091-2/+2
| | | | | | | | | | | Instead of: Live Ins: %r0 %r1 print: liveins: %r0, %r1 llvm-svn: 324694
* [PowerPC] fix up in rL324229, NFCHiroshi Inoue2018-02-061-1/+1
| | | | | | This patch fixes up my previous commit (add initialization of local variables). llvm-svn: 324336
* [PowerPC] Check hot loop exit edge in PPCCTRLoopsHiroshi Inoue2018-02-051-0/+21
| | | | | | | | | | | | | | | | PPCCTRLoops transform loops using mtctr/bdnz instructions if loop trip count is known and big enough to compensate for the cost of mtctr. But if there is a loop exit edge which is known to be frequently taken (by builtin_expect or by PGO), we should not transform the loop to avoid the cost of mtctr instruction. Here is an example of a loop with hot exit edge: for (unsigned i = 0; i < TripCount; i++) { // do something if (__builtin_expect(check(), 1)) break; // do something } Differential Revision: https://reviews.llvm.org/D42637 llvm-svn: 324229
* Remove non-modular header containing static utility functionsDavid Blaikie2018-02-022-203/+178
| | | | | | | | | | The one place that uses these functions isn't particularly long/complicated, so it's easier to just have these inline at that location than trying to split it out into a true header. (in part also because of the use of the DEBUG macros, which make this not really a standalone header even if the static functions were made inline instead) llvm-svn: 324044
* [PowerPC] Tell VSX swap removal that scalar conversions are lane-sensitiveNemanja Ivanovic2018-02-011-0/+2
| | | | | | | | This is a rather non-controversial change. We were missing these instructions from the list of instructions that are lane-sensitive. These two put the result into lane 0 (BE) or 3 (LE) regardless of the input. This patch fixes PR36068. llvm-svn: 324005
* [PowerPC] Return true in enableMultipleCopyHints().Jonas Paulsson2018-01-311-0/+2
| | | | | | | | | | Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Nemanja Ivanovic llvm-svn: 323858
* Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass ↵Zaara Syeda2018-01-307-6/+113
| | | | | | | | | | | | | | | | | | | | | to mark candidates with coldcc attribute. This recommits r322721 reverted due to sanitizer memory leak build bot failures. Original commit message: This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 323778
* [NFC] fix trivial typos in comments and documentsHiroshi Inoue2018-01-291-1/+1
| | | | | | "to to" -> "to" llvm-svn: 323628
* [PPC] Avoid incorrect fp-i128-fp lowering.Tim Shen2018-01-231-2/+4
| | | | | | | | | | | | | | | Summary: Fix an issue that's similar to what D41411 fixed: float(__int128(float_var)) shouldn't be optimized to xscvdpsxds + xscvsxdsp, as they mean (float)(int64_t)float_var. Reviewers: jtony, hfinkel, echristo Subscribers: sanjoy, nemanjai, hiraditya, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D42400 llvm-svn: 323270
* Revert [PowerPC] This reverts commit rL322721Zaara Syeda2018-01-177-113/+6
| | | | | | Failing build bots. Revert the commit now. llvm-svn: 322748
* [PowerPC] Add handling for ColdCC calling convention and a pass to markZaara Syeda2018-01-177-6/+113
| | | | | | | | | | | | | | | | candidates with coldcc attribute. This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 322721
* [PPC] Add a new register XER aliased to CARRYGuozhi Wei2018-01-161-2/+6
| | | | | | | | | | When "xer" is specified as clobbered register in inline assembler, clang can accept it, but llvm simply ignore it when lowered to machine instructions. It may cause problems later in scheduler. This patch adds a new register XER aliased to CARRY, and adds it to register class CARRYRC. Now PPCTargetLowering::getRegForInlineAsmConstraint can return correct register number for inline asm constraint "{xer}", and scheduler behave correctly. Differential Revision: https://reviews.llvm.org/D41967 llvm-svn: 322591
* [PowerPC] Don't miscompile rotate+mask into an ANDIo if it can't recreate ↵Benjamin Kramer2018-01-121-0/+4
| | | | | | | | | the immediate I'm not even sure if this transform is ever worth it, but this at least stops the bleeding. llvm-svn: 322373
* [PowerPC] Zero-extend the compare operand for ATOMIC_CMP_SWAPNemanja Ivanovic2018-01-123-0/+61
| | | | | | | | | | | | Part of the fix for https://bugs.llvm.org/show_bug.cgi?id=35812. This patch ensures that the compare operand for the atomic compare and swap is properly zero-extended to 32 bits if applicable. A follow-up commit will fix the extension for the SETCC node generated when expanding an ATOMIC_CMP_SWAP_WITH_SUCCESS. That will complete the bug fix. Differential Revision: https://reviews.llvm.org/D41856 llvm-svn: 322372
* Revert "[PowerPC] Manually schedule the prologue and epilogue"Stefan Pintilie2018-01-121-62/+6
| | | | | | | This reverts commit r322124 since some tests were broken by that patch. Will recommmit once the patch is fixed. llvm-svn: 322369
* [PowerPC] Manually schedule the prologue and epilogueStefan Pintilie2018-01-091-6/+62
| | | | | | | | | | | | | | | | | | | | | This patch makes the following changes to the schedule of instructions in the prologue and epilogue. The stack pointer update is moved down in the prologue so that the callee saves do not have to wait for the update to happen. Saving the lr is moved down in the prologue to hide the latency of the mflr. The stack pointer is moved up in the epilogue so that restoring of the lr can happen sooner. The mtlr is moved up in the epilogue so that it is away form the blr at the end of the epilogue. The latency of the mtlr can now be hidden by the loads of the callee saved registers. This commit is almost identical to this one: r322036 except that two warnings that broke build bots have been fixed. The revision number is D41737 as before. llvm-svn: 322124
* [PowerPC] Can not assume an intrinsic argument is a simple type.Sean Fertile2018-01-091-6/+7
| | | | | | | | | | | | | The CTRLoop pass performs checks on the argument of certain libcalls/intrinsics, and assumes the arguments must be of a simple type. This isn't always the case though. For example if we unroll and vectorize a loop we may end up with vectors larger then the largest legal type, along with intrinsics that operate on those wider types. This happened in the ffmpeg build, where we unrolled a loop and ended up with a sqrt intrinsic that operated on V16f64, triggering an assertion. Differential Revision: https://reviews.llvm.org/D41758 llvm-svn: 322055
* Revert "[PowerPC] Manually schedule the prologue and epilogue"Stefan Pintilie2018-01-091-65/+6
| | | | | | | | [PowerPC] This reverts commit r322036. Failing build bots. Revert the commit now. llvm-svn: 322051
* [PowerPC] Manually schedule the prologue and epilogueStefan Pintilie2018-01-081-6/+65
| | | | | | | | | | | | | | | | | | This patch makes the following changes to the schedule of instructions in the prologue and epilogue. The stack pointer update is moved down in the prologue so that the callee saves do not have to wait for the update to happen. Saving the lr is moved down in the prologue to hide the latency of the mflr. The stack pointer is moved up in the epilogue so that restoring of the lr can happen sooner. The mtlr is moved up in the epilogue so that it is away form the blr at the end of the epilogue. The latency of the mtlr can now be hidden by the loads of the callee saved registers. Differential Revision: https://reviews.llvm.org/D41737 llvm-svn: 322036
* [PowerPC] Add an ISD::TRUNCATE to the legalization for ↵Craig Topper2018-01-071-1/+1
| | | | | | | | | | | | | | | | | | | | | ppc_is_decremented_ctr_nonzero Summary: I believe legalization is really expecting that ReplaceNodeResults will return something with the same type as the thing that's being legalized. Ultimately, it uses the output to replace the uses in the DAG so the type should match to make that work. There are two relevant cases here. When crbits are enabled, then i1 is a legal type and getSetCCResultType should return i1. In this case, the truncate will be between i1 and i1 and should be removed (SelectionDAG::getNode does this). Otherwise, getSetCCResultType will be i32 and the legalizer will promote the truncate to be i32 -> i32 which will be similarly removed. With this fixed we can remove some code from PromoteIntRes_SETCC that seemed to only exist to deal with the intrinsic being replaced with a larger type without changing the other operand. With the truncate being used for connectivity this doesn't happen anymore. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: nemanjai, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D41654 llvm-svn: 321959
* Thread MCSubtargetInfo through Target::createMCAsmBackendAlex Bradbury2018-01-032-3/+6
| | | | | | | | | | | | | | | | | | | | | Currently it's not possible to access MCSubtargetInfo from a TgtMCAsmBackend. D20830 threaded an MCSubtargetInfo reference through MCAsmBackend::relaxInstruction, but this isn't the only function that would benefit from access. This patch removes the Triple and CPUString arguments from createMCAsmBackend and replaces them with MCSubtargetInfo. This patch just changes the interface without making any intentional functional changes. Once in, several cleanups are possible: * Get rid of the awkward MCSubtargetInfo handling in ARMAsmBackend * Support 16-bit instructions when valid in MipsAsmBackend::writeNopData * Get rid of the CPU string parsing in X86AsmBackend and just use a SubtargetFeature for HasNopl * Emit 16-bit nops in RISCVAsmBackend::writeNopData if the compressed instruction set extension is enabled (see D41221) This change initially exposed PR35686, which has since been resolved in r321026. Differential Revision: https://reviews.llvm.org/D41349 llvm-svn: 321692
* [PowerPC] fix a bug in TCO eligibility checkHiroshi Inoue2017-12-301-6/+29
| | | | | | | | | | If the callee and caller use different calling convensions, we cannot apply TCO if the callee requires arguments on stack; e.g. C calling convention and Fast CC use the same registers for parameter passing, but the stack offset is not necessarily same. This patch also recommit r319218 "[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions." by @sfertile since the problem reported in r320106 should be fixed. Differential Revision: https://reviews.llvm.org/D40893 llvm-svn: 321579
* [PowerPC] Fix for PR35688 - handle out-of-range values for r+r to r+i conversionNemanja Ivanovic2017-12-294-23/+69
| | | | | | | | | | | | | | | Revision 320791 introduced a pass that transforms reg+reg instructions to reg+imm if they're fed by "load immediate". However, it didn't handle out-of-range shifts correctly as reported in PR35688. This patch fixes that and therefore the PR. Furthermore, there was undefined behaviour in the patch where the RHS of an initialization expression was 32 bits and constant `1` was shifted left 32 bits. This was fixed by ensuring the RHS is 64 bits just like the LHS. Differential Revision: https://reviews.llvm.org/D41369 llvm-svn: 321551
* (Re-landing) Expose a TargetMachine::getTargetTransformInfo functionSanjoy Das2017-12-222-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-land r321234. It had to be reverted because it broke the shared library build. The shared library build broke because there was a missing LLVMBuild dependency from lib/Passes (which calls TargetMachine::getTargetIRAnalysis) to lib/Target. As far as I can tell, this problem was always there but was somehow masked before (perhaps because TargetMachine::getTargetIRAnalysis was a virtual function). Original commit message: This makes the TargetMachine interface a bit simpler. We still need the std::function in TargetIRAnalysis to avoid having to add a dependency from Analysis to Target. See discussion: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html I avoided adding all of the backend owners to this review since the change is simple, but let me know if you feel differently about this. Reviewers: echristo, MatzeB, hfinkel Reviewed By: hfinkel Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D41464 llvm-svn: 321375
* [PowerPC] Fix parest build failure in SPEC2017.Tony Jiang2017-12-211-5/+6
| | | | | | | | | | | | | | | | The build failure was caused by an assertion in pre-legalization DAGCombine: Combining: t6: ppcf128 = uint_to_fp t5 ... into: t20: f32 = PPCISD::FCFIDUS t19 which is clearly wrong since ppcf128 are definitely different type with f32 and we cannot change the node value type when do DAGCombine. The fix is don't handle ppc_fp128 or i1 conversions in PPCTargetLowering::combineFPToIntToFP and leave it to downstream to legalize it and expand it to small legal types. Differential Revision: https://reviews.llvm.org/D41411 llvm-svn: 321276
* Revert "Expose a TargetMachine::getTargetTransformInfo function"Sanjoy Das2017-12-212-4/+5
| | | | | | This reverts commit r321234. It breaks the -DBUILD_SHARED_LIBS=ON build. llvm-svn: 321243
* Expose a TargetMachine::getTargetTransformInfo functionSanjoy Das2017-12-212-5/+4
| | | | | | | | | | | | | | | | | | | | | | | Summary: This makes the TargetMachine interface a bit simpler. We still need the std::function in TargetIRAnalysis to avoid having to add a dependency from Analysis to Target. See discussion: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html I avoided adding all of the backend owners to this review since the change is simple, but let me know if you feel differently about this. Reviewers: echristo, MatzeB, hfinkel Reviewed By: hfinkel Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D41464 llvm-svn: 321234
* [PowerPC] Added an assert to make sure that the MBBI iterator is valid.Stefan Pintilie2017-12-201-3/+3
| | | | | | | | | | The function createTailCallBranchInstr assumes that the iterator MBBI is valid. However, only one use of MBBI is guarded in the function. Fix this by adding an assert. Differential Revision: https://reviews.llvm.org/D41358 llvm-svn: 321205
* [PowerPC] fix a bug in redundant compare eliminationHiroshi Inoue2017-12-201-5/+13
| | | | | | | | | | This patch fixes a bug in the redundant compare elimination reported in https://reviews.llvm.org/rL320786 and re-enables the optimization. The redundant compare elimination assumes that we can replace signed comparison with unsigned comparison for the equality check. But due to the difference in the sign extension behavior we cannot change the opcode if the comparison is against an immediate and the most significant bit of the immediate is one. Differential Revision: https://reviews.llvm.org/D41385 llvm-svn: 321147
* [PPC] Also disable the pre-emit version of reg+reg to reg+imm transformation.Benjamin Kramer2017-12-181-1/+1
| | | | | | This has the same issue as the early pass disabled in r321010. llvm-svn: 321013
* [PPC] Disable reg+reg to reg+imm transformation.Benjamin Kramer2017-12-181-1/+1
| | | | | | It creates invalid instructions. PR35688. llvm-svn: 321010
* [PowerPC, AsmParser] Enable the mnemonic spell correctorHal Finkel2017-12-161-2/+15
| | | | | | | | | | | r307148 added an assembly mnemonic spelling correction support and enabled it on ARM. This enables that support on PowerPC as well. Patch by Dmitry Venikov, thanks! Differential Revision: https://reviews.llvm.org/D40552 llvm-svn: 320911
* MachineFunction: Return reference from getFunction(); NFCMatthias Braun2017-12-1515-36/+36
| | | | | | The Function can never be nullptr so we can return a reference. llvm-svn: 320884
* Fix the second build bot break introduced by r320791.Nemanja Ivanovic2017-12-151-0/+7
| | | | llvm-svn: 320811
* Fix code causing fallthrough warnings in the PPC back end.Nemanja Ivanovic2017-12-154-1/+7
| | | | llvm-svn: 320806
* Fix the build bot break introduced by r320791.Nemanja Ivanovic2017-12-151-1/+6
| | | | llvm-svn: 320798
* [PowerPC] Convert r+r instructions to r+i (pre and post RA)Nemanja Ivanovic2017-12-159-48/+1041
| | | | | | | | | | | | | | | | | | | | | | This patch adds the necessary infrastructure to convert instructions that take two register operands to those that take a register and immediate if the necessary operand is produced by a load-immediate. Furthermore, it uses this infrastructure to perform such conversions twice - first at MachineSSA and then pre-emit. There are a number of reasons we may end up with opportunities for this transformation, including but not limited to: - X-Form instructions chosen since the exact offset isn't available at ISEL time - Atomic instructions with constant operands (we will add patterns for this in the future) - Tail duplication may duplicate code where one block contains this redundancy - When emitting compare-free code in PPCDAGToDAGISel, we don't handle constant comparands specially Furthermore, this patch moves the initialization of PPCMIPeepholePass so that it can be used for MIR tests. llvm-svn: 320791
* Disabling r312514 as it causes miscompiles that show up on bootstrapNemanja Ivanovic2017-12-151-1/+1
| | | | | | | | | | | | The compare elimination peephole introduced in https://reviews.llvm.org/rL312514 causes a miscompile in AMDGPUInstrInfo.cpp which in turn causes some AMDGPU test case failures in stage2 bootstrap testing. This miscompile didn't cause any test case failures until https://reviews.llvm.org/rL320614, so it appeared as if that patch caused these failures. Disabling this transformation for now to bring the build bots back to green and the author of the patch will investigate the miscompile. llvm-svn: 320786
* TLI: Allow using PSV for intrinsic mem operandsMatt Arsenault2017-12-142-0/+2
| | | | llvm-svn: 320756
* DAG: Expose all MMO flags in getTgtMemIntrinsicMatt Arsenault2017-12-141-14/+6
| | | | | | | | | | | | | | Rather than adding more bits to express every MMO flag you could want, just directly use the MMO flags. Also fixes using a bunch of bool arguments to getMemIntrinsicNode. On AMDGPU, buffer and image intrinsics should always have MODereferencable set, but currently there is no way to do that directly during the initial intrinsic lowering. llvm-svn: 320746
* [CodeGen] Print global addresses as @foo in both MIR and debug outputFrancis Visoiu Mistrih2017-12-143-24/+24
| | | | | | | | | | | | Work towards the unification of MIR and debug output by printing `@foo` instead of `<ga:@foo>`. Also print target flags in the MIR format since most of them are used on global address operands. Only debug syntax is affected. llvm-svn: 320682
* Fix link failure on one build bot introduced by r320584.Nemanja Ivanovic2017-12-131-1/+3
| | | | llvm-svn: 320589
* [PowerPC] MachineSSA pass to reduce the number of CR-logical operationsNemanja Ivanovic2017-12-135-0/+740
| | | | | | | | | | | | | | The initial implementation of an MI SSA pass to reduce cr-logical operations. Currently, the only operations handled by the pass are binary operations where both CR-inputs come from the same block and the single use is a conditional branch (also in the same block). Committing this off by default to allow for a period of field testing. Will enable it by default in a follow-up patch soon. Differential Revision: https://reviews.llvm.org/D30431 llvm-svn: 320584
* [Targets] Don't automatically include the scheduler class enum from ↵Craig Topper2017-12-131-0/+1
| | | | | | | | | | *GenInstrInfo.inc with GET_INSTRINFO_ENUM. Make targets request is separately. Most of the targets don't need the scheduler class enum. I have an X86 scheduler model change that causes some names in the enum to become about 18000 characters long. This is because using instregex in scheduler models causes the scheduler class to get named with every instruction that matches the regex concatenated together. MSVC has a limit of 4096 characters for an identifier name. Rather than trying to come up with way to reduce the name length, I'm just going to sidestep the problem by not including the enum in X86. llvm-svn: 320552
* Rename LiveIntervalAnalysis.h to LiveIntervals.hMatthias Braun2017-12-133-3/+3
| | | | | | | | | | Headers/Implementation files should be named after the class they declare/define. Also eliminated an `#include "llvm/CodeGen/LiveIntervalAnalysis.h"` in favor of `class LiveIntarvals;` llvm-svn: 320546
* [PowerPC] Add branch flag on asm parser-only branch instructionsNemanja Ivanovic2017-12-121-1/+1
| | | | | | | | | | | This flag was missing but it wasn't an issue as nothing depended on it for these asm parser-only instructions. Now that LLDB support is slowly landing, it is important to get this right. Committing on behalf of Leonardo Bianconi. Differential revision: https://reviews.llvm.org/D40846 llvm-svn: 320475
* [PowerPC] Follow-up to r318436 to get the missed CSE opportunitiesNemanja Ivanovic2017-12-121-1/+65
| | | | | | | | | | | | | | | | | | The last of the three patches that https://reviews.llvm.org/D40348 was broken up into. Canonicalize the materialization of constants so that they are more likely to be CSE'd regardless of the bit-width of the use. If a constant can be materialized using PPC::LI, materialize it the same way always. For example: li 4, -1 li 4, 255 li 4, 65535 are equivalent if the uses only use the low byte. Canonicalize it to the first form. Differential Revision: https://reviews.llvm.org/D40348 llvm-svn: 320473
* [PowerPC] Partially enable the ISEL expansion pass.Tony Jiang2017-12-111-21/+64
| | | | | | | | | | | The pass to expand ISEL instructions into if-then-else sequences in patch D23630 is currently disabled. This patch partially enable it by always removing the unnecessary ISELs (all registers used by the ISELs are the same one) and folding the ISELs which have the same input registers into unconditional copies. Differential Revision: https://reviews.llvm.org/D40497 llvm-svn: 320414
OpenPOWER on IntegriCloud