summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [PowerPC] fix a bug in TCO eligibility checkHiroshi Inoue2017-12-301-6/+29
| | | | | | | | | | If the callee and caller use different calling convensions, we cannot apply TCO if the callee requires arguments on stack; e.g. C calling convention and Fast CC use the same registers for parameter passing, but the stack offset is not necessarily same. This patch also recommit r319218 "[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions." by @sfertile since the problem reported in r320106 should be fixed. Differential Revision: https://reviews.llvm.org/D40893 llvm-svn: 321579
* [PowerPC] Fix parest build failure in SPEC2017.Tony Jiang2017-12-211-5/+6
| | | | | | | | | | | | | | | | The build failure was caused by an assertion in pre-legalization DAGCombine: Combining: t6: ppcf128 = uint_to_fp t5 ... into: t20: f32 = PPCISD::FCFIDUS t19 which is clearly wrong since ppcf128 are definitely different type with f32 and we cannot change the node value type when do DAGCombine. The fix is don't handle ppc_fp128 or i1 conversions in PPCTargetLowering::combineFPToIntToFP and leave it to downstream to legalize it and expand it to small legal types. Differential Revision: https://reviews.llvm.org/D41411 llvm-svn: 321276
* MachineFunction: Return reference from getFunction(); NFCMatthias Braun2017-12-151-15/+15
| | | | | | The Function can never be nullptr so we can return a reference. llvm-svn: 320884
* Fix code causing fallthrough warnings in the PPC back end.Nemanja Ivanovic2017-12-151-0/+1
| | | | llvm-svn: 320806
* TLI: Allow using PSV for intrinsic mem operandsMatt Arsenault2017-12-141-0/+1
| | | | llvm-svn: 320756
* DAG: Expose all MMO flags in getTgtMemIntrinsicMatt Arsenault2017-12-141-14/+6
| | | | | | | | | | | | | | Rather than adding more bits to express every MMO flag you could want, just directly use the MMO flags. Also fixes using a bunch of bool arguments to getMemIntrinsicNode. On AMDGPU, buffer and image intrinsics should always have MODereferencable set, but currently there is no way to do that directly during the initial intrinsic lowering. llvm-svn: 320746
* [PowerPC] Sign-extend negative constant storesNemanja Ivanovic2017-12-111-2/+6
| | | | | | | | | | | | | Second part of https://reviews.llvm.org/D40348. Revision r318436 has extended all constants feeding a store to 64 bits to allow for CSE on the SDAG. However, negative constants were zero extended which made the constant being loaded appear to be a positive value larger than 16 bits. This resulted in long sequences to materialize such constants rather than simply a "load immediate". This patch just sign-extends those updated constants so that they remain 16-bit signed immediates if they started out that way. llvm-svn: 320368
* Temporarily revert "[PowerPC] Allow tail calls of fastcc functions from C ↵Eric Christopher2017-12-071-10/+5
| | | | | | | | | | CallingConv functions." It is causing sanitizer failures on llvm tests in a bootstrapped compiler. No bot link since it's currently down, but following up to get the bot up. This reverts commit r319218. llvm-svn: 320106
* [PowerPC] Allow tail calls of fastcc functions from C CallingConv functions.Sean Fertile2017-11-281-5/+10
| | | | | | | | Allow fastcc callees to be tail-called from ccc callers. Differential Revision: https://reviews.llvm.org/D40355 llvm-svn: 319218
* Fix a bunch more layering of CodeGen headers that are in TargetDavid Blaikie2017-11-171-2/+2
| | | | | | | | All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490
* [PPC] Change i32 constant in store instruction to i64Guozhi Wei2017-11-161-1/+16
| | | | | | | | This patch changes all i32 constant in store instruction to i64 with truncation, to increase the chance that the referenced constant can be shared with other i64 constant. Differential Revision: https://reviews.llvm.org/D39352 llvm-svn: 318436
* [PowerPC] Implement mayBeEmittedAsTailCall for PPCSean Fertile2017-11-151-0/+35
| | | | | | | | | Implements TargetLowering callback 'mayBeEmittedAsTailCall' that enables CodeGenPrepare to duplicate returns when they might enable a tail-call. Differential Revision: https://reviews.llvm.org/D39777 llvm-svn: 318321
* [PowerPC] Split out the tailcall calling convention checks. NFC.Sean Fertile2017-11-151-11/+19
| | | | | | | | Move the calling convention checks for tail-call eligibility for the 64-bit SysV ABI into a separate function. This is so that it can be shared with 'mayBeEmittedAsTailCall' in a subsequent change. llvm-svn: 318305
* Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layeringDavid Blaikie2017-11-081-1/+1
| | | | | | | | This header includes CodeGen headers, and is not, itself, included by any Target headers, so move it into CodeGen to match the layering of its implementation. llvm-svn: 317647
* Use new vector insert half-word and byte instructions when we see ↵Graham Yiu2017-11-071-3/+26
| | | | | | | | insertelement on '8 x i16' and '16 x i8' types. Also extended existing lit testcase to cover these cases. Differential Revision: https://reviews.llvm.org/D34630 llvm-svn: 317613
* Fix buildbot breakages from r317503. Add parentheses to assignment when ↵Graham Yiu2017-11-061-2/+2
| | | | | | using result as a condition. llvm-svn: 317508
* Adds code to PPC ISEL lowering to recognize byte inserts from ↵Graham Yiu2017-11-061-2/+106
| | | | | | | | vector_shuffles, and use P9 shift and vector insert byte instructions instead of vperm. Extends tests from vector insert half-word. Differential Revision: https://reviews.llvm.org/D34497 llvm-svn: 317503
* [PPC] Use xxbrd to speed up bswap64Guozhi Wei2017-11-061-2/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | Power doesn't have bswap instructions, so llvm generates following code sequence for bswap64. rotldi 5, 3, 16 rotldi 4, 3, 8 rotldi 9, 3, 24 rotldi 10, 3, 32 rotldi 11, 3, 48 rotldi 12, 3, 56 rldimi 4, 5, 8, 48 rldimi 4, 9, 16, 40 rldimi 4, 10, 24, 32 rldimi 4, 11, 40, 16 rldimi 4, 12, 48, 8 rldimi 4, 3, 56, 0 But Power9 has vector bswap instructions, they can also be used to speed up scalar bswap intrinsic. With this patch, bswap64 can be translated to: mtvsrdd 34, 3, 3 xxbrd 34, 34 mfvsrld 3, 34 Differential Revision: https://reviews.llvm.org/D39510 llvm-svn: 317499
* Adds code to PPC ISEL lowering to recognize half-word inserts from ↵Graham Yiu2017-11-011-0/+119
| | | | | | | | vector_shuffles, and use P9 shift and vector insert instructions instead of vperm. Differential Revision: https://reviews.llvm.org/D34160 llvm-svn: 317111
* [PowerPC] Eliminate sign- and zero-extensions if already sign- or zero-extendedHiroshi Inoue2017-10-161-0/+4
| | | | | | | | | | | | | | | | | | This patch enables redundant sign- and zero-extension elimination in PowerPC MI Peephole pass. If the input value of a sign- or zero-extension is known to be already sign- or zero-extended, the operation is redundant and can be eliminated. One common case is sign-extensions for a method parameter or for a method return value; they must be sign- or zero-extended as defined in PPC ELF ABI. For example of the following simple code, two extsw instructions are generated before the invocation of int_func and before the return. With this patch, both extsw are eliminated. void int_func(int); void ii_test(int a) { if (a & 1) return int_func(a); } Such redundant sign- or zero-extensions are quite common in many programs; e.g. I observed about 60,000 occurrences of the elimination while compiling the LLVM+CLANG. Differential Revision: https://reviews.llvm.org/D31319 llvm-svn: 315888
* DAG: Add opcode and source type to isFPExtFreeMatt Arsenault2017-10-131-2/+3
| | | | | | | | This is only currently used for mad/fma transforms. This is the only case where it should be used for AMDGPU, so add an opcode to be sure. llvm-svn: 315740
* [PowerPC] Don't use xscvdpspn on the P7Hal Finkel2017-09-061-3/+6
| | | | | | | xscvdpspn was not introduced until the P8, so don't use it on the P7. Fixes a regression introduced in r288152. llvm-svn: 312612
* [PPC][NFC] Renaming things with 'xxinsert' moniker to 'vecinsert' to make it ↵Tony Jiang2017-09-051-4/+4
| | | | | | | | more general. Commit on behalf of Graham Yiu (gyiu@ca.ibm.com) llvm-svn: 312547
* [PPC] Refine checks for emiting TOC restore nop and tail-call eligibility.Sean Fertile2017-08-211-6/+17
| | | | | | | | | For the medium and large code models we only need to check if a call crosses dso-boundaries when considering tail-call elgibility. Differential Revision: https://reviews.llvm.org/D34245 llvm-svn: 311353
* [PowerPC] Don't crash on larger splats achieved through 1-byte splatsNemanja Ivanovic2017-08-081-0/+9
| | | | | | | | | | We've implemented a 1-byte splat using XXSPLTISB on P9. However, LLVM will produce a 1-byte splat even for wider element BUILD_VECTOR nodes. This patch prevents crashing in that situation. Differential Revision: https://reviews.llvm.org/D35650 llvm-svn: 310358
* Delete Default and JITDefault code modelsRafael Espindola2017-08-031-2/+0
| | | | | | | | | | | | | | | IMHO it is an antipattern to have a enum value that is Default. At any given piece of code it is not clear if we have to handle Default or if has already been mapped to a concrete value. In this case in particular, only the target can do the mapping and it is nice to make sure it is always done. This deletes the two default enum values of CodeModel and uses an explicit Optional<CodeModel> when it is possible that it is unspecified. llvm-svn: 309911
* [Power9] Exploit vector absolute difference instructions on Power 9Stefan Pintilie2017-08-021-1/+37
| | | | | | | | | Power 9 has instructions to do absolute difference (VABSDUB, VABSDUH, VABSDUW) for byte, halfword and word. We should take advantage of these. Differential Revision: https://reviews.llvm.org/D34684 llvm-svn: 309876
* Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. ↵Peter Collingbourne2017-07-261-14/+14
| | | | | | | | NFCI. This was a use-after-free waiting to happen. llvm-svn: 309159
* [SystemZ, LoopStrengthReduce]Jonas Paulsson2017-07-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes LSR generate better code for SystemZ in the cases of memory intrinsics, Load->Store pairs or comparison of immediate with memory. In order to achieve this, the following common code changes were made: * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if LSR should do instruction-based addressing evaluations by calling isLegalAddressingMode() with the Instruction pointers. * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address, not just loads or stores. SystemZ changes: * isLSRCostLess() implemented with Insns first, and without ImmCost. * New function supportedAddressingMode() that is a helper for TTI methods looking at Instructions passed via pointers. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D35262 https://reviews.llvm.org/D35049 llvm-svn: 308729
* [PowerPC] Ensure displacements for DQ-Form instructions are multiples of 16Nemanja Ivanovic2017-07-131-9/+9
| | | | | | | | | | | | | As outlined in the PR, we didn't ensure that displacements for DQ-Form instructions are multiples of 16. Since the instruction encoding encodes a quad-word displacement, a sub-16 byte displacement is meaningless and ends up being encoded incorrectly. Fixes https://bugs.llvm.org/show_bug.cgi?id=33671. Differential Revision: https://reviews.llvm.org/D35007 llvm-svn: 307934
* [PPC CodeGen] Expand the bitreverse.i64 intrinsic.Tony Jiang2017-07-101-0/+1
| | | | | | | Differential Revision: https://reviews.llvm.org/D34908 Fix PR: https://bugs.llvm.org/show_bug.cgi?id=33093 llvm-svn: 307563
* [PowerPC] Reduce register pressure by not materializing a constant just for ↵Lei Huang2017-07-101-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | use as an index register for X-Form loads/stores. For this example: float test (int *arr) { return arr[2]; } We currently generate the following code: li r4, 8 lxsiwax f0, r3, r4 xscvsxdsp f1, f0 With this patch, we will now generate: addi r3, r3, 8 lxsiwax f0, 0, r3 xscvsxdsp f1, f0 Originally reported in: https://bugs.llvm.org/show_bug.cgi?id=27204 Differential Revision: https://reviews.llvm.org/D35027 llvm-svn: 307553
* fix typos in comments and error messages; NFCHiroshi Inoue2017-07-101-3/+3
| | | | llvm-svn: 307533
* [PowerPC] NFC : Common up definitions of isIntS16Immediate and update ↵Lei Huang2017-07-071-7/+7
| | | | | | parameter to int16_t llvm-svn: 307442
* [PPC CodeGen] Expand the bitreverse.i32 intrinsic.Tony Jiang2017-07-071-0/+3
| | | | | | | Differential Revision: https://reviews.llvm.org/D33572 Fix PR: https://bugs.llvm.org/show_bug.cgi?id=33093 llvm-svn: 307413
* [PowerPC] Fix -Wimplicit-fallthrough warnings. NFCI.Simon Pilgrim2017-07-071-0/+4
| | | | llvm-svn: 307382
* [Power9] Exploit vector integer extend instructions when indices aren't correct.Tony Jiang2017-07-051-0/+136
| | | | | | | | | | | | | | | This patch adds on to the exploitation added by https://reviews.llvm.org/D33510. This now catches build vector nodes where the inputs are coming from sign extended vector extract elements where the indices used by the vector extract are not correct. We can still use the new hardware instructions by adding a shuffle to move the elements to the correct indices. I introduced a new PPCISD node here because adding a vector_shuffle and changing the elements of the vector_extracts was getting undone by another DAG combine. Commit on behalf of Zaara Syeda (syzaara@ca.ibm.com) Differential Revision: https://reviews.llvm.org/D34009 llvm-svn: 307169
* Tidy up some calls to getRegister for readability.Eric Christopher2017-06-171-5/+6
| | | | llvm-svn: 305626
* Test commit - NFC.Lei Huang2017-06-141-1/+1
| | | | | | Modified a comment to confirm commit access functionality. llvm-svn: 305402
* Test commit - NFC.Kit Barton2017-06-131-1/+1
| | | | | | Modified a comment to confirm commit access functionality. llvm-svn: 305309
* PPCISelLowering.cpp: Fix warnings in r305214. [-Wdocumentation]NAKAMURA Takumi2017-06-131-3/+3
| | | | llvm-svn: 305277
* [PowerPC] Match vec_revb builtins to P9 instructions.Tony Jiang2017-06-121-7/+69
| | | | | | | | | | | | Power9 has instructions that will reverse the bytes within an element for all sizes (half-word, word, double-word and quad-word). These can be used for the vec_revb builtins in altivec.h. However, we implement these to match vector shuffle nodes as that will cover both the builtins and vector shuffles that occur in the SDAG through other means. Differential Revision: https://reviews.llvm.org/D33690 llvm-svn: 305214
* [Power9] Added support for the modsw, moduw, modsd, modud hardware instructions.Tony Jiang2017-06-121-5/+32
| | | | | | | | | | | Note that if we need the result of both the divide and the modulo then we compute the modulo based on the result of the divide and not using the new hardware instruction. Commit on behalf of STEFAN PINTILIE. Differential Revision: https://reviews.llvm.org/D33940 llvm-svn: 305210
* [DAG] add helper to bind memop chains; NFCISanjay Patel2017-06-121-0/+1
| | | | | | | | | | This step is just intended to reduce code duplication rather than change any functionality. A follow-up would be to replace PPCTargetLowering::spliceIntoChain() usage with this new helper. Differential Revision: https://reviews.llvm.org/D33649 llvm-svn: 305192
* Sort the remaining #include lines in include/... and lib/....Chandler Carruth2017-06-061-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
* [PPC] Inline expansion of memcmpZaara Syeda2017-05-311-0/+4
| | | | | | | | | | | | | | | This patch does an inline expansion of memcmp. It changes the memcmp library call into an inline expansion when the size is known at compile time and is under a target specified threshold. This expansion is implemented in CodeGenPrepare and expands into straight line code. The target specifies a maximum load size and the expansion works by using this size to load the two sources, compare, and exit early if a difference is found. It also has a special case when the memcmp result is used in a compare to zero equality. Differential Revision: https://reviews.llvm.org/D28637 llvm-svn: 304313
* [PowerPC] Fix a performance bug for PPC::XXPERMDI.Tony Jiang2017-05-311-12/+94
| | | | | | | | | | There are some VectorShuffle Nodes in SDAG which can be selected to XXPERMDI Instruction, this patch recognizes them and does the selection to improve the PPC performance. Differential Revision: https://reviews.llvm.org/D33404 llvm-svn: 304298
* [SelectionDAG] Set ISD::FPOWI to Expand by defaultCraig Topper2017-05-301-3/+0
| | | | | | | | | | | | | | | | | Summary: Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie". This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default. Reviewers: spatel, RKSimon, efriedma Reviewed By: RKSimon Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits Differential Revision: https://reviews.llvm.org/D33530 llvm-svn: 304215
* [PPC] Add text for assert.Tim Shen2017-05-251-1/+1
| | | | llvm-svn: 303940
* [PPC] Fix atomics lowering in DAG lowering.Tim Shen2017-05-251-1/+3
| | | | | | | | | | | I forgot to forward the chain, causing some missing instruction dependencies. The test crashes the compiler without this patch. Inspired by the test case, D33519 also tries to remove the extra sync. Differential Revision: https://reviews.llvm.org/D33573 llvm-svn: 303931
OpenPOWER on IntegriCloud