summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* [PPC] Heuristic to choose between a X-Form VSX ld/st vs a X-Form FP ld/st.Tony Jiang2017-11-205-46/+158
| | | | | | | | | | | | | | | | | | | | The VSX versions have the advantage of a full 64-register target whereas the FP ones have the advantage of lower latency and higher throughput. So what we’re after is using the faster instructions in low register pressure situations and using the larger register file in high register pressure situations. The heuristic chooses between the following 7 pairs of instructions. PPC::LXSSPX vs PPC::LFSX PPC::LXSDX vs PPC::LFDX PPC::STXSSPX vs PPC::STFSX PPC::STXSDX vs PPC::STFDX PPC::LXSIWAX vs PPC::LFIWAX PPC::LXSIWZX vs PPC::LFIWZX PPC::STXSIWX vs PPC::STFIWX Differential Revision: https://reviews.llvm.org/D38486 llvm-svn: 318651
* Fix a bunch more layering of CodeGen headers that are in TargetDavid Blaikie2017-11-1715-18/+18
| | | | | | | | All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490
* [PPC] Change i32 constant in store instruction to i64Guozhi Wei2017-11-161-1/+16
| | | | | | | | This patch changes all i32 constant in store instruction to i64 with truncation, to increase the chance that the referenced constant can be shared with other i64 constant. Differential Revision: https://reviews.llvm.org/D39352 llvm-svn: 318436
* Add backend name to Target to enable runtime info to be fed back into TableGenDaniel Sanders2017-11-151-3/+3
| | | | | | | | | | | | | | | | | | | | | | Summary: Make it possible to feed runtime information back to tablegen to enable profile-guided tablegen-eration, detection of untested tablegen definitions, etc. Being a cross-compiler by nature, LLVM will potentially collect data for multiple architectures (e.g. when running 'ninja check'). We therefore need a way for TableGen to figure out what data applies to the backend it is generating at the time. This patch achieves that by including the name of the 'def X : Target ...' for the backend in the TargetRegistry. Reviewers: qcolombet Reviewed By: qcolombet Subscribers: jholewinski, arsenm, jyknight, aditya_nandakumar, sdardis, nemanjai, ab, nhaehnle, t.p.northover, javed.absar, qcolombet, llvm-commits, fedor.sergeev Differential Revision: https://reviews.llvm.org/D39742 llvm-svn: 318352
* [PowerPC] Implement mayBeEmittedAsTailCall for PPCSean Fertile2017-11-152-0/+39
| | | | | | | | | Implements TargetLowering callback 'mayBeEmittedAsTailCall' that enables CodeGenPrepare to duplicate returns when they might enable a tail-call. Differential Revision: https://reviews.llvm.org/D39777 llvm-svn: 318321
* [PowerPC] Split out the tailcall calling convention checks. NFC.Sean Fertile2017-11-151-11/+19
| | | | | | | | Move the calling convention checks for tail-call eligibility for the 64-bit SysV ABI into a separate function. This is so that it can be shared with 'mayBeEmittedAsTailCall' in a subsequent change. llvm-svn: 318305
* [PowerPC] fix up in redundant compare eliminationHiroshi Inoue2017-11-151-2/+6
| | | | | | This patch fixes a potential problem in my previous commit (https://reviews.llvm.org/rL312514) by introducing an additional check. llvm-svn: 318266
* Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layeringDavid Blaikie2017-11-085-7/+7
| | | | | | | | This header includes CodeGen headers, and is not, itself, included by any Target headers, so move it into CodeGen to match the layering of its implementation. llvm-svn: 317647
* Use new vector insert half-word and byte instructions when we see ↵Graham Yiu2017-11-072-3/+33
| | | | | | | | insertelement on '8 x i16' and '16 x i8' types. Also extended existing lit testcase to cover these cases. Differential Revision: https://reviews.llvm.org/D34630 llvm-svn: 317613
* Fix buildbot breakages from r317503. Add parentheses to assignment when ↵Graham Yiu2017-11-061-2/+2
| | | | | | using result as a condition. llvm-svn: 317508
* Adds code to PPC ISEL lowering to recognize byte inserts from ↵Graham Yiu2017-11-063-3/+117
| | | | | | | | vector_shuffles, and use P9 shift and vector insert byte instructions instead of vperm. Extends tests from vector insert half-word. Differential Revision: https://reviews.llvm.org/D34497 llvm-svn: 317503
* [PPC] Use xxbrd to speed up bswap64Guozhi Wei2017-11-062-2/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | Power doesn't have bswap instructions, so llvm generates following code sequence for bswap64. rotldi 5, 3, 16 rotldi 4, 3, 8 rotldi 9, 3, 24 rotldi 10, 3, 32 rotldi 11, 3, 48 rotldi 12, 3, 56 rldimi 4, 5, 8, 48 rldimi 4, 9, 16, 40 rldimi 4, 10, 24, 32 rldimi 4, 11, 40, 16 rldimi 4, 12, 48, 8 rldimi 4, 3, 56, 0 But Power9 has vector bswap instructions, they can also be used to speed up scalar bswap intrinsic. With this patch, bswap64 can be translated to: mtvsrdd 34, 3, 3 xxbrd 34, 34 mfvsrld 3, 34 Differential Revision: https://reviews.llvm.org/D39510 llvm-svn: 317499
* Move TargetFrameLowering.h to CodeGen where it's implementedDavid Blaikie2017-11-033-3/+3
| | | | | | | | | | | This header already includes a CodeGen header and is implemented in lib/CodeGen, so move the header there to match. This fixes a link error with modular codegeneration builds - where a header and its implementation are circularly dependent and so need to be in the same library, not split between two like this. llvm-svn: 317379
* Adds code to PPC ISEL lowering to recognize half-word inserts from ↵Graham Yiu2017-11-013-5/+139
| | | | | | | | vector_shuffles, and use P9 shift and vector insert instructions instead of vperm. Differential Revision: https://reviews.llvm.org/D34160 llvm-svn: 317111
* Revert "[PowerPC] Try to simplify a Swap if it feeds a Splat"Stefan Pintilie2017-10-301-47/+0
| | | | | | | | | | Revert r316478. A test case has failed. Will recommit this change once we find and fix the failure. This reverts commit 7c330fabaedaba3d02c58bc3cc1198896c895f34. llvm-svn: 316952
* [PPC CodeGen] Fix the bitreverse.i64 intrinsic.Fangrui Song2017-10-301-71/+34
| | | | | | | | | | | | Summary: The two 32-bit words were swapped. Update a test omitted in reverted r316270. Reviewers: jtony, aaron.ballman Subscribers: nemanjai, kbarton Differential Revision: https://reviews.llvm.org/D39163 llvm-svn: 316916
* [CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).Clement Courbet2017-10-302-4/+13
| | | | | | | | | | | | - Targets that want to support memcmp expansions now return the list of supported load sizes. - Expansion codegen does not assume that all power-of-two load sizes smaller than the max load size are valid. For examples, this is not the case for x86(32bit)+sse2. Fixes PR34887. llvm-svn: 316905
* [PowerPC] Use record-form instruction for Less-or-Equal -1 and ↵Hiroshi Inoue2017-10-261-30/+39
| | | | | | | | | | | Greater-or-Equal 1 Currently a record-form instruction is used for comparison of "greater than -1" and "less than 1" by modifying the predicate (e.g. LT 1 into LE 0) in addition to the naive case of comparison against 0. This patch also enables emitting a record-form instruction for "less than or equal to -1" (i.e. "less than 0") and "greater than or equal to 1" (i.e. "greater than 0") to increase the optimization opportunities. Differential Revision: https://reviews.llvm.org/D38941 llvm-svn: 316647
* [PowerPC] Try to simplify a Swap if it feeds a SplatStefan Pintilie2017-10-241-0/+47
| | | | | | | | | | | | If we have the situation where a Swap feeds a Splat we can sometimes change the index on the Splat and then remove the Swap instruction. Fixed the test case that was failing and recommit after pulling the original commit. Original revision is here: https://reviews.llvm.org/D39009 llvm-svn: 316478
* PowerPC: support the separator character in the IASSaleem Abdulrasool2017-10-241-0/+1
| | | | | | | PowerPC uses ; as a comment leader and the @ as a separator character. Support this properly. llvm-svn: 316454
* Revert "[PowerPC] Try to simplify a Swap if it feeds a Splat"Stefan Pintilie2017-10-231-47/+0
| | | | | | | | | Revert commit r316366. Previous commit causes p8-scalar_vector_conversions.ll to fail. This reverts commit 990e764ad8a2eec206ce5dda6aefab059ccd4e92. llvm-svn: 316371
* [PowerPC] Try to simplify a Swap if it feeds a SplatStefan Pintilie2017-10-231-0/+47
| | | | | | | | | If we have the situation where a Swap feeds a Splat we can sometimes change the index on the Splat and then remove the Swap instruction. Differential Revision: https://reviews.llvm.org/D39009 llvm-svn: 316366
* Reverting r316270 due to failing build bots.Aaron Ballman2017-10-211-8/+6
| | | | | | | http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/12899 http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/7951 llvm-svn: 316276
* [PPC CodeGen] Fix the bitreverse.i64 intrinsic.Fangrui Song2017-10-211-6/+8
| | | | | | | | | | Summary: The two 32-bit words were swapped. Subscribers: nemanjai, kbarton Differential Revision: https://reviews.llvm.org/D38705 llvm-svn: 316270
* Disabling the transformation introduced in r315888Nemanja Ivanovic2017-10-201-2/+2
| | | | | | | The commit at https://reviews.llvm.org/rL315888 is causing some failures with internal testing. Disabling this code until we can resolve the issues. llvm-svn: 316199
* The cost of splitting a large vector instruction is not being taken into ↵Graham Yiu2017-10-192-0/+13
| | | | | | | | | | account by the getUserCost function. This was leading to some loops being over unrolled. The cost of a vector instruction is now being multiplied by the cost of the type legalization. This will return a more accurate cost. Committing on behalf on Brad Nemanich (brad.nemanich@ibm.com) Differential Revision: https://reviews.llvm.org/D38961 llvm-svn: 316174
* [PowerPC] Use helper functions to check sign-/zero-extended valueHiroshi Inoue2017-10-181-23/+11
| | | | | | | | | | | Helper functions to identify sign- and zero-extending machine instruction is introduced in rL315888. This patch makes PPCInstrInfo::optimizeCompareInstr use the helper functions. It simplifies the code and also makes possible more optimizations since the helper can do more analysis than the original check code; I observed about 5000 more compare instructions are eliminated while building LLVM. Also, this patch fixes a bug in helpers on ANDIo instruction handling due to the order of checks. This bug causes a failure in an existing test case for optimizeCompareInstr. Differential Revision: https://reviews.llvm.org/D38988 llvm-svn: 316071
* Add iterator range MachineRegisterInfo::liveins(), adopt users, NFCKrzysztof Parzyszek2017-10-161-5/+3
| | | | llvm-svn: 315927
* [PowerPC] fix up in sign-/zero-extension eliminationHiroshi Inoue2017-10-161-0/+2
| | | | | | This patch fixes a potential problem in my previous commit (https://reviews.llvm.org/rL315888) by adding a null check. llvm-svn: 315900
* [PowerPC] Eliminate sign- and zero-extensions if already sign- or zero-extendedHiroshi Inoue2017-10-166-0/+506
| | | | | | | | | | | | | | | | | | This patch enables redundant sign- and zero-extension elimination in PowerPC MI Peephole pass. If the input value of a sign- or zero-extension is known to be already sign- or zero-extended, the operation is redundant and can be eliminated. One common case is sign-extensions for a method parameter or for a method return value; they must be sign- or zero-extended as defined in PPC ELF ABI. For example of the following simple code, two extsw instructions are generated before the invocation of int_func and before the return. With this patch, both extsw are eliminated. void int_func(int); void ii_test(int a) { if (a & 1) return int_func(a); } Such redundant sign- or zero-extensions are quite common in many programs; e.g. I observed about 60,000 occurrences of the elimination while compiling the LLVM+CLANG. Differential Revision: https://reviews.llvm.org/D31319 llvm-svn: 315888
* Reverting r315590; it did not include changes for llvm-tblgen, which is ↵Aaron Ballman2017-10-151-1/+1
| | | | | | | | causing link errors for several people. Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1 llvm-svn: 315854
* DAG: Add opcode and source type to isFPExtFreeMatt Arsenault2017-10-132-3/+4
| | | | | | | | This is only currently used for mad/fma transforms. This is the only case where it should be used for AMDGPU, so add an opcode to be sure. llvm-svn: 315740
* Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine"Matthias Braun2017-10-122-6/+6
| | | | | | | | | | Reverting to investigate layering effects of MCJIT not linking libCodeGen but using TargetMachine::getNameWithPrefix() breaking the lldb bots. This reverts commit r315633. llvm-svn: 315637
* TargetMachine: Merge TargetMachine and LLVMTargetMachineMatthias Braun2017-10-122-6/+6
| | | | | | | | | | | | | | | Merge LLVMTargetMachine into TargetMachine. - There is no in-tree target anymore that just implements TargetMachine but not LLVMTargetMachine. - It should still be possible to stub out all the various functions in case a target does not want to use lib/CodeGen - This simplifies the code and avoids methods ending up in the wrong interface. Differential Revision: https://reviews.llvm.org/D38489 llvm-svn: 315633
* [PowerPC] Add profitablilty check for conversion to mtctr loopsLei Huang2017-10-121-1/+32
| | | | | | | | | | | | | | | Add profitability checks for modifying counted loops to use the mtctr instruction. The latency of mtctr is only justified if there are more than 4 comparisons that will be removed as a result. Usually counted loops are formed relatively early and before unrolling, so most low trip count loops often don't survive. However we want to ensure that if they do, we do not mistakenly update them to mtctr loops. Use CodeMetrics to ensure we are only doing this for small loops with small trip counts. Differential Revision: https://reviews.llvm.org/D38212 llvm-svn: 315592
* [dump] Remove NDEBUG from test to enable dump methods [NFC]Don Hinton2017-10-121-1/+1
| | | | | | | | | | | | | | | Summary: Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP. Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods. Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so it'll be picked up by public headers. Differential Revision: https://reviews.llvm.org/D38406 llvm-svn: 315590
* [PowerPC] Utilize DQ-Form instructions for spill/restore and fix FrameIndex ↵Lei Huang2017-10-112-9/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | elimination to only use `lis/addi` if necessary. Currently we produce a bunch of unnecessary code when emitting the prologue/epilogue for spills/restores. Namely, if the load from stack slot/store to stack slot instruction is an X-Form instruction, we will always produce an LIS/ORI sequence for the stack offset. Furthermore, we have not exploited the P9 vector D-Form loads/stores for this purpose. This patch address both issues. Specifying the D-Form load as the instruction to use for stack spills/reloads should be safe because: 1. The stack should be aligned according to the ABI 2. If the stack isn't aligned, PPCRegisterInfo::eliminateFrameIndex() will check for the offset being a multiple of 16 and will convert it to an X-Form instruction if it isn't. Differential Revision : https://reviews.llvm.org/D38758 llvm-svn: 315500
* [Asm] Add debug tracing in table-generated assembly matcherOliver Stannard2017-10-111-2/+1
| | | | | | | | | | | | | This adds debug tracing to the table-generated assembly instruction matcher, enabled by the -debug-only=asm-matcher option. The changes in the target AsmParsers are to add an MCInstrInfo reference under a consistent name, so that we can use it from table-generated code. This was already being used this way for targets that use deprecation warnings, but 5 targets did not have it, and Hexagon had it under a different name to the other backends. llvm-svn: 315445
* [MC] Add a missing <memory> include left out of r315327.Lang Hames2017-10-101-0/+1
| | | | llvm-svn: 315331
* [MC] Thread unique_ptr<MCObjectWriter> through the create.*ObjectWriterLang Hames2017-10-104-14/+19
| | | | | | | | | | functions. This makes the ownership of the resulting MCObjectWriter clear, and allows us to remove one instance of MCObjectStreamer's bizarre "holding ownership via someone else's reference" trick. llvm-svn: 315327
* [PowerPC] Add missing record form instructions to the P9 Scheduling ModelStefan Pintilie2017-10-102-1/+32
| | | | | | | | | A number of record form instructions were missing from the P9 scheduling model. Added those instructions and marked the P9 model as complete. Differential Revision: https://reviews.llvm.org/D38560 llvm-svn: 315313
* Fix for PR34888.Nemanja Ivanovic2017-10-101-3/+4
| | | | | | | | | | The issue is that we assume operand zero of the input to the add instruction is a register. In this case, the input comes from inline assembly and operand zero is not a register thereby causing a crash. The code will bail anyway if the input instruction doesn't have the right opcode. So do that check first and let short-circuiting prevent the crash. llvm-svn: 315285
* [MC] Plumb unique_ptr<MCELFObjectTargetWriter> through createELFObjectWriter toLang Hames2017-10-091-2/+2
| | | | | | | | | | ELFObjectWriter's constructor. Fixes the same ownership issue for ELF that r315245 did for MachO: ELFObjectWriter takes ownership of its MCELFObjectTargetWriter, so we want to pass this through to the constructor via a unique_ptr, rather than a raw ptr. llvm-svn: 315254
* [MC] Plumb unique_ptr<MCMachObjectTargetWriter> through createMachObjectWriterLang Hames2017-10-091-1/+1
| | | | | | | | | | | to MCObjectWriter's constructor. MCObjectWriter takes ownership of its MCMachObjectTargetWriter argument -- this patch plumbs that ownership relationship through the constructor (which previously took raw MCMachObjectTargetWriter*) and the createMachObjectWriter function. llvm-svn: 315245
* Remove unused variables. No functionality change.Benjamin Kramer2017-10-081-2/+0
| | | | llvm-svn: 315185
* [PowerPC] Revert P9 scheduling model to incompleteStefan Pintilie2017-10-031-1/+1
| | | | | | | Partially revert a previous change from commit: https://llvm.org/svn/llvm-project/llvm/trunk@314026 The previous change caused regressions on Power 9. llvm-svn: 314835
* [trivial] fix format, NFCHiroshi Inoue2017-10-031-1/+1
| | | | llvm-svn: 314769
* [PowerPC] support ZERO_EXTEND in tryBitPermutationHiroshi Inoue2017-10-021-17/+64
| | | | | | | | | | | | | | | | | | | This patch add a support of ISD::ZERO_EXTEND in PPCDAGToDAGISel::tryBitPermutation to increase the opportunity to use rotate-and-mask by reordering ZEXT and ANDI. Since tryBitPermutation stops analyzing nodes if it hits a ZEXT node while traversing SDNodes, we want to avoid ZEXT between two nodes that can be folded into a rotate-and-mask instruction. For example, we allow these nodes t9: i32 = add t7, Constant:i32<1> t11: i32 = and t9, Constant:i32<255> t12: i64 = zero_extend t11 t14: i64 = shl t12, Constant:i64<2> to be folded into a rotate-and-mask instruction. Such case often happens in array accesses with logical AND operation in the index, e.g. array[i & 0xFF]; Differential Revision: https://reviews.llvm.org/D37514 llvm-svn: 314655
* [PowerPC] eliminate partially redundant compare instructionHiroshi Inoue2017-09-281-14/+180
| | | | | | | | | | | | | | | | | | This is a follow-on of D37211. D37211 eliminates a compare instruction if two conditional branches can be made based on the one compare instruction, e.g. if (a == 0) { ... } else if (a < 0) { ... } This patch extends this optimization to support partially redundant cases, which often happen in while loops. For example, one compare instruction is moved from the loop body into the preheader by this optimization in the following example. do { if (a == 0) dummy1(); a = func(a); } while (a > 0); Differential Revision: https://reviews.llvm.org/D38236 llvm-svn: 314390
* [PowerPC] eliminate unconditional branch to the next instructionHiroshi Inoue2017-09-271-0/+14
| | | | | | | | | This patch makes analyzeBranch eliminate unconditional branch to the next instruction. After basic blocks are re-organized by optimizers, such as machine block placement, a BB may end with an unconditional branch to the next (fallthrough) BB. This patch removes such redundant branch instruction. Differential Revision: https://reviews.llvm.org/D37730 llvm-svn: 314297
OpenPOWER on IntegriCloud