summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC/PPCInstrAltivec.td
Commit message (Collapse)AuthorAgeFilesLines
* [PowerPC] Swap arguments and adjust shift count for vsldoi on little endianBill Schmidt2014-08-051-7/+20
| | | | | | | | | | | | | | Commits r213915 and r214718 fix recognition of shuffle masks for vmrg* and vpku*um instructions for a little-endian target, by swapping the input arguments. The vsldoi instruction requires similar treatment, and also needs its shift count adjusted for little endian. Reviewed by Ulrich Weigand. This is a bug fix candidate for release 3.5 (and hopefully the last of those for PowerPC). llvm-svn: 214923
* [PowerPC] Swap arguments to vpkuhum/vpkuwum on little-endianUlrich Weigand2014-08-041-8/+22
| | | | | | | | | | | | | In commit r213915, Bill fixed little-endian usage of vmrgh* and vmrgl* by swapping the input arguments. As it turns out, the exact same fix is also required for the vpkuhum/vpkuwum patterns. This fixes another regression in llvmpipe when vector support is enabled. Reviewed by Bill Schmidt. llvm-svn: 214718
* Don't use additional arguments for dss and friends to satisfy DSS_Form,Joerg Sonnenberger2014-08-021-61/+53
| | | | | | | | | when let can do the same thing. Keep the 64bit variants as codegen-only. While they have a different register class, the encoding is the same for 32bit and 64bit mode. Having both present would otherwise confuse the disassembler. llvm-svn: 214636
* [PATCH][PPC64LE] Correct little-endian usage of vmrgh* and vmrgl*.Bill Schmidt2014-07-251-24/+56
| | | | | | | | | | | | | | | | | | | | | | Because the PowerPC vmrgh* and vmrgl* instructions have a built-in big-endian bias, it is necessary to swap their inputs in little-endian mode when using them to implement a vector shuffle. This was previously missed in the vector LE implementation. There was already logic to distinguish between unary and "normal" vmrg* vector shuffles, so this patch extends that logic to use a third option: "swapped" vmrg* vector shuffles that are used for little endian in place of the "normal" ones. I've updated the vec-shuffle-le.ll test to check for the expected register ordering on the generated instructions. This bug was discovered when testing the LE and ELFv2 patches for safety if they were backported to 3.4. A different vectorization decision was made in 3.4 than on mainline trunk, and that exposed the problem. I've verified this fix takes care of that issue. llvm-svn: 213915
* [PPC64LE] Recognize shufflevector patterns for little endianBill Schmidt2014-06-101-23/+39
| | | | | | | | | | | | | | | | | Various masks on shufflevector instructions are recognizable as specific PowerPC instructions (vector pack, vector merge, etc.). There is existing code in PPCISelLowering.cpp to recognize the correct patterns for big endian code. The masks for these instructions are different for little endian code due to the big-endian numbering employed by these instructions. This patch adds the recognition code for little endian. I've added a new test case test/CodeGen/PowerPC/vec_shuffle_le.ll for this. The existing recognizer test (vec_shuffle.ll) is unnecessarily verbose and difficult to read, so I felt it was better to add a new test rather than modify the old one. llvm-svn: 210536
* Reset the subtarget for DAGToDAG on every iteration of runOnMachineFunction.Eric Christopher2014-05-221-1/+1
| | | | | | | This required updating the generated functions and TD file accordingly to be pointers rather than const references. llvm-svn: 209375
* [PowerPC] Mark many instructions as commutativeHal Finkel2014-03-241-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I'm under the impression that we used to infer the isCommutable flag from the instruction-associated pattern. Regardless, we don't seem to do this (at least by default) any more. I've gone through all of our instruction definitions, and marked as commutative all of those that should be trivial to commute (by exchanging the first two operands). There has been special code for the RL* instructions, and that's not changed. Before this change, we had the following commutative instructions: RLDIMI RLDIMIo RLWIMI RLWIMI8 RLWIMI8o RLWIMIo XSADDDP XSMULDP XVADDDP XVADDSP XVMULDP XVMULSP After: ADD4 ADD4o ADD8 ADD8o ADDC ADDC8 ADDC8o ADDCo ADDE ADDE8 ADDE8o ADDEo AND AND8 AND8o ANDo CRAND CREQV CRNAND CRNOR CROR CRXOR EQV EQV8 EQV8o EQVo FADD FADDS FADDSo FADDo FMADD FMADDS FMADDSo FMADDo FMSUB FMSUBS FMSUBSo FMSUBo FMUL FMULS FMULSo FMULo FNMADD FNMADDS FNMADDSo FNMADDo FNMSUB FNMSUBS FNMSUBSo FNMSUBo MULHD MULHDU MULHDUo MULHDo MULHW MULHWU MULHWUo MULHWo MULLD MULLDo MULLW MULLWo NAND NAND8 NAND8o NANDo NOR NOR8 NOR8o NORo OR OR8 OR8o ORo RLDIMI RLDIMIo RLWIMI RLWIMI8 RLWIMI8o RLWIMIo VADDCUW VADDFP VADDSBS VADDSHS VADDSWS VADDUBM VADDUBS VADDUHM VADDUHS VADDUWM VADDUWS VAND VAVGSB VAVGSH VAVGSW VAVGUB VAVGUH VAVGUW VMADDFP VMAXFP VMAXSB VMAXSH VMAXSW VMAXUB VMAXUH VMAXUW VMHADDSHS VMHRADDSHS VMINFP VMINSB VMINSH VMINSW VMINUB VMINUH VMINUW VMLADDUHM VMULESB VMULESH VMULEUB VMULEUH VMULOSB VMULOSH VMULOUB VMULOUH VNMSUBFP VOR VXOR XOR XOR8 XOR8o XORo XSADDDP XSMADDADP XSMAXDP XSMINDP XSMSUBADP XSMULDP XSNMADDADP XSNMSUBADP XVADDDP XVADDSP XVMADDADP XVMADDASP XVMAXDP XVMAXSP XVMINDP XVMINSP XVMSUBADP XVMSUBASP XVMULDP XVMULSP XVNMADDADP XVNMADDASP XVNMSUBADP XVNMSUBASP XXLAND XXLNOR XXLOR XXLXOR This is a by-inspection change, and I'm not sure how to write a reliable test case. I would like advice on this, however. llvm-svn: 204609
* Add IIC_ prefix to PPC instruction-class namesHal Finkel2013-11-271-78/+80
| | | | | | | | | | | | | This adds the IIC_ prefix to the instruction itinerary class names, giving the PPC backend a naming convention for itinerary classes that is more consistent with that used by the X86 and ARM backends. Instruction scheduling in the PPC backend needs a bunch of cleanup and improvement (especially for the ooo cores). This is just a preliminary step. No functionality change intended. llvm-svn: 195890
* Mark PPC MFTB and DST (and friends) as deprecatedHal Finkel2013-09-121-10/+20
| | | | | | | | Use the new instruction deprecation feature to mark mftb (now replaced with mfspr) and dst (along with the other Altivec cache control instructions) as deprecated when targeting cores supporting at least ISA v2.03. llvm-svn: 190605
* PPC: Add some missing V_SET0 patternsHal Finkel2013-07-111-2/+15
| | | | | | | | | | We had patterns to match v4i32 immAllZerosV -> V_SET0, but not patterns for v8i16 (which occurs in the test case) or v16i8. The same was true for V_SETALLONES (so I added the associated patterns for those as well). Another bug found by llvm-stress. llvm-svn: 186108
* [PowerPC] Make specialized AltiVec patterns isCodeGenOnlyUlrich Weigand2013-07-031-2/+3
| | | | | | | | | | | A couple of AltiVec patterns are just specialized forms of the generic instruction pattern, and should therefore be marked isCodeGenOnly to avoid confusing the asm parser: VCFSX_0, VCTUXS_0, VCFUX_0, VCTSXS_0, and V_SETALLONES. Noticed by inspection of the generated PPCGenAsmMatcher.inc. llvm-svn: 185533
* PowerPC: Use RegisterOperand instead of RegisterClass operandsUlrich Weigand2013-04-261-72/+72
| | | | | | | | | | | | | | | | | In the default PowerPC assembler syntax, registers are specified simply by number, so they cannot be distinguished from immediate values (without looking at the opcode). This means that the default operand matching logic for the asm parser does not work, and we need to specify custom matchers. Since those can only be specified with RegisterOperand classes and not directly on the RegisterClass, all instructions patterns used by the asm parser need to use a RegisterOperand (instead of a RegisterClass) for all their register operands. This patch adds one RegisterOperand for each RegisterClass, using the same name as the class, just in lower case, and updates all instruction patterns to use RegisterOperand instead of RegisterClass operands. llvm-svn: 180611
* PowerPC: Fix encoding of vsubcuw and vsum4sbs instructionsUlrich Weigand2013-04-261-2/+2
| | | | | | | | | When testing the asm parser, I noticed wrong encodings for the above instructions (wrong sub-opcodes). Tests will be added together with the asm parser. llvm-svn: 180608
* PPC: Add a FIXME regarding the non-working fma+fneg Altivec patternHal Finkel2013-04-031-0/+2
| | | | llvm-svn: 178658
* More direct types in PowerPC AltiVec intrinsics.Ulrich Weigand2013-04-031-47/+29
| | | | | | | | | | This patch follows up on work done by Bill Schmidt in r178277, and replaces most of the remaining uses of VRRC in ISEL DAG patterns. The resulting .inc files are identical except for comments, so no change in code generation is expected. llvm-svn: 178656
* Use PPC reciprocal estimates with Newton iteration in fast-math modeHal Finkel2013-04-031-0/+3
| | | | | | | | | | | | | | | | | | | When unsafe FP math operations are enabled, we can use the fre[s] and frsqrte[s] instructions, which generate reciprocal (sqrt) estimates, together with some Newton iteration, in order to quickly generate floating-point division and sqrt results. All of these instructions are separately optional, and so each has its own feature flag (except for the Altivec instructions, which are covered under the existing Altivec flag). Doing this is not only faster than using the IEEE-compliant fdiv/fsqrt instructions, but allows these computations to be pipelined with other computations in order to hide their overall latency. I've also added a couple of missing fnmsub patterns which turned out to be missing (but are necessary for good code generation of the Newton iterations). Altivec needs a similar fix, but that will probably be more complicated because fneg is expanded for Altivec's v4f32. llvm-svn: 178617
* Use direct types in most PowerPC Altivec instructions and patterns.Bill Schmidt2013-03-281-236/+333
| | | | | | | | | | | | | | | | | | | | | | | This follows up Ulrich Weigand's work in PPCInstrInfo.td and PPCInstr64Bit.td by doing the corresponding work for most of the Altivec patterns. I have not been able to do anything for the following classes of instructions: (1) Vector logicals. These don't have corresponding intrinsics and don't have a single obvious vector type. So far as I can tell I need to leave these as VRRC. Affected instructions are: VAND, VANDC, VNOR, VOR, VXOR, V_SET0. (2) Instructions that make use of vector shuffle. The selection code promotes all shuffles to v16i8, so any pattern that matches on a shuffle is constrained. I haven't found any way to make the patterns match on their natural types, so I plan to leave these as VRRC. Affected instructions are: VMRG*, VSPLTB, VSPLTH, VSPLTW, VPKUHUM, VPKUWUM. No change in behavior is anticipated. llvm-svn: 178277
* PowerPC: Mark patterns as isCodeGenOnly.Ulrich Weigand2013-03-261-0/+3
| | | | | | | | | | | There remain a number of patterns that cannot (and should not) be handled by the asm parser, in particular all the Pseudo patterns. This commit marks those patterns as isCodeGenOnly. No change in generated code. llvm-svn: 178008
* Protect PPC Altivec patterns with a predicateHal Finkel2013-03-151-0/+6
| | | | | | | | | | | | In preparation for the addition of other SIMD ISA extensions (such as QPX) we need to make sure that all Altivec patterns are properly predicated on having Altivec support. No functionality change intended (one test case needed to be updated b/c it assumed that Altivec intrinsics would be supported without enabling Altivec support). llvm-svn: 177152
* This patch fixes the Altivec addend construction for the fused multiply-addAdhemerval Zanella2012-11-301-5/+7
| | | | | | | | | | | | | | | instruction (vmaddfp) to conform with IEEE to ensure the sign of a zero result when resulting product is -0.0. The -0.0 vector addend to vmaddfp is generated by a creating a vector with full bits sets and then shifting each elements by 31-bits to the left, resulting in a vector of 0x80000000 (or -0.0 as float). The 'buildvec_canonicalize.ll' was adjusted to reflect this change and the 'vec_mul.ll' was complemented with the float vector multiplication test. llvm-svn: 168998
* PowerPC: Lowering floor intrinsic for AltivecAdhemerval Zanella2012-11-151-0/+10
| | | | | | | | This patch lowers the llvm.floor, llvm.ceil, llvm.trunc, and llvm.nearbyint to Altivec instruction when using 4 single-precision float vectors. llvm-svn: 168086
* Add floating-point to and from integer conversionAdhemerval Zanella2012-10-081-0/+32
| | | | | | | This patch add altivec support for v4i32 to v4f32 and for v4f32 to v4i32 vector rounding conversion. llvm-svn: 165409
* Convert the PPC backend to use the new FMA infrastructure.Hal Finkel2012-06-221-7/+3
| | | | | | | The existing contraction patterns are replaced with fma/fneg. Overall functionality should be the same. llvm-svn: 158955
* Split the LdStGeneral PPC itin. class into LdStLoad and LdStStore.Hal Finkel2012-04-011-24/+24
| | | | | | | | | | | Loads and stores can have different pipeline behavior, especially on embedded chips. This change allows those differences to be expressed. Except for the 440 scheduler, there are no functionality changes. On the 440, the latency adjustment is only by one cycle, and so this probably does not affect much. Nevertheless, it will make a larger difference in the future and this removes a FIXME from the 440 itin. llvm-svn: 153821
* Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, ↵Jia Liu2012-02-181-3/+3
| | | | | | MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878
* fix up vnot matching, eliminating a dead pattern, correcting a couple ofChris Lattner2010-03-281-6/+11
| | | | | | | patterns that would never match because of bitcast, and eliminating use of vnot_conv. llvm-svn: 99753
* Fix a bunch of ambiguous patterns which tblgen happens to infer typesChris Lattner2010-03-081-7/+7
| | | | | | for, due to a bug. llvm-svn: 97953
* PR3628: Add patterns to match SHL/SRL/SRA to the corresponding Altivec Eli Friedman2009-06-071-0/+22
| | | | | | instructions. llvm-svn: 73009
* 2nd attempt, fixing SSE4.1 issues and implementing feedback from duncan.Nate Begeman2009-04-271-82/+97
| | | | | | | | | | | | | | PR2957 ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes as the shuffle mask. A value of -1 represents UNDEF. In addition to eliminating the creation of illegal BUILD_VECTORS just to represent shuffle masks, we are better about canonicalizing the shuffle mask, resulting in substantially better code for some classes of shuffles. llvm-svn: 70225
* Revert 69952. Causes testsuite failures on linux x86-64.Rafael Espindola2009-04-241-97/+82
| | | | llvm-svn: 69967
* PR2957Nate Begeman2009-04-241-82/+97
| | | | | | | | | | | | | | ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes as the shuffle mask. A value of -1 represents UNDEF. In addition to eliminating the creation of illegal BUILD_VECTORS just to represent shuffle masks, we are better about canonicalizing the shuffle mask, resulting in substantially better code for some classes of shuffles. A clean up of x86 shuffle code, and some canonicalizing in DAGCombiner is next. llvm-svn: 69952
* Rename isSimpleLoad to canFoldAsLoad, to better reflect its meaning.Dan Gohman2008-12-031-1/+1
| | | | llvm-svn: 60487
* erect abstraction boundaries for accessing SDValue members, rename Val -> ↵Gabor Greif2008-08-281-3/+3
| | | | | | Node to reflect semantics llvm-svn: 55504
* Replace all target specific implicit def instructions with a target ↵Evan Cheng2008-03-151-8/+0
| | | | | | independent one: TargetInstrInfo::IMPLICIT_DEF. llvm-svn: 48380
* rename isLoad -> isSimpleLoad due to evan's desire to have such a predicate.Chris Lattner2008-01-061-1/+1
| | | | llvm-svn: 45667
* remove some isStore flags that are now inferred automatically.Chris Lattner2008-01-061-1/+1
| | | | llvm-svn: 45652
* Remove attribution from file headers, per discussion on llvmdev.Chris Lattner2007-12-291-2/+2
| | | | llvm-svn: 45418
* Add the 64-bit versions of the DS* Altivec instructions.Bill Wendling2007-09-051-14/+45
| | | | llvm-svn: 41717
* Fix arguments for some Altivec instructions. From SWB.Dale Johannesen2007-08-091-9/+15
| | | | llvm-svn: 40957
* Fix spelling of mtvscr and mfvscr.Dale Johannesen2007-08-071-2/+2
| | | | llvm-svn: 40908
* Vector fneg must be expanded into fsub -0.0, X.Evan Cheng2007-07-301-2/+6
| | | | llvm-svn: 40586
* No more noResults.Evan Cheng2007-07-211-3/+1
| | | | llvm-svn: 40132
* Change instruction description to split OperandList into OutOperandList andEvan Cheng2007-07-191-58/+58
| | | | | | | | | | | | | | | InOperandList. This gives one piece of important information: # of results produced by an instruction. An example of the change: def ADD32rr : I<0x01, MRMDestReg, (ops GR32:$dst, GR32:$src1, GR32:$src2), "add{l} {$src2, $dst|$dst, $src2}", [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>; => def ADD32rr : I<0x01, MRMDestReg, (outs GR32:$dst), (ins GR32:$src1, GR32:$src2), "add{l} {$src2, $dst|$dst, $src2}", [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>; llvm-svn: 40033
* fix incorrect encoding of vminsw.Chris Lattner2007-02-161-1/+1
| | | | llvm-svn: 34351
* Make the implicit def instructions look like other instrs.Chris Lattner2006-07-181-1/+1
| | | | llvm-svn: 29174
* Remove some now-unneeded casts from instruction patterns. With the castsChris Lattner2006-06-201-11/+11
| | | | | | removed, tblgen produces identical output to with them in. llvm-svn: 28867
* Fix the CodeGen/PowerPC/buildvec_canonicalize.ll regression last night.Chris Lattner2006-04-201-1/+1
| | | | llvm-svn: 27908
* Make sure that the new instructions selected have the right type. This fixesChris Lattner2006-04-201-5/+5
| | | | | | CodeGen/PowerPC/2006-04-19-vmaddfp-crash.ll llvm-svn: 27868
* Implement a TODO: have the legalizer canonicalize a bunch of operations toChris Lattner2006-04-161-41/+6
| | | | | | | one type (v4i32) so that we don't have to write patterns for each type, and so that more CSE opportunities are exposed. llvm-svn: 27731
* Add patterns for matching vnots with bit converted inputs. Most of these willChris Lattner2006-04-151-0/+17
| | | | | | go away when I start using evan's binop type canonicalizer llvm-svn: 27725
OpenPOWER on IntegriCloud