summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMInstrNEON.td
Commit message (Collapse)AuthorAgeFilesLines
...
* [ARM] Add ARMv8.2-A FP16 vector instructionsOliver Stannard2015-12-161-28/+356
| | | | | | | | | | | | | | ARMv8.2-A adds 16-bit floating point versions of all existing SIMD floating-point instructions. This is an optional extension, so all of these instructions require the FeatureFullFP16 subtarget feature. Note that VFP without SIMD is not a valid combination for any version of ARMv8-A, but I have ensured that these instructions all depend on both FeatureNEON and FeatureFullFP16 for consistency. Differential Revision: http://reviews.llvm.org/D15039 llvm-svn: 255764
* Revert r248483, r242546, r242545, and r242409 - absdiff intrinsicsHal Finkel2015-12-111-8/+8
| | | | | | | | | | | | | | | | | | | | After much discussion, ending here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151123/315620.html it has been decided that, instead of having the vectorizer directly generate special absdiff and horizontal-add intrinsics, we'll recognize the relevant reduction patterns during CodeGen. Accordingly, these intrinsics are not needed (the operations they represent can be pattern matched, as is already done in some backends). Thus, we're backing these out in favor of the current development work. r248483 - Codegen: Fix llvm.*absdiff semantic. r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation llvm-svn: 255387
* [ARM] Match VABDL from log2 shuffles.Charlie Turner2015-11-171-0/+23
| | | | | | Differential Revision: http://reviews.llvm.org/D14664 llvm-svn: 253334
* [ARM] Add instruction selection patterns for vmin/vmaxSilviu Baranga2015-08-191-4/+4
| | | | | | | | | | | | | | | | Summary: The mid-end was generating vector smin/smax/umin/umax nodes, but we were using vbsl to generatate the code. This adds the vmin/vmax patterns and a test to check that we are now generating vmin/vmax instructions. Reviewers: rengolin, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D12105 llvm-svn: 245439
* [ARM] Allow vmin/vmax of scalars to be emitted without UseNEONForFP.James Molloy2015-08-131-2/+2
| | | | | | | | This overrides the default to more closely resemble the hand-crafted matching logic in ISelLowering. It makes sense, as there is no VFP equivalent of vmin or vmax, to use them when they're available even if in general VFP ops should be preferred. This should be NFC. llvm-svn: 244915
* [ARM] Match fminnan/fmaxnan for vector vmin/vmax instead of an intrinsicJames Molloy2015-08-111-4/+4
| | | | | | | | Lower Intrinsic::arm_neon_vmins/vmaxs to fminnan/fmaxnan and match that instead. This is important because SDAG will soon be able to select FMINNAN itself, so we need a unified lowering path for intrinsics and SDAG. NFCI. llvm-svn: 244593
* [ARM] Match fminnum/fmaxnum for vector vminnm/vmaxnm instead of an intrinsicJames Molloy2015-08-111-4/+4
| | | | | | | | Lower the intrinsic to a FMINNUM/FMAXNUM node and select that instead. This is important because soon SDAG will be able to select FMINNUM/FMAXNUM itself, so we need an integrated lowering path between SDAG and intrinsics. NFCI. llvm-svn: 244592
* [ARM] Replace ARMISD::FMIN/FMAX with the shiny new ISD::FMINNAN/FMAXNAN.James Molloy2015-08-111-7/+2
| | | | | | NFCI. This removes a custom ISDNode. llvm-svn: 244590
* [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABAJames Molloy2015-07-171-8/+8
| | | | | | | No functional change, but it preps codegen for the future when SABSDIFF will start getting generated in anger. llvm-svn: 242546
* Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"Sergey Dmitrouk2015-04-281-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989
* Revert "[DebugInfo] Add debug locations to constant SD nodes"Daniel Jasper2015-04-281-13/+8
| | | | | | | This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987
* [DebugInfo] Add debug locations to constant SD nodesSergey Dmitrouk2015-04-281-8/+13
| | | | | | | | | | | | | | | | | | | | | | | This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977
* [ARM] Add v8.1a "Rounding Double Multiply Add/Subtract" extensionVladimir Sukharev2015-03-261-14/+166
| | | | | | | | | | Reviewers: t.p.northover Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8503 llvm-svn: 233301
* [ARM] Remove target-specific ITOFP/FPTOI nodesJames Molloy2015-03-231-4/+31
| | | | | | | | Anton tried this 5 years ago but it was reverted due to extra VMOVs being emitted. This can be easily fixed with a liberal application of patterns - matching loads/stores and extractelts. llvm-svn: 232958
* Replace neverHasSideEffects=1 with hasSideEffects=0 in all .td files.Craig Topper2014-11-261-10/+10
| | | | llvm-svn: 222801
* [ARM] NEON 32-bit scalar moves are also available in VFPv2Oliver Stannard2014-10-211-2/+3
| | | | | | | | | | | The 32-bit variants of the NEON scalar<->GPR move instructions are also available in VFPv2. The 8- and 16-bit variants do require NEON. Note that the checks in the test file are all -DAG because they are checking a mixture of stdout and stderr, and the ordering is not guaranteed. llvm-svn: 220288
* Add aliases for VAND imm to VBIC ~immRenato Golin2014-09-251-0/+18
| | | | | | | | | | | | | On ARM NEON, VAND with immediate (16/32 bits) is an alias to VBIC ~imm with the same type size. Adding that logic to the parser, and generating VBIC instructions from VAND asm files. This patch also fixes the validation routines for NEON splat immediates which were wrong. Fixes PR20702. llvm-svn: 218450
* [ARM] Mark VSETLNi32 with the InsertSubreg property and implement the relatedQuentin Colombet2014-08-211-0/+3
| | | | | | | | | | | | | target hook. This patch teaches the compiler that: dX = VSETLNi32 dY, rZ, imm is the same as: dX = INSERT_SUBREG dY, rZ, translateImmToSubIdx(imm) <rdar://problem/12702965> llvm-svn: 216143
* Fix a whole bunch of binary literals which were the wrong size. All were ↵Pete Cooper2014-08-071-1/+1
| | | | | | | | being silently zero extended to the correct width. The commit after this changes { } and 0bxx literals to be of type bits<n> and not int. This means we need to write exactly the right number of bits, and not rely on the values being silently zero extended for us. llvm-svn: 215082
* ARMEB: Vector extend operationsChristian Pirker2014-06-231-20/+145
| | | | | | Reviewed at http://reviews.llvm.org/D4043 llvm-svn: 211520
* ARM: Implement big endian bit-conversion for NEON typeChristian Pirker2014-05-121-54/+132
| | | | llvm-svn: 208538
* ARM: stop passing unused values up the TableGen hierarchy.Tim Northover2014-04-281-9/+6
| | | | | | | | It's bad enough that I have to look up 5 different levels of TableGen class definitions to work out what bits go where in a simple NEON instruction anyway, without having to keep track of umpteen unused parameters. llvm-svn: 207420
* Fix for PR18921, "vmov" part.Stepan Dyatkovskiy2014-04-241-0/+72
| | | | | | | | | | | | | | | | | | | | | | | Added support for bytes replication feature, so it could be GAS compatible. E.g. instructions below: "vmov.i32 d0, 0xffffffff" "vmvn.i32 d0, 0xabababab" "vmov.i32 d0, 0xabababab" "vmov.i16 d0, 0xabab" are incorrect, but we could deal with such cases. For first one we should emit: "vmov.i8 d0, 0xff" For second one ("vmvn"): "vmov.i8 d0, 0x54" For last two instructions it should emit: "vmov.i8 d0, 0xab" P.S.: In ARMAsmParser.cpp I have also fixed few nearby style issues in old code. Just for keeping method bodies in harmony with themselves. llvm-svn: 207080
* For the ARM integrated assembler add checking of theKevin Enderby2014-04-101-428/+603
| | | | | | | | | | | | | | | | | | | alignments on vld/vst instructions. And report errors for alignments that are not supported. While this is a large diff and an big test case, the changes are very straight forward. But pretty much had to touch all vld/vst instructions changing the addrmode to one of the new ones that where added will do the proper checking for the specific instruction. FYI, re-committing this with a tweak so MemoryOp's default constructor is trivial and will work with MSVC 2012. Thanks to Reid Kleckner and Jim Grosbach for help with the tweak. rdar://11312406 llvm-svn: 205986
* Revert "For the ARM integrated assembler add checking of the alignments on ↵Reid Kleckner2014-04-101-603/+428
| | | | | | | | | | | | | vld/vst instructions. And report errors for alignments that are not supported." It doesn't build with MSVC 2012, because MSVC doesn't allow union members that have non-trivial default constructors. This change added 'SMLoc AlignmentLoc' to MemoryOp, which made MemoryOp's default ctor non-trivial. This reverts commit r205930. llvm-svn: 205944
* For the ARM integrated assembler add checking of theKevin Enderby2014-04-091-428/+603
| | | | | | | | | | | | | | | alignments on vld/vst instructions. And report errors for alignments that are not supported. While this is a large diff and an big test case, the changes are very straight forward. But pretty much had to touch all vld/vst instructions changing the addrmode to one of the new ones that where added will do the proper checking for the specific instruction. rdar://11312406 llvm-svn: 205930
* Tidy up. Trailing whitespace.Jim Grosbach2014-04-031-11/+11
| | | | llvm-svn: 205583
* ARM: add cyclone CPU with ZeroCycleZeroing feature.Tim Northover2014-04-011-0/+20
| | | | | | | | The Cyclone CPU is similar to swift for most LLVM purposes, but does have two preferred instructions for zeroing a VFP register. This teaches LLVM about them. llvm-svn: 205309
* ARM: remove floating-point patterns for @llvm.arm.neon.vabsTim Northover2014-02-131-3/+0
| | | | | | | The front-end is now generating the generic @llvm.fabs for this operation now, so the extra patterns are no longer needed. llvm-svn: 201314
* ARM: use natural LLVM IR for vshll instructionsTim Northover2014-02-101-14/+27
| | | | | | | | Similarly to the vshrn instructions, these are simple zext/sext + trunc operations. Using normal LLVM IR should allow for better code, and more sharing with the AArch64 backend. llvm-svn: 201093
* ARM: use LLVM IR to represent the vshrn operationTim Northover2014-02-101-4/+13
| | | | | | | | | | vshrn is just the combination of a right shift and a truncate (and the limits on the immediate value actually mean the signedness of the shift doesn't matter). Using that representation allows us to get rid of an ARM-specific intrinsic, share more code with AArch64 and hopefully get better code out of the mid-end optimisers. llvm-svn: 201085
* ARM & AArch64: merge NEON absolute compare intrinsicsTim Northover2014-02-041-4/+4
| | | | | | | | There was an extremely confusing proliferation of LLVM intrinsics to implement the vacge & vacgt instructions. This combines them all into two polymorphic intrinsics, shared across both backends. llvm-svn: 200768
* AArch64 & ARM: refactor crypto intrinsics to take scalarsTim Northover2014-02-031-5/+33
| | | | | | | | | | | | Some of the SHA instructions take a scalar i32 as one argument (largely because they work on 160-bit hash fragments). This wasn't reflected in the IR previously, with ARM and AArch64 choosing different types (<4 x i32> and <1 x i32> respectively) which was ugly. This makes all the affected intrinsics take a uniform "i32", allowing them to become non-polymorphic at the same time. llvm-svn: 200706
* Remove the useless pseudo instructions VDUPfdf and VDUPfqf, replacing them ↵James Molloy2014-01-201-4/+6
| | | | | | with patterns to match VDUPLN. llvm-svn: 199675
* For ARM, fix assertuib failures for some ld/st 3/4 instruction with wirteback.Jiangning Liu2014-01-161-2/+6
| | | | llvm-svn: 199369
* ARM: add a couple more NEON predicates.Tim Northover2013-10-241-4/+4
| | | | | | | | The fused multiply instructions were added in VFPv4 but are still NEON instructions, in particular they shouldn't be available on a Cortex-M4 not matter how floaty it is. llvm-svn: 193342
* ARM: mark various aliases with their architecture requirements.Tim Northover2013-10-241-4/+4
| | | | | | | | | | If an alias inherits directly from InstAlias then it doesn't get any default "Requires" values, so llvm-mc will allow it even on architectures that don't support the underlying instruction. This tidies up the obvious VFP and NEON cases I found. llvm-svn: 193340
* [ARMv8] Add support for the v8 cryptography extensions.Amara Emerson2013-09-191-9/+102
| | | | llvm-svn: 190996
* Revert "Revert "ARM: Improve pattern for isel mul of vector by scalar.""Jim Grosbach2013-09-031-0/+11
| | | | | | | | | This reverts commit r189648. Fixes for the previously failing clang-side arm_neon_intrinsics test cases will be checked in separately. llvm-svn: 189841
* Revert "ARM: Improve pattern for isel mul of vector by scalar."Michael Gottesman2013-08-301-11/+0
| | | | | | | | This reverts commit r189619. The commit was breaking the arm_neon_intrinsic test. llvm-svn: 189648
* ARM: Improve pattern for isel mul of vector by scalar.Jim Grosbach2013-08-291-0/+11
| | | | | | | | | | | In addition to recognizing when the multiply's second argument is coming from an explicit VDUPLANE, also look for a plain scalar f32 reference and reference it via the corresponding vector lane. rdar://14870054 llvm-svn: 189619
* ARM: remove unused v(add|sub)hn and vqdml[as]l intrinsics.Tim Northover2013-08-281-8/+6
| | | | | | | Clang is now generating cleaner IR, so this removes the old variants which should be completely unused. llvm-svn: 189481
* ARM: add patterns for vqdmlal with separate vqdmull and vqaddsTim Northover2013-08-281-0/+38
| | | | | | | | | The vqdmlal and vqdmlls instructions are really just a fused pair consisting of a vqdmull.sN and a vqadd.sN. This adds patterns to LLVM so that we can switch Clang's CodeGen over to generating these instead of the special vqdmlal intrinsics. llvm-svn: 189480
* [ARMv8] Add some negative tests for the recent VFP/NEON instructions.Joey Gouly2013-08-271-2/+2
| | | | | | Fix two issues I found while writing these tests. llvm-svn: 189341
* ARM: add natural patterns for vaddhl and vsubhl.Tim Northover2013-08-271-0/+14
| | | | | | | | These instructions aren't particularly complicated and it's well worth having patterns for some reasonably useful LLVM IR that will match them. Soon we should be able to switch Clang over to producing this natural version. llvm-svn: 189335
* Fix ARM vcvt encoding when the number of fractional bits is zero.Mihai Popa2013-08-221-0/+19
| | | | | | | | | | | The instruction to convert between floating point and fixed point representations takes an immediate operand for the number of fractional bits of the fixed point value. ARMARM specifies that when that number of bits is zero, the assembler should encode floating point/integer conversion instructions. This patch adds the necessary instruction aliases to achieve this behaviour. llvm-svn: 189009
* ARM: remove now unneeded custom Asm convertersTim Northover2013-07-221-28/+0
| | | | | | | | After Ulrich's r180677 (thanks!) TableGen is intelligent enough to handle tied constraints involving complex operands properly, so virtually all of the ARM custom converters are now unnecessary. llvm-svn: 186810
* [ARMv8] Implement the NEON instructions VRINT{N, X, A, Z, M, P}.Joey Gouly2013-07-191-0/+28
| | | | llvm-svn: 186688
* Change 'n' to 'N' to keep consistent with other instructions.Joey Gouly2013-07-181-4/+4
| | | | llvm-svn: 186576
* [ARMv8] Add NEON instructions VCVT{A, N, P, M}.Joey Gouly2013-07-181-0/+35
| | | | llvm-svn: 186574
OpenPOWER on IntegriCloud