summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64/AArch64Subtarget.h
Commit message (Collapse)AuthorAgeFilesLines
...
* [ARM/AArch64] Support FP16 +fp16fml instructionsBernard Ogden2018-08-171-0/+2
| | | | | | | | | | | | | | | | | | Add +fp16fml feature for new FP16 instructions, which are a mandatory part of FP16 from v8.4-A and an optional part of FP16 from v8.2-A. It doesn't seem to be possible to model this in LLVM, but the relationship between the options is handled by the related clang patch. In keeping with what I think is the usual practice, the fp16fml extension is accepted regardless of base architecture version. Builds on/replaces Sjoerd Meijer's patch to add these instructions at https://reviews.llvm.org/D49839. Differential Revision: https://reviews.llvm.org/D50228 llvm-svn: 340013
* [ARM][AArch64] Armv8.4-A EnablementSjoerd Meijer2018-06-291-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initial patch adding assembly support for Armv8.4-A. Besides adding v8.4 as a supported architecture to the usual places, this also adds target features for the different crypto algorithms. Armv8.4-A introduced new crypto algorithms, made them optional, and allows different combinations: - none of the v8.4 crypto functions are supported, which is independent of the implementation of the Armv8.0 SHA1 and SHA2 instructions. - the v8.4 SHA512 and SHA3 support is implemented, in this case the Armv8.0 SHA1 and SHA2 instructions must also be implemented. - the v8.4 SM3 and SM4 support is implemented, which is independent of the implementation of the Armv8.0 SHA1 and SHA2 instructions. - all of the v8.4 crypto functions are supported, in this case the Armv8.0 SHA1 and SHA2 instructions must also be implemented. The v8.4 crypto instructions are added to AArch64 only, and not AArch32, and are made optional extensions to Armv8.2-A. The user-facing Clang options will map on these new target features, their naming will be compatible with GCC and added in follow-up patches. The Armv8.4-A instruction sets can be downloaded here: https://developer.arm.com/products/architecture/a-profile/exploration-tools Differential Revision: https://reviews.llvm.org/D48625 llvm-svn: 335953
* [AArch64] Support reserving x20 registerPetr Hosek2018-06-121-0/+4
| | | | | | | | | | | | Register x20 is a callee-saved register which may be used for other purposes in certain contexts, for example to hold special variables within the kernel. This change adds support for reserving this register both to frontend and backend to make this register usable for these purposes. Differential Revision: https://reviews.llvm.org/D46552 llvm-svn: 334531
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-011-1/+1
| | | | | | | | | | | | | | | | We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
* [PATCH] [AArch64] Add new target feature to fuse conditional selectEvandro Menezes2018-02-231-1/+3
| | | | | | | | | This feature enables the fusion of the comparison and the conditional select instructions together. Differential revision: https://reviews.llvm.org/D42392 llvm-svn: 325939
* [AArch64] Add new target feature to fuse address generation with load or storeEvandro Menezes2018-01-301-0/+2
| | | | | | | | | This feature enables the fusion of the address generation and a corresponding load or store together. Differential revision: https://reviews.llvm.org/D42393 llvm-svn: 323782
* [AArch64] Add new target feature to handle cheap as move for ExynosEvandro Menezes2018-01-301-0/+2
| | | | | | | | | This feature enables special handling of cheap as move in the existing custom handling specifically for Exynos processors. Differential revision: https://reviews.llvm.org/D42387 llvm-svn: 323774
* [AArch64] Add pipeline model for Exynos M3Evandro Menezes2018-01-301-0/+1
| | | | | | | | Add the scheduling and cost model for Exynos M3. Differential revision: https://reviews.llvm.org/D42387 llvm-svn: 323773
* [AArch64] Enable aggressive FMA on T99 and provide AArch64 options for others.Joel Jones2018-01-251-0/+2
| | | | | | | | | | | | | | This patch enables aggressive FMA by default on T99, and provides a -mllvm option to enable the same on other AArch64 micro-arch's (-mllvm -aarch64-enable-aggressive-fma). Test case demonstrating the effects on T99 is included. Patch by: steleman (Stefan Teleman) Differential Revision: https://reviews.llvm.org/D40696 llvm-svn: 323474
* AArch64: Fix emergency spillslot being out of reach for large callframesMatthias Braun2018-01-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-commit of r322200: The testcase shouldn't hit machineverifiers anymore with r322917 in place. Large callframes (calls with several hundreds or thousands or parameters) could lead to situations in which the emergency spillslot is out of range to be addressed relative to the stack pointer. This commit forces the use of a frame pointer in the presence of large callframes. This commit does several things: - Compute max callframe size at the end of instruction selection. - Add mirFileLoaded target callback. Use it to compute the max callframe size after loading a .mir file when the size wasn't specified in the file. - Let TargetFrameLowering::hasFP() return true if there exists a callframe > 255 bytes. - Always place the emergency spillslot close to FP if we have a frame pointer. - Note that `useFPForScavengingIndex()` would previously return false when a base pointer was available leading to the emergency spillslot getting allocated late (that's the whole effect of this callback). Which made no sense to me so I took this case out: Even though the emergency spillslot is technically not referenced by FP in this case we still want it allocated early. Differential Revision: https://reviews.llvm.org/D40876 llvm-svn: 322919
* Revert "AArch64: Fix emergency spillslot being out of reach for large ↵Matthias Braun2018-01-101-2/+0
| | | | | | | | | | | | callframes" Revert for now as the testcase is hitting a pre-existing verifier error that manifest as a failure when expensive checks are enabled (or -verify-machineinstrs) is used. This reverts commit r322200. llvm-svn: 322231
* AArch64: Fix emergency spillslot being out of reach for large callframesMatthias Braun2018-01-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Large callframes (calls with several hundreds or thousands or parameters) could lead to situations in which the emergency spillslot is out of range to be addressed relative to the stack pointer. This commit forces the use of a frame pointer in the presence of large callframes. This commit does several things: - Compute max callframe size at the end of instruction selection. - Add mirFileLoaded target callback. Use it to compute the max callframe size after loading a .mir file when the size wasn't specified in the file. - Let TargetFrameLowering::hasFP() return true if there exists a callframe > 255 bytes. - Always place the emergency spillslot close to FP if we have a frame pointer. - Note that `useFPForScavengingIndex()` would previously return false when a base pointer was available leading to the emergency spillslot getting allocated late (that's the whole effect of this callback). Which made no sense to me so I took this case out: Even though the emergency spillslot is technically not referenced by FP in this case we still want it allocated early. Differential Revision: https://reviews.llvm.org/D40876 llvm-svn: 322200
* AArch64/X86: Factor out common bzero logic; NFCMatthias Braun2017-12-181-7/+0
| | | | llvm-svn: 321035
* AArch64: work around how Cyclone handles "movi.2d vD, #0".Tim Northover2017-12-181-0/+5
| | | | | | | | | | | For Cylone, the instruction "movi.2d vD, #0" is executed incorrectly in some rare circumstances. Work around the issue conservatively by avoiding the instruction entirely. This patch changes CodeGen so that problematic instructions are never generated, and the AsmParser so that an equivalent instruction is used (with a warning). llvm-svn: 320965
* Fix a bunch more layering of CodeGen headers that are in TargetDavid Blaikie2017-11-171-1/+1
| | | | | | | | All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490
* [AArch64] Add basic support for Qualcomm's Saphira CPU.Chad Rosier2017-09-251-0/+1
| | | | llvm-svn: 314105
* [AArch64][Falkor] Avoid generating STRQro* instructionsGeoff Berry2017-08-281-0/+2
| | | | | | | | | | | | | | | Summary: STRQro* instructions are slower than the alternative ADD/STRQui expanded instructions on Falkor, so avoid generating them unless we're optimizing for code size. Reviewers: t.p.northover, mcrosier Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D37020 llvm-svn: 311931
* [ARM][AArch64] Cortex-A75 and Cortex-A55 supportSam Parker2017-08-211-0/+2
| | | | | | | | | | | | | | | | | | This patch introduces support for Cortex-A75 and Cortex-A55, Arm's latest big.LITTLE A-class cores. They implement the ARMv8.2-A architecture, including the cryptography and RAS extensions, plus the optional dot product extension. They also implement the RCpc AArch64 extension from ARMv8.3-A. Cortex-A75: https://developer.arm.com/products/processors/cortex-a/cortex-a75 Cortex-A55: https://developer.arm.com/products/processors/cortex-a/cortex-a55 Differential Revision: https://reviews.llvm.org/D36667 llvm-svn: 311316
* Reapply "[GlobalISel] Remove the GISelAccessor API."Quentin Colombet2017-08-151-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit r310425, thus reapplying r310335 with a fix for link issue of the AArch64 unittests on Linux bots when BUILD_SHARED_LIBS is ON. Original commit message: [GlobalISel] Remove the GISelAccessor API. Its sole purpose was to avoid spreading around ifdefs related to building global-isel. Since r309990, GlobalISel is not optional anymore, thus, we can get rid of this mechanism all together. NFC. ---- The fix for the link issue consists in adding the GlobalISel library in the list of dependencies for the AArch64 unittests. This dependency comes from the use of AArch64Subtarget that needs to know how to destruct the GISel related APIs when being detroyed. Thanks to Bill Seurer and Ahmed Bougacha for helping me reproducing and understand the problem. llvm-svn: 310969
* [AArch64] Assembler support for v8.3 RCpcSam Parker2017-08-101-0/+2
| | | | | | | | | | Added assembler and disassembler support for the new Release Consistent processor consistent instructions, introduced with ARM v8.3-A for AArch64. Differential Revision: https://reviews.llvm.org/D36522 llvm-svn: 310575
* [ARM][AArch64] ARMv8.3-A enablementSam Parker2017-08-101-0/+2
| | | | | | | | | | | | | | | | | The beta ARMv8.3 ISA specifications have been released for AArch64 and AArch32, these can be found at: https://developer.arm.com/products/architecture/a-profile/exploration-tools An introduction to this architecture update can be found at: https://community.arm.com/processors/b/blog/posts/armv8-a-architecture-2016-additions This patch is the first in a series which will add ARM v8.3-A support in LLVM and Clang. It adds the necessary changes that create targets for both the ARM and AArch64 backends. Differential Revision: https://reviews.llvm.org/D36514 llvm-svn: 310561
* [AArch64] Assembler support for the ARMv8.2a dot product instructionsSjoerd Meijer2017-08-091-0/+2
| | | | | | | | | | | Dot product is an optional ARMv8.2a extension, see also the public architecture specification here: https://developer.arm.com/products/architecture/a-profile/exploration-tools. This patch adds AArch64 assembler support for these dot product instructions. Differential Revision: https://reviews.llvm.org/D36515 llvm-svn: 310480
* Revert "[GlobalISel] Remove the GISelAccessor API."Quentin Colombet2017-08-081-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit r310115. It causes a linker failure for the one of the unittests of AArch64 on one of the linux bot: http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/3429 : && /home/fedora/gcc/install/gcc-7.1.0/bin/g++ -fPIC -fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -ffunction-sections -fdata-sections -O2 -L/home/fedora/gcc/install/gcc-7.1.0/lib64 -Wl,-allow-shlib-undefined -Wl,-O3 -Wl,--gc-sections unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o -o unittests/Target/AArch64/AArch64Tests lib/libLLVMAArch64CodeGen.so.6.0.0svn lib/libLLVMAArch64Desc.so.6.0.0svn lib/libLLVMAArch64Info.so.6.0.0svn lib/libLLVMCodeGen.so.6.0.0svn lib/libLLVMCore.so.6.0.0svn lib/libLLVMMC.so.6.0.0svn lib/libLLVMMIRParser.so.6.0.0svn lib/libLLVMSelectionDAG.so.6.0.0svn lib/libLLVMTarget.so.6.0.0svn lib/libLLVMSupport.so.6.0.0svn -lpthread lib/libgtest_main.so.6.0.0svn lib/libgtest.so.6.0.0svn -lpthread -Wl,-rpath,/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib && : unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x0): undefined reference to `vtable for llvm::LegalizerInfo' unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x8): undefined reference to `vtable for llvm::RegisterBankInfo' The particularity of this bot is that it is built with BUILD_SHARED_LIBS=ON However, I was not able to reproduce the problem so far. Reverting to unblock the bot. llvm-svn: 310425
* [GlobalISel] Remove the GISelAccessor API.Quentin Colombet2017-08-041-10/+10
| | | | | | | | | | Its sole purpose was to avoid spreading around ifdefs related to building global-isel. Since r309990, GlobalISel is not optional anymore, thus, we can get rid of this mechanism all together. NFC. llvm-svn: 310115
* [AArch64] Extend CallingConv::X86_64_Win64 to AArch64 as wellMartin Storsjo2017-07-171-0/+11
| | | | | | | | | | | | Rename the enum value from X86_64_Win64 to plain Win64. The symbol exposed in the textual IR is changed from 'x86_64_win64cc' to 'win64cc', but the numeric value is kept, keeping support for old bitcode. Differential Revision: https://reviews.llvm.org/D34474 llvm-svn: 308208
* [AArch64] Add an SVE target feature to the backend and TargetParser.Amara Emerson2017-07-131-0/+2
| | | | | | | The feature will be used properly once assembler/disassembler support begins to land. llvm-svn: 307917
* [AArch64] Add AArch64Subtarget::isFusion function.Florian Hahn2017-07-121-0/+7
| | | | | | | | | | | | | | | | | | | | | | Summary: isFusion returns true if the subtarget supports any kind of instruction fusion, similar to ARMSubtarget::isFusion. This was suggested in D34142. This changes the current behavior slightly, because the macro fusion mutation is now added to the PostRA MachineScheduler in case the subtarget supports any kind of fusion. I think that makes sense because if the PostRA MachineScheduler is run, there is potential that instructions scheduled back to back are re-scheduled. Reviewers: evandro, t.p.northover, joelkevinjones, joel_k_jones, steleman Reviewed By: joelkevinjones Subscribers: joel_k_jones, aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34958 llvm-svn: 307842
* [globalisel][tablegen] Demote OptForSize/OptForMinSize/ForCodeSize to ↵Daniel Sanders2017-05-191-5/+1
| | | | | | | | | | | | | | | | | | per-function predicates. Summary: This causes them to be re-computed more often than necessary but resolves objections that were raised post-commit on r301750. Reviewers: qcolombet, ab, t.p.northover, rovka, kristof.beyls Reviewed By: qcolombet Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D32861 llvm-svn: 303418
* [SLP] Enable 64-bit wide vectorization on AArch64Adam Nemet2017-05-151-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ARM Neon has native support for half-sized vector registers (64 bits). This is beneficial for example for 2D and 3D graphics. This patch adds the option to lower MinVecRegSize from 128 via a TTI in the SLP Vectorizer. *** Performance Analysis This change was motivated by some internal benchmarks but it is also beneficial on SPEC and the LLVM testsuite. The results are with -O3 and PGO. A negative percentage is an improvement. The testsuite was run with a sample size of 4. ** SPEC * CFP2006/482.sphinx3 -3.34% A pretty hot loop is SLP vectorized resulting in nice instruction reduction. This used to be a +22% regression before rL299482. * CFP2000/177.mesa -3.34% * CINT2000/256.bzip2 +6.97% My current plan is to extend the fix in rL299482 to i16 which brings the regression down to +2.5%. There are also other problems with the codegen in this loop so there is further room for improvement. ** LLVM testsuite * SingleSource/Benchmarks/Misc/ReedSolomon -10.75% There are multiple small SLP vectorizations outside the hot code. It's a bit surprising that it adds up to 10%. Some of this may be code-layout noise. * MultiSource/Benchmarks/VersaBench/beamformer/beamformer -8.40% The opt-viewer screenshot can be seen at F3218284. We start at a colder store but the tree leads us into the hottest loop. * MultiSource/Applications/lambda-0.1.3/lambda -2.68% * MultiSource/Benchmarks/Bullet/bullet -2.18% This is using 3D vectors. * SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists +6.67% Noise, binary is unchanged. * MultiSource/Benchmarks/Ptrdist/anagram/anagram +4.90% There is an additional SLP in the cold code. The test runs for ~1sec and prints out over 2000 lines. This is most likely noise. * MultiSource/Applications/aha/aha +1.63% * MultiSource/Applications/JM/lencod/lencod +1.41% * SingleSource/Benchmarks/Misc/richards_benchmark +1.15% Differential Revision: https://reviews.llvm.org/D31965 llvm-svn: 303116
* [AArch64] Consider widening instructions in cost calculationsMatthew Simpson2017-05-091-0/+3
| | | | | | | | | | | | | | | The AArch64 instruction set has a few "widening" instructions (e.g., uaddl, saddl, uaddw, etc.) that take one or more doubleword operands and produce quadword results. The operands are automatically sign- or zero-extended as appropriate. However, in LLVM IR, these extends are explicit. This patch updates TTI to consider these widening instructions as single operations whose cost is attached to the arithmetic instruction. It marks extends that are part of a widening operation "free" and applies a sub-target specified overhead (zero by default) to the arithmetic instructions. Differential Revision: https://reviews.llvm.org/D32706 llvm-svn: 302582
* [globalisel][tablegen] Compute available feature bits correctly.Daniel Sanders2017-04-291-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Predicate<> now has a field to indicate how often it must be recomputed. Currently, there are two frequencies, per-module (RecomputePerFunction==0) and per-function (RecomputePerFunction==1). Per-function predicates are currently recomputed more frequently than necessary since the only predicate in this category is cheap to test. Per-module predicates are now computed in getSubtargetImpl() while per-function predicates are computed in selectImpl(). Tablegen now manages the PredicateBitset internally. It should only be necessary to add the required includes. Also fixed a problem revealed by the test case where constrainSelectedInstRegOperands() would attempt to tie operands that BuildMI had already tied. Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D32491 llvm-svn: 301750
* AArch64: support nonlazybindTim Northover2017-04-171-0/+3
| | | | | | | | It's almost certainly not a good idea to actually use it in most cases (there's a pretty large code size overhead on AArch64), but we can't do those experiments until it's supported. llvm-svn: 300462
* [AArch64][Fuchsia] Allow -mcmodel=kernel for --target=aarch64-fuchsiaPetr Hosek2017-04-041-0/+12
| | | | | | | | | | | This mode is just like -mcmodel=small except that it moves the thread pointer from TPIDR_EL0 to TPIDR_EL1. Patch by Roland McGrath. Differential Revision: https://reviews.llvm.org/D31624 llvm-svn: 299462
* [AArch64] Add new subtarget feature to fold LSL into address mode.Balaram Makam2017-03-311-0/+2
| | | | | | | | | | | | | Summary: This feature enables folding of logical shift operations of up to 3 places into addressing mode on Kryo and Falkor that have a fastpath LSL. Reviewers: mcrosier, rengolin, t.p.northover Subscribers: junbuml, gberry, llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D31113 llvm-svn: 299240
* [AArch64] [Assembler] option to disable negative immediate conversionsSanne Wouda2017-03-281-0/+4
| | | | | | | | | | | | | | | | | Summary: Similar to the ARM target in https://reviews.llvm.org/rL298380, this patch adds identical infrastructure for disabling negative immediate conversions, and converts the existing aliases to the new infrastucture. Reviewers: rengolin, javed.absar, olista01, SjoerdMeijer, samparker Reviewed By: samparker Subscribers: samparker, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D31243 llvm-svn: 298908
* [Target] Remove some code probably copy/pasted from another backend.Davide Italiano2017-03-261-4/+0
| | | | llvm-svn: 298825
* [AArch64] Vulcan is now ThunderXT99Joel Jones2017-03-071-1/+1
| | | | | | | | | | | | | | | | | Broadcom Vulcan is now Cavium ThunderX2T99. LLVM Bugzilla: http://bugs.llvm.org/show_bug.cgi?id=32113 Minor fixes for the alignments of loops and functions for ThunderX T81/T83/T88 (better performance). Patch was tested with SpecCPU2006. Patch by Stefan Teleman Differential Revision: https://reviews.llvm.org/D30510 llvm-svn: 297190
* [Fuchsia] Use thread-pointer ABI slots for stack-protector and safe-stackPetr Hosek2017-02-241-0/+1
| | | | | | | | | | | | The Fuchsia ABI defines slots from the thread pointer where the stack-guard value for stack-protector, and the unsafe stack pointer for safe-stack, are stored. This parallels the Android ABI support. Patch by Roland McGrath Differential Revision: https://reviews.llvm.org/D30237 llvm-svn: 296081
* [AArch64] Add Cavium ThunderX supportJoel Jones2017-02-171-1/+5
| | | | | | | | | | | | | | This set of patches adds support for Cavium ThunderX ARM64 processors: * ThunderX * ThunderX T81 * ThunderX T83 * ThunderX T88 Patch by Stefan Teleman Differential Revision: https://reviews.llvm.org/D28891 llvm-svn: 295475
* [AArch64] Add new target feature to fuse literal generationEvandro Menezes2017-02-011-0/+2
| | | | | | | | | This feature enables the fusion of such operations on Cortex A57, as recommended in its Software Optimisation Guide, sections 4.14 and 4.15. Differential revision: https://reviews.llvm.org/D28698 llvm-svn: 293739
* [AArch64] Add new subtarget feature to fuse AES crypto operationsEvandro Menezes2017-02-011-0/+2
| | | | | | | | | | This feature enables the fusion of such operations on Cortex A57, as recommended in its Software Optimisation Guide, section 4.13, and on Exynos M1. Differential revision: https://reviews.llvm.org/D28491 llvm-svn: 293738
* [AArch64] Rename 'no-quad-ldst-pairs' to 'slow-paired-128'Evandro Menezes2017-01-241-2/+2
| | | | | | | In order to follow the pattern of the existing 'slow-misaligned-128store' option, rename the option 'no-quad-ldst-pairs' to 'slow-paired-128'. llvm-svn: 292954
* [AArch64] Falkor supports Rounding Double Multiply Add/Subtract instructions.Chad Rosier2017-01-161-0/+2
| | | | | | | | | | Falkor only partially implements the ARMv8.1a extensions, so this patch refactors the support for the SQRDML[A|S]H instruction into a separate feature. Differential Revision: https://reviews.llvm.org/D28681 llvm-svn: 292142
* [AArch64] Refactor LSE support as feature separate from V8.1a support.Joel Jones2016-11-301-0/+2
| | | | | | | | | | | | | | | | | | | Summary: This is preparation for ThunderX processors that have Large System Extension (LSE) atomic instructions, but not the other instructions introduced by V8.1a. This will mimic changes to GCC as described here: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00388.html LSE instructions are: LD/ST<op>, CAS*, SWP Reviewers: t.p.northover, echristo, jmolloy, rengolin Subscribers: aemerson, mehdi_amini Differential Revision: https://reviews.llvm.org/D26621 llvm-svn: 288279
* [XRay] Support AArch64 in LLVMDean Michael Berris2016-11-171-0/+2
| | | | | | | | | | | | | | | | | | This patch adds XRay support in LLVM for AArch64 targets. This patch is one of a series: Clang: https://reviews.llvm.org/D26415 compiler-rt: https://reviews.llvm.org/D26413 Author: rSerge Reviewers: rengolin, dberris Subscribers: amehsan, aemerson, llvm-commits, iid_iunknown Differential Revision: https://reviews.llvm.org/D26412 llvm-svn: 287209
* [AArch64] Add support for Qualcomm's Falkor CPU.Chad Rosier2016-11-151-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D26673 llvm-svn: 287036
* [AArch64] Enable merging of adjacent zero stores for all subtargets.Chad Rosier2016-11-111-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | This optimization merges adjacent zero stores into a wider store. e.g., strh wzr, [x0] strh wzr, [x0, #2] ; becomes str wzr, [x0] e.g., str wzr, [x0] str wzr, [x0, #4] ; becomes str xzr, [x0] Previously, this was only enabled for Kryo and Cortex-A57. Differential Revision: https://reviews.llvm.org/D26396 llvm-svn: 286592
* [AArch64] Removed the narrow load merging code in the ld/st optimizer.Chad Rosier2016-11-071-2/+2
| | | | | | | | This feature has been disabled for some time now, so remove cruft. Differential Revision: https://reviews.llvm.org/D26248 llvm-svn: 286110
* [AArch64] Optionally use the Newton series for reciprocal estimationEvandro Menezes2016-10-241-0/+2
| | | | | | | | | Add support for estimating the square root or its reciprocal and division or reciprocal using the combiner generic Newton series. Differential revision: https://reviews.llvm.org/D25291 llvm-svn: 284986
* GlobalISel: rename legalizer components to match others.Tim Northover2016-10-141-1/+1
| | | | | | | | | | The previous names were both misleading (the MachineLegalizer actually contained the info tables) and inconsistent with the selector & translator (in having a "Machine") prefix. This should make everything sensible again. The only functional change is the name of a couple of command-line options. llvm-svn: 284287
OpenPOWER on IntegriCloud