summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM
Commit message (Collapse)AuthorAgeFilesLines
...
* [Thumb] Fix assembler error 'cannot honor width suffix pop {lr}'Artyom Skrobov2015-12-282-50/+61
| | | | | | | | | | | | | Summary: * avoid generating POP {LR} in Thumb1 epilogues * combine MOV LR, Rx + BX LR -> BX Rx in a peephole optimization pass * combine POP {LR} + B + BX LR -> POP {PC} on v5T+ Test cases by Ana Pazos Differential Revision: http://reviews.llvm.org/D15707 llvm-svn: 256523
* Support clrex instruction on ARMv6k. Patch by Andrew Turner.Roman Divacky2015-12-281-1/+1
| | | | llvm-svn: 256505
* Remove extra forward declarations and scrub includes for all in tree ↵Craig Topper2015-12-252-3/+1
| | | | | | InstPrinters. NFC llvm-svn: 256427
* [MC, COFF] Support link /incremental conditionallyDavid Majnemer2015-12-212-7/+8
| | | | | | | | | | | | | | | | Today, we always take into account the possibility that object files produced by MC may be consumed by an incremental linker. This results in us initialing fields which vary with time (TimeDateStamp) which harms hermetic builds (e.g. verifying a self-host went well) and produces sub-optimal code because we cannot assume anything about the relative position of functions within a section (call sites can get redirected through incremental linker thunks). Let's provide an MCTargetOption which controls this behavior so that we can disable this functionality if we know a-priori that the build will not rely on /incremental. llvm-svn: 256203
* Teach ARMLoadStoreOptimizer to ignore DBG_VALUE instructions when mergingAdrian Prantl2015-12-211-1/+5
| | | | | | | | | instructions. As noted in PR24563. rdar://problem/23963293 llvm-svn: 256183
* Remove extra whitespace. NFC.Chad Rosier2015-12-211-3/+3
| | | | llvm-svn: 256173
* Fix mapping of @llvm.arm.ssat/usat intrinsics to ssat/usat instructions for ↵Weiming Zhao2015-12-201-2/+2
| | | | | | | | | | | | | | | | | Thumb2 Summary: r250697 fixed the mapping for ARM mode. We have to do the same for Thumb2 otherwise the same llvm.arm.ssat() will generate different saturating amount for ARM and Thumb. r250697: http://reviews.llvm.org/rL250697 Reviewers: rmaprath Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D15653 llvm-svn: 256115
* Revert "[ARM] Add ARMv8.2-A FP16 scalar instructions"Reid Kleckner2015-12-1612-784/+6
| | | | | | This reverts commit r255762. llvm-svn: 255806
* [ARM] Add ARMv8.2-A FP16 vector instructionsOliver Stannard2015-12-164-42/+430
| | | | | | | | | | | | | | ARMv8.2-A adds 16-bit floating point versions of all existing SIMD floating-point instructions. This is an optional extension, so all of these instructions require the FeatureFullFP16 subtarget feature. Note that VFP without SIMD is not a valid combination for any version of ARMv8-A, but I have ensured that these instructions all depend on both FeatureNEON and FeatureFullFP16 for consistency. Differential Revision: http://reviews.llvm.org/D15039 llvm-svn: 255764
* [ARM] Add ARMv8.2-A FP16 scalar instructionsOliver Stannard2015-12-1612-6/+784
| | | | | | | | | | | | | | | | | | | | | | | | | | | ARMv8.2-A adds 16-bit floating point versions of all existing VFP floating-point instructions. This is an optional extension, so all of these instructions require the FeatureFullFP16 subtarget feature. The assembly for these instructions uses S registers (AArch32 does not have H registers), but the instructions have ".f16" type specifiers rather than ".f32" or ".f64". The top 16 bits of each source register are ignored, and the top 16 bits of the destination register are set to zero. These instructions are mostly the same as the 32- and 64-bit versions, but they use coprocessor 9 rather than 10 and 11. Two new instructions, VMOVX and VINS, have been added to allow packing and extracting two 16-bit floats stored in the top and bottom halves of an S register. New fixup kinds have been added for the PC-relative load and store instructions, but no ELF relocations have been added as they have a range of 512 bytes. Differential Revision: http://reviews.llvm.org/D15038 llvm-svn: 255762
* Normalize MBB's successors' probabilities in several locations.Cong Hou2015-12-131-0/+1
| | | | | | | | | | | | This patch adds some missing calls to MBB::normalizeSuccProbs() in several locations where it should be called. Those places are found by checking if the sum of successors' probabilities is approximate one in MachineBlockPlacement pass with some instrumented code (not in this patch). Differential revision: http://reviews.llvm.org/D15259 llvm-svn: 255455
* ARM: only emit EABI attributes on EABI targetsSaleem Abdulrasool2015-12-131-1/+2
| | | | | | | EABI attributes should only be emitted on EABI targets. This prevents the emission of the optimization goals EABI attribute on Windows ARM. llvm-svn: 255448
* Revert r248483, r242546, r242545, and r242409 - absdiff intrinsicsHal Finkel2015-12-112-22/+8
| | | | | | | | | | | | | | | | | | | | After much discussion, ending here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151123/315620.html it has been decided that, instead of having the vectorizer directly generate special absdiff and horizontal-add intrinsics, we'll recognize the relevant reduction patterns during CodeGen. Accordingly, these intrinsics are not needed (the operations they represent can be pattern matched, as is already done in some backends). Thus, we're backing these out in favor of the current development work. r248483 - Codegen: Fix llvm.*absdiff semantic. r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation llvm-svn: 255387
* CodeGen: Redo analyzePhysRegs() and computeRegisterLiveness()Matthias Braun2015-12-111-13/+2
| | | | | | | | | | | | | | | | | | | | computeRegisterLiveness() was broken in that it reported dead for a register even if a subregister was alive. I assume this was because the results of analayzePhysRegs() are hard to understand with respect to subregisters. This commit: Changes the results of analyzePhysRegs (=struct PhysRegInfo) to be clearly understandable, also renames the fields to avoid silent breakage of third-party code (and improve the grammar). Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling it and removing workarounds for the bug. This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033 Differential Revision: http://reviews.llvm.org/D15320 llvm-svn: 255362
* ARM: don't use a deleted node as the BaseReg in complex pattern.Tim Northover2015-12-091-1/+4
| | | | | | | | | | We mutated the DAG, which invalidated the node we were trying to use as a base register. Sometimes we got away with it, but other times the node really did get deleted before it was finished with. Should fix PR25733 llvm-svn: 255120
* [AArch64][ARM] Don't base interleaved op legality on type alloc size.Ahmed Bougacha2015-12-092-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise, we think that most types that look like they'd fit in a legal vector type are legal (so, basically, *any* vector type with a size between 33 and 128 bits, I think, since we use pow2 alignment; e.g., v2i25, v3f32, ...). DataLayout::getTypeAllocSize rounds up based on alignment. When checking for target intrinsic legality, that's not what we want: if rounding makes a difference, the type isn't legal, and the target intrinsics shouldn't be used, as they are always assumed legal. One could make the argument that alloc size is ultimately the most relevant here, since we're dealing with LD/ST intrinsics. That's only true if we did legalize them though; that's a problem for another day. Use DataLayout::getTypeSizeInBits instead of getTypeAllocSizeInBits. Type::getSizeInBits can't be used because that'd gratuitously break pointer vector support. Some of these uses are currently fine, because we only hit them when the type is already known legal (e.g., r114454). Update them for consistency. It's faster to avoid the rounding anyway! llvm-svn: 255089
* Fix ARMv4T (Thumb1) epilogue generationArtyom Skrobov2015-12-081-8/+33
| | | | | | | | | | | | | | Summary: Before ARMv5T, Thumb1 code could not pop PC, as described at D14357 and D14986; so we need the special fixup in the epilogue. Reviewers: jroelofs, qcolombet Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D15126 llvm-svn: 255047
* [ARM] Allowing SP/PC for AND/BIC mod_imm_notRenato Golin2015-12-081-4/+4
| | | | | | | AND/BIC instructions do accept SP/PC, so the register class should be more generic (rGPR -> GPR) to cope with that case. Adding more tests. llvm-svn: 255034
* [ARM] Generate ABI_optimization_goals build attribute, as described in the ↵Artyom Skrobov2015-12-073-7/+49
| | | | | | | | | | | | | | ARM ARM. Summary: This reverts r254234, and adds a simple fix for the annoying case of use-after-free. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D15236 llvm-svn: 254912
* [ARM] Flag vcvt{t,b} with an f16 type specifier as part of the FP16 extensionBradley Smith2015-12-072-4/+9
| | | | | | Additionally correct the Cortex-R7 definition to allow the FP16 feature. llvm-svn: 254900
* Replace uint16_t with the MCPhysReg typedef in many places. A lot of ↵Craig Topper2015-12-053-11/+11
| | | | | | physical register arrays already use this typedef. llvm-svn: 254843
* [ARM] When a bitcast is about to be turned into a VMOVDRR, try to combine itQuentin Colombet2015-12-041-0/+55
| | | | | | | | | | | with its source instead of forcing the values on GPRs. This improves the lowering of vector code when such bitcasts happen in the middle of vector computations. rdar://problem/23691584 llvm-svn: 254684
* AArch64: use ldxp/stxp pair to implement 128-bit atomic loads.Tim Northover2015-12-021-1/+1
| | | | | | | | The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic if there has been a corresponding successful stxp. It's less clear for AArch32, so I'm leaving that alone for now. llvm-svn: 254524
* [AArch64]: Add support for Cortex-A35Christof Douma2015-12-022-2/+11
| | | | | | Adds support for the new Cortex-A35 ARMv8-A core. llvm-svn: 254503
* ARM: Change ArchCheck field to uint64_tMatthias Braun2015-12-011-1/+1
| | | | | | | | The values in this field are compared against getAvailableFeatures() which returns an uint64_t. This was causing problems in an internal branch. llvm-svn: 254462
* Fix Thumb1 epilogue generationArtyom Skrobov2015-12-011-12/+55
| | | | | | | | | | | | | | Summary: This had been broken for a very long time, but nobody noticed until D14357 enabled shrink-wrapping by default. Reviewers: jroelofs, qcolombet Subscribers: tyomitch, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14986 llvm-svn: 254444
* [ARM] Add ARMv8.2-A to TargetParserOliver Stannard2015-12-014-1/+15
| | | | | | | | | | | | Add ARMv8.2-A to TargetParser, so that it can be used by the clang command-line options and the .arch directive. Most testing of this will be done in clang, checking that the command-line options that this enables work. Differential Revision: http://reviews.llvm.org/D15037 llvm-svn: 254400
* [ARM] Add subtarget features for ARMv8.2-AOliver Stannard2015-12-014-3/+20
| | | | | | | | | | | | | | This adds subtarget features for ARMv8.2-A, which builds on (and requires the features from) ARMv8.1-A. Most assembler-visible features of ARMv8.2-A are system instructions, and are all required parts of the architecture, so just depend on the HasV8_2aOps subtarget feature. There is also one large, optional feature, which adds 16-bit floating point versions of all existing floating-point instructions (VFP and SIMD), this is represented by the FeatureFullFP16 subtarget feature. Differential Revision: http://reviews.llvm.org/D15036 llvm-svn: 254399
* [ARM] Use range-based for loops to avoid the need for calculating an array ↵Craig Topper2015-12-011-6/+5
| | | | | | size that I would have otherwise cconverted to array_lengthof. NFC llvm-svn: 254381
* Replace all weight-based interfaces in MBB with probability-based ↵Cong Hou2015-12-012-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | interfaces, and update all uses of old interfaces. (This is the second attempt to submit this patch. The first caused two assertion failures and was reverted. See https://llvm.org/bugs/show_bug.cgi?id=25687) The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254377
* Revert r254348: "Replace all weight-based interfaces in MBB with ↵Hans Wennborg2015-12-012-2/+3
| | | | | | | | | | probability-based interfaces, and update all uses of old interfaces." and the follow-up r254356: "Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction." Asserts were firing in Chromium builds. See PR25687. llvm-svn: 254366
* Replace all weight-based interfaces in MBB with probability-based ↵Cong Hou2015-12-012-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | interfaces, and update all uses of old interfaces. The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254348
* [ARM] For old thumb ISA like v4t, we cannot use PC directly in pop.Quentin Colombet2015-11-301-18/+5
| | | | | | Fix the epilogue emission to account for that. llvm-svn: 254325
* Revert "[ARM] Generate ABI_optimization_goals build attribute, as described ↵Renato Golin2015-11-282-46/+4
| | | | | | | | | in the ARM ARM." This reverts commit r254201 and r254202, as it broke test-suite, self-hosting and sanitizer tests on ARM buildbots. llvm-svn: 254234
* Follow-up fix for r254201Artyom Skrobov2015-11-271-1/+1
| | | | llvm-svn: 254202
* [ARM] Generate ABI_optimization_goals build attribute, as described in the ↵Artyom Skrobov2015-11-272-4/+46
| | | | | | | | | | | | | | | | | | | ARM ARM. Summary: Since this build attribute corresponds to a whole module, and different functions in a module may differ in the optimizations enabled for them, this attribute is emitted after all functions, and only in the case that the optimization goals for all functions match. Reviewers: logan, hans Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D14934 llvm-svn: 254201
* ARM: address WOA unsigned division overflow crashMartell Malone2015-11-262-20/+19
| | | | | | | | | Building on r253865 the crash is not limited to signed overflows. Disable custom handling of unsigned 32-bit and 64-bit integer divide. Add test cases for both 32-bit and 64-bit unsigned integer overflow. llvm-svn: 254158
* Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC)Artyom Skrobov2015-11-251-20/+8
| | | | | | | | | | | | | | Summary: Many target lowerings copy-paste the code to test SDValues for known constants. This code can instead be shared in SelectionDAG.cpp, and reused in the targets. Reviewers: MatzeB, andreadb, tstellarAMD Subscribers: arsenm, jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D14945 llvm-svn: 254085
* ARM: address WoA division overflow crashMartell Malone2015-11-232-23/+14
| | | | | | | Disable custom handling of signed 32-bit and 64-bit integer divide. Add test cases for both 32-bit and 64-bit integer overflow crashes. llvm-svn: 253865
* ARMLoadStoreOptimizer: Cleanup isMemoryOp(); NFCMatthias Braun2015-11-211-33/+33
| | | | llvm-svn: 253757
* Test commitVinicius Tinti2015-11-201-4/+4
| | | | llvm-svn: 253737
* Handle ARMv6-J as an alias, instead of fake architectureArtyom Skrobov2015-11-201-1/+0
| | | | | | | | | | | | | | | | | | | | | | Summary: This follows D14577 to treat ARMv6-J as an alias for ARMv6, instead of an architecture in its own right. The functional change is that the default CPU when targeting ARMv6-J changes from arm1136j-s to arm1136jf-s, which is currently used as the default CPU for ARMv6; both are, in fact, ARMv6-J CPUs. The J-bit (Jazelle support) is irrelevant to LLVM, and it doesn't affect code generation, attributes, optimizations, or anything else, apart from selecting the default CPU. Reviewers: rengolin, logan, compnerd Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14755 llvm-svn: 253675
* Revert "Change memcpy/memset/memmove to have dest and source alignments."Pete Cooper2015-11-191-4/+3
| | | | | | | | | | This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543
* Change memcpy/memset/memmove to have dest and source alignments.Pete Cooper2015-11-181-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511
* ARM: make sure backend is consistent about exception handling method.Tim Northover2015-11-182-4/+13
| | | | | | | | | | | | It turns out we decide whether to use SjLj exceptions or some alternative in two separate places in the backend, and they disagreed with each other. This led to inconsistent code and is generally a terrible idea. So make them consistent and add an assert that they *do* match (unfortunately MCAsmInfo isn't available in opt, so it can't be used to initialise the CodeGen version directly). llvm-svn: 253502
* Stop producing .data.rel sections.Rafael Espindola2015-11-181-9/+4
| | | | | | | | | | | | | | | | | If a section is rw, it is irrelevant if the dynamic linker will write to it or not. It looks like llvm implemented this because gcc was doing it. It looks like gcc implemented this in the hope that it would put all the relocated items close together and speed up the dynamic linker. There are two problem with this: * It doesn't work. Both bfd and gold will map .data.rel to .data and concatenate the input sections in the order they are seen. * If we want a feature like that, it can be implemented directly in the linker since it knowns where the dynamic relocations are. llvm-svn: 253436
* [ARM] Enable shrink-wrapping by default.Quentin Colombet2015-11-181-0/+5
| | | | | | | | Differential Revision: http://reviews.llvm.org/D14357 rdar://problem/21942589 llvm-svn: 253411
* [ARM] Don't pessimize i32 vselect.Charlie Turner2015-11-171-3/+0
| | | | | | | | | | | | | | | | | | The underlying issues surrounding codegen for 32-bit vselects have been resolved. The pessimistic costs for 64-bit vselects remain due to the bad scalarization that is still happening there. I tested this on A57 in T32, A32 and A64 modes. I saw no regressions, and some improvements. From my benchmarks, I saw these improvements in A57 (T32) spec.cpu2000.ref.177_mesa 5.95% lnt.SingleSource/Benchmarks/Shootout/strcat 12.93% lnt.MultiSource/Benchmarks/MiBench/telecomm-CRC32/telecomm-CRC32 11.89% I also measured A57 A32, A53 T32 and A9 T32 and found no performance regressions. I see much bigger wins in third-party benchmarks with this change Differential Revision: http://reviews.llvm.org/D14743 llvm-svn: 253349
* [ARM] Default to ARMv4t in favour of adding Other to ARMArchBradley Smith2015-11-172-2/+2
| | | | llvm-svn: 253335
* [ARM] Match VABDL from log2 shuffles.Charlie Turner2015-11-171-0/+23
| | | | | | Differential Revision: http://reviews.llvm.org/D14664 llvm-svn: 253334
OpenPOWER on IntegriCloud