summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM
Commit message (Collapse)AuthorAgeFilesLines
...
* revert 344472 due to failures.Dorit Nuzman2018-10-142-6/+4
| | | | llvm-svn: 344473
* [IAI,LV] Add support for vectorizing predicated strided accesses using maskedDorit Nuzman2018-10-142-4/+6
| | | | | | | | | | | | | | | | | | | | | | | interleave-group The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/stores; predicated strided accesses can currently be vectorized only using masked gather/scatter or scalarization. This patch makes predicated loads/stores candidates for forming interleave-groups during the Loop-Vectorizer's analysis, and adds the proper support for masked-interleave- groups to the Loop-Vectorizer's planning and transformation stages. The patch also extends the TTI API to allow querying the cost of masked interleave groups (which each target can control); Targets that support masked vector loads/ stores may choose to enable this feature and allow vectorizing predicated strided loads/stores using masked wide loads/stores and shuffles. Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53011 llvm-svn: 344472
* Replace most users of UnknownSize with LocationSize::unknown(); NFCGeorge Burgess IV2018-10-101-1/+1
| | | | | | | | | | | | Moving away from UnknownSize is part of the effort to migrate us to LocationSizes (e.g. the cleanup promised in D44748). This doesn't entirely remove all of the uses of UnknownSize; some uses require tweaks to assume that UnknownSize isn't just some kind of int. This patch is intended to just be a trivial replacement for all places where LocationSize::unknown() will Just Work. llvm-svn: 344186
* [ARM] Account for implicit IT when calculating inline asm sizePeter Smith2018-10-082-3/+18
| | | | | | | | | | | | | | | | | | | | | When deciding if it is safe to optimize a conditional branch to a CBZ or CBNZ the offsets of the BasicBlocks from the start of the function are estimated. For inline assembly the generic getInlineAsmLength() function is used to get a worst case estimate of the inline assembly by multiplying the number of instructions by the max instruction size of 4 bytes. This unfortunately doesn't take into account the generation of Thumb implicit IT instructions. In edge cases such as when all the instructions in the block are 4-bytes in size and there is an implicit IT then the size is underestimated. This can cause an out of range CBZ or CBNZ to be generated. The patch takes a conservative approach and assumes that every instruction in the inline assembly block may have an implicit IT. Fixes pr31805 Differential Revision: https://reviews.llvm.org/D52834 llvm-svn: 343960
* X86, AArch64, ARM: Do not attach debug location to spill/reload instructionsMatthias Braun2018-10-051-15/+15
| | | | | | | | | | | | | | This rebases and recommits r343520. hwasan should be fixed now and this shouldn't break the tests anymore. Spill/reload instructions are artificially generated by the compiler and have no relation to the original source code. So the best thing to do is not attach any debug location to them (instead of just taking the next debug location we find on following instructions). Differential Revision: https://reviews.llvm.org/D52125 llvm-svn: 343895
* [TargetRegisterInfo] Remove temporary hook enableMultipleCopyHints()Jonas Paulsson2018-10-051-1/+0
| | | | | | | | | | | | Finally all targets are enabling multiple regalloc hints, so the hook to disable this can now be removed. NFC. Review: Simon Pilgrim https://reviews.llvm.org/D52316 llvm-svn: 343851
* Revert "X86, AArch64, ARM: Do not attach debug location to spill/reload ↵Matt Morehouse2018-10-021-15/+15
| | | | | | | | instructions" This reverts r343520 due to breakage of HWASan tests on Android. llvm-svn: 343616
* [ARM] Emmit data symbol for constant pool dataDiogo N. Sampaio2018-10-021-0/+5
| | | | | | | | | | | The ARM elf emitter would omit printing data symbol when constant data. This patch overrides the emitFill method as to enforce that the symbol is correctly printed. Differential revision: https://reviews.llvm.org/D52737 llvm-svn: 343594
* X86, AArch64, ARM: Do not attach debug location to spill/reload instructionsMatthias Braun2018-10-011-15/+15
| | | | | | | | | | | Spill/reload instructions are artificially generated by the compiler and have no relation to the original source code. So the best thing to do is not attach any debug location to them (instead of just taking the next debug location we find on following instructions). Differential Revision: https://reviews.llvm.org/D52125 llvm-svn: 343520
* [ARM] Fix correctness checks in promoteToConstantPool.Eli Friedman2018-09-281-46/+15
| | | | | | | | | | | | | | | | | Correctly check for relocations in the constant to promote. And don't allow promoting a constant multiple times. This partially fixes https://bugs.llvm.org//show_bug.cgi?id=32780 ; it's not a complete fix because we also need to prevent ARMConstantIslands from cloning the constant. (-arm-promote-constant is currently off by default, and it stays off with this patch. I'll look into turning it on again when all the known issues are fixed.) Differential Revision: https://reviews.llvm.org/D51472 llvm-svn: 343361
* [ARM] Use preferred alignment for constants in promoteToConstantPool.Eli Friedman2018-09-281-1/+1
| | | | | | | | | | | | | | | This mostly affects IR generated by non-clang frontends because clang generally sets the alignment of globals explicitly. Fixes https://bugs.llvm.org//show_bug.cgi?id=32394 . (-arm-promote-constant is currently off by default, and it stays off with this patch. I'll look into turning it on again when all the known issues are fixed.) Differential Revision: https://reviews.llvm.org/D51469 llvm-svn: 343359
* [ARM] Allow execute only code on Cortex-m23David Spickett2018-09-281-2/+4
| | | | | | | | | | | The NoMovt feature prevents the use of MOVW/MOVT instructions on Cortex-M23 for performance reasons. These instructions are required for execute only code so NoMovt should be disabled when that option is enabled. Differential Revision: https://reviews.llvm.org/D52551 llvm-svn: 343302
* [ARM][v8.5A] Add speculation barriers SSBB and PSSBBOliver Stannard2018-09-284-1/+45
| | | | | | | | | | | This adds two new barrier instructions which can be used to restrict speculative execution of load instructions. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52484 llvm-svn: 343300
* [ARM][v8.5A] Add speculation barrier to ARM & Thumb instruction setsOliver Stannard2018-09-275-2/+32
| | | | | | | | | | | This is a new barrier which limits speculative execution of the instructions following it. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52477 llvm-svn: 343213
* llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)Fangrui Song2018-09-274-15/+12
| | | | | | | | | | | | Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163
* [ARM/AArch64][v8.5A] Add Armv8.5-A targetOliver Stannard2018-09-264-0/+24
| | | | | | | | | | | | | This patch allows targeting Armv8.5-A, adding the architecture to tablegen and setting the options to be identical to Armv8.4-A for the time being. Subsequent patches will add support for the different features included in the Armv8.5-A Reference Manual. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52470 llvm-svn: 343102
* [ARM] Fix for PR39060Sam Parker2018-09-261-28/+103
| | | | | | | | | | | | | When calculating whether a value can safely overflow for use by an icmp, we weren't checking that the value couldn't wrap around. To do this we need the icmp to be using a constant, as well as the incoming add or sub. bugzilla report: https://bugs.llvm.org/show_bug.cgi?id=39060 Differential Revision: https://reviews.llvm.org/D52463 llvm-svn: 343092
* Revert r342870 "[ARM] bottom-top mul support ARMParallelDSP"Hans Wennborg2018-09-261-154/+27
| | | | | | | | | | | | | | | | | | | | This broke Chromium's Android build (https://crbug.com/889390) and the polly-aosp buildbot (http://lab.llvm.org:8011/builders/aosp-O3-polly-before-vectorizer-unprofitable). > Originally committed in rL342210 but was reverted in rL342260 because > it was causing issues in vectorized code, because I had forgotten to > ensure that we're operating on scalar values. > > Original commit message: > > On failing to find sequences that can be converted into dual macs, > try to find sequential 16-bit loads that are used by muls which we > can then use smultb, smulbt, smultt with a wide load. > > Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 343082
* [ARM] Share predecessor bookkeeping in CombineBaseUpdate. NFCI.Nirav Dave2018-09-251-2/+9
| | | | llvm-svn: 342987
* [ARM] Adjust the cost model for ExynosEvandro Menezes2018-09-241-2/+2
| | | | | | | Tune `MaxInterleaveFactor` and `LdStMultipleTiming`and remove `PartialUpdateClearance` for the Exynos processors. llvm-svn: 342900
* [ARM] Adjust the feature set for ExynosEvandro Menezes2018-09-241-0/+2
| | | | | | Enable crypto and literals fusion for the Exynos processors. llvm-svn: 342899
* [Thumb1] Any imm8 should have cost of 1Zhaoshi Zheng2018-09-241-2/+2
| | | | | | | | | A simple MOVS rd, imm8 can materialize [-128, 127] in signed i8 type or [0, 255] in unsigned i8 type on Thumb1. Differential Revision: https://reviews.llvm.org/D52257 llvm-svn: 342898
* [Arm][AsmParser] Restrict register list size for VSTM/VLDMLuke Cheeseman2018-09-241-0/+9
| | | | | | | | | | - The assembler accepts VSTM/VLDM with register lists (specifically double registers lists) with more than 16 registers specified - The Arm architecture reference manual says this instruction must not contain more than 16 registers when the registers are doubleword registers - This addresses one of the concerns in https://bugs.llvm.org/show_bug.cgi?id=38389 Differential Revision: https://reviews.llvm.org/D52082 llvm-svn: 342891
* [ARM] Do not fuse VADD and VMUL on the Cortex-M4 and Cortex-M33Sjoerd Meijer2018-09-242-3/+5
| | | | | | | | | | | | A sequence of VMUL and VADD instructions always give the same or better performance than a fused VMLA instruction on the Cortex-M4 and Cortex-M33. Executing the VMUL and VADD back-to-back requires the same cycles, but having separate instructions allows scheduling to avoid the hazard between these 2 instructions. Differential Revision: https://reviews.llvm.org/D52289 llvm-svn: 342874
* Revert r341932 "[ARM] Enable ARMCodeGenPrepare by default"Hans Wennborg2018-09-241-1/+1
| | | | | | | | | | | This caused miscompilation of WebRTC for Android: PR39060. > We've had the pass enabled downstream for a couple of weeks and it > seems to be okay, so enable it by default. > > Differential Revision: https://reviews.llvm.org/D51920 llvm-svn: 342873
* [ARM][ARMLoadStoreOptimizer]Luke Cheeseman2018-09-241-0/+14
| | | | | | | | | | | - The load store optimizer is currently merging multiple loads/stores into VLDM/VSTM with more than 16 doubleword registers - This is an UNPREDICTABLE instruction and shouldn't be done - It looks like the Limit for how many registers included in a merge got dropped at some point so I am reintroducing it in this patch - This fixes https://bugs.llvm.org/show_bug.cgi?id=38389 Differential Revision: https://reviews.llvm.org/D52085 llvm-svn: 342872
* [ARM] bottom-top mul support ARMParallelDSPSam Parker2018-09-241-27/+154
| | | | | | | | | | | | | | | | Originally committed in rL342210 but was reverted in rL342260 because it was causing issues in vectorized code, because I had forgotten to ensure that we're operating on scalar values. Original commit message: On failing to find sequences that can be converted into dual macs, try to find sequential 16-bit loads that are used by muls which we can then use smultb, smulbt, smultt with a wide load. Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 342870
* Fix for bug 34002 - label generated before it block is finalized. ↵Maya Madhavan2018-09-201-1/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D52258 llvm-svn: 342615
* [ARM] Adjust the feature set for ExynosEvandro Menezes2018-09-191-0/+6
| | | | | | Fine tune the cost model for all Exynos processors. llvm-svn: 342585
* [ARM] Refactor Exynos feature set (NFC)Evandro Menezes2018-09-193-71/+23
| | | | | | | Since all Exynos processors share the same feature set, fold them in the implied fatures list for the subtarget. llvm-svn: 342583
* [AtomicExpandPass]: Add a hook for custom cmpxchg expansion in IRAlex Bradbury2018-09-192-5/+8
| | | | | | | | | | | | | | | | | This involves changing the shouldExpandAtomicCmpXchgInIR interface, but I have updated the in-tree backends using this hook (ARM, AArch64, Hexagon) so they will see no functional change. Previously this hook returned bool, but it now returns AtomicExpansionKind. This hook allows targets to select how a given cmpxchg is to be expanded. D48131 uses this to expand part-word cmpxchg to a target-specific intrinsic. See my associated RFC for more info on the motivation for this change <http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html>. Differential Revision: https://reviews.llvm.org/D48130 llvm-svn: 342550
* [ARM] Fix unwind information for floating point registersOliver Stannard2018-09-191-3/+7
| | | | | | | | | | | | Fixes the unwind information generated for floating-point registers. Previously, all padding registers were assumed to be four bytes wide. Now, the width of the register is used to specify the amount of padding. Patch by Jackson Woodruff! Differential revision: https://reviews.llvm.org/D51494 llvm-svn: 342545
* Revert "[ARM] Cleanup ARM CGP isSupportedValue"Volodymyr Sapsai2018-09-181-19/+42
| | | | | | | | | | | | | | | This reverts r342395 as it caused error > Argument value type does not match pointer operand type! > %0 = atomicrmw volatile xchg i8* %_Value1, i32 1 monotonic, !dbg !25 > i8in function atomic_flag_test_and_set > fatal error: error in backend: Broken function found, compilation aborted! on bot http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/ More details are available at https://reviews.llvm.org/D52080 llvm-svn: 342431
* [ARM] Cleanup ARM CGP isSupportedValueSam Parker2018-09-171-42/+19
| | | | | | | | | | | | isSupportedValue explicitly checked and accepted many types of value, primarily for debugging reasons. Remove most of these checks and do a bit of refactoring now that the pass is more stable. This also enables ZExts to be sources, but this has very little practical benefit at the moment extend instructions will still be introduced. Differential Revision: https://reviews.llvm.org/D52080 llvm-svn: 342395
* [ARM] Disallow icmp with negative imm and overflowSam Parker2018-09-171-0/+11
| | | | | | | | | | We allow overflowing instructions if they're decreasing and only used by an unsigned compare. Add the extra condition that the icmp cannot be using a negative immediate. Differential Revision: https://reviews.llvm.org/D52102 llvm-svn: 342392
* Revert r342210 "[ARM] bottom-top mul support in ARMParallelDSP"Reid Kleckner2018-09-141-152/+27
| | | | | | | | | | It causes assertion failures while building Skia for Android in Chromium: https://ci.chromium.org/buildbot/chromium.clang/ToTAndroid/4550 Reduction forthcoming. llvm-svn: 342260
* [ARM] bottom-top mul support in ARMParallelDSPSam Parker2018-09-141-27/+152
| | | | | | | | | | On failing to find sequences that can be converted into dual macs, try to find sequential 16-bit loads that are used by muls which we can then use smultb, smulbt, smultt with a wide load. Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 342210
* [ARM] Allow truncs as sources in ARM CGPSam Parker2018-09-131-19/+23
| | | | | | | | | | We previously only allowed truncs as sinks, but now allow them as sources too. We do this by checking that the result type is the narrow type that we're trying to optimise for. Differential Revision: https://reviews.llvm.org/D51978 llvm-svn: 342141
* [ARM] Fix FixConst for ARMCodeGenPrepareSam Parker2018-09-131-20/+3
| | | | | | | | | | Part of FixConsts wrongly assumes either a 8- or 16-bit constant which can result in the wrong constants being generated during promotion. Differential Revision: https://reviews.llvm.org/D52032 llvm-svn: 342140
* ARM: align loops to 4 bytes on Cortex-M3 and Cortex-M4.Tim Northover2018-09-134-0/+22
| | | | | | | | | | | | The Technical Reference Manuals for these two CPUs state that branching to an unaligned 32-bit instruction incurs an extra pipeline reload penalty. That's bad. This also enables the optimization at -Os since it costs on average one byte per loop in return for 1 cycle per iteration, which is pretty good going. llvm-svn: 342127
* ARM: correct the relocation type for `bl` on WoASaleem Abdulrasool2018-09-131-1/+1
| | | | | | | | | | The `IMAGE_REL_ARM_BRANCH20T` applies only to a `b.w` instruction. A thumb-2 `bl` should be relocated using a `IMAGE_REL_ARM_BRANCH24T`. Correct the relocation that we emit in such a case. Resolves PR38620! Based on the patch by Jordan Rhee! llvm-svn: 342109
* [ARM] Tighten f64<->f16 conversion requirementsDiogo N. Sampaio2018-09-121-4/+8
| | | | | | | | | | | | | | Fix missing Requires fields. Patch by Bernard Ogden (bogden) Reviewers: SjoerdMeijer, javed.absar, t.p.northover Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D51631 llvm-svn: 342061
* [ARM] Follow-up to rL342033Sam Parker2018-09-121-1/+1
| | | | | | Fixed typo which can cause segfault. llvm-svn: 342040
* [ARM] Exchange MAC operands in ARMParallelDSPSam Parker2018-09-121-115/+154
| | | | | | | | | | | | | | | | SMLAD and SMLALD instructions also come in the form of SMLADX and SMLALDX which perform an exchange on their second operand. To support this, more of the loads in the MAC candidates are compared for sequential access and a boolean value has been added to BinOpChain. AddMACCandiate has been refactored into a small pattern matching state machine to reduce the amount of duplicated code, but also to enable the matching to be more flexible. CreateParallelMACPairs now iterates through all the candidates to find parallel ones. Differential Revision: https://reviews.llvm.org/D51424 llvm-svn: 342033
* [ARM] Allow bitcasts in ARMCodeGenPrepareSam Parker2018-09-121-5/+4
| | | | | | | | Allow bitcasts in the use-def chains, treating them as sources. Differential Revision: https://reviews.llvm.org/D50758 llvm-svn: 342032
* [ARM] Add smlald support in ARMParallelDSPSam Parker2018-09-111-13/+41
| | | | | | | | | Search from i64 reducing phis, as well as i32, to allow the generation of smlald instructions. Differential Revision: https://reviews.llvm.org/D51101 llvm-svn: 341941
* [ARM] Enable ARMCodeGenPrepare by defaultSam Parker2018-09-111-1/+1
| | | | | | | | | We've had the pass enabled downstream for a couple of weeks and it seems to be okay, so enable it by default. Differential Revision: https://reviews.llvm.org/D51920 llvm-svn: 341932
* [Target] Untangle disassemblersBenjamin Kramer2018-09-101-1/+1
| | | | | | | Disassemblers cannot depend on main target headers. The same is true for MCTargetDesc, but there's a lot more cleanup needed for that. llvm-svn: 341822
* Fix typo in previous commitJF Bastien2018-09-081-1/+1
| | | | llvm-svn: 341742
* ADT: add <bit> header, implement C++20 bit_cast, useJF Bastien2018-09-081-13/+9
| | | | | | | | | | | | | | Summary: I saw a few places that were punning through a union of FP and integer, and that made me sad. Luckily, C++20 adds bit_cast for exactly that purpose. Implement our own version in ADT (without constexpr, leaving us a bit sad), and use it in the few places my grep-fu found silly union punning. This was originally committed as r341728 and reverted in r341730. Reviewers: javed.absar, steven_wu, srhines Subscribers: dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D51693 llvm-svn: 341741
OpenPOWER on IntegriCloud