summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64
Commit message (Collapse)AuthorAgeFilesLines
* Make each target map all inline assembly memory constraints to ↵Daniel Sanders2015-03-161-0/+6
| | | | | | | | | | | | | | | | | | | InlineAsm::Constraint_m. NFC. Summary: This is instead of doing this in target independent code and is the last non-functional change before targets begin to distinguish between different memory constraints when selecting code for the ISD::INLINEASM node. Next, each target will individually move away from the idea that all memory constraints behave like 'm'. Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D8173 llvm-svn: 232373
* unique_ptrs are unique already, no need to unique them any further.Benjamin Kramer2015-03-131-8/+7
| | | | llvm-svn: 232178
* Recommit r232027 with PR22883 fixed: Add infrastructure for support of ↵Daniel Sanders2015-03-131-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | multiple memory constraints. The operand flag word for ISD::INLINEASM nodes now contains a 15-bit memory constraint ID when the operand kind is Kind_Mem. This constraint ID is a numeric equivalent to the constraint code string and is converted with a target specific hook in TargetLowering. This patch maps all memory constraints to InlineAsm::Constraint_m so there is no functional change at this point. It just proves that using these previously unused bits in the encoding of the flag word doesn't break anything. The next patch will make each target preserve the current mapping of everything to Constraint_m for itself while changing the target independent implementation of the hook to return Constraint_Unknown appropriately. Each target will then be adapted in separate patches to use appropriate Constraint_* values. PR22883 was caused the matching operands copying the whole of the operand flags for the matched operand. This included the constraint id which needed to be replaced with the operand number. This has been fixed with a conversion function. Following on from this, matching operands also used the operand number as the constraint id. This has been fixed by looking up the matched operand and taking it from there. llvm-svn: 232165
* Migrate the AArch64 TargetRegisterInfo to its TargetMachineEric Christopher2015-03-128-48/+52
| | | | | | | implementation. This requires a bit of scaffolding and a few fixups that'll go away once all of the ports have been migrated. llvm-svn: 232103
* Revert "r232027 - Add infrastructure for support of multiple memory constraints"Hal Finkel2015-03-121-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | This (r232027) has caused PR22883; so it seems those bits might be used by something else after all. Reverting until we can figure out what else to do. Original commit message: The operand flag word for ISD::INLINEASM nodes now contains a 15-bit memory constraint ID when the operand kind is Kind_Mem. This constraint ID is a numeric equivalent to the constraint code string and is converted with a target specific hook in TargetLowering. This patch maps all memory constraints to InlineAsm::Constraint_m so there is no functional change at this point. It just proves that using these previously unused bits in the encoding of the flag word doesn't break anything. The next patch will make each target preserve the current mapping of everything to Constraint_m for itself while changing the target independent implementation of the hook to return Constraint_Unknown appropriately. Each target will then be adapted in separate patches to use appropriate Constraint_* values. llvm-svn: 232093
* Fix comment formatting.Eric Christopher2015-03-121-2/+1
| | | | llvm-svn: 232076
* Add infrastructure for support of multiple memory constraints.Daniel Sanders2015-03-121-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The operand flag word for ISD::INLINEASM nodes now contains a 15-bit memory constraint ID when the operand kind is Kind_Mem. This constraint ID is a numeric equivalent to the constraint code string and is converted with a target specific hook in TargetLowering. This patch maps all memory constraints to InlineAsm::Constraint_m so there is no functional change at this point. It just proves that using these previously unused bits in the encoding of the flag word doesn't break anything. The next patch will make each target preserve the current mapping of everything to Constraint_m for itself while changing the target independent implementation of the hook to return Constraint_Unknown appropriately. Each target will then be adapted in separate patches to use appropriate Constraint_* values. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D8171 llvm-svn: 232027
* Remove the need to cache the subtarget in the AArch64 TargetRegisterInfoEric Christopher2015-03-124-21/+26
| | | | | | | classes. Replace it with a cache to the Triple and use that where applicable at the moment. llvm-svn: 232005
* Move the DataLayout to the generic TargetMachine, making it mandatory.Mehdi Amini2015-03-122-11/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I don't know why every singled backend had to redeclare its own DataLayout. There was a virtual getDataLayout() on the common base TargetMachine, the default implementation returned nullptr. It was not clear from this that we could assume at call site that a DataLayout will be available with each Target. Now getDataLayout() is no longer virtual and return a pointer to the DataLayout member of the common base TargetMachine. I plan to turn it into a reference in a future patch. The only backend that didn't have a DataLayout previsouly was the CPPBackend. It now initializes the default DataLayout. This commit is NFC for all the other backends. Test Plan: clang+llvm ninja check-all Reviewers: echristo Subscribers: jfb, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D8243 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231987
* Have getCallPreservedMask and getThisCallPreservedMask take aEric Christopher2015-03-114-8/+12
| | | | | | | MachineFunction argument so that we can grab subtarget specific features off of it. llvm-svn: 231979
* Have getCalleeSavedRegs take a non-null MachineFunction all theEric Christopher2015-03-111-2/+1
| | | | | | | | time. The target independent code was passing in one all the time and targets weren't checking validity before using. Update a few calls to pass in a MachineFunction where necessary. llvm-svn: 231970
* Constify AArch64CollectLOH.cpp. NFCPete Cooper2015-03-111-7/+7
| | | | llvm-svn: 231969
* Remove the use of the subtarget in MCCodeEmitter creation andEric Christopher2015-03-102-8/+4
| | | | | | | update all ports accordingly. Required a couple of small rewrites in handling subtarget features during creation in PPC. llvm-svn: 231861
* [AArch64] Avoid going through GPRs for across-vector instructions.Ahmed Bougacha2015-03-103-119/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds new node types for each intrinsic. For instance, for addv, we have AArch64ISD::UADDV, such that: (v4i32 (uaddv ...)) is the same as (v4i32 (scalar_to_vector (i32 (int_aarch64_neon_uaddv ...)))) that is, (v4i32 (INSERT_SUBREG (v4i32 (IMPLICIT_DEF)), (i32 (int_aarch64_neon_uaddv ...)), ssub) In a combine, we transform all such across-vector-lanes intrinsics to: (i32 (extract_vector_elt (uaddv ...), 0)) This has one big advantage: by making the extract_element explicit, we enable the existing patterns for lane-aware instructions to fire. This lets us avoid needlessly going through the GPRs. Consider: uint32x4_t test_mul(uint32x4_t a, uint32x4_t b) { return vmulq_n_u32(a, vaddvq_u32(b)); } We now generate: addv.4s s1, v1 mul.4s v0, v0, v1[0] instead of the previous: addv.4s s1, v1 fmov w8, s1 dup.4s v1, w8 mul.4s v0, v1, v0 rdar://20044838 llvm-svn: 231840
* [AArch64] Remove integer INSvi*lane patterns. NFCI.Ahmed Bougacha2015-03-101-4/+0
| | | | | | | | | | | | | | | | | | | | | | | Most are redundant, and they never seem to fire. The V128 integer patterns already exist in the INS multiclass. The duplicates only fire when the vector index type isn't i64, because they accept "imm" instead of an explicit "i64", as the instruction definition patterns do. TLI::getVectorIdxTy is i64 on AArch64, so this should never happen. Also, one of them had a typo: for i64, INSvi32lane was used. I noticed because I mistakenly used an explicit i32 as the idx type, and got ins.s for an i64 vector_insert. The V64 patterns also don't seem to ever fire, as V64 vector extract/insert are legalized to V128. The equivalent float patterns are unique and useful, so keep them. No functional change intended; none exhibited on the LIT and LNT tests. llvm-svn: 231838
* [AArch64] Enable partial & runtime unrolling on cortex-a57Kevin Qin2015-03-091-0/+10
| | | | | | | | For inner one of nested loops, it is more likely to be a hot loop, and the runtime check can be promoted out from patch 0001, so the overhead is less, we can try a doubled threshold to unroll more loops. llvm-svn: 231632
* Make static variables const if possible. Makes them go into a read-only section.Benjamin Kramer2015-03-082-35/+20
| | | | | | Or fold them into a initializer list which has the same effect. NFC. llvm-svn: 231598
* Make constant arrays that are passed to functions as const.Benjamin Kramer2015-03-071-10/+10
| | | | | | | | In theory this allows the compiler to skip materializing the array on the stack. In practice clang often fails to do that, but that's a different story. NFC. llvm-svn: 231571
* Typo.Eric Christopher2015-03-071-1/+1
| | | | llvm-svn: 231547
* [AArch64][LoadStoreOptimizer] Generate LDP + SXTW instead of LD[U]R + LD[U]RSW.Quentin Colombet2015-03-061-11/+116
| | | | | | | | | | | Teach the load store optimizer how to sign extend a result of a load pair when it helps creating more pairs. The rational is that loads are more expensive than sign extensions, so if we gather some in one instruction this is better! <rdar://problem/20072968> llvm-svn: 231527
* [AsmPrinter][TLOF] 32-bit MachO support for replacing GOT equivalentsBruno Cardoso Lopes2015-03-062-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | Add MachO 32-bit (i.e. arm and x86) support for replacing global GOT equivalent symbol accesses. Unlike 64-bit targets, there's no GOTPCREL relocation, and access through a non_lazy_symbol_pointers section is used instead. -- before _extgotequiv: .long _extfoo _delta: .long _extgotequiv-_delta -- after _delta: .long L_extfoo$non_lazy_ptr-_delta .section __IMPORT,__pointers,non_lazy_symbol_pointers L_extfoo$non_lazy_ptr: .indirect_symbol _extfoo .long 0 llvm-svn: 231475
* [AsmPrinter][TLOF] ARM64 MachO support for replacing GOT equivalentsBruno Cardoso Lopes2015-03-062-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Follow up r230264 and add ARM64 support for replacing global GOT equivalent symbol accesses by references to the GOT entry for the final symbol instead, example: -- before .globl _foo _foo: .long 42 .globl _gotequivalent _gotequivalent: .quad _foo .globl _delta _delta: .long _gotequivalent-_delta -- after .globl _foo _foo: .long 42 .globl _delta Ltmp3: .long _foo@GOT-Ltmp3 llvm-svn: 231474
* [AArch64] Teach AsmPrinter about GlobalAddress operands.Ahmed Bougacha2015-03-051-0/+12
| | | | | | | Fixes PR22761, rdar://20024866. Differential Revision: http://reviews.llvm.org/D8042 llvm-svn: 231400
* Mutate TargetLowering::shouldExpandAtomicRMWInIR to specifically dictate how ↵JF Bastien2015-03-042-3/+6
| | | | | | | | | | | | | | | | | | | | | AtomicRMWInsts are expanded. Summary: In PNaCl, most atomic instructions have their own @llvm.nacl.atomic.* function, each one, with a few exceptions, represents a consistent behaviour across all NaCl-supported targets. Unfortunately, the atomic RMW operations nand, [u]min, and [u]max aren't directly represented by any such @llvm.nacl.atomic.* function. This patch refines shouldExpandAtomicRMWInIR in TargetLowering so that a future `Le32TargetLowering` class can selectively inform the caller how the target desires the atomic RMW instruction to be expanded (ie via load-linked/store-conditional for ARM/AArch64, via cmpxchg for X86/others?, or not at all for Mips) if at all. This does not represent a behavioural change and as such no tests were added. Patch by: Richard Diamond. Reviewers: jfb Reviewed By: jfb Subscribers: jfb, aemerson, t.p.northover, llvm-commits Differential Revision: http://reviews.llvm.org/D7713 llvm-svn: 231250
* Fix PR22408 - LLVM producing AArch64 TLS relocations that GNU linkers cannot ↵Kristof Beyls2015-03-047-132/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | handle yet. As is described at http://llvm.org/bugs/show_bug.cgi?id=22408, the GNU linkers ld.bfd and ld.gold currently only support a subset of the whole range of AArch64 ELF TLS relocations. Furthermore, they assume that some of the code sequences to access thread-local variables are produced in a very specific sequence. When the sequence is not as the linker expects, it can silently mis-relaxe/mis-optimize the instructions. Even if that wouldn't be the case, it's good to produce the exact sequence, as that ensures that linkers can perform optimizing relaxations. This patch: * implements support for 16MiB TLS area size instead of 4GiB TLS area size. Ideally clang would grow an -mtls-size option to allow support for both, but that's not part of this patch. * by default doesn't produce local dynamic access patterns, as even modern ld.bfd and ld.gold linkers do not support the associated relocations. An option (-aarch64-elf-ldtls-generation) is added to enable generation of local dynamic code sequence, but is off by default. * makes sure that the exact expected code sequence for local dynamic and general dynamic accesses is produced, by making use of a new pseudo instruction. The patch also removes two (AArch64ISD::TLSDESC_BLR, AArch64ISD::TLSDESC_CALL) pre-existing AArch64-specific pseudo SDNode instructions that are superseded by the new one (TLSDESC_CALLSEQ). llvm-svn: 231227
* Remove MCStreamer.h include from MCContext.h and explictly include it where ↵Pete Cooper2015-03-041-0/+1
| | | | | | necessary. NFC llvm-svn: 231193
* Remove subtarget dependence in pass pipeline setup for AArch64.Eric Christopher2015-03-032-4/+6
| | | | llvm-svn: 231165
* Avoid copying LiveInterval, this could lead to a double-deleteDavid Blaikie2015-03-031-1/+1
| | | | llvm-svn: 231154
* Revert "Remove the explicit SDNodeIterator::operator= in favor of the ↵David Blaikie2015-03-031-1/+1
| | | | | | | | | | | implicit default" Accidentally committed a few more of these cleanup changes than intended. Still breaking these out & tidying them up. This reverts commit r231135. llvm-svn: 231136
* Remove the explicit SDNodeIterator::operator= in favor of the implicit defaultDavid Blaikie2015-03-031-1/+1
| | | | | | | | | | There doesn't seem to be any need to assert that iterator assignment is between iterators over the same node - if you want to reuse an iterator variable to iterate another node, that's perfectly acceptable. Just don't mix comparisons between iterators into disjoint sequences, as usual. llvm-svn: 231135
* [AArch64] When combining constant mul of -3, prefer (sub x, (shl x, N)).Chad Rosier2015-03-031-9/+9
| | | | | | This change only effects codegen when the constant is -3. llvm-svn: 231085
* [AArch64] fix an invalid-iterator-use bug.Sanjoy Das2015-03-021-2/+4
| | | | | | | | | | | | | | | | | | | | Summary: In AArch64PromoteConstant::appendAndTransferDominatedUses, `InsertPts[NewPt]` invalidates IPI. Therefore, `InsertPts[NewPt] = std::move(IPI->second)` is not legal. This was caught by running `make check` with http://reviews.llvm.org/D7931. Reviewers: t.p.northover, grosbach, bkramer Reviewed By: bkramer Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D7988 llvm-svn: 230923
* Make some non-constant static variables non-static or fully const.Benjamin Kramer2015-03-011-1/+1
| | | | | | Otherwise we have to emit thread-safe initialization for them. NFC. llvm-svn: 230894
* Convert push_back loops into append calls.Benjamin Kramer2015-02-281-8/+5
| | | | | | No functionality change intended. llvm-svn: 230849
* ArrayRefize memory operand folding. NFC.Benjamin Kramer2015-02-282-8/+7
| | | | llvm-svn: 230846
* Rewrite MachineOperand::print and MachineInstr::print to avoidEric Christopher2015-02-271-3/+3
| | | | | | | | | | | | uses of TM->getSubtargetImpl and propagate to all calls. This could be a debugging regression in places where we had a TargetMachine and/or MachineFunction but don't have it as part of the MachineInstr. Fixing this would require passing a MachineFunction/Function down through the print operator, but none of the existing uses in tree seem to do this. llvm-svn: 230710
* getRegForInlineAsmConstraint wants to use TargetRegisterInfo forEric Christopher2015-02-262-3/+5
| | | | | | | | | a lookup, pass that in rather than use a naked call to getSubtargetImpl. This involved passing down and around either a TargetMachine or TargetRegisterInfo. Update all callers/definitions around the targets and SelectionDAG. llvm-svn: 230699
* Remove an argument-less call to getSubtargetImpl from TargetLoweringBase.Eric Christopher2015-02-261-1/+1
| | | | | | | | | This required plumbing a TargetRegisterInfo through computeRegisterProperties and into findRepresentativeClass which uses it for register class iteration. This required passing a subtarget into a few target specific initializations of TargetLowering. llvm-svn: 230583
* AArch64: Add debug message for large shift constants.Matthias Braun2015-02-251-2/+8
| | | | | | As requested in code review. llvm-svn: 230517
* AArch64: Relax assert about large shift sizes.Matthias Braun2015-02-241-3/+9
| | | | | | | | | | The reason why these large shift sizes happen is because OpaqueConstants currently inhibit alot of DAG combining, but that has to be addressed in another commit (like the proposal in D6946). Differential Revision: http://reviews.llvm.org/D6940 llvm-svn: 230355
* Rewrite the global merge pass to be subprogram agnostic for now.Eric Christopher2015-02-233-12/+4
| | | | | | | | | | | | | It was previously using the subtarget to get values for the global offset without actually checking each function as it was generating code. Go ahead and solidify the current behavior and make the existing FIXMEs more prominent. As a note the ARM backend previously had a thumb1 and non-thumb1 set of defaults. Only the former was tested so I've changed the behavior to only use that for now. llvm-svn: 230245
* Prevent hoisting fmul from THEN/ELSE to IF if there is fmsub/fmadd opportunity.Chad Rosier2015-02-232-0/+31
| | | | | | | | | | | This patch adds the isProfitableToHoist API. For AArch64, we want to prevent a fmul from being hoisted in cases where it is more profitable to form a fmsub/fmadd. Phabricator Review: http://reviews.llvm.org/D7299 Patch by Lawrence Hu <lawrence@codeaurora.org> llvm-svn: 230241
* CodeGen: convert CCState interface to using ArrayRefsTim Northover2015-02-211-4/+2
| | | | | | | | | | | Everyone except R600 was manually passing the length of a static array at each callsite, calculated in a variety of interesting ways. Far easier to let ArrayRef handle that. There should be no functional change, but out of tree targets may have to tweak their calls as with these examples. llvm-svn: 230118
* Get the cached subtarget off the MachineFunction rather thanEric Christopher2015-02-201-5/+4
| | | | | | inquiring for a new one from the TargetMachine. llvm-svn: 230000
* [CodeGen] Use ArrayRef instead of std::vector&. NFC.Ahmed Bougacha2015-02-191-1/+1
| | | | | | The former lets us use SmallVectors. Do so in ARM and AArch64. llvm-svn: 229925
* Demote vectors to arrays. No functionality change.Benjamin Kramer2015-02-191-73/+44
| | | | llvm-svn: 229861
* Reverting r229831 due to multiple ARM/PPC/MIPS build-bot failures.Michael Kuperstein2015-02-192-11/+8
| | | | llvm-svn: 229841
* Use std::bitset for SubtargetFeaturesMichael Kuperstein2015-02-192-8/+11
| | | | | | | | | | | Previously, subtarget features were a bitfield with the underlying type being uint64_t. Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset. No functional change. Differential Revision: http://reviews.llvm.org/D7065 llvm-svn: 229831
* Prefer SmallVector::append/insert over push_back loops.Benjamin Kramer2015-02-171-8/+2
| | | | | | Same functionality, but hoists the vector growth out of the loop. llvm-svn: 229500
* AArch64: Safely handle the incoming sret call argument.Andrew Trick2015-02-161-11/+12
| | | | | | | | | | | | | | | | | | This adds a safe interface to the machine independent InputArg struct for accessing the index of the original (IR-level) argument. When a non-native return type is lowered, we generate the hidden machine-level sret argument on-the-fly. Before this fix, we were representing this argument as OrigArgIndex == 0, which is an outright lie. In particular this crashed in the AArch64 backend where we actually try to access the type of the original argument. Now we use a sentinel value for machine arguments that have no original argument index. AArch64, ARM, Mips, and PPC now check for this case before accessing the original argument. Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering llvm-svn: 229413
OpenPOWER on IntegriCloud