summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64
Commit message (Collapse)AuthorAgeFilesLines
* Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and ↵Daniel Sanders2015-09-159-37/+33
| | | | | | | | related. NFC. Eric has replied and has demanded the patch be reverted. llvm-svn: 247702
* Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* ↵Daniel Sanders2015-09-159-33/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Thanks go to Pavel Labath for fixing LLDB for me. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247692
* Revert r247684 - Replace Triple with a new TargetTuple ...Daniel Sanders2015-09-159-37/+33
| | | | | | LLDB needs to be updated in the same commit. llvm-svn: 247686
* Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.Daniel Sanders2015-09-159-33/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247683
* Improve ISel using across lane min/max reductionJun Bum Lim2015-09-141-53/+190
| | | | | | | | | | | | | | | | | | | | In vectorized integer min/max reduction code, the final "reduce" step is sub-optimal. In AArch64, this change wll combine : %svn0 = vector_shuffle %0, undef<2,3,u,u> %smax0 = smax %0, svn0 %svn3 = vector_shuffle %smax0, undef<1,u,u,u> %sc = setcc %smax0, %svn3, gt %n0 = extract_vector_elt %sc, #0 %n1 = extract_vector_elt %smax0, #0 %n2 = extract_vector_elt $smax0, #1 %result = select %n0, %n1, n2 becomes : %1 = smaxv %0 %result = extract_vector_elt %1, 0 This change extends r246790. llvm-svn: 247575
* [CodeGen] Refactor TLI/AtomicExpand interface to make LLSC explicit.Ahmed Bougacha2015-09-112-6/+10
| | | | | | | | | | | | | | | We used to have this magic "hasLoadLinkedStoreConditional()" callback, which really meant two things: - expand cmpxchg (to ll/sc). - expand atomic loads using ll/sc (rather than cmpxchg). Remove it, and, instead, introduce explicit callbacks: - bool shouldExpandAtomicCmpXchgInIR(inst) - AtomicExpansionKind shouldExpandAtomicLoadInIR(inst) Differential Revision: http://reviews.llvm.org/D12557 llvm-svn: 247429
* [CodeGen] Rename AtomicRMWExpansionKind to AtomicExpansionKind.Ahmed Bougacha2015-09-112-4/+3
| | | | | | This lets us generalize its usage to the other atomic instructions. llvm-svn: 247428
* Fix an undefined behavior introduces in r247234Steven Wu2015-09-101-1/+1
| | | | llvm-svn: 247296
* [ADT] Switch a bunch of places in LLVM that were doing single-characterChandler Carruth2015-09-101-1/+1
| | | | | | | splits to actually use the single character split routine which does less work, and in a debug build is *substantially* faster. llvm-svn: 247245
* [AArch64] Match FI+offset in STNP addressing mode.Ahmed Bougacha2015-09-102-0/+26
| | | | | | | | | | | | | | | First, we need to teach isFrameOffsetLegal about STNP. It already knew about the STP/LDP variants, but those were probably never exercised, because it's only the load/store optimizer that generates STP/LDP, and the only user of the method is frame lowering, which runs earlier. The STP/LDP cases were wrong: they didn't take into account the fact that they return two results, not one, so the immediate offset will be the 4th operand, not the 3rd. Follow-up to r247234. llvm-svn: 247236
* [AArch64] Match base+offset in STNP addressing mode.Ahmed Bougacha2015-09-101-0/+16
| | | | | | Followup to r247231. llvm-svn: 247234
* [AArch64] Support selecting STNP.Ahmed Bougacha2015-09-103-0/+78
| | | | | | | | | | | | | | | | | | We could go through the load/store optimizer and match STNP where we would have matched a nontemporal-annotated STP, but that's not reliable enough, as an opportunistic optimization. Insetad, we can guarantee emitting STNP, by matching them at ISel. Since there are no single-input nontemporal stores, we have to resort to some high-bits-extracting trickery to generate an STNP from a plain store. Also, we need to support another, LDP/STP-specific addressing mode, base + signed scaled 7-bit immediate offset. For now, only match the base. Let's make it smart separately. Part of PR24086. llvm-svn: 247231
* [CostModel][AArch64] Remove amortization factor for some of the vector ↵Silviu Baranga2015-09-091-4/+5
| | | | | | | | | | | | | | | | | select instructions Summary: We are not scalarizing the wide selects in codegen for i16 and i32 and therefore we can remove the amortization factor. We still have issues with i64 vectors in codegen though. Reviewers: mcrosier Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12724 llvm-svn: 247156
* Remove white space (test commit)Jun Bum Lim2015-09-081-1/+1
| | | | llvm-svn: 247021
* [AArch64] Improve ISel using across lane addition reduction.Chad Rosier2015-09-031-0/+99
| | | | | | | | | | | | | | | | | | | | In vectorized add reduction code, the final "reduce" step is sub-optimal. This change wll combine : ext v1.16b, v0.16b, v0.16b, #8 add v0.4s, v1.4s, v0.4s dup v1.4s, v0.s[1] add v0.4s, v1.4s, v0.4s into addv s0, v0.4s PR21371 http://reviews.llvm.org/D12325 Patch by Jun Bum Lim <junbuml@codeaurora.org>! llvm-svn: 246790
* Revert "[AArch64] Improve load/store optimizer to handle LDUR + LDR."Chad Rosier2015-09-031-77/+21
| | | | | | | | This reverts commit r246769. This appears to have broken Multisource/Benchmarks/tramp3d-v4. llvm-svn: 246782
* [AArch64] Improve load/store optimizer to handle LDUR + LDR.Chad Rosier2015-09-031-21/+77
| | | | | | | | | | | This patch allows the mixing of scaled and unscaled load/stores to form load/store pairs. PR24465 http://reviews.llvm.org/D12116 Many thanks to Ahmed and Michael for fixes and code review. llvm-svn: 246769
* [AArch64] Reuse MayLoad. NFC.Chad Rosier2015-09-031-1/+1
| | | | llvm-svn: 246767
* [AArch64] More consistently separate asm opc and operands with '\t'.Ahmed Bougacha2015-09-021-30/+30
| | | | | | Somehow missed these in r246686. llvm-svn: 246687
* [AArch64] Consistently separate asm opc and operands with '\t'.Ahmed Bougacha2015-09-021-17/+17
| | | | | | | | Some of the instructions use ' ', which drives OCD-me nuts. Let's put an end to this. NFC-ish: hopefully nobody cares about whitespace. llvm-svn: 246686
* [AArch64] Lower READCYCLECOUNTER using MRS PMCCTNR_EL0.Ahmed Bougacha2015-09-015-6/+25
| | | | | | | | | | This matches the ARM behavior. In both cases, the register is part of the optional Performance Monitors extension, so, add the feature, and enable it for the A-class processors we support. Differential Revision: http://reviews.llvm.org/D12425 llvm-svn: 246555
* [AArch64] Turn on by default interleaved access vectorizationSilviu Baranga2015-09-011-0/+2
| | | | | | | | | | | | | | | | | Summary: This change turns on by default interleaved access vectorization for AArch64. We also clean up some tests which were spedifically enabling this behaviour. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12149 llvm-svn: 246542
* [AArch64][CollectLOH] Remove an invalid assertion and add a test case ↵Quentin Colombet2015-08-311-3/+11
| | | | | | | | exposing it. rdar://problem/22491525 llvm-svn: 246472
* AArch64: Fix loads to lower NEON vector lanes using GPR registersMatthias Braun2015-08-311-1/+4
| | | | | | | | | | | | | The ISelLowering code turned insertion turned the element for the lowest lane of a BUILD_VECTOR into an INSERT_SUBREG, this prohibited the patterns for SCALAR_TO_VECTOR(Load) to match later. Restrict this to cases without a load argument. Reported in rdar://22223823 Differential Revision: http://reviews.llvm.org/D12467 llvm-svn: 246462
* [MIR Serialization] static -> static const in ↵Hal Finkel2015-08-301-2/+2
| | | | | | | | | getSerializable*MachineOperandTargetFlags Make the arrays 'static const' instead of just 'static'. Post-commit review comment from Roman Divacky on IRC. NFC. llvm-svn: 246376
* [AArch64][CollectLOH] Fix a regression that prevented us to detect chains ofQuentin Colombet2015-08-271-2/+6
| | | | | | | | | | | more than 2 instructions. I introduced this regression a while back and did not noticed it because I somehow forgot to push the initial test cases for the pass! Fix that as well! llvm-svn: 246239
* [AArch64] Remove a use-after-free when collecting stats.Chad Rosier2015-08-261-4/+4
| | | | | | | The call to mergePairedInsns() deletes MI, so the later use by isUnscaledLdSt() is referencing freed memory. llvm-svn: 246033
* [AArch64] Unify the integer min/max vector selection patterns with the ↵Silviu Baranga2015-08-262-52/+16
| | | | | | | | | | | | | | | | | | | | intrinsic ones Summary: This change lowers the aarch64 integer vector min/max intrinsic nodes to generic min/max nodes and replaces the intrinsic selection patterns with the generic ones. There should already be testing in place for this, so no further tests were added. Reviewers: jmolloy Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12276 llvm-svn: 246030
* FastISel: Factor out common code; NFC intendedMatthias Braun2015-08-261-40/+5
| | | | | | | | | This should be no functional change but for the record: For three cases in X86FastISel this will change the order in which the FalseMBB and TrueMBB of a conditional branch is addedd to the successor/predecessor lists. llvm-svn: 245997
* Add missing break in AArch64DAGToDAGISel::Select() switch caseMehdi Amini2015-08-231-0/+1
| | | | | | | Reported by coverity. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 245800
* AArch64: Fix cmp;ccmp orderingMatthias Braun2015-08-201-3/+10
| | | | | | | | | | | | When producing conditional compare sequences for or operations we need to negate the operands and the finally tested flags. The thing is if we negate the finally tested flags this equals a logical negation of all previously emitted expressions. There was a case missing where we have to order OR expressions so they get emitted first. This fixes http://llvm.org/PR24459 llvm-svn: 245641
* AArch64: Do not create CCMP on multiple users.Matthias Braun2015-08-201-1/+1
| | | | | | | | | Create CMP;CCMP sequences from and/or trees does not gain us anything if the and/or tree is materialized to a GP register anyway. While most of the code already checked for hasOneUse() there was one important case missing. llvm-svn: 245640
* [AArch64][FastISel] Don't fold shifts with UB.Juergen Ributzka2015-08-191-13/+38
| | | | | | | | | | We are already falling back to SelectionDAG when encountering an shift with UB. This adds the same checks for shifts with UB that get folded into arithmetic or logical operations. This fixes rdar://problem/22345295. llvm-svn: 245499
* [AArch64] Improve short-form diags on long-form Match_InvalidOperand.Ahmed Bougacha2015-08-191-10/+18
| | | | | | | | | Since r244955, we try to use the short-form ErrorInfo when both tries failed, and the long-form match failed on a suffix operand. However, this means we sometimes mix ErrorInfo and MatchResult (one manifestation of this being PR24498). Instead, restore both. llvm-svn: 245469
* Revert "[AArch64] Simplify/refactor code to ease code review. NFC."Renato Golin2015-08-191-32/+18
| | | | | | | This reverts commit r245443, as it broke AArch64 test-suite tramp3d with an assert "Reg && "Null register has no regunits". llvm-svn: 245455
* [AArch64] Simplify/refactor code to ease code review. NFC.Chad Rosier2015-08-191-18/+32
| | | | llvm-svn: 245443
* MIR Serialization: Serialize the operand's bit mask target flags.Alex Lorenz2015-08-182-0/+39
| | | | | | | | | This commit adds support for bit mask target flag serialization to the MIR printer and the MIR parser. It also adds support for the machine operand's target flag serialization to the AArch64 target. Reviewers: Duncan P. N. Exon Smith llvm-svn: 245383
* [AArch64] Simplify the logic for computing in bounds offset. NFC.Chad Rosier2015-08-181-10/+6
| | | | llvm-svn: 245307
* [CostModel][AArch64] Increase cost of vector insert element and add missing ↵Silviu Baranga2015-08-171-1/+33
| | | | | | | | | | | | | | | | | | | | | | cast costs Summary: Increase the estimated costs for insert/extract element operations on AArch64. This is motivated by results from benchmarking interleaved accesses. Add missing costs for zext/sext/trunc instructions and some integer to floating point conversions. These costs were previously calculated by scalarizing these operation and were affected by the cost increase of the insert/extract element operations. Reviewers: rengolin Subscribers: mcrosier, aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D11939 llvm-svn: 245226
* Remove hand-rolled matching for fmin and fmax.James Molloy2015-08-171-98/+2
| | | | | | SDAGBuilder now does this all for us. llvm-svn: 245198
* Remove redundant TargetFrameLowering::getFrameIndexOffset virtualJames Y Knight2015-08-152-9/+0
| | | | | | | | | | | function. This was the same as getFrameIndexReference, but without the FrameReg output. Differential Revision: http://reviews.llvm.org/D12042 llvm-svn: 245148
* [AArch64] Fix FMLS scalar-indexed-from-2s-after-neg patterns.Ahmed Bougacha2015-08-141-1/+3
| | | | | | | | | We canonicalize V64 vectors to V128 through insert_subvector: the other FMLA/FMLS/FMUL/FMULX patterns match that already, but this one doesn't, so we'd fail to match fmls and generate fneg+fmla instead. The vector equivalents are already tested and functional. llvm-svn: 245107
* Revert "Centralize the information about which object format we are using."Rafael Espindola2015-08-141-4/+4
| | | | | | | | | | | | | | | | | | | | | This reverts commit r245047. It was failing on the darwin bots. The problem was that when running ./bin/llc -march=msp430 llc gets to if (TheTriple.getTriple().empty()) TheTriple.setTriple(sys::getDefaultTargetTriple()); Which means that we go with an arch of msp430 but a triple of x86_64-apple-darwin14.4.0 which fails badly. That code has to be updated to select a triple based on the value of march, but that is not a trivial fix. llvm-svn: 245062
* Centralize the information about which object format we are using.Rafael Espindola2015-08-141-4/+4
| | | | | | | | | | | Other than some places that were handling unknown as ELF, this should have no change. The test updates are because we were detecting arm-coff or x86_64-win64-coff as ELF targets before. It is not clear if the enum should live on the Triple. At least now it lives in a single location and should be easier to move somewhere else. llvm-svn: 245047
* [AArch64] FMINNAN/FMAXNAN on f16 is not legal.James Molloy2015-08-141-2/+4
| | | | | | | | Spotted by Ahmed - in r244594 I inadvertently marked f16 min/max as legal. I've reverted it here, and marked min/max on scalar f16's as promote. I've also added a testcase. The test just checks that the compiler doesn't fall over - it doesn't create fmin nodes for f16 yet. llvm-svn: 245035
* [AArch64] Provide "too few operands" diags on short-form NEON also.Ahmed Bougacha2015-08-131-0/+10
| | | | | | | | | | | We used to just say "invalid type suffix for instruction", which is misleading. This is because we fallback to the long-form matcher if the short-form matcher failed, losing the error information on the way. Save it, so that we can provide a little better diagnostics when the long-form matcher thinks a suffix is the cause of the error. llvm-svn: 244955
* [AArch64] Also custom-lowering mismatched vector/f16 FCOPYSIGN.Ahmed Bougacha2015-08-131-11/+5
| | | | | | | | | We can lower them using our cool tricks if we fpext/fptrunc the second input, like we do for f32/f64. Follow-up to r243924, r243926, and r244858. llvm-svn: 244860
* PseudoSourceValue: Replace global manager with a manager in a machine function.Alex Lorenz2015-08-113-24/+31
| | | | | | | | | | | | | | | | | | | | | | This commit removes the global manager variable which is responsible for storing and allocating pseudo source values and instead it introduces a new manager class named 'PseudoSourceValueManager'. Machine functions now own an instance of the pseudo source value manager class. This commit also modifies the 'get...' methods in the 'MachinePointerInfo' class to construct pseudo source values using the instance of the pseudo source value manager object from the machine function. This commit updates calls to the 'get...' methods from the 'MachinePointerInfo' class in a lot of different files because those calls now need to pass in a reference to a machine function to those methods. This change will make it easier to serialize pseudo source values as it will enable me to transform the mips specific MipsCallEntry PseudoSourceValue subclass into two target independent subclasses. Reviewers: Akira Hatanaka llvm-svn: 244693
* [AArch64] Match fminnum/fmaxnum for vector fminnm/fmaxnm instead of an ↵James Molloy2015-08-112-8/+17
| | | | | | | | | | | intrinsic. Lower Intrinsic::aarch64_neon_fmin/fmax to fminnum/fmannum and match that instead. Minimal functional change: - Extra tests added because coverage of scalar fminnm/fmaxnm instructions was nonexistant. - f16 test updated because now we actually generate scalar fminnm/fmaxnm we no longer need to bail out to a libcall! llvm-svn: 244595
* [AArch64] Replace the custom AArch64ISD::FMIN/MAX nodes with ISD::FMINNAN/MAXNANJames Molloy2015-08-113-19/+15
| | | | | | NFCI. This just removes custom ISDNodes that are no longer needed. llvm-svn: 244594
OpenPOWER on IntegriCloud