summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64
Commit message (Collapse)AuthorAgeFilesLines
* [COFF, ARM64] Add support for Windows ARM64 COFF formatMandeep Singh Grang2017-06-2714-5/+218
| | | | | | | | | | | | | | | | Summary: This is the llvm part of the initial implementation to support Windows ARM64 COFF format. I will gradually add more functionality in subsequent patches. Reviewers: ruiu, rnk, t.p.northover, compnerd Reviewed By: ruiu, compnerd Subscribers: aemerson, mgorny, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D34705 llvm-svn: 306490
* [AArch64] Inline callee if its target-features are a subset of the callerFlorian Hahn2017-06-272-0/+17
| | | | | | | | | | | | | | | | | | | Summary: Similar to X86, it should be safe to inline callees if their target-features are a subset of the caller. This change matches GCC's inlining behavior with respect to attributes [1]. [1] https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes Reviewers: kristof.beyls, javed.absar, rengolin, t.p.northover Reviewed By: t.p.northover Subscribers: aemerson, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D34698 llvm-svn: 306478
* clang-format a file.Rafael Espindola2017-06-271-59/+64
| | | | | | | It had a few inconsistent indentations that made a followup patch hard to read. llvm-svn: 306474
* [AArch64] Performance enhancements for Cavium ThunderX2 T99Joel Jones2017-06-272-166/+1059
| | | | | | | | | | | | | | This patch enables significant performance enhancements to the Cavium ThunderX2T99 LLVM backend, as observed by running SPEC2K6, by adding more detailed scheduling information. Related Bugzilla bug: http://bugs.llvm.org/show_bug.cgi?id=32562 Patch by: steleman Differential Revision: https://reviews.llvm.org/D31801 llvm-svn: 306462
* [AArch64] Update successor probabilities after ccmp-conversionMatthew Simpson2017-06-271-4/+44
| | | | | | | | | | | | | This patch modifies the conditional compares pass so that it keeps successor probabilities up-to-date after the conversion. Previously, successor probabilities were being normalized to a uniform distribution, even though they may have been heavily biased prior to the conversion (e.g., if one of the edges was the back edge of a loop). This loss of information affected passes later in the pipeline. Differential Revision: https://reviews.llvm.org/D34109 llvm-svn: 306412
* [globalisel][tablegen] Add support for EXTRACT_SUBREG.Daniel Sanders2017-06-271-2/+7
| | | | | | | | | | | | | | | | Summary: After this patch, we finally have test cases that require multiple instruction emission. Depends on D33590 Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls Subscribers: javed.absar, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D33596 llvm-svn: 306388
* Fix the bug when handling shufflevector for aarch64.Dehao Chen2017-06-261-2/+3
| | | | | | | | | | | | | | Summary: This Fixes https://bugs.llvm.org/show_bug.cgi?id=33600 Reviewers: mssimpso, davidxl, Carrot Reviewed By: mssimpso Subscribers: aemerson, rengolin, sanjoy, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D34641 llvm-svn: 306334
* AArch64: legalize G_EXTRACT operations.Tim Northover2017-06-261-0/+6
| | | | | | | This is the dual problem to legalizing G_INSERTs so most of the code and testing was cribbed from there. llvm-svn: 306328
* AArch64: remove all kill flags when extending register liveness.Tim Northover2017-06-261-1/+7
| | | | | | | | | | | | When we forward a stored value to a load and eliminate it entirely we need to make sure the liveness of the register is maintained all the way to its use. Previously we only cleared liveness on the store doing the forwarding, but there could be other killing uses in between. We already do the right thing when the load has to be converted into something else, it was just this one path that skipped it. llvm-svn: 306318
* fix trivial typos in comment, NFCHiroshi Inoue2017-06-241-1/+1
| | | | llvm-svn: 306211
* Simplify the processFixupValue interface. NFC.Rafael Espindola2017-06-241-8/+6
| | | | llvm-svn: 306202
* Remove redundant argument.Rafael Espindola2017-06-241-2/+3
| | | | llvm-svn: 306189
* ARM: move some logic from processFixupValue to applyFixup.Rafael Espindola2017-06-231-2/+4
| | | | | | | | | | | | processFixupValue is called on every relaxation iteration. applyFixup is only called once at the very end. applyFixup is then the correct place to do last minute changes and value checks. While here, do proper range checks again for fixup_arm_thumb_bl. We used to do it, but dropped because of thumb2. We now do it again, but use the thumb2 range. llvm-svn: 306177
* [AArch64][Falkor] Remove some non-existent opcodes from sched detail ↵Geoff Berry2017-06-231-12/+12
| | | | | | regexes. NFC. llvm-svn: 306170
* [AArch64] Prefer Bcc to CBZ/CBNZ/TBZ/TBNZ when NZCV flags can be set for "free".Chad Rosier2017-06-236-3/+387
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch contains a pass that transforms CBZ/CBNZ/TBZ/TBNZ instructions into a conditional branch (Bcc), when the NZCV flags can be set for "free". This is preferred on targets that have more flexibility when scheduling Bcc instructions as compared to CBZ/CBNZ/TBZ/TBNZ (assuming all other variables are equal). This can reduce register pressure and is also the default behavior for GCC. A few examples: add w8, w0, w1 -> cmn w0, w1 ; CMN is an alias of ADDS. cbz w8, .LBB_2 -> b.eq .LBB0_2 ; single def/use of w8 removed. add w8, w0, w1 -> adds w8, w0, w1 ; w8 has multiple uses. cbz w8, .LBB1_2 -> b.eq .LBB1_2 sub w8, w0, w1 -> subs w8, w0, w1 ; w8 has multiple uses. tbz w8, #31, .LBB6_2 -> b.ge .LBB6_2 In looking at all current sub-target machine descriptions, this transformation appears to be either positive or neutral. Differential Revision: https://reviews.llvm.org/D34220. llvm-svn: 306144
* GlobalISel: remove G_SEQUENCE instruction.Tim Northover2017-06-231-4/+0
| | | | | | | | It was trying to do too many things. The basic lumping together of values for legalization purposes is now handled by G_MERGE_VALUES. More complex things involving gaps and odd sizes are handled by G_INSERT sequences. llvm-svn: 306120
* Use a MutableArrayRef. NFC.Rafael Espindola2017-06-211-5/+5
| | | | llvm-svn: 305968
* [AARCH64][LSE] Preliminary support for ARMv8.1 LSE Atomics.Christof Douma2017-06-214-5/+114
| | | | | | | | | | | | | | | | | | | | | | Implemented support to AArch64 codegen for ARMv8.1 Large System Extensions atomic instructions. Where supported, these instructions can provide atomic operations with higher performance. Currently supported operations include: fetch_add, fetch_or, fetch_xor, fetch_smin, fetch_min/max (signed and unsigned), swap, and compare_exchange. This implementation implies sequential-consistency ordering, more relaxed ordering is under development. Subtarget->hasLSE is currently supported for Cavium ThunderX2T99. Patch by Ananth Jasty. Differential Revision: https://reviews.llvm.org/D33586 Change-Id: I82f6d3d64255622791ceb0715b7ab9f4dc4d4b2c llvm-svn: 305893
* [AArch64] Add early exit to promoteLoadFromStore.Florian Hahn2017-06-211-1/+4
| | | | | | | | There should be at most a single kill flag for the promoted operand between the store/load pair. Discussed in https://reviews.llvm.org/D34402. llvm-svn: 305889
* [AArch64] Preserve register flags when promoting a load from store.Florian Hahn2017-06-211-3/+4
| | | | | | | | | | | | | | | | | | | | | Summary: This patch updates promoteLoadFromStore to use the store MachineOperand as the source operand of the of the new instruction instead of creating a new register MachineOperand. This way, the existing register flags are preserved. This fixes PR33468 (https://bugs.llvm.org/show_bug.cgi?id=33468). Reviewers: MatzeB, t.p.northover, junbuml Reviewed By: MatzeB Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34402 llvm-svn: 305885
* clang-format a region.Rafael Espindola2017-06-201-20/+19
| | | | | | It will make a followup patch easier to read. llvm-svn: 305865
* [AArch64][Falkor] Fix MOVZ sched predicate to not assert on non-imm operands ↵Geoff Berry2017-06-191-1/+2
| | | | | | (e.g. blockaddress). llvm-svn: 305752
* [AArch64][Kryo] Add missing write latency for LDAXP, LDXP second destination.Geoff Berry2017-06-191-2/+4
| | | | | | Fixes PR33491 and PR33512. llvm-svn: 305751
* [AArch64][Falkor] Refine load/store increment latencies.Geoff Berry2017-06-191-164/+242
| | | | | | Also fix LDXP & LDAXP write latency to avoid similar assert as PR33491 and PR33512. llvm-svn: 305750
* [AArch64] Fix order of checks in shouldScheduleAdjacent.Florian Hahn2017-06-191-2/+2
| | | | | | | We need to check the opcode of FirstMI before accessing the operands. This caused a buildbot failure during bootstrapping on AArch64. llvm-svn: 305694
* Recommit rL305677: [CodeGen] Add generic MacroFusion passFlorian Hahn2017-06-192-160/+34
| | | | | | | | | | | | | Use llvm::make_unique to avoid ambiguity with MSVC. This patch adds a generic MacroFusion pass, that is used on X86 and AArch64, which both define target-specific shouldScheduleAdjacent functions. This generic pass should make it easier for other targets to implement macro fusion and I intend to add macro fusion for ARM shortly. Differential Revision: https://reviews.llvm.org/D34144 llvm-svn: 305690
* Revert r305677 [CodeGen] Add generic MacroFusion pass.Florian Hahn2017-06-192-34/+160
| | | | | | This causes Windows buildbot failures do an ambiguous call. llvm-svn: 305681
* [CodeGen] Add generic MacroFusion pass.Florian Hahn2017-06-192-160/+34
| | | | | | | | | | | | | | | | | | Summary: This patch adds a generic MacroFusion pass, that is used on X86 and AArch64, which both define target-specific shouldScheduleAdjacent functions. This generic pass should make it easier for other targets to implement macro fusion and I intend to add macro fusion for ARM shortly. Reviewers: craig.topper, evandro, t.p.northover, atrick, MatzeB Reviewed By: MatzeB Subscribers: atrick, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34144 llvm-svn: 305677
* [AArch64] Add indexed check to splitStores. NFC.Nirav Dave2017-06-151-1/+1
| | | | | | | Add explicit check for unhandled cases in preparation for delaying splitStores to post-legalization. llvm-svn: 305471
* [AArch64] Enable FeatureFuseAES for the generic processor model.Florian Hahn2017-06-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: Scheduling AESE/AESMC and AESD/AESIMC instruction pairs back-to-back gives a double digit speedup on benchmarks using those instructions on Cortex-A processors. In GCC, this optimization is part of the generic processor model as well. This change should not have a major performance impact on processors that do not optimize AES instruction pairs, although I only had access to Cortex-A processors for benchmarking. Reviewers: rengolin, kristof.beyls, javed.absar, evandro, silviu.baranga, MatzeB, mcrosier, joelkevinjones, joel_k_jones, bmakam, t.p.northover Reviewed By: evandro Subscribers: sbaranga, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D33836 llvm-svn: 305457
* [AArch64][Falkor] Fix sched details for FDIV, FSQRT, SDIV, UDIVGeoff Berry2017-06-131-11/+50
| | | | llvm-svn: 305310
* AArch64: don't try to emit an add (shifted reg) for SP.Tim Northover2017-06-121-0/+8
| | | | | | | | | | The "Add/sub (shifted reg)" instructions use the 31 encoding for xzr and wzr rather than the SP, so we need to use different variants. Situations where this actually comes up are rare enough (see test-case) that I think falling back to DAG is fine. llvm-svn: 305230
* [Falkor] Enable SW Prefetch.Haicheng Wu2017-06-121-0/+4
| | | | | | | | SW prefetch is good for Falkor. Differential Revision: http://reviews.llvm.org/D34084 llvm-svn: 305199
* Const correctness for TTI::getRegisterBitWidthDaniel Neilson2017-06-121-1/+1
| | | | | | | | | | | | | | Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation. Reviewers: chandlerc, rnk, reames Reviewed By: reames Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits Differential Revision: https://reviews.llvm.org/D33903 llvm-svn: 305189
* [AArch64] Add fallback in FastISel fp16 conversionsI-Jui (Ray) Sung2017-06-091-1/+5
| | | | | | | | | | | | | | | | | Summary: - Fix assertion failures on F16 to/from int types in FastISel by falling back to regular ISel - Add a testcase of various conversion cases with FastISel (-O0) Reviewers: kristof.beyls, jmolloy, SjoerdMeijer Reviewed By: SjoerdMeijer Subscribers: SjoerdMeijer, llvm-commits, srhines, pirama, aemerson, rengolin, javed.absar, kristof.beyls Differential Revision: https://reviews.llvm.org/D33734 llvm-svn: 305127
* Move Object format code to lib/BinaryFormat.Zachary Turner2017-06-075-5/+5
| | | | | | | | | | | | This creates a new library called BinaryFormat that has all of the headers from llvm/Support containing structure and layout definitions for various types of binary formats like dwarf, coff, elf, etc as well as the code for identifying a file from its magic. Differential Revision: https://reviews.llvm.org/D33843 llvm-svn: 304864
* Sort the remaining #include lines in include/... and lib/....Chandler Carruth2017-06-0612-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
* [llvm] Remove double semicolonsMandeep Singh Grang2017-06-061-1/+1
| | | | | | | | | | | | Reviewers: craig.topper, arsenm, mehdi_amini Reviewed By: mehdi_amini Subscribers: mehdi_amini, wdng, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33924 llvm-svn: 304767
* [AArch64][Falkor] Model immediate forwarding.Geoff Berry2017-06-021-13/+28
| | | | llvm-svn: 304552
* [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-06-011-2/+5
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 304495
* [AArch64] Enable FeatureFuseAES on Cortex-A53.Florian Hahn2017-05-311-0/+1
| | | | | | It improves performance on Cortex-A53. llvm-svn: 304307
* [AArch64] Enable FeatureFuseAES on Cortex-A73.Florian Hahn2017-05-311-0/+1
| | | | | | It improves performance on Cortex-A73. llvm-svn: 304304
* Add latency info for Exynos interleaved Load/Store instructions.Abderrazek Zaafrani2017-05-311-5/+335
| | | | llvm-svn: 304259
* TargetPassConfig: Keep a reference to an LLVMTargetMachine; NFCMatthias Braun2017-05-301-3/+3
| | | | | | | | | | | TargetPassConfig is not useful for targets that do not use the CodeGen library, so we may just as well store a pointer to an LLVMTargetMachine instead of just to a TargetMachine. While at it, also change the constructor to take a reference instead of a pointer as the TM must not be nullptr. llvm-svn: 304247
* [SelectionDAG] Set ISD::FPOWI to Expand by defaultCraig Topper2017-05-301-3/+0
| | | | | | | | | | | | | | | | | Summary: Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie". This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default. Reviewers: spatel, RKSimon, efriedma Reviewed By: RKSimon Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits Differential Revision: https://reviews.llvm.org/D33530 llvm-svn: 304215
* Fix PR33031: correct the estimate of maximum offset for instructions ↵Kristof Beyls2017-05-301-5/+30
| | | | | | spilling/filling the stack. llvm-svn: 304196
* [AArch64][Falkor] Combine sched details files into one. NFC.Geoff Berry2017-05-282-514/+503
| | | | llvm-svn: 304109
* [AArch64][Falkor] Fix some sched details.Geoff Berry2017-05-284-294/+461
| | | | | | | | | | | | | | | | | | | | - Remove all uses of base sched model entries and set them all to Unsupported so all the opcodes are described in AArch64SchedFalkorDetails.td. - Remove entries for unsupported half-float opcodes. - Remove entries for unsupported LSE extension opcodes. - Add entry for MOVbaseTLS (and set Sched in base td file entry to WriteSys) and a few other pseudo ops. - Fix a few FP load/store with reg offset entries to use the LSLfast predicates. - Add Q size BIF/BIT/BSL entries. - Fix swapped Q/D sized CLS/CLZ/CNT/RBIT entires. - Fix pre/post increment address register latency (this operand is always dest 0). - Fix swapped FCVTHD/FCVTHS/FCVTDH/FCVTDS entries. - Fix XYZ resource over usage on LD[1-4] opcodes. llvm-svn: 304108
* AArch64/PEI: Do not add reserved regs to liveinsMatthias Braun2017-05-271-2/+5
| | | | | | | We do not track liveness for reserved registers. It is unnecessary to add them to block livein lists. llvm-svn: 304059
* [AArch64][GlobalISel] Add the Localizer pass for the O0 pipelineQuentin Colombet2017-05-271-1/+9
| | | | | | | This should fix most of the issue we have right now with constants being spilled all over the place. llvm-svn: 304052
OpenPOWER on IntegriCloud