summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [GlobalISel] Drop workaround for Legalizer member/class sharing a name. NFC.Ahmed Bougacha2016-12-151-1/+1
| | | | | | | | MachineLegalizer used to be the name of both the class and the member, causing GCC errors. r276522 fixed that by renaming the member to just 'Legalizer'. The 'class' workaround isn't necessary anymore; drop it. llvm-svn: 289848
* MachineScheduler: Export function to construct "default" scheduler.Matthias Braun2016-11-281-0/+10
| | | | | | | | | | | | | | | | | | This makes the createGenericSchedLive() function that constructs the default scheduler available for the public API. This should help when you want to get a scheduler and the default list of DAG mutations. This also shrinks the list of default DAG mutations: {Load|Store}ClusterDAGMutation and MacroFusionDAGMutation are no longer added by default. Targets can easily add them if they need them. It also makes it easier for targets to add alternative/custom macrofusion or clustering mutations while staying with the default createGenericSchedLive(). It also saves the callback back and forth in TargetInstrInfo::enableClusterLoads()/enableClusterStores(). Differential Revision: https://reviews.llvm.org/D26986 llvm-svn: 288057
* AArch64: Use DeadRegisterDefinitionsPass before regalloc.Matthias Braun2016-11-161-3/+4
| | | | | | | | | Doing this before register allocation reduces register pressure as we do not even have to allocate a register for those dead definitions. Differential Revision: https://reviews.llvm.org/D26111 llvm-svn: 287076
* GlobalISel: Fix indentation. NFCDiana Picus2016-11-141-1/+1
| | | | llvm-svn: 286808
* AArch64 ILP32 relocations for assembly and ELFJoel Jones2016-10-241-3/+9
| | | | | | | | | | | | | | | | | | | Summary: Add relocations for AArch64 ILP32. Includes: - Addition of definitions for R_AARCH32_* - Definition of new -target-abi: ilp32 - Definition of data layout string - Tests for added relocations. Not comprehensive, but matches existing tests for 64-bit. Renames "CHECK-OBJ" to "CHECK-OBJ-LP64". - Tests for llvm-readobj Reviewers: zatrazz, peter.smith, echristo, t.p.northover Subscribers: aemerson, rengolin, mehdi_amini Differential Revision: https://reviews.llvm.org/D25159 llvm-svn: 284973
* GlobalISel: rename legalizer components to match others.Tim Northover2016-10-141-6/+6
| | | | | | | | | | The previous names were both misleading (the MachineLegalizer actually contained the info tables) and inconsistent with the selector & translator (in having a "Machine") prefix. This should make everything sensible again. The only functional change is the name of a couple of command-line options. llvm-svn: 284287
* GlobalISel: select G_GLOBAL_VALUE uses on AArch64.Tim Northover2016-10-101-1/+1
| | | | llvm-svn: 283809
* Move the global variables representing each Target behind accessor functionMehdi Amini2016-10-091-3/+3
| | | | | | | | This avoids "static initialization order fiasco" Differential Revision: https://reviews.llvm.org/D25412 llvm-svn: 283702
* [AArch64] Avoid generating indexed vector instructions for ExynosSebastian Pop2016-10-081-0/+2
| | | | | | | | | | | | | | | | | | Avoid generating indexed vector instructions for Exynos. This is needed for fmla/fmls/fmul/fmulx. For example, the instruction fmla v0.4s, v1.4s, v2.s[1] is less efficient than the instructions dup v2.4s, v2.s[1] fmla v0.4s, v1.4s, v2.4s Patch written by Abderrazek Zaafrani. Differential Revision: https://reviews.llvm.org/D21571 llvm-svn: 283663
* Move AArch64BranchRelaxation to generic codeMatt Arsenault2016-10-061-2/+2
| | | | llvm-svn: 283459
* Revert "[AArch64] Use the reciprocal estimation machinery"Evandro Menezes2016-09-201-26/+2
| | | | | | | This reverts commit b7d42b0048f65346e9fa37fb65defeea7ce8c337 per request by Eric Christopher <echristo@gmail.com> (v. http://bit.ly/2cmz6kW). llvm-svn: 282000
* [AArch64] Register passes so they can be run by llcDiana Picus2016-08-011-48/+74
| | | | | | | | | | | | | | Initialize all AArch64-specific passes in the TargetMachine so they can be run by llc. This can lead to conflicts in opt with some command line options that share the same name as the pass, so I took this opportunity to do some cleanups: * rename all relevant command line options from "aarch64-blah" to "aarch64-enable-blah" and update the tests accordingly * run clang-format on their declarations * move all these declarations to a common place (the TargetMachine) as opposed to having them scattered around (AArch64BranchRelaxation and AArch64AddressTypePromotion were the only offenders) llvm-svn: 277322
* [GlobalISel] Introduce an instruction selector.Ahmed Bougacha2016-07-271-2/+20
| | | | | | | | And implement it for AArch64, supporting x/w ADD/OR. Differential Revision: https://reviews.llvm.org/D22373 llvm-svn: 276875
* Fix a GCC error due to this member name also being a type name. ThisChandler Carruth2016-07-231-3/+3
| | | | | | | should fix the build with GCC 4.9 at least. Not sure if this is the right name or fix, but I've followed up on the original commit. llvm-svn: 276522
* GlobalISel: implement legalization pass, with just one transformation.Tim Northover2016-07-221-0/+12
| | | | | | | | | This adds the actual MachineLegalizeHelper to do the work and a trivial pass wrapper that legalizes all instructions in a MachineFunction. Currently the only transformation supported is splitting up a vector G_ADD into one acting on smaller vectors. llvm-svn: 276461
* [AArch64] Register AArch64LoadStoreOptimizer so it can be run by llc ↵Geoff Berry2016-07-201-0/+1
| | | | | | -run-pass. NFCI. llvm-svn: 276193
* [AArch64] Change the preferred alignment for char and short to word alignment.Chad Rosier2016-07-071-2/+2
| | | | | | | | | | The commit reinstates r273279, which was informally approved. Original Review: http://reviews.llvm.org/D21414 This reverts commit ca632c91aaa7cafc50942f890c49f727a046ace1. llvm-svn: 274790
* Revert "[AArch64] Change the preferred alignment for char and short to word ↵Chad Rosier2016-07-071-2/+2
| | | | | | | | alignment" This reverts commit r273279 as the change was not properly approved. llvm-svn: 274768
* fix documentation comment. NFC.Junmo Park2016-07-061-2/+1
| | | | llvm-svn: 274704
* [AArch64] Change the preferred alignment for char and short to word alignmentEvandro Menezes2016-06-211-2/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D21414 llvm-svn: 273279
* AArch64: Do not test for CPUs, use SubtargetFeaturesMatthias Braun2016-06-021-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Testing for specific CPUs has a number of problems, better use subtarget features: - When some tweak is added for a specific CPU it is often desirable for the next version of that CPU as well, yet we often forget to add it. - It is hard to keep track of checks scattered around the target code; Declaring all target specifics together with the CPU in the tablegen file is a clear representation. - Subtarget features can be tweaked from the command line. To discourage people from using CPU checks in the future I removed the isCortexXX(), isCyclone(), ... functions. I added an getProcFamily() function for exceptional circumstances but made it clear in the comment that usage is discouraged. Reformat feature list in AArch64.td to have 1 feature per line in alphabetical order to simplify merging and sorting for out of tree tweaks. No functional change intended. Differential Revision: http://reviews.llvm.org/D20762 llvm-svn: 271555
* Delete Reloc::Default.Rafael Espindola2016-05-181-11/+22
| | | | | | | | | | | | Having an enum member named Default is quite confusing: Is it distinct from the others? This patch removes that member and instead uses Optional<Reloc> in places where we have a user input that still hasn't been maped to the default value, which is now clear has no be one of the remaining 3 options. llvm-svn: 269988
* Trivial cleanups.Rafael Espindola2016-05-181-1/+1
| | | | | | | This just clang formats and cleans comments in an area I am about to post a patch for review. llvm-svn: 269946
* CodeGen: Move TargetPassConfig from Passes.h to an own header; NFCMatthias Braun2016-05-101-0/+1
| | | | | | | | Many files include Passes.h but only a fraction needs to know about the TargetPassConfig class. Move it into an own header. Also rename Passes.cpp to TargetPassConfig.cpp while we are at it. llvm-svn: 269011
* [AArch64] Use the reciprocal estimation machineryEvandro Menezes2016-05-041-2/+27
| | | | | | | This patch adds support for estimating the square root, its reciprocal and division or reciprocal using the combiner generic reciprocal machinery. llvm-svn: 268539
* [GlobalISel] Move GISelAccessor class into public headersTom Stellard2016-04-141-6/+6
| | | | | | | | | | Reviewers: qcolombet Subscribers: joker.eph, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D19120 llvm-svn: 266348
* [AArch64] Get rid of some GlobalISel ifdefs.Quentin Colombet2016-04-071-3/+1
| | | | llvm-svn: 265725
* [GlobalISel] Add RegBankSelect hooks into the pass pipeline.Quentin Colombet2016-04-071-0/+6
| | | | | | | Now, RegBankSelect will happen after the IRTranslation and the target may optionally add additional passes in between. llvm-svn: 265716
* [AArch64] Teach the subtarget how to get to the RegisterBankInfo.Quentin Colombet2016-04-061-0/+28
| | | | | | | | | | | Rework the access to GlobalISel APIs to contain how much of the APIs we need to access for the final executable to build when GlobalISel is not built. This prevents massive usage of ifdefs in various places. Now, all the GlobalISel ifdefs will be happing only in AArch64TargetMachine.cpp. llvm-svn: 265567
* AArch64: avoid clobbering SP for dead MOVimm pseudos.Tim Northover2016-04-011-1/+3
| | | | | | | | We were producing ORR, which actually defines a GPR32sp rather than a GPR32. Should fix PR23209. llvm-svn: 265198
* [Aarch64] Turn on the LoopDataPrefetch pass for CycloneAdam Nemet2016-03-301-1/+1
| | | | llvm-svn: 264811
* [Aarch64] Add pass LoopDataPrefetch for CycloneAdam Nemet2016-03-181-0/+13
| | | | | | | | | | | | | | | | | | | | Summary: This wires up the pass for Cyclone but keeps it off for now because we need a few more TTIs. The getPrefetchMinStride value is not very well tuned right now but it works well with CFP2006/433.milc which motivated this. Tests will be added as part of the upcoming large-stride prefetching patch. Reviewers: t.p.northover Subscribers: llvm-commits, aemerson, hfinkel, rengolin Differential Revision: http://reviews.llvm.org/D17943 llvm-svn: 263770
* [AArch64] Initialize GlobalISel as part of the target initialization.Quentin Colombet2016-03-081-0/+2
| | | | llvm-svn: 262897
* [AArch64] Add pass to remove redundant copy after RAJun Bum Lim2016-02-161-0/+9
| | | | | | | | | | | | | | | | | | | | | Summary: This change will add a pass to remove unnecessary zero copies in target blocks of cbz/cbnz instructions. E.g., the copy instruction in the code below can be removed because the cbz jumps to BB1 when x0 is zero : BB0: cbz x0, .BB1 BB1: mov x0, xzr Jun Reviewers: gberry, jmolloy, HaoLiu, MatzeB, mcrosier Subscribers: mcrosier, mssimpso, haicheng, bmakam, llvm-commits, aemerson, rengolin Differential Revision: http://reviews.llvm.org/D16203 llvm-svn: 261004
* [AArch64] Plug the beginning of the GlobalISel pipeline.Quentin Colombet2016-02-111-0/+13
| | | | llvm-svn: 260569
* constify the Function parameter to the TTI creation callback andEric Christopher2015-09-161-1/+1
| | | | | | propagate to all callers/users/etc. llvm-svn: 247864
* [AArch64] Lower interleaved memory accesses to ldN/stN intrinsics. This ↵Hao Liu2015-06-261-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | patch also adds a function to calculate the cost of interleaved memory accesses. E.g. Lower an interleaved load: %wide.vec = load <8 x i32>, <8 x i32>* %ptr %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6> %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7> into: %ld2 = { <4 x i32>, <4 x i32> } call llvm.aarch64.neon.ld2(%ptr) %vec0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0 %vec1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1 E.g. Lower an interleaved store: %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11> store <12 x i32> %i.vec, <12 x i32>* %ptr into: %sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3> %sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7> %sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11> call void llvm.aarch64.neon.st3(%sub.v0, %sub.v1, %sub.v2, %ptr) Differential Revision: http://reviews.llvm.org/D10533 llvm-svn: 240754
* Clean up redundant copies of Triple objects. NFCDaniel Sanders2015-06-161-7/+7
| | | | | | | | | | | | | | Summary: Reviewers: rengolin Reviewed By: rengolin Subscribers: llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10382 llvm-svn: 239823
* [AArch64] Revert r239711 again. We need to discuss how to share code between ↵Hao Liu2015-06-151-8/+0
| | | | | | AArch64 and ARM backend. llvm-svn: 239713
* [AArch64] Match interleaved memory accesses into ldN/stN instructions.Hao Liu2015-06-151-0/+8
| | | | | | | Re-commit after adding "-aarch64-neon-syntax=generic" to fix the failure on OS X. This patch was firstly committed in r239514, then reverted in r239544 because of a syntax incompatible failure on OS X. llvm-svn: 239711
* Replace string GNU Triples with llvm::Triple in TargetMachine. NFC.Daniel Sanders2015-06-111-13/+11
| | | | | | | | | | | | | | | | | | Summary: For the moment, TargetMachine::getTargetTriple() still returns a StringRef. This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rengolin Reviewed By: rengolin Subscribers: ted, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10362 llvm-svn: 239554
* This reverts commit r239529 and r239514.Rafael Espindola2015-06-111-8/+0
| | | | | | | | | Revert "[AArch64] Match interleaved memory accesses into ldN/stN instructions." Revert "Fixing MSVC 2013 build error." The test/CodeGen/AArch64/aarch64-interleaved-accesses.ll test was failing on OS X. llvm-svn: 239544
* Replace string GNU Triples with llvm::Triple in computeDataLayout(). NFC.Daniel Sanders2015-06-111-5/+4
| | | | | | | | | | | | | | | | Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rengolin Reviewed By: rengolin Subscribers: llvm-commits, jfb, rengolin Differential Revision: http://reviews.llvm.org/D10361 llvm-svn: 239538
* [AArch64] Match interleaved memory accesses into ldN/stN instructions.Hao Liu2015-06-111-0/+8
| | | | | | | | | | | | | | | | | | | | | | | Add a pass AArch64InterleavedAccess to identify and match interleaved memory accesses. This pass transforms an interleaved load/store into ldN/stN intrinsic. As Loop Vectorizor disables optimization on interleaved accesses by default, this optimization is also disabled by default. To enable it by "-aarch64-interleaved-access-opt=true" E.g. Transform an interleaved load (Factor = 2): %wide.vec = load <8 x i32>, <8 x i32>* %ptr %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6> ; Extract even elements %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7> ; Extract odd elements Into: %ld2 = { <4 x i32>, <4 x i32> } call aarch64.neon.ld2(%ptr) %v0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0 %v1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1 E.g. Transform an interleaved store (Factor = 2): %i.vec = shuffle %v0, %v1, <0, 4, 1, 5, 2, 6, 3, 7> ; Interleaved vec store <8 x i32> %i.vec, <8 x i32>* %ptr Into: %v0 = shuffle %i.vec, undef, <0, 1, 2, 3> %v1 = shuffle %i.vec, undef, <4, 5, 6, 7> call void aarch64.neon.st2(%v0, %v1, %ptr) llvm-svn: 239514
* Replace string GNU Triples with llvm::Triple in MCSubtargetInfo and ↵Daniel Sanders2015-06-101-1/+2
| | | | | | | | | | | | | | | | | | create*MCSubtargetInfo(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rafael Reviewed By: rafael Subscribers: rafael, ted, jfb, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10311 llvm-svn: 239467
* [GlobalMerge] Take into account minsize on Global users' parents.Ahmed Bougacha2015-06-041-3/+7
| | | | | | | | | | Now that we can look at users, we can trivially do this: when we would have otherwise disabled GlobalMerge (currently -O<3), we can just run it for minsize functions, as it's usually a codesize win. Differential Revision: http://reviews.llvm.org/D10054 llvm-svn: 239087
* [AArch64] Disable complex GEP optimization by default.James Molloy2015-04-221-1/+1
| | | | | | | | Enough concerns were raised that this optimization is pessimising some code patterns. The obvious fix, to add a Reassociate run afterwards, causes even more pessimisation in some cases due to fewer complex addressing modes being matched. As there isn't a trivial fix for this, backing this out by default until someone gets a chance to fix the addressing mode matcher. llvm-svn: 235491
* [CodeGen] Split -enable-global-merge into ARM and AArch64 options.Ahmed Bougacha2015-04-111-1/+8
| | | | | | | | | | | | | Currently, there's a single flag, checked by the pass itself. It can't force-enable the pass (and is on by default), because it might not even have been created, as that's the targets decision. Instead, have separate explicit flags, so that the decision is consistently made in the target. Keep the flag as a last-resort "force-disable GlobalMerge" for now, for backwards compatibility. llvm-svn: 234666
* [AArch64, ARM] Enable GlobalMerge with -O3 rather than -O1.Ahmed Bougacha2015-03-231-1/+1
| | | | | | | | | | | | | | | | | | | | The pass used to be enabled by default with CodeGenOpt::Less (-O1). This is too aggressive, considering the pass indiscriminately merges all globals together. Currently, performance doesn't always improve, and, on code that uses few globals (e.g., the odd file- or function- static), more often than not is degraded by the optimization. Lengthy discussion can be found on llvmdev (AArch64-focused; ARM has similar problems): http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-February/082800.html Also, it makes tooling and debuggers less useful when dealing with globals and data sections. GlobalMerge needs to better identify those cases that benefit, and this will be done separately. In the meantime, move the pass to run with -O3 rather than -O1, on both ARM and AArch64. llvm-svn: 233024
* Remove the bare getSubtargetImpl call from the AArch64 port. As partEric Christopher2015-03-211-1/+0
| | | | | | | of this add a test that shows we can generate code for functions that specifically enable a subtarget feature. llvm-svn: 232884
OpenPOWER on IntegriCloud