summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64
Commit message (Collapse)AuthorAgeFilesLines
...
* [MachineOutliner] Add `useMachineOutliner` target hookJessica Paquette2018-04-041-0/+3
| | | | | | | | | | | | | The MachineOutliner has a bunch of target hooks that will call llvm_unreachable if the target doesn't implement them. Therefore, if you enable the outliner on such a target, it'll just crash. It'd be much better if it'd just *not* run the outliner at all in this case. This commit adds a hook to TargetInstrInfo that returns false by default. Targets that implement the hook make it return true. The outliner checks the return value of this hook to decide whether or not to continue. llvm-svn: 329220
* [AArch64] Change std::sort to llvm::sort in response to r327219Mandeep Singh Grang2018-04-041-4/+4
| | | | | | | | | | | | | | | | | | | | | Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: t.p.northover, jmolloy, RKSimon, rengolin Reviewed By: rengolin Subscribers: dexonsmith, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D44853 llvm-svn: 329216
* Sort targetgen calls in lib/Target/*/CMakeLists.Nico Weber2018-04-041-9/+9
| | | | | | | | | | | Makes it easier to see mistakes such as the one fixed in r329178 and makes the different target CMakeLists more consistent. Also remove some stale-looking comments from the Nios2 target cmakefile. No intended behavior change. llvm-svn: 329181
* [AArch64] Add patterns matching (fabs (fsub x y)) to (fabd x y)John Brawn2018-04-041-0/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D44573 llvm-svn: 329163
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-04-031-12/+12
| | | | | | Fix typo and simplify matching expression. llvm-svn: 329130
* [MachineOutliner] Keep track of fns that use a redzone in AArch64FunctionInfoJessica Paquette2018-04-033-7/+27
| | | | | | | | | | | | This patch adds a hasRedZone() function to AArch64MachineFunctionInfo. It returns true if the function is known to use a redzone, false if it is known to not use a redzone, and no value otherwise. This removes the requirement to pass -mno-red-zone when outlining for AArch64. https://reviews.llvm.org/D45189 llvm-svn: 329120
* [AArch64] Reserve x18 register on FuchsiaPetr Hosek2018-04-011-2/+2
| | | | | | | | This register is reserved as a platform register on Fuchsia. Differential Revision: https://reviews.llvm.org/D45105 llvm-svn: 328950
* [IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to ↵Craig Topper2018-03-294-4/+4
| | | | | | | | | | | | CodeGen layer. Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it. The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly. Differential Revision: https://reviews.llvm.org/D45017 llvm-svn: 328806
* Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header ↵David Blaikie2018-03-281-1/+1
| | | | | | | | dependency Thanks to echristo for the pointers on direction. llvm-svn: 328737
* [MachineOutliner] Simplify call outlining + require valid callee save info ↵Jessica Paquette2018-03-281-31/+18
| | | | | | | | | | for call outlining This commit simplifies the call outlining logic by removing references to the Function associated with the callee. To do this, it requires that valid callee save info is available to the outliner. llvm-svn: 328719
* [MachineOutliner] AArch64: Don't outline ADRPs with un-outlinable operandsJessica Paquette2018-03-271-11/+7
| | | | | | | | | If an ADRP appears with, say, a CPI operand, we shouldn't outline it. This moves the check for unsafe operands so that it occurs before the special-case for ADRPs. Also add a test for outlining ADRPs. llvm-svn: 328674
* [AArch64] Decorate AArch64 instrs with OPERAND_PCRELRafael Auler2018-03-272-1/+27
| | | | | | | | | | | | | | Summary: This is a canonical way to teach objdump to print the target symbols for branches when disassembling AArch64 code. Reviewers: evandro, t.p.northover, espindola Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D44851 llvm-svn: 328638
* Fix layering by moving ValueTypes.h from CodeGen to IRDavid Blaikie2018-03-234-4/+4
| | | | | | ValueTypes.h is implemented in IR already. llvm-svn: 328397
* Fix layering of MachineValueType.h by moving it from CodeGen to SupportDavid Blaikie2018-03-233-3/+3
| | | | | | | | | This is used by llvm tblgen as well as by LLVM Targets, so the only common place is Support for now. (maybe we need another target for these sorts of things - but for now I'm at least making them correct & we can make them better if/when people have strong feelings) llvm-svn: 328395
* Move TargetLoweringObjectFile from CodeGen to Target to fix layeringDavid Blaikie2018-03-233-3/+3
| | | | | | | It's implemented in Target & include from other Target headers, so the header should be in Target. llvm-svn: 328392
* [AArch64] Don't reduce the width of loads if it prevents combining a shiftJohn Brawn2018-03-232-0/+30
| | | | | | | | | | | | | | | Loads and stores can only shift the offset register by the size of the value being loaded, but currently the DAGCombiner will reduce the width of the load if it's followed by a trunc making it impossible to later combine the shift. Solve this by implementing shouldReduceLoadWidth for the AArch64 backend and make it prevent the width reduction if this is what would happen, though do allow it if reducing the load width will let us eliminate a later sign or zero extend. Differential Revision: https://reviews.llvm.org/D44794 llvm-svn: 328321
* [AArch64] Clean-up a few over-eager regexps in models.Florian Hahn2018-03-232-26/+26
| | | | | | | | | Patch by Simon Pilgrim <llvm-dev@redking.me.uk> That is a slightly modified version of the AArch64 changes from Simon's D44687 . llvm-svn: 328303
* State that CFG is preserved in 'Falkor HW Prefetch Fix Late Phase'.Michael Zolotukhin2018-03-221-0/+1
| | | | | | That removes some redundant recomputations from the passes pipeline. llvm-svn: 328272
* [CodeGen] Add a new pass for PostRA sinkJun Bum Lim2018-03-221-36/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This pass sinks COPY instructions into a successor block, if the COPY is not used in the current block and the COPY is live-in to a single successor (i.e., doesn't require the COPY to be duplicated). This avoids executing the the copy on paths where their results aren't needed. This also exposes additional opportunites for dead copy elimination and shrink wrapping. These copies were either not handled by or are inserted after the MachineSink pass. As an example of the former case, the MachineSink pass cannot sink COPY instructions with allocatable source registers; for AArch64 these type of copy instructions are frequently used to move function parameters (PhyReg) into virtual registers in the entry block.. For the machine IR below, this pass will sink %w19 in the entry into its successor (%bb.1) because %w19 is only live-in in %bb.1. ``` %bb.0: %wzr = SUBSWri %w1, 1 %w19 = COPY %w0 Bcc 11, %bb.2 %bb.1: Live Ins: %w19 BL @fun %w0 = ADDWrr %w0, %w19 RET %w0 %bb.2: %w0 = COPY %wzr RET %w0 ``` As we sink %w19 (CSR in AArch64) into %bb.1, the shrink-wrapping pass will be able to see %bb.0 as a candidate. With this change I observed 12% more shrink-wrapping candidate and 13% more dead copies deleted in spec2000/2006/2017 on AArch64. Reviewers: qcolombet, MatzeB, thegameg, mcrosier, gberry, hfinkel, john.brawn, twoh, RKSimon, sebpop, kparzysz Reviewed By: sebpop Subscribers: evandro, sebpop, sfertile, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D41463 llvm-svn: 328237
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-03-201-1/+1
| | | | | | Fix typo in the number of integer dividers. llvm-svn: 328027
* [AArch64][Falkor] Correct load/store increment scheduling detailsGeoff Berry2018-03-201-46/+50
| | | | llvm-svn: 327982
* [MachineOutliner] AArch64: Emit CFI instructions when outlining callsJessica Paquette2018-03-191-0/+19
| | | | | | | | | | | When outlining calls, the outliner needs to update CFI to ensure that, say, exception handling works. This commit adds that functionality and adds a test just for call outlining. Call outlining stuff in machine-outliner.mir should be moved into machine-outliner-calls.mir in a later commit. llvm-svn: 327917
* [ARM, AArch64] Check the no-stack-arg-probe attribute for dynamic stack probesMartin Storsjo2018-03-191-0/+13
| | | | | | | | | | | | | This extends the use of this attribute on ARM and AArch64 from SVN r325900 (where it was only checked for fixed stack allocations on ARM/AArch64, but for all stack allocations on X86). This also adds a testcase for the existing use of disabling the fixed stack probe with the attribute on ARM and AArch64. Differential Revision: https://reviews.llvm.org/D44291 llvm-svn: 327897
* TableGen: Check the dynamic type of !cast<Rec>(string)Nicolai Haehnle2018-03-191-60/+60
| | | | | | | | | | | | | | | | | | | | | Summary: The docs already claim that this happens, but so far it hasn't. As a consequence, existing TableGen files get this wrong a lot, but luckily the fixes are all reasonably straightforward. To make this work with all the existing forms of self-references (since the true type of a record is only built up over time), the lookup of self-references in !cast is delayed until the final resolving step. Change-Id: If5923a72a252ba2fbc81a889d59775df0ef31164 Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D44475 llvm-svn: 327849
* [AArch64] Fix a few InstRWs in the A53 scheduler model and enable ↵Craig Topper2018-03-181-7/+4
| | | | | | | | FullInstRWOverlapCheck. This fixes the errors found by the new check added in r327808. llvm-svn: 327812
* [TableGen] When trying to reuse a scheduler class for instructions from an ↵Craig Topper2018-03-186-0/+18
| | | | | | | | | | | | InstRW, make sure we haven't already seen another InstRW containing this instruction on this CPU. This is similar to the check later when we remap some of the instructions from one class to a new one. But if we reuse the class we don't get to do that check. So many CPUs have violations of this check that I had to add a flag to the SchedMachineModel to allow it to be disabled. Hopefully we can get those cleaned up quickly and remove this flag. A lot of the violations are due to overlapping regular expressions, but that's not the only kind of issue it found. llvm-svn: 327808
* [AArch64] Skip an unnecessary getCopyToReg in DYNAMIC_STACKALLOCMartin Storsjo2018-03-171-5/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D44586 llvm-svn: 327779
* [MachineOutliner] Make KILLs invisibleJessica Paquette2018-03-161-0/+5
| | | | | | | | | At the point the outliner runs, KILLs don't impact anything, but they're still considered unique instructions. This commit makes them invisible like DebugValues so that they can still be outlined without impacting outlining decisions. llvm-svn: 327760
* [AArch64] Implement getArithmeticReductionCostMatthew Simpson2018-03-162-0/+31
| | | | | | | | | | This patch provides an implementation of getArithmeticReductionCost for AArch64. We can specialize the cost of add reductions since they are computed using the 'addv' instruction. Differential Revision: https://reviews.llvm.org/D44490 llvm-svn: 327702
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-03-151-2/+2
| | | | | | Fix typo. llvm-svn: 327663
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-03-151-0/+7
| | | | | | Add special case for rotate right. llvm-svn: 327662
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-03-152-12/+59
| | | | | | Increase the number of cheap as move cases of register reset. llvm-svn: 327661
* [AArch64] Emit CSR loads in the same order as storesFrancis Visoiu Mistrih2018-03-141-14/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Optionally allow the order of restoring the callee-saved registers in the epilogue to be reversed. The flag -reverse-csr-restore-seq generates the following code: ``` stp x26, x25, [sp, #-64]! stp x24, x23, [sp, #16] stp x22, x21, [sp, #32] stp x20, x19, [sp, #48] ; [..] ldp x24, x23, [sp, #16] ldp x22, x21, [sp, #32] ldp x20, x19, [sp, #48] ldp x26, x25, [sp], #64 ret ``` Note how the CSRs are restored in the same order as they are saved. One exception to this rule is the last `ldp`, which allows us to merge the stack adjustment and the ldp into a post-index ldp. This is done by first generating: ldp x26, x27, [sp] add sp, sp, #64 which gets merged by the arm64 load store optimizer into ldp x26, x25, [sp], #64 The flag is disabled by default. llvm-svn: 327569
* [AArch64] Keep track of MIFlags in the LoadStoreOptimizerFrancis Visoiu Mistrih2018-03-141-7/+14
| | | | | | | | | | | | | | | | Merging: * $x26, $x25 = frame-setup LDPXi $sp, 0 * $sp = frame-destroy ADDXri $sp, 64, 0 into an LDPXpost should preserve the flags from both instructions as following: * frame-setup frame-destroy LDPXpost Differential Revision: https://reviews.llvm.org/D44446 llvm-svn: 327533
* [AArch64] Don't produce R_AARCH64_TLSLE_LDST32_TPREL_LO12_NCMartin Storsjo2018-03-141-2/+7
| | | | | | | | | Support for this relocation is missing in both LLD and GNU binutils at the moment. This reverts the ELF parts of SVN r327316. llvm-svn: 327503
* [AArch64] Fold adds with tprel_lo12_nc and secrel_lo12 into a following ldr/strMartin Storsjo2018-03-123-10/+22
| | | | | | Differential Revision: https://reviews.llvm.org/D44355 llvm-svn: 327316
* [AArch64] Implement native TLS for WindowsMartin Storsjo2018-03-104-2/+76
| | | | | | Differential Revision: https://reviews.llvm.org/D43971 llvm-svn: 327220
* [AArch64] Fix UB about shift amount exceeds data bit-widthWeiming Zhao2018-03-081-1/+1
| | | | | | | | | | | | | | | | Summary: Fixes an UB caught by sanitizer. The shift amount might be larger than 32 so the operand should be 1ULL. In this patch, we replace the original expression with existing API with uint64_t type. Reviewers: eli.friedman, rengolin Reviewed By: rengolin Subscribers: rengolin, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D44234 llvm-svn: 326969
* Delete code that is probably dead since r249303.Rafael Espindola2018-03-081-40/+5
| | | | | | | With r249303 the expression evaluation should expand variables that are not in sections (and so don't have an atom). llvm-svn: 326966
* [AArch64] Adjust the cost of integer vector divisionEvandro Menezes2018-03-071-22/+38
| | | | | | | | | | Since there is no instruction for integer vector division, factor in the cost of singling out each element to be used with the scalar division instruction. Differential revision: https://reviews.llvm.org/D43974 llvm-svn: 326955
* [AArch64] add missing pattern for insert_subvector undefSebastian Pop2018-03-071-14/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | The attached testcase started failing after the patch to define isExtractSubvectorCheap with the following pattern mismatch: ISEL: Starting pattern match Initial Opcode index to 85068 Match failed at index 85076 LLVM ERROR: Cannot select: t47: v8i16 = insert_subvector undef:v8i16, t43, Constant:i64<0> The code generated from llvm/lib/Target/AArch64/AArch64InstrInfo.td def : Pat<(insert_subvector undef, (v4i16 FPR64:$src), (i32 0)), (INSERT_SUBREG (v8i16 (IMPLICIT_DEF)), FPR64:$src, dsub)>; is in ninja/lib/Target/AArch64/AArch64GenDAGISel.inc At the location of the error it is: /* 85076*/ OPC_CheckChild2Type, MVT::i32, And it failed to match the type of operand 2. Adding another def-pat for i64 fixes the failed def-pat error: def : Pat<(insert_subvector undef, (v4i16 FPR64:$src), (i64 0)), (INSERT_SUBREG (v8i16 (IMPLICIT_DEF)), FPR64:$src, dsub)>; llvm-svn: 326949
* [AArch64] define isExtractSubvectorCheapSebastian Pop2018-03-062-0/+13
| | | | | | | | | | | | | | | | | | Following the ARM-neon backend, define isExtractSubvectorCheap to return true when extracting low and high part of a neon register. The patch disables a test in llvm/test/CodeGen/AArch64/arm64-ext.ll This testcase is fragile in the sense that it requires a BUILD_VECTOR to "survive" all DAG transforms until ISelLowering. The testcase is supposed to check that AArch64TargetLowering::ReconstructShuffle() works, and for that we need a BUILD_VECTOR in ISelLowering. As we now transform the BUILD_VECTOR earlier into an VEXT + vector_shuffle, we don't have the BUILD_VECTOR pattern when we get to ISelLowering. As there is no way to disable the combiner to only exercise the code in ISelLowering, the patch disables the testcase. Differential revision: https://reviews.llvm.org/D43973 llvm-svn: 326811
* fix PR36582Sebastian Pop2018-03-051-3/+9
| | | | | | | | | The error occurs when reading i16 elements (as in the testcase) from a v8i8 with a pattern of <0,2,4,6>. As all the data in the vector is accessed, the operation is not a VUZP. The patch stops the pattern recognition of VUZP when EXTRACT_VECTOR_ELT has a different element type than BUILD_VECTOR. llvm-svn: 326722
* [AArch64] Improve code generation of constant vectorsEvandro Menezes2018-03-052-35/+44
| | | | | | | | | | | | | Use the whole gammut of constant immediates available to set up a vector. Instead of using, for example, `mov w0, #0xffff; dup v0.4s, w0`, which transfers between register files, use the more efficient `movi v0.4s, #-1` instead. Not limited to just a few values, but any immediate value that can be encoded by all the variants of `FMOV`, `MOVI`, `MVNI`, thus eliminating the need to there be patterns to optimize special cases. Differential revision: https://reviews.llvm.org/D42133 llvm-svn: 326718
* [AArch64] Clean up code (NFC)Evandro Menezes2018-03-011-16/+13
| | | | | | | Clean up a couple of functions in `AArch64TargetLowering` by removing redundant statements. llvm-svn: 326486
* [AArch64] Add support for secrel add/load/store relocations for COFFMartin Storsjo2018-03-015-4/+32
| | | | | | Differential Revision: https://reviews.llvm.org/D43288 llvm-svn: 326480
* [AArch64] generate vuzp instead of movSebastian Pop2018-03-011-0/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when a BUILD_VECTOR is created out of a sequence of EXTRACT_VECTOR_ELT with a specific pattern sequence, either <0, 2, 4, ...> or <1, 3, 5, ...>, replace the BUILD_VECTOR with either vuzp1 or vuzp2. With this patch LLVM generates the following code for the first function fun1 in the testcase: adrp x8, .LCPI0_0 ldr q0, [x8, :lo12:.LCPI0_0] tbl v0.16b, { v0.16b }, v0.16b ext v1.16b, v0.16b, v0.16b, #8 uzp1 v0.8b, v0.8b, v1.8b str d0, [x8] ret Without this patch LLVM currently generates this code: adrp x8, .LCPI0_0 ldr q0, [x8, :lo12:.LCPI0_0] tbl v0.16b, { v0.16b }, v0.16b mov v1.16b, v0.16b mov v1.b[1], v0.b[2] mov v1.b[2], v0.b[4] mov v1.b[3], v0.b[6] mov v1.b[4], v0.b[8] mov v1.b[5], v0.b[10] mov v1.b[6], v0.b[12] mov v1.b[7], v0.b[14] str d1, [x8] ret llvm-svn: 326443
* [TLS] use emulated TLS if the target supports only this modeChih-Hung Hsieh2018-02-281-1/+1
| | | | | | | | | | | | | | | Emulated TLS is enabled by llc flag -emulated-tls, which is passed by clang driver. When llc is called explicitly or from other drivers like LTO, missing -emulated-tls flag would generate wrong TLS code for targets that supports only this mode. Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether emulated TLS code should be generated. Unit tests are modified to run with and without the -emulated-tls flag. Differential Revision: https://reviews.llvm.org/D42999 llvm-svn: 326341
* [GISel]: Don't assert when constraining RegisterOperands which are uses.Aditya Nandakumar2018-02-261-2/+1
| | | | | | | | | | | | | Currently we assert that only non target specific opcodes can have missing RegisterClass constraints in the MCDesc. The backend can have instructions with register operands but don't have RegisterClass constraints (say using unknown_class) in which case the instruction defining the register will constrain it. Change the assert to only fire if a def has no regclass. https://reviews.llvm.org/D43409 llvm-svn: 326142
* [PATCH] [AArch64] Add new target feature to fuse conditional selectEvandro Menezes2018-02-233-22/+73
| | | | | | | | | This feature enables the fusion of the comparison and the conditional select instructions together. Differential revision: https://reviews.llvm.org/D42392 llvm-svn: 325939
OpenPOWER on IntegriCloud