summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AArch64
Commit message (Collapse)AuthorAgeFilesLines
* [AArch64] Harden test caseEvandro Menezes2018-03-051-151/+158
| | | | | | NFC llvm-svn: 326724
* fix PR36582Sebastian Pop2018-03-051-2/+15
| | | | | | | | | The error occurs when reading i16 elements (as in the testcase) from a v8i8 with a pattern of <0,2,4,6>. As all the data in the vector is accessed, the operation is not a VUZP. The patch stops the pattern recognition of VUZP when EXTRACT_VECTOR_ELT has a different element type than BUILD_VECTOR. llvm-svn: 326722
* [AArch64] Improve code generation of constant vectorsEvandro Menezes2018-03-053-66/+163
| | | | | | | | | | | | | Use the whole gammut of constant immediates available to set up a vector. Instead of using, for example, `mov w0, #0xffff; dup v0.4s, w0`, which transfers between register files, use the more efficient `movi v0.4s, #-1` instead. Not limited to just a few values, but any immediate value that can be encoded by all the variants of `FMOV`, `MOVI`, `MVNI`, thus eliminating the need to there be patterns to optimize special cases. Differential revision: https://reviews.llvm.org/D42133 llvm-svn: 326718
* [DAGCombiner] When combining zero_extend of a truncate, only mask before ↵Craig Topper2018-03-013-10/+9
| | | | | | | | | | extending for vectors. Masking first, prevents the extend from being combine with loads. Its also interfering with some vXi1 extraction code. Differential Revision: https://reviews.llvm.org/D42679 llvm-svn: 326500
* [AArch64] generate vuzp instead of movSebastian Pop2018-03-011-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when a BUILD_VECTOR is created out of a sequence of EXTRACT_VECTOR_ELT with a specific pattern sequence, either <0, 2, 4, ...> or <1, 3, 5, ...>, replace the BUILD_VECTOR with either vuzp1 or vuzp2. With this patch LLVM generates the following code for the first function fun1 in the testcase: adrp x8, .LCPI0_0 ldr q0, [x8, :lo12:.LCPI0_0] tbl v0.16b, { v0.16b }, v0.16b ext v1.16b, v0.16b, v0.16b, #8 uzp1 v0.8b, v0.8b, v1.8b str d0, [x8] ret Without this patch LLVM currently generates this code: adrp x8, .LCPI0_0 ldr q0, [x8, :lo12:.LCPI0_0] tbl v0.16b, { v0.16b }, v0.16b mov v1.16b, v0.16b mov v1.b[1], v0.b[2] mov v1.b[2], v0.b[4] mov v1.b[3], v0.b[6] mov v1.b[4], v0.b[8] mov v1.b[5], v0.b[10] mov v1.b[6], v0.b[12] mov v1.b[7], v0.b[14] str d1, [x8] ret llvm-svn: 326443
* [GlobalISel][AArch64] Adding -disable-gisel-legality-check CL optionRoman Tereshin2018-03-011-0/+4543
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently it's impossible to test InstructionSelect pass with MIR which is considered illegal by the Legalizer in Assert builds. In early stages of porting an existing backend from SelectionDAG ISel to GlobalISel, however, we would have very basic CallLowering, Legalizer, and RegBankSelect implementations, but rather functional Instruction Select with quite a few patterns selectable due to the semi-automatic porting process borrowing them from SelectionDAG ISel. As we are trying to define legality as a property of being selectable by the instruction selector, it would be nice to be able to easily check what the selector can do in its current state w/o the legality check provided by the Legalizer getting in the way. It also seems beneficial to have a regression testing set up that would not allow the selector to silently regress in its support of the MIR not supported yet by the previous passes in the GlobalISel pipeline. This commit adds -disable-gisel-legality-check command line option to llc that disables those legality checks in RegBankSelect and InstructionSelect passes. It also adds quite a few MIR test cases for AArch64's Instruction Selector. Every one of them would fail on the legality check at the moment, but will select just fine if the check is disabled. Every test MachineFunction is intended to exercise a specific selection rule and that rule only, encoded in the MachineFunction's name by the rule's number, ID, and index of its GIM_Try opcode in TableGen'erated MatchTable (-optimize-match-table=false). Reviewers: ab, dsanders, qcolombet, rovka Reviewed By: bogner Subscribers: kristof.beyls, volkan, aditya_nandakumar, aemerson, rengolin, t.p.northover, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D42886 llvm-svn: 326396
* [TLS] use emulated TLS if the target supports only this modeChih-Hung Hsieh2018-02-282-0/+14
| | | | | | | | | | | | | | | Emulated TLS is enabled by llc flag -emulated-tls, which is passed by clang driver. When llc is called explicitly or from other drivers like LTO, missing -emulated-tls flag would generate wrong TLS code for targets that supports only this mode. Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether emulated TLS code should be generated. Unit tests are modified to run with and without the -emulated-tls flag. Differential Revision: https://reviews.llvm.org/D42999 llvm-svn: 326341
* Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"Geoff Berry2018-02-2711-28/+131
| | | | | | | | Re-enable commit r323991 now that r325931 has been committed to make MachineOperand::isRenamable() check more conservative w.r.t. code changes and opt-in on a per-target basis. llvm-svn: 326208
* [AArch64] Harden test casesEvandro Menezes2018-02-262-2/+5
| | | | | | NFC llvm-svn: 326147
* [CodeGen] Don't omit any redundant information in -debug outputFrancis Visoiu Mistrih2018-02-261-1/+1
| | | | | | | | | | | | | | | | | | | | | In r322867, we introduced IsStandalone when printing MIR in -debug output. The default behaviour for that was: 1) If any of MBB, MI, or MO are -debug-printed separately, don't omit any redundant information. 2) When -debug-printing a MF entirely, don't print any redundant information. 3) When printing MIR, don't print any redundant information. I'd like to change 2) to: 2) When -debug-printing a MF entirely, don't omit any redundant information. Differential Revision: https://reviews.llvm.org/D43337 llvm-svn: 326094
* [PATCH] [AArch64] Add new target feature to fuse conditional selectEvandro Menezes2018-02-231-0/+30
| | | | | | | | | This feature enables the fusion of the comparison and the conditional select instructions together. Differential revision: https://reviews.llvm.org/D42392 llvm-svn: 325939
* [AArch64] Improve macro fusion test caseEvandro Menezes2018-02-221-3/+5
| | | | | | | Improve a vector in the test case for the fusion of address generation and loads or stores. Otherwise, NFC. llvm-svn: 325839
* Revert "[DebugInfo][FastISel] Fix dropping dbg.value()"Sander de Smalen2018-02-221-49/+0
| | | | | | | | | This patch reverts r325440 and r325438 because it triggers an assertion in SelectionDAGBuilder.cpp. Also having debug enabled may unintentionally affect code-gen. The patch is reverted until we find a better solution. llvm-svn: 325825
* [AArch64] Refactor instructions using SIMD immediatesEvandro Menezes2018-02-201-0/+90
| | | | | | | | | | | Get rid of icky goto loops and make the code easier to maintain. Otherwise, NFC. Restore r324903 and fix PR36369. Differentail revision: https://reviews.llvm.org/D43364 llvm-svn: 325621
* [AArch64][GlobalISel] When copying from a gpr32 to an fpr16 reg, convert to ↵Amara Emerson2018-02-202-9/+71
| | | | | | | | | | | | fpr32 first. This is a follow on commit to r[x] where we fix the other direction of copy. For this case, after converting the source from gpr32 -> fpr32, we use a subregister copy, which is essentially what EXTRACT_SUBREG does in SDAG land. https://reviews.llvm.org/D43444 llvm-svn: 325550
* [AArch64][GlobalISel] Fix an assert fail/miscompile when fp16 types are copiedAmara Emerson2018-02-181-0/+69
| | | | | | | | | | | | to gpr register banks. PR36345. rdar://36478867 Differential Revision: https://reviews.llvm.org/D43310 llvm-svn: 325463
* [AArch64][GlobalISel] Support G_INSERT/G_EXTRACT of types < s32 bits.Amara Emerson2018-02-181-13/+69
| | | | | | These are needed for operations on fp16 types in a later patch. llvm-svn: 325462
* [AArch64] Coalesce Copy Zero during instruction selectionHaicheng Wu2018-02-184-2/+50
| | | | | | | | Add special case for copy of zero to avoid a double copy. Differential Revision: https://reviews.llvm.org/D36104 llvm-svn: 325459
* Made test dbg_value_fastisel.ll specific to AArch64 fast-isel.Sander de Smalen2018-02-171-0/+49
| | | | | | | | Some buildbots failed on this test (rL325438) because they don't build all targets. I set the triple to aarch64 and moved the test to test/CodeGen/AArch64/fast-isel-dbg-value.ll. llvm-svn: 325440
* [AArch64] Implement dynamic stack probing for windowsMartin Storsjo2018-02-172-3/+27
| | | | | | | | | This makes sure that alloca() function calls properly probe the stack as needed. Differential Revision: https://reviews.llvm.org/D42356 llvm-svn: 325433
* Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"Quentin Colombet2018-02-1711-131/+28
| | | | | | | | | | | | | | | | | This reverts commit r323991. This commit breaks target that don't model all the register constraints in TableGen. So far the workaround was to set the hasExtraXXXRegAllocReq, but it proves that it doesn't cover all the cases. For instance, when mutating an instruction (like in the lowering of COPYs) the isRenamable flag is not properly updated. The same problem will happen when attaching machine operand from one instruction to another. Geoff Berry is working on a fix in https://reviews.llvm.org/D43042. llvm-svn: 325421
* [AArch64] Fix BITCAST lowering crashEvandro Menezes2018-02-161-0/+28
| | | | | | | | | | | The data type is assumed to be a vector, but sometimes it is not, leading to an assertion. Add simple test-case to verify this. Differential revision: https://reviews.llvm.org/D42599 llvm-svn: 325378
* GlobalISel: IRTranslate llvm.fmuladd.* intrinsicVolkan Keles2018-02-131-0/+34
| | | | | | | | | | | | Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, bogner Reviewed By: qcolombet Subscribers: rovka, kristof.beyls, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D43090 llvm-svn: 324971
* [AArch64] Fixes for ARMv8.2-A FP16 scalar intrinsic - llvm portionAbderrazek Zaafrani2018-02-121-0/+30
| | | | | | https://reviews.llvm.org/D42993 llvm-svn: 324912
* [AArch64] Improve v8.1-A code-gen for atomic load-andOliver Stannard2018-02-121-0/+96
| | | | | | | | | | | | | | | | | | | | | | | | Armv8.1-A added an atomic load-clear instruction (which performs bitwise and with the complement of it's operand), but not a load-and instruction. Our current code-generation for atomic load-and always inserts an MVN instruction to invert its argument, even if it could be folded into a constant or another instruction. This adds lowering early in selection DAG to convert a load-and operation into an xor with -1 and a load-clear, allowing the normal DAG optimisations to work on it. To do this, I've had to add a new ISD opcode, ATOMIC_LOAD_CLR. I don't see any easy way to do this with an AArch64-specific ISD node, because the code-generation for atomic operations assumes the SDNodes are of type AtomicSDNode. I've left the old tablegen patterns in because they are still needed for global isel. Differential revision: https://reviews.llvm.org/D42478 llvm-svn: 324908
* [AArch64] Improve v8.1-A code-gen for atomic load-subtractOliver Stannard2018-02-121-0/+112
| | | | | | | | | | | | | | | | | | | | | Armv8.1-A added an atomic load-add instruction, but not a load-subtract instruction. Our current code-generation for atomic load-subtract always inserts a NEG instruction to negate it's argument, even if it could be folded into a constant or another instruction. This adds lowering early in selection DAG to convert a load-subtract operation into a subtract and a load-add, allowing the normal DAG optimisations to work on it. I've left the old tablegen patterns in because they are still needed for global isel. Some of the tests in this patch are copied from D35375 by Chad Rosier (which was abandoned). Differential revision: https://reviews.llvm.org/D42477 llvm-svn: 324892
* [TargetLowering] try to create -1 constant operand for math ops via demanded ↵Sanjay Patel2018-02-111-2/+1
| | | | | | | | | | | | | | | | | | bits This reverses instcombine's demanded bits' transform which always tries to clear bits in constants. As noted in PR35792 and shown in the test diffs: https://bugs.llvm.org/show_bug.cgi?id=35792 ...we can do better in codegen by trying to form -1. The x86 sub test shows a missed opportunity. I did investigate changing instcombine's behavior, but it would be more work to change canonicalization in IR. Clearing bits / shrinking constants can allow killing instructions, so we'd have to figure out how to not regress those cases. Differential Revision: https://reviews.llvm.org/D42986 llvm-svn: 324839
* [AArch64] Return true in enableMultipleCopyHints().Jonas Paulsson2018-02-094-28/+26
| | | | | | | | | | Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Martin Storsjö llvm-svn: 324720
* [GISel]: Verify COPIES involving generic registers.Aditya Nandakumar2018-02-094-8/+10
| | | | | | | | | | | | Add verification for copies involving generic registers if they are compatible - ie if it is a generic copy, then the types are the same, and if a COPY b/w generic and target virtual register, then the sizes should be the same. Only checks if there are no sub registers involved for now. https://reviews.llvm.org/D37775 llvm-svn: 324696
* [AArch64] Don't materialize 0 with "fmov h0, .." when FullFP16 is not supportedSjoerd Meijer2018-02-081-5/+8
| | | | | | | | | | | | | | We were generating "fmov h0, wzr" instructions when FullFP16 is not enabled. I've not added any tests, because the problem was visible in: test/CodeGen/AArch64/arm64-zero-cycle-zeroing.ll, which I had to change: I don't think Cyclone has FullFP16 enabled by default, so it shouldn't be using this v8.2a instruction. I've also removed these rdar tags, please shout if there are any objections. Differential Revision: https://reviews.llvm.org/D43020 llvm-svn: 324581
* [LegalizeDAG] Truncate condition operand of ISD::SELECTEugene Leviant2018-02-071-0/+61
| | | | | | Differential revision: https://reviews.llvm.org/D42737 llvm-svn: 324447
* [Mips][AMDGPU] Update test cases to not use vector lt/gt compares that can ↵Craig Topper2018-02-072-84/+90
| | | | | | | | | | be simplified to an equality/inequality or to always true/false. For example 'ugt X, 0' can be simplified to 'ne X, 0'. Or 'uge X, 0' is always true. We already simplify this for scalars in SimplifySetCC, but we don't currently for vectors in SimplifySetCC. D42948 proposes to change that. llvm-svn: 324436
* [AArch64] add test to show sub-optimal isel; NFCSanjay Patel2018-02-061-0/+17
| | | | llvm-svn: 324404
* [AArch64][GlobalISel] Fix old use of % sigil in test.Amara Emerson2018-02-021-2/+2
| | | | | | My rebase had missed the new $ sigil we're using. llvm-svn: 324051
* [GlobalISel] Constrain the dest reg of IMPLICT_DEF.Amara Emerson2018-02-021-0/+19
| | | | | | | | | | This fixes a crash where the user is a COPY, which deliberately does not constrain its source operands, resulting in a vreg without a reg class escaping selection. Differential Revision: https://reviews.llvm.org/D42697 llvm-svn: 324047
* [GlobalISel] Fix assert failure when legalizing non-power-2 loads.Amara Emerson2018-02-011-0/+10
| | | | | | | Until we support extending loads properly we're going to fall back for these. We already handle stores in the same way, so this is just being consistent. llvm-svn: 324001
* [MachineCopyPropagation] Extend pass to do COPY source forwardingGeoff Berry2018-02-0110-25/+128
| | | | | | | | | | | | | | | | | | | | | | Summary: This change extends MachineCopyPropagation to do COPY source forwarding and adds an additional run of the pass to the default pass pipeline just after register allocation. This version of this patch uses the newly added MachineOperand::isRenamable bit to avoid forwarding registers is such a way as to violate constraints that aren't captured in the Machine IR (e.g. ABI or ISA constraints). This change is a continuation of the work started in D30751. Reviewers: qcolombet, javed.absar, MatzeB, jonpa, tstellar Subscribers: tpr, mgorny, mcrosier, nhaehnle, nemanjai, jyknight, hfinkel, arsenm, inouehrs, eraman, sdardis, guyblank, fedor.sergeev, aheejin, dschuff, jfb, myatsina, llvm-commits Differential Revision: https://reviews.llvm.org/D41835 llvm-svn: 323991
* [AArch64] add tests with sqrt estimate and ieee denorms; NFCSanjay Patel2018-02-011-0/+50
| | | | | | As noted in D42323, we're not checking for denorms as we should. llvm-svn: 323985
* [AArch64] auto-generate complete checks; NFCSanjay Patel2018-02-011-210/+319
| | | | llvm-svn: 323984
* Followup on Proposal to move MIR physical register namespace to '$' sigil.Puyan Lotfi2018-01-31118-3610/+3610
| | | | | | | | | | | | Discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html In preparation for adding support for named vregs we are changing the sigil for physical registers in MIR to '$' from '%'. This will prevent name clashes of named physical register with named vregs. llvm-svn: 323922
* [MachineOutliner] Freeze registers in new functionsGeoff Berry2018-01-311-1/+1
| | | | | | | | | | | | | | | Summary: Call MRI.freezeReservedRegs() on functions created during outlining so that calls to isReserved() by the verifier called after this pass won't assert. Reviewers: MatzeB, qcolombet, paquette Subscribers: mcrosier, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D42749 llvm-svn: 323905
* [Analysis] Disable calls to *_finite and other glibc-only functions on Android.Chih-Hung Hsieh2018-01-311-0/+24
| | | | | | | | | | Since r322087, glibc's finite lib calls are generated when possible. However, they are not supported on Android. This change also disables other functions not available on Android. Differential Revision: http://reviews.llvm.org/D42668 llvm-svn: 323898
* [DWARF] Allow duplication of tails with CFI instructionsPetar Jovanovic2018-01-311-0/+96
| | | | | | | | | | | | | | | | | | This commit came as a result for revert of patch r317579 (originally committed as r317100). The patch made CFI instructions duplicable, because their existence in the epilogue block was affecting the Tail duplication pass. However, duplicating blocks with CFI instructions was an issue for compact unwind info on Darwin, which is why the patch was reverted. This patch allows duplicating tails with CFI instructions, though they are not duplicable, by copying them 'manually'. Patch by Djordje Kovacevic. Differential Revision: https://reviews.llvm.org/D40979 llvm-svn: 323883
* [MachineCombiner] Add check for optimal pattern order.Florian Hahn2018-01-313-12/+12
| | | | | | | | | | | | | | | | | | | | | | In D41587, @mssimpso discovered that the order of some patterns for AArch64 was sub-optimal. I thought a bit about how we could avoid that case in the future. I do not think there is a need for evaluating all patterns for now. But this patch adds an extra (expensive) check, that evaluates the latencies of all patterns, and ensures that the latency saved decreases for subsequent patterns. This catches the sub-optimal order fixed in D41587, but I am not entirely happy with the check, as it only applies to sub-optimal patterns seen while building with EXPENSIVE_CHECKS on. It did not discover any other sub-optimal pattern ordering. Reviewers: Gerolf, spatel, mssimpso Reviewed By: Gerolf, mssimpso Differential Revision: https://reviews.llvm.org/D41766 llvm-svn: 323873
* [AArch64] Expand testing of zero cycle zeroingEvandro Menezes2018-01-301-34/+29
| | | | | | | | Make sure that r321824 doesn't change zeroing. Differential revision: https://reviews.llvm.org/D42089 llvm-svn: 323816
* [GlobalISel] Bail out on calls to dllimported functionsMartin Storsjo2018-01-301-1/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D42568 llvm-svn: 323811
* [AArch64] Properly handle dllimport of variables when using fast-iselMartin Storsjo2018-01-301-4/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D42567 llvm-svn: 323810
* [AArch64] Add new target feature to fuse address generation with load or storeEvandro Menezes2018-01-301-0/+112
| | | | | | | | | This feature enables the fusion of the address generation and a corresponding load or store together. Differential revision: https://reviews.llvm.org/D42393 llvm-svn: 323782
* [AArch64] Update test cases for Exynos M3Evandro Menezes2018-01-304-42/+106
| | | | | | Update any test case relevant for Exynos M3. llvm-svn: 323775
* [AArch64] Add pipeline model for Exynos M3Evandro Menezes2018-01-304-3/+13
| | | | | | | | Add the scheduling and cost model for Exynos M3. Differential revision: https://reviews.llvm.org/D42387 llvm-svn: 323773
OpenPOWER on IntegriCloud