summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AArch64/GlobalISel
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert "[GlobalISel] Add legalization support for non-power-2 loads and stores"Amara Emerson2019-04-192-49/+20
| | | | | | This introduces some runtime failures which I'll need to investigate further. llvm-svn: 358771
* [GlobalISel][AArch64] Legalize vector G_FPOWJessica Paquette2019-04-191-0/+274
| | | | | | | | | | This instruction is legalized in the same way as G_FSIN, G_FCOS, G_FLOG10, etc. Update legalize-pow.mir and arm64-vfloatintrinsics.ll to reflect the change. Differential Revision: https://reviews.llvm.org/D60218 llvm-svn: 358764
* [GlobalISel][AArch64] Legalize/select G_(S/Z/ANY)_EXT for v8s8sJessica Paquette2019-04-182-0/+140
| | | | | | | | | | | | | | | This adds legalization for G_SEXT, G_ZEXT, and G_ANYEXT for v8s8s. We were falling back on G_ZEXT in arm64-vabs.ll before, preventing us from selecting the @llvm.aarch64.neon.sabd.v8i8 intrinsic. This adds legalizer support for those 3, which gives us selection via the importer. Update the relevant tests (legalize-ext.mir, select-int-ext.mir) and add a GISel line to arm64-vabs.ll. Differential Revision: https://reviews.llvm.org/D60881 llvm-svn: 358715
* [GlobalISel][AArch64] Legalize v8s8 loadsJessica Paquette2019-04-181-0/+25
| | | | | | | | Add legalizer support for loads of v8s8 and update legalize-load-store.mir. Differential Revision: https://reviews.llvm.org/D60877 llvm-svn: 358714
* [GlobalISel] Add legalization support for non-power-2 loads and storesAmara Emerson2019-04-172-20/+49
| | | | | | | | | | Legalize things like i24 load/store by splitting them into smaller power of 2 operations. This matches how SelectionDAG handles these operations. Differential Revision: https://reviews.llvm.org/D59971 llvm-svn: 358613
* [AArch64][GlobalISel] Don't do extending loads combine for non-pow-2 types.Amara Emerson2019-04-152-1/+38
| | | | | | | | Since non-pow-2 types are going to get split up into multiple loads anyway, don't do the [SZ]EXTLOAD combine for those and save us trouble later in legalization. llvm-svn: 358458
* [GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with ↵Amara Emerson2019-04-1513-65/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | constants only. Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded. This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it. I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0 ARM64. Compile time: Program base cse diff test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8% test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7% test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4% test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3% test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2% test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2% test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1% test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1% test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0% Geomean difference -0.2% Code size: Program base cse diff test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7% test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2% test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1% test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1% test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 8033492 -0.6% test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4% test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0% test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0% Geomean difference -0.9% Differential Revision: https://reviews.llvm.org/D60580 llvm-svn: 358369
* [AArch64][GlobalISel] Enable copy elision in the pre-legalizer combine and ↵Amara Emerson2019-04-131-0/+32
| | | | | | | | | | | | | fix a crash. This enables the simple copy combine that already exists in the CombinerHelper. However, it exposed a bug in the GISelChangeObserver where it wouldn't clear a set of MIs to process, and so would end up causing a crash when deleted MIs were being added to the combiner worklist again. Differential Revision: https://reviews.llvm.org/D60579 llvm-svn: 358318
* [GlobalISel] Fix a crash when handling an invalid MVT during call lowering.Amara Emerson2019-04-121-0/+7
| | | | | | | | This crash was introduced in r358032 as we try to construct an EVT from an MVT in order to find the register type for the calling conv. Fall back instead of trying to do this with an invalid MVT coming from i256. llvm-svn: 358314
* [AArch64][GlobalISel] Fix a crash when selecting shufflevectors with an ↵Amara Emerson2019-04-121-0/+51
| | | | | | | | | | | undef mask element. If a shufflevector's mask vector has an element with "undef" then the generic instruction defining that element register is a G_IMPLICT_DEF instead of G_CONSTANT. This fixes the selector to handle this case, and for now assumes that undef just means zero. In future we'll optimize this case properly. llvm-svn: 358312
* [AArch64][GlobalISel] Flesh out vector load/store support for more types.Amara Emerson2019-04-115-71/+440
| | | | | | | Some of these were legalizing into smaller vector types unnecessarily, others were simply not supported yet. llvm-svn: 358223
* [AArch64][GlobalISel] Legalization and ISel support for load/stores of ↵Amara Emerson2019-04-115-16/+193
| | | | | | | | | | | | | | | | vectors of pointers. Loads and store of values with type like <2 x p0> currently don't get imported because SelectionDAG has no knowledge of pointer types. To leverage the existing support for vector load/stores, we can bitcast the value to have s64 element types instead. We do this as a custom legalization. This patch also adds support for general loads of <2 x s64>, and relaxes some type conditions on selecting G_BITCAST. Differential Revision: https://reviews.llvm.org/D60534 llvm-svn: 358221
* [AArch64][GlobalISel] Make <2 x p0> = G_BUILD_VECTOR legal.Amara Emerson2019-04-102-144/+50
| | | | | | The existing isel support already works for p0 once the legalizer accepts it. llvm-svn: 358144
* [AArch64][GlobalISel] Add legalizer support for <8 x s16> and <16 x s8> G_ADD.Amara Emerson2019-04-102-0/+107
| | | | llvm-svn: 358143
* [AArch64][GlobalISel] Scalarize vector SDIV.Amara Emerson2019-04-101-4/+33
| | | | llvm-svn: 358142
* GlobalISel: Support legalizing G_CONSTANT with irregular breakdownMatt Arsenault2019-04-101-8/+0
| | | | llvm-svn: 358109
* [AArch64][GlobalISel] Add isel support for vector G_ICMP and G_ASHR & G_SHLAmara Emerson2019-04-092-0/+3470
| | | | | | | | | | | | | | | | The selection for G_ICMP is unfortunately not currently importable from SDAG due to the use of custom SDNodes. To support this, this selection method has an opcode table which has been generated by a script, indexed by various instruction properties. Ideally in future we will have a GISel native selection patterns that we can write in tablegen to improve on this. For selection of some types we also need support for G_ASHR and G_SHL which are generated as a result of legalization. This patch also adds support for them, generating the same code as SelectionDAG currently does. Differential Revision: https://reviews.llvm.org/D60436 llvm-svn: 358035
* [AArch64][GlobalISel] Legalize vector G_ICMP.Amara Emerson2019-04-093-5/+1929
| | | | | | | | Selection support will be coming in a later patch. Differential Revision: https://reviews.llvm.org/D60435 llvm-svn: 358034
* [AArch64][GlobalISel] Add legalization for some vector G_SHL and G_ASHR.Amara Emerson2019-04-092-1/+32
| | | | | | | | This is needed for some future support for vector ICMP. Differential Revision: https://reviews.llvm.org/D60433 llvm-svn: 358033
* [GlobalISel][AArch64] Allow CallLowering to handle types which are normallyAmara Emerson2019-04-092-0/+44
| | | | | | | | | | | required to be passed as different register types. E.g. <2 x i16> may need to be passed as a larger <2 x i32> type, so formal arg lowering needs to be able truncate it back. Likewise, when dealing with returns of these types, they need to be widened in the appropriate way back. Differential Revision: https://reviews.llvm.org/D60425 llvm-svn: 358032
* [AArch64][GlobalISel] Legalize G_FEXP2Jessica Paquette2019-04-032-1/+253
| | | | | | | | | Same as G_EXP. Add a test, and update legalizer-info-validation.mir and f16-instructions.ll. Differential Revision: https://reviews.llvm.org/D60165 llvm-svn: 357605
* [GlobalISel] Add IRTranslator support for llvm.stacksave and llvm.stackrestoreJessica Paquette2019-04-021-0/+12
| | | | | | | | Also update arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D60140 llvm-svn: 357538
* [AArch64][GlobalISel] Select llvm.aarch64.stlxr(i64, i64*)Jessica Paquette2019-04-021-0/+32
| | | | | | | | | | | | | This adds partial instruction selection support for llvm.aarch64.stlxr. It also factors out selection for G_INTRINSIC_W_SIDE_EFFECTS into its own function. The new function removes the restriction that the intrinsic ID on the G_INTRINSIC_W_SIDE_EFFECTS be on operand 0. Also add a test, and add a GISel line to arm64-ldxr-stxr.ll. Differential Revision: https://reviews.llvm.org/D60100 llvm-svn: 357518
* [AArch64][GlobalISe] Select STRQui for stores into v264s instead of scalarizingJessica Paquette2019-04-012-4/+24
| | | | | | | | | This improves selection for vector stores into v2s64s. Before we just scalarized them, but we can just use a STRQui instead. Differential Revision: https://reviews.llvm.org/D60083 llvm-svn: 357432
* [GlobalISel][AArch64] Add isel support for G_INSERT_VECTOR_ELT on v2s32sJessica Paquette2019-03-292-0/+108
| | | | | | | | | This adds support for v2s32 vector inserts, and updates the selection + regbankselect tests for G_INSERT_VECTOR_ELT. Differential Revision: https://reviews.llvm.org/D59910 llvm-svn: 357318
* [GlobalISel] Fix legalizer artifact combiner from crashing with invalid dead ↵Amara Emerson2019-03-272-13/+82
| | | | | | | | | | | | | | | | | | | | instructions. The artifact combiners push instructions which have been marked for deletion onto an list for the legalizer to deal with on return. However, for trunc(ext) combines the combiner routine recursively calls itself. When it does this the dead instructions list may not be empty, and the other combiners don't expect to be dealing with essentially invalid MIR (multiple vreg defs etc). This change fixes it by ensuring that the dead instructions are processed on entry into tryCombineInstruction. As a result, this fix exposed a few places in tests where G_TRUNC instructions were not being deleted even though they were dead. Differential Revision: https://reviews.llvm.org/D59892 llvm-svn: 357101
* [AArch64] Split the neon.addp intrinsic into integer and fp variants.Amara Emerson2019-03-212-33/+1
| | | | | | | | | | | | | | | | | | | This is the result of discussions on the list about how to deal with intrinsics which require codegen to disambiguate them via only the integer/fp overloads. It causes problems for GlobalISel as some of that information is lost during translation, while with other operations like IR instructions the information is encoded into the instruction opcode. This patch changes clang to emit the new faddp intrinsic if the vector operands to the builtin have FP element types. LLVM IR AutoUpgrade has been taught to upgrade existing calls to aarch64.neon.addp with fp vector arguments, and we remove the workarounds introduced for GlobalISel in r355865. This is a more permanent solution to PR40968. Differential Revision: https://reviews.llvm.org/D59655 llvm-svn: 356722
* GlobalISel: Fix RegBankSelect for REG_SEQUENCEMatt Arsenault2019-03-211-7/+4
| | | | | | | | | | | | | The AArch64 test was broken since the result register already had a set register class, so this test was a no-op. The mapping verify call would fail because the result size is not the same as the inputs like in a copy or phi. The AMDGPU testcases are half broken and introduce illegal VGPR->SGPR copies which need much more work to handle correctly (same for phis), but add them as a baseline. llvm-svn: 356713
* [AArch64][GlobalISel] Add an optimization to select vector DUP instructions.Amara Emerson2019-03-191-0/+110
| | | | | | | | | This adds pattern matching for the insert+shufflevector sequence so we can generate dup instructions instead of the current TBL sequence. Differential Revision: https://reviews.llvm.org/D59558 llvm-svn: 356526
* [AArch64][GlobalISel] Make v4s32 G_IMPLICIT_DEF legal.Amara Emerson2019-03-191-6/+2
| | | | llvm-svn: 356525
* [AArch64][GlobalISel] Regbankselect: Fix G_BUILD_VECTOR trying to use s16 ↵Amara Emerson2019-03-151-0/+34
| | | | | | | | | gpr sources. Since we can't insert s16 gprs as we don't have 16 bit GPR registers, we need to teach RBS to assign them to the FPR bank so our selector works. llvm-svn: 356282
* [AArch64][GlobalISel] Add isel support for G_UADDO on s32s and s64sJessica Paquette2019-03-142-1/+63
| | | | | | | | | | | | This adds instruction selection support for G_UADDO on s32s and s64s. Also - Add an instruction selection test - Update the arm64-xaluo.ll test to show that we generate the correct assembly Differential Revision: https://reviews.llvm.org/D58734 llvm-svn: 356214
* [AArch64][GlobalISel] Implement selection for G_UNMERGE of vectors to vectors.Amara Emerson2019-03-141-0/+56
| | | | | | | | | This re-uses the previous support for extract vector elt to extract the subvectors. Differential Revision: https://reviews.llvm.org/D59390 llvm-svn: 356213
* [AArch64][GlobalISel] Add some support for G_CONCAT_VECTORS.Amara Emerson2019-03-143-1/+101
| | | | | | | | Handles concatenating 2 x v2s32 and 2 x v4s16 Differential Revision: https://reviews.llvm.org/D59390 llvm-svn: 356212
* [GlobalISel][AArch64] Add partial selection support for G_INSERT_VECTOR_ELTJessica Paquette2019-03-143-1/+220
| | | | | | | | | | | This adds support for inserting elements into packed vectors. It also adds two tests: one for selection, and one for regbank select. Unpacked vectors will come in a follow-up. Differential Revision: https://reviews.llvm.org/D59325 llvm-svn: 356182
* GlobalISel: Use multiple returns for intrinsic structsMatt Arsenault2019-03-141-1/+1
| | | | | | | | | | | This is consistent with what SelectionDAG does and is much easier to work with than the extract sequence with an artificial wide register. For the AMDGPU control flow intrinsics, this was producing an s128 for the i64, i1 tuple return. Any legalization that should apply to a real s128 value would badly obscure the direct values that need to be seen. llvm-svn: 356147
* Recommit "[GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELT"Jessica Paquette2019-03-112-0/+220
| | | | | | | | | After r355865, we should be able to safely select G_EXTRACT_VECTOR_ELT without running into any problematic intrinsics. Also add a fix for lane copies, which don't support index 0. llvm-svn: 355871
* [GlobalISel][AArch64] Always fall back on aarch64.neon.addp.*Jessica Paquette2019-03-112-1/+33
| | | | | | | | | | | | | | Overloaded intrinsics aren't necessarily safe for instruction selection. One such intrinsic is aarch64.neon.addp.*. This is a temporary workaround to ensure that we always fall back on that intrinsic. Eventually this will be replaced with a proper solution. https://bugs.llvm.org/show_bug.cgi?id=40968 Differential Revision: https://reviews.llvm.org/D59062 llvm-svn: 355865
* [AArch64][GlobalISel] Fix i1 arguments not being zero-extended as required ↵Amara Emerson2019-03-081-0/+9
| | | | | | | | by ABI. Fixes PR41001. llvm-svn: 355745
* Revert "[GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELT"Jessica Paquette2019-03-052-197/+0
| | | | | | | | | | | This broke test-suite::aarch64_neon_intrinsics.test Reverting while I look into it. Example failure: http://lab.llvm.org:8011/builders/clang-cmake-aarch64-quick/builds/17740 llvm-svn: 355408
* [GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELTJessica Paquette2019-03-042-0/+197
| | | | | | | | | | | | | This adds instruction selection support for G_EXTRACT_VECTOR_ELT for cases where the index is defined by a G_CONSTANT. It also factos out the lane copy opcode selection part into its own function, `getLaneCopyOpcode`. This is used by both `selectUnmergeValues` and `selectExtractElt`. Differential Revision: https://reviews.llvm.org/D58469 llvm-svn: 355344
* [GlobalISel][AArch64] Legalize vector G_SELECTJessica Paquette2019-03-041-0/+69
| | | | | | | | Just scalarize it, and add a test showing it works. Differential Revision: https://reviews.llvm.org/D58747 llvm-svn: 355339
* Re-commit r355104: "[AArch64][GlobalISel] Add support for 64 bit vector ↵Amara Emerson2019-03-041-25/+49
| | | | | | | | | | | | shuffle using TBL1." The code to materialize a mask from a constant pool load tried to use a 128 bit LDR to load a 64 bit constant pool entry, which was 8 byte aligned. This resulted in a link failure in the NEON tests in the test suite since the LDR address was unaligned. This change fixes that to instead emit a 64 bit LDR if the entry is 64 bit, before converting back to a 128 bit register for the TBL. llvm-svn: 355326
* Revert "[AArch64][GlobalISel] Add support for 64 bit vector shuffle using TBL1."Amara Emerson2019-02-281-47/+25
| | | | | | Seems to break some neon intrinsics tests. llvm-svn: 355115
* [AArch64][GlobalISel] Add support for 64 bit vector shuffle using TBL1.Amara Emerson2019-02-281-25/+47
| | | | | | | | | This extends the existing support for shufflevector to handle cases like <2 x float>, which we can implement by concating the vectors and using a TBL1. Differential Revision: https://reviews.llvm.org/D58684 llvm-svn: 355104
* Re-land "[AArch64][GlobalISel] Implement partial support for G_SHUFFLE_VECTOR""Amara Emerson2019-02-213-1/+206
| | | | | | | Thanks to Richard Trieu for pointing out that the failures were due to a use-after-free of an ArrayRef. llvm-svn: 354616
* Revert "[AArch64][GlobalISel] Implement partial support for G_SHUFFLE_VECTOR"Amara Emerson2019-02-213-206/+1
| | | | | | This reverts r354521 because it broke the bots, but passes on Darwin somehow. llvm-svn: 354532
* [GlobalISel] Add -O0 to some tests to see if it fixes them. I can't ↵Amara Emerson2019-02-202-2/+2
| | | | | | | | | | reproduce the failures locally, and greendragon also passes, but some other bots fail for reasons I don't understand. The only difference I can see between these tests is it's missing an -O0 If this doesn't work I'll revert and continue investigating. llvm-svn: 354529
* [AArch64][GlobalISel] Implement partial support for G_SHUFFLE_VECTORAmara Emerson2019-02-203-1/+206
| | | | | | | | | | | | | | This change makes some basic type combinations for G_SHUFFLE_VECTOR legal, and implements them with a very pessimistic TBL2 instruction in the selector. For TBL2, support is also needed to generate constant pool entries and load from them in order to materialize the mask register. Currently supports <2 x s64> and <4 x s32> result types. Differential Revision: https://reviews.llvm.org/D58466 llvm-svn: 354521
* [GlobalISel][AArch64] Legalize + select some llvm.ctlz.* intrinsicsJessica Paquette2019-02-182-1/+201
| | | | | | | | | | | | Legalize/select llvm.ctlz.* Add select-ctlz to show that we actually select them. Update arm64-clrsb.ll and arm64-vclz.ll to show that we perform valid transformations in optimized builds, and document where GISel can improve. Differential Revision: https://reviews.llvm.org/D58155 llvm-svn: 354299
OpenPOWER on IntegriCloud