summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64/AArch64InstructionSelector.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [AArch64][GlobalISel] Use TST for comparisons when possibleJessica Paquette2019-07-081-45/+98
| | | | | | | | | | | | | | | | | Porting over the part of `emitComparison` in AArch64ISelLowering where we use TST to represent a compare. - Rename `tryOptCMN` to `tryFoldIntegerCompare`, since it now also emits TSTs when possible. - Add a utility function for emitting a TST with register operands. - Rename opt-fold-cmn.mir to opt-fold-compare.mir, since it now also tests the TST fold as well. Differential Revision: https://reviews.llvm.org/D64371 llvm-svn: 365404
* Fix precedence in assert from r364961Jessica Paquette2019-07-031-1/+2
| | | | | | | | Precedence was wrong in an assert added in r364961. Add braces around the assertion condition to make it right. See: https://reviews.llvm.org/D64084 llvm-svn: 365069
* [GlobalISel][AArch64] Use getConstantVRegValWithLookThrough for selectArithImmedJessica Paquette2019-07-031-6/+4
| | | | | | | | | | | Instead of just stopping to see if we have a G_CONSTANT, instead, look through G_TRUNCs, G_SEXTs, and G_ZEXTs. This gives an average ~1.3% code size improvement on CINT2000 at -O3. Differential Revision: https://reviews.llvm.org/D64108 llvm-svn: 365063
* [AArch64][GlobalISel] Overhaul legalization & isel or shifts to select ↵Amara Emerson2019-07-031-15/+179
| | | | | | | | | | | | | | | | | | | | | | | | | immediate forms. There are two main issues preventing us from generating immediate form shifts: 1) We have partial SelectionDAG imported support for G_ASHR and G_LSHR shift immediate forms, but they currently don't work because the amount type is expected to be an s64 constant, but we only legalize them to have homogenous types. To deal with this, first we introduce a custom legalizer to *only* custom legalize s32 shifts which have a constant operand into a s64. There is also an additional artifact combiner to fold zexts(g_constant) to a larger G_CONSTANT if it's legal, a counterpart to the anyext version committed in an earlier patch. 2) For G_SHL the importer can't cope with the pattern. For this I introduced an early selection phase in the arm64 selector to select these forms manually before the tablegen selector pessimizes it to a register-register variant. Differential Revision: https://reviews.llvm.org/D63910 llvm-svn: 364994
* [AArch64][GlobalISel] Teach tryOptSelect to handle G_ICMPJessica Paquette2019-07-021-106/+139
| | | | | | | | | | | | | | | | | | | | This teaches `tryOptSelect` to handle folding G_ICMP, and removes the requirement that the G_SELECT we're dealing with is floating point. Some refactoring to make this work nicely as well: - Factor out the scalar case from the selection code for G_ICMP into `emitIntegerCompare`. - Make `tryOptCMN` return a MachineInstr* instead of a bool. - Make `tryOptCMN` not modify the instruction being selected. - Factor out the CMN emission into `emitCMN` for readability. By doing this this way, we can get all of the compare selection optimizations in select emission. Differential Revision: https://reviews.llvm.org/D64084 llvm-svn: 364961
* AArch64/GlobalISel: Fix trying to select invalid MIRMatt Arsenault2019-07-011-18/+15
| | | | | | Physical registers are not allowed to be a phi operand. llvm-svn: 364810
* GlobalISel: Remove unsigned variant of SrcOpMatt Arsenault2019-06-241-102/+102
| | | | | | | | | Force using Register. One downside is the generated register enums require explicit conversion. llvm-svn: 364194
* CodeGen: Introduce a class for registersMatt Arsenault2019-06-241-3/+3
| | | | | | | | | Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191
* [COFF, ARM64] Fix encoding of debugtrap for WindowsTom Tan2019-06-211-0/+5
| | | | | | | | | | | | On Windows ARM64, intrinsic __debugbreak is compiled into brk #0xF000 which is mapped to llvm.debugtrap in Clang. Instruction brk #F000 is the defined break point instruction on ARM64 which is recognized by Windows debugger and exception handling code, so llvm.debugtrap should map to it instead of redirecting to llvm.trap (brk #1) as the default implementation. Differential Revision: https://reviews.llvm.org/D63635 llvm-svn: 364115
* [AArch64][GlobalISel] Implement selection support for the new G_JUMP_TABLE ↵Amara Emerson2019-06-211-0/+45
| | | | | | | | | | and G_BRJT ops. With this we can now fully code generate jump tables, which is important for code size. Differential Revision: https://reviews.llvm.org/D63223 llvm-svn: 364086
* [AArch64][GlobalISel] Make s8 and s16 G_CONSTANTs legal.Amara Emerson2019-06-211-2/+5
| | | | | | | | | | | | | | | | | | | | | We sometimes get poor code size because constants of types < 32b are legalized as 32 bit G_CONSTANTs with a truncate to fit. This works but means that the localizer can no longer sink them (although it's possible to extend it to do so). On AArch64 however s8 and s16 constants can be selected in the same way as s32 constants, with a mov pseudo into a W register. If we make s8 and s16 constants legal then we can avoid unnecessary truncates, they can be CSE'd, and the localizer can sink them as normal. There is a caveat: if the user of a smaller constant has to widen the sources, we end up with an anyext of the smaller typed G_CONSTANT. This can cause regressions because of the additional extend and missed pattern matching. To remedy this, there's a new artifact combiner to generate the wider G_CONSTANT if it's legal for the target. Differential Revision: https://reviews.llvm.org/D63587 llvm-svn: 364075
* [GlobalISel][AArch64] Fold G_SUB into G_ICMP when it's safe to do soJessica Paquette2019-06-171-16/+144
| | | | | | | | | | | | | | | | | | | | | | | | | Basically porting over the behaviour in AArch64ISelLowering to GISel. See emitComparison for reference. When we have something like this: ``` lhs = G_SUB 0, y ... G_ICMP lhs, rhs ``` We can fold away the G_SUB and produce a cmn instead, given that we produce the same value in NZCV. Add a test showing that the transformation works, and also showing that we don't perform the transformation when it's unsafe. Also factor out the CSet emission into emitCSetForICMP. Differential Revision: https://reviews.llvm.org/D63163 llvm-svn: 363596
* [AArch64][GlobalISel] Select immediate forms of cmp instructions.Amara Emerson2019-06-091-5/+17
| | | | | | | | A simple re-use of the immediate operand matcher and renderer functions. rdar://43795178 llvm-svn: 362896
* [AArch64][GlobalISel] Add manual selection support for G_ZEXTLOADs to s64.Amara Emerson2019-06-061-0/+23
| | | | | | | | | | | We already get support for G_ZEXTLOAD to s32 from the importer, but it can't deal with the SUBREG_TO_REG in the pattern. Tweaking the existing manual selection code for G_LOAD to handle an additional SUBREG_TO_REG when dealing with G_ZEXTLOAD isn't much work. Also add tests to check the imported pattern selections to s32 work. llvm-svn: 362681
* [AArch64][GlobalISel] Add the new changes to fix PR42129 that were supposed ↵Amara Emerson2019-06-061-0/+5
| | | | | | | | to go into r362666. The changes weren't staged so ended up just re-commiting the unmodified reverted change. llvm-svn: 362677
* Revert "Revert "[AArch64][GlobalISel] Optimize G_FCMP + G_SELECT pairs when ↵Amara Emerson2019-06-051-8/+96
| | | | | | | | | | | | G_SELECT is fp"" When looking through copies, make sure to not try to find the vreg def of a physreg. Normally getVRegDef will return nullptr in this case, but if there happens to be multiple defs then it will assert. This fixes PR42129. llvm-svn: 362666
* Revert "[AArch64][GlobalISel] Optimize G_FCMP + G_SELECT pairs when G_SELECT ↵Petr Hosek2019-06-051-96/+8
| | | | | | | | is fp" This reverts commit r362435 as this triggers ICE, see PR42129 for details. llvm-svn: 362662
* [AArch64][GlobalISel] Optimize G_FCMP + G_SELECT pairs when G_SELECT is fpJessica Paquette2019-06-031-8/+96
| | | | | | | | | | | | | | | | Instead of emitting all of the test stuff for a compare when it's only used by a select, instead, just emit the compare + select. The select will use the value of NZCV correctly, so we don't need to emit all of the test instructions etc. For now, only support fp selects which use G_FCMP. Also only support condition codes which will only require one select to represent. Also add a test. Differential Revision: https://reviews.llvm.org/D62695 llvm-svn: 362446
* [AArch64][GlobalISel] Select FCMPSri/FCMPDri when comparing against 0.0Jessica Paquette2019-05-281-13/+27
| | | | | | | | | | | Add support for selecting FCMPSri and FCMPDri when comparing against 0.0, and factor out opcode selection for G_FCMP into its own function. Add a test to show that we don't do this with other immediates. Differential Revision: https://reviews.llvm.org/D62539 llvm-svn: 361888
* [AArch64][GlobalISel] Use fcsel instead of csel for G_SELECT on FPRsJessica Paquette2019-05-031-6/+42
| | | | | | | | | | | | | | | | | | | | | | | | This saves us some unnecessary copies. If the inputs to a G_SELECT are floating point, we should use fcsel rather than csel. Changes here are... - Teach selectCopy about s1-to-s1 copies across register banks. - AArch64RegisterBankInfo about G_SELECT in general. - Teach the instruction selector about the FCSEL instructions. Also add two tests: - select-select.mir to show that we get the expected FCSEL - regbank-select.mir (unfortunately named) to show the register banks on G_SELECT are properly preserved And update fast-isel-select.ll to show that we do the same thing as other instruction selectors in these cases. llvm-svn: 359940
* [GlobalISel][AArch64] Use fmov for G_FCONSTANT when possibleJessica Paquette2019-05-011-2/+46
| | | | | | | | | | This adds support for using fmov rather than a standard mov to materialize G_FCONSTANT when it's safe to do so. Update arm64-fast-isel-materialize.ll and select-constant.mir to show that the selection is correct. llvm-svn: 359734
* [GlobalISel][AArch64] Select llvm.aarch64.crypto.sha1hJessica Paquette2019-04-291-5/+68
| | | | | | | | | | This was falling back and gives us a reason to create a selectIntrinsic function which we would need eventually anyway. Update arm64-crypto.ll to show that we correctly select it. Also factor out the code for finding an intrinsic ID. llvm-svn: 359501
* [GlobalISel][AArch64] Use getConstantVRegValWithLookThrough for extractsJessica Paquette2019-04-261-38/+6
| | | | | | | | | | | | getConstantVRegValWithLookThrough does the same thing as the getConstantValueForReg function, and has more visibility across GISel. Plus, it supports looking through G_TRUNC, G_SEXT, and G_ZEXT. So, we get better code reuse and more functionality for free by using it. Add some test cases to select-extract-vector-elt.mir to show that we can now look through those instructions. llvm-svn: 359351
* [AArch64][GlobalISel] Select G_BSWAP for vectors of s32 and s64Jessica Paquette2019-04-261-0/+37
| | | | | | | | | There are instructions for these, so mark them as legal. Select the correct instruction in AArch64InstructionSelector.cpp. Update select-bswap.mir and arm64-rev.ll to reflect the changes. llvm-svn: 359331
* Fix alignment in AArch64InstructionSelector::emitConstantPoolEntry()Hans Wennborg2019-04-261-1/+1
| | | | | | | | | | | | | | The code was using the alignment of a pointer to the value, not the alignment of the constant itself. Maybe we got away with it so far because the pointer alignment is fairly high, but we did end up under-aligning <16 x i8> vectors, which was caught in the Chromium build after lld stopped over-aligning the .rodata.cst16 section in r356428. (See crbug.com/953815) Differential revision: https://reviews.llvm.org/D61124 llvm-svn: 359287
* [AArch64][GlobalISel] Select G_INTRINSIC_ROUNDJessica Paquette2019-04-231-0/+58
| | | | | | | Add selection support for G_INTRINSIC_ROUND, add a selection test, and add check lines to arm64-vfloatintrinsics.ll and f16-instructions.ll. llvm-svn: 359046
* [AArch64][GlobalISel] Actually select G_INTRINSIC_TRUNCJessica Paquette2019-04-231-1/+58
| | | | | | | | | | | | | | Apparently FileCheck wasn't actually matching the fallback check lines in arm64-vfloatintrinsics.ll properly. So, there were selection fallbacks for G_INTRINSIC_TRUNC there. Actually hook it up into AArch64InstructionSelector.cpp and write a proper selection test. I guess I'll figure out the FileCheck magic to make the fallback checks work properly in arm64-vfloatintrinsics.ll. llvm-svn: 359030
* [AArch64][GlobalISel] Fix a crash when selecting shufflevectors with an ↵Amara Emerson2019-04-121-7/+17
| | | | | | | | | | | undef mask element. If a shufflevector's mask vector has an element with "undef" then the generic instruction defining that element register is a G_IMPLICT_DEF instead of G_CONSTANT. This fixes the selector to handle this case, and for now assumes that undef just means zero. In future we'll optimize this case properly. llvm-svn: 358312
* [AArch64][GlobalISel] Legalization and ISel support for load/stores of ↵Amara Emerson2019-04-111-5/+4
| | | | | | | | | | | | | | | | vectors of pointers. Loads and store of values with type like <2 x p0> currently don't get imported because SelectionDAG has no knowledge of pointer types. To leverage the existing support for vector load/stores, we can bitcast the value to have s64 element types instead. We do this as a custom legalization. This patch also adds support for general loads of <2 x s64>, and relaxes some type conditions on selecting G_BITCAST. Differential Revision: https://reviews.llvm.org/D60534 llvm-svn: 358221
* [NFC] Fix unused variable warning.Clement Courbet2019-04-101-3/+0
| | | | llvm-svn: 358080
* [AArch64][GlobalISel] Add isel support for vector G_ICMP and G_ASHR & G_SHLAmara Emerson2019-04-091-2/+259
| | | | | | | | | | | | | | | | The selection for G_ICMP is unfortunately not currently importable from SDAG due to the use of custom SDNodes. To support this, this selection method has an opcode table which has been generated by a script, indexed by various instruction properties. Ideally in future we will have a GISel native selection patterns that we can write in tablegen to improve on this. For selection of some types we also need support for G_ASHR and G_SHL which are generated as a result of legalization. This patch also adds support for them, generating the same code as SelectionDAG currently does. Differential Revision: https://reviews.llvm.org/D60436 llvm-svn: 358035
* [AArch64][GlobalISel] Select llvm.aarch64.stlxr(i64, i64*)Jessica Paquette2019-04-021-8/+68
| | | | | | | | | | | | | This adds partial instruction selection support for llvm.aarch64.stlxr. It also factors out selection for G_INTRINSIC_W_SIDE_EFFECTS into its own function. The new function removes the restriction that the intrinsic ID on the G_INTRINSIC_W_SIDE_EFFECTS be on operand 0. Also add a test, and add a GISel line to arm64-ldxr-stxr.ll. Differential Revision: https://reviews.llvm.org/D60100 llvm-svn: 357518
* [GlobalISel][AArch64] Add isel support for G_INSERT_VECTOR_ELT on v2s32sJessica Paquette2019-03-291-6/+45
| | | | | | | | | This adds support for v2s32 vector inserts, and updates the selection + regbankselect tests for G_INSERT_VECTOR_ELT. Differential Revision: https://reviews.llvm.org/D59910 llvm-svn: 357318
* [AArch64][GlobalISel] Add an optimization to select vector DUP instructions.Amara Emerson2019-03-191-0/+105
| | | | | | | | | This adds pattern matching for the insert+shufflevector sequence so we can generate dup instructions instead of the current TBL sequence. Differential Revision: https://reviews.llvm.org/D59558 llvm-svn: 356526
* Revert r356304: remove subreg parameter from MachineIRBuilder::buildCopy()Amara Emerson2019-03-181-6/+13
| | | | | | | | | | | | | After review comments, it was preferred to not teach MachineIRBuilder about non-generic instructions beyond using buildInstr(). For AArch64 I've changed the buildCopy() calls to buildInstr() + a separate addReg() call. This also relaxes the MachineIRBuilder's COPY checking more because it may not always have a SrcOp given to it. llvm-svn: 356396
* [GlobalISel] Allow MachineIRBuilder to build subregister copies.Amara Emerson2019-03-151-38/+21
| | | | | | | | | | | | This relaxes some asserts about sizes, and adds an optional subreg parameter to buildCopy(). Also update AArch64 instruction selector to use this in places where we previously used MachineInstrBuilder manually. Differential Revision: https://reviews.llvm.org/D59434 llvm-svn: 356304
* [AArch64][GlobalISel] Add isel support for G_UADDO on s32s and s64sJessica Paquette2019-03-141-0/+37
| | | | | | | | | | | | This adds instruction selection support for G_UADDO on s32s and s64s. Also - Add an instruction selection test - Update the arm64-xaluo.ll test to show that we generate the correct assembly Differential Revision: https://reviews.llvm.org/D58734 llvm-svn: 356214
* [AArch64][GlobalISel] Implement selection for G_UNMERGE of vectors to vectors.Amara Emerson2019-03-141-53/+99
| | | | | | | | | This re-uses the previous support for extract vector elt to extract the subvectors. Differential Revision: https://reviews.llvm.org/D59390 llvm-svn: 356213
* [AArch64][GlobalISel] Add some support for G_CONCAT_VECTORS.Amara Emerson2019-03-141-5/+29
| | | | | | | | Handles concatenating 2 x v2s32 and 2 x v4s16 Differential Revision: https://reviews.llvm.org/D59390 llvm-svn: 356212
* [GlobalISel][AArch64] Add partial selection support for G_INSERT_VECTOR_ELTJessica Paquette2019-03-141-0/+39
| | | | | | | | | | | This adds support for inserting elements into packed vectors. It also adds two tests: one for selection, and one for regbank select. Unpacked vectors will come in a follow-up. Differential Revision: https://reviews.llvm.org/D59325 llvm-svn: 356182
* [AArch64][GlobalISel] Gardening: Simplify subregister copy in selectBuildVectorJessica Paquette2019-03-131-20/+16
| | | | | | | | | | NFC. Some more preliminary factoring for G_INSERT_VECTOR_ELT. Also better code-reuse, etc., etc. Differential Revision: https://reviews.llvm.org/D59323 llvm-svn: 356107
* [GlobalISel][AArch64] Gardening: Factor out vector insertsJessica Paquette2019-03-131-33/+47
| | | | | | | | | | | Factor out the vector insert code in `selectBuildVector`. Replace part of it with `emitScalarToVector`, since it was pretty much equivalent. This will make implementing G_INSERT_VECTOR_ELT easier. Differential Revision: https://reviews.llvm.org/D59322 llvm-svn: 356106
* [GlobalISel][AArch64] Gardening: Factor out code to find lane indicesJessica Paquette2019-03-131-22/+37
| | | | | | | | | | | | | Some more refactoring for G_INSERT_VECTOR_ELT. Factor out the code used to find a lane index from `selectExtractElt`. Put it into a more general-purpose `getConstantValueForReg` function. This will be shared with the code for G_INSERT_VECTOR_ELT. Differential Revision: https://reviews.llvm.org/D59324 llvm-svn: 356101
* Recommit "[GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELT"Jessica Paquette2019-03-111-17/+136
| | | | | | | | | After r355865, we should be able to safely select G_EXTRACT_VECTOR_ELT without running into any problematic intrinsics. Also add a fix for lane copies, which don't support index 0. llvm-svn: 355871
* Revert "[GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELT"Jessica Paquette2019-03-051-125/+17
| | | | | | | | | | | This broke test-suite::aarch64_neon_intrinsics.test Reverting while I look into it. Example failure: http://lab.llvm.org:8011/builders/clang-cmake-aarch64-quick/builds/17740 llvm-svn: 355408
* [GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELTJessica Paquette2019-03-041-17/+125
| | | | | | | | | | | | | This adds instruction selection support for G_EXTRACT_VECTOR_ELT for cases where the index is defined by a G_CONSTANT. It also factos out the lane copy opcode selection part into its own function, `getLaneCopyOpcode`. This is used by both `selectUnmergeValues` and `selectExtractElt`. Differential Revision: https://reviews.llvm.org/D58469 llvm-svn: 355344
* Re-commit r355104: "[AArch64][GlobalISel] Add support for 64 bit vector ↵Amara Emerson2019-03-041-36/+153
| | | | | | | | | | | | shuffle using TBL1." The code to materialize a mask from a constant pool load tried to use a 128 bit LDR to load a 64 bit constant pool entry, which was 8 byte aligned. This resulted in a link failure in the NEON tests in the test suite since the LDR address was unaligned. This change fixes that to instead emit a 64 bit LDR if the entry is 64 bit, before converting back to a 128 bit register for the TBL. llvm-svn: 355326
* [AArch64/ARM] Fix two compiler warnings in InstructionSelector, NFCIJonas Hahnfeld2019-03-041-0/+1
| | | | | | | | | | | 1) GCC complains that KnownValid is set but not used. 2) In ARMInstructionSelector::selectGlobal() the code is mixing "enumeral and non-enumeral type in conditional expression". Solve this by casting to unsigned which is the final type anyway. Differential Revision: https://reviews.llvm.org/D58834 llvm-svn: 355304
* Revert "[AArch64][GlobalISel] Add support for 64 bit vector shuffle using TBL1."Amara Emerson2019-02-281-118/+26
| | | | | | Seems to break some neon intrinsics tests. llvm-svn: 355115
* [AArch64][GlobalISel] Add support for 64 bit vector shuffle using TBL1.Amara Emerson2019-02-281-26/+118
| | | | | | | | | This extends the existing support for shufflevector to handle cases like <2 x float>, which we can implement by concating the vectors and using a TBL1. Differential Revision: https://reviews.llvm.org/D58684 llvm-svn: 355104
OpenPOWER on IntegriCloud