summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert r334980 and 334983Vlad Tsyrklevich2018-06-206-154/+18
| | | | | | | This reverts commits r334980 and r334983 because they were causing build timeouts on the x86_64-linux-ubsan bot. llvm-svn: 335085
* [MachineOutliner] NFC: Remove insertOutlinerPrologue, rename ↵Jessica Paquette2018-06-192-8/+2
| | | | | | | | | | | | insertOutlinerEpilogue insertOutlinerPrologue was not used by any target, and prologue-esque code was beginning to appear in insertOutlinerEpilogue. Refactor that into one function, buildOutlinedFrame. This just removes insertOutlinerPrologue and renames insertOutlinerEpilogue. llvm-svn: 335076
* [AArch64][SVE] Asm: Fix predicate pattern diagnostics.Sander de Smalen2018-06-181-4/+6
| | | | | | | | | | | | | | | This patch uses the DiagnosticPredicate for SVE predicate patterns to improve their diagnostics, now giving a 'invalid operand' diagnostic if the type is not an immediate or one of the expected pattern labels. Reviewers: samparker, SjoerdMeijer, javed.absar, fhahn Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48220 llvm-svn: 334983
* [AArch64][SVE] Asm: Support for saturating INC/DEC (32bit scalar) instructions.Sander de Smalen2018-06-186-14/+148
| | | | | | | | | | | | | | | | | | | | | | | The variants added by this patch are: - SQINC signed increment, e.g. sqinc x0, w0, all, mul #4 - SQDEC signed decrement, e.g. sqdec x0, w0, all, mul #4 - UQINC unsigned increment, e.g. uqinc w0, all, mul #4 - UQDEC unsigned decrement, e.g. uqdec w0, all, mul #4 This patch includes asmparser changes to parse a GPR64 as a GPR32 in order to satisfy the constraint check: x0 == GPR64(w0) in: sqinc x0, w0, all, mul #4 ^___^ (must match) Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47716 llvm-svn: 334980
* [AArch64][SVE] Asm: Support for saturating INC/DEC (64bit scalar) instructions.Sander de Smalen2018-06-182-0/+51
| | | | | | | | | | | | | | | | | | Summary: The variants added by this patch are: - SQINC (signed increment) - UQINC (unsigned increment) - SQDEC (signed decrement) - UQDEC (unsigned decrement) For example: uqincw x0, all, mul #4 Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Differential Revision: https://reviews.llvm.org/D47715 llvm-svn: 334948
* [AArch64][SVE] Asm: Support for vector element compares.Sander de Smalen2018-06-182-0/+102
| | | | | | | | | | | | | | | | This patch adds instructions for comparing elements from two vectors, e.g. cmpgt p0.s, p0/z, z0.s, z1.s and also adds support for comparing to a 64-bit wide element vector, e.g. cmpgt p0.s, p0/z, z0.s, z1.d The patch also contains aliases for certain comparisons, e.g.: cmple p0.s, p0/z, z0.s, z1.s => cmpge p0.s, p0/z, z1.s, z0.s cmplo p0.s, p0/z, z0.s, z1.s => cmphi p0.s, p0/z, z1.s, z0.s cmpls p0.s, p0/z, z0.s, z1.s => cmphs p0.s, p0/z, z1.s, z0.s cmplt p0.s, p0/z, z0.s, z1.s => cmpgt p0.s, p0/z, z1.s, z0.s llvm-svn: 334931
* [AArch64][SVE] Asm: Support for bitwise operations on predicate vectors.Sander de Smalen2018-06-171-0/+29
| | | | | | | | | | | | | | | | | | | This patch adds support for instructions performing bitwise operations on predicate vectors, including AND, BIC, EOR, NAND, NOR, ORN, ORR, and their status flag setting variants ANDS, BICS, EORS, NANDS, ORNS, ORRS. This patch also adds several aliases: orr p0.b, p1/z, p1.b, p1.b => mov p0.b, p1.b orrs p0.b, p1/z, p1.b, p1.b => movs p0.b, p1.b and p0.b, p1/z, p2.b, p2.b => mov p0.b, p1/z, p2.b ands p0.b, p1/z, p2.b, p2.b => movs p0.b, p1/z, p2.b eor p0.b, p1/z, p2.b, p1.b => not p0.b, p1/z, p2.b eors p0.b, p1/z, p2.b, p1.b => nots p0.b, p1/z, p2.b llvm-svn: 334906
* [AArch64][SVE] Asm: Support for SEL (vector/predicate) instructions.Sander de Smalen2018-06-172-0/+81
| | | | | | | | Support for SVE's predicated select instructions to select elements from either vector, both in a data-vector and a predicate-vector variant. llvm-svn: 334905
* [AArch64][SVE] Asm: Support for CPY SIMD/FP and GPR instructions.Sander de Smalen2018-06-152-0/+79
| | | | | | | Predicated splat/copy of SIMD/FP register or general purpose register to SVE vector, along with MOV-aliases. llvm-svn: 334842
* [AArch64][SVE] Asm: Support for INC/DEC (scalar) instructions.Sander de Smalen2018-06-155-12/+105
| | | | | | | | | | | | | Increment/decrement scalar register by (scaled) element count given by predicate pattern, e.g. 'incw x0, all, mul #4'. Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47713 llvm-svn: 334838
* [AArch64][SVE] Asm: Support for FADD, FMUL and FMAX immediate instructions.Sander de Smalen2018-06-153-0/+57
| | | | | | | | | | Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: javed.absar Differential Revision: https://reviews.llvm.org/D47712 llvm-svn: 334831
* [AArch64][SVE] Asm: Add parsing/printing support for exact FP immediates.Sander de Smalen2018-06-158-46/+164
| | | | | | | | | | | | | | | | Some instructions require of a limited set of FP immediates as operands, for example '#0.5 or #1.0' for SVE's FADD instruction. This patch adds support for parsing and printing such FP immediates as exact values (e.g. #0.499999 is not accepted for #0.5). Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47711 llvm-svn: 334826
* [TableGen] Emit a fatal error on inconsistencies in resource units vs cycles.Clement Courbet2018-06-131-6/+6
| | | | | | | | | | | | | | | | | | | | | Summary: For targets I'm not familiar with, I've automatically made the "default to 1 for each resource" behaviour explicit in the td files. For more obvious cases, I've ventured a fix. Some notes: - Exynos is especially fishy. - AArch64SchedThunderX2T99.td had some truncated entries. If I understand correctly, the person who wrote that interpreted the ResourceCycle as a range. I made the decision to use the upper/lower bound for consistency with the 'Latency' value. I'm sure there is a better choice. - The change to X86ScheduleBtVer2.td is an NFC, it just makes values more explicit. Also see PR37310. Reviewers: RKSimon, craig.topper, javed.absar Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D46356 llvm-svn: 334586
* [AArch64] Support reserving x20 registerPetr Hosek2018-06-124-5/+25
| | | | | | | | | | | | Register x20 is a callee-saved register which may be used for other purposes in certain contexts, for example to hold special variables within the kernel. This change adds support for reserving this register both to frontend and backend to make this register usable for these purposes. Differential Revision: https://reviews.llvm.org/D46552 llvm-svn: 334531
* [AArch64] Audit on rL333879 to fix FP16 64bit bitpatternsLuke Geeson2018-06-121-2/+2
| | | | llvm-svn: 334488
* [ExynosM1][Sched] Fix resource usage in scheduling model.Clement Courbet2018-06-111-16/+16
| | | | | | This is part of https://reviews.llvm.org/D46356. llvm-svn: 334391
* [AArch64, ARM] Add support for Samsung Exynos M4Evandro Menezes2018-06-061-0/+1
| | | | | | Create a separate feature set for Exynos M4 and add test cases. llvm-svn: 334115
* [MC] Pass MCSubtargetInfo to fixupNeedsRelaxation and applyFixupPeter Smith2018-06-061-4/+8
| | | | | | | | | | | | | | | | | | On targets like Arm some relaxations may only be performed when certain architectural features are available. As functions can be compiled with differing levels of architectural support we must make a judgement on whether we can relax based on the MCSubtargetInfo for the function. This change passes through the MCSubtargetInfo for the function to fixupNeedsRelaxation so that the decision on whether to relax can be made per function. In this patch, only the ARM backend makes use of this information. We must also pass the MCSubtargetInfo to applyFixup because some fixups skip error checking on the assumption that relaxation has occurred, to prevent code-generation errors applyFixup must see the same MCSubtargetInfo as fixupNeedsRelaxation. Differential Revision: https://reviews.llvm.org/D44928 llvm-svn: 334078
* [MachineOutliner] NFC - Move intermediate data structures to MachineOutliner.hJessica Paquette2018-06-042-56/+53
| | | | | | | | | | | | | | | | | | | | | This is setting up to fix bug 37573 cleanly. This moves data structures that are technically both used in some way by the target and the general-purpose outlining algorithm into MachineOutliner.h. In particular, the `Candidate` class is of importance. Before, the outliner passed the locations of `Candidates` to the target, which would then make some decisions about the prospective outlined function. This change allows us to just pass `Candidates` along to the target. This will allow the target to discard `Candidates` that would be considered unsafe before cost calculation. Thus, we will be able to remove the unsafe candidates described in the bug without resorting to torching the entire prospective function. Also, as a side-effect, it makes the outliner a bit cleaner. https://bugs.llvm.org/show_bug.cgi?id=37573 llvm-svn: 333952
* TableGen: Streamline the semantics of NAMENicolai Haehnle2018-06-041-72/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The new rules are straightforward. The main rules to keep in mind are: 1. NAME is an implicit template argument of class and multiclass, and will be substituted by the name of the instantiating def/defm. 2. The name of a def/defm in a multiclass must contain a reference to NAME. If such a reference is not present, it is automatically prepended. And for some additional subtleties, consider these: 3. defm with no name generates a unique name but has no special behavior otherwise. 4. def with no name generates an anonymous record, whose name is unique but undefined. In particular, the name won't contain a reference to NAME. Keeping rules 1&2 in mind should allow a predictable behavior of name resolution that is simple to follow. The old "rules" were rather surprising: sometimes (but not always), NAME would correspond to the name of the toplevel defm. They were also plain bonkers when you pushed them to their limits, as the old version of the TableGen test case shows. Having NAME correspond to the name of the toplevel defm introduces "spooky action at a distance" and breaks composability: refactoring the upper layers of a hierarchy of nested multiclass instantiations can cause unexpected breakage by changing the value of NAME at a lower level of the hierarchy. The new rules don't suffer from this problem. Some existing .td files have to be adjusted because they ended up depending on the details of the old implementation. Change-Id: I694095231565b30f563e6fd0417b41ee01a12589 Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm, javed.absar Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D47430 llvm-svn: 333900
* [AArch64] Audit on rL333634 to fix FP16 Disasm BitPatternsLuke Geeson2018-06-042-2/+2
| | | | llvm-svn: 333879
* [AArch64][SVE] Fix range for DUP immediates (16bit elts)Sander de Smalen2018-06-042-3/+11
| | | | | | | | | | | | | | | For immediates used in DUP instructions that have the range -128 to 127, or a multiple of 256 in the range -32768 to 32512, one could argue that when the result element size is 16bits (.h), the value can be considered both signed and unsigned. Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47619 llvm-svn: 333873
* [AArch64][SVE] Asm: Print indexed element 0 as FPR.Sander de Smalen2018-06-045-0/+67
| | | | | | | | | | | | | | | | | | | | Print the first indexed element as a FP register, for example: mov z0.d, z1.d[0] Is now printed as: mov z0.d, d1 Next to printing, this patch also adds aliases to parse 'mov z0.d, d1'. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47571 llvm-svn: 333872
* [AArch64][SVE] Asm: Support for indexed DUP instructions.Sander de Smalen2018-06-044-71/+127
| | | | | | | | | | | | | | | | | | | | Unpredicated copy of indexed SVE element to SVE vector, along with MOV-aliases. For example: dup z0.h, z1.h[0] duplicates the first 16-bit element from z1 to all elements in the result vector z0. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47570 llvm-svn: 333871
* [AArch64][SVE] Asm: Support for FCPY immediate instructions.Sander de Smalen2018-06-042-2/+43
| | | | | | | | | | | | | Predicated copy of floating-point immediate value to SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: javed.absar Differential Revision: https://reviews.llvm.org/D47518 llvm-svn: 333869
* [AArch64][SVE] Asm: Support for CPY immediate instructionsSander de Smalen2018-06-042-0/+62
| | | | | | | | | | | | | Predicated copy of possibly shifted immediate value into SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47517 llvm-svn: 333868
* [AArch64][GlobalISel] Zero-extend s1 values when returning.Amara Emerson2018-06-011-1/+6
| | | | | | | | | | | Before we were relying on the any extend of the s1 to s32, but for AAPCS we need to zero-extend it to at least s8. Fixes PR36719 Differential Revision: https://reviews.llvm.org/D47425 llvm-svn: 333747
* [AArch64][SVE] Asm: Support for FDUP_ZI (copy fp immediate) instruction.Sander de Smalen2018-06-012-0/+39
| | | | | | | | | | | | | Unpredicated copy of floating-point immediate value into SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47482 llvm-svn: 333744
* [AArch64][SVE] Asm: Support for DUPM (masked immediate) instruction.Sander de Smalen2018-06-017-1/+137
| | | | | | | | | | | | | Unpredicated copy of repeating immediate pattern to SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47328 llvm-svn: 333731
* [MC] Fallback on DWARF when generating compact unwind on AArch64Francis Visoiu Mistrih2018-05-311-3/+11
| | | | | | | | | | | | | | | Instead of asserting when using the def_cfa directive with a register different from fp, fallback on DWARF. Easily triggered with: .cfi_def_cfa x1, 32; rdar://40249694 Differential Revision: https://reviews.llvm.org/D47593 llvm-svn: 333667
* [AArch64] Reverted rL333427 fixing Clang UnitTest FailureLuke Geeson2018-05-312-5/+39
| | | | llvm-svn: 333634
* [GlobalISel][AArch64] LegalizerInfo verifier: Fixing bugs exposed by ↵Roman Tereshin2018-05-311-11/+4
| | | | | | | | | | | | LegalizerInfo::verify(...) Reviewers: aemerson, qcolombet Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D46339 llvm-svn: 333618
* [GlobalISel][AArch64] LegalizerInfo verifier: Adding ↵Roman Tereshin2018-05-301-0/+1
| | | | | | | | | | | | | | | LegalizerInfo::verify(...) call w/o fixing bugs This is to make it clear what kind of bugs the LegalizerInfo::verifier is able to catch and test its output Reviewers: aemerson, qcolombet Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D46338 llvm-svn: 333597
* AArch64: print correct annotation for ADRP addresses.Tim Northover2018-05-301-2/+2
| | | | | | | The immediate on an ADRP MCInst needs to be multiplied by 0x1000 to obtain the actual PC-offset that will be calculated. llvm-svn: 333525
* [AArch64][AsmParser] Fix segfault on illegal fpimm.Sander de Smalen2018-05-301-2/+2
| | | | | | | | | | | | | Floating point immediate combining a negative sign and a hexadecimal number, e.g. #-0x0 caused the compiler to crash. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: javed.absar Differential Revision: https://reviews.llvm.org/D47483 llvm-svn: 333524
* [AArch64] Fix PR32384: bump up the number of stores per memset and memcpyEvandro Menezes2018-05-292-5/+11
| | | | | | | | | | | | | As suggested in https://bugs.llvm.org/show_bug.cgi?id=32384#c1, this change makes the inlining of `memset()` and `memcpy()` more aggressive when compiling for speed. The tuning remains the same when optimizing for size. Patch by: Sebastian Pop <s.pop@samsung.com> Evandro Menezes <e.menezes@samsung.com> Differential revision: https://reviews.llvm.org/D45098 llvm-svn: 333429
* Revert "[AArch64] added FP16 vcvth intrinsic support"Amara Emerson2018-05-292-36/+5
| | | | | | This reverts commit r333410 due to bot failures. llvm-svn: 333427
* [AArch64][SVE] Asm: Support for predicated LSL/LSR (vectors)Sander de Smalen2018-05-292-0/+35
| | | | | | | | | | Reviewers: rengolin, huntergr, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47365 llvm-svn: 333422
* [AArch64][SVE] Asm: Support for AND, ORR, EOR and BIC instructions.Sander de Smalen2018-05-292-1/+47
| | | | | | | | | | | | | | | | | | | | | This patch addresses the following variants: - bitmask immediate, e.g. 'and z0.d, z0.d, #0x6'. - unpredicated data vectors, e.g. 'and z0.d, z1.d, z2.d'. - predicated data vectors, e.g. 'and z0.d, p0/m, z0.d, z1.d'. And also several aliases, such as: - ORN, alias of ORR. - EON, alias of EOR. - BIC, alias of AND (immediate variant) - MOV, alias of ORR (if unpredicated and source register operands are the same) Reviewers: rengolin, huntergr, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47363 llvm-svn: 333414
* [AArch64] added FP16 vcvth intrinsic supportLuke Geeson2018-05-292-5/+36
| | | | | | | | | | | | | | Summary: Change-Id: I0df845749c7689dfc99150ba7c19c7d0dadbd705 Reviewers: javed.absar, SjoerdMeijer Reviewed By: SjoerdMeijer Subscribers: llvm-commits, SjoerdMeijer Differential Revision: https://reviews.llvm.org/D46311 llvm-svn: 333410
* [AArch64][SVE] Asm: Support for ADD (immediate) instructions.Sander de Smalen2018-05-294-15/+90
| | | | | | | | | | | | | | | | | | | This patch adds addsub_imm8_opt_lsl_(i8|i16|i32|i64) operands that are unsigned values in the range 0 to 255. For element widths of 16 bits or higher it may also be a signed multiple of 256 in the range 0 to 65280. Note: This also does some refactoring to reuse convenience function getShiftedVal<shift>(), and now allows AArch64 scalar 'ADD #-4096' to be accepted to be mapped to SUB #4096. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47310 llvm-svn: 333408
* Fix ubsan errors introduced by r333263 re. left-shifting negative values.Sander de Smalen2018-05-251-2/+3
| | | | llvm-svn: 333270
* [AArch64][SVE] Asm: Support for DUP (immediate) instructions.Sander de Smalen2018-05-259-30/+256
| | | | | | | | | | | | | | | | | | | | | | | | | | Unpredicated copy of optionally-shifted immediate to SVE vector, along with MOV-aliases. This patch contains parsing and printing support for cpy_imm8_opt_lsl_(i8|i16|i32|i64). This operand allows a signed value in the range -128 to +127. For element widths of 16 bits or higher it may also be a signed multiple of 256 in the range -32768 to +32512. For element-width of 8 bits a range of -128 to 255 is accepted, since a copy of a byte can be considered either signed/unsigned. Note: This patch renames tryParseAddSubImm() -> tryParseImmWithOptionalShift() and moves the behaviour of trying to shift a plain immediate by an allowed shift-value to its addImmWithOptionalShiftOperands() method, so that the parsing itself is generic and allows immediates from multiple shifted operands. This is done because an immediate can be divisible by both shifted operands. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47309 llvm-svn: 333263
* [AArch64] Improve orr+movk sequences for MOVi64imm.Eli Friedman2018-05-241-115/+96
| | | | | | | | | | | | | | The existing code has three different ways to try to lower a 64-bit immediate to the sequence ORR+MOVK. The result is messy: it misses some possible sequences, and the order of the checks means we sometimes emit two MOVKs when we only need one. Instead, just use a simple loop to try all possible two-instruction ORR+MOVK sequences. Differential Revision: https://reviews.llvm.org/D47176 llvm-svn: 333218
* [AArch64] Take advantage of variable shift/rotate amount implicit mod operation.Geoff Berry2018-05-241-0/+111
| | | | | | | | | | | | | | | | Summary: Optimize code generated for variable shifts/rotates by taking advantage of the implicit and/mod done on the variable shift amount register. Resolves bug 27582 and bug 37421. Reviewers: t.p.northover, qcolombet, MatzeB, javed.absar Subscribers: rengolin, kristof.beyls, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D46844 llvm-svn: 333214
* [CodeGen][AArch64] Use RegUnits to track register aliases. (NFC)Chad Rosier2018-05-231-40/+27
| | | | | | | | Use RegUnits to track register aliases in AArch64RedundantCopyElimination. Differential Revision: https://reviews.llvm.org/D47269 llvm-svn: 333107
* [AArch64] Use addAliasForDirective to support data directivesAlex Bradbury2018-05-231-23/+7
| | | | | | | | | | The AArch64 asm parser currently has custom parsing logic for .hword, .word, and .xword. Rather than use this custom logic, we can just use addAliasForDirective to enable the reuse of AsmParser::parseDirectiveValue. Differential Revision: https://reviews.llvm.org/D47000 llvm-svn: 333077
* Delete unused variable from r333015.Eli Friedman2018-05-221-3/+0
| | | | | | | (The assertion suppressed the unused variable warning on Release+Asserts builds, so I didn't notice.) llvm-svn: 333018
* [MachineOutliner] Add "thunk" outlining for AArch64.Eli Friedman2018-05-221-18/+83
| | | | | | | | | | | | | | | | | | | | | | When we're outlining a sequence that ends in a call, we can save up to three instructions in the outlined function by turning the call into a tail-call. I refer to this as thunk outlining because the resulting outlined function looks like a thunk; suggestions welcome for a better name. In addition to making the outlined function shorter, thunk outlining allows outlining calls which would otherwise be illegal to outline: we don't need to save/restore LR, so we don't need to prove anything about the stack access patterns of the callee. To make this work effectively, I also added MachineOutlinerInstrType::LegalTerminator to the generic MachineOutliner code; this allows treating an arbitrary instruction as a terminator in the suffix tree. Differential Revision: https://reviews.llvm.org/D47173 llvm-svn: 333015
* [DAGCombine][X86][AArch64] Masked merge unfolding: vector edition.Roman Lebedev2018-05-211-3/+12
| | | | | | | | | | | | | | | Summary: This **appears** to be the last missing piece for the masked merge pattern handling in the backend. This is [[ https://bugs.llvm.org/show_bug.cgi?id=37104 | PR37104 ]]. [[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]] will introduce an IR canonicalization that is likely bad for the end assembly. Previously, `andps`+`andnps` / `bsl` would be generated. (see `@out`) Now, they would no longer be generated (see `@in`), and we need to make sure that they are generated. Differential Revision: https://reviews.llvm.org/D46528 llvm-svn: 332904
OpenPOWER on IntegriCloud