summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64
Commit message (Collapse)AuthorAgeFilesLines
...
* [SelectionDAG] Disallow indirect "i" constraintFangrui Song2019-12-291-1/+0
| | | | | | | | | This allows us to delete InlineAsm::Constraint_i workarounds in SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and TargetLowering::getInlineAsmMemConstraint overrides. They were introduced to X86 in r237517 to prevent crashes for constraints like "=*imr". They were later copied to other targets.
* [DAGCombine] visitEXTRACT_SUBVECTOR - 'little to big' ↵Sanjay Patel2019-12-231-71/+0
| | | | | | | | | | | | | | | | | | | extract_subvector(bitcast()) support This moves the X86 specific transform from rL364407 into DAGCombiner to generically handle 'little to big' cases (for example: extract_subvector(v2i64 bitcast(v16i8))). This allows us to remove both the x86 implementation and the aarch64 bitcast(extract_subvector(bitcast())) combine. Earlier patches that dealt with regressions initially exposed by this patch: rG5e5e99c041e4 rG0b38af89e2c0 Patch by: @RKSimon (Simon Pilgrim) Differential Revision: https://reviews.llvm.org/D63815
* [AArch64] [Windows] Use COFF stubs for calls to extern_weak functionsMartin Storsjö2019-12-233-7/+15
| | | | | | | | | | | | | | | | | | | As the extern_weak target might be missing, resolving to the absolute address zero, we can't use the normal direct PC-relative branch instructions (as that would result in relocations out of range). Improve the classifyGlobalFunctionReference method to set MO_DLLIMPORT/MO_COFFSTUB, and simplify the existing code in AArch64TargetLowering::LowerCall to use the return value from classifyGlobalFunctionReference for these cases. Add code in both AArch64FastISel and GlobalISel/IRTranslator to bail out for function calls to extern weak functions on windows, to let SelectionDAG handle them. This matches what was done for X86 in 6bf108d77a3c. Differential Revision: https://reviews.llvm.org/D71721
* [AArch64] match splat of bitcasted extract subvector to DUPLANESanjay Patel2019-12-221-7/+43
| | | | | | | | | | This is another potential regression exposed by D63815. Here we peek through a bitcast to find an extract subvector and scale the splat offset based on that: splat (bitcast (extract X, C)), LaneC --> duplane (bitcast X), LaneC' Differential Revision: https://reviews.llvm.org/D71672
* Fix "result of 32-bit shift implicitly converted to 64 bits" warning. NFC.Simon Pilgrim2019-12-211-1/+1
|
* [AArch64] Respect reserved registers while renaming in LdSt opt.Florian Hahn2019-12-211-1/+4
| | | | | | We cannot pick reserved registers as rename registers. Fixes https://bugs.llvm.org/show_bug.cgi?id=44358
* Add parentheses to silence warningBill Wendling2019-12-201-2/+2
|
* [AArch64][SVE] Replace integer immediate intrinsics with splat vector variantDanilo Carvalho Grael2019-12-202-22/+39
| | | | | | | | | | | | Summary: Replace the integer immediate intrisics with splat vector variants so they can be applied as optimizations for the C/C++ intrinsics. Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71614
* [AArch64][SVE] Correct intrinsics and patterns for logical predicate ↵Paul Walker2019-12-201-17/+17
| | | | | | | | | | | | | | | | | | | | | | | instructions In general SVE intrinsics are considered predicated and merging with everything else having suitable decoration. For predicated zeroing operations (like the predicate logical instructions) we use the "_z" suffix. After this change all intrinsics use their expected names (i.e. orr instead of or and eor instead of xor). I've removed intrinsics and patterns for condition code setting instructions as that data is not returned as part of the intrinsic. The expectation is to ask for a cc flag explicitly. For example: a = and_z(pg, p1, p2) cc = ptest_<flag>(pg, a) With the code generator expected to use "s" variants of instructions when available. Differential Revision: https://reviews.llvm.org/D71715
* [AArch64][SVE] Fold constant multiply of element countCullen Rhodes2019-12-203-1/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: E.g. %0 = tail call i64 @llvm.aarch64.sve.cntw(i32 31) %mul = mul i64 %0, <const> Should emit: cntw x0, all, mul #<const> For <const> in the range 1-16. Patch by Kerry McLaughlin Reviewers: sdesmalen, huntergr, dancgr, rengolin, efriedma Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71014
* [AArch64][SVE] Add intrnisics for saturating scalar arithmeticAndrzej Warzynski2019-12-202-69/+137
| | | | | | | | | | | | | | | | | | | | | | Summary: The following intrnisics are added: * @llvm.aarch64.sve.sqdec{b|h|w|d|p} * @llvm.aarch64.sve.sqinc{b|h|w|d|p} * @llvm.aarch64.sve.uqdec{b|h|w|d|p} * @llvm.aarch64.sve.uqinc{b|h|w|d|p} For every intrnisic there a scalar variants (with n32 or n64 suffix) and vector variants (no suffix). Reviewers: sdesmalen, rengolin, efriedma Reviewed By: sdesmalen, efriedma Subscribers: eli.friedman, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71252
* Recommit "[AArch64][SVE] Add permutation and selection intrinsics"Cullen Rhodes2019-12-205-42/+264
| | | | | | | | Recommit 23c28c40436143006be740533375c036d11c92cd (reverted in dcb48f50bdfa0fa47b62d089b6ed999d857fc9f8) with a fix for an assert "Request for a fixed size on a scalable object" being triggered in `LowerSVEIntrinsicEXT`. The fix is to call `getKnownMinSize` on the TypeSize object.
* [AArch64][SVE] Add intrinsics for binary narrowing operationsAndrzej Warzynski2019-12-203-22/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The following intrinsics for binary narrowing shift righ operations are added: * @llvm.aarch64.sve.shrnb * @llvm.aarch64.sve.uqshrnb * @llvm.aarch64.sve.sqshrnb * @llvm.aarch64.sve.sqshrunb * @llvm.aarch64.sve.uqrshrnb * @llvm.aarch64.sve.sqrshrnb * @llvm.aarch64.sve.sqrshrunb * @llvm.aarch64.sve.shrnt * @llvm.aarch64.sve.uqshrnt * @llvm.aarch64.sve.sqshrnt * @llvm.aarch64.sve.sqshrunt * @llvm.aarch64.sve.uqrshrnt * @llvm.aarch64.sve.sqrshrnt * @llvm.aarch64.sve.sqrshrunt Reviewers: sdesmalen, rengolin, efriedma Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71552
* [StackMaps] Be explicit about label formation [NFC] (try 2)Philip Reames2019-12-191-2/+9
| | | | | | Recommit after making the same API change in non-x86 targets. This has been build for all targets, and tested for effected ones. Why the difference? Because my disk filled up when I tried make check for all. For auto-padding assembler support, we'll need to bundle the label with the instructions (nops or call sequences) so that they don't get separated. This just rearranges the code to make the upcoming change more obvious.
* Revert "[AArch64][SVE] Add permutation and selection intrinsics"Cullen Rhodes2019-12-195-264/+42
| | | | | | | | | | | This reverts commit 23c28c40436143006be740533375c036d11c92cd. It caused build failures in the following expensive checks builders: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/1295 http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/700 Reverting for now whilst I figure what the issue is.
* [AArch64][SVE] Add permutation and selection intrinsicsCullen Rhodes2019-12-195-42/+264
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Adds the following intrinsics: * @llvm.aarch64.sve.clasta * @llvm.aarch64.sve.clasta_n * @llvm.aarch64.sve.clastb * @llvm.aarch64.sve.clastb_n * @llvm.aarch64.sve.compact * @llvm.aarch64.sve.ext * @llvm.aarch64.sve.lasta * @llvm.aarch64.sve.lastb * @llvm.aarch64.sve.rev * @llvm.aarch64.sve.splice * @llvm.aarch64.sve.tbl * @llvm.aarch64.sve.trn1 * @llvm.aarch64.sve.trn2 * @llvm.aarch64.sve.uzp1 * @llvm.aarch64.sve.uzp2 * @llvm.aarch64.sve.zip1 * @llvm.aarch64.sve.zip2 Reviewers: sdesmalen, efriedma, dancgr, mgudim, huntergr, rengolin Reviewed By: sdesmalen, efriedma Subscribers: kmclaughlin, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71401
* [AArch64][SVE] Implement pfirst and pnext intrinsicsCullen Rhodes2019-12-192-5/+12
| | | | | | | | | | | | | Reviewers: sdesmalen, efriedma, dancgr, mgudim, cameron.mcinally Reviewed By: cameron.mcinally Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71472
* [AArch64][SVE] Implement ptrue intrinsicCullen Rhodes2019-12-193-10/+19
| | | | | | | | | | | | | | Reviewers: sdesmalen, eli.friedman, dancgr, mgudim, cameron.mcinally, huntergr, efriedma Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71457
* Revert "[AArch64][SVE] Replace integer immediate intrinsics with splat ↵Danilo Carvalho Grael2019-12-182-39/+22
| | | | | | | | | | vector variant" This reverts commit 830e08b98bcb427136443093c282b25328137cf0 and eb1857ce0da481caf82271e6d0c9fc745dfab26f. This commit leads to an unexpected failure on test/CodeGen/AArch64/sve-gather-scatter-dag-combine.ll. The review will need more changes before its re-commited.
* [AArch64][SVE] Replace integer immediate intrinsics with splat vector variantDanilo Carvalho Grael2019-12-182-22/+39
| | | | | | | | | | | | Summary: Replace the integer immediate intrisics with splat vector variants so they can be applied as optimizations for the C/C++ intrinsics. Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71614
* [gicombiner] Import tryCombineIndexedLoadStore()Daniel Sanders2019-12-181-6/+1
| | | | | | | | | | | | | | | | | Summary: Now that arbitrary data is supported, import tryCombineIndexedLoadStore() Depends on D69147 Reviewers: bogner, volkan Reviewed By: volkan Subscribers: hiraditya, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69151
* [AArch64] match fcvtl2 with bitcasted extractSanjay Patel2019-12-182-6/+35
| | | | | | | | | | | | | This should eliminate a regression seen in D63815. If we are FP extending the high half extract of a vector, we should be able to peek through a bitcast sitting between the extract and extend. This replaces tablegen patterns with a more general DAG to DAG override, so we can handle any casted type. Differential Revision: https://reviews.llvm.org/D71515
* [gicombiner] Add support for arbitrary match data being passed from match to ↵Daniel Sanders2019-12-181-14/+12
| | | | | | | | | | | | | | | | | | | | | apply Summary: This is used by the extending_loads combine to tell the apply step which use is the preferred one to fold and the other uses should be re-written to consume. Depends on D69117 Reviewers: volkan, bogner Reviewed By: volkan Subscribers: hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69147
* [AArch64] Improve codegen of volatile load/store of i128Victor Campos2019-12-183-14/+69
| | | | | | | | | | | | | | | | Summary: Instead of generating two i64 instructions for each load or store of a volatile i128 value (two LDRs or STRs), now emit a single LDP or STP. Reviewers: labrinea, t.p.northover, efriedma Reviewed By: efriedma Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69559
* [AArch64] Enable clustering memory accesses to fixed stack objectsJay Foad2019-12-183-118/+90
| | | | | | | | | | | | | | | | | | Summary: r347747 added support for clustering mem ops with FI base operands including support for fixed stack objects in shouldClusterFI, but apparently this was never tested. This patch fixes shouldClusterFI to work with scaled as well as unscaled load/store instructions, and fixes the ordering of memory ops in MemOpInfo::operator< to ensure that memory addresses always increase, regardless of which direction the stack grows. Subscribers: MatzeB, kristof.beyls, hiraditya, javed.absar, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71334
* [AArch64][GlobalISel]: Fix a crash in GlobalIsel in dealing with 16bit ↵Xiaoqing Wu2019-12-171-1/+2
| | | | | | | | | | | | | | | | uadd.with.overflow. Summary: AArch64 doesn't support uadd.with.overflow.i16 natively. This change adds a legalization rule to convert the 32bit add result to 16bit. This should fix PR43981. Reviewers: arsenm, qcolombet, paquette, aemerson Reviewed By: paquette Subscribers: wdng, rovka, kristof.beyls, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71587
* Revert "Honor -fuse-init-array when os is not specified on x86"Mitch Phillips2019-12-171-0/+1
| | | | | | | This reverts commit aa5ee8f244441a8ea103a7e0ed8b6f3e74454516. This change broke the sanitizer buildbots. See comments at the patchset (https://reviews.llvm.org/D71360) for more information.
* Fix assertion failure in getMemOperandWithOffsetWidthKristof Beyls2019-12-171-3/+5
| | | | | | | | | | | | | | | | | | | | This fixes an assertion failure that triggers inside getMemOperandWithOffset when Machine Sinking calls it on a MachineInstr that is not a memory operation. Different backends implement getMemOperandWithOffset differently: some return false on non-memory MachineInstrs, others assert. The Machine Sinking pass in at least SinkingPreventsImplicitNullCheck relies on getMemOperandWithOffset to return false on non-memory MachineInstrs, instead of asserting. This patch updates the documentation on getMemOperandWithOffset that it should return false on any MachineInstr it cannot handle, instead of asserting. It also adapts the in-tree backends accordingly where necessary. Differential Revision: https://reviews.llvm.org/D71359
* Honor -fuse-init-array when os is not specified on x86Kamlesh Kumar2019-12-161-1/+0
| | | | | | | | | | | | | | | | | | | | | Currently -fuse-init-array option is not effective when target triple does not specify os, on x86,x86_64. i.e. // -fuse-init-array is not honored. $ clang -target i386 -fuse-init-array test.c -S // -fuse-init-array is honored. $ clang -target i386-linux -fuse-init-array test.c -S This patch fixes first case. And does cleanup. Reviewers: rnk, craig.topper, fhahn, echristo Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D71360
* [AArch64][SVE] Change pattern generation code to fix -Wimplicit-fallthrough ↵Fangrui Song2019-12-161-4/+11
| | | | after D71483
* [AArch64][SVE] Add patterns for logical immediate operations.Danilo Carvalho Grael2019-12-163-4/+56
| | | | | | | | | | | | | | Summary: Add pattern matching for the following SVE logical vector and immediate instructions: - and/bic, orr/orn, eor/eon. Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71483
* Reland [AArch64][MachineOutliner] Return address signing for outlined functionsDavid Tellenbach2019-12-161-8/+304
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Reland after fixing a bug that allowed outlining of SP modifying instructions that invalidated return address signing. During AArch64 frame lowering instructions to enable return address signing are inserted into functions if needed. Functions generated during machine outlining don't run through target frame lowering and hence are missing such instructions. This patch introduces the following changes: 1. If not all functions that potentially participate in function outlining agree on their return address signing scope and their return address signing key, outlining is disabled for these functions. 2. If not all functions that potentially participate in function outlining agree on their support for v8.3A features, outlining is disabled for these functions. 3. If an outlining candidate would outline instructions that modify sp in a way that invalidates return address signing, outlining is disabled for that particular candidate. 4. If all candidate functions agree on the signing scope, signing key and their support for v8.3 features, the outlined function behaves as if it had the same scope and key attributes and as if it would provide the same v8.3A support as the original functions. Reviewers: ostannard, paquette Reviewed By: ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70635
* [AArch64][SVE2] Add intrinsics for binary narrowing operationsAndrzej Warzynski2019-12-162-10/+20
| | | | | | | | | | | | | | | | | | | | | | | | Summary: The following intrinsics for binary narrowing add and sub operations are added: * @llvm.aarch64.sve.addhnb * @llvm.aarch64.sve.addhnt * @llvm.aarch64.sve.raddhnb * @llvm.aarch64.sve.raddhnt * @llvm.aarch64.sve.subhnb * @llvm.aarch64.sve.subhnt * @llvm.aarch64.sve.rsubhnb * @llvm.aarch64.sve.rsubhnt Reviewers: sdesmalen, rengolin, efriedma Reviewed By: sdesmalen, efriedma Subscribers: tschuett, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71424
* [AArch64] Enable emission of stack maps for non-Mach-O binaries on AArch64.Kristof Beyls2019-12-161-1/+1
| | | | | | | | | | The emission of stack maps in AArch64 binaries has been disabled for all binary formats except Mach-O since rL206610, probably mistakenly, as far as I can tell. This patch reverts this to its intended state. Differential Revision: https://reviews.llvm.org/D70069 Patch by Loic Ottet.
* [Aarch64][SVE] Add intrinsics for scatter storesAndrzej Warzynski2019-12-165-61/+297
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch adds the following SVE intrinsics for scatter stores: * 64-bit offsets: * @llvm.aarch64.sve.st1.scatter (unscaled) * @llvm.aarch64.sve.st1.scatter.index (scaled) * 32-bit unscaled offsets: * @llvm.aarch64.sve.st1.scatter.uxtw (zero-extended offset) * @llvm.aarch64.sve.st1.scatter.sxtw (sign-extended-offset) * 32-bit scaled offsets: * @llvm.aarch64.sve.st1.scatter.uxtw.index (zero-extended offset) * @llvm.aarch64.sve.st1.scatter.sxtw.index (sign-extended offset) * vector base + immediate: * @llvm.aarch64.sve.st1.scatter.imm Reviewers: rengolin, efriedma, sdesmalen Reviewed By: efriedma, sdesmalen Subscribers: kmclaughlin, eli.friedman, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71074
* Revert "AArch64: Fix frame record chain"Logan Chien2019-12-142-32/+16
| | | | | | | | | | | | | | Breaks aosp-O3-polly-before-vectorizer-unprofitable with the following error message: void llvm::emitFrameOffset(llvm::MachineBasicBlock &, MachineBasicBlock::iterator, const llvm::DebugLoc &, unsigned int, unsigned int, llvm::StackOffset, const llvm::TargetInstrInfo *, MachineInstr::MIFlag, bool, bool, bool *): Assertion `(DestReg != AArch64::SP || Bytes % 16 == 0) && "SP increment/decrement not 16-byte aligned"' failed. This reverts commit d4e10e6adb1b629b3fc1b78f7e281fbcec392edb.
* AArch64: Fix frame record chainLogan Chien2019-12-142-16/+32
| | | | | | | | | | | | | | | | | | The commit r369122 may keep LR and FP register (aka. frame record) in the middle of a frame, thus we must add the offsets to ensure the FP register always points to innermost frame record on the stack. According to AAPCS64[1], a conforming code shall construct a linked list of stack frames that can be traversed with frame records. This commit is also essential to frame-pointer-based stack unwinder (e.g. the stack unwinder in linx-perf-tools.) [1] https://github.com/ARM-software/software-standards/blob/master/abi/aapcs64/aapcs64.rst#the-frame-pointer Test: llvm-lit ${LLVM_SRC}/test/CodeGen/AArch64/framelayout-frame-record.ll Test: llvm-lit ${LLVM_SRC}/test/CodeGen/AArch64 Differential Revision: https://reviews.llvm.org/D70800
* [AArch64] Save FP for leaf functions when disabling frame pointer eliminationFangrui Song2019-12-131-1/+1
| | | | | | | | | | The change allows clang -mno-omit-leaf-frame-pointer to disable frame pointer elimination. This behavior matches X86 and Mips, and also GCC AArch64. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71168
* [AArch64] Emit PAC/BTI .note.gnu.property flagsMomchil Velikov2019-12-131-0/+60
| | | | | | | | | | | | | This patch make LLVM emit the processor specific program property types defined in AArch64 ELF spec https://developer.arm.com/docs/ihi0056/f/elf-for-the-arm-64-bit-architecture-aarch64-abi-2019q2-documentation A file containing no functions gets both property flags. Otherwise, a property is set iff all the functions in the file have the corresponding attribute. Patch by Daniel Kiss and Momchil Velikov. Differential Revision: https://reviews.llvm.org/D71019
* [NFC] Use EVT instead of bool for getSetCCInverse()Alex Richardson2019-12-131-15/+17
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The use of a boolean isInteger flag (generally initialized using VT.isInteger()) caused errors in our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). In our backend, pointers use a separate ValueType (iFATPTR) and therefore .isInteger() returns false. This meant that getSetCCInverse() was using the floating-point variant and generated incorrect code for us: `(void *)0x12033091e < (void *)0xffffffffffffffff` would return false. Committing this change will significantly reduce our merge conflicts for each upstream merge. Reviewers: spatel, bogner Reviewed By: bogner Subscribers: wuzish, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70917
* Recommit "[AArch64][SVE] Implement intrinsics for non-temporal loads & stores"Kerry McLaughlin2019-12-133-2/+95
| | | | | | | | | | | | | | Updated pred_load patterns added to AArch64SVEInstrInfo.td by this patch to use reg + imm non-temporal loads to fix previous test failures. Original commit message: Adds the following intrinsics: - llvm.aarch64.sve.ldnt1 - llvm.aarch64.sve.stnt1 This patch creates masked loads and stores with the MONonTemporal flag set when used with the intrinsics above.
* [NFC][AArch64] Fix typo.Nate Voorhies2019-12-131-1/+1
| | | | | | | | | | | | | | Summary: Coaleascer should be coalescer. Reviewers: qcolombet, Jim Reviewed By: Jim Subscribers: Jim, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70731
* [AArch64][SVE] Add integer arithmetic with immediate instructions.Danilo Carvalho Grael2019-12-123-9/+65
| | | | | | | | | | | | | | | | | | | | Summary: Add pattern matching for the following instructions: - add, sub, subr, sqadd, sqsub, uqadd, uqsub This patch required complex patterns to match the immediate with optinal left shift. I re-used the Select function from the other SVE repo to implement the complext pattern. I plan on doing another patch to also match constant vector of the same immediate. Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71370
* [AArch64][SVE] Remove nxv1f32 and nxv1f64 as legal typesCullen Rhodes2019-12-123-31/+13
| | | | | | | | | | | | | | | Summary: Also cleans up ZPR register class definition. Reviewers: sdesmalen, cameron.mcinally, efriedma Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71351
* [AArch64][SVE] Add patterns for scalable vselectCameron McInally2019-12-112-2/+12
| | | | | | This patch matches scalable vector selects to predicated move instructions. Differential Revision: https://reviews.llvm.org/D71298
* [IR] Split out target specific intrinsic enums into separate headersReid Kleckner2019-12-115-1/+6
| | | | | | | | | | | | | | | | | | | | This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320
* Rename TTI::getIntImmCost for instructions and intrinsicsReid Kleckner2019-12-112-7/+8
| | | | | | | | | | | | | | Soon Intrinsic::ID will be a plain integer, so this overload will not be possible. Rename both overloads to ensure that downstream targets observe this as a build failure instead of a runtime failure. Split off from D71320 Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D71381
* Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=Off builds after D65958 ↵Fangrui Song2019-12-111-0/+2
| | | | and D70450
* Add intrinsics for unary narrowing operationsAndrzej Warzynski2019-12-112-8/+18
| | | | | | | | | | | | | | | | | | | | | Summary: The following intrinsics for unary narrowing operations are added: * @llvm.aarch64.sve.sqxtnb * @llvm.aarch64.sve.uqxtnb * @llvm.aarch64.sve.sqxtunb * @llvm.aarch64.sve.sqxtnt * @llvm.aarch64.sve.uqxtnt * @llvm.aarch64.sve.sqxtunt Reviewers: sdesmalen, rengolin, efriedma Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71270
* [AArch64] Be more careful to skip debug operands in LdSt Optimizier.Florian Hahn2019-12-111-7/+10
| | | | This fixes crashes with $noreg operands.
OpenPOWER on IntegriCloud