summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [AArch64][SVE] Replace integer immediate intrinsics with splat vector variantDanilo Carvalho Grael2019-12-202-22/+39
| | | | | | | | | | | | Summary: Replace the integer immediate intrisics with splat vector variants so they can be applied as optimizations for the C/C++ intrinsics. Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71614
* [SystemZ] Add a mapping from "select register" to "load on condition" (2-addr).Jonas Paulsson2019-12-204-81/+60
| | | | | | | | | | | | | | | | | | The SELR(Mux) instructions can be converted to two-address form as LOCR(Mux) instructions whenever one of the sources are the same reg as dest. By adding this mapping in getTwoOperandOpcode(), we get: - Two-address hints in getRegAllocationHints() for select register instructions. - No need anymore for special handling in SystemZShortenInst.cpp - shortenSelect() removed. The two-address hints are now added before the GRX32 hints, which should be preferred. Review: Ulrich Weigand https://reviews.llvm.org/D68870
* llvm-symbolizer: support DW_FORM_loclistx locations.Evgenii Stepanov2019-12-202-31/+27
| | | | | | | | | | | | | | | | Summary: With -gdwarf-5 local variable locations are emitted as DW_FORM_loclistx form instead of the regular DW_FORM_sec_offset. Teach DWARFDie::getLocations to understand the new format and use it in llvm-symbolizer "FRAME" command. Reviewers: pcc, jdoerfert Subscribers: srhines, aprantl, hiraditya, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70756
* Temporarily revert "Reapply [LVI] Normalize pointer behavior" and "[LVI] ↵Jordan Rupprecht2019-12-201-134/+174
| | | | | | Restructure caching" This reverts commits 7e18aeba5062cd4324a9efb7bc25c9dbc4a34c2c (D70376) 21fbd5587cdfa11dabb3aeb0ead2d3d5fd0b490d (D69914) due to increased memory usage.
* [SystemZ] Bugfix and improve the handling of CC values.Jonas Paulsson2019-12-206-33/+141
| | | | | | | | | | | | | | | | | It was recently discovered that the handling of CC values was actually broken since overflow was not properly handled ('nsw' flag not checked for). Add and sub instructions now have a new target specific instruction flag named SystemZII::CCIfNoSignedWrap. It means that the CC result can be used instead of a compare with 0, but only if the instruction has the 'nsw' flag set. This patch also adds the improvements of conversion to logical instructions and the analyzing of add with immediates, to be able to eliminate more compares. Review: Ulrich Weigand https://reviews.llvm.org/D66868
* Revert "[ARM] Improve codegen of volatile load/store of i64"Victor Campos2019-12-206-158/+6
| | | | This reverts commit bbcf1c3496ce2bd1ed87e8fb15ad896e279633ce.
* [SystemZ][FPEnv] Enable strict vector FP extends/truncationsUlrich Weigand2019-12-204-13/+66
| | | | | | | | | | | | | | The back-end currently has special DAGCombine code to detect cases where two floating-point extend or truncate operations can be combined into a single vector operation. This patch extends that support to also handle strict FP operations. Note that currently only the case where both operations have the same input chain are supported. This already suffices to cover the common case where the operations result from scalarizing a non-legal vector type. More general cases can be supported in the future.
* [AArch64][SVE] Correct intrinsics and patterns for logical predicate ↵Paul Walker2019-12-201-17/+17
| | | | | | | | | | | | | | | | | | | | | | | instructions In general SVE intrinsics are considered predicated and merging with everything else having suitable decoration. For predicated zeroing operations (like the predicate logical instructions) we use the "_z" suffix. After this change all intrinsics use their expected names (i.e. orr instead of or and eor instead of xor). I've removed intrinsics and patterns for condition code setting instructions as that data is not returned as part of the intrinsic. The expectation is to ask for a cc flag explicitly. For example: a = and_z(pg, p1, p2) cc = ptest_<flag>(pg, a) With the code generator expected to use "s" variants of instructions when available. Differential Revision: https://reviews.llvm.org/D71715
* [OPT-DBG] Teach DbgEntityHistoryCalculator about meta-instructions.Tom Weaver2019-12-201-1/+3
| | | | | | | | | | | | | | The calculator was considering instructions such as KILLs as clobbers of a physical address. This is wrong as meta instructions such as KILLs produce no output in the final program and thus don't clobber or change any physical location's value. As a result they're safe to ignore whilst calculating location list ranges. reviewers: aprantl, vsk diff revision: https://reviews.llvm.org/D70497 fixes: https://bugs.llvm.org/show_bug.cgi?id=38753
* [LV] Strip wrap flags from vectorized reductionsAyal Zaks2019-12-201-1/+39
| | | | | | | | | | | | A sequence of additions or multiplications that is known not to wrap, may wrap if it's order is changed (i.e., reassociated). Therefore when vectorizing integer sum or product reductions, their no-wrap flags need to be removed. Fixes PR43828 Patch by Denis Antrushin Differential Revision: https://reviews.llvm.org/D69563
* [AArch64][SVE] Fold constant multiply of element countCullen Rhodes2019-12-203-1/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: E.g. %0 = tail call i64 @llvm.aarch64.sve.cntw(i32 31) %mul = mul i64 %0, <const> Should emit: cntw x0, all, mul #<const> For <const> in the range 1-16. Patch by Kerry McLaughlin Reviewers: sdesmalen, huntergr, dancgr, rengolin, efriedma Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71014
* [AArch64][SVE] Add intrnisics for saturating scalar arithmeticAndrzej Warzynski2019-12-202-69/+137
| | | | | | | | | | | | | | | | | | | | | | Summary: The following intrnisics are added: * @llvm.aarch64.sve.sqdec{b|h|w|d|p} * @llvm.aarch64.sve.sqinc{b|h|w|d|p} * @llvm.aarch64.sve.uqdec{b|h|w|d|p} * @llvm.aarch64.sve.uqinc{b|h|w|d|p} For every intrnisic there a scalar variants (with n32 or n64 suffix) and vector variants (no suffix). Reviewers: sdesmalen, rengolin, efriedma Reviewed By: sdesmalen, efriedma Subscribers: eli.friedman, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71252
* Recommit "[AArch64][SVE] Add permutation and selection intrinsics"Cullen Rhodes2019-12-205-42/+264
| | | | | | | | Recommit 23c28c40436143006be740533375c036d11c92cd (reverted in dcb48f50bdfa0fa47b62d089b6ed999d857fc9f8) with a fix for an assert "Request for a fixed size on a scalable object" being triggered in `LowerSVEIntrinsicEXT`. The fix is to call `getKnownMinSize` on the TypeSize object.
* [AArch64][SVE] Add intrinsics for binary narrowing operationsAndrzej Warzynski2019-12-203-22/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The following intrinsics for binary narrowing shift righ operations are added: * @llvm.aarch64.sve.shrnb * @llvm.aarch64.sve.uqshrnb * @llvm.aarch64.sve.sqshrnb * @llvm.aarch64.sve.sqshrunb * @llvm.aarch64.sve.uqrshrnb * @llvm.aarch64.sve.sqrshrnb * @llvm.aarch64.sve.sqrshrunb * @llvm.aarch64.sve.shrnt * @llvm.aarch64.sve.uqshrnt * @llvm.aarch64.sve.sqshrnt * @llvm.aarch64.sve.sqshrunt * @llvm.aarch64.sve.uqrshrnt * @llvm.aarch64.sve.sqrshrnt * @llvm.aarch64.sve.sqrshrunt Reviewers: sdesmalen, rengolin, efriedma Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71552
* [ARM][MVE] Fixes for tail predication.Sam Parker2019-12-204-17/+102
| | | | | | | | | | | | | | | | | | 1) Fix an issue with the incorrect value being used for the number of elements being passed to [d|w]lstp. We were trying to check that the value was available at LoopStart, but this doesn't consider that the last instruction in the block could also define the register. Two helpers have been added to RDA for this. 2) Insert some code to now try to move the element count def or the insertion point so that we can perform more tail predication. 3) Related to (1), the same off-by-one could prevent us from generating a low-overhead loop when a mov lr could have been the last instruction in the block. 4) Fix up some instruction attributes so that not all the low-overhead loop instructions are labelled as branches and terminators - as this is not true for dls/dlstp. Differential Revision: https://reviews.llvm.org/D71609
* [ARM][MVE] Tail predicate in the presence of vcmpSam Parker2019-12-203-76/+270
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Record the discovered VPT blocks while checking for validity and, for now, only handle blocks that begin with VPST and not VPT. We're now allowing more than one instruction to define vpr, but each block must somehow be predicated using the vctp. This leaves us with several scenarios which need fixing up: 1) A VPT block with is only predicated by the vctp and has no internal vpr defs. 2) A VPT block which is only predicated by the vctp but has an internal vpr def. 3) A VPT block which is predicated upon the vctp as well as another vpr def. 4) A VPT block which is not predicated upon a vctp, but contains it and all instructions within the block are predicated upon in. The changes needed are, for: 1) The easy one, just remove the vpst and unpredicate the instructions in the block. 2) Remove the vpst and unpredicate the instructions up to the internal vpr def. Need insert a new vpst to predicate the remaining instructions. 3) No nothing. 4) The vctp will be inside a vpt and the instruction will be removed, so adjust the size of the mask on the vpst. Differential Revision: https://reviews.llvm.org/D71107
* [ARM][MVE] Tail predicate bottom/top muls.Sam Parker2019-12-201-0/+3
| | | | | | Add VMULL and VQDMULL variants to our tail predication white list. Differential Revision: https://reviews.llvm.org/D71465
* [X86] Make EmitCmp into a static function and explicitly return chain result ↵Craig Topper2019-12-192-18/+19
| | | | | | | | | | | | for STRICT_FCMP. NFCI The only thing its getting from the X86TargetLowering class is the subtarget which we can easily pass. This function only has one call site now since this might help the compiler inline it. Explicitly return both the flag result and the chain result for STRICT_FCMP nodes. This removes an assumption in the caller that getValue(1) is the right way to get the chain.
* [X86] Directly call EmitTest in two places instead of creating a null ↵Craig Topper2019-12-191-4/+2
| | | | | | | | | constant and calling EmitCmp. NFCI EmitCmp will just immediately call EmitTest and discard the null constant only to have EmitTest create it again if it doesn't fold. So just skip all that and go directly to EmitTest.
* [Orc][LLJIT] Re-apply 298e183e813 (use JITLink for LLJIT where supported).Lang Hames2019-12-191-3/+7
| | | | Patch d9220b580b3 fixed the underlying issue that casued 298e183e813 to fail.
* [JITLink][MachO] Fix common symbol size plumbing.Lang Hames2019-12-191-1/+1
| | | | This fixes the underlying bug that was exposed by 298e183e813.
* [CommandLine] Add template instantiations of cl::parser for long and long long.River Riddle2019-12-191-0/+24
| | | | | | | | This allows cl::opt<int64_t>. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D71729
* [NFC][InlineCost] Simplify internal inlining cost interfaceMircea Trofin2019-12-191-11/+11
| | | | | | | | | | | | | | | | | Summary: All the use cases of CallAnalyzer use the same call site parameter to both construct the CallAnalyzer, and then pass to the analysis member. This change removes this duplication. Reviewers: davidxl, eraman, Jim Reviewed By: davidxl Subscribers: Jim, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71645
* [ValueTracking] isKnownNonZero() should take non-null-ness assumptions into ↵Roman Lebedev2019-12-201-0/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | consideration (PR43267) Summary: It is pretty common to assume that something is not zero. Even optimizer itself sometimes emits such assumptions (e.g. `addAssumeNonNull()` in `PromoteMemoryToRegister.cpp`). But we currently don't deal with such assumptions :) The only way `isKnownNonZero()` handles assumptions is by calling `computeKnownBits()` which calls `computeKnownBitsFromAssume()`. But `x != 0` does not tell us anything about set bits, it only says that there are *some* set bits. So naturally, `KnownBits` does not get populated, and we fail to make use of this assumption. I propose to deal with this special case by special-casing it via adding a `isKnownNonZeroFromAssume()` that returns boolean when there is an applicable assumption. While there, we also deal with other predicates, mainly if the comparison is with constant. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=43267 | PR43267 ]]. Differential Revision: https://reviews.llvm.org/D71660
* [ValueTracking] isValidAssumeForContext(): CxtI itself also must transfer ↵Roman Lebedev2019-12-201-5/+4
| | | | | | | | | | | | | execution to successor This is a pretty rare case, when CxtI and assume are in the same basic block, with assume being located later. We were already checking that assumption was guaranteed to be executed, but we omitted CxtI itself from consideration, and as the test (miscompile) shows, that is incorrect. As noted in D71660 review by @nikic.
* HotColdSplitting: Do not outline within noreturn functionsVedant Kumar2019-12-191-0/+5
| | | | | | | A function marked `noreturn` may contain unreachable terminators: these should not be considered cold, as the function may be a trampoline. rdar://58068594
* [StackMaps] Be explicit about label formation [NFC] (try 2)Philip Reames2019-12-195-20/+54
| | | | | | Recommit after making the same API change in non-x86 targets. This has been build for all targets, and tested for effected ones. Why the difference? Because my disk filled up when I tried make check for all. For auto-padding assembler support, we'll need to bundle the label with the instructions (nops or call sequences) so that they don't get separated. This just rearranges the code to make the upcoming change more obvious.
* Temporarily Revert "[Dsymutil][Debuginfo][NFC] Refactor dsymutil to separate ↵Eric Christopher2019-12-195-372/+1
| | | | | | | | | | | DWARF optimizing part 2." as it causes a layering violation/dependency cycle: llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp -> llvm/DebugInfo/DWARF/DWARFExpression.h llvm/include/llvm/DebugInfo/DWARF/DWARFOptimizer.h -> llvm/CodeGen/NonRelocatableStringpool.h This reverts commit abc7f6800df8a1f40e1e2c9ccce826abb0208284.
* [XCOFF][AIX] Fix for missing of undefined symbols from symbol tablejasonliu2019-12-191-7/+7
| | | | | | | | | Summary: When we use undefined symbol with its qualname, we are not able to generate that symbol because of the logic of early "continue" that skip the qualname symbol. This patch fixes it. Differential revision: https://reviews.llvm.org/D71667
* Temporarily Revert "[StackMaps] Be explicit about label formation [NFC]"Eric Christopher2019-12-192-25/+14
| | | | | | as it broke the aarch64 build. This reverts commit bc7595d934b958ab481288d7b8e768fe5310be8f.
* [StackMaps] Be explicit about label formation [NFC]Philip Reames2019-12-192-14/+25
| | | | For auto-padding assembler support, we'll need to bundle the label with the instructions (nops or call sequences) so that they don't get separated. This just rearranges the code to make the upcoming change more obvious.
* [FaultMaps] Make label formation a bit more explicit [NFC]Philip Reames2019-12-192-4/+6
| | | | This is in advance of assembler padding directives support where we'll need to bundle the label w/the corresponding faulting instruction to avoid padding being inserted between.
* [LegalizeDAG] Add return to the strict node handling in ↵Craig Topper2019-12-191-0/+1
| | | | PromoteLegalINT_TO_FP to prevent an invalid strict fp node from being created by falling into non-strict code path.
* [Alignment][NFC] Align compatible methods for CreateElementUnorderedAtomicMemSetGuillaume Chatelet2019-12-191-4/+2
| | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71703
* [RISCV] Don't crash on unsupported relocationsLuís Marques2019-12-191-2/+11
| | | | | | | | | | Summary: Instead of crashing due to the `llvm_unreachable`, provide a proper error when invalid fixups/relocations are encountered. Reviewers: asb, lenary Reviewed By: asb Tags: #llvm Differential Revision: https://reviews.llvm.org/D71536
* [SystemZ] Recognize mrecord-mcount in backendJonas Paulsson2019-12-192-3/+18
| | | | | | | Emit the __mcount_loc section for all fentry calls. Review: Ulrich Weigand https://reviews.llvm.org/D71629
* [RISCV] Enable the machine outliner for RISC-Vlewis-revill2019-12-193-0/+190
| | | | | | | | | | This patch enables the machine outliner for RISC-V and adds the necessary logic for checking whether sequences can be safely outlined, and describing how they should be outlined. Outlined functions are called using the register t0 (x5) as the return address register, which must be available for an occurrence of a sequence to be safely outlined. Differential Revision: https://reviews.llvm.org/D66210
* [DDG] Data Dependence Graph - OrdinalsBardia Mahjour2019-12-191-0/+21
| | | | | | | | | | | | | | | | | | | | | | | Summary: This patch associates ordinal numbers to the DDG Nodes allowing the builder to order nodes within a pi-block in program order. The algorithm works by simply assuming the order in which the BBList is fed into the builder. The builder already relies on the blocks being in program order so that it can compute the dependencies correctly. Similarly the order of instructions in their parent basic blocks determine their program order. Authored By: bmahjour Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert Reviewed By: Meinersbur Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack Tags: #llvm Differential Revision: https://reviews.llvm.org/D70986
* [PowerPC] Only use PLT annotations if using PIC relocation modelJustin Hibbits2019-12-191-1/+7
| | | | | | | | | | | | | | Summary: The default static (non-PIC, non-PIE) model for 32-bit powerpc does not use @PLT annotations and relocations in GCC. LLVM shouldn't use @PLT annotations either, because it breaks secure-PLT linking with (some versions of?) GNU LD. Update the available-externally.ll test to reflect that default mode should be the same as the static relocation, by using the same check prefix. Reviewed by: sfertile Differential Revision: https://reviews.llvm.org/D70570
* Revert "[AArch64][SVE] Add permutation and selection intrinsics"Cullen Rhodes2019-12-195-264/+42
| | | | | | | | | | | This reverts commit 23c28c40436143006be740533375c036d11c92cd. It caused build failures in the following expensive checks builders: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/1295 http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/700 Reverting for now whilst I figure what the issue is.
* [ConstantHoisting] Ignore unreachable bb:s when collecting candidatesBjorn Pettersson2019-12-191-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Ignore looking at blocks that are unreachable from entry when collecting candidates for hosting. Normally the consthoist pass is executed in the llc pipeline, just after unreachableblockelim. So it is abnormal to have code that is unreachable from the entry block. But when running the pass as part of opt, for example as part of fuzzy testing, we might trigger various kinds of asserts when collecting candidates if we include unreachable blocks in that analysis. It seems like a waste of time to hoist constants in unreachble blocks, so the solution is to simply ignore such blocks when collecting the hoisting candidates. The two added test cases used to end up in two different asserts, and the intention with the checks is just to verify that we no longer fail. Fixes: PR43903 Reviewers: spatel Reviewed By: spatel Subscribers: hiraditya, uabelho, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71678
* [AArch64][SVE] Add permutation and selection intrinsicsCullen Rhodes2019-12-195-42/+264
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Adds the following intrinsics: * @llvm.aarch64.sve.clasta * @llvm.aarch64.sve.clasta_n * @llvm.aarch64.sve.clastb * @llvm.aarch64.sve.clastb_n * @llvm.aarch64.sve.compact * @llvm.aarch64.sve.ext * @llvm.aarch64.sve.lasta * @llvm.aarch64.sve.lastb * @llvm.aarch64.sve.rev * @llvm.aarch64.sve.splice * @llvm.aarch64.sve.tbl * @llvm.aarch64.sve.trn1 * @llvm.aarch64.sve.trn2 * @llvm.aarch64.sve.uzp1 * @llvm.aarch64.sve.uzp2 * @llvm.aarch64.sve.zip1 * @llvm.aarch64.sve.zip2 Reviewers: sdesmalen, efriedma, dancgr, mgudim, huntergr, rengolin Reviewed By: sdesmalen, efriedma Subscribers: kmclaughlin, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71401
* [DebugInfo] Fix verbose printing of rows added via DW_LNE_end_sequenceJames Henderson2019-12-191-1/+1
| | | | | | | | | | | The debug line verbose printing was printing the wrong values for rows added via DW_LNE_end_sequence, because the row was being printed AFTER its state had been reset following it being appended to the line table. This patch fixes this issue by printing the row before appending it. Reviewers: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D71664
* [Dsymutil][Debuginfo][NFC] Refactor dsymutil to separate DWARF optimizing ↵Alexey Lapshin2019-12-195-1/+372
| | | | | | | | | | | part 2. That patch is extracted from the D70709. It moves CompileUnit, DeclContext into llvm/DebugInfo/DWARF. It also adds new file DWARFOptimizer with AddressesMap class. AddressesMap generalizes functionality from RelocationManager. Differential Revision: https://reviews.llvm.org/D71271
* [InstCombine] Canonicalize select immediatesDavid Green2019-12-191-2/+30
| | | | | | | | | | | | | | | | | | | | In certain situations after inlining and simplification we end up with code that is _almost_ a min/max pattern, but contains constants that have been demand-bit optimised to the wrong values, ending up with code like: %1 = icmp slt i32 %shr, -128 %2 = select i1 %1, i32 128, i32 %shr %.inv = icmp sgt i32 %shr, 127 %spec.select.i = select i1 %.inv, i32 127, i32 %2 %conv7 = trunc i32 %spec.select.i to i8 This should be turned into a min/max pattern, but the -128 in the first select was instead transformed into 128, as only the bottom byte was ever demanded. To fix this, I've put in further canonicalisation for the immediates of selects, preferring to use the same value as the icmp if available. Differential Revision: https://reviews.llvm.org/D71516
* [llvm-exegesis] Fix pfm counter names for Haswell for older versions of libpfmMiloš Stojanović2019-12-191-8/+8
| | | | | | | | The inconsistency caused uops mode to fail on an older version of libpfm since the dispatched_port was added as an alias for executed_port only after v4.6.0 of libpfm. Differential revision: https://reviews.llvm.org/D71665
* Make more use of MachineInstr::mayLoadOrStore.Jay Foad2019-12-1913-18/+18
|
* [ARM] Improve codegen of volatile load/store of i64Victor Campos2019-12-196-6/+158
| | | | | | | | | | | | | | | | | | Summary: Instead of generating two i32 instructions for each load or store of a volatile i64 value (two LDRs or STRs), now emit LDRD/STRD. These improvements cover architectures implementing ARMv5TE or Thumb-2. Reviewers: dmgreen, efriedma, john.brawn Reviewed By: efriedma Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70072
* [AArch64][SVE] Implement pfirst and pnext intrinsicsCullen Rhodes2019-12-192-5/+12
| | | | | | | | | | | | | Reviewers: sdesmalen, efriedma, dancgr, mgudim, cameron.mcinally Reviewed By: cameron.mcinally Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71472
* [AArch64][SVE] Implement ptrue intrinsicCullen Rhodes2019-12-193-10/+19
| | | | | | | | | | | | | | Reviewers: sdesmalen, eli.friedman, dancgr, mgudim, cameron.mcinally, huntergr, efriedma Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71457
OpenPOWER on IntegriCloud