summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [RISCV] Don't force Local Exec TLS for non-PICJames Clarke2019-12-031-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Forcing Local Exec TLS requires the use of copy relocations. Copy relocations need special handling in the runtime linker when being used against TLS symbols, which is present in glibc, but not in FreeBSD nor musl, and so cannot be relied upon. Moreover, copy relocations are a hack that embed the size of an object in the ABI when it otherwise wouldn't be, and break protected symbols (which are expected to be DSO local), whilst also wasting space, thus they should be avoided whenever possible. As discussed in D70398, RISC-V should move away from forcing Local Exec, and instead use Initial Exec like other targets, with possible linker relaxation to follow. The RISC-V GCC maintainers also intend to adopt this more-conventional behaviour (see https://github.com/riscv/riscv-elf-psabi-doc/issues/122). Reviewers: asb, MaskRay Reviewed By: MaskRay Subscribers: emaste, krytarowski, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, llvm-commits, bsdjhb Tags: #llvm Differential Revision: https://reviews.llvm.org/D70649
* [InstCombine] Revert aafde063aaf09285c701c80cd4b543c2beb523e8 and ↵Craig Topper2019-12-032-10/+2
| | | | | | | | | | | | | | | | 6749dc3446671df05235d0a218c426a314ac33cd related to bitcast handling of x86_mmx This reverts these two commits [InstCombine] Turn (extractelement <1 x i64/double> (bitcast (x86_mmx))) into a single bitcast from x86_mmx to i64/double. [InstCombine] Don't transform bitcasts between x86_mmx and v1i64 into insertelement/extractelement We're seeing at least one internal test failure related to a bitcast that was previously before an inline assembly block containing emms being placed after it. This leads to the mmx state ending up not empty after the emms. IR has no way to make any specific guarantees about this. Reverting these patches to get back to previous behavior which at least worked for this test.
* [GlobalISel]: Allow targets to override how to widen constants during ↵Aditya Nandakumar2019-12-032-1/+13
| | | | | | | | | | | | | | | legalization https://reviews.llvm.org/D70922 This adds a hook to allow targets to define exactly what extension operation should be performed for widening constants. This handles cases like widening i1 true which would end up becoming -1 which affects code quality during combines. Additionally, in order to stay consistent with how DAG is promoting constants, we now signextend for byte sized types and zero extend otherwise (by default). Targets can of course override this if necessary.
* [LV] Scalar with predication must not be uniformAyal Zaks2019-12-031-17/+22
| | | | | | | | | | | | | | Fix PR40816: avoid considering scalar-with-predication instructions as also uniform-after-vectorization. Instructions identified as "scalar with predication" will be "vectorized" using a replicating region. If such instructions are also optimized as "uniform after vectorization", namely when only the first of VF lanes is used, such a replicating region becomes erroneous - only the first instance of the region can and should be formed. Fix such cases by not considering such instructions as "uniform after vectorization". Differential Revision: https://reviews.llvm.org/D70298
* Revert "Temporarily revert "build: avoid hardcoding the libxml2 library name""Saleem Abdulrasool2019-12-031-12/+6
| | | | | | This reverts commit 2e75681b55ab55301022533b203269f5f3d6f909. Restore the clean up change. The underlying CMake issue was resolved in 372ad32734ecb455f9fb4d0601229ca2dfc78b66.
* [NFC][KnownBits] Add getMinValue() / getMaxValue() methodsRoman Lebedev2019-12-036-13/+14
| | | | | | | | | | As it can be seen from accompanying cleanup, it is not unheard of to write `~Known.Zero` meaning "what maximal value can this KnownBits produce". But i think `~Known.Zero` isn't *that* self-explanatory, as compared to a method with a name. Note that not all `~Known.Zero` places were cleaned up, only those where this arguably improves things.
* [SLP] Enhance SLPVectorizer to vectorize different combinations of aggregatesAnton Afanasyev2019-12-031-62/+56
| | | | | | | | | | | | | | | | | | | | Summary: Make SLPVectorize to recognize homogeneous aggregates like `{<2 x float>, <2 x float>}`, `{{float, float}, {float, float}}`, `[2 x {float, float}]` and so on. It's a follow-up of https://reviews.llvm.org/D70068. Merged `findBuildVector()` and `findBuildAggregate()` to one `findBuildAggregate()` function making it recursive to recognize multidimensional aggregates. Aggregates required to be homogeneous. Reviewers: RKSimon, ABataev, dtemirbulatov, spatel, vporpo Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70587
* [AArch64] Fix over-eager fusing of NEON SIMD MUL/ADDSanne Wouda2019-12-032-8/+362
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The ISel pattern for SIMD MLA is a bit too eager: it replaces the ADD with an MLA even when the MUL cannot be eliminated, e.g. when it has another use. An MLA is usually has a higher latency than an ADD (and there are fewer pipes available that can execute it), so trading an MLA for an ADD is not great. ISel is not taking the number of uses of the MUL result into account, nor any other factors such as the length of the critical path or other resource pressure. The MachineCombiner is able to make these judgments so this patch ports the ISel pattern for MUL/ADD fusing to the MachineCombiner. Similarly for MUL/SUB -> MLS, as well as the indexed variants. The change has no impact on SPEC CPU© intrate nor fprate. Reviewers: dmgreen, SjoerdMeijer, fhahn, Gerolf Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70673
* [SelectionDAG] Reoder ViewXXXDAGs declarations to match execution order. NFCAmaury Séchet2019-12-031-14/+12
|
* [Aarch64][SVE] Add intrinsics for gather loads (vector + imm)Sander de Smalen2019-12-034-29/+47
| | | | | | | | | | | | | | | | | | This patch adds intrinsics for SVE gather loads from memory addresses generated by a vector base plus immediate index: * @llvm.aarch64.sve.ld1.gather.imm This intrinsics maps 1-1 to the corresponding SVE instruction (example for half-words): * ld1h { z0.d }, p0/z, [z0.d, #16] Committed on behalf of Andrzej Warzynski (andwar) Reviewers: sdesmalen, huntergr, kmclaughlin, eli.friedman, rengolin, rovka, dancgr, mgudim, efriedma Reviewed By: sdesmalen Tags: #llvm Differential Revision: https://reviews.llvm.org/D70806
* [DebugInfo] Make DebugVariable class available in DebugInfoMetadatastozer2019-12-032-111/+29
| | | | | | | | The DebugVariable class is a class declared in LiveDebugValues.cpp which is used to uniquely identify a single variable, using its source variable, inline location, and fragment info to do so. This patch moves this class into DebugInfoMetadata.h, making it available in a much broader scope.
* [DDG] Data Dependence Graph - Topological Sort (Memory Leak Fix)Bardia Mahjour2019-12-032-5/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes the memory leak in bec37c3fc766a7b97f8c52c181c325fd47b75259 and re-delivers the reverted patch. In this patch the DDG DAG is sorted topologically to put the nodes in the graph in the order that would satisfy all dependencies. This helps transformations that would like to generate code based on the DDG. Since the DDG is a DAG a reverse-post-order traversal would give us the topological ordering. This patch also sorts the basic blocks passed to the builder based on program order to ensure that the dependencies are computed in the correct direction. Authored By: bmahjour Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert Reviewed By: Meinersbur Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack Tags: #llvm Differential Revision: https://reviews.llvm.org/D70609
* [Aarch64][SVE] Add intrinsics for gather loads with 32-bits offsetsSander de Smalen2019-12-034-46/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds intrinsics for SVE gather loads for which the offsets are 32-bits wide and are: * unscaled * @llvm.aarch64.sve.ld1.gather.sxtw * @llvm.aarch64.sve.ld1.gather.uxtw * scaled (offsets become indices) * @llvm.arch64.sve.ld1.gather.sxtw.index * @llvm.arch64.sve.ld1.gather.uxtw.index The offsets are either zero (uxtw) or sign (sxtw) extended to 64 bits. These intrinsics map 1-1 to the corresponding SVE instructions (examples for half-words): * unscaled * ld1h { z0.s }, p0/z, [x0, z0.s, sxtw] * ld1h { z0.s }, p0/z, [x0, z0.s, uxtw] * scaled * ld1h { z0.s }, p0/z, [x0, z0.s, sxtw #1] * ld1h { z0.s }, p0/z, [x0, z0.s, uxtw #1] Committed on behalf of Andrzej Warzynski (andwar) Reviewers: sdesmalen, kmclaughlin, eli.friedman, rengolin, rovka, huntergr, dancgr, mgudim, efriedma Reviewed By: sdesmalen Tags: #llvm Differential Revision: https://reviews.llvm.org/D70782
* [NFCI][DebugInfo] Corrected a comment.Sourabh Singh Tomar2019-12-031-2/+2
|
* [AArch64][SVE2] Implement remaining SVE2 floating-point intrinsicsKerry McLaughlin2019-12-032-17/+42
| | | | | | | | | | | | | | | | | | | Summary: Adds the following intrinsics: - faddp - fmaxp, fminp, fmaxnmp & fminnmp - fmlalb, fmlalt, fmlslb & fmlslt - flogb Reviewers: huntergr, sdesmalen, dancgr, efriedma Reviewed By: sdesmalen Subscribers: efriedma, tschuett, kristof.beyls, hiraditya, cameron.mcinally, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70253
* [Support] Add ProcName to TimeTraceProfilerRussell Gallop2019-12-031-5/+9
| | | | | | | | | | This was hard-coded to "clang". This change allows it to to be used on processes other than clang (such as lld). This gets reported as clang-10 on Linux and clang.exe on Windows so adapted test to accommodate this. Differential Revision: https://reviews.llvm.org/D70950
* [AArch64][SVE] Add intrinsics for gather loads with 64-bit offsetsSander de Smalen2019-12-036-29/+158
| | | | | | | | | | | | | | | | | | | | This patch adds the following intrinsics for gather loads with 64-bit offsets: * @llvm.aarch64.sve.ld1.gather (unscaled offset) * @llvm.aarch64.sve.ld1.gather.index (scaled offset) These intrinsics map 1-1 to the following AArch64 instructions respectively (examples for half-words): * ld1h { z0.d }, p0/z, [x0, z0.d] * ld1h { z0.d }, p0/z, [x0, z0.d, lsl #1] Committing on behalf of Andrzej Warzynski (andwar) Reviewers: sdesmalen, huntergr, rovka, mgudim, dancgr, rengolin, efriedma Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D70542
* Revert "[LiveDebugValues] Introduce entry values of unmodified params"Djordje Todorovic2019-12-031-301/+78
| | | | This reverts commit rG4cfceb910692 due to LLDB test failing.
* [VPlan] Add dump function to VPlan class.Florian Hahn2019-12-032-8/+14
| | | | | | | | | | | | | | This adds a dump() function to VPlan, which uses the existing operator<<. This method provides a convenient way to dump a VPlan while debugging, e.g. from lldb. Reviewers: hsaito, Ayal, gilr, rengolin Reviewed By: hsaito Differential Revision: https://reviews.llvm.org/D70920
* [AArch64][SVE] Implement shift intrinsicsKerry McLaughlin2019-12-035-21/+69
| | | | | | | | | | | | | | | | | | | | Summary: Adds the following intrinsics: - asr & asrd - insr - lsl & lsr This patch also adds a new AArch64ISD node (INSR) to represent the int_aarch64_sve_insr intrinsic. Reviewers: huntergr, sdesmalen, dancgr, mgudim, rengolin, efriedma Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cameron.mcinally, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70437
* [CodeGen] Move ARMCodegenPrepare to TypePromotionSam Parker2019-12-036-175/+102
| | | | | | | | | | | | | | | | | | Convert ARMCodeGenPrepare into a generic type promotion pass by: - Removing the insertion of arm specific intrinsics to handle narrow types as we weren't using this. - Removing ARMSubtarget references. - Now query a generic TLI object to know which types should be promoted and what they should be promoted to. - Move all codegen tests into Transforms folder and testing using opt and not llc, which is how they should have been written in the first place... The pass searches up from icmp operands in an attempt to safely promote types so we can avoid generating unnecessary unsigned extends during DAG ISel. Differential Revision: https://reviews.llvm.org/D69556
* [DWARF] Add support for parsing/dumping section indices in location listsPavel Labath2019-12-033-32/+51
| | | | | | | | | | | | | | | | | | | Summary: This does exactly what it says on the box. The only small gotcha is the section index computation for offset_pair entries, which can use either the base address section, or the section from the offset_pair entry. This is to support both the cases where the base address is relocated (points to the base of the CU, typically), and the case where the base address is a constant (typically zero) and relocations are on the offsets themselves. Reviewers: dblaikie, JDevlieghere, aprantl, SouraVX Subscribers: hiraditya, llvm-commits, probinson Tags: #llvm Differential Revision: https://reviews.llvm.org/D70540
* [asan] Remove debug locations from alloca prologue instrumentationJohannes Altmanninger2019-12-031-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes https://llvm.org/PR26673 "Wrong debugging information with -fsanitize=address" where asan instrumentation causes the prologue end to be computed incorrectly: findPrologueEndLoc, looks for the first instruction with a debug location to determine the prologue end. Since the asan instrumentation instructions had debug locations, that prologue end was at some instruction, where the stack frame is still being set up. There seems to be no good reason for extra debug locations for the asan instrumentations that set up the frame; they don't have a natural source location. In the debugger they are simply located at the start of the function. For certain other instrumentations like -fsanitize-coverage=trace-pc-guard the same problem persists - that might be more work to fix, since it looks like they rely on locations of the tracee functions. This partly reverts aaf4bb239487e0a3b20a8eaf94fc641235ba2c29 "[asan] Set debug location in ASan function prologue" whose motivation was to give debug location info to the coverage callback. Its test only ensures that the call to @__sanitizer_cov_trace_pc_guard is given the correct source location; as the debug location is still set in ModuleSanitizerCoverage::InjectCoverageAtBlock, the test does not break. So -fsanitize-coverage is hopefully unaffected - I don't think it should rely on the debug locations of asan-generated allocas. Related revision: 3c6c14d14b40adfb581940859ede1ac7d8ceae7a "ASAN: Provide reliable debug info for local variables at -O0." Below is how the X86 assembly version of the added test case changes. We get rid of some .loc lines and put prologue_end where the user code starts. ```diff --- 2.master.s 2019-12-02 12:32:38.982959053 +0100 +++ 2.patch.s 2019-12-02 12:32:41.106246674 +0100 @@ -45,8 +45,6 @@ .cfi_offset %rbx, -24 xorl %eax, %eax movl %eax, %ecx - .Ltmp2: - .loc 1 3 0 prologue_end # 2.c:3:0 cmpl $0, __asan_option_detect_stack_use_after_return movl %edi, 92(%rbx) # 4-byte Spill movq %rsi, 80(%rbx) # 8-byte Spill @@ -57,9 +55,7 @@ callq __asan_stack_malloc_0 movq %rax, 72(%rbx) # 8-byte Spill .LBB1_2: - .loc 1 0 0 is_stmt 0 # 2.c:0:0 movq 72(%rbx), %rax # 8-byte Reload - .loc 1 3 0 # 2.c:3:0 cmpq $0, %rax movq %rax, %rcx movq %rax, 64(%rbx) # 8-byte Spill @@ -72,9 +68,7 @@ movq %rax, %rsp movq %rax, 56(%rbx) # 8-byte Spill .LBB1_4: - .loc 1 0 0 # 2.c:0:0 movq 56(%rbx), %rax # 8-byte Reload - .loc 1 3 0 # 2.c:3:0 movq %rax, 120(%rbx) movq %rax, %rcx addq $32, %rcx @@ -99,7 +93,6 @@ movb %r8b, 31(%rbx) # 1-byte Spill je .LBB1_7 # %bb.5: - .loc 1 0 0 # 2.c:0:0 movq 40(%rbx), %rax # 8-byte Reload andq $7, %rax addq $3, %rax @@ -118,7 +111,8 @@ movl %ecx, (%rax) movq 80(%rbx), %rdx # 8-byte Reload movq %rdx, 128(%rbx) - .loc 1 4 3 is_stmt 1 # 2.c:4:3 +.Ltmp2: + .loc 1 4 3 prologue_end # 2.c:4:3 movq %rax, %rdi callq f movq 48(%rbx), %rax # 8-byte Reload ``` Reviewers: eugenis, aprantl Reviewed By: eugenis Subscribers: ormris, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70894
* ImplicitNullChecks: Don't add a dead definition of DepMI as live-inJonas Paulsson2019-12-031-1/+1
| | | | | | | | This is one of the fixes needed to reapply D68267 which improves verification of live-in lists. Review: craig.topper https://reviews.llvm.org/D70434
* [LiveDebugValues] Introduce entry values of unmodified paramsDjordje Todorovic2019-12-031-78/+301
| | | | | | | | | | | | | | | | | | | | | The idea is to remove front-end analysis for the parameter's value modification and leave it to the value tracking system. Front-end in some cases marks a parameter as modified even the line of code that modifies the parameter gets optimized, that implies that this will cover more entry values even. In addition, extending the support for modified parameters will be easier with this approach. Since the goal is to recognize if a parameter’s value has changed, the idea at very high level is: If we encounter a DBG_VALUE other than the entry value one describing the same variable (parameter), we can assume that the variable’s value has changed and we should not track its entry value any more. That would be ideal scenario, but due to various LLVM optimizations, a variable’s value could be just moved around from one register to another (and there will be additional DBG_VALUEs describing the same variable), so we have to recognize such situation (otherwise, we will lose a lot of entry values) and salvage the debug entry value. Differential Revision: https://reviews.llvm.org/D68209
* [NFC] Tidy-ups to TimeProfiler.cppRussell Gallop2019-12-031-10/+8
| | | | | | | | Remove unused include Make fields const where possible Move initialisation to initialiser list Differential Revision: https://reviews.llvm.org/D70904
* [MachineVerifier] Improve checks of target instructions operands.Jonas Paulsson2019-12-031-7/+17
| | | | | | | | | | | | | | | | While working with a patch for instruction selection, the splitting of a large immediate ended up begin treated incorrectly by the backend. Where a register operand should have been created, it instead became an immediate. To my surprise the machine verifier failed to report this, which at the time would have been helpful. This patch improves the verifier so that it will report this type of error. This patch XFAILs CodeGen/SPARC/fp128.ll, which has been reported at https://bugs.llvm.org/show_bug.cgi?id=44091 Review: thegameg, arsenm, fhahn https://reviews.llvm.org/D63973
* [LegalizeDAG] Return true from ExpandNode for some nodes that don't have ↵Craig Topper2019-12-021-1/+3
| | | | | | | | | | | | | | expand support. These nodes have a FIXME that they only get here because a Custom handler returned SDValue() instead of the original Op. Even though we aren't expanding them, we should return true here to prevent ConvertNodeToLibcall from also trying to process them until the FIXME has been addressed. I'm hoping to add checking to ConvertNodeToLibcall to make sure we don't give it nodes it doesn't have support for.
* [LegalizeDAG] When expanding vector SRA/SRL/SHL add the new BUILD_VECTOR to ↵Craig Topper2019-12-021-1/+1
| | | | | | | | | | | | | | | | the Results vector instead of just calling ReplaceNode The code that processes the Results vector also calls ReplaceNode and makes ExpandNode return true. If we don't add it to the Results node, we end up returning false from ExpandNode. This causes ConvertNodeToLibcall to be called next. But ConvertNodeToLibcall doesn't do anything for shifts so they just pass through unmodified. Except for printing a debug message. Ultimately, I'd like to add more checks to ExpandNode and ConvertNodeToLibcall to make sure we don't have nodes marked as Expand that don't have any Expand or libcall handling.
* [NFC][PowerPC] Add the inheritable and additional features to make the ↵QingShan Zhang2019-12-031-39/+81
| | | | | | | | | | | | | | | processor definition more clear The old processor design assume that, all the old processor's feature must be inherited into future processor. That is not true as instruction fusion or some implementation defined features are not inheritable. What this patch did: * Rename the old "specific features" to "additional features" that keep the new added inheritable features. * Use the "specific features" to keep those features only for specific processor. * Add the "inheritable features" to keep all the features that inherited from early processor. Differential Revision: https://reviews.llvm.org/D70768
* Recommit "[DWARF5]Addition of alignment atrribute in typedef DIE."Sourabh Singh Tomar2019-12-033-7/+16
| | | | | | | | | | | | | | | | This revision is revised to update Go-bindings and Release Notes. The original commit message follows. This patch, adds support for DW_AT_alignment[DWARF5] attribute, to be emitted with typdef DIE. When explicit alignment is specified. Patch by Awanish Pandey <Awanish.Pandey@amd.com> Reviewers: aprantl, dblaikie, jini.susan.george, SouraVX, alok, deadalinx Differential Revision: https://reviews.llvm.org/D70111
* [DebugInfo] Support for debug_macinfo.dwo section in llvm and llvm-dwarfdump.Sourabh Singh Tomar2019-12-034-2/+45
| | | | | | | | | | | This patch adds support for debug_macinfo.dwo section[pre-standardized] to llvm and llvm-dwarfdump. Reviewers: probinson, dblaikie, aprantl, jini.susan.george, alok Differential Revision: https://reviews.llvm.org/D70705 Tags: #debug-info #llvm
* [X86] Model MXCSR for AVX instructions other than AVX512Wang, Pengfei2019-12-032-12/+17
| | | | | | | | | | | | Summary: Model MXCSR for AVX instructions other than AVX512 Reviewers: craig.topper, RKSimon Subscribers: hiraditya, llvm-commits, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70875
* Place the "cold" code piece into the same section as the original functionBill Wendling2019-12-021-0/+3
| | | | | | | | | | | | | | | | Summary: This cropped up in the Linux kernel where cold code was placed in an incompatible section. Reviewers: compnerd, vsk, tejohnson Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70925
* Temporarily revert "build: avoid hardcoding the libxml2 library name"Eric Christopher2019-12-021-6/+12
| | | | | | | | as it breaks uses of llvm-config --system-libs and the follow-on commit "build: avoid cached literals being linked against" This reverts commits 340e7c0b77a7037afefe7255503afe362967b577 and 340e7c0b77a7037afefe7255503afe362967b577.
* [PGO][PGSO] Add an optional query type parameter to shouldOptimizeForSize.Hiroshi Yamauchi2019-12-027-13/+24
| | | | | | | | | | | | | | Summary: In case of a need to distinguish different query sites for gradual commit or debugging of PGSO. NFC. Reviewers: davidxl Subscribers: hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70510
* [Remarks][ThinLTO] Use the correct file extension based on the formatFrancis Visoiu Mistrih2019-12-021-1/+5
| | | | | Since we now have multiple formats, the ThinLTO remark files should also respect that.
* [MIBundles] Move analyzePhysReg out of MIBundleOperands iterator (NFC).Florian Hahn2019-12-024-17/+8
| | | | | | | | | | | analyzePhysReg does not really fit into the iterator and moving it makes it easier to change the base iterator. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm, qcolombet Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D70559
* [GlobalISel] CombinerHelper: Fix a bug in matchCombineCopyVolkan Keles2019-12-021-3/+26
| | | | | | | | | | | | | | | | | Summary: When combining COPY instructions, we were replacing the destination registers with the source register without checking register constraints. This patch adds a simple logic to check if the constraints match before replacing registers. Reviewers: qcolombet, aditya_nandakumar, aemerson, paquette, dsanders, Petar.Avramovic Reviewed By: aditya_nandakumar Subscribers: rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70616
* [ARM] Add ARMVCCThen to tablegen and make use of it. NFCDavid Green2019-12-022-71/+76
| | | | | | | Similar to the parent, this adds some constants to tablegen to replace the existing magic values. Differential Revision: https://reviews.llvm.org/D70825
* [ARM] Add ARMCC constants to tablegen. NFCDavid Green2019-12-023-151/+133
| | | | | | | | | | I got tired of looking at magic constants in tablegen files. This adds condition codes like ARMCCeq and makes use of them. I also removed the extra patterns for reverse condition codes from D70296, they should now be covered by the parent commit. Differential Revision: https://reviews.llvm.org/D70824
* [ARM] Add some VCMP folding and canonicalisationDavid Green2019-12-023-27/+62
| | | | | | | | | | The VCMP instructions in MVE can accept a register or ZR, but only as the right hand operator. Most of the time this will already be correct because the icmp will have been canonicalised that way already. There are some cases in the lowering of float conditions that this will not apply to though. This code should fix up those cases. Differential Revision: https://reviews.llvm.org/D70822
* [MIBundles] Move analyzeVirtReg out of MIBundleOperands iterator (NFC).Florian Hahn2019-12-022-17/+16
| | | | | | | | | | | analyzeVirtReg does not really fit into the iterator and moving it makes it easier to change the base iterator. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm, qcolombet Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D70558
* Reland "b19ec1eb3d0c [BPI] Improve unreachable/ColdCall heurstics to handle ↵Taewook Oh2019-12-021-57/+75
| | | | | | | | | | | | | loops." Summary: b19ec1eb3d0c has been reverted because of the test failures with PowerPC targets. This patch addresses the issues from the previous commit. Test Plan: ninja check-all. Confirmed that CodeGen/PowerPC/pr36292.ll and CodeGen/PowerPC/sms-cpy-1.ll pass Subscribers: llvm-commits
* [VPlan] Move graph traits (NFC).Florian Hahn2019-12-021-121/+122
| | | | | | | | | | | By defining the graph traits right after the VPBlockBase definitions, we can make use of them earlier in the file. Reviewers: hsaito, Ayal, gilr Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D70733
* [DAGCombine] Factor oplist operations. NFCAmaury Séchet2019-12-021-6/+10
|
* [InstCombine] fix undef propagation for vector urem transform (PR44186)Sanjay Patel2019-12-021-2/+4
| | | | | | | | | As described here: https://bugs.llvm.org/show_bug.cgi?id=44186 The match() code safely allows undef values, but we can't safely propagate a vector constant that contains an undef to the new compare instruction.
* [SelectionDAG] Reduce assumptions made about levels. NFCAmaury Séchet2019-12-021-7/+8
|
* [ARM,MVE] Add intrinsics to deal with predicates.Simon Tatham2019-12-021-12/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This commit adds the `vpselq` intrinsics which take an MVE predicate word and select lanes from two vectors; the `vctp` intrinsics which create a tail predicate word suitable for processing the first m elements of a vector (e.g. in the last iteration of a loop); and `vpnot`, which simply complements a predicate word and is just syntactic sugar for the `~` operator. The `vctp` ACLE intrinsics are lowered to the IR intrinsics we've already added (and which D70592 just reorganized). I've filled in the missing isel rule for VCTP64, and added another set of rules to generate the predicated forms. I needed one small tweak in MveEmitter to allow the `unpromoted` type modifier to apply to predicates as well as integers, so that `vpnot` doesn't pointlessly convert its input integer to an `<n x i1>` before complementing it. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70485
* [ARM,MVE] Rename and clean up VCTP IR intrinsics.Simon Tatham2019-12-022-7/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: D65884 added a set of Arm IR intrinsics for the MVE VCTP instruction, to use in tail predication. But the 64-bit one doesn't work properly: its predicate type is `<2 x i1>` / `v2i1`, which isn't a legal MVE type (due to not having a full set of instructions that manipulate it usefully). The test of `vctp64` in `basic-tail-pred.ll` goes through `opt` fine, as the test expects, but if you then feed it to `llc` it causes a type legality failure at isel time. The usual workaround we've been using in the rest of the MVE intrinsics family is to bodge `v2i1` into `v4i1`. So I've adjusted the `vctp64` IR intrinsic to do that, and completely removed the code (and test) that uses that intrinsic for 64-bit tail predication. That will allow me to add isel rules (upcoming in D70485) that actually generate the VCTP64 instruction. Also renamed all four of these IR intrinsics so that they have `mve` in the name, since its absence was confusing. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: MarkMurrayARM Subscribers: samparker, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70592
OpenPOWER on IntegriCloud