bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[SimpleLoopUnswitch] Invalidate the topmost loop with ExitBB as exiting.	Florian Hahn	2019-12-04	1	-2/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SCEV caches the exiting blocks when computing exit counts. In SimpleLoopUnswitch, we split the exit block of the loop to unswitch. Currently we only invalidate the loop containing that exit block, but if that block is the exiting block for a parent loop, we have stale cache entries. We have to invalidate the top-most loop that contains the exit block as exiting block. We might also be able to skip invalidating the loop containing the exit block, if the exit block is not an exiting block of that loop. There are also 2 more places in SimpleLoopUnswitch, that use a similar problematic approach to get the loop to invalidate. If the patch makes sense, I will also update those places to a similar approach (they deal with multiple exit blocks, so we cannot directly re-use getTopMostExitingLoop). Fixes PR43972. Reviewers: skatkov, reames, asbirlea, chandlerc Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D70786
*	Allow negative offsets in MipsMCInstLower::LowerOperand	Alex Richardson	2019-12-04	2	-7/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We rely on this in our CHERI backend to address the GOT by generating a $pc-relative addresses. For this we emit the following code sequence: lui $1, %pcrel_hi(_CHERI_CAPABILITY_TABLE_-8) daddiu $1, $1, %pcrel_lo(_CHERI_CAPABILITY_TABLE_-4) cgetpccincoffset $c1, $1 However, without this change the addend is implicitly converted to UINT32_MAX and an invalid pointer value is generated. Reviewers: atanasyan Reviewed By: atanasyan Subscribers: merge_guards_bot, sdardis, hiraditya, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70953
*	Handle BUNDLE instructions in MipsAsmPrinter	Alex Richardson	2019-12-04	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In our CHERI fork we use BUNDLE instructions to ensure that a three-instruction sequence to generate a program-counter-relative value is emitted without reordering or insertions (since that would break the 32-bit offset computation). Currently MipsAsmPrinter asserts when it encounters a pseudo instruction. To handle BUNDLE we can simply skip the instruction which will then make EmitInstruction() process the contents of the bundle in order. Reviewers: atanasyan Reviewed By: atanasyan Subscribers: merge_guards_bot, sdardis, hiraditya, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70945
*	MipsDelaySlotFiller: Don't move BUNDLE instructions into the delay slot	Alex Richardson	2019-12-04	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In our CHERI fork we use BUNDLE instructions to ensure that a three-instruction sequence to generate a program-counter-relative value is emitted without reordering or insertions (since that would break the 32-bit offset computation). This sequence is created in MipsExpandPseudo and we use finalizeBundle() to create the BUNDLE instruction. However, the delay slot filler currently breaks this pattern since the BUNDLE will be removed and so all instructions are moved into the delay slot. Since the delay slot only executes the first instruction, this results in incorrect computations (and run-time crashes) if the branch is taken. The original test cases uses CHERI instructions, so for the test case here I simple filled a BUNDLE with a no-op DADDiu $sp_64, -16 and DADDiu $sp_64, 16. Reviewers: atanasyan Reviewed By: atanasyan Subscribers: merge_guards_bot, sdardis, hiraditya, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70944
*	Add debug output to MipsDelaySlotFiller pass	Alex Richardson	2019-12-04	1	-5/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I was tracking down a code-generation bug in this pass and found that the added output was useful. It is also helpful to find out why a delay slot could not be filled even though there is clearly a valid instruction (which appears to mostly be caused by CFI instructions). Reviewers: atanasyan Reviewed By: atanasyan Subscribers: merge_guards_bot, sdardis, hiraditya, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70940
*	[AArch64TTI] Compute imm materialization cost for AArch64 intrinsics	Florian Hahn	2019-12-04	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, getIntImmCost returns TCC_Free for almost all intrinsics. For most AArch64 specific intrinsics however, it looks like integer constants cannot be folded into most of them (at least the ones I checked). Unless we know that we can fold integer operands with the intrinsic, we handle more cases correctly by returning the cost to materialize the immediate than return TCC_Free. Reviewers: SjoerdMeijer, dmgreen, t.p.northover, ributzka Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D70669
*	[NFC] Use default case in EVT::getEVTString	Cullen Rhodes	2019-12-04	1	-127/+2
\| \| \| \| \| \| \| \| \| \|	Summary: The default case handles the majority of MVTs so most of the individual cases can be removed. Also added a case for floating point types. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D70955
*	AMDGPU: Avoid folding 2 constant operands into an SALU operation	David Stuttard	2019-12-04	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Catch the (admittedly unusual) case where SIFoldOperands attempts to fold 2 constant operands into the same SALU operation, with neither operand able to be encoded as an inline constant. Change-Id: Ibc48d662c9ffd8bbacd154976b0b1c257ace0927 Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70896
*	[yaml2obj] - Make DynamicSymbols to be Optional<> too.	Georgii Rymar	2019-12-04	1	-16/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already have Symbols property to list regular symbols and it is currently Optional<>. This patch makes DynamicSymbols to be optional too. With this there is no need to define a dummy symbol anymore to trigger creation of the .dynsym and it is now possible to define an empty .dynsym using just the following line: DynamicSymbols: [] (it is important to have when you do not want to have dynamic symbols, but want to have a .dynsym) Now the code is consistent and it helped to fix a bug: previously we did not report an error when both Content/Size and an empty Symbols/DynamicSymbols list were specified. Differential revision: https://reviews.llvm.org/D70956
*	[APFloat] Prevent construction of APFloat with Semantics and FP value	Ehud Katz	2019-12-04	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Constructor invocations such as `APFloat(APFloat::IEEEdouble(), 0.0)` may seem like they accept a FP (floating point) value, but the overload they reach is actually the `integerPart` one, not a `float` or `double` overload (which only exists when `fltSemantics` isn't passed). This may lead to possible loss of data, by the conversion from `float` or `double` to `integerPart`. To prevent future mistakes, a new constructor overload, which accepts any FP value and marked with `delete`, to prevent its usage. Fixes PR34095. Differential Revision: https://reviews.llvm.org/D70425
*	[DWARFDebugLoclists] Add support for other DW_LLE encodings	Pavel Labath	2019-12-04	1	-8/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: lldb's loclists parser has support for DW_LLE_start_end(x) encodings. To avoid regressing when switching the implementation to llvm's, I add parsing support for all previously unsupported location list encodings. Reviewers: dblaikie, JDevlieghere, aprantl, SouraVX Subscribers: hiraditya, probinson, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70949
*	[DWARFDebugRnglists] Add a callback-based version of the getAbsoluteRanges ↵	Pavel Labath	2019-12-04	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	function Summary: The dump() function already accepts a callback. This makes getAbsoluteRanges do the same. The existing DWARFUnit overload is implemented on top of the new function. This enables usage of the debug_rnglists parser from within lldb (which has it's own dwarf parser). Reviewers: dblaikie, JDevlieghere, aprantl Subscribers: hiraditya, probinson, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70952
*	[SelectionDAG] Expand nnan FMINNUM/FMAXNUM to select sequence	Ulrich Weigand	2019-12-04	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	InstCombine may synthesize FMINNUM/FMAXNUM nodes from fcmp+select sequences (where the fcmp is marked nnan). Currently, if the target does not otherwise handle these nodes, they'll get expanded to libcalls to fmin/fmax. However, these functions may reside in libm, which may introduce a library dependency that was not originally present in the source code, potentially resulting in link failures. To fix this problem, add code to TargetLowering::expandFMINNUM_FMAXNUM to expand FMINNUM/FMAXNUM to a compare+select sequence instead of the libcall. This is done only if the node is marked as "nnan"; in this case, the expansion to compare+select is always correct. This also suffices to catch all cases where FMINNUM/FMAXNUM was synthesized as above. Differential Revision: https://reviews.llvm.org/D70965
*	[MacroFusion] Limit the max fused number as 2 to reduce the dependency	QingShan Zhang	2019-12-04	1	-1/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the example: int foo(int a, int b, int c, int d) { return a + b + c + d; } And this is the Dependency Graph: +------+ +------+ +------+ +------+ \| A \| \| B \| \| C \| \| D \| +--+--++ +---+--+ +--+---+ +--+---+ ^ ^ ^ ^ ^ ^ \| \| \| \| \| \| \| \| \| \|New1 +--------------+ \| \| \| \| \| \| \| \| \| +--+---+ \| \|New2 \| +-------+ ADD1 \| \| \| \| +--+---+ \| \| \| Fuse ^ \| \| +-------------+ \| +------------+ \| \| \| Fuse +--+---+ +----------->+ ADD2 \| \| +------+ +--+---+ \| ADD3 \| +------+ We need also create an artificial edge from ADD1 to A if https://reviews.llvm.org/D69998 is landed. That will force the Node A scheduled before the ADD1 and ADD2. But in fact, it is ok to schedule the Node A in-between ADD3 and ADD2, as ADD3 and ADD2 are NOT a fusion pair because ADD2 has been matched to ADD1. We are creating these unnecessary dependency edges that override the heuristics. Differential Revision: https://reviews.llvm.org/D70066
*	[PowerPC] folding rlwinm + rlwinm to rlwinm	czhengsz	2019-12-03	1	-0/+138
\| \| \| \| \| \| \| \| \| \| \| \|	For example: x3 = rlwinm x3, 27, 5, 31 x3 = rlwinm x3, 19, 0, 12 can be combined to x3 = rlwinm x3, 14, 0, 12 Reviewed by: steven.zhang, lkail Differential Revision: https://reviews.llvm.org/D70374
*	[X86] Model DAZ and FTZ	Wang, Pengfei	2019-12-04	2	-21/+43
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a follow-up of D70881. It models DAZ and FTZ for releated instructions. Reviewers: craig.topper, RKSimon, andrew.w.kaylor Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70938
*	[X86] Model MXCSR for all AVX512 instructions	Wang, Pengfei	2019-12-04	2	-57/+79
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Model MXCSR for all AVX512 instructions Reviewers: craig.topper, RKSimon, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70881
*	[FPEnv] [PowerPC] Lowering ppc_fp128 StrictFP Nodes to libcalls	Craig Topper	2019-12-03	4	-210/+283
\| \| \| \| \| \| \| \|	This is an alternative to D64662 that shares more code between strict and non-strict nodes. It's modeled after the implementation that I did for softening. Differential Revision: https://reviews.llvm.org/D70867
*	[RISCV] Don't force Local Exec TLS for non-PIC	James Clarke	2019-12-03	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Forcing Local Exec TLS requires the use of copy relocations. Copy relocations need special handling in the runtime linker when being used against TLS symbols, which is present in glibc, but not in FreeBSD nor musl, and so cannot be relied upon. Moreover, copy relocations are a hack that embed the size of an object in the ABI when it otherwise wouldn't be, and break protected symbols (which are expected to be DSO local), whilst also wasting space, thus they should be avoided whenever possible. As discussed in D70398, RISC-V should move away from forcing Local Exec, and instead use Initial Exec like other targets, with possible linker relaxation to follow. The RISC-V GCC maintainers also intend to adopt this more-conventional behaviour (see https://github.com/riscv/riscv-elf-psabi-doc/issues/122). Reviewers: asb, MaskRay Reviewed By: MaskRay Subscribers: emaste, krytarowski, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, llvm-commits, bsdjhb Tags: #llvm Differential Revision: https://reviews.llvm.org/D70649
*	[InstCombine] Revert aafde063aaf09285c701c80cd4b543c2beb523e8 and ↵	Craig Topper	2019-12-03	2	-10/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	6749dc3446671df05235d0a218c426a314ac33cd related to bitcast handling of x86_mmx This reverts these two commits [InstCombine] Turn (extractelement <1 x i64/double> (bitcast (x86_mmx))) into a single bitcast from x86_mmx to i64/double. [InstCombine] Don't transform bitcasts between x86_mmx and v1i64 into insertelement/extractelement We're seeing at least one internal test failure related to a bitcast that was previously before an inline assembly block containing emms being placed after it. This leads to the mmx state ending up not empty after the emms. IR has no way to make any specific guarantees about this. Reverting these patches to get back to previous behavior which at least worked for this test.
*	[GlobalISel]: Allow targets to override how to widen constants during ↵	Aditya Nandakumar	2019-12-03	2	-1/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	legalization https://reviews.llvm.org/D70922 This adds a hook to allow targets to define exactly what extension operation should be performed for widening constants. This handles cases like widening i1 true which would end up becoming -1 which affects code quality during combines. Additionally, in order to stay consistent with how DAG is promoting constants, we now signextend for byte sized types and zero extend otherwise (by default). Targets can of course override this if necessary.
*	[LV] Scalar with predication must not be uniform	Ayal Zaks	2019-12-03	1	-17/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix PR40816: avoid considering scalar-with-predication instructions as also uniform-after-vectorization. Instructions identified as "scalar with predication" will be "vectorized" using a replicating region. If such instructions are also optimized as "uniform after vectorization", namely when only the first of VF lanes is used, such a replicating region becomes erroneous - only the first instance of the region can and should be formed. Fix such cases by not considering such instructions as "uniform after vectorization". Differential Revision: https://reviews.llvm.org/D70298
*	Revert "Temporarily revert "build: avoid hardcoding the libxml2 library name""	Saleem Abdulrasool	2019-12-03	1	-12/+6
\| \| \| \| \| \|	This reverts commit 2e75681b55ab55301022533b203269f5f3d6f909. Restore the clean up change. The underlying CMake issue was resolved in 372ad32734ecb455f9fb4d0601229ca2dfc78b66.
*	[NFC][KnownBits] Add getMinValue() / getMaxValue() methods	Roman Lebedev	2019-12-03	6	-13/+14
\| \| \| \| \| \| \| \| \| \|	As it can be seen from accompanying cleanup, it is not unheard of to write `~Known.Zero` meaning "what maximal value can this KnownBits produce". But i think `~Known.Zero` isn't that self-explanatory, as compared to a method with a name. Note that not all `~Known.Zero` places were cleaned up, only those where this arguably improves things.
*	[SLP] Enhance SLPVectorizer to vectorize different combinations of aggregates	Anton Afanasyev	2019-12-03	1	-62/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Make SLPVectorize to recognize homogeneous aggregates like `{<2 x float>, <2 x float>}`, `{{float, float}, {float, float}}`, `[2 x {float, float}]` and so on. It's a follow-up of https://reviews.llvm.org/D70068. Merged `findBuildVector()` and `findBuildAggregate()` to one `findBuildAggregate()` function making it recursive to recognize multidimensional aggregates. Aggregates required to be homogeneous. Reviewers: RKSimon, ABataev, dtemirbulatov, spatel, vporpo Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70587
*	[AArch64] Fix over-eager fusing of NEON SIMD MUL/ADD	Sanne Wouda	2019-12-03	2	-8/+362
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The ISel pattern for SIMD MLA is a bit too eager: it replaces the ADD with an MLA even when the MUL cannot be eliminated, e.g. when it has another use. An MLA is usually has a higher latency than an ADD (and there are fewer pipes available that can execute it), so trading an MLA for an ADD is not great. ISel is not taking the number of uses of the MUL result into account, nor any other factors such as the length of the critical path or other resource pressure. The MachineCombiner is able to make these judgments so this patch ports the ISel pattern for MUL/ADD fusing to the MachineCombiner. Similarly for MUL/SUB -> MLS, as well as the indexed variants. The change has no impact on SPEC CPU© intrate nor fprate. Reviewers: dmgreen, SjoerdMeijer, fhahn, Gerolf Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70673
*	[SelectionDAG] Reoder ViewXXXDAGs declarations to match execution order. NFC	Amaury Séchet	2019-12-03	1	-14/+12
\|
*	[Aarch64][SVE] Add intrinsics for gather loads (vector + imm)	Sander de Smalen	2019-12-03	4	-29/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds intrinsics for SVE gather loads from memory addresses generated by a vector base plus immediate index: * @llvm.aarch64.sve.ld1.gather.imm This intrinsics maps 1-1 to the corresponding SVE instruction (example for half-words): * ld1h { z0.d }, p0/z, [z0.d, #16] Committed on behalf of Andrzej Warzynski (andwar) Reviewers: sdesmalen, huntergr, kmclaughlin, eli.friedman, rengolin, rovka, dancgr, mgudim, efriedma Reviewed By: sdesmalen Tags: #llvm Differential Revision: https://reviews.llvm.org/D70806
*	[DebugInfo] Make DebugVariable class available in DebugInfoMetadata	stozer	2019-12-03	2	-111/+29
\| \| \| \| \| \| \| \|	The DebugVariable class is a class declared in LiveDebugValues.cpp which is used to uniquely identify a single variable, using its source variable, inline location, and fragment info to do so. This patch moves this class into DebugInfoMetadata.h, making it available in a much broader scope.
*	[DDG] Data Dependence Graph - Topological Sort (Memory Leak Fix)	Bardia Mahjour	2019-12-03	2	-5/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes the memory leak in bec37c3fc766a7b97f8c52c181c325fd47b75259 and re-delivers the reverted patch. In this patch the DDG DAG is sorted topologically to put the nodes in the graph in the order that would satisfy all dependencies. This helps transformations that would like to generate code based on the DDG. Since the DDG is a DAG a reverse-post-order traversal would give us the topological ordering. This patch also sorts the basic blocks passed to the builder based on program order to ensure that the dependencies are computed in the correct direction. Authored By: bmahjour Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert Reviewed By: Meinersbur Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack Tags: #llvm Differential Revision: https://reviews.llvm.org/D70609
*	[Aarch64][SVE] Add intrinsics for gather loads with 32-bits offsets	Sander de Smalen	2019-12-03	4	-46/+98
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds intrinsics for SVE gather loads for which the offsets are 32-bits wide and are: * unscaled * @llvm.aarch64.sve.ld1.gather.sxtw * @llvm.aarch64.sve.ld1.gather.uxtw * scaled (offsets become indices) * @llvm.arch64.sve.ld1.gather.sxtw.index * @llvm.arch64.sve.ld1.gather.uxtw.index The offsets are either zero (uxtw) or sign (sxtw) extended to 64 bits. These intrinsics map 1-1 to the corresponding SVE instructions (examples for half-words): * unscaled * ld1h { z0.s }, p0/z, [x0, z0.s, sxtw] * ld1h { z0.s }, p0/z, [x0, z0.s, uxtw] * scaled * ld1h { z0.s }, p0/z, [x0, z0.s, sxtw #1] * ld1h { z0.s }, p0/z, [x0, z0.s, uxtw #1] Committed on behalf of Andrzej Warzynski (andwar) Reviewers: sdesmalen, kmclaughlin, eli.friedman, rengolin, rovka, huntergr, dancgr, mgudim, efriedma Reviewed By: sdesmalen Tags: #llvm Differential Revision: https://reviews.llvm.org/D70782
*	[NFCI][DebugInfo] Corrected a comment.	Sourabh Singh Tomar	2019-12-03	1	-2/+2
\|
*	[AArch64][SVE2] Implement remaining SVE2 floating-point intrinsics	Kerry McLaughlin	2019-12-03	2	-17/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds the following intrinsics: - faddp - fmaxp, fminp, fmaxnmp & fminnmp - fmlalb, fmlalt, fmlslb & fmlslt - flogb Reviewers: huntergr, sdesmalen, dancgr, efriedma Reviewed By: sdesmalen Subscribers: efriedma, tschuett, kristof.beyls, hiraditya, cameron.mcinally, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70253
*	[Support] Add ProcName to TimeTraceProfiler	Russell Gallop	2019-12-03	1	-5/+9
\| \| \| \| \| \| \| \| \| \|	This was hard-coded to "clang". This change allows it to to be used on processes other than clang (such as lld). This gets reported as clang-10 on Linux and clang.exe on Windows so adapted test to accommodate this. Differential Revision: https://reviews.llvm.org/D70950
*	[AArch64][SVE] Add intrinsics for gather loads with 64-bit offsets	Sander de Smalen	2019-12-03	6	-29/+158
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds the following intrinsics for gather loads with 64-bit offsets: * @llvm.aarch64.sve.ld1.gather (unscaled offset) * @llvm.aarch64.sve.ld1.gather.index (scaled offset) These intrinsics map 1-1 to the following AArch64 instructions respectively (examples for half-words): * ld1h { z0.d }, p0/z, [x0, z0.d] * ld1h { z0.d }, p0/z, [x0, z0.d, lsl #1] Committing on behalf of Andrzej Warzynski (andwar) Reviewers: sdesmalen, huntergr, rovka, mgudim, dancgr, rengolin, efriedma Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D70542
*	Revert "[LiveDebugValues] Introduce entry values of unmodified params"	Djordje Todorovic	2019-12-03	1	-301/+78
\| \| \| \|	This reverts commit rG4cfceb910692 due to LLDB test failing.
*	[VPlan] Add dump function to VPlan class.	Florian Hahn	2019-12-03	2	-8/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a dump() function to VPlan, which uses the existing operator<<. This method provides a convenient way to dump a VPlan while debugging, e.g. from lldb. Reviewers: hsaito, Ayal, gilr, rengolin Reviewed By: hsaito Differential Revision: https://reviews.llvm.org/D70920
*	[AArch64][SVE] Implement shift intrinsics	Kerry McLaughlin	2019-12-03	5	-21/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds the following intrinsics: - asr & asrd - insr - lsl & lsr This patch also adds a new AArch64ISD node (INSR) to represent the int_aarch64_sve_insr intrinsic. Reviewers: huntergr, sdesmalen, dancgr, mgudim, rengolin, efriedma Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cameron.mcinally, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70437
*	[CodeGen] Move ARMCodegenPrepare to TypePromotion	Sam Parker	2019-12-03	6	-175/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Convert ARMCodeGenPrepare into a generic type promotion pass by: - Removing the insertion of arm specific intrinsics to handle narrow types as we weren't using this. - Removing ARMSubtarget references. - Now query a generic TLI object to know which types should be promoted and what they should be promoted to. - Move all codegen tests into Transforms folder and testing using opt and not llc, which is how they should have been written in the first place... The pass searches up from icmp operands in an attempt to safely promote types so we can avoid generating unnecessary unsigned extends during DAG ISel. Differential Revision: https://reviews.llvm.org/D69556
*	[DWARF] Add support for parsing/dumping section indices in location lists	Pavel Labath	2019-12-03	3	-32/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This does exactly what it says on the box. The only small gotcha is the section index computation for offset_pair entries, which can use either the base address section, or the section from the offset_pair entry. This is to support both the cases where the base address is relocated (points to the base of the CU, typically), and the case where the base address is a constant (typically zero) and relocations are on the offsets themselves. Reviewers: dblaikie, JDevlieghere, aprantl, SouraVX Subscribers: hiraditya, llvm-commits, probinson Tags: #llvm Differential Revision: https://reviews.llvm.org/D70540
*	[asan] Remove debug locations from alloca prologue instrumentation	Johannes Altmanninger	2019-12-03	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes https://llvm.org/PR26673 "Wrong debugging information with -fsanitize=address" where asan instrumentation causes the prologue end to be computed incorrectly: findPrologueEndLoc, looks for the first instruction with a debug location to determine the prologue end. Since the asan instrumentation instructions had debug locations, that prologue end was at some instruction, where the stack frame is still being set up. There seems to be no good reason for extra debug locations for the asan instrumentations that set up the frame; they don't have a natural source location. In the debugger they are simply located at the start of the function. For certain other instrumentations like -fsanitize-coverage=trace-pc-guard the same problem persists - that might be more work to fix, since it looks like they rely on locations of the tracee functions. This partly reverts aaf4bb239487e0a3b20a8eaf94fc641235ba2c29 "[asan] Set debug location in ASan function prologue" whose motivation was to give debug location info to the coverage callback. Its test only ensures that the call to @__sanitizer_cov_trace_pc_guard is given the correct source location; as the debug location is still set in ModuleSanitizerCoverage::InjectCoverageAtBlock, the test does not break. So -fsanitize-coverage is hopefully unaffected - I don't think it should rely on the debug locations of asan-generated allocas. Related revision: 3c6c14d14b40adfb581940859ede1ac7d8ceae7a "ASAN: Provide reliable debug info for local variables at -O0." Below is how the X86 assembly version of the added test case changes. We get rid of some .loc lines and put prologue_end where the user code starts. ```diff --- 2.master.s 2019-12-02 12:32:38.982959053 +0100 +++ 2.patch.s 2019-12-02 12:32:41.106246674 +0100 @@ -45,8 +45,6 @@ .cfi_offset %rbx, -24 xorl %eax, %eax movl %eax, %ecx - .Ltmp2: - .loc 1 3 0 prologue_end # 2.c:3:0 cmpl $0, __asan_option_detect_stack_use_after_return movl %edi, 92(%rbx) # 4-byte Spill movq %rsi, 80(%rbx) # 8-byte Spill @@ -57,9 +55,7 @@ callq __asan_stack_malloc_0 movq %rax, 72(%rbx) # 8-byte Spill .LBB1_2: - .loc 1 0 0 is_stmt 0 # 2.c:0:0 movq 72(%rbx), %rax # 8-byte Reload - .loc 1 3 0 # 2.c:3:0 cmpq $0, %rax movq %rax, %rcx movq %rax, 64(%rbx) # 8-byte Spill @@ -72,9 +68,7 @@ movq %rax, %rsp movq %rax, 56(%rbx) # 8-byte Spill .LBB1_4: - .loc 1 0 0 # 2.c:0:0 movq 56(%rbx), %rax # 8-byte Reload - .loc 1 3 0 # 2.c:3:0 movq %rax, 120(%rbx) movq %rax, %rcx addq $32, %rcx @@ -99,7 +93,6 @@ movb %r8b, 31(%rbx) # 1-byte Spill je .LBB1_7 # %bb.5: - .loc 1 0 0 # 2.c:0:0 movq 40(%rbx), %rax # 8-byte Reload andq $7, %rax addq $3, %rax @@ -118,7 +111,8 @@ movl %ecx, (%rax) movq 80(%rbx), %rdx # 8-byte Reload movq %rdx, 128(%rbx) - .loc 1 4 3 is_stmt 1 # 2.c:4:3 +.Ltmp2: + .loc 1 4 3 prologue_end # 2.c:4:3 movq %rax, %rdi callq f movq 48(%rbx), %rax # 8-byte Reload ``` Reviewers: eugenis, aprantl Reviewed By: eugenis Subscribers: ormris, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70894
*	ImplicitNullChecks: Don't add a dead definition of DepMI as live-in	Jonas Paulsson	2019-12-03	1	-1/+1
\| \| \| \| \| \| \| \|	This is one of the fixes needed to reapply D68267 which improves verification of live-in lists. Review: craig.topper https://reviews.llvm.org/D70434
*	[LiveDebugValues] Introduce entry values of unmodified params	Djordje Todorovic	2019-12-03	1	-78/+301
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The idea is to remove front-end analysis for the parameter's value modification and leave it to the value tracking system. Front-end in some cases marks a parameter as modified even the line of code that modifies the parameter gets optimized, that implies that this will cover more entry values even. In addition, extending the support for modified parameters will be easier with this approach. Since the goal is to recognize if a parameter’s value has changed, the idea at very high level is: If we encounter a DBG_VALUE other than the entry value one describing the same variable (parameter), we can assume that the variable’s value has changed and we should not track its entry value any more. That would be ideal scenario, but due to various LLVM optimizations, a variable’s value could be just moved around from one register to another (and there will be additional DBG_VALUEs describing the same variable), so we have to recognize such situation (otherwise, we will lose a lot of entry values) and salvage the debug entry value. Differential Revision: https://reviews.llvm.org/D68209
*	[NFC] Tidy-ups to TimeProfiler.cpp	Russell Gallop	2019-12-03	1	-10/+8
\| \| \| \| \| \| \| \|	Remove unused include Make fields const where possible Move initialisation to initialiser list Differential Revision: https://reviews.llvm.org/D70904
*	[MachineVerifier] Improve checks of target instructions operands.	Jonas Paulsson	2019-12-03	1	-7/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While working with a patch for instruction selection, the splitting of a large immediate ended up begin treated incorrectly by the backend. Where a register operand should have been created, it instead became an immediate. To my surprise the machine verifier failed to report this, which at the time would have been helpful. This patch improves the verifier so that it will report this type of error. This patch XFAILs CodeGen/SPARC/fp128.ll, which has been reported at https://bugs.llvm.org/show_bug.cgi?id=44091 Review: thegameg, arsenm, fhahn https://reviews.llvm.org/D63973
*	[LegalizeDAG] Return true from ExpandNode for some nodes that don't have ↵	Craig Topper	2019-12-02	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	expand support. These nodes have a FIXME that they only get here because a Custom handler returned SDValue() instead of the original Op. Even though we aren't expanding them, we should return true here to prevent ConvertNodeToLibcall from also trying to process them until the FIXME has been addressed. I'm hoping to add checking to ConvertNodeToLibcall to make sure we don't give it nodes it doesn't have support for.
*	[LegalizeDAG] When expanding vector SRA/SRL/SHL add the new BUILD_VECTOR to ↵	Craig Topper	2019-12-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the Results vector instead of just calling ReplaceNode The code that processes the Results vector also calls ReplaceNode and makes ExpandNode return true. If we don't add it to the Results node, we end up returning false from ExpandNode. This causes ConvertNodeToLibcall to be called next. But ConvertNodeToLibcall doesn't do anything for shifts so they just pass through unmodified. Except for printing a debug message. Ultimately, I'd like to add more checks to ExpandNode and ConvertNodeToLibcall to make sure we don't have nodes marked as Expand that don't have any Expand or libcall handling.
*	[NFC][PowerPC] Add the inheritable and additional features to make the ↵	QingShan Zhang	2019-12-03	1	-39/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	processor definition more clear The old processor design assume that, all the old processor's feature must be inherited into future processor. That is not true as instruction fusion or some implementation defined features are not inheritable. What this patch did: * Rename the old "specific features" to "additional features" that keep the new added inheritable features. * Use the "specific features" to keep those features only for specific processor. * Add the "inheritable features" to keep all the features that inherited from early processor. Differential Revision: https://reviews.llvm.org/D70768
*	Recommit "[DWARF5]Addition of alignment atrribute in typedef DIE."	Sourabh Singh Tomar	2019-12-03	3	-7/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This revision is revised to update Go-bindings and Release Notes. The original commit message follows. This patch, adds support for DW_AT_alignment[DWARF5] attribute, to be emitted with typdef DIE. When explicit alignment is specified. Patch by Awanish Pandey <Awanish.Pandey@amd.com> Reviewers: aprantl, dblaikie, jini.susan.george, SouraVX, alok, deadalinx Differential Revision: https://reviews.llvm.org/D70111
*	[DebugInfo] Support for debug_macinfo.dwo section in llvm and llvm-dwarfdump.	Sourabh Singh Tomar	2019-12-03	4	-2/+45
\| \| \| \| \| \| \| \| \| \| \|	This patch adds support for debug_macinfo.dwo section[pre-standardized] to llvm and llvm-dwarfdump. Reviewers: probinson, dblaikie, aprantl, jini.susan.george, alok Differential Revision: https://reviews.llvm.org/D70705 Tags: #debug-info #llvm