bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[mips] Add a 'generic' Mips CPU	Miloš Stojanović	2019-11-21	1	-0/+1
\| \| \| \| \| \| \|	Having a generic CPU removes a warning when creating a subtarget without the CPU being explicitly specified. Differential Revision: https://reviews.llvm.org/D70490
*	[mips] Remove unused `IsPCRelativeLoad` MIPS instructions attribute. NFC	Simon Atanasyan	2019-11-21	3	-11/+3
\| \| \| \|	This attribute is always set to zero.
*	[AMDGPU] Add attribute for target loop unroll threshold default	Tim Corringham	2019-11-21	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add a function attribute to allow the target specific default loop unroll threshold to be specified on a per-function basis. This allows a front-end to give guidance where it has insight that is not available to the back-end, while still allowing the target specific heuristics to also have an effect. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68873
*	[X86] Fix i16->f128 sitofp to promote the i16 to i32 before trying to form a ↵	Craig Topper	2019-11-20	1	-8/+9
\| \| \| \| \| \|	libcall. Previously one of the test cases added here gave an error.
*	[X86] Fix f128->i16 fptosi to promote the i16 to i32 before trying to form a ↵	Craig Topper	2019-11-20	1	-15/+16
\| \| \| \| \| \|	libcall. Previously one of the test cases added here gave an error.
*	Revert "[AArch64] Add the pipeline model for Exynos M5"	Eric Christopher	2019-11-20	2	-1014/+1
\| \| \| \| \| \|	as it's causing test failures in llvm-mca. This reverts commit 9bdfee2a3bd13d405ce1592930182f23849d2897.
*	[BPF] Fix a bug in peephole optimization	Yonghong Song	2019-11-20	1	-21/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	One of current peephole optimiations is to remove SLL/SRL if the sub register has been zero extended. This phase has two bugs and one limitations. First, for the physical subregister used in pseudo insn COPY like below, it permits incorrect optimization. %0:gpr32 = COPY $w0 ... %4:gpr = MOV_32_64 %0:gpr32 %5:gpr = SLL_ri %4:gpr(tied-def 0), 32 %6:gpr = SRA_ri %5:gpr(tied-def 0), 32 The $w0 could be from the return value of a previous function call and its upper 32-bit value might contain some non-zero values. The same applies to function arguments. Second, the current code may permits removing SLL/SRA like below: %0:gpr32 = COPY $w0 %1:gpr32 = COPY %0:gpr32 ... %4:gpr = MOV_32_64 %1:gpr32 %5:gpr = SLL_ri %4:gpr(tied-def 0), 32 %6:gpr = SRA_ri %5:gpr(tied-def 0), 32 The reason is that it did not follow def-use chain to skip all intermediate 32bit-to-32bit COPY instructions. The current implementation is also very conservative for PHI instructions. If any PHI insn component is another PHI or COPY insn, it will just permit SLL/SRA. This patch fixed the issue as follows: - During def/use chain traversal, if any physical register is read, SLL/SRA will be preserved as these physical registers are mostly from function return values or current function arguments. - Recursively visit all COPY and PHI instructions.
*	[AArch64] Add the pipeline model for Exynos M5	Evandro Menezes	2019-11-20	2	-1/+1014
\| \| \| \|	Add the scheduling and cost models for Exynos M5.
*	[AMDGPU][SILoadStoreOptimizer] Merge TBUFFER loads/stores	Piotr Sobczak	2019-11-20	4	-8/+444
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Extend SILoadStoreOptimizer to merge tbuffer loads and stores. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69794
*	[X86] Mark vector STRICT_FP_ROUND as Legal instead of Custom.	Craig Topper	2019-11-20	1	-3/+9
\| \| \| \| \| \| \| \| \|	The Custom handler doesn't do anything for these nodes anyway. SelectionDAGISel won't mutate them if they are Legal or Custom. X86 has custom code for mutating them due to missing isel patterns. When the isel patterns are added Legal will be the right answer. So go ahead a change it now since that's where we'll end up.
*	[AMDGPU] Keep consistent check of legal addressing mode.	Michael Liao	2019-11-20	2	-14/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Add test cases for GFX10, which has narrower offset range compared to GFX9. Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70473
*	[SelectionDAG][X86] Mutate strictFP nodes to non-strict in ↵	Craig Topper	2019-11-20	1	-0/+7
\| \| \| \| \| \| \| \| \| \|	DoInstructionSelection when the node is marked Expand rather than when it is not Legal. This allows operations that are marked Custom, but have some type combinations that are legal to get past this code. Add custom mutation code to X86's Select function for the nodes that don't have isel patterns yet.
*	[AIX][XCOFF] Add support for generating assembly code for one-byte mergable ↵	Xing Xue	2019-11-20	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	strings This patch adds support for generating assembly code for one-byte mergeable strings. Generating assembly code for multi-byte mergeable strings and the `XCOFF` object code for mergeable strings will be supported later. Reviewers: hubert.reinterpretcast, jasonliu, daltenty, sfertile, DiggerLin, Xiangling_L Reviewed by: daltenty Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70310
*	[AIX] Lowering jump table, constant pool and block address in asm	Xiangling Liao	2019-11-20	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	This patch lowering jump table, constant pool and block address in assembly. 1. On AIX, jump table index is always relative; 2. Put CPI and JTI into ReadOnlySection until we support unique data sections; 3. Create the temp symbol for block address symbol; 4. Update MIR testcases and add related assembly part; Differential Revision: https://reviews.llvm.org/D70243
*	[AMDGPU][GFX10] Disabled v_movrel*[sdwa\|dpp] opcodes in codegen	Dmitry Preobrazhensky	2019-11-20	2	-0/+27
\| \| \| \| \| \| \| \|	These opcodes use indirect register addressing so they need special handling by codegen (currently missing). Reviewers: vpykhtin, arsenm, rampitec Differential Revision: https://reviews.llvm.org/D70400
*	[mips] Define mem_simm# operands using tblgen `foreach` loop. NFC	Simon Atanasyan	2019-11-20	1	-29/+5
\|
*	[ARM][MVE] Select vqabs	Anna Welker	2019-11-20	1	-0/+35
\| \| \| \| \| \| \| \|	Adds a pattern to ARMInstrMVE.td to use a VQABS instruction if an equivalent multi-instruction construct is found. Differential revision: https://reviews.llvm.org/D70181
*	[mips] Put conditions when we need to expand memory operand into a separate ↵	Simon Atanasyan	2019-11-20	1	-29/+36
\| \| \| \| \| \| \| \|	function. NFC `expandMemInst` expects instruction with 3 or 4 operands and the last operand requires expanding. It's redundant to scan all operands in a loop. We can check the last operands.
*	[mips] Make MipsAsmParser::isEvaluated static function. NFC	Simon Atanasyan	2019-11-20	1	-21/+20
\|
*	[AMDGPU][DPP] Corrected DPP combiner	Dmitry Preobrazhensky	2019-11-20	1	-6/+9
\| \| \| \| \| \| \| \|	Added a check to make sure that the selected dpp opcode is supported by target. Reviewers: vpykhtin, arsenm, rampitec Differential Revision: https://reviews.llvm.org/D70402
*	[AMDGPU] add support for hostcall buffer pointer as hidden kernel argument	Sameer Sahasrabuddhe	2019-11-20	2	-2/+21
\| \| \| \| \| \| \| \| \| \| \|	Hostcall is a service that allows a kernel to submit requests to the host using shared buffers, and block until a response is received. This will eventually replace the shared buffer currently used for printf, and repurposes the same hidden kernel argument. This change introduces a new ValueKind in the HSA metadata to represent the hostcall buffer. Differential Revision: https://reviews.llvm.org/D70038
*	ExecutionEngine: add preliminary support for COFF ARM64	Adam Kallai	2019-11-20	1	-1/+4
\| \| \| \|	Differential Revision: https://reviews.llvm.org/D69434
*	AMDGPU/GlobalISel: Legalize FDIV64	Austin Kerbow	2019-11-19	2	-0/+87
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70403
*	[musttail] Don't forward AL on Win64	Reid Kleckner	2019-11-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AL is only used for varargs on SysV platforms. Don't forward it on Windows. This allows control flow guard to set up an extra hidden parameter in RAX, as described in PR44049. This also has the effect of freeing up RAX for use in virtual member pointer thunks, which may also be a nice little code size improvement on Win64. Fixes PR44049 Reviewers: ajpaverd, efriedma, hans Differential Revision: https://reviews.llvm.org/D70413
*	[LegalizeDAG][X86] Enable STRICT_FP_TO_SINT/UINT to be promoted	Craig Topper	2019-11-19	1	-4/+7
\| \| \| \|	Differential Revision: https://reviews.llvm.org/D70220
*	[X86] Add custom type legalization and lowering for scalar ↵	Craig Topper	2019-11-19	2	-30/+119
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	STRICT_FP_TO_SINT/UINT This is a first pass at Custom lowering for these operations. I also updated some of the vector code where it was obviously easy and straightforward. More work needed in follow up. This enables these operations to be handled with X87 where special rounding control adjustments are needed to perform a truncate. Still need to fix Promotion in the target independent code in LegalizeDAG. llrint/llround split into separate test file because we can't make a strict libcall properly yet either and we need to do that when i64 isn't a legal type. This does not include any isel support. So we still rely on the mutation in SelectionDAGIsel to remove the strict from this stuff later. Except for the X87 stuff which goes through custom nodes that already had chains. Differential Revision: https://reviews.llvm.org/D70214
*	[ARC] Add InitializePasses header to fix ARC build.	Pete Couperus	2019-11-19	2	-0/+2
\|
*	MTE: add more unchecked instructions.	Evgenii Stepanov	2019-11-19	1	-3/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In particular, 1- and 2-byte loads and stores ignore the pointer tag when using SP as the base register. Reviewers: pcc, ostannard Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70341
*	[ARM] MVE interleaving load and stores.	David Green	2019-11-19	3	-41/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we have the intrinsics, we can add VLD2/4 and VST2/4 lowering for MVE. This works the same way as Neon, recognising the load/shuffles combination and converting them into intrinsics in a pre-isel pass, which just calls getMaxSupportedInterleaveFactor, lowerInterleavedLoad and lowerInterleavedStore. The main difference to Neon is that we do not have a VLD3 instruction. Otherwise most of the code works very similarly, with just some minor differences in the form of the intrinsics to work around. VLD3 is disabled by making isLegalInterleavedAccessType return false for those cases. We may need some other future adjustments, such as VLD4 take up half the available registers so should maybe cost more. This patch should get the basics in though. Differential Revision: https://reviews.llvm.org/D69392
*	[ARM,MVE] Add intrinsics for scalar shifts.	Simon Tatham	2019-11-19	2	-8/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fills in the small family of MVE intrinsics that have nothing to do with vectors: they implement bit-shift operations on 32- or 64-bit values held in one or two general-purpose registers. Most of these shift operations saturate if shifting left, and round to nearest if shifting right, although LSLL and ASRL behave like ordinary shifts. When these instructions take a variable shift count in a register, they pay attention to its sign, so that (for example) LSLL or UQRSHLL will shift left if given a positive number but right if given a negative one. That makes even LSLL and ASRL different enough from standard LLVM IR shift semantics that I couldn't see any better alternative than to simply model the whole family as a set of MVE-specific IR intrinsics. (The //immediate// forms of LSLL and ASRL, on the other hand, do behave exactly like a standard IR shift of a 64-bit value. In fact, those forms don't have ACLE intrinsics defined at all, because you can just write an ordinary C shift operation if you want one of those.) The 64-bit shifts have to be instruction-selected in C++, because they deliver two output values. But the 32-bit ones are simple enough that I could write a DAG isel pattern directly into each Instruction record. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70319
*	AMDGPU: Refactor treatment of denormal mode	Matt Arsenault	2019-11-19	18	-78/+141
\| \| \| \| \| \| \| \| \| \| \|	Start moving towards treating this as a property of the calling convention, and not the subtarget. The default denormal mode should not be part of the subtarget, and be moved into a separate function attribute. This patch is still NFC. The denormal mode remains as a subtarget feature for now, but make the necessary changes to switch to using an attribute.
*	AMDGPU: Be explicit about denormal mode in MIR tests	Matt Arsenault	2019-11-19	1	-10/+16
\| \| \| \| \| \| \|	Start checking the machine function in GlobalISel instead of the target directly. This temporarily breaks fcanonicalize selection in GlobalISel.
*	DAG: Add function context to isFMAFasterThanFMulAndFAdd	Matt Arsenault	2019-11-19	15	-21/+50
\| \| \| \| \| \| \| \|	AMDGPU needs to know the FP mode for the function to answer this correctly when this is removed from the subtarget. AArch64 had to make this more complicated by using this from an IR hook, so add an IR typed overload.
*	[AMDGPU] Tune inlining parameters for AMDGPU target (part 2)	dfukalov	2019-11-19	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Most of IR instructions got better code size estimations after commit 47a5c36b. So default parameters values should be updated to improve inlining and unrolling for the target. Reviewers: rampitec, arsenm Reviewed By: rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70391
*	[X86][SSE] Remove XFormVExtractWithShuffleIntoLoad to prevent legalization ↵	Simon Pilgrim	2019-11-19	1	-122/+2
\| \| \| \| \| \| \| \|	infinite loops (PR43971) As detailed in PR43971/D70267, the use of XFormVExtractWithShuffleIntoLoad causes issues where we end up in infinite loops of extract(targetshuffle(vecload)) -> extract(shuffle(vecload)) -> extract(vecload) -> extract(targetshuffle(vecload)), there are just too many legalization checks at every stage that we can't guarantee that extract(shuffle(vecload)) -> scalarload can occur. At the moment we see a number of minor regressions as we don't fold extract(shuffle(vecload)) -> scalarload before legal ops, these can be addressed in future patches and extension of X86ISelLowering's combineExtractWithShuffle.
*	[mips] Joint MipsMemSimmXXXAsmOperand into the single template class. NFC	Simon Atanasyan	2019-11-19	2	-60/+14
\|
*	[ARM][MVE] Enable narrow vectors for tail pred	Sam Parker	2019-11-19	1	-1/+1
\| \| \| \| \| \| \| \|	Remove the restriction, from the mve tail predication pass, that the all masked vectors instructions need to be 128-bits. This allows us to supported extending loads and truncating stores. Differential Revision: https://reviews.llvm.org/D69946
*	[ARM][MVE] Tail predication conversion	Sam Parker	2019-11-19	2	-134/+295
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch modifies ARMLowOverheadLoops to convert a predicated vector low-overhead loop into a tail-predicatd one. This is currently a very basic conversion, with the following restrictions: - Operates only on single block loops. - The loop can only contain a single vctp instruction. - No other instructions can write to the vpr. - We only allow a subset of the mve instructions in the loop. TODO: Pass the number of elements, not the number of iterations to dlstp/wlstp. Differential Revision: https://reviews.llvm.org/D69945
*	[PowerPC] Improve float vector gather codegen	Stefan Pintilie	2019-11-18	1	-2/+36
\| \| \| \| \| \| \| \| \| \|	This patch aims to improve the code generation for float vector gather on POWER9. Patterns have been implemented to utilize instructions that deliver improved performance. Patch by: Kamau Bridgeman Differential Revision: https://reviews.llvm.org/D62908
*	[X86] Add a 'break;' to the end of the last case in a switch to avoid ↵	Craig Topper	2019-11-18	1	-0/+2
\| \| \| \|	surprising the next person to add a case after this one. NFC
*	arm64_32: support function return in FastISel.	Tim Northover	2019-11-18	1	-5/+8
\|
*	[AMDGPU][MC][GFX10] Enabled v_movrel*[sdwa\|dpp\|dpp8] opcodes	Dmitry Preobrazhensky	2019-11-18	2	-41/+62
\| \| \| \| \| \| \| \|	See https://bugs.llvm.org/show_bug.cgi?id=43712 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D70170
*	[SVE][CodeGen] Scalable vector MVT size queries	Graham Hunter	2019-11-18	8	-17/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Implements scalable size queries for MVTs, split out from D53137. * Contains a fix for FindMemType to avoid using scalable vector type to contain non-scalable types. * Explicit casts for several places where implicit integer sign changes or promotion from 32 to 64 bits caused problems. * CodeGenDAGPatterns will treat scalable and non-scalable vector types as different. Reviewers: greened, cameron.mcinally, sdesmalen, rovka Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D66871
*	Fix signed/unsigned comparison warning. NFCI.	Simon Pilgrim	2019-11-18	1	-1/+1
\|
*	[RISCV] Add assembly mnemonic spell checking	Simon Cook	2019-11-18	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows the assembler to suggest alternative assembly mnemonics when an invalid one has been provided. Reviewers: asb, lenary, lewis-revill Reviewed By: asb Subscribers: hiraditya, rbar, johnrusso, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69894
*	[ARM] Allocatable Global Register Variables for ARM	Anna Welker	2019-11-18	8	-22/+68
\| \| \| \| \| \| \| \| \| \| \| \|	Provides support for using r6-r11 as globally scoped register variables. This requires a -ffixed-rN flag in order to reserve rN against general allocation. If for a given GRV declaration the corresponding flag is not found, or the the register in question is the target's FP, we fail with a diagnostic. Differential Revision: https://reviews.llvm.org/D68862
*	[Sparc] Fix "Cannot select" error for AtomicFence on 32-bit V9	James Clarke	2019-11-18	2	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This also adds testing of 32-bit V9 atomic lowering, splitting the 64-bit-only tests out into their own file. Reviewers: venkatra, jyknight Reviewed By: jyknight Subscribers: hiraditya, fedor.sergeev, jfb, llvm-commits, glaubitz Tags: #llvm Differential Revision: https://reviews.llvm.org/D69352
*	[PowerPC] extend PPCPreIncPrep Pass for ds/dq form	czhengsz	2019-11-17	1	-54/+338
\| \| \| \| \| \| \| \| \| \|	Now, PPCPreIncPrep pass changes a loop to update form and update all load/store with same base accordingly. We can do more for load/store with same base, for example, convert load/store with same base to ds/dq form. Reviewed by: jsji Differential Revision: https://reviews.llvm.org/D67088
*	[mips] Remove redundant cast. NFC	Simon Atanasyan	2019-11-16	1	-10/+7
\|
*	[mips] Remove old FIXME comment. NFC	Simon Atanasyan	2019-11-16	1	-2/+0
\| \| \| \|	The issue was fixed at r275050.