bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[ARM,MVE] Add intrinsics for scalar shifts.	Simon Tatham	2019-11-19	2	-8/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fills in the small family of MVE intrinsics that have nothing to do with vectors: they implement bit-shift operations on 32- or 64-bit values held in one or two general-purpose registers. Most of these shift operations saturate if shifting left, and round to nearest if shifting right, although LSLL and ASRL behave like ordinary shifts. When these instructions take a variable shift count in a register, they pay attention to its sign, so that (for example) LSLL or UQRSHLL will shift left if given a positive number but right if given a negative one. That makes even LSLL and ASRL different enough from standard LLVM IR shift semantics that I couldn't see any better alternative than to simply model the whole family as a set of MVE-specific IR intrinsics. (The //immediate// forms of LSLL and ASRL, on the other hand, do behave exactly like a standard IR shift of a 64-bit value. In fact, those forms don't have ACLE intrinsics defined at all, because you can just write an ordinary C shift operation if you want one of those.) The 64-bit shifts have to be instruction-selected in C++, because they deliver two output values. But the 32-bit ones are simple enough that I could write a DAG isel pattern directly into each Instruction record. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70319
*	AMDGPU: Refactor treatment of denormal mode	Matt Arsenault	2019-11-19	18	-78/+141
\| \| \| \| \| \| \| \| \| \| \|	Start moving towards treating this as a property of the calling convention, and not the subtarget. The default denormal mode should not be part of the subtarget, and be moved into a separate function attribute. This patch is still NFC. The denormal mode remains as a subtarget feature for now, but make the necessary changes to switch to using an attribute.
*	AMDGPU: Be explicit about denormal mode in MIR tests	Matt Arsenault	2019-11-19	1	-10/+16
\| \| \| \| \| \| \|	Start checking the machine function in GlobalISel instead of the target directly. This temporarily breaks fcanonicalize selection in GlobalISel.
*	DAG: Add function context to isFMAFasterThanFMulAndFAdd	Matt Arsenault	2019-11-19	15	-21/+50
\| \| \| \| \| \| \| \|	AMDGPU needs to know the FP mode for the function to answer this correctly when this is removed from the subtarget. AArch64 had to make this more complicated by using this from an IR hook, so add an IR typed overload.
*	[AMDGPU] Tune inlining parameters for AMDGPU target (part 2)	dfukalov	2019-11-19	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Most of IR instructions got better code size estimations after commit 47a5c36b. So default parameters values should be updated to improve inlining and unrolling for the target. Reviewers: rampitec, arsenm Reviewed By: rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70391
*	[X86][SSE] Remove XFormVExtractWithShuffleIntoLoad to prevent legalization ↵	Simon Pilgrim	2019-11-19	1	-122/+2
\| \| \| \| \| \| \| \|	infinite loops (PR43971) As detailed in PR43971/D70267, the use of XFormVExtractWithShuffleIntoLoad causes issues where we end up in infinite loops of extract(targetshuffle(vecload)) -> extract(shuffle(vecload)) -> extract(vecload) -> extract(targetshuffle(vecload)), there are just too many legalization checks at every stage that we can't guarantee that extract(shuffle(vecload)) -> scalarload can occur. At the moment we see a number of minor regressions as we don't fold extract(shuffle(vecload)) -> scalarload before legal ops, these can be addressed in future patches and extension of X86ISelLowering's combineExtractWithShuffle.
*	[mips] Joint MipsMemSimmXXXAsmOperand into the single template class. NFC	Simon Atanasyan	2019-11-19	2	-60/+14
\|
*	[ARM][MVE] Enable narrow vectors for tail pred	Sam Parker	2019-11-19	1	-1/+1
\| \| \| \| \| \| \| \|	Remove the restriction, from the mve tail predication pass, that the all masked vectors instructions need to be 128-bits. This allows us to supported extending loads and truncating stores. Differential Revision: https://reviews.llvm.org/D69946
*	[ARM][MVE] Tail predication conversion	Sam Parker	2019-11-19	2	-134/+295
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch modifies ARMLowOverheadLoops to convert a predicated vector low-overhead loop into a tail-predicatd one. This is currently a very basic conversion, with the following restrictions: - Operates only on single block loops. - The loop can only contain a single vctp instruction. - No other instructions can write to the vpr. - We only allow a subset of the mve instructions in the loop. TODO: Pass the number of elements, not the number of iterations to dlstp/wlstp. Differential Revision: https://reviews.llvm.org/D69945
*	[PowerPC] Improve float vector gather codegen	Stefan Pintilie	2019-11-18	1	-2/+36
\| \| \| \| \| \| \| \| \| \|	This patch aims to improve the code generation for float vector gather on POWER9. Patterns have been implemented to utilize instructions that deliver improved performance. Patch by: Kamau Bridgeman Differential Revision: https://reviews.llvm.org/D62908
*	[X86] Add a 'break;' to the end of the last case in a switch to avoid ↵	Craig Topper	2019-11-18	1	-0/+2
\| \| \| \|	surprising the next person to add a case after this one. NFC
*	arm64_32: support function return in FastISel.	Tim Northover	2019-11-18	1	-5/+8
\|
*	[AMDGPU][MC][GFX10] Enabled v_movrel*[sdwa\|dpp\|dpp8] opcodes	Dmitry Preobrazhensky	2019-11-18	2	-41/+62
\| \| \| \| \| \| \| \|	See https://bugs.llvm.org/show_bug.cgi?id=43712 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D70170
*	[SVE][CodeGen] Scalable vector MVT size queries	Graham Hunter	2019-11-18	8	-17/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Implements scalable size queries for MVTs, split out from D53137. * Contains a fix for FindMemType to avoid using scalable vector type to contain non-scalable types. * Explicit casts for several places where implicit integer sign changes or promotion from 32 to 64 bits caused problems. * CodeGenDAGPatterns will treat scalable and non-scalable vector types as different. Reviewers: greened, cameron.mcinally, sdesmalen, rovka Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D66871
*	Fix signed/unsigned comparison warning. NFCI.	Simon Pilgrim	2019-11-18	1	-1/+1
\|
*	[RISCV] Add assembly mnemonic spell checking	Simon Cook	2019-11-18	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows the assembler to suggest alternative assembly mnemonics when an invalid one has been provided. Reviewers: asb, lenary, lewis-revill Reviewed By: asb Subscribers: hiraditya, rbar, johnrusso, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69894
*	[ARM] Allocatable Global Register Variables for ARM	Anna Welker	2019-11-18	8	-22/+68
\| \| \| \| \| \| \| \| \| \| \| \|	Provides support for using r6-r11 as globally scoped register variables. This requires a -ffixed-rN flag in order to reserve rN against general allocation. If for a given GRV declaration the corresponding flag is not found, or the the register in question is the target's FP, we fail with a diagnostic. Differential Revision: https://reviews.llvm.org/D68862
*	[Sparc] Fix "Cannot select" error for AtomicFence on 32-bit V9	James Clarke	2019-11-18	2	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This also adds testing of 32-bit V9 atomic lowering, splitting the 64-bit-only tests out into their own file. Reviewers: venkatra, jyknight Reviewed By: jyknight Subscribers: hiraditya, fedor.sergeev, jfb, llvm-commits, glaubitz Tags: #llvm Differential Revision: https://reviews.llvm.org/D69352
*	[PowerPC] extend PPCPreIncPrep Pass for ds/dq form	czhengsz	2019-11-17	1	-54/+338
\| \| \| \| \| \| \| \| \| \|	Now, PPCPreIncPrep pass changes a loop to update form and update all load/store with same base accordingly. We can do more for load/store with same base, for example, convert load/store with same base to ds/dq form. Reviewed by: jsji Differential Revision: https://reviews.llvm.org/D67088
*	[mips] Remove redundant cast. NFC	Simon Atanasyan	2019-11-16	1	-10/+7
\|
*	[mips] Remove old FIXME comment. NFC	Simon Atanasyan	2019-11-16	1	-2/+0
\| \| \| \|	The issue was fixed at r275050.
*	MCObjectStreamer: assign MCSymbols in the dummy fragment to offset 0.	James Y Knight	2019-11-16	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In MCObjectStreamer, when there is no current fragment, initially symbols are created in a "pending" state and assigned to a dummy empty fragment. Previously, they were not being assigned an offset, and thus evaluateAbsolute would fail if trying to evaluate an expression 'a - b', where both 'a' and 'b' were in this pending state. Also slightly refactored the EmitLabel overload which takes an MCFragment for clarity. Fixes: https://llvm.org/PR41825 Differential Revision: https://reviews.llvm.org/D70062
*	AMDGPU/SILoadStoreOptimizer: fix a likely bug introduced recently	Nicolai Hähnle	2019-11-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We should check for same instruction class before checking whether they have the same base address, else we might iterate out of bounds of a MachineInstr operands list. The InstClass check is also cheaper. This was introduced in SVN r373630. Reviewers: tstellar Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68690
*	[RISCV] Handle variable sized objects with the stack need to be realigned	Shiva Chen	2019-11-16	5	-12/+44
\| \| \| \|	Differential Revision: https://reviews.llvm.org/D68979
*	[WebAssembly] Fix miscompile of select with and	Thomas Lively	2019-11-15	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Rolls back the remaining bad optimizations introduced in eb15d00193f. Some of them were already rolled back in e661f946a7db and this finishes the job. Fixes https://bugs.llvm.org/show_bug.cgi?id=44012. Reviewers: dschuff, aheejin Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70347
*	[mips] Enable `la` pseudo instruction on 64-bit arch.	Simon Atanasyan	2019-11-15	1	-5/+1
\| \| \| \| \| \| \|	This patch makes LLVM compatible with GAS. It accepts `la` pseudo instruction on arch with 64-bit pointers and just shows a warning. Differential Revision: https://reviews.llvm.org/D70202
*	[mips] Do not emit R_MIPS_JALR for sym+offset in case of O32 ABI	Simon Atanasyan	2019-11-15	1	-1/+14
\| \| \| \| \| \| \| \| \| \|	O32 ABI uses relocations in REL format. Relocation's addend is written in place. R_MIPS_JALR relocation points to the `jalr` instruction which does not have a place to store the relocation addend. So it's impossible to save non-zero "offset". This patch blocks emission of `R_MIPS_JALR` relocations in such cases. Differential Revision: https://reviews.llvm.org/D70201
*	Add read-only data assembly writing for aix	diggerlin	2019-11-15	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	SUMMARY: The patch will emit read-only variable assembly code for aix. Reviewers: daltenty,Xiangling_Liao Subscribers: rupprecht, seiyai,hiraditya Differential Revision: https://reviews.llvm.org/D70182
*	[ARM,MVE] Add reversed isel patterns for MVE `vcmp qN,rN`	Simon Tatham	2019-11-15	1	-13/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: As well as vector/vector compare instructions, MVE also has a family of comparisons taking a vector and a scalar, which compare every lane of the vector against the same value. We generate those at isel time using isel patterns that match `(ARMvcmp vector, (ARMvdup scalar))`. This commit adds corresponding patterns for the operand-reversed form `(ARMvcmp (ARMvdup scalar), vector)`, with condition codes swapped as necessary. That way, we can still generate the vector/scalar compare instruction if the IR happens to have been rearranged to put the operands the other way round, which can happen in some optimization phases. Previously, a vcmp the other way round was handled by emitting a `vdup` instruction to //explicitly// replicate the scalar input into a vector, and then doing a vector/vector comparison. I haven't added a new test, because it turned out that several existing tests were already exhibiting that failure mode. So just updating the expected output in the existing MVE codegen tests demonstrates what's been improved. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70296
*	[AMDGPU] Lower llvm.amdgcn.s.buffer.load.v3[i\|f]32	Piotr Sobczak	2019-11-15	1	-6/+24
\| \| \| \| \| \| \| \| \| \|	Summary: Add lowering support for 32-bit vec3 variant of s.buffer.load intrinsic. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70118
*	[ARM][MVE] tail-predication	Sjoerd Meijer	2019-11-15	1	-0/+3
\| \| \| \| \| \| \|	This is a follow up of d90804d, to also flag fmcp instructions as instructions that we do not support in tail-predicated vector loops. Differential Revision: https://reviews.llvm.org/D70295
*	[MIPS GlobalISel] Select andi, ori and xori	Petar Avramovic	2019-11-15	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce IntImmLeaf version of PatLeaf immZExt16 for 32-bit immediates. Change immZExt16 with imm32ZExt16 for andi, ori and xori. This keeps same behavior for SDAG and allows for GlobalISel selectImpl to select 'G_CONSTANT imm' + G_AND, G_OR, G_XOR into ANDi, ORi, XORi, respectively, when 32-bit imm satisfies imm32ZExt16 predicate: zero extending 16 low bits of imm is equal to imm. Large number of test changes comes from zero extending of small types which is transformed into 'and' with bitmask in legalizer. Differential Revision:https://reviews.llvm.org/D70185
*	[MIPS GlobalISel] Select addiu	Petar Avramovic	2019-11-15	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	Introduce IntImmLeaf version of PatLeaf immSExt16 for 32-bit immediates. Change immSExt16 with imm32SExt16 for addiu. This keeps same behavior for SDAG and allows for GlobalISel selectImpl to select 'G_CONSTANT imm' + G_ADD into ADDIu when 32-bit imm satisfies imm32SExt16 predicate: sign extending 16 low bits of imm is equal to imm. Differential Revision: https://reviews.llvm.org/D70184
*	AMDGPU: Change boolean content type to 0 or 1	Matt Arsenault	2019-11-15	4	-8/+15
\| \| \| \| \| \| \| \|	The usage of target boolean checks is overly inflexible, since sext and zext of a compare are equally cheap. The choice is arbitrary, but using 0/1 to some degree is the choice of lower resistance since that's what most targets use. This enables a few combines that don't bother to support ZeroOrNegativeOneBooleanContent.
*	AMDGPU: Try to commute sub of boolean ext	Matt Arsenault	2019-11-15	1	-3/+26
\| \| \| \|	Avoids another regression in a future patch.
*	GlobalISel: Lower s1 source G_SITOFP/G_UITOFP	Matt Arsenault	2019-11-15	3	-48/+2
\|
*	[WinEH] Fix the wrong alignment orientation during calculating EH frame.	Wang, Pengfei	2019-11-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a bug fix for further issues in PR43585. Reviewers: rnk, RKSimon, craig.topper, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, annita.zhang Tags: #llvm Differential Revision: https://reviews.llvm.org/D70224
*	Add missing includes needed to prune LLVMContext.h include, NFC	Reid Kleckner	2019-11-14	3	-0/+3
\| \| \| \| \|	These are a pre-requisite to removing #include "llvm/Support/Options.h" from LLVMContext.h: https://reviews.llvm.org/D70280
*	[Hexagon] Validate the iterators before converting them to mux.	Sumanth Gundapaneni	2019-11-14	1	-2/+8
\| \| \| \| \| \|	The conditional instructions that are translated to mux instructions are deleted and the iterators to these deleted instructions are being used later. This patch fixed this issue.
*	[RISCV] Use addi rather than add x0	Sam Elliott	2019-11-14	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The RISC-V backend used to generate `add <reg>, x0, <reg>` in a few instances. It seems most places no longer generate this sequence. This is semantically equivalent to `addi <reg>, <reg>, 0`, but the latter has the advantage of being noted to be the canonical instruction to be used for moves (which microarchitectures can and should recognise as such). The changed testcases use instruction aliases - `mv <reg>, <reg>` is an alias for `addi <reg>, <reg>, 0`. Reviewers: luismarques Reviewed By: luismarques Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70124
*	[RISCV] Fix wrong CFI directives	Luís Marques	2019-11-14	1	-55/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Removes CFI CFA directives that could incorrectly propagate beyond the basic block they were inteded for. Specifically it removes the epilogue CFI directives. See the branch_and_tail_call test for an example of the issue. Should fix the stack unwinding issues caused by the incorrect directives. Reviewers: asb, lenary, shiva0217 Reviewed By: lenary Tags: #llvm Differential Revision: https://reviews.llvm.org/D69723
*	ARM: allow rewriting frame indexes for all prefetch variants.	Tim Northover	2019-11-14	1	-0/+14
\| \| \| \| \|	For some reason we could handle PLD but not PLDW or PLI, but all of them can potentially refer to the stack region (if weirdly for PLI).
*	Fix uninitialized variable warning. NFCI.	Simon Pilgrim	2019-11-14	1	-1/+1
\|
*	Fix uninitialized variable warnings. NFCI.	Simon Pilgrim	2019-11-14	2	-6/+5
\|
*	Hexagon - fix uninitialized variable warnings. NFCI.	Simon Pilgrim	2019-11-14	4	-6/+6
\|
*	MSP430 - fix uninitialized variable warnings. NFCI.	Simon Pilgrim	2019-11-14	3	-21/+18
\|
*	[AArch64][SVE] Implement floating-point comparison & reduction intrinsics	Kerry McLaughlin	2019-11-14	2	-16/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds intrinsics for the following: - fadda & faddv - fminv, fmaxv, fminnmv & fmaxnmv - facge & facgt - fcmp[eq\|ge\|gt\|ne\|uo] Reviewers: sdesmalen, huntergr, dancgr, mgudim Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cameron.mcinally, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69858
*	[mips][NFC] Remove old FIXME comment	Miloš Stojanović	2019-11-14	1	-1/+0
\| \| \| \| \| \|	This was fixed in rL229595 but this comment was missed. Differential Revision: https://reviews.llvm.org/D70231
*	[AArch64][SVE] Implement remaining floating-point arithmetic intrinsics	Kerry McLaughlin	2019-11-14	2	-19/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds intrinsics for the following: - fabs & fneg - fexpa - frint[a\|i\|m\|n\|p\|x\|z] - frecpe, frecps & frecpx - fsqrt, frsqrte & frsqrts Reviewers: huntergr, sdesmalen, dancgr, mgudim Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cameron.mcinally, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69800
*	[AArch64][SVE] Implement additional floating-point arithmetic intrinsics	Kerry McLaughlin	2019-11-14	3	-38/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds intrinsics for the following: - ftssel - fcadd, fcmla - fmla, fmls, fnmla, fnmls - fmad, fmsb, fnmad, fnmsb Reviewers: sdesmalen, huntergr, dancgr, mgudim Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cameron.mcinally, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69707