bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU: Use tablegen pattern for sendmsg intrinsics	Matt Arsenault	2019-08-01	3	-17/+23
\| \| \| \| \| \| \|	Since this now emits a direct copy to m0, SIFixSGPRCopies has to handle a physical register. llvm-svn: 367593
*	[LV] Tail-Loop Folding	Sjoerd Meijer	2019-08-01	2	-54/+99
\| \| \| \| \| \| \| \| \| \| \|	This allows folding of the scalar epilogue loop (the tail) into the main vectorised loop body when the loop is annotated with a "vector predicate" metadata hint. To fold the tail, instructions need to be predicated (masked), enabling/disabling lanes for the remainder iterations. Differential Revision: https://reviews.llvm.org/D65197 llvm-svn: 367592
*	GlobalISel: Fix widenScalar for G_MERGE_VALUES to pointer	Matt Arsenault	2019-08-01	1	-1/+3
\| \| \| \| \| \| \|	AMDGPU testcase isn't broken now, but will be in a future patch without this. llvm-svn: 367591
*	[WebAssembly] Assembler/InstPrinter: support call_indirect type index.	Wouter van Oortmerssen	2019-08-01	5	-37/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A TYPE_INDEX operand (as used by call_indirect) used to be represented by the InstPrinter as a symbol (e.g. .Ltype_index0@TYPE_INDEX) which was a bit of a mismatch with the WasmObjectWriter which expects an unnamed symbol, to receive the signature from and then turn into a reloc. There was really no good way to round-trip this information. An earlier version of this patch tried to attach the signature information using a .functype, but that ran into trouble when the symbol was re-emitted without a name. Removing the name was a giant hack also. The current version changes the assembly syntax to have an inline signature spec for TYPEINDEX operands that is always unnamed, which is much more elegant both in syntax and in implementation (as now the assembler is able to follow the same path as the regular backend) Reviewers: sbc100, dschuff, aheejin, jgravelle-google, sunfish, tlively Subscribers: arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64758 llvm-svn: 367590
*	[TargetLowering] SimplifyMultipleUseDemandedBits - Add ↵	Simon Pilgrim	2019-08-01	1	-0/+10
\| \| \| \| \| \| \| \|	ISD::INSERT_VECTOR_ELT handling Allow us to peek through vector insertions to avoid dependencies on entire insertion chains. llvm-svn: 367588
*	[X86][SSE] Add PEXTR(PINSR(v, s, c), c) -> s combine.	Simon Pilgrim	2019-08-01	1	-4/+15
\| \| \| \| \| \|	We should probably extend this to cover bitcasts as well to help other cases in promote-vec3.ll. llvm-svn: 367582
*	[Attributor][FIX] Indicate a missing update change	Johannes Doerfert	2019-08-01	1	-3/+7
\| \| \| \| \| \| \| \| \| \|	User of AAReturnedValues need to know if HasOverdefinedReturnedCalls changed from false to true as it will impact the result of the return value traversal (calls are not ignored anymore). This will be tested with the tests in D59978. llvm-svn: 367581
*	[mips] Fix lowering load/store instruction in PIC case	Simon Atanasyan	2019-08-01	1	-1/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an operand of the `lw/sw` instructions is a symbol, these instructions incorrectly lowered using not-position-independent chain of commands. For PIC code we should use `lw/addiu` instructions with the `R_MIPS_GOT16` and `R_MIPS_LO16` relocations respectively. Instead of that LLVM generates position dependent code with the `R_MIPS_HI16` and `R_MIPS_LO16` relocations. This patch provides a fix for the bug by handling PIC case separately in the `MipsAsmParser::expandMemInst`. The main idea is to generate a chain of PIC instructions to load a symbol address into a register and then load the address content. The fix is not optimal and does not fix all PIC-related problems. This is a task for subsequent patches. Differential Revision: https://reviews.llvm.org/D65524 llvm-svn: 367580
*	[X86][SSE] SimplifyMultipleUseDemandedBits - Add PEXTR/PINSR B+W handling	Simon Pilgrim	2019-08-01	2	-0/+31
\| \| \| \| \| \|	This adds SimplifyMultipleUseDemandedBitsForTargetNode X86 support and uses it to allow us to peek through vector insertions to avoid dependencies on entire insertion chains. llvm-svn: 367570
*	[X86] EltsFromConsecutiveLoads - don't attempt to merge volatile loads (PR42846)	Simon Pilgrim	2019-08-01	1	-1/+4
\| \| \| \|	llvm-svn: 367556
*	[RISCV] Add Custom Parser for Atomic Memory Operands	Sam Elliott	2019-08-01	4	-4/+113
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: GCC Accepts both (reg) and 0(reg) for atomic instruction memory operands. These instructions do not allow for an offset in their encoding, so in the latter case, the 0 is silently dropped. Due to how we have structured the RISCVAsmParser, the easiest way to add support for parsing this offset is to add a custom AsmOperand and parser. This parser drops all the parens, and just keeps the register. This commit also adds a custom printer for these operands, which matches the GCC canonical printer, printing both `(a0)` and `0(a0)` as `(a0)`. Reviewers: asb, lewis-revill Reviewed By: asb Subscribers: s.egerton, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, psnobl, benna, Jim, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65205 llvm-svn: 367553
*	[IR] Value: add replaceUsesWithIf() utility	Roman Lebedev	2019-08-01	6	-42/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: While there is always a `Value::replaceAllUsesWith()`, sometimes the replacement needs to be conditional. I have only cleaned a few cases where `replaceUsesWithIf()` could be used, to both add test coverage, and show that it is actually useful. Reviewers: jdoerfert, spatel, RKSimon, craig.topper Reviewed By: jdoerfert Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, george.burgess.iv, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65528 llvm-svn: 367548
*	[IR] SelectInst: add swapValues() utility	Roman Lebedev	2019-08-01	3	-14/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Sometimes we need to swap true-val and false-val of a `SelectInst`. Having a function for that is nicer than hand-writing it each time. Reviewers: spatel, RKSimon, craig.topper, jdoerfert Reviewed By: jdoerfert Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65520 llvm-svn: 367547
*	[ARM] Fix for MVE VREV64	David Green	2019-08-01	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \|	The VREV64 instruction is apparently unpredictable if Qd == Qm, due to the cross-beat nature of the instruction. This adds an earlyclobber to Qd, which seems to be the same way we deal with this on other instructions like the write-back on loads and stores. Differential Revision: https://reviews.llvm.org/D65502 llvm-svn: 367544
*	[AArch64] Do not allocate unnecessary emergency slot.	Sander de Smalen	2019-08-01	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix an issue where the compiler still allocates an emergency spill slot even though it already decided to spill an extra callee-save register to use as a scratch register. Reviewers: gberry, thegameg, mstorsjo, t.p.northover Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D65504 llvm-svn: 367540
*	[MIPS GlobalISel] Fold load/store + G_GEP + G_CONSTANT	Petar Avramovic	2019-08-01	1	-2/+23
\| \| \| \| \| \| \| \| \|	Fold load/store + G_GEP + G_CONSTANT when immediate in G_CONSTANT fits into 16 bit signed integer. Differential Revision: https://reviews.llvm.org/D65507 llvm-svn: 367535
*	[NFC][ARM][ParallelDSP] Getters and renaming	Sam Parker	2019-08-01	1	-16/+22
\| \| \| \| \| \| \|	Add a couple of getters for Reduction and do some renaming of variables around CreateSMLAD for clarity. llvm-svn: 367522
*	[SelectionDAG] Use APInt::isSubsetOf/intersects to simplify some code.	Craig Topper	2019-08-01	1	-2/+2
\| \| \| \| \| \|	Also use KnownBits::isNegative/isNonNegative to further simplify. llvm-svn: 367518
*	AMDGPU/SILoadStoreOptimizer: Make some functions const	Tom Stellard	2019-08-01	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, pendingchaos, rampitec Reviewed By: rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65316 llvm-svn: 367517
*	recommit:[PowerPC] Eliminate loads/swap feeding swap/store for vector type ↵	Zi Xuan Wu	2019-08-01	3	-1/+117
\| \| \| \| \| \| \| \| \| \| \| \|	by using big-endian load/store In PowerPC, there is instruction to load vector in big endian element order when it's in little endian target. So we can combine vector load + reverse into big endian load to eliminate the swap instruction. Also combine vector reverse + store into big endian store. Differential Revision: https://reviews.llvm.org/D65063 llvm-svn: 367516
*	AMDGPU/GlobalISel: Fix flat load/store of pointer types	Matt Arsenault	2019-08-01	4	-8/+13
\| \| \| \|	llvm-svn: 367513
*	AMDGPU/GlobalISel: Remove manual store select code	Matt Arsenault	2019-08-01	2	-58/+23
\| \| \| \| \| \| \|	This regresses the weird types that are newly treated as legal load types, but fixes incorrectly using flat instrucions on SI. llvm-svn: 367512
*	AMDGPU/GlobalISel: Select local atomic cmpxchg	Matt Arsenault	2019-08-01	3	-28/+13
\| \| \| \|	llvm-svn: 367511
*	AMDGPU/GlobalISel: Handle G_ATOMICRMW_FADD	Matt Arsenault	2019-08-01	4	-0/+6
\| \| \| \|	llvm-svn: 367509
*	AMDGPU/GlobalISel: Allow selection of DS atomicrmw	Matt Arsenault	2019-08-01	4	-6/+27
\| \| \| \|	llvm-svn: 367507
*	AMDGPU: Start redefining atomic PatFrags	Matt Arsenault	2019-08-01	6	-189/+210
\| \| \| \| \| \| \| \|	Start migrating to a form that will be compatible with the global isel emitter. Also should fix some overly lax checks on the memory type, which allowed mis-selecting some illegal atomics. llvm-svn: 367506
*	AMDGPU: Correct FP atomic patterns	Matt Arsenault	2019-08-01	3	-9/+10
\| \| \| \| \| \| \|	These need to use an fadd, not an add. Also make the noret part clear in the name. llvm-svn: 367505
*	AMDGPU/GlobalISel: Select simple local stores	Matt Arsenault	2019-08-01	6	-19/+52
\| \| \| \|	llvm-svn: 367504
*	GlobalISel: moreElementsVector for G_LOAD/G_STORE	Matt Arsenault	2019-08-01	2	-1/+12
\| \| \| \| \| \| \|	AMDGPU change and test is a placeholder until a future patch with complete handling. llvm-svn: 367503
*	Create unique, but identically-named ELF sections for explicitly-sectioned ↵	Peter Collingbourne	2019-08-01	1	-2/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	functions and globals when using -function-sections and -data-sections. This allows functions and globals to to be reordered later in the linking phase (using the -symbol-ordering-file) even though reordering will be limited to the scope of the explicit section. Patch by Rahman Lavaee! Differential Revision: https://reviews.llvm.org/D65478 llvm-svn: 367501
*	Reapply "AMDGPU: Split block for si_end_cf"	Matt Arsenault	2019-08-01	5	-25/+139
\| \| \| \| \| \|	This reverts commit r359363, reapplying r357634 llvm-svn: 367500
*	Fix a release-only build warning triggered by rL367485	Philip Reames	2019-08-01	1	-0/+3
\| \| \| \|	llvm-svn: 367499
*	AMDGPU/GlobalISel: Select local loads	Matt Arsenault	2019-08-01	5	-9/+108
\| \| \| \|	llvm-svn: 367498
*	Revert "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG" and	Amy Huang	2019-07-31	2	-21/+0
\| \| \| \| \| \| \| \| \| \|	and partial fix. Causes windows buildbot errors. This reverts commit 6e65c34523963094acd0d6c94a5f5c64b32fe6aa and 53da7ca94343166ac68aef81db0398932fc258bb. llvm-svn: 367496
*	[ARM] Lower "(x<<c) > 0x80000000U" to "lsls" on Thumb1.	Eli Friedman	2019-07-31	5	-0/+34
\| \| \| \| \| \| \| \| \|	This is extremely specific, but saves three instructions when it's legal. I don't think the code can be usefully generalized. Differential Revision: https://reviews.llvm.org/D65351 llvm-svn: 367492
*	[ARM] Transform compare of masked value to shift on Thumb1.	Eli Friedman	2019-07-31	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \|	Thumb1 has very limited immediate modes, so turning an "and" into a shift can save multiple instructions. It's possible to simplify the generated code for test2 and test3 in cmp-and-fold.ll a little more, but I'll implement that as a followup. Differential Revision: https://reviews.llvm.org/D65175 llvm-svn: 367491
*	[ScalarizeMaskedMemIntrin] Bitcast the mask to the scalar domain and use ↵	Craig Topper	2019-07-31	1	-11/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	scalar bit tests for the branches. X86 at least is able to use movmsk or kmov to move the mask to the scalar domain. Then we can just use test instructions to test individual bits. This is more efficient than extracting each mask element individually. I special cased v1i1 to use the previous behavior. This avoids poor type legalization of bitcast of v1i1 to i1. I've skipped expandload/compressstore as I think we need to handle constant masks for those better first. Many tests end up with duplicate test instructions due to tail duplication in the branch folding pass. But the same thing happens when constructing similar code in C. So its not unique to the scalarization. Not sure if this lowering code will also be good for other targets, but we're only testing X86 today. Differential Revision: https://reviews.llvm.org/D65319 llvm-svn: 367489
*	[X86] Add DAG combine to fold any_extend_vector_inreg+truncstore to an ↵	Craig Topper	2019-07-31	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \|	extractelement+store We have custom code that ignores the normal promoting type legalization on less than 128-bit vector types like v4i8 to emit pavgb, paddusb, psubusb since we don't have the equivalent instruction on a larger element type like v4i32. If this operation appears before a store, we can be left with an any_extend_vector_inreg followed by a truncstore after type legalization. When truncstore isn't legal, this will normally be decomposed into shuffles and a non-truncating store. This will then combine away the any_extend_vector_inreg and shuffle leaving just the store. On avx512, truncstore is legal so we don't decompose it and we had no combines to fix it. This patch adds a new DAG combine to detect this case and emit either an extract_store for 64-bit stoers or a extractelement+store for 32 and 16 bit stores. This makes the avx512 codegen match the avx2 codegen for these situations. I'm restricting to only when -x86-experimental-vector-widening-legalization is false. When we're widening we're not likely to create this any_extend_inreg+truncstore combination. This means we should be able to remove this code when we flip the default. I would like to flip the default soon, but I need to investigate some performance regressions its causing in our branch that I wasn't seeing on trunk. Differential Revision: https://reviews.llvm.org/D65538 llvm-svn: 367488
*	Migrate some more fadd and fsub cases away from UnsafeFPMath control to ↵	Michael Berg	2019-07-31	2	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	utilize NoSignedZerosFPMath options control Summary: Honoring no signed zeroes is also available as a user control through clang separately regardless of fastmath or UnsafeFPMath context, DAG guards should reflect this context. Reviewers: spatel, arsenm, hfinkel, wristow, craig.topper Reviewed By: spatel Subscribers: rampitec, foad, nhaehnle, wuzish, nemanjai, jvesely, wdng, javed.absar, MaskRay, jsji Differential Revision: https://reviews.llvm.org/D65170 llvm-svn: 367486
*	[IndVars, RLEV] Support rewriting exit values in loops without known exits ↵	Philip Reames	2019-07-31	1	-9/+7
\| \| \| \| \| \| \| \| \| \|	(prep work) This is a prepatory patch for future work on support exit value rewriting in loops with a mixture of computable and non-computable exit counts. The intention is to be "mostly NFC" - i.e. not enable any interesting new transforms - but in practice, there are some small output changes. The test differences are caused by cases wherewhere getSCEVAtScope can simplify a single entry phi without needing any knowledge of the loop. llvm-svn: 367485
*	Fix to r367374 "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG"	Amy Huang	2019-07-31	1	-4/+8
\| \| \| \| \| \| \| \| \|	after windows buildbot failure. Added a check that the MachineInstr exists and is a call before trying to add symbols around it. llvm-svn: 367483
*	Fix unused variable warning for non-assert builds.	Eric Christopher	2019-07-31	1	-0/+1
\| \| \| \|	llvm-svn: 367482
*	[GISel] Address review feedback on passing MD_callees to lowerCall.	Mark Lacey	2019-07-31	4	-5/+5
\| \| \| \| \| \| \|	Preserve the nullptr default for KnownCallees that appears in the base class. llvm-svn: 367477
*	[GISel] Pass MD_callees metadata down in call lowering.	Mark Lacey	2019-07-31	9	-12/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This will make it possible to improve IPRA by taking into account register usage in indirect calls. NFC yet; this is just laying the groundwork to start building up patches to take advantage of the information for improved register allocation. Reviewers: aditya_nandakumar, volkan, qcolombet, arsenm, rovka, aemerson, paquette Subscribers: sdardis, wdng, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65488 llvm-svn: 367476
*	AArch64: Add a tagged-globals backend feature.	Peter Collingbourne	2019-07-31	7	-1/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This feature instructs the backend to allow locally defined global variable addresses to contain a pointer tag in bits 56-63 that will be ignored by the hardware (i.e. TBI), but may be used by an instrumentation pass such as HWASAN. It works by adding a MOVK instruction to the regular ADRP/ADD sequence that sets bits 48-63 to the corresponding bits of the global, with the linker bounds check disabled on the ADRP instruction to prevent the tag from causing a link failure. This implementation of the feature omits the MOVK when loading from or storing to a global, which is sufficient for TBI. If the same approach is extended to MTE, assuming that 0 is not configured as a catch-all tag, we will most likely also need the MOVK in this case in order to avoid a tag mismatch. Differential Revision: https://reviews.llvm.org/D65364 llvm-svn: 367475
*	SelectionDAG, MI, AArch64: Widen target flags fields/arguments from unsigned ↵	Peter Collingbourne	2019-07-31	8	-31/+27
\| \| \| \| \| \| \| \| \| \| \| \| \|	char to unsigned. This makes the field wider than MachineOperand::SubReg_TargetFlags so that we don't end up silently truncating any higher bits. We should still catch any bits truncated from the MachineOperand field as a consequence of the assertion in MachineOperand::setTargetFlags(). Differential Revision: https://reviews.llvm.org/D65465 llvm-svn: 367474
*	[DAGCombine] Limit the number of times for the same store and root nodes	Wei Mi	2019-07-31	1	-3/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	to bail out in store merging dependence check. We run into a case where dependence check in store merging bail out many times for the same store and root nodes in a huge basicblock. That increases compile time by almost 100x. The patch add a map to track how many times the bailing out happen for the same store and root, and if it is over a limit, stop considering the store with the same root as a merging candidate. Differential Revision: https://reviews.llvm.org/D65174 llvm-svn: 367472
*	[SCCP] Update condition to avoid overflow.	Alina Sbirlea	2019-07-31	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Update condition to remove addition that may cause an overflow. Resolves PR42814. Reviewers: sanjoy, RKSimon Subscribers: jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65417 llvm-svn: 367461
*	[MemorySSA] Add additional verification for phis.	Alina Sbirlea	2019-07-31	2	-1/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Verify that the incoming defs into phis are the last defs from the respective incoming blocks. When moving blocks, insertDef must RenameUses. Adding this verification makes GVNHoist tests fail that uncovered this issue. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63147 llvm-svn: 367451
*	[InstCombine] canonicalize fneg before fmul/fdiv	Sanjay Patel	2019-07-31	2	-20/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reverse the canonicalization of fneg relative to fmul/fdiv. That makes it easier to implement the transforms (and possibly other fneg transforms) in 1 place because we can always start the pattern match from fneg (either the legacy binop or the new unop). There's a secondary practical benefit seen in PR21914 and PR42681: https://bugs.llvm.org/show_bug.cgi?id=21914 https://bugs.llvm.org/show_bug.cgi?id=42681 ...hoisting fneg rather than sinking seems to play nicer with LICM in IR (although this change may expose analysis holes in the other direction). 1. The instcombine test changes show the expected neutral IR diffs from reversing the order. 2. The reassociation tests show that we were missing an optimization opportunity to fold away fneg-of-fneg. My reading of IEEE-754 says that all of these transforms are allowed (regardless of binop/unop fneg version) because: "For all other operations [besides copy/abs/negate/copysign], this standard does not specify the sign bit of a NaN result." In all of these transforms, we always have some other binop (fadd/fsub/fmul/fdiv), so we are free to flip the sign bit of a potential intermediate NaN operand. (If that interpretation is wrong, then we must already have a bug in the existing transforms?) 3. The clang tests shouldn't exist as-is, but that's effectively a revert of rL367149 (the test broke with an extension of the pre-existing fneg canonicalization in rL367146). Differential Revision: https://reviews.llvm.org/D65399 llvm-svn: 367447