bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[X86][BMI1] Fix BLSI/BLSMSK/BLSR BMI1 scheduling on btver2	Simon Pilgrim	2018-09-14	1	-1/+1
\| \| \| \| \| \|	These have the same behaviour as tzcnt on btver2 - confirmed with AMD 16h SOG, Agner and instlatx64. llvm-svn: 342235
*	[X86][BMI1] Add scheduler class for BLSI/BLSMSK/BLSR BMI1 instructions	Simon Pilgrim	2018-09-14	11	-48/+35
\| \| \| \|	llvm-svn: 342234
*	[AMDGPU] Ensure trig range reduction only used for subtargets that require it	David Stuttard	2018-09-14	4	-9/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: GFX9 and above support sin/cos instructions with a greater range and thus don't require a fract instruction prior to invocation. Added a subtarget feature to reflect this and added code to take advantage of expanded range on GFX9+ Also updated the tests to check correct behaviour Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D51933 Change-Id: I1c1f1d3726a5ae32116646ca5cfa1ab4ef69e5b0 llvm-svn: 342222
*	[ARM] bottom-top mul support in ARMParallelDSP	Sam Parker	2018-09-14	1	-27/+152
\| \| \| \| \| \| \| \| \| \|	On failing to find sequences that can be converted into dual macs, try to find sequential 16-bit loads that are used by muls which we can then use smultb, smulbt, smultt with a wide load. Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 342210
*	[SystemZ] Adjust cost functions for subtargets that use LI + LOC instead of IPM	Jonas Paulsson	2018-09-14	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After recent improvements which makes better use of LOC instead of IPM, the TTI cost functions also needs to be updated to reflect this. This involves sext, zext and xor of i1. The tests were updated so that for z13 the new costs are expected, while the old costs are still checked for on zEC12. Review: Ulrich Weigand https://reviews.llvm.org/D51339 llvm-svn: 342207
*	[AMDGPU] Removed unused method	Tim Renouf	2018-09-13	1	-22/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I accidentally left this behind in D50306, and it causes a build warning when I build with gcc7. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52022 Change-Id: I30f7a47047e9d9d841f652da66d2fea19e74842c llvm-svn: 342189
*	[X86] Fix register resizings for inline assembly register operands.	Nirav Dave	2018-09-13	2	-7/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When replacing a named register input to the appropriately sized sub/super-register. In the case of a 64-bit value being assigned to a register in 32-bit mode, match GCC's assignment. Reviewers: eli.friedman, craig.topper Subscribers: nickdesaulniers, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D51502 llvm-svn: 342175
*	[X86] Cleanup pair returns. NFCI.	Nirav Dave	2018-09-13	1	-32/+14
\| \| \| \|	llvm-svn: 342174
*	[RISCV][MC] Reject bare symbols for the simm6 and simm6nonzero operand types	Ana Pazos	2018-09-13	1	-14/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fixed assertions due to invalid fixup when encoding compressed instructions (c.addi, c.addiw, c.li, c.andi) with bare symbols with/without modifiers. This matches GAS behavior as well. This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer for the RISC-V assembly language. Reviewers: asb Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb Differential Revision: https://reviews.llvm.org/D52005 llvm-svn: 342160
*	[RISCV] Fix decoding of invalid instruction with C extension enabled.	Ana Pazos	2018-09-13	2	-2/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The illegal instruction 0x00 0x00 is being wrongly decoded as c.addi4spn with 0 immediate. The invalid instruction 0x01 0x61 is being wrongly decoded as c.addi16sp with 0 immediate. This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer for the RISC-V assembly language. Reviewers: asb Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb Differential Revision: https://reviews.llvm.org/D51815 llvm-svn: 342159
*	[WebAssembly] Fix signature of `main` in FixFunctionBitcasts	Sam Clegg	2018-09-13	1	-2/+4
\| \| \| \| \| \| \| \| \|	Also, add a check to ensure that when main has the expected signature we do not create a wrapper. Differential Revision: https://reviews.llvm.org/D51562 llvm-svn: 342157
*	[ARM] Allow truncs as sources in ARM CGP	Sam Parker	2018-09-13	1	-19/+23
\| \| \| \| \| \| \| \| \| \|	We previously only allowed truncs as sinks, but now allow them as sources too. We do this by checking that the result type is the narrow type that we're trying to optimise for. Differential Revision: https://reviews.llvm.org/D51978 llvm-svn: 342141
*	[ARM] Fix FixConst for ARMCodeGenPrepare	Sam Parker	2018-09-13	1	-20/+3
\| \| \| \| \| \| \| \| \| \|	Part of FixConsts wrongly assumes either a 8- or 16-bit constant which can result in the wrong constants being generated during promotion. Differential Revision: https://reviews.llvm.org/D52032 llvm-svn: 342140
*	AMDGPU: Fix not preserving alignent in call setups	Matt Arsenault	2018-09-13	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \|	If an argument was passed on the stack, this was using the default alignment. I'm not sure there's an observable change from this. This was observable due to bugs in expansion of unaligned loads and stores, but since that is fixed I don't think this matters much. llvm-svn: 342133
*	ARM: align loops to 4 bytes on Cortex-M3 and Cortex-M4.	Tim Northover	2018-09-13	4	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \|	The Technical Reference Manuals for these two CPUs state that branching to an unaligned 32-bit instruction incurs an extra pipeline reload penalty. That's bad. This also enables the optimization at -Os since it costs on average one byte per loop in return for 1 cycle per iteration, which is pretty good going. llvm-svn: 342127
*	[AMDGPU] Load divergence predicate refactoring	Alexander Timofeev	2018-09-13	2	-8/+26
\| \| \| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D51931 Reviewers: rampitec llvm-svn: 342120
*	[mips] Enable the mnemonic spell corrector	Simon Atanasyan	2018-09-13	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \|	This implements suggesting alternative mnemonics when an invalid one is specified. For example `addru $9, $6, 17767` leads to the following error message: error: unknown instruction, did you mean: add, addiu, addu, maddu? Differential revision: https://reviews.llvm.org/D40646 llvm-svn: 342119
*	[AMDGPU] Preliminary patch for divergence driven instruction selection. ↵	Alexander Timofeev	2018-09-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Load offset inlining pattern changed. Differential revision: https://reviews.llvm.org/D51975 Reviewers: rampitec llvm-svn: 342115
*	[X86] Type legalize v2i32 div/rem by scalarizing rather than promoting	Craig Topper	2018-09-13	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously we type legalized v2i32 div/rem by promoting to v2i64. But we don't support div/rem of vectors so op legalization would then scalarize it using i64 scalar ops since it doesn't know about the original promotion. 64-bit scalar divides on Intel hardware are known to be slow and in 32-bit mode they require a libcall. This patch switches type legalization to do the scalarizing itself using i32. It looks like the division by power of 2 optimization is still kicking in and leaving the code as a vector. The division by other constant optimization doesn't kick in pre type legalization since it ignores illegal types. And previously, after type legalization we scalarized the v2i64 since we don't have v2i64 MULHS/MULHU support. Another option might be to widen v2i32 to v4i32 so we could do division by constant optimizations, but we'd have to be careful to only do that for constant divisors or we risk scalaring to 4 scalar divides. Reviewers: RKSimon, spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51325 llvm-svn: 342114
*	ARM: correct the relocation type for `bl` on WoA	Saleem Abdulrasool	2018-09-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The `IMAGE_REL_ARM_BRANCH20T` applies only to a `b.w` instruction. A thumb-2 `bl` should be relocated using a `IMAGE_REL_ARM_BRANCH24T`. Correct the relocation that we emit in such a case. Resolves PR38620! Based on the patch by Jordan Rhee! llvm-svn: 342109
*	Remove isAsCheapAsAMove from v128.const	Thomas Lively	2018-09-13	1	-1/+1
\| \| \| \|	llvm-svn: 342106
*	Remove isAsCheapAsAMove from mem ops	Thomas Lively	2018-09-13	1	-2/+2
\| \| \| \|	llvm-svn: 342105
*	[WebAssembly] Add missing SIMD instruction attributes	Thomas Lively	2018-09-13	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These attributes are copied from equivalent instructions in WebAssemblyInstrInfo.td. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D51518 llvm-svn: 342104
*	[Hexagon] Use shuffles when lowering "gather" shufflevectors	Krzysztof Parzyszek	2018-09-12	1	-0/+70
\| \| \| \| \| \| \| \|	Shufflevector instructions in LLVM IR that extract a subset of elements of a longer input into a shorter vector can be done using VECTOR_SHUFFLEs. This will avoid expanding them into constly extracts and inserts. llvm-svn: 342091
*	[Hexagon] Improve the selection algorithm in scalarizeShuffle	Krzysztof Parzyszek	2018-09-12	1	-22/+89
\| \| \| \| \| \|	Use topological ordering for newly generated nodes. llvm-svn: 342090
*	[WebAssembly] Make tied inline asm operands work again	Heejin Ahn	2018-09-12	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: rL341389 broke code with tied register operands in inline assembly. For example, `asm("" : "=r"(var) : "0"(var));` The code above specifies the input operand to be in the same register with the output operand, tying the two register. This patch makes this kind of code work again. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, eraman, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D51991 llvm-svn: 342084
*	[Hexagon] Use legalized type for extracted elements in scalarizeShuffle	Krzysztof Parzyszek	2018-09-12	1	-2/+4
\| \| \| \| \| \| \| \| \|	Scalarization of a shuffle will break up the source vectors into individual elements, and use them to assemble the resulting vector. An element type of a legal vector type may not necessarily be a legal scalar type, so make sure that the extracted values are extended to a legal scalar type. llvm-svn: 342079
*	AMDGPU: Print all kernel descriptor directives (including the ones with ↵	Konstantin Zhuravlyov	2018-09-12	1	-101/+88
\| \| \| \| \| \| \| \| \| \|	default values) Change by Tony Tye Differential Revision: https://reviews.llvm.org/D51954 llvm-svn: 342077
*	AMDGPU: Re-apply r341982 after fixing the layering issue	Konstantin Zhuravlyov	2018-09-12	11	-391/+364
\| \| \| \| \| \| \| \| \| \| \| \|	Move isa version determination into TargetParser. Also switch away from target features to CPU string when determining isa version. This fixes an issue when we output wrong isa version in the object code when features of a particular CPU are altered (i.e. gfx902 w/o xnack used to result in gfx900). llvm-svn: 342069
*	[WebAssembly] SIMD comparisons	Thomas Lively	2018-09-12	2	-1/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Match the ordering semantics of non-vector comparisons. For floating point comparisons that do not correspond to instructions, the tests check that some vector comparison instruction was emitted but do not care about the full implementation. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D51765 llvm-svn: 342064
*	[ARM] Tighten f64<->f16 conversion requirements	Diogo N. Sampaio	2018-09-12	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix missing Requires fields. Patch by Bernard Ogden (bogden) Reviewers: SjoerdMeijer, javed.absar, t.p.northover Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D51631 llvm-svn: 342061
*	[X86] Remove isel patterns for ADCX instruction	Craig Topper	2018-09-12	1	-30/+7
\| \| \| \| \| \| \| \| \| \| \| \|	There's no advantage to this instruction unless you need to avoid touching other flag bits. It's encoding is longer, it can't fold an immediate, it doesn't write all the flags. I don't think gcc will generate this instruction either. Fixes PR38852. Differential Revision: https://reviews.llvm.org/D51754 llvm-svn: 342059
*	[AArch64] Implement aarch64_vector_pcs codegen support.	Sander de Smalen	2018-09-12	3	-41/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds codegen support for the saving/restoring V8-V23 for functions specified with the aarch64_vector_pcs calling convention attribute, as added in patch D51477. Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar, MatzeB Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D51479 llvm-svn: 342049
*	[ARM] Follow-up to rL342033	Sam Parker	2018-09-12	1	-1/+1
\| \| \| \| \| \|	Fixed typo which can cause segfault. llvm-svn: 342040
*	[AArch64] NFC: Refactoring to prepare for vector PCS.	Sander de Smalen	2018-09-12	1	-39/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch refactors several parts of AArch64FrameLowering so that it can be easily extended to support saving/restoring of FPR128 (Q) registers. Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D51478 llvm-svn: 342038
*	[ARM] Exchange MAC operands in ARMParallelDSP	Sam Parker	2018-09-12	1	-115/+154
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SMLAD and SMLALD instructions also come in the form of SMLADX and SMLALDX which perform an exchange on their second operand. To support this, more of the loads in the MAC candidates are compared for sequential access and a boolean value has been added to BinOpChain. AddMACCandiate has been refactored into a small pattern matching state machine to reduce the amount of duplicated code, but also to enable the matching to be more flexible. CreateParallelMACPairs now iterates through all the candidates to find parallel ones. Differential Revision: https://reviews.llvm.org/D51424 llvm-svn: 342033
*	[ARM] Allow bitcasts in ARMCodeGenPrepare	Sam Parker	2018-09-12	1	-5/+4
\| \| \| \| \| \| \| \|	Allow bitcasts in the use-def chains, treating them as sources. Differential Revision: https://reviews.llvm.org/D50758 llvm-svn: 342032
*	[AArch64] Add parsing of aarch64_vector_pcs attribute.	Sander de Smalen	2018-09-12	2	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds parsing support for the 'aarch64_vector_pcs' calling convention attribute to calls and function declarations. More information describing the vector ABI and procedure call standard can be found here: https://developer.arm.com/products/software-development-tools/\ hpc/arm-compiler-for-hpc/vector-function-abi Reviewers: t.p.northover, rnk, rengolin, javed.absar, thegameg, SjoerdMeijer Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D51477 llvm-svn: 342030
*	Revert "AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination into ↵	Ilya Biryukov	2018-09-12	11	-274/+391
\| \| \| \| \| \| \| \| \| \| \|	TargetParser." This reverts commit r341982. The change introduced a layering violation. Reverting to unbreak our integrate. llvm-svn: 342023
*	[X86] Teach X86SelectionDAGInfo::EmitTargetCodeForMemcpy about GNUX32	Craig Topper	2018-09-12	2	-24/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In GNUX23, is64BitMode returns true, but pointers are 32-bits. So we shouldn't copy pointer values into RSI/RDI since the widths don't match. Fixes PR38865 despite what the title says. I think the llvm_unreachable in the copyPhysReg code tricked the optimizer and made the fatal error trigger. Reviewers: rnk, efriedma, MatzeB, echristo Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51893 llvm-svn: 342015
*	AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination	Konstantin Zhuravlyov	2018-09-11	11	-391/+274
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	into TargetParser. Also switch away from target features to CPU string when determining isa version. This fixes an issue when we output wrong isa version in the object code when features of a particular CPU are altered (i.e. gfx902 w/o xnack used to result in gfx900). Differential Revision: https://reviews.llvm.org/D51890 llvm-svn: 341982
*	[X86] Prefer unpckhpd over movhlps in isel for fake unary cases	Craig Topper	2018-09-11	1	-13/+0
\| \| \| \| \| \| \| \| \| \| \| \|	In r337348, I changed lowering to prefer X86ISD::UNPCKL/UNPCKH opcodes over MOVLHPS/MOVHLPS for v2f64 {0,0} and {1,1} shuffles when we have SSE2. This enabled the removal of a bunch of weirdly bitcasted isel patterns in r337349. To avoid changing the tests I placed a gross hack in isel to still emit movhlps instructions for fake unary unpckh nodes. A similar hack was not needed for unpckl and movlhps because we do execution domain switching for those. But unpckh and movhlps have swapped operand order. This patch removes the hack. This is a code size increase since unpckhpd requires a 0x66 prefix and movhlps does not. But if that's a big concern we should be using movhlps for all unpckhpd opcodes and let commuteInstruction turnit into unpckhpd when its an advantage. Differential Revision: https://reviews.llvm.org/D49499 llvm-svn: 341973
*	[X86] Teach X86FastISel::X86SelectRet to use EAX for the sret pointer in GNUX32	Craig Topper	2018-09-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	GNUX32 uses 32-bit pointers despite is64BitMode being true. So we should use EAX to return the value. Fixes ones of the failures from PR38865. Differential Revision: https://reviews.llvm.org/D51940 llvm-svn: 341972
*	Test commit: remove trailing whitespace	Josh Stone	2018-09-11	1	-1/+1
\| \| \| \|	llvm-svn: 341966
*	[X86] Correct the one use check from r341915.	Craig Topper	2018-09-11	1	-1/+1
\| \| \| \| \| \|	The one use check should be on the bitcast, not the input to the bitcast. llvm-svn: 341956
*	[MIPS] Fix illegal type assert in single-float mode	Simon Atanasyan	2018-09-11	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \|	An fp_to_sint node would be incorrectly lowered to a TruncIntFP node in single-float mode. This would trigger an "Unexpected illegal type!" assert. Patch by Dan Ravensloft. Differential revision: https://reviews.llvm.org/D51810 llvm-svn: 341952
*	[ARM] Add smlald support in ARMParallelDSP	Sam Parker	2018-09-11	1	-13/+41
\| \| \| \| \| \| \| \| \|	Search from i64 reducing phis, as well as i32, to allow the generation of smlald instructions. Differential Revision: https://reviews.llvm.org/D51101 llvm-svn: 341941
*	[ARM] Enable ARMCodeGenPrepare by default	Sam Parker	2018-09-11	1	-1/+1
\| \| \| \| \| \| \| \| \|	We've had the pass enabled downstream for a couple of weeks and it seems to be okay, so enable it by default. Differential Revision: https://reviews.llvm.org/D51920 llvm-svn: 341932
*	[AMDGPU] Preliminary patch for divergence driven instruction selection. ↵	Alexander Timofeev	2018-09-11	4	-21/+62
\| \| \| \| \| \| \| \| \|	Immediate selection predicate changed Differential revision: https://reviews.llvm.org/D51734 Reviewers: rampitec llvm-svn: 341928
*	[mips] Add a pattern for 64-bit GPR variant of the `rdhwr` instruction	Simon Atanasyan	2018-09-11	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MIPS ISAs start to support third operand for the `rdhwr` instruction starting from Revision 6. But LLVM generates assembler code with three-operands version of this instruction on any MIPS64 ISA. The third operand is always zero, so in case of direct code generation we get correct code. This patch fixes the bug by adding an instruction alias. The same alias already exists for 32-bit ISA. Ideally, we also need to reject three-operands version of the `rdhwr` instruction in an assembler code if ISA revision is less than 6. That is a task for a separate patch. This fixes PR38861 (https://bugs.llvm.org/show_bug.cgi?id=38861) Differential revision: https://reviews.llvm.org/D51773 llvm-svn: 341919