bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AVX-512] Add vpermilps/pd to load folding tables.	Craig Topper	2016-12-09	2	-0/+200
\| \| \| \|	llvm-svn: 289173
*	[AVX-512] Move some floating point stack folding test cases out of the ↵	Craig Topper	2016-12-09	4	-192/+192
\| \| \| \| \| \|	integer test. llvm-svn: 289172
*	GlobalISel: fall back gracefully for debug intrinsics.	Tim Northover	2016-12-08	1	-0/+33
\| \| \| \| \| \| \|	Supporting them properly is a reasonably complex chunk of work, so to allow bot testing before then we should at least be able to fall back to DAG ISel. llvm-svn: 289150
*	[mips] Change gnueabi to gnu in the triple because EABI has been removed ↵	Simon Atanasyan	2016-12-08	1	-1/+1
\| \| \| \| \| \|	recently. NFC llvm-svn: 289114
*	[mips] Remove N32 Android test because Android does not support N32 ABI. NFC	Simon Atanasyan	2016-12-08	1	-2/+0
\| \| \| \|	llvm-svn: 289113
*	Don't emit .seh_handler directives for any cleanup funclets	Reid Kleckner	2016-12-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were falsely claiming that we had an LSDA for the relevant EH personality before this change, which could lead to the EH machinery interpreting random adjacent data as an LSDA. Fixes PR31317 This change is safe because cleanups can't contain exception handlers today. We do these things to maintain that invariant: - C++ destructors are naturally out-of-line - __finally blocks are outlined in clang - LLVM's inliner will not inline EH constructs into cleanups llvm-svn: 289101
*	AMDGPU: Make f16 ConstantFP legal	Matt Arsenault	2016-12-08	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Not having this legal led to combine failures, resulting in dumb things like bitcasts of constants not being folded away. The only reason I'm leaving the v_mov_b32 hack that f32 already uses is to avoid madak formation test regressions. PeepholeOptimizer has an ordering issue where the immediate fold attempt is into the sgpr->vgpr copy instead of the actual use. Running it twice avoids that problem. llvm-svn: 289096
*	AMDGPU: Fix commuting v_sub_u16	Matt Arsenault	2016-12-08	1	-0/+169
\| \| \| \| \| \| \| \|	The correct commutable opcode was set to itself, so this was simply swapping the operands to commute instead of also changing the opcode to v_subrev_u16. llvm-svn: 289093
*	[AMDGPU] Add amdgpu-unify-metadata pass	Stanislav Mekhanoshin	2016-12-08	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Multiple metadata values for records such as opencl.ocl.version, llvm.ident and similar are created after linking several modules. For some of them, notably opencl.ocl.version, this creates semantic problem because we cannot tell which version of OpenCL the composite module conforms. Moreover, such repetitions of identical values often create a huge list of unneeded metadata, which grows bitcode size both in memory and stored on disk. It can go up to several Mb when linked against our OpenCL library. Lastly, such long lists obscure reading of dumped IR. The pass unifies metadata after linking. Differential Revision: https://reviews.llvm.org/D25381 llvm-svn: 289092
*	IR, X86: Understand !absolute_symbol metadata on global variables.	Peter Collingbourne	2016-12-08	4	-0/+167
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Attaching !absolute_symbol to a global variable does two things: 1) Marks it as an absolute symbol reference. 2) Specifies the value range of that symbol's address. Teach the X86 backend to allow absolute symbols to appear in place of immediates by extending the relocImm and mov64imm32 matchers. Start using relocImm in more places where it is legal. As previously proposed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-October/105800.html Differential Revision: https://reviews.llvm.org/D25878 llvm-svn: 289087
*	[AMDGPU] Scalarization of global uniform loads.	Alexander Timofeev	2016-12-08	2	-0/+206
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: LC can currently select scalar load for uniform memory access basing on readonly memory address space only. This restriction originated from the fact that in HW prior to VI vector and scalar caches are not coherent. With MemoryDependenceAnalysis we can check that the memory location corresponding to the memory operand of the LOAD is not clobbered along the all paths from the function entry. Reviewers: rampitec, tstellarAMD, arsenm Subscribers: wdng, arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D26917 llvm-svn: 289076
*	X86: Add checks for fma_patterns[_wide].ll with -enable-no-infs-fp-math	Nicolai Haehnle	2016-12-08	2	-701/+1244
\| \| \| \| \| \| \| \| \| \| \| \|	This re-adds checks for the patterns that were disabled with r288506. Reviewers: spatel, delena, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27346 llvm-svn: 289049
*	AMDGPU: Properly implement SIRegisterInfo::isFrameOffsetLegal and ↵	Nicolai Haehnle	2016-12-08	7	-98/+122
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	needsFrameBaseReg Summary: Without the fix to isFrameOffsetLegal to consider the instruction's immediate offset, the new test case hits the corresponding assertion in resolveFrameIndex, because the LocalStackSlotAllocation pass re-uses a different base register. With only the fix to isFrameOffsetLegal, code quality reduces in a bunch of places because frame base registers are added where they're not needed. This is addressed by properly implementing needsFrameBaseReg, which also helps to avoid unnecessary zero frame indices in a bunch of other places. Fixes piglit glsl-1.50/execution/variable-indexing/gs-output-array-vec4-index-wr.shader_test Reviewers: arsenm, tstellarAMD Subscribers: qcolombet, kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D27344 llvm-svn: 289048
*	[AVR] Add MIR tests for pseudo instruction expansions	Dylan McKay	2016-12-08	13	-0/+308
\| \| \| \| \| \|	This adds tests for 13 pseudo instruction expansions. llvm-svn: 289039
*	[X86][SSE] Add vector test for (shl (or x, c1), c2) -> (or (shl x, c2), c1 ↵	Simon Pilgrim	2016-12-08	1	-0/+36
\| \| \| \| \| \|	<< c2) detailed in D19325 llvm-svn: 289035
*	[AVR] Add MIR tests for a few pseudo instructions	Dylan McKay	2016-12-08	3	-0/+72
\| \| \| \|	llvm-svn: 289031
*	The few days mentioned in r267095 are over	Matthias Braun	2016-12-08	1	-1/+0
\| \| \| \|	llvm-svn: 289004
*	[InlineSpiller] Don't call TargetInstrInfo::foldMemoryOperand with an empty ↵	Quentin Colombet	2016-12-08	1	-0/+22
\| \| \| \| \| \| \| \|	list. Since r287792 if we try to do that we will hit an assert. llvm-svn: 289001
*	GlobalISel: use correct builder for ConstantExprs.	Tim Northover	2016-12-07	1	-2/+1
\| \| \| \| \| \| \| \|	ConstantExpr instances were emitting code into the current block rather than the entry block. This meant they didn't necessarily dominate all uses, which is clearly wrong. llvm-svn: 288985
*	GlobalISel: simplify MachineIRBuilder interface.	Tim Northover	2016-12-07	5	-22/+57
\| \| \| \| \| \| \| \| \| \| \| \|	MachineIRBuilder had weird before/after and beginning/end flags for the insert point. Unfortunately the non-default means that instructions will be inserted in reverse order which is almost never what anyone wants. Really, I think we just want (like IRBuilder has) the ability to insert at any C++ iterator-style point (i.e. before any instruction or before MBB.end()). So this fixes MIRBuilders to behave like IRBuilders in this respect. llvm-svn: 288980
*	[X86] Skip over DEBUG_VALUE while looking for start of call sequence	Michael Kuperstein	2016-12-07	1	-0/+55
\| \| \| \| \| \| \| \| \| \| \|	If we don't skip over DEBUG_VALUEs, we get differences between -g and non-g code. This fixes PR31242. Differential Revision: https://reviews.llvm.org/D27485 llvm-svn: 288965
*	[X86] Do not assume "ri" instructions always have an immediate operand	Michael Kuperstein	2016-12-07	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \|	The second operand of an "ri" instruction may be an immediate, but it may also be a globalvariable, so we should make any assumptions. This fixes PR31271. Differential Revision: https://reviews.llvm.org/D27481 llvm-svn: 288964
*	[SelectionDAG] Add knownbits support for vector demandedelts in ↵	Simon Pilgrim	2016-12-07	1	-10/+2
\| \| \| \| \| \|	SMAX/SMIN/UMAX/UMIN opcodes llvm-svn: 288926
*	[X86] Add knownbits vector UMAX test	Simon Pilgrim	2016-12-07	1	-0/+31
\| \| \| \| \| \|	In preparation for demandedelts support llvm-svn: 288920
*	[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes	Simon Pilgrim	2016-12-07	1	-10/+2
\| \| \| \|	llvm-svn: 288916
*	[X86] Add test to show missed opportunities to calculate knownbits in ↵	Simon Pilgrim	2016-12-07	1	-0/+37
\| \| \| \| \| \|	INSERT_VECTOR_ELT llvm-svn: 288912
*	[X86][SSE] Fix vpextrd/vpextrq checks	Simon Pilgrim	2016-12-07	1	-2/+2
\| \| \| \| \| \|	They were testing for the pre-vex versions llvm-svn: 288911
*	[X86][SSE] Force execution domain of 32-bit extractps/pextrd in the stack ↵	Simon Pilgrim	2016-12-07	4	-14/+22
\| \| \| \| \| \|	folding tests llvm-svn: 288910
*	[X86][SSE] Regenerate test.	Simon Pilgrim	2016-12-07	1	-3/+13
\| \| \| \|	llvm-svn: 288906
*	[AVR] Expand 'SELECT_CC' nodes whereever possible	Dylan McKay	2016-12-07	1	-2/+0
\| \| \| \|	llvm-svn: 288905
*	[X86][SSE] Consistently set MOVD/MOVQ load/store/move instructions to ↵	Simon Pilgrim	2016-12-07	13	-57/+59
\| \| \| \| \| \| \| \| \| \|	integer domain We are being inconsistent with these instructions (and all their variants.....) with a random mix of them using the default float domain. Differential Revision: https://reviews.llvm.org/D27419 llvm-svn: 288902
*	[AVR] Move a pseudo expansion test into a folder	Dylan McKay	2016-12-07	1	-0/+0
\| \| \| \|	llvm-svn: 288899
*	[X86][XOP] Fix VPERMIL2 non-constant pool shuffle decoding (PR31296)	Simon Pilgrim	2016-12-07	1	-2/+8
\| \| \| \| \| \| \| \|	The non-constant pool version of DecodeVPERMIL2PMask was not offsetting correctly for the second input. I've updated the code to match the implementation in the constant-pool version. Annoyingly this bug was hidden for so long as it's tricky to combine to useful variable shuffle masks that don't become constant-pool entries. llvm-svn: 288898
*	[AVR] Allow loading from stack slots where src and dest registers are identical	Dylan McKay	2016-12-07	2	-56/+33
\| \| \| \| \| \|	Fixes PR 31256 llvm-svn: 288897
*	AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.	Tom Stellard	2016-12-07	1	-55/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch By: Wei Ding Summary: This patch fixes the fdiv precision issues. Reviewers: b-sumner, cfang, wdng, arsenm Subscribers: kzhuravl, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D26424 llvm-svn: 288879
*	AMDGPU: Add llvm.amdgcn.interp.mov intrinsic	Tom Stellard	2016-12-06	1	-1/+5
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D26725 llvm-svn: 288865
*	AMDGPU: Fix crash on i16 constant expression	Matt Arsenault	2016-12-06	1	-0/+28
\| \| \| \|	llvm-svn: 288861
*	[X86][XOP] Add test case for PR31296	Simon Pilgrim	2016-12-06	1	-0/+20
\| \| \| \|	llvm-svn: 288858
*	[CodeGen] Fix result type for SMULO/UMULO legalization	Eli Friedman	2016-12-06	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On some platforms (like MSP430) the second element of the result structure for SMULO/UMULO may have a shorter type than the one returned by SetCC. We need to truncate it to the right type, or else some incorrect code may be generated later on. This fixes issue https://github.com/rust-lang/rust/issues/37829 Patch by Vadzim Dambrouski! Differential Revision: https://reviews.llvm.org/D27154 llvm-svn: 288857
*	AMDGPU/SI: Set correct value for amd_kernel_code_t::kernarg_segment_alignment	Tom Stellard	2016-12-06	1	-0/+28
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27416 llvm-svn: 288852
*	AMDGPU/SI: Don't move copies of immediates to the VALU	Tom Stellard	2016-12-06	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If we write an immediate to a VGPR and then copy the VGPR to an SGPR, we can replace the copy with a S_MOV_B32 sgpr, imm, rather than moving the copy to the SALU. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27272 llvm-svn: 288849
*	GlobalISel: correctly handle small args via memory.	Tim Northover	2016-12-06	1	-0/+9
\| \| \| \| \| \| \|	We were rounding size in bits down rather than up, leading to 0-sized slots for i1 (assert!) and bugs for other types not byte-aligned. llvm-svn: 288848
*	[X86] Prefer reduced width multiplication over pmulld on Silvermont	Zvi Rackover	2016-12-06	1	-0/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Prefer expansions such as: pmullw,pmulhw,unpacklwd,unpackhwd over pmulld. On Silvermont [source: Optimization Reference Manual]: PMULLD has a throughput of 1/11 [instruction/cycles]. PMULHUW/PMULHW/PMULLW have a throughput of 1/2 [instruction/cycles]. Fixes pr31202. Analysis of this issue was done by Fahana Aleen. Reviewers: wmi, delena, mkuper Subscribers: RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D27203 llvm-svn: 288844
*	[DAGCombine] Add (sext_in_reg (zext x)) -> (sext x) combine	Simon Pilgrim	2016-12-06	1	-12/+0
\| \| \| \| \| \| \| \|	Handle the case where a sign extension has ended up being split into separate stages (typically to get around vector legal ops) and a zext + sext_in_reg gets inserted. Differential Revision: https://reviews.llvm.org/D27461 llvm-svn: 288842
*	GlobalISel: fall back gracefully when we hit unhandled legalizer default.	Tim Northover	2016-12-06	1	-0/+8
\| \| \| \|	llvm-svn: 288840
*	[SelectionDAG] We can ignore knownbits from an undef shuffle vector index if ↵	Simon Pilgrim	2016-12-06	1	-2/+23
\| \| \| \| \| \|	we don't actually demand that element llvm-svn: 288839
*	GlobalISel: handle G_SEQUENCE fallbacks gracefully.	Tim Northover	2016-12-06	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	There were two problems: + AArch64 was reusing random data from its binary op tables, which is complete nonsense for G_SEQUENCE. + Even when AArch64 gave up and said it couldn't handle G_SEQUENCE, the generic code asserted. llvm-svn: 288836
*	GlobalISel: allow G_SELECT instructions for pointers.	Tim Northover	2016-12-06	1	-0/+11
\| \| \| \|	llvm-svn: 288835
*	GlobalISel: stop the legalizer from trying to handle oddly-sized types.	Tim Northover	2016-12-06	1	-0/+8
\| \| \| \| \| \| \| \|	It'll almost immediately fail because it always tries to half/double the size until it finds a legal one. Unfortunately, this triggers an assertion preventing the DAG fallback from being possible. llvm-svn: 288834
*	[X86][SSE] Add knownbits test demonstrating demandedelts not ignoring undef ↵	Simon Pilgrim	2016-12-06	1	-0/+21
\| \| \| \| \| \|	shuffle elements llvm-svn: 288825