bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[InstSimplify] allow integer vector types to use computeKnownBits	Sanjay Patel	2016-11-27	2	-7/+7
\| \| \| \| \| \| \| \|	Note that the non-splat lshr+lshr test folded, but that does not work in general. Something is missing or wrong in computeKnownBits as the non-splat shl+shl test still shows. llvm-svn: 288005
*	[AVX-512] Add integer and fp unpck instructions to load folding tables.	Craig Topper	2016-11-27	1	-0/+108
\| \| \| \|	llvm-svn: 288004
*	[X86][SSE] Split lowerVectorShuffleAsShift ready for combines. NFCI.	Simon Pilgrim	2016-11-27	1	-31/+60
\| \| \| \| \| \|	Moved most of matching code into matchVectorShuffleAsShift to share with target shuffle combines (in a future commit). llvm-svn: 288003
*	[X86] Add TB_NO_REVERSE to entries in the load folding table where the ↵	Craig Topper	2016-11-27	1	-188/+206
\| \| \| \| \| \| \| \| \| \|	instruction's load size is smaller than the register size. If we were to unfold these, the load size would be increased to the register size. This is not safe to do since the enlarged load can do things like cross a page boundary into a page that doesn't exist. I probably missed some instructions, but this should be a large portion of them. llvm-svn: 288001
*	fix formatting; NFC	Sanjay Patel	2016-11-27	1	-13/+15
\| \| \| \|	llvm-svn: 287997
*	[AVX-512] Add masked EVEX vpmovzx/sx instructions to load folding tables.	Craig Topper	2016-11-27	1	-0/+84
\| \| \| \|	llvm-svn: 287995
*	[X86] Remove alignment restrictions from load folding table for some ↵	Craig Topper	2016-11-27	1	-13/+13
\| \| \| \| \| \| \| \|	instructions that don't have a restriction. Most of these are the SSE4.1 PMOVZX/PMOVSX instructions which all read less than 128-bits. The only other was PMOVUPD which by definition is an unaligned load. llvm-svn: 287991
*	[X86] Remove hasOneUse check that is redundant with the one in ↵	Craig Topper	2016-11-26	1	-2/+0
\| \| \| \| \| \|	IsProfitableToFold. llvm-svn: 287987
*	[X86] Fix the zero extending load detection in ↵	Craig Topper	2016-11-26	1	-11/+12
\| \| \| \| \| \| \| \|	X86DAGToDAGISel::selectScalarSSELoad to pass the load node to IsProfitableToFold and IsLegalToFold. Previously we were passing the SCALAR_TO_VECTOR node. llvm-svn: 287986
*	[X86] Simplify control flow. NFCI	Craig Topper	2016-11-26	1	-3/+2
\| \| \| \|	llvm-svn: 287985
*	[X86] Add a hasOneUse check to selectScalarSSELoad to keep the same load ↵	Craig Topper	2016-11-26	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	from being folded multiple times. Summary: When selectScalarSSELoad is looking for a scalar_to_vector of a scalar load, it makes sure the load is only used by the scalar_to_vector. But it doesn't make sure the scalar_to_vector is only used once. This can cause the same load to be folded multiple times. This can be bad for performance. This also causes the chain output to be duplicated, but not connected to anything so chain dependencies will not be satisfied. Reviewers: RKSimon, zvi, delena, spatel Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D26790 llvm-svn: 287983
*	[InstCombine] don't drop metadata in FoldOpIntoSelect()	Sanjay Patel	2016-11-26	1	-3/+3
\| \| \| \|	llvm-svn: 287980
*	add optional param to copy metadata when creating selects; NFC	Sanjay Patel	2016-11-26	1	-7/+3
\| \| \| \| \| \| \| \| \| \| \|	There are other spots where we can use this; we're currently dropping metadata in some places, and there are proposed changes where we will want to propagate metadata. IRBuilder's CreateSelect() already has a parameter like this, so this change makes the regular 'Create' API line up with that. llvm-svn: 287976
*	[AVX-512] Add unmasked EVEX vpmovzx/sx instructions to load folding tables.	Craig Topper	2016-11-26	1	-0/+36
\| \| \| \|	llvm-svn: 287975
*	[AVX-512] Add masked 128/256-bit integer add/sub instructions to load ↵	Craig Topper	2016-11-26	1	-0/+64
\| \| \| \| \| \|	folding tables. llvm-svn: 287974
*	[AVX-512] Add masked 512-bit integer add/sub instructions to load folding ↵	Craig Topper	2016-11-26	1	-0/+31
\| \| \| \| \| \|	tables. llvm-svn: 287972
*	[AVX-512] Teach LowerFormalArguments to use the extended register class when ↵	Craig Topper	2016-11-26	1	-4/+4
\| \| \| \| \| \|	available. Fix the avx512vl stack folding tests to clobber more registers or otherwise they use xmm16 after this change. llvm-svn: 287971
*	[AVX-512] Add VLX versions of VDIVPD/PS and VMULPD/PS to load folding tables.	Craig Topper	2016-11-26	1	-0/+8
\| \| \| \|	llvm-svn: 287970
*	AMDGPU/SI: Use float as the operand type for amdgcn.interp intrinsics	Tom Stellard	2016-11-26	2	-2/+4
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D26724 llvm-svn: 287962
*	[X86][XOP] Add a reversed reg/reg form for VPROT instructions.	Craig Topper	2016-11-26	1	-0/+7
\| \| \| \| \| \|	The W bit distinquishes which operand is the memory operand. But if the mod bits are 3 then the memory operand is a register and there are two possible encodings. We already did this correctly for several other XOP instructions. llvm-svn: 287961
*	[X86] Add SSE, AVX, and AVX2 version of MOVDQU to the load/store folding ↵	Craig Topper	2016-11-26	1	-0/+6
\| \| \| \| \| \| \| \|	tables for consistency. Not sure this is truly needed but we had the floating point equivalents, the aligned equivalents, and the EVEX equivalents. So this just makes it complete. llvm-svn: 287960
*	[AVX-512] Put the AVX-512 sections of the load folding tables into mostly ↵	Craig Topper	2016-11-25	1	-365/+373
\| \| \| \| \| \|	alphabetical order. This is consistent with the older sections of the table. NFC llvm-svn: 287956
*	Replace some callers of setTailCall with setTailCallKind	David Majnemer	2016-11-25	4	-13/+12
\| \| \| \| \| \| \|	We were a little sloppy with adding tailcall markers. Be more consistent by using setTailCallKind instead of setTailCall. llvm-svn: 287955
*	AMDGPU/SI: Add back reverted SGPR spilling code, but disable it	Marek Olsak	2016-11-25	8	-96/+284
\| \| \| \| \| \|	suggested as a better solution by Matt llvm-svn: 287942
*	Use SDValue helpers instead of explicitly going via SDValue::getNode(). NFCI	Simon Pilgrim	2016-11-25	1	-7/+7
\| \| \| \|	llvm-svn: 287941
*	Use SDValue helper instead of explicitly going via SDValue::getNode(). NFCI	Simon Pilgrim	2016-11-25	1	-5/+5
\| \| \| \|	llvm-svn: 287940
*	[AVX-512] Add support for changing VSHUFF64x2 to VSHUFF32x4 when its feeding ↵	Craig Topper	2016-11-25	1	-9/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a vselect with 32-bit element size. Summary: Shuffle lowering may have widened the element size of a i32 shuffle to i64 before selecting X86ISD::SHUF128. If this shuffle was used by a vselect this can prevent us from selecting masked operations. This patch detects this and changes the element size to match the vselect. I don't handle changing integer to floating point or vice versa as its not clear if its better to push such a bitcast to the inputs of the shuffle or to the user of the vselect. So I'm ignoring that case for now. Reviewers: delena, zvi, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27087 llvm-svn: 287939
*	[AVX-512] Add VPERMT2* and VPERMI2* instructions to load folding tables.	Craig Topper	2016-11-25	1	-0/+32
\| \| \| \|	llvm-svn: 287937
*	Revert "AMDGPU: Implement SGPR spilling with scalar stores"	Marek Olsak	2016-11-25	3	-153/+10
\| \| \| \| \| \|	This reverts commit 4404d0d6e354e80dd7f8f0a0e12d8ad809cf007e. llvm-svn: 287936
*	Revert "AMDGPU: Fix MMO when splitting spill"	Marek Olsak	2016-11-25	2	-79/+47
\| \| \| \| \| \|	This reverts commit 79d4f8b8b1ce430c3d5dac4fc72a9eebaed24fe1. llvm-svn: 287935
*	Revert "AMDGPU: Fix adding extra implicit def of register"	Marek Olsak	2016-11-25	1	-25/+14
\| \| \| \| \| \|	This reverts commit e834ce5976567575621901fb967b8018b9916d71. llvm-svn: 287934
*	Revert "AMDGPU: Fix not setting kill flag on temp reg when spilling"	Marek Olsak	2016-11-25	1	-1/+1
\| \| \| \| \| \|	This reverts commit 057bbbe4ae170247ba37f08f2e70ef185267d1bb. llvm-svn: 287933
*	Revert "AMDGPU: Make m0 unallocatable"	Marek Olsak	2016-11-25	6	-23/+16
\| \| \| \| \| \|	This reverts commit 124ad83dae04514f943902446520c859adee0e96. llvm-svn: 287932
*	Revert "AMDGPU: Remove m0 spilling code"	Marek Olsak	2016-11-25	1	-3/+37
\| \| \| \| \| \|	This reverts commit f18de36554eb22416f8ba58e094e0272523a4301. llvm-svn: 287931
*	Revert "AMDGPU: Preserve m0 value when spilling"	Marek Olsak	2016-11-25	1	-34/+5
\| \| \| \| \| \|	This reverts commit a5a179ffd94fd4136df461ec76fb30f04afa87ce. llvm-svn: 287930
*	[Loop Unswitch] Patch to selective unswitch only the reachable branch ↵	Abhilash Bhandari	2016-11-25	1	-1/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instructions. Summary: The iterative algorithm for Loop Unswitching may render some of the branches unreachable in the unswitched loops. Given the exponential nature of the algorithm, this is quite an overhead. This patch fixes this problem by selectively unswitching only those branches within a loop that are reachable from the loop header. Reviewers: Michael Zolothukin, Anna Thomas, Weiming Zhao. Subscribers: llvm-commits. Differential Revision: http://reviews.llvm.org/D26299 llvm-svn: 287925
*	[mips] Correct jal expansion for local symbols in .local directives.	Simon Dardis	2016-11-25	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch corrects the behaviour of code such as: .local foo jal foo foo: to use the correct jal expansion when writing ELF files. Patch by: Daniel Sanders Reviewers: zoran.jovanovic, seanbruno, vkalintiris Differential Revision: https://reviews.llvm.org/D24722 llvm-svn: 287918
*	[X86] Invert an 'if' and early out to fix a weird indentation. NFCI	Craig Topper	2016-11-25	1	-1/+2
\| \| \| \|	llvm-svn: 287909
*	[X86] Size a SmallVector to the worst case mask size for a 512-bit shuffle. NFCI	Craig Topper	2016-11-25	1	-1/+1
\| \| \| \|	llvm-svn: 287908
*	[DAGCombine] Teach DAG combine that if both inputs of a vselect are the ↵	Craig Topper	2016-11-24	1	-0/+4
\| \| \| \| \| \| \| \|	same, then the condition doesn't matter and the vselect can be removed. Selects with scalar condition already handle this correctly. llvm-svn: 287904
*	Test commit access.	Serge Rogatch	2016-11-24	1	-0/+1
\| \| \| \|	llvm-svn: 287898
*	Fix unused variable warning	Simon Pilgrim	2016-11-24	1	-1/+0
\| \| \| \|	llvm-svn: 287889
*	[X86] Don't round trip a unique_ptr through a raw pointer for assignment.	Benjamin Kramer	2016-11-24	1	-1/+1
\| \| \| \| \| \|	No functional change. llvm-svn: 287888
*	[X86][SSE] Improve UINT_TO_FP v2i32 -> v2f64	Simon Pilgrim	2016-11-24	1	-8/+38
\| \| \| \| \| \| \| \| \| \|	Vectorize UINT_TO_FP v2i32 -> v2f64 instead of scalarization (albeit still on the SIMD unit). The codegen matches that generated by legalization (and is in fact used by AVX for UINT_TO_FP v4i32 -> v4f64), but has to be done in the x86 backend to account for legalization via 4i32. Differential Revision: https://reviews.llvm.org/D26938 llvm-svn: 287886
*	[X86][AVX512] Add support for v2i64 fptosi/fptoui/sitofp/uitofp on ↵	Simon Pilgrim	2016-11-24	3	-5/+29
\| \| \| \| \| \| \| \|	AVX512DQ-only targets Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances llvm-svn: 287882
*	[X86][AVX512DQVL] Add awareness of vcvtqq2ps and vcvtuqq2ps implicit zeroing ↵	Simon Pilgrim	2016-11-24	1	-0/+11
\| \| \| \| \| \|	of upper 64-bits of xmm result llvm-svn: 287878
*	[X86][AVX512DQVL] Add support for v2i64 -> v2f32 SINT_TO_FP/UINT_TO_FP lowering	Simon Pilgrim	2016-11-24	1	-4/+22
\| \| \| \|	llvm-svn: 287877
*	[x86] Fixing PR28755 by precomputing the address used in CMPXCHG8B	Nikolai Bozhenov	2016-11-24	3	-1/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The bug arises during register allocation on i686 for CMPXCHG8B instruction when base pointer is needed. CMPXCHG8B needs 4 implicit registers (EAX, EBX, ECX, EDX) and a memory address, plus ESI is reserved as the base pointer. With such constraints the only way register allocator would do its job successfully is when the addressing mode of the instruction requires only one register. If that is not the case - we are emitting additional LEA instruction to compute the address. It fixes PR28755. Patch by Alexander Ivchenko <alexander.ivchenko@intel.com> Differential Revision: https://reviews.llvm.org/D25088 llvm-svn: 287875
*	[x86] Minor refactoring of X86TargetLowering::EmitInstrWithCustomInserter	Nikolai Bozhenov	2016-11-24	1	-10/+6
\| \| \| \| \| \| \| \| \| \|	Move the definitions of three variables out of the switch. Patch by Alexander Ivchenko <alexander.ivchenko@intel.com> Differential Revision: https://reviews.llvm.org/D25192 llvm-svn: 287874
*	[x86] Rewrite getAddressFromInstr helper function	Nikolai Bozhenov	2016-11-24	1	-17/+18
\| \| \| \| \| \| \| \| \| \| \| \| \|	- It does not modify the input instruction - Second operand of any address is always an Index Register, make sure we actually check for that, instead of a check for an immediate value Patch by Alexander Ivchenko <alexander.ivchenko@intel.com> Differential Revision: https://reviews.llvm.org/D24938 llvm-svn: 287873