bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[X86] Lower v16i16->v8i16 truncate using an 'and' with 255, an ↵	Craig Topper	2018-11-18	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	extract_subvector, and a packuswb instruction. Summary: This is an improvement over the two pshufbs and punpcklqdq we'd get otherwise. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54671 llvm-svn: 347171
*	[CodeGen] Unify MBB reference format in both MIR and debug output	Francis Visoiu Mistrih	2017-12-04	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(\1)/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g' * find . \( -name ".txt" -o -name ".s" -o -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665
*	[X86 Codegen] Fixed a bug in unsigned saturation	Elena Demikhovsky	2017-01-29	1	-24/+0
\| \| \| \| \| \| \| \| \| \| \|	PACKUSWB converts Signed word to Unsigned byte, (the same about DW) and it can't be used for umin+truncate pattern. AVX-512 VPMOVUS* instructions fit the pattern since they convert Unsigned to Unsigned. See https://llvm.org/bugs/show_bug.cgi?id=31773 Differential Revision: https://reviews.llvm.org/D29196 llvm-svn: 293431
*	Recommiting unsigned saturation with a bugfix.	Elena Demikhovsky	2017-01-19	1	-0/+26
\| \| \| \| \| \| \|	A test case that crached is added to avx512-trunc.ll. (PR31589) llvm-svn: 292479
*	Revert r291670 because it introduces a crash.	Michael Kuperstein	2017-01-18	1	-26/+0
\| \| \| \| \| \| \| \| \|	r291670 doesn't crash on the original testcase from PR31589, but it crashes on a slightly more complex one. PR31589 has the new reproducer. llvm-svn: 292444
*	X86 CodeGen: Optimized pattern for truncate with unsigned saturation.	Elena Demikhovsky	2017-01-11	1	-0/+26
\| \| \| \| \| \| \| \| \|	DAG patterns optimization: truncate + unsigned saturation supported by VPMOVUS* instructions in AVX-512. And VPACKUS* instructions on SEE* targets. Differential Revision: https://reviews.llvm.org/D28216 llvm-svn: 291670
*	[x86] use a single shufps when it can save instructions	Sanjay Patel	2016-12-15	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a tiny patch with a big pile of test changes. This partially fixes PR27885: https://llvm.org/bugs/show_bug.cgi?id=27885 My motivating case looks like this: - vpshufd {{.#+}} xmm1 = xmm1[0,1,0,2] - vpshufd {{.#+}} xmm0 = xmm0[0,2,2,3] - vpblendw {{.#+}} xmm0 = xmm0[0,1,2,3],xmm1[4,5,6,7] + vshufps {{.#+}} xmm0 = xmm0[0,2],xmm1[0,2] And this happens several times in the diffs. For chips with domain-crossing penalties, the instruction count and size reduction should usually overcome any potential domain-crossing penalty due to using an FP op in a sequence of int ops. For chips such as recent Intel big cores and Atom, there is no domain-crossing penalty for shufps, so using shufps is a pure win. So the test case diffs all appear to be improvements except one test in vector-shuffle-combining.ll where we miss an opportunity to use a shift to generate zero elements and one test in combine-sra.ll where multiple uses prevent the expected shuffle combining. Differential Revision: https://reviews.llvm.org/D27692 llvm-svn: 289837
*	[x86] fix test specifications	Sanjay Patel	2016-12-12	1	-4/+4
\| \| \| \|	llvm-svn: 289493
*	[X86][AVX] Regenerated AVX tests	Simon Pilgrim	2016-01-16	1	-8/+29
\| \| \| \| \| \|	Updated i1 select, vector truncation and subvector extraction tests llvm-svn: 257995
*	[x86] Teach the 128-bit vector shuffle lowering routines to take	Chandler Carruth	2015-02-16	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	advantage of the existence of a reasonable blend instruction. The 256-bit vector shuffle lowering has leveraged the general technique of decomposed shuffles and blends for quite some time, but this never made it back into the 128-bit code, and there are a large number of patterns where this is substantially better. For example, this removes almost all domain crossing in vector shuffles that involve some blend and some permutation with SSE4.1 and later. See the massive reduction in 'shufps' for integer test cases in this commit. This isn't perfect yet for a few reasons: 1) The v8i16 shuffle lowering continues to plague me. We don't always form an unpack-based blend when that would be better. But the wins pretty drastically outstrip the losses here. 2) The v16i8 shuffle lowering is just a disaster here. I never went and implemented blend support here for some terrible reason. I'll do that next probably. I've not updated it for now. More variations on this technique are coming as well -- we don't shuffle-into-unpack or shuffle-into-palignr, both of which would also be profitable. Note that some test cases grow significantly in the number of instructions, but I expect to actually be faster. We use pshufd+pshufd+blendw instead of a single shufps, but the pshufd's are very likely to pipeline well (two ports on most modern intel chips) and the blend is a very fast instruction. The domain switch penalty will essentially always be more than a blend instruction, which is the only increase in tree height. llvm-svn: 229350
*	Lower AVX v4i64->v4i32 truncate to one shuffle.	Cameron McInally	2014-03-05	1	-3/+5
\| \| \| \|	llvm-svn: 202996
*	X86: Custom lower sext v16i8 to v16i16, and the corresponding truncate.	Benjamin Kramer	2013-10-23	1	-1/+6
\| \| \| \| \| \|	Also update the cost model. llvm-svn: 193270
*	Cleanup: test source files do not need to be executable	Arnaud A. de Grandmaison	2013-04-22	1	-0/+0
\| \| \| \|	llvm-svn: 180003
*	Revert revision: 171467. This transformation is incorrect and makes some ↵	Nadav Rotem	2013-01-04	1	-15/+0
\| \| \| \| \| \| \| \| \|	tests fail. Original message: Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1. Added a test. llvm-svn: 171468
*	Simplified TRUNCATE operation that comes after SETCC. It is possible since ↵	Elena Demikhovsky	2013-01-03	1	-0/+15
\| \| \| \| \| \| \| \|	SETCC result is 0 or -1. Added a test. llvm-svn: 171467
*	Unix line endings	Matt Beaumont-Gay	2012-02-02	1	-15/+15
\| \| \| \|	llvm-svn: 149615
*	Optimization for "truncate" operation on AVX.	Elena Demikhovsky	2012-02-01	1	-0/+15
	Truncating v4i64 -> v4i32 and v8i32 -> v8i16 may be done with set of shuffles. llvm-svn: 149485