bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[x86] add load fold patterns for movddup with vzext_load	Sanjay Patel	2018-12-22	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \|	The missed load folding noticed in D55898 is visible independent of that change either with an adjusted IR pattern to start or with AVX2/AVX512 (where the build vector becomes a broadcast first; movddup is not produced until we get into isel via tablegen patterns). Differential Revision: https://reviews.llvm.org/D55936 llvm-svn: 350005
*	[x86] add movddup specialization for build vector lowering (PR37502)	Sanjay Patel	2018-12-21	1	-22/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is admittedly a narrow fix for the problem: https://bugs.llvm.org/show_bug.cgi?id=37502 ...but as the XOP restriction shows, it's a maze to get this right. In the motivating example, note that we have movddup before SSE4.1 and again with AVX2. That's because insertps isn't available pre-SSE41 and vbroadcast is (more generally) available with AVX2 (and the splat is reduced to movddup via isel pattern). Differential Revision: https://reviews.llvm.org/D55898 llvm-svn: 349937
*	[x86] move test for movddup; NFC	Sanjay Patel	2018-12-21	1	-48/+0
\| \| \| \| \| \| \| \| \| \|	This adds an AVX512 run as suggested in D55936. The test didn't really belong with other build vector tests because that's not the pattern here. I don't see much value in adding 64-bit RUNs because they wouldn't exercise the isel patterns that we're aiming to expose. llvm-svn: 349920
*	[x86] add test to show missed movddup load fold; NFC	Sanjay Patel	2018-12-20	1	-0/+48
\| \| \| \|	llvm-svn: 349773
*	[x86] add test to show ddup hole; NFC (PR37502)	Sanjay Patel	2018-12-19	1	-0/+65
\| \| \| \|	llvm-svn: 349680
*	[X86][SSE] Don't vectorize splat buildvector of binops (PR30780)	Simon Pilgrim	2017-12-31	1	-14/+9
\| \| \| \| \| \|	Don't combine buildvector(binop(),binop(),binop(),binop()) -> binop(buildvector(), buildvector()) if its a splat - keep the binop scalar and just splat the result to avoid large vector constants. llvm-svn: 321607
*	[X86][SSE] Add PR30780 test cases	Simon Pilgrim	2017-12-30	1	-0/+103
\| \| \| \| \| \|	Broadcast of sign/zero extended scalars resulting in unnecessary vector constants llvm-svn: 321584
*	[CodeGen] Unify MBB reference format in both MIR and debug output	Francis Visoiu Mistrih	2017-12-04	1	-30/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(\1)/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g' * find . \( -name ".txt" -o -name ".s" -o -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665
*	[X86] Teach the execution domain fixing tables to use movlhps inplace of ↵	Craig Topper	2017-09-18	1	-3/+3
\| \| \| \| \| \| \| \|	unpcklpd for the packed single domain. MOVLHPS has a smaller encoding than UNPCKLPD in the legacy encodings. With VEX and EVEX encodings it doesn't matter. llvm-svn: 313509
*	Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset	Nirav Dave	2017-07-05	1	-18/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Relanding after rewriting undef.ll test to avoid host-dependant endianness. As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using generic checks. Also, propagate missing local handling from there to BaseIndexOffset checks. Tests of note: * test/CodeGen/X86/build-vector* - Improved. * test/CodeGen/BPF/undef.ll - Improved store alignment allows an additional store merge * test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a case we already do not handle well. Here, the DAG is improved, but scheduling causes a code size degradation. Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D34472 llvm-svn: 307114
*	Revert "[DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset"	Nirav Dave	2017-06-30	1	-5/+18
\| \| \| \| \| \| \|	This reverts commit r306819 which appears be exposing underlying issues in a stage1 ppc64be build llvm-svn: 306820
*	[DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset	Nirav Dave	2017-06-30	1	-18/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using generic checks. Also, propagate missing local handling from there to BaseIndexOffset checks. Tests of note: * test/CodeGen/X86/build-vector* - Improved. * test/CodeGen/BPF/undef.ll - Improved store alignment allows an additional store merge * test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a case we already do not handle well. Here, the DAG is improved, but scheduling causes a code size degradation. Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D34472 llvm-svn: 306819
*	[X86][SSE] Change BUILD_VECTOR interleaving ordering to improve ↵	Simon Pilgrim	2017-06-04	1	-48/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	coalescing/combine opportunities We currently generate BUILD_VECTOR as a tree of UNPCKL shuffles of the same type: e.g. for v4f32: Step 1: unpcklps 0, 2 ==> X: <?, ?, 2, 0> : unpcklps 1, 3 ==> Y: <?, ?, 3, 1> Step 2: unpcklps X, Y ==> <3, 2, 1, 0> The issue is because we are not placing sequential vector elements together early enough, we fail to recognise many combinable patterns - consecutive scalar loads, extractions etc. Instead, this patch unpacks progressively larger sequential vector elements together: e.g. for v4f32: Step 1: unpcklps 0, 2 ==> X: <?, ?, 1, 0> : unpcklps 1, 3 ==> Y: <?, ?, 3, 2> Step 2: unpcklpd X, Y ==> <3, 2, 1, 0> This does mean that we are creating UNPCKL shuffle of different value types, but the relevant combines that benefit from this are quite capable of handling the additional BITCASTs that are now included in the shuffle tree. Differential Revision: https://reviews.llvm.org/D33864 llvm-svn: 304688
*	[X86][SSE] Add 128/256/512 bit vector build vector from register tests	Simon Pilgrim	2017-05-05	1	-0/+428
	llvm-svn: 302243