bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	CodeGen: BlockPlacement: Increase tail duplication size for O3.	Kyle Butt	2017-05-15	2	-9/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At O3 we are more willing to increase size if we believe it will improve performance. The current threshold for tail-duplication of 2 instructions is conservative, and can be relaxed at O3. Benchmark results: llvm test-suite: 6% improvement in aha, due to duplication of loop latch 3% improvement in hexxagon 2% slowdown in lpbench. Seems related, but couldn't completely diagnose. Internal google benchmark: Produces 4% improvement on internal google protocol buffer serialization benchmarks. Differential-Revision: https://reviews.llvm.org/D32324 llvm-svn: 303084
*	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ↵	Simon Pilgrim	2017-05-15	9	-3616/+3618
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	ReadMem/WriteMem (PR32146) Follow up to D33147 NVPTXTargetLowering::LowerCall was trusting the default argument values. Fixes another 17 of the NVPTX '-verify-machineinstrs with EXPENSIVE_CHECKS' errors in PR32146. Differential Revision: https://reviews.llvm.org/D33189 llvm-svn: 303082
*	Add an extra test for archive symbol tables.	Rafael Espindola	2017-05-15	1	-0/+19
\| \| \| \| \| \|	The table should include only defined symbols. llvm-svn: 303075
*	[SLPVectorizer][X86] Add vectorization tests for vXi64/vXi32/vXi16/VXi8 ↵	Simon Pilgrim	2017-05-15	3	-0/+1998
\| \| \| \| \| \|	add/sub/mul llvm-svn: 303074
*	[AArch64] Enable FeatureFuseAES on Cortex-A72.	Florian Hahn	2017-05-15	1	-0/+33
\| \| \| \| \| \| \| \|	This patch enables fusing dependent AESE/AESMC and AESD/AESIMC instruction pairs on Cortex-A72, as recommended in the Software Optimization Guide, section 4.10. llvm-svn: 303073
*	[AMDGPU][MC] Corrected several VI opcodes to avoid printing _e64	Dmitry Preobrazhensky	2017-05-15	8	-94/+94
\| \| \| \| \| \| \| \| \| \|	See bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D33123 llvm-svn: 303070
*	[SLPVectorizer][X86] Add vectorization tests for vXi64/vXi32/vXi16/VXi8 shifts	Simon Pilgrim	2017-05-15	3	-0/+2589
\| \| \| \|	llvm-svn: 303069
*	Test commit.	Dinar Temirbulatov	2017-05-15	1	-0/+1
\| \| \| \|	llvm-svn: 303059
*	[AMDGPU][MC] Removed V_MQSAD_U16_U8	Dmitry Preobrazhensky	2017-05-15	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \|	This instruction does not really exist See Bug 33018: https://bugs.llvm.org//show_bug.cgi?id=33018 Reviewers: vpykhtin, artem.tamazov Differential Revision: https://reviews.llvm.org/D33126 llvm-svn: 303055
*	[ARM] Mark LEApcrel instructions as isAsCheapAsAMove	John Brawn	2017-05-15	2	-1/+30
\| \| \| \| \| \| \| \| \| \| \|	Doing this means that if an LEApcrel is used in two places we will rematerialize instead of generating two MOVs. This is particularly useful for printfs using the same format string, where we want to generate an address into a register that's going to get corrupted by the call. Differential Revision: https://reviews.llvm.org/D32858 llvm-svn: 303054
*	[ARM] Mark LEApcrel as not having side effects	John Brawn	2017-05-15	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Doing this lets us hoist it out of loops, and I've also marked it as rematerializable the same as the thumb1 and thumb2 counterparts. It looks like it being marked as such was just a mistake, as the commit that made that change only mentions LEApcrelJT and in thumb1 and thumb2 only the LEApcrelJT instructions were marked as having side-effects, so it looks like the intent was to only mark LEApcrelJT as having side-effects but LEApcrel was accidentally marked as such also. Differential Revision: https://reviews.llvm.org/D32857 llvm-svn: 303053
*	[X86] Relocate code of replacement of subtarget unsupported masked memory ↵	Ayman Musa	2017-05-15	3	-1/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	intrinsics to run also on -O0 option. Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional). CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0). Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation. Differential Revision: https://reviews.llvm.org/D32487 llvm-svn: 303050
*	[TableGen] Add EncoderMethod to RegisterOperand	Sam Kolton	2017-05-15	2	-1/+36
\| \| \| \| \| \| \| \|	Reviewers: stoklund, grosbach, vpykhtin Differential Revision: https://reviews.llvm.org/D32493 llvm-svn: 303044
*	MCObjectStreamer : fail with a diagnostic when emitting an out of range value.	Arnaud A. de Grandmaison	2017-05-15	1	-0/+9
\| \| \| \| \| \| \| \| \|	We were previously silently emitting bogus data in release mode, making it very hard to diagnose the error, or crashing with an assert in debug mode. A proper diagnostic is now always emitted when the value to be emitted is out of range. llvm-svn: 303041
*	[GlobalISel][X86] G_BR instruction select test	Igor Breger	2017-05-15	2	-0/+58
\| \| \| \|	llvm-svn: 303036
*	Add '#' to test regex that I forgot in r303025.	Daniel Jasper	2017-05-15	1	-1/+1
\| \| \| \|	llvm-svn: 303034
*	Fix two tests that weren't correctly copied.	Daniel Jasper	2017-05-14	2	-2/+1
\| \| \| \| \| \| \|	One didn't correctly fine the regex variable, the other still had a RUN line for FNOBUILTIN-checks, which weren't copied to the file. llvm-svn: 303025
*	[X86][AVX1] Account for cost of extract/insert of 256-bit shifts	Simon Pilgrim	2017-05-14	3	-52/+52
\| \| \| \|	llvm-svn: 303023
*	[X86][AVX2] Fix costs for v4i64 ashr by splat	Simon Pilgrim	2017-05-14	1	-2/+2
\| \| \| \|	llvm-svn: 303022
*	[X86][AVX1] Account for cost of extract/insert of 256-bit shifts by splat	Simon Pilgrim	2017-05-14	3	-38/+38
\| \| \| \|	llvm-svn: 303021
*	[X86] Add avx512vl command lines to the 128/256-bit vector-lzcnt tests so we ↵	Craig Topper	2017-05-14	2	-92/+656
\| \| \| \| \| \| \| \|	can see what compare instructions are being used in the lookup table code. I noticed the 512-bit lzcnts don't use the X86 specific lookup table code and instead use the EXPAND case in LegalizeDAG. I was toying around with fixing this and noticed it would require compare instructions that generate i1 masks and then converting from mask to vector. Then I noticed that we don't test which compares are used with avx512vl and no avx512cd. llvm-svn: 303020
*	[X86] Cleanup some of the check-prefixes in the vector-lzcnt tests.	Craig Topper	2017-05-14	2	-136/+96
\| \| \| \| \| \|	Remove an unneeded prefix from the 32-bit command line. Make all the 64-bit triples match. Replace ALL with X64 and remove it from the 32-bit test. llvm-svn: 303019
*	[X86][AVX1] Account for cost of extract/insert of 256-bit SDIV/UDIV by mul ↵	Simon Pilgrim	2017-05-14	1	-16/+16
\| \| \| \| \| \|	sequences llvm-svn: 303017
*	[COFF] Gracefully handle empty .drectve sections	Shoaib Meenai	2017-05-14	2	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Running `llvm-readobj -coff-directives msvcrt.lib` resulted in this error: Invalid data was encountered while parsing the file This happened because some of the object files in the archive have empty `.drectve` sections. These empty sections result in a `parse_failed` error being returned from `COFFObjectFile::getSectionContents()`, which in turn caused `llvm-readobj` to stop. With this change, `getSectionContents` now returns success, and like before the resulting array is empty. Patch by Dave Lee. Differential Revision: https://reviews.llvm.org/D32652 llvm-svn: 303014
*	[X86][XOP] XOP's general v16i8 shifts will be used instead of v8i16 shift + ↵	Simon Pilgrim	2017-05-14	2	-6/+6
\| \| \| \| \| \| \| \|	mask. Tweak cost model to match what lowering actually does. llvm-svn: 303013
*	[X86][SSE] Account for cost of extract/insert of v32i8 vector shifts	Simon Pilgrim	2017-05-14	3	-12/+12
\| \| \| \|	llvm-svn: 303012
*	[X86][XOP] Account for cost of extract/insert of 256-bit vector shifts	Simon Pilgrim	2017-05-14	3	-98/+98
\| \| \| \|	llvm-svn: 303010
*	[X86][AVX] Allow 32-bit targets to peek through subvectors to extract ↵	Simon Pilgrim	2017-05-14	4	-33/+17
\| \| \| \| \| \|	constant splats for vXi64 shifts. llvm-svn: 303009
*	[X86][AVX] Add additional 32-bit target vector shift tests	Simon Pilgrim	2017-05-14	3	-0/+1400
\| \| \| \| \| \|	Shows issue with 32-bits not being able to peek through subvectors to extract constant splats llvm-svn: 303008
*	[InstSimplify] Add patterns for folding (A & B) \| (~A ^ B) -> (~A ^ B) and ↵	Craig Topper	2017-05-14	1	-32/+16
\| \| \| \| \| \| \| \|	its commuted variants. We already had (A & ~B) \| (A ^ B), but we missed the cases where the not was part of the xor. llvm-svn: 303004
*	foo	Craig Topper	2017-05-14	1	-0/+122
\| \| \| \|	llvm-svn: 303003
*	Renable test that was disabled due to cost analysis	Xinliang David Li	2017-05-14	1	-1/+1
\| \| \| \|	llvm-svn: 303000
*	[llvm-pdbdump] Add the option to sort functions and data.	Zachary Turner	2017-05-14	5	-6/+98
\| \| \| \|	llvm-svn: 302998
*	[SelectionDAG] Added support for EXTRACT_SUBVECTOR/CONCAT_VECTORS ↵	Simon Pilgrim	2017-05-13	1	-23/+4
\| \| \| \| \| \|	demandedelts in ComputeNumSignBits llvm-svn: 302997
*	[X86][SSE] Test showing missing EXTRACT_SUBVECTOR/CONCAT_VECTORS ↵	Simon Pilgrim	2017-05-13	1	-0/+55
\| \| \| \| \| \|	demandedelts support in ComputeNumSignBits llvm-svn: 302994
*	[SelectionDAG] Add VECTOR_SHUFFLE support to ComputeNumSignBits	Simon Pilgrim	2017-05-13	1	-45/+5
\| \| \| \|	llvm-svn: 302993
*	[X86][SSE] Test showing inability of ComputeNumSignBits to resolve shuffles	Simon Pilgrim	2017-05-13	1	-0/+72
\| \| \| \|	llvm-svn: 302992
*	MSan: Mark MemorySanitizer tests that use x86 intrinsics as REQUIRES: x86	Justin Bogner	2017-05-13	7	-64/+73
\| \| \| \| \| \| \|	Tests that use target intrinsics are inherently target specific. Mark them as such. llvm-svn: 302990
*	[x86, SSE] AVX1 PR28129 (256-bit all-ones rematerialization)	Simon Pilgrim	2017-05-13	7	-104/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Further perf tests on Jaguar indicate that: vxorps %ymm0, %ymm0, %ymm0 vcmpps $15, %ymm0, %ymm0, %ymm0 is consistently faster (by about 9%) than: vpcmpeqd %xmm0, %xmm0, %xmm0 vinsertf128 $1, %xmm0, %ymm0, %ymm0 Testing equivalent code on a SandyBridge (E5-2640) puts it slightly (~3%) faster as well. Committed on behalf of @dtemirbulatov Differential Revision: https://reviews.llvm.org/D32416 llvm-svn: 302989
*	[LoopOptimizer][Fix]PR32859, PR24738	Simon Pilgrim	2017-05-13	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Loop vectorizer pass introduced undef value while it is fixing output of LCSSA form. Here it is: before: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] after: %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ undef, %middle.block ] and after this change we have: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ 0, %middle.block ] Committed on behalf of @dtemirbulatov Differential Revision: https://reviews.llvm.org/D33055 llvm-svn: 302988
*	[InstCombine] Prevent InstCombine from triggering an extra iteration if ↵	Craig Topper	2017-05-13	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	something changed in the initial Worklist creation Summary: If the Worklist build causes an IR change this change flag currently factors into the flag for running another iteration of the iteration loop. But only changes during processing should trigger another loop. This patch captures the worklist creation change flag into the outside the loop flag currently used for DbgDeclares and only sends that flag up to the caller. Rerunning the loop only depends on IC.run() now. This uses the debug output of InstCombine to determine if one or two iterations run. I couldn't think of a better way to detect it since the second spurious iteration shoudn't make any visible changes. Just wasted computation. I can do a pre-commit of the test case with the CHECK-NOT as a CHECK if this is an ok way to check this. This is a subset of D31678 as I'm still not sure how to verify the analysis behavior for that. Reviewers: davide, majnemer, spatel, chandlerc Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32453 llvm-svn: 302982
*	ConstProp: Split x86 SSE intrinsic tests out of calls.ll	Justin Bogner	2017-05-13	2	-206/+209
\| \| \| \| \| \| \|	This allows us to mark this as `REQUIRES: x86`, since it uses x86 target specific intrinsics. llvm-svn: 302980
*	InstCombine: Move tests that use target intrinsics into subdirectories	Justin Bogner	2017-05-13	38	-175/+190
\| \| \| \| \| \| \| \|	Tests with target intrinsics are inherently target specific, so it doesn't actually make sense to run them if we've excluded their target. llvm-svn: 302979
*	Disable llvm/test/Transforms/NewGVN/pr32934.ll while Davide is investigating.	NAKAMURA Takumi	2017-05-13	1	-1/+1
\| \| \| \|	llvm-svn: 302977
*	[NewGVN] XFAIL a flaky test until I find out what's going on.	Davide Italiano	2017-05-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	I bet the change is correct but this test seems to expose some underlying problem that manifest only on some buildbots, and I'm not able to reproduce locally. Unfortunately I can't debug right now but I don't want to annoy people with spurious failures, so I'll XFAIL until I can take a look (over the weekend). llvm-svn: 302976
*	[AVR] When lowering Select8/Select16, put newly generated MBBs in the same spot	Dylan McKay	2017-05-13	1	-0/+35
\| \| \| \| \| \| \| \| \| \|	Contributed by Dr. Gergő Érdi. Fixes a bug. Raised from (https://github.com/avr-rust/rust/issues/49). llvm-svn: 302973
*	AA: Use generic intrinsics for tests instead of target specific ones	Justin Bogner	2017-05-13	5	-67/+87
\| \| \| \| \| \| \| \| \|	Update a few tests to use llvm.masked.load/store instead of arm neon vector loads and stores, and move the tests that are actually specific to those arm intrinsics to their own files. This lets us mark the tests that use target specific intrinsics as requiring those targets. llvm-svn: 302972
*	[PartialInlining] Profile based cost analysis	Xinliang David Li	2017-05-12	9	-12/+160
\| \| \| \| \| \| \| \| \| \| \| \|	Implemented frequency based cost/saving analysis and related options. The pass is now in a state ready to be turne on in the pipeline (in follow up). Differential Revision: http://reviews.llvm.org/D32783 llvm-svn: 302967
*	[TLI] Add mapping for various '__<func>_finite' forms of the math routines ↵	Andrew Kaylor	2017-05-12	1	-0/+187
\| \| \| \| \| \| \| \| \| \|	to SVML routines Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31789 llvm-svn: 302957
*	[ConstantFolding] Add folding for various math '__<func>_finite' routines ↵	Andrew Kaylor	2017-05-12	1	-0/+83
\| \| \| \| \| \| \| \| \| \|	generated from -ffast-math Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31788 llvm-svn: 302956