bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[Stackmaps] Make ithe frame-pointer required for stackmaps.	Juergen Ributzka	2014-10-02	7	-12/+12
\| \| \| \| \| \| \| \| \|	Do not eliminate the frame pointer if there is a stackmap or patchpoint in the function. All stackmap references should be FP relative. This fixes PR21107. llvm-svn: 218920
*	Revert "DI: Fold constant arguments into a single MDString"	Duncan P. N. Exon Smith	2014-10-02	54	-1617/+1619
\| \| \| \| \| \|	This reverts commit r218914 while I investigate some bots. llvm-svn: 218918
*	DI: Fold constant arguments into a single MDString	Duncan P. N. Exon Smith	2014-10-02	54	-1619/+1617
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch addresses the first stage of PR17891 by folding constant arguments together into a single MDString. Integers are stringified and a `\0` character is used as a separator. Part of PR17891. Note: I've attached my testcases upgrade scripts to the PR. If I've just broken your out-of-tree testcases, they might help. llvm-svn: 218914
*	[x86] Teach the new vector shuffle lowering to widen floating point	Chandler Carruth	2014-10-02	2	-10/+194
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	elements as well as integer elements in order to form simpler shuffle patterns. This is the primary reason why we were failing to match some of the 2-and-2 floating point shuffles such as PR21140. Even after fixing this we need to support some extra patterns in the backend in order to match the resulting X86ISD::UNPCKL nodes into the correct instructions. This commit should fix PR21140 and includes more comprehensive testing of insertion patterns in v4 shuffles. Not all of the added tests are beautiful. For example, we don't have clever instructions to insert-via-load in the integer domain. There are also some places where we aren't sufficiently cunning with our use of movq and movd, but that's future work. llvm-svn: 218911
*	[x86] Move the vperm2f128 test to be vperm2x128 and test both the	Chandler Carruth	2014-10-02	3	-116/+217
\| \| \| \| \| \| \| \| \|	floating point and integer domains. Merge the AVX2 test into it and add an extra RUN line. Generate clean FileCheck statements with my script. Remove the now merged AVX2 tests. llvm-svn: 218903
*	[x86] Just delete the last combine test file.	Chandler Carruth	2014-10-02	1	-448/+0
\| \| \| \| \| \| \| \| \| \|	This file isn't really doing anything useful. Many of the tests that seem to be combined are also repeats from other test files. Many of the other tests, despite the comment that they should be combined into a single shuffle... well... aren't combined into a single shuffle. =/ llvm-svn: 218862
*	[x86] Merge still more combine tests into the common file. These at	Chandler Carruth	2014-10-02	2	-237/+382
\| \| \| \| \| \| \|	least seem slightly more interesting test wise, although given how spotily we actually combine anything, I remain somewhat suspicious. llvm-svn: 218861
*	[x86] Merge the third combining test into the generic one and add proper	Chandler Carruth	2014-10-02	2	-380/+1001
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	checks for all the ISA variants. If the SSE2 checks here terrify you, good. This is (in large part) the kind of amazingly bad code that is holding LLVM back when vectorizing on older ISAs. At the same time, these tests seem increasingly dubious to me. There are a very large number of tests and it isn't clear that they are systematically covering a specific set of functionality. Anyways, I don't want to reduce testing during the transition, I just want to consolidate it to where it is easier to manage. llvm-svn: 218860
*	[x86] Merge the second set of vector combining tests into a common test	Chandler Carruth	2014-10-02	2	-317/+470
\| \| \| \| \| \| \| \| \| \| \| \| \|	file. Some of these really don't make sense to test -- we're testing for the lack of combining two shuffles into one, presumably because the two would generate better shuffles in the end. But if you look at the generated code shown here, in many cases the generated code is, frankly, terrible. Or we combine any two generated shuffles back into a single instruction! I've left a FIXME to revisit these decisions. llvm-svn: 218859
*	[x86] Merge the bitwise operation shuffle combining into the common test	Chandler Carruth	2014-10-02	2	-253/+468
\| \| \| \| \| \|	file, adding assertions across the ISA variants for it. llvm-svn: 218858
*	[x86] Update this test to run a full complement of the ISA extensions,	Chandler Carruth	2014-10-02	1	-54/+92
\| \| \| \| \| \| \| \| \|	and use the new grouped FileCheck patterns to match them. No interesting changes yet, but this test is now in proper form to have the other shuffle combining tests merged into it. llvm-svn: 218857
*	[x86] Minimize the parameters to this test for clarity.	Chandler Carruth	2014-10-02	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \|	The test has to do with DAG combines, and so it doesn't need the new vector shuffle lowering to be effective. Also, it has a nice in-IR triple string which we should really be using rather than command line flags (unless it varies form RUN-line to RUN-line). Finally, I much prefer letting LLVM synthesize the correct datalayout string from the triple rather than baking one in here that will just become stale. llvm-svn: 218856
*	[x86] Add a comment clarifying that this test should span all manners of	Chandler Carruth	2014-10-02	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	generic DAG combining of shuffles relevant to x86. My plan is to fold a bunch of the other DAG combining test cases into this one, while converting them to use the nice new FileCheck assertion syntax. llvm-svn: 218855
*	[x86] Switch some of the new consolidated vector tests to use	Chandler Carruth	2014-10-02	3	-208/+239
\| \| \| \| \| \| \| \| \|	a bare-metal triple and have nice BB labels, etc. No significant change here, just tidying up to have a consistent set of OS-agnostic vector functionality here. llvm-svn: 218854
*	[x86] Improve and correct how the new vector shuffle lowering was	Chandler Carruth	2014-10-01	1	-15/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	matching and lowering 64-bit insertions. The first problem was that we weren't looking through bitcasts to discover that we could lower as insertions. Once fixed, we in turn weren't looking through bitcasts to discover that we could fold a load into the lowering. Once fixed, we weren't forming a SCALAR_TO_VECTOR node around the inserted element and instead were passing a scalar to a DAG node that expected a vector. It turns out there are some patterns that will "lower" this into the correct asm, but the rest of the X86 backend is very unhappy with such antics. This should fix a few more edge case regressions I've spotted going through the regression test suite to enable the new vector shuffle lowering. llvm-svn: 218839
*	Lower FNEG ( FABS (x) ) -> FNABS (x) [X86 codegen] PR20578	Sanjay Patel	2014-10-01	1	-0/+77
\| \| \| \| \| \| \| \| \| \| \| \|	Negative FABS of either a scalar or vector should be handled the same way on x86 with SSE/AVX: a single OR instruction of the FP operand with a constant to light up the sign bit(s). http://llvm.org/bugs/show_bug.cgi?id=20578 Differential Revision: http://reviews.llvm.org/D5201 llvm-svn: 218822
*	[x86] Merge the remaining test cases into vector-blend.ll and remove all	Chandler Carruth	2014-10-01	4	-218/+117
\| \| \| \| \| \|	the ISA-specific test files. llvm-svn: 218818
*	[x86] Expand the ISA coverage of our blend test in preparation for	Chandler Carruth	2014-10-01	1	-86/+423
\| \| \| \| \| \|	merging ISA-specific testing into this file. llvm-svn: 218816
*	[x86] Merge the interesting test cases from blend-msb.ll into	Chandler Carruth	2014-10-01	2	-40/+18
\| \| \| \| \| \|	vector-blend.ll and remove the former. llvm-svn: 218814
*	[x86] Move the AVX blend test to a generic name. I'm going to fold other	Chandler Carruth	2014-10-01	1	-0/+0
\| \| \| \| \| \|	blend tests into this one. llvm-svn: 218813
*	[x86] Remove a test that wasn't doing anything really. We have plenty of	Chandler Carruth	2014-10-01	1	-69/+0
\| \| \| \| \| \|	better tests for zext of vectors at this point. llvm-svn: 218811
*	[x86] Add a 32-bit run to the sext test, and remove a sad vec_sext.ll	Chandler Carruth	2014-10-01	2	-80/+181
\| \| \| \| \| \| \| \| \| \| \|	test file. This old test had a bunch of functions that were never even checked. =/ The only thing it really did was to make sure that we did something reasonable in 32-bit mode with SSE4.1. Adding another run line to the main vector-sext.ll test seems a better way to do that. llvm-svn: 218810
*	[x86] Teach both sext and zext vector tests to cover a nice wide range	Chandler Carruth	2014-10-01	2	-184/+662
\| \| \| \| \| \| \| \| \| \| \| \| \|	of architectures: SSE2, SSSE3, SSE4.1, AVX, and AVX2. Unfortunately, this exposses the absolute horror of the code we generate for many of these patterns. Anyone wanting to familiarize themselves with the x86 backend and improve performance could do a lot of good sitting down and making these test cases not look so terrible. While the new vector shuffle code I'm working on well help some, it won't fix all of the crimes here. llvm-svn: 218807
*	[x86] Sort the ISA-specific RUN lines for vector-sext.ll to go from	Chandler Carruth	2014-10-01	1	-155/+155
\| \| \| \| \| \| \|	oldest to newest. This makes more sense to me and is more consistent with other tests. llvm-svn: 218802
*	ARM: yes it can (as of r218789)	Tim Northover	2014-10-01	1	-3/+0
\| \| \| \|	llvm-svn: 218801
*	[x86] Rename avx-{s,z}ext.ll to vector-{s,z}ext.ll.	Chandler Carruth	2014-10-01	2	-0/+0
\| \| \| \| \| \| \| \|	These tests are far and away the best sext and zext tests we have for vectors. I'm going to merge the other similar tests into them and expand the ISA coverage. llvm-svn: 218800
*	[x86] Cleanup and re-generate the checks for avx-zext.ll using the new	Chandler Carruth	2014-10-01	1	-19/+32
\| \| \| \| \| \|	script. llvm-svn: 218799
*	[x86] Generate the FileCheck assertions for avx-blend.ll with my new	Chandler Carruth	2014-10-01	1	-96/+75
\| \| \| \| \| \| \|	script to make them nice and predictable. This will ease updating them for the new vector shuffle lowering and seeing the delta if any. llvm-svn: 218795
*	[x86] Clean up and generate detailed FileCheck assertions for	Chandler Carruth	2014-10-01	1	-123/+365
\| \| \| \| \| \| \| \| \| \| \| \| \|	avx-sext.ll using my new script. Also add an AVX2 mode to this test. Part of cleaning up the test suite before enabling the new vector shuffle lowering. This also highlights some of the abysmal failures of the old shuffle lowering. Check out those 'pinsrw' and 'pextrw' sequences! llvm-svn: 218794
*	ARM: allow copying of CPSR when all else fails.	Tim Northover	2014-10-01	1	-0/+41
\| \| \| \| \| \| \| \| \| \| \| \|	As with x86 and AArch64, certain situations can arise where we need to spill CPSR in the middle of a calculation. These should be avoided where possible (MRS/MSR is rather expensive), which ARM is actually better at than the other two since it tries to Glue defs to uses, but as a last ditch effort, copying is better than crashing. rdar://problem/18011155 llvm-svn: 218789
*	Move the complex address expression out of DIVariable and into an extra	Adrian Prantl	2014-10-01	44	-220/+220
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	argument of the llvm.dbg.declare/llvm.dbg.value intrinsics. Previously, DIVariable was a variable-length field that has an optional reference to a Metadata array consisting of a variable number of complex address expressions. In the case of OpPiece expressions this is wasting a lot of storage in IR, because when an aggregate type is, e.g., SROA'd into all of its n individual members, the IR will contain n copies of the DIVariable, all alike, only differing in the complex address reference at the end. By making the complex address into an extra argument of the dbg.value/dbg.declare intrinsics, all of the pieces can reference the same variable and the complex address expressions can be uniqued across the CU, too. Down the road, this will allow us to move other flags, such as "indirection" out of the DIVariable, too. The new intrinsics look like this: declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr) declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr) This patch adds a new LLVM-local tag to DIExpressions, so we can detect and pretty-print DIExpression metadata nodes. What this patch doesn't do: This patch does not touch the "Indirect" field in DIVariable; but moving that into the expression would be a natural next step. http://reviews.llvm.org/D4919 rdar://problem/17994491 Thanks to dblaikie and dexonsmith for reviewing this patch! Note: I accidentally committed a bogus older version of this patch previously. llvm-svn: 218787
*	Add fptrunc to mips fast-sel	Reed Kotler	2014-10-01	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Implement conversion of 64 to 32 bit floating point numbers (fptrunc) in mips fast-isel Test Plan: fptrunc.ll checked also with 4 internal mips build bot flavors mip32r1/miprs32r2 and at -O0 and -O2 Reviewers: dsanders Reviewed By: dsanders Subscribers: rfuhler Differential Revision: http://reviews.llvm.org/D5553 llvm-svn: 218785
*	Revert r218778 while investigating buldbot breakage.	Adrian Prantl	2014-10-01	44	-220/+220
\| \| \| \| \| \|	"Move the complex address expression out of DIVariable and into an extra" llvm-svn: 218782
*	Move the complex address expression out of DIVariable and into an extra	Adrian Prantl	2014-10-01	44	-220/+220
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	argument of the llvm.dbg.declare/llvm.dbg.value intrinsics. Previously, DIVariable was a variable-length field that has an optional reference to a Metadata array consisting of a variable number of complex address expressions. In the case of OpPiece expressions this is wasting a lot of storage in IR, because when an aggregate type is, e.g., SROA'd into all of its n individual members, the IR will contain n copies of the DIVariable, all alike, only differing in the complex address reference at the end. By making the complex address into an extra argument of the dbg.value/dbg.declare intrinsics, all of the pieces can reference the same variable and the complex address expressions can be uniqued across the CU, too. Down the road, this will allow us to move other flags, such as "indirection" out of the DIVariable, too. The new intrinsics look like this: declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr) declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr) This patch adds a new LLVM-local tag to DIExpressions, so we can detect and pretty-print DIExpression metadata nodes. What this patch doesn't do: This patch does not touch the "Indirect" field in DIVariable; but moving that into the expression would be a natural next step. http://reviews.llvm.org/D4919 rdar://problem/17994491 Thanks to dblaikie and dexonsmith for reviewing this patch! llvm-svn: 218778
*	R600: Call EmitFunctionHeader() in the AsmPrinter to populate the ELF symbol ↵	Tom Stellard	2014-10-01	265	-1519/+1520
\| \| \| \| \| \|	table llvm-svn: 218776
*	Revert r216862 due to a performance regression	Jingyue Wu	2014-10-01	3	-50/+7
\| \| \| \| \| \|	Reported by Alexey Volkov in PR21115 llvm-svn: 218771
*	[ARM] Allow selecting VRINT[APMXZR] and VCVT[BT] instructions for FPv5	Oliver Stannard	2014-10-01	3	-40/+65
\| \| \| \| \| \| \| \| \| \|	Currently, we only codegen the VRINT[APMXZR] and VCVT[BT] instructions when targeting ARMv8, but they are actually present on any target with FP-ARMv8. Note that FP-ARMv8 is called FPv5 when is is part of an M-profile core, but they have the same instructions so we model them both as FPARMv8 in the ARM backend. llvm-svn: 218763
*	[x86] Fix a few more tiny patterns with the new vector shuffle lowering	Chandler Carruth	2014-10-01	1	-0/+190
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	that keep cropping up in the regression test suite. This also addresses one of the issues raised on the mailing list with failing to form 'movsd' in as many cases as we realistically should. There will be corresponding patches forthcoming for v4f32 at least. This was a lot of fuss for a relatively small gain, but all the fuss was on my end trying different ways of holding the pieces of the x86 fragment patterns just right. Now that it works, the code is reasonably simple. In the new test cases I'm adding here, v2i64 sticks out as just plain horrible. I've not come up with any great ideas here other than that it would be nice to recognize when we're going to take a domain crossing hit and cross earlier to get the decent instructions. At least with AVX it is slightly less silly.... llvm-svn: 218756
*	Add missing natual vector cast.	Asiri Rathnayake	2014-10-01	1	-0/+65
\| \| \| \| \| \| \| \| \|	Summary: The natual vector cast node (similar to bitcast) AArch64ISD::NVCAST was introduced in r217159 and r217138. This patch adds a missing cast from v2f32 to v1i64 which is causing some compilation failures. Also added test cases to cover various modimm types and BUILD_VECTORs with i64 elements. llvm-svn: 218751
*	[ARM] Add support for Cortex-M7, FPv5-SP and FPv5-DP (LLVM)	Oliver Stannard	2014-10-01	6	-17/+55
\| \| \| \| \| \| \| \| \|	The Cortex-M7 has 3 options for its FPU: none, FPv5-SP-D16 and FPv5-DP-D16. FPv5 has the same instructions as FP-ARMv8, so it can be modelled using the same target feature, and all double-precision operations are already disabled by the fp-only-sp target features. llvm-svn: 218747
*	[mips] For indirect calls we don't need $gp to point to .got. Mips linker	Sasa Stankovic	2014-10-01	2	-4/+15
\| \| \| \| \| \| \| \| \|	doesn't generate lazy binding stub for a function whose address is taken in the program. Differential Revision: http://reviews.llvm.org/D5067 llvm-svn: 218744
*	[x86] Teach the new vector shuffle lowering to be even more aggressive	Chandler Carruth	2014-10-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in exposing the scalar value to the broadcast DAG fragment so that we can catch even reloads and fold them into the broadcast. This is somewhat magical I'm afraid but seems to work. It is also what the old lowering did, and I've switched an old test to run both lowerings demonstrating that we get the same result. Unlike the old code, I'm not lowering f32 or f64 scalars through this path when we only have AVX1. The target patterns include pretty heinous code to re-cast those as shuffles when the scalar happens to not be spilled because AVX1 provides no broadcast mechanism from registers what-so-ever. This is terribly brittle. I'd much rather go through our generic lowering code to get this. If needed, we can add a peephole to get even more opportunities to broadcast-from-spill-slots that are exposed post-RA, but my suspicion is this just doesn't matter that much. llvm-svn: 218734
*	[x86] Hoist the zext-lowering up in the v4i32 lowering routine -- it is	Chandler Carruth	2014-10-01	1	-5/+20
\| \| \| \| \| \| \| \| \| \|	the same speed as pshufd but we can fold loads into the pmovzx instructions. This fixes some regressions that came up in the regression test suite for the new vector shuffle lowering. llvm-svn: 218733
*	[x86] Teach the new vector shuffle lowering about VBROADCAST and	Chandler Carruth	2014-10-01	8	-263/+310
\| \| \| \| \| \| \| \| \| \|	VPBROADCAST. This has the somewhat expected pervasive impact. I don't know why I forgot about this. Everything seems good with lots of significant improvements in the tests. llvm-svn: 218724
*	[x86] Add AVX1 and AVX2 testing to all of the 128-bit shuffle test	Chandler Carruth	2014-09-30	4	-375/+855
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	cases. While clearly we don't need the AVX vector width, these ISA extensions often cause us to select different instructions and we should cover them even with the narrow vector width. Also, while here, nuke the stress_test2 contents. There is no reason to try to FileCheck this entire body when it is mostly a test for successfully surviving the code generator. llvm-svn: 218710
*	[x86] Update the exact FileCheck syntax of the 256-bit and 512-bit	Chandler Carruth	2014-09-30	5	-1961/+1962
\| \| \| \| \| \| \| \| \| \| \|	shuffle tests to match that used in the script I posted and now used consistently in 128-bit tests. Nothing interesting changing here, just using the label name as the FileCheck label and a slightly more general comment marker consumption strategy. llvm-svn: 218709
*	[x86] Rework all of the 128-bit vector shuffle tests with my handy test	Chandler Carruth	2014-09-30	4	-1222/+2541
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	updating script so that they are more thorough and consistent. Specific fixes here include: - Actually test VEX-encoded AVX mnemonics. - Actually use an SSE 4.1 run to test SSE 4.1 features! - Correctly check instructions sequences from the start of the function. - Elide the shuffle operands and comment designator in a consistent way. - Test all of the architectures instead of just the ones I was motivated to manually author. I've gone back through and fixed up any egregious issues I spotted. Let me know if I missed something you really dislike. One downside to this is that we're now not as diligently using FileCheck variables for registers. I would be much more concerned with this if we had larger register usage, but there just aren't that interesting of register choices here and most of the registers are constrained by the ABI. Ultimately, I don't think this is likely to be the maintenance burden for these tests and updating them again should be staright forward. llvm-svn: 218707
*	Recommit r218010 [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ.	Juergen Ributzka	2014-09-30	1	-0/+125
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Note: This version fixed an issue with the TBZ/TBNZ instructions that were generated in FastISel. The issue was that the 64bit version of TBZ (TBZX) automagically sets the upper bit of the immediate field that is used to specify the bit we want to test. To test for any of the lower 32bits we have to first extract the subregister and use the 32bit version of the TBZ instruction (TBZW). Original commit message: Teach selectBranch to fold bit test and branch into a single instruction (TBZ or TBNZ). llvm-svn: 218693
*	R600/SI: Fix printing of clamp and omod	Matt Arsenault	2014-09-30	5	-15/+15
\| \| \| \| \| \| \| \|	No tests for omod since nothing uses it yet, but this should get rid of the remaining annoying trailing zeros after some instructions. llvm-svn: 218692
*	Add numeric extend, trunctate to mips fast-isel	Reed Kotler	2014-09-30	2	-0/+200
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add numeric extend, trunctate to mips fast-isel Reactivates D4827 Test Plan: fpext.ll loadstoreconv.ll Reviewers: dsanders Subscribers: mcrosier Differential Revision: http://reviews.llvm.org/D5251 llvm-svn: 218681