bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Remove an unreachable 'break' following a 'return'.	Craig Topper	2013-04-22	1	-1/+0
\| \| \| \|	llvm-svn: 179991
*	Improve performance of file I/O.	Bill Wendling	2013-04-22	1	-17/+21
\| \| \| \| \| \| \| \| \| \|	The fread / fwrite calls were happening for each timer. However, that could be pretty expensive for a large number of timers. Instead, read and write the timers in one call. This gives ~10% speedup in compilation time. llvm-svn: 179990
*	Legalize vector truncates by parts rather than just splitting.	Jim Grosbach	2013-04-21	5	-38/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than just splitting the input type and hoping for the best, apply a bit more cleverness. Just splitting the types until the source is legal often leads to an illegal result time, which is then widened and a scalarization step is introduced which leads to truly horrible code generation. With the loop vectorizer, these sorts of operations are much more common, and so it's worth extra effort to do them well. Add a legalization hook for the operands of a TRUNCATE node, which will be encountered after the result type has been legalized, but if the operand type is still illegal. If simple splitting of both types ends up with the result type of each half still being legal, just do that (v16i16 -> v16i8 on ARM, for example). If, however, that would result in an illegal result type (v8i32 -> v8i8 on ARM, for example), we can get more clever with power-two vectors. Specifically, split the input type, but also widen the result element size, then concatenate the halves and truncate again. For example on ARM, To perform a "%res = v8i8 trunc v8i32 %in" we transform to: %inlo = v4i32 extract_subvector %in, 0 %inhi = v4i32 extract_subvector %in, 4 %lo16 = v4i16 trunc v4i32 %inlo %hi16 = v4i16 trunc v4i32 %inhi %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16 %res = v8i8 trunc v8i16 %in16 This allows instruction selection to generate three VMOVN instructions instead of a sequences of moves, stores and loads. Update the ARMTargetTransformInfo to take this improved legalization into account. Consider the simplified IR: define <16 x i8> @test1(<16 x i32>* %ap) { %a = load <16 x i32>* %ap %tmp = trunc <16 x i32> %a to <16 x i8> ret <16 x i8> %tmp } define <8 x i8> @test2(<8 x i32>* %ap) { %a = load <8 x i32>* %ap %tmp = trunc <8 x i32> %a to <8 x i8> ret <8 x i8> %tmp } Previously, we would generate the truly hideous: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: push {r7} mov r7, sp sub sp, sp, #20 bic sp, sp, #7 add r1, r0, #48 add r2, r0, #32 vld1.64 {d24, d25}, [r0:128] vld1.64 {d16, d17}, [r1:128] vld1.64 {d18, d19}, [r2:128] add r1, r0, #16 vmovn.i32 d22, q8 vld1.64 {d16, d17}, [r1:128] vmovn.i32 d20, q9 vmovn.i32 d18, q12 vmov.u16 r0, d22[3] strb r0, [sp, #15] vmov.u16 r0, d22[2] strb r0, [sp, #14] vmov.u16 r0, d22[1] strb r0, [sp, #13] vmov.u16 r0, d22[0] vmovn.i32 d16, q8 strb r0, [sp, #12] vmov.u16 r0, d20[3] strb r0, [sp, #11] vmov.u16 r0, d20[2] strb r0, [sp, #10] vmov.u16 r0, d20[1] strb r0, [sp, #9] vmov.u16 r0, d20[0] strb r0, [sp, #8] vmov.u16 r0, d18[3] strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] vldmia sp, {d16, d17} vmov r0, r1, d16 vmov r2, r3, d17 mov sp, r7 pop {r7} bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: push {r7} mov r7, sp sub sp, sp, #12 bic sp, sp, #7 vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d20, d21}, [r0:128] vmovn.i32 d18, q8 vmov.u16 r0, d18[3] vmovn.i32 d16, q10 strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] ldm sp, {r0, r1} mov sp, r7 pop {r7} bx lr Now, however, we generate the much more straightforward: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: add r1, r0, #48 add r2, r0, #32 vld1.64 {d20, d21}, [r0:128] vld1.64 {d16, d17}, [r1:128] add r1, r0, #16 vld1.64 {d18, d19}, [r2:128] vld1.64 {d22, d23}, [r1:128] vmovn.i32 d17, q8 vmovn.i32 d16, q9 vmovn.i32 d18, q10 vmovn.i32 d19, q11 vmovn.i16 d17, q8 vmovn.i16 d16, q9 vmov r0, r1, d16 vmov r2, r3, d17 bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d18, d19}, [r0:128] vmovn.i32 d16, q8 vmovn.i32 d17, q9 vmovn.i16 d16, q8 vmov r0, r1, d16 bx lr llvm-svn: 179989
*	ARM: Split out cost model vcvt testcases.	Jim Grosbach	2013-04-21	2	-172/+171
\| \| \| \| \| \|	They had a separate RUN line already, so may as well be in a separate file. llvm-svn: 179988
*	Passing arguments to varags functions under the SPARC v9 ABI.	Jakob Stoklund Olesen	2013-04-21	2	-0/+60
\| \| \| \| \| \| \|	Arguments after the fixed arguments never use the floating point registers. llvm-svn: 179987
*	Tidy up comment grammar.	Jim Grosbach	2013-04-21	1	-2/+2
\| \| \| \|	llvm-svn: 179986
*	Fix the SETHIimm pattern for 64-bit code.	Jakob Stoklund Olesen	2013-04-21	2	-2/+7
\| \| \| \| \| \|	Don't ignore the high 32 bits of the immediate. llvm-svn: 179985
*	Fix return type of isBitfield in the binding definition	Dmitri Gribenko	2013-04-21	1	-1/+1
\| \| \| \| \| \|	Patch by Loïc Jaquemet. llvm-svn: 179984
*	Remove unused, undefined ArgFlagsTy::getArgFlagsString; add a comment about ↵	Stephen Lin	2013-04-21	1	-5/+2
\| \| \| \| \| \|	'returned' llvm-svn: 179983
*	SROA: Don't crash on a select with two identical operands.	Benjamin Kramer	2013-04-21	2	-8/+19
\| \| \| \| \| \| \|	This is an edge case that can happen if we modify a chain of multiple selects. Update all operands in that case and remove the assert. PR15805. llvm-svn: 179982
*	[Mips] Convert a GNU style Mips ABI name to the name accepted by LLVM	Simon Atanasyan	2013-04-21	2	-1/+44
\| \| \| \| \| \|	Mips backend. llvm-svn: 179981
*	Revert "SimplifyCFG: If convert single conditional stores"	Arnold Schwaighofer	2013-04-21	2	-171/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is the temptation to make this tranform dependent on target information as it is not going to be beneficial on all (sub)targets. Therefore, we should probably do this in MI Early-Ifconversion. This reverts commit r179957. Original commit message: "SimplifyCFG: If convert single conditional stores This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up." llvm-svn: 179980
*	[Mips] Do not add unnecessary Mips toolchain path to the list	Simon Atanasyan	2013-04-21	2	-37/+0
\| \| \| \| \| \|	of system include directories with extern "C" semantics. llvm-svn: 179979
*	ARM: fix part of test which actually needed an asserts build	Tim Northover	2013-04-21	2	-6/+30
\| \| \| \| \| \|	This should fix a buildbot failure that occurred after r179977. llvm-svn: 179978
*	ARM: Use ldrd/strd to spill 64-bit pairs when available.	Tim Northover	2013-04-21	4	-50/+133
\| \| \| \| \| \| \|	This allows common sp-offsets to be part of the instruction and is probably faster on modern CPUs too. llvm-svn: 179977
*	Remove the executable bit on cmake files	Sylvestre Ledru	2013-04-21	2	-0/+0
\| \| \| \|	llvm-svn: 179976
*	SLPVectorize: Add support for vectorization of casts.	Nadav Rotem	2013-04-21	2	-0/+107
\| \| \| \|	llvm-svn: 179975
*	SLPVectorizer: Fix a bug in the code that scans the tree in search of nodes ↵	Nadav Rotem	2013-04-21	1	-0/+1
\| \| \| \| \| \| \| \|	with multiple users. We did not terminate the switch case and we executed the search routine twice. llvm-svn: 179974
*	[objc-arc] Cleaned up tail-call-invariant-enforcement.ll.	Michael Gottesman	2013-04-21	1	-25/+40
\| \| \| \| \| \| \| \| \| \| \| \|	Specifically: 1. Added checks that unwind is being properly added to various instructions. 2. Fixed the declaration/calling of objc_release to have a return type of void. 3. Moved all checks to precede the functions and added checks to ensure that the checks would only match inside the specific function that we are attempting to check. llvm-svn: 179973
*	[objc-arc] Check that objc-arc-expand properly handles all strictly ↵	Michael Gottesman	2013-04-21	1	-5/+71
\| \| \| \| \| \|	forwarding calls and does not touch calls which are not strictly forwarding (i.e. objc_retainBlock). llvm-svn: 179972
*	[objc-arc] Renamed the test file ↵	Michael Gottesman	2013-04-21	1	-0/+0
\| \| \| \| \| \|	clang-arc-used-intrinsic-removed-if-isolated.ll -> intrinsic-use-isolated.ll to match the other test file intrinsic-use.ll. llvm-svn: 179971
*	Remove tbaa metadata.	Bill Wendling	2013-04-21	1	-7/+3
\| \| \| \|	llvm-svn: 179970
*	The 'constexpr implies const' rule for non-static member functions is gone in	Richard Smith	2013-04-21	25	-82/+133
\| \| \| \| \| \| \| \| \|	C++1y, so stop adding the 'const' there. Provide a compatibility warning for code relying on this in C++11, with a fix-it hint. Update our lazily-written tests to add the const, except for those ones which were testing our implementation of this rule. llvm-svn: 179969
*	When we strength reduce an objc_retainBlock call to objc_retain, increment ↵	Michael Gottesman	2013-04-21	1	-1/+6
\| \| \| \| \| \|	NumPeeps and make sure that Changed is set to true. llvm-svn: 179968
*	Fixed comment typo.	Michael Gottesman	2013-04-21	1	-1/+1
\| \| \| \|	llvm-svn: 179967
*	[objc-arc] Fixed typo in debug message.	Michael Gottesman	2013-04-21	1	-1/+1
\| \| \| \|	llvm-svn: 179966
*	[objc-arc] Fixed comment typo.	Michael Gottesman	2013-04-21	1	-1/+1
\| \| \| \|	llvm-svn: 179965
*	[objc-arc] Refactored OptimizeReturns so that it uses continue instead of a ↵	Michael Gottesman	2013-04-21	1	-25/+30
\| \| \| \| \| \|	large multi-level nested if statement. llvm-svn: 179964
*	[objc-arc] Added debug statement saying when we are resetting a sequence's ↵	Michael Gottesman	2013-04-20	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	progress. This will make it clearer when we are actually resetting a sequence's progress vs just changing state. This is an important distinction because the former case clears any pointers that we are tracking while the later does not. llvm-svn: 179963
*	Disable VLA diagnostic in C++1y mode, and add some tests.	Richard Smith	2013-04-20	5	-1/+82
\| \| \| \| \| \| \| \| \| \| \| \|	Still to do here: - we have a collection of syntactic accepts-invalids to diagnose - support non-PODs in VLAs, including dynamic initialization / destruction - runtime checks (and throw std::bad_array_length) for bad bound - support VLA capture by reference in lambdas - properly support VLAs in range-based for (don't recompute bound) llvm-svn: 179962
*	Compile varargs functions for SPARCv9.	Jakob Stoklund Olesen	2013-04-20	2	-31/+119
\| \| \| \| \| \| \| \| \| \| \| \|	With a little help from the frontend, it looks like the standard va_* intrinsics can do the job. Also clean up an old bitcast hack in LowerVAARG that dealt with unaligned double loads. Load SDNodes can specify an alignment now. Still missing: Calling varargs functions with float arguments. llvm-svn: 179961
*	Fix PR15800. Do not try to vectorize vectors and structs.	Nadav Rotem	2013-04-20	2	-1/+24
\| \| \| \|	llvm-svn: 179960
*	Add another test I forgot to svn add.	Richard Smith	2013-04-20	1	-0/+30
\| \| \| \|	llvm-svn: 179959
*	C++1y: Allow aggregates to have default initializers.	Richard Smith	2013-04-20	37	-41/+495
\| \| \| \| \| \| \| \| \| \| \|	Add a CXXDefaultInitExpr, analogous to CXXDefaultArgExpr, and use it both in CXXCtorInitializers and in InitListExprs to represent a default initializer. There's an additional complication here: because the default initializer can refer to the initialized object via its 'this' pointer, we need to make sure that 'this' points to the right thing within the evaluation. llvm-svn: 179958
*	SimplifyCFG: If convert single conditional stores	Arnold Schwaighofer	2013-04-20	2	-4/+171
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up. llvm-svn: 179957
*	ARM: don't add FrameIndex offset for LDMIA (has no immediate)	Tim Northover	2013-04-20	2	-1/+37
\| \| \| \| \| \| \| \| \| \| \| \|	Previously, when spilling 64-bit paired registers, an LDMIA with both a FrameIndex and an offset was produced. This kind of instruction shouldn't exist, and the extra operand was being confused with the predicate, causing aborts later on. This removes the invalid 0-offset from the instruction being produced. llvm-svn: 179956
*	recommit tests	Nuno Lopes	2013-04-20	1	-0/+20
\| \| \| \|	llvm-svn: 179955
*	Minor renaming of tests (for consistency with an in-development patch)	Stephen Lin	2013-04-20	1	-10/+10
\| \| \| \|	llvm-svn: 179954
*	Update some stuff on the open projects page to reflect things we've already ↵	Richard Smith	2013-04-20	1	-11/+9
\| \| \| \| \| \|	done. llvm-svn: 179953
*	AArch64: remove useless comment	Tim Northover	2013-04-20	1	-2/+0
\| \| \| \|	llvm-svn: 179952
*	Switch C++11 open project to C++1y :)	Richard Smith	2013-04-20	1	-2/+2
\| \| \| \|	llvm-svn: 179951
*	Add note that some of these links are dead for now.	Richard Smith	2013-04-20	1	-0/+3
\| \| \| \|	llvm-svn: 179950
*	VLAs in C++14!	Richard Smith	2013-04-20	1	-2/+2
\| \| \| \|	llvm-svn: 179949
*	Move 'kw_align' case to proper section, reorganize function attribute ↵	Stephen Lin	2013-04-20	1	-12/+25
\| \| \| \| \| \|	keyword case statements to be consistent with r179119 llvm-svn: 179948
*	Variable templates and generic lambdas are approved for C++14.	Richard Smith	2013-04-20	1	-2/+2
\| \| \| \|	llvm-svn: 179947
*	Clarifying memory allocation: approved for C++14. Move from N/A to no, since ↵	Richard Smith	2013-04-20	1	-2/+2
\| \| \| \| \| \|	we currently relax 'operator new' calls which didn't come from new-expressions. llvm-svn: 179946
*	No digit separators for C++14.	Richard Smith	2013-04-20	1	-7/+0
\| \| \| \|	llvm-svn: 179945
*	Generalized constexpr is approved for C++14.	Richard Smith	2013-04-20	1	-1/+1
\| \| \| \|	llvm-svn: 179944
*	More approved C++14 features.	Richard Smith	2013-04-20	1	-12/+5
\| \| \| \|	llvm-svn: 179943
*	Binary literals are approved for C++14.	Richard Smith	2013-04-20	1	-1/+6
\| \| \| \|	llvm-svn: 179942