bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[ARM] Promote small global constants to constant pools	James Molloy	2016-09-14	1	-0/+109
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). llvm-svn: 281484
*	[X86] Added i128 lshr+shl -> mask combine test	Simon Pilgrim	2016-09-14	1	-0/+6
\| \| \| \|	llvm-svn: 281480
*	Fix code-gen crash on Power9 for insert_vector_elt with variable index (PR30189)	Nemanja Ivanovic	2016-09-14	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \|	This patch corresponds to review: https://reviews.llvm.org/D24021 In the initial implementation of this instruction, I forgot to account for variable indices. This patch fixes PR30189 and should probably be merged into 3.9.1 (I'll open a bug according to the new instructions). llvm-svn: 281479
*	[X86][SSE] Don't blend vector shifts with MOVSS/MOVSD directly, lower from ↵	Simon Pilgrim	2016-09-14	1	-32/+60
\| \| \| \| \| \| \| \|	generic shuffle Shuffle lowering will correctly lower to MOVSS/MOVSD/PBLEND, improving commutation opportunities llvm-svn: 281471
*	Revert "[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently"	James Molloy	2016-09-14	7	-93/+21
\| \| \| \| \| \|	This reverts commit r281323. It caused chromium test failures and a selfhost failure. llvm-svn: 281451
*	GlobalISel: mark pointer stores as legal on AArch64.	Tim Northover	2016-09-14	1	-0/+7
\| \| \| \|	llvm-svn: 281448
*	This reapplies r281304. The issue was that I had missed	Sjoerd Meijer	2016-09-14	1	-25/+5
\| \| \| \| \| \|	to copy the new isAdd field in the tablegen data structure. llvm-svn: 281447
*	AVX-512: Fixed a bug in kortest.z intrinsic	Elena Demikhovsky	2016-09-14	1	-3/+1
\| \| \| \| \| \|	Lowering was wrong - X86ISD::SETCC node should return i8 type. llvm-svn: 281446
*	[AVX512BW] Change truncStore action (v16i16->v16i18). It can be legal only ↵	Igor Breger	2016-09-14	1	-6/+69
\| \| \| \| \| \| \| \|	with AVX512VL. Differential Revision: http://reviews.llvm.org/D24547 llvm-svn: 281445
*	[X86] Remove the VCVTSI2SD32 with rounding intrinsic. It's not used by clang ↵	Craig Topper	2016-09-14	1	-10/+0
\| \| \| \| \| \|	and not needed since 32-bit integer to double is always exact. llvm-svn: 281442
*	[AArch64] Simplify patchpoint/stackmap size test (r281301). NFC.	Ahmed Bougacha	2016-09-13	2	-42/+0
\| \| \| \|	llvm-svn: 281407
*	[CodeGen] Fix invalid shift in mul expansion	Pawel Bylica	2016-09-13	2	-0/+6659
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: When expanding mul in type legalization make sure the type for shift amount can actually fit the value. This fixes PR30354 https://llvm.org/bugs/show_bug.cgi?id=30354. Reviewers: hfinkel, majnemer, RKSimon Subscribers: RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D24478 llvm-svn: 281403
*	[DAG] Allow build-to-shuffle combine to combine builds from two wide vectors.	Michael Kuperstein	2016-09-13	1	-99/+31
\| \| \| \| \| \| \| \| \| \| \|	This allows us to, in some cases, create a vector_shuffle out of a build_vector, when the inputs to the build are extract_elements from two different vectors, at least one of which is wider than the output. (E.g. a <8 x i16> being constructed out of elements from a <16 x i16> and a <8 x i16>). Differential Revision: https://reviews.llvm.org/D24491 llvm-svn: 281402
*	[Hexagon] Better handling of HVX vector lowering	Krzysztof Parzyszek	2016-09-13	1	-0/+21
\| \| \| \| \| \| \|	- Expand SELECT_CC and BR_CC for vector types. - Implement TLI::isShuffleMaskLegal. llvm-svn: 281397
*	AArch64: Cleanup tailcall CC check, enable swiftcc.	Matthias Braun	2016-09-13	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	Cleanup/change the code that checks for possible tailcall conventions to look the same as the one in the X86 target. This makes the distinction between calling conventions that can guarnatee tailcalls and the ones that may tailcall more obvious. - Add Swift to the mayTailCall list - PreserveMost seemed to be incorrectly part of the guarnteed tail call list, move it to the mayTailCall list. llvm-svn: 281376
*	AMDGPU: Support commuting a FrameIndex operand	Matt Arsenault	2016-09-13	1	-0/+15
\| \| \| \|	llvm-svn: 281369
*	[DAGCombiner] Use APInt directly in (shl (zext (srl x, C)), C) combine range ↵	Simon Pilgrim	2016-09-13	1	-0/+7
\| \| \| \| \| \| \| \| \| \|	test To avoid assertion, we must ensure that the inner shift constant is within range before calling ConstantSDNode::getZExtValue(). We already know that the outer shift constant is in range. Followup to D23007 llvm-svn: 281362
*	[Myriad]: set LeonCASA processor feature	Douglas Katzman	2016-09-13	1	-0/+1
\| \| \| \|	llvm-svn: 281359
*	[X86][SSE] Added AVX512F and additional vector truncate test cases	Simon Pilgrim	2016-09-13	1	-88/+273
\| \| \| \| \| \|	trunc16i16_16i8 is currently commented out due to PR25684 llvm-svn: 281356
*	[DAGCombiner] Use APInt directly in (shl (ext (shl x, c1)), c2) combine	Simon Pilgrim	2016-09-13	1	-0/+18
\| \| \| \| \| \| \| \|	Fix failure to detect out of range shift constants leading to assert in ConstantSDNode::getZExtValue() Followup to D23007 llvm-svn: 281354
*	[X86] Regenerated shift combine tests.	Simon Pilgrim	2016-09-13	1	-26/+104
\| \| \| \| \| \|	Added x86_64 tests llvm-svn: 281341
*	[Hexagon] Clear the flow queue after visiting a single instruction	Krzysztof Parzyszek	2016-09-13	1	-0/+47
\| \| \| \|	llvm-svn: 281339
*	Revert "[ARM] Promote small global constants to constant pools"	James Molloy	2016-09-13	1	-109/+0
\| \| \| \| \| \|	This reverts commit r281314. Speculatively revert as it's possible this caused linker errors: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/19656 llvm-svn: 281327
*	[ARM] Add ".code 32" to functions in the ARM instruction set	Pablo Barrio	2016-09-13	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before, only Thumb functions were marked as ".code 16". These ".code x" directives are effective until the next directive of its kind is encountered. Therefore, in code with interleaved ARM and Thumb functions, it was possible to declare a function as ARM and end up with a Thumb function after assembly. A test has been added. An existing test has also been fixed to take this change into account. Reviewers: aschwaighofer, t.p.northover, jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D24337 llvm-svn: 281324
*	[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently	James Molloy	2016-09-13	7	-21/+93
\| \| \| \| \| \| \| \| \| \| \| \| \|	For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 281323
*	[ARM] Promote small global constants to constant pools	James Molloy	2016-09-13	1	-0/+109
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). llvm-svn: 281314
*	[WebAssembly] Trying to fix broken tests in CodeGen/WebAssembly caused by ↵	Eric Liu	2016-09-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	r281285. Reviewers: bkramer, ddcc, dschuff, sunfish Subscribers: jfb, llvm-commits, dschuff Differential Revision: https://reviews.llvm.org/D24497 llvm-svn: 281312
*	Remove MVT:i1 xor instruction before SELECT. (Performance improvement).	Ayman Musa	2016-09-13	2	-11/+53
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D23764 llvm-svn: 281308
*	Revert of r281304 as it is causing build bot failures in hexagon	Sjoerd Meijer	2016-09-13	1	-5/+25
\| \| \| \| \| \| \|	hwloop regression tests. These tests pass locally; will be investigating where these differences come from. llvm-svn: 281306
*	This adds a new field isAdd to MCInstrDesc. The ARM and Hexagon instruction	Sjoerd Meijer	2016-09-13	1	-25/+5
\| \| \| \| \| \| \| \| \| \| \|	descriptions now tag add instructions, and the Hexagon backend is using this to identify loop induction statements. Patch by Sam Parker and Sjoerd Meijer. Differential Revision: https://reviews.llvm.org/D23601 llvm-svn: 281304
*	AVX-512: Fix for PR28175 - Scalar code optimization.	Elena Demikhovsky	2016-09-13	4	-22/+88
\| \| \| \| \| \| \| \| \|	Optimized (truncate (assertzext x) to i1) and anyext i1 to i8/16/32. Optimization of this patterns is a one more step towards i1 optimization on AVX-512. Differential Revision: https://reviews.llvm.org/D24456 llvm-svn: 281302
*	[AArch64] Support stackmap/patchpoint in getInstSizeInBytes	Diana Picus	2016-09-13	2	-0/+170
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently return 4 for stackmaps and patchpoints, which is very optimistic and can in rare cases cause the branch relaxation pass to fail to relax certain branches. This patch causes getInstSizeInBytes to return a pessimistic estimate of the size as the number of bytes requested in the stackmap/patchpoint. In the future, we could provide a more accurate estimate by sharing some of the logic in AArch64::LowerSTACKMAP/PATCHPOINT. Fixes part of https://llvm.org/bugs/show_bug.cgi?id=28750 Differential Revision: https://reviews.llvm.org/D24073 llvm-svn: 281301
*	[X86] Remove masked shufpd/shufps intrinsics and autoupgrade to native ↵	Craig Topper	2016-09-13	4	-118/+118
\| \| \| \| \| \|	vector shuffles. They were removed from clang previously but accidentally left in the backend. llvm-svn: 281300
*	DebugInfo: New metadata representation for global variables.	Peter Collingbourne	2016-09-13	9	-41/+40
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch reverses the edge from DIGlobalVariable to GlobalVariable. This will allow us to more easily preserve debug info metadata when manipulating global variables. Fixes PR30362. A program for upgrading test cases is attached to that bug. Differential Revision: http://reviews.llvm.org/D20147 llvm-svn: 281284
*	X86: Conditional tail calls should not have isBarrier = 1	Hans Wennborg	2016-09-13	2	-3/+31
\| \| \| \| \| \| \| \| \| \|	That confuses e.g. machine basic block placement, which then doesn't realize that control can fall through a block that ends with a conditional tail call. Instead, isBranch=1 should be set. Also, mark EFLAGS as used by these instructions. llvm-svn: 281281
*	Revert r281215, it caused PR30358.	Nico Weber	2016-09-12	7	-93/+21
\| \| \| \|	llvm-svn: 281263
*	Lower consecutive select instructions correctly.	Dehao Chen	2016-09-12	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: If consecutive select instructions are lowered separately in CGP, it will introduce redundant condition check and branches that cannot be removed by later optimization phases. This patch lowers all consecutive select instructions at the same to to avoid inefficent code as demonstrated in https://llvm.org/bugs/show_bug.cgi?id=29095 Reviewers: davidxl Subscribers: vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D24147 llvm-svn: 281252
*	AVX-512: Added a test for -O0 mode. NFC.	Elena Demikhovsky	2016-09-12	1	-0/+52
\| \| \| \|	llvm-svn: 281246
*	AVX-512: Simplified masked_gather_scatter test. NFC.	Elena Demikhovsky	2016-09-12	1	-183/+6
\| \| \| \|	llvm-svn: 281244
*	AMDGPU: Do not clobber SCC in SIWholeQuadMode	Nicolai Haehnle	2016-09-12	1	-0/+37
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D22198 llvm-svn: 281230
*	Revert "[ARM] Promote small global constants to constant pools"	James Molloy	2016-09-12	1	-99/+0
\| \| \| \| \| \|	This reverts commit r281213. It made a bot go bang: http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-full/builds/14625 llvm-svn: 281228
*	[BranchFolding] Unique added live-ins after hoisting code.	Ahmed Bougacha	2016-09-12	1	-1/+1
\| \| \| \| \| \|	We're not supposed to have duplicate live-ins. llvm-svn: 281224
*	[X86] Copy imp-uses when folding tailcall into conditional branch.	Ahmed Bougacha	2016-09-12	1	-0/+84
\| \| \| \| \| \| \| \| \| \| \|	r280832 added 32-bit support for emitting conditional tail-calls, but dropped imp-used parameter registers. This went unnoticed until r281113, which added 64-bit support, as this is only exposed with parameter passing via registers. Don't drop the imp-used parameters. llvm-svn: 281223
*	add select i1 test, reproduser pr30249.	Igor Breger	2016-09-12	1	-0/+12
\| \| \| \|	llvm-svn: 281218
*	[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently	James Molloy	2016-09-12	7	-21/+93
\| \| \| \| \| \| \| \| \| \| \| \| \|	For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 281215
*	[ARM] Promote small global constants to constant pools	James Molloy	2016-09-12	1	-0/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). llvm-svn: 281213
*	Fix the Thumb test for vfloat intrinsics	Pablo Barrio	2016-09-12	1	-55/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This test was not testing the intrinsics. A function like this: define %v4f32 @test_v4f32.floor(%v4f32 %a){ ... %1 = call %v4f32 @llvm.floor.v4f32(%v4f32 %a) ... } is transformed into the following assembly: _test_v4f32.floor: @ @test_v4f32.floor ... bl _floorf ... In each function tested, there are two CHECK: one that checked for the label and another one for the intrinsic that should be used inside the function (in our case, "floor"). However, although the first CHECK was matching the label, the second was not matching the intrinsic, but the second "floor" in the same line as the label. This is fixed by making the first CHECK match the entire line. Reviewers: jmolloy, rengolin Subscribers: rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D24398 llvm-svn: 281211
*	GlobalISel: support translation of global addresses.	Tim Northover	2016-09-12	1	-0/+28
\| \| \| \|	llvm-svn: 281207
*	GlobalISel: translate GEP instructions.	Tim Northover	2016-09-12	1	-0/+85
\| \| \| \| \| \| \| \|	Unlike SDag, we use a separate G_GEP instruction (much simplified, only taking a single byte offset) to preserve the pointer type information through selection. llvm-svn: 281205
*	GlobalISel: disambiguate types when printing MIR	Tim Northover	2016-09-12	9	-83/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some generic instructions have multiple types. While in theory these always be discovered by inspecting the single definition of each generic vreg, in practice those definitions won't always be local and traipsing through a big function to find them will not be fun. So this changes MIRPrinter to print out the type of uses as well as defs, if they're known to be different or not known to be the same. On the parsing side, we're a little more flexible: provided each register is given a type in at least one place it's mentioned (and all types are consistent) we accept the MIR. This doesn't introduce ambiguity but makes writing tests manually a bit less painful. llvm-svn: 281204