bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	SelectionDAG: Fix a crash on inline asm when output register supports ↵	Tom Stellard	2016-03-09	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	multiple types Summary: The code in SelectionDAG did not handle the case where the register type and output types were different, but had the same size. Reviewers: arsenm, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17940 llvm-svn: 263022
*	[AMDGPU] Assembler: Support DPP instructions.	Sam Kolton	2016-03-09	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Supprot DPP syntax as used in SP3 (except several operands syntax). Added dpp-specific operands in td-files. Added DPP flag to TSFlags to determine if instruction is dpp in InstPrinter. Support for VOP2 DPP instructions in td-files. Some tests for DPP instructions. ToDo: - VOP2bInst: - vcc is considered as operand - AsmMatcher doesn't apply mnemonic aliases when parsing operands - v_mac_f32 - v_nop - disable instructions with 64-bit operands - change dpp_ctrl assembler representation to conform sp3 Review: http://reviews.llvm.org/D17804 llvm-svn: 263008
*	[WebAssembly] Implement irreducible control flow.	Dan Gohman	2016-03-09	1	-0/+88
\| \| \| \| \| \| \| \|	This implements a very simple conservative transformation that doesn't require more than linear code size growth. There's room for much more optimization in this space. llvm-svn: 262982
*	[AArch64] Disable the MI scheduler to turn bots green after r262942.	Chad Rosier	2016-03-08	1	-4/+4
\| \| \| \|	llvm-svn: 262944
*	Revert r262759 and r262760.	Quentin Colombet	2016-03-08	1	-30/+0
\| \| \| \| \| \| \| \|	The fix consisting in using the library call for atomic compare and swap when the instruction is not safe to use may be incorrect. Indeed the library call may not exist on all platform. In other words, we need a better fix! llvm-svn: 262943
*	Revert r262599 "[X86][SSE] Improve vector ZERO_EXTEND by combining to ↵	Hans Wennborg	2016-03-08	2	-18/+33
\| \| \| \| \| \| \| \|	ZERO_EXTEND_VECTOR_INREG" This caused PR26870. llvm-svn: 262935
*	AVX512: Add extract_subvector patterns v8i1->v4i1 , v4i1->v2i1.	Igor Breger	2016-03-08	1	-0/+23
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17953 llvm-svn: 262929
*	[X86] Regenerated vector float extension tests	Simon Pilgrim	2016-03-08	1	-19/+65
\| \| \| \|	llvm-svn: 262919
*	[WebAssembly] Update for spec change from tableswitch to br_table.	Dan Gohman	2016-03-08	2	-4/+4
\| \| \| \| \| \| \|	Also note that the operand order changed; the default label is now listed after the regular labels. llvm-svn: 262903
*	[AArch64][GlobalISel] Add a test case for the IRTranslator.	Quentin Colombet	2016-03-08	1	-0/+18
\| \| \| \|	llvm-svn: 262898
*	[MIR] Teach the parser/printer that generic virtual registers do not need a ↵	Quentin Colombet	2016-03-08	1	-7/+10
\| \| \| \| \| \|	register class. llvm-svn: 262893
*	[MIR] Teach the parser how to parse complex types of generic machine ↵	Quentin Colombet	2016-03-08	1	-0/+27
\| \| \| \| \| \| \| \|	instructions. By complex types, I mean aggregate or vector types. llvm-svn: 262890
*	[MIR] Print the type of generic machine instructions.	Quentin Colombet	2016-03-08	1	-1/+1
\| \| \| \|	llvm-svn: 262880
*	[MIR] Teach the mir parser about types on generic machine instructions.	Quentin Colombet	2016-03-08	1	-1/+2
\| \| \| \|	llvm-svn: 262879
*	[x86] add test to show missing optimization	Sanjay Patel	2016-03-07	1	-0/+31
\| \| \| \| \| \| \| \|	This should make it clearer how this proposed patch: http://reviews.llvm.org/D11393 ...will change codegen. llvm-svn: 262875
*	[x86] simplify test and tighten checks	Sanjay Patel	2016-03-07	1	-15/+22
\| \| \| \| \| \| \| \| \|	I noticed this test as part of: http://reviews.llvm.org/D11393 ...which is confusing enough as-is. Let's show the exact codegen, so the changes will be more obvious. llvm-svn: 262874
*	[MIR] Teach the MIPrinter about size for generic virtual registers.	Quentin Colombet	2016-03-07	1	-1/+1
\| \| \| \|	llvm-svn: 262867
*	AMDGPU: Match more med3 integer patterns	Matt Arsenault	2016-03-07	2	-0/+694
\| \| \| \|	llvm-svn: 262864
*	[MIR] Teach the parser how to handle the size of generic virtual registers.	Quentin Colombet	2016-03-07	1	-0/+17
\| \| \| \|	llvm-svn: 262862
*	[X86][AVX512] Fixed VPERMT2* shuffle mask decoding and enabled target ↵	Simon Pilgrim	2016-03-06	1	-20/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	shuffle combining. Patch to add support for target shuffle combining of X86ISD::VPERMV3 nodes, including support for detecting unary shuffles. This uncovered several issues with the X86ISD::VPERMV3 shuffle mask decoding of non-64 bit shuffle mask elements - the bit masking wasn't being correctly computed. Removed non-constant pool mask decode path as we have no way of testing it right now. Differential Revision: http://reviews.llvm.org/D17916 llvm-svn: 262809
*	AVX512BW: Support llvm intrinsic masked vector load/store for i8/i16 element ↵	Igor Breger	2016-03-06	2	-0/+258
\| \| \| \| \| \| \| \|	types on SKX Differential Revision: http://reviews.llvm.org/D17913 llvm-svn: 262803
*	AVX512: Remove VSHRI kmask patterns from TD file. It is incorrect to use ↵	Igor Breger	2016-03-06	3	-30/+42
\| \| \| \| \| \| \| \|	kshiftw to implement VSHRI v4i1 , bits 15-4 is undef so the upper bits of v4i1 may not be zeroed. v4i1 should be zero_extend to v16i1 ( or any natively supported vector). Differential Revision: http://reviews.llvm.org/D17763 llvm-svn: 262797
*	[X86][AVX] Improved VPERMILPS variable shuffle mask decoding.	Simon Pilgrim	2016-03-05	1	-4/+0
\| \| \| \| \| \| \| \| \| \|	Added support for decoding VPERMILPS variable shuffle masks that aren't in the constant pool. Added target shuffle mask decoding for SCALAR_TO_VECTOR+VZEXT_MOVL cases - these can happen for v2i64 constant re-materialization Followup to D17681 llvm-svn: 262784
*	RegisterCoalescer: Remap subregister lanemasks before exchanging operands	Matthias Braun	2016-03-05	1	-3/+2
\| \| \| \| \| \| \| \| \| \|	Rematerializing and merging into a bigger register class at the same time, requires the subregister range lanemasks getting remapped to the new register class. This fixes http://llvm.org/PR26805 llvm-svn: 262768
*	[X86] Fix the lowering of setjmp intrinsic on i386.	Quentin Colombet	2016-03-05	1	-0/+23
\| \| \| \| \| \| \| \| \| \|	When the lowering of the setjmp intrinsic requires a global base pointer to be set, make sure such pointer gets defined by the CGBR pass. This fixes PR26742. llvm-svn: 262762
*	Add missing triple in my previous commit!	Quentin Colombet	2016-03-04	1	-0/+1
\| \| \| \|	llvm-svn: 262760
*	[X86] Do not use cmpxchgXXb when we need the base pointer (RBX).	Quentin Colombet	2016-03-04	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \|	cmpxchgXXb uses RBX as one of its implicit argument. I.e., when we use that instruction we need to clobber RBX. This is generally fine, expect when RBX is a reserved register because in that case, the register allocator will not track its value and will not save and restore it when interferences occur. rdar://problem/24851412 llvm-svn: 262759
*	[x86] add tests for masked loads with constant masks	Sanjay Patel	2016-03-04	1	-14/+191
\| \| \| \|	llvm-svn: 262758
*	[X86] Support cleaning more than 2**16 bytes of stack	David Majnemer	2016-03-04	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The x86 ret instruction has a 16 bit immediate indicating how many bytes to pop off of the stack beyond the return address. There is a problem when extremely large structs are passed by value: we might not be able to fit the number of bytes to pop into the return instruction. To fix this, expand RET_FLAG a little later and use a special sequence to clean the stack: pop %ecx ; return address is now in %ecx add $n, %esp ; clean the stack push %ecx ; bring the return address back on the stack ret ; pop the return address and jmp to it's value llvm-svn: 262755
*	[DAGCombine] Fix divrem combine not to assume div/rem type is simple.	Michael Kuperstein	2016-03-04	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	The divrem combine assumed the type of the div/rem is simple, which isn't necessarily true. This probably worked fine until r250825, since it only saw legal types, but now breaks when it runs as a pre-type-legalization combine. This fixes PR26835. Differential Revision: http://reviews.llvm.org/D17878 llvm-svn: 262746
*	[ARM] Merging 64-bit divmod lib calls into one	Renato Golin	2016-03-04	1	-1/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make sure we only emit valid types or the ones that were explicitly marked as custom. Now, passing check-all and test-suite on x86, ARM and AArch64. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262738
*	AMDGPU/SI: Add support for spiling SGPRs to scratch buffer	Tom Stellard	2016-03-04	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is necessary for when we run out of VGPRs and can no longer use v_{read,write}_lane for spilling SGPRs. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17592 llvm-svn: 262732
*	[mips][microMIPS] Prevent usage of OR16_MMR6 instruction when code for ↵	Zoran Jovanovic	2016-03-04	1	-12/+17
\| \| \| \| \| \| \| \| \| \|	microMIPS is generated. Author: milena.vujosevic.janicic Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D17373 llvm-svn: 262725
*	[X86][AVX512] Added some basic X86ISD::VPERMV3 shuffle combining tests	Simon Pilgrim	2016-03-04	1	-0/+72
\| \| \| \| \| \|	None of these actually combine yet as we haven't enabled X86ISD::VPERMV3 for target shuffle combining llvm-svn: 262718
*	[X86][SSSE3] Added combine test for unary shuffle (pshufb) only referencing ↵	Simon Pilgrim	2016-03-04	1	-3/+20
\| \| \| \| \| \|	elements from the second input of a binary shuffle (punpcklbw) llvm-svn: 262710
*	AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsics	Nikolay Haustov	2016-03-04	1	-0/+124
\| \| \| \| \| \| \| \| \| \| \|	These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. Initial change by Nicolai H.hnle Differential Revision: http://reviews.llvm.org/D17401 llvm-svn: 262701
*	llvm/test/CodeGen/ARM/rem_crash.ll: Avoid unsupported targets to specify ↵	NAKAMURA Takumi	2016-03-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	explicit triple. We will see it for targeting win32; LLVM ERROR: CPU: 'generic' does not support ARM mode execution! llvm-svn: 262668
*	[X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test.	Simon Pilgrim	2016-03-03	1	-0/+15
\| \| \| \| \| \|	PSHUFB decoder was assuming that input was 128 or 256-bit vector only. llvm-svn: 262661
*	[X86][AVX] Better support for the variable mask form of VPERMILPD/VPERMILPS	Simon Pilgrim	2016-03-03	1	-9/+0
\| \| \| \| \| \| \| \| \| \|	The variable mask form of VPERMILPD/VPERMILPS were only partially implemented, with much of it still performed as an intrinsic. This patch properly defines the instructions in terms of X86ISD::VPERMILPV, permitting the opcode to be easily combined as a target shuffle. Differential Revision: http://reviews.llvm.org/D17681 llvm-svn: 262635
*	[X86] Don't assume that shuffle non-mask operands starts at #0.	Ahmed Bougacha	2016-03-03	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	That's not the case for VPERMV/VPERMV3, which cover all possible combinations (the C intrinsics use a different order; the AVX vs AVX512 intrinsics are different still). Since: r246981 AVX-512: Lowering for 512-bit vector shuffles. VPERMV is recognized in getTargetShuffleMask. This breaks assumptions in most callers, as they expect the non-mask operands to start at index 0. VPERMV has the mask as operand #0; VPERMV3 has it in the middle. Instead of the faulty assumption, have getTargetShuffleMask return its operands as well. One alternative we considered was to change the operand order of VPERMV, but we agreed to stick to the instruction order, as there are more AVX512 weirdness to cover (vpermt2/vpermi2 in particular). Differential Revision: http://reviews.llvm.org/D17041 llvm-svn: 262627
*	[AArch64] fold 'isPositive' vector integer operations (PR26819)	Sanjay Patel	2016-03-03	1	-15/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is one of the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26819 Shift and negate is what InstCombine prefers to produce (and I tried to make it do more of that in http://reviews.llvm.org/rL262424 ), so we should recognize that pattern as something that might come from autovectorization even if it's unlikely to be produced from C NEON intrinsics. The patch is based on the x86 equivalent: http://reviews.llvm.org/rL262036 Differential Revision: http://reviews.llvm.org/D17834 llvm-svn: 262623
*	AVX512: Combine AND + TESTM instructions .	Igor Breger	2016-03-03	1	-0/+61
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17844 llvm-svn: 262621
*	Making rem_crash.ll target-specific	Renato Golin	2016-03-03	3	-1/+516
\| \| \| \| \| \| \| \| \| \|	This test failed in some ARM bots after a divmod change because it was running on a native llc, instead of targeted one. This makes sure the test is target-specific (as intended), and also copies to ARM and AArch64 directories. If it is also supposed to work on other architectures, I'll leave as an exercise to the respective maintainers. llvm-svn: 262620
*	[X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG	Simon Pilgrim	2016-03-03	2	-33/+18
\| \| \| \| \| \| \| \|	Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns. Differential Revision: http://reviews.llvm.org/D17691 llvm-svn: 262599
*	Revert "[ARM] Merging 64-bit divmod lib calls into one"	Renato Golin	2016-03-03	1	-3/+1
\| \| \| \| \| \|	This reverts commit r262507, which broke some ARM buildbots. llvm-svn: 262594
*	[LLVM][AVX512] PSRLWI Chnage imm8 to int	Michael Zuckerman	2016-03-03	2	-15/+15
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17753 llvm-svn: 262592
*	[X86] Enable forwarding bool arguments in tail calls (PR26305)	Hans Wennborg	2016-03-03	1	-0/+27
\| \| \| \| \| \| \| \| \|	The code was previously not able to track a boolean argument at a call site back to the formal argument of the caller. Differential Revision: http://reviews.llvm.org/D17786 llvm-svn: 262575
*	[PPCVSXFMAMutate] Temporarily disable this pass	Tim Shen	2016-03-03	4	-9/+9
\| \| \| \|	llvm-svn: 262573
*	[MBP] Avoid placing random blocks between loop preheader and header	Philip Reames	2016-03-03	1	-0/+39
\| \| \| \| \| \| \| \| \| \|	If we have a loop with a rarely taken path, we will prune that from the blocks which get added as part of the loop chain. The problem is that we weren't then recognizing the loop chain as schedulable when considering the preheader when forming the function chain. We'd then fall to various non-predecessors before finally scheduling the loop chain (as if the CFG was unnatural.) The net result was that there could be lots of garbage between a loop preheader and the loop, even though we could have directly fallen into the loop. It also meant we separated hot code with regions of colder code. The particular reason for the rejection of the loop chain was that we were scanning predecessor of the header, seeing the backedge, believing that was a globally more important predecessor (true), but forgetting to account for the fact the backedge precessor was already part of the existing loop chain (oops!. Differential Revision: http://reviews.llvm.org/D17830 llvm-svn: 262547
*	[X86] Don't give catch objects a displacement of zero	David Majnemer	2016-03-03	3	-15/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Catch objects with a displacement of zero do not initialize a catch object. The displacement is relative to %rsp at the end of the function's prologue for x86_64 targets. If we place an object at the top-of-stack, we will end up wit a displacement of zero resulting in our catch object remaining uninitialized. Address this by creating our catch objects as fixed objects. We will ensure that the UnwindHelp object is created after the catch objects so that no catch object will have a displacement of zero. Differential Revision: http://reviews.llvm.org/D17823 llvm-svn: 262546