bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AMDGPU] Assembler: Support DPP instructions.	Sam Kolton	2016-03-09	8	-46/+350
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Supprot DPP syntax as used in SP3 (except several operands syntax). Added dpp-specific operands in td-files. Added DPP flag to TSFlags to determine if instruction is dpp in InstPrinter. Support for VOP2 DPP instructions in td-files. Some tests for DPP instructions. ToDo: - VOP2bInst: - vcc is considered as operand - AsmMatcher doesn't apply mnemonic aliases when parsing operands - v_mac_f32 - v_nop - disable instructions with 64-bit operands - change dpp_ctrl assembler representation to conform sp3 Review: http://reviews.llvm.org/D17804 llvm-svn: 263008
*	[AMDGPU] Assembler: Support abs() syntax.	Nikolay Haustov	2016-03-09	1	-2/+19
\| \| \| \| \| \| \| \| \|	Support legacy SP3 abs(v1) syntax. InstPrinter still uses \|v1\|. Add tests. Differential Revision: http://reviews.llvm.org/D17887 llvm-svn: 263006
*	[AMDGPU] Assembler: Fix s_setpc_b64	Nikolay Haustov	2016-03-09	1	-1/+1
\| \| \| \| \| \| \| \|	s_setpc_b64 has just one 64-bit source which is the address of instruction to jump to. Differential Revision: http://reviews.llvm.org/D17888 llvm-svn: 263005
*	[WebAssembly] Update comments about irreducible control flow.	Dan Gohman	2016-03-09	2	-8/+13
\| \| \| \|	llvm-svn: 262995
*	[WebAssembly] Implement irreducible control flow.	Dan Gohman	2016-03-09	5	-35/+297
\| \| \| \| \| \| \| \|	This implements a very simple conservative transformation that doesn't require more than linear code size growth. There's room for much more optimization in this space. llvm-svn: 262982
*	Revert r262759 and r262760.	Quentin Colombet	2016-03-08	1	-9/+0
\| \| \| \| \| \| \| \|	The fix consisting in using the library call for atomic compare and swap when the instruction is not safe to use may be incorrect. Indeed the library call may not exist on all platform. In other words, we need a better fix! llvm-svn: 262943
*	[AArch64] Add MMOs to unscaled pairs.	Chad Rosier	2016-03-08	1	-3/+2
\| \| \| \| \| \| \|	Test to be committed in follow up commit, per discussion in D17097. http://reviews.llvm.org/D17097 llvm-svn: 262942
*	[ARM] Simplify ARMInstr*.td by getting rid of identity PatFrags (NFC)	Artyom Skrobov	2016-03-08	3	-107/+74
\| \| \| \| \| \| \| \| \| \|	Reviewers: t.p.northover, grosbach, resistor Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D17636 llvm-svn: 262936
*	Revert r262599 "[X86][SSE] Improve vector ZERO_EXTEND by combining to ↵	Hans Wennborg	2016-03-08	1	-24/+9
\| \| \| \| \| \| \| \|	ZERO_EXTEND_VECTOR_INREG" This caused PR26870. llvm-svn: 262935
*	AVX512: Add extract_subvector patterns v8i1->v4i1 , v4i1->v2i1.	Igor Breger	2016-03-08	1	-0/+8
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17953 llvm-svn: 262929
*	[Power9] Implement new vsx instructions: load, store instructions for vector ↵	Kit Barton	2016-03-08	7	-0/+214
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and scalar We follow the comments mentioned in http://reviews.llvm.org/D16842#344378 to implement this new patch. This patch implements the following vsx instructions: Vector load/store: lxv lxvx lxvb16x lxvl lxvll lxvh8x lxvwsx stxv stxvb16x stxvh8x stxvl stxvll stxvx Scalar load/store: lxsd lxssp lxsibzx lxsihzx stxsd stxssp stxsibx stxsihx 21 instructions Phabricator: http://reviews.llvm.org/D16919 llvm-svn: 262906
*	[WebAssembly] Update for spec change from tableswitch to br_table.	Dan Gohman	2016-03-08	5	-18/+18
\| \| \| \| \| \| \|	Also note that the operand order changed; the default label is now listed after the regular labels. llvm-svn: 262903
*	[AArch64] Initialize GlobalISel as part of the target initialization.	Quentin Colombet	2016-03-08	1	-0/+2
\| \| \| \|	llvm-svn: 262897
*	A couple more UB fixes for C++14 sized deallocation.	Richard Smith	2016-03-08	1	-0/+4
\| \| \| \|	llvm-svn: 262891
*	AMDGPU: Match more med3 integer patterns	Matt Arsenault	2016-03-07	2	-0/+33
\| \| \| \|	llvm-svn: 262864
*	AMDGPU: Remove a fixme for ptrrtoint handling	Matt Arsenault	2016-03-07	1	-1/+0
\| \| \| \|	llvm-svn: 262854
*	AMDGPU: Move function only used by R600	Matt Arsenault	2016-03-07	4	-18/+17
\| \| \| \|	llvm-svn: 262853
*	[ms-inline-asm][AVX512] Add ability to use k registers in MS inline asm + ↵	Marina Yatsina	2016-03-07	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fix bag with curly braces Until now curly braces could only be used in MS inline assembly to mark block start/end. All curly braces were removed completely at a very early stage. This approach caused bugs like: "m{o}v eax, ebx" turned into "mov eax, ebx" without any error. In addition, AVX-512 added special operands (e.g., k registers), which are also surrounded by curly braces that mark them as such. Now, we need to keep the curly braces and identify at a later stage if they are marking block start/end (if so, ignore them), or surrounding special AVX-512 operands (if so, parse them as such). This patch fixes the bug described above and enables the use of AVX-512 special operands. This commit is the the llvm part of the patch. The clang part of the review is: http://reviews.llvm.org/D17766 The llvm part of the review is: http://reviews.llvm.org/D17767 Differential Revision: http://reviews.llvm.org/D17767 llvm-svn: 262843
*	[X86][AVX512] Fixed VPERMT2* shuffle mask decoding and enabled target ↵	Simon Pilgrim	2016-03-06	3	-11/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	shuffle combining. Patch to add support for target shuffle combining of X86ISD::VPERMV3 nodes, including support for detecting unary shuffles. This uncovered several issues with the X86ISD::VPERMV3 shuffle mask decoding of non-64 bit shuffle mask elements - the bit masking wasn't being correctly computed. Removed non-constant pool mask decode path as we have no way of testing it right now. Differential Revision: http://reviews.llvm.org/D17916 llvm-svn: 262809
*	[AMDGPU] Using table-driven amd_kernel_code_t field parser in assembler.	Valery Pykhtin	2016-03-06	2	-157/+8
\| \| \| \| \| \| \| \|	Engages code from r262804. Differential Revision: http://reviews.llvm.org/D17151 llvm-svn: 262808
*	fix sanitizer-ppc64be-linux failure for r262804	Valery Pykhtin	2016-03-06	1	-1/+1
\| \| \| \| \| \| \| \|	error: moving a local object in a return statement prevents copy elision [-Werror,-Wpessimizing-move] http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/930 llvm-svn: 262805
*	[AMDGPU] table-driven parser/printer for amd_kernel_code_t structure fields	Valery Pykhtin	2016-03-06	4	-0/+369
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17150 llvm-svn: 262804
*	AVX512BW: Support llvm intrinsic masked vector load/store for i8/i16 element ↵	Igor Breger	2016-03-06	2	-37/+56
\| \| \| \| \| \| \| \|	types on SKX Differential Revision: http://reviews.llvm.org/D17913 llvm-svn: 262803
*	[AMDGPU] SOPxx instructions operand naming fixed in td files.	Valery Pykhtin	2016-03-06	3	-68/+68
\| \| \| \| \| \| \| \| \|	dst -> sdst ssrcN -> srcN Differential Revision: http://reviews.llvm.org/D17646 llvm-svn: 262801
*	[X86] Use high bits of return value from getEncoding instead of predicate ↵	Craig Topper	2016-03-06	1	-162/+101
\| \| \| \| \| \|	functions to populate the REX and VEX prefix bits that extend register encodings. NFC llvm-svn: 262800
*	[X86] Remove unnecessary masking. The assert above it already guaranteed it. NFC	Craig Topper	2016-03-06	1	-2/+0
\| \| \| \|	llvm-svn: 262799
*	[X86] Use uint8_t instead of unsigned char as it shortens the code and more ↵	Craig Topper	2016-03-06	1	-27/+26
\| \| \| \| \| \|	explicitly reflects the desired size. llvm-svn: 262798
*	AVX512: Remove VSHRI kmask patterns from TD file. It is incorrect to use ↵	Igor Breger	2016-03-06	2	-87/+80
\| \| \| \| \| \| \| \|	kshiftw to implement VSHRI v4i1 , bits 15-4 is undef so the upper bits of v4i1 may not be zeroed. v4i1 should be zero_extend to v16i1 ( or any natively supported vector). Differential Revision: http://reviews.llvm.org/D17763 llvm-svn: 262797
*	[X86][AVX] Improved VPERMILPS variable shuffle mask decoding.	Simon Pilgrim	2016-03-05	3	-1/+43
\| \| \| \| \| \| \| \| \| \|	Added support for decoding VPERMILPS variable shuffle masks that aren't in the constant pool. Added target shuffle mask decoding for SCALAR_TO_VECTOR+VZEXT_MOVL cases - these can happen for v2i64 constant re-materialization Followup to D17681 llvm-svn: 262784
*	[X86] AMD Bobcat CPU (btver1) doesn't support XSAVE	Simon Pilgrim	2016-03-05	1	-1/+0
\| \| \| \| \| \| \| \|	btver1 is a SSSE3/SSE4a only CPU - it doesn't have AVX and doesn't support XSAVE. Differential Revision: http://reviews.llvm.org/D17683 llvm-svn: 262782
*	[X86] Fix the lowering of setjmp intrinsic on i386.	Quentin Colombet	2016-03-05	1	-0/+10
\| \| \| \| \| \| \| \| \| \|	When the lowering of the setjmp intrinsic requires a global base pointer to be set, make sure such pointer gets defined by the CGBR pass. This fixes PR26742. llvm-svn: 262762
*	[X86] Do not use cmpxchgXXb when we need the base pointer (RBX).	Quentin Colombet	2016-03-04	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \|	cmpxchgXXb uses RBX as one of its implicit argument. I.e., when we use that instruction we need to clobber RBX. This is generally fine, expect when RBX is a reserved register because in that case, the register allocator will not track its value and will not save and restore it when interferences occur. rdar://problem/24851412 llvm-svn: 262759
*	Fix build breakage	David Majnemer	2016-03-04	1	-4/+5
\| \| \| \|	llvm-svn: 262756
*	[X86] Support cleaning more than 2**16 bytes of stack	David Majnemer	2016-03-04	6	-8/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The x86 ret instruction has a 16 bit immediate indicating how many bytes to pop off of the stack beyond the return address. There is a problem when extremely large structs are passed by value: we might not be able to fit the number of bytes to pop into the return instruction. To fix this, expand RET_FLAG a little later and use a special sequence to clean the stack: pop %ecx ; return address is now in %ecx add $n, %esp ; clean the stack push %ecx ; bring the return address back on the stack ret ; pop the return address and jmp to it's value llvm-svn: 262755
*	[WebAssembly] Add another possible code-size optimization to README.txt	Dan Gohman	2016-03-04	1	-0/+6
\| \| \| \|	llvm-svn: 262740
*	[ARM] Merging 64-bit divmod lib calls into one	Renato Golin	2016-03-04	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make sure we only emit valid types or the ones that were explicitly marked as custom. Now, passing check-all and test-suite on x86, ARM and AArch64. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262738
*	AMDGPU/SI: Add support for spiling SGPRs to scratch buffer	Tom Stellard	2016-03-04	5	-30/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is necessary for when we run out of VGPRs and can no longer use v_{read,write}_lane for spilling SGPRs. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17592 llvm-svn: 262732
*	AMDGPU/SI: Enable frame index scavenging during PrologEpilogueInserter	Tom Stellard	2016-03-04	2	-8/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows us to use virtual registers when we need extra registers for inserting spill instructions in SIRegisterInfo:eliminateFrameIndex(). Once all the frame indices have been eliminated, the PrologEpilogueInserter does an extra pass over the program to replace all virtual registers with physical ones. This allows us to make more efficient use of our emergency spill slots, so we only need to create one. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17591 llvm-svn: 262728
*	[Hexagon] Fix lowering of calls with the return type of i1	Krzysztof Parzyszek	2016-03-04	1	-10/+30
\| \| \| \| \| \| \|	This fixes an assertion in test/CodeGen/Hexagon/ifcvt-edge-weight.ll when run with -debug-only=isel llvm-svn: 262726
*	[mips][microMIPS] Prevent usage of OR16_MMR6 instruction when code for ↵	Zoran Jovanovic	2016-03-04	2	-2/+2
\| \| \| \| \| \| \| \| \| \|	microMIPS is generated. Author: milena.vujosevic.janicic Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D17373 llvm-svn: 262725
*	Test commit access	Sam Kolton	2016-03-04	1	-1/+1
\| \| \| \|	llvm-svn: 262714
*	test commit	Valery Pykhtin	2016-03-04	1	-1/+1
\| \| \| \|	llvm-svn: 262709
*	Make headers self-contained again.	Benjamin Kramer	2016-03-04	1	-0/+1
\| \| \| \|	llvm-svn: 262702
*	AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsics	Nikolay Haustov	2016-03-04	4	-32/+169
\| \| \| \| \| \| \| \| \| \| \|	These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. Initial change by Nicolai H.hnle Differential Revision: http://reviews.llvm.org/D17401 llvm-svn: 262701
*	[X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test.	Simon Pilgrim	2016-03-03	1	-3/+3
\| \| \| \| \| \|	PSHUFB decoder was assuming that input was 128 or 256-bit vector only. llvm-svn: 262661
*	[X86][AVX] Better support for the variable mask form of VPERMILPD/VPERMILPS	Simon Pilgrim	2016-03-03	3	-19/+35
\| \| \| \| \| \| \| \| \| \|	The variable mask form of VPERMILPD/VPERMILPS were only partially implemented, with much of it still performed as an intrinsic. This patch properly defines the instructions in terms of X86ISD::VPERMILPV, permitting the opcode to be easily combined as a target shuffle. Differential Revision: http://reviews.llvm.org/D17681 llvm-svn: 262635
*	[X86] Tidied up 256-bit -> 2 x 128-bit vector shift extraction.	Simon Pilgrim	2016-03-03	1	-14/+2
\| \| \| \| \| \|	lowerShift was manually splitting BUILD_VECTOR cases when it could just call Extract128BitVector which does this anyway. llvm-svn: 262633
*	[X86] Pulled out repeated code testing for constant vector shift amount. NFCI.	Simon Pilgrim	2016-03-03	1	-8/+6
\| \| \| \|	llvm-svn: 262631
*	MCU target has its own ABI, however X86 interrupt handler calling convention ↵	Amjad Aboud	2016-03-03	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	overrides this ABI. Fixed the ordering to check first for X86 interrupt handler then for MCU target. Differential Revision: http://reviews.llvm.org/D17801 llvm-svn: 262628
*	[X86] Don't assume that shuffle non-mask operands starts at #0.	Ahmed Bougacha	2016-03-03	2	-32/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	That's not the case for VPERMV/VPERMV3, which cover all possible combinations (the C intrinsics use a different order; the AVX vs AVX512 intrinsics are different still). Since: r246981 AVX-512: Lowering for 512-bit vector shuffles. VPERMV is recognized in getTargetShuffleMask. This breaks assumptions in most callers, as they expect the non-mask operands to start at index 0. VPERMV has the mask as operand #0; VPERMV3 has it in the middle. Instead of the faulty assumption, have getTargetShuffleMask return its operands as well. One alternative we considered was to change the operand order of VPERMV, but we agreed to stick to the instruction order, as there are more AVX512 weirdness to cover (vpermt2/vpermi2 in particular). Differential Revision: http://reviews.llvm.org/D17041 llvm-svn: 262627