bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[DAGCombine] Make sure we check the ResNo from UADDO before combining	Amaury Sechet	2017-06-11	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: UADDO has 2 result, and one must check the result no before doing any kind of combine. Without it, the transform is invalid. Reviewers: joerg Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34088 llvm-svn: 305162
*	[X86][SSE] Extended PR32368 to SSE/AVX1/AVX2	Simon Pilgrim	2017-06-10	1	-8/+142
\| \| \| \|	llvm-svn: 305154
*	[X86][AVX512] Added test case for PR32368	Simon Pilgrim	2017-06-10	1	-0/+19
\| \| \| \|	llvm-svn: 305153
*	AMDGPU : Fix ISA Version Definitions.	Wei Ding	2017-06-10	1	-0/+13
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D28531 llvm-svn: 305137
*	[PowerPC] add memcmp test with one constant operand and equality cmp; NFC	Sanjay Patel	2017-06-09	1	-3/+29
\| \| \| \|	llvm-svn: 305131
*	[AArch64] Add fallback in FastISel fp16 conversions	I-Jui (Ray) Sung	2017-06-09	1	-0/+131
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Fix assertion failures on F16 to/from int types in FastISel by falling back to regular ISel - Add a testcase of various conversion cases with FastISel (-O0) Reviewers: kristof.beyls, jmolloy, SjoerdMeijer Reviewed By: SjoerdMeijer Subscribers: SjoerdMeijer, llvm-commits, srhines, pirama, aemerson, rengolin, javed.absar, kristof.beyls Differential Revision: https://reviews.llvm.org/D33734 llvm-svn: 305127
*	[AMDGPU] Add intrinsics for alignbit and alignbyte instructions	Stanislav Mekhanoshin	2017-06-09	1	-0/+23
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D34046 llvm-svn: 305098
*	[X86][SSE] Add support for PACKSS nodes to faux shuffle extraction	Simon Pilgrim	2017-06-09	1	-273/+265
\| \| \| \| \| \|	If the inputs won't saturate during packing then we can treat the PACKSS as a truncation shuffle llvm-svn: 305091
*	Reland "[SelectionDAG] Enable target specific vector scalarization of calls ↵	Simon Dardis	2017-06-09	4	-24/+1697
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and returns" By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown, backends can request that LLVM to scalarize vector types for calls and returns. The MIPS vector ABI requires that vector arguments and returns are passed in integer registers. With SelectionDAG's new hooks, the MIPS backend can now handle LLVM-IR with vector types in calls and returns. E.g. 'call @foo(<4 x i32> %4)'. Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for calls and returns if vector types were not legal. If vector types were legal, a single 128bit vector argument would be assigned to a single 32 bit / 64 bit integer register. By teaching the MIPS backend to inspect the original types, it can now implement the MIPS vector ABI which requires a particular method of scalarizing vectors. Previously, the MIPS backend relied on clang to scalarize types such as "call @foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3, i32 inreg %4)". This patch enables the MIPS backend to take either form for vector types. The previous version of this patch had a "conditional move or jump depends on uninitialized value". Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur Differential Revision: https://reviews.llvm.org/D27845 llvm-svn: 305083
*	[AMDGPU] Fix for issue in alloca to vector promotion pass	David Stuttard	2017-06-09	1	-0/+131
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Alloca promotion pass not dealing with non-canonical input Added some additional checks so the pass simply backs-off forms it can't deal with (non-canonical) Also added some test cases in non-canonical form to check that it no longer crashes Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31710 llvm-svn: 305079
*	Prevent RemoveDeadNodes from deleted already deleted node.	Nirav Dave	2017-06-09	1	-0/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This prevents against assertion errors like PR32659 which occur from a replacement deleting a node after it's been added to the list argument of RemoveDeadNodes. The specific failure from PR32659 does not currently happen, but it is still potentially possible. The underlying cause is that the callers of the change dfunction builds up a list of nodes to delete after having moved their uses and it possible that a move of a later node will cause a previously deleted nodes to be deleted. Reviewers: bkramer, spatel, davide Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33731 llvm-svn: 305070
*	[ARM] Add scheduling info for VFMS	Oliver Stannard	2017-06-09	1	-5/+86
\| \| \| \| \| \| \| \| \| \|	The scalar VFMS instructions did not have scheduling information attached (but VFMA did), which was causing assertion failures with the Cortex-A57 scheduling model and -fp-contract=fast. Differential Revision: https://reviews.llvm.org/D34040 llvm-svn: 305064
*	RegAllocPBQP: Do not assign reserved physical register	Matthias Braun	2017-06-08	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(0) RegAllocPBQP: Since getRawAllocationOrder() may return a collection that includes reserved physical registers, iterate to find an un-reserved physical register. (1) VirtRegMap: Enforce the invariant: "no reserved physical registers" in assignVirt2Phys(). Previously, this was checked only after the fact in VirtRegRewriter::rewrite. (2) MachineVerifier: updated the test per MatzeB's review. (3) +testcase Patch by Nick Johnson<Nicholas.Paul.Johnson@deshawresearch.com>! Differential Revision: https://reviews.llvm.org/D33947 llvm-svn: 305016
*	[Hexagon] Skip mux generation when predicate register is undefined	Krzysztof Parzyszek	2017-06-08	1	-0/+27
\| \| \| \|	llvm-svn: 305014
*	[CGP] don't expand a memcmp with nobuiltin attribute	Sanjay Patel	2017-06-08	1	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This matches the behavior used in the SDAG when expanding memcmp. For reference, we're intentionally treating the earlier fortified call transforms differently after: https://bugs.llvm.org/show_bug.cgi?id=23093 https://reviews.llvm.org/rL233776 One motivation for not transforming nobuiltin calls is that it can interfere with sanitizers: https://reviews.llvm.org/D19781 https://reviews.llvm.org/D19801 Differential Revision: https://reviews.llvm.org/D34043 llvm-svn: 305007
*	AMDGPU: Use correct register names in inline assembly	Matt Arsenault	2017-06-08	14	-401/+401
\| \| \| \| \| \|	Fixes using physical registers in inline asm from clang. llvm-svn: 305004
*	[PPC] In PPCBoolRetToInt change the bool value to i64 if the target is ppc64	Guozhi Wei	2017-06-08	2	-14/+33
\| \| \| \| \| \| \| \| \| \|	In PPCBoolRetToInt bool value is changed to i32 type. On ppc64 it may introduce an extra zero extension for the return value. This patch changes the integer type to i64 to avoid the zero extension on ppc64. This patch fixed PR32442. Differential Revision: https://reviews.llvm.org/D31407 llvm-svn: 305001
*	[AMDGPU] Force qsads instrs to use different dest register than source registers	Mark Searles	2017-06-08	3	-37/+68
\| \| \| \| \| \| \| \|	The V_MQSAD_PK_U16_U8, V_QSAD_PK_U16_U8, and V_MQSAD_U32_U8 take more than 1 pass in hardware. For these three instructions, the destination registers must be different than all sources, so that the first pass does not overwrite sources for the following passes. Differential Revision: https://reviews.llvm.org/D33783 llvm-svn: 304998
*	[Power9] Exploit vector integer extend instructions	Zaara Syeda	2017-06-08	1	-0/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds build vector patterns to exploit the vector integer extend instructions: vextsb2w - Vector Extend Sign Byte To Word vextsb2d - Vector Extend Sign Byte To Doubleword vextsh2w - Vector Extend Sign Halfword To Word vextsh2d - Vector Extend Sign Halfword To Doubleword vextsw2d - Vector Extend Sign Word To Doubleword Differential Revision: https://reviews.llvm.org/D33510 llvm-svn: 304992
*	[PowerPC] add memcmp test with nobuiltin attr; NFC	Sanjay Patel	2017-06-08	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \|	In SDAG, we don't expand libcalls with a nobuiltin attribute. It's not clear if that's correct from the existing code comment: "Don't do the check if marked as nobuiltin for some reason." ...adding a test here either way to show that there is currently a different behavior implemented in the CGP-based expansion. llvm-svn: 304991
*	[x86] remove unused param from tests; NFC	Sanjay Patel	2017-06-08	1	-10/+10
\| \| \| \|	llvm-svn: 304989
*	[CGP / PowerPC] avoid multi-block overhead for simple memcmp expansion	Sanjay Patel	2017-06-08	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The test diff for PowerPC shows we can better optimize if this case is one block. For x86, there's would be a substantial difference if CGP expansion was enabled because branches are assumed cheap and SDAG can't optimize across blocks. Instead of this: _cmp_eq8: movq (%rdi), %rax cmpq (%rsi), %rax je LBB23_1 ## BB#2: ## %res_block movl $1, %ecx jmp LBB23_3 LBB23_1: xorl %ecx, %ecx LBB23_3: ## %endblock xorl %eax, %eax testl %ecx, %ecx sete %al retq We get this: cmp_eq8: movq (%rdi), %rcx xorl %eax, %eax cmpq (%rsi), %rcx sete %al retq And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall(). If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we can lower the IR produced here optimally. Differential Revision: https://reviews.llvm.org/D34005 llvm-svn: 304987
*	Add scheduler classes to integer/float horizontal operations.	Andrew V. Tischenko	2017-06-08	1	-16/+16
\| \| \| \| \| \| \|	This patch will close PR32801. Differential Revision: https://reviews.llvm.org/D33203 llvm-svn: 304986
*	[x86] add tests for memcmp expansion; NFC	Sanjay Patel	2017-06-08	1	-37/+293
\| \| \| \| \| \| \| \| \| \| \| \|	We already had a test to demonstrate PR33325: https://bugs.llvm.org/show_bug.cgi?id=33325 I'm adding tests for general memcmp expansion (see D34005 / D33963) and: https://bugs.llvm.org/show_bug.cgi?id=33329 ...plus non-power-of-2 sizes, so we can see what that looks like currently or if expanded. llvm-svn: 304979
*	This patch closes PR28513: an optimization of multiplication by different ↵	Andrew V. Tischenko	2017-06-08	4	-360/+4253
\| \| \| \| \| \| \| \|	constants. The initial patch was rejected: I fixed the issue and re-apply it. llvm-svn: 304972
*	[ARM] GlobalISel: Add more tests. NFC	Diana Picus	2017-06-08	1	-0/+149
\| \| \| \| \| \| \| \|	Add a couple of tests to increase coverage for the TableGen'erated code, in particular for rules where 2 generic instructions may be combined into a single machine instruction. llvm-svn: 304971
*	[Hexagon] Generate 'inbounds' GEPs in HexagonCommonGEP	Krzysztof Parzyszek	2017-06-07	1	-0/+20
\| \| \| \|	llvm-svn: 304937
*	[CGP] avoid zext/trunc of a memcmp expansion compare	Sanjay Patel	2017-06-07	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This could be viewed as another shortcoming of the DAGCombiner: when both operands of a compare are zexted from the same source type, we should be able to compare the original types. The effect on PowerPC perf is likely unnoticeable, but there's a visible regression for x86 if we feed the suboptimal IR for memcmp expansion to the DAG: _cmp_eq4_zexted_to_i64: movl (%rdi), %ecx movl (%rsi), %edx xorl %eax, %eax cmpq %rdx, %rcx sete %al _cmp_eq4_better: movl (%rdi), %ecx xorl %eax, %eax cmpl (%rsi), %ecx sete %al llvm-svn: 304923
*	[mips][dsp] Modify repl.ph to accept signed immediate values	Petar Jovanovic	2017-06-07	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Changed immediate type for repl.ph from uimm10 to simm10 as per the specs. Repl.qb still accepts uimm8. Both instructions now mimic the behaviour of GNU as. Patch by Stefan Maksimovic. Differential Revision: https://reviews.llvm.org/D33594 llvm-svn: 304918
*	[X86] Add test to demonstrate inefficient lowering of v48i8 shuffle.	Guy Blank	2017-06-07	1	-0/+49
\| \| \| \|	llvm-svn: 304915
*	AMDGPU/GlobalISel: Mark 32-bit G_SELECT as legal	Tom Stellard	2017-06-07	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33949 llvm-svn: 304910
*	[x86] avoid flipping sign bits for vector icmp by using known bits	Sanjay Patel	2017-06-07	2	-104/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we know that both operands of an unsigned integer vector comparison are non-negative, then it's safe to directly use a signed-compare-greater-than instruction (the only non-equality integer vector compare predicate provided by SSE/AVX). We're intentionally not changing the condition code to signed in order to preserve the existing transforms that use min/max/psubus below here. This should solve PR33276: https://bugs.llvm.org/show_bug.cgi?id=33276 Differential Revision: https://reviews.llvm.org/D33862 llvm-svn: 304909
*	[PowerPC] Eliminate integer compare instructions - vol. 5	Nemanja Ivanovic	2017-06-07	5	-6/+567
\| \| \| \| \| \| \| \|	Adds handling for i64 SETNE comparison (both sign and zero extended). Differential Revision: https://reviews.llvm.org/D33720 llvm-svn: 304907
*	[mips] do not use FastISel when -mxgot is present	Petar Jovanovic	2017-06-07	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The clang compiler by default uses FastISel when invoked with -O0, which is also the default. In that case, passing of -mxgot does not get honored, i.e. the code path that is to deal with large got is not taken. Clang produces same output regardless of -mxgot being present or not. This change checks whether -mxgot is passed as an option, and turns off FastISel if it is. Patch by Stefan Maksimovic. Differential Revision: https://reviews.llvm.org/D33593 llvm-svn: 304906
*	[ARM] GlobalISel: Purge G_SEQUENCE	Diana Picus	2017-06-07	5	-64/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to the commit message from r296921, G_MERGE_VALUES and G_INSERT are to be preferred over G_SEQUENCE. Therefore, stop generating G_SEQUENCE in the ARM backend and remove the code dealing with it. This boils down to the code breaking up double values for the soft float calling convention. Use G_MERGE_VALUES + G_UNMERGE_VALUES instead of G_SEQUENCE + G_EXTRACT for it. This maps very nicely to VMOVDRR + VMOVRRD and simplifies the code in the instruction selector. There's one occurence of G_SEQUENCE left in arm-irtranslator.ll, but that is part of the target-independent code for translating constant structs. Therefore, it is beyond the scope of this commit. llvm-svn: 304902
*	[PowerPC] Eliminate integer compare instructions - vol. 3	Nemanja Ivanovic	2017-06-07	9	-39/+798
\| \| \| \| \| \| \| \|	Adds handling for i32 SETNE comparison (both sign and zero extended). Differential Revision: https://reviews.llvm.org/D33718 llvm-svn: 304901
*	[ARM] GlobalISel: Support G_XOR	Diana Picus	2017-06-07	4	-0/+168
\| \| \| \| \| \| \| \| \|	Same as the other binary operators: - legalize to 32 bits - map to GPRs - select to EORrr via TableGen'erated code llvm-svn: 304898
*	evert "[mips] Fix test mips64fpldst.ll with machine verifier enabled"	Simon Dardis	2017-06-07	8	-19/+41
\| \| \| \| \| \| \|	This reverts commit r301394. It broke some internal buildbots, reverting while the issue is being investigated. llvm-svn: 304896
*	[X86][SSE] Fix an issue with PEXTRW/PEXTRB indices during shuffle combining	Simon Pilgrim	2017-06-07	1	-40/+4
\| \| \| \| \| \|	We were checking that the index was in range of the destination vector type, not the (larger) source vector type llvm-svn: 304894
*	[ARM] GlobalISel: Support G_OR	Diana Picus	2017-06-07	4	-0/+168
\| \| \| \| \| \| \| \| \|	Same as the other binary operators: - legalize to 32 bits - map to GPRs - select ORRrr thanks to TableGen'erated code llvm-svn: 304890
*	[ARM] GlobalISel: Support G_AND	Diana Picus	2017-06-07	4	-0/+170
\| \| \| \| \| \| \| \| \|	This is identical to the support for the other binary operators: - widen to s32 - map into GPR - select ANDrr (via TableGen'erated code) llvm-svn: 304885
*	[CGP / PowerPC] use direct compares if there's only one load per block in ↵	Sanjay Patel	2017-06-07	1	-20/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	memcmp() expansion I'd like to enable CGP memcmp expansion for x86, but the output from CGP would regress the special cases (memcmp(x,y,N) != 0 for N=1,2,4,8,16,32 bytes) that we already handle. I'm not sure if we'll actually be able to produce the optimal code given the block-at-a-time limitation in the DAG. We might have to just avoid those special-cases here in CGP. But regardless of that, I think this is a win for the more general cases. http://rise4fun.com/Alive/cbQ Differential Revision: https://reviews.llvm.org/D33963 llvm-svn: 304849
*	[PowerPC] auto-generate full checks and increase test coverage	Sanjay Patel	2017-06-06	1	-77/+160
\| \| \| \| \| \| \|	3 of the tests were testing exactly the same thing: memcmp(x, y, 16) != 0. I changed that to test 4, 7, and 16 bytes, so we can see how those differ. llvm-svn: 304838
*	Added tests for X86InterleavedStore.	Evgeny Stupachenko	2017-06-06	1	-0/+93
\| \| \| \| \| \| \| \| \| \|	Reviewers: RKSimon, DavidKreitzer Differential Revision: https://reviews.llvm.org/D33684 Patch by: Aleen Farhana <Farhana.aleen@gmail.com> llvm-svn: 304834
*	llc: Add ability to parse mir from stdin	Matthias Braun	2017-06-06	1	-0/+20
\| \| \| \| \| \| \| \|	- Add -x <language> option to switch between IR and MIR inputs. - Change MIR parser to read from stdin when filename is '-'. - Add a simple mir roundtrip test. llvm-svn: 304825
*	Fix PR23384 (part 3 of 3)	Evgeny Stupachenko	2017-06-06	7	-67/+67
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The patch makes instruction count the highest priority for LSR solution for X86 (previously registers had highest priority). Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D30562 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 304824
*	MIRPrinter: Avoid assert() when printing empty INLINEASM strings.	Matthias Braun	2017-06-06	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \|	CodeGen uses MO_ExternalSymbol to represent the inline assembly strings. Empty strings for symbol names appear to be invalid. For now just special case the output code to avoid hitting an `assert()` in `printLLVMNameWithoutPrefix()`. This fixes https://llvm.org/PR33317 llvm-svn: 304815
*	[mips] Add madd4 subtarget feature	Petar Jovanovic	2017-06-06	1	-222/+233
\| \| \| \| \| \| \| \| \| \| \|	Addition of a feature and a predicate used to control generation of madd.fmt and similar instructions. Patch by Stefan Maksimovic. Differential Revision: https://reviews.llvm.org/D33400 llvm-svn: 304801
*	[X86][AVX1] Split 256-bit vector non-temporal FastISel loads to keep it ↵	Simon Pilgrim	2017-06-06	1	-6/+30
\| \| \| \| \| \| \| \|	non-temporal (PR32744) Extension to D33728 llvm-svn: 304798
*	AMDGPU/GlobalISel: Mark 32-bit G_ICMP as legal	Tom Stellard	2017-06-06	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33890 llvm-svn: 304797