bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SystemZ] TargetTransformInfo cost functions implemented.	Jonas Paulsson	2017-04-12	11	-32/+620
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleavedMemoryOpCost() implemented. Interleaved access vectorization enabled. BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads, in which case the cost of the z/sext instruction becomes 0. Review: Ulrich Weigand, Renato Golin. https://reviews.llvm.org/D29631 llvm-svn: 300052
*	[AMDGPU] SDWA: make pass global	Sam Kolton	2017-04-12	1	-183/+175
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Remove checks for basic blocks. Reviewers: vpykhtin, rampitec, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31935 llvm-svn: 300040
*	[AMDGPU] Add a new pass to insert waitcnts. Leave under an option for testing.	Kannan Narayanan	2017-04-12	5	-1/+1881
\| \| \| \| \| \|	Based on comments in https://reviews.llvm.org/D31161. llvm-svn: 300023
*	Revert "[WebAssembly] Update use of Attributes after r299875"	Derek Schuff	2017-04-12	1	-14/+17
\| \| \| \| \| \| \| \|	This reverts commit 2a0eb61dcccb15058d5b2a572bb3da0cf47fd550, r300015 I raced with rnk on the commit. llvm-svn: 300016
*	[WebAssembly] Update use of Attributes after r299875	Derek Schuff	2017-04-12	1	-17/+14
\| \| \| \| \| \|	This fixes the failing WebAssemblyLowerEmscriptenEHSjLj tests llvm-svn: 300015
*	AMDGPU: Insert wait at start of callee functions	Matt Arsenault	2017-04-11	1	-0/+14
\| \| \| \|	llvm-svn: 300000
*	AMDGPU: Refactor SIMachineFunctionInfo slightly	Matt Arsenault	2017-04-11	3	-16/+38
\| \| \| \| \| \|	Prepare for handling non-entry functions. llvm-svn: 299999
*	AMDGPU: Refactor argument lowering	Matt Arsenault	2017-04-11	10	-276/+375
\| \| \| \| \| \| \|	Split into smaller functions and prepare for handling non-entry functions. llvm-svn: 299998
*	AMDGPU: Fix folding reg_sequence into copy to phys reg	Matt Arsenault	2017-04-11	1	-0/+4
\| \| \| \| \| \| \|	This was producing an illegal reg_sequence defining a physical register with virtual register inputs. llvm-svn: 299997
*	AMDGPU: Prune unecessary include	Matt Arsenault	2017-04-11	1	-2/+0
\| \| \| \|	llvm-svn: 299996
*	[AArch64] Fix scheduling info for INS(vector, general) instruction.	Balaram Makam	2017-04-11	2	-1/+6
\| \| \| \|	llvm-svn: 299994
*	[x86] Relax the check in areLoadsFromSameBasePtr	Easwaran Raman	2017-04-11	1	-19/+16
\| \| \| \| \| \| \| \| \|	Check if the scale operand is identical (doesn't have to be 1) and do not check the chaain operand. Differential revision: https://reviews.llvm.org/D31833 llvm-svn: 299986
*	[AArch64] Simplify MacroFusion	Evandro Menezes	2017-04-11	1	-79/+89
\| \| \| \| \| \| \| \| \| \| \| \|	This patch assumes that the dependents to be scanned for the ExitSU are its predecessors; otherwise, the successors of the instr are scanned. Furthermore, sometimes the ExitSU was being fused twice, since it may be fused once when scanning the successors from the beginning of the BB and then again when scanning the predecessors of ExitSU. Thus, when scanning the successors of an instr, skip the ExitSU. llvm-svn: 299974
*	[X86] Create the correct ADC/SBB SDNode when lowering add.	Davide Italiano	2017-04-11	1	-2/+4
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D31911 llvm-svn: 299973
*	Fix spelling compliment->complement. Mostly refering to 2s complement. NFC	Craig Topper	2017-04-11	3	-4/+4
\| \| \| \|	llvm-svn: 299970
*	[AMDGPU] Add A5 to data layout for amdgiz environment	Yaxun Liu	2017-04-11	1	-1/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D31589 llvm-svn: 299964
*	Module::getOrInsertFunction is using C-style vararg instead of variadic ↵	Serge Guelton	2017-04-11	3	-4/+3
\| \| \| \| \| \| \| \| \| \| \|	templates. From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments. The variadic template is an obvious solution to both issues. Differential Revision: https://reviews.llvm.org/D31070 llvm-svn: 299949
*	Remove unused functions. Remove static qualifier from functions in header ↵	Vassil Vassilev	2017-04-11	1	-10/+0
\| \| \| \| \| \|	files. NFC. llvm-svn: 299947
*	[AVR] Migrate to new MCAsmBackend applyFixup	Jonathan Roelofs	2017-04-11	2	-2/+2
\| \| \| \| \| \| \| \|	https://reviews.llvm.org/D31875 Patch by Leslie Zhai! llvm-svn: 299946
*	[ARM] Refactor Thumb2 sat instructions	Sam Parker	2017-04-11	1	-48/+30
\| \| \| \| \| \| \| \| \|	Refactor the USAT, SSAT, USAT16 and SSAT16 instruction descriptions for Thumb2. Differential Revision: https://reviews.llvm.org/D31933 llvm-svn: 299945
*	GlobalISel: Allow legalizing G_FADD to a libcall	Diana Picus	2017-04-11	1	-0/+3
\| \| \| \| \| \| \| \| \|	Use the same handling in the generic legalizer code as for the other libcalls (G_FREM, G_FPOW). Enable it on ARM for float and double so we can test it. llvm-svn: 299931
*	Revert "Turn some C-style vararg into variadic templates"	Diana Picus	2017-04-11	3	-3/+4
\| \| \| \| \| \| \|	This reverts commit r299925 because it broke the buildbots. See e.g. http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/6008 llvm-svn: 299928
*	Turn some C-style vararg into variadic templates	Serge Guelton	2017-04-11	3	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Module::getOrInsertFunction is using C-style vararg instead of variadic templates. From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments. The variadic template is an obvious solution to both issues. llvm-svn: 299925
*	[PowerPC] multiply-with-overflow might use the CTR register	Hal Finkel	2017-04-11	1	-9/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Check the legality of ISD::[US]MULO to see whether Intrinsic::[us]mul_with_overflow will legalize into a function call (and, thus, will use the CTR register). Fixes PR32485. Patch by Tim Neumann! Differential Revision: https://reviews.llvm.org/D31790 llvm-svn: 299910
*	Allow DataLayout to specify addrspace for allocas.	Matt Arsenault	2017-04-10	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LLVM makes several assumptions about address space 0. However, alloca is presently constrained to always return this address space. There's no real way to avoid using alloca, so without this there is no way to opt out of these assumptions. The problematic assumptions include: - That the pointer size used for the stack is the same size as the code size pointer, which is also the maximum sized pointer. - That 0 is an invalid, non-dereferencable pointer value. These are problems for AMDGPU because alloca is used to implement the private address space, which uses a 32-bit index as the pointer value. Other pointers are 64-bit and behave more like LLVM's notion of generic address space. By changing the address space used for allocas, we can change our generic pointer type to be LLVM's generic pointer type which does have similar properties. llvm-svn: 299888
*	Get the TOC save offset off of PPCFrameLowering rather than a separate copy ↵	Eric Christopher	2017-04-10	1	-1/+1
\| \| \| \| \| \|	of the same data. llvm-svn: 299887
*	[mips] Use Triple::isLittleEndian to check endianness. NFC	Simon Atanasyan	2017-04-10	1	-3/+1
\| \| \| \|	llvm-svn: 299872
*	[ARM/AArch64] Ensure valid vector element types for interleaved accesses	Matthew Simpson	2017-04-10	6	-39/+86
\| \| \| \| \| \| \| \| \| \| \|	This patch refactors and strengthens the type checks performed for interleaved accesses. The primary functional change is to ensure that the interleaved accesses have valid element types. The added test cases previously failed because the element type is f128. Differential Revision: https://reviews.llvm.org/D31817 llvm-svn: 299864
*	AMDGPU: Fix crash when disassembling VOP3 mac	Matt Arsenault	2017-04-10	10	-19/+23
\| \| \| \| \| \| \| \| \| \| \| \|	The unused dummy src2_modifiers is missing, so it crashes when trying to print it. I tried to fully remove src2_modifiers, but there are some irritations in the places where it is converted to mad since it starts to require modifying use lists while iterating over them. llvm-svn: 299861
*	[X86][MMX] Add fast-isel support for MMX non-temporal writes	Simon Pilgrim	2017-04-10	1	-0/+4
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D31754 llvm-svn: 299852
*	[ARM] GlobalISel: Support G_FPOW for float and double	Diana Picus	2017-04-10	1	-2/+3
\| \| \| \| \| \|	Legalize to a libcall. llvm-svn: 299841
*	AMDGPU: Actually write nops for writeNopData	Matt Arsenault	2017-04-08	1	-1/+14
\| \| \| \| \| \| \|	Before this was just writing 0s, which ends up looking like a v_cndmask_b32 v0, s0, v0, vcc. Write out an encoded s_nop instead. llvm-svn: 299816
*	[AArch64] Refine Falkor Machine Model - Part 3	Balaram Makam	2017-04-08	5	-26/+135
\| \| \| \| \| \| \| \| \|	This concludes the refinements to Falkor Machine Model. It includes SchedPredicates for immediate zero and LSL Fast. Forwarding logic is also modeled for vector multiply and accumulate only. llvm-svn: 299810
*	[ARM] Prefer BIC over BFC in ARM mode.	Eli Friedman	2017-04-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	BIC is generally faster, and it can put the output in a different register from the input. We already do this in Thumb2 mode; not sure why the equivalent fix never got applied to ARM mode. Differential Revision: https://reviews.llvm.org/D31797 llvm-svn: 299803
*	[AArch64] Allow global register asm("x18") or asm("w18") under -ffixed-x18	Petr Hosek	2017-04-07	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	When using -ffixed-x18, the x18 (or w18) register can safely be used with the "global register variable" GCC extension, but the backend fails to recognize it. Patch by Roland McGrath. Differential Revision: https://reviews.llvm.org/D31793 llvm-svn: 299799
*	Revert "[SelectionDAG] Enable target specific vector scalarization of calls ↵	Simon Dardis	2017-04-07	6	-198/+15
\| \| \| \| \| \| \| \| \| \| \| \| \|	and returns" This reverts commit r299766. This change appears to have broken the MIPS buildbots. Reverting while I investigate. Revert "[mips] Remove usage of debug only variable (NFC)" This reverts commit r299769. Follow up commit. llvm-svn: 299788
*	[AMDGPU] Unroll more to eliminate phis and conditions	Stanislav Mekhanoshin	2017-04-07	1	-2/+52
\| \| \| \| \| \| \| \| \| \| \| \| \|	Increase threshold to unroll a loop which contains an "if" statement whose condition defined by a PHI belonging to the loop. This may help to eliminate if region and potentially even PHI itself, saving on both divergence and registers used for the PHI. Add a small bonus for each of such "if" statements. Differential Revision: https://reviews.llvm.org/D31693 llvm-svn: 299779
*	Use PMADDWD to expand reduction in a loop	Dehao Chen	2017-04-07	1	-0/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: PMADDWD can help improve 8/16 bit integer mutliply-add operation performance for cases like: for (int i = 0; i < count; i++) a += x[i] * y[i]; Reviewers: wmi, davidxl, hfinkel, RKSimon, zvi, mkuper Reviewed By: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31679 llvm-svn: 299776
*	[GlobalISel] implement narrowing for G_CONSTANT.	Igor Breger	2017-04-07	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: [GlobalISel] implement narrowing for G_CONSTANT. Reviewers: bogner, zvi, t.p.northover Reviewed By: t.p.northover Subscribers: llvm-commits, dberris, rovka, kristof.beyls Differential Revision: https://reviews.llvm.org/D31744 llvm-svn: 299772
*	[mips] Remove usage of debug only variable (NFC)	Simon Dardis	2017-04-07	1	-2/+2
\| \| \| \| \| \| \|	Fix the lld-x86_64-darwin13 buildbot by removing the declaration of a debug only variable and instead moving the value into the debug statement. llvm-svn: 299769
*	[mips][msa] Fix generation of bm(n)zi and bins[lr]i instructions	Petar Jovanovic	2017-04-07	2	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have two cases here, the first one being the following instruction selection from the builtin function: bm(n)zi builtin -> vselect node -> bins[lr]i machine instruction In case of bm(n)zi having an immediate which has either its high or low bits set, a bins[lr] instruction can be selected through the selectVSplatMask[LR] function. The function counts the number of bits set, and that value is being passed to the bins[lr]i instruction as its immediate, which in turn copies immediate modulo the size of the element in bits plus 1 as per specs, where we get the off-by-one-error. The other case is: bins[lr]i -> vselect node -> bsel.v In this case, a bsel.v instruction gets selected with a mask having one bit less set than required. Patch by Stefan Maksimovic. Differential Revision: https://reviews.llvm.org/D30579 llvm-svn: 299768
*	[AMDGPU][MC] Fix for Bug 28211 + LIT tests	Dmitry Preobrazhensky	2017-04-07	2	-36/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- corrected DS_GWS_* opcodes (see VI_Shader_Programming#16.pdf for detailed description) - address operand is not used - several opcodes have data operand - all opcodes have offset modifier - DS_AND_SRC2_B32: corrected typo in mnemo - DS_WRAP_RTN_F32 replaced with DS_WRAP_RTN_B32 - added CI/VI opcodes: - DS_CONDXCHG32_RTN_B64 - DS_GWS_SEMA_RELEASE_ALL - added VI opcodes: - DS_CONSUME - DS_APPEND - DS_ORDERED_COUNT Differential Revision: https://reviews.llvm.org/D31707 llvm-svn: 299767
*	[SelectionDAG] Enable target specific vector scalarization of calls and returns	Simon Dardis	2017-04-07	6	-15/+198
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown, backends can request that LLVM to scalarize vector types for calls and returns. The MIPS vector ABI requires that vector arguments and returns are passed in integer registers. With SelectionDAG's new hooks, the MIPS backend can now handle LLVM-IR with vector types in calls and returns. E.g. 'call @foo(<4 x i32> %4)'. Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for calls and returns if vector types were not legal. If vector types were legal, a single 128bit vector argument would be assigned to a single 32 bit / 64 bit integer register. By teaching the MIPS backend to inspect the original types, it can now implement the MIPS vector ABI which requires a particular method of scalarizing vectors. Previously, the MIPS backend relied on clang to scalarize types such as "call @foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3, i32 inreg %4)". This patch enables the MIPS backend to take either form for vector types. Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur Differential Revision: https://reviews.llvm.org/D27845 llvm-svn: 299766
*	[SystemZ] Check for presence of vector support in SystemZISelLowering	Jonas Paulsson	2017-04-07	2	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	A test case was found with llvm-stress that caused DAGCombiner to crash when compiling for an older subtarget without vector support. SystemZTargetLowering::combineTruncateExtract() should do nothing for older subtargets. This check was placed in canTreatAsByteVector(), which also helps in a few other places. Review: Ulrich Weigand llvm-svn: 299763
*	[SystemZ] Remove confusing comment in combineEXTRACT_VECTOR_ELT()	Jonas Paulsson	2017-04-07	1	-2/+0
\| \| \| \| \| \|	It isn't just one-element vectors that can appear here. llvm-svn: 299762
*	[AMDGPU] Move SiShrinkInstruction and SDWAPeephole to SSAOptimization passes	Sam Kolton	2017-04-07	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Difference beetween PreRegAlloc() and MachineSSAOptimization() are that the former is run despite of -O0 optimization level. In my undestanding SiShrinkInstructions and SDWAPeephole shouldn't run when optimizations are disabled. With this change order of passes will not change. Reviewers: arsenm, vpykhtin, rampitec Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31705 llvm-svn: 299757
*	[ARM] GlobalISel: Support frem for 64-bit values	Diana Picus	2017-04-07	1	-0/+1
\| \| \| \| \| \|	Legalize to a libcall. llvm-svn: 299756
*	[ARM] GlobalISel: Support frem for 32-bit values	Diana Picus	2017-04-07	2	-5/+3
\| \| \| \| \| \| \| \|	Legalize to a libcall. On this occasion, also start allowing soft float subtargets. For the moment G_FREM is the only legal floating point operation for them. llvm-svn: 299753
*	[WebAssembly] Fix -Wcovered-switch-default warning	Derek Schuff	2017-04-06	1	-2/+1
\| \| \| \|	llvm-svn: 299736
*	AMDGPU/GFX9: Fix shared and private aperture queries	Konstantin Zhuravlyov	2017-04-06	3	-14/+35
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D31786 llvm-svn: 299727