bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	AMDGPU/SI: Implement GroupStaticSize Intrinsic for Dynamic LDS	Changpeng Fang	2016-03-15	2	-2/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Static LDS size is saved in MachineFunctionInfo::LDSSize, We define a pseudo instruction with usesCustomInserter bit set. Then, in EmitInstrWithCustomInserter, we replace this pseudo instruction with a mov of MachineFunctionInfo::LDSSize. Reviewers: arsenm tstellarAMD Subscribers llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18064 llvm-svn: 263563
*	Myriad: Add new sparc CPU kinds.	Douglas Katzman	2016-03-15	1	-0/+3
\| \| \| \|	llvm-svn: 263557
*	[MC] Rename TLSCALL as it's not ARM specific.	Davide Italiano	2016-03-15	2	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	`MCSymbolRefExpr` variant kind for TLSCALL is prefixed with _ARM_ since this is how it was originally implemented. The X86_64 version is exactly the same so there's no reason to create a new variant, we can just rename the existing one to be machine-independent. This generalization is the first step to implement support for GNU2 TLS dialect in MC. Differential Revision: http://reviews.llvm.org/D18160 llvm-svn: 263515
*	Temporarily Revert "[X86][SSE] Simplify vector LOAD + EXTEND on	Eric Christopher	2016-03-14	1	-3/+1
\| \| \| \| \| \| \| \| \|	pre-SSE41 hardware" as it seems to be causing crashes during code generation in halide. PR forthcoming. This reverts commit r263303. llvm-svn: 263512
*	[SystemZ] Add missing isBranch flags to certain instruction	Ulrich Weigand	2016-03-14	1	-9/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Some instructions were missing isBranch, isCall, or isTerminator flags. This didn't really affect code generation since most of the affected patterns were used only for the AsmParser and/or disassembler. However, it could affect tools using the MC layer to disassemble and parse binary code (e.g. via MCInstrDesc::mayAffectControlFlow). llvm-svn: 263478
*	[AArch64] Refactor AArch64FrameLowering::emitPrologue. NFC.	Chad Rosier	2016-03-14	1	-52/+50
\| \| \| \| \| \| \|	http://reviews.llvm.org/D18125 Patch by Aditya Kumar. llvm-svn: 263461
*	[AArch64] Break the dependency between FP and SP when possible.	Chad Rosier	2016-03-14	2	-3/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	When the SP in not changed because of realignment/VLAs etc., we restore the SP by using the previous value of SP and not the FP. Breaking the dependency will help in cases when the epilog of a callee is close to the epilog of the caller; for then "sub sp, fp, #" depends on the load restoring the FP in the epilog of the callee. http://reviews.llvm.org/D18060 Patch by Aditya Kumar and Evandro Menezes. llvm-svn: 263458
*	[Mips] Fix -Wunused-private-field warning after r263444.	Chad Rosier	2016-03-14	3	-7/+6
\| \| \| \|	llvm-svn: 263454
*	[DAG] use !isUndef() ; NFCI	Sanjay Patel	2016-03-14	6	-37/+32
\| \| \| \|	llvm-svn: 263453
*	[DAG] use isUndef() ; NFCI	Sanjay Patel	2016-03-14	9	-98/+85
\| \| \| \|	llvm-svn: 263448
*	AMDGPU/SI: Handle wait states required for DPP instructions	Tom Stellard	2016-03-14	2	-0/+63
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17543 llvm-svn: 263447
*	[x86, AVX] replace masked load with full vector load when possible	Sanjay Patel	2016-03-14	1	-7/+25
\| \| \| \| \| \| \| \| \| \| \| \| \|	Converting masked vector loads to regular vector loads for x86 AVX should always be a win. I raised the legality issue of reading the extra memory bytes on llvm-dev. I did not see any objections. 1. x86 already does this kind of optimization for multiple scalar loads -> vector load. 2. If other targets have the same flexibility, we could move this transform up to CGP or DAGCombiner. Differential Revision: http://reviews.llvm.org/D18094 llvm-svn: 263446
*	[mips] MIPS32R6 compact branch support	Daniel Sanders	2016-03-14	13	-56/+358
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: MIPSR6 introduces a class of branches called compact branches. Unlike the traditional MIPS branches which have a delay slot, compact branches do not have a delay slot. The instruction following the compact branch is only executed if the branch is not taken and must not be a branch. It works by generating compact branches for MIPS32R6 when the delay slot filler cannot fill a delay slot. Then, inspecting the generated code for forbidden slot hazards (a compact branch with an adjacent branch or other CTI) and inserting nops to clear this hazard. Patch by Simon Dardis. Reviewers: vkalintiris, dsanders Subscribers: MatzeB, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D16353 llvm-svn: 263444
*	AMDGPU/SI: Incomplete shader binaries need to finish execution at the end	Marek Olsak	2016-03-14	2	-8/+24
\| \| \| \| \| \| \| \| \| \|	Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D18058 llvm-svn: 263441
*	AMDGPU: mark llvm.amdgcn.image.atomic.* as a source of divergence	Nicolai Haehnle	2016-03-14	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When multiple threads perform an atomic op with the same arguments, they will usually see different return values. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18101 llvm-svn: 263440
*	[mips] Use range-based for loops. NFC.	Vasileios Kalintiris	2016-03-14	1	-7/+5
\| \| \| \|	llvm-svn: 263438
*	[SystemZ] Avoid LER on z13 due to partial register dependencies	Ulrich Weigand	2016-03-14	2	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On the z13, it turns out to be more efficient to access a full floating-point register than just the upper half (as done e.g. by the LE and LER instructions). Current code already takes this into account when loading from memory by using the LDE instruction in place of LE. However, we still generate LER, which shows the same performance issues as LE in certain circumstances. This patch changes the back-end to emit LDR instead of LER to implement FP32 register-to-register copies on z13. llvm-svn: 263431
*	[mips] Fix an issue with long double when function roundl is defined	Zlatko Buljan	2016-03-14	1	-2/+2
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17760 llvm-svn: 263428
*	[mips] Range check uimm16_64	Daniel Sanders	2016-03-14	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D17725 llvm-svn: 263427
*	[mips] Simplify ordering of range checked immediate classes.	Daniel Sanders	2016-03-14	1	-29/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: With the addition of checks to ensure that operands have a strict ordering it has become tricky to manage the order in the way I originally intended. This patch linearizes the ordering which simplifies the implementation but requires an order that is arbitrary in places. Here are some examples: * uimm4 < uimm5 < uimm6 * simm4 < uimm4 < simm5 < uimm5 * uimm5 < uimm5_plus1 (1..32) < uimm5_plus32 (32..63) < uimm6 The term 'superset' starts to break down here since the _plus classes are not true supersets of uimm5 (but they are still subsets of uimm6). * uimm5 < uimm5_64, and uimm5 < vsplat_uimm5 This is entirely arbitrary. We need an ordering and what we pick is unimportant since only one is possible for a given mnemonic. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D17723 llvm-svn: 263423
*	[AMDGPU] Assembler: SOP* instruction fixes	Nikolay Haustov	2016-03-14	2	-27/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	s_bitset0_b64, s_bitset1_b64 has 32-bit src0, not 64-bit. s_rfe_b64 has just one destination operand and no source. Uncomment S_BITCMP* and S_SETVSKIP, adjust SOPC_* classes for that. Add s_memrealtime test and change comments in smem.s to follow common style. Change test for s_memtime to use non-zero register to make it really test encoding. Add tests for s_buffer_load*. Add tests for SOPC instructions (same for SI and VI) Differential Revision: http://reviews.llvm.org/D18040 llvm-svn: 263420
*	[mips] Range check uimm6_lsl2.	Daniel Sanders	2016-03-14	4	-43/+32
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D17291 llvm-svn: 263419
*	Try to fix build of WebAssemblyRegStackify.cpp on Windows	Hans Wennborg	2016-03-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's failing to build on VS2015 with: C:\b\build\slave\ClangToTWin\build\src\third_party\llvm\lib\Target\WebAssembly\WebAssemblyRegStackify.cpp(520): error C2668: 'llvm::make_reverse_iterator': ambiguous call to overloaded function C:\b\build\slave\ClangToTWin\build\src\third_party\llvm\include\llvm/ADT/STLExtras.h(217): note: could be 'std::reverse_iterator<llvm::MachineBasicBlock::iterator> llvm::make_reverse_iterator<llvm::MachineInstrBundleIterator<llvm::MachineInstr>>(IteratorTy)' with [ IteratorTy=llvm::MachineInstrBundleIterator<llvm::MachineInstr> ] C:\b\depot_tools\win_toolchain\vs_files\391bbf1220d3edcd3cc3fccdb56224181e3b13a7\win_sdk\bin\..\..\VC\include\xutility(1217): note: or 'std::reverse_iterator<llvm::MachineBasicBlock::iterator> std::make_reverse_iterator<llvm::MachineInstrBundleIterator<llvm::MachineInstr>>(_RanIt)' [found using argument-dependent lookup] with [ _RanIt=llvm::MachineInstrBundleIterator<llvm::MachineInstr> ] I don't have VS2015 locally at the moment, but hopefully this will help. llvm-svn: 263418
*	AVX512: icmp operation should be always lowered to CMPM (AVX-512) ↵	Igor Breger	2016-03-14	1	-22/+23
\| \| \| \| \| \| \| \| \| \|	instruction on SKX. implemented by delena Differential Revision: http://reviews.llvm.org/D18054 llvm-svn: 263417
*	[AMDGPU] AsmParser: Factor out parseRegister. NFC.	Valery Pykhtin	2016-03-14	1	-24/+40
\| \| \| \|	llvm-svn: 263411
*	[AMDGPU] AsmParser: refactor post push_back vector access. NFC.	Valery Pykhtin	2016-03-14	1	-6/+5
\| \| \| \|	llvm-svn: 263409
*	[AMDGPU] AsmParser: remove redundant isReg checks. NFC.	Valery Pykhtin	2016-03-14	1	-7/+7
\| \| \| \|	llvm-svn: 263407
*	Remove PreserveNames template parameter from IRBuilder	Mehdi Amini	2016-03-13	1	-1/+1
\| \| \| \| \| \| \| \|	This reapplies r263258, which was reverted in r263321 because of issues on Clang side. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263393
*	[X86][SSE41] Avoid variable blend for constant v8i16 shifts	Simon Pilgrim	2016-03-13	1	-2/+7
\| \| \| \| \| \|	The SSE41 v8i16 shift lowering using (v)pblendvb is great for non-constant shift amounts, but if it is constant then we can efficiently reduce the VSELECT to shuffles with the pre-SSE41 lowering. llvm-svn: 263383
*	[X86] Remove many operands that represent memory stores from outs to ins. ↵	Craig Topper	2016-03-13	6	-34/+34
\| \| \| \| \| \|	These operands are the registers and immediates that specify the memory address not the memory itself thus they are inputs. llvm-svn: 263354
*	Fix for PR 26378	Nemanja Ivanovic	2016-03-12	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \|	This patch corresponds to review: http://reviews.llvm.org/D17712 We were not clearing the TOC vector in PPCAsmPrinter when initializing it. This caused duplicate definition asserts when the pass is reused on the module (i.e. with -compile-twice or in JIT contexts). llvm-svn: 263338
*	[X86] Make sure we do not clobber RBX with cmpxchg when used as a base pointer.	Quentin Colombet	2016-03-12	5	-12/+147
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cmpxchg[8\|16]b uses RBX as one of its argument. In other words, using this instruction clobbers RBX as it is defined to hold one the input. When the backend uses dynamically allocated stack, RBX is used as a reserved register for the base pointer. Reserved registers have special semantic that only the target understands and enforces, because of that, the register allocator don’t use them, but also, don’t try to make sure they are used properly (remember it does not know how they are supposed to be used). Therefore, when RBX is used as a reserved register but defined by something that is not compatible with that use, the register allocator will not fix the surrounding code to make sure it gets saved and restored properly around the broken code. This is the responsibility of the target to do the right thing with its reserved register. To fix that, when the base pointer needs to be preserved, we use a different pseudo instruction for cmpxchg that save rbx. That pseudo takes two more arguments than the regular instruction: - One is the value to be copied into RBX to set the proper value for the comparison. - The other is the virtual register holding the save of the value of RBX as the base pointer. This saving is done as part of isel (i.e., we emit a copy from rbx). cmpxchg_save_rbx <regular cmpxchg args>, input_for_rbx_reg, save_of_rbx_as_bp This gets expanded into: rbx = copy input_for_rbx_reg cmpxchg <regular cmpxchg args> rbx = save_of_rbx_as_bp Note: The actual modeling of the pseudo is a bit more complicated to make sure the interferes that appears after the pseudo gets expanded are properly modeled before that expansion. This fixes PR26883. llvm-svn: 263325
*	Temporarily revert:	Eric Christopher	2016-03-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit ae14bf6488e8441f0f6d74f00455555f6f3943ac Author: Mehdi Amini <mehdi.amini@apple.com> Date: Fri Mar 11 17:15:50 2016 +0000 Remove PreserveNames template parameter from IRBuilder Summary: Following r263086, we are now relying on a flag on the Context to discard Value names in release builds. Reviewers: chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18023 From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263258 91177308-0d34-0410-b5e6-96231b3b80d8 until we can figure out what to do about clang and Release build testing. This reverts commit 263258. llvm-svn: 263321
*	[X86][SSE] Simplify vector LOAD + EXTEND on pre-SSE41 hardware	Simon Pilgrim	2016-03-11	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 263303
*	[AArch64] Don't blindly lower f16/f128 FCCMPs.	Ahmed Bougacha	2016-03-11	1	-3/+16
\| \| \| \| \| \| \| \| \|	Instead, extend f16 (like we do when lowering a standalone SETCC), and let f128 be legalized to the RT calls. Fixes PR26803. llvm-svn: 263301
*	[WebAssembly] Add `final` keywords to a few more subclasses, for consistency.	Dan Gohman	2016-03-11	2	-2/+2
\| \| \| \|	llvm-svn: 263287
*	Fix spelling.	Simon Pilgrim	2016-03-11	1	-1/+1
\| \| \| \|	llvm-svn: 263266
*	Remove PreserveNames template parameter from IRBuilder	Mehdi Amini	2016-03-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Following r263086, we are now relying on a flag on the Context to discard Value names in release builds. Reviewers: chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18023 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263258
*	[AMDGPU] Fix VOPC instruction operand namings	Valery Pykhtin	2016-03-11	1	-2/+2
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17966 llvm-svn: 263242
*	[X86][AVX] Fixed issue where a long chain of shuffles could attempt to ↵	Simon Pilgrim	2016-03-11	1	-1/+4
\| \| \| \| \| \| \| \|	combine to a single (illegal) PSHUFB instruction. Its not enough that we test for SSSE3 - that's only OK for 128-bit vectors - we also need to test for AVX2 / AVX512BW for 256/512 bit vector cases. llvm-svn: 263239
*	[mips] MIPSR6 Instruction itineraries	Vasileios Kalintiris	2016-03-11	3	-65/+186
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Defines instruction itineraries for common MIPSR6 instructions. Patch by Simon Dardis. Reviewers: vkalintiris Subscribers: MatzeB, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D17198 llvm-svn: 263229
*	[mips] Range check simm4.	Daniel Sanders	2016-03-11	4	-34/+50
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D16811 llvm-svn: 263220
*	[AMDGPU] Assembler: change v_madmk operands to have same order as mad.	Nikolay Haustov	2016-03-11	4	-28/+19
\| \| \| \| \| \| \| \| \| \|	The constant is now at source operand 1 (previously at 2). This is also how it is in legacy AMD sp3 assembler. Update tests. Differential Revision: http://reviews.llvm.org/D17984 llvm-svn: 263212
*	[PM] Port GVN to the new pass manager, wire it up, and teach a couple of	Chandler Carruth	2016-03-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tests to run GVN in both modes. This is mostly the boring refactoring just like SROA and other complex transformation passes. There is some trickiness in that GVN's ValueNumber class requires hand holding to get to compile cleanly. I'm open to suggestions about a better pattern there, but I tried several before settling on this. I was trying to balance my desire to sink as much implementation detail into the source file as possible without introducing overly many layers of abstraction. Much like with SROA, the design of this system is made somewhat more cumbersome by the need to support both pass managers without duplicating the significant state and logic of the pass. The same compromise is struck here. I've also left a FIXME in a doxygen comment as the GVN pass seems to have pretty woeful documentation within it. I'd like to submit this with the FIXME and let those more deeply familiar backfill the information here now that we have a nice place in an interface to put that kind of documentaiton. Differential Revision: http://reviews.llvm.org/D18019 llvm-svn: 263208
*	AMDGPU: Don't use InstVisitor for AMDGPUPromoteAlloca	Matt Arsenault	2016-03-11	1	-6/+12
\| \| \| \| \| \| \| \|	Frontend authors are strongly encouraged to keep allocas in the entry block, so don't bother visiting every instruction in the other blocks of the function. llvm-svn: 263206
*	AMDGPU: R600 code splitting cleanup	Matt Arsenault	2016-03-11	32	-105/+93
\| \| \| \| \| \| \|	Move a few functions only used by R600 to R600 specific code, fix header macros to stop using R600, mark classes as final. llvm-svn: 263204
*	AMDGPU: Materialize sign bits with bfrev	Matt Arsenault	2016-03-11	1	-0/+24
\| \| \| \| \| \| \|	If a constant is the same as the reverse of an inline immediate, this is 4 bytes smaller than having to embed a 32-bit literal. llvm-svn: 263201
*	AArch64: only try to use scaled fcvt ops on legal vector types.	Tim Northover	2016-03-10	1	-1/+2
\| \| \| \| \| \| \|	Before we ended up calling getSimpleVectorType on a <3 x float>, which asserted. llvm-svn: 263169
*	[x86] don't use a shuffle when a vselect will do; NFCI	Sanjay Patel	2016-03-10	1	-16/+5
\| \| \| \| \| \| \| \|	Looking at the IR definition of a masked load made me realize there was no reason to use a shuffle here, so we don't need to convert the format of the mask at all. llvm-svn: 263167
*	[X86][SSE] Reapplied: Improve vector ZERO_EXTEND by combining to ↵	Simon Pilgrim	2016-03-10	1	-9/+24
\| \| \| \| \| \| \| \| \| \| \| \|	ZERO_EXTEND_VECTOR_INREG Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns. Reapplied with a fix for PR26870 (avoid premature use of TargetConstant in ZERO_EXTEND_VECTOR_INREG expansion). Differential Revision: http://reviews.llvm.org/D17691 llvm-svn: 263159