bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	NFC: MergeFunctions return early	JF Bastien	2016-04-12	1	-1/+1
\| \| \| \| \| \|	Same effect, easier to read. llvm-svn: 266128
*	AMDGPU: add llvm.amdgcn.buffer.load/store intrinsics	Nicolai Haehnle	2016-04-12	1	-59/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: They correspond to BUFFER_LOAD/STORE_DWORD[_X2,X3,X4] and mostly behave like llvm.amdgcn.buffer.load/store.format. They will be used by Mesa for SSBO and atomic counters at least when robust buffer access behavior is desired. (These instructions perform no format conversion and do buffer range checking per component.) As a side effect of sharing patterns with llvm.amdgcn.buffer.store.format, it has become trivial to add support for the f32 and v2f32 variants of that intrinsic, so the patch does so. Also DAG-ify (and fix) some tests that I noticed intermittent failures in while developing this patch. Some tests were (temporarily) adjusted for the required mayLoad/hasSideEffects changes to the BUFFER_STORE_DWORD* instructions. See also http://reviews.llvm.org/D18291. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18292 llvm-svn: 266126
*	[ThinLTO] Only compute imports for current module in FunctionImport pass	Teresa Johnson	2016-04-12	2	-24/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The function import pass was computing all the imports for all the modules in the index, and only using the imports for the current module. Change this to instead compute only for the given module. This means that the exports list can't be populated, but they weren't being used anyway. Longer term, the linker can collect all the imports and export lists and serialize them out for consumption by the distributed backend processes which use this pass. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18945 llvm-svn: 266125
*	NFC: MergeFunctions update more comments	JF Bastien	2016-04-12	1	-7/+8
\| \| \| \| \| \|	They are wordy. Some words were wrong. llvm-svn: 266124
*	Add __atomic_* lowering to AtomicExpandPass.	James Y Knight	2016-04-12	3	-9/+562
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(Recommit of r266002, with r266011, r266016, and not accidentally including an extra unused/uninitialized element in LibcallRoutineNames) AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266115
*	AMDGPU/SI: Insert wait states required after v_readfirstlane on SI	Tom Stellard	2016-04-12	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We will be able to handle this case much better once the hazard recognizer is finished, but this conservative implementation fixes a hang with the piglit test: spec/arb_arrays_of_arrays/execution/sampler/fs-nested-struct-arrays-nonconst-nested-arra Reviewers: arsenm, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18988 llvm-svn: 266105
*	AMDGPU: Eliminate half of i64 or if one operand is zero_extend from i32	Matt Arsenault	2016-04-12	1	-0/+30
\| \| \| \| \| \| \| \| \| \|	This helps clean up some of the mess when expanding unaligned 64-bit loads when changed to be promote to v2i32, and fixes situations where or x, 0 was emitted after splitting 64-bit ors during moveToVALU. I think this could be a generic combine but I'm not sure. llvm-svn: 266104
*	[IR/Verifier] Each DISubprogram with isDefinition: true must belong to a CU.	Davide Italiano	2016-04-12	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \|	Add a check to catch violations. ~60 tests were broken and prevented this change to be committed. Adrian and I (thanks Adrian!) went through them in the last week or so updating. The check can be done more efficiently but I'd still like to get this in ASAP to avoid more broken tests to be checked in (if any). PR: 27101 llvm-svn: 266102
*	[CodeGen] Remove constant-folding dead code. NFC.	Ahmed Bougacha	2016-04-12	1	-12/+4
\| \| \| \| \| \| \| \| \| \| \|	This code was specific to vector operations with scalar operands: all the opcodes in FoldValue (via FoldConstantArithmetic) can't match those criteria. Replace it with an assert if that ever changes: at that point, we might need to add back a splat BUILD_VECTOR. llvm-svn: 266100
*	Check alloca's special state	JF Bastien	2016-04-12	1	-0/+4
\| \| \| \| \| \|	Following up to a similar fix in MergeFunctions: r266022. This patch keeps both in sync, it would be nice to not have to do this. It doesn't look like there's an easy way to test this code directly at the moment: AFAICT all currect uses of isSameOperationAs are looking at instructions deep inside a function. IndVarSimplify/pr24952.ll and InstMerge/st_sink_* look at alloca inadvertently but are brittle tests. llvm-svn: 266099
*	Introduce an GCRelocateInst class [NFC]	Philip Reames	2016-04-12	5	-16/+9
\| \| \| \| \| \|	Previously, we were using isGCRelocate predicates. Using a subclass of IntrinsicInst is far more idiomatic. The refactoring also enables a couple of minor simplifications and code sharing. llvm-svn: 266098
*	fix indentation; NFC	Sanjay Patel	2016-04-12	1	-12/+12
\| \| \| \|	llvm-svn: 266097
*	AMDGPU/SI: Fix a mis-compilation of multi-level breaks	Nicolai Haehnle	2016-04-12	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Under certain circumstances, multi-level breaks (or what is understood by the control flow passes as such) could be miscompiled in a way that causes infinite loops, by emitting incorrect control flow intrinsics. This fixes a hang in dEQP-GLES3.functional.shaders.loops.while_dynamic_iterations.conditional_continue_vertex Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18967 llvm-svn: 266088
*	Support arbitrary addrspace pointers in masked load/store intrinsics	Artur Pilipenko	2016-04-12	2	-10/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a resubmittion of 263158 change. This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 266086
*	[ScheduleDAGInstrs] Handle instructions with multiple MMOs	Geoff Berry	2016-04-12	1	-30/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In getUnderlyingObjectsForInstr(): Don't give up on instructions with multiple MMOs, instead look through all the MMOs and if they all meet the conservative criteria previously used for single MMO instructions, then return all of the underlying objects derived from the MMOs. The change to ScheduleDAGInstrs::buildSchedGraph() is needed to avoid the case where multiple underlying objects are present and are related in such a way that successive iterations of the loop end up adding a dependency from an instruction to itself. Reviewers: atrick, hfinkel Subscribers: MatzeB, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18093 llvm-svn: 266084
*	[mips] add assembler support for .set arch=octeon	Petar Jovanovic	2016-04-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This patch enables assembler support for .set arch=octeon. It will fix issues with inline assembler when this directive is used. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D18548 llvm-svn: 266081
*	AMDGPU: Implement i64 global atomics	Matt Arsenault	2016-04-12	2	-14/+44
\| \| \| \|	llvm-svn: 266075
*	AMDGPU: Add atomic_inc + atomic_dec intrinsics	Matt Arsenault	2016-04-12	8	-11/+101
\| \| \| \| \| \| \|	These are different than atomicrmw add 1 because they have an additional input value to clamp the result. llvm-svn: 266074
*	AMDGPU: Remove trailing whitespace	Matt Arsenault	2016-04-12	1	-7/+7
\| \| \| \|	llvm-svn: 266073
*	This reverts commit r266002, r266011 and r266016.	Rafael Espindola	2016-04-12	3	-562/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	They broke the msan bot. Original message: Add __atomic_* lowering to AtomicExpandPass. AtomicExpandPass can now lower atomic load, atomic store, atomicrmw,and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266062
*	Revert "[mips] MIPSR6 Compact branch aliases"	Simon Dardis	2016-04-12	2	-7/+1
\| \| \| \| \| \| \| \|	This reverts commit r266055. ps4-buildslave2 is highlighting a failure. llvm-svn: 266061
*	[SystemZ] Use LDE32 instead of LE, when Offset is small.	Jonas Paulsson	2016-04-12	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \|	On z13, if eliminateFrameIndex() chooses LE (and not LEY), immediately transform that LE to LDE32 to avoid partial register dependencies. LEY should be generally preferred for big offsets over an expansion into LAY + LDE32. Reviewed by Ulrich Weigand. llvm-svn: 266060
*	[mips] MIPSR6 Compact branch aliases	Simon Dardis	2016-04-12	2	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Alias 'jic $reg, 0' to 'jrc $reg' and 'jialc $reg, 0' to 'jalrc $reg' like binutils. Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D18856 llvm-svn: 266055
*	Avoid GCC -fpermissive error about llvm::Mangler hidden by member named Mangler	Stephan Bergmann	2016-04-12	1	-1/+1
\| \| \| \|	llvm-svn: 266049
*	Refactor the Internalize stage of libLTO in a separate file (NFC)	Mehdi Amini	2016-04-12	4	-135/+230
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is intended to be shared by the ThinLTOCodeGenerator. Note that there is a change in the way the verifier is run, previously it was ran as a Pass on the merged module during internalization. While now the verifier is called explicitely on the merged module outside of the internalize "pass pipeline". What remains strange in the API is the fact that `DisableVerify` in the API does not disable this initial verifier. Differential Revision: http://reviews.llvm.org/D19000 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266047
*	[PPC64] Mark CR0 Live if PPCInstrInfo::optimizeCompareInstr Creates a Use of CR0	Chuang-Yu Cheng	2016-04-12	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Resolve Bug 27046 (https://llvm.org/bugs/show_bug.cgi?id=27046). The PPCInstrInfo::optimizeCompareInstr function could create a new use of CR0, even if CR0 were previously dead. This patch marks CR0 live if a use of CR0 is created. Author: Tom Jablin (tjablin) Reviewers: hfinkel kbarton cycheng http://reviews.llvm.org/D18884 llvm-svn: 266040
*	[PPC64] Use mfocrf in prologue when we only need to save 1 nonvolatile CR field	Chuang-Yu Cheng	2016-04-12	1	-8/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the ELFv2 ABI, we are not required to save all CR fields. If only one nonvolatile CR field is clobbered, use mfocrf instead of mfcr to selectively save the field, because mfocrf has short latency compares to mfcr. Thanks Nemanja's invaluable hint! Reviewers: nemanjai tjablin hfinkel kbarton http://reviews.llvm.org/D17749 llvm-svn: 266038
*	AArch64: Drive-by cleanup	Matthias Braun	2016-04-12	1	-3/+2
\| \| \| \|	llvm-svn: 266035
*	Attempt to make buildbot happier with r266032.	George Burgess IV	2016-04-12	1	-2/+1
\| \| \| \| \| \| \|	Apparently std::numeric_limits<unsigned>::max() isn't constexpr everywhere yet. llvm-svn: 266034
*	Add the allocsize attribute to LLVM.	George Burgess IV	2016-04-12	11	-65/+321
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	`allocsize` is a function attribute that allows users to request that LLVM treat arbitrary functions as allocation functions. This patch makes LLVM accept the `allocsize` attribute, and makes `@llvm.objectsize` recognize said attribute. The review for this was split into two patches for ease of reviewing: D18974 and D14933. As promised on the revisions, I'm landing both patches as a single commit. Differential Revision: http://reviews.llvm.org/D14933 llvm-svn: 266032
*	[RegBankSelect] Teach the repairing code how to handle physical	Quentin Colombet	2016-04-12	1	-2/+6
\| \| \| \| \| \|	registers. llvm-svn: 266029
*	[RegisterBankInfo] Do not provide a default mapping for non-reg of phi	Quentin Colombet	2016-04-12	1	-0/+7
\| \| \| \| \| \|	operations. llvm-svn: 266027
*	[RegBankSelect] Teach how to repair definitions.	Quentin Colombet	2016-04-12	1	-13/+112
\| \| \| \| \| \| \| \| \|	Although repairing definitions is not mandatory for correctness (only phis would be impacted because of the RPO traversal), not repairing might go against the cost model. Therefore, just repair when it is possible. llvm-svn: 266025
*	MergeFunctions: test alloca better	JF Bastien	2016-04-12	1	-9/+6
\| \| \| \| \| \|	r237193 fix handling of alloca size / align in MergeFunctions, but only tested one and didn't follow FunctionComparator::cmpOperations's usual comparison pattern. It also didn't update Instruction.cpp:haveSameSpecialState which I'll do separately. llvm-svn: 266022
*	Replace MachineRegisterInfo::TracksLiveness with a MachineFunctionProperty	Derek Schuff	2016-04-11	2	-9/+7
\| \| \| \| \| \| \| \| \| \|	Use the MachineFunctionProperty mechanism to indicate whether the liveness info is accurate instead of a bool flag on MRI. Keeps the MRI accessor function for convenience. NFC Differential Revision: http://reviews.llvm.org/D18767 llvm-svn: 266020
*	ThinLTO renaming: use module hash instead of position in the summary	Mehdi Amini	2016-04-11	1	-1/+1
\| \| \| \| \| \| \| \| \|	This is more robust to changes in the link ordering. Differential Revision: http://reviews.llvm.org/D18946 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266018
*	AtomicExpandPass: mark assert variable as used	JF Bastien	2016-04-11	1	-0/+3
\| \| \| \| \| \|	Avoid -Wunused-variable llvm-svn: 266016
*	Fix compile with GCC after r266002 (Add __atomic_* lowering to AtomicExpandPass)	James Y Knight	2016-04-11	1	-8/+8
\| \| \| \| \| \| \| \|	It doesn't like implicitly calling the ArrayRef constructor with a returned array -- it appears to decays the returned value to a pointer, first, before trying to make an ArrayRef out of it. llvm-svn: 266011
*	CodeGen: Fix a use-after-free in TailDuplication	Justin Bogner	2016-04-11	1	-2/+0
\| \| \| \| \| \| \| \| \| \|	The call to processPHI already erased MI from its parent, so MI isn't even valid here, making the getParent() call a use-after-free in addition to being redundant. Found by ASan with the ArrayRecycler changes in llvm.org/pr26808. llvm-svn: 266008
*	NFC: keep comment up to date	JF Bastien	2016-04-11	1	-4/+4
\| \| \| \| \| \|	MergeFunctions was refactored a while ago, and Instruction.cpp's comments went out of sync. The content did as well, will fix later. llvm-svn: 266007
*	[safestack] Add canary to unsafe stack frames	Evgeniy Stepanov	2016-04-11	3	-26/+80
\| \| \| \| \| \| \| \|	Add StackProtector to SafeStack. This adds limited protection against data corruption in the caller frame. Current implementation treats all stack protector levels as -fstack-protector-all. llvm-svn: 266004
*	ARM: use r7 as the frame-pointer on all MachO targets.	Tim Northover	2016-04-11	2	-13/+10
\| \| \| \| \| \| \| \| \| \| \| \|	This is better for a few reasons: + It matches the other tooling for iOS. + It matches EABI in more cases (i.e. Thumb-mode, and in practice we don't use ARM mode). + It leads to infinitesimally smaller code (0.2%, yay!). rdar://25369506 llvm-svn: 266003
*	Add __atomic_* lowering to AtomicExpandPass.	James Y Knight	2016-04-11	3	-9/+559
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266002
*	[DAGCombiner] Fold xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) ↵	Simon Pilgrim	2016-04-11	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	anytime before LegalizeVectorOprs xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) was only being combined at the AfterLegalizeTypes stage, this patch permits the combine to occur anytime before then as well. The main aim with this to improve the ability to recognise bitmasks that can be converted to shuffles. I had to modify a number of AVX512 mask tests as the basic bitcast to/from scalar pattern was being stripped out, preventing testing of the mmask bitops. By replacing the bitcasts with loads we can get almost the same result. Differential Revision: http://reviews.llvm.org/D18944 llvm-svn: 265998
*	Swift Calling Convention: swifterror target support.	Manman Ren	2016-04-11	14	-7/+219
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D18716 llvm-svn: 265997
*	Revert "AMDGPU/SI: Do not generate s_waitcnt after ds_permute/ds_bpermute"	Tom Stellard	2016-04-11	1	-1/+1
\| \| \| \| \| \| \| \|	This reverts commit r263720. Just confirmed that s_waitcnt is required after ds_permute/ds_bpermute. llvm-svn: 265992
*	Fix broken assert, PR24624	Hans Wennborg	2016-04-11	1	-1/+1
\| \| \| \|	llvm-svn: 265989
*	Remove redundant .c_str(), as suggested by PR25633	Hans Wennborg	2016-04-11	1	-1/+1
\| \| \| \|	llvm-svn: 265988
*	Fix a couple of redundant conditional expressions (PR27283, PR28282)	Hans Wennborg	2016-04-11	2	-3/+3
\| \| \| \|	llvm-svn: 265987
*	use range-loops; NFCI	Sanjay Patel	2016-04-11	1	-13/+8
\| \| \| \|	llvm-svn: 265985