bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU/SI: Fix regression with no-return atomics	Nicolai Haehnle	2016-04-15	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In the added test-case, the atomic instruction feeds into a non-machine CopyToReg node which hasn't been selected yet, so guard against non-machine opcodes here. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19043 llvm-svn: 266433
*	Move divergent-target test into CodeGen/NVPTX because it requires an NVPTX ↵	Justin Lebar	2016-04-15	1	-0/+24
\| \| \| \| \| \|	target. llvm-svn: 266403
*	AMDGPU: Include LDS size in printed comment	Matt Arsenault	2016-04-14	1	-4/+10
\| \| \| \|	llvm-svn: 266382
*	AMDGPU: Run SIFoldOperands after PeepholeOptimizer	Matt Arsenault	2016-04-14	15	-45/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PeepholeOptimizer cleans up redundant copies, which makes the operand folding more effective. shader-db stats: Totals: SGPRS: 34200 -> 34336 (0.40 %) VGPRS: 22118 -> 21655 (-2.09 %) Code Size: 632144 -> 633460 (0.21 %) bytes LDS: 11 -> 11 (0.00 %) blocks Scratch: 10240 -> 11264 (10.00 %) bytes per wave Max Waves: 8822 -> 8918 (1.09 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 7704 -> 7840 (1.77 %) VGPRS: 5169 -> 4706 (-8.96 %) Code Size: 234444 -> 235760 (0.56 %) bytes LDS: 2 -> 2 (0.00 %) blocks Scratch: 0 -> 1024 (0.00 %) bytes per wave Max Waves: 1188 -> 1284 (8.08 %) Wait states: 0 -> 0 (0.00 %) Increases: SGPRS: 35 (0.01 %) VGPRS: 1 (0.00 %) Code Size: 59 (0.02 %) LDS: 0 (0.00 %) Scratch: 1 (0.00 %) Max Waves: 48 (0.02 %) Wait states: 0 (0.00 %) Decreases: SGPRS: 26 (0.01 %) VGPRS: 54 (0.02 %) Code Size: 68 (0.03 %) LDS: 0 (0.00 %) Scratch: 0 (0.00 %) Max Waves: 4 (0.00 %) Wait states: 0 (0.00 %) llvm-svn: 266378
*	AMDGPU: Fold bitcasts of scalar constants to vectors	Matt Arsenault	2016-04-14	4	-50/+49
\| \| \| \| \| \| \|	This cleans up some messes since the individual scalar components can be CSEed. llvm-svn: 266376
*	AMDGPU: Add skeleton GlobalIsel implementation	Tom Stellard	2016-04-14	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds the necessary target code to be able to run the ir translator. Lowering function arguments and returns is a nop and there is no support for RegBankSelect. Reviewers: arsenm, qcolombet Subscribers: arsenm, joker.eph, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D19077 llvm-svn: 266356
*	[lanai] Add custom lowering for SRL_PARTS i32.	Jacques Pienaar	2016-04-14	1	-0/+12
\| \| \| \|	llvm-svn: 266349
*	[DivergenceAnalysis] Treat PHI with incoming undef as constant	Nicolai Haehnle	2016-04-14	1	-0/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If a PHI has an incoming undef, we can pretend that it is equal to one non-undef, non-self incoming value. This is particularly relevant in combination with the StructurizeCFG pass, which introduces PHI nodes with undefs. Previously, this lead to branch conditions that were uniform before StructurizeCFG to become non-uniform afterwards, which confused the SIAnnotateControlFlow pass. This fixes a crash when Mesa radeonsi compiles a shader from dEQP-GLES3.functional.shaders.switch.switch_in_for_loop_dynamic_vertex Reviewers: arsenm, tstellarAMD, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19013 llvm-svn: 266347
*	AMDGPU: Remove SIFixSGPRLiveRanges pass	Nicolai Haehnle	2016-04-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This pass is unnecessary and overly conservative. It was motivated by situations like def %vreg0:SGPR_32 ... if-block: .. def %vreg1:SGPR_32 ... else-block: ... use %vreg0:SGPR_32 ... and similar situations with uses after the non-uniform control flow, where we are not allowed to assign %vreg0 and %vreg1 to the same physical register, even though in the original, thread/workitem-based CFG, it looks like the live ranges of these registers do not overlap. However, by the time register allocation runs, we have moved to a wave-based CFG that accurately represents the fact that the wave may run through both the if- and the else-block. So the live ranges of %vreg0 and %vreg1 already overlap even without the SIFixSGPRLiveRanges pass. In addition to proving this change correct, I have tested it with Piglit and a small number of other tests. Reviewers: arsenm, tstellarAMD Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19041 llvm-svn: 266345
*	AArch64: expand cmpxchg after regalloc at -O0.	Tim Northover	2016-04-14	1	-0/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	FastRegAlloc works only at the basic-block level and spills all live-out registers. Unfortunately for a stack-based cmpxchg near the spill slots, this can perpetually clear the exclusive monitor, which means the cmpxchg will never succeed. I believe the only way to handle this within LLVM is by expanding the loop post-regalloc. We don't want this in general because it severely limits the optimisations that can be done, so we limit this to -O0 compilations. It's an ugly hack, and about the one good point in the whole mess is that we can treat all cmpxchg operations in the most naive way possible (seq_cst, no clrex faff) without affecting correctness. Should fix PR25526. llvm-svn: 266339
*	[lanai] Add areMemAccessesTriviallyDisjoint, getMemOpBaseRegImmOfs and ↵	Jacques Pienaar	2016-04-14	1	-0/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	getMemOpBaseRegImmOfsWidth. Summary: Add getMemOpBaseRegImmOfsWidth to enable determining independence during MiSched. Reviewers: eliben, majnemer Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18903 llvm-svn: 266338
*	AMDGPU: allow specifying a workgroup size that needs to fit in a compute unit	Tom Stellard	2016-04-14	2	-0/+113
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For GL_ARB_compute_shader we need to support workgroup sizes of at least 1024. However, if we want to allow large workgroup sizes, we may need to use less registers, as we have to run more waves per SIMD. This patch adds an attribute to specify the maximum work group size the compiled program needs to support. It defaults, to 256, as that has no wave restrictions. Reducing the number of registers available is done similarly to how the registers were reserved for chips with the sgpr init bug. Reviewers: mareko, arsenm, tstellarAMD, nhaehnle Subscribers: FireBurn, kerberizer, llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18340 Patch By: Bas Nieuwenhuizen llvm-svn: 266337
*	AMDGPU/SI: Use the correct scratch wave offset register for shaders.	Tom Stellard	2016-04-14	2	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The code previously always used s1 as it was using the user + system SGPR information for compute kernels. This is incorrect for Mesa shaders though, The register should be the next SGPR after all user and system SGPR's. We use that Mesa adds arguments for all input and system SGPR's and take the next available SGPR for the scratch wave offset register. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewers: mareko, arsenm, nhaehnle, tstellarAMD Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18941 Patch By: Bas Nieuwenhuizen llvm-svn: 266336
*	Summary:	Simon Dardis	2016-04-14	5	-48/+54
\| \| \| \| \| \| \| \| \| \|	Alias 'jic $reg, 0' to 'jrc $reg' and 'jialc $reg, 0' to 'jalrc $reg' like binutils. This patch was previous committed as r266055 as seemed to have caused some spurious test failures. They did not reappear after further local testing. llvm-svn: 266301
*	[mips] Remove duplicate tests and add missing prefixes for *-LABEL checks. NFC.	Vasileios Kalintiris	2016-04-14	4	-419/+119
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The only difference between the removed tests and the pre-existing ones, is the materialization of the zero constant, which shouldn't matter for these cases. Reviewers: dsanders, sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D18693 llvm-svn: 266285
*	Revert "Support arbitrary addrspace pointers in masked load/store intrinsics"	Adam Nemet	2016-04-14	2	-112/+111
\| \| \| \| \| \| \| \|	This reverts commit r266086. It breaks the LTO build of gcc in SPEC2000. llvm-svn: 266282
*	[CodeGen] Teach LLVM how to lower @llvm.{min,max}num to {MIN,MAX}NAN	David Majnemer	2016-04-14	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The behavior of {MIN,MAX}NAN differs from that of {MIN,MAX}NUM when only one of the inputs is NaN: -NUM will return the non-NaN argument while -NAN would return NaN. It is desirable to lower to @llvm.{min,max}num to -NAN if they don't have a native instruction for -NUM. Notably, ARMv7 NEON's vmin has the -NAN semantics. N.B. Of course, it is only safe to do this if the intrinsic call is marked nnan. llvm-svn: 266279
*	AMDGPU: Implement canonicalize	Matt Arsenault	2016-04-14	1	-0/+320
\| \| \| \| \| \|	Also add generic DAG node for it. llvm-svn: 266272
*	[ppc] add tests to show potential andc optimization	Sanjay Patel	2016-04-13	1	-0/+36
\| \| \| \|	llvm-svn: 266261
*	ARM: override cost function to re-enable ConstantHoisting (& fix it).	Tim Northover	2016-04-13	2	-2/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At some point, ARM stopped getting any benefit from ConstantHoisting because the pass called a different variant of getIntImmCost. Reimplementing the correct variant revealed some problems, however: + ConstantHoisting was modifying switch statements. This is simply invalid, the cases must remain integer constants no matter the notional cost. + ConstantHoisting was mangling alloca instructions in the entry block. These should be handled by FrameLowering, so constants actually have a cost of 0. Worse, the resulting bitcasts meant they became dynamic allocas. rdar://25707382 llvm-svn: 266260
*	ARM: Use a callee save register for the swiftself parameter.	Matthias Braun	2016-04-13	1	-23/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is very likely that the swiftself parameter is alive throughout most functions function so putting it into a callee save register should avoid spills for the callers with only a minimum amount of extra spills in the callees. Currently the generated code is correct but unnecessarily spills and reloads arguments passed in callee save registers, I will address this in upcoming patches. This also adds a missing check that for tail calls the preserved value of the caller must be the same as the callees parameter. Differential Revision: http://reviews.llvm.org/D18901 llvm-svn: 266253
*	X86: Use a callee save register for the swiftself parameter.	Matthias Braun	2016-04-13	1	-33/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is very likely that the swiftself parameter is alive throughout most functions function so putting it into a callee save register should avoid spills for the callers with only a minimum amount of extra spills in the callees. Currently the generated code is correct but unnecessarily spills and reloads arguments passed in callee save registers, I will address this in upcoming patches. This also adds a missing check that for tail calls the preserved value of the caller must be the same as the callees parameter. Differential Revision: http://reviews.llvm.org/D18902 llvm-svn: 266252
*	AArch64: Use a callee save registers for swiftself parameters	Matthias Braun	2016-04-13	1	-21/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is very likely that the swiftself parameter is alive throughout most functions function so putting it into a callee save register should avoid spills for the callers with only a minimum amount of extra spills in the callees. Currently the generated code is correct but unnecessarily spills and reloads arguments passed in callee save registers, I will address this in upcoming patches. This also adds a missing check that for tail calls the preserved value of the caller must be the same as the callees parameter. Differential Revision: http://reviews.llvm.org/D19007 llvm-svn: 266251
*	[x86] add tests to show potential BMI optimization	Sanjay Patel	2016-04-13	1	-0/+68
\| \| \| \|	llvm-svn: 266243
*	[AArch64] Disable LDP/STP for quads	Evandro Menezes	2016-04-13	2	-0/+79
\| \| \| \| \| \| \| \| \|	Disable LDP/STP for quads on Exynos M1 as they are not as efficient as pairs of regular LDR/STR. Patch by Abderrazek Zaafrani <a.zaafrani@samsung.com>. llvm-svn: 266223
*	Cleanup Store Merging in UseAA case	Nirav Dave	2016-04-13	1	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes a bug (PR26827) when using anti-aliasing in store merging. This sets the chain users of the component stores to point to the new store instead of the component stores chain parent. Reviewers: jyknight Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18909 llvm-svn: 266217
*	AArch64: don't create instructions that write to xzr/wzr twice.	Tim Northover	2016-04-13	1	-3/+3
\| \| \| \| \| \| \| \|	These are unpredictable even on AArch64. Patch by Yichao Yu. llvm-svn: 266206
*	[AMDGPU][llvm-mc] Support of Trap Handler registers (TTMP0..11 and ↵	Artem Tamazov	2016-04-13	1	-16/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TBA/TMA)git status Tests added along with implemented feature. Note that there is a small leftover of unecessary MI sheduling issue (more info in the review). CodeGen/AMDGPU/salu-to-valu.ll updated to fix the false regression. TODO: Support for TTMP quads, comma-separated syntax in "[]" and more. Differential Revision: http://reviews.llvm.org/D17825 llvm-svn: 266205
*	[mips] Fix emitAtomicCmpSwapPartword to handle 64 bit pointers correctly	Zoran Jovanovic	2016-04-13	1	-0/+17
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D18995 llvm-svn: 266204
*	[mips] Sign-extend i32 values truncated from previously zero-extended i32 ↵	Vasileios Kalintiris	2016-04-13	3	-4/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	values. Summary: This is a special case for MIPS64 because the architecture requires properly 32-bit sign-extended values in the register containers. Additionaly, we merge consecutive trunc + AssertZExt nodes in order to avoid unnecessary sign-extensions when the extension comes from a type smaller than i32. Reviewers: dsanders Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D18893 llvm-svn: 266203
*	[X86][SSE] Regenerated vector integer absolute tests	Simon Pilgrim	2016-04-13	1	-200/+503
\| \| \| \|	llvm-svn: 266194
*	Added missing autogeneration note	Simon Pilgrim	2016-04-13	1	-0/+1
\| \| \| \|	llvm-svn: 266185
*	[mips][microMIPS] Add CodeGen support for DIV, MOD, DIVU, MODU, DDIV, DMOD, ↵	Zlatko Buljan	2016-04-13	4	-9/+558
\| \| \| \| \| \| \| \| \| \| \| \|	DDIVU and DMODU instructions Differential Revision: http://reviews.llvm.org/D17137 This patch was reverted after the revertion of dependant patch http://reviews.llvm.org/D17068. There was the problem with test-suite failure. The problem is hopefully solved with dependant patch so this patch is commited again. llvm-svn: 266179
*	[mips][microMIPS] Fix for "Cannot copy registers" assertion	Hrvoje Varga	2016-04-13	2	-0/+50
\| \| \| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17068 This changes contains fix for failing test-suite. So, this patch should hopefully work now. llvm-svn: 266171
*	Recommit r265547, and r265610,r265639,r265657 on top of it, plus	Wei Mi	2016-04-13	7	-518/+522
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	two fixes with one about error verify-regalloc reported, and another about live range update of phi after rematerialization. r265547: Replace analyzeSiblingValues with new algorithm to fix its compile time issue. The patch is to solve PR17409 and its duplicates. analyzeSiblingValues is a N x N complexity algorithm where N is the number of siblings generated by reg splitting. Although it causes siginificant compile time issue when N is large, it is also important for performance since it removes redundent spills and enables rematerialization. To solve the compile time issue, the patch removes analyzeSiblingValues and replaces it with lower cost alternatives containing two parts. The first part creates a new spill hoisting method in postOptimization of register allocation. It does spill hoisting at once after all the spills are generated instead of inside every instance of selectOrSplit. The second part queries the define expr of the original register for rematerializaiton and keep it always available during register allocation even if it is already dead. It deletes those dead instructions only in postOptimization. With the two parts in the patch, it can remove analyzeSiblingValues without sacrificing performance. Patches on top of r265547: r265610 "Fix the compare-clang diff error introduced by r265547." r265639 "Fix the sanitizer bootstrap error in r265547." r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]" Differential Revision: http://reviews.llvm.org/D15302 Differential Revision: http://reviews.llvm.org/D18934 Differential Revision: http://reviews.llvm.org/D18935 Differential Revision: http://reviews.llvm.org/D18936 llvm-svn: 266162
*	AMDGPU: Add test for m0 initialization in basic loop	Matt Arsenault	2016-04-13	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \|	Initialization of m0 is emitted for each LDS operation, so every block with LDS usage ends up with one. MachineLICM used to fail to hoist this out of the loop, so every loop iteration with LDS usage in it would re-initialize it. This seems to be fixed now, so add a test to make sure that it stays this way. llvm-svn: 266156
*	CodeGen: Clear the MFI's save and restore point after PrologEpilogInserter	Justin Bogner	2016-04-12	2	-2/+27
\| \| \| \| \| \| \| \| \| \|	This state is no longer useful and not guaranteed to be valid in later codegen passes. For example, see the added test, which would print a savepoint of %bb.-1 without this change, and crashes with a use-after-free error under ASan if you apply the recycling allocator patch from llvm.org/PR26808. llvm-svn: 266150
*	AMDGPU: add llvm.amdgcn.buffer.load/store intrinsics	Nicolai Haehnle	2016-04-12	8	-22/+242
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: They correspond to BUFFER_LOAD/STORE_DWORD[_X2,X3,X4] and mostly behave like llvm.amdgcn.buffer.load/store.format. They will be used by Mesa for SSBO and atomic counters at least when robust buffer access behavior is desired. (These instructions perform no format conversion and do buffer range checking per component.) As a side effect of sharing patterns with llvm.amdgcn.buffer.store.format, it has become trivial to add support for the f32 and v2f32 variants of that intrinsic, so the patch does so. Also DAG-ify (and fix) some tests that I noticed intermittent failures in while developing this patch. Some tests were (temporarily) adjusted for the required mayLoad/hasSideEffects changes to the BUFFER_STORE_DWORD* instructions. See also http://reviews.llvm.org/D18291. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18292 llvm-svn: 266126
*	[WebAssembly] Fix debug info in reg-stackify.ll test	Derek Schuff	2016-04-12	1	-18/+23
\| \| \| \| \| \|	It lacked a CU and thus became invalid with r266102 llvm-svn: 266114
*	AMDGPU/SI: Insert wait states required after v_readfirstlane on SI	Tom Stellard	2016-04-12	2	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We will be able to handle this case much better once the hazard recognizer is finished, but this conservative implementation fixes a hang with the piglit test: spec/arb_arrays_of_arrays/execution/sampler/fs-nested-struct-arrays-nonconst-nested-arra Reviewers: arsenm, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18988 llvm-svn: 266105
*	AMDGPU: Eliminate half of i64 or if one operand is zero_extend from i32	Matt Arsenault	2016-04-12	1	-0/+41
\| \| \| \| \| \| \| \| \| \|	This helps clean up some of the mess when expanding unaligned 64-bit loads when changed to be promote to v2i32, and fixes situations where or x, 0 was emitted after splitting 64-bit ors during moveToVALU. I think this could be a generic combine but I'm not sure. llvm-svn: 266104
*	AMDGPU/SI: Fix a mis-compilation of multi-level breaks	Nicolai Haehnle	2016-04-12	1	-0/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Under certain circumstances, multi-level breaks (or what is understood by the control flow passes as such) could be miscompiled in a way that causes infinite loops, by emitting incorrect control flow intrinsics. This fixes a hang in dEQP-GLES3.functional.shaders.loops.while_dynamic_iterations.conditional_continue_vertex Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18967 llvm-svn: 266088
*	Support arbitrary addrspace pointers in masked load/store intrinsics	Artur Pilipenko	2016-04-12	2	-111/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a resubmittion of 263158 change. This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 266086
*	[ScheduleDAGInstrs] Handle instructions with multiple MMOs	Geoff Berry	2016-04-12	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In getUnderlyingObjectsForInstr(): Don't give up on instructions with multiple MMOs, instead look through all the MMOs and if they all meet the conservative criteria previously used for single MMO instructions, then return all of the underlying objects derived from the MMOs. The change to ScheduleDAGInstrs::buildSchedGraph() is needed to avoid the case where multiple underlying objects are present and are related in such a way that successive iterations of the loop end up adding a dependency from an instruction to itself. Reviewers: atrick, hfinkel Subscribers: MatzeB, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18093 llvm-svn: 266084
*	Test commit, NFC.	Than McIntosh	2016-04-12	1	-0/+1
\| \| \| \| \| \|	Adds a blank line. llvm-svn: 266082
*	AMDGPU: Implement i64 global atomics	Matt Arsenault	2016-04-12	1	-0/+842
\| \| \| \|	llvm-svn: 266075
*	AMDGPU: Add atomic_inc + atomic_dec intrinsics	Matt Arsenault	2016-04-12	3	-1/+502
\| \| \| \| \| \| \|	These are different than atomicrmw add 1 because they have an additional input value to clamp the result. llvm-svn: 266074
*	AMDGPU: Add volatile to test loads and stores	Matt Arsenault	2016-04-12	19	-291/+291
\| \| \| \| \| \| \| \|	When the memory vectorizer is enabled, these tests break. These tests don't really care about the memory instructions, and it's easier to write check lines with the unmerged loads. llvm-svn: 266071
*	[X86] Regenerated avx512 calling convention test checks	Simon Pilgrim	2016-04-12	1	-5/+5
\| \| \| \|	llvm-svn: 266070
*	Revert "[mips] MIPSR6 Compact branch aliases"	Simon Dardis	2016-04-12	5	-54/+48
\| \| \| \| \| \| \| \|	This reverts commit r266055. ps4-buildslave2 is highlighting a failure. llvm-svn: 266061