bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU: Fix mishandling alignment when scalarizing vector loads/stores	Matt Arsenault	2016-02-12	1	-2/+5
\| \| \| \| \| \| \|	I don't think this was causing any real problems, so I'm not sure how to test for this. llvm-svn: 260646
*	AMDGPU: Split R600 and SI store lowering	Matt Arsenault	2016-02-11	1	-63/+2
\| \| \| \| \| \| \|	These were only sharing some somewhat incorrect logic for when to scalarize or split vectors. llvm-svn: 260490
*	AMDGPU: Split R600 and SI load lowering	Matt Arsenault	2016-02-10	1	-93/+0
\| \| \| \| \| \| \|	These weren't actually sharing anything in the common LowerLOAD. llvm-svn: 260398
*	[CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.	Ahmed Bougacha	2016-02-09	1	-5/+2
\| \| \| \|	llvm-svn: 260316
*	AMDGPU: Remove bfi and bfm intrinsics	Matt Arsenault	2016-02-08	1	-11/+0
\| \| \| \| \| \|	Nothing is using them. llvm-svn: 260123
*	AMDGPU: Account for LDS alignment	Matt Arsenault	2016-02-05	1	-4/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	The current situation isn't great, because the amount of padding requires is determined by the inverse order of the first encountered use. We should eventually somehow sort these to minimize wasted space. Another problem is the alignment of kernel arguments isn't respected. The group_segment_alignment is always emitted as the default 16, and typed arguments with higher alignments or an explicitly set alignment are also ignored. llvm-svn: 259912
*	Refactor backend diagnostics for unsupported features	Oliver Stannard	2016-02-02	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-commit of r258951 after fixing layering violation. The BPF and WebAssembly backends had identical code for emitting errors for unsupported features, and AMDGPU had very similar code. This merges them all into one DiagnosticInfo subclass, that can be used by any backend. There should be minimal functional changes here, but some AMDGPU tests have been updated for the new format of errors (it used a slightly different format to BPF and WebAssembly). The AMDGPU error messages will now benefit from having precise source locations when debug info is available. llvm-svn: 259498
*	AMDGPU: Remove 24-bit intrinsics	Matt Arsenault	2016-01-29	1	-16/+0
\| \| \| \| \| \| \|	The known bit matching code seems to work reasonably well, so these shouldn't really be needed. llvm-svn: 259180
*	AMDGPU: Match fmed3 patterns with legacy fmin/fmax	Matt Arsenault	2016-01-28	1	-2/+7
\| \| \| \|	llvm-svn: 259090
*	AMDGPU: Match some med3 patterns	Matt Arsenault	2016-01-28	1	-1/+4
\| \| \| \|	llvm-svn: 259089
*	Revert r259035, it introduces a cyclic library dependency	Oliver Stannard	2016-01-28	1	-5/+5
\| \| \| \|	llvm-svn: 259045
*	Add backend dignostic printer for unsupported features	Oliver Stannard	2016-01-28	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-commit of r258951 after fixing layering violation. The related LLVM patch adds a backend diagnostic type for reporting unsupported features, this adds a printer for them to clang. In the case where debug location information is not available, I've changed the printer to report the location as the first line of the function, rather than the closing brace, as the latter does not give the user any information. This also affects optimisation remarks. Differential Revision: http://reviews.llvm.org/D16590 llvm-svn: 259035
*	Revert r258951 (and r258950), "Refactor backend diagnostics for unsupported ↵	NAKAMURA Takumi	2016-01-28	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \|	features" It broke layering violation in LLVMIR. clang r258950 "Add backend dignostic printer for unsupported features" llvm r258951 "Refactor backend diagnostics for unsupported features" llvm-svn: 259016
*	Refactor backend diagnostics for unsupported features	Oliver Stannard	2016-01-27	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The BPF and WebAssembly backends had identical code for emitting errors for unsupported features, and AMDGPU had very similar code. This merges them all into one DiagnosticInfo subclass, that can be used by any backend. There should be minimal functional changes here, but some AMDGPU tests have been updated for the new format of errors (it used a slightly different format to BPF and WebAssembly). The AMDGPU error messages will now benefit from having precise source locations when debug info is available. The implementation of DiagnosticInfoUnsupported::print must be in lib/Codegen rather than in the existing file in lib/IR/ to avoid introducing a dependency from IR to CodeGen. Differential Revision: http://reviews.llvm.org/D16590 llvm-svn: 258951
*	AMDGPU: Restore AMDGPU prefixed rsq intrinsic for now	Matt Arsenault	2016-01-26	1	-4/+0
\| \| \| \| \| \|	Also move into backend intrinsics to discourage use of the old name. llvm-svn: 258783
*	AMDGPU: Remove more unused intrinsics	Matt Arsenault	2016-01-23	1	-23/+0
\| \| \| \| \| \|	Replace tests with lrp with basic IR expansion llvm-svn: 258612
*	AMDGPU: Move amdgcn intrinsic handling into SITargetLowering	Matt Arsenault	2016-01-23	1	-72/+2
\| \| \| \|	llvm-svn: 258608
*	AMDGPU: Rename intrinsics to use amdgcn prefix	Matt Arsenault	2016-01-22	1	-8/+10
\| \| \| \| \| \| \| \| \| \| \|	The intrinsic target prefix should match the target name as it appears in the triple. This is not yet complete, but gets most of the important ones. llvm.AMDGPU.* intrinsics used by mesa and libclc are still handled for compatability for now. llvm-svn: 258557
*	AMDGPU: Remove AMDGPU.trunc intrinsic	Matt Arsenault	2016-01-20	1	-2/+0
\| \| \| \|	llvm-svn: 258348
*	AMDGPU: Remove AMDIL.round.nearest intrinsic	Matt Arsenault	2016-01-20	1	-2/+0
\| \| \| \|	llvm-svn: 258346
*	AMDGPU: Remove abs intrinsic	Matt Arsenault	2016-01-20	1	-14/+0
\| \| \| \|	llvm-svn: 258343
*	AMDGPU: Remove min/max intrinsics	Matt Arsenault	2016-01-20	1	-44/+0
\| \| \| \| \| \|	This removes support for mesa 11.0.x llvm-svn: 258342
*	AMDGPU: Reduce 64-bit SRAs	Matt Arsenault	2016-01-18	1	-0/+60
\| \| \| \|	llvm-svn: 258096
*	AMDGPU: Split 64-bit and of constant up	Matt Arsenault	2016-01-18	1	-1/+60
\| \| \| \| \| \| \| \| \| \|	This breaks the tests that were meant for testing 64-bit inline immediates, so move those to shl where they won't be broken up. This should be repeated for the other related bit ops. llvm-svn: 258095
*	AMDGPU: Generalize shl combine	Matt Arsenault	2016-01-18	1	-8/+14
\| \| \| \| \| \| \|	Reduce 64-bit shl with constant > 32. We already special cased this for the == 32 case, but this also works for any >= 32 constant. llvm-svn: 258092
*	AMDGPU: Reduce 64-bit lshr by constant to 32-bit	Matt Arsenault	2016-01-18	1	-0/+44
\| \| \| \| \| \|	64-bit shifts are very slow on some subtargets. llvm-svn: 258090
*	GlobalValue: use getValueType() instead of getType()->getPointerElementType().	Manuel Jacob	2016-01-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: mjacob Subscribers: jholewinski, arsenm, dsanders, dblaikie Patch by Eduard Burtescu. Differential Revision: http://reviews.llvm.org/D16260 llvm-svn: 257999
*	AMDGPU/SI: Add support for non-void functions	Marek Olsak	2016-01-13	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Return values can be stored in SGPRs (i32) and VGPRs (f32). This will be used by functions which expect some bytecode or other binary to be appended at the end. It allows defining in which registers the return values will be stored. v2: don't do this for compute shaders Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16033 llvm-svn: 257621
*	AMDGPU: Implement {{s\|u}}int_to_fp i64 -> f32	Matt Arsenault	2016-01-11	1	-19/+98
\| \| \| \| \| \| \|	The old lowering for uint_to_fp failed opencl conformance. It might be OK for fast math mode, but I'm not sure. llvm-svn: 257393
*	AMDGPU: Fix ctlz combine for sub 32-bit types	Matt Arsenault	2016-01-11	1	-6/+24
\| \| \| \|	llvm-svn: 257353
*	AMDGPU: Pattern match ffbh pattern to instruction.	Matt Arsenault	2016-01-11	1	-20/+83
\| \| \| \| \| \| \| \|	The hardware instruction's output on 0 is -1 rather than 32. Eliminate a test and select to -1. This removes an extra instruction from the compatability function with HSAIL's firstbit instruction. llvm-svn: 257352
*	AMDGPU: Custom lower i64 ctlz	Matt Arsenault	2016-01-11	1	-1/+58
\| \| \| \|	llvm-svn: 257348
*	LegalizeDAG: Expand ctlz with ctlz_zero_undef if legal	Matt Arsenault	2016-01-11	1	-0/+3
\| \| \| \|	llvm-svn: 257345
*	AMDGPU: Use generic bitreverse intrinsic	Matt Arsenault	2015-12-14	1	-4/+2
\| \| \| \| \| \|	Also fix bug in vector legalization for bitreverse. llvm-svn: 255512
*	AMDGPU: Fix splitting vector loads with existing offsets	Matt Arsenault	2015-12-14	1	-9/+18
\| \| \| \| \| \| \|	If the original MMO had an offset, it was dropped. Also use the correct alignment after adding the new offset. llvm-svn: 255508
*	Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC)	Artyom Skrobov	2015-11-25	1	-10/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Many target lowerings copy-paste the code to test SDValues for known constants. This code can instead be shared in SelectionDAG.cpp, and reused in the targets. Reviewers: MatzeB, andreadb, tstellarAMD Subscribers: arsenm, jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D14945 llvm-svn: 254085
*	AMDGPU: Split LDS vector loads	Matt Arsenault	2015-11-24	1	-1/+1
\| \| \| \| \| \|	If properly aligned this could allow using ds_read_b64. llvm-svn: 253975
*	AMDGPU: Split x8 and x16 vector loads instead of scalarize	Matt Arsenault	2015-11-24	1	-0/+10
\| \| \| \| \| \| \| \|	The one regression in the builtin tests is in the read2 test which now (again) has many extra copies, but this should be solved once the pass is replaced with a DAG combine. llvm-svn: 253974
*	AMDGPU: Split DiagnosticInfoUnsupported into its own file	Matt Arsenault	2015-10-21	1	-41/+1
\| \| \| \|	llvm-svn: 250959
*	Don't pretend AMDGPU backend knows how to custom-lower UDIVREM for vector ↵	Artyom Skrobov	2015-10-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	types; it can't Reviewers: arsenm, jvesely, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13734 llvm-svn: 250384
*	DAGCombiner: Combine extract_vector_elt from build_vector	Matt Arsenault	2015-10-12	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This basic combine was surprisingly missing. AMDGPU legalizes many operations in terms of 32-bit vector components, so not doing this results in many extra copies and subregister extracts that need to be cleaned up later. InstCombine already does this for the hasOneUse case. The target hook is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn from a vector materialize repeated immediate instruction to a constant vector load with more scalar copies from it. llvm-svn: 250129
*	propagate fast-math-flags on DAG nodes	Sanjay Patel	2015-09-16	1	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is one test case in this patch to prove that point. This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF ( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes. This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the current global settings. Differential Revision: http://reviews.llvm.org/D12095 llvm-svn: 247815
*	AMDGPU: Produce error on dynamic_stackalloc	Matt Arsenault	2015-08-26	1	-0/+13
\| \| \| \|	llvm-svn: 246048
*	[TLI] Refactor "is integer division cheap" queries.	Michael Kuperstein	2015-08-19	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	This removes the isPow2SDivCheap() query, as it is not currently used in any meaningful way. isIntDivCheap() no longer relies on a state variable (as all in-tree target set it to false), but the interface allows querying based on the type optimization level. NFC. Differential Revision: http://reviews.llvm.org/D12082 llvm-svn: 245430
*	Remove redundant TargetFrameLowering::getFrameIndexOffset virtual	James Y Knight	2015-08-15	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \|	function. This was the same as getFrameIndexReference, but without the FrameReg output. Differential Revision: http://reviews.llvm.org/D12042 llvm-svn: 245148
*	[AMDGPU] Use the general SMAX/SMIN/UMAX/UMIN pattern matching and remove the ↵	Simon Pilgrim	2015-08-13	1	-45/+0
\| \| \| \| \| \| \| \| \| \|	AMDGPU implementation D9746 added general SMAX/SMIN/UMAX/UMIN pattern matching to SelectionDAGBuilder::visitSelect. Differential Revision: http://reviews.llvm.org/D12007 llvm-svn: 244960
*	AMDGPU: Avoid using 64-bit shift for i64 (shl x, 32)	Matt Arsenault	2015-07-14	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This can be done only with moves which theoretically will optimize better later. Although this transform increases the instruction count, it should be code size / cycle count neutral in the worst VALU case. It also seems to slightly improve a couple of testcases due to other DAG combines this exposes. This is probably slightly worse for the SALU case, so it might be better to handle this during moveToVALU, although then you lose some simplifications like the load width reducing in the simple testcase. llvm-svn: 242177
*	AMDGPU: Add helper function for implicit parameter offsets.	Tom Stellard	2015-07-09	1	-0/+12
\| \| \| \| \| \|	Patch by: Zoltan Gilian llvm-svn: 241861
*	Remove getDataLayout() from TargetLowering	Mehdi Amini	2015-07-09	1	-12/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11042 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241779
*	Make TargetLowering::getPointerTy() taking DataLayout as an argument	Mehdi Amini	2015-07-09	1	-16/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775