bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[DAGCombiner] re-enable truncation of binops	Sanjay Patel	2018-12-08	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is effectively re-committing the changes from: rL347917 (D54640) rL348195 (D55126) ...which were effectively reverted here: rL348604 ...because the code had a bug that could induce infinite looping or eventual out-of-memory compilation. The bug was that this code did not guard against transforming opaque constants. More details are in the post-commit mailing list thread for r347917. A reduced test for that is included in the x86 bool-math.ll file. (I wasn't able to reduce a PPC backend test for this, but it was almost the same pattern.) Original commit message for r347917: The motivating case for this is shown in: https://bugs.llvm.org/show_bug.cgi?id=32023 and the corresponding rot16.ll regression tests. Because x86 scalar shift amounts are i8 values, we can end up with trunc-binop-trunc sequences that don't get folded in IR. As the TODO comments suggest, there will be regressions if we extend this (for x86, we mostly seem to be missing LEA opportunities, but there are likely vector folds missing too). I think those should be considered existing bugs because this is the same transform that we do as an IR canonicalization in instcombine. We just need more tests to make those visible independent of this patch. llvm-svn: 348706
*	[DAGCombiner] disable truncation of binops by default	Sanjay Patel	2018-12-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	As discussed in the post-commit thread of r347917, this transform is fighting with an existing transform causing an infinite loop or out-of-memory, so this is effectively reverting r347917 and its follow-up r348195 while we investigate the bug. llvm-svn: 348604
*	[DAGCombiner] narrow truncated binops	Sanjay Patel	2018-11-29	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The motivating case for this is shown in: https://bugs.llvm.org/show_bug.cgi?id=32023 and the corresponding rot16.ll regression tests. Because x86 scalar shift amounts are i8 values, we can end up with trunc-binop-trunc sequences that don't get folded in IR. As the TODO comments suggest, there will be regressions if we extend this (for x86, we mostly seem to be missing LEA opportunities, but there are likely vector folds missing too). I think those should be considered existing bugs because this is the same transform that we do as an IR canonicalization in instcombine. We just need more tests to make those visible independent of this patch. Differential Revision: https://reviews.llvm.org/D54640 llvm-svn: 347917
*	[AMDGPU] Add pattern for v_alignbit_b32 with immediate	Stanislav Mekhanoshin	2017-06-28	1	-5/+4
\| \| \| \| \| \| \| \|	If immediate in shift is less than 32 we can use alignbit too. Differential Revision: https://reviews.llvm.org/D34729 llvm-svn: 306500
*	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel	Matt Arsenault	2017-03-21	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
*	[AMDGPU] Remove getBidirectionalReasonRank	Stanislav Mekhanoshin	2017-03-11	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This method inverts the Reason field of a scheduling candidate. It does right comparison between RegCritical and RegExcess, but everything else is broken. In fact it can prefer less strong reason such as Weak over RegCritical because Weak > -RegCritical. The CandReason enum is properly sorted, so just remove artificial ranking. Differential Revision: https://reviews.llvm.org/D30557 llvm-svn: 297536
*	Enable FeatureFlatForGlobal on Volcanic Islands	Matt Arsenault	2017-01-24	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982
*	AMDGPU: Select branch on undef to uniform scc branch	Matt Arsenault	2016-12-15	1	-6/+7
\| \| \| \|	llvm-svn: 289877
*	AMDGPU: Add VI i16 support	Tom Stellard	2016-11-10	1	-2/+7
\| \| \| \| \| \| \| \|	Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 286464
*	AMDGPU: Remove unnecessary and on conditional branch	Matt Arsenault	2016-11-07	1	-2/+1
\| \| \| \| \| \| \|	The comment explaining why this was necessary is incorrect in its description of v_cmp's behavior for inactive workitems. llvm-svn: 286134
*	Revert "AMDGPU: Add VI i16 support"	Tom Stellard	2016-11-04	1	-7/+2
\| \| \| \| \| \|	This reverts commit r285939 and r285948. These broke some conformance tests. llvm-svn: 285995
*	AMDGPU: Add VI i16 support	Tom Stellard	2016-11-03	1	-2/+7
\| \| \| \| \| \| \| \|	Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 285939
*	DAGCombiner: Reduce 64-bit BFE pattern to pattern on 32-bit component	Matt Arsenault	2016-04-21	1	-4/+2
\| \| \| \| \| \| \|	If the extracted bits are restricted to the upper half or lower half, this can be truncated. llvm-svn: 267024
*	AMDGPU: Set HasExtractBitInsn	Matt Arsenault	2016-03-01	1	-0/+303
	This currently does not have the control over the bitwidth, and there are missing optimizations to reduce the integer to 32-bit if it can be. But in most situations we do want the sinking to occur. llvm-svn: 262296