bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	transform fadd chains to increase parallelism	Sanjay Patel	2015-04-28	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a compromise: with this simple patch, we should always handle a chain of exactly 3 operations optimally, but we're not generating the optimal balanced binary tree for a longer sequence. In general, this transform will reduce the dependency chain for a sequence of instructions using N operands from a worst case N-1 dependent operations to N/2 dependent operations. The optimal balanced binary tree would reduce the chain to log2(N). The trade-off for not dealing with longer sequences is: (1) we have less complexity in the compiler, (2) we avoid unknown compile-time blowup calculating a balanced tree, and (3) we don't need to worry about the increased register pressure required to parallelize longer sequences. It also seems unlikely that we would ever encounter really long strings of dependent ops like that in the wild, but I'm not sure how to verify that speculation. FWIW, I see no perf difference for test-suite running on btver2 (x86-64) with -ffast-math and this patch. We can extend this patch to cover other associative operations such as fmul, fmax, fmin, integer add, integer mul. This is a partial fix for: https://llvm.org/bugs/show_bug.cgi?id=17305 and if extended: https://llvm.org/bugs/show_bug.cgi?id=21768 https://llvm.org/bugs/show_bug.cgi?id=23116 The issue also came up in: http://reviews.llvm.org/D8941 Differential Revision: http://reviews.llvm.org/D9232 llvm-svn: 236031
*	move IR-level optimization flags into their own struct	Sanjay Patel	2015-04-28	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a preliminary step to using the IR-level floating-point fast-math-flags in the SDAG (D8900). In this patch, we introduce the optimization flags as their own struct. As noted in the TODO comment, we should eventually share this data between the IR passes and the backend. We also switch the existing nsw / nuw / exact bit functionality of the BinaryWithFlagsSDNode class to use the new struct. The tradeoff is that instead of using the free but limited space of SDNode's SubclassData, we add a data member to the subclass. This means we don't have to repeat all of the get/set methods per flag, but we're potentially adding size to all nodes of this subclassi type. In practice on 64-bit systems (measured on Linux and MacOS X), there is no size difference between an SDNode and BinaryWithFlagsSDNode after this change: they're both 80 bytes. This means that we had at least one free byte to play with due to struct alignment. Differential Revision: http://reviews.llvm.org/D9325 llvm-svn: 235997
*	Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"	Sergey Dmitrouk	2015-04-28	1	-355/+490
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989
*	Revert "[DebugInfo] Add debug locations to constant SD nodes"	Daniel Jasper	2015-04-28	1	-490/+355
\| \| \| \| \| \| \|	This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987
*	[DebugInfo] Add debug locations to constant SD nodes	Sergey Dmitrouk	2015-04-28	1	-355/+490
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977
*	[DAGCombiner] Fix the type used in canFoldInAddressingMode to account for the	Quentin Colombet	2015-04-24	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	right scaling. In the function canFoldInAddressingMode, VT is computed as the type of the destination/source of a LOAD/STORE operations, instead of the memory type of the operation. On targets with a scaling factor on the offset of the LOAD/STORE operations, the function may return false for actually valid cases. This may then prevent the selection of profitable pre or post indexed load/store operations, and instead select pre or post indexed load/store for unprofitable cases. Patch by Francois de Ferriere <francois.de-ferriere@st.com>! Differential Revision: http://reviews.llvm.org/D9146 llvm-svn: 235780
*	[DAGCombiner] Remove extra bitcasts surrounding vector shuffles	Simon Pilgrim	2015-04-23	1	-0/+45
\| \| \| \| \| \| \| \|	Patch to remove extra bitcasts from shuffles, this is often a legacy of XformToShuffleWithZero being used to combine bitmaskings (of float vectors bitcast to integer vectors) into shuffles: bitcast(shuffle(bitcast(s0),bitcast(s1))) -> shuffle(s0,s1) Differential Revision: http://reviews.llvm.org/D9097 llvm-svn: 235578
*	Fixed logic to enable complex FMA formation.	Olivier Sallenave	2015-04-22	1	-2/+2
\| \| \| \|	llvm-svn: 235508
*	[DAGCombine] Disable select(c, load,load) for indexed loads	Hal Finkel	2015-04-22	1	-0/+3
\| \| \| \| \| \| \| \| \|	This turned up after r235333, but was a pre-existing bug. The optimization which transforms select(c, load, load) into a load of a select of the addresses does not handle indexed loads (pre/post inc/dec). However, it did not check for them either, leading to a crash if it tried to transform one of them. llvm-svn: 235497
*	Refactoring and enhancement to FMA combine.	Olivier Sallenave	2015-04-20	1	-171/+369
\| \| \| \|	llvm-svn: 235344
*	DAGCombine: Remove redundant NaN checks around ISD::FSQRT	Tom Stellard	2015-04-20	1	-0/+35
\| \| \| \| \| \| \| \|	This folds: (select (setcc x, -0.0, *lt), NaN, (fsqrt x)) -> ( fsqrt x) llvm-svn: 235333
*	Add support to promote f16 to f32	Pirama Arumuga Nainar	2015-04-17	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds legalization support to operate on FP16 as a load/store type and do operations on it as floats. Tests for ARM are added to test/CodeGen/ARM/fp16-promote.ll Reviewers: srhines, t.p.northover Differential Revision: http://reviews.llvm.org/D8755 llvm-svn: 235215
*	[CodeGen] Re-apply r234809 (concat of scalars), with an x86_mmx fix.	Ahmed Bougacha	2015-04-16	1	-0/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The only type that isn't an integer, isn't floating point, and isn't a vector; ladies and gentlemen, the gift that keeps on giving: x86_mmx! Fixes PR23246. Original message (reverted in r235062): [CodeGen] Combine concat_vectors of scalars into build_vector. Combine something like: (v8i8 concat_vectors (v2i8 bitcast (i16)) x4) into: (v8i8 (bitcast (v4i16 BUILD_VECTOR (i16) x4))) If any of the scalars are floating point, use that throughout. Differential Revision: http://reviews.llvm.org/D8948 llvm-svn: 235072
*	Revert r234809 because it caused PR23246.	Nick Lewycky	2015-04-16	1	-60/+0
\| \| \| \|	llvm-svn: 235062
*	[CodeGen] Combine concat_vectors of scalars into build_vector.	Ahmed Bougacha	2015-04-13	1	-0/+60
\| \| \| \| \| \| \| \| \| \| \| \| \|	Combine something like: (v8i8 concat_vectors (v2i8 bitcast (i16)) x4) into: (v8i8 (bitcast (v4i16 BUILD_VECTOR (i16) x4))) If any of the scalars are floating point, use that throughout. Differential Revision: http://reviews.llvm.org/D8948 llvm-svn: 234809
*	DAGCombiner: Fix crash in select(select) opt.	Matthias Braun	2015-04-13	1	-2/+2
\| \| \| \| \| \| \| \| \|	In case of different types used for the condition of the selects the select(select) -> select(and) normalisation cannot be performed. See also: http://reviews.llvm.org/D7622 llvm-svn: 234763
*	Reduce dyn_cast<> to isa<> or cast<> where possible.	Benjamin Kramer	2015-04-10	1	-7/+7
\| \| \| \| \| \|	No functional change intended. llvm-svn: 234586
*	[CodeGen] Combine concat_vector of trunc'd scalar to scalar_to_vector.	Ahmed Bougacha	2015-04-09	1	-3/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already do: concat_vectors(scalar, undef) -> scalar_to_vector(scalar) When the scalar is legal. When it's not, but is a truncated legal scalar, we can also do: concat_vectors(trunc(scalar), undef) -> scalar_to_vector(scalar) Which is equivalent, since the upper lanes are undef anyway. While there, teach the combine to look at more than 2 operands. Differential Revision: http://reviews.llvm.org/D8883 llvm-svn: 234530
*	Revert "Refactoring and enhancement to FMA combine."	Rafael Espindola	2015-04-09	1	-361/+172
\| \| \| \| \| \|	This reverts commit r234513. It was failing on the bots. llvm-svn: 234518
*	Refactoring and enhancement to FMA combine.	Olivier Sallenave	2015-04-09	1	-172/+361
\| \| \| \|	llvm-svn: 234513
*	[DAGCombine] Fix a bug in MergeConsecutiveStores.	Akira Hatanaka	2015-04-08	1	-20/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The bug manifests when there are two loads and two stores chained as follows in a DAG, (ld v3f32) -> (st f32) -> (ld v3f32) -> (st f32) and the stores' values are extracted from the preceding vector loads. MergeConsecutiveStores would replace the first store in the chain with the merged vector store, which would create a cycle between the merged store node and the last load node that appears in the chain. This commits fixes the bug by replacing the last store in the chain instead. rdar://problem/20275084 Differential Revision: http://reviews.llvm.org/D8849 llvm-svn: 234430
*	[DAGCombiner] Add support for FCEIL, FFLOOR and FTRUNC vector constant folding	Simon Pilgrim	2015-04-06	1	-6/+3
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D8715 llvm-svn: 234179
*	[DAGCombiner] Merge FMUL Scalar and Vector constant canonicalization to RHS. ↵	Simon Pilgrim	2015-04-05	1	-8/+2
\| \| \| \| \| \|	NFCI. llvm-svn: 234118
*	[DAGCombiner] Canonicalize vector constants for ADD/MUL/AND/OR/XOR ↵	Simon Pilgrim	2015-04-04	1	-6/+11
\| \| \| \| \| \| \| \|	re-association Scalar integers are commuted to move constants to the RHS for re-association - this ensures vectors do the same. llvm-svn: 234092
*	[DAGCombiner] Combine shuffles of BUILD_VECTOR and SCALAR_TO_VECTOR	Simon Pilgrim	2015-04-03	1	-0/+37
\| \| \| \| \| \| \| \|	This patch attempts to fold the shuffling of 'scalar source' inputs - BUILD_VECTOR and SCALAR_TO_VECTOR nodes - if the shuffle node is the only user. This folds away a lot of unnecessary shuffle nodes, and allows quite a bit of constant folding that was being missed. Differential Revision: http://reviews.llvm.org/D8516 llvm-svn: 234004
*	Fix PR23065. Avoid optimizing bitcast of build_vector with constant input to ↵	Jiangning Liu	2015-04-01	1	-5/+0
\| \| \| \| \| \|	scalar_to_vector. llvm-svn: 233778
*	typos; NFC	Sanjay Patel	2015-03-31	1	-5/+5
\| \| \| \|	llvm-svn: 233701
*	Use SDValue bool check to tidyup some possible vector folding ops. NFC.	Simon Pilgrim	2015-03-29	1	-40/+35
\| \| \| \|	llvm-svn: 233498
*	Use SDValue bool check to tidyup some possible ReassociateOps. NFC.	Simon Pilgrim	2015-03-29	1	-10/+5
\| \| \| \|	llvm-svn: 233495
*	[DAGCombiner] Fixed incorrect test for buildvector of constant integers.	Simon Pilgrim	2015-03-28	1	-14/+8
\| \| \| \| \| \|	DAGCombiner::ReassociateOps was correctly testing for an constant integer scalar but failed to correctly test for constant integer vectors (it was testing for any constant vector). llvm-svn: 233482
*	revert inadvertent change	Sanjay Patel	2015-03-26	1	-2/+0
\| \| \| \|	llvm-svn: 233294
*	comment cleanup; NFC	Sanjay Patel	2015-03-26	1	-0/+2
\| \| \| \|	llvm-svn: 233293
*	fix indent; NFC	Sanjay Patel	2015-03-26	1	-1/+1
\| \| \| \|	llvm-svn: 233288
*	[DAGCombiner] Add support for TRUNCATE + FP_EXTEND vector constant folding	Simon Pilgrim	2015-03-25	1	-53/+23
\| \| \| \| \| \| \| \| \| \|	This patch adds supports for the vector constant folding of TRUNCATE and FP_EXTEND instructions and tidies up the SINT_TO_FP and UINT_TO_FP instructions to match. It also moves the vector constant folding for the FNEG and FABS instructions to use the DAG.getNode() functionality like the other unary instructions. Differential Revision: http://reviews.llvm.org/D8593 llvm-svn: 233224
*	'optnone' should not disable DAG combiner.	Paul Robinson	2015-03-25	1	-5/+0
\| \| \| \| \| \| \| \| \|	Reverts the code change from r221168 and the relevant test. It was a mistake to disable the combiner, and based on the ultimate definition of 'optnone' we shouldn't have considered the test case as failing in the first place. llvm-svn: 233153
*	Move private classes into anonymous namespaces	Benjamin Kramer	2015-03-23	1	-0/+2
\| \| \| \| \| \|	NFC. llvm-svn: 232944
*	Fix a nasty bug in DAGCombine of STORE nodes.	Owen Anderson	2015-03-19	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is very related to the bug fixed in r174431. The problem is that SelectionDAG does not include alignment in the uniquing of loads and stores. When an otherwise no-op DAGCombine would increase the alignment of a load or store, the original node would be returned (with the alignment increased), which would cause the node not to be processed by any further DAGCombines. I don't have a direct testcase for this that manifests on an in-tree target, but I did see some noise in the tests for other targets and have updated them for it. llvm-svn: 232780
*	DAGCombiner: fold (xor (shl 1, x), -1) -> (rotl ~1, x)	David Majnemer	2015-03-18	1	-0/+26
\| \| \| \| \| \| \| \| \| \|	Targets which provide a rotate make it possible to replace a sequence of (XOR (SHL 1, x), -1) with (ROTL ~1, x). This saves an instruction on architectures like X86 and POWER(64). Differential Revision: http://reviews.llvm.org/D8350 llvm-svn: 232572
*	XformToShuffleWithZero - Added clearer early outs and general tidy up. NFCI	Simon Pilgrim	2015-03-17	1	-31/+38
\| \| \| \|	llvm-svn: 232557
*	[DAGCombiner] Add a shuffle mask commutation helper function. NFCI.	Simon Pilgrim	2015-03-07	1	-19/+2
\| \| \| \| \| \| \| \| \| \|	We have an increasing number of cases where we are creating commuted shuffle masks - all implementing nearly the same code. This patch adds a static helper function - ShuffleVectorSDNode::commuteMask() and replaces a number of cases to use it. Differential Revision: http://reviews.llvm.org/D8139 llvm-svn: 231581
*	Use SDValue bool check to tidyup some possible combines. NFC.	Simon Pilgrim	2015-03-07	1	-6/+5
\| \| \| \|	llvm-svn: 231569
*	[DAGCombiner] Fix wrong folding of AND dag nodes.	Andrea Di Biagio	2015-03-07	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes the logic in the DAGCombiner that folds an AND node according to rule: (and (X (load V)), C) -> (X (load V)) An AND between a vector load 'X' and a constant build_vector 'C' can be folded into the load itself only if we can prove that the AND operation is redundant. The algorithm implemented by 'visitAND' firstly computes the splat value 'S' from C, and then checks if S has the lower 'B' bits set (where B is the size in bits of the vector element type). The algorithm takes into account also the 'undef' bits in the splat mask. Unfortunately, the algorithm only worked under the assumption that the size of S is a multiple of the vector element type. With this patch, we conservatively avoid folding the AND if the splat bits are not compatible with the vector element type. Added X86 test and-load-fold.ll Differential Revision: http://reviews.llvm.org/D8085 llvm-svn: 231563
*	[DAGCombiner] SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT(V,C)) -> VECTOR_SHUFFLE	Simon Pilgrim	2015-03-07	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \|	This patch attempts to convert a SCALAR_TO_VECTOR using an operand from an EXTRACT_VECTOR_ELT into a VECTOR_SHUFFLE. This prevents many cases of spilling scalar data between the gpr + simd registers. At present the optimization only accepts cases where there is no TRUNC of the scalar type (i.e. all types must match). Differential Revision: http://reviews.llvm.org/D8132 llvm-svn: 231554
*	DAGCombiner: Canonicalize select(and/or,x,y) depending on target.	Matthias Braun	2015-03-06	1	-0/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is based on the following equivalences: select(C0 & C1, X, Y) <=> select(C0, select(C1, X, Y), Y) select(C0 \| C1, X, Y) <=> select(C0, X, select(C1, X, Y)) Many target cannot perform and/or on the CPU flags and therefore the right side should be choosen to avoid materializign the i1 flags in an integer register. If the target can perform this operation efficiently we normalize to the left form. Differential Revision: http://reviews.llvm.org/D7622 llvm-svn: 231507
*	DAGCombiner: Factor out some and/or combines.	Matthias Braun	2015-03-06	1	-225/+252
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is in preparation for changing visitSELECT to normalize towards select(Cond0, select(Cond1, X, Y), Y); select(Cond0, X, select(Cond1, X, Y)) which perfom an implicit and/or of the conditions. The factored function contains all DAGCombine rules which reduce two values combined by an And/Or operation to a single value. This does not include rules involving constants as visitSELECT already handles that case. Differential Revision: http://reviews.llvm.org/D8026 llvm-svn: 231506
*	[DagCombiner] Allow shuffles to merge through bitcasts	Simon Pilgrim	2015-03-05	1	-0/+83
\| \| \| \| \| \| \| \| \| \| \| \|	Currently shuffles may only be combined if they are of the same type, despite the fact that bitcasts are often introduced in between shuffle nodes (e.g. x86 shuffle type widening). This patch allows a single input shuffle to peek through bitcasts and if the input is another shuffle will merge them, shuffling using the smallest sized type, and re-applying the bitcasts at the inputs and output instead. Dropped old ShuffleToZext test - this patch removes the use of the zext and vector-zext.ll covers these anyhow. Differential Revision: http://reviews.llvm.org/D7939 llvm-svn: 231380
*	[DAGCombine] Fix a bug in a BUILD_VECTOR combine	Michael Kuperstein	2015-03-04	1	-2/+3
\| \| \| \| \| \| \| \| \| \|	When trying to convert a BUILD_VECTOR into a shuffle, we try to split a single source vector that is twice as wide as the destination vector. We can not do this when we also need the zero vector to create a blend. This fixes PR22774. Differential Revision: http://reviews.llvm.org/D8040 llvm-svn: 231219
*	DAGCombiner::LoadedSlice: Remove explicit copy ctor in favor of the Rule of Zero	David Blaikie	2015-03-03	1	-3/+0
\| \| \| \| \| \| \|	This way, the copy assignment operator can be used without hitting the deprecated case in C++11. llvm-svn: 231144
*	Revert "Remove the explicit SDNodeIterator::operator= in favor of the ↵	David Blaikie	2015-03-03	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	implicit default" Accidentally committed a few more of these cleanup changes than intended. Still breaking these out & tidying them up. This reverts commit r231135. llvm-svn: 231136
*	Remove the explicit SDNodeIterator::operator= in favor of the implicit default	David Blaikie	2015-03-03	1	-3/+0
\| \| \| \| \| \| \| \| \| \|	There doesn't seem to be any need to assert that iterator assignment is between iterators over the same node - if you want to reuse an iterator variable to iterate another node, that's perfectly acceptable. Just don't mix comparisons between iterators into disjoint sequences, as usual. llvm-svn: 231135