bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Add extra headers that got deleted by my revert in r289916 but for which	Chandler Carruth	2016-12-16	1	-1/+2
\| \| \| \| \| \|	new usage had already grown in the file. llvm-svn: 289917
*	Revert patch series introducing the DAG combine to match a load-by-bytes	Chandler Carruth	2016-12-16	1	-283/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	idiom. r289538: Match load by bytes idiom and fold it into a single load r289540: Fix a buildbot failure introduced by r289538 r289545: Use more detailed assertion messages in the code ... r289646: Add a couple of assertions to the load combine code ... This DAG combine has a bad crash in it that is quite hard to trigger sadly -- it relies on sneaking code with UB through the SDAG build and into this particular combine. I've responded to the original commit with a test case that reproduces it. However, the code also has other problems that will require substantial changes to address and so I'm going ahead and reverting it for now. This should unblock us and perhaps others that are hitting the crash in the wild and will let a fresh patch with updated approach come in cleanly afterward. Sorry for any trouble or disruption! llvm-svn: 289916
*	Don't combine splats with other shuffles.	Eli Friedman	2016-12-15	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	We sometimes end up creating shuffles which are worse than the obvious translation of the IR. Fixes https://llvm.org/bugs/show_bug.cgi?id=31301 . Differential Revision: https://reviews.llvm.org/D27793 llvm-svn: 289882
*	Don't combine a shuffle of two BUILD_VECTORs with duplicate elements.	Eli Friedman	2016-12-15	1	-10/+23
\| \| \| \| \| \| \| \| \| \| \| \| \|	Targets can't handle this case well in general; we often transform a shuffle of two cheap BUILD_VECTORs to element-by-element insertion, which is very inefficient. Fixes https://llvm.org/bugs/show_bug.cgi?id=31364 . Partially fixes https://llvm.org/bugs/show_bug.cgi?id=31301. Differential Revision: https://reviews.llvm.org/D27787 llvm-svn: 289874
*	[DAG] allow more select folding for targets that have 'and not' (PR31175)	Sanjay Patel	2016-12-14	1	-6/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original motivation for this patch comes from wanting to canonicalize more IR to selects and also canonicalizing min/max. If we're going to do that, we need more backend fixups to undo select codegen when simpler ops will do. I chose AArch64 for the tests because that shows the difference in the simplest way. This should fix: https://llvm.org/bugs/show_bug.cgi?id=31175 Differential Revision: https://reviews.llvm.org/D27489 llvm-svn: 289738
*	Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵	Nirav Dave	2016-12-14	1	-228/+278
\| \| \| \| \| \| \| \| \| \|	UseAA is enabled." Reverting due to ARM MCJIT and MIPS LLD error. This reverts commit r289659. llvm-svn: 289667
*	In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵	Nirav Dave	2016-12-14	1	-278/+228
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	enabled. Retrying after fixing after removing load-store factoring through token factors in favor of improved token factor operand pruning Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289659
*	[DAGCombiner] Try to use SelectionDAG::isKnownToBeAPowerOfTwo instead of ↵	Simon Pilgrim	2016-12-14	1	-28/+47
\| \| \| \| \| \| \| \| \| \| \| \|	just APInt::isPowerOf2 Generalize sdiv/udiv/srem/urem combines using APInt::isPowerOf2, which only works for const/splat-const values, to call SelectionDAG::isKnownToBeAPowerOfTwo instead which recognises many more cases. Added a DAGCombiner::BuildLogBase2 helper since PowerOf2 combines often involve taking the log2 of such a value. Differential Revision: https://reviews.llvm.org/D27714 llvm-svn: 289654
*	Add a couple of assertions to the load combine code introduced by r289538	Artur Pilipenko	2016-12-14	1	-1/+5
\| \| \| \|	llvm-svn: 289646
*	Use more detailed assertion messages in the code introduced by r289538	Artur Pilipenko	2016-12-13	1	-4/+8
\| \| \| \|	llvm-svn: 289545
*	Fix a buildbot failure introduced by r289538	Artur Pilipenko	2016-12-13	1	-2/+1
\| \| \| \| \| \|	Build failed because of unused variable in product mode. llvm-svn: 289540
*	[DAGCombiner] Match load by bytes idiom and fold it into a single load	Artur Pilipenko	2016-12-13	1	-0/+276
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it. Assuming little endian target: i8 a = ... i32 val = a[0] \| (a[1] << 8) \| (a[2] << 16) \| (a[3] << 24) => i32 val = ((i32)a) i8 a = ... i32 val = (a[0] << 24) \| (a[1] << 16) \| (a[2] << 8) \| a[3] => i32 val = BSWAP(((i32)a)) This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations. Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part: i32 val = a[i] \| (a[i + 1] << 8) \| (a[i + 2] << 16) \| (a[i + 3] << 24) Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses. The general scheme is to match OR expressions by recursively calculating the origin of individual bits which constitute the resulting OR value. If all the OR bits come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed. Reviewed By: hfinkel, RKSimon, filcab Differential Revision: https://reviews.llvm.org/D26149 llvm-svn: 289538
*	Move BaseIndexOffset in DAGCombiner.cpp so it will be available for the ↵	Artur Pilipenko	2016-12-13	1	-104/+104
\| \| \| \| \| \|	upcoming user llvm-svn: 289537
*	Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵	Nirav Dave	2016-12-09	1	-141/+275
\| \| \| \| \| \| \| \|	UseAA is enabled." This reverts commit r289221 which appears to be triggering an assertion llvm-svn: 289226
*	In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵	Nirav Dave	2016-12-09	1	-275/+141
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	enabled. Retrying after fixing overly aggressive load-store forwarding optimization. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289221
*	[DAGCombine] Add (sext_in_reg (zext x)) -> (sext x) combine	Simon Pilgrim	2016-12-06	1	-0/+9
\| \| \| \| \| \| \| \|	Handle the case where a sign extension has ended up being split into separate stages (typically to get around vector legal ops) and a zext + sext_in_reg gets inserted. Differential Revision: https://reviews.llvm.org/D27461 llvm-svn: 288842
*	[DAGCombiner] do not fold (fmul (fadd X, 1), Y) -> (fmad X, Y, Y) by default	Nicolai Haehnle	2016-12-02	1	-5/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When X = 0 and Y = inf, the original code produces inf, but the transformed code produces nan. So this transform (and its relatives) should only be used when the no-infs-fp-math flag is explicitly enabled. Also disable the transform using fmad (intermediate rounding) when unsafe-math is not enabled, since it can reduce the precision of the result; consider this example with binary floating point numbers with two bits of mantissa: x = 1.01 y = 111 x * (y + 1) = 1.01 * 1000 = 1010 (this is the exact result; no rounding occurs at any step) x * y + x = 1000.11 + 1.01 =r 1000 + 1.01 = 1001.01 =r 1000 (with rounding towards zero) The example relies on rounding towards zero at least in the second step. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98578 Reviewers: RKSimon, tstellarAMD, spatel, arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26602 llvm-svn: 288506
*	[SelectionDAG] Rename and clarify visitFMULForFMADCombine (NFC)	Nicolai Haehnle	2016-12-01	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Suggested by @spatel in D26602. Reviewers: spatel, hfinkel Subscribers: spatel, llvm-commits Differential Revision: https://reviews.llvm.org/D27260 llvm-svn: 288336
*	Test commit. Comment changes. NFC.	Warren Ristow	2016-11-29	1	-5/+5
\| \| \| \|	llvm-svn: 288100
*	[DAG] clean up foldSelectCCToShiftAnd(); NFCI	Sanjay Patel	2016-11-28	1	-35/+35
\| \| \| \|	llvm-svn: 288088
*	[DAG] add helper function for selectcc --> and+shift transforms; NFC	Sanjay Patel	2016-11-28	1	-42/+51
\| \| \| \|	llvm-svn: 288073
*	Revert "[DAG] Improve loads-from-store forwarding to handle TokenFactor"	Nirav Dave	2016-11-28	1	-13/+2
\| \| \| \| \| \|	This reverts commit r287773 which caused issues with ppc64le builds. llvm-svn: 288035
*	Use SDValue helpers instead of explicitly going via SDValue::getNode(). NFCI	Simon Pilgrim	2016-11-25	1	-7/+7
\| \| \| \|	llvm-svn: 287941
*	[DAGCombine] Teach DAG combine that if both inputs of a vselect are the ↵	Craig Topper	2016-11-24	1	-0/+4
\| \| \| \| \| \| \| \|	same, then the condition doesn't matter and the vselect can be removed. Selects with scalar condition already handle this correctly. llvm-svn: 287904
*	[DAG] Improve loads-from-store forwarding to handle TokenFactor	Nirav Dave	2016-11-23	1	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	Forward store values to matching loads down through token factors. Factored from D14834. Reviewers: jyknight, hfinkel Subscribers: hfinkel, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D26080 llvm-svn: 287773
*	[DAGCombiner] Fix infinite loop in vector mul/shl combining	John Brawn	2016-11-23	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have the following DAGCombiner transformations: (mul (shl X, c1), c2) -> (mul X, c2 << c1) (mul (shl X, C), Y) -> (shl (mul X, Y), C) (shl (mul x, c1), c2) -> (mul x, c1 << c2) Usually the constant shift is optimised by SelectionDAG::getNode when it is constructed, by SelectionDAG::FoldConstantArithmetic, but when we're dealing with vectors and one of those vector constants contains an undef element FoldConstantArithmetic does not fold and we enter an infinite loop. Fix this by making FoldConstantArithmetic use getNode to decide how to fold each vector element, the same as FoldConstantVectorArithmetic does, and rather than adding the constant shift to the work list instead only apply the transformation if it's already been folded into a constant, as if it's not we're going to loop endlessly. Additionally add missing NoOpaques to one of those transformations, which I noticed when writing the tests for this. Differential Revision: https://reviews.llvm.org/D26605 llvm-svn: 287766
*	Type legalization for compressstore and expandload intrinsics.	Elena Demikhovsky	2016-11-23	1	-16/+13
\| \| \| \| \| \| \| \|	Implemented widening (v2f32) and splitting (v16f64). On splitting, I use "popcnt" to calculate memory increment. More type legalization work will come in the next patches. llvm-svn: 287761
*	Fix spelling mistakes in SelectionDAG comments. NFC.	Simon Pilgrim	2016-11-20	1	-2/+2
\| \| \| \| \| \|	Identified by Pedro Giffuni in PR27636. llvm-svn: 287487
*	DAGCombiner: fix combine of trunc and select	Asaf Badouh	2016-11-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	bugzilla: https://llvm.org/bugs/show_bug.cgi?id=29002 pr29002 Differential Revision: https://reviews.llvm.org/D26449 llvm-svn: 286938
*	[DAG Combiner] Fix the native computation of the Newton series for reciprocals	Evandro Menezes	2016-11-10	1	-28/+30
\| \| \| \| \| \| \| \| \| \| \| \|	The generic infrastructure to compute the Newton series for reciprocal and reciprocal square root was conceived to allow a target to compute the series itself. However, the original code did not properly consider this condition if returned by a target. This patch addresses the issues to allow a target to compute the series on its own. Differential revision: https://reviews.llvm.org/D22975 llvm-svn: 286523
*	Use common SDLoc. NFCI.	Simon Pilgrim	2016-11-10	1	-3/+3
\| \| \| \|	llvm-svn: 286473
*	[DAGCombiner] Correctly extract the ConstOrConstSplat shift value for SHL nodes	Simon Pilgrim	2016-11-10	1	-3/+2
\| \| \| \| \| \| \| \|	We were failing to extract a constant splat shift value if the shifted value was being masked. The (shl (and (setcc) N01CV) N1CV) -> (and (setcc) N01CV<<N1CV) combine was unnecessarily preventing this. llvm-svn: 286454
*	DAGCombiner: fix use-after-free when merging consecutive stores	Nicolai Haehnle	2016-11-03	1	-18/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Have MergeConsecutiveStores explicitly return information about the stores that were merged, so that we can safely determine whether the starting node has been freed. Reviewers: chandlerc, bogner, niravd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25601 llvm-svn: 285916
*	Expandload and Compressstore intrinsics	Elena Demikhovsky	2016-11-03	1	-2/+2
\| \| \| \| \| \| \| \|	2 new intrinsics covering AVX-512 compress/expand functionality. This implementation includes syntax, DAG builder, operation lowering and tests. Does not include: handling of illegal data types, codegen prepare pass and the cost model. llvm-svn: 285876
*	[DAG] x \| x --> x	Sanjay Patel	2016-10-30	1	-0/+4
\| \| \| \|	llvm-svn: 285522
*	[DAG] x & x --> x	Sanjay Patel	2016-10-30	1	-0/+4
\| \| \| \|	llvm-svn: 285521
*	[DAGCombiner] Fix a crash visiting `AND` nodes.	Davide Italiano	2016-10-28	1	-1/+6
\| \| \| \| \| \| \| \| \| \|	Instead of asserting that the shift count is != 0 we just bail out as it's not profitable trying to optimize a node which will be removed anyway. Differential Revision: https://reviews.llvm.org/D26098 llvm-svn: 285480
*	[DAGCombiner] Enable (urem x, (shl pow2, y)) -> (and x, (add (shl pow2, y), ↵	Simon Pilgrim	2016-10-25	1	-3/+3
\| \| \| \| \| \|	-1)) combine for splatted vectors llvm-svn: 285129
*	[DAGCombiner] Enable srem(x.y) -> urem(x,y) combine for vectors	Simon Pilgrim	2016-10-25	1	-4/+2
\| \| \| \| \| \|	SelectionDAG::SignBitIsZero (via SelectionDAG::computeKnownBits) has supported vectors since rL280927 llvm-svn: 285123
*	[DAGCombiner] Enable sdiv(x.y) -> udiv(x,y) combine for vectors	Simon Pilgrim	2016-10-25	1	-4/+2
\| \| \| \| \| \|	SelectionDAG::SignBitIsZero (via SelectionDAG::computeKnownBits) has supported vectors since rL280927 llvm-svn: 285118
*	[DAGCombine] Preserve shuffles when one of the vector operands is constant	Zvi Rackover	2016-10-25	1	-34/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Do not perform combines such as: vector_shuffle<4,1,2,3>(build_vector(Ud, C0, C1 C2), scalar_to_vector(X)) -> build_vector(X, C0, C1, C2) Keeping the shuffle allows lowering the constant build_vector to a materialized constant vector (such as a vector-load from the constant-pool or some other idiom). Reviewers: delena, igorb, spatel, mkuper, andreadb, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25524 llvm-svn: 285063
*	Use SDValue::getConstantOperandVal() helper. NFCI.	Simon Pilgrim	2016-10-23	1	-4/+1
\| \| \| \|	llvm-svn: 284949
*	[DAG] fold negation of sign-bit	Sanjay Patel	2016-10-21	1	-11/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	0 - X --> 0, if the sub is NUW 0 - X --> 0, if X is 0 or the minimum signed value and the sub is NSW 0 - X --> X, if X is 0 or the minimum signed value This is the DAG equivalent of: https://reviews.llvm.org/rL284649 plus the fold for the NUW case which already existed in InstSimplify. Note that we miss a vector fold because of a deficiency in the DAG version of computeKnownBits(). llvm-svn: 284844
*	[DAG] use SDNode flags 'nsz' to enable fadd/fsub with zero folds	Sanjay Patel	2016-10-21	1	-16/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As discussed in D24815, let's start the process of killing off the broken fast-math global state housed in TargetOptions and eliminate the need for function-level fast-math attributes. Here we enable two similar folds that are possible when we don't care about signed-zero: fadd nsz x, 0 --> x fsub nsz 0, x --> -x Note that although the test cases include a 'sin' function call, I'm side-stepping the FMF-on-calls question (and lack of support in the DAG) for now. It's not needed for these tests - isNegatibleForFree/GetNegatedExpression just look through a ISD::FSIN node. Also, when we create an FNEG node and propagate the Flags of the FSUB to it, this doesn't actually do anything today because Flags are silently dropped for any node that is not a binary operator. Differential Revision: https://reviews.llvm.org/D25297 llvm-svn: 284824
*	[Target] remove TargetRecip class; 2nd try	Sanjay Patel	2016-10-20	1	-11/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a retry of r284495 which was reverted at r284513 due to use-after-scope bugs caused by faulty usage of StringRef. This version also renames a pair of functions: getRecipEstimateDivEnabled() getRecipEstimateSqrtEnabled() as suggested by Eric Christopher. original commit msg: [Target] remove TargetRecip class; move reciprocal estimate isel functionality to TargetLowering This is a follow-up to https://reviews.llvm.org/D24816 - where we changed reciprocal estimates to be function attributes rather than TargetOptions. This patch is intended to be a structural, but not functional change. By moving all of the TargetRecip functionality into TargetLowering, we can remove all of the reciprocal estimate state, shield the callers from the string format implementation, and simplify/localize the logic needed for a target to enable this. If a function has a "reciprocal-estimates" attribute, those settings may override the target's default reciprocal preferences for whatever operation and data type we're trying to optimize. If there's no attribute string or specific setting for the op/type pair, just use the target default settings. As noted earlier, a better solution would be to move the reciprocal estimate settings to IR instructions and SDNodes rather than function attributes, but that's a multi-step job that requires infrastructure improvements. I intend to work on that, but it's not clear how long it will take to get all the pieces in place. Differential Revision: https://reviews.llvm.org/D25440 llvm-svn: 284746
*	[DAGCombiner] Add general constant vector support to (srl (shl x, c), c) -> ↵	Simon Pilgrim	2016-10-20	1	-8/+8
\| \| \| \| \| \| \| \|	(and x, cst2) We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector llvm-svn: 284717
*	Merged nested ifs. NFCI.	Simon Pilgrim	2016-10-19	1	-7/+6
\| \| \| \|	llvm-svn: 284616
*	[DAGCombiner] Add general constant vector support to (shl (add x, c1), c2) ↵	Simon Pilgrim	2016-10-19	1	-4/+5
\| \| \| \| \| \| \| \|	-> (add (shl x, c2), c1 << c2) We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector llvm-svn: 284613
*	[DAGCombiner] Add general constant vector support to (shl (sra x, c1), c1) ↵	Simon Pilgrim	2016-10-19	1	-7/+6
\| \| \| \| \| \| \| \|	-> (and x, (shl -1, c1)) We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector llvm-svn: 284608
*	[DAGCombiner] Add general constant vector support to (shl (mul x, c1), c2) ↵	Simon Pilgrim	2016-10-19	1	-5/+6
\| \| \| \| \| \| \| \|	-> (mul x, c1 << c2) We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector llvm-svn: 284607