bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fix name typo in SelectonDAG	Joel Jones	2016-12-16	1	-4/+4
\| \| \| \|	llvm-svn: 289969
*	Add extra headers that got deleted by my revert in r289916 but for which	Chandler Carruth	2016-12-16	1	-1/+2
\| \| \| \| \| \|	new usage had already grown in the file. llvm-svn: 289917
*	Revert patch series introducing the DAG combine to match a load-by-bytes	Chandler Carruth	2016-12-16	1	-283/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	idiom. r289538: Match load by bytes idiom and fold it into a single load r289540: Fix a buildbot failure introduced by r289538 r289545: Use more detailed assertion messages in the code ... r289646: Add a couple of assertions to the load combine code ... This DAG combine has a bad crash in it that is quite hard to trigger sadly -- it relies on sneaking code with UB through the SDAG build and into this particular combine. I've responded to the original commit with a test case that reproduces it. However, the code also has other problems that will require substantial changes to address and so I'm going ahead and reverting it for now. This should unblock us and perhaps others that are hitting the crash in the wild and will let a fresh patch with updated approach come in cleanly afterward. Sorry for any trouble or disruption! llvm-svn: 289916
*	Don't combine splats with other shuffles.	Eli Friedman	2016-12-15	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	We sometimes end up creating shuffles which are worse than the obvious translation of the IR. Fixes https://llvm.org/bugs/show_bug.cgi?id=31301 . Differential Revision: https://reviews.llvm.org/D27793 llvm-svn: 289882
*	Don't combine a shuffle of two BUILD_VECTORs with duplicate elements.	Eli Friedman	2016-12-15	1	-10/+23
\| \| \| \| \| \| \| \| \| \| \| \| \|	Targets can't handle this case well in general; we often transform a shuffle of two cheap BUILD_VECTORs to element-by-element insertion, which is very inefficient. Fixes https://llvm.org/bugs/show_bug.cgi?id=31364 . Partially fixes https://llvm.org/bugs/show_bug.cgi?id=31301. Differential Revision: https://reviews.llvm.org/D27787 llvm-svn: 289874
*	[DAG] allow more select folding for targets that have 'and not' (PR31175)	Sanjay Patel	2016-12-14	1	-6/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original motivation for this patch comes from wanting to canonicalize more IR to selects and also canonicalizing min/max. If we're going to do that, we need more backend fixups to undo select codegen when simpler ops will do. I chose AArch64 for the tests because that shows the difference in the simplest way. This should fix: https://llvm.org/bugs/show_bug.cgi?id=31175 Differential Revision: https://reviews.llvm.org/D27489 llvm-svn: 289738
*	Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵	Nirav Dave	2016-12-14	1	-228/+278
\| \| \| \| \| \| \| \| \| \|	UseAA is enabled." Reverting due to ARM MCJIT and MIPS LLD error. This reverts commit r289659. llvm-svn: 289667
*	In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵	Nirav Dave	2016-12-14	1	-278/+228
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	enabled. Retrying after fixing after removing load-store factoring through token factors in favor of improved token factor operand pruning Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289659
*	[DAGCombiner] Try to use SelectionDAG::isKnownToBeAPowerOfTwo instead of ↵	Simon Pilgrim	2016-12-14	2	-30/+63
\| \| \| \| \| \| \| \| \| \| \| \|	just APInt::isPowerOf2 Generalize sdiv/udiv/srem/urem combines using APInt::isPowerOf2, which only works for const/splat-const values, to call SelectionDAG::isKnownToBeAPowerOfTwo instead which recognises many more cases. Added a DAGCombiner::BuildLogBase2 helper since PowerOf2 combines often involve taking the log2 of such a value. Differential Revision: https://reviews.llvm.org/D27714 llvm-svn: 289654
*	Replace APFloatBase static fltSemantics data members with getter functions	Stephan Bergmann	2016-12-14	4	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	At least the plugin used by the LibreOffice build (<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly uses those members (through inline functions in LLVM/Clang include files in turn using them), but they are not exported by utils/extract_symbols.py on Windows, and accessing data across DLL/EXE boundaries on Windows is generally problematic. Differential Revision: https://reviews.llvm.org/D26671 llvm-svn: 289647
*	Add a couple of assertions to the load combine code introduced by r289538	Artur Pilipenko	2016-12-14	1	-1/+5
\| \| \| \|	llvm-svn: 289646
*	Use more detailed assertion messages in the code introduced by r289538	Artur Pilipenko	2016-12-13	1	-4/+8
\| \| \| \|	llvm-svn: 289545
*	Fix a buildbot failure introduced by r289538	Artur Pilipenko	2016-12-13	1	-2/+1
\| \| \| \| \| \|	Build failed because of unused variable in product mode. llvm-svn: 289540
*	[DAGCombiner] Match load by bytes idiom and fold it into a single load	Artur Pilipenko	2016-12-13	1	-0/+276
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it. Assuming little endian target: i8 a = ... i32 val = a[0] \| (a[1] << 8) \| (a[2] << 16) \| (a[3] << 24) => i32 val = ((i32)a) i8 a = ... i32 val = (a[0] << 24) \| (a[1] << 16) \| (a[2] << 8) \| a[3] => i32 val = BSWAP(((i32)a)) This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations. Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part: i32 val = a[i] \| (a[i + 1] << 8) \| (a[i + 2] << 16) \| (a[i + 3] << 24) Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses. The general scheme is to match OR expressions by recursively calculating the origin of individual bits which constitute the resulting OR value. If all the OR bits come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed. Reviewed By: hfinkel, RKSimon, filcab Differential Revision: https://reviews.llvm.org/D26149 llvm-svn: 289538
*	Move BaseIndexOffset in DAGCombiner.cpp so it will be available for the ↵	Artur Pilipenko	2016-12-13	1	-104/+104
\| \| \| \| \| \|	upcoming user llvm-svn: 289537
*	[SelectionDAG] computeKnownBits - simplified knownbits sign extension. NFCI.	Simon Pilgrim	2016-12-13	1	-13/+4
\| \| \| \| \| \|	We don't need to extract+test the sign bit of the known ones/zeros, we can use sext which will handle all of this. llvm-svn: 289534
*	[Statepoints] Reuse stack slots more than once within a basic block	Philip Reames	2016-12-13	1	-4/+9
\| \| \| \| \| \| \| \| \| \|	The stack slot reuse code had a really amusing bug. We ended up only reusing a stack slot exact once (initial use + reuse) within a basic block. If we had a third statepoint to process, we ended up allocating a new set of stack slots. If we crossed a basic block boundary, the set got cleared. As a result, code which is invoke heavy doesn't see the problem, but multiple calls within a basic block does. Net result: as we optimize invokes into calls, lowering gets worse. The root error here is that the bitmap uses by the custom allocator wasn't kept in sync. The result was that we ended up resizing the bitmap on the next statepoint (to handle the cross block case), reset the bit once, but then never reset it again. Differential Revision: https://reviews.llvm.org/D25243 llvm-svn: 289509
*	[SelectionDAG] Add support for EXTRACT_SUBVECTOR to ComputeNumSignBits	Simon Pilgrim	2016-12-12	1	-0/+2
\| \| \| \| \| \|	Pre-commit as discussed on D27657 llvm-svn: 289425
*	[SelectionDAG] Add ability for computeKnownBits to peek through bitcasts ↵	Simon Pilgrim	2016-12-10	1	-1/+23
\| \| \| \| \| \| \| \|	from 'large element' scalar/vector to 'small element' vector. Extension to D27129 which already supported bitcasts from 'small element' vector to 'large element' scalar/vector types. llvm-svn: 289329
*	[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes (REAPPLIED)	Simon Pilgrim	2016-12-09	1	-0/+36
\| \| \| \| \| \|	Reapplied with fix for PR31323 - X86 SSE2 vXi16 multiplies for illegal types were creating CONCAT_VECTORS nodes with vector inputs that might not total the number of elements in the result type. llvm-svn: 289232
*	AMDGPU: Fix i128 mul	Matt Arsenault	2016-12-09	1	-1/+1
\| \| \| \|	llvm-svn: 289231
*	Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵	Nirav Dave	2016-12-09	1	-141/+275
\| \| \| \| \| \| \| \|	UseAA is enabled." This reverts commit r289221 which appears to be triggering an assertion llvm-svn: 289226
*	In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵	Nirav Dave	2016-12-09	1	-275/+141
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	enabled. Retrying after fixing overly aggressive load-store forwarding optimization. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289221
*	Use SelectionDAG.getSplatBuildVector helper. NFCI.	Simon Pilgrim	2016-12-09	1	-6/+5
\| \| \| \|	llvm-svn: 289220
*	[SelectionDAG] Use SelectionDAG.getBuildVector helper. NFCI.	Simon Pilgrim	2016-12-09	2	-9/+6
\| \| \| \| \| \|	Makes interception of BUILD_VECTOR creation easier for debugging. llvm-svn: 289218
*	[SelectionDAG] Add additional checks to CONCAT_VECTORS creation	Simon Pilgrim	2016-12-09	1	-0/+10
\| \| \| \| \| \|	Part of the work for PR31323 - add extra asserts checking that the input vectors are of consistent type and result in the correct number of vector elements. llvm-svn: 289214
*	[SelectionDAG] Add partial BITCAST support to computeKnownBits	Simon Pilgrim	2016-12-09	1	-0/+44
\| \| \| \| \| \| \| \| \| \|	Adds support for bitcasting a little endian 'small element' vector to 'large element' scalar/vector (e.g. v16i8 to v4i32 or v2i32 to i64), which is required for PR30845. We extract the knownbits for each 'small element' part and concatenate the results together. We can add support for big endian and 'large element' scalar/vector to 'small element' vector bitcasting once we have test cases for them. Differential Revision: https://reviews.llvm.org/D27129 llvm-svn: 289200
*	Revert "[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes"	Daniel Jasper	2016-12-09	1	-36/+0
\| \| \| \| \| \| \| \|	This reverts commit r288916 as it is currently causing a crasher in Halide. Reproducer on llvm.org/PR31323. While it might be that halide is generating invalid IR, llc shouldn't crash. llvm-svn: 289194
*	[SelectionDAG] Add expansion and promotion of [US]MUL_LOHI	Nicolai Haehnle	2016-12-08	4	-27/+179
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Most targets set the action for these nodes to Expand even though there isn't actually any code for them in ExpandNode. Instead, targets simply relied on the fact that no code generates these nodes as long as the nodes aren't legal or custom. However, generating these nodes can be useful e.g. for divide-by-constant in wider integer types. Expand of [US]MUL_LOHI will use MULH[US] when legal or custom, and a sequence of half-width multiplications otherwise. Promote uses a wider multiply. This patch intends to not change the generated code, but indirect effects are possible since expansions/promotions that were previously done in DAGCombine may now be done in LegalizeDAG. See D24822 for a change that actually uses the new expansion. Reviewers: spatel, bkramer, venkatra, efriedma, hfinkel, ast, nadav, tstellarAMD Subscribers: arsenm, jyknight, nemanjai, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D24956 llvm-svn: 289050
*	[SelectionDAG] Add knownbits support for vector demandedelts in ↵	Simon Pilgrim	2016-12-07	1	-2/+4
\| \| \| \| \| \|	SMAX/SMIN/UMAX/UMIN opcodes llvm-svn: 288926
*	[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes	Simon Pilgrim	2016-12-07	1	-0/+36
\| \| \| \|	llvm-svn: 288916
*	[SelectionDAG] Removed old knownbits TODO comment. NFCI.	Simon Pilgrim	2016-12-07	1	-3/+0
\| \| \| \| \| \|	EXTRACT_VECTOR_ELT does support demanded elts if the element index is known and in range. llvm-svn: 288913
*	[CodeGen] Fix result type for SMULO/UMULO legalization	Eli Friedman	2016-12-06	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On some platforms (like MSP430) the second element of the result structure for SMULO/UMULO may have a shorter type than the one returned by SetCC. We need to truncate it to the right type, or else some incorrect code may be generated later on. This fixes issue https://github.com/rust-lang/rust/issues/37829 Patch by Vadzim Dambrouski! Differential Revision: https://reviews.llvm.org/D27154 llvm-svn: 288857
*	[DAGCombine] Add (sext_in_reg (zext x)) -> (sext x) combine	Simon Pilgrim	2016-12-06	1	-0/+9
\| \| \| \| \| \| \| \|	Handle the case where a sign extension has ended up being split into separate stages (typically to get around vector legal ops) and a zext + sext_in_reg gets inserted. Differential Revision: https://reviews.llvm.org/D27461 llvm-svn: 288842
*	[SelectionDAG] We can ignore knownbits from an undef shuffle vector index if ↵	Simon Pilgrim	2016-12-06	1	-3/+3
\| \| \| \| \| \|	we don't actually demand that element llvm-svn: 288839
*	Avoid repeated calls to Op.getOpcode(). NFCI.	Simon Pilgrim	2016-12-06	1	-3/+4
\| \| \| \|	llvm-svn: 288814
*	[TargetLowering] add special-case for demanded bits analysis of 'not'	Sanjay Patel	2016-12-05	1	-5/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We treat bitwise 'not' as a special operation and try not to reduce its all-ones mask. Presumably, this is because a 'not' may be cheaper than a generic 'xor' or it may get folded into another logic op if the target has those. However, if we can remove a logic instruction by changing the xor's constant mask value, that should always be a win. Note that the IR version of SimplifyDemandedBits() does not treat 'not' as a special-case currently (although that's marked with a FIXME). So if you run this IR through -instcombine, you should get the same end result. I'm hoping to add a different backend transform that will expose this problem though, so I need to solve this first. Differential Revision: https://reviews.llvm.org/D27356 llvm-svn: 288676
*	DAG: Fold out out of bounds insert_vector_elt	Matt Arsenault	2016-12-03	1	-0/+7
\| \| \| \| \| \| \|	getNode already prevents formation of out of bounds constant extract_vector_elts. Do the same for insert_vector_elt. llvm-svn: 288603
*	[DAGCombiner] do not fold (fmul (fadd X, 1), Y) -> (fmad X, Y, Y) by default	Nicolai Haehnle	2016-12-02	1	-5/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When X = 0 and Y = inf, the original code produces inf, but the transformed code produces nan. So this transform (and its relatives) should only be used when the no-infs-fp-math flag is explicitly enabled. Also disable the transform using fmad (intermediate rounding) when unsafe-math is not enabled, since it can reduce the precision of the result; consider this example with binary floating point numbers with two bits of mantissa: x = 1.01 y = 111 x * (y + 1) = 1.01 * 1000 = 1010 (this is the exact result; no rounding occurs at any step) x * y + x = 1000.11 + 1.01 =r 1000 + 1.01 = 1001.01 =r 1000 (with rounding towards zero) The example relies on rounding towards zero at least in the second step. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98578 Reviewers: RKSimon, tstellarAMD, spatel, arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26602 llvm-svn: 288506
*	IR: Change the gep_type_iterator API to avoid always exposing the "current" ↵	Peter Collingbourne	2016-12-02	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458
*	SDAG: Avoid a large, usually empty SmallVector in a recursive function	Justin Bogner	2016-12-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This SmallVector is using up 128 bytes on the stack every time despite almost always being empty[1], and since this function can recurse quite deeply that adds up to a lot of overhead. We've seen this run afoul of ulimits in some cases with ASAN on. Replacing the SmallVector with a std::vector trades an occasional heap allocation for vastly less stack usage. [1]: I gathered some stats on an internal test suite and the vector was non-empty in only 45,000 of 10,000,000 calls to this function. llvm-svn: 288441
*	Move most EH from MachineModuleInfo to MachineFunction	Matthias Braun	2016-12-01	3	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recommitting r288293 with some extra fixes for GlobalISel code. Most of the exception handling members in MachineModuleInfo is actually per function data (talks about the "current function") so it is better to keep it at the function instead of the module. This is a necessary step to have machine module passes work properly. Also: - Rename TidyLandingPads() to tidyLandingPads() - Use doxygen member groups instead of "//===- EH ---"... so it is clear where a group ends. - I had to add an ugly const_cast at two places in the AsmPrinter because the available MachineFunction pointers are const, but the code wants to call tidyLandingPads() in between (markFunctionEnd()/endFunction()). Differential Revision: https://reviews.llvm.org/D27227 llvm-svn: 288405
*	[SelectionDAG] Rename and clarify visitFMULForFMADCombine (NFC)	Nicolai Haehnle	2016-12-01	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Suggested by @spatel in D26602. Reviewers: spatel, hfinkel Subscribers: spatel, llvm-commits Differential Revision: https://reviews.llvm.org/D27260 llvm-svn: 288336
*	Temporarily Revert "Move most EH from MachineModuleInfo to MachineFunction"	Eric Christopher	2016-12-01	3	-12/+12
\| \| \| \| \| \| \| \| \|	This apprears to have broken the global isel bot: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-globalisel_build/5174/console This reverts commit r288293. llvm-svn: 288322
*	Move most EH from MachineModuleInfo to MachineFunction	Matthias Braun	2016-11-30	3	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most of the exception handling members in MachineModuleInfo is actually per function data (talks about the "current function") so it is better to keep it at the function instead of the module. This is a necessary step to have machine module passes work properly. Also: - Rename TidyLandingPads() to tidyLandingPads() - Use doxygen member groups instead of "//===- EH ---"... so it is clear where a group ends. - I had to add an ugly const_cast at two places in the AsmPrinter because the available MachineFunction pointers are const, but the code wants to call tidyLandingPads() in between (markFunctionEnd()/endFunction()). Differential Revision: https://reviews.llvm.org/D27227 llvm-svn: 288293
*	Move VariableDbgInfo from MachineModuleInfo to MachineFunction	Matthias Braun	2016-11-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	VariableDbgInfo is per function data, so it makes sense to have it with the function instead of the module. This is a necessary step to have machine module passes work properly. Differential Revision: https://reviews.llvm.org/D27186 llvm-svn: 288292
*	[SelectionDAG] Refactor TargetLowering::expandMUL (NFC)	Nicolai Haehnle	2016-11-30	1	-50/+32
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Further preparation for the expansion of MUL_LOHI added in D24956. Reviewers: efriedma, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27064 llvm-svn: 288248
*	Test commit. Comment changes. NFC.	Warren Ristow	2016-11-29	1	-5/+5
\| \| \| \|	llvm-svn: 288100
*	[DAG] clean up foldSelectCCToShiftAnd(); NFCI	Sanjay Patel	2016-11-28	1	-35/+35
\| \| \| \|	llvm-svn: 288088
*	[DAG] add helper function for selectcc --> and+shift transforms; NFC	Sanjay Patel	2016-11-28	1	-42/+51
\| \| \| \|	llvm-svn: 288073