bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[DAGCombiner] Just call isConstOrConstSplat directly. NFCI.	Simon Pilgrim	2016-10-19	1	-8/+4
\| \| \| \| \| \|	This will get the same ConstantSDNode scalar or vector splat value as the current separate dyn_cast<ConstantSDNode> / isVector() approach. llvm-svn: 284578
*	[DAGCombine] Generalize distributeTruncateThroughAnd to work with any ↵	Simon Pilgrim	2016-10-19	1	-13/+9
\| \| \| \| \| \|	non-opaque constant or constant vector llvm-svn: 284574
*	revert r284495: [Target] remove TargetRecip class	Sanjay Patel	2016-10-18	1	-32/+9
\| \| \| \| \| \|	There's something wrong with the StringRef usage while parsing the attribute string. llvm-svn: 284513
*	[Target] remove TargetRecip class; move reciprocal estimate isel ↵	Sanjay Patel	2016-10-18	1	-9/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	functionality to TargetLowering This is a follow-up to D24816 - where we changed reciprocal estimates to be function attributes rather than TargetOptions. This patch is intended to be a structural, but not functional change. By moving all of the TargetRecip functionality into TargetLowering, we can remove all of the reciprocal estimate state, shield the callers from the string format implementation, and simplify/localize the logic needed for a target to enable this. If a function has a "reciprocal-estimates" attribute, those settings may override the target's default reciprocal preferences for whatever operation and data type we're trying to optimize. If there's no attribute string or specific setting for the op/type pair, just use the target default settings. As noted earlier, a better solution would be to move the reciprocal estimate settings to IR instructions and SDNodes rather than function attributes, but that's a multi-step job that requires infrastructure improvements. I intend to work on that, but it's not clear how long it will take to get all the pieces in place. Differential Revision: https://reviews.llvm.org/D25440 llvm-svn: 284495
*	[DAGCombiner] Add splatted vector support to (udiv x, (shl pow2, y)) -> x ↵	Simon Pilgrim	2016-10-18	1	-2/+3
\| \| \| \| \| \|	>>u (log2(pow2)+y) llvm-svn: 284491
*	Strip trailing whitespace (NFCI)	Simon Pilgrim	2016-10-18	1	-1/+1
\| \| \| \|	llvm-svn: 284478
*	[DAG] make isConstOrConstSplat and isConstOrConstSplatFP more accessible; NFC	Sanjay Patel	2016-10-17	1	-38/+0
\| \| \| \| \| \| \| \| \| \| \| \|	As noted in: https://reviews.llvm.org/D25685 This is the next-to-smallest step needed to enable the ComputeNumSignBits fix in that patch. In a minor attempt to keep some structure, we're pulling the FP helper over along with its integer sibling, but clearly we can and should do more refactoring of the similar helper functions in DAGCombiner and SelectionDAG to simplify and not duplicate functionality. llvm-svn: 284421
*	[DAG] optimize away an arithmetic-right-shift of a 0 or -1 value	Sanjay Patel	2016-10-17	1	-0/+4
\| \| \| \| \| \| \| \| \|	This came up as part of: https://reviews.llvm.org/D25485 Note that the vector case is missed because ComputeNumSignBits() is deficient for vectors. llvm-svn: 284395
*	[DAG] avoid creating illegal node when transforming negated shifted sign bit	Sanjay Patel	2016-10-14	1	-2/+3
\| \| \| \| \| \| \| \|	Eli noted this potential bug in the post-commit thread for: https://reviews.llvm.org/rL284239 ...but I'm not sure how to trigger it, so there's no test case yet. llvm-svn: 284268
*	[DAG] add folds for negated shifted sign bit	Sanjay Patel	2016-10-14	1	-0/+13
\| \| \| \| \| \| \| \| \|	The same folds exist in InstCombine already. This came up as part of: https://reviews.llvm.org/D25485 llvm-svn: 284239
*	Fix use-after-frees	Nicolai Haehnle	2016-10-14	1	-2/+2
\| \| \| \| \| \|	Extracted from D25313, as suggested by Justin Bogner. llvm-svn: 284220
*	[DAGCombiner] Teach createBuildVecShuffle to handle cases where input ↵	Craig Topper	2016-10-14	1	-5/+9
\| \| \| \| \| \| \| \|	vectors are less than half of the output vector size. This will be needed by a future commit to support sign/zero extending from v8i8 to v8i64 which requires a sign/zero_extend_vector_inreg to be created which requires v8i8 to be concatenated upto v64i8 and goes through this code. llvm-svn: 284204
*	[DAG] hoist DL(N) and fix formatting; NFC	Sanjay Patel	2016-10-13	1	-24/+31
\| \| \| \|	llvm-svn: 284170
*	Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵	Nirav Dave	2016-10-13	1	-120/+271
\| \| \| \| \| \| \| \| \|	UseAA is enabled." This reverts commit r284151 which appears to be triggering a LTO failures on Hexagon llvm-svn: 284157
*	In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵	Nirav Dave	2016-10-13	1	-271/+120
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	enabled. Retrying after upstream changes. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 284151
*	[DAGCombiner] Add vector support to (mul (shl X, Y), Z) -> (shl (mul X, Z), ↵	Simon Pilgrim	2016-10-13	1	-7/+6
\| \| \| \| \| \|	Y) style combines llvm-svn: 284122
*	[DAGCombiner] Add vector support to C2-(A+C1) -> (C2-C1)-A folding	Simon Pilgrim	2016-10-13	1	-5/+5
\| \| \| \|	llvm-svn: 284117
*	[DAGCombiner] Add vector support to (sub -1, x) -> (xor x, -1) canonicalization	Simon Pilgrim	2016-10-13	1	-1/+12
\| \| \| \| \| \|	Improves commutation potential llvm-svn: 284113
*	[DAGCombiner] Update most ADD combines to support general vector combines	Simon Pilgrim	2016-10-12	1	-12/+54
\| \| \| \| \| \| \| \|	Add a number of helper functions to match scalar or vector equivalent constant/splat values to allow most of the combine patterns to be used by vectors. Differential Revision: https://reviews.llvm.org/D25374 llvm-svn: 284015
*	[DAGCombiner] Do not remove the load of stored values when optimizations are ↵	Konstantin Zhuravlyov	2016-10-12	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	disabled This combiner breaks debug experience and should not be run when optimizations are disabled. For example: int main() { int j = 0; j += 2; if (j == 2) return 0; return 5; } When debugging this code compiled in /O0, it should be valid to break at line "j+=2;" and edit the value of j. It should change the return value of the function. Differential Revision: https://reviews.llvm.org/D19268 llvm-svn: 284014
*	[DAG] Fix crash in build_vector -> vector_shuffle combine	Michael Kuperstein	2016-10-11	1	-0/+5
\| \| \| \| \| \| \| \|	Fixes a crash in the build_vector -> vector_shuffle combine when the first vector input is twice as wide as the output, and the second input vector is even wider. llvm-svn: 283953
*	[DAG] add fold for masked negated sign-extended bool	Sanjay Patel	2016-10-11	1	-5/+11
\| \| \| \| \| \| \|	This enhances the fold added with: https://reviews.llvm.org/rL283900 llvm-svn: 283905
*	[DAG] add fold for masked negated extended bool	Sanjay Patel	2016-10-11	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The non-obvious motivation for adding this fold (which already happens in InstCombine) is that we want to canonicalize IR towards select instructions and canonicalize DAG nodes towards boolean math. So we need to recreate some folds in the DAG to handle that change in direction. An interesting implementation difference for cases like this is that InstCombine generally works top-down while the DAG goes bottom-up. That means we need to detect different patterns. In this case, the SimplifyDemandedBits fold prevents us from performing a zext to sext fold that would then be recognized as a negation of a sext. llvm-svn: 283900
*	[DAG] simplify logic; NFC	Sanjay Patel	2016-10-11	1	-8/+6
\| \| \| \|	llvm-svn: 283885
*	[DAG] hoist DL(N) and fix formatting; NFC	Sanjay Patel	2016-10-11	1	-25/+32
\| \| \| \|	llvm-svn: 283884
*	[DAG] fix formatting; NFC	Sanjay Patel	2016-10-11	1	-72/+68
\| \| \| \|	llvm-svn: 283878
*	[DAG] clean up foldSelectOfConstants(); NFCI	Sanjay Patel	2016-10-07	1	-19/+14
\| \| \| \| \| \| \|	Rename variables, simplify logic. Not clear yet why we don't handle a target with ZeroOrNegativeOneBooleanContent too. llvm-svn: 283613
*	[DAG] move fold (select C, 0, 1 -> xor C, 1) to a helper function; NFC	Sanjay Patel	2016-10-07	1	-16/+31
\| \| \| \| \| \|	We're missing at least 3 other similar folds based on what we have in InstCombine. llvm-svn: 283596
*	Delete some dead code in SelectionDAG (NFC)	Vedant Kumar	2016-10-06	1	-4/+0
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D24435 llvm-svn: 283505
*	[DAG] Generalize build_vector -> vector_shuffle combine for more than 2 inputs	Michael Kuperstein	2016-10-06	1	-135/+191
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This generalizes the build_vector -> vector_shuffle combine to support any number of inputs. The idea is to create a binary tree of shuffles, where the first layer performs pairwise shuffles of the input vectors placing each input element into the correct lane, and the rest of the tree blends these shuffles together. This doesn't try to be smart and create any sort of "optimal" shuffles. The assumption is that even a "poor" shuffle sequence is better than extracting and inserting the elements one by one. Differential Revision: https://reviews.llvm.org/D24683 llvm-svn: 283480
*	fix formatting; NFC	Sanjay Patel	2016-10-03	1	-8/+5
\| \| \| \|	llvm-svn: 283115
*	Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵	Nirav Dave	2016-09-28	1	-120/+271
\| \| \| \| \| \| \| \|	UseAA is enabled." This reverts commit r282600 due to test failues with MCJIT llvm-svn: 282604
*	In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵	Nirav Dave	2016-09-28	1	-271/+120
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	enabled. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 282600
*	[DAG] Remove isVectorClearMaskLegal() check from vector_build dagcombine	Michael Kuperstein	2016-09-28	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \|	This check currently doesn't seem to do anything useful on any in-tree target: On non-x86, it always evaluates to false, so we never hit the code path that creates the shuffle with zero. On x86, it just forwards to isShuffleMaskLegal(), which is a reasonable thing to query in general, but doesn't make sense if only restricted to zero blends. Differential Revision: https://reviews.llvm.org/D24625 llvm-svn: 282567
*	Change the order of the splitted store from high - low to low - high.	Wei Mi	2016-09-18	1	-2/+2
\| \| \| \| \| \| \|	It is a trivial change which could make the testcase easier to be reused for the store splitting in CodeGenPrepare. llvm-svn: 281846
*	getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCI	Sanjay Patel	2016-09-14	1	-9/+8
\| \| \| \|	llvm-svn: 281493
*	getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits() ; NFCI	Sanjay Patel	2016-09-14	1	-32/+24
\| \| \| \|	llvm-svn: 281490
*	getScalarType().getSizeInBits() -> getScalarSizeInBits() ; NFCI	Sanjay Patel	2016-09-14	1	-32/+32
\| \| \| \|	llvm-svn: 281489
*	[DAG] Allow build-to-shuffle combine to combine builds from two wide vectors.	Michael Kuperstein	2016-09-13	1	-27/+53
\| \| \| \| \| \| \| \| \| \| \|	This allows us to, in some cases, create a vector_shuffle out of a build_vector, when the inputs to the build are extract_elements from two different vectors, at least one of which is wider than the output. (E.g. a <8 x i16> being constructed out of elements from a <16 x i16> and a <8 x i16>). Differential Revision: https://reviews.llvm.org/D24491 llvm-svn: 281402
*	[DAGCombiner] Use APInt directly in (shl (zext (srl x, C)), C) combine range ↵	Simon Pilgrim	2016-09-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	test To avoid assertion, we must ensure that the inner shift constant is within range before calling ConstantSDNode::getZExtValue(). We already know that the outer shift constant is in range. Followup to D23007 llvm-svn: 281362
*	[DAGCombiner] Use APInt directly in (shl (ext (shl x, c1)), c2) combine	Simon Pilgrim	2016-09-13	1	-11/+15
\| \| \| \| \| \| \| \|	Fix failure to detect out of range shift constants leading to assert in ConstantSDNode::getZExtValue() Followup to D23007 llvm-svn: 281354
*	Remove MVT:i1 xor instruction before SELECT. (Performance improvement).	Ayman Musa	2016-09-13	1	-0/+16
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D23764 llvm-svn: 281308
*	[DAG] Refactor BUILD_VECTOR combine to make it easier to extend. NFCI.	Michael Kuperstein	2016-09-13	1	-123/+156
\| \| \| \| \| \| \|	This should make it easier to add cases that we currently don't cover, like supporting more kinds of type mismatches and more than 2 input vectors. llvm-svn: 281283
*	[CodeGen] Split out the notions of MI invariance and MI dereferenceability.	Justin Lebar	2016-09-11	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: An IR load can be invariant, dereferenceable, neither, or both. But currently, MI's notion of invariance is IR-invariant && IR-dereferenceable. This patch splits up the notions of invariance and dereferenceability at the MI level. It's NFC, so adds some probably-unnecessary "is-dereferenceable" checks, which we can remove later if desired. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23371 llvm-svn: 281151
*	[SelectionDAG] Ensure DAG::getZeroExtendInReg is called with a scalar type	Simon Pilgrim	2016-09-09	1	-1/+1
\| \| \| \| \| \|	Fixes issue with rL280927 identified by Mikael Holmén llvm-svn: 281042
*	[DAGCombiner] Enable AND combines of splatted constant vectors	Simon Pilgrim	2016-09-08	1	-7/+7
\| \| \| \| \| \| \| \|	Allow AND combines to use a vector splatted constant as well as a constant scalar. Preliminary part of D24253. llvm-svn: 280926
*	[DAGCombine] More fixups to SETCC legality checking (visitANDLike/visitORLike)	Hal Finkel	2016-09-06	1	-28/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I might have called this "r246507, the sequel". It fixes the same issue, as the issue has cropped up in a few more places. The underlying problem is that isSetCCEquivalent can pick up select_cc nodes with a result type that is not legal for a setcc node to have, and if we use that type to create new setcc nodes, nothing fixes that (and so we've violated the contract that the infrastructure has with the backend regarding setcc node types). Fixes PR30276. For convenience, here's the commit message from r246507, which explains the problem is greater detail: [DAGCombine] Fixup SETCC legality checking SETCC is one of those special node types for which operation actions (legality, etc.) is keyed off of an operand type, not the node's value type. This makes sense because the value type of a legal SETCC node is determined by its operands' value type (via the TLI function getSetCCResultType). When the SDAGBuilder creates SETCC nodes, it either creates them with an MVT::i1 value type, or directly with the value type provided by TLI.getSetCCResultType. The first problem being fixed here is that DAGCombine had several places querying TLI.isOperationLegal on SETCC, but providing the return of getSetCCResultType, instead of the operand type directly. This does not mean what the author thought, and "luckily", most in-tree targets have SETCC with Custom lowering, instead of marking them Legal, so these checks return false anyway. The second problem being fixed here is that two of the DAGCombines could create SETCC nodes with arbitrary (integer) value types; specifically, those that would simplify: (setcc a, b, op1) and\|or (setcc a, b, op2) -> setcc a, b, op3 (which is possible for some combinations of (op1, op2)) If the operands of the and\|or node are actual setcc nodes, then this is not an issue (because the and\|or must share the same type), but, the relevant code in DAGCombiner::visitANDLike and DAGCombiner::visitORLike actually calls DAGCombiner::isSetCCEquivalent on each operand, and that function will recognise setcc-like select_cc nodes with other return types. And, thus, when creating new SETCC nodes, we need to be careful to respect the value-type constraint. This is even true before type legalization, because it is quite possible for the SELECT_CC node to have a legal type that does not happen to match the corresponding TLI.getSetCCResultType type. To be explicit, there is nothing that later fixes the value types of SETCC nodes (if the type is legal, but does not happen to match TLI.getSetCCResultType). Creating SETCCs with an MVT::i1 value type seems to work only because, either MVT::i1 is not legal, or it is what TLI.getSetCCResultType returns if it is legal. Fixing that is a larger change, however. For the time being, restrict the relevant transformations to produce only SETCC nodes with a value type matching TLI.getSetCCResultType (or MVT::i1 prior to type legalization). Fixes PR24636. llvm-svn: 280767
*	Split the store of a wide value merged from an int-fp pair into multiple stores.	Wei Mi	2016-09-02	1	-0/+103
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For the store of a wide value merged from a pair of values, especially int-fp pair, sometimes it is more efficent to split it into separate narrow stores, which can remove the bitwise instructions or sink them to colder places. Now the feature is only enabled on x86 target, and only store of int-fp pair is splitted. It is possible that the application scope gets extended with perf evidence support in the future. Differential Revision: https://reviews.llvm.org/D22840 llvm-svn: 280505
*	[DAGcombiner] Fix incorrect sinking of a truncate into the operand of a shift.	Andrea Di Biagio	2016-09-02	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a regression introduced by revision 268094. Revision 268094 added the following dag combine rule: // trunc (shl x, K) -> shl (trunc x), K => K < vt.size / 2 That rule converts a truncate of a shift-by-constant into a shift of a truncated value. We do this only if the shift count is less than half the size in bits of the truncated value (K < vt.size / 2). The problem is that the constraint on the shift count is incorrect, so the rule doesn't work well in some cases involving vector types. The combine rule should have been written instead like this: // trunc (shl x, K) -> shl (trunc x), K => K < vt.getScalarSizeInBits() Basically, if K is smaller than the "scalar size in bits" of the truncated value then we know that by "sinking" the truncate into the operand of the shift we would never accidentally make the shift undefined. This patch fixes the check on the shift count, and adds test cases to make sure that we don't regress the behavior. Differential Revision: https://reviews.llvm.org/D24154 llvm-svn: 280482
*	[DAGCombine] Don't fold a trunc if it feeds an anyext	Michael Kuperstein	2016-09-01	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Legalization tends to create anyext(trunc) patterns. This should always be combined - into either a single trunc, a single ext, or nothing if the types match exactly. But if we happen to combine the trunc first, we may pull the trunc away from the anyext or make it implicit (e.g. the truncate(extract) -> extract(bitcast) fold). To prevent this, we can avoid doing the fold, similarly to how we already handle fpround(fpextend). Differential Revision: https://reviews.llvm.org/D23893 llvm-svn: 280386