bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵	Nirav Dave	2016-09-28	1	-120/+271
\| \| \| \| \| \| \| \|	UseAA is enabled." This reverts commit r282600 due to test failues with MCJIT llvm-svn: 282604
*	In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵	Nirav Dave	2016-09-28	1	-271/+120
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	enabled. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 282600
*	[DAG] Remove isVectorClearMaskLegal() check from vector_build dagcombine	Michael Kuperstein	2016-09-28	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \|	This check currently doesn't seem to do anything useful on any in-tree target: On non-x86, it always evaluates to false, so we never hit the code path that creates the shuffle with zero. On x86, it just forwards to isShuffleMaskLegal(), which is a reasonable thing to query in general, but doesn't make sense if only restricted to zero blends. Differential Revision: https://reviews.llvm.org/D24625 llvm-svn: 282567
*	Add support to optionally limit the size of jump tables.	Evandro Menezes	2016-09-26	1	-12/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Many high-performance processors have a dedicated branch predictor for indirect branches, commonly used with jump tables. As sophisticated as such branch predictors are, they tend to have well defined limits beyond which their effectiveness is hampered or even nullified. One such limit is the number of possible destinations for a given indirect branches that such branch predictors can handle. This patch considers a limit that a target may set to the number of destination addresses in a jump table. Patch by: Evandro Menezes <e.menezes@samsung.com>, Aditya Kumar <aditya.k7@samsung.com>, Sebastian Pop <s.pop@samsung.com>. Differential revision: https://reviews.llvm.org/D21940 llvm-svn: 282412
*	[X86][avx512] Fix bug in masked compress store.	Ayman Musa	2016-09-26	1	-3/+3
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D23984 llvm-svn: 282381
*	[DAG] Fix incorrect alignment of ext load.	Nirav Dave	2016-09-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Correctly use alignment size from loaded size not output value size. Reviewers: jyknight, tstellarAMD, arsenm Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23356 llvm-svn: 282177
*	Disable tail calls if there is an swifterror argument	Arnold Schwaighofer	2016-09-21	1	-0/+5
\| \| \| \| \| \| \| \| \|	ISel does not handle them correctly yet i.e we crash trying to emit tail call code. radar://28407842 llvm-svn: 282088
*	[AVX-512] Don't lower CVTPD2PS intrinsics to ISD::FP_ROUND with an X86 ↵	Craig Topper	2016-09-18	1	-1/+2
\| \| \| \| \| \| \| \|	rounding mode encoding in the second operand. This immediate should only be 0 or 1 and indicates if the truncation loses precision. Also enhance an assert in SelectionDAG::getNode to flag this sort of problem in the future. llvm-svn: 281868
*	[X86][SSE] Improve recognition of uitofp conversions that can be performed ↵	Simon Pilgrim	2016-09-18	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	as sitofp With D24253 we can now use SelectionDAG::SignBitIsZero with vector operations. This patch uses SelectionDAG::SignBitIsZero to recognise that a zero sign bit means that we can use a sitofp instead of a uitofp (which is not directly support on pre-AVX512 hardware). While AVX512 does provide support for uitofp, the conversion to sitofp should not cause any regressions. Differential Revision: https://reviews.llvm.org/D24343 llvm-svn: 281852
*	Change the order of the splitted store from high - low to low - high.	Wei Mi	2016-09-18	1	-2/+2
\| \| \| \| \| \| \|	It is a trivial change which could make the testcase easier to be reused for the store splitting in CodeGenPrepare. llvm-svn: 281846
*	Make analyzeBranch family of instruction names consistent	Matt Arsenault	2016-09-14	1	-1/+1
\| \| \| \| \| \| \|	analyzeBranch was renamed to use lowercase first, rename the related set to match. llvm-svn: 281506
*	getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits(), round 2 ↵	Sanjay Patel	2016-09-14	2	-5/+4
\| \| \| \| \| \|	; NFCI llvm-svn: 281498
*	getVectorElementType().getSizeInBits() -> getScalarSizeInBits() ; NFCI	Sanjay Patel	2016-09-14	6	-16/+14
\| \| \| \|	llvm-svn: 281495
*	getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCI	Sanjay Patel	2016-09-14	12	-62/+54
\| \| \| \|	llvm-svn: 281493
*	getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits() ; NFCI	Sanjay Patel	2016-09-14	3	-46/+33
\| \| \| \|	llvm-svn: 281490
*	getScalarType().getSizeInBits() -> getScalarSizeInBits() ; NFCI	Sanjay Patel	2016-09-14	6	-70/+70
\| \| \| \|	llvm-svn: 281489
*	[CodeGen] Fix invalid shift in mul expansion	Pawel Bylica	2016-09-13	1	-6/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: When expanding mul in type legalization make sure the type for shift amount can actually fit the value. This fixes PR30354 https://llvm.org/bugs/show_bug.cgi?id=30354. Reviewers: hfinkel, majnemer, RKSimon Subscribers: RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D24478 llvm-svn: 281403
*	[DAG] Allow build-to-shuffle combine to combine builds from two wide vectors.	Michael Kuperstein	2016-09-13	1	-27/+53
\| \| \| \| \| \| \| \| \| \| \|	This allows us to, in some cases, create a vector_shuffle out of a build_vector, when the inputs to the build are extract_elements from two different vectors, at least one of which is wider than the output. (E.g. a <8 x i16> being constructed out of elements from a <16 x i16> and a <8 x i16>). Differential Revision: https://reviews.llvm.org/D24491 llvm-svn: 281402
*	[DAGCombiner] Use APInt directly in (shl (zext (srl x, C)), C) combine range ↵	Simon Pilgrim	2016-09-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	test To avoid assertion, we must ensure that the inner shift constant is within range before calling ConstantSDNode::getZExtValue(). We already know that the outer shift constant is in range. Followup to D23007 llvm-svn: 281362
*	[DAGCombiner] Use APInt directly in (shl (ext (shl x, c1)), c2) combine	Simon Pilgrim	2016-09-13	1	-11/+15
\| \| \| \| \| \| \| \|	Fix failure to detect out of range shift constants leading to assert in ConstantSDNode::getZExtValue() Followup to D23007 llvm-svn: 281354
*	Remove MVT:i1 xor instruction before SELECT. (Performance improvement).	Ayman Musa	2016-09-13	1	-0/+16
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D23764 llvm-svn: 281308
*	[DAG] Refactor BUILD_VECTOR combine to make it easier to extend. NFCI.	Michael Kuperstein	2016-09-13	1	-123/+156
\| \| \| \| \| \| \|	This should make it easier to add cases that we currently don't cover, like supporting more kinds of type mismatches and more than 2 input vectors. llvm-svn: 281283
*	[CodeGen] Split out the notions of MI invariance and MI dereferenceability.	Justin Lebar	2016-09-11	5	-16/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: An IR load can be invariant, dereferenceable, neither, or both. But currently, MI's notion of invariance is IR-invariant && IR-dereferenceable. This patch splits up the notions of invariance and dereferenceability at the MI level. It's NFC, so adds some probably-unnecessary "is-dereferenceable" checks, which we can remove later if desired. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23371 llvm-svn: 281151
*	Create phi nodes for swifterror values at the end of the phi instructions list	Arnold Schwaighofer	2016-09-09	1	-1/+1
\| \| \| \| \| \| \| \|	ISel makes assumption about the order of phi nodes. rdar://28190150 llvm-svn: 281095
*	[SelectionDAG] Ensure DAG::getZeroExtendInReg is called with a scalar type	Simon Pilgrim	2016-09-09	2	-3/+3
\| \| \| \| \| \|	Fixes issue with rL280927 identified by Mikael Holmén llvm-svn: 281042
*	[SDAGBuilder] Don't create a binary tree for switches in minsize mode	James Molloy	2016-09-08	1	-1/+2
\| \| \| \| \| \|	This bloats codesize - all of the non-leaf nodes are extra code. llvm-svn: 280932
*	[SelectionDAG] Add BUILD_VECTOR support to computeKnownBits and ↵	Simon Pilgrim	2016-09-08	2	-0/+47
\| \| \| \| \| \| \| \| \| \| \| \|	SimplifyDemandedBits Add the ability to computeKnownBits and SimplifyDemandedBits to extract the known zero/one bits from BUILD_VECTOR, returning the known bits that are shared by every vector element. This is an initial step towards determining the sign bits of a vector (PR29079). Differential Revision: https://reviews.llvm.org/D24253 llvm-svn: 280927
*	[DAGCombiner] Enable AND combines of splatted constant vectors	Simon Pilgrim	2016-09-08	1	-7/+7
\| \| \| \| \| \| \| \|	Allow AND combines to use a vector splatted constant as well as a constant scalar. Preliminary part of D24253. llvm-svn: 280926
*	Shift-left (ISD::SHL) operation crashes on "DAG Legalization" phase.	Elena Demikhovsky	2016-09-07	1	-21/+27
\| \| \| \| \| \| \| \| \| \| \|	https://llvm.org/bugs/show_bug.cgi?id=29058. While node legalization we tried to legalize its operands. If an operand node is replaced during legalization the user node may be destroyed. Differential Revision: https://reviews.llvm.org/D24244 llvm-svn: 280862
*	Remove unnecessary call to getAllocatableRegClass	Matt Arsenault	2016-09-07	1	-9/+15
\| \| \| \| \| \| \| \|	This reapplies r252565 and r252674, effectively reverting r252956. This allows VS_32/VS_64 to be unallocatable like they should be. llvm-svn: 280783
*	[DAGCombine] More fixups to SETCC legality checking (visitANDLike/visitORLike)	Hal Finkel	2016-09-06	1	-28/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I might have called this "r246507, the sequel". It fixes the same issue, as the issue has cropped up in a few more places. The underlying problem is that isSetCCEquivalent can pick up select_cc nodes with a result type that is not legal for a setcc node to have, and if we use that type to create new setcc nodes, nothing fixes that (and so we've violated the contract that the infrastructure has with the backend regarding setcc node types). Fixes PR30276. For convenience, here's the commit message from r246507, which explains the problem is greater detail: [DAGCombine] Fixup SETCC legality checking SETCC is one of those special node types for which operation actions (legality, etc.) is keyed off of an operand type, not the node's value type. This makes sense because the value type of a legal SETCC node is determined by its operands' value type (via the TLI function getSetCCResultType). When the SDAGBuilder creates SETCC nodes, it either creates them with an MVT::i1 value type, or directly with the value type provided by TLI.getSetCCResultType. The first problem being fixed here is that DAGCombine had several places querying TLI.isOperationLegal on SETCC, but providing the return of getSetCCResultType, instead of the operand type directly. This does not mean what the author thought, and "luckily", most in-tree targets have SETCC with Custom lowering, instead of marking them Legal, so these checks return false anyway. The second problem being fixed here is that two of the DAGCombines could create SETCC nodes with arbitrary (integer) value types; specifically, those that would simplify: (setcc a, b, op1) and\|or (setcc a, b, op2) -> setcc a, b, op3 (which is possible for some combinations of (op1, op2)) If the operands of the and\|or node are actual setcc nodes, then this is not an issue (because the and\|or must share the same type), but, the relevant code in DAGCombiner::visitANDLike and DAGCombiner::visitORLike actually calls DAGCombiner::isSetCCEquivalent on each operand, and that function will recognise setcc-like select_cc nodes with other return types. And, thus, when creating new SETCC nodes, we need to be careful to respect the value-type constraint. This is even true before type legalization, because it is quite possible for the SELECT_CC node to have a legal type that does not happen to match the corresponding TLI.getSetCCResultType type. To be explicit, there is nothing that later fixes the value types of SETCC nodes (if the type is legal, but does not happen to match TLI.getSetCCResultType). Creating SETCCs with an MVT::i1 value type seems to work only because, either MVT::i1 is not legal, or it is what TLI.getSetCCResultType returns if it is legal. Fixing that is a larger change, however. For the time being, restrict the relevant transformations to produce only SETCC nodes with a value type matching TLI.getSetCCResultType (or MVT::i1 prior to type legalization). Fixes PR24636. llvm-svn: 280767
*	[SelectionDAG] Simplify extract_subvector( insert_subvector ( Vec, In, Idx ↵	Simon Pilgrim	2016-09-06	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \|	), Idx ) -> In If we are extracting a subvector that has just been inserted then we should just use the original inserted subvector. This has come up in certain several x86 shuffle lowering cases where we are crossing 128-bit lanes. Differential Revision: https://reviews.llvm.org/D24254 llvm-svn: 280715
*	Split the store of a wide value merged from an int-fp pair into multiple stores.	Wei Mi	2016-09-02	1	-0/+103
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For the store of a wide value merged from a pair of values, especially int-fp pair, sometimes it is more efficent to split it into separate narrow stores, which can remove the bitwise instructions or sink them to colder places. Now the feature is only enabled on x86 target, and only store of int-fp pair is splitted. It is possible that the application scope gets extended with perf evidence support in the future. Differential Revision: https://reviews.llvm.org/D22840 llvm-svn: 280505
*	[DAGcombiner] Fix incorrect sinking of a truncate into the operand of a shift.	Andrea Di Biagio	2016-09-02	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a regression introduced by revision 268094. Revision 268094 added the following dag combine rule: // trunc (shl x, K) -> shl (trunc x), K => K < vt.size / 2 That rule converts a truncate of a shift-by-constant into a shift of a truncated value. We do this only if the shift count is less than half the size in bits of the truncated value (K < vt.size / 2). The problem is that the constraint on the shift count is incorrect, so the rule doesn't work well in some cases involving vector types. The combine rule should have been written instead like this: // trunc (shl x, K) -> shl (trunc x), K => K < vt.getScalarSizeInBits() Basically, if K is smaller than the "scalar size in bits" of the truncated value then we know that by "sinking" the truncate into the operand of the shift we would never accidentally make the shift undefined. This patch fixes the check on the shift count, and adds test cases to make sure that we don't regress the behavior. Differential Revision: https://reviews.llvm.org/D24154 llvm-svn: 280482
*	[SelectionDAGBuilder] Add const to relevant places	Aditya Kumar	2016-09-01	2	-16/+17
\| \| \| \| \| \| \| \|	Reviewers: hans, evandro, sebpop Differential Revision: https://reviews.llvm.org/D24112 llvm-svn: 280430
*	[Legalizer] Don't throw away false low half when expanding GT/LT SETCC	Michael Kuperstein	2016-09-01	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \|	When expanding a SETCC for which the low half is known to evaluate to false, we can only throw it away for LT/GT comparisons, not LE/GE. This fixes PR29170. Differential Revision: https://reviews.llvm.org/D24151 llvm-svn: 280424
*	[SelectionDAG] Generate vector_shuffle nodes for undersized result vector sizes	Michael Kuperstein	2016-09-01	1	-45/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prior to this, we could generate a vector_shuffle from an IR shuffle when the size of the result was exactly the sum of the sizes of the input vectors. If the output vector was narrower - e.g. a <12 x i8> being formed by a shuffle with two <8 x i8> inputs - we would lower the shuffle to a sequence of extracts and inserts. Instead, we can form a larger vector_shuffle, and then extract a subvector of the right size - e.g. shuffle the two <8 x i8> inputs into a <16 x i8> and then extract a <12 x i8>. This also includes a target-specific X86 combine that in the presence of AVX2 combines: (vector_shuffle <mask> (concat_vectors t1, undef) (concat_vectors t2, undef)) into: (vector_shuffle <mask> (concat_vectors t1, t2), undef) in cases where this allows us to form VPERMD/VPERMQ. (This is not a separate commit, as that pattern does not appear without the DAGBuilder change.) llvm-svn: 280418
*	Rename some variables to have meaningful names. NFC.	Michael Kuperstein	2016-09-01	1	-29/+29
\| \| \| \|	llvm-svn: 280391
*	[DAGCombine] Don't fold a trunc if it feeds an anyext	Michael Kuperstein	2016-09-01	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Legalization tends to create anyext(trunc) patterns. This should always be combined - into either a single trunc, a single ext, or nothing if the types match exactly. But if we happen to combine the trunc first, we may pull the trunc away from the anyext or make it implicit (e.g. the truncate(extract) -> extract(bitcast) fold). To prevent this, we can avoid doing the fold, similarly to how we already handle fpround(fpextend). Differential Revision: https://reviews.llvm.org/D23893 llvm-svn: 280386
*	Add ISD::EH_DWARF_CFA, simplify @llvm.eh.dwarf.cfa on Mips, fix on PowerPC	Hal Finkel	2016-09-01	3	-12/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin. As pointed out in PR26761, this is currently broken on PowerPC (and likely on ARM as well). Currently, @llvm.eh.dwarf.cfa is lowered using: ADD(FRAMEADDR, FRAME_TO_ARGS_OFFSET) where FRAME_TO_ARGS_OFFSET defaults to the constant zero. On x86, FRAME_TO_ARGS_OFFSET is lowered to 2*SlotSize. This setup, however, does not work for PowerPC. Because of the way that the stack layout works, the canonical frame address is not exactly (FRAMEADDR + FRAME_TO_ARGS_OFFSET) on PowerPC (there is a lower save-area offset as well), so it is not just a matter of implementing FRAME_TO_ARGS_OFFSET for PowerPC (unless we redefine its semantics -- We can do that, since it is currently used only for @llvm.eh.dwarf.cfa lowering, but the better to directly lower the CFA construct itself (since it can be easily represented as a fixed-offset FrameIndex)). Mips currently does this, but by using a custom lowering for ADD that specifically recognizes the (FRAMEADDR, FRAME_TO_ARGS_OFFSET) pattern. This change introduces a ISD::EH_DWARF_CFA node, which by default expands using the existing logic, but can be directly lowered by the target. Mips is updated to use this method (which simplifies its implementation, and I suspect makes it more robust), and updates PowerPC to do the same. Fixes PR26761. Differential Revision: https://reviews.llvm.org/D24038 llvm-svn: 280350
*	[statepoints][experimental] Add support for live-in semantics of values in ↵	Philip Reames	2016-08-31	1	-5/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	deopt bundles This is a first step towards supporting deopt value lowering and reporting entirely with the register allocator. I hope to build on this in the near future to support live-on-return semantics, but I have a use case which allows me to test and investigate code quality with just the live-in semantics so I've chosen to start there. For those curious, my use cases is our implementation of the "__llvm_deoptimize" function we bind to @llvm.deoptimize. I'm choosing not to hard code that fact in the patch and instead make it configurable via function attributes. The basic approach here is modelled on what is done for the "Live In" values on stackmaps and patchpoints. (A secondary goal here is to remove one of the last barriers to merging the pseudo instructions.) We start by adding the operands directly to the STATEPOINT SDNode. Once we've lowered to MI, we extend the remat logic used by the register allocator to fold virtual register uses into StackMap::Indirect entries as needed. This does rely on the fact that the register allocator rematerializes. If it didn't along some code path, we could end up with more vregs than physical registers and fail to allocate. Today, we only fold in the register allocator. This can create some weird effects when combined with arguments passed on the stack because we don't fold them appropriately. I have an idea how to fix that, but it needs this patch in place to work on that effectively. (There's some weird interaction with the scheduler as well, more investigation needed.) My near term plan is to land this patch off-by-default, experiment in my local tree to identify any correctness issues and then start fixing codegen problems one by one as I find them. Once I have the live-in lowering fully working (both correctness and code quality), I'm hoping to move on to the live-on-return semantics. Note: I don't have any known miscompiles with this patch enabled, but I'm pretty sure I'll find at least a couple. Thus, the "experimental" tag and the fact it's off by default. Differential Revision: https://reviews.llvm.org/D24000 llvm-svn: 280250
*	Propagate TBAA info in SelectionDAG::getIndexedLoad	Krzysztof Parzyszek	2016-08-29	1	-1/+2
\| \| \| \| \| \|	Patch by Pranav Bhandarkar. llvm-svn: 279998
*	Fixed a bug in type legalizer for masked gather.	Igor Breger	2016-08-29	1	-1/+9
\| \| \| \| \| \| \| \| \|	The problem occurs when the Node doesn't updated in place , UpdateNodeOperation() return the node that already exist. In this case assert fail in PromoteIntegerOperand() , N have 2 results ( val + chain). Differential Revision: http://reviews.llvm.org/D23756 llvm-svn: 279961
*	[SelectionDAG] Do not run the ISel process on already selected code.	Quentin Colombet	2016-08-26	1	-0/+4
\| \| \| \| \| \| \|	Right now, this cannot happen, but with the fall back path of GlobalISel it will show up eventually. llvm-svn: 279877
*	Reuse an SDLoc throughout a function. NFC.	Michael Kuperstein	2016-08-25	1	-18/+12
\| \| \| \|	llvm-svn: 279767
*	[SelectionDAG] Use a union of bitfield structs for SDNode::SubclassData.	Justin Lebar	2016-08-23	1	-43/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This greatly simplifies our handling of SDNode::SubclassData. NFC, hopefully. :) See discussion in D23035 for discussion about the design API of these bitfields. Reviewers: chandlerc Subscribers: llvm-commits, rnk Differential Revision: https://reviews.llvm.org/D23036 llvm-svn: 279537
*	Fix some more asserts after r279466.	Pete Cooper	2016-08-23	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	That commit added a new version of Intrinsic::getName which should only be called when the intrinsic has no overloaded types. There are several debugging paths, such as SDNode::dump which are printing the name of the intrinsic but don't have the overloaded types. These paths should be ok to just print the name instead of crashing. The fix here is ultimately to just add a 'None' second argument as that calls the overload capable getName, which is less efficient, but this is a debugging path anyway, and not perf critical. Thanks to Björn Pettersson for pointing out that there were more crashes. llvm-svn: 279528
*	Use SDValue::getOpcode() helper instead of via SDValue::getNode()	Simon Pilgrim	2016-08-20	1	-2/+2
\| \| \| \|	llvm-svn: 279381
*	[CodeGen] Fix a trivial type conversion bug dating back to pre-2008	James Molloy	2016-08-19	1	-1/+1
\| \| \| \| \| \| \| \|	The heuristic above this code is incredibly suspect, but disregarding that it mutates the cast opcode so we need to check the mutated opcode later to see if we need to emit an AssertSext or AssertZext node. Fixes PR29041. llvm-svn: 279223
*	Replace a few more "fall through" comments with LLVM_FALLTHROUGH	Justin Bogner	2016-08-17	6	-15/+14
\| \| \| \| \| \|	Follow up to r278902. I had missed "fall through", with a space. llvm-svn: 278970