bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	SLSR: Pass address space to isLegalAddressingMode	Matt Arsenault	2015-06-11	2	-0/+109
\| \| \| \| \| \| \| \| \|	This only updates one of the uses. The other is used in cases that may never touch memory, so I'm not sure why this is even calling it. Perhaps there should be a new, similar hook for such cases or pass -1 for unknown address space. llvm-svn: 239540
*	ArgumentPromotion: Drop sret attribute on functions that are only called ↵	Peter Collingbourne	2015-06-10	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	directly. If the first argument to a function is a 'this' argument and the second has the sret attribute, the ArgumentPromotion pass may promote the 'this' argument to more than one argument, violating the IR constraint that 'sret' may only be applied to the first or second argument. Although this IR constraint is arguably unnecessary, it highlighted the fact that ArgPromotion does not need to preserve this attribute. Dropping the attribute reduces register pressure in the backend by avoiding the register copy required by sret. Because sret implies noalias, we also replace the former with the latter. Differential Revision: http://reviews.llvm.org/D10353 llvm-svn: 239488
*	[GVN] Set proper debug locations for some instructions created by GVN.	Alexey Samsonov	2015-06-10	1	-12/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Determining proper debug locations for instructions created in PHITransAddr is tricky. We use a simple approach here and simply copy debug locations from instructions computing load address to "corresponding" instructions re-creating the address computation in predecessor basic blocks. This may not always be correct, given all the rearrangement and simplification going on, and debug locations may jump around a lot, as the basic blocks we copy locations between may be very far from each other. Still, this would work good in most simple cases (e.g. when chain of address computing instruction is short, or our mapping turns out to be 1-to-1), and we desire to have some reasonable debug locations associated with newly inserted instructions. See http://reviews.llvm.org/D10351 review thread for more details. Test Plan: regression test suite Reviewers: spatel, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10351 llvm-svn: 239479
*	[Statepoints] Add test case to check that statepoint is marked with ↵	Igor Laevsky	2015-06-10	1	-0/+24
\| \| \| \| \| \| \| \|	Throwable attribute. Differential Revision: http://reviews.llvm.org/D10215 llvm-svn: 239473
*	[BasicBlockUtils] Set debug locations for instructions created in ↵	Alexey Samsonov	2015-06-09	1	-0/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	SplitBlockPredecessors. Test Plan: regression test suite Reviewers: eugenis, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10343 llvm-svn: 239438
*	MergeFunctions: Don't replace a weak function use by another equivalent weak ↵	Arnold Schwaighofer	2015-06-09	1	-9/+39
\| \| \| \| \| \| \| \| \| \|	function We don't know whether the weak functions definition is the definitive definition. rdar://21303727 llvm-svn: 239422
*	MergeFunctions: Impose a total order on the replacement of functions	Arnold Schwaighofer	2015-06-09	2	-4/+34
\| \| \| \| \| \| \| \| \| \| \| \| \|	We don't want to replace function A by Function B in one module and Function B by Function A in another module. If these functions are marked with linkonce_odr we would end up with a function stub calling B in one module and a function stub calling A in another module. If the linker decides to pick these two we will have two stubs calling each other. rdar://21265586 llvm-svn: 239367
*	[LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses.	Hao Liu	2015-06-08	2	-20/+484
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector. E.g. for (i = 0; i < N; i+=2) { a = A[i]; // load of even element b = A[i+1]; // load of odd element ... // operations on a, b, c, d A[i] = c; // store of even element A[i+1] = d; // store of odd element } The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like: %wide.vec = load <8 x i32>, <8 x i32>* %ptr %vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6> %vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7> The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like: %interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> store <8 x i32> %interleaved.vec, <8 x i32>* %ptr This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'. llvm-svn: 239291
*	SeparateConstOffsetFromGEP: Pass address space to isLegalAddressingMode	Matt Arsenault	2015-06-07	2	-0/+97
\| \| \| \|	llvm-svn: 239262
*	[LoopUnroll] Fix truncation bug in canUnrollCompletely.	Sanjoy Das	2015-06-06	1	-0/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: canUnrollCompletely takes `unsigned` values for `UnrolledCost` and `RolledDynamicCost` but is passed in `uint64_t`s that are silently truncated. Because of this, when `UnrolledSize` is a large integer that has a small remainder with UINT32_MAX, LLVM tries to completely unroll loops with high trip counts. Reviewers: mzolotukhin, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10293 llvm-svn: 239218
*	[CVP] Don't assume Constants of type i1 can be known to be true or false	David Majnemer	2015-06-06	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	CVP wants to analyze the condition operand of a select along an edge. It succeeds in getting back a Constant but not a ConstantInt. Instead, it gets a ConstantExpr. It then assumes that the Constant must be equal to false because it isn't equal to true. Instead, perform an additional comparison. This fixes PR23752. llvm-svn: 239217
*	[InstCombine] Don't miscompile select to poison	David Majnemer	2015-06-06	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we have (select a, b, c), it is sometimes valid to simplify this to a single select operand. However, doing so is only valid if the computation doesn't inject poison into the computation. It might be helpful to consider the following example: (select (icmp ne %i, INT_MAX), (add nsw %i, 1), INT_MIN) The select is equivalent to (add %i, 1) but not (add nsw %i, 1). Self hosting on x86_64 revealed that this occurs very, very rarely so bailing out is hopefully pretty reasonable. llvm-svn: 239215
*	Revert "[InstCombine] Rephrase fix to SimplifyWithOpReplaced"	Renato Golin	2015-06-05	1	-10/+0
\| \| \| \| \| \| \| \| \|	This reverts commit r239141. This commit was an attempt to reintroduce a previous patch that broke many self-hosting bots with clang timeouts, but it still has slowdown issues, at least on ARM, increasing the compilation time (stage 2, clang's) by 5x. llvm-svn: 239175
*	[InstCombine] Fix PR23751.	Sanjoy Das	2015-06-05	1	-0/+13
\| \| \| \| \| \|	PR23751 was caused by a missing ``break;`` in r234388. llvm-svn: 239171
*	[Unroll] Rework the naming and structure of the new unroll heuristics.	Chandler Carruth	2015-06-05	2	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new naming is (to me) much easier to understand. Here is a summary of the new state of the world: - 'Threshold' is the threshold for full unrolling. It is measured against the estimated unrolled cost as computed by getUserCost in TTI (or CodeMetrics, etc). We will exceed this threshold when unrolling loops where unrolling exposes a significant degree of simplification of the logic within the loop. - 'PercentDynamicCostSavedThreshold' is the percentage of the loop's estimated dynamic execution cost which needs to be saved by unrolling to apply a discount to the estimated unrolled cost. - 'DynamicCostSavingsDiscount' is the discount applied to the estimated unrolling cost when the dynamic savings are expected to be high. When actually analyzing the loop, we now produce both an estimated unrolled cost, and an estimated rolled cost. The rolled cost is notably a dynamic estimate based on our analysis of the expected execution of each iteration. While we're still working to build up the infrastructure for making these estimates, to me it is much more clear how* to make them better when they have reasonably descriptive names. For example, we may want to apply estimated (from heuristics or profiles) dynamic execution weights to the dynamic cost estimates. If we start doing that, we would also need to track the static unrolled cost and the dynamic unrolled cost, as only the latter could reasonably be weighted by profile information. This patch is sadly not without functionality change for the new unroll analysis logic. Buried in the heuristic management were several things that surprised me. For example, we never subtracted the optimized instruction count off when comparing against the unroll heursistics! I don't know if this just got lost somewhere along the way or what, but with the new accounting of things, this is much easier to keep track of and we use the post-simplification cost estimate to compare to the thresholds, and use the dynamic cost reduction ratio to select whether we can exceed the baseline threshold. The old values of these flags also don't necessarily make sense. My impression is that none of these thresholds or discounts have been tuned yet, and so they're just arbitrary placehold numbers. As such, I've not bothered to adjust for the fact that this is now a discount and not a tow-tier threshold model. We need to tune all these values once the logic is ready to be enabled. Differential Revision: http://reviews.llvm.org/D9966 llvm-svn: 239164
*	[LoopVectorize] Don't crash on zero-sized types in isInductionPHI	David Majnemer	2015-06-05	1	-0/+27
\| \| \| \| \| \| \| \| \|	isInductionPHI wants to calculate the stride based on the pointee size. However, this is not possible when the pointee is zero sized. This fixes PR23763. llvm-svn: 239143
*	[InstCombine] Rephrase fix to SimplifyWithOpReplaced	David Majnemer	2015-06-05	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I don't have the IR which is causing the build bot breakage but I can postulate as to why they are timing out: 1. SimplifyWithOpReplaced was stripping flags from the simplified value. 2. visitSelectInstWithICmp was overriding SimplifyWithOpReplaced because it's simplification wasn't correct. 3. InstCombine would revisit the add instruction and note that it can rederive the flags. 4. By modifying the value, we chose to revisit instructions which reuse the value. One of the instructions is the original select, causing LLVM to never reach fixpoint. Instead, strip the flags only when we are sure we are going to perform the simplification. llvm-svn: 239141
*	Revert "[InstCombine] Don't miscompile safe increment idiom"	Daniel Jasper	2015-06-05	1	-10/+0
\| \| \| \| \| \| \| \| \|	This is breaking a lot of build bots and is causing very long-running compiles (infinite loops)? Likely, we shouldn't return nullptr? llvm-svn: 239139
*	[InstCombine] Don't miscompile safe increment idiom	David Majnemer	2015-06-04	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \|	We cleverly handle cases where computation done in one argument of a select instruction is suitable for the other operand, thus obviating the need of the select and the comparison. However, the other operand cannot have flags. This fixes PR23757. llvm-svn: 239115
*	Make the test introduced in r239015 more targeted.	David Majnemer	2015-06-04	1	-29/+0
\| \| \| \| \| \| \| \|	We don't need to go through LSR to trigger this bug. Instead, hand-craft a tricky GEP and get the constant folder to hack on it when parsing the IR. llvm-svn: 239017
*	[ConstantFold] Don't skip the first gep index when folding geps	David Majnemer	2015-06-04	1	-0/+29
\| \| \| \| \| \| \| \| \|	We neglected to check if the first index made the GEP ineligible for 'inbounds'. This fixes PR23753. llvm-svn: 239015
*	[RewriteStatepointsForGC] Strip deref info after rewriting.	Sanjoy Das	2015-06-02	1	-0/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Once a gc.statepoint has been rewritten to relocate live references, the SSA values represent physical pointers instead of logical references. Logical dereferencability does not imply physical dereferencability and after RewriteStatepointsForGC has run any attributes that imply dereferencability of the logical references need to be stripped. This current approach is conservative, and can be made more precise later if needed. For starters, we need to strip dereferencable attributes only from pointers that live in the GC address space. Reviewers: reames, pgavlin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10105 llvm-svn: 238883
*	Teach the IR Sink pass to (conservatively) respect convergent annotations.	Owen Anderson	2015-06-01	1	-0/+24
\| \| \| \|	llvm-svn: 238762
*	[PHITransAddr] Don't translate unreachable values	David Majnemer	2015-06-01	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \|	Unreachable values may use themselves in strange ways due to their dominance property. Attempting to translate through them can lead to infinite recursion, crashing LLVM. Instead, claim that we weren't able to translate the value. This fixes PR23096. llvm-svn: 238702
*	[IR] fptrunc-of-fptrunc isn't an EliminableCastPair.	Ahmed Bougacha	2015-05-29	1	-0/+8
\| \| \| \| \| \| \|	Double and single rounding can produce different results. This is the IR counterpart to r228911. llvm-svn: 238531
*	Enable exitValue rewrite only when the cost of expansion is low.	Wei Mi	2015-05-28	3	-2/+77
\| \| \| \| \| \| \| \|	The patch evaluates the expansion cost of exitValue in indVarSimplify pass, and only does the rewriting when the expansion cost is low or loop can be deleted with the rewriting. It provides an option "-replexitval=" to control the default aggressiveness of the exitvalue rewriting. It also fixes some missing cases in SCEVExpander::isHighCostExpansionHelper to enhance the evaluation of SCEV expansion cost. Differential Revision: http://reviews.llvm.org/D9800 llvm-svn: 238507
*	[InstCombine] Fold IntToPtr and PtrToInt into preceding loads.	David Majnemer	2015-05-28	2	-0/+157
\| \| \| \| \| \| \| \| \| \| \|	Currently we only fold a BitCast into a Load when the BitCast is its only user. Do the same for any no-op cast. Differential Revision: http://reviews.llvm.org/D9152 llvm-svn: 238452
*	[Reassociate] Canonicalizing 'x [+-] (-Constant * y)' isn't always a win	David Majnemer	2015-05-28	2	-14/+12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Canonicalizing 'x [+-] (-Constant * y)' is not a win if we don't know we will open up CSE opportunities. If the multiply was 'nsw', then negating 'y' requires us to clear the 'nsw' flag. If this is actually worth pursuing, it is probably more appropriate to do so in GVN or EarlyCSE. This fixes PR23675. llvm-svn: 238397
*	[NaryReassociate] Run EarlyCSE after NaryReassociate	Jingyue Wu	2015-05-28	1	-9/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch made two improvements to NaryReassociate and the NVPTX pipeline 1. Run EarlyCSE/GVN after NaryReassociate to get rid of redundant common expressions. 2. When adding an instruction to SeenExprs, maps both the SCEV before and after reassociation to that instruction. Test Plan: updated @reassociate_gep_nsw in nary-gep.ll Reviewers: meheff, broune Reviewed By: broune Subscribers: dberlin, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9947 llvm-svn: 238396
*	[inliner] Fix the early-exit of the inline cost analysis to correctly	Chandler Carruth	2015-05-27	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	model the dense vector instruction bonuses. Previously, this code really didn't effectively compute the density of inlined vector instructions and apply the intended inliner bonus. It would try to compute it repeatedly while analyzing the function and didn't handle the case where future vector instructions would tip the scales back towards the bonus. Instead, speculatively apply all possible bonuses to the threshold initially. Once we know that a certain bonus can not be applied, subtract it. This should delay early bailout enough to get much more consistent results without actually causing us to analyze huge swaths of code. I expect some (hopefully mild) compile time hit here, and some swings in performance, but this was definitely the intended behavior of these bonuses. This also dramatically simplifies the computation of the bonuses to not interact with each other in confusing ways. The previous code didn't do a good job of this and the values for bonuses may be surprising but are at least now clearly written in the code. Finally, fix code to be in line with comments and use zero as the bailout condition. Patch by Easwaran Raman, with some comment tweaks by me to try and further clarify what is going on with this code. http://reviews.llvm.org/D8267 llvm-svn: 238276
*	Forgot to add lit.local.cfg for new R600 directory	Matt Arsenault	2015-05-26	1	-0/+3
\| \| \| \|	llvm-svn: 238218
*	CodeGenPrepare: Don't match addressing modes through addrspacecast	Matt Arsenault	2015-05-26	1	-0/+49
\| \| \| \| \| \| \|	This was resulting in the addrspacecast being removed and incorrectly replaced with a ptrtoint when sinking. llvm-svn: 238217
*	Remove conflicting attributes before adding deduced readonly/readnone	Bjorn Steinbrink	2015-05-25	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In case of functions that have a pointer argument and only pass it to each other, the function attributes pass deduces that the pointer should get the readnone attribute, but fails to remove a readonly attribute that may already have been present. Reviewers: nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9995 llvm-svn: 238152
*	Correct a mistaken comment from 238071 [NFC]	Philip Reames	2015-05-23	1	-3/+2
\| \| \| \|	llvm-svn: 238074
*	Extend EarlyCSE to handle basic cases from JumpThreading and CVP	Philip Reames	2015-05-22	2	-0/+281
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch extends EarlyCSE to take advantage of the information that a controlling branch gives us about the value of a Value within this and dominated basic blocks. If the current block has a single predecessor with a controlling branch, we can infer what the branch condition must have been to execute this block. The actual change to support this is downright simple because EarlyCSE's existing scoped hash table logic deals with most of the complexity around merging. The patch actually implements two optimizations. 1) The first is analogous to JumpThreading in that it enables EarlyCSE's CSE handling to fold branches which are exactly redundant due to a previous branch to branches on constants. (It doesn't actually replace the branch or change the CFG.) This is pretty clearly a win since it enables substantial CFG simplification before we start trying to inline. 2) The second is analogous to CVP in that it exploits the knowledge gained to replace dominated uses of the original value. EarlyCSE does not otherwise reason about specific uses, so this is the more arguable one. It does enable further simplication and constant folding within the rest of the visit by EarlyCSE. In both cases, the added code only handles the easy dominance based case of each optimization. The general case is deferred to the existing passes. Differential Revision: http://reviews.llvm.org/D9763 llvm-svn: 238071
*	[InstCombine] Don't eagerly propagate nsw for AB+AC => A*(B+C)	David Majnemer	2015-05-22	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	InstCombine transforms A nsw B +nsw A nsw C to A *nsw (B + C). This is incorrect -- e.g. if A = -1, B = 1, C = INT_SMAX. Then nothing in the LHS overflows, but the multiplication in RHS overflows. We need to first make sure that we won't multiple by INT_SMAX + 1. Test case `add_of_mul` contributed by Sanjoy Das. This fixes PR23635. Differential Revision: http://reviews.llvm.org/D9629 llvm-svn: 238066
*	[InstSimplify] Handle some overflow intrinsics in InstSimplify	David Majnemer	2015-05-22	2	-2/+26
\| \| \| \| \| \| \| \| \|	This change does a few things: - Move some InstCombine transforms to InstSimplify - Run SimplifyCall from within InstCombine::visitCallInst - Teach InstSimplify to fold [us]mul_with_overflow(X, undef) to 0. llvm-svn: 237995
*	[LICM] Sinking doesn't involve the preheader	Philip Reames	2015-05-22	1	-0/+50
\| \| \| \| \| \|	PR23608 pointed out that using the preheader to gain a context instruction isn't always legal because a loop might not have a preheader. When looking into that, I realized that using the preheader to determine legality for sinking is questionable at best. Given no test covers that case and the original commit didn't seem to intend it, I restructured the code to only ask context sensative queries for hoising of loads and stores. This is effectively a partial revert of 237593. llvm-svn: 237985
*	[NaryReassoc] reassociate GEP for CSE	Jingyue Wu	2015-05-21	1	-0/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: x = &a[i]; y = &a[i + j]; => y = x + j; along with some refactoring work such as extracting method findClosestMatchingDominator. Depends on D9786 which provides the ScalarEvolution::getGEPExpr interface. Test Plan: nary-gep.ll Reviewers: meheff, broune Reviewed By: broune Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9802 llvm-svn: 237971
*	[InstCombine] X - 0 is equal to X, not undef	David Majnemer	2015-05-21	1	-0/+8
\| \| \| \| \| \| \| \| \|	A refactoring made @llvm.ssub.with.overflow.i32(i32 %X, i32 0) transform into undef instead of %X. This fixes PR23624. llvm-svn: 237968
*	[PPC/LoopUnrollRuntime] Don't avoid high-cost trip count computation on the ↵	Hal Finkel	2015-05-21	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \|	PPC/A2 On X86 (and similar OOO cores) unrolling is very limited, and even if the runtime unrolling is otherwise profitable, the expense of a division to compute the trip count could greatly outweigh the benefits. On the A2, we unroll a lot, and the benefits of unrolling are more significant (seeing a 5x or 6x speedup is not uncommon), so we're more able to tolerate the expense, on average, of a division to compute the trip count. llvm-svn: 237947
*	[MemCpyOpt] Do move the memset, but look at its dest's dependencies.	Ahmed Bougacha	2015-05-21	1	-1/+21
\| \| \| \| \| \| \| \| \|	In effect a partial revert of r237858, which was a dumb shortcut. Looking at the dependencies of the destination should be the proper fix: if the new memset would depend on anything other than itself, the transformation isn't correct. llvm-svn: 237874
*	[MemCpyOpt] Don't move the memset when optimizing memset+memcpy.	Ahmed Bougacha	2015-05-20	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Fixes PR23599, another miscompile introduced by r235232: when there is another dependency on the destination of the created memset (i.e., the part of the original destination that the memcpy doesn't depend on) between the memcpy and the original memset, we would insert the created memset after the memcpy, and thus after the other dependency. Instead, insert the created memset right after the old one. llvm-svn: 237858
*	Reapply r237539 with a fix for the Chromium build.	James Molloy	2015-05-20	1	-0/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure if we're truncating a constant that would then be sign extended that the sign extension of the truncated constant is the same as the original constant. > Canonicalize min/max expressions correctly. > > This patch introduces a canonical form for min/max idioms where one operand > is extended or truncated. This often happens when the other operand is a > constant. For example: > > %1 = icmp slt i32 %a, i32 0 > %2 = sext i32 %a to i64 > %3 = select i1 %1, i64 %2, i64 0 > > Would now be canonicalized into: > > %1 = icmp slt i32 %a, i32 0 > %2 = select i1 %1, i32 %a, i32 0 > %3 = sext i32 %2 to i64 > > This builds upon a patch posted by David Majenemer > (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass > passively stopped instcombine from ruining canonical patterns. This > patch additionally actively makes instcombine canonicalize too. > > Canonicalization of expressions involving a change in type from int->fp > or fp->int are not yet implemented. llvm-svn: 237821
*	Add a GCStrategy for CoreCLR	Swaroop Sridhar	2015-05-20	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds a new GC strategy for supporting the CoreCLR runtime. This strategy is currently identical to Statepoint-example GC, but is necessary for several upcoming changes specific to CoreCLR, such as: 1. Base-pointers not explicitly reported for interior pointers 2. Different format for stack-map encoding 3. Location of Safe-point polls: polls are only needed before loop-back edges and before tail-calls (not needed at function-entry) 4. Runtime specific handshake between calls to managed/unmanaged functions. llvm-svn: 237753
*	[PlaceSafepoints] Stop special casing some intrinsics	Philip Reames	2015-05-19	1	-0/+20
\| \| \| \| \| \|	We were special casing a handful of intrinsics as not needing a safepoint before them. After running into another valid case - memset - I took a closer look and realized that almost no intrinsics need to have a safepoint poll before them. Restructure the code to make that apparent so that we stop hitting these bugs. The only intrinsics which need a safepoint poll before them are ones which can run arbitrary code. llvm-svn: 237744
*	Revert r237539: "Reapply r237520 with another fix for infinite looping"	Hans Wennborg	2015-05-19	1	-99/+0
\| \| \| \| \| \|	This caused PR23583. llvm-svn: 237739
*	Dereferenceable, dereferenceable_or_null metadata for loads	Sanjoy Das	2015-05-19	1	-1/+135
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Introduce dereferenceable, dereferenceable_or_null metadata for loads with the same semantic as corresponding attributes. This patch depends on http://reviews.llvm.org/D9253 Patch by Artur Pilipenko! Reviewers: hfinkel, sanjoy, reames Reviewed By: sanjoy, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9365 llvm-svn: 237720
*	[RewriteStatepointsForGC] For some values (like gep's and bitcasts) it's ↵	Igor Laevsky	2015-05-19	4	-3/+225
\| \| \| \| \| \| \| \|	cheaper to clone them after statepoint than to emit proper relocates for them. This change implements this logic. There is alredy similar optimization in CodeGenPrepare, but doing so during RewriteStatepointsForGC allows to capture more opprtunities such as relocates in loops and longer instruction chains. Differential Revision: http://reviews.llvm.org/D9774 llvm-svn: 237701
*	[PlaceSafepoints] Assertion on that gc_result can not have preceding phis ↵	Chen Li	2015-05-18	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	should only apply to invoke statepoint Summary: When PlaceSafepoints pass replaces old return result with gc_result from statepoint, it asserts that gc_result can not have preceding phis in its parent block. This is only true on invoke statepoint, which terminates the block and puts its result at the beginning of the normal successor block. Call statepoint does not terminate the block and thus its result is in the same block with it. There should be no restriction on whether there are phis or not. Reviewers: reames, igor-laevsky Reviewed By: igor-laevsky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9803 llvm-svn: 237597