bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Add minnum / maxnum intrinsics	Matt Arsenault	2014-10-21	4	-0/+557
\| \| \| \| \| \| \| \| \| \| \| \|	These are named following the IEEE-754 names for these functions, rather than the libm fmin / fmax to avoid possible ambiguities. Some languages may implement something resembling fmin / fmax which return NaN if either operand is to propagate errors. These implement the IEEE-754 semantics of returning the other operand if either is a NaN representing missing data. llvm-svn: 220341
*	InstCombine: Simplify FoldICmpCstShrCst	David Majnemer	2014-10-21	3	-330/+332
\| \| \| \| \| \| \| \| \|	This function was complicated by the fact that it tried to perform canonicalizations that were already preformed by InstSimplify. Remove this extra code and move the tests over to InstSimplify. Add asserts to make sure our preconditions hold before we make any assumptions. llvm-svn: 220314
*	Teach the load analysis to allow finding available values which require	Chandler Carruth	2014-10-21	1	-9/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	inttoptr or ptrtoint cast provided there is datalayout available. Eventually, the datalayout can just be required but in practice it will always be there today. To go with the ability to expose available values requiring a ptrtoint or inttoptr cast, helpers are added to perform one of these three casts. These smarts are necessary to finish canonicalizing loads and stores to the operational type requirements without regressing fundamental combines. I've added some test cases. These should actually improve as the load combining and store combining improves, but they may fundamentally be highlighting some missing combines for select in addition to exercising the specific added logic to load analysis. llvm-svn: 220277
*	Introduce a 'nonnull' metadata on Load instructions.	Philip Reames	2014-10-20	1	-0/+23
\| \| \| \| \| \| \| \| \|	The newly introduced 'nonnull' metadata is analogous to existing 'nonnull' attributes, but applies to load instructions rather than call arguments or returns. Long term, it would be nice to combine these into a single construct. The value of the load is allowed to vary between successive loads, but null is not a valid value to be loaded by any load marked nonnull. Reviewed by: Hal Finkel Differential Revision: http://reviews.llvm.org/D5220 llvm-svn: 220240
*	Fix a miscompile introduced in r220178.	Chandler Carruth	2014-10-20	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original code had an implicit assumption that if the test for allocas or globals was reached, the two pointers were not equal. With my changes to make the pointer analysis more powerful here, I also had to guard against circumstances where the results weren't useful. That in turn violated the assumption and gave rise to a circumstance in which we could have a store with both the queried pointer and stored pointer rooted at the same alloca. Clearly, we cannot ignore such a store. There are other things we might do in this code to better handle the case of both pointers ending up at the same alloca or global, but it seems best to at least make the test explicit in what it intends to check. I've added tests for both the alloca and global case here. llvm-svn: 220190
*	Fix a somewhat subtle pair of issues with JumpThreading I introduced in	Chandler Carruth	2014-10-20	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	r220178. First, the creation routine doesn't insert prior to the terminator of the basic block provided, but really at the end of the basic block. Instead, get the terminator and insert before that. The next issue was that we need to ensure multiple PHI node entries for a single predecessor re-use the same cast instruction rather than creating new ones. All of the logic here was without tests previously. I've reduced and added a test case from the test suite that crashed without both of these fixes. llvm-svn: 220186
*	Teach the load analysis driving core instcombine logic and other bits of	Chandler Carruth	2014-10-20	1	-0/+109
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	logic to look through pointer casts, making them trivially stronger in the face of loads and stores with intervening pointer casts. I've included a few test cases that demonstrate the kind of folding instcombine can do without pointer casts and then variations which obfuscate the logic through bitcasts. Without this patch, the variations all fail to optimize fully. This is more important now than it has been in the past as I've started moving the load canonicialization to more closely follow the value type requirements rather than the pointer type requirements and thus this needs to be prepared for more pointer casts. When I made the same change to stores several test cases regressed without logic along these lines so I wanted to systematically improve matters first. llvm-svn: 220178
*	Add a datalayout string to this test so that it exercises the full gamut	Chandler Carruth	2014-10-20	1	-13/+15
\| \| \| \| \| \| \| \| \| \| \| \| \|	of InstCombine rather than just the bits enabled when datalayout is optional. The primary fixes here are because now things are little endian. In good news, silliness like this seems like it will be going away as we've got pretty stong consensus on dropping optional datalayout entirely. llvm-svn: 220176
*	Do a better and more complete job of preserving metadata when combining	Chandler Carruth	2014-10-19	2	-25/+86
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	loads. This handles many more cases than just the AA metadata, some of them suggested by Hal in his review of the AA metadata handling patch. I've tried to test this behavior where tractable to do so. I'll point out that I have specifically not included a test for debuginfo because it was going to require 2 or 3 times as much work to craft some input which would survive the "helpful" stripping of debug info metadata that doesn't match the desired schema. This is another good example of why the current state of write-ability for our debug info metadata is unacceptable. I spent over 30 minutes trying to conjure some test case that would survive, even copying from other debug info tests, but it always failed to survive with no explanation of why or how I might fix it. =[ llvm-svn: 220165
*	Move previously dead code to handle computing the known bits of an alias	Chandler Carruth	2014-10-19	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	up to where it actually works as intended. The problem is that a GlobalAlias isa GlobalValue and so the prior block handled all of the cases. This allows us to constant fold based on the actual constant expression in the global alias. As an example, see the last function in the newly added test case which explicitly aligns an unaligned pointer using constant expression math. Without this change, we fail to see that and fold an alignment test to zero. llvm-svn: 220164
*	InstCombine: (sub (or A B) (xor A B)) --> (and A B)	David Majnemer	2014-10-19	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \|	The following implements the transformation: (sub (or A B) (xor A B)) --> (and A B). Patch by Ankur Garg! Differential Revision: http://reviews.llvm.org/D5719 llvm-svn: 220163
*	InstCombine: Optimize icmp eq/ne (shl Const2, A), Const1	David Majnemer	2014-10-19	1	-0/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following implements the optimization for sequences of the form: icmp eq/ne (shl Const2, A), Const1 Such sequences can be transformed to: icmp eq/ne A, (TrailingZeros(Const1) - TrailingZeros(Const2)) This handles only the equality operators for now. Other operators need to be handled. Patch by Ankur Garg! llvm-svn: 220162
*	Fix a long-standing miscompile in the load analysis that was uncovered	Chandler Carruth	2014-10-19	3	-4/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	by my refactoring of this code. The method isSafeToLoadUnconditionally assumes that the load will proceed with the preferred type alignment. Given that, it has to ensure that the alloca or global is at least that aligned. It has always done this historically when a datalayout is present, but has never checked it when the datalayout is absent. When I refactored the code in r220156, I exposed this path when datalayout was present and that turned the latent bug into a patent bug. This fixes the issue by just removing the special case which allows folding things without datalayout. This isn't worth the complexity of trying to tease apart when it is or isn't safe without actually knowing the preferred alignment. llvm-svn: 220161
*	Preserve AA metadata when combining (cast (load (...))) -> (load (cast	Chandler Carruth	2014-10-18	1	-0/+25
\| \| \| \| \| \|	(...))). llvm-svn: 220141
*	[InstCombine] Do an about-face on how LLVM canonicalizes (cast (load	Chandler Carruth	2014-10-18	6	-32/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	...)) and (load (cast ...)): canonicalize toward the former. Historically, we've tried to load using the type of the pointer, and tried to match that type as closely as possible removing as many pointer casts as we could and trading them for bitcasts of the loaded value. This is deeply and fundamentally wrong. Repeat after me: memory does not have a type! This was a hard lesson for me to learn working on SROA. There is only one thing that should actually drive the type used for a pointer, and that is the type which we need to use to load from that pointer. Matching up pointer types to the loaded value types is very useful because it minimizes the physical size of the IR required for no-op casts. Similarly, the only thing that should drive the type used for a loaded value is how that value is used! Again, this minimizes casts. And in fact, the only thing motivating types in any part of LLVM's IR are the types used by the operations in the IR. We should match them as closely as possible. I've ended up removing some tests here as they were testing bugs or behavior that is no longer present. Mostly though, this is just cleanup to let the tests continue to function as intended. The only fallout I've found so far from this change was SROA and I have fixed it to not be impeded by the different type of load. If you find more places where this change causes optimizations not to fire, those too are likely bugs where we are assuming that the type of pointers is "significant" for optimization purposes. llvm-svn: 220138
*	Remove a test that was ported from the old llvm-gcc frontend test suite.	Chandler Carruth	2014-10-18	1	-39/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This test is pretty awesome. It is claiming to test devirtualization. However, the code in question is not in fact devirtualized by LLVM. If you take the original C++ test case and run it through Clang at -O3 we fail to devirtualize it completely. It also isn't a sufficiently focused test case. The reason we fail to devirtualize it isn't because of any missing instcombine though. Instead, it is because we fail to emit an available externally vtable and thus the vtable is just an external and completely opaque. If I cause the vtable to be emitted, we successfully devirtualize things. Anyways, I'm just removing it because it is providing negative value at this point: it isn't representative of the output of Clang really, LLVM isn't doing the transform it claims to be testing, LLVM's failure to do the transform isn't actually an LLVM bug at all and we shouldn't be testing for it here, and finally the test is written in such a way that it will trivially pass even when the point of the test is failing. llvm-svn: 220137
*	[SROA] Change how SROA does vector-based promotion of allocas to handle	Chandler Carruth	2014-10-18	1	-0/+136
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cases where the alloca type, the load types, and the store types used all disagree. Previously, the only way that vector-based promotion occured was if the alloca type was a vector type. This was one of the very few remaining uses of the alloca's type to guide SROA/mem2reg left in LLVM. It turns out it was a bad idea. The alloca type can change very easily based on the mixture of types loaded and stored to that alloca. We shouldn't be relying on it as a signal for very much. Instead, the source of truth should be loads and stores. We should canonicalize the loads and stores as much as possible and then rely on them exclusively in SROA. When looking and loads and stores, we may find many different candidate vector types. This change will let SROA try all of them to find a vector type which is a viable way to promote the entire alloca to a vector register. With this change, it becomes possible to do better canonicalization and optimization of loads and stores without breaking SROA in random ways, and that should allow fixing a core source of performance loss in hot numerical loops such as those in Eigen. llvm-svn: 220116
*	Revert "TRE: make TRE a bit more aggressive"	Rafael Espindola	2014-10-17	3	-38/+6
\| \| \| \| \| \| \| \| \|	This reverts commit r219899. This also updates byval-tail-call.ll to make it clear what was breaking. Adding r219899 again will cause the load/store to disappear. llvm-svn: 220093
*	[DSE] Remove no-data-layout-only type-based overlap checking	Hal Finkel	2014-10-17	3	-12/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DSE's overlap checking contained special logic, used only when no DataLayout was available, which inferred a complete overwrite when the pointee types were equal. This logic seems fine for regular loads/stores, but does not work for memcpy and friends. Instead of fixing this, I'm just removing it. Philosophically, transformations should not contain enhanced behavior used only when data layout is lacking (data layout should be strictly additive), and maintaining these rarely-tested code paths seems not worthwhile at this stage. Credit to Aliaksei Zasenka for the bug report and the diagnosis. The test case (slightly reduced from that provided by Aliaksei) replaces the original contents of test/Transforms/DeadStoreElimination/no-targetdata.ll -- a few other tests have been updated to have a data layout. llvm-svn: 220035
*	Delete -std-compile-opts.	Rafael Espindola	2014-10-16	3	-3/+3
\| \| \| \| \| \|	These days -std-compile-opts was just a silly alias for -O3. llvm-svn: 219951
*	Allow call-slop optzn for destinations with a suitable dereferenceable attribute	Bjorn Steinbrink	2014-10-16	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently, call slot optimization requires that if the destination is an argument, the argument has the sret attribute. This is to ensure that the memory access won't trap. In addition to sret, we can also allow the optimization to happen for arguments that have the new dereferenceable attribute, which gives the same guarantee. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5832 llvm-svn: 219950
*	fold: sqrt(x * x * y) -> fabs(x) * sqrt(y)	Sanjay Patel	2014-10-16	1	-0/+170
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a square root call has an FP multiplication argument that can be reassociated, then we can hoist a repeated factor out of the square root call and into a fabs(). In the simplest case, this: y = sqrt(x * x); becomes this: y = fabs(x); This patch relies on an earlier optimization in instcombine or reassociate to put the multiplication tree into a canonical form, so we don't have to search over every permutation of the multiplication tree. Because there are no IR-level FastMathFlags for intrinsics (PR21290), we have to use function-level attributes to do this optimization. This needs to be fixed for both the intrinsics and in the backend. Differential Revision: http://reviews.llvm.org/D5787 llvm-svn: 219944
*	Reapply r219832 - InstCombine: Narrow switch instructions using known bits.	Akira Hatanaka	2014-10-16	1	-0/+93
\| \| \| \| \| \| \|	The code committed in r219832 asserted when it attempted to shrink a switch statement whose type was larger than 64-bit. llvm-svn: 219902
*	TRE: make TRE a bit more aggressive	Saleem Abdulrasool	2014-10-16	3	-2/+37
\| \| \| \| \| \| \| \| \|	Make tail recursion elimination a bit more aggressive. This allows us to get tail recursion on functions that are just branches to a different function. The fact that the function takes a byval argument does not restrict it from being optimised into just a tail call. llvm-svn: 219899
*	Revert r219832.	Akira Hatanaka	2014-10-16	1	-61/+0
\| \| \| \|	llvm-svn: 219884
*	Revert "r219834 - Teach ScalarEvolution to sharpen range information"	Sanjoy Das	2014-10-15	2	-40/+2
\| \| \| \| \| \| \|	This change breaks the asan buildbots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/13468 llvm-svn: 219878
*	Preserve non-byval pointer alignment attributes using @llvm.assume when inlining	Hal Finkel	2014-10-15	1	-0/+98
\| \| \| \| \| \| \| \| \|	For pointer-typed function arguments, enhanced alignment can be asserted using the 'align' attribute. When inlining, if this enhanced alignment information is not otherwise available, preserve it using @llvm.assume-based alignment assumptions. llvm-svn: 219876
*	Teach ScalarEvolution to sharpen range information.	Sanjoy Das	2014-10-15	2	-2/+40
\| \| \| \| \| \| \| \| \| \| \| \|	If x is known to have the range [a, b) in a loop predicated by (icmp ne x, a), its range can be sharpened to [a + 1, b). Get ScalarEvolution and hence IndVars to exploit this fact. This change triggers an optimization to widen-loop-comp.ll, so it had to be edited to get it to pass. phabricator: http://reviews.llvm.org/D5639 llvm-svn: 219834
*	InstCombine: Narrow switch instructions using known bits.	Akira Hatanaka	2014-10-15	1	-0/+61
\| \| \| \| \| \| \| \| \|	Truncate the operands of a switch instruction to a narrower type if the upper bits are known to be all ones or zeros. rdar://problem/17720004 llvm-svn: 219832
*	[SLPVectorize] Basic ephemeral-value awareness	Hal Finkel	2014-10-15	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SLP vectorizer should not vectorize ephemeral values. These are used to express information to the optimizer, and vectorizing them does not lead to faster code (because the ephemeral values are dropped prior to code generation, vectorized or not), and obscures the information the instructions are attempting to communicate (the logic that interprets the arguments to @llvm.assume generically does not understand vectorized conditions). Also, uses by ephemeral values are free (because they, and the necessary extractelement instructions, will be dropped prior to code generation). llvm-svn: 219816
*	[LoopVectorize] Ignore @llvm.assume for cost estimates and legality	Hal Finkel	2014-10-14	1	-0/+100
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	A few minor changes to prevent @llvm.assume from interfering with loop vectorization. First, treat @llvm.assume like the lifetime intrinsics, which are scalarized (but don't otherwise interfere with the legality checking). Second, ignore the cost of ephemeral instructions in the loop (these will go away anyway during CodeGen). Alignment assumptions and other uses of @llvm.assume can often end up inside of loops that should be vectorized (this is not uncommon for assumptions generated by __attribute__((align_value(n))), for example). llvm-svn: 219741
*	Optimize away fabs() calls when input is squared (known positive).	Sanjay Patel	2014-10-14	1	-0/+100
\| \| \| \| \| \| \| \| \| \| \| \|	Eliminate library calls and intrinsic calls to fabs when the input is a squared value. Note that no unsafe-math / fast-math assumptions are needed for this optimization. Differential Revision: http://reviews.llvm.org/D5777 llvm-svn: 219717
*	InstCombine: Don't miscompile X % ((Pow2 << A) >>u B)	David Majnemer	2014-10-14	1	-1/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We assumed that A must be greater than B because the right hand side of a remainder operator must be nonzero. However, it is possible for A to be less than B if Pow2 is a power of two greater than 1. Take for example: i32 %A = 0 i32 %B = 31 i32 Pow2 = 2147483648 ((Pow2 << 0) >>u 31) is non-zero but A is less than B. This fixes PR21274. llvm-svn: 219713
*	Revert "r216914 - Revert: [APFloat] Fixed a bug in method 'fusedMultiplyAdd'"	Hal Finkel	2014-10-14	1	-0/+119
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reapply r216913, a fix for PR20832 by Andrea Di Biagio. The commit was reverted because of buildbot failures, and credit goes to Ulrich Weigand for isolating the underlying issue (which can be confirmed by Valgrind, which does helpfully light up like the fourth of July). Uli explained the problem with the original patch as: It seems the problem is calling multiplySignificand with an addend of category fcZero; that is not expected by this routine. Note that for fcZero, the significand parts are simply uninitialized, but the code in (or rather, called from) multiplySignificand will unconditionally access them -- in effect using uninitialized contents. This version avoids using a category == fcZero addend within multiplySignificand, which avoids this problem (the Valgrind output is also now clean). Original commit message: [APFloat] Fixed a bug in method 'fusedMultiplyAdd'. When folding a fused multiply-add builtin call, make sure that we propagate the correct result in the case where the addend is zero, and the two other operands are finite non-zero. Example: define double @test() { %1 = call double @llvm.fma.f64(double 7.0, double 8.0, double 0.0) ret double %1 } Before this patch, the instruction simplifier wrongly folded the builtin call in function @test to constant 'double 7.0'. With this patch, method 'fusedMultiplyAdd' correctly evaluates the multiply and propagates the expected result (i.e. 56.0). Added test fold-builtin-fma.ll with the reproducible from PR20832 plus extra test cases to verify the behavior of method 'fusedMultiplyAdd' in the presence of NaN/Inf operands. This fixes PR20832. llvm-svn: 219708
*	[LVI] Check for @llvm.assume dominating the edge branch	Hal Finkel	2014-10-14	1	-0/+39
\| \| \| \| \| \| \| \| \| \| \| \| \|	When LazyValueInfo uses @llvm.assume intrinsics to provide edge-value constraints, we should check for intrinsics that dominate the edge's branch, not just any potential context instructions. An assumption that dominates the edge's branch represents a truth on that edge. This is specifically useful, for example, if multiple predecessors assume a pointer to be nonnull, allowing us to simplify a later null comparison. The test case, and an initial patch, were provided by Philip Reames. Thanks! llvm-svn: 219688
*	Switch to select optimization for two-case switches	Marcello Maggioni	2014-10-14	4	-3/+119
\| \| \| \| \| \| \|	This is the same optimization of r219233 with modifications to support PHIs with multiple incoming edges from the same block and a test to check that this condition is handled. llvm-svn: 219656
*	InstCombine: Fix miscompile in X % -Y -> X % Y transform	David Majnemer	2014-10-13	2	-8/+3
\| \| \| \| \| \| \| \| \| \| \|	We assumed that negation operations of the form (0 - %Z) resulted in a negative number. This isn't true if %Z was originally negative. Substituting the negative number into the remainder operation may result in undefined behavior because the dividend might be INT_MIN. This fixes PR21256. llvm-svn: 219639
*	InstCombine: Don't miscompile (x lshr C1) udiv C2	David Majnemer	2014-10-13	1	-3/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	We have a transform that changes: (x lshr C1) udiv C2 into: x udiv (C2 << C1) However, it is unsafe to do so if C2 << C1 discards any of C2's bits. This fixes PR21255. llvm-svn: 219634
*	Revert r219223, it creates invalid PHI nodes.	Joerg Sonnenberger	2014-10-12	3	-79/+3
\| \| \| \|	llvm-svn: 219587
*	InstCombine: Turn (x != 0 & x <u C) into the canonical range check form (x-1 ↵	Benjamin Kramer	2014-10-12	1	-0/+11
\| \| \| \| \| \|	<u C-1) llvm-svn: 219585
*	InstCombine: Don't fold (X <<s log(INT_MIN)) /s INT_MIN to X	David Majnemer	2014-10-11	1	-0/+17
\| \| \| \| \| \| \| \| \| \|	Consider the case where X is 2. (2 <<s 31)/s-2147483648 is zero but we would fold to X. Note that this is valid when we are in the unsigned domain because we require NUW: 2 <<u 31 results in poison. This fixes PR21245. llvm-svn: 219568
*	InstCombine, InstSimplify: (%X /s C1) /s C2 isn't always 0 when C1 * C2 overflow	David Majnemer	2014-10-11	2	-14/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	consider: C1 = INT_MIN C2 = -1 C1 * C2 overflows without a doubt but consider the following: %x = i32 INT_MIN This means that (%X /s C1) is 1 and (%X /s C1) /s C2 is -1. N. B. Move the unsigned version of this transform to InstSimplify, it doesn't create any new instructions. This fixes PR21243. llvm-svn: 219567
*	InstCombine: mul to shl shouldn't preserve nsw	David Majnemer	2014-10-11	4	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	consider: mul i32 nsw %x, -2147483648 this instruction will not result in poison if %x is 1 however, if we transform this into: shl i32 nsw %x, 31 then we will be generating poison because we just shifted into the sign bit. This fixes PR21242. llvm-svn: 219566
*	Return undef on FP <-> Int conversions that overflow (PR21330).	Sanjay Patel	2014-10-10	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The LLVM Lang Ref states for signed/unsigned int to float conversions: "If the value cannot fit in the floating point value, the results are undefined." And for FP to signed/unsigned int: "If the value cannot fit in ty2, the results are undefined." This matches the C definitions. The existing behavior pins to infinity or a max int value, but that may just lead to more confusion as seen in: http://llvm.org/bugs/show_bug.cgi?id=21130 Returning undef will hopefully lead to a less silent failure. Differential Revision: http://reviews.llvm.org/D5603 llvm-svn: 219542
*	This patch teaches ScalarEvolution to pick and use !range metadata.	Sanjoy Das	2014-10-10	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \|	It also makes it more aggressive in querying range information by adding a call to isKnownPredicateWithRanges to isLoopBackedgeGuardedByCond and isLoopEntryGuardedByCond. phabricator: http://reviews.llvm.org/D5638 Reviewed by: atrick, hfinkel llvm-svn: 219532
*	This patch de-pessimizes the calculation of loop trip counts in	Mark Heffernan	2014-10-10	1	-9/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ScalarEvolution in the presence of multiple exits. Previously all loops exits had to have identical counts for a loop trip count to be considered computable. This pessimization was implemented by calling getBackedgeTakenCount(L) rather than getExitCount(L, ExitingBlock) inside of ScalarEvolution::getSmallConstantTripCount() (see the FIXME in the comments of that function). The pessimization was added to fix a corner case involving undefined behavior (pr/16130). This patch more precisely handles the undefined behavior case allowing the pessimization to be removed. ControlsExit replaces IsSubExpr to more precisely track the case where undefined behavior is expected to occur. Because undefined behavior is tracked more precisely we can remove MustExit from ExitLimit. MustExit was used to track the case where the limit was computed potentially assuming undefined behavior even if undefined behavior didn't necessarily occur. llvm-svn: 219517
*	SimplifyCFG: Don't convert phis into selects if we could remove undef behavior	Arnold Schwaighofer	2014-10-10	1	-0/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instead We used to transform this: define void @test6(i1 %cond, i8* %ptr) { entry: br i1 %cond, label %bb1, label %bb2 bb1: br label %bb2 bb2: %ptr.2 = phi i8* [ %ptr, %entry ], [ null, %bb1 ] store i8 2, i8* %ptr.2, align 8 ret void } into this: define void @test6(i1 %cond, i8* %ptr) { %ptr.2 = select i1 %cond, i8* null, i8* %ptr store i8 2, i8* %ptr.2, align 8 ret void } because the simplifycfg transformation into selects would happen to happen before the simplifycfg transformation that removes unreachable control flow (We have 'unreachable control flow' due to the store to null which is undefined behavior). The existing transformation that removes unreachable control flow in simplifycfg is: /// If BB has an incoming value that will always trigger undefined behavior /// (eg. null pointer dereference), remove the branch leading here. static bool removeUndefIntroducingPredecessor(BasicBlock BB) Now we generate: define void @test6(i1 %cond, i8 %ptr) { store i8 2, i8* %ptr.2, align 8 ret void } I did not see any impact on the test-suite + externals. rdar://18596215 llvm-svn: 219462
*	[Reassociate] Don't canonicalize X - undef to X + (-undef).	Chad Rosier	2014-10-09	1	-0/+21
\| \| \| \| \| \| \|	Phabricator Revision: http://reviews.llvm.org/D5674 PR21205 llvm-svn: 219434
*	[InstCombine] Fix wrong folding of constant comparisons involving ashr and ↵	Andrea Di Biagio	2014-10-09	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	negative values. This patch fixes a bug in method InstCombiner::FoldCmpCstShrCst where we wrongly computed the distance between the highest bits set of two negative values. This fixes PR21222. Differential Revision: http://reviews.llvm.org/D5700 llvm-svn: 219406
*	Inliner: Non-local functions in COMDATs shouldn't be dropped	David Majnemer	2014-10-08	1	-0/+18
\| \| \| \| \| \| \| \| \| \|	A function with discardable linkage cannot be discarded if its a member of a COMDAT group without considering all the other COMDAT members as well. This sort of thing is already handled by GlobalOpt/GlobalDCE. This fixes PR21206. llvm-svn: 219335