bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	InstSimplify: ((X % Y) % Y) -> (X % Y)	David Majnemer	2014-09-17	1	-0/+9
\| \| \| \| \| \| \| \|	Patch by Sonam Kumari! Differential Revision: http://reviews.llvm.org/D5350 llvm-svn: 217937
*	[InstCombine] Remove redundant test case.	Tilmann Scheller	2014-09-16	1	-16/+6
\| \| \| \| \| \| \| \|	Patch by Sonam Kumari! Differential Revision: http://reviews.llvm.org/D5284 llvm-svn: 217865
*	InstSimplify: Simplify trivial and/or of icmps	David Majnemer	2014-09-15	1	-0/+120
\| \| \| \| \| \| \| \| \| \| \| \| \|	Some ICmpInsts when anded/ored with another ICmpInst trivially reduces to true or false depending on whether or not all integers or no integers satisfy the intersected/unioned range. This sort of trivial looking code can come about when InstCombine performs a range reduction-type operation on sdiv and the like. This fixes PR20916. llvm-svn: 217750
*	FileCheckize. NFC.	Chad Rosier	2014-09-12	1	-21/+25
\| \| \| \|	llvm-svn: 217698
*	[AlignmentFromAssumptions] Don't crash just because the target is 32-bit	Hal Finkel	2014-09-11	1	-0/+215
\| \| \| \| \| \| \| \| \|	We used to crash processing any relevant @llvm.assume on a 32-bit target (because we'd ask SE to subtract expressions of differing types). I've copied our 'simple.ll' test, but with the data layout from arm-linux-gnueabihf to get some meaningful test coverage here. llvm-svn: 217574
*	[AlignmentFromAssumptions] Don't divide by zero for unknown starting alignment	Hal Finkel	2014-09-10	1	-0/+154
\| \| \| \| \| \| \| \| \| \|	The routine that determines an alignment given some SCEV returns zero if the answer is unknown. In a case where we could determine the increment of an AddRec but not the starting alignment, we would compute the integer modulus by zero (which is illegal and traps). Prevent this by returning early if either the start or increment alignment is unknown (zero). llvm-svn: 217544
*	Rename getMaximumUnrollFactor -> getMaxInterleaveFactor; also rename option ↵	Sanjay Patel	2014-09-10	83	-92/+92
\| \| \| \| \| \| \| \| \| \| \|	names controlling this variable. "Unroll" is not the appropriate name for this variable. Clang already uses the term "interleave" in pragmas and metadata for this. Differential Revision: http://reviews.llvm.org/D5066 llvm-svn: 217528
*	Add a test for hoisting instructions with metadata out of then/else blocks	Bjorn Steinbrink	2014-09-09	1	-0/+20
\| \| \| \| \| \|	Test for the bug fixed in r215723. llvm-svn: 217453
*	Check for all known bits on ret in InstCombine	Hal Finkel	2014-09-07	1	-0/+12
\| \| \| \| \| \| \| \| \| \|	From a combination of @llvm.assume calls (and perhaps through other means, such as range metadata), it is possible that all bits of a return value might be known. Previously, InstCombine did not check for this (which is understandable given assumptions of constant propagation), but means that we'd miss simple cases where assumptions are involved. llvm-svn: 217346
*	Make use of @llvm.assume from LazyValueInfo	Hal Finkel	2014-09-07	1	-0/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change teaches LazyValueInfo to use the @llvm.assume intrinsic. Like with the known-bits change (r217342), this requires feeding a "context" instruction pointer through many functions. Aside from a little refactoring to reuse the logic that turns predicates into constant ranges in LVI, the only new code is that which can 'merge' the range from an assumption into that otherwise computed. There is also a small addition to JumpThreading so that it can have LVI use assumptions in the same block as the comparison feeding a conditional branch. With this patch, we can now simplify this as expected: int foo(int a) { __builtin_assume(a > 5); if (a > 3) { bar(); return 1; } return 0; } llvm-svn: 217345
*	Add an AlignmentFromAssumptions Pass	Hal Finkel	2014-09-07	1	-0/+215
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a ScalarEvolution-powered transformation that updates load, store and memory intrinsic pointer alignments based on invariant((a+q) & b == 0) expressions. Many of the simple cases we can get with ValueTracking, but we still need something like this for the more complicated cases (such as those with an offset) that require some algebra. Note that gcc's __builtin_assume_aligned's optional third argument provides exactly for this kind of 'misalignment' offset for which this kind of logic is necessary. The primary motivation is to fixup alignments for vector loads/stores after vectorization (and unrolling). This pass is added to the optimization pipeline just after the SLP vectorizer runs (which, admittedly, does not preserve SE, although I imagine it could). Regardless, I actually don't think that the preservation matters too much in this case: SE computes lazily, and this pass won't issue any SE queries unless there are any assume intrinsics, so there should be no real additional cost in the common case (SLP does preserve DT and LoopInfo). llvm-svn: 217344
*	Add additional patterns for @llvm.assume in ValueTracking	Hal Finkel	2014-09-07	1	-0/+174
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This builds on r217342, which added the infrastructure to compute known bits using assumptions (@llvm.assume calls). That original commit added only a few patterns (to catch common cases related to determining pointer alignment); this change adds several other patterns for simple cases. r217342 contained that, for assume(v & b = a), bits in the mask that are known to be one, we can propagate known bits from the a to v. It also had a known-bits transfer for assume(a = b). This patch adds: assume(~(v & b) = a) : For those bits in the mask that are known to be one, we can propagate inverted known bits from the a to v. assume(v \| b = a) : For those bits in b that are known to be zero, we can propagate known bits from the a to v. assume(~(v \| b) = a): For those bits in b that are known to be zero, we can propagate inverted known bits from the a to v. assume(v ^ b = a) : For those bits in b that are known to be zero, we can propagate known bits from the a to v. For those bits in b that are known to be one, we can propagate inverted known bits from the a to v. assume(~(v ^ b) = a) : For those bits in b that are known to be zero, we can propagate inverted known bits from the a to v. For those bits in b that are known to be one, we can propagate known bits from the a to v. assume(v << c = a) : For those bits in a that are known, we can propagate them to known bits in v shifted to the right by c. assume(~(v << c) = a) : For those bits in a that are known, we can propagate them inverted to known bits in v shifted to the right by c. assume(v >> c = a) : For those bits in a that are known, we can propagate them to known bits in v shifted to the right by c. assume(~(v >> c) = a) : For those bits in a that are known, we can propagate them inverted to known bits in v shifted to the right by c. assume(v >=_s c) where c is non-negative: The sign bit of v is zero assume(v >_s c) where c is at least -1: The sign bit of v is zero assume(v <=_s c) where c is negative: The sign bit of v is one assume(v <_s c) where c is non-positive: The sign bit of v is one assume(v <=_u c): Transfer the known high zero bits assume(v <_u c): Transfer the known high zero bits (if c is know to be a power of 2, transfer one more) A small addition to InstCombine was necessary for some of the test cases. The problem is that when InstCombine was simplifying and, or, etc. it would fail to check the 'do I know all of the bits' condition before checking less specific conditions and would not fully constant-fold the result. I'm not sure how to trigger this aside from using assumptions, so I've just included the change here. llvm-svn: 217343
*	Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.)	Hal Finkel	2014-09-07	2	-0/+183
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change, which allows @llvm.assume to be used from within computeKnownBits (and other associated functions in ValueTracking), adds some (optional) parameters to computeKnownBits and friends. These functions now (optionally) take a "context" instruction pointer, an AssumptionTracker pointer, and also a DomTree pointer, and most of the changes are just to pass this new information when it is easily available from InstSimplify, InstCombine, etc. As explained below, the significant conceptual change is that known properties of a value might depend on the control-flow location of the use (because we care that the @llvm.assume dominates the use because assumptions have control-flow dependencies). This means that, when we ask if bits are known in a value, we might get different answers for different uses. The significant changes are all in ValueTracking. Two main changes: First, as with the rest of the code, new parameters need to be passed around. To make this easier, I grouped them into a structure, and I made internal static versions of the relevant functions that take this structure as a parameter. The new code does as you might expect, it looks for @llvm.assume calls that make use of the value we're trying to learn something about (often indirectly), attempts to pattern match that expression, and uses the result if successful. By making use of the AssumptionTracker, the process of finding @llvm.assume calls is not expensive. Part of the structure being passed around inside ValueTracking is a set of already-considered @llvm.assume calls. This is to prevent a query using, for example, the assume(a == b), to recurse on itself. The context and DT params are used to find applicable assumptions. An assumption needs to dominate the context instruction, or come after it deterministically. In this latter case we only handle the specific case where both the assumption and the context instruction are in the same block, and we need to exclude assumptions from being used to simplify their own ephemeral values (those which contribute only to the assumption) because otherwise the assumption would prove its feeding comparison trivial and would be removed. This commit adds the plumbing and the logic for a simple masked-bit propagation (just enough to write a regression test). Future commits add more patterns (and, correspondingly, more regression tests). llvm-svn: 217342
*	Add functions for finding ephemeral values	Hal Finkel	2014-09-07	2	-0/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a set of utility functions for collecting 'ephemeral' values. These are LLVM IR values that are used only by @llvm.assume intrinsics (directly or indirectly), and thus will be removed prior to code generation, implying that they should be considered free for certain purposes (like inlining). The inliner's cost analysis, and a few other passes, have been updated to account for ephemeral values using the provided functionality. This functionality is important for the usability of @llvm.assume, because it limits the "non-local" side-effects of adding llvm.assume on inlining, loop unrolling, etc. (these are hints, and do not generate code, so they should not directly contribute to estimates of execution cost). llvm-svn: 217335
*	InstCombine: Remove a special case pattern	David Majnemer	2014-09-05	1	-6/+21
\| \| \| \| \| \| \|	The special case did not work when run under -reassociate and can easily be expressed by a further generalization of an existing pattern. llvm-svn: 217227
*	IndVarSimplify: Don't let LFTR compare against a poison value	David Majnemer	2014-09-03	5	-13/+240
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	LinearFunctionTestReplace tries to use the next indvar to compare against when possible. However, it may be the case that the calculation for the next indvar has NUW/NSW flags and that it may only be safely used inside the loop. Using it in a comparison to calculate the exit condition could result in observing poison. This fixes PR20680. Differential Revision: http://reviews.llvm.org/D5174 llvm-svn: 217102
*	Use target-dependent emitLeading/TrailingFence instead of the ↵	Robin Morisset	2014-09-03	2	-43/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	target-independent insertLeading/TrailingFence (in AtomicExpandPass) Fixes two latent bugs: - There was no fence inserted before expanded seq_cst load (unsound on Power) - There was only a fence release before seq_cst stores (again unsound, in particular on Power) It is not even clear if this is correct on ARM swift processors (where release fences are DMB ishst instead of DMB ish). This behaviour is currently preserved on ARM Swift as it is not clear whether it is incorrect. I would love to get documentation stating whether it is correct or not. These two bugs were not triggered because Power is not (yet) using this pass, and these behaviours happen to be (mostly?) working on ARM (although they completely butchered the semantics of the llvm IR). See: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075821.html for an example of the problems that can be caused by the second of these bugs. I couldn't see a way of fixing these in a completely target-independent way without adding lots of unnecessary fences on ARM, hence the target-dependent parts of this patch. This patch implements the new target-dependent parts only for ARM (the default of not doing anything is enough for AArch64), other architectures will use this infrastructure in later patches. llvm-svn: 217076
*	Preserve IR flags (nsw, nuw, exact, fast-math) in SLP vectorizer (PR20802).	Sanjay Patel	2014-09-03	12	-24/+374
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SLP vectorizer should propagate IR-level optimization hints/flags (nsw, nuw, exact, fast-math) when converting scalar instructions into vectors. But this isn't a simple copy - we need to take the intersection (the logical 'and') of the sets of flags on the scalars. The solution is further complicated because we can have non-uniform (non-SIMD) vector ops after: http://reviews.llvm.org/D4015 http://llvm.org/viewvc/llvm-project?view=revision&revision=211339 The vast majority of changed files are existing tests that were not propagating IR flags, but I've also added a new test file for focused testing of IR flag possibilities. Differential Revision: http://reviews.llvm.org/D5172 llvm-svn: 217051
*	Generate extract for in-tree uses if the use is scalar operand in vectorized ↵	Yi Jiang	2014-09-02	1	-0/+70
\| \| \| \| \| \|	instruction. radar://18144665 llvm-svn: 216946
*	Fix crash when looking up the addrspace of GEPs with vector types	Matt Arsenault	2014-09-02	1	-0/+12
\| \| \| \| \| \|	Patch by Björn Steinbrink llvm-svn: 216930
*	Revert: [APFloat] Fixed a bug in method 'fusedMultiplyAdd'.	Andrea Di Biagio	2014-09-02	1	-119/+0
\| \| \| \| \| \| \|	This reverts revision 216913; the new test added at revision 216913 caused regression failures on a couple of buildbots. llvm-svn: 216914
*	[APFloat] Fixed a bug in method 'fusedMultiplyAdd'.	Andrea Di Biagio	2014-09-02	1	-0/+119
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When folding a fused multiply-add builtin call, make sure that we propagate the correct result in the case where the addend is zero, and the two other operands are finite non-zero. Example: define double @test() { %1 = call double @llvm.fma.f64(double 7.0, double 8.0, double 0.0) ret double %1 } Before this patch, the instruction simplifier wrongly folded the builtin call in function @test to constant 'double 7.0'. With this patch, method 'fusedMultiplyAdd' correctly evaluates the multiply and propagates the expected result (i.e. 56.0). Added test fold-builtin-fma.ll with the reproducible from PR20832 plus extra test cases to verify the behavior of method 'fusedMultiplyAdd' in the presence of NaN/Inf operands. This fixes PR20832. Differential Revision: http://reviews.llvm.org/D5152 llvm-svn: 216913
*	LICM: Don't crash when an instruction is used by an unreachable BB	David Majnemer	2014-09-02	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: BBs might contain non-LCSSA'd values after the LCSSA pass is run if they are unreachable from the entry block. Normally, the users of the instruction would be PHIs but the unreachable BBs have normal users; rewrite their uses to be undef values. An alternative fix could involve fixing this at LCSSA but that would require this invariant to hold after subsequent transforms. If a BB created an unreachable block, they would be in violation of this. This fixes PR19798. Differential Revision: http://reviews.llvm.org/D5146 llvm-svn: 216911
*	SROA: Don't insert instructions before a PHI	David Majnemer	2014-09-01	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SROA may decide that it needs to insert a bitcast and would set it's insertion point before a PHI. This will create an invalid module right quick. Instead, choose the first insertion point in the basic block that holds our PHI. This fixes PR20822. Differential Revision: http://reviews.llvm.org/D5141 llvm-svn: 216891
*	Revert "Revert two GEP-related InstCombine commits"	David Majnemer	2014-09-01	1	-0/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r216698 which reverted r216523 and r216598. We would attempt to perform the transformation even if the match() failed because, as a side effect, it would set V. This would trick us into believing that we correctly found a place to correctly apply the transform. An additional test case was added to getelementptr.ll so that we might not regress in the future. llvm-svn: 216890
*	Add a convenience method to copy wrapping, exact, and fast-math flags (NFC).	Sanjay Patel	2014-09-01	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The loop vectorizer preserves wrapping, exact, and fast-math properties of scalar instructions. This patch adds a convenience method to make that operation easier because we need to do this in the loop vectorizer, SLP vectorizer, and possibly other places. Although this is a 'no functional change' patch, I've added a testcase to verify that the exact flag is preserved by the loop vectorizer. The wrapping and fast-math flags are already checked in existing testcases. Differential Revision: http://reviews.llvm.org/D5138 llvm-svn: 216886
*	Fix a really bad miscompile introduced in r216865 - the else-if logic	Chandler Carruth	2014-09-01	1	-9/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	chain became completely broken here as all intrinsic users ended up being skipped, and the ones that seemed to be singled out were actually the exact wrong set. This is a great example of why long else-if chains can be easily confusing. Switch the entire code to use early exits and early continues to have simpler (and more importantly, correct) logic here, as well as fixing the reversed logic for detecting and continuing on lifetime intrinsics. I've also significantly cleaned up the test case and added another test case demonstrating an example where the optimization is not (trivially) safe to perform. llvm-svn: 216871
*	Small refactor on VectorizerHint for deduplication	Renato Golin	2014-09-01	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, the hint mechanism relied on clean up passes to remove redundant metadata, which still showed up if running opt at low levels of optimization. That also has shown that multiple nodes of the same type, but with different values could still coexist, even if temporary, and cause confusion if the next pass got the wrong value. This patch makes sure that, if metadata already exists in a loop, the hint mechanism will never append a new node, but always replace the existing one. It also enhances the algorithm to cope with more metadata types in the future by just adding a new type, not a lot of code. Re-applying again due to MSVC 2013 being minimum requirement, and this patch having C++11 that MSVC 2012 didn't support. Fixes PR20655. llvm-svn: 216870
*	Feed AA to the inliner and use AA->getModRefBehavior in AddAliasScopeMetadata	Hal Finkel	2014-09-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This feeds AA through the IFI structure into the inliner so that AddAliasScopeMetadata can use AA->getModRefBehavior to figure out which functions only access their arguments (instead of just hard-coding some knowledge of memory intrinsics). Most of the information is only available from BasicAA; this is important for preserving alias scoping information for target-specific intrinsics when doing the noalias parameter attribute to metadata conversion. llvm-svn: 216866
*	Ignore lifetime intrinsics in use list for MemCpyOptimizer. Patch by Luqman ↵	Nick Lewycky	2014-09-01	1	-0/+28
\| \| \| \| \| \|	Aden, review by Hal Finkel. llvm-svn: 216865
*	Fix AddAliasScopeMetadata again - alias.scope must be a complete description	Hal Finkel	2014-09-01	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I thought that I had fixed this problem in r216818, but I did not do a very good job. The underlying issue is that when we add alias.scope metadata we are asserting that this metadata completely describes the aliasing relationships within the current aliasing scope domain, and so in the context of translating noalias argument attributes, the pointers must all be based on noalias arguments (as underlying objects) and have no other kind of underlying object. In r216818 excluding appropriate accesses from getting alias.scope metadata is done by looking for underlying objects that are not identified function-local objects -- but that's wrong because allocas, etc. are also function-local objects and we need to explicitly check that all underlying objects are the noalias arguments for which we're adding metadata aliasing scopes. This fixes the underlying-object check for adding alias.scope metadata, and does some refactoring of the related capture-checking eligibility logic (and adds more comments; hopefully making everything a bit clearer). Fixes self-hosting on x86_64 with -mllvm -enable-noalias-to-md-conversion (the feature is still disabled by default). llvm-svn: 216863
*	Fix AddAliasScopeMetadata to not add scopes when deriving from unknown pointers	Hal Finkel	2014-08-30	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous implementation of AddAliasScopeMetadata, which adds noalias metadata to preserve noalias parameter attribute information when inlining had a flaw: it would add alias.scope metadata to accesses which might have been derived from pointers other than noalias function parameters. This was incorrect because even some access known not to alias with all noalias function parameters could easily alias with an access derived from some other pointer. Instead, when deriving from some unknown pointer, we cannot add alias.scope metadata at all. This fixes a miscompile of the test-suite's tramp3d-v4. Furthermore, we cannot add alias.scope to functions unless we know they access only argument-derived pointers (currently, we know this only for memory intrinsics). Also, we fix a theoretical problem with using the NoCapture attribute to skip the capture check. This is incorrect (as explained in the comment added), but would not matter in any code generated by Clang because we get only inferred nocapture attributes in Clang-generated IR. This functionality is not yet enabled by default. llvm-svn: 216818
*	InstCombine: Try harder to combine icmp instructions	David Majnemer	2014-08-30	1	-1/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	consider: (and (icmp X, Y), (and Z, (icmp A, B))) It may be possible to combine (icmp X, Y) with (icmp A, B). If we successfully combine, create an 'and' instruction with Z. This fixes PR20814. N.B. There is room for improvement after this change but I'm not convinced it's worth chasing yet. llvm-svn: 216814
*	Relax the constraint more in MemoryDependencyAnalysis.cpp	Robin Morisset	2014-08-29	2	-58/+107
\| \| \| \| \| \| \| \|	Even loads/stores that have a stronger ordering than monotonic can be safe. The rule is no release-acquire pair on the path from the QueryInst, assuming that the QueryInst is not atomic itself. llvm-svn: 216771
*	Make fabs safe to speculatively execute	Matt Arsenault	2014-08-29	1	-0/+17
\| \| \| \|	llvm-svn: 216736
*	Revert two GEP-related InstCombine commits	David Majnemer	2014-08-29	1	-58/+0
\| \| \| \| \| \| \|	This reverts commit r216523 and r216598; people have reported regressions. llvm-svn: 216698
*	Don't promote byval pointer arguments when padding matters	Reid Kleckner	2014-08-28	2	-0/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Don't promote byval pointer arguments when when their size in bits is not equal to their alloc size in bits. This can happen for x86_fp80, where the size in bits is 80 but the alloca size in bits in 128. Promoting these types can break passing unions of x86_fp80s and other types. Patch by Thomas Jablin! Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D5057 llvm-svn: 216693
*	Fix: SLPVectorizer tried to move an instruction which was replaced by a ↵	Erik Eckstein	2014-08-28	1	-0/+41
\| \| \| \| \| \| \| \| \| \|	vector instruction. For a detailed description of the problem see the comment in the test file. The problematic moveBefore() calls are not required anymore because the new scheduling algorithm ensures a correct ordering anyway. llvm-svn: 216656
*	InstSimplify: Move a transform from InstCombine to InstSimplify	David Majnemer	2014-08-28	2	-34/+70
\| \| \| \| \| \| \| \|	Several combines involving icmp (shl C2, %X) C1 can be simplified without introducing any new instructions. Move them to InstSimplify; while we are at it, make them more powerful. llvm-svn: 216642
*	InstCombine: Combine gep X, (Y-X) to Y	David Majnemer	2014-08-27	1	-0/+13
\| \| \| \| \| \| \| \|	We try to perform this transform in InstSimplify but we aren't always able to. Sometimes, we need to insert a bitcast if X and Y don't have the same time. llvm-svn: 216598
*	InstSimplify: Don't simplify gep X, (Y-X) to Y if types differ	David Majnemer	2014-08-27	1	-0/+14
\| \| \| \| \| \| \| \| \|	It's incorrect to perform this simplification if the types differ. A bitcast would need to be inserted for this to work. This fixes PR20771. llvm-svn: 216597
*	Reland r216439 215441, majnemer has a real fix for PR20771.	Nico Weber	2014-08-27	1	-0/+66
\| \| \| \|	llvm-svn: 216586
*	Revert r216439 (and r216441, else the former doesn't revert cleanly).	Nico Weber	2014-08-27	1	-66/+0
\| \| \| \| \| \|	It caused PR 20771. I'll land a test on the clang side. llvm-svn: 216582
*	InstSimplify: Compute comparison ranges for left shift instructions	David Majnemer	2014-08-27	1	-0/+27
\| \| \| \| \| \| \| \|	'shl nuw CI, x' produces [CI, CI << CLZ(CI)] 'shl nsw CI, x' produces [CI << CLO(CI)-1, CI] if CI is negative 'shl nsw CI, x' produces [CI, CI << CLZ(CI)-1] if CI is non-negative llvm-svn: 216570
*	[SLP] Re-enable vectorization of GEP expressions (re-apply r210342 with a fix).	Michael Zolotukhin	2014-08-27	1	-0/+41
\| \| \| \|	llvm-svn: 216549
*	InstCombine: Optimize GEP's involving ptrtoint better	David Majnemer	2014-08-27	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We supported transforming: (gep i8* X, -(ptrtoint Y)) to: (inttoptr (sub (ptrtoint X), (ptrtoint Y))) However, this only fired if 'X' had type i8*. Generalize this to support various types of different sizes. This results in much better CodeGen, especially for pointers to packed structs. llvm-svn: 216523
*	Revert r210342 and r210343, add test case for the crasher.	Joerg Sonnenberger	2014-08-26	2	-41/+19
\| \| \| \| \| \|	PR 20642. llvm-svn: 216475
*	InstSimplify: Fold gep X, (sub 0, ptrtoint(X)) to null	David Majnemer	2014-08-26	1	-0/+29
\| \| \| \| \| \| \|	Save InstCombine some work if we can perform this fold during InstSimplify. llvm-svn: 216441
*	InstSimplify: Simplify trivial pointer expressions like b + (e - b)	David Majnemer	2014-08-26	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	consider: long long f(long long b, long long e) { return b + (e - b); } we would lower this to something like: define i64 @f(i64* %b, i64* %e) { %1 = ptrtoint i64* %e to i64 %2 = ptrtoint i64* %b to i64 %3 = sub i64 %1, %2 %4 = ashr exact i64 %3, 3 %5 = getelementptr inbounds i64* %b, i64 %4 ret i64* %5 } This should fold away to just 'e'. N.B. This adds m_SpecificInt as a convenient way to match against a particular 64-bit integer when using LLVM's match interface. llvm-svn: 216439
*	musttail: Don't eliminate varargs packs if there is a forwarding call	Reid Kleckner	2014-08-26	1	-6/+30
\| \| \| \| \| \|	Also clean up and beef up this grep test for the feature. llvm-svn: 216425