bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[Tests] Autogen a test so future changes are visible	Philip Reames	2019-06-04	1	-34/+122
\| \| \| \| \| \|	Oddly, I had to change a value name from "tmp0" to "bc0" to get the autogened test to pass. I'm putting this down to an oddity of update_test_checks or FileCheck, but don't understand it. llvm-svn: 362532
*	[Tests] Update a test to consistently use new pass manager and FileCheck the ↵	Philip Reames	2019-06-04	1	-1/+1
\| \| \| \| \| \|	result llvm-svn: 362518
*	[Tests] Autogen tests so that diffs for a future change are understandable	Philip Reames	2019-06-04	2	-29/+119
\| \| \| \|	llvm-svn: 362516
*	[Tests] Add LFTR tests for multiple exit loops (try 2)	Philip Reames	2019-06-03	1	-0/+276
\| \| \| \| \| \| \| \|	(Recommit after fixing a keymash in the run line. Sorry for breakage.) This is preparation for D62625 <https://reviews.llvm.org/D62625> llvm-svn: 362426
*	Revert "[Tests] Add LFTR tests for multiple exit loops"	Dmitri Gribenko	2019-06-03	1	-276/+0
\| \| \| \| \| \|	This reverts commit r362417. There's a syntax error in the RUN line. llvm-svn: 362418
*	[Tests] Add LFTR tests for multiple exit loops	Philip Reames	2019-06-03	1	-0/+276
\| \| \| \| \| \|	This is preparation for D62625 llvm-svn: 362417
*	[IndVarSimplify] Add tests for saturating math on IV; NFC	Nikita Popov	2019-06-02	1	-0/+123
\| \| \| \| \| \|	These saturating math ops can be replaced with simple math. llvm-svn: 362320
*	[IndVarSimplify] Fixup nowrap flags during LFTR (PR31181)	Nikita Popov	2019-06-01	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix for https://bugs.llvm.org/show_bug.cgi?id=31181 and partial fix for LFTR poison handling issues in general. When LFTR moves a condition from pre-inc to post-inc, it may now depend on value that is poison due to nowrap flags. To avoid this, we clear any nowrap flag that SCEV cannot prove for the post-inc addrec. Additionally, LFTR may switch to a different IV that is dynamically dead and as such may be arbitrarily poison. This patch will correct nowrap flags in some but not all cases where this happens. This is related to the adoption of IR nowrap flags for the pre-inc addrec. (See some of the switch_to_different_iv tests, where flags are not dropped or insufficiently dropped.) Finally, there are likely similar issues with the handling of GEP inbounds, but we don't have a test case for this yet. Differential Revision: https://reviews.llvm.org/D60935 llvm-svn: 362292
*	[IndVarSimplify] Add additional PR33181 tests; NFC	Nikita Popov	2019-06-01	1	-4/+90
\| \| \| \| \| \| \|	Two more tests with a switch to a dynamically dead IV, with poison occuring on the first or second iteration. llvm-svn: 362291
*	[LFTR] Add additional PR31181 test cases	Nikita Popov	2019-05-20	1	-0/+122
\| \| \| \| \| \| \| \|	One case where overflow happens in the first loop iteration, and two cases where we switch to a dynamically dead IV with post/pre increment, respectively. llvm-svn: 361189
*	[Tests] Consolidate more lftr tests	Philip Reames	2019-05-17	3	-318/+299
\| \| \| \| \| \|	These are all of the ones involving the same data layout string. Remainder take a bit more consideration, but at least everything can be auto-updated now. llvm-svn: 360961
*	[Tests] Expand basic lftr coverage	Philip Reames	2019-05-16	1	-5/+121
\| \| \| \| \| \|	Newly written tests to cover the simple cases. We don't appear to have broad coverage of this transform anywhere. llvm-svn: 360957
*	[Tests] More consolidation of lftr tests	Philip Reames	2019-05-16	3	-73/+69
\| \| \| \|	llvm-svn: 360936
*	[Test] Remove a bunch of cruft from a test	Philip Reames	2019-05-16	1	-26/+11
\| \| \| \| \| \|	This test hadn't been fully reduced, so do so. llvm-svn: 360935
*	[Tests] Start consolidating lftr tests into a single file	Philip Reames	2019-05-16	4	-126/+117
\| \| \| \|	llvm-svn: 360934
*	[Tests] Autogen the last lftr test	Philip Reames	2019-05-16	1	-14/+222
\| \| \| \|	llvm-svn: 360933
*	[Tests] Autogen a few more lftr tests for readability	Philip Reames	2019-05-16	2	-13/+43
\| \| \| \|	llvm-svn: 360932
*	[Tests] Autogen a few lftr test in preparation for merging	Philip Reames	2019-05-16	3	-40/+203
\| \| \| \|	llvm-svn: 360931
*	[IndVars] Extend reasoning about loop invariant exits to non-header blocks	Philip Reames	2019-05-14	1	-0/+65
\| \| \| \| \| \|	Noticed while glancing through the code for other reasons. The extension is trivial enough, decided to just do it. llvm-svn: 360694
*	[Test] Autogen a test for ease of later changing	Philip Reames	2019-05-14	1	-28/+61
\| \| \| \|	llvm-svn: 360690
*	[SCEV] Add explicit representations of umin/smin	Keno Fischer	2019-05-07	1	-9/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently we express umin as `~umax(~x, ~y)`. However, this becomes a problem for operands in non-integral pointer spaces, because `~x` is not something we can compute for `x` non-integral. However, since comparisons are generally still allowed, we are actually able to express `umin(x, y)` directly as long as we don't try to express is as a umax. Support this by adding an explicit umin/smin representation to SCEV. We do this by factoring the existing getUMax/getSMax functions into a new function that does all four. The previous two functions were largely identical. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D50167 llvm-svn: 360159
*	[IndVarSimplify] Generate full checks for some LFTR tests; NFC	Nikita Popov	2019-04-20	7	-128/+326
\| \| \| \|	llvm-svn: 358813
*	[IndVarSimplify] Add tests for PR31181; NFC	Nikita Popov	2019-04-20	1	-0/+152
\| \| \| \|	llvm-svn: 358812
*	Revert "Temporarily Revert "Add basic loop fusion pass.""	Eric Christopher	2019-04-17	164	-0/+13435
\| \| \| \| \| \| \| \|	The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
*	Temporarily Revert "Add basic loop fusion pass."	Eric Christopher	2019-04-17	164	-13435/+0
\| \| \| \| \| \| \| \|	As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
*	[SCEV] Ensure that isHighCostExpansion takes into account what is being divided	David Green	2019-03-05	2	-57/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	A SCEV is not low-cost just because you can divide it by a power of 2. We need to also check what we are dividing to make sure it too is not a high-code expansion. This helps to not expand the exit value of certain loops, helping not to bloat the code. The change in no-iv-rewrite.ll is reverting back to what it was testing before rL194116, and looks a lot like the other tests in replace-loop-exit-folds.ll. Differential Revision: https://reviews.llvm.org/D58435 llvm-svn: 355393
*	[SCEV] Add some extra tests for IndVarSimplifys loop exit values. NFC.	David Green	2019-03-05	1	-0/+232
\| \| \| \| \| \| \| \| \| \| \|	Add some tests for various loops of the form: while(S >= 32) { S -= 32; something(); }; return S; llvm-svn: 355389
*	[SCEV] Handle case where MaxBECount is less precise than ExactBECount for OR.	Florian Hahn	2019-03-02	1	-20/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In some cases, MaxBECount can be less precise than ExactBECount for AND and OR (the AND case was PR26207). In the OR test case, both ExactBECounts are undef, but MaxBECount are different, so we hit the assertion below. This patch uses the same solution the AND case already uses. Assertion failed: ((isa<SCEVCouldNotCompute>(ExactNotTaken) \|\| !isa<SCEVCouldNotCompute>(MaxNotTaken)) && "Exact is not allowed to be less precise than Max"), function ExitLimit This patch also consolidates test cases for both AND and OR in a single test case. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13245 Reviewers: sanjoy, efriedma, mkazantsev Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D58853 llvm-svn: 355259
*	[IndVars] Fix corner case with unreachable Phi inputs. PR40454	Max Kazantsev	2019-02-12	1	-3/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Logic in `getInsertPointForUses` doesn't account for a corner case when `Def` only comes to a Phi user from unreachable blocks. In this case, the incoming value may be arbitrary (and not even available in the input block) and break the loop-related invariants that are asserted below. In fact, if we encounter this situation, no IR modification is needed. This Phi will be simplified away with nearest cleanup. Differential Revision: https://reviews.llvm.org/D58045 Reviewed By: spatel llvm-svn: 353816
*	[TEST] Add missing opportunity test for PR39673	Max Kazantsev	2019-02-11	1	-0/+56
\| \| \| \|	llvm-svn: 353693
*	[TEST] Add failing test from PR40454	Max Kazantsev	2019-02-11	1	-0/+41
\| \| \| \|	llvm-svn: 353688
*	Return "[IndVars] Smart hard uses detection"	Max Kazantsev	2018-11-08	2	-0/+89
\| \| \| \| \| \| \| \| \| \|	The patch has been reverted because it ended up prohibiting propagation of a constant to exit value. For such values, we should skip all checks related to hard uses because propagating a constant is always profitable. Differential Revision: https://reviews.llvm.org/D53691 llvm-svn: 346397
*	[NFC] Add motivating test case for revert in rL346198	Max Kazantsev	2018-11-06	1	-0/+35
\| \| \| \|	llvm-svn: 346199
*	Revert "[IndVars] Smart hard uses detection"	Max Kazantsev	2018-11-06	2	-89/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 2f425e9c7946b9d74e64ebbfa33c1caa36914402. It seems that the check that we still should do the transform if we know the result is constant is missing in this code. So the logic that has been deleted by this change is still sometimes accidentally useful. I revert the change to see what can be done about it. The motivating case is the following: @Y = global [400 x i16] zeroinitializer, align 1 define i16 @foo() { entry: br label %for.body for.body: ; preds = %entry, %for.body %i = phi i16 [ 0, %entry ], [ %inc, %for.body ] %arrayidx = getelementptr inbounds [400 x i16], [400 x i16]* @Y, i16 0, i16 %i store i16 0, i16* %arrayidx, align 1 %inc = add nuw nsw i16 %i, 1 %cmp = icmp ult i16 %inc, 400 br i1 %cmp, label %for.body, label %for.end for.end: ; preds = %for.body %inc.lcssa = phi i16 [ %inc, %for.body ] ret i16 %inc.lcssa } We should be able to figure out that the result is constant, but the patch breaks it. Differential Revision: https://reviews.llvm.org/D51584 llvm-svn: 346198
*	[IndVars] Smart hard uses detection	Max Kazantsev	2018-11-01	2	-0/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When rewriting loop exit values, IndVars considers this transform not profitable if the loop instruction has a loop user which it believes cannot be optimized away. In current implementation only calls that immediately use the instruction are considered as such. This patch extends the definition of "hard" users to any side-effecting instructions (which usually cannot be optimized away from the loop) and also allows handling of not just immediate users, but use chains. Differentlai Revision: https://reviews.llvm.org/D51584 Reviewed By: etherzhhb llvm-svn: 345814
*	[IndVars] Strengthen restricton in rewriteLoopExitValues	Max Kazantsev	2018-10-31	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For some unclear reason rewriteLoopExitValues considers recalculation after the loop profitable if it has some "soft uses" outside the loop (i.e. any use other than call and return), even if we have proved that it has a user inside the loop which we think will not be optimized away. There is no existing unit test that would explain this. This patch provides an example when rematerialisation of exit value is not profitable but it passes this check due to presence of a "soft use" outside the loop. It makes no sense to recalculate value on exit if we are going to compute it due to some irremovable within the loop. This patch disallows applying this transform in the described situation. Differential Revision: https://reviews.llvm.org/D51581 Reviewed By: etherzhhb llvm-svn: 345708
*	[IndVars] Drop "exact" flag from lshr and udiv when substituting their args	Max Kazantsev	2018-10-11	1	-0/+99
\| \| \| \| \| \| \| \| \| \| \| \|	There is a transform that may replace `lshr (x+1), 1` with `lshr x, 1` in case if it can prove that the result will be the same. However the initial instruction might have an `exact` flag set, and it now should be dropped unless we prove that it may hold. Incorrectly set `exact` attribute may then produce poison. Differential Revision: https://reviews.llvm.org/D53061 Reviewed By: sanjoy llvm-svn: 344223
*	[IndVars] Remove unreasonable checks in rewriteLoopExitValues	Max Kazantsev	2018-09-18	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \|	A piece of logic in rewriteLoopExitValues has a weird check on number of users which allowed an unprofitable transform in case if an instruction has more than 6 users. Differential Revision: https://reviews.llvm.org/D51404 Reviewed By: etherzhhb llvm-svn: 342444
*	AMDGPU: Fix some outdated datalayouts in tests	Matt Arsenault	2018-09-13	1	-1/+1
\| \| \| \|	llvm-svn: 342131
*	[NFC] Specify test's option to reduce reliance on defaults	Max Kazantsev	2018-09-11	1	-1/+1
\| \| \| \|	llvm-svn: 341904
*	[IndVars] Set Changed if sinkUnusedInvariants changes IR. PR38863	Max Kazantsev	2018-09-10	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \|	Currently, `sinkUnusedInvariants` does not set Changed flag even if it makes changes in the IR. There is no clear evidence that it can cause a crash, but it looks highly suspicious and likely invalid. Differential Revision: https://reviews.llvm.org/D51777 Reviewed By: skatkov llvm-svn: 341777
*	[SimplifyIndVar] Avoid generating truncate instructions with non-hoisted ↵	Abderrazek Zaafrani	2018-09-07	1	-0/+84
\| \| \| \| \| \| \| \|	Laod operand. Differential Revision: https://reviews.llvm.org/D49151 llvm-svn: 341726
*	[IndVars] Set Changed when we delete dead instructions. PR38855	Max Kazantsev	2018-09-07	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \|	IndVars does not set `Changed` flag when it eliminates dead instructions. As result, it may make IR modifications and report that it has done nothing. It leads to inconsistent preserved analyzes results. Differential Revision: https://reviews.llvm.org/D51770 Reviewed By: skatkov llvm-svn: 341633
*	[NFC] Add test on full IV widening	Max Kazantsev	2018-09-05	1	-0/+44
\| \| \| \|	llvm-svn: 341456
*	[IndVars] Fix usage of SCEVExpander to not mess with SCEVConstant. PR38674	Max Kazantsev	2018-09-04	1	-0/+141
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch removes the function `expandSCEVIfNeeded` which behaves not as it was intended. This function tries to make a lookup for exact existing expansion and only goes to normal expansion via `expandCodeFor` if this lookup hasn't found anything. As a result of this, if some instruction above the loop has a `SCEVConstant` SCEV, this logic will return this instruction when asked for this `SCEVConstant` rather than return a constant value. This is both non-profitable and in some cases leads to breach of LCSSA form (as in PR38674). Whether or not it is possible to break LCSSA with this algorithm and with some non-constant SCEVs is still in question, this is still being investigated. I wasn't able to construct such a test so far, so maybe this situation is impossible. If it is, it will go as a separate fix. Rather than do it, it is always correct to just invoke `expandCodeFor` unconditionally: it behaves smarter about insertion points, and as side effect of this it will choose a constant value for SCEVConstants. For other SCEVs it may end up finding a better insertion point. So it should not be worse in any case. NOTE: So far the only known case for which this transform may break LCSSA is mapping of SCEVConstant to an instruction. However there is a suspicion that the entire algorithm can compromise LCSSA form for other cases as well (yet not proved). Differential Revision: https://reviews.llvm.org/D51286 Reviewed By: etherzhhb llvm-svn: 341345
*	[PPC] Remove Darwin support from POWER backend.	Kit Barton	2018-08-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch issues an error message if Darwin ABI is attempted with the PPC backend. It also cleans up existing test cases, either converting the test to use an alternative triple or removing the test if the coverage is no longer needed. Updated Tests ------------- The majority of test cases were updated to use a different triple that does not include the Darwin ABI. Many tests were also updated to use FileCheck, in place of grep. Deleted Tests ------------- llvm/test/tools/dsymutil/PowerPC/sibling.test was originally added to test specific functionality of dsymutil using an object file created with an old version of llvm-gcc for a Powerbook G4. After a discussion with @JDevlieghere he suggested removing the test. llvm/test/CodeGen/PowerPC/combine_loads_from_build_pair.ll was converted from a PPC test to a SystemZ test, as the behavior is also reproducible there. All other tests that were deleted were specific to the darwin/ppc ABI and no longer necessary. Phabricator Review: https://reviews.llvm.org/D50988 llvm-svn: 340795
*	[SimplifyIndVar] Canonicalize comparisons to unsigned while eliminating truncs	Max Kazantsev	2018-07-27	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \|	This is a follow-up for the patch rL335020. When we replace compares against trunc with compares against wide IV, we can also replace signed predicates with unsigned where it is legal. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D48763 llvm-svn: 338115
*	[SCEV] Add [zs]ext{C,+,x} -> (D + [zs]ext{C-D,+,x})<nuw><nsw> transform	Roman Tereshin	2018-07-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	as well as sext(C + x + ...) -> (D + sext(C-D + x + ...))<nuw><nsw> similar to the equivalent transformation for zext's if the top level addition in (D + (C-D + x * n)) could be proven to not wrap, where the choice of D also maximizes the number of trailing zeroes of (C-D + x * n), ensuring homogeneous behaviour of the transformation and better canonicalization of such AddRec's (indeed, there are 2^(2w) different expressions in `B1 + ext(B2 + Y)` form for the same Y, but only 2^(2w - k) different expressions in the resulting `B3 + ext((B4 * 2^k) + Y)` form, where w is the bit width of the integral type) This patch generalizes sext(C1 + C2X) --> sext(C1) + sext(C2X) and sext{C1,+,C2} --> sext(C1) + sext{0,+,C2} transformations added in r209568 relaxing the requirements the following way: 1. C2 doesn't have to be a power of 2, it's enough if it's divisible by 2 a sufficient number of times; 2. C1 doesn't have to be less than C2, instead of extracting the entire C1 we can split it into 2 terms: (00...0XXX + YY...Y000), keep the second one that may cause wrapping within the extension operator, and move the first one that doesn't affect wrapping out of the extension operator, enabling further simplifications; 3. C1 and C2 don't have to be positive, splitting C1 like shown above produces a sum that is guaranteed to not wrap, signed or unsigned; 4. in AddExpr case there could be more than 2 terms, and in case of AddExpr the 2nd and following terms and in case of AddRecExpr the Step component don't have to be in the C2X form or constant (respectively), they just need to have enough trailing zeros, which in turn could be guaranteed by means other than arithmetics, e.g. by a pointer alignment; 5. the extension operator doesn't have to be a sext, the same transformation works and profitable for zext's as well. Apparently, optimizations like SLPVectorizer currently fail to vectorize even rather trivial cases like the following: double bar(double a, unsigned n) { double x = 0.0; double y = 0.0; for (unsigned i = 0; i < n; i += 2) { x += a[i]; y += a[i + 1]; } return x * y; } If compiled with `clang -std=c11 -Wpedantic -Wall -O3 main.c -S -o - -emit-llvm` (!{!"clang version 7.0.0 (trunk 337339) (llvm/trunk 337344)"}) it produces scalar code with the loop not unrolled with the unsigned `n` and `i` (like shown above), but vectorized and unrolled loop with signed `n` and `i`. With the changes made in this commit the unsigned version will be vectorized (though not unrolled for unclear reasons). How it all works: Let say we have an AddExpr that looks like (C + x + y + ...), where C is a constant and x, y, ... are arbitrary SCEVs. Let's compute the minimum number of trailing zeroes guaranteed of that sum w/o the constant term: (x + y + ...). If, for example, those terms look like follows: i XXXX...X000 YYYY...YY00 ... ZZZZ...0000 then the rightmost non-guaranteed-zero bit (a potential one at i-th position above) can change the bits of the sum to the left (and at i-th position itself), but it can not possibly change the bits to the right. So we can compute the number of trailing zeroes by taking a minimum between the numbers of trailing zeroes of the terms. Now let's say that our original sum with the constant is effectively just C + X, where X = x + y + .... Let's also say that we've got 2 guaranteed trailing zeros for X: j CCCC...CCCC XXXX...XX00 // this is X = (x + y + ...) Any bit of C to the left of j may in the end cause the C + X sum to wrap, but the rightmost 2 bits of C (at positions j and j - 1) do not affect wrapping in any way. If the upper bits cause a wrap, it will be a wrap regardless of the values of the 2 least significant bits of C. If the upper bits do not cause a wrap, it won't be a wrap regardless of the values of the 2 bits on the right (again). So let's split C to 2 constants like follows: 0000...00CC = D CCCC...CC00 = (C - D) and represent the whole sum as D + (C - D + X). The second term of this new sum looks like this: CCCC...CC00 XXXX...XX00 ----------- // let's add them up YYYY...YY00 The sum above (let's call it Y)) may or may not wrap, we don't know, so we need to keep it under a sext/zext. Adding D to that sum though will never wrap, signed or unsigned, if performed on the original bit width or the extended one, because all that that final add does is setting the 2 least significant bits of Y to the bits of D: YYYY...YY00 = Y 0000...00CC = D ----------- <nuw><nsw> YYYY...YYCC Which means we can safely move that D out of the sext or zext and claim that the top-level sum neither sign wraps nor unsigned wraps. Let's run an example, let's say we're working in i8's and the original expression (zext's or sext's operand) is 21 + 12x + 8y. So it goes like this: 0001 0101 // 21 XXXX XX00 // 12x YYYY Y000 // 8y 0001 0101 // 21 ZZZZ ZZ00 // 12x + 8y 0000 0001 // D 0001 0100 // 21 - D = 20 ZZZZ ZZ00 // 12x + 8y 0000 0001 // D WWWW WW00 // 21 - D + 12x + 8y = 20 + 12x + 8y therefore zext(21 + 12x + 8y) = (1 + zext(20 + 12x + 8y))<nuw><nsw> This approach could be improved if we move away from using trailing zeroes and use KnownBits instead. For instance, with KnownBits we could have the following picture: i 10 1110...0011 // this is C XX X1XX...XX00 // this is X = (x + y + ...) Notice that some of the bits of X are known ones, also notice that known bits of X are interspersed with unknown bits and not grouped on the rigth or left. We can see at the position i that C(i) and X(i) are both known ones, therefore the (i + 1)th carry bit is guaranteed to be 1 regardless of the bits of C to the right of i. For instance, the C(i - 1) bit only affects the bits of the sum at positions i - 1 and i, and does not influence if the sum is going to wrap or not. Therefore we could split the constant C the following way: i 00 0010...0011 = D 10 1100...0000 = (C - D) Let's compute the KnownBits of (C - D) + X: XX1 1 = carry bit, blanks stand for known zeroes 10 1100...0000 = (C - D) XX X1XX...XX00 = X --- ----------- XX X0XX...XX00 Will this add wrap or not essentially depends on bits of X. Adding D to this sum, however, is guaranteed to not to wrap: 0 X 00 0010...0011 = D sX X0XX...XX00 = (C - D) + X --- ----------- sX XXXX XX11 As could be seen above, adding D preserves the sign bit of (C - D) + X, if any, and has a guaranteed 0 carry out, as expected. The more bits of (C - D) we constrain, the better the transformations introduced here canonicalize expressions as it leaves less freedom to what values the constant part of ((C - D) + x + y + ...) can take. Reviewed By: mzolotukhin, efriedma Differential Revision: https://reviews.llvm.org/D48853 llvm-svn: 337943
*	[IndVarSimplify] Ignore unreachable users of truncs	Max Kazantsev	2018-06-28	1	-0/+47
\| \| \| \| \| \| \|	If a trunc has a user in a block which is not reachable from entry, we can safely perform trunc elimination as if this user didn't exist. llvm-svn: 335816
*	[SimplifyIndVars] Eliminate redundant truncs	Max Kazantsev	2018-06-19	3	-3/+496
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds logic to deal with the following constructions: %iv = phi i64 ... %trunc = trunc i64 %iv to i32 %cmp = icmp <pred> i32 %trunc, %invariant Replacing it with %iv = phi i64 ... %cmp = icmp <pred> i64 %iv, sext/zext(%invariant) In case if it is legal. Specifically, if `%iv` has signed comparison users, it is required that `sext(trunc(%iv)) == %iv`, and if it has unsigned comparison uses then we require `zext(trunc(%iv)) == %iv`. The current implementation bails if `%trunc` has other uses than `icmp`, but in theory we can handle more cases here (e.g. if the user of trunc is bitcast). Differential Revision: https://reviews.llvm.org/D47928 Reviewed By: reames llvm-svn: 335020