summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* Thread comparisons over udiv/sdiv/ashr/lshr exact and lshr nuw/nsw wheneverNick Lewycky2011-03-051-0/+21
| | | | | | | | | possible. This goes into instcombine and instsimplify because instsimplify doesn't need to check hasOneUse since it returns (almost exclusively) constants. This fixes PR9343 #4 #5 and #8! llvm-svn: 127064
* Try once again to optimize "icmp (srem X, Y), Y" by turning the comparison intoNick Lewycky2011-03-051-0/+29
| | | | | | true/false or "icmp slt/sge Y, 0". llvm-svn: 127063
* Make InstCombiner::FoldAndOfICmps create a ConstantRange that's theAnders Carlsson2011-03-011-8/+12
| | | | | | | | | intersection of the LHS and RHS ConstantRanges and return "false" when the range is empty. This simplifies some code and catches some extra cases. llvm-svn: 126744
* srem doesn't actually have the same resulting sign as its numerator, you couldNick Lewycky2011-02-281-10/+0
| | | | | | | also have a zero when numerator = denominator. Reverts parts of r126635 and r126637. llvm-svn: 126644
* Teach InstCombine to fold "(shr exact X, Y) == 0" --> X == 0, fixing #1 fromNick Lewycky2011-02-281-5/+13
| | | | | | PR9343. llvm-svn: 126643
* The sign of an srem instruction is the sign of its dividend (the firstNick Lewycky2011-02-281-3/+13
| | | | | | | argument), regardless of the divisor. Teach instcombine about this and fix test7 in PR9343! llvm-svn: 126635
* change instcombine to not turn a call to non-varargs bitcast ofChris Lattner2011-02-241-5/+15
| | | | | | | | | function prototype into a call to a varargs prototype. We do allow the xform if we have a definition, but otherwise we don't want to risk that we're changing the abi in a subtle way. On X86-64, for example, varargs require passing stuff in %al. llvm-svn: 126363
* Move "A | ~(A & ?) -> -1" from InstCombine to InstructionSimplify.Benjamin Kramer2011-02-201-16/+8
| | | | llvm-svn: 126082
* InstCombine: Add a bunch of combines of the form x | (y ^ z).Benjamin Kramer2011-02-201-0/+41
| | | | | | | | | We usually catch this kind of optimization through InstSimplify's distributive magic, but or doesn't distribute over xor in general. "A | ~(A | B) -> A | ~B" hits 24 times on gcc.c. llvm-svn: 126081
* PR9218: SimplifyDemandedVectorElts can return a non-null value that is notEli Friedman2011-02-191-2/+7
| | | | | | | the instruction passed in. Make sure to account for this correctly, instead of looping infinitely. llvm-svn: 126058
* Add some transforms of the kind X-Y>X -> 0>Y which are valid when there is noDuncan Sands2011-02-181-18/+17
| | | | | | overflow. These subsume some existing equality transforms, so zap those. llvm-svn: 125843
* have instcombine preserve nsw/nuw/exact when sinkingChris Lattner2011-02-171-11/+53
| | | | | | common operations through a phi. llvm-svn: 125790
* fix typoChris Lattner2011-02-171-1/+1
| | | | llvm-svn: 125787
* fix instcombine merging GEPs through a PHI to only make theChris Lattner2011-02-171-3/+7
| | | | | | result inbounds if all of the inputs are inbounds. llvm-svn: 125785
* add is always integer, thanks to Frits for noticing this.Chris Lattner2011-02-171-1/+1
| | | | llvm-svn: 125774
* Transform "A + B >= A + C" into "B >= C" if the adds do not wrap. Likewise ↵Duncan Sands2011-02-171-93/+106
| | | | | | | | | for some variations (some of these were already present so I unified the code). Spotted by my auto-simplifier as occurring a lot. llvm-svn: 125734
* preserve NUW/NSW when transforming add x,xChris Lattner2011-02-171-2/+7
| | | | llvm-svn: 125711
* Spelling fix: consequtive -> consecutive.Duncan Sands2011-02-151-1/+1
| | | | llvm-svn: 125563
* Fix 9216 - Endless loop in InstCombine pass.Nadav Rotem2011-02-151-1/+5
| | | | | | | The pattern "A&(A^B) -> A & ~B" recreated itself because ~B is actually a xor -1. llvm-svn: 125557
* Do not forget DebugLoc!Devang Patel2011-02-151-0/+1
| | | | llvm-svn: 125547
* tidy up a bit.Chris Lattner2011-02-151-7/+9
| | | | llvm-svn: 125546
* convert ConstantVector::get to use ArrayRef.Chris Lattner2011-02-151-7/+3
| | | | llvm-svn: 125537
* revert my ConstantVector patch, it seems to have made the llvm-gccChris Lattner2011-02-141-3/+7
| | | | | | builders unhappy. llvm-svn: 125504
* Switch ConstantVector::get to use ArrayRef instead of a pointer+sizeChris Lattner2011-02-141-7/+3
| | | | | | idiom. Change various clients to simplify their code. llvm-svn: 125487
* remove a now-unneccesary cast.Chris Lattner2011-02-131-1/+1
| | | | llvm-svn: 125464
* implement instcombine folding for things like (x >> c) < 42.Chris Lattner2011-02-131-8/+50
| | | | | | We were previously simplifying divisions, but not right shifts! llvm-svn: 125454
* refactor some code out into a helper method.Chris Lattner2011-02-132-46/+56
| | | | llvm-svn: 125451
* Also fold (A+B) == A -> B == 0 when the add is commuted.Benjamin Kramer2011-02-111-2/+4
| | | | llvm-svn: 125411
* When lowering an inbounds gep, the intermediate adds can haveChris Lattner2011-02-111-6/+3
| | | | | | | | unsigned overflow (e.g. due to a negative array index), but the scales on array size multiplications are known to not sign wrap. llvm-svn: 125409
* implement the first part of PR8882: when lowering an inboundsChris Lattner2011-02-101-8/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | gep to explicit addressing, we know that none of the intermediate computation overflows. This could use review: it seems that the shifts certainly wouldn't overflow, but could the intermediate adds overflow if there is a negative index? Previously the testcase would instcombine to: define i1 @test(i64 %i) { %p1.idx.mask = and i64 %i, 4611686018427387903 %cmp = icmp eq i64 %p1.idx.mask, 1000 ret i1 %cmp } now we get: define i1 @test(i64 %i) { %cmp = icmp eq i64 %i, 1000 ret i1 %cmp } llvm-svn: 125271
* Enhance a bunch of transformations in instcombine to start generatingChris Lattner2011-02-102-142/+145
| | | | | | | | | | | exact/nsw/nuw shifts and have instcombine infer them when it can prove that the relevant properties are true for a given shift without them. Also, a variety of refactoring to use the new patternmatch logic thrown in for good luck. I believe that this takes care of a bunch of related code quality issues attached to PR8862. llvm-svn: 125267
* Enhance the "compare with shift" and "compare with div" Chris Lattner2011-02-101-44/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | optimizations to be much more aggressive in the face of exact/nsw/nuw div and shifts. For example, these (which are the same except the first is 'exact' sdiv: define i1 @sdiv_icmp4_exact(i64 %X) nounwind { %A = sdiv exact i64 %X, -5 ; X/-5 == 0 --> x == 0 %B = icmp eq i64 %A, 0 ret i1 %B } define i1 @sdiv_icmp4(i64 %X) nounwind { %A = sdiv i64 %X, -5 ; X/-5 == 0 --> x == 0 %B = icmp eq i64 %A, 0 ret i1 %B } compile down to: define i1 @sdiv_icmp4_exact(i64 %X) nounwind { %1 = icmp eq i64 %X, 0 ret i1 %1 } define i1 @sdiv_icmp4(i64 %X) nounwind { %X.off = add i64 %X, 4 %1 = icmp ult i64 %X.off, 9 ret i1 %1 } This happens when you do something like: (ptr1-ptr2) == 42 where the pointers are pointers to non-unit types. llvm-svn: 125266
* more cleanups, notably bitcast isn't used for "signed to unsigned type Chris Lattner2011-02-101-45/+27
| | | | | | conversions". :) llvm-svn: 125265
* A bunch of cleanups and simplifications using the new PatternMatch predicatesChris Lattner2011-02-101-176/+132
| | | | | | | | | | and generally tidying things up. Only very trivial functionality changes like now doing (-1 - A) -> (~A) for vectors too. InstCombineAddSub.cpp | 296 +++++++++++++++++++++----------------------------- 1 file changed, 126 insertions(+), 170 deletions(-) llvm-svn: 125264
* teach SimplifyDemandedBits that exact shifts demand the bits they Chris Lattner2011-02-101-3/+23
| | | | | | | are shifting out since they do require them to be zeros. Similarly for NUW/NSW bits of shl llvm-svn: 125263
* Teach instsimplify some tricks about exact/nuw/nsw shifts.Chris Lattner2011-02-091-3/+7
| | | | | | improve interfaces to instsimplify to take this info. llvm-svn: 125196
* Rework InstrTypes.h so to reduce the repetition around the NSW/NUW/ExactChris Lattner2011-02-091-2/+2
| | | | | | | | | | | | | | | | | versions of creation functions. Eventually, the "insertion point" versions of these should just be removed, we do have IRBuilder afterall. Do a massive rewrite of much of pattern match. It is now shorter and less redundant and has several other widgets I will be using in other patches. Among other changes, m_Div is renamed to m_IDiv (since it only matches integer divides) and m_Shift is gone (it used to match all binops!!) and we now have m_LogicalShift for the one client to use. Enhance IRBuilder to have "isExact" arguments to things like CreateUDiv and reduce redundancy within IRbuilder by having these methods chain to each other more instead of duplicating code. llvm-svn: 125194
* enhance vmcore to know that udiv's can be exact, and add a trivialChris Lattner2011-02-061-2/+2
| | | | | | | | instcombine xform to exercise this. Nothing forms exact udivs yet though. This is progress on PR8862 llvm-svn: 124992
* Conservatively, clear optional flags, such as nsw, when performingDan Gohman2011-02-021-0/+15
| | | | | | | reassociation. No testcase, because I wasn't able to create a testcase which actually demonstrates a problem. llvm-svn: 124713
* Recognize and simplifyAnders Carlsson2011-01-301-1/+11
| | | | | | | (A+B) == A -> B == 0 A == (A+B) -> B == 0 llvm-svn: 124567
* Call SimplifyFDivInst() in InstCombiner::visitFDiv().Frits van Bommel2011-01-292-0/+10
| | | | llvm-svn: 124535
* Move InstCombine's knowledge of fdiv to SimplifyInstruction().Frits van Bommel2011-01-292-15/+0
| | | | llvm-svn: 124534
* My auto-simplifier noticed that ((X/Y)*Y)/Y occurs several times in SPECDuncan Sands2011-01-281-56/+24
| | | | | | | | | | | | | | | | | benchmarks, and that it can be simplified to X/Y. (In general you can only simplify (Z*Y)/Y to Z if the multiplication did not overflow; if Z has the form "X/Y" then this is the case). This patch implements that transform and moves some Div logic out of instcombine and into InstructionSimplify. Unfortunately instcombine gets in the way somewhat, since it likes to change (X/Y)*Y into X-(X rem Y), so I had to teach instcombine about this too. Finally, thanks to the NSW/NUW flags, sometimes we know directly that "Z*Y" does not overflow, because the flag says so, so I added that logic too. This eliminates a bunch of divisions and subtractions in 447.dealII, and has good effects on some other benchmarks too. It seems to have quite an effect on tramp3d-v4 but it's hard to say if it's good or bad because inlining decisions changed, resulting in massive changes all over. llvm-svn: 124487
* Fold select + select where both selects are on the same condition.Nick Lewycky2011-01-281-0/+13
| | | | llvm-svn: 124469
* Null initialize a few variables flagged byTed Kremenek2011-01-231-1/+1
| | | | | | | | | | clang's -Wuninitialized-experimental warning. While these don't look like real bugs, clang's -Wuninitialized-experimental analysis is stricter than GCC's, and these fixes have the benefit of being general nice cleanups. llvm-svn: 124073
* Just because we have determined that an (fcmp | fcmp) is true for A < B,Owen Anderson2011-01-211-1/+3
| | | | | | | A == B, and A > B, does not mean we can fold it to true. We still need to check for A ? B (A unordered B). llvm-svn: 123993
* fix PR9013, an infinite loop in instcombine.Chris Lattner2011-01-211-2/+10
| | | | llvm-svn: 123968
* update obsolete comment.Chris Lattner2011-01-211-4/+3
| | | | llvm-svn: 123965
* Don't try to pull vector bitcasts that change the number of elements throughNick Lewycky2011-01-211-3/+17
| | | | | | | a select. A vector select is pairwise on each element so we'd need a new condition with the right number of elements to select on. Fixes PR8994. llvm-svn: 123963
* At -O123 the early-cse pass is run before instcombine has run. According to myDuncan Sands2011-01-201-32/+11
| | | | | | | | | | | | | | | | auto-simplier the transform most missed by early-cse is (zext X) != 0 -> X != 0. This patch adds this transform and some related logic to InstructionSimplify and removes some of the logic from instcombine (unfortunately not all because there are several situations in which instcombine can improve things by making new instructions, whereas instsimplify is not allowed to do this). At -O2 this often results in more than 15% more simplifications by early-cse, and results in hundreds of lines of bitcode being eliminated from the testsuite. I did see some small negative effects in the testsuite, for example a few additional instructions in three programs. One program, 483.xalancbmk, got an additional 35 instructions, which seems to be due to a function getting an additional instruction and then being inlined all over the place. llvm-svn: 123911
OpenPOWER on IntegriCloud