summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Fix a bug in instcombine for fmul in fast math mode.Quentin Colombet2013-02-281-3/+3
| | | | | | | | | | | | | | | The instcombine recognized pattern looks like: a = b * c d = a +/- Cst or a = b * c d = Cst +/- a When creating the new operands for fadd or fsub instruction following the related fmul, the first operand was created with the second original operand (M0 was created with C1) and the second with the first (M1 with Opnd0). The fix consists in creating the new operands with the appropriate original operand, i.e., M0 with Opnd0 and M1 with C1. llvm-svn: 176300
* 1. Hoist minus sign as high as possible in an attempt to revealShuxin Yang2013-01-151-31/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | some optimization opportunities (in the enclosing supper-expressions). rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y) if expression "-0.0 - X" has only one reference. rule 2. (0.0 - X ) * Y => -0.0 - (X * Y) if expression "0.0 - X" has only one reference, and the instruction is marked "noSignedZero". 2. Eliminate negation (The compiler was already able to handle these opt if the 0.0s are replaced with -0.0.) rule 3: (0.0 - X) * (0.0 - Y) => X * Y rule 4: (0.0 - X) * C => X * -C if the expr is flagged "noSignedZero". 3. Rule 5: (X*Y) * X => (X*X) * Y if X!=Y and the expression is flagged with "UnsafeAlgebra". The purpose of this transformation is two-fold: a) to form a power expression (of X). b) potentially shorten the critical path: After transformation, the latency of the instruction Y is amortized by the expression of X*X, and therefore Y is in a "less critical" position compared to what it was before the transformation. 4. Remove the InstCombine code about simplifiying "X * select". The reasons are following: a) The "select" is somewhat architecture-dependent, therefore the higher level optimizers are not able to precisely predict if the simplification really yields any performance improvement or not. b) The "select" operator is bit complicate, and tends to obscure optimization opportunities. It is btter to keep it as low as possible in expr tree, and let CodeGen to tackle the optimization. llvm-svn: 172551
* This change is to implement following rules under the condition C_A and/or C_RShuxin Yang2013-01-141-8/+127
| | | | | | | | | | | | | | | | | | | | | --------------------------------------------------------------------------- C_A: reassociation is allowed C_R: reciprocal of a constant C is appropriate, which means - 1/C is exact, or - reciprocal is allowed and 1/C is neither a special value nor a denormal. ----------------------------------------------------------------------------- rule1: (X/C1) / C2 => X / (C2*C1) (if C_A) => X * (1/(C2*C1)) (if C_A && C_R) rule 2: X*C1 / C2 => X * (C1/C2) if C_A rule 3: (X/Y)/Z = > X/(Y*Z) (if C_A && at least one of Y and Z is symbolic value) rule 4: Z/(X/Y) = > (Z*Y)/X (similar to rule3) rule 5: C1/(X*C2) => (C1/C2) / X (if C_A) rule 6: C1/(X/C2) => (C1*C2) / X (if C_A) rule 7: C1/(C2/X) => (C1/C2) * X (if C_A) llvm-svn: 172488
* Cosmetical changne in order to conform to coding std.Shuxin Yang2013-01-071-5/+6
| | | | | | Thank Eric Christopher for figuring out these problems! llvm-svn: 171805
* This change is to implement following rules:Shuxin Yang2013-01-071-0/+127
| | | | | | | | | | | o. X/C1 * C2 => X * (C2/C1) (if C2/C1 is neither special FP nor denormal) o. X/C1 * C2 -> X/(C1/C2) (if C2/C1 is either specical FP or denormal, but C1/C2 is a normal Fp) Let MDC denote multiplication or dividion with one & only one operand being a constant o. (MDC ± C1) * C2 => (MDC * C2) ± (C1 * C2) (so long as the constant-folding doesn't yield any denormal or special value) llvm-svn: 171793
* Move all of the header files which are involved in modelling the LLVM IRChandler Carruth2013-01-021-1/+1
| | | | | | | | | | | | | | | | | | | | | into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366
* rdar://12753946Shuxin Yang2012-12-141-0/+32
| | | | | | Implement rule : "x * (select cond 1.0, 0.0) -> select cond x, 0.0" llvm-svn: 170226
* Rename isPowerOfTwo to isKnownToBeAPowerOfTwo.Rafael Espindola2012-12-131-2/+2
| | | | | | | | In a previous thread it was pointed out that isPowerOfTwo is not a very precise name since it can return false for powers of two if it is unable to show that they are powers of two. llvm-svn: 170093
* The TargetData is not used for the isPowerOfTwo determination. It has neverRafael Espindola2012-12-121-3/+2
| | | | | | | | | | been used in the first place. It simply was passed to the function and to the recursive invocations. Simply drop the parameter and update the callers for the new signature. Patch by Saleem Abdulrasool! llvm-svn: 169988
* Remove redunant optimizations from InstCombine, instead call the appropriate ↵Michael Ilseman2012-12-121-13/+4
| | | | | | functions from SimplifyInstruction llvm-svn: 169941
* Use the new script to sort the includes of every file under lib.Chandler Carruth2012-12-031-1/+1
| | | | | | | | | | | | | | | | | Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
* reversed the logic of the log2 detection routine to reduce the number of ↵Pedro Artigas2012-11-301-25/+29
| | | | | | nested ifs llvm-svn: 169049
* Addresses many style issues with prior checkin (r169025)Pedro Artigas2012-11-301-58/+44
| | | | llvm-svn: 169043
* Add fast math inst combine X*log2(Y*0.5)-->X*log2(Y)-XPedro Artigas2012-11-301-0/+77
| | | | | | reviewed by Michael Ilseman <milseman@apple.com> llvm-svn: 169025
* Move TargetData to DataLayout.Micah Villmow2012-10-081-2/+2
| | | | llvm-svn: 165402
* Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. ↵Sylvestre Ledru2012-09-271-2/+2
| | | | | | See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767 llvm-svn: 164768
* Fix a typo 'iff' => 'if'Sylvestre Ledru2012-09-271-2/+2
| | | | llvm-svn: 164767
* InstCombine: Make sure we use the pre-zext type when creating a constant of ↵Benjamin Kramer2012-09-211-1/+2
| | | | | | | | a value that is zext'd. Fixes PR13250. llvm-svn: 164377
* InstCombine: Fix comment to reflect the code.Benjamin Kramer2012-08-301-1/+1
| | | | llvm-svn: 162911
* It is illegal to transform (sdiv (ashr X c1) c2) -> (sdiv x (2^c1 * c2)),Nadav Rotem2012-08-301-10/+0
| | | | | | | | because C always rounds towards zero. Thanks Dirk and Ben. llvm-svn: 162899
* InstCombine: Defensively avoid undefined shifts by limiting the amount to ↵Benjamin Kramer2012-08-281-2/+2
| | | | | | | | | the bit width. No test case, undefined shifts get folded early, but can occur when other transforms generate a constant. Thanks to Duncan for bringing this up. llvm-svn: 162755
* InstCombine: Guard the transform introduced in r162743 against large ints ↵Benjamin Kramer2012-08-281-10/+10
| | | | | | and non-const shifts. llvm-svn: 162751
* Make sure that we don't call getZExtValue on values > 64 bits.Nadav Rotem2012-08-281-8/+8
| | | | | | Thanks Benjamin for noticing this. llvm-svn: 162749
* Teach InstCombine to canonicalize [SU]div+[AL]shl patterns.Nadav Rotem2012-08-281-0/+20
| | | | | | | | | | For example: %1 = lshr i32 %x, 2 %2 = udiv i32 %1, 100 rdar://12182093 llvm-svn: 162743
* Look pass zext to strength reduce an udiv. Patch by David Majnemer. ↵Evan Cheng2012-06-211-1/+4
| | | | | | rdar://11721329 llvm-svn: 158946
* Remove some dead code and tidy things up now that vectors use ConstantDataVectorChris Lattner2012-02-061-20/+7
| | | | | | instead of always using ConstantVector. llvm-svn: 149912
* continue making the world safe for ConstantDataVector. At this point,Chris Lattner2012-01-271-9/+26
| | | | | | | we should (theoretically optimize and codegen ConstantDataVector as well as ConstantVector. llvm-svn: 149116
* use ConstantVector::getSplat in a few places.Chris Lattner2012-01-251-1/+1
| | | | llvm-svn: 148929
* InstCombine now optimizes vector udiv by power of 2 to shiftsPete Cooper2011-11-071-5/+9
| | | | | | Fixes r8429 llvm-svn: 144036
* Stop emitting instructions with the name "tmp" they eat up memory and have ↵Benjamin Kramer2011-09-271-6/+5
| | | | | | | | to be uniqued, without any benefit. If someone prefers %tmp42 to %42, run instnamer. llvm-svn: 140634
* land David Blaikie's patch to de-constify Type, with a few tweaks.Chris Lattner2011-07-181-1/+1
| | | | llvm-svn: 135375
* start using the new helper methods a bit.Chris Lattner2011-07-151-2/+2
| | | | llvm-svn: 135251
* Reapply 132348 with fixes. rdar://problem/6501862Stuart Hastings2011-06-011-9/+15
| | | | llvm-svn: 132402
* Revert to pacify a buildbot. rdar://problem/6501862Stuart Hastings2011-05-311-16/+9
| | | | llvm-svn: 132351
* Followup to 132316; accept arbitrary constants, add with a constant,Stuart Hastings2011-05-311-9/+16
| | | | | | | sub with a non-constant. Fix comments, enlarge test case. rdar://problem/6501862 llvm-svn: 132348
* (1 - X) * (-2) -> (x - 1) * 2, for all positive nonzero powers of 2Stuart Hastings2011-05-301-0/+17
| | | | | | rdar://problem/6501862 llvm-svn: 132316
* rearrange two transforms, since one subsumes the other. Make the ↵Chris Lattner2011-05-231-16/+23
| | | | | | | | shift-exactness xform recurse. llvm-svn: 131888
* Transform any logical shift of a power of two into an exact/NUW shift whenChris Lattner2011-05-231-0/+17
| | | | | | in a known-non-zero context. llvm-svn: 131887
* use the valuetracking isPowerOfTwo function, which is more powerful than ↵Chris Lattner2011-05-231-4/+4
| | | | | | | | checking for a constant directly. Thanks to Duncan for pointing this out. llvm-svn: 131885
* add some random notes.Chris Lattner2011-05-221-0/+5
| | | | llvm-svn: 131862
* Carve out a place in instcombine to put transformations which work knowing ↵Chris Lattner2011-05-221-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | that their result is non-zero. Implement an example optimization (PR9814), which allows us to transform: A / ((1 << B) >>u 2) into: A >>u (B-2) which we compile into: _divu3: ## @divu3 leal -2(%rsi), %ecx shrl %cl, %edi movl %edi, %eax ret instead of: _divu3: ## @divu3 movb %sil, %cl movl $1, %esi shll %cl, %esi shrl $2, %esi movl %edi, %eax xorl %edx, %edx divl %esi, %eax ret llvm-svn: 131860
* Remove unused variable.Duncan Sands2011-05-021-1/+1
| | | | llvm-svn: 130705
* Move some rem transforms out of instcombine and into instsimplify.Duncan Sands2011-05-021-42/+19
| | | | | | | This automagically provides a transform noticed by my super-optimizer as occurring quite often: "rem x, (select cond, x, 1)" -> 0. llvm-svn: 130694
* InstCombine: Turn (zext A) udiv (zext B) into (zext (A udiv B)). Same for ↵Benjamin Kramer2011-04-301-1/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | urem or constant B. This obviously helps a lot if the division would be turned into a libcall (think i64 udiv on i386), but div is also one of the few remaining instructions on modern CPUs that become more expensive when the bitwidth gets bigger. This also helps register pressure on i386 when dividing chars, divb needs two 8-bit parts of a 16 bit register as input where divl uses two registers. int foo(unsigned char a) { return a/10; } int bar(unsigned char a, unsigned char b) { return a/b; } compiles into (x86_64) _foo: imull $205, %edi, %eax shrl $11, %eax ret _bar: movzbl %dil, %eax divb %sil, %al movzbl %al, %eax ret llvm-svn: 130615
* Use SimplifyDemandedBits on div instructions.Benjamin Kramer2011-04-301-0/+4
| | | | | | This folds away silly stuff like (a&255)/1000 -> 0. llvm-svn: 130614
* InstCombine: If the divisor of an fdiv has an exact inverse, turn it into an ↵Benjamin Kramer2011-03-301-0/+12
| | | | | | | | fmul. Fixes PR9587. llvm-svn: 128546
* Enhance a bunch of transformations in instcombine to start generatingChris Lattner2011-02-101-125/+85
| | | | | | | | | | | exact/nsw/nuw shifts and have instcombine infer them when it can prove that the relevant properties are true for a given shift without them. Also, a variety of refactoring to use the new patternmatch logic thrown in for good luck. I believe that this takes care of a bunch of related code quality issues attached to PR8862. llvm-svn: 125267
* enhance vmcore to know that udiv's can be exact, and add a trivialChris Lattner2011-02-061-2/+2
| | | | | | | | instcombine xform to exercise this. Nothing forms exact udivs yet though. This is progress on PR8862 llvm-svn: 124992
* Call SimplifyFDivInst() in InstCombiner::visitFDiv().Frits van Bommel2011-01-291-0/+9
| | | | llvm-svn: 124535
* Move InstCombine's knowledge of fdiv to SimplifyInstruction().Frits van Bommel2011-01-291-14/+0
| | | | llvm-svn: 124534
OpenPOWER on IntegriCloud