summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
Commit message (Collapse)AuthorAgeFilesLines
* This is an update to a previous commit (r181216).Jean-Luc Duprat2013-05-221-29/+0
| | | | | | | | | | | | | | | | | | | | | | | The earlier change list introduced the following inst combines: B * (uitofp i1 C) —> select C, B, 0 A * (1 - uitofp i1 C) —> select C, 0, A select C, 0, B + select C, A, 0 —> select C, A, B Together these 3 changes would simplify : A * (1 - uitofp i1 C) + B * uitofp i1 C down to : select C, B, A In practice we found that the first two substitutions can have a negative effect on performance, because they reduce opportunities to use FMA contractions; between the two options FMAs are often the better choice. This change list amends the previous one to enable just these inst combines: select C, B, 0 + select C, 0, A —> select C, B, A A * (1 - uitofp i1 C) + B * uitofp i1 C —> select C, B, A llvm-svn: 182499
* Fix two typoSylvestre Ledru2013-05-141-1/+1
| | | | llvm-svn: 181848
* InstCombine: Flip the order of two urem transformsDavid Majnemer2013-05-121-6/+6
| | | | | | | | | | | | | | There are two transforms in visitUrem that conflict with each other. *) One, if a divisor is a power of two, subtracts one from the divisor and turns it into a bitwise-and. *) The other unwraps both operands if they are surrounded by zext instructions. Flipping the order allows the subtraction to go beneath the sign extension. llvm-svn: 181668
* InstCombine: Turn urem to bitwise-and more oftenDavid Majnemer2013-05-111-20/+2
| | | | | | | Use isKnownToBeAPowerOfTwo in visitUrem so that we may more aggressively fold away urem instructions. llvm-svn: 181661
* InstCombine: Verify the type before transforming uitofp into select.Benjamin Kramer2013-05-101-22/+23
| | | | | | PR15952. llvm-svn: 181586
* Provide InstCombines for the following 3 cases:Jean-Luc Duprat2013-05-061-0/+28
| | | | | | | | | | | | | A * (1 - (uitofp i1 C)) -> select C, 0, A B * (uitofp i1 C) -> select C, B, 0 select C, 0, A + select C, B, 0 -> select C, B, A These come up in code that has been hand-optimized from a select to a linear blend, on platforms where that may have mattered. We want to undo such changes with the following transform: A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B llvm-svn: 181216
* Tidy up a bit. No functional change.Jim Grosbach2013-04-051-56/+56
| | | | llvm-svn: 178915
* Fix a bug in instcombine for fmul in fast math mode.Quentin Colombet2013-02-281-3/+3
| | | | | | | | | | | | | | | The instcombine recognized pattern looks like: a = b * c d = a +/- Cst or a = b * c d = Cst +/- a When creating the new operands for fadd or fsub instruction following the related fmul, the first operand was created with the second original operand (M0 was created with C1) and the second with the first (M1 with Opnd0). The fix consists in creating the new operands with the appropriate original operand, i.e., M0 with Opnd0 and M1 with C1. llvm-svn: 176300
* 1. Hoist minus sign as high as possible in an attempt to revealShuxin Yang2013-01-151-31/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | some optimization opportunities (in the enclosing supper-expressions). rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y) if expression "-0.0 - X" has only one reference. rule 2. (0.0 - X ) * Y => -0.0 - (X * Y) if expression "0.0 - X" has only one reference, and the instruction is marked "noSignedZero". 2. Eliminate negation (The compiler was already able to handle these opt if the 0.0s are replaced with -0.0.) rule 3: (0.0 - X) * (0.0 - Y) => X * Y rule 4: (0.0 - X) * C => X * -C if the expr is flagged "noSignedZero". 3. Rule 5: (X*Y) * X => (X*X) * Y if X!=Y and the expression is flagged with "UnsafeAlgebra". The purpose of this transformation is two-fold: a) to form a power expression (of X). b) potentially shorten the critical path: After transformation, the latency of the instruction Y is amortized by the expression of X*X, and therefore Y is in a "less critical" position compared to what it was before the transformation. 4. Remove the InstCombine code about simplifiying "X * select". The reasons are following: a) The "select" is somewhat architecture-dependent, therefore the higher level optimizers are not able to precisely predict if the simplification really yields any performance improvement or not. b) The "select" operator is bit complicate, and tends to obscure optimization opportunities. It is btter to keep it as low as possible in expr tree, and let CodeGen to tackle the optimization. llvm-svn: 172551
* This change is to implement following rules under the condition C_A and/or C_RShuxin Yang2013-01-141-8/+127
| | | | | | | | | | | | | | | | | | | | | --------------------------------------------------------------------------- C_A: reassociation is allowed C_R: reciprocal of a constant C is appropriate, which means - 1/C is exact, or - reciprocal is allowed and 1/C is neither a special value nor a denormal. ----------------------------------------------------------------------------- rule1: (X/C1) / C2 => X / (C2*C1) (if C_A) => X * (1/(C2*C1)) (if C_A && C_R) rule 2: X*C1 / C2 => X * (C1/C2) if C_A rule 3: (X/Y)/Z = > X/(Y*Z) (if C_A && at least one of Y and Z is symbolic value) rule 4: Z/(X/Y) = > (Z*Y)/X (similar to rule3) rule 5: C1/(X*C2) => (C1/C2) / X (if C_A) rule 6: C1/(X/C2) => (C1*C2) / X (if C_A) rule 7: C1/(C2/X) => (C1/C2) * X (if C_A) llvm-svn: 172488
* Cosmetical changne in order to conform to coding std.Shuxin Yang2013-01-071-5/+6
| | | | | | Thank Eric Christopher for figuring out these problems! llvm-svn: 171805
* This change is to implement following rules:Shuxin Yang2013-01-071-0/+127
| | | | | | | | | | | o. X/C1 * C2 => X * (C2/C1) (if C2/C1 is neither special FP nor denormal) o. X/C1 * C2 -> X/(C1/C2) (if C2/C1 is either specical FP or denormal, but C1/C2 is a normal Fp) Let MDC denote multiplication or dividion with one & only one operand being a constant o. (MDC ± C1) * C2 => (MDC * C2) ± (C1 * C2) (so long as the constant-folding doesn't yield any denormal or special value) llvm-svn: 171793
* Move all of the header files which are involved in modelling the LLVM IRChandler Carruth2013-01-021-1/+1
| | | | | | | | | | | | | | | | | | | | | into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366
* rdar://12753946Shuxin Yang2012-12-141-0/+32
| | | | | | Implement rule : "x * (select cond 1.0, 0.0) -> select cond x, 0.0" llvm-svn: 170226
* Rename isPowerOfTwo to isKnownToBeAPowerOfTwo.Rafael Espindola2012-12-131-2/+2
| | | | | | | | In a previous thread it was pointed out that isPowerOfTwo is not a very precise name since it can return false for powers of two if it is unable to show that they are powers of two. llvm-svn: 170093
* The TargetData is not used for the isPowerOfTwo determination. It has neverRafael Espindola2012-12-121-3/+2
| | | | | | | | | | been used in the first place. It simply was passed to the function and to the recursive invocations. Simply drop the parameter and update the callers for the new signature. Patch by Saleem Abdulrasool! llvm-svn: 169988
* Remove redunant optimizations from InstCombine, instead call the appropriate ↵Michael Ilseman2012-12-121-13/+4
| | | | | | functions from SimplifyInstruction llvm-svn: 169941
* Use the new script to sort the includes of every file under lib.Chandler Carruth2012-12-031-1/+1
| | | | | | | | | | | | | | | | | Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
* reversed the logic of the log2 detection routine to reduce the number of ↵Pedro Artigas2012-11-301-25/+29
| | | | | | nested ifs llvm-svn: 169049
* Addresses many style issues with prior checkin (r169025)Pedro Artigas2012-11-301-58/+44
| | | | llvm-svn: 169043
* Add fast math inst combine X*log2(Y*0.5)-->X*log2(Y)-XPedro Artigas2012-11-301-0/+77
| | | | | | reviewed by Michael Ilseman <milseman@apple.com> llvm-svn: 169025
* Move TargetData to DataLayout.Micah Villmow2012-10-081-2/+2
| | | | llvm-svn: 165402
* Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. ↵Sylvestre Ledru2012-09-271-2/+2
| | | | | | See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767 llvm-svn: 164768
* Fix a typo 'iff' => 'if'Sylvestre Ledru2012-09-271-2/+2
| | | | llvm-svn: 164767
* InstCombine: Make sure we use the pre-zext type when creating a constant of ↵Benjamin Kramer2012-09-211-1/+2
| | | | | | | | a value that is zext'd. Fixes PR13250. llvm-svn: 164377
* InstCombine: Fix comment to reflect the code.Benjamin Kramer2012-08-301-1/+1
| | | | llvm-svn: 162911
* It is illegal to transform (sdiv (ashr X c1) c2) -> (sdiv x (2^c1 * c2)),Nadav Rotem2012-08-301-10/+0
| | | | | | | | because C always rounds towards zero. Thanks Dirk and Ben. llvm-svn: 162899
* InstCombine: Defensively avoid undefined shifts by limiting the amount to ↵Benjamin Kramer2012-08-281-2/+2
| | | | | | | | | the bit width. No test case, undefined shifts get folded early, but can occur when other transforms generate a constant. Thanks to Duncan for bringing this up. llvm-svn: 162755
* InstCombine: Guard the transform introduced in r162743 against large ints ↵Benjamin Kramer2012-08-281-10/+10
| | | | | | and non-const shifts. llvm-svn: 162751
* Make sure that we don't call getZExtValue on values > 64 bits.Nadav Rotem2012-08-281-8/+8
| | | | | | Thanks Benjamin for noticing this. llvm-svn: 162749
* Teach InstCombine to canonicalize [SU]div+[AL]shl patterns.Nadav Rotem2012-08-281-0/+20
| | | | | | | | | | For example: %1 = lshr i32 %x, 2 %2 = udiv i32 %1, 100 rdar://12182093 llvm-svn: 162743
* Look pass zext to strength reduce an udiv. Patch by David Majnemer. ↵Evan Cheng2012-06-211-1/+4
| | | | | | rdar://11721329 llvm-svn: 158946
* Remove some dead code and tidy things up now that vectors use ConstantDataVectorChris Lattner2012-02-061-20/+7
| | | | | | instead of always using ConstantVector. llvm-svn: 149912
* continue making the world safe for ConstantDataVector. At this point,Chris Lattner2012-01-271-9/+26
| | | | | | | we should (theoretically optimize and codegen ConstantDataVector as well as ConstantVector. llvm-svn: 149116
* use ConstantVector::getSplat in a few places.Chris Lattner2012-01-251-1/+1
| | | | llvm-svn: 148929
* InstCombine now optimizes vector udiv by power of 2 to shiftsPete Cooper2011-11-071-5/+9
| | | | | | Fixes r8429 llvm-svn: 144036
* Stop emitting instructions with the name "tmp" they eat up memory and have ↵Benjamin Kramer2011-09-271-6/+5
| | | | | | | | to be uniqued, without any benefit. If someone prefers %tmp42 to %42, run instnamer. llvm-svn: 140634
* land David Blaikie's patch to de-constify Type, with a few tweaks.Chris Lattner2011-07-181-1/+1
| | | | llvm-svn: 135375
* start using the new helper methods a bit.Chris Lattner2011-07-151-2/+2
| | | | llvm-svn: 135251
* Reapply 132348 with fixes. rdar://problem/6501862Stuart Hastings2011-06-011-9/+15
| | | | llvm-svn: 132402
* Revert to pacify a buildbot. rdar://problem/6501862Stuart Hastings2011-05-311-16/+9
| | | | llvm-svn: 132351
* Followup to 132316; accept arbitrary constants, add with a constant,Stuart Hastings2011-05-311-9/+16
| | | | | | | sub with a non-constant. Fix comments, enlarge test case. rdar://problem/6501862 llvm-svn: 132348
* (1 - X) * (-2) -> (x - 1) * 2, for all positive nonzero powers of 2Stuart Hastings2011-05-301-0/+17
| | | | | | rdar://problem/6501862 llvm-svn: 132316
* rearrange two transforms, since one subsumes the other. Make the ↵Chris Lattner2011-05-231-16/+23
| | | | | | | | shift-exactness xform recurse. llvm-svn: 131888
* Transform any logical shift of a power of two into an exact/NUW shift whenChris Lattner2011-05-231-0/+17
| | | | | | in a known-non-zero context. llvm-svn: 131887
* use the valuetracking isPowerOfTwo function, which is more powerful than ↵Chris Lattner2011-05-231-4/+4
| | | | | | | | checking for a constant directly. Thanks to Duncan for pointing this out. llvm-svn: 131885
* add some random notes.Chris Lattner2011-05-221-0/+5
| | | | llvm-svn: 131862
* Carve out a place in instcombine to put transformations which work knowing ↵Chris Lattner2011-05-221-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | that their result is non-zero. Implement an example optimization (PR9814), which allows us to transform: A / ((1 << B) >>u 2) into: A >>u (B-2) which we compile into: _divu3: ## @divu3 leal -2(%rsi), %ecx shrl %cl, %edi movl %edi, %eax ret instead of: _divu3: ## @divu3 movb %sil, %cl movl $1, %esi shll %cl, %esi shrl $2, %esi movl %edi, %eax xorl %edx, %edx divl %esi, %eax ret llvm-svn: 131860
* Remove unused variable.Duncan Sands2011-05-021-1/+1
| | | | llvm-svn: 130705
* Move some rem transforms out of instcombine and into instsimplify.Duncan Sands2011-05-021-42/+19
| | | | | | | This automagically provides a transform noticed by my super-optimizer as occurring quite often: "rem x, (select cond, x, 1)" -> 0. llvm-svn: 130694
OpenPOWER on IntegriCloud