summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.)Hal Finkel2014-09-071-20/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change, which allows @llvm.assume to be used from within computeKnownBits (and other associated functions in ValueTracking), adds some (optional) parameters to computeKnownBits and friends. These functions now (optionally) take a "context" instruction pointer, an AssumptionTracker pointer, and also a DomTree pointer, and most of the changes are just to pass this new information when it is easily available from InstSimplify, InstCombine, etc. As explained below, the significant conceptual change is that known properties of a value might depend on the control-flow location of the use (because we care that the @llvm.assume dominates the use because assumptions have control-flow dependencies). This means that, when we ask if bits are known in a value, we might get different answers for different uses. The significant changes are all in ValueTracking. Two main changes: First, as with the rest of the code, new parameters need to be passed around. To make this easier, I grouped them into a structure, and I made internal static versions of the relevant functions that take this structure as a parameter. The new code does as you might expect, it looks for @llvm.assume calls that make use of the value we're trying to learn something about (often indirectly), attempts to pattern match that expression, and uses the result if successful. By making use of the AssumptionTracker, the process of finding @llvm.assume calls is not expensive. Part of the structure being passed around inside ValueTracking is a set of already-considered @llvm.assume calls. This is to prevent a query using, for example, the assume(a == b), to recurse on itself. The context and DT params are used to find applicable assumptions. An assumption needs to dominate the context instruction, or come after it deterministically. In this latter case we only handle the specific case where both the assumption and the context instruction are in the same block, and we need to exclude assumptions from being used to simplify their own ephemeral values (those which contribute only to the assumption) because otherwise the assumption would prove its feeding comparison trivial and would be removed. This commit adds the plumbing and the logic for a simple masked-bit propagation (just enough to write a regression test). Future commits add more patterns (and, correspondingly, more regression tests). llvm-svn: 217342
* InstCombine: Respect recursion depth in visitUDivOperandDavid Majnemer2014-08-301-4/+4
| | | | llvm-svn: 216817
* Remove an InstCombine that transformed patterns like (x * uitofp i1 y) to ↵Owen Anderson2014-08-171-30/+0
| | | | | | | | | | | | | (select y, x, 0.0) when the multiply has fast math flags set. While this might seem like an obvious canonicalization, there is one subtle problem with it. The result of the original expression is undef when x is NaN (remember, fast math flags), but the result of the select is always defined when x is NaN. This means that the new expression is strictly more defined than the original one. One unfortunate consequence of this is that the transform is not reversible! It's always legal to make increase the defined-ness of an expression, but it's not legal to reduce it. Thus, targets that prefer the original form of the expression cannot reverse the transform to recover it. Another way to think of it is that the transform has lost source-level information (the fast math flags), which is undesirable. llvm-svn: 215825
* InstCombine: Combine mul with div.David Majnemer2014-08-161-2/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can combne a mul with a div if one of the operands is a multiple of the other: %mul = mul nsw nuw %a, C1 %ret = udiv %mul, C2 => %ret = mul nsw %a, (C1 / C2) This can expose further optimization opportunities if we end up multiplying or dividing by a power of 2. Consider this small example: define i32 @f(i32 %a) { %mul = mul nuw i32 %a, 14 %div = udiv exact i32 %mul, 7 ret i32 %div } which gets CodeGen'd to: imull $14, %edi, %eax imulq $613566757, %rax, %rcx shrq $32, %rcx subl %ecx, %eax shrl %eax addl %ecx, %eax shrl $2, %eax retq We can now transform this into: define i32 @f(i32 %a) { %shl = shl nuw i32 %a, 1 ret i32 %shl } which gets CodeGen'd to: leal (%rdi,%rdi), %eax retq This fixes PR20681. llvm-svn: 215815
* InstCombine: Optimize x/INT_MIN to x==INT_MINDavid Majnemer2014-07-021-0/+4
| | | | | | | The result of x/INT_MIN is either 0 or 1, we can just use an icmp instead. llvm-svn: 212167
* InstCombine: Stop two transforms duelingDavid Majnemer2014-06-191-2/+5
| | | | | | | | | | | | | | | | | | | | | | InstCombineMulDivRem has: // Canonicalize (X+C1)*CI -> X*CI+C1*CI. InstCombineAddSub has: // W*X + Y*Z --> W * (X+Z) iff W == Y These two transforms could fight with each other if C1*CI would not fold away to something simpler than a ConstantExpr mul. The InstCombineMulDivRem transform only acted on ConstantInts until r199602 when it was changed to operate on all Constants in order to let it fire on ConstantVectors. To fix this, make this transform more careful by checking to see if we actually folded away C1*CI. This fixes PR20079. llvm-svn: 211258
* Optimize integral reciprocal (udiv 1, x and sdiv 1, x) to not use division. ↵Nick Lewycky2014-05-141-1/+20
| | | | | | This fires exactly once in a clang bootstrap, but covers a few different results from http://www.cs.utah.edu/~regehr/souper/ llvm-svn: 208750
* Reorder shuffle and binary operation.Serge Pavlov2014-05-111-0/+24
| | | | | | | | | | | | | This patch enables transformations: BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2)) BinOp(shuffle(v1), const1) -> shuffle(BinOp, const2) They allow to eliminate extra shuffles in some cases. Differential Revision: http://reviews.llvm.org/D3525 llvm-svn: 208488
* [C++] Use 'nullptr'. Transforms edition.Craig Topper2014-04-251-40/+41
| | | | llvm-svn: 207196
* [Modules] Fix potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-221-1/+2
| | | | | | | | | | | | | | | | | definition below all of the header #include lines, lib/Transforms/... edition. This one is tricky for two reasons. We again have a couple of passes that define something else before the includes as well. I've sunk their name macros with the DEBUG_TYPE. Also, InstCombine contains headers that need DEBUG_TYPE, so now those headers #define and #undef DEBUG_TYPE around their code, leaving them well formed modular headers. Fixing these headers was a large motivation for all of these changes, as "leaky" macros of this form are hard on the modules implementation. llvm-svn: 206844
* [Modules] Sink all the DEBUG_TYPE defines for InstCombine out of theChandler Carruth2014-04-211-0/+1
| | | | | | | | | | | header files and into the cpp files. These files will require more touches as the header files actually use DEBUG(). Eventually, I'll have to introduce a matched #define and #undef of DEBUG_TYPE for the header files, but that comes as step N of many to clean all of this up. llvm-svn: 206777
* [Modules] Move the LLVM IR pattern match header into the IR library, itChandler Carruth2014-03-041-1/+1
| | | | | | obviously is coupled to the IR. llvm-svn: 202818
* Rename many DataLayout variables from TD to DL.Rafael Espindola2014-02-211-8/+8
| | | | | | | | | I am really sorry for the noise, but the current state where some parts of the code use TD (from the old name: TargetData) and other parts use DL makes it hard to write a patch that changes where those variables come from and how they are passed along. llvm-svn: 201827
* Fix all the remaining lost-fast-math-flags bugs I've been able to find. The ↵Owen Anderson2014-01-201-0/+10
| | | | | | | | most important of these are cases in the generic logic for combining BinaryOperators. This logic hadn't been updated to handle FastMathFlags, and it took me a while to detect it because it doesn't show up in a simple search for CreateFAdd. llvm-svn: 199629
* InstCombine: Teach most integer add/sub/mul/div combines how to deal with ↵Benjamin Kramer2014-01-191-21/+22
| | | | | | vectors. llvm-svn: 199602
* InstCombine: Refactor fmul/fdiv combines to handle vectors.Benjamin Kramer2014-01-191-64/+77
| | | | llvm-svn: 199598
* InstCombine: Make the (fmul X, -1.0) -> (fsub -0.0, X) transform handle ↵Benjamin Kramer2014-01-181-6/+4
| | | | | | | | vectors too. PR18532. llvm-svn: 199553
* Fix an instance where we would drop fast math flags when performing an fdiv ↵Owen Anderson2014-01-161-1/+3
| | | | | | to reciprocal multiply transformation. llvm-svn: 199425
* Fix a bug in InstCombine where we failed to preserve fast math flags when ↵Owen Anderson2014-01-161-2/+5
| | | | | | optimizing an FMUL expression. llvm-svn: 199424
* Teach InstCombine that (fmul X, -1.0) can be simplified to (fneg X), which ↵Owen Anderson2014-01-161-0/+10
| | | | | | LLVM expresses as (fsub -0.0, X). llvm-svn: 199420
* InstCombine: Replace manual fast math flag copying with the new IRBuilder ↵Benjamin Kramer2013-09-301-22/+20
| | | | | | | | | RAII helper. Defines away the issue where cast<Instruction> would fail because constant folding happened. Also slightly cleaner. llvm-svn: 191674
* Fix a bug in InstCombine where it attempted to cast a Value* to an Instruction*Joey Gouly2013-09-301-2/+2
| | | | | | | | | when it was actually a Constant*. There are quite a few other casts to Instruction that might have the same problem, but this is the only one I have a test case for. llvm-svn: 191668
* [Fast-math] Disable "(C1/X)*C2 => (C1*C2)/X" if C1/X has multiple uses.Shuxin Yang2013-09-191-3/+6
| | | | | | | | | | | | | | | | | | If "C1/X" were having multiple uses, the only benefit of this transformation is to potentially shorten critical path. But it is at the cost of instroducing additional div. The additional div may or may not incur cost depending on how div is implemented. If it is implemented using Newton–Raphson iteration, it dosen't seem to incur any cost (FIXME). However, if the div blocks the entire pipeline, that sounds to be pretty expensive. Let CodeGen to take care this transformation. This patch sees 6% on a benchmark. rdar://15032743 llvm-svn: 191037
* Correct case of m_UIToFp to m_UIToFP to match instruction name, add m_SIToFP ↵Stephen Lin2013-07-261-4/+4
| | | | | | for consistency. llvm-svn: 187225
* InstCombine: call FoldOpIntoSelect for all floating binops, not just fmulStephen Lin2013-07-201-0/+9
| | | | llvm-svn: 186759
* Restore r181216, which was partially reverted in r182499.Stephen Lin2013-07-171-0/+29
| | | | llvm-svn: 186533
* Add a microoptimization for urem.Nick Lewycky2013-07-131-0/+7
| | | | llvm-svn: 186235
* InstCombine: Reimplementation of visitUDivOperandDavid Majnemer2013-07-041-56/+139
| | | | | | | | | | | This transform was originally added in r185257 but later removed in r185415. The original transform would create instructions speculatively and then discard them if the speculation was proved incorrect. This has been replaced with a scheme that splits the transform into two parts: preflight and fold. While we preflight, we build up fold actions that inform the folding stage on how to act. llvm-svn: 185667
* Revert r185257 (InstCombine: Be more agressive optimizing 'udiv' instrs with ↵Hal Finkel2013-07-021-77/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'select' denoms) I'm reverting this commit because: 1. As discussed during review, it needs to be rewritten (to avoid creating and then deleting instructions). 2. This is causing optimizer crashes. Specifically, I'm seeing things like this: While deleting: i1 % Use still stuck around after Def is destroyed: <badref> = select i1 <badref>, i32 0, i32 1 opt: /src/llvm-trunk/lib/IR/Value.cpp:79: virtual llvm::Value::~Value(): Assertion `use_empty() && "Uses remain when a value is destroyed!"' failed. I'd guess that these will go away once we're no longer creating/deleting instructions here, but just in case, I'm adding a regression test. Because the code is bring rewritten, I've just XFAIL'd the original regression test. Original commit message: InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms Real world code sometimes has the denominator of a 'udiv' be a 'select'. LLVM can handle such cases but only when the 'select' operands are symmetric in structure (both select operands are a constant power of two or a left shift, etc.). This falls apart if we are dealt a 'udiv' where the code is not symetric or if the select operands lead us to more select instructions. Instead, we should treat the LHS and each select operand as a distinct divide operation and try to optimize them independently. If we can to simplify each operation, then we can replace the 'udiv' with, say, a 'lshr' that has a new select with a bunch of new operands for the select. llvm-svn: 185415
* InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denomsDavid Majnemer2013-06-291-44/+77
| | | | | | | | | | | | | | | | | Real world code sometimes has the denominator of a 'udiv' be a 'select'. LLVM can handle such cases but only when the 'select' operands are symmetric in structure (both select operands are a constant power of two or a left shift, etc.). This falls apart if we are dealt a 'udiv' where the code is not symetric or if the select operands lead us to more select instructions. Instead, we should treat the LHS and each select operand as a distinct divide operation and try to optimize them independently. If we can to simplify each operation, then we can replace the 'udiv' with, say, a 'lshr' that has a new select with a bunch of new operands for the select. llvm-svn: 185257
* In InstCombine{AddSub,MulDivRem} convert APFloat.isFiniteNonZero() && ↵Michael Gottesman2013-06-261-4/+4
| | | | | | !APFloat.isDenormal => APFloat.isNormal. llvm-svn: 185037
* [APFloat] Converted all references to APFloat::isNormal => ↵Michael Gottesman2013-06-191-9/+9
| | | | | | | | APFloat::isFiniteNonZero. Turns out all the references were in llvm and not in clang. llvm-svn: 184356
* Simplify code. No functionality change.Jakub Staszak2013-06-061-2/+1
| | | | llvm-svn: 183461
* Simplify multiplications by vectors whose elements are powers of 2.Rafael Espindola2013-05-311-16/+48
| | | | | | Patch by Andrea Di Biagio. llvm-svn: 183005
* This is an update to a previous commit (r181216).Jean-Luc Duprat2013-05-221-29/+0
| | | | | | | | | | | | | | | | | | | | | | | The earlier change list introduced the following inst combines: B * (uitofp i1 C) —> select C, B, 0 A * (1 - uitofp i1 C) —> select C, 0, A select C, 0, B + select C, A, 0 —> select C, A, B Together these 3 changes would simplify : A * (1 - uitofp i1 C) + B * uitofp i1 C down to : select C, B, A In practice we found that the first two substitutions can have a negative effect on performance, because they reduce opportunities to use FMA contractions; between the two options FMAs are often the better choice. This change list amends the previous one to enable just these inst combines: select C, B, 0 + select C, 0, A —> select C, B, A A * (1 - uitofp i1 C) + B * uitofp i1 C —> select C, B, A llvm-svn: 182499
* Fix two typoSylvestre Ledru2013-05-141-1/+1
| | | | llvm-svn: 181848
* InstCombine: Flip the order of two urem transformsDavid Majnemer2013-05-121-6/+6
| | | | | | | | | | | | | | There are two transforms in visitUrem that conflict with each other. *) One, if a divisor is a power of two, subtracts one from the divisor and turns it into a bitwise-and. *) The other unwraps both operands if they are surrounded by zext instructions. Flipping the order allows the subtraction to go beneath the sign extension. llvm-svn: 181668
* InstCombine: Turn urem to bitwise-and more oftenDavid Majnemer2013-05-111-20/+2
| | | | | | | Use isKnownToBeAPowerOfTwo in visitUrem so that we may more aggressively fold away urem instructions. llvm-svn: 181661
* InstCombine: Verify the type before transforming uitofp into select.Benjamin Kramer2013-05-101-22/+23
| | | | | | PR15952. llvm-svn: 181586
* Provide InstCombines for the following 3 cases:Jean-Luc Duprat2013-05-061-0/+28
| | | | | | | | | | | | | A * (1 - (uitofp i1 C)) -> select C, 0, A B * (uitofp i1 C) -> select C, B, 0 select C, 0, A + select C, B, 0 -> select C, B, A These come up in code that has been hand-optimized from a select to a linear blend, on platforms where that may have mattered. We want to undo such changes with the following transform: A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B llvm-svn: 181216
* Tidy up a bit. No functional change.Jim Grosbach2013-04-051-56/+56
| | | | llvm-svn: 178915
* Fix a bug in instcombine for fmul in fast math mode.Quentin Colombet2013-02-281-3/+3
| | | | | | | | | | | | | | | The instcombine recognized pattern looks like: a = b * c d = a +/- Cst or a = b * c d = Cst +/- a When creating the new operands for fadd or fsub instruction following the related fmul, the first operand was created with the second original operand (M0 was created with C1) and the second with the first (M1 with Opnd0). The fix consists in creating the new operands with the appropriate original operand, i.e., M0 with Opnd0 and M1 with C1. llvm-svn: 176300
* 1. Hoist minus sign as high as possible in an attempt to revealShuxin Yang2013-01-151-31/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | some optimization opportunities (in the enclosing supper-expressions). rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y) if expression "-0.0 - X" has only one reference. rule 2. (0.0 - X ) * Y => -0.0 - (X * Y) if expression "0.0 - X" has only one reference, and the instruction is marked "noSignedZero". 2. Eliminate negation (The compiler was already able to handle these opt if the 0.0s are replaced with -0.0.) rule 3: (0.0 - X) * (0.0 - Y) => X * Y rule 4: (0.0 - X) * C => X * -C if the expr is flagged "noSignedZero". 3. Rule 5: (X*Y) * X => (X*X) * Y if X!=Y and the expression is flagged with "UnsafeAlgebra". The purpose of this transformation is two-fold: a) to form a power expression (of X). b) potentially shorten the critical path: After transformation, the latency of the instruction Y is amortized by the expression of X*X, and therefore Y is in a "less critical" position compared to what it was before the transformation. 4. Remove the InstCombine code about simplifiying "X * select". The reasons are following: a) The "select" is somewhat architecture-dependent, therefore the higher level optimizers are not able to precisely predict if the simplification really yields any performance improvement or not. b) The "select" operator is bit complicate, and tends to obscure optimization opportunities. It is btter to keep it as low as possible in expr tree, and let CodeGen to tackle the optimization. llvm-svn: 172551
* This change is to implement following rules under the condition C_A and/or C_RShuxin Yang2013-01-141-8/+127
| | | | | | | | | | | | | | | | | | | | | --------------------------------------------------------------------------- C_A: reassociation is allowed C_R: reciprocal of a constant C is appropriate, which means - 1/C is exact, or - reciprocal is allowed and 1/C is neither a special value nor a denormal. ----------------------------------------------------------------------------- rule1: (X/C1) / C2 => X / (C2*C1) (if C_A) => X * (1/(C2*C1)) (if C_A && C_R) rule 2: X*C1 / C2 => X * (C1/C2) if C_A rule 3: (X/Y)/Z = > X/(Y*Z) (if C_A && at least one of Y and Z is symbolic value) rule 4: Z/(X/Y) = > (Z*Y)/X (similar to rule3) rule 5: C1/(X*C2) => (C1/C2) / X (if C_A) rule 6: C1/(X/C2) => (C1*C2) / X (if C_A) rule 7: C1/(C2/X) => (C1/C2) * X (if C_A) llvm-svn: 172488
* Cosmetical changne in order to conform to coding std.Shuxin Yang2013-01-071-5/+6
| | | | | | Thank Eric Christopher for figuring out these problems! llvm-svn: 171805
* This change is to implement following rules:Shuxin Yang2013-01-071-0/+127
| | | | | | | | | | | o. X/C1 * C2 => X * (C2/C1) (if C2/C1 is neither special FP nor denormal) o. X/C1 * C2 -> X/(C1/C2) (if C2/C1 is either specical FP or denormal, but C1/C2 is a normal Fp) Let MDC denote multiplication or dividion with one & only one operand being a constant o. (MDC ± C1) * C2 => (MDC * C2) ± (C1 * C2) (so long as the constant-folding doesn't yield any denormal or special value) llvm-svn: 171793
* Move all of the header files which are involved in modelling the LLVM IRChandler Carruth2013-01-021-1/+1
| | | | | | | | | | | | | | | | | | | | | into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366
* rdar://12753946Shuxin Yang2012-12-141-0/+32
| | | | | | Implement rule : "x * (select cond 1.0, 0.0) -> select cond x, 0.0" llvm-svn: 170226
* Rename isPowerOfTwo to isKnownToBeAPowerOfTwo.Rafael Espindola2012-12-131-2/+2
| | | | | | | | In a previous thread it was pointed out that isPowerOfTwo is not a very precise name since it can return false for powers of two if it is unable to show that they are powers of two. llvm-svn: 170093
* The TargetData is not used for the isPowerOfTwo determination. It has neverRafael Espindola2012-12-121-3/+2
| | | | | | | | | | been used in the first place. It simply was passed to the function and to the recursive invocations. Simply drop the parameter and update the callers for the new signature. Patch by Saleem Abdulrasool! llvm-svn: 169988
OpenPOWER on IntegriCloud