summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis/InstructionSimplify.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Remove dead TLI arg of isKnownNonNull and propagate deadness. NFC.Sean Silva2016-07-021-4/+3
| | | | | | | | | | | | | | This actually uncovered a surprisingly large chain of ultimately unused TLI args. From what I can gather, this argument is a remnant of when isKnownNonNull would look at the TLI directly. The current approach seems to be that InferFunctionAttrs runs early in the pipeline and uses TLI to annotate the TLI-dependent non-null information as return attributes. This also removes the dependence of functionattrs on TLI altogether. llvm-svn: 274455
* [InstSimplify] Replace calls to null with undefDavid Majnemer2016-06-251-1/+2
| | | | | | | Calling null is undefined behavior, we can simplify the resulting value to undef. llvm-svn: 273777
* [InstSimplify] analyze (optionally casted) icmps to eliminate obviously ↵Sanjay Patel2016-06-201-4/+31
| | | | | | | | | | | | | | | false logic (PR27869) By moving this transform to InstSimplify from InstCombine, we sidestep the problem/question raised by PR27869: https://llvm.org/bugs/show_bug.cgi?id=27869 ...where InstCombine turns an icmp+zext into a shift causing us to miss the fold. Credit to David Majnemer for a draft patch of the changes to InstructionSimplify.cpp. Differential Revision: http://reviews.llvm.org/D21512 llvm-svn: 273200
* fix formatting, typo; NFCSanjay Patel2016-06-191-1/+1
| | | | llvm-svn: 273118
* IR: Introduce local_unnamed_addr attribute.Peter Collingbourne2016-06-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a local_unnamed_addr attribute is attached to a global, the address is known to be insignificant within the module. It is distinct from the existing unnamed_addr attribute in that it only describes a local property of the module rather than a global property of the symbol. This attribute is intended to be used by the code generator and LTO to allow the linker to decide whether the global needs to be in the symbol table. It is possible to exclude a global from the symbol table if three things are true: - This attribute is present on every instance of the global (which means that the normal rule that the global must have a unique address can be broken without being observable by the program by performing comparisons against the global's address) - The global has linkonce_odr linkage (which means that each linkage unit must have its own copy of the global if it requires one, and the copy in each linkage unit must be the same) - It is a constant or a function (which means that the program cannot observe that the unique-address rule has been broken by writing to the global) Although this attribute could in principle be computed from the module contents, LTO clients (i.e. linkers) will normally need to be able to compute this property as part of symbol resolution, and it would be inefficient to materialize every module just to compute it. See: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html for earlier discussion. Part of the fix for PR27553. Differential Revision: http://reviews.llvm.org/D20348 llvm-svn: 272709
* Avoid copies of std::strings and APInt/APFloats where we only read from itBenjamin Kramer2016-06-081-1/+1
| | | | | | | | As suggested by clang-tidy's performance-unnecessary-copy-initialization. This can easily hit lifetime issues, so I audited every change and ran the tests under asan, which came back clean. llvm-svn: 272126
* [InstSimplify] use computeKnownBits on shift amount operandsSanjay Patel2016-05-101-0/+16
| | | | | | | | | | | | | | Do simplifications common to all shift instructions based on the amount shifted: 1. If the shift amount is known larger than the bitwidth, the result is undefined. 2. If the valid bits of the shift amount are all known to be 0, it's a shift by zero, so the shift operand is the result. Note that we could generalize the shift-by-zero transform into a shift-by-constant if all of the valid bits in the shift amount are known, but that would have to be done in InstCombine rather than here because it would mean we need to create a new shift instruction. Differential Revision: http://reviews.llvm.org/D19874 llvm-svn: 269114
* Fold compares irrespective of whether allocation can be elidedAnna Thomas2016-05-031-5/+21
| | | | | | | | | | | | | | | | | | | Summary When a non-escaping pointer is compared to a global value, the comparison can be folded even if the corresponding malloc/allocation call cannot be elided. We need to make sure the global value is not null, since comparisons to null cannot be folded. In future, we should also handle cases when the the comparison instruction dominates the pointer escape. Reviewers: sanjoy Subscribers s.egerton, llvm-commits Differential Revision: http://reviews.llvm.org/D19549 llvm-svn: 268390
* Introduce llvm.load.relative intrinsic.Peter Collingbourne2016-04-221-0/+61
| | | | | | | | | | | | | | | | | | | This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that value and returns it. The constant folder specifically recognizes the form of this intrinsic and the constant initializers it may load from; if a loaded constant initializer is known to have the form ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. LLVM provides that the calculation of such a constant initializer will not overflow at link time under the medium code model if ``x`` is an ``unnamed_addr`` function. However, it does not provide this guarantee for a constant initializer folded into a function body. This intrinsic can be used to avoid the possibility of overflows when loading from such a constant. Differential Revision: http://reviews.llvm.org/D18367 llvm-svn: 267223
* Add optimization for 'icmp slt (or A, B), A' and some related idioms based ↵Nick Lewycky2016-04-211-15/+42
| | | | | | | | | | | | | | | | | | | | on knowledge of the sign bit for A and B. No matter what value you OR in to A, the result of (or A, B) is going to be UGE A. When A and B are positive, it's SGE too. If A is negative, OR'ing a value into it can't make it positive, but can increase its value closer to -1, therefore (or A, B) is SGE A. Working through all possible combinations produces this truth table: ``` A is +, -, +/- F F F + B is T F ? - ? F ? +/- ``` The related optimizations are flipping the 'slt' for 'sge' which always NOTs the result (if the result is known), and swapping the LHS and RHS while swapping the comparison predicate. There are more idioms left to implement (aren't there always!) but I've stopped here because any more would risk becoming unreasonable for reviewers. llvm-svn: 266939
* [ValueTracking] Make isImpliedCondition return an Optional<bool>. NFC.Chad Rosier2016-04-201-6/+3
| | | | | | Phabricator Revision: http://reviews.llvm.org/D19277 llvm-svn: 266904
* [ValueTracking] Improve isImpliedCondition for conditions with matching ↵Chad Rosier2016-04-191-6/+12
| | | | | | | | | | | | | | | operands. This patch improves SimplifyCFG to catch cases like: if (a < b) { if (a > b) <- known to be false unreachable; } Phabricator Revision: http://reviews.llvm.org/D18905 llvm-svn: 266767
* [InstCombine] We folded an fcmp to an i1 instead of a vector of i1David Majnemer2016-04-131-12/+19
| | | | | | | | | Remove an ad-hoc transform in InstCombine and replace it with more general machinery (ValueTracking, InstructionSimplify and VectorUtils). This fixes PR27332. llvm-svn: 266175
* Don't IPO over functions that can be de-refinedSanjoy Das2016-04-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fixes PR26774. If you're aware of the issue, feel free to skip the "Motivation" section and jump directly to "This patch". Motivation: I define "refinement" as discarding behaviors from a program that the optimizer has license to discard. So transforming: ``` void f(unsigned x) { unsigned t = 5 / x; (void)t; } ``` to ``` void f(unsigned x) { } ``` is refinement, since the behavior went from "if x == 0 then undefined else nothing" to "nothing" (the optimizer has license to discard undefined behavior). Refinement is a fundamental aspect of many mid-level optimizations done by LLVM. For instance, transforming `x == (x + 1)` to `false` also involves refinement since the expression's value went from "if x is `undef` then { `true` or `false` } else { `false` }" to "`false`" (by definition, the optimizer has license to fold `undef` to any non-`undef` value). Unfortunately, refinement implies that the optimizer cannot assume that the implementation of a function it can see has all of the behavior an unoptimized or a differently optimized version of the same function can have. This is a problem for functions with comdat linkage, where a function can be replaced by an unoptimized or a differently optimized version of the same source level function. For instance, FunctionAttrs cannot assume a comdat function is actually `readnone` even if it does not have any loads or stores in it; since there may have been loads and stores in the "original function" that were refined out in the currently visible variant, and at the link step the linker may in fact choose an implementation with a load or a store. As an example, consider a function that does two atomic loads from the same memory location, and writes to memory only if the two values are not equal. The optimizer is allowed to refine this function by first CSE'ing the two loads, and the folding the comparision to always report that the two values are equal. Such a refined variant will look like it is `readonly`. However, the unoptimized version of the function can still write to memory (since the two loads //can// result in different values), and selecting the unoptimized version at link time will retroactively invalidate transforms we may have done under the assumption that the function does not write to memory. Note: this is not just a problem with atomics or with linking differently optimized object files. See PR26774 for more realistic examples that involved neither. This patch: This change introduces a new set of linkage types, predicated as `GlobalValue::mayBeDerefined` that returns true if the linkage type allows a function to be replaced by a differently optimized variant at link time. It then changes a set of IPO passes to bail out if they see such a function. Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18634 llvm-svn: 265762
* Minor code cleanups. NFC.Junmo Park2016-04-051-2/+2
| | | | llvm-svn: 265468
* [InstSimplify] Restore fsub 0.0, (fsub 0.0, X) ==> X optznBenjamin Kramer2016-02-291-1/+1
| | | | | | | I accidentally removed this in r262212 but there was no test coverage to detect it. llvm-svn: 262215
* [InstSimplify] fsub 0.0, (fsub -0.0, X) ==> X is only safe if signed zeros ↵Benjamin Kramer2016-02-291-7/+8
| | | | | | | | are ignored. Only allow fsub -0.0, (fsub -0.0, X) ==> X without nsz. PR26746. llvm-svn: 262212
* [opaque pointer types] [NFC] Add an explicit type argument to ↵Eduard Burtescu2016-01-221-1/+1
| | | | | | | | | | | | ConstantFoldLoadFromConstPtr. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16418 llvm-svn: 258472
* [InstCombine] Simplify (x >> y) <= xDavid Majnemer2016-01-211-2/+4
| | | | | | | | | | | | This commit extends the patterns recognised by InstSimplify to also handle (x >> y) <= x in the same way as (x /u y) <= x. The missing optimisation was found investigating why LLVM did not optimise away bound checks in a binary search: https://github.com/rust-lang/rust/pull/30917 Patch by Andrea Canciani! Differential Revision: http://reviews.llvm.org/D16402 llvm-svn: 258422
* Change ConstantFoldInstOperands to take Instruction instead of opcode and ↵Manuel Jacob2016-01-211-2/+1
| | | | | | | | | | | | | | | | | | | | | type. NFC. Summary: The previous form, taking opcode and type, is moved to an internal helper and the new form, taking an instruction, is a wrapper around this helper. Although this is a slight cleanup on its own, the main motivation is to refactor the constant folding API to ease migration to opaque pointers. This will be follow-up work. Reviewers: eddyb Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D16383 llvm-svn: 258391
* Introduce ConstantFoldCastOperand function and migrate some callers of ↵Manuel Jacob2016-01-211-1/+1
| | | | | | | | | | | | | | | | | ConstantFoldInstOperands to use it. NFC. Summary: Although this is a slight cleanup on its own, the main motivation is to refactor the constant folding API to ease migration to opaque pointers. This will be follow-up work. Reviewers: eddyb Subscribers: zzheng, dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D16380 llvm-svn: 258390
* Introduce ConstantFoldBinaryOpOperands function and migrate some callers of ↵Manuel Jacob2016-01-211-68/+29
| | | | | | | | | | | | | | | | | ConstantFoldInstOperands to use it. NFC. Summary: Although this is a slight cleanup on its own, the main motivation is to refactor the constant folding API to ease migration to opaque pointers. This will be follow-up work. Reviewers: eddyb Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D16378 llvm-svn: 258389
* fix typo; NFCSanjay Patel2016-01-201-1/+1
| | | | llvm-svn: 258332
* [opaque pointer types] [breaking-change] [NFC] SimplifyGEPInst: take the ↵Manuel Jacob2016-01-171-5/+6
| | | | | | | | | | | | | | source element type of the GEP as an argument. Patch by Eduard Burtescu. Reviewers: dblaikie, mjacob Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16281 llvm-svn: 258024
* getParent()->getParent() == getFunction() and clang-format ; NFCSanjay Patel2016-01-111-15/+11
| | | | llvm-svn: 257399
* don't repeat function names in comments; NFCSanjay Patel2016-01-111-88/+85
| | | | llvm-svn: 257396
* Add a missing const qualifier on the context instruction. This somehowChandler Carruth2015-12-241-1/+1
| | | | | | has always been missing. =/ llvm-svn: 256371
* [IR] Reformulate LLVM's EH funclet IRDavid Majnemer2015-12-121-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While we have successfully implemented a funclet-oriented EH scheme on top of LLVM IR, our scheme has some notable deficiencies: - catchendpad and cleanupendpad are necessary in the current design but they are difficult to explain to others, even to seasoned LLVM experts. - catchendpad and cleanupendpad are optimization barriers. They cannot be split and force all potentially throwing call-sites to be invokes. This has a noticable effect on the quality of our code generation. - catchpad, while similar in some aspects to invoke, is fairly awkward. It is unsplittable, starts a funclet, and has control flow to other funclets. - The nesting relationship between funclets is currently a property of control flow edges. Because of this, we are forced to carefully analyze the flow graph to see if there might potentially exist illegal nesting among funclets. While we have logic to clone funclets when they are illegally nested, it would be nicer if we had a representation which forbade them upfront. Let's clean this up a bit by doing the following: - Instead, make catchpad more like cleanuppad and landingpad: no control flow, just a bunch of simple operands; catchpad would be splittable. - Introduce catchswitch, a control flow instruction designed to model the constraints of funclet oriented EH. - Make funclet scoping explicit by having funclet instructions consume the token produced by the funclet which contains them. - Remove catchendpad and cleanupendpad. Their presence can be inferred implicitly using coloring information. N.B. The state numbering code for the CLR has been updated but the veracity of it's output cannot be spoken for. An expert should take a look to make sure the results are reasonable. Reviewers: rnk, JosephTremoulet, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D15139 llvm-svn: 255422
* Remove unnecessary intermediate lambda. NFCCraig Topper2015-11-291-2/+1
| | | | llvm-svn: 254243
* [ValueTracking] Add parameters to isImpliedCondition; NFCSanjoy Das2015-11-061-3/+3
| | | | | | | | | | | | | | | | Summary: This change makes the `isImpliedCondition` interface similar to the rest of the functions in ValueTracking (in that it takes a DataLayout, AssumptionCache etc.). This is an NFC, intended to make a later diff less noisy. Depends on D14369 Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14391 llvm-svn: 252333
* [InstSimplify] sgt on i1s also encodes implicationPhilip Reames2015-10-291-0/+11
| | | | | | | | | | | | | | | Follow on to http://reviews.llvm.org/D13074, implementing something pointed out by Sanjoy. His truth table from his comment on that bug summarizes things well: LHS | RHS | LHS >=s RHS | LHS implies RHS 0 | 0 | 1 (0 >= 0) | 1 0 | 1 | 1 (0 >= -1) | 1 1 | 0 | 0 (-1 >= 0) | 0 1 | 1 | 1 (-1 >= -1) | 1 The key point is that an "i1 1" is the value "-1", not "1". Differential Revision: http://reviews.llvm.org/D13756 llvm-svn: 251597
* [ValueTracking] Expose `implies` via ValueTracking, NFCSanjoy Das2015-10-281-50/+2
| | | | | | | | | | | | Summary: This will allow a later patch to `JumpThreading` use this functionality. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13971 llvm-svn: 251488
* Extract out getConstantRangeFromMetadata; NFCSanjoy Das2015-10-241-31/+5
| | | | | | | | | | | | The loop idiom creating a ConstantRange is repeated twice in the codebase, time to give it a name and a home. The loop is also repeated in `rangeMetadataExcludesValue`, but using `getConstantRangeFromMetadata` there would not be an NFC -- the range returned by `getConstantRangeFromMetadata` may contain a value that none of the subranges did. llvm-svn: 251180
* Fix whitespace issues in two places; NFCSanjoy Das2015-10-241-1/+2
| | | | llvm-svn: 251179
* Handle non-constant shifts in computeKnownBits, and use computeKnownBits for ↵Hal Finkel2015-10-231-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | constant folding in InstCombine/Simplify First, the motivation: LLVM currently does not realize that: ((2072 >> (L == 0)) >> 7) & 1 == 0 where L is some arbitrary value. Whether you right-shift 2072 by 7 or by 8, the lowest-order bit is always zero. There are obviously several ways to go about fixing this, but the generic solution pursued in this patch is to teach computeKnownBits something about shifts by a non-constant amount. Previously, we would give up completely on these. Instead, in cases where we know something about the low-order bits of the shift-amount operand, we can combine (and together) the associated restrictions for all shift amounts consistent with that knowledge. As a further generalization, I refactored all of the logic for all three kinds of shifts to have this capability. This works well in the above case, for example, because the dynamic shift amount can only be 0 or 1, and thus we can say a lot about the known bits of the result. This brings us to the second part of this change: Even when we know all of the bits of a value via computeKnownBits, nothing used to constant-fold the result. This introduces the necessary code into InstCombine and InstSimplify. I've added it into both because: 1. InstCombine won't automatically pick up the associated logic in InstSimplify (InstCombine uses InstSimplify, but not via the API that passes in the original instruction). 2. Putting the logic in InstCombine allows the resulting simplifications to become part of the iterative worklist 3. Putting the logic in InstSimplify allows the resulting simplifications to be used by everywhere else that calls SimplifyInstruction (inlining, unrolling, and many others). And this requires a small change to our definition of an ephemeral value so that we don't break the rest case from r246696 (where the icmp feeding the @llvm.assume, is also feeding a br). Under the old definition, the icmp would not be considered ephemeral (because it is used by the br), but this causes the assume to remove itself (in addition to simplifying the branch structure), and it seems more-useful to prevent that from happening. llvm-svn: 251146
* [ValueTracking] Add a new predicate: isKnownNonEqual()James Molloy2015-10-221-0/+8
| | | | | | | | | | | | | | isKnownNonEqual(A, B) returns true if it can be determined that A != B. At the moment it only knows two facts, that a non-wrapping add of nonzero to a value cannot be that value: A + B != A [where B != 0, addition is nsw or nuw] and that contradictory known bits imply two values are not equal. This patch also hooks this up to InstSimplify; InstSimplify had a peephole for the first fact but not the second so this teaches InstSimplify a new trick too (alas no measured performance impact!) llvm-svn: 251012
* Fix pr25040 - Handle vectors of i1s in recently added implication codePhilip Reames2015-10-061-4/+11
| | | | | | | | As mentioned in the bug, I'd missed the presence of a getScalarType in the caller of the new implies method. As a result, when we ended up with a implication over two vectors, we'd trip an assert and crash. Differential Revision: http://reviews.llvm.org/D13441 llvm-svn: 249442
* [InstSimplify] Fold simple known implications to truePhilip Reames2015-09-281-0/+47
| | | | | | | | | | This was split off of http://reviews.llvm.org/D13040 to make it easier to test the correctness of the implication logic. For the moment, this only handles a single easy case which shows up when eliminating and combining range checks. In the (near) future, I plan to extend this for other cases which show up in range checks, but I wanted to make those changes incrementally once the framework was in place. At the moment, the implication logic will be used by three places. One in InstSimplify (this review) and two in SimplifyCFG (http://reviews.llvm.org/D13040 & http://reviews.llvm.org/D13070). Can anyone think of other locations this style of reasoning would make sense? Differential Revision: http://reviews.llvm.org/D13074 llvm-svn: 248719
* [Bug 24848] Use range metadata to constant fold comparisons between two valuesChen Li2015-09-261-0/+26
| | | | | | | | | | | | | | | Summary: This is the second part of fixing bug 24848 https://llvm.org/bugs/show_bug.cgi?id=24848. If both operands of a comparison have range metadata, they should be used to constant fold the comparison. Reviewers: sanjoy, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13177 llvm-svn: 248650
* [Bug 24848] Use range metadata to constant fold comparisons with constant valuesChen Li2015-09-231-2/+32
| | | | | | | | | | | | | | | Summary: This is the first part of fixing bug 24848 https://llvm.org/bugs/show_bug.cgi?id=24848. When range metadata is provided, it should be used to constant fold comparisons with constant values. Reviewers: sanjoy, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12988 llvm-svn: 248402
* [InstSimplify] add nuw %x, C2 must be at least C2David Majnemer2015-08-201-0/+3
| | | | | | | Use the fact that add nuw always creates a larger bit pattern when trying to simplify comparisons. llvm-svn: 245638
* [InstSimplify] Remove unused variableDavid Majnemer2015-08-181-6/+2
| | | | | | No functionality change is intended. llvm-svn: 245369
* [InstSimplify] Don't assume getAggregateElement will succeedDavid Majnemer2015-08-181-5/+0
| | | | | | | It isn't always possible to get a value from getAggregateElement. This fixes PR24488. llvm-svn: 245365
* [IR] Give catchret an optional 'return value' operandDavid Majnemer2015-08-151-2/+2
| | | | | | | | | | | Some personality routines require funclet exit points to be clearly marked, this is done by producing a token at the funclet pad and consuming it at the corresponding ret instruction. CleanupReturnInst already had a spot for this operand but CatchReturnInst did not. Other personality routines don't need to use this which is why it has been made optional. llvm-svn: 245149
* [InstSimplify] Teach InstSimplify how to simplify extractelementDavid Majnemer2015-07-131-0/+48
| | | | llvm-svn: 242008
* [InstSimplify] Teach InstSimplify how to simplify extractvalueDavid Majnemer2015-07-131-0/+41
| | | | llvm-svn: 242007
* [InstSimplify] Fold away ord/uno fcmps when nnan is present.Benjamin Kramer2015-07-101-8/+17
| | | | | | | This is important to fold away the slow case of complex multiplies emitted by clang. llvm-svn: 241911
* [InstSimplify] Allow folding of fdiv X, X with just NaNs ignoredBenjamin Kramer2015-06-161-3/+3
| | | | | | | | Any combination of +-inf/+-inf is NaN so it's already ignored with nnan and we can skip checking for ninf. Also rephrase logic in comments a bit. llvm-svn: 239821
* [InstSimplify] fsub nnan x, x -> 0.0 is valid without ninfBenjamin Kramer2015-06-141-2/+2
| | | | | | | Both inf - inf and (-inf) - (-inf) are NaN, so it's already covered by nnan. llvm-svn: 239702
* [InstSimplify] Add self-fdiv identities for -ffinite-math-only.Benjamin Kramer2015-06-141-0/+15
| | | | | | | | | When NaNs and Infs are ignored we can fold X / X -> 1.0 -X / X -> -1.0 X / -X -> -1.0 llvm-svn: 239701
OpenPOWER on IntegriCloud