summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] Don't eagerly propagate nsw for A*B+A*C => A*(B+C)David Majnemer2015-05-221-3/+16
| | | | | | | | | | | | | | | | InstCombine transforms A *nsw B +nsw A *nsw C to A *nsw (B + C). This is incorrect -- e.g. if A = -1, B = 1, C = INT_SMAX. Then nothing in the LHS overflows, but the multiplication in RHS overflows. We need to first make sure that we won't multiple by INT_SMAX + 1. Test case `add_of_mul` contributed by Sanjoy Das. This fixes PR23635. Differential Revision: http://reviews.llvm.org/D9629 llvm-svn: 238066
* [InstSimplify] Handle some overflow intrinsics in InstSimplifyDavid Majnemer2015-05-222-12/+6
| | | | | | | | | This change does a few things: - Move some InstCombine transforms to InstSimplify - Run SimplifyCall from within InstCombine::visitCallInst - Teach InstSimplify to fold [us]mul_with_overflow(X, undef) to 0. llvm-svn: 237995
* [InstCombine] X - 0 is equal to X, not undefDavid Majnemer2015-05-211-27/+21
| | | | | | | | | A refactoring made @llvm.ssub.with.overflow.i32(i32 %X, i32 0) transform into undef instead of %X. This fixes PR23624. llvm-svn: 237968
* Reapply r237539 with a fix for the Chromium build.James Molloy2015-05-205-7/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure if we're truncating a constant that would then be sign extended that the sign extension of the truncated constant is the same as the original constant. > Canonicalize min/max expressions correctly. > > This patch introduces a canonical form for min/max idioms where one operand > is extended or truncated. This often happens when the other operand is a > constant. For example: > > %1 = icmp slt i32 %a, i32 0 > %2 = sext i32 %a to i64 > %3 = select i1 %1, i64 %2, i64 0 > > Would now be canonicalized into: > > %1 = icmp slt i32 %a, i32 0 > %2 = select i1 %1, i32 %a, i32 0 > %3 = sext i32 %2 to i64 > > This builds upon a patch posted by David Majenemer > (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass > passively stopped instcombine from ruining canonical patterns. This > patch additionally actively makes instcombine canonicalize too. > > Canonicalization of expressions involving a change in type from int->fp > or fp->int are not yet implemented. llvm-svn: 237821
* Revert r237539: "Reapply r237520 with another fix for infinite looping"Hans Wennborg2015-05-195-63/+7
| | | | | | This caused PR23583. llvm-svn: 237739
* Simplify IRBuilder::CreateCall* by using ArrayRef+initializer_list/braced ↵David Blaikie2015-05-181-2/+2
| | | | | | init only llvm-svn: 237624
* Reapply r237520 with another fix for infinite loopingJames Molloy2015-05-175-7/+63
| | | | | | | | | SimplifyDemandedBits was "simplifying" a constant by removing just sign bits. This caused a canonicalization race between different parts of instcombine. Fix and regression test added - third time lucky? llvm-svn: 237539
* Revert commits r237521 and r237520.James Molloy2015-05-164-56/+7
| | | | | | | | The AArch64 LNT bot is unhappy - I've found that the problem is in SimpliftDemandedBits, but that's going to require another code review so reverting in the meantime. llvm-svn: 237528
* Reapply r237453 with a fix for the test timeouts.James Molloy2015-05-164-7/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The test timeouts were due to instcombine fighting itself. Regression test added. Original log message: Canonicalize min/max expressions correctly. This patch introduces a canonical form for min/max idioms where one operand is extended or truncated. This often happens when the other operand is a constant. For example: %1 = icmp slt i32 %a, i32 0 %2 = sext i32 %a to i64 %3 = select i1 %1, i64 %2, i64 0 Would now be canonicalized into: %1 = icmp slt i32 %a, i32 0 %2 = select i1 %1, i32 %a, i32 0 %3 = sext i32 %2 to i64 This builds upon a patch posted by David Majenemer (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass passively stopped instcombine from ruining canonical patterns. This patch additionally actively makes instcombine canonicalize too. Canonicalization of expressions involving a change in type from int->fp or fp->int are not yet implemented. llvm-svn: 237520
* Revert "Canonicalize min/max expressions correctly."James Molloy2015-05-153-47/+7
| | | | | | | This reverts r237453 - it was causing timeouts on some bots. Reverting while I investigate (it's probably InstCombine fighting itself...) llvm-svn: 237458
* Canonicalize min/max expressions correctly.James Molloy2015-05-153-7/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a canonical form for min/max idioms where one operand is extended or truncated. This often happens when the other operand is a constant. For example: %1 = icmp slt i32 %a, i32 0 %2 = sext i32 %a to i64 %3 = select i1 %1, i64 %2, i64 0 Would now be canonicalized into: %1 = icmp slt i32 %a, i32 0 %2 = select i1 %1, i32 %a, i32 0 %3 = sext i32 %2 to i64 This builds upon a patch posted by David Majenemer (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass passively stopped instcombine from ruining canonical patterns. This patch additionally actively makes instcombine canonicalize too. Canonicalization of expressions involving a change in type from int->fp or fp->int are not yet implemented. llvm-svn: 237453
* [ValueTracking] refactor: extract method haveNoCommonBitsSetJingyue Wu2015-05-141-14/+2
| | | | | | | | | | | | | | | | | | | | | Summary: Extract method haveNoCommonBitsSet so that we don't have to duplicate this logic in InstCombine and SeparateConstOffsetFromGEP. This patch also makes SeparateConstOffsetFromGEP more precise by passing DominatorTree to computeKnownBits. Test Plan: value-tracking-domtree.ll that tests ValueTracking indeed leverages dominating conditions Reviewers: broune, meheff, majnemer Reviewed By: majnemer Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9734 llvm-svn: 237407
* Convert PHI getIncomingValue() to foreach over incoming_values(). NFC.Pete Cooper2015-05-124-12/+11
| | | | | | | | We already had a method to iterate over all the incoming values of a PHI. This just changes all eligible code to use it. Ineligible code included anything which cared about the index, or was also trying to get the i'th incoming BB. llvm-svn: 237169
* [RewriteStatepointsForGC] Fix a bug on creating gc_relocate for pointer to ↵Sanjoy Das2015-05-111-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vector of pointers Summary: In RewriteStatepointsForGC pass, we create a gc_relocate intrinsic for each relocated pointer, and the gc_relocate has the same type with the pointer. During the creation of gc_relocate intrinsic, llvm requires to mangle its type. However, llvm does not support mangling of all possible types. RewriteStatepointsForGC will hit an assertion failure when it tries to create a gc_relocate for pointer to vector of pointers because mangling for vector of pointers is not supported. This patch changes the way RewriteStatepointsForGC pass creates gc_relocate. For each relocated pointer, we erase the type of pointers and create an unified gc_relocate of type i8 addrspace(1)*. Then a bitcast is inserted to convert the gc_relocate to the correct type. In this way, gc_relocate does not need to deal with different types of pointers and the unsupported type mangling is no longer a problem. This change would also ease further merge when LLVM erases types of pointers and introduces an unified pointer type. Some minor changes are also introduced to gc_relocate related part in InstCombineCalls, CodeGenPrepare, and Verifier accordingly. Patch by Chen Li! Reviewers: reames, AndyAyers, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9592 llvm-svn: 237009
* Rip min/max pattern matching out of InstCombine and intoJames Molloy2015-05-112-99/+4
| | | | | | | | | | | ValueTracking. This matching functionality is useful in more than just InstCombine, so make it available in ValueTracking. NFC. llvm-svn: 236998
* [InstCombine/PowerPC] Fix single-precision QPX load/store replacementHal Finkel2015-05-111-5/+10
| | | | | | | | | | The QPX single-precision load/store intrinsics have implied truncation/extension from/to the declared value type of <4 x double> to the memory type of <4 x float>. When we can prove the alignment of the pointer argument, and thus replace the intrinsic with a regular load or store, we need to load or store the correct data type (<4 x float>) instead of (<4 x double>). llvm-svn: 236973
* [InstCombine] Canonicalize single element array storeDavid Majnemer2015-05-111-0/+9
| | | | | | | | Use the element type instead of the aggregate type. Differential Revision: http://reviews.llvm.org/D9591 llvm-svn: 236969
* [InstCombine] Canonicalize single element array loadDavid Majnemer2015-05-111-0/+10
| | | | | | | | Use the element type instead of the aggregate type. Differential Revision: http://reviews.llvm.org/D9596 llvm-svn: 236968
* Update InstCombine to transform aggregate loads into scalar loads.Mehdi Amini2015-05-071-3/+32
| | | | | | | | | | | | | | | | | | | | | Summary: One step further getting aggregate loads and store being optimized properly. This will only handle struct with one element at this point. Test Plan: Added unit tests for the new supported cases. Reviewers: chandlerc, joker-eph, joker.eph, majnemer Reviewed By: majnemer Subscribers: pete, llvm-commits Differential Revision: http://reviews.llvm.org/D8339 Patch by Amaury Sechet. From: Amaury Sechet <amaury@fb.com> llvm-svn: 236695
* Change typeIncompatible to return an AttrBuilder instead of new-ing an ↵Pete Cooper2015-05-061-10/+3
| | | | | | | | | | AttributeSet. This makes use of the new API which can remove attributes from a set given a builder. This is much faster than creating a temporary set and reduces llc time by about 0.3% which was all spent creating temporary attributes sets on the context. llvm-svn: 236668
* [Statepoint] Clean up Statepoint.h: accessor names.Sanjoy Das2015-05-061-1/+1
| | | | | | Use getFoo() as accessors consistently and some other naming changes. llvm-svn: 236564
* [opaque pointer type] Track explicit GEP pointee type through in-memory IRDavid Blaikie2015-05-051-0/+1
| | | | llvm-svn: 236510
* InstCombineSimplifyDemanded: Remove nsw/nuw flags when optimizing demanded bitsMatthias Braun2015-04-301-102/+15
| | | | | | | | | | | | | | | | | | | | | | | | | When optimizing demanded bits of the operands of an Add we have to remove the nsw/nuw flags as we have no guarantee anymore that we don't wrap. This is legal here because the top bit is not demanded. In fact this operaion was already performed but missed in the case of an Add with a constant on the right side. To fix this this patch refactors the code to unify the code paths in SimplifyDemandedUseBits() handling of Add/Sub: - The transformation of Add->Or is removed from the simplify demand code because the equivalent transformation exists in InstCombiner::visitAdd() - KnownOnes/KnownZero are not adjusted for Add x, C anymore as computeKnownBits() already performs these computations. - The simplification of the operands is unified. In this new version constant on the right side of a Sub are shrunk now as I could not find a reason why not to do so. - The special case for clearing nsw/nuw in ShrinkDemandedConstant() is not necessary anymore as the caller does that already. Differential Revision: http://reviews.llvm.org/D9415 llvm-svn: 236269
* InstCombine: Move Sub->Xor rule from SimplifyDemanded to InstCombineMatthias Braun2015-04-302-10/+13
| | | | | | | | | | The rule that turns a sub to xor if the LHS is 2^n-1 and the remaining bits are known zero, does not use the demanded bits at all: Move it to the normal InstCombine code path. Differential Revision: http://reviews.llvm.org/D9417 llvm-svn: 236268
* [InstCombine] Add new rule for MIN(MAX(~A, ~B), ~C) et. al.Sanjoy Das2015-04-301-0/+86
| | | | | | | | | | | | | | | | | Summary: Optimizing these well are especially interesting for IRCE since it "clamps" values by generating this sort of pattern through SCEV expressions. Depends on D9352. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9353 llvm-svn: 236203
* [InstCombine] Add a new formula for SMIN.Sanjoy Das2015-04-301-0/+11
| | | | | | | | | | | | | | | | Summary: After this change `MatchSelectPattern` recognizes the following form of SMIN: Y >s C ? ~Y : ~C == ~Y <s ~C ? ~Y : ~C = SMIN(~Y, ~C) Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9352 llvm-svn: 236202
* [x86] instcombine more cases of insertps into a shufflevectorSanjay Patel2015-04-251-14/+31
| | | | | | | | | | | | This is a follow-on to D8833 (insertps optimization when the zero mask is not used). In this patch, we check for the case where the zmask is used, but both input vectors to the insertps intrinsic are the same operand or the zmask overrides the destination lane. This lets us replace the 2nd shuffle input operand with the zero vector. Differential Revision: http://reviews.llvm.org/D9257 llvm-svn: 235810
* Move Value.isDereferenceablePointer to ValueTracking [NFC]Philip Reames2015-04-231-1/+1
| | | | | | | | | | | Move isDereferenceablePointer function to Analysis. This function recursively tracks dereferencability over a chain of values like other functions in ValueTracking. This refactoring is motivated by further changes to support dereferenceable_or_null attribute (http://reviews.llvm.org/D8650). isDereferenceablePointer will be extended to perform context-sensitive analysis and IR is not a good place to have such functionality. Patch by: Artur Pilipenko <apilipenko@azulsystems.com> Differential Revision: reviews.llvm.org/D9075 llvm-svn: 235611
* [InstCombine] Use a more targeted fix instead of r235544David Majnemer2015-04-221-9/+8
| | | | | | | | | Only clear out the NSW/NUW flags if we are optimizing 'add'/'sub' while taking advantage that the sign bit is not set. We do this optimization to further shrink the mask but shrinking the mask isn't NSW/NUW preserving in this case. llvm-svn: 235558
* [InstCombine] Clear out nsw/nuw if we modify computation in the chainDavid Majnemer2015-04-221-3/+10
| | | | | | | | | | | | An nsw/nuw operation relies on the values feeding into it to not overflow if 'poison' is not to be produced. This means that optimizations which make modifications to the bottom of a chain (like SimplifyDemandedBits) must strip out nsw/nuw if they cannot ensure that they will be preserved. This fixes PR23309. llvm-svn: 235544
* Limiting gep merging to fix the performance problem described inWei Mi2015-04-211-0/+5
| | | | | | | | | | | | | | | | | | https://llvm.org/bugs/show_bug.cgi?id=23163. Gep merging sometimes behaves like a reverse CSE/LICM optimization, which has negative impact on performance. In this patch we restrict gep merging to happen only when the indexes to be merged are both consts, which ensures such merge is always beneficial. The patch makes gep merging only happen in very restrictive cases. It is possible that some analysis/optimization passes rely on the merged geps to get better result, and we havn't notice them yet. We will be ready to further improve it once we see the cases. Differential Revision: http://reviews.llvm.org/D8911 llvm-svn: 235455
* Revert r235451 since it is attached to a wrong Differential Revision. Sorry.Wei Mi2015-04-211-5/+0
| | | | llvm-svn: 235453
* Limiting gep merging to fix the performance problem described inWei Mi2015-04-211-0/+5
| | | | | | | | | | | | | | | | | | https://llvm.org/bugs/show_bug.cgi?id=23163. Gep merging sometimes behaves like a reverse CSE/LICM optimizations, which has negative impact on performance. In this patch we restrict gep merging to happen only when the indexes to be merged are both consts, which ensures such merge is always beneficial. The patch makes gep merging only happen in very restrictive cases. It is possible that some analysis/optimization passes rely on the merged geps to get better result, and we havn't notice them yet. We will be ready to further improve it once we see the cases. Differential Revision: http://reviews.llvm.org/D9007 llvm-svn: 235451
* [InstCombine] Create zero constants on demand.Benjamin Kramer2015-04-181-4/+2
| | | | | | No functional change intended. llvm-svn: 235257
* [InstCombine] (mul nsw 1, INT_MIN) != (shl nsw 1, 31)David Majnemer2015-04-181-2/+6
| | | | | | | Multiplying INT_MIN by 1 doesn't trigger nsw. However, shifting 1 into the sign bit *does* trigger nsw. llvm-svn: 235250
* [X86, SSE] instcombine common cases of insertps intrinsics into shufflesSanjay Patel2015-04-161-2/+45
| | | | | | | | | | | | | | | | This is very similar to D8486 / r232852 (vperm2). If we treat insertps intrinsics as shufflevectors, we can optimize them better. I've left all but the full zero case of the zero mask variants out of this patch. I don't think those can be converted into a single shuffle in all cases, but I'd be happy to be proven wrong as I was for vperm2f128. Either way, we'd need to support whatever sequence we come up with for those cases in the backend before converting them here. Differential Revision: http://reviews.llvm.org/D8833 llvm-svn: 235124
* GCC complains thusly: "attributes at the beginning of statement are ignored ↵Nick Lewycky2015-04-131-1/+1
| | | | | | [-Werror=attributes]". Very well then! NFC llvm-svn: 234788
* Subtraction is not commutative. Fixes PR23212!Nick Lewycky2015-04-132-7/+9
| | | | llvm-svn: 234780
* [InstCombine][CodeGenPrep] Create llvm.uadd.with.overflow in CGP.Sanjoy Das2015-04-101-44/+12
| | | | | | | | | | | | | | | | | | | Summary: This change moves creating calls to `llvm.uadd.with.overflow` from InstCombine to CodeGenPrep. Combining overflow check patterns into calls to the said intrinsic in InstCombine inhibits optimization because it introduces an intrinsic call that not all other transforms and analyses understand. Depends on D8888. Reviewers: majnemer, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8889 llvm-svn: 234638
* [CallSite] Make construction from Value* (or Instruction*) explicit.Benjamin Kramer2015-04-101-1/+1
| | | | | | | | | | | | | | | | | | | CallSite roughly behaves as a common base CallInst and InvokeInst. Bring the behavior closer to that model by making upcasts explicit. Downcasts remain implicit and work as before. Following dyn_cast as a mental model checking whether a Value *V isa CallSite now looks like this: if (auto CS = CallSite(V)) // think dyn_cast instead of: if (CallSite CS = V) This is an extra token but I think it is slightly clearer. Making the ctor explicit has the advantage of not accidentally creating nullptr CallSites, e.g. when you pass a Value * to a function taking a CallSite argument. llvm-svn: 234601
* [InstCombine] Refactor out OptimizeOverflowCheck. NFCI.Sanjoy Das2015-04-083-100/+172
| | | | | | | | | | | | | | | | | | | Summary: This patch adds an enum `OverflowCheckFlavor` and a function `OptimizeOverflowCheck`. This will allow InstCombine to optimize overflow checks without directly introducing an intermediate call to the `llvm.$op.with.overflow` instrinsics. This specific change is a refactoring and does not intend to change behavior. Reviewers: majnemer, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8888 llvm-svn: 234388
* [opaque pointer type] More GEP IRBuilder API migrations...David Blaikie2015-04-031-19/+22
| | | | llvm-svn: 234058
* [InstCombine] Use DataLayout to determine vector element widthDavid Majnemer2015-04-031-3/+2
| | | | | | | | | InstCombine didn't realize that it needs to use DataLayout to determine how wide pointers are. This lead to assertion failures. This fixes PR23113. llvm-svn: 234046
* [opaque pointer type] Change GetElementPtrInst::getIndexedType to take the ↵David Blaikie2015-03-301-2/+4
| | | | | | | | | | pointee type This pushes the use of PointerType::getElementType up into several callers - I'll essentially just have to keep pushing that up the stack until I can eliminate every call to it... llvm-svn: 233604
* Transforms: Use the new DebugLoc API, NFCDuncan P. N. Exon Smith2015-03-301-1/+1
| | | | | | Update lib/Analysis and lib/Transforms to use the new `DebugLoc` API. llvm-svn: 233587
* Constrain the type of a parameter now that callers without this constraint ↵David Blaikie2015-03-272-5/+3
| | | | | | have been removed. llvm-svn: 233419
* Recommit r233116 better: Remove a redundant instcombine involving bitcasts ↵David Blaikie2015-03-271-36/+0
| | | | | | | | | | | | of geps of bitcasts This just didn't need to be here at all, but the assertion I tried to add wasn't appropriate either - the circumstance isn't impossible, it's just not important to deal with it here - the gep-rooted version of this instcombine will handle this case, we don't need to duplicate it for the case where the gep happens to be used in a bitcast. llvm-svn: 233404
* InstCombine: fold (A << C) == (B << C) --> ((A^B) & (~0U >> C)) == 0Benjamin Kramer2015-03-261-0/+15
| | | | | | | Anding and comparing with zero can be done in a single instruction on most archs so this is a bit cheaper. llvm-svn: 233291
* Opaque Pointer Types: GEP API migrations to specify the gep type explicitlyDavid Blaikie2015-03-241-3/+5
| | | | | | | | | | | | | | | | | | | | | The changes to InstCombine (& SCEV) do seem a bit silly - it doesn't make anything obviously better to have the caller access the pointers element type (the thing I'm trying to remove) than the GEP itself, but it's a helpful migration step. This will allow me to more obviously lock down GEP (& Load, etc) API usage, then fix all the code that accesses pointer element types except the places that need to be removed (most of the InstCombines) anyway - at which point I'll need to just remove all that code because it won't be meaningful anymore (there will be no pointer types, so no bitcasts to combine) SCEV looks like it'll need some restructuring - we'll have to do a bit more work for GEP canonicalization, since it'll depend on how it's used if we can even manage to canonicalize it to a non-ugly GEP. I guess we can do some fun stuff like voting (do 2 out of 3 load from the GEP with a certain type that gives a pretty GEP? Does every typed use of the GEP use either a specific type or a generic type (i8*, etc)?) llvm-svn: 233131
* optimize the AVX2 (integer) version of vperm2 into a shuffleSanjay Patel2015-03-241-1/+1
| | | | | | | | | | | | ...because this is what happens when an instruction set puts its underwear on after its pants. This is an extension of r232852, r233100, and 233110: http://llvm.org/viewvc/llvm-project?view=revision&revision=232852 http://llvm.org/viewvc/llvm-project?view=revision&revision=233100 http://llvm.org/viewvc/llvm-project?view=revision&revision=233110 llvm-svn: 233127
OpenPOWER on IntegriCloud