summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] Fold ((C1-zext(X)) & C2) -> zext((C1-X) & C2)David Majnemer2017-01-171-0/+15
| | | | | | | This is valid if C2 fits within the bitwidth of X thanks to two's complement modulo arithmetic. llvm-svn: 292179
* SimplifyLibCalls: Replace fabs libcalls with intrinsicsMatt Arsenault2017-01-172-15/+30
| | | | | | | | Add missing fabs(fpext) optimzation that worked with the call, and also fixes it creating a second fpext when there were multiple uses. llvm-svn: 292172
* [InstCombine] use m_APInt instead of faking itSanjay Patel2017-01-161-20/+14
| | | | llvm-svn: 292164
* [InstCombine] fix names in canEvaluateShiftedShift(); NFCSanjay Patel2017-01-161-27/+26
| | | | | | | | It's not clear what 'First' and 'Second' mean, so use 'Inner' and 'Outer' to match foldShiftedShift() and add comments with formulas, so it's easier to see what's going on. llvm-svn: 292153
* [InstCombine] use m_APInt to allow shift-shift folds for vectors with splat ↵Sanjay Patel2017-01-161-4/+5
| | | | | | | | constants Some existing 'FIXME' tests are still not folded because of splat holes in value tracking. llvm-svn: 292151
* [InstCombine] refactor shift-of-shift folds; NFCISanjay Patel2017-01-161-83/+66
| | | | | | Reduces code duplication and makes it easier to extend these folds for vectors. llvm-svn: 292145
* [InstCombine][SSE] Add DemandedElts support for PSHUFB instructionsSimon Pilgrim2017-01-162-1/+21
| | | | | | | | Simplify a pshufb shuffle mask based on the elements of the mask that are actually demanded. Differential Revision: https://reviews.llvm.org/D28745 llvm-svn: 292101
* [InstCombine] fix formatting; NFCSanjay Patel2017-01-151-24/+22
| | | | llvm-svn: 292073
* [InstCombine] use m_APInt to allow ashr folds for vectors with splat constantsSanjay Patel2017-01-151-3/+4
| | | | llvm-svn: 292064
* [PM] Introduce an analysis set used to preserve all analyses overChandler Carruth2017-01-151-2/+1
| | | | | | | | | | | | | | | a function's CFG when that CFG is unchanged. This allows transformation passes to simply claim they preserve the CFG and analysis passes to check for the CFG being preserved to remove the fanout of all analyses being listed in all passes. I've gone through and removed or cleaned up as many of the comments reminding us to do this as I could. Differential Revision: https://reviews.llvm.org/D28627 llvm-svn: 292054
* [PM] The assumption cache is fundamentally designed to be self-updating,Chandler Carruth2017-01-151-1/+0
| | | | | | | | | | | | | | mark it as never invalidated in the new PM. The old PM already required this to work, and after a discussion with Hal this seems to really be the only sensible answer. The cache gracefully degrades as the IR is mutated, and most things which do this should already be incrementally updating the cache. This gets rid of a bunch of logic preserving and testing the invalidation of this analysis. llvm-svn: 292039
* [PM] Fix instcombine's analysis preservation in the new pass manager toChandler Carruth2017-01-141-0/+3
| | | | | | | | | | | cover domtree and alias analysis. These are the pretty clear analyses that we would always want to survive this pass. To make these survive, we also need to preserve the assumption cache. Added a test that verifies the important bits of this preservation. llvm-svn: 292037
* [InstCombine] clean up visitAshr(); NFCISanjay Patel2017-01-141-20/+9
| | | | llvm-svn: 292036
* [InstCombine] optimize unsigned icmp of incrementSanjay Patel2017-01-131-0/+25
| | | | | | | | | | | | | | | | | | | | | | | Allows LLVM to optimize sequences like the following: %add = add nuw i32 %x, 1 %cmp = icmp ugt i32 %add, %y Into: %cmp = icmp uge i32 %x, %y Previously, only signed comparisons were being handled. Decrements could also be handled, but 'sub nuw %x, 1' is currently canonicalized to 'add %x, -1' in InstCombineAddSub, losing the nuw flag. Removing that canonicalization seems like it might have far-reaching ramifications so I kept this simple for now. Patch by Matti Niemenmaa! Differential Revision: https://reviews.llvm.org/D24700 llvm-svn: 291975
* [InstCombine] use m_APInt to allow lshr folds for vectors with splat constantsSanjay Patel2017-01-131-17/+14
| | | | llvm-svn: 291972
* [InstCombine] use 'match' and other clean-up; NFCISanjay Patel2017-01-131-17/+8
| | | | llvm-svn: 291937
* [InstCombine] use m_APInt to allow shl folds for vectors with splat constantsSanjay Patel2017-01-131-3/+5
| | | | llvm-svn: 291934
* [InstCombine] use Op0/Op1 local variables more consistently with shifts; NFCSanjay Patel2017-01-131-22/+16
| | | | llvm-svn: 291923
* [InstCombine] if the condition of a select may be known via assumes, ↵Sanjay Patel2017-01-131-0/+14
| | | | | | | | | | | | | | | | | | | | eliminate the select This is a limited solution for PR31512: https://llvm.org/bugs/show_bug.cgi?id=31512 The motivation is that we will need to increase usage of llvm.assume and/or metadata to solve PR28430: https://llvm.org/bugs/show_bug.cgi?id=28430 ...and this kind of simplification is needed to take advantage of that extra information. The 'not' test case would be handled by: https://reviews.llvm.org/D28485 Differential Revision: https://reviews.llvm.org/D28337 llvm-svn: 291915
* [DebugInfo] Add const to DILocation variable declaration; NFC.Robert Lougher2017-01-121-1/+1
| | | | llvm-svn: 291785
* Make processing @llvm.assume more efficient - Add affected values to the ↵Hal Finkel2017-01-111-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | assumption cache Here's my second try at making @llvm.assume processing more efficient. My previous attempt, which leveraged operand bundles, r289755, didn't end up working: it did make assume processing more efficient but eliminating the assumption cache made ephemeral value computation too expensive. This is a more-targeted change. We'll keep the assumption cache, but extend it to keep a map of affected values (i.e. values about which an assumption might provide some information) to the corresponding assumption intrinsics. This allows ValueTracking and LVI to find assumptions relevant to the value being queried without scanning all assumptions in the function. The fact that ValueTracking started doing O(number of assumptions in the function) work, for every known-bits query, has become prohibitively expensive in some cases. As discussed during the review, this is a pragmatic fix that, longer term, will likely be replaced by a more-principled solution (perhaps based on an extended SSA form). Differential Revision: https://reviews.llvm.org/D28459 llvm-svn: 291671
* [InstCombine] add a wrapper for a common pair of transforms; NFCISanjay Patel2017-01-106-75/+44
| | | | | | | Some of the callers are artificially limiting this transform to integer types; this should make it easier to incrementally remove that restriction. llvm-svn: 291620
* InstCombine: Set operands instead of creating new callMatt Arsenault2017-01-101-10/+6
| | | | llvm-svn: 291612
* InstCombine: fdiv -x, -y -> fdiv x, yMatt Arsenault2017-01-101-0/+10
| | | | llvm-svn: 291611
* fix comment typos; NFCSanjay Patel2017-01-091-5/+5
| | | | llvm-svn: 291447
* InstCombine: Fold cos(-x) -> cos(x)Matt Arsenault2017-01-041-0/+14
| | | | | | Also cos(fabs(x)) -> cos(x) llvm-svn: 291022
* [InstCombine] Move casts around shift operationsDavid Majnemer2017-01-041-0/+19
| | | | | | | It is possible to perform a left shift before zero extending if the shift would only shift out zeros. llvm-svn: 290928
* [InstCombine] Combine adds across a zextDavid Majnemer2017-01-041-0/+12
| | | | | | | | | We can perform the following: (add (zext (add nuw X, C1)), C2) -> (zext (add nuw X, C1+C2)) This is only possible if C2 is negative and C2 is greater than or equal to negative C1. llvm-svn: 290927
* InstCombine: Fold fabs on select of constantsMatt Arsenault2017-01-031-0/+12
| | | | llvm-svn: 290913
* [InstCombine] use 'match' to reduce code bloat; NFCISanjay Patel2017-01-031-15/+11
| | | | | | | | | | | I wrote this patch before seeing the comment in: https://reviews.llvm.org/D27114 ...that suggests we should actually be canonicalizing the other way. So just in case we decide this is the right way, we might as well have a cleaner implementation. llvm-svn: 290912
* InstCombine: Add fma with constant transformsMatt Arsenault2017-01-031-3/+17
| | | | | | DAGCombine already does these. llvm-svn: 290860
* InstCombine: Add fma + fabs/fneg transformsMatt Arsenault2017-01-031-0/+30
| | | | | | | fma (fneg x), (fneg y), z -> fma x, y, z fma (fabs x), (fabs x), z -> fma x, x, z llvm-svn: 290859
* [InstCombine] use combineMetadataForCSE instead of copying it; NFCISanjay Patel2017-01-021-14/+4
| | | | llvm-svn: 290844
* [InstCombine][AVX-512] Teach InstCombine that llvm.x86.avx512.vcomi.sd and ↵Craig Topper2016-12-311-0/+2
| | | | | | | | llvm.x86.avx512.vcomi.ss don't use the upper elements of their input. This was already done for the SSE/SSE2 version of the intrinsics. llvm-svn: 290776
* [InstCombine][AVX-512] When turning intrinsics with masking into native IR, ↵Craig Topper2016-12-301-9/+20
| | | | | | | | don't emit a select if the mask is known to be all ones. This saves InstCombine the burden of having to optimize the select later. llvm-svn: 290774
* [InstCombine] Address post-commit feedbackDavid Majnemer2016-12-302-2/+4
| | | | llvm-svn: 290741
* [InstCombine] More thoroughly canonicalize the position of zextsDavid Majnemer2016-12-302-9/+120
| | | | | | | | We correctly canonicalized (add (sext x), (sext y)) to (sext (add x, y)) where possible. However, we didn't perform the same canonicalization for zexts or for muls. llvm-svn: 290733
* [InstCombine] Use getVectorNumElements instead of explicitly casting to ↵Craig Topper2016-12-291-8/+7
| | | | | | VectorType and calling getNumElements. NFC llvm-svn: 290707
* [InstCombine] Fix typo in comment. NFCCraig Topper2016-12-291-1/+1
| | | | llvm-svn: 290706
* [InstCombine] Use a 32-bits instead of 64-bits for storing the number of ↵Craig Topper2016-12-291-2/+2
| | | | | | elements in VectorType for a ShuffleVector. While there getVectorNumElements to avoid an explicit cast. NFC llvm-svn: 290705
* [InstCombine][X86] If the lowest element of a scalar intrinsic isn't used ↵Craig Topper2016-12-291-6/+18
| | | | | | | | make sure we add it to the worklist so we can DCE it sooner. We bypassed the intrinsic and returned the passthru operand, but we should also add the intrinsic to the worklist since its now dead. This can allow DCE to find it sooner and remove it. Similar was done for InsertElement when the inserted element isn't demanded. llvm-svn: 290704
* [InstCombine] Remove a piece of a comment that said that InstCombiner ↵Craig Topper2016-12-281-2/+1
| | | | | | contains pass infrastructure. That hasn't been true since r226618. NFC llvm-svn: 290648
* [InstCombine] Canonicalize insert splat sequences into an insert + shuffleMichael Kuperstein2016-12-281-0/+57
| | | | | | | | | | | This adds a combine that canonicalizes a chain of inserts which broadcasts a value into a single insert + a splat shufflevector. This fixes PR31286. Differential Revision: https://reviews.llvm.org/D27992 llvm-svn: 290641
* [InstCombine][X86] Add DemandedElts support for 512-bit PMULDQ/PMULUDQ ↵Craig Topper2016-12-272-2/+6
| | | | | | | | | | instructions PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use. This builds on r290554 which added supported for 128 and 256-bit. llvm-svn: 290582
* [AVX-512][InstCombine] Teach InstCombine to turn masked scalar ↵Craig Topper2016-12-271-37/+42
| | | | | | | | add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION. An earlier commit added support for unmasked scalar operations. At that time isel wouldn't generate an optimal sequence for masked operations, but that has now been fixed. llvm-svn: 290566
* [AVX-512][InstCombine] Teach InstCombine to turn packed add/sub/mul/div with ↵Craig Topper2016-12-271-0/+44
| | | | | | rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION. llvm-svn: 290559
* [InstCombine][X86] Add DemandedElts support for PMULDQ/PMULUDQ instructionsSimon Pilgrim2016-12-262-0/+42
| | | | | | | | PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use. Differential Revision: https://reviews.llvm.org/D28119 llvm-svn: 290554
* [AVX-512][InstCombine] Teach InstCombine to turn scalar add/sub/mul/div with ↵Craig Topper2016-12-261-3/+49
| | | | | | | | | | | | | | | | | rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION. Summary: I only do this for unmasked cases for now because isel is failing to fold the mask. I'll try to fix that soon. I'll do the same thing for packed add/sub/mul/div in a future patch. Reviewers: delena, RKSimon, zvi, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27879 llvm-svn: 290535
* [AVX-512][InstCombine] Teach InstCombine to converted masked vpermv ↵Craig Topper2016-12-251-4/+50
| | | | | | | | | | | | | | | | | intrinsics into shufflevector instructions Summary: This patch adds support for converting the masked vpermv intrinsics into shufflevector instructions if the indices are constants. We also need to wrap a select instruction around the shuffle to take care of the masking part. InstCombine will take care of optimizing the select if the mask is constant so I didn't bother checking for that. Reviewers: zvi, delena, spatel, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27825 llvm-svn: 290530
* Revert "[InstCombine] New opportunities for FoldAndOfICmp and FoldXorOfICmp"David Majnemer2016-12-212-98/+2
| | | | | | This reverts commit r289813, it caused PR31449. llvm-svn: 290266
OpenPOWER on IntegriCloud