summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] revert r300977 and r301021Sanjay Patel2017-04-211-14/+4
| | | | | | This can cause an inf-loop. Investigating... llvm-svn: 301035
* [InstCombine] use isSubsetOf() for efficiencySanjay Patel2017-04-211-1/+1
| | | | | | | | | | C | ~D == -1 ~(C | ~D) == 0 ~C & D == 0 D & ~C == 0 D.isSubsetOf(C) llvm-svn: 301021
* [InstCombine] prefer xor with -1 because 'not' is easier to understand (PR32706)Sanjay Patel2017-04-211-4/+14
| | | | | | | | | This matches the demanded bits behavior in the DAG and should fix: https://bugs.llvm.org/show_bug.cgi?id=32706 Differential Revision: https://reviews.llvm.org/D32255 llvm-svn: 300977
* [InstCombine] Remove the zextOrTrunc from ShrinkDemandedConstant.Craig Topper2017-04-201-4/+2
| | | | | | | | The demanded mask and the constant should always be the same width for all callers today. Also stop copying the demanded mask as its passed in. We should avoid allocating memory unless we are going to do something. The final AND to create the new constant will take care of it. llvm-svn: 300927
* [InstCombine] function names start with lower-case letter; NFCSanjay Patel2017-04-201-2/+2
| | | | | | Forgot to make this fix with the signature change in r300911. llvm-svn: 300912
* [InstCombine] allow shl+shr demanded bits folds with splat constantsSanjay Patel2017-04-201-19/+13
| | | | llvm-svn: 300911
* [InstCombine] allow shl demanded bits folds with splat constantsSanjay Patel2017-04-201-2/+4
| | | | | | More fixes are needed to enable the helper SimplifyShrShlDemandedBits(). llvm-svn: 300898
* [InstCombine] Use APInt::intersects and APInt::isSubsetOf to improve a few ↵Craig Topper2017-04-201-4/+4
| | | | | | more places in SimplifyDemandedBits. llvm-svn: 300896
* [InstCombine] allow ashr/lshr demanded bits folds with splat constantsSanjay Patel2017-04-201-11/+14
| | | | llvm-svn: 300888
* [InstCombine] Use APInt::isSubsetOf to simplify some code in ↵Craig Topper2017-04-201-37/+27
| | | | | | | | SimplifyDemandedBits. NFC This allows us to use less temporary APInt for And and Invert operations. llvm-svn: 300885
* [InstCombine] Remove redundant code from SimplifyDemandedBits handling for ↵Craig Topper2017-04-201-18/+0
| | | | | | Or. The code above it is equivalent if you work through the bitwise math. llvm-svn: 300876
* [APInt] Rename getSignBit to getSignMaskCraig Topper2017-04-201-5/+5
| | | | | | | | getSignBit is a static function that creates an APInt with only the sign bit set. getSignMask seems like a better name to convey its functionality. In fact several places use it and then store in an APInt named SignMask. Differential Revision: https://reviews.llvm.org/D32108 llvm-svn: 300856
* [APInt] Add isSubsetOf method that can check if one APInt is a subset of ↵Craig Topper2017-04-201-1/+1
| | | | | | | | | | | | | | another without creating temporary APInts This question comes up in many places in SimplifyDemandedBits. This makes it easy to ask without allocating additional temporary APInts. The BitVector class provides a similar functionality through its (IMHO badly named) test(const BitVector&) method. Though its output polarity is reversed. I've provided one example use case in this patch. I plan to do more as a follow up. Differential Revision: https://reviews.llvm.org/D32258 llvm-svn: 300851
* In SimplifyDemandedUseBits, use computeKnownBits directly to handle ConstantsCraig Topper2017-04-201-15/+4
| | | | | | | | | | | | Currently we don't explicitly process ConstantDataSequential, ConstantAggregateZero, or ConstantVector, or Undef before applying the Depth limit. Instead they occur after the depth check in the non-instruction path. For the constant types that we do handle, the code is replicated from computeKnownBits. This patch fixes the missing constant handling and the reduces the amount of code by just using computeKnownBits directly for any type of Constant. Differential Revision: https://reviews.llvm.org/D32123 llvm-svn: 300849
* [APInt] Use lshrInPlace to replace lshr where possibleCraig Topper2017-04-181-5/+5
| | | | | | | | | | This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result. This adds an lshrInPlace(const APInt &) version as well. Differential Revision: https://reviews.llvm.org/D32155 llvm-svn: 300566
* Introduce APInt::isSignBitSet/isSignBitClear. Use in place isSignBitSet in ↵Craig Topper2017-04-171-4/+4
| | | | | | | | place of isNegative in known bits tracking. This makes statements like KnownZero.isNegative() (which means the value we're tracking is positive) less confusing. llvm-svn: 300457
* AMDGPU: SimplifyDemandedElts for image intrinsicsMatt Arsenault2017-04-171-3/+80
| | | | | | | | Causes some VGPR usage improvements in shaderdb, but introduces some SGPR spilling regressions due to random scheduling changes later. llvm-svn: 300453
* [InstCombine][ValueTracking] When computing known bits for Srem make sure we ↵Craig Topper2017-04-161-2/+2
| | | | | | | | don't compute known bits for the LHS twice. If we already called computeKnownBits for the RHS being a constant power of 2, we've already computed everything we can and should just stop. I think previously we would still recurse if we had determined the result was negative or had not determined the sign bit at all. llvm-svn: 300432
* [InstCombine] In SimplifyDemandedUseBits, don't bother to mask known bits of ↵Craig Topper2017-04-161-3/+3
| | | | | | | | constants with DemandedMask. Just because we didn't demand them doesn't mean they aren't known. llvm-svn: 300430
* [InstCombine] MakeAnd/Or/Xor handling to reuse previous APInt computationsCraig Topper2017-04-141-36/+46
| | | | | | | | | | | | When checking if we should return a constant, we create some temporary APInts to see if we know all bits. But the exact computations we do are needed in several other locations in the same code. This patch moves them to named temporaries so we can reuse them. Ideally we'd write directly to KnownZero/One, but we currently seem to only write those variables after all the simplifications checks and I didn't want to change that with this patch. Differential Revision: https://reviews.llvm.org/D32094 llvm-svn: 300376
* [InstCombine] Use APInt::setSignBit and APInt::isNegative(). NFCCraig Topper2017-04-141-3/+3
| | | | llvm-svn: 300305
* [InstCombine] Teach SimplifyMultipleUseDemandedBits to handle And/Or/Xor ↵Craig Topper2017-04-121-11/+46
| | | | | | | | known bits using the LHS/RHS known bits it already acquired without recursing back into computeKnownBits. This replicates the known bits and constant creation code from the single use case for these instructions and adds it here. The computeKnownBits and constant creation code for other instructions is now in the default case of the opcode switch. llvm-svn: 300094
* [InstCombine] Remove unreachable code for turning an And where all demanded ↵Craig Topper2017-04-121-4/+0
| | | | | | | | bits on both sides are known to be zero into a constant 0. We already handled a superset check that included the known ones too and folded to a constant that may include ones. But it can also handle the case of no ones. llvm-svn: 300093
* [InstCombine] In SimplifyMultipleUseDemandedBits, use a switch instead of ↵Craig Topper2017-04-121-3/+11
| | | | | | cascaded ifs on opcode. NFC llvm-svn: 300085
* [InstCombine] Teach SimplifyDemandedInstructionBits that even if we reach an ↵Craig Topper2017-04-121-0/+6
| | | | | | | | | | | | | | instruction that has multiple uses, if we know all the bits for the demanded bits for this context we can go ahead and create a constant. Currently if we reach an instruction with multiples uses we know we can't do any optimizations to that instruction itself since we only have the demanded bits for one of the users. But if we know all of the bits are zero/one for that one user we can still go ahead and create a constant to give to that user. This might then reduce the instruction to having a single use and allow additional optimizations on the other path. This picks up an additional case that r300075 didn't catch. Differential Revision: https://reviews.llvm.org/D31552 llvm-svn: 300084
* [InstCombine] Move portion of SimplifyDemandedUseBits that deals with ↵Craig Topper2017-04-121-76/+96
| | | | | | instructions with multiple uses out to a separate method. NFCI llvm-svn: 300082
* Teach SimplifyDemandedUseBits that adding or subtractings 0s from every bit ↵Craig Topper2017-04-121-1/+10
| | | | | | | | | | | | | | below the highest demanded bit can be simplified If we are adding/subtractings 0s below the highest demanded bit we can just use the other operand and remove the operation. My primary motivation is observing that we can call ShrinkDemandedConstant for the add/sub and create a 0 constant, rather than removing the add completely. In the case I saw, we modified the constant on an add instruction to a 0, but the add is not put into the worklist. So we didn't revisit it until the next InstCombine iteration. This caused an IR modification to remove add and a subsequent iteration to be ran. With this change we get bypass the add in the first iteration and prevent the second iteration from changing anything. Differential Revision: https://reviews.llvm.org/D31120 llvm-svn: 300075
* [InstCombine] Use setAllBits in place of getAllOnesValue since we know the ↵Craig Topper2017-04-041-1/+1
| | | | | | bitwidths are the same. NFCI llvm-svn: 299413
* [APInt] Move isMask and isShiftedMask out of APIntOps and into the APInt ↵Craig Topper2017-04-031-1/+1
| | | | | | | | | | class. Implement them without memory allocation for multiword This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation. Differential Revision: https://reviews.llvm.org/D31565 llvm-svn: 299362
* [APInt] Remove shift functions from APIntOps namespace. Replace the few ↵Craig Topper2017-03-311-5/+5
| | | | | | users with the APInt class methods. NFCI llvm-svn: 299248
* [InstCombine] Change the interface of SimplifyDemandedBits so that it takes ↵Craig Topper2017-03-251-43/+42
| | | | | | | | the instruction and operand instead of the Use. The first thing it did was get the User for the Use to get the instruction back. This requires looking through the Uses for the User using the waymarking walk. That's pretty fast, but its probably still better to just pass the Instruction we already had. llvm-svn: 298772
* Revert r298711 "[InstCombine] Provide a way to calculate KnownZero/One for ↵Craig Topper2017-03-241-5/+4
| | | | | | | | Add/Sub in SimplifyDemandedUseBits without recursing into ComputeKnownBits" Tsan bot is failing. llvm-svn: 298745
* [InstCombine] Provide a way to calculate KnownZero/One for Add/Sub in ↵Craig Topper2017-03-241-4/+5
| | | | | | | | SimplifyDemandedUseBits without recursing into ComputeKnownBits SimplifyDemandedUseBits for Add/Sub already recursed down LHS and RHS for simplifying bits. If that didn't provide any simplifications we fall back to calling computeKnownBits which will recurse again. Instead just take the known bits for LHS and RHS we already have and call into a new function in ValueTracking that can calculate the known bits given the LHS/RHS bits. llvm-svn: 298711
* [InstCombine] Teach SimplifyDemandedUseBits to shrink Constants on the left ↵Craig Topper2017-03-221-1/+2
| | | | | | | | | | | | | | | | side of subtracts Summary: Subtracts can have constants on the left side, but we don't shrink them based on demanded bits. This patch fixes that to match the right hand side. Reviewers: davide, majnemer, spatel, sanjoy, hfinkel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31119 llvm-svn: 298478
* [InstCombine] Remove duplicate code in SimplifyDemandedUseBits for URem. NFCCraig Topper2017-03-191-2/+0
| | | | llvm-svn: 298231
* [InstCombine] Use setHighBits/setLowBits/setBitsFrom in place of ↵Craig Topper2017-03-191-17/+14
| | | | | | getLowBitsSet/getHighBitsSet. llvm-svn: 298204
* AMDGPU: Fix insertion point when reducing load intrinsicsMatt Arsenault2017-03-101-0/+3
| | | | | | | The insertion point may be later than the next instruction, so it is necessary to set it when replacing the call. llvm-svn: 297439
* AMDGPU: Support for SimplifyDemandedVectorElts for load intrinsicsMatt Arsenault2017-03-091-0/+41
| | | | llvm-svn: 297408
* Use APInt::getLowBitsSet instead of APInt::getBitsSet for lower bit mask ↵Simon Pilgrim2017-03-031-1/+1
| | | | | | creation llvm-svn: 296882
* [AVX-512][InstCombine] Teach InstCombine to optimize 512-bit packss/packus ↵Craig Topper2017-02-161-2/+5
| | | | | | intrinsics like it does 128/256-bit. llvm-svn: 295294
* [InstCombine] use m_APInt to allow demanded bits analysis on splat constantsSanjay Patel2017-02-091-10/+13
| | | | llvm-svn: 294628
* [InstCombine][X86] MULDQ/MULUDQ undef -> zeroSimon Pilgrim2017-01-241-6/+0
| | | | | | | | Added early out for single undef input - we were already supporting (and testing) this in the constant folding code, we just do it quicker now Drop undef handling from demanded elts code now that we handle it fully in InstCombiner::visitCallInst llvm-svn: 292913
* [InstCombine][X86] Add MULDQ/MULUDQ undef handlingSimon Pilgrim2017-01-201-0/+6
| | | | llvm-svn: 292627
* [InstCombine][SSE] Add DemandedElts support for PACKSS/PACKUS instructionsSimon Pilgrim2017-01-201-0/+54
| | | | | | | | Simplify a packss/packus truncation based on the elements of the mask that are actually demanded. Differential Revision: https://reviews.llvm.org/D28777 llvm-svn: 292591
* [InstCombine][AVX2] Add DemandedElts support for VPERMD/VPERMPS shufflesSimon Pilgrim2017-01-181-1/+4
| | | | | | Simplify a vpermv shuffle mask based on the elements of the mask that are actually demanded. llvm-svn: 292371
* [InstCombine][X86][AVX] Add DemandedElts support for VPERMILPD/VPERMILPS ↵Simon Pilgrim2017-01-171-1/+9
| | | | | | | | instructions Simplify a vpermilvar shuffle mask based on the elements of the mask that are actually demanded. llvm-svn: 292209
* [InstCombine][SSE] Add DemandedElts support for PSHUFB instructionsSimon Pilgrim2017-01-161-0/+10
| | | | | | | | Simplify a pshufb shuffle mask based on the elements of the mask that are actually demanded. Differential Revision: https://reviews.llvm.org/D28745 llvm-svn: 292101
* [InstCombine] Fix typo in comment. NFCCraig Topper2016-12-291-1/+1
| | | | llvm-svn: 290706
* [InstCombine] Use a 32-bits instead of 64-bits for storing the number of ↵Craig Topper2016-12-291-2/+2
| | | | | | elements in VectorType for a ShuffleVector. While there getVectorNumElements to avoid an explicit cast. NFC llvm-svn: 290705
* [InstCombine][X86] If the lowest element of a scalar intrinsic isn't used ↵Craig Topper2016-12-291-6/+18
| | | | | | | | make sure we add it to the worklist so we can DCE it sooner. We bypassed the intrinsic and returned the passthru operand, but we should also add the intrinsic to the worklist since its now dead. This can allow DCE to find it sooner and remove it. Similar was done for InsertElement when the inserted element isn't demanded. llvm-svn: 290704
OpenPOWER on IntegriCloud