summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] Add support for vector srem->urem.Craig Topper2017-04-171-7/+5
| | | | llvm-svn: 300437
* [InstCombine] Add support for turning vector sdiv into udiv.Craig Topper2017-04-171-18/+16
| | | | llvm-svn: 300435
* [LCSSA] Simplify a loop. NFCI.Davide Italiano2017-04-171-7/+3
| | | | llvm-svn: 300433
* [InstCombine][ValueTracking] When computing known bits for Srem make sure we ↵Craig Topper2017-04-161-2/+2
| | | | | | | | don't compute known bits for the LHS twice. If we already called computeKnownBits for the RHS being a constant power of 2, we've already computed everything we can and should just stop. I think previously we would still recurse if we had determined the result was negative or had not determined the sign bit at all. llvm-svn: 300432
* [LCSSA] Fix non-determinism due to iterating over a SmallPtrSet.Davide Italiano2017-04-161-3/+3
| | | | | | Use a SmallSetVector instead. llvm-svn: 300431
* [InstCombine] In SimplifyDemandedUseBits, don't bother to mask known bits of ↵Craig Topper2017-04-161-3/+3
| | | | | | | | constants with DemandedMask. Just because we didn't demand them doesn't mean they aren't known. llvm-svn: 300430
* [X86][X86 intrinsics]Folding cmp(sub(a,b),0) into cmp(a,b) optimizationMichael Zuckerman2017-04-161-0/+31
| | | | | | | | | This patch adds new optimization (Folding cmp(sub(a,b),0) into cmp(a,b)) to instCombineCall pass and was written specific for X86 CMP intrinsics. Differential Revision: https://reviews.llvm.org/D31398 llvm-svn: 300422
* [InstCombine] allow (X != C1 && X != C2) and similar patterns to match splat ↵Sanjay Patel2017-04-151-19/+19
| | | | | | vector constants llvm-svn: 300402
* [ProfileData] Unify getInstrProf*SectionName helpersVedant Kumar2017-04-152-31/+13
| | | | | | | | | | | | | | | | | | | | | | This is a version of D32090 that unifies all of the `getInstrProf*SectionName` helper functions. (Note: the build failures which D32090 would have addressed were fixed with r300352.) We should unify these helper functions because they are hard to use in their current form. E.g we recently introduced more helpers to fix section naming for COFF files. This scheme doesn't totally succeed at hiding low-level details about section naming, so we should switch to an API that is easier to maintain. This is not an NFC commit because it fixes llvm-cov's testing support for COFF files (this falls out of the API change naturally). This is an area where we lack tests -- I will see about adding one as a follow up. Testing: check-clang, check-profile, check-llvm. Differential Revision: https://reviews.llvm.org/D32097 llvm-svn: 300381
* [InstCombine] MakeAnd/Or/Xor handling to reuse previous APInt computationsCraig Topper2017-04-141-36/+46
| | | | | | | | | | | | When checking if we should return a constant, we create some temporary APInts to see if we know all bits. But the exact computations we do are needed in several other locations in the same code. This patch moves them to named temporaries so we can reuse them. Ideally we'd write directly to KnownZero/One, but we currently seem to only write those variables after all the simplifications checks and I didn't want to change that with this patch. Differential Revision: https://reviews.llvm.org/D32094 llvm-svn: 300376
* [IR] Make paramHasAttr to use arg indices instead of attr indicesReid Kleckner2017-04-144-6/+6
| | | | | | | | | This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern. Previously we were testing return value attributes with index 0, so I introduced hasReturnAttr() for that use case. llvm-svn: 300367
* [InstCombine] (X != C1 && X != C2) --> (X | (C1 ^ C2)) != C2Sanjay Patel2017-04-141-36/+65
| | | | | | | | | | ...when C1 differs from C2 by one bit and C1 <u C2: http://rise4fun.com/Alive/Vuo And move related folds to a helper function. This reduces code duplication and will make it easier to remove the scalar-only restriction as a follow-up step. llvm-svn: 300364
* [InstCombine] Support folding a subtract with a constant LHS into a phi nodeCraig Topper2017-04-148-28/+44
| | | | | | | | | | | | We currently only support folding a subtract into a select but not a PHI. This fixes that. I had to fix an assumption in FoldOpIntoPhi that assumed the PHI node was always in operand 0. Now we pass it in like we do for FoldOpIntoSelect. But we still require some dancing to find the Constant when we create the BinOp or ConstantExpr. This is based code is similar to what we do for selects. Since I touched all call sites, this also renames FoldOpIntoPhi to foldOpIntoPhi to match coding standards. Differential Revision: https://reviews.llvm.org/D31686 llvm-svn: 300363
* [InstCombine] Refactor SimplifyUsingDistributiveLaws to more explicitly skip ↵Craig Topper2017-04-141-30/+33
| | | | | | | | | | | | | | code when LHS/RHS aren't BinaryOperators Currently this code always makes 2 or 3 calls to tryFactorization regardless of whether the LHS/RHS are BinaryOperators. We make 3 calls when both operands are BinaryOperators with the same opcode. Or surprisingly, when neither are BinaryOperators. This is because getBinOpsForFactorization returns Instruction::BinaryOpsEnd when the operand is not a BinaryOperator. If both LHS and RHS are not BinaryOperators then they both have an Opcode of Instruction::BinaryOpsEnd. When this happens we rely on tryFactorization to early out due to A/B/C/D being null. Similar behavior occurs for the other calls, we rely on getBinOpsForFactorization having made A/B or C/D null to get tryFactorization to early out. We also rely on these null checks to check the result of getIdentityValue and early out for it. This patches refactors this to pull these checks up to SimplifyUsingDistributiveLaws so we don't rely on BinaryOpsEnd as a sentinel or this A/B/C/D null behavior. I think this makes this code easier to reason about. Should also give a tiny performance improvement for cases where the LHS or RHS isn't a BinaryOperator. Differential Revision: https://reviews.llvm.org/D31913 llvm-svn: 300353
* [FunctionImport] assert(false) -> llvm_unreachable(). NFCI.Davide Italiano2017-04-141-1/+1
| | | | llvm-svn: 300344
* Tighten the API for ScalarEvolutionNormalizationSanjoy Das2017-04-141-4/+3
| | | | llvm-svn: 300331
* Remove NormalizeAutodetect; NFCSanjoy Das2017-04-141-9/+3
| | | | | | | | | It is cleaner to have a callback based system where the logic of whether an add recurrence is normalized or not lives on IVUsers. This is one step in a multi-step cleanup. llvm-svn: 300330
* [LV] Remove implicit single basic block assumptionGil Rapaport2017-04-141-6/+5
| | | | | | | | | | | | | This patch is part of D28975's breakdown - no change in output intended. LV's code currently assumes the vectorized loop is a single basic block up until predicateInstructions() is called. This patch removes two manifestations of this assumption (loop phi incoming values, dominator tree update) by replacing the use of vectorLoopBody with the vectorized loop's latch/header. Differential Revision: https://reviews.llvm.org/D32040 llvm-svn: 300310
* [InstCombine] Use APInt::setSignBit and APInt::isNegative(). NFCCraig Topper2017-04-141-3/+3
| | | | llvm-svn: 300305
* Fix test failure on windows: pass module to getInstrProfXXName callsXinliang David Li2017-04-141-4/+4
| | | | llvm-svn: 300302
* NewGVN: Don't propagate over phi backedges where undef causes us toDaniel Berlin2017-04-141-8/+149
| | | | | | | | have >1 value, unless we can prove the phi node is cycle free. Fixes PR 32607. llvm-svn: 300299
* [Profile] PE binary coverage bug fixXinliang David Li2017-04-132-12/+11
| | | | | | | | PR/32584 Differential Revision: https://reviews.llvm.org/D32023 llvm-svn: 300277
* [IR] Make getParamAttributes take argument numbers, not ArgNo+1Reid Kleckner2017-04-135-35/+34
| | | | | | | | | | | | Add hasParamAttribute() and use it instead of hasAttribute(ArgNo+1, Kind) everywhere. The fact that the AttributeList index for an argument is ArgNo+1 should be a hidden implementation detail. NFC llvm-svn: 300272
* [InstCombine] Use APInt::getBitsSetFrom instead of inverting the result of ↵Craig Topper2017-04-131-4/+2
| | | | | | getLowBitsSet. NFC llvm-svn: 300265
* [LCSSA] Efficiently compute blocks dominating at least one exit.Davide Italiano2017-04-131-19/+54
| | | | | | | | | | | | | | | | | | | | | | | For LCSSA purposes, loop BBs not dominating any of the exits aren't interesting, as none of the values defined in these blocks can be used outside the loop. The way the code computed this information was by comparing each BB of the loop with each of the exit blocks and ask the dominator tree about their dominance relation. This is slow. A more efficient way, implemented here, is that of starting from the exit blocks and walking the dom upwards until we hit an header. By transitivity, all the blocks we encounter in our path dominate an exit. For the testcase provided in PR31851, this reduces compile time on `opt -O2` by ~25%, going from 1m47s to 1m22s. Thanks to Dan/MichaelZ for discussions/suggesting the approach/review. Differential Revision: https://reviews.llvm.org/D31843 llvm-svn: 300255
* Revert accidentally-committed files in r300252.Richard Smith2017-04-131-403/+0
| | | | llvm-svn: 300253
* Remove all allocation and divisions from GreatestCommonDivisorRichard Smith2017-04-131-0/+403
| | | | | | | | | | | Switch from Euclid's algorithm to Stein's algorithm for computing GCD. This avoids the (expensive) APInt division operation in favour of bit operations. Remove all memory allocation from within the GCD loop by tweaking our `lshr` implementation so it can operate in-place. Differential Revision: https://reviews.llvm.org/D31968 llvm-svn: 300252
* [InstCombine] Fix !prof metadata preservation for invokesReid Kleckner2017-04-131-18/+16
| | | | | | | | | | | | | | | | | | | | Summary: Bug noticed by inspection. Extend the test to handle invokes as well as calls, and rewrite it to not depend on the inliner and other passes. Also simplify the call site replacement code with CallSite, similar to what I did to dead arg elimination and arg promotion (rL300235 and rL300229). Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32041 llvm-svn: 300251
* [LCSSA] Assert that we always have a valid loop.Davide Italiano2017-04-131-0/+1
| | | | | | | We could otherwise add BBs not belonging to a loop in `formLCSSA` and later crash when trying to iterate the loop blocks. llvm-svn: 300244
* [LCSSA] Remove spurious whitespaces. NFCI.Davide Italiano2017-04-131-1/+1
| | | | llvm-svn: 300243
* [LCSSA] Use `auto` when the type is obvious. NFCI.Davide Italiano2017-04-131-3/+3
| | | | llvm-svn: 300242
* SamplePGO: convert callsite samples map key from callsite_location to ↵Dehao Chen2017-04-131-39/+88
| | | | | | | | | | | | | | | | callsite_location+callee_name Summary: For iterative SamplePGO, an indirect call can be speculatively promoted to multiple direct calls and get inlined. All these promoted direct calls will share the same callsite location (offset+discriminator). With the current implementation, we cannot distinguish between different promotion candidates and its inlined instance. This patch adds callee_name to the key of the callsite sample map. And added helper functions to get all inlined callee samples for a given callsite location. This helps the profile annotator promote correct targets and inline it before annotation, and ensures all indirect call targets to be annotated correctly. Reviewers: davidxl, dnovillo Reviewed By: davidxl Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D31950 llvm-svn: 300240
* [LV] Fix the vector code generation for first order recurrenceAnna Thomas2017-04-132-24/+25
| | | | | | | | | | | | | | | | | | | Summary: In first order recurrences where phi's are used outside the loop, we should generate an additional vector.extract of the second last element from the vectorized phi update. This is because we require the phi itself (which is the value at the second last iteration of the vector loop) and not the phi's update within the loop. Also fix the code gen when we just unroll, but don't vectorize. Fixes PR32396. Reviewers: mssimpso, mkuper, anemet Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D31979 llvm-svn: 300238
* [InstCombine] fold X == 0 || X == -1 to one compare (PR32524)Sanjay Patel2017-04-131-1/+5
| | | | | | | | | | | | | | | | This is effectively a retry of: https://reviews.llvm.org/rL299851 but now we have tests and an assert to make sure the bug that was exposed with that attempt will not happen again. I'll fix the code duplication and missing sibling fold next, but I want to make this change as small as possible to reduce risk since I messed it up last time. This should fix: https://bugs.llvm.org/show_bug.cgi?id=32524 llvm-svn: 300236
* [DAE] Simplify call site replacement code with CallSite NFCReid Kleckner2017-04-131-27/+24
| | | | llvm-svn: 300235
* [InstCombine] Simplify attribute code with new AttributeList::get NFCReid Kleckner2017-04-131-31/+20
| | | | llvm-svn: 300230
* [ArgPromotion] Don't drop !prof metadata on promoted callsReid Kleckner2017-04-131-1/+4
| | | | | | | | | | Noticed by inspection while doing attribute work. DAE, InstCombineCalls, and ArgPromotion have a fair amount of duplicated code for hacking on call sites, and you can find bugs by comparing them. Add a test case for this. llvm-svn: 300229
* [InstCombine] use similar ops for related folds; NFCISanjay Patel2017-04-131-10/+9
| | | | | | | | | | | | | | | It's less efficient to produce 'ule' than 'ult' since we know we're going to canonicalize to 'ult', but we shouldn't have duplicated code for these folds. As a trade-off, this was a pretty terrible way to make a '2'. :) if (LHSC == SubOne(RHSC)) AddC = ConstantExpr::getSub(AddOne(RHSC), LHSC); The next steps are to share the code to fix PR32524 and add the missing 'and' fold that was left out when PR14708 was fixed: https://bugs.llvm.org/show_bug.cgi?id=14708 llvm-svn: 300222
* [InstCombine] fix assert to not always be trueSanjay Patel2017-04-131-1/+1
| | | | llvm-svn: 300202
* Re-apply "[GVNHoist] Move GVNHoist to function simplification part of pipeline."Geoff Berry2017-04-131-2/+2
| | | | | | This reverts commit r296872 now that PR32153 has been fixed. llvm-svn: 300200
* [LV] Refactor ILV to provide vectorizeInstruction(); NFCAyal Zaks2017-04-131-310/+302
| | | | | | | | | | | Refactoring InnerLoopVectorizer's vectorizeBlockInLoop() to provide vectorizeInstruction(). Aligning DeadInstructions with its only user. Facilitates driving the transformation by VPlan - follows https://reviews.llvm.org/D28975 and its tentative breakdown. Differential Revision: https://reviews.llvm.org/D31997 llvm-svn: 300183
* [IR] Take func, ret, and arg attrs separately in AttributeList::getReid Kleckner2017-04-134-82/+55
| | | | | | | | | | | | | This seems like a much more natural API, based on Derek Schuff's comments on r300015. It further hides the implementation detail of AttributeList that function attributes come last and appear at index ~0U, which is easy for the user to screw up. git diff says it saves code as well: 97 insertions(+), 137 deletions(-) This also makes it easier to change the implementation, which I want to do next. llvm-svn: 300153
* [InstCombine] Teach SimplifyMultipleUseDemandedBits to handle And/Or/Xor ↵Craig Topper2017-04-121-11/+46
| | | | | | | | known bits using the LHS/RHS known bits it already acquired without recursing back into computeKnownBits. This replicates the known bits and constant creation code from the single use case for these instructions and adds it here. The computeKnownBits and constant creation code for other instructions is now in the default case of the opcode switch. llvm-svn: 300094
* [InstCombine] Remove unreachable code for turning an And where all demanded ↵Craig Topper2017-04-121-4/+0
| | | | | | | | bits on both sides are known to be zero into a constant 0. We already handled a superset check that included the known ones too and folded to a constant that may include ones. But it can also handle the case of no ones. llvm-svn: 300093
* [InstCombine] fix wrong undef handling when converting select to shuffleSanjay Patel2017-04-121-2/+4
| | | | | | | | | | | | | As discussed in: https://bugs.llvm.org/show_bug.cgi?id=32486 ...the canonicalization of vector select to shufflevector does not hold up when undef elements are present in the condition vector. Try to make the undef handling clear in the code and the LangRef. Differential Revision: https://reviews.llvm.org/D31980 llvm-svn: 300092
* [InstCombine] In SimplifyMultipleUseDemandedBits, use a switch instead of ↵Craig Topper2017-04-121-3/+11
| | | | | | cascaded ifs on opcode. NFC llvm-svn: 300085
* [InstCombine] Teach SimplifyDemandedInstructionBits that even if we reach an ↵Craig Topper2017-04-121-0/+6
| | | | | | | | | | | | | | instruction that has multiple uses, if we know all the bits for the demanded bits for this context we can go ahead and create a constant. Currently if we reach an instruction with multiples uses we know we can't do any optimizations to that instruction itself since we only have the demanded bits for one of the users. But if we know all of the bits are zero/one for that one user we can still go ahead and create a constant to give to that user. This might then reduce the instruction to having a single use and allow additional optimizations on the other path. This picks up an additional case that r300075 didn't catch. Differential Revision: https://reviews.llvm.org/D31552 llvm-svn: 300084
* [InstCombine] Move portion of SimplifyDemandedUseBits that deals with ↵Craig Topper2017-04-122-76/+103
| | | | | | instructions with multiple uses out to a separate method. NFCI llvm-svn: 300082
* Teach SimplifyDemandedUseBits that adding or subtractings 0s from every bit ↵Craig Topper2017-04-121-1/+10
| | | | | | | | | | | | | | below the highest demanded bit can be simplified If we are adding/subtractings 0s below the highest demanded bit we can just use the other operand and remove the operation. My primary motivation is observing that we can call ShrinkDemandedConstant for the add/sub and create a 0 constant, rather than removing the add completely. In the case I saw, we modified the constant on an add instruction to a 0, but the add is not put into the worklist. So we didn't revisit it until the next InstCombine iteration. This caused an IR modification to remove add and a subsequent iteration to be ran. With this change we get bypass the add in the first iteration and prevent the second iteration from changing anything. Differential Revision: https://reviews.llvm.org/D31120 llvm-svn: 300075
* [InstCombine] morph an existing instruction instead of creating a new oneSanjay Patel2017-04-121-7/+6
| | | | | | | | | | | | One potential way to make InstCombine (very slightly?) faster is to recycle instructions when possible instead of creating new ones. It's not explicitly stated AFAIK, but we don't consider this an "InstSimplify". We could, however, make a new layer to house transforms like this if that makes InstCombine more manageable (just throwing out an idea; not sure how much opportunity is actually here). Differential Revision: https://reviews.llvm.org/D31863 llvm-svn: 300067
OpenPOWER on IntegriCloud