summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis
Commit message (Collapse)AuthorAgeFilesLines
...
* [BasicAA] Use DenseMap::try_emplace after D59151. NFCFangrui Song2019-03-211-5/+5
| | | | llvm-svn: 356651
* [BasicAA] Reduce no of map seaches [NFCI].Alina Sbirlea2019-03-211-14/+32
| | | | | | | | | | | | | | | | | Summary: This is a refactoring patch. - Reduce the number of map searches by reusing the iterator. - Add asserts to check that the entry is in the cache, as this is something BasicAA relies on to avoid infinite recursion. Reviewers: chandlerc, aschwaighofer Subscribers: sanjoy, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59151 llvm-svn: 356644
* [MSSA] Delete move ctor; remove dynamic never-moved verificationGeorge Burgess IV2019-03-211-14/+0
| | | | | | | | | | | | | | | | Code archaeology in D59315 revealed that MSSA should never be moved. Rather than trying to check dynamically that this hasn't happened in the verify() functions of Walkers, it's likely best to just delete its move constructor. Since all these verify() functions did is check that MSSA hasn't moved, this allows us to remove these verify functions. I can readd the verification checks if someone's super concerned about us trying to `memcpy` MemorySSA or something somewhere, but I imagine we have other problems if we're trying anything like that... llvm-svn: 356641
* [ValueTracking] Compute range for abs without nswNikita Popov2019-03-201-7/+8
| | | | | | | | | | | | | This is a small followup to D59511. The code that was moved into computeConstantRange() there is a bit overly conversative: If the abs is not nsw, it does not compute any range. However, abs without nsw still has a well-defined contiguous unsigned range from 0 to SIGNED_MIN. This is a lot less useful than the usual 0 to SIGNED_MAX range, but if we're already here we might as well specify it... Differential Revision: https://reviews.llvm.org/D59563 llvm-svn: 356586
* [ValueTracking] Use computeConstantRange() for unsigned add/sub overflowNikita Popov2019-03-191-14/+25
| | | | | | | | | | | | | | | | | | Improve computeOverflowForUnsignedAdd/Sub in ValueTracking by intersecting the computeConstantRange() result into the ConstantRange created from computeKnownBits(). This allows us to detect some additional never/always overflows conditions that can't be determined from known bits. This revision also adds basic handling for constants to computeConstantRange(). Non-splat vectors will be handled in a followup. The signed case will also be handled in a followup, as it needs some more groundwork. Differential Revision: https://reviews.llvm.org/D59386 llvm-svn: 356489
* [SelectionDAG] Handle unary SelectPatternFlavor for ABS case in ↵Simon Pilgrim2019-03-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SelectionDAGBuilder::visitSelect These changes are related to PR37743 and include: SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node. Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner. Add promoting the integer ABS node in the LegalizeIntegerType. Expand-based legalization of integer result for the ABS nodes. Expand-based legalization of ABS vector operations. Add some integer abs testcases for different typesizes for Thumb arch Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to: tmp = (SRA, Hi, 31) Lo = (UADDO tmp, Lo) Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1)) Lo = (XOR tmp, Lo) The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern: (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))). Change integer abs testcases for codegen with the ABS node support for AArch64. Indicate that the ABS is legal for the i64 type when the NEON is supported. Change the integer abs testcases to show changing of codegen. Add combine and legalization of ABS nodes for Thumb arch. Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition. For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743 Patch by: @ikulagin (Ivan Kulagin) Differential Revision: https://reviews.llvm.org/D49837 llvm-svn: 356468
* [InstSimplify] SimplifyICmpInst - icmp eq/ne %X, undef -> undefSimon Pilgrim2019-03-191-0/+7
| | | | | | | | | | | | | | | As discussed on PR41125 and D59363, we have a mismatch between icmp eq/ne cases with an undef operand: When the other operand is constant we fold to undef (handled in ConstantFoldCompareInstruction) When the other operand is non-constant we fold to a bool constant based on isTrueWhenEqual (handled in SimplifyICmpInst). Neither is really wrong, but this patch changes the logic in SimplifyICmpInst to consistently fold to undef. The NewGVN test change is annoying (as with most heavily reduced tests) but AFAICT I have kept the purpose of the test based on rL291968. Differential Revision: https://reviews.llvm.org/D59541 llvm-svn: 356456
* Revert "[ValueTracking][InstSimplify] Support min/max selects in ↵Nikita Popov2019-03-181-22/+1
| | | | | | | | | | computeConstantRange()" This reverts commit 106f0cdefb02afc3064268dc7a71419b409ed2f3. This change impacts the AMDGPU smed3.ll and umed3.ll codegen tests. llvm-svn: 356424
* [ValueTracking][InstSimplify] Support min/max selects in computeConstantRange()Nikita Popov2019-03-181-1/+22
| | | | | | | | | | | | Add support for min/max flavor selects in computeConstantRange(), which allows us to fold comparisons of a min/max against a constant in InstSimplify. This was suggested by spatel as an alternative approach to D59378. I've also added the infinite looping test from that revision here. Differential Revision: https://reviews.llvm.org/D59506 llvm-svn: 356415
* [ValueTracking][InstSimplify] Move abs handling into computeConstantRange(); NFCNikita Popov2019-03-182-41/+32
| | | | | | | | | | | This is preparation for D59506. The InstructionSimplify abs handling is moved into computeConstantRange(), which is the general place for such calculations. This is NFC and doesn't affect the existing tests in test/Transforms/InstSimplify/icmp-abs-nabs.ll. Differential Revision: https://reviews.llvm.org/D59511 llvm-svn: 356409
* [SCEV] Guard movement of insertion point for loop-invariantsWarren Ristow2019-03-181-41/+47
| | | | | | | | | | | | | | | | | | | | | This reinstates r347934, along with a tweak to address a problem with PHI node ordering that that commit created (or exposed). (That commit was reverted at r348426, due to the PHI node issue.) Original commit message: r320789 suppressed moving the insertion point of SCEV expressions with dev/rem operations to the loop header in non-loop-invariant situations. This, and similar, hoisting is also unsafe in the loop-invariant case, since there may be a guard against a zero denominator. This is an adjustment to the fix of r320789 to suppress the movement even in the loop-invariant case. This fixes PR30806. Differential Revision: https://reviews.llvm.org/D57428 llvm-svn: 356392
* [ValueTracking] Use ConstantRange overflow check for signed add; NFCNikita Popov2019-03-171-48/+8
| | | | | | | | | | | | | | | | This is the same change as rL356290, but for signed add. It replaces the existing ripple logic with the overflow logic in ConstantRange. This is NFC in that it should return NeverOverflow in exactly the same cases as the previous implementation. However, it does make computeOverflowForSignedAdd() more powerful by now also determining AlwaysOverflows conditions. As none of its consumers handle this yet, this has no impact on optimization. Making use of AlwaysOverflows in with.overflow folding will be handled as a followup. Differential Revision: https://reviews.llvm.org/D59450 llvm-svn: 356345
* [ConstantRange] Add fromKnownBits() methodNikita Popov2019-03-171-11/+8
| | | | | | | | | | | | | | Following the suggestion in D59450, I'm moving the code for constructing a ConstantRange from KnownBits out of ValueTracking, which also allows us to test this code independently. I'm adding this method to ConstantRange rather than KnownBits (which would have been a bit nicer API wise) to avoid creating a dependency from Support to IR, where ConstantRange lives. Differential Revision: https://reviews.llvm.org/D59475 llvm-svn: 356339
* [ValueTracking] Use ConstantRange overflow checks for unsigned add/sub; NFCNikita Popov2019-03-151-20/+26
| | | | | | | | | | | | | | | Use the methods introduced in rL356276 to implement the computeOverflowForUnsigned(Add|Sub) functions in ValueTracking, by converting the KnownBits into a ConstantRange. This is NFC: The existing KnownBits based implementation uses the same logic as the the ConstantRange based one. This is not the case for the signed equivalents, so I'm only changing unsigned here. This is in preparation for D59386, which will also intersect the computeConstantRange() result into the range determined from KnownBits. llvm-svn: 356290
* [ThinLTO] Restructure AliasSummary to contain ValueInfo of AliaseeTeresa Johnson2019-03-151-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The AliasSummary previously contained the AliaseeGUID, which was only populated when reading the summary from bitcode. This patch changes it to instead hold the ValueInfo of the aliasee, and always populates it. This enables more efficient access to the ValueInfo (specifically in the recent patch r352438 which needed to perform an index hash table lookup using the aliasee GUID). As noted in the comments in AliasSummary, we no longer technically need to keep a pointer to the corresponding aliasee summary, since it could be obtained by walking the list of summaries on the ValueInfo looking for the summary in the same module. However, I am concerned that this would be inefficient when walking through the index during the thin link for various analyses. That can be reevaluated in the future. By always populating this new field, we can remove the guard and special handling for a 0 aliasee GUID when dumping the dot graph of the summary. An additional improvement in this patch is when reading the summaries from LLVM assembly we now set the AliaseeSummary field to the aliasee summary in that same module, which makes it consistent with the behavior when reading the summary from bitcode. Reviewers: pcc, mehdi_amini Subscribers: inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D57470 llvm-svn: 356268
* [InstCombine] canonicalize funnel shift constant shift amount to be modulo ↵Sanjay Patel2019-03-141-1/+0
| | | | | | | | | | | | | | | bitwidth The shift argument is defined to be modulo the bitwidth, so if that argument is a constant, we can always reduce the constant to its minimal form to allow better CSE and other follow-on transforms. We need to be careful to ignore constant expressions here, or we will likely infinite loop. I'm adding a general vector constant query for that case. Differential Revision: https://reviews.llvm.org/D59374 llvm-svn: 356192
* [MemorySSA] Remove redundant walker assignment [NFC].Alina Sbirlea2019-03-141-3/+1
| | | | | Subscribers: llvm-commits llvm-svn: 356189
* [SCEV] Use depth limit for trunc analysisTeresa Johnson2019-03-121-29/+36
| | | | | | | | | | | | | | | | | | | Summary: This fixes an extremely long compile time caused by recursive analysis of truncs, which were not previously subject to any depth limits unlike some of the other ops. I decided to use the same control used for sext/zext, since the routines analyzing these are sometimes mutually recursive with the trunc analysis. Reviewers: mkazantsev, sanjoy Subscribers: sanjoy, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58994 llvm-svn: 355949
* [TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI.Sjoerd Meijer2019-03-121-6/+9
| | | | | | | | | | | | | | | | | This is addressing the issue that we're not modeling the cost of clib functions in TTI::getIntrinsicCosts and thus we're basically addressing this fixme: // FIXME: This is wrong for libc intrinsics. To enable analysis of clib functions, we not only need an intrinsic ID and formal arguments, but also the actual user of that function so that we can e.g. look at alignment and values of arguments. So, this is the initial plumbing to pass the user of an intrinsinsic on to getCallCosts, which queries getIntrinsicCosts. Differential Revision: https://reviews.llvm.org/D59014 llvm-svn: 355901
* Reland "Relax constraints for reduction vectorization"Sanjoy Das2019-03-121-2/+8
| | | | | | | | | | | | | | | | | | | | | Change from original commit: move test (that uses an X86 triple) into the X86 subdirectory. Original description: Gating vectorizing reductions on *all* fastmath flags seems unnecessary; `reassoc` should be sufficient. Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal Reviewed By: sdesmalen Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57728 llvm-svn: 355889
* Revert "Relax constraints for reduction vectorization"Sanjoy Das2019-03-111-8/+2
| | | | | | This reverts commit r355868. Breaks hexagon. llvm-svn: 355873
* Relax constraints for reduction vectorizationSanjoy Das2019-03-111-2/+8
| | | | | | | | | | | | | | | | | | Summary: Gating vectorizing reductions on *all* fastmath flags seems unnecessary; `reassoc` should be sufficient. Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal Reviewed By: sdesmalen Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57728 llvm-svn: 355868
* [ValueTracking] Move constant range computation into ValueTracking; NFCNikita Popov2019-03-092-238/+239
| | | | | | | | | | | | | | | | InstructionSimplify currently has some code to determine the constant range of integer instructions for some simple cases. It is used to simplify icmps. This change moves the relevant code into ValueTracking as llvm::computeConstantRange(), so it can also be reused for other purposes. In particular this is with the optimization of overflow checks in mind (ref D59071), where constant ranges cover some cases that known bits don't. llvm-svn: 355781
* [RegionPass] Fix forgotten "!".Michael Kruse2019-03-081-1/+1
| | | | | | | | | | | | Commit r355068 "Fix IR/Analysis layering issue with OptBisect" uses the template return Gate.isEnabled() && !Gate.shouldRunPass(this, getDescription(...)); for all pass kinds. For the RegionPass, it left out the not operator, causing region passes to be skipped as soon as a pass gate is used. llvm-svn: 355733
* [CFLAnders] Fix typo in comment; NFCGeorge Burgess IV2019-03-081-1/+1
| | | | | | | | Patch by Enna1! Differential Revision: https://reviews.llvm.org/D58756 llvm-svn: 355715
* [SelectionDAG] Allow the user to specify a memeq function.Clement Courbet2019-03-081-0/+13
| | | | | | | | | | | | | | | | | | | | | | | Summary: Right now, when we encounter a string equality check, e.g. `if (memcmp(a, b, s) == 0)`, we try to expand to a comparison if `s` is a small compile-time constant, and fall back on calling `memcmp()` else. This is sub-optimal because memcmp has to compute much more than equality. This patch replaces `memcmp(a, b, s) == 0` by `bcmp(a, b, s) == 0` on platforms that support `bcmp`. `bcmp` can be made much more efficient than `memcmp` because equality compare is trivially parallel while lexicographic ordering has a chain dependency. Subscribers: fedor.sergeev, jyknight, ckennelly, gchatelet, llvm-commits Differential Revision: https://reviews.llvm.org/D56593 llvm-svn: 355672
* [IDF] Delete a redundant J-edge testFangrui Song2019-03-071-5/+0
| | | | | | | | | | | | | | In the DJ-graph based computation of iterated dominance frontiers, SuccNode->getIDom() == Node is one of the tests to check if (Node,Succ) is a J-edge. If it is true, since Node is dominated by Root, SuccLevel = level(Node)+1 > RootLevel which means the next test SuccLevel > RootLevel will also be true. test the check is redundant and can be deleted as it also involves one indirection and provides no speed-up. llvm-svn: 355589
* [SCEV] Ensure that isHighCostExpansion takes into account what is being dividedDavid Green2019-03-051-2/+5
| | | | | | | | | | | | | A SCEV is not low-cost just because you can divide it by a power of 2. We need to also check what we are dividing to make sure it too is not a high-code expansion. This helps to not expand the exit value of certain loops, helping not to bloat the code. The change in no-iv-rewrite.ll is reverting back to what it was testing before rL194116, and looks a lot like the other tests in replace-loop-exit-folds.ll. Differential Revision: https://reviews.llvm.org/D58435 llvm-svn: 355393
* PHI nodes are not `FPMathOperator` sSanjoy Das2019-03-051-2/+1
| | | | | | | | | | | | | | Reviewers: chandlerc, arsenm Reviewed By: arsenm Subscribers: wdng, arsenm, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58887 llvm-svn: 355362
* [ValueTracking] do not try to peek through bitcasts in ↵Sanjay Patel2019-03-031-3/+1
| | | | | | | | | | computeKnownBitsFromAssume() There are no tests for this case, and I'm not sure how it could ever work, so I'm just removing this option from the matcher. This should fix PR40940: https://bugs.llvm.org/show_bug.cgi?id=40940 llvm-svn: 355292
* [DemandedBits] Remove some redundancy in the work listFangrui Song2019-03-031-8/+9
| | | | | | | | | InputIsKnownDead check is shared by all operands. Compute it once. For non-integer instructions, use Visited.insert(I).second to replace a find() and an insert(). llvm-svn: 355290
* [DemandedBits] Optimize a find()+insert pattern with try_emplace and ↵Fangrui Song2019-03-031-8/+3
| | | | | | APInt::operator|= llvm-svn: 355284
* [SCEV] Handle case where MaxBECount is less precise than ExactBECount for OR.Florian Hahn2019-03-021-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | In some cases, MaxBECount can be less precise than ExactBECount for AND and OR (the AND case was PR26207). In the OR test case, both ExactBECounts are undef, but MaxBECount are different, so we hit the assertion below. This patch uses the same solution the AND case already uses. Assertion failed: ((isa<SCEVCouldNotCompute>(ExactNotTaken) || !isa<SCEVCouldNotCompute>(MaxNotTaken)) && "Exact is not allowed to be less precise than Max"), function ExitLimit This patch also consolidates test cases for both AND and OR in a single test case. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13245 Reviewers: sanjoy, efriedma, mkazantsev Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D58853 llvm-svn: 355259
* [SCEV] Remove undef check for SCEVConstant (NFC)Florian Hahn2019-03-021-2/+0
| | | | | | | | | | | | | The value stored in SCEVConstant is of type ConstantInt*, which can never be UndefValue. So we should never hit that code. Reviewers: mkazantsev, sanjoy Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D58851 llvm-svn: 355257
* [ValueTracking] Known bits support for unsigned saturating add/subNikita Popov2019-03-011-0/+31
| | | | | | | | | | | | | | | | We have two sources of known bits: 1. For adds leading ones of either operand are preserved. For sub leading zeros of LHS and leading ones of RHS become leading zeros in the result. 2. The saturating math is a select between add/sub and an all-ones/ zero value. As such we can carry out the add/sub known bits calculation, and only preseve the known one/zero bits respectively. Differential Revision: https://reviews.llvm.org/D58329 llvm-svn: 355223
* Hide two unused debugging methods, NFCI.Jonas Hahnfeld2019-03-011-0/+4
| | | | | | | | GCC correctly moans that PlainCFGBuilder::isExternalDef(llvm::Value*) and StackSafetyDataFlowAnalysis::verifyFixedPoint() are defined but not used in Release builds. Hide them behind 'ifndef NDEBUG'. llvm-svn: 355205
* [PGO] Context sensitive PGO (part 2)Rong Xu2019-02-281-1/+8
| | | | | | | | | | | Part 2 of CSPGO changes (mostly related to ProfileSummary). Note that I use a default parameter in setProfileSummary() and getSummary(). This is to break the dependency in clang. I will make the parameter explicit after changing clang in a separated patch. Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 355131
* [ValueTracking] More accurate unsigned sub overflow detectionNikita Popov2019-02-281-11/+9
| | | | | | | | | | | | Second part of D58593. Compute precise overflow conditions based on all known bits, rather than just the sign bits. Unsigned a - b overflows iff a < b, and we can determine whether this always/never happens based on the minimal and maximal values achievable for a and b subject to the known bits constraint. llvm-svn: 355109
* Add support for computing "zext of value" in KnownBits. NFCIBjorn Pettersson2019-02-281-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The description of KnownBits::zext() and KnownBits::zextOrTrunc() has confusingly been telling that the operation is equivalent to zero extending the value we're tracking. That has not been true, instead the user has been forced to explicitly set the extended bits as known zero afterwards. This patch adds a second argument to KnownBits::zext() and KnownBits::zextOrTrunc() to control if the extended bits should be considered as known zero or as unknown. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58650 llvm-svn: 355099
* [ValueTracking] More accurate unsigned add overflow detectionNikita Popov2019-02-281-14/+10
| | | | | | | | | | | | Part of D58593. Compute precise overflow conditions based on all known bits, rather than just the sign bits. Unsigned a + b overflows iff a > ~b, and we can determine whether this always/never happens based on the minimal and maximal values achievable for a and ~b subject to the known bits constraint. llvm-svn: 355072
* Fix IR/Analysis layering issue with OptBisectRichard Trieu2019-02-283-7/+33
| | | | | | | | | | | | | OptBisect is in IR due to LLVMContext using it. However, it uses IR units from Analysis as well. This change moves getDescription functions from OptBisect to their respective IR units. Generating names for IR units will now be up to the callers, keeping the Analysis IR units in Analysis. To prevent unnecessary string generation, isEnabled function is added so that callers know when the description needs to be generated. Differential Revision: https://reviews.llvm.org/D58406 llvm-svn: 355068
* [MemorySSA] Make insertDef insert corresponding phi nodes.Alina Sbirlea2019-02-271-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The original assumption for the insertDef method was that it would not materialize Defs out of no-where, hence it will not insert phis needed after inserting a Def. However, when cloning an instruction (use case used in LICM), we do materialize Defs "out of no-where". If the block receiving a Def has at least one other Def, then no processing is needed. If the block just received its first Def, we must check where Phi placement is needed. The only new usage of insertDef is in LICM, hence the trigger for the bug. But the original goal of the method also fails to apply for the move() method. If we move a Def from the entry point of a diamond to either the left or right blocks, then the merge block must add a phi. While this usecase does not currently occur, or may be viewed as an incorrect transformation, MSSA must behave corectly given the scenario. Resolves PR40749 and PR40754. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58652 llvm-svn: 355040
* [InstSimplify] remove zero-shift-guard fold for general funnel shiftSanjay Patel2019-02-261-12/+29
| | | | | | | | | | | | | | | | | | | | | | As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2019-February/130491.html We can't remove the compare+select in the general case because we are treating funnel shift like a standard instruction (as opposed to a special instruction like select/phi). That means that if one of the operands of the funnel shift is poison, the result is poison regardless of whether we know that the operand is actually unused based on the instruction's particular semantics. The motivating case for this transform is the more specific rotate op (rather than funnel shift), and we are preserving the fold for that case because there is no chance of introducing extra poison when there is no anonymous extra operand to the funnel shift. llvm-svn: 354905
* [Vectorizer] Add vectorization support for fixed smul/umul intrinsicsSimon Pilgrim2019-02-251-0/+5
| | | | | | | | This requires a couple of tweaks to existing vectorization functions as they were assuming that only the second call argument (ctlz/cttz/powi) could ever be the 'always scalar' argument, but for smul.fix + umul.fix its the third argument. Differential Revision: https://reviews.llvm.org/D58616 llvm-svn: 354790
* [NFC] Fix typos: preceeding -> precedingJordan Rupprecht2019-02-231-1/+1
| | | | llvm-svn: 354715
* [DTU] Refine the interface and logic of applyUpdatesChijun Sima2019-02-221-59/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch separates two semantics of `applyUpdates`: 1. User provides an accurate CFG diff and the dominator tree is updated according to the difference of `the number of edge insertions` and `the number of edge deletions` to infer the status of an edge before and after the update. 2. User provides a sequence of hints. Updates mentioned in this sequence might never happened and even duplicated. Logic changes: Previously, removing invalid updates is considered a side-effect of deduplication and is not guaranteed to be reliable. To handle the second semantic, `applyUpdates` does validity checking before deduplication, which can cause updates that have already been applied to be submitted again. Then, different calls to `applyUpdates` might cause unintended consequences, for example, ``` DTU(Lazy) and Edge A->B exists. 1. DTU.applyUpdates({{Delete, A, B}, {Insert, A, B}}) // User expects these 2 updates result in a no-op, but {Insert, A, B} is queued 2. Remove A->B 3. DTU.applyUpdates({{Delete, A, B}}) // DTU cancels this update with {Insert, A, B} mentioned above together (Unintended) ``` But by restricting the precondition that updates of an edge need to be strictly ordered as how CFG changes were made, we can infer the initial status of this edge to resolve this issue. Interface changes: The second semantic of `applyUpdates` is separated to `applyUpdatesPermissive`. These changes enable DTU(Lazy) to use the first semantic if needed, which is quite useful in `transforms/utils`. Reviewers: kuhar, brzycki, dmgreen, grosser Reviewed By: brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58170 llvm-svn: 354669
* [InstSimplify] use any-zero matcher for fcmp foldsSanjay Patel2019-02-201-22/+25
| | | | | | | | | | | | | The m_APFloat matcher does not work with anything but strict splat vector constants, so we could miss these folds and then trigger an assertion in instcombine: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201 The previous attempt at this in rL354406 had a logic bug that actually triggered a regression test failure, but I failed to notice it the first time. llvm-svn: 354467
* Revert "[InstSimplify] use any-zero matcher for fcmp folds"Sanjay Patel2019-02-201-25/+22
| | | | | | | This reverts commit 058bb8351351d56d2a4e8a772570231f9e5305e5. Forgot to update another test affected by this change. llvm-svn: 354408
* [InstSimplify] use any-zero matcher for fcmp foldsSanjay Patel2019-02-201-22/+25
| | | | | | | | | The m_APFloat matcher does not work with anything but strict splat vector constants, so we could miss these folds and then trigger an assertion in instcombine: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201 llvm-svn: 354406
* [BPI] Look through bitcasts in calcZeroHeuristicSam Parker2019-02-151-1/+7
| | | | | | | | | | | | Constant hoisting may have hidden a constant behind a bitcast so that it isn't folded into its users. However, this prevents BPI from calculating some of its heuristics that are based upon constant values. So, I've added a simple helper function to look through these casts. Differential Revision: https://reviews.llvm.org/D58166 llvm-svn: 354119
OpenPOWER on IntegriCloud