summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis
Commit message (Collapse)AuthorAgeFilesLines
* [SCEV] Ensure that isHighCostExpansion takes into account what is being dividedDavid Green2019-03-051-2/+5
| | | | | | | | | | | | | A SCEV is not low-cost just because you can divide it by a power of 2. We need to also check what we are dividing to make sure it too is not a high-code expansion. This helps to not expand the exit value of certain loops, helping not to bloat the code. The change in no-iv-rewrite.ll is reverting back to what it was testing before rL194116, and looks a lot like the other tests in replace-loop-exit-folds.ll. Differential Revision: https://reviews.llvm.org/D58435 llvm-svn: 355393
* PHI nodes are not `FPMathOperator` sSanjoy Das2019-03-051-2/+1
| | | | | | | | | | | | | | Reviewers: chandlerc, arsenm Reviewed By: arsenm Subscribers: wdng, arsenm, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58887 llvm-svn: 355362
* [ValueTracking] do not try to peek through bitcasts in ↵Sanjay Patel2019-03-031-3/+1
| | | | | | | | | | computeKnownBitsFromAssume() There are no tests for this case, and I'm not sure how it could ever work, so I'm just removing this option from the matcher. This should fix PR40940: https://bugs.llvm.org/show_bug.cgi?id=40940 llvm-svn: 355292
* [DemandedBits] Remove some redundancy in the work listFangrui Song2019-03-031-8/+9
| | | | | | | | | InputIsKnownDead check is shared by all operands. Compute it once. For non-integer instructions, use Visited.insert(I).second to replace a find() and an insert(). llvm-svn: 355290
* [DemandedBits] Optimize a find()+insert pattern with try_emplace and ↵Fangrui Song2019-03-031-8/+3
| | | | | | APInt::operator|= llvm-svn: 355284
* [SCEV] Handle case where MaxBECount is less precise than ExactBECount for OR.Florian Hahn2019-03-021-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | In some cases, MaxBECount can be less precise than ExactBECount for AND and OR (the AND case was PR26207). In the OR test case, both ExactBECounts are undef, but MaxBECount are different, so we hit the assertion below. This patch uses the same solution the AND case already uses. Assertion failed: ((isa<SCEVCouldNotCompute>(ExactNotTaken) || !isa<SCEVCouldNotCompute>(MaxNotTaken)) && "Exact is not allowed to be less precise than Max"), function ExitLimit This patch also consolidates test cases for both AND and OR in a single test case. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13245 Reviewers: sanjoy, efriedma, mkazantsev Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D58853 llvm-svn: 355259
* [SCEV] Remove undef check for SCEVConstant (NFC)Florian Hahn2019-03-021-2/+0
| | | | | | | | | | | | | The value stored in SCEVConstant is of type ConstantInt*, which can never be UndefValue. So we should never hit that code. Reviewers: mkazantsev, sanjoy Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D58851 llvm-svn: 355257
* [ValueTracking] Known bits support for unsigned saturating add/subNikita Popov2019-03-011-0/+31
| | | | | | | | | | | | | | | | We have two sources of known bits: 1. For adds leading ones of either operand are preserved. For sub leading zeros of LHS and leading ones of RHS become leading zeros in the result. 2. The saturating math is a select between add/sub and an all-ones/ zero value. As such we can carry out the add/sub known bits calculation, and only preseve the known one/zero bits respectively. Differential Revision: https://reviews.llvm.org/D58329 llvm-svn: 355223
* Hide two unused debugging methods, NFCI.Jonas Hahnfeld2019-03-011-0/+4
| | | | | | | | GCC correctly moans that PlainCFGBuilder::isExternalDef(llvm::Value*) and StackSafetyDataFlowAnalysis::verifyFixedPoint() are defined but not used in Release builds. Hide them behind 'ifndef NDEBUG'. llvm-svn: 355205
* [PGO] Context sensitive PGO (part 2)Rong Xu2019-02-281-1/+8
| | | | | | | | | | | Part 2 of CSPGO changes (mostly related to ProfileSummary). Note that I use a default parameter in setProfileSummary() and getSummary(). This is to break the dependency in clang. I will make the parameter explicit after changing clang in a separated patch. Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 355131
* [ValueTracking] More accurate unsigned sub overflow detectionNikita Popov2019-02-281-11/+9
| | | | | | | | | | | | Second part of D58593. Compute precise overflow conditions based on all known bits, rather than just the sign bits. Unsigned a - b overflows iff a < b, and we can determine whether this always/never happens based on the minimal and maximal values achievable for a and b subject to the known bits constraint. llvm-svn: 355109
* Add support for computing "zext of value" in KnownBits. NFCIBjorn Pettersson2019-02-281-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The description of KnownBits::zext() and KnownBits::zextOrTrunc() has confusingly been telling that the operation is equivalent to zero extending the value we're tracking. That has not been true, instead the user has been forced to explicitly set the extended bits as known zero afterwards. This patch adds a second argument to KnownBits::zext() and KnownBits::zextOrTrunc() to control if the extended bits should be considered as known zero or as unknown. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58650 llvm-svn: 355099
* [ValueTracking] More accurate unsigned add overflow detectionNikita Popov2019-02-281-14/+10
| | | | | | | | | | | | Part of D58593. Compute precise overflow conditions based on all known bits, rather than just the sign bits. Unsigned a + b overflows iff a > ~b, and we can determine whether this always/never happens based on the minimal and maximal values achievable for a and ~b subject to the known bits constraint. llvm-svn: 355072
* Fix IR/Analysis layering issue with OptBisectRichard Trieu2019-02-283-7/+33
| | | | | | | | | | | | | OptBisect is in IR due to LLVMContext using it. However, it uses IR units from Analysis as well. This change moves getDescription functions from OptBisect to their respective IR units. Generating names for IR units will now be up to the callers, keeping the Analysis IR units in Analysis. To prevent unnecessary string generation, isEnabled function is added so that callers know when the description needs to be generated. Differential Revision: https://reviews.llvm.org/D58406 llvm-svn: 355068
* [MemorySSA] Make insertDef insert corresponding phi nodes.Alina Sbirlea2019-02-271-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The original assumption for the insertDef method was that it would not materialize Defs out of no-where, hence it will not insert phis needed after inserting a Def. However, when cloning an instruction (use case used in LICM), we do materialize Defs "out of no-where". If the block receiving a Def has at least one other Def, then no processing is needed. If the block just received its first Def, we must check where Phi placement is needed. The only new usage of insertDef is in LICM, hence the trigger for the bug. But the original goal of the method also fails to apply for the move() method. If we move a Def from the entry point of a diamond to either the left or right blocks, then the merge block must add a phi. While this usecase does not currently occur, or may be viewed as an incorrect transformation, MSSA must behave corectly given the scenario. Resolves PR40749 and PR40754. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58652 llvm-svn: 355040
* [InstSimplify] remove zero-shift-guard fold for general funnel shiftSanjay Patel2019-02-261-12/+29
| | | | | | | | | | | | | | | | | | | | | | As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2019-February/130491.html We can't remove the compare+select in the general case because we are treating funnel shift like a standard instruction (as opposed to a special instruction like select/phi). That means that if one of the operands of the funnel shift is poison, the result is poison regardless of whether we know that the operand is actually unused based on the instruction's particular semantics. The motivating case for this transform is the more specific rotate op (rather than funnel shift), and we are preserving the fold for that case because there is no chance of introducing extra poison when there is no anonymous extra operand to the funnel shift. llvm-svn: 354905
* [Vectorizer] Add vectorization support for fixed smul/umul intrinsicsSimon Pilgrim2019-02-251-0/+5
| | | | | | | | This requires a couple of tweaks to existing vectorization functions as they were assuming that only the second call argument (ctlz/cttz/powi) could ever be the 'always scalar' argument, but for smul.fix + umul.fix its the third argument. Differential Revision: https://reviews.llvm.org/D58616 llvm-svn: 354790
* [NFC] Fix typos: preceeding -> precedingJordan Rupprecht2019-02-231-1/+1
| | | | llvm-svn: 354715
* [DTU] Refine the interface and logic of applyUpdatesChijun Sima2019-02-221-59/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch separates two semantics of `applyUpdates`: 1. User provides an accurate CFG diff and the dominator tree is updated according to the difference of `the number of edge insertions` and `the number of edge deletions` to infer the status of an edge before and after the update. 2. User provides a sequence of hints. Updates mentioned in this sequence might never happened and even duplicated. Logic changes: Previously, removing invalid updates is considered a side-effect of deduplication and is not guaranteed to be reliable. To handle the second semantic, `applyUpdates` does validity checking before deduplication, which can cause updates that have already been applied to be submitted again. Then, different calls to `applyUpdates` might cause unintended consequences, for example, ``` DTU(Lazy) and Edge A->B exists. 1. DTU.applyUpdates({{Delete, A, B}, {Insert, A, B}}) // User expects these 2 updates result in a no-op, but {Insert, A, B} is queued 2. Remove A->B 3. DTU.applyUpdates({{Delete, A, B}}) // DTU cancels this update with {Insert, A, B} mentioned above together (Unintended) ``` But by restricting the precondition that updates of an edge need to be strictly ordered as how CFG changes were made, we can infer the initial status of this edge to resolve this issue. Interface changes: The second semantic of `applyUpdates` is separated to `applyUpdatesPermissive`. These changes enable DTU(Lazy) to use the first semantic if needed, which is quite useful in `transforms/utils`. Reviewers: kuhar, brzycki, dmgreen, grosser Reviewed By: brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58170 llvm-svn: 354669
* [InstSimplify] use any-zero matcher for fcmp foldsSanjay Patel2019-02-201-22/+25
| | | | | | | | | | | | | The m_APFloat matcher does not work with anything but strict splat vector constants, so we could miss these folds and then trigger an assertion in instcombine: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201 The previous attempt at this in rL354406 had a logic bug that actually triggered a regression test failure, but I failed to notice it the first time. llvm-svn: 354467
* Revert "[InstSimplify] use any-zero matcher for fcmp folds"Sanjay Patel2019-02-201-25/+22
| | | | | | | This reverts commit 058bb8351351d56d2a4e8a772570231f9e5305e5. Forgot to update another test affected by this change. llvm-svn: 354408
* [InstSimplify] use any-zero matcher for fcmp foldsSanjay Patel2019-02-201-22/+25
| | | | | | | | | The m_APFloat matcher does not work with anything but strict splat vector constants, so we could miss these folds and then trigger an assertion in instcombine: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201 llvm-svn: 354406
* [BPI] Look through bitcasts in calcZeroHeuristicSam Parker2019-02-151-1/+7
| | | | | | | | | | | | Constant hoisting may have hidden a constant behind a bitcast so that it isn't folded into its users. However, this prevents BPI from calculating some of its heuristics that are based upon constant values. So, I've added a simple helper function to look through these casts. Differential Revision: https://reviews.llvm.org/D58166 llvm-svn: 354119
* Revert "[INLINER] allow inlining of address taken blocks"Nick Desaulniers2019-02-141-2/+2
| | | | | | This reverts commit 19e95fe61182945b7b68ad15348f144fb996633f. llvm-svn: 354082
* [INLINER] allow inlining of address taken blocksNick Desaulniers2019-02-141-2/+2
| | | | | | | | | | | | | as long as their uses does not contain calls to functions that capture the argument (potentially allowing the blockaddress to "escape" the lifetime of the caller). TODO: - add more tests - fix crash in llvm::updateCGAndAnalysisManagerForFunctionPass when invoking Transforms/Inline/blockaddress.ll llvm-svn: 354079
* Make widenable condition transparent for MemoryWriteTrackingMax Kazantsev2019-02-141-0/+4
| | | | | | | | | Side effects of widenable condition intrinsic are modelled via InaccessibleMemOnly, and there is no way to say that it isn't really writing any memory. This patch teaches MemoryWriteTracking ignore this intrinsic. llvm-svn: 354021
* Teach isGuaranteedToTransferExecutionToSuccessor about widenable conditionsMax Kazantsev2019-02-141-1/+2
| | | | | | | Widenable condition intrinsic is guaranteed to return value, notify the isGuaranteedToTransferExecutionToSuccessor function about it. llvm-svn: 354020
* [NFC] Simplify code & reduce nest slightlyMax Kazantsev2019-02-121-33/+35
| | | | llvm-svn: 353832
* [TargetLibraryInfo] Update run time support for WindowsEvandro Menezes2019-02-111-47/+39
| | | | | | | | | It seems that, since VC19, the `float` C99 math functions are supported for all targets, unlike the C89 ones. According to the discussion at https://reviews.llvm.org/D57625. llvm-svn: 353758
* [MemorySSA] Remove verifyClobberSanity.Alina Sbirlea2019-02-111-29/+11
| | | | | | | | | | | | | | | | | | Summary: This verification may fail after certain transformations due to BasicAA's fragility. Added a small explanation and a testcase that triggers the assert in checkClobberSanity (before its removal). Addresses PR40509. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, llvm-commits, Prazek Tags: #llvm Differential Revision: https://reviews.llvm.org/D57973 llvm-svn: 353739
* Refactor setAlreadyUnrolled() and setAlreadyVectorized().Michael Kruse2019-02-111-27/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Loop::setAlreadyUnrolled() and LoopVectorizeHints::setLoopAlreadyUnrolled() both add loop metadata that stops the same loop from being transformed multiple times. This patch merges both implementations. In doing so we fix 3 potential issues: * setLoopAlreadyUnrolled() kept the llvm.loop.vectorize/interleave.* metadata even though it will not be used anymore. This already caused problems such as http://llvm.org/PR40546. Change the behavior to the one of setAlreadyUnrolled which deletes this loop metadata. * setAlreadyUnrolled() used to create a new LoopID by calling MDNode::get with nullptr as the first operand, then replacing it by the returned references using replaceOperandWith. It is possible that MDNode::get would instead return an existing node (due to de-duplication) that then gets modified. To avoid, use a fresh TempMDNode that does not get uniqued with anything else before replacing it with replaceOperandWith. * LoopVectorizeHints::matchesHintMetadataName() only compares the suffix of the attribute to set the new value for. That is, when called with "enable", would erase attributes such as "llvm.loop.unroll.enable", "llvm.loop.vectorize.enable" and "llvm.loop.distribute.enable" instead of the one to replace. Fortunately, function was only called with "isvectorized". Differential Revision: https://reviews.llvm.org/D57566 llvm-svn: 353738
* [TargetLibraryInfo] Update run time support for WindowsEvandro Menezes2019-02-111-35/+64
| | | | | | | | | | | It seems that the run time for Windows has changed and supports more math functions than it used to, especially on AArch64, ARM, and AMD64. Fixes PR40541. Differential revision: https://reviews.llvm.org/D57625 llvm-svn: 353733
* Move CFLGraph and the AA summary code over to the new `CallBase`Chandler Carruth2019-02-113-38/+36
| | | | | | instruction base class rather than the `CallSite` wrapper. llvm-svn: 353676
* Remove `CallSite` from the CodeMetrics analysis, moving it to the newChandler Carruth2019-02-111-7/+4
| | | | | | `CallBase` and simpler APIs therein. llvm-svn: 353673
* [CallSite removal] Port InstSimplify over to use `CallBase` both in itsChandler Carruth2019-02-111-19/+17
| | | | | | | | interface and implementation. Port code with: `cast<CallBase>(CS.getInstruction())`. llvm-svn: 353662
* [CallSite removal] Migrate ConstantFolding APIs and implementation toChandler Carruth2019-02-113-30/+35
| | | | | | | | | `CallBase`. Users have been updated. You can see how to update any out-of-tree usages: pass `cast<CallBase>(CS.getInstruction())`. llvm-svn: 353661
* Implementation of asm-goto support in LLVMCraig Topper2019-02-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | This patch accompanies the RFC posted here: http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html This patch adds a new CallBr IR instruction to support asm-goto inline assembly like gcc as used by the linux kernel. This instruction is both a call instruction and a terminator instruction with multiple successors. Only inline assembly usage is supported today. This also adds a new INLINEASM_BR opcode to SelectionDAG and MachineIR to represent an INLINEASM block that is also considered a terminator instruction. There will likely be more bug fixes and optimizations to follow this, but we felt it had reached a point where we would like to switch to an incremental development model. Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii Differential Revision: https://reviews.llvm.org/D53765 llvm-svn: 353563
* [CodeExtractor] Update function's assumption cache after extracting blocks ↵Sergey Dmitriev2019-02-081-2/+26
| | | | | | | | | | | | | | | | | | from it Summary: Assumption cache's self-updating mechanism does not correctly handle the case when blocks are extracted from the function by the CodeExtractor. As a result function's assumption cache may have stale references to the llvm.assume calls that were moved to the outlined function. This patch fixes this problem by removing extracted llvm.assume calls from the function’s assumption cache. Reviewers: hfinkel, vsk, fhahn, davidxl, sanjoy Reviewed By: hfinkel, vsk Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57215 llvm-svn: 353500
* [LSR] Generate cross iteration indexesSam Parker2019-02-071-0/+4
| | | | | | | | | | | | | | Modify GenerateConstantOffsetsImpl to create offsets that can be used by indexed addressing modes. If formulae can be generated which result in the constant offset being the same size as the recurrence, we can generate a pre-indexed access. This allows the pointer to be updated via the single pre-indexed access so that (hopefully) no add/subs are required to update it for the next iteration. For small cores, this can significantly improve performance DSP-like loops. Differential Revision: https://reviews.llvm.org/D55373 llvm-svn: 353403
* [LICM/MSSA] Add promotion to scalars by building an AliasSetTracker with ↵Alina Sbirlea2019-02-061-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | MemorySSA. Summary: Experimentally we found that promotion to scalars carries less benefits than sinking and hoisting in LICM. When using MemorySSA, we build an AliasSetTracker on demand in order to reuse the current infrastructure. We only build it if less than AccessCapForMSSAPromotion exist in the loop, a cap that is by default set to 250. This value ensures there are no runtime regressions, and there are small compile time gains for pathological cases. A much lower value (20) was found to yield a single regression in the llvm-test-suite and much higher benefits for compile times. Conservatively we set the current cap to a high value, but we will explore lowering it when MemorySSA is enabled by default. Reviewers: sanjoy, chandlerc Subscribers: nemanjai, jlebar, Prazek, george.burgess.iv, jfb, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D56625 llvm-svn: 353339
* [AliasSetTracker] Pass MustAlias to addPointer more often.Alina Sbirlea2019-02-061-24/+36
| | | | | | | | | | | | | | | Summary: Pass the alias info to addPointer when available. Will save an alias() call for must sets when adding a known Must or May alias. [Part of a series of cleanup patches] Reviewers: reames, mkazantsev Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D56613 llvm-svn: 353335
* [AliasSetTracker] Minor style tweak to avoid a variable w/two distinct live ↵Philip Reames2019-02-061-4/+2
| | | | | | ranges [NFC] llvm-svn: 353267
* Move DomTreeUpdater from IR to AnalysisRichard Trieu2019-02-063-1/+530
| | | | | | | | DomTreeUpdater depends on headers from Analysis, but is in IR. This is a layering violation since Analysis depends on IR. Relocate this code from IR to Analysis to fix the layering violation. llvm-svn: 353265
* [BasicAA] Cache nonEscapingLocalObjects for alias() calls.Alina Sbirlea2019-02-051-7/+27
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Use a small cache for Values tested by nonEscapingLocalObject(). Since the calls to PointerMayBeCaptured are fairly expensive, this saves a good amount of compile time for anything relying heavily on BasicAA.alias() calls. This uses the same approach as the AliasCache, i.e. the cache is reset after each alias() call. The cache is not used or updated by modRefInfo calls since it's harder to know when to reset the cache. Testcases that show improvements with this patch are too large to include. Example compile time improvement: 7s to 6s. Reviewers: chandlerc, sunfish Subscribers: sanjoy, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57627 llvm-svn: 353245
* [TargetLibraryInfo] Regroup run time functions for Windows (NFC)Evandro Menezes2019-02-051-38/+37
| | | | | | Regroup supported and unsupported functions by precision and C standard. llvm-svn: 353213
* [NFC] fix trivial typos in commentsHiroshi Inoue2019-02-058-13/+13
| | | | llvm-svn: 353147
* Revert "[PATCH] [TargetLibraryInfo] Update run time support for Windows"Evandro Menezes2019-02-041-59/+53
| | | | | | This reverts accidental commit ff5527718d5d3b9966f6e8948866c0dc15ffcf3c. llvm-svn: 353118
* [PATCH] [TargetLibraryInfo] Update run time support for WindowsEvandro Menezes2019-02-041-53/+59
| | | | | | | | | | | | | It seems that the run time for Windows has changed and supports more math functions than before. Since LLVM requires at least VS2015, I assume that this is the run time that would be redistributed with programs built with Clang. Thus, I based this update on the header file `math.h` that accompanies it. This patch addresses the PR40541. Unfortunately, I have no access to a Windows development environment to validate it. llvm-svn: 353114
* Adjust cardinality of internal inliner thresholdsDavid Callahan2019-02-041-4/+4
| | | | | | | | | | | | | | | | | | | Summary: While compiling openJDK11 (also other workloads), some make files would pass both CFLAGS and LDFLAGS at link step ; resulting in duplicate options on the command line when one is using LTO and trying to influence the inliner. Most of the internal flags are ZeroOrMore, this diff changes the remaining ones. Reviewers: david2050, twoh, modocache Reviewed By: twoh Subscribers: mehdi_amini, dexonsmith, eraman, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57537 Patch by: Abdoul-Kader Keita llvm-svn: 353071
* [SCEV] Do not bother creating separate SCEVUnknown for unreachable nodesMax Kazantsev2019-02-041-1/+1
| | | | | | | | | | | Currently, SCEV creates SCEVUnknown for every node of unreachable code. If we have a huge amounts of such code, we will be littering SE with these nodes. We could just state that they all are undef and save some memory. Differential Revision: https://reviews.llvm.org/D57567 Reviewed By: sanjoy llvm-svn: 353017
OpenPOWER on IntegriCloud