summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis
Commit message (Collapse)AuthorAgeFilesLines
* Allow subclassing ExternalAAMatt Arsenault2018-11-071-22/+0
| | | | | | | | | | | | | | This allows testing AMDGPU alias analysis like any other alias analysis pass. This fixes the existing test pointlessly running opt -O3 when it really just wants to run the one analysis. Before there was no way to test this using -aa-eval with opt, since the default constructed pass is run. The wrapper subclass allows the default constructor to pass the necessary callback. llvm-svn: 346353
* Add support for llvm.is.constant intrinsic (PR4898)James Y Knight2018-11-071-0/+22
| | | | | | | | | | | | | | | This adds the llvm-side support for post-inlining evaluation of the __builtin_constant_p GCC intrinsic. Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call to a function where canConstantFoldTo returns true, and one of the arguments is a struct. Updated from patch initially by Janusz Sobczak. Differential Revision: https://reviews.llvm.org/D4276 llvm-svn: 346322
* Fix unit tests after patch https://reviews.llvm.org/rL346313Calixte Denizet2018-11-071-5/+1
| | | | | | | | | | | | | | Summary: Tests are broken so fix them. Reviewers: marco-c Reviewed By: marco-c Subscribers: sylvestre.ledru, llvm-commits Differential Revision: https://reviews.llvm.org/D54208 llvm-svn: 346318
* [GCOV] Flush counters before to avoid counting the execution before fork ↵Calixte Denizet2018-11-071-0/+24
| | | | | | | | | | | | | | | | | | | | twice and for exec** functions we must flush before the call Summary: This is replacement for patch in https://reviews.llvm.org/D49460. When we fork, the counters are duplicate as they're and so the values are finally wrong when writing gcda for parent and child. So just before to fork, we flush the counters and so the parent and the child have new counters set to zero. For exec** functions, we need to flush before the call to have some data. Reviewers: vsk, davidxl, marco-c Reviewed By: marco-c Subscribers: llvm-commits, sylvestre.ledru, marco-c Differential Revision: https://reviews.llvm.org/D53593 llvm-svn: 346313
* [ThinLTO] Split NotEligibleToImport into legality and inlinability flagsTeresa Johnson2018-11-061-10/+9
| | | | | | | | | | | | | | | | | | | | Summary: The NotEligibleToImport flag on the GlobalValueSummary was set if it isn't legal to import (e.g. because it references unpromotable locals) and when it can't be inlined (in which case importing is pointless). I split out the inlinable piece into a separate flag on the FunctionSummary (doesn't make sense for aliases or global variables), because in the future we may want to import for reasons other than inlining. Reviewers: davidxl Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D53345 llvm-svn: 346261
* Disable calls to *_finite and other glibc-only functions on Musl.Eli Friedman2018-11-061-5/+5
| | | | | | | | | Non-GNU environments don't have __finite_*, so treat them as unavailable. Differential Revision: https://reviews.llvm.org/D51282 llvm-svn: 346250
* [NFC] Turn collectTransitivePredecessors into a static functionMax Kazantsev2018-11-061-2/+5
| | | | llvm-svn: 346217
* [InstSimplify] fold select (fcmp X, Y), X, YSanjay Patel2018-11-051-0/+31
| | | | | | | | | This is NFCI for InstCombine because it calls InstSimplify, so I left the tests for this transform there. As noted in the code comment, we can allow this fold more often by using FMF and/or value tracking. llvm-svn: 346169
* [Inliner] Penalise inlining of calls with loops at OzDavid Green2018-11-051-0/+20
| | | | | | | | | | | | | | | | | | We currently seem to underestimate the size of functions with loops in them, both in terms of absolute code size and in the difficulties of dealing with such code. (Calls, for example, can be tail merged to further reduce codesize). At -Oz, we can then increase code size by inlining small loops multiple times. This attempts to penalise functions with loops at -Oz by adding a CallPenalty for each top level loop in the function. It uses LI (and hence DT) to calculate the number of loops. As we are dealing with minsize, the inline threshold is small and functions at this point should be relatively small, making the construction of these cheap. Differential Revision: https://reviews.llvm.org/D52716 llvm-svn: 346134
* [ValueTracking] determine sign of 0.0 from select when matching min/max FPSanjay Patel2018-11-041-0/+21
| | | | | | | | | | | | | | | | | | | In PR39475: https://bugs.llvm.org/show_bug.cgi?id=39475 ..we may fail to recognize/simplify fabs() in some cases because we do not canonicalize fcmp with a -0.0 operand. Adding that canonicalization can cause regressions on min/max FP tests, so that's this patch: for the purpose of determining whether something is min/max, let the value returned by the select determine how we treat a 0.0 operand in the fcmp. This patch doesn't actually change the -0.0 to +0.0. It just changes the analysis, so we don't fail to recognize equivalent min/max patterns that only differ in the signbit of 0.0. Differential Revision: https://reviews.llvm.org/D54001 llvm-svn: 346097
* [ValueTracking] peek through 2-input shuffles in ComputeNumSignBitsSanjay Patel2018-11-031-19/+34
| | | | | | | | | | | This patch gives the IR ComputeNumSignBits the same functionality as the DAG version (the code is derived from the existing code). This an extension of the single input shuffle analysis added with D53659. Differential Revision: https://reviews.llvm.org/D53987 llvm-svn: 346071
* [ProfileSummary] Add options to override hot and cold count thresholds.Easwaran Raman2018-11-021-0/+18
| | | | | | | | | | | | | | Summary: The hot and cold count thresholds are derived from the summary, but for debugging purposes it is convenient to provide the actual thresholds. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54040 llvm-svn: 346005
* [ValueTracking] allow non-canonical shuffles when computing signbitsSanjay Patel2018-11-021-8/+10
| | | | | | | This possibility is noted in D53987 for a different case, so we need to adjust the existing code. llvm-svn: 345988
* [AliasSetTracker] Misc cleanup (NFCI)Alina Sbirlea2018-11-011-14/+6
| | | | | Summary: Remove two redundant checks, add one in the unit test. Remove an unused method. Fix computation of TotalMayAliasSetSize. llvm-svn: 345911
* Fix clang -Wimplicit-fallthrough warnings across llvm, NFCReid Kleckner2018-11-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch should not introduce any behavior changes. It consists of mostly one of two changes: 1. Replacing fall through comments with the LLVM_FALLTHROUGH macro 2. Inserting 'break' before falling through into a case block consisting of only 'break'. We were already using this warning with GCC, but its warning behaves slightly differently. In this patch, the following differences are relevant: 1. GCC recognizes comments that say "fall through" as annotations, clang doesn't 2. GCC doesn't warn on "case N: foo(); default: break;", clang does 3. GCC doesn't warn when the case contains a switch, but falls through the outer case. I will enable the warning separately in a follow-up patch so that it can be cleanly reverted if necessary. Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu Differential Revision: https://reviews.llvm.org/D53950 llvm-svn: 345882
* [InstSimplify] fold icmp based on range of abs/nabs (2nd try)Sanjay Patel2018-11-011-0/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is retrying the fold from rL345717 (reverted at rL347780) ...with a fix for the miscompile demonstrated by PR39510: https://bugs.llvm.org/show_bug.cgi?id=39510 Original commit message: This is a fix for PR39475: https://bugs.llvm.org/show_bug.cgi?id=39475 We managed to get some of these patterns using computeKnownBits in https://reviews.llvm.org/D47041, but that can't be used for nabs(). Instead, put in some range-based logic, so we can fold both abs/nabs with icmp with a constant value. Alive proofs: https://rise4fun.com/Alive/21r Name: abs_nsw_is_positive %cmp = icmp slt i32 %x, 0 %negx = sub nsw i32 0, %x %abs = select i1 %cmp, i32 %negx, i32 %x %r = icmp sgt i32 %abs, -1 => %r = i1 true Name: abs_nsw_is_not_negative %cmp = icmp slt i32 %x, 0 %negx = sub nsw i32 0, %x %abs = select i1 %cmp, i32 %negx, i32 %x %r = icmp slt i32 %abs, 0 => %r = i1 false Name: nabs_is_negative_or_0 %cmp = icmp slt i32 %x, 0 %negx = sub i32 0, %x %nabs = select i1 %cmp, i32 %x, i32 %negx %r = icmp slt i32 %nabs, 1 => %r = i1 true Name: nabs_is_not_over_0 %cmp = icmp slt i32 %x, 0 %negx = sub i32 0, %x %nabs = select i1 %cmp, i32 %x, i32 %negx %r = icmp sgt i32 %nabs, 0 => %r = i1 false Differential Revision: https://reviews.llvm.org/D53844 llvm-svn: 345832
* [NFC] Specialize public API of ICFLoopSafetyInfo for insertions and removalsMax Kazantsev2018-11-011-1/+7
| | | | llvm-svn: 345822
* [SCEV] Avoid redundant computations when doing AddRec mergeMax Kazantsev2018-11-011-5/+6
| | | | | | | | | | | | | When we calculate a product of 2 AddRecs, we end up making quite massive computations to deduce the operands of resulting AddRec. This process can be optimized by computing all args of intermediate sum and then calling `getAddExpr` once rather than calling `getAddExpr` with intermediate result every time a new argument is computed. Differential Revision: https://reviews.llvm.org/D53189 Reviewed By: rtereshin llvm-svn: 345813
* revert rL345717 : [InstSimplify] fold icmp based on range of abs/nabsSanjay Patel2018-10-311-42/+0
| | | | | | | This can miscompile as shown in PR39510: https://bugs.llvm.org/show_bug.cgi?id=39510 llvm-svn: 345780
* [InstSimplify] fold 'fcmp nnan ult X, 0.0' when X is not negativeSanjay Patel2018-10-311-1/+4
| | | | | | This is the inverted case for the transform added with D53874 / rL345725. llvm-svn: 345728
* [InstSimplify] fold 'fcmp nnan oge X, 0.0' when X is not negativeSanjay Patel2018-10-311-0/+4
| | | | | | | | | | | | | | This re-raises some of the open questions about how to apply and use fast-math-flags in IR from PR38086: https://bugs.llvm.org/show_bug.cgi?id=38086 ...but given the current implementation (no FMF on casts), this is likely the only way to predicate the transform. This is part of solving PR39475: https://bugs.llvm.org/show_bug.cgi?id=39475 Differential Revision: https://reviews.llvm.org/D53874 llvm-svn: 345725
* [InstSimplify] fold icmp based on range of abs/nabsSanjay Patel2018-10-311-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a fix for PR39475: https://bugs.llvm.org/show_bug.cgi?id=39475 We managed to get some of these patterns using computeKnownBits in D47041, but that can't be used for nabs(). Instead, put in some range-based logic, so we can fold both abs/nabs with icmp with a constant value. Alive proofs: https://rise4fun.com/Alive/21r Name: abs_nsw_is_positive %cmp = icmp slt i32 %x, 0 %negx = sub nsw i32 0, %x %abs = select i1 %cmp, i32 %negx, i32 %x %r = icmp sgt i32 %abs, -1 => %r = i1 true Name: abs_nsw_is_not_negative %cmp = icmp slt i32 %x, 0 %negx = sub nsw i32 0, %x %abs = select i1 %cmp, i32 %negx, i32 %x %r = icmp slt i32 %abs, 0 => %r = i1 false Name: nabs_is_negative_or_0 %cmp = icmp slt i32 %x, 0 %negx = sub i32 0, %x %nabs = select i1 %cmp, i32 %x, i32 %negx %r = icmp slt i32 %nabs, 1 => %r = i1 true Name: nabs_is_not_over_0 %cmp = icmp slt i32 %x, 0 %negx = sub i32 0, %x %nabs = select i1 %cmp, i32 %x, i32 %negx %r = icmp sgt i32 %nabs, 0 => %r = i1 false Differential Revision: https://reviews.llvm.org/D53844 llvm-svn: 345717
* [LV] Support vectorization of interleave-groups that require an epilog underDorit Nuzman2018-10-312-5/+28
| | | | | | | | | | | | | | | | | | | | | | optsize using masked wide loads Under Opt for Size, the vectorizer does not vectorize interleave-groups that have gaps at the end of the group (such as a loop that reads only the even elements: a[2*i]) because that implies that we'll require a scalar epilogue (which is not allowed under Opt for Size). This patch extends the support for masked-interleave-groups (introduced by D53011 for conditional accesses) to also cover the case of gaps in a group of loads; Targets that enable the masked-interleave-group feature don't have to invalidate interleave-groups of loads with gaps; they could now use masked wide-loads and shuffles (if that's what the cost model selects). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53668 llvm-svn: 345705
* ADT/STLExtras: Introduce llvm::empty; NFCMatthias Braun2018-10-311-1/+1
| | | | | | | | This is modeled after C++17 std::empty(). Differential Revision: https://reviews.llvm.org/D53909 llvm-svn: 345679
* [AliasSetTracker] Cleanup addPointer interface. [NFCI]Alina Sbirlea2018-10-291-6/+6
| | | | | | | | | | | | | | Summary: Attempting to simplify the addPointer interface. Currently there's code decomposing a MemoryLocation into (Ptr, Size, AAMDNodes) only to recreate the MemoryLocation inside the call. Reviewers: reames, mkazantsev Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D53836 llvm-svn: 345548
* Revert r344172: [LV] Add a new reduction pattern matchRenato Golin2018-10-271-65/+6
| | | | | | | | This patch has caused fast-math issues in the reduction pattern. Will re-work and land again. llvm-svn: 345465
* [ValueTracking] peek through shuffles in ComputeNumSignBits (PR37549)Sanjay Patel2018-10-261-0/+21
| | | | | | | | | | | | | | | | | The motivating case is from PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 The analysis improvement allows us to form a vector 'select' out of bitwise logic (the use of ComputeNumSignBits was added at rL345149). The smaller test shows another InstCombine improvement - we use ComputeNumSignBits to add 'nsw' to shift-left. But the negative test shows an example where we must not add 'nsw' - when the shuffle mask contains undef elements. Differential Revision: https://reviews.llvm.org/D53659 llvm-svn: 345429
* [IAI,LV] Avoid creating a scalar epilogue due to gaps in interleave-groups when Dorit Nuzman2018-10-221-0/+24
| | | | | | | | | | | | | | | | | | | | | | | optimizing for size LV is careful to respect -Os and not to create a scalar epilog in all cases (runtime tests, trip-counts that require a remainder loop) except for peeling due to gaps in interleave-groups. This patch fixes that; -Os will now have us invalidate such interleave-groups and vectorize without an epilog. The patch also removes a related FIXME comment that is now obsolete, and was also inaccurate: "FIXME: return None if loop requiresScalarEpilog(<MaxVF>), or look for a smaller MaxVF that does not require a scalar epilog." (requiresScalarEpilog() has nothing to do with VF). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53420 llvm-svn: 344883
* [LoopVectorize] Loop vectorization for minimum and maximumThomas Lively2018-10-191-0/+2
| | | | | | | | | | | | Summary: Depends on D52766. Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52767 llvm-svn: 344816
* [InstCombine] InstCombine and InstSimplify for minimum and maximumThomas Lively2018-10-192-7/+24
| | | | | | | | | | | | Summary: Depends on D52765 Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52766 llvm-svn: 344799
* [ConstantFolding] Constant fold minimum and maximum intrinsicsThomas Lively2018-10-191-0/+14
| | | | | | | | | | | | Summary: Depends on D52764 Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52765 llvm-svn: 344796
* [TI removal] Switch some newly added code over to use `Instruction`Chandler Carruth2018-10-192-8/+7
| | | | | | directly. llvm-svn: 344768
* [DA] DivergenceAnalysis for unstructured, reducible CFGsNicolai Haehnle2018-10-183-0/+807
| | | | | | | | | | | | | | | | | | | | | | Summary: This is patch 2 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433). This patch contains a generic divergence analysis implementation for unstructured, reducible Control-Flow Graphs. It contains two new classes. The `SyncDependenceAnalysis` class lazily computes sync dependences, which relate divergent branches to points of joining divergent control. The `DivergenceAnalysis` class contains the generic divergence analysis implementation. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: sameerds, kristina, nhaehnle, xbolva00, tschuett, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D51491 llvm-svn: 344734
* [TI removal] Switch an analysis to just use Instruction.Chandler Carruth2018-10-181-5/+5
| | | | llvm-svn: 344713
* [NFC] Remove GOTO from SCEVMax Kazantsev2018-10-171-20/+14
| | | | llvm-svn: 344687
* [LV] Teach vectorizer about variant value store into uniform addressAnna Thomas2018-10-161-10/+6
| | | | | | | | | | | | | | | | | | | | Summary: Teach vectorizer about vectorizing variant value stores to uniform address. Similar to rL343028, we do not allow vectorization if we have multiple stores to the same uniform address. Cost model already has the change for considering the extract instruction cost for a variant value store. See added test cases for how vectorization is done. The patch also contains changes to the ORE messages. Reviewers: Ayal, mkuper, anemet, hsaito Subscribers: rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D52656 llvm-svn: 344613
* [NFC] Introduce ICFLoopSafetyInfoMax Kazantsev2018-10-161-0/+31
| | | | | | | | | | | | | | | | This is an alternative implementation of LoopSafetyInfo that uses the implicit control flow tracking to give precise answers on queries "whether or not this block contains throwing instructions". This rules out false-positive answers on LoopSafetyInfo's queries. This patch only introduces the new implementation. It is not currently used in any pass. The enabling patches will go separately, through review. The plan is to completely replace all uses of LoopSafetyInfo with ICFLoopSafetyInfo in the future, but to avoid introducing functional problems, we will do it pass by pass. llvm-svn: 344601
* [NFC] Remove obsolete method headerMayThrowMax Kazantsev2018-10-161-13/+2
| | | | llvm-svn: 344596
* [NFC] Make LoopSafetyInfo abstract to allow alternative implementationsMax Kazantsev2018-10-161-8/+8
| | | | llvm-svn: 344592
* [NFC] Encapsulate work with BlockColors in LoopSafetyInfoMax Kazantsev2018-10-161-1/+16
| | | | llvm-svn: 344590
* [NFC] Move block throw check inside allLoopPathsLeadToBlockMax Kazantsev2018-10-161-6/+10
| | | | llvm-svn: 344588
* [NFC] Turn isGuaranteedToExecute into a methodMax Kazantsev2018-10-161-8/+8
| | | | llvm-svn: 344587
* [SCEV] Limit AddRec "simplifications" to avoid combinatorial explosionsMax Kazantsev2018-10-161-1/+1
| | | | | | | | | | | | | | | | | | SCEV's transform that turns `{A1,+,A2,+,...,+,An}<L> * {B1,+,B2,+,...,+,Bn}<L>` into a single AddRec of size `2n+1` with complex combinatorial coefficients can easily trigger exponential growth of the SCEV (in case if nothing gets folded and simplified). We tried to restrain this transform using the option `scalar-evolution-max-add-rec-size`, but its default value seems to be insufficiently small: the test attached to this patch with default value of this option `16` has a SCEV of >3M symbols (when printed out). This patch reduces the simplification limit. It is not a cure to combinatorial explosions, but at least it reduces this corner case to something more or less reasonable. Differential Revision: https://reviews.llvm.org/D53282 Reviewed By: sanjoy llvm-svn: 344584
* [TI removal] Make variables declared as `TerminatorInst` and initializedChandler Carruth2018-10-157-12/+12
| | | | | | | | | | | | | by `getTerminator()` calls instead be declared as `Instruction`. This is the biggest remaining chunk of the usage of `getTerminator()` that insists on the narrow type and so is an easy batch of updates. Several files saw more extensive updates where this would cascade to requiring API updates within the file to use `Instruction` instead of `TerminatorInst`. All of these were trivial in nature (pervasively using `Instruction` instead just worked). llvm-svn: 344502
* [TI removal] Remove TerminatorInst as an input parameter from all publicChandler Carruth2018-10-151-1/+2
| | | | | | | | | LLVM APIs. There weren't very many. We still have the instruction visitor, and APIs with TerminatorInst as a return type or an output parameter. llvm-svn: 344494
* recommit 344472 after fixing build failure on ARM and PPC.Dorit Nuzman2018-10-142-12/+27
| | | | llvm-svn: 344475
* revert 344472 due to failures.Dorit Nuzman2018-10-142-27/+12
| | | | llvm-svn: 344473
* [IAI,LV] Add support for vectorizing predicated strided accesses using maskedDorit Nuzman2018-10-142-12/+27
| | | | | | | | | | | | | | | | | | | | | | | interleave-group The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/stores; predicated strided accesses can currently be vectorized only using masked gather/scatter or scalarization. This patch makes predicated loads/stores candidates for forming interleave-groups during the Loop-Vectorizer's analysis, and adds the proper support for masked-interleave- groups to the Loop-Vectorizer's planning and transformation stages. The patch also extends the TTI API to allow querying the cost of masked interleave groups (which each target can control); Targets that support masked vector loads/ stores may choose to enable this feature and allow vectorizing predicated strided loads/stores using masked wide loads/stores and shuffles. Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53011 llvm-svn: 344472
* [ThinLTO] Don't import GV which contains blockaddressEugene Leviant2018-10-121-3/+16
| | | | | | Differential revision: https://reviews.llvm.org/D53139 llvm-svn: 344325
* [NFC] Factor out getOrCreateAddRecExpr methodMax Kazantsev2018-10-111-18/+24
| | | | llvm-svn: 344227
OpenPOWER on IntegriCloud