summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis
Commit message (Collapse)AuthorAgeFilesLines
* [SCEV] Rely on ConstantRange instead of custom logic; NFCISanjoy Das2016-10-021-124/+52
| | | | llvm-svn: 283058
* [SCEV] Remove commented out code; NFCSanjoy Das2016-10-021-3/+1
| | | | llvm-svn: 283056
* Use StringRef in TLI instead of raw pointer (NFC)Mehdi Amini2016-10-011-15/+13
| | | | llvm-svn: 283005
* Use StringRef in Pass/PassManager APIs (NFC)Mehdi Amini2016-10-011-3/+1
| | | | llvm-svn: 283004
* NFC fix doxygen commentsPiotr Padlewski2016-09-302-25/+25
| | | | llvm-svn: 282950
* [LAA, LV] Port to new streaming interface for opt remarks. Update LVAdam Nemet2016-09-301-24/+38
| | | | | | | | | | | | | | (Recommit after making sure IsVerbose gets properly initialized in DiagnosticInfoOptimizationBase. See previous commit that takes care of this.) OptimizationRemarkAnalysis directly takes the role of the report that is generated by LAA. Then we need the magic to be able to turn an LAA remark into an LV remark. This is done via a new OptimizationRemark ctor. llvm-svn: 282813
* Revert "[LAA, LV] Port to new streaming interface for opt remarks. Update LV"Adam Nemet2016-09-291-38/+24
| | | | | | | | This reverts commit r282758. There are some clang failures I haven't seen. llvm-svn: 282759
* [LAA, LV] Port to new streaming interface for opt remarks. Update LVAdam Nemet2016-09-291-24/+38
| | | | | | | | | | OptimizationRemarkAnalysis directly takes the role of the report that is generated by LAA. Then we need the magic to be able to turn an LAA remark into an LV remark. This is done via a new OptimizationRemark ctor. llvm-svn: 282758
* Refactor the ProfileSummaryInfo to use doInitialization and doFinalization ↵Dehao Chen2016-09-282-18/+13
| | | | | | | | | | | | | | to handle Module update. Summary: This refactors the change in r282616 Reviewers: davidxl, eraman, mehdi_amini Subscribers: mehdi_amini, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D25041 llvm-svn: 282630
* Fix the bug when -compile-twice is specified, the PSI will be invalidated.Dehao Chen2016-09-281-3/+12
| | | | | | | | | | | | | | Summary: When using llc with -compile-twice, module is generated twice, but getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI will still get the old PSI with the original (invalidated) Module. This patch checks if the module has changed when calling getPSI, if yes, update the module and invalidate the Summary. The bug does not show up in the current llc because PSI is not used in CodeGen yet. But with https://reviews.llvm.org/D24989, the bug will be exposed by test/CodeGen/PowerPC/pr26378.ll Reviewers: eraman, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24993 llvm-svn: 282616
* Don't look through addrspacecast in GetPointerBaseWithConstantOffsetArtur Pilipenko2016-09-281-2/+7
| | | | | | | | | | | | Pointers in different addrspaces can have different sizes, so it's not valid to look through addrspace cast calculating base and offset for a value. This is similar to D13008. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D24729 llvm-svn: 282612
* [SCEV] Use a SmallPtrSet as a temporary union predicate; NFCSanjoy Das2016-09-281-55/+65
| | | | | | | | | | | | | | | | Summary: Instead of creating and destroying SCEVUnionPredicate instances (which internally creates and destroys a DenseMap), use temporary SmallPtrSet instances of remember the set of predicates that will get reified into a SCEVUnionPredicate. Reviewers: silviu.baranga, sbaranga Subscribers: sanjoy, mcrosier, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D25000 llvm-svn: 282606
* [InstSimplify] allow or-of-icmps folds with vector splat constantsSanjay Patel2016-09-281-12/+11
| | | | llvm-svn: 282592
* [InstSimplify] allow and-of-icmps folds with vector splat constantsSanjay Patel2016-09-281-10/+9
| | | | llvm-svn: 282590
* [LAA] Rename emitAnalysis to recordAnalys. NFCAdam Nemet2016-09-281-23/+20
| | | | | | | | | Ever since LAA was split out into an analysis on its own, this function stopped emitting the report directly. Instead it stores it to be retrieved by the client which can then emit it as its own report (e.g. -Rpass-analysis=loop-vectorize). llvm-svn: 282561
* [Inliner] Port all opt remarks to new streaming APIAdam Nemet2016-09-271-1/+7
| | | | llvm-svn: 282559
* Shorten DiagnosticInfoOptimizationRemark* to OptimizationRemark*. NFCAdam Nemet2016-09-271-10/+9
| | | | | | | With the new streaming interface, these class names need to be typed a lot and it's way too looong. llvm-svn: 282544
* Output optimization remarks in YAMLAdam Nemet2016-09-271-0/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (Re-committed after moving the template specialization under the yaml namespace. GCC was complaining about this.) This allows various presentation of this data using an external tool. This was first recommended here[1]. As an example, consider this module: 1 int foo(); 2 int bar(); 3 4 int baz() { 5 return foo() + bar(); 6 } The inliner generates these missed-optimization remarks today (the hotness information is pulled from PGO): remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30) remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30) Now with -pass-remarks-output=<yaml-file>, we generate this YAML file: --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 10 } Function: baz Hotness: 30 Args: - Callee: foo - String: will not be inlined into - Caller: baz ... --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 18 } Function: baz Hotness: 30 Args: - Callee: bar - String: will not be inlined into - Caller: baz ... This is a summary of the high-level decisions: * There is a new streaming interface to emit optimization remarks. E.g. for the inliner remark above: ORE.emit(DiagnosticInfoOptimizationRemarkMissed( DEBUG_TYPE, "NotInlined", &I) << NV("Callee", Callee) << " will not be inlined into " << NV("Caller", CS.getCaller()) << setIsVerbose()); NV stands for named value and allows the YAML client to process a remark using its name (NotInlined) and the named arguments (Callee and Caller) without parsing the text of the message. Subsequent patches will update ORE users to use the new streaming API. * I am using YAML I/O for writing the YAML file. YAML I/O requires you to specify reading and writing at once but reading is highly non-trivial for some of the more complex LLVM types. Since it's not clear that we (ever) want to use LLVM to parse this YAML file, the code supports and asserts that we're writing only. On the other hand, I did experiment that the class hierarchy starting at DiagnosticInfoOptimizationBase can be mapped back from YAML generated here (see D24479). * The YAML stream is stored in the LLVM context. * In the example, we can probably further specify the IR value used, i.e. print "Function" rather than "Value". * As before hotness is computed in the analysis pass instead of DiganosticInfo. This avoids the layering problem since BFI is in Analysis while DiagnosticInfo is in IR. [1] https://reviews.llvm.org/D19678#419445 Differential Revision: https://reviews.llvm.org/D24587 llvm-svn: 282539
* [SCEV] Replace a struct with a function; NFCSanjoy Das2016-09-271-153/+139
| | | | | | We can do this now thanks to C++11 lambdas. llvm-svn: 282515
* [SCEV] Use find instead of find_as; NFCSanjoy Das2016-09-271-1/+1
| | | | | | We don't need the extra generality here. llvm-svn: 282514
* [SCEV] Reduce the scope of a struct; NFCSanjoy Das2016-09-271-22/+20
| | | | llvm-svn: 282513
* [SCEV] Remove custom RAII wrapper; NFCSanjoy Das2016-09-271-22/+5
| | | | | | Instead use the pre-existing `scope_exit` class. llvm-svn: 282512
* [SCEV] Make PendingLoopPredicates more frugal; NFCISanjoy Das2016-09-271-3/+4
| | | | | | | | | I don't expect `PendingLoopPredicates` to have very many elements (e.g. when -O3'ing the sqlite3 amalgamation, `PendingLoopPredicates` has at most 3 elements). So now we use a `SmallPtrSet` for it instead of the more heavyweight `DenseSet`. llvm-svn: 282511
* Revert "Output optimization remarks in YAML"Adam Nemet2016-09-271-80/+0
| | | | | | | | This reverts commit r282499. The GCC bots are failing llvm-svn: 282503
* Output optimization remarks in YAMLAdam Nemet2016-09-271-0/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows various presentation of this data using an external tool. This was first recommended here[1]. As an example, consider this module: 1 int foo(); 2 int bar(); 3 4 int baz() { 5 return foo() + bar(); 6 } The inliner generates these missed-optimization remarks today (the hotness information is pulled from PGO): remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30) remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30) Now with -pass-remarks-output=<yaml-file>, we generate this YAML file: --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 10 } Function: baz Hotness: 30 Args: - Callee: foo - String: will not be inlined into - Caller: baz ... --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 18 } Function: baz Hotness: 30 Args: - Callee: bar - String: will not be inlined into - Caller: baz ... This is a summary of the high-level decisions: * There is a new streaming interface to emit optimization remarks. E.g. for the inliner remark above: ORE.emit(DiagnosticInfoOptimizationRemarkMissed( DEBUG_TYPE, "NotInlined", &I) << NV("Callee", Callee) << " will not be inlined into " << NV("Caller", CS.getCaller()) << setIsVerbose()); NV stands for named value and allows the YAML client to process a remark using its name (NotInlined) and the named arguments (Callee and Caller) without parsing the text of the message. Subsequent patches will update ORE users to use the new streaming API. * I am using YAML I/O for writing the YAML file. YAML I/O requires you to specify reading and writing at once but reading is highly non-trivial for some of the more complex LLVM types. Since it's not clear that we (ever) want to use LLVM to parse this YAML file, the code supports and asserts that we're writing only. On the other hand, I did experiment that the class hierarchy starting at DiagnosticInfoOptimizationBase can be mapped back from YAML generated here (see D24479). * The YAML stream is stored in the LLVM context. * In the example, we can probably further specify the IR value used, i.e. print "Function" rather than "Value". * As before hotness is computed in the analysis pass instead of DiganosticInfo. This avoids the layering problem since BFI is in Analysis while DiagnosticInfo is in IR. [1] https://reviews.llvm.org/D19678#419445 Differential Revision: https://reviews.llvm.org/D24587 llvm-svn: 282499
* [thinlto] Basic thinlto fdo heuristicPiotr Padlewski2016-09-261-13/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch improves thinlto importer by importing 3x larger functions that are called from hot block. I compared performance with the trunk on spec, and there were about 2% on povray and 3.33% on milc. These results seems to be consistant and match the results Teresa got with her simple heuristic. Some benchmarks got slower but I think they are just noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with more iterations to confirm. Geomean of all benchmarks including the noisy ones were about +0.02%. I see much better improvement on google branch with Easwaran patch for pgo callsite inlining (the inliner actually inline those big functions) Over all I see +0.5% improvement, and I get +8.65% on povray. So I guess we will see much bigger change when Easwaran patch will land (it depends on new pass manager), but it is still worth putting this to trunk before it. Implementation details changes: - Removed CallsiteCount. - ProfileCount got replaced by Hotness - hot-import-multiplier is set to 3.0 for now, didn't have time to tune it up, but I see that we get most of the interesting functions with 3, so there is no much performance difference with higher, and binary size doesn't grow as much as with 10.0. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24638 llvm-svn: 282437
* [SCEV] Fix the order of members in the initializer list.Chandler Carruth2016-09-261-1/+1
| | | | | | | Noticed due to the warning on this line. Sanjoy is on a less-than-awesome internet connection, so committing on his behalf. llvm-svn: 282380
* [SCEV] Assign LoopPropertiesCache in the move constructorSanjoy Das2016-09-261-0/+1
| | | | | | | | | | | In a previous change I collapsed two different caches into one. When doing that I noticed that ScalarEvolution's move constructor was not moving those caches. To keep the previous change simple, I've moved that bugfix into this separate change. llvm-svn: 282376
* [SCEV] Combine two predicates into one; NFCSanjoy Das2016-09-261-31/+24
| | | | | | | | | Both `loopHasNoSideEffects` and `loopHasNoAbnormalExits` involve walking the loop and maintaining similar sorts of caches. This commit changes SCEV to compute both the predicates via a single walk, and maintain a single cache instead of two. llvm-svn: 282375
* [SCEV] Make it obvious BackedgeTakenInfo's constructor steals storageSanjoy Das2016-09-261-2/+4
| | | | | | | Specifically, it moves SCEVUnionPredicates from its input into its own storage. Make this obvious at the type level. llvm-svn: 282374
* [SCEV] Further isolate incidental data structure; NFCSanjoy Das2016-09-261-4/+7
| | | | llvm-svn: 282373
* [SCEV] Simplify BackedgeTakenInfo::getMax; NFCSanjoy Das2016-09-261-7/+7
| | | | llvm-svn: 282372
* [SCEV] Reserve space in SmallVector; NFCSanjoy Das2016-09-251-0/+1
| | | | llvm-svn: 282368
* [SCEV] Have ExitNotTakenInfo keep a pointer to its predicate; NFCSanjoy Das2016-09-251-11/+15
| | | | | | | SCEVUnionPredicate is a "heavyweight" structure, so it is beneficial to store the (optional) data out of line. llvm-svn: 282366
* [SCEV] Simplify tracking ExitNotTakenInfo instances; NFCSanjoy Das2016-09-251-72/+24
| | | | | | | | | | This change simplifies a data structure optimization in the `BackedgeTakenInfo` class for loops with exactly one computable exit. I've sanity checked that this does not regress compile time performance, using sqlite3's amalgamated build. llvm-svn: 282365
* [SCEV] Rename a couple of fields; NFCSanjoy Das2016-09-251-48/+55
| | | | llvm-svn: 282364
* [SCEV] Remove incidental data structure; NFCSanjoy Das2016-09-251-15/+19
| | | | llvm-svn: 282363
* Analysis: Return early for UndefValue in computeKnownBitsDuncan P. N. Exon Smith2016-09-241-0/+8
| | | | | | | | | There is no benefit in looking through assumptions on UndefValue to guess known bits. Return early to avoid walking their use-lists, and assert that all instances of ConstantData are handled here for similar reasons (UndefValue was the only integer/pointer holdout). llvm-svn: 282337
* Analysis: Return early in isKnownNonNullAt for ConstantDataDuncan P. N. Exon Smith2016-09-241-0/+4
| | | | | | | | | | | | | Check and return early for ConstantPointerNull and UndefValue specifically in isKnownNonNullAt, and assert that ConstantData never make it to isKnownNonNullFromDominatingCondition. This confirms that isKnownNonNullFromDominatingCondition never walks through the use-list of an instance of ConstantData. Given that such use-lists cross module boundaries, it never really made sense to do so, and was potentially very expensive. llvm-svn: 282333
* [TLI] isdigit / isascii / toascii param type should match return type (PR30484)Sanjay Patel2016-09-231-1/+4
| | | | | | We crash in LibCallSimplifier if we don't check the validity of the function signature properly. llvm-svn: 282278
* Enhance calcColdCallHeuristics for InvokeInstJun Bum Lim2016-09-231-0/+10
| | | | | | | | | | | | Summary: When identifying cold blocks, consider only the edge to the normal destination if the terminator is InvokeInst and let calcInvokeHeuristics() decide edge weights for the InvokeInst. Reviewers: mcrosier, hfinkel, davidxl Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D24868 llvm-svn: 282262
* [LV] When reporting about a specific instruction without debug location use ↵Adam Nemet2016-09-211-1/+4
| | | | | | | | loop's This can occur for example if some optimization drops the debug location. llvm-svn: 282048
* [InferAttributes] Don't access parameters that don't exist.Michael Kuperstein2016-09-201-2/+2
| | | | | | | Check for the correct number of parameters before querying their type. This fixes PR30455. llvm-svn: 282038
* move variables closer to their uses; add FIXMEs; NFCSanjay Patel2016-09-201-10/+10
| | | | llvm-svn: 281972
* [Loop Vectorizer] Consecutive memory access - fixed and simplifiedElena Demikhovsky2016-09-181-4/+4
| | | | | | | | | Amended consecutive memory access detection in Loop Vectorizer. Load/Store were not handled properly without preceding GEP instruction. Differential Revision: https://reviews.llvm.org/D20789 llvm-svn: 281853
* [InstCombine] allow vector types for constant folding / computeKnownBits ↵Sanjay Patel2016-09-161-3/+4
| | | | | | | | | | | | | | | | (PR24942) computeKnownBits() already works for integer vectors, so allow vector types when calling that from InstCombine. I don't think the change to use m_APInt in computeKnownBits is strictly necessary because we do check for ConstantVector later, but it's more efficient to handle the splat case without needing to loop on vector elements. This should work with InstSimplify, but doesn't yet, so I made that a FIXME comment on the test for PR24942: https://llvm.org/bugs/show_bug.cgi?id=24942 Differential Revision: https://reviews.llvm.org/D24677 llvm-svn: 281777
* Reapplying r278731 after fixing the problem that caused it to be reverted.David L Kreitzer2016-09-161-15/+100
| | | | | | | | | | Enhance SCEV to compute the trip count for some loops with unknown stride. Patch by Pankaj Chawla Differential Revision: https://reviews.llvm.org/D22377 llvm-svn: 281732
* [LCG] Redesign the lazy post-order iteration mechanism for theChandler Carruth2016-09-161-103/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LazyCallGraph to support repeated, stable iterations, even in the face of graph updates. This is particularly important to allow the CGSCC pass manager to walk the RefSCCs (and thus everything else) in a module more than once. Lots of unittests and other tests were hard or impossible to write because repeated CGSCC pass managers which didn't invalidate the LazyCallGraph would conclude the module was empty after the first one. =[ Really, really bad. The interesting thing is that in many ways this simplifies the code. We can now re-use the same code for handling reference edge insertion updates of the RefSCC graph as we use for handling call edge insertion updates of the SCC graph. Outside of adapting to the shared logic for this (which isn't trivial, but is *much* simpler than the DFS it replaces!), the new code involves putting newly created RefSCCs when deleting a reference edge into the cached list in the correct way, and to re-formulate the iterator to be stable and effective even in the face of these kinds of updates. I've updated the unittests for the LazyCallGraph to re-iterate the postorder sequence and verify that this all works. We even check for using alternating iterators to trigger the lazy formation of RefSCCs after mutation has occured. It's worth noting that there are a reasonable number of likely simplifications we can make past this. It isn't clear that we need to keep the "LeafRefSCCs" around any more. But I've not removed that mostly because I want this to be a more isolated change. Differential Revision: https://reviews.llvm.org/D24219 llvm-svn: 281716
* [PM] Port CFGViewer and CFGPrinter to the new Pass ManagerSriraman Tallam2016-09-152-50/+69
| | | | | | Differential Revision: https://reviews.llvm.org/D24592 llvm-svn: 281640
* Add some shortcuts in LazyValueInfo to reduce compile time of Correlated ↵Wei Mi2016-09-151-0/+29
| | | | | | | | | | | | | | | Value Propagation. The patch is to partially fix PR10584. Correlated Value Propagation queries LVI to check non-null for pointer params of each callsite. If we know the def of param is an alloca instruction, we know it is non-null and can return early from LVI. Similarly, CVP queries LVI to check whether pointer for each mem access is constant. If the def of the pointer is an alloca instruction, we know it is not a constant pointer. These shortcuts can reduce the cost of CVP significantly. Differential Revision: https://reviews.llvm.org/D18066 llvm-svn: 281586
OpenPOWER on IntegriCloud