summaryrefslogtreecommitdiffstats
path: root/llvm/test/Bitcode/thinlto-function-summary-callgraph-profile-summary.ll
Commit message (Collapse)AuthorAgeFilesLines
* Bitcode: Write the irsymtab to disk.Peter Collingbourne2017-06-271-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D33973 llvm-svn: 306487
* Restrict call metadata based hotness detection to Sample PGO modeTeresa Johnson2017-05-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: Don't use the metadata on call instructions for determining hotness unless we are in sample PGO mode, where it is needed because profile counts are not accurate. In instrumentation mode this is not necessary and does more harm than good when calls have VP metadata that hasn't been properly scaled after transformations or dropped after constant prop based devirtualization (both should be fixed, but we don't need to do this in the first place for instrumentation PGO). This required adjusting a number of tests to distinguish between sample and instrumentation PGO handling, and to add in profile summary metadata so that getProfileCount can get the summary. Reviewers: davidxl, danielcdh Subscribers: aemerson, rengolin, mehdi_amini, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D32877 llvm-svn: 302844
* Bitcode: Add a string table to the bitcode format.Peter Collingbourne2017-04-171-15/+33
| | | | | | | | | | | | | | | | | | | | | | | Add a top-level STRTAB block containing a string table blob, and start storing strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in the string table. This change allows us to share names between globals and comdats as well as between modules, and improves the efficiency of loading bitcode files by no longer using a bit encoding for symbol names. Once we start writing the irsymtab to the bitcode file we will also be able to share strings between it and the module. On my machine, link time for Chromium for Linux with ThinLTO decreases by about 7% for no-op incremental builds or about 1% for full builds. Total bitcode file size decreases by about 3%. As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html Differential Revision: https://reviews.llvm.org/D31838 llvm-svn: 300464
* Use ProfileSummary:getProfileCount to get ScaledCount for ModuleSummaryDehao Chen2017-03-211-1/+5
| | | | | | | | | | | | | | Summary: ModuleSummary should use the standard interface of ProfileSummary::getProfileCount. Reviewers: eraman, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D31154 llvm-svn: 298404
* Add function importing info from samplepgo profile to the module summary.Dehao Chen2017-02-281-2/+3
| | | | | | | | | | | | | | Summary: For SamplePGO, the profile may contain cross-module inline stacks. As we need to make sure the profile annotation happens when all the hot inline stacks are expanded, we need to pass this info to the module importer so that it can import proper functions if necessary. This patch implemented this feature by emitting cross-module targets as part of function entry metadata. In the module-summary phase, the metadata is used to build call edges that points to functions need to be imported. Reviewers: mehdi_amini, tejohnson Reviewed By: tejohnson Subscribers: davidxl, llvm-commits Differential Revision: https://reviews.llvm.org/D30053 llvm-svn: 296498
* IR: Eliminate non-determinism in the module summary analysis.Peter Collingbourne2016-12-201-2/+2
| | | | | | | | | Also make the summary ref and call graph vectors immutable. This means a smaller API surface and fewer places to audit for non-determinism. Differential Revision: https://reviews.llvm.org/D27875 llvm-svn: 290200
* [thinlto] Basic thinlto fdo heuristicPiotr Padlewski2016-09-261-0/+98
Summary: This patch improves thinlto importer by importing 3x larger functions that are called from hot block. I compared performance with the trunk on spec, and there were about 2% on povray and 3.33% on milc. These results seems to be consistant and match the results Teresa got with her simple heuristic. Some benchmarks got slower but I think they are just noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with more iterations to confirm. Geomean of all benchmarks including the noisy ones were about +0.02%. I see much better improvement on google branch with Easwaran patch for pgo callsite inlining (the inliner actually inline those big functions) Over all I see +0.5% improvement, and I get +8.65% on povray. So I guess we will see much bigger change when Easwaran patch will land (it depends on new pass manager), but it is still worth putting this to trunk before it. Implementation details changes: - Removed CallsiteCount. - ProfileCount got replaced by Hotness - hot-import-multiplier is set to 3.0 for now, didn't have time to tune it up, but I see that we get most of the interesting functions with 3, so there is no much performance difference with higher, and binary size doesn't grow as much as with 10.0. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24638 llvm-svn: 282437
OpenPOWER on IntegriCloud