summaryrefslogtreecommitdiffstats
path: root/llvm/test/Bitcode/thinlto-function-summary-callgraph-profile-summary.ll
Commit message (Collapse)AuthorAgeFilesLines
* [ThinLTO] Attempt to recommit r365188 after alignment fixEugene Leviant2019-07-051-2/+2
| | | | llvm-svn: 365215
* Reverted r365188 due to alignment problems on i686-androidEugene Leviant2019-07-051-2/+2
| | | | llvm-svn: 365206
* [ThinLTO] Attempt to recommit r365040 after caching fixEugene Leviant2019-07-051-2/+2
| | | | | | | | | | | | | | It's possible that some function can load and store the same variable using the same constant expression: store %Derived* @foo, %Derived** bitcast (%Base** @bar to %Derived**) %42 = load %Derived*, %Derived** bitcast (%Base** @bar to %Derived**) The bitcast expression was mistakenly cached while processing loads, and never examined later when processing store. This caused @bar to be mistakenly treated as read-only variable. See load-store-caching.ll. llvm-svn: 365188
* Revert [ThinLTO] Optimize writeonly globals outReid Kleckner2019-07-041-2/+2
| | | | | | | | | This reverts r365040 (git commit 5cacb914758c7f436b47c8362100f10cef14bbc4) Speculatively reverting, since this appears to have broken check-lld on Linux. Partial analysis in https://crbug.com/981168. llvm-svn: 365097
* [ThinLTO] Optimize writeonly globals outEugene Leviant2019-07-031-2/+2
| | | | | | Differential revision: https://reviews.llvm.org/D63444 llvm-svn: 365040
* [ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligibleTeresa Johnson2019-05-101-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We hit undefined references building with ThinLTO when one source file contained explicit instantiations of a template method (weak_odr) but there were also implicit instantiations in another file (linkonce_odr), and the latter was the prevailing copy. In this case the symbol was marked hidden when the prevailing linkonce_odr copy was promoted to weak_odr. It led to unsats when the resulting shared library was linked with other code that contained a reference (expecting to be resolved due to the explicit instantiation). Add a CanAutoHide flag to the GV summary to allow the thin link to identify when all copies are eligible for auto-hiding (because they were all originally linkonce_odr global unnamed addr), and only do the auto-hide in that case. Most of the changes here are due to plumbing the new flag through the bitcode and llvm assembly, and resulting test changes. I augmented the existing auto-hide test to check for this situation. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, dexonsmith, arphaman, dang, llvm-commits, steven_wu, wmi Tags: #llvm Differential Revision: https://reviews.llvm.org/D59709 llvm-svn: 360466
* [LTO] Record whether LTOUnit splitting is enabled in indexTeresa Johnson2019-01-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Records in the module summary index whether the bitcode was compiled with the option necessary to enable splitting the LTO unit (e.g. -fsanitize=cfi, -fwhole-program-vtables, or -fsplit-lto-unit). The information is passed down to the ModuleSummaryIndex builder via a new module flag "EnableSplitLTOUnit", which is propagated onto a flag on the summary index. This is then used during the LTO link to check whether all linked summaries were built with the same value of this flag. If not, an error is issued when we detect a situation requiring whole program visibility of the class hierarchy. This is the case when both of the following conditions are met: 1) We are performing LowerTypeTests or Whole Program Devirtualization. 2) There are type tests or type checked loads in the code. Note I have also changed the ThinLTOBitcodeWriter to also gate the module splitting on the value of this flag. Reviewers: pcc Subscribers: ormris, mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D53890 llvm-svn: 350948
* [ThinLTO] Compute synthetic function entry countEaswaran Raman2018-12-131-1/+1
| | | | | | | | | | | | | | | | | Summary: This patch computes the synthetic function entry count on the whole program callgraph (based on module summary) and writes the entry counts to the summary. After function importing, this count gets attached to the IR as metadata. Since it adds a new field to the summary, this bumps up the version. Reviewers: tejohnson Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D43521 llvm-svn: 349076
* [ThinLTO] Internalize readonly globalsEugene Leviant2018-11-161-2/+2
| | | | | | | | An attempt to recommit r346584 after failure on OSX build bot. Fixed cache key computation in ThinLTOCodeGenerator and added test case llvm-svn: 347033
* Revert "[ThinLTO] Internalize readonly globals"Steven Wu2018-11-131-2/+2
| | | | | | This reverts commit 10c84a8f35cae4a9fc421648d9608fccda3925f2. llvm-svn: 346768
* [ThinLTO] Internalize readonly globalsEugene Leviant2018-11-101-2/+2
| | | | | | | | | This patch allows internalising globals if all accesses to them (from live functions) are from non-volatile load instructions Differential revision: https://reviews.llvm.org/D49362 llvm-svn: 346584
* [ThinLTO] Parse module summary index from assemblyTeresa Johnson2018-06-261-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Adds assembly parsing support for the module summary index (follow on to r333335 which added the assembly writing support). I added support to llvm-as to invoke the index parsing, so that it can create either a bitcode file with a Module and a per-module index, or a combined index without a Module. I will send follow on patches soon to do the following: - add support to tools such as llvm-lto2 to parse the per-module indexes from assembly instead of bitcode when testing the thin link. - verification support. Depends on D47844 and D47842. Reviewers: pcc, dexonsmith, mehdi_amini Subscribers: inglorion, eraman, steven_wu, llvm-commits Differential Revision: https://reviews.llvm.org/D47905 llvm-svn: 335602
* [ThinLTO] Fix a few more test match issuesTeresa Johnson2018-05-261-3/+3
| | | | | | | | | | | | Fix a few more bot failures due to r333335: - don't match path other than file name, since the delimiter is different for Windows - The summary IDs in thinlto-function-summary-refgraph.ll may vary and therefore can't be matched exactly, because the ordering depends on the iteration order of the index map which is keyed by GUID. The GUID for private values will depend on the path. llvm-svn: 333338
* [ThinLTO] Print module summary index to assemblyTeresa Johnson2018-05-261-0/+36
| | | | | | | | | | | | | | | | | | | | | Summary: Implements AsmWriter support for printing the module summary index to assembly with the format discussed in the RFC "LLVM Assembly format for ThinLTO Summary". Implements just enough of the parsing support to recognize and ignore the summary entries. As agreed in the RFC thread, this will be the behavior when assembling the IR. A follow on change will implement parsing/assembling of the summary entries for use by tools that currently build the summary index from bitcode. Reviewers: dexonsmith, pcc Subscribers: inglorion, eraman, steven_wu, dblaikie, llvm-commits Differential Revision: https://reviews.llvm.org/D46699 llvm-svn: 333335
* [ThinLTO] Serialize WithGlobalValueDeadStripping index flag for distributed ↵Teresa Johnson2018-02-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | backends Summary: A recent fix to drop dead symbols (r323633) did not work for ThinLTO distributed backends because we lose the WithGlobalValueDeadStripping set on the index during the thin link. This patch adds a new flags record to the bitcode format for the index, and serializes this flag for the combined index (it would always be 0 for the per-module index generated by the compile step, so no need to serialize the new flags record there until/unless we add another flag that applies to the per-module indexes). Generally this flag should always be set for the distributed backends, which are necessarily performed after the thin link. However, if we were to simply set this flag on the index applied to the distributed backends (invoked via clang), we would lose the ability to disable dead stripping via -compute-dead=false for debugging purposes. Reviewers: grimar, pcc Subscribers: mehdi_amini, inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D42799 llvm-svn: 324444
* [ThinLTO] Add FunctionAttrs to ThinLTO indexCharles Saternos2017-08-041-2/+2
| | | | | | Adds function attributes to index: ReadNone, ReadOnly, NoRecurse, NoAlias. This attributes will be used for future ThinLTO optimizations that will propagate function attributes across modules. llvm-svn: 310061
* Increase the import-threshold for crtical functions.Dehao Chen2017-07-071-1/+1
| | | | | | | | | | | | | | Summary: For interative sample-pgo, if a hot call site is inlined in the profiling binary, we should inline it in before profile annotation in the backend. Before that, the compile phase first collects all GUIDs that needs to be imported and creates virtual "hot" call edge in the summary. However, "hot" is not good enough to guarantee the callsites get inlined. This patch introduces "critical" call edge, and assign much higher importing threshold for those edges. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: sanjoy, mehdi_amini, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D35096 llvm-svn: 307439
* Bitcode: Write the irsymtab to disk.Peter Collingbourne2017-06-271-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D33973 llvm-svn: 306487
* Restrict call metadata based hotness detection to Sample PGO modeTeresa Johnson2017-05-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: Don't use the metadata on call instructions for determining hotness unless we are in sample PGO mode, where it is needed because profile counts are not accurate. In instrumentation mode this is not necessary and does more harm than good when calls have VP metadata that hasn't been properly scaled after transformations or dropped after constant prop based devirtualization (both should be fixed, but we don't need to do this in the first place for instrumentation PGO). This required adjusting a number of tests to distinguish between sample and instrumentation PGO handling, and to add in profile summary metadata so that getProfileCount can get the summary. Reviewers: davidxl, danielcdh Subscribers: aemerson, rengolin, mehdi_amini, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D32877 llvm-svn: 302844
* Bitcode: Add a string table to the bitcode format.Peter Collingbourne2017-04-171-15/+33
| | | | | | | | | | | | | | | | | | | | | | | Add a top-level STRTAB block containing a string table blob, and start storing strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in the string table. This change allows us to share names between globals and comdats as well as between modules, and improves the efficiency of loading bitcode files by no longer using a bit encoding for symbol names. Once we start writing the irsymtab to the bitcode file we will also be able to share strings between it and the module. On my machine, link time for Chromium for Linux with ThinLTO decreases by about 7% for no-op incremental builds or about 1% for full builds. Total bitcode file size decreases by about 3%. As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html Differential Revision: https://reviews.llvm.org/D31838 llvm-svn: 300464
* Use ProfileSummary:getProfileCount to get ScaledCount for ModuleSummaryDehao Chen2017-03-211-1/+5
| | | | | | | | | | | | | | Summary: ModuleSummary should use the standard interface of ProfileSummary::getProfileCount. Reviewers: eraman, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D31154 llvm-svn: 298404
* Add function importing info from samplepgo profile to the module summary.Dehao Chen2017-02-281-2/+3
| | | | | | | | | | | | | | Summary: For SamplePGO, the profile may contain cross-module inline stacks. As we need to make sure the profile annotation happens when all the hot inline stacks are expanded, we need to pass this info to the module importer so that it can import proper functions if necessary. This patch implemented this feature by emitting cross-module targets as part of function entry metadata. In the module-summary phase, the metadata is used to build call edges that points to functions need to be imported. Reviewers: mehdi_amini, tejohnson Reviewed By: tejohnson Subscribers: davidxl, llvm-commits Differential Revision: https://reviews.llvm.org/D30053 llvm-svn: 296498
* IR: Eliminate non-determinism in the module summary analysis.Peter Collingbourne2016-12-201-2/+2
| | | | | | | | | Also make the summary ref and call graph vectors immutable. This means a smaller API surface and fewer places to audit for non-determinism. Differential Revision: https://reviews.llvm.org/D27875 llvm-svn: 290200
* [thinlto] Basic thinlto fdo heuristicPiotr Padlewski2016-09-261-0/+98
Summary: This patch improves thinlto importer by importing 3x larger functions that are called from hot block. I compared performance with the trunk on spec, and there were about 2% on povray and 3.33% on milc. These results seems to be consistant and match the results Teresa got with her simple heuristic. Some benchmarks got slower but I think they are just noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with more iterations to confirm. Geomean of all benchmarks including the noisy ones were about +0.02%. I see much better improvement on google branch with Easwaran patch for pgo callsite inlining (the inliner actually inline those big functions) Over all I see +0.5% improvement, and I get +8.65% on povray. So I guess we will see much bigger change when Easwaran patch will land (it depends on new pass manager), but it is still worth putting this to trunk before it. Implementation details changes: - Removed CallsiteCount. - ProfileCount got replaced by Hotness - hot-import-multiplier is set to 3.0 for now, didn't have time to tune it up, but I see that we get most of the interesting functions with 3, so there is no much performance difference with higher, and binary size doesn't grow as much as with 10.0. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24638 llvm-svn: 282437
OpenPOWER on IntegriCloud