<feed xmlns='http://www.w3.org/2005/Atom'>
<title>bcm5719-llvm/llvm/test/Transforms/SampleProfile, branch meklort-10.0.1</title>
<subtitle>Project Ortega BCM5719 LLVM</subtitle>
<id>https://git.raptorcs.com/git/bcm5719-llvm/atom?h=meklort-10.0.1</id>
<link rel='self' href='https://git.raptorcs.com/git/bcm5719-llvm/atom?h=meklort-10.0.1'/>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/'/>
<updated>2019-12-25T00:27:51+00:00</updated>
<entry>
<title>Migrate function attribute "no-frame-pointer-elim"="false" to "frame-pointer"="none" as cleanups after D56351</title>
<updated>2019-12-25T00:27:51+00:00</updated>
<author>
<name>Fangrui Song</name>
<email>maskray@google.com</email>
</author>
<published>2019-12-25T00:11:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=a36ddf0aa9db5c1086e04f56b5f077b761712eb5'/>
<id>urn:sha1:a36ddf0aa9db5c1086e04f56b5f077b761712eb5</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Migrate function attribute "no-frame-pointer-elim" to "frame-pointer"="all" as cleanups after D56351</title>
<updated>2019-12-24T23:57:33+00:00</updated>
<author>
<name>Fangrui Song</name>
<email>maskray@google.com</email>
</author>
<published>2019-12-24T23:52:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=502a77f125f43ffde57af34d3fd1b900248a91cd'/>
<id>urn:sha1:502a77f125f43ffde57af34d3fd1b900248a91cd</id>
<content type='text'>
</content>
</entry>
<entry>
<title>[AutoFDO] Statistic for context sensitive profile guided inlining</title>
<updated>2019-12-12T05:37:21+00:00</updated>
<author>
<name>Wenlei He</name>
<email>aktoon@gmail.com</email>
</author>
<published>2019-11-22T07:59:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=d275a064871763ab3a7712c74712d2fd1d0bef5d'/>
<id>urn:sha1:d275a064871763ab3a7712c74712d2fd1d0bef5d</id>
<content type='text'>
Summary: AutoFDO compilation has two places that do inlining - the sample profile loader that does inlining with context sensitive profile, and the regular inliner as CGSCC pass. Ideally we want most inlining to come from sample profile loader as that is driven by context sensitive profile and also retains context sensitivity after inlining. However the reality is most of the inlining actually happens during regular inliner. To track the number of inline instances from sample profile loader and help move more inlining to sample profile loader, I'm adding statistics and optimization remarks for sample profile loader's inlining.

Reviewers: wmi, davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70584
</content>
</entry>
<entry>
<title>[AutoFDO] Inline replay for cold/small callees from sample profile loader</title>
<updated>2019-12-06T19:44:45+00:00</updated>
<author>
<name>Wenlei He</name>
<email>aktoon@gmail.com</email>
</author>
<published>2019-11-26T23:53:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=7b61ae68ecd7a127e69c9e0d2563bddb7eccad7a'/>
<id>urn:sha1:7b61ae68ecd7a127e69c9e0d2563bddb7eccad7a</id>
<content type='text'>
Summary:
Sample profile loader of AutoFDO tries to replay previous inlining using context sensitive profile. The replay only repeats inlining if the call site block is hot. As a result it punts inlining of small functions, some of which can be beneficial for size, and will still be inlined by CSGCC inliner later. The oscillation between sample profile loader's inlining and regular CGSSC inlining cause unnecessary loss of context-sensitive profile. It doesn't have much impact for inline decision itself, but it negatively affects post-inline profile quality as CGSCC inliner have to scale counts which is not as accurate as the original context sensitive profile, and bad post-inline profile can misguide code layout.

This change added regular Inline Cost calculation for sample profile loader, so we can inline small functions upfront under switch -sample-profile-inline-size. In addition -sample-profile-cold-inline-threshold is added so we can tune the separate size threshold - currently the default is chosen to be the same as regular inliner's cold call-site threshold.

Reviewers: wmi, davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70750
</content>
</entry>
<entry>
<title>[AutoFDO] Top-down Inlining for specialization with context-sensitive profile</title>
<updated>2019-12-06T00:07:01+00:00</updated>
<author>
<name>Wenlei He</name>
<email>aktoon@gmail.com</email>
</author>
<published>2019-11-25T07:54:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=532196d811ad4db1e522012c9d20e4a95aae2eb3'/>
<id>urn:sha1:532196d811ad4db1e522012c9d20e4a95aae2eb3</id>
<content type='text'>
Summary:
AutoFDO's sample profile loader processes function in arbitrary source code order, so if I change the order of two functions in source code, the inline decision can change. This also prevented the use of context-sensitive profile to do specialization while inlining. This commit enforces SCC top-down order for sample profile loader. With this change, we can now do specialization, as illustrated by the added test case:

Say if we have A-&gt;B-&gt;C and D-&gt;B-&gt;C call path, we want to inline C into B when root inliner is B, but not when root inliner is A or D, this is not possible without enforcing top-down order. E.g. Once C is inlined into B, A and D can only choose to inline (B-&gt;C) as a whole or nothing, but what we want is only inline B into A and D, not its recursive callee C. If we process functions in top-down order, this is no longer a problem, which is what this commit is doing.

This change is guarded with a new switch "-sample-profile-top-down-load" for tuning, and it depends on D70653. Eventually, top-down can be the default order for sample profile loader.

Reviewers: wmi, davidxl

Subscribers: hiraditya, llvm-commits, tejohnson

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70655
</content>
</entry>
<entry>
<title>[AutoFDO] Properly merge context-sensitive profile of inlinee back to outlined function</title>
<updated>2019-12-05T23:57:55+00:00</updated>
<author>
<name>Wenlei He</name>
<email>aktoon@gmail.com</email>
</author>
<published>2019-11-25T07:31:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=e503fd85d3ac9d3e1493a7a63bc43c6939e132cc'/>
<id>urn:sha1:e503fd85d3ac9d3e1493a7a63bc43c6939e132cc</id>
<content type='text'>
Summary:
When sample profile loader decides not to inline a previously inlined call-site, we adjust the profile of outlined function simply by scaling up its profile counts by call-site count. This means the context-sensitive profile of that inlined instance will be thrown away. This commit try to keep context-sensitive profile for such cases:

 - Instead of scaling outlined function's profile, we now properly merge the FunctionSamples of inlined instance into outlined function, including all recursively inlined profile.
 - Instead of adjusting the profile for negative inline decision at the end of the sample profile loader pass, we do the profile merge right after processing each function. This change paired with top-down ordering of annotation/inline-replay (a separate diff) will make sure we recursively merge profile back before the profile is used for annotation and inline replay.

A new switch -sample-profile-merge-inlinee is added to enable the new profile merge for tuning. It should be the default behavior eventually.

Reviewers: wmi, davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70653
</content>
</entry>
<entry>
<title>Keep import function list for inlinee profile update</title>
<updated>2019-11-07T02:36:00+00:00</updated>
<author>
<name>Wenlei He</name>
<email>aktoon@gmail.com</email>
</author>
<published>2019-11-01T19:57:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=ba1dfae054b4c9a8b11aabd62fd0dcb792366206'/>
<id>urn:sha1:ba1dfae054b4c9a8b11aabd62fd0dcb792366206</id>
<content type='text'>
Summary:
When adjusting function entry counts after inlining, Funciton::setEntryCount is called without providing an import function list. The side effect of that is the previously set import function list will be dropped. The import function list is used by ThinLTO to help import hot cross module callee for LTO inlining, so dropping that during ThinLTO pre-link may adversely affect LTO inlining. The fix is to keep the list while updating entry counts for inlining.

Reviewers: wmi, davidxl, tejohnson

Subscribers: mehdi_amini, hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69736
</content>
</entry>
<entry>
<title>[SampleFDO] Add profile remapping support for profile on-demand loading used</title>
<updated>2019-10-18T22:35:20+00:00</updated>
<author>
<name>Wei Mi</name>
<email>wmi@google.com</email>
</author>
<published>2019-10-18T22:35:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=8c8ec1f6868bd0f96801fabc55ea395d9d171f06'/>
<id>urn:sha1:8c8ec1f6868bd0f96801fabc55ea395d9d171f06</id>
<content type='text'>
by ExtBinary format profile

Profile on-demand loading was added for ExtBinary format profile in rL374233,
but currently profile on-demand loading doesn't work well with profile
remapping. The patch adds the support.

Suppose a function in the current module has outline instance in the profile.
The function name in the module is different from the name of the outline
instance, but remapper knows the two names are equal. When loading profile
on-demand, the outline instance has to be loaded with remapper's help.

At the same time SampleProfileReaderItaniumRemapper is changed from a proxy
of SampleProfileReader to a helper member in SampleProfileReader.

Differential Revision: https://reviews.llvm.org/D68901

llvm-svn: 375295
</content>
</entry>
<entry>
<title>[SampleFDO] Add indexing for function profiles so they can be loaded on demand</title>
<updated>2019-10-09T21:36:03+00:00</updated>
<author>
<name>Wei Mi</name>
<email>wmi@google.com</email>
</author>
<published>2019-10-09T21:36:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=09dcfe68057082207f47da230c1c2618ce3aadca'/>
<id>urn:sha1:09dcfe68057082207f47da230c1c2618ce3aadca</id>
<content type='text'>
in ExtBinary format

Currently for Text, Binary and ExtBinary format profiles, when we compile a
module with samplefdo, even if there is no function showing up in the profile,
we have to load all the function profiles from the profile input. That is a
waste of compile time.

CompactBinary format profile has already had the support of loading function
profiles on demand. In this patch, we add the support to load profile on
demand for ExtBinary format. It will work no matter the sections in ExtBinary
format profile are compressed or not. Experiment shows it reduces the time to
compile a server benchmark by 30%.

When profile remapping and loading function profiles on demand are both used,
extra work needs to be done so that the loading on demand process will take
the name remapping into consideration. It will be addressed in a follow-up
patch.

Differential Revision: https://reviews.llvm.org/D68601

llvm-svn: 374233
</content>
</entry>
<entry>
<title>[SampleFDO] Add compression support for any section in ExtBinary profile format</title>
<updated>2019-10-07T16:12:37+00:00</updated>
<author>
<name>Wei Mi</name>
<email>wmi@google.com</email>
</author>
<published>2019-10-07T16:12:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=b523790ae1b30a1708d2fc7937f90e283330ef33'/>
<id>urn:sha1:b523790ae1b30a1708d2fc7937f90e283330ef33</id>
<content type='text'>
Previously ExtBinary profile format only supports compression using zlib for
profile symbol list. In this patch, we extend the compression support to any
section. User can select some or all of the sections to compress. In an
experiment, for a 45M profile in ExtBinary format, compressing name table
reduced its size to 24M, and compressing all the sections reduced its size
to 11M.

Differential Revision: https://reviews.llvm.org/D68253

llvm-svn: 373914
</content>
</entry>
</feed>
