bcm5719-llvm/llvm/test/Transforms/SampleProfile, branch meklort-10.0.1

bcm5719-llvm/llvm/test/Transforms/SampleProfile, branch meklort-10.0.1 Project Ortega BCM5719 LLVM https://git.raptorcs.com/git/bcm5719-llvm/atom?h=meklort-10.0.1 2019-12-25T00:27:51+00:00 Migrate function attribute "no-frame-pointer-elim"="false" to "frame-pointer"="none" as cleanups after D56351 2019-12-25T00:27:51+00:00 Fangrui Song maskray@google.com 2019-12-25T00:11:33+00:00 urn:sha1:a36ddf0aa9db5c1086e04f56b5f077b761712eb5 Migrate function attribute "no-frame-pointer-elim" to "frame-pointer"="all" as cleanups after D56351 2019-12-24T23:57:33+00:00 Fangrui Song maskray@google.com 2019-12-24T23:52:21+00:00 urn:sha1:502a77f125f43ffde57af34d3fd1b900248a91cd [AutoFDO] Statistic for context sensitive profile guided inlining 2019-12-12T05:37:21+00:00 Wenlei He aktoon@gmail.com 2019-11-22T07:59:41+00:00 urn:sha1:d275a064871763ab3a7712c74712d2fd1d0bef5d Summary: AutoFDO compilation has two places that do inlining - the sample profile loader that does inlining with context sensitive profile, and the regular inliner as CGSCC pass. Ideally we want most inlining to come from sample profile loader as that is driven by context sensitive profile and also retains context sensitivity after inlining. However the reality is most of the inlining actually happens during regular inliner. To track the number of inline instances from sample profile loader and help move more inlining to sample profile loader, I'm adding statistics and optimization remarks for sample profile loader's inlining. Reviewers: wmi, davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70584 [AutoFDO] Inline replay for cold/small callees from sample profile loader 2019-12-06T19:44:45+00:00 Wenlei He aktoon@gmail.com 2019-11-26T23:53:06+00:00 urn:sha1:7b61ae68ecd7a127e69c9e0d2563bddb7eccad7a Summary: Sample profile loader of AutoFDO tries to replay previous inlining using context sensitive profile. The replay only repeats inlining if the call site block is hot. As a result it punts inlining of small functions, some of which can be beneficial for size, and will still be inlined by CSGCC inliner later. The oscillation between sample profile loader's inlining and regular CGSSC inlining cause unnecessary loss of context-sensitive profile. It doesn't have much impact for inline decision itself, but it negatively affects post-inline profile quality as CGSCC inliner have to scale counts which is not as accurate as the original context sensitive profile, and bad post-inline profile can misguide code layout. This change added regular Inline Cost calculation for sample profile loader, so we can inline small functions upfront under switch -sample-profile-inline-size. In addition -sample-profile-cold-inline-threshold is added so we can tune the separate size threshold - currently the default is chosen to be the same as regular inliner's cold call-site threshold. Reviewers: wmi, davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70750 [AutoFDO] Top-down Inlining for specialization with context-sensitive profile 2019-12-06T00:07:01+00:00 Wenlei He aktoon@gmail.com 2019-11-25T07:54:07+00:00 urn:sha1:532196d811ad4db1e522012c9d20e4a95aae2eb3 Summary: AutoFDO's sample profile loader processes function in arbitrary source code order, so if I change the order of two functions in source code, the inline decision can change. This also prevented the use of context-sensitive profile to do specialization while inlining. This commit enforces SCC top-down order for sample profile loader. With this change, we can now do specialization, as illustrated by the added test case: Say if we have A->B->C and D->B->C call path, we want to inline C into B when root inliner is B, but not when root inliner is A or D, this is not possible without enforcing top-down order. E.g. Once C is inlined into B, A and D can only choose to inline (B->C) as a whole or nothing, but what we want is only inline B into A and D, not its recursive callee C. If we process functions in top-down order, this is no longer a problem, which is what this commit is doing. This change is guarded with a new switch "-sample-profile-top-down-load" for tuning, and it depends on D70653. Eventually, top-down can be the default order for sample profile loader. Reviewers: wmi, davidxl Subscribers: hiraditya, llvm-commits, tejohnson Tags: #llvm Differential Revision: https://reviews.llvm.org/D70655 [AutoFDO] Properly merge context-sensitive profile of inlinee back to outlined function 2019-12-05T23:57:55+00:00 Wenlei He aktoon@gmail.com 2019-11-25T07:31:02+00:00 urn:sha1:e503fd85d3ac9d3e1493a7a63bc43c6939e132cc Summary: When sample profile loader decides not to inline a previously inlined call-site, we adjust the profile of outlined function simply by scaling up its profile counts by call-site count. This means the context-sensitive profile of that inlined instance will be thrown away. This commit try to keep context-sensitive profile for such cases: - Instead of scaling outlined function's profile, we now properly merge the FunctionSamples of inlined instance into outlined function, including all recursively inlined profile. - Instead of adjusting the profile for negative inline decision at the end of the sample profile loader pass, we do the profile merge right after processing each function. This change paired with top-down ordering of annotation/inline-replay (a separate diff) will make sure we recursively merge profile back before the profile is used for annotation and inline replay. A new switch -sample-profile-merge-inlinee is added to enable the new profile merge for tuning. It should be the default behavior eventually. Reviewers: wmi, davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70653 Keep import function list for inlinee profile update 2019-11-07T02:36:00+00:00 Wenlei He aktoon@gmail.com 2019-11-01T19:57:23+00:00 urn:sha1:ba1dfae054b4c9a8b11aabd62fd0dcb792366206 Summary: When adjusting function entry counts after inlining, Funciton::setEntryCount is called without providing an import function list. The side effect of that is the previously set import function list will be dropped. The import function list is used by ThinLTO to help import hot cross module callee for LTO inlining, so dropping that during ThinLTO pre-link may adversely affect LTO inlining. The fix is to keep the list while updating entry counts for inlining. Reviewers: wmi, davidxl, tejohnson Subscribers: mehdi_amini, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69736 [SampleFDO] Add profile remapping support for profile on-demand loading used 2019-10-18T22:35:20+00:00 Wei Mi wmi@google.com 2019-10-18T22:35:20+00:00 urn:sha1:8c8ec1f6868bd0f96801fabc55ea395d9d171f06 by ExtBinary format profile Profile on-demand loading was added for ExtBinary format profile in rL374233, but currently profile on-demand loading doesn't work well with profile remapping. The patch adds the support. Suppose a function in the current module has outline instance in the profile. The function name in the module is different from the name of the outline instance, but remapper knows the two names are equal. When loading profile on-demand, the outline instance has to be loaded with remapper's help. At the same time SampleProfileReaderItaniumRemapper is changed from a proxy of SampleProfileReader to a helper member in SampleProfileReader. Differential Revision: https://reviews.llvm.org/D68901 llvm-svn: 375295 [SampleFDO] Add indexing for function profiles so they can be loaded on demand 2019-10-09T21:36:03+00:00 Wei Mi wmi@google.com 2019-10-09T21:36:03+00:00 urn:sha1:09dcfe68057082207f47da230c1c2618ce3aadca in ExtBinary format Currently for Text, Binary and ExtBinary format profiles, when we compile a module with samplefdo, even if there is no function showing up in the profile, we have to load all the function profiles from the profile input. That is a waste of compile time. CompactBinary format profile has already had the support of loading function profiles on demand. In this patch, we add the support to load profile on demand for ExtBinary format. It will work no matter the sections in ExtBinary format profile are compressed or not. Experiment shows it reduces the time to compile a server benchmark by 30%. When profile remapping and loading function profiles on demand are both used, extra work needs to be done so that the loading on demand process will take the name remapping into consideration. It will be addressed in a follow-up patch. Differential Revision: https://reviews.llvm.org/D68601 llvm-svn: 374233 [SampleFDO] Add compression support for any section in ExtBinary profile format 2019-10-07T16:12:37+00:00 Wei Mi wmi@google.com 2019-10-07T16:12:37+00:00 urn:sha1:b523790ae1b30a1708d2fc7937f90e283330ef33 Previously ExtBinary profile format only supports compression using zlib for profile symbol list. In this patch, we extend the compression support to any section. User can select some or all of the sections to compress. In an experiment, for a 45M profile in ExtBinary format, compressing name table reduced its size to 24M, and compressing all the sections reduced its size to 11M. Differential Revision: https://reviews.llvm.org/D68253 llvm-svn: 373914