summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/IPO/SampleProfile.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Import all inlined indirect call targets for SamplePGO.Dehao Chen2017-09-191-3/+5
| | | | | | | | | | | | | | Summary: In the ThinLTO compilation, if a function is inlined in the profiling binary, we need to inline it before annotation. If the callee is not available in the primary module, a first step is needed to import that callee function. For the current implementation, if the call is an indirect call, which has been promoted to >1 targets and inlined, SamplePGO will only import one target with the largest sample count. This patch fixed the bug to import all targets instead. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36637 llvm-svn: 313678
* Handle profile mismatch correctly for SamplePGO.Dehao Chen2017-09-191-1/+6
| | | | | | | | | | | | | | Summary: Fix the bug when promoted call return type mismatches with the promoted function, we should not try to inline it. Otherwise it may lead to compiler crash. Reviewers: davidxl, tejohnson, eraman Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D38018 llvm-svn: 313658
* Invoke GetInlineCost for legality check before inline functions in ↵Dehao Chen2017-09-141-6/+37
| | | | | | | | | | | | | | | | SampleProfileLoader. Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues. Reviewers: eraman, davidxl Reviewed By: eraman Subscribers: vitalybuka, sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37779 llvm-svn: 313277
* Revert "Invoke GetInlineCost for legality check before inline functions in ↵Vitaly Buka2017-09-141-37/+6
| | | | | | | | | | SampleProfileLoader." Patch introduced uninitialized value. This reverts commit r313195. llvm-svn: 313230
* Invoke GetInlineCost for legality check before inline functions in ↵Dehao Chen2017-09-131-6/+37
| | | | | | | | | | | | | | | | SampleProfileLoader. Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues. Reviewers: eraman, davidxl Reviewed By: eraman Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37779 llvm-svn: 313195
* Refactor the code to pass down ACT to SampleProfileLoader correctly.Dehao Chen2017-09-121-13/+23
| | | | | | | | | | | | | | Summary: This change passes down ACT to SampleProfileLoader for the new PM. Also remove the default value for SampleProfileLoader class as it is not used. Reviewers: eraman, davidxl Reviewed By: eraman Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37773 llvm-svn: 313080
* [OptDiag] Updating Remarks in SampleProfileEli Friedman2017-08-111-30/+45
| | | | | | | | | | | | | | | | Updating remark API to newer OptimizationDiagnosticInfo API. This allows remarks to show up in diagnostic yaml file, and enables use of opt-viewer tool. Hotness information for remarks (L505 and L751) do not display hotness information, most likely due to profile information not being propagated yet. Unsure if this is the desired outcome. Patch by Tarun Rajendran. Differential Revision: https://reviews.llvm.org/D36127 llvm-svn: 310763
* Guard print() functions only used by dump() functions.Florian Hahn2017-07-311-0/+2
| | | | | | | | | | | | | | | | | | | Summary: Since r293359, most dump() function are only defined when `!defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)` holds. print() functions only used by dump() functions are now unused in release builds, generating lots of warnings. This patch only defines some print() functions if they are used. Reviewers: MatzeB Reviewed By: MatzeB Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D35949 llvm-svn: 309553
* [Dominators] Make IsPostDominator a template parameterJakub Kuderski2017-07-141-4/+7
| | | | | | | | | | | | | | | | | Summary: DominatorTreeBase used to have IsPostDominators (bool) member to indicate if the tree is a dominator or a postdominator tree. This made it possible to switch between the two 'modes' at runtime, but it isn't used in practice anywhere. This patch makes IsPostDominator a template argument. This way, it is easier to switch between different algorithms at compile-time based on this argument and design external utilities around it. It also makes it impossible to incidentally assign a postdominator tree to a dominator tree (and vice versa), and to further simplify template code in GenericDominatorTreeConstruction. Reviewers: dberlin, sanjoy, davide, grosser Reviewed By: dberlin Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35315 llvm-svn: 308040
* Hook the sample PGO machinery in the new PMDehao Chen2017-06-291-1/+2
| | | | | | | | | | | | | | Summary: This patch hooks up SampleProfileLoaderPass with the new PM. Reviewers: chandlerc, davidxl, davide, tejohnson Reviewed By: chandlerc, tejohnson Subscribers: tejohnson, llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D34720 llvm-svn: 306763
* Do not inline recursive direct calls in sample loader pass.Dehao Chen2017-06-211-0/+3
| | | | | | | | | | | | | | Summary: r305009 disables recursive inlining for indirect calls in sample loader pass. The same logic applies to direct recursive calls. Reviewers: iteratee, davidxl Reviewed By: iteratee Subscribers: sanjoy, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D34456 llvm-svn: 305934
* Do not early-inline recursive calls in sample profile loader.Dehao Chen2017-06-081-0/+7
| | | | | | | | | | | | | | Summary: Early-inlining of recursive call makes the code size bloat exponentially. We should not disable it. Reviewers: davidxl, dnovillo, iteratee Reviewed By: iteratee Subscribers: iteratee, llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D34017 llvm-svn: 305009
* [SampleProfile] Don't assert when printing the DebugLoc of a branch. NFC.Andrea Di Biagio2017-04-181-2/+4
| | | | llvm-svn: 300544
* [SampleProfile] Skip intrinsic calls when visiting callsites in ↵Andrea Di Biagio2017-04-181-1/+1
| | | | | | | | | | | | | | InlineHotFunctions. Before this patch, we always called method 'findCalleeFunctionSamples()' on intrinsic calls. However, intrinsic calls like llvm.dbg.value() are not viable candidates for obvious reasons. No functional change intended. Differential Revision: https://reviews.llvm.org/D32008 llvm-svn: 300541
* Build SymbolMap in SampleProfileLoader to help matchin function names with ↵Dehao Chen2017-04-171-1/+31
| | | | | | | | | | | | | | | | suffix. Summary: If there is suffix added in the function name (e.g. module hash added by thinLTO), we will not be able to find a match in profile as the suffix does not exist in profile. This patch build a map from function name to Function *. The map includes the entry for the stripped function name so that inlineHotFunctions can find the corresponding function to promote/inline. Reviewers: davidxl, dnovillo, tejohnson Reviewed By: davidxl Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D31952 llvm-svn: 300507
* SamplePGO: convert callsite samples map key from callsite_location to ↵Dehao Chen2017-04-131-39/+88
| | | | | | | | | | | | | | | | callsite_location+callee_name Summary: For iterative SamplePGO, an indirect call can be speculatively promoted to multiple direct calls and get inlined. All these promoted direct calls will share the same callsite location (offset+discriminator). With the current implementation, we cannot distinguish between different promotion candidates and its inlined instance. This patch adds callee_name to the key of the callsite sample map. And added helper functions to get all inlined callee samples for a given callsite location. This helps the profile annotator promote correct targets and inline it before annotation, and ensures all indirect call targets to be annotated correctly. Reviewers: davidxl, dnovillo Reviewed By: davidxl Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D31950 llvm-svn: 300240
* Emit less compiler optimization remarks in samplepgo to reduce a call to ↵Dehao Chen2017-04-101-3/+1
| | | | | | | | | | | | | | | | findCalleeFunctionSamples which is going to be refactored. Summary: Now the SamplePGO support is more stable, we do not need so many verbose optimization remarks emitted. Reviewers: dnovillo, davidxl Reviewed By: davidxl Subscribers: fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D31826 llvm-svn: 299883
* Do not set branch weight if the branch weight annotation is present.Dehao Chen2017-03-231-1/+5
| | | | | | | | | | | | | | Summary: ThinLTO will annotate the CFG twice. If the branch weight is set by the first annotation, we should not set the branch weight again in the second annotation because the first annotation is more accurate as there is less optimization that could affect debug info accuracy. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: mehdi_amini, aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D31228 llvm-svn: 298602
* SamplePGO ThinLTO ICP fix for local functions.Dehao Chen2017-03-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Summary: In SamplePGO, if the profile is collected from non-LTO binary, and used to drive ThinLTO, the indirect call promotion may fail because ThinLTO adjusts local function names to avoid conflicts. There are two places of where the mismatch can happen: 1. thin-link prepends SourceFileName to front of FuncName to build the GUID (GlobalValue::getGlobalIdentifier). Unlike instrumentation FDO, SamplePGO does not use the PGOFuncName scheme and therefore the indirect call target profile data contains a hash of the OriginalName. 2. backend compiler promotes some local functions to global and appends .llvm.{$ModuleHash} to the end of the FuncName to derive PromotedFunctionName This patch tries at the best effort to find the GUID from the original local function name (in profile), and use that in ICP promotion, and in SamplePGO matching that happens in the backend after importing/inlining: 1. in thin-link, it builds the map from OriginalName to GUID so that when thin-link reads in indirect call target profile (represented by OriginalName), it knows which GUID to import. 2. in backend compiler, if sample profile reader cannot find a profile match for PromotedFunctionName, it will try to find if there is a match for OriginalFunctionName. 3. in backend compiler, we build symbol table entry for OriginalFunctionName and pointer to the same symbol of PromotedFunctionName, so that ICP can find the correct target to promote. Reviewers: mehdi_amini, tejohnson Reviewed By: tejohnson Subscribers: llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D30754 llvm-svn: 297757
* Remove the sample pgo annotation heuristic that uses call count to annotate ↵Dehao Chen2017-03-061-5/+3
| | | | | | | | | | | | | | | | basic block count. Summary: We do not need that special handling because the debug info is more accurate now. Performance testing shows no regression on google internal benchmarks. Reviewers: davidxl, aprantl Reviewed By: aprantl Subscribers: llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D30658 llvm-svn: 297038
* Add function importing info from samplepgo profile to the module summary.Dehao Chen2017-02-281-8/+19
| | | | | | | | | | | | | | Summary: For SamplePGO, the profile may contain cross-module inline stacks. As we need to make sure the profile annotation happens when all the hot inline stacks are expanded, we need to pass this info to the module importer so that it can import proper functions if necessary. This patch implemented this feature by emitting cross-module targets as part of function entry metadata. In the module-summary phase, the metadata is used to build call edges that points to functions need to be imported. Reviewers: mehdi_amini, tejohnson Reviewed By: tejohnson Subscribers: davidxl, llvm-commits Differential Revision: https://reviews.llvm.org/D30053 llvm-svn: 296498
* Add call branch annotation for ICP promoted direct call in SamplePGO mode.Dehao Chen2017-02-231-1/+1
| | | | | | | | | | | | | | Summary: SamplePGO uses branch_weight annotation to represent callsite hotness. When ICP promotes an indirect call to direct call, we need to make sure the direct call is annotated with branch_weight in SamplePGO mode, so that downstream function inliner can use hot callsite heuristic. Reviewers: davidxl, eraman, xur Reviewed By: davidxl, xur Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D30282 llvm-svn: 296028
* Use base discriminator in sample pgo profile matching.Dehao Chen2017-02-231-7/+8
| | | | | | | | | | | | | | Summary: The discriminator has been encoded, and only the base discriminator should be used during profile matching. Reviewers: dblaikie, davidxl Reviewed By: dblaikie, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30218 llvm-svn: 295999
* Fix the samplepgo indirect call promotion bug: we should not promote a ↵Dehao Chen2017-02-061-1/+2
| | | | | | | | | | | | | | | | direct call. Summary: Checking CS.getCalledFunction() == nullptr does not necessary indicate indirect call. We also need to check if CS.getCalledValue() is not a constant. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29570 llvm-svn: 294260
* Fix the bug of samplepgo indirect call promption when type casting of the ↵Dehao Chen2017-02-061-1/+3
| | | | | | | | | | | | | | | | return value is needed. Summary: When type casting of the return value is needed, promoteIndirectCall will return the type casting instruction instead of the direct call. This patch changed to return the direct call instruction instead. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29569 llvm-svn: 294205
* Refactor SampleProfile.cpp to make it cleaner. (NFC)Dehao Chen2017-02-051-32/+14
| | | | llvm-svn: 294118
* Explicitly promote indirect calls before sample profile annotation.Dehao Chen2017-01-311-5/+24
| | | | | | | | | | | | | | Summary: In iterative sample pgo where profile is collected from PGOed binary, we may see indirect call targets promoted and inlined in the profile. Before profile annotation, we need to make this happen in order to annotate correctly on IR. This patch explicitly promotes these indirect calls and inlines them before profile annotation. Reviewers: xur, davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29040 llvm-svn: 293657
* Revert r292979 which causes compile time failure.Dehao Chen2017-01-301-19/+5
| | | | llvm-svn: 293557
* Explicitly promote indirect calls before sample profile annotation.Dehao Chen2017-01-241-5/+19
| | | | | | | | | | | | | | Summary: In iterative sample pgo where profile is collected from PGOed binary, we may see indirect call targets promoted and inlined in the profile. Before profile annotation, we need to make this happen in order to annotate correctly on IR. This patch explicitly promotes these indirect calls and inlines them before profile annotation. Reviewers: xur, davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29040 llvm-svn: 292979
* Revert "Refactor SampleProfile.cpp to move computation inside a branch. (NFC)"Evgeniy Stepanov2017-01-231-2/+2
| | | | | | Causes MSan failures on the buildbot. llvm-svn: 292840
* Refactor SampleProfile.cpp to move computation inside a branch. (NFC)Dehao Chen2017-01-231-2/+2
| | | | llvm-svn: 292803
* Add indirect call promotion to SamplePGODehao Chen2017-01-201-7/+49
| | | | | | | | | | | | | | Summary: This patch adds metadata for indirect call promotion in the sample profile loader. Reviewers: xur, davidxl, dnovillo Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28923 llvm-svn: 292672
* clang-format SampleProfile.cpp (NFC)Dehao Chen2017-01-191-6/+4
| | | | llvm-svn: 292533
* Revert @llvm.assume with operator bundles (r289755-r289757)Daniel Jasper2016-12-191-5/+21
| | | | | | | This creates non-linear behavior in the inliner (see more details in r289755's commit thread). llvm-svn: 290086
* Remove the AssumptionCacheHal Finkel2016-12-151-21/+5
| | | | | | | | | After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... llvm-svn: 289756
* Only sets profile summary when it was not preset.Dehao Chen2016-12-141-1/+2
| | | | | | | | | | | | Summary: SampleProfileLoader pass may be invoked twice by LTO. The 2nd pass should not append more summary info as it is already preset by the 1st pass. Reviewers: eraman, davidxl Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D27733 llvm-svn: 289725
* Fix the bug in r289714 (NFC).Dehao Chen2016-12-141-1/+1
| | | | llvm-svn: 289724
* Change CoverageTracker from a global variable to member variable to avoid ↵Dehao Chen2016-12-131-52/+52
| | | | | | breaking thread-safety. (NFC) llvm-svn: 289603
* Use CallSite to simplify codeDavid Blaikie2016-11-291-5/+3
| | | | llvm-svn: 288192
* Before sample pgo annotation, do not inline a function that has no debug ↵Dehao Chen2016-11-221-0/+2
| | | | | | | | info. (NFC) If there is no debug info in the callee, inlining it will not help annotator. This avoids infinite loop as reported in PR/31119. llvm-svn: 287710
* Use StringRef in Pass/PassManager APIs (NFC)Mehdi Amini2016-10-011-1/+1
| | | | llvm-svn: 283004
* Change the basic block weight calculation algorithm to use max instead of ↵Dehao Chen2016-09-211-14/+6
| | | | | | | | | | | | | | voting. Summary: Now that we have more precise debug info, we should change back to use maximum to get basic block weight. Reviewers: dnovillo Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D24788 llvm-svn: 282084
* Handle early inline for hot callsites that reside in the same basic block.Dehao Chen2016-09-191-2/+7
| | | | | | | | | | | | Summary: Callsites in the same basic block should share the same hotness. This patch checks for the hottest callsite in the same basic block, and use the hotness for all callsites in that basic block for early inline decisions. It also fixes the test to add "-S" so theat the "CHECK-NOT" is actually checking the content. Reviewers: dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24734 llvm-svn: 281927
* Only set branch weight during sample pgo annotation when max_weight of the ↵Dehao Chen2016-09-191-11/+15
| | | | | | | | | | | | | | branch is non-zero. Otherwise use default static profile to set branch probability. Summary: It does not make sense to set equal weights for all unkown branches as we have static branch prediction available. Reviewers: dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24732 llvm-svn: 281912
* Use call target count to derive the call instruction weightDehao Chen2016-09-191-3/+5
| | | | | | | | | | | | Summary: The call target count profile is directly derived from LBR branch->target data. This is more reliable than instruction frequency profiles that could be moved across basic block boundaries. This patches uses call target count profile to annotate call instructions. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24410 llvm-svn: 281911
* Handle Invoke during sample profiler annotation: make it inlinable.Dehao Chen2016-09-181-23/+32
| | | | | | | | | | | | Summary: Previously we reline on inst-combine to remove inlinable invoke instructions. This causes trouble because a few extra optimizations are schedule early that could introduce too much CFG change (e.g. simplifycfg removes too much control flow). This patch handles invoke instruction in-place during sample profile annotation, so that we do not rely on instcombine to remove those invoke instructions. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24409 llvm-svn: 281870
* Fine tuning of sample profile propagation algorithm.Dehao Chen2016-08-121-31/+100
| | | | | | | | | | | | Summary: The refined propagation algorithm is more accurate and robust. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23224 llvm-svn: 278522
* Consistently use ModuleAnalysisManagerSean Silva2016-08-091-1/+1
| | | | | | | | | | | Besides a general consistently benefit, the extra layer of indirection allows the mechanical part of https://reviews.llvm.org/D23256 that requires touching every transformation and analysis to be factored out cleanly. Thanks to David for the suggestion. llvm-svn: 278078
* Avoid using a raw AssumptionCacheTracker in various inliner functions.Sean Silva2016-07-231-1/+3
| | | | | | | | | | This unblocks the new PM part of River's patch in https://reviews.llvm.org/D22706 Conveniently, this same change was needed for D21921 and so these changes are just spun out from there. llvm-svn: 276515
* Implement callsite-hotness based inline cost for Sample-based PGODehao Chen2016-07-111-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch. E.g. if (A1 && A2 && A3 && ..... && A10) { for (i=0; i < 100000000; i++) { callsite(); } } Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value. In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness. Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22118 llvm-svn: 275073
OpenPOWER on IntegriCloud