summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis/InlineCost.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Rename isHotFunction/isColdFunction to ↵Dehao Chen2016-10-101-2/+2
| | | | | | | | isFunctionEntryHot/isFunctionEntryCold. (NFC) This is in preparation for https://reviews.llvm.org/D25048 llvm-svn: 283805
* NFC fix doxygen commentsPiotr Padlewski2016-09-301-18/+18
| | | | llvm-svn: 282950
* Fix a thinko in r278189.Easwaran Raman2016-08-291-1/+1
| | | | llvm-svn: 280008
* Make more fields of InlineParams Optional.Easwaran Raman2016-08-111-4/+8
| | | | | | Differential revision: https://reviews.llvm.org/D23386 llvm-svn: 278312
* Changed sign of LastCallToStaticBounsPiotr Padlewski2016-08-101-1/+1
| | | | | | | | | | | | | | | | | Summary: I think it is much better this way. When I firstly saw line: Cost += InlineConstants::LastCallToStaticBonus; I though that this is a bug, because everywhere where the cost is being reduced it is usuing -=. Reviewers: eraman, tejohnson, mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23222 llvm-svn: 278290
* Do not directly use inline threshold cl options in cost analysis.Easwaran Raman2016-08-101-57/+98
| | | | | | | | This adds an InlineParams struct which is populated from the command line options by getInlineParams and passed to getInlineCost for the call analyzer to use. Differential revision: https://reviews.llvm.org/D22120 llvm-svn: 278189
* Remove cold callsite heuristic that is not necessary because of cold callee ↵Dehao Chen2016-08-051-7/+5
| | | | | | heuristic. llvm-svn: 277863
* Replace hot-callsite based heuristic to use its own threshold parameter ↵Dehao Chen2016-08-051-6/+17
| | | | | | | | | | | | | | instead of share inline-hint parameter Summary: Hot callsites should have higher threshold than inline hints. This patch uses separate threshold parameter for hot callsites. Reviewers: davidxl, eraman Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D22368 llvm-svn: 277860
* Avoid using a raw AssumptionCacheTracker in various inliner functions.Sean Silva2016-07-231-29/+29
| | | | | | | | | | This unblocks the new PM part of River's patch in https://reviews.llvm.org/D22706 Conveniently, this same change was needed for D21921 and so these changes are just spun out from there. llvm-svn: 276515
* Implement callsite-hotness based inline cost for Sample-based PGODehao Chen2016-07-111-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch. E.g. if (A1 && A2 && A3 && ..... && A10) { for (i=0; i < 100000000; i++) { callsite(); } } Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value. In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness. Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22118 llvm-svn: 275073
* Fix size computation of array allocation in inline cost analysisEaswaran Raman2016-06-271-3/+4
| | | | | | Differential revision: http://reviews.llvm.org/D21690 llvm-svn: 273952
* Use ProfileSummaryInfo in inline cost analysis.Easwaran Raman2016-06-091-39/+28
| | | | | | | | Instead of directly using MaxFunctionCount and function entry count to determine callee hotness, use the isHotFunction/isColdFunction methods provided by ProfileSummaryInfo. Differential revision: http://reviews.llvm.org/D21045 llvm-svn: 272321
* Allow -inline-threshold to override default threshold.Easwaran Raman2016-05-191-4/+7
| | | | | | | | Before r257832, the threshold used by SimpleInliner was explicitly specified or generated from opt levels and passed to the base class Inliner's constructor. There, it was first overridden by explicitly specified -inline-threshold. The refactoring in r257832 did not preserve this behavior for all opt levels. This change brings back the original behavior. Differential Revision: http://reviews.llvm.org/D20452 llvm-svn: 270153
* Revert r269131Easwaran Raman2016-05-101-4/+2
| | | | llvm-svn: 269138
* Reapply r266477 and r266488Easwaran Raman2016-05-101-2/+4
| | | | llvm-svn: 269131
* [Inliner] don't assume that a Constant alloca size is a ConstantInt (PR27277)Sanjay Patel2016-05-091-4/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D20077 llvm-svn: 268980
* [Inliner] Formatting. NFC.Chad Rosier2016-04-281-36/+41
| | | | | | | Patch by Aditya Kumar! Differential Revision: http://reviews.llvm.org/D19047 llvm-svn: 267888
* Introduce llvm.load.relative intrinsic.Peter Collingbourne2016-04-221-0/+5
| | | | | | | | | | | | | | | | | | | This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that value and returns it. The constant folder specifically recognizes the form of this intrinsic and the constant initializers it may load from; if a loaded constant initializer is known to have the form ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. LLVM provides that the calculation of such a constant initializer will not overflow at link time under the medium code model if ``x`` is an ``unnamed_addr`` function. However, it does not provide this guarantee for a constant initializer folded into a function body. This intrinsic can be used to avoid the possibility of overflows when loading from such a constant. Differential Revision: http://reviews.llvm.org/D18367 llvm-svn: 267223
* Revert "Replace the use of MaxFunctionCount module flag"Eric Liu2016-04-181-4/+2
| | | | | | | | | | This reverts commit r266477. This commit introduces cyclic dependency. This commit has "Analysis" depend on "ProfileData", while "ProfileData" depends on "Object", which depends on "BitCode", which depends on "Analysis". llvm-svn: 266619
* Replace the use of MaxFunctionCount module flagEaswaran Raman2016-04-151-2/+4
| | | | | | | | Adds an interface to get ProfileSummary for a module and makes InlineCost use ProfileSummary to get max function count. Differential Revision: http://reviews.llvm.org/D18622 llvm-svn: 266477
* [TTI] Add getInliningThresholdMultiplier.Justin Lebar2016-04-151-0/+4
| | | | | | | | | | | | | | | | Summary: InlineCost's threshold is multiplied by this value. This lets us adjust the inlining threshold up or down on a per-target basis. For example, we might want to increase the threshold on targets where calls are unusually expensive. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18560 llvm-svn: 266405
* Return immediately from analyzeCall if analyzeBlock returns false.Easwaran Raman2016-04-131-14/+2
| | | | | | This is part of the patch reviewed at http://reviews.llvm.org/D17584 llvm-svn: 266249
* Refactor Threshold computation. NFC.Easwaran Raman2016-04-081-22/+35
| | | | | | This is part of changes reviewed in http://reviews.llvm.org/D17584. llvm-svn: 265852
* Don't IPO over functions that can be de-refinedSanjoy Das2016-04-081-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fixes PR26774. If you're aware of the issue, feel free to skip the "Motivation" section and jump directly to "This patch". Motivation: I define "refinement" as discarding behaviors from a program that the optimizer has license to discard. So transforming: ``` void f(unsigned x) { unsigned t = 5 / x; (void)t; } ``` to ``` void f(unsigned x) { } ``` is refinement, since the behavior went from "if x == 0 then undefined else nothing" to "nothing" (the optimizer has license to discard undefined behavior). Refinement is a fundamental aspect of many mid-level optimizations done by LLVM. For instance, transforming `x == (x + 1)` to `false` also involves refinement since the expression's value went from "if x is `undef` then { `true` or `false` } else { `false` }" to "`false`" (by definition, the optimizer has license to fold `undef` to any non-`undef` value). Unfortunately, refinement implies that the optimizer cannot assume that the implementation of a function it can see has all of the behavior an unoptimized or a differently optimized version of the same function can have. This is a problem for functions with comdat linkage, where a function can be replaced by an unoptimized or a differently optimized version of the same source level function. For instance, FunctionAttrs cannot assume a comdat function is actually `readnone` even if it does not have any loads or stores in it; since there may have been loads and stores in the "original function" that were refined out in the currently visible variant, and at the link step the linker may in fact choose an implementation with a load or a store. As an example, consider a function that does two atomic loads from the same memory location, and writes to memory only if the two values are not equal. The optimizer is allowed to refine this function by first CSE'ing the two loads, and the folding the comparision to always report that the two values are equal. Such a refined variant will look like it is `readonly`. However, the unoptimized version of the function can still write to memory (since the two loads //can// result in different values), and selecting the unoptimized version at link time will retroactively invalidate transforms we may have done under the assumption that the function does not write to memory. Note: this is not just a problem with atomics or with linking differently optimized object files. See PR26774 for more realistic examples that involved neither. This patch: This change introduces a new set of linkage types, predicated as `GlobalValue::mayBeDerefined` that returns true if the linkage type allows a function to be replaced by a differently optimized variant at link time. It then changes a set of IPO passes to bail out if they see such a function. Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18634 llvm-svn: 265762
* Revert revisions 262636, 262643, 262679, and 262682.Easwaran Raman2016-03-081-86/+16
| | | | llvm-svn: 262883
* Fix a memory leak.Easwaran Raman2016-03-041-1/+4
| | | | llvm-svn: 262682
* Fix breakage caused by r262636.Easwaran Raman2016-03-031-1/+1
| | | | | | Use LLVM_ATTRIBUTE_UNUSED instead of __attribute_((unused)) llvm-svn: 262643
* Infrastructure for PGO enhancements in inlinerEaswaran Raman2016-03-031-16/+83
| | | | | | | | | | | | This patch provides the following infrastructure for PGO enhancements in inliner: Enable the use of block level profile information in inliner Incremental update of block frequency information during inlining Update the function entry counts of callees when they get inlined into callers. Differential Revision: http://reviews.llvm.org/D16381 llvm-svn: 262636
* CallAnalyzer::analyzeCall: change the condition back to "Cost < Threshold"Hans Wennborg2016-02-051-1/+1
| | | | | | | | In r252595, I inadvertently changed the condition to "Cost <= Threshold", which caused a significant size regression in Chrome. This commit rectifies that. llvm-svn: 259915
* Avoid inlining call sites in unreachable-terminated blockJun Bum Lim2016-02-011-6/+17
| | | | | | | | | | | | | Summary: If the normal destination of the invoke or the parent block of the call site is unreachable-terminated, there is little point in inlining the call site unless there is literally zero cost. Unlike my previous change (D15289), this change specifically handle the call sites followed by unreachable in the same basic block for call or in the normal destination for the invoke. This change could be a reasonable first step to conservatively inline call sites leading to an unreachable-terminated block while BFI / BPI is not yet available in inliner. Reviewers: manmanren, majnemer, hfinkel, davidxl, mcrosier, dblaikie, eraman Subscribers: dblaikie, davidxl, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16616 llvm-svn: 259403
* Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith ↵Yaron Keren2016-01-291-1/+1
| | | | | | | | r259192 post commit comment. clang part in r259232, this is the LLVM part of the patch. llvm-svn: 259240
* Lower inlining threshold when the caller has minsize attribute.Easwaran Raman2016-01-281-8/+8
| | | | | | | | | | | | When the caller has optsize attribute, we reduce the inlinining threshold to OptSizeThreshold (=75) if it is not already lower than that. We don't do the same for minsize and I suspect it was not intentional. This also addresses a FIXME regarding checking optsize attribute explicitly instead of using the right wrapper. Differential Revision: http://reviews.llvm.org/D16493 llvm-svn: 259120
* Change ConstantFoldInstOperands to take Instruction instead of opcode and ↵Manuel Jacob2016-01-211-2/+1
| | | | | | | | | | | | | | | | | | | | | type. NFC. Summary: The previous form, taking opcode and type, is moved to an internal helper and the new form, taking an instruction, is a wrapper around this helper. Although this is a slight cleanup on its own, the main motivation is to refactor the constant folding API to ease migration to opaque pointers. This will be follow-up work. Reviewers: eddyb Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D16383 llvm-svn: 258391
* Refactor threshold computation for inline cost analysisEaswaran Raman2016-01-141-4/+106
| | | | | | Differential Revision: http://reviews.llvm.org/D15401 llvm-svn: 257832
* Refactor inline costs analysis by removing the InlineCostAnalysis classEaswaran Raman2015-12-281-36/+12
| | | | | | | | | | InlineCostAnalysis is an analysis pass without any need for it to be one. Once it stops being an analysis pass, it doesn't maintain any useful state and the member functions inside can be made free functions. NFC. Differential Revision: http://reviews.llvm.org/D15701 llvm-svn: 256521
* Provide a way to specify inliner's attribute compatibility and merging.Akira Hatanaka2015-12-221-3/+1
| | | | | | | | | | | | | | | | | | | | | This reapplies r256277 with two changes: - In emitFnAttrCompatCheck, change FuncName's type to std::string to fix a use-after-free bug. - Remove an unnecessary install-local target in lib/IR/Makefile. Original commit message for r252949: Provide a way to specify inliner's attribute compatibility and merging rules using table-gen. NFC. This commit adds new classes CompatRule and MergeRule to Attributes.td, which are used to generate code to check attribute compatibility and merge attributes of the caller and callee. rdar://problem/19836465 llvm-svn: 256304
* Revert r256277 and r256279.Akira Hatanaka2015-12-221-1/+3
| | | | | | Some of the bots failed again. llvm-svn: 256280
* Provide a way to specify inliner's attribute compatibility and merging.Akira Hatanaka2015-12-221-3/+1
| | | | | | | | | | | | | | | | | | This reapplies r252990 and r252949. I've added member function getKind to the Attr classes which returns the enum or string of the attribute. Original commit message for r252949: Provide a way to specify inliner's attribute compatibility and merging rules using table-gen. NFC. This commit adds new classes CompatRule and MergeRule to Attributes.td, which are used to generate code to check attribute compatibility and merge attributes of the caller and callee. rdar://problem/19836465 llvm-svn: 256277
* Use updated threshold for indirect call bonusEaswaran Raman2015-12-071-2/+2
| | | | | | | | When considering foo->bar inlining, if there is an indirect call in foo which gets resolved to a direct call (say baz), then we try to inline baz into bar with a threshold T and subtract max(T - Cost(bar->baz), 0) from Cost(foo->bar). This patch uses max(Threshold(bar->baz) - Cost(bar->baz)) instead, where Thresheld(bar->baz) could be different from T due to bonuses or subtractions. Threshold(bar->baz) - Cost(bar->baz) better represents the desirability of inlining baz into bar. Differential Revision: http://reviews.llvm.org/D14309 llvm-svn: 254945
* Test commit.Easwaran Raman2015-12-031-2/+2
| | | | | | Remove blank spaces at the end of comments llvm-svn: 254630
* Revert r252990.Akira Hatanaka2015-11-131-1/+10
| | | | | | Some of the buildbots are still failing. llvm-svn: 252999
* Provide a way to specify inliner's attribute compatibility and merging.Akira Hatanaka2015-11-131-10/+1
| | | | | | | | | | | | | | | | | | This reapplies r252949. I've changed the type of FuncName to be std::string instead of StringRef in emitFnAttrCompatCheck. Original commit message for r252949: Provide a way to specify inliner's attribute compatibility and merging rules using table-gen. NFC. This commit adds new classes CompatRule and MergeRule to Attributes.td, which are used to generate code to check attribute compatibility and merge attributes of the caller and callee. rdar://problem/19836465 llvm-svn: 252990
* Revert r252949.Akira Hatanaka2015-11-121-1/+10
| | | | | | It broke some of the bots including clang-x64-ninja-win7. llvm-svn: 252951
* Provide a way to specify inliner's attribute compatibility and mergingAkira Hatanaka2015-11-121-10/+1
| | | | | | | | | | | | rules using table-gen. NFC. This commit adds new classes CompatRule and MergeRule to Attributes.td, which are used to generate code to check attribute compatibility and merge attributes of the caller and callee. rdar://problem/19836465 llvm-svn: 252949
* Inliner: Do zero-cost inlines even if above a negative threshold (PR24851)Hans Wennborg2015-11-101-1/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D14499 llvm-svn: 252595
* Analysis: Remove implicit ilist iterator conversionsDuncan P. N. Exon Smith2015-10-101-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove implicit ilist iterator conversions from LLVMAnalysis. I came across something really scary in `llvm::isKnownNotFullPoison()` which relied on `Instruction::getNextNode()` being completely broken (not surprising, but scary nevertheless). This function is documented (and coded to) return `nullptr` when it gets to the sentinel, but with an `ilist_half_node` as a sentinel, the sentinel check looks into some other memory and we don't recognize we've hit the end. Rooting out these scary cases is the reason I'm removing the implicit conversions before doing anything else with `ilist`; I'm not at all surprised that clients rely on badness. I found another scary case -- this time, not relying on badness, just bad (but I guess getting lucky so far) -- in `ObjectSizeOffsetEvaluator::compute_()`. Here, we save out the insertion point, do some things, and then restore it. Previously, we let the iterator auto-convert to `Instruction*`, and then set it back using the `Instruction*` version: Instruction *PrevInsertPoint = Builder.GetInsertPoint(); /* Logic that may change insert point */ if (PrevInsertPoint) Builder.SetInsertPoint(PrevInsertPoint); The check for `PrevInsertPoint` doesn't protect correctly against bad accesses. If the insertion point has been set to the end of a basic block (i.e., `SetInsertPoint(SomeBB)`), then `GetInsertPoint()` returns an iterator pointing at the list sentinel. The version of `SetInsertPoint()` that's getting called will then call `PrevInsertPoint->getParent()`, which explodes horribly. The only reason this hasn't blown up is that it's fairly unlikely the builder is adding to the end of the block; usually, we're adding instructions somewhere before the terminator. llvm-svn: 249925
* 80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; ↵Sanjay Patel2015-09-151-4/+5
| | | | | | NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC llvm-svn: 247700
* [WinEH] Require token linkage in EH pad/ret signaturesJoseph Tremoulet2015-08-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: WinEHPrepare is going to require that cleanuppad and catchpad produce values of token type which are consumed by any cleanupret or catchret exiting the pad. This change updates the signatures of those operators to require/enforce that the type produced by the pads is token type and that the rets have an appropriate argument. The catchpad argument of a `CatchReturnInst` must be a `CatchPadInst` (and similarly for `CleanupReturnInst`/`CleanupPadInst`). To accommodate that restriction, this change adds a notion of an operator constraint to both LLParser and BitcodeReader, allowing appropriate sentinels to be constructed for forward references and appropriate error messages to be emitted for illegal inputs. Also add a verifier rule (noted in LangRef) that a catchpad with a catchpad predecessor must have no other predecessors; this ensures that WinEHPrepare will see the expected linear relationship between sibling catches on the same try. Lastly, remove some superfluous/vestigial casts from instruction operand setters operating on BasicBlocks. Reviewers: rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12108 llvm-svn: 245797
* [PM/AA] Remove the last relics of the separate IPA library from LLVM,Chandler Carruth2015-08-181-0/+1451
| | | | | | | | | | | | | | | | | | | | | folding the code into the main Analysis library. There already wasn't much of a distinction between Analysis and IPA. A number of the passes in Analysis are actually IPA passes, and there doesn't seem to be any advantage to separating them. Moreover, it makes it hard to have interactions between analyses that are both local and interprocedural. In trying to make the Alias Analysis infrastructure work with the new pass manager, it becomes particularly awkward to navigate this split. I've tried to find all the places where we referenced this, but I may have missed some. I have also adjusted the C API to continue to be equivalently functional after this change. Differential Revision: http://reviews.llvm.org/D12075 llvm-svn: 245318
* Sink InlineCost.cpp into IPA -- it is now officially an interproceduralChandler Carruth2013-01-211-1237/+0
| | | | | | | | | | analysis. How cute that it wasn't previously. ;] Part of this confusion stems from the flattened header file tree. Thanks to Benjamin for pointing out the goof on IRC, and we're considering un-flattening the headers, so speak now if that would bug you. llvm-svn: 173033
OpenPOWER on IntegriCloud