summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis/InlineCost.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Tidy checking for the soft float attribute.Eric Christopher2017-04-151-10/+1
| | | | llvm-svn: 300394
* Cache the DataLayout rather than looking it up frequently.Eric Christopher2017-04-151-20/+14
| | | | llvm-svn: 300393
* [IR] Make paramHasAttr to use arg indices instead of attr indicesReid Kleckner2017-04-141-2/+1
| | | | | | | | | This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern. Previously we were testing return value attributes with index 0, so I introduced hasReturnAttr() for that use case. llvm-svn: 300367
* [IR] Redesign the case iterator in SwitchInst to actually be an iteratorChandler Carruth2017-04-121-3/+3
| | | | | | | | | | | | | | | | and to expose a handle to represent the actual case rather than having the iterator return a reference to itself. All of this allows the iterator to be used with common STL facilities, standard algorithms, etc. Doing this exposed some missing facilities in the iterator facade that I've fixed and required some work to the actual iterator to fully support the necessary API. Differential Revision: https://reviews.llvm.org/D31548 llvm-svn: 300032
* Remove unused functions. Remove static qualifier from functions in header ↵Vassil Vassilev2017-04-111-7/+0
| | | | | | files. NFC. llvm-svn: 299947
* Do not inline hot callsites for samplepgo in thinlto compile phase.Dehao Chen2017-03-211-1/+1
| | | | | | | | | | | | | | Summary: Because SamplePGO passes will be invoked twice in ThinLTO build: once at compile phase, the other at backend. We want to make sure the IR at the 2nd phase matches the hot part in profile, thus we do not want to inline hot callsites in the first phase. Reviewers: tejohnson, eraman Reviewed By: tejohnson Subscribers: mehdi_amini, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D31201 llvm-svn: 298428
* [InlineCost] Move the code in isGEPOffsetConstant to a lambda.Easwaran Raman2017-02-251-13/+9
| | | | | | Differential revision: https://reviews.llvm.org/D30112 llvm-svn: 296208
* Refactor instruction simplification code in visitors. NFC.Easwaran Raman2017-02-181-88/+67
| | | | | | | | | | | | | | Several visitors check if operands to the instruction are constants, either as it is or after looking up SimplifiedValues, check if the result is a constant and update the SimplifiedValues map. This refactoring splits it into a common function that does the checking of whether the operands are constants and updating of the SimplifiedValues table, and an instruction specific part that is implemented by each instruction visitor as a lambda and passed to the common function. Differential revision: https://reviews.llvm.org/D30104 llvm-svn: 295552
* Improve PGO support for the new inlinerEaswaran Raman2017-01-201-17/+40
| | | | | | | | | | | | | | | | | | | | | | | | This adds the following to the new PM based inliner in PGO mode: * Use block frequency analysis to derive callsite's profile count and use that to adjust thresholds of hot and cold callsites. * Incrementally update the BFI of the caller after a callee gets inlined into it. This incremental update is only within an invocation of the run method - BFI is not preserved across calls to run. Update the function entry count of the callee after inlining it into a caller. * I've tuned the thresholds for the hot and cold callsites using a hacked up version of the old inliner that explicitly computes BFI on a set of internal benchmarks and spec. Once the new PM based pipeline stabilizes (IIRC Chandler mentioned there are known issues) I'll benchmark this again and adjust the thresholds if required. Inliner PGO support. Differential revision: https://reviews.llvm.org/D28331 llvm-svn: 292666
* Recommit "[InlineCost] Use TTI to check if GEP is free." #3Haicheng Wu2017-01-201-2/+18
| | | | | | | | | | | | This is the third attemp to recommit r292526. The original summary: Currently, a GEP is considered free only if its indices are all constant. TTI::getGEPCost() can give target-specific more accurate analysis. TTI is already used for the cost of many other instructions. llvm-svn: 292633
* Revert "Recommit "[InlineCost] Use TTI to check if GEP is free." #2"Haicheng Wu2017-01-201-18/+2
| | | | | | This reverts commit r292616 because the test case still has problem. llvm-svn: 292618
* Recommit "[InlineCost] Use TTI to check if GEP is free." #2Haicheng Wu2017-01-201-2/+18
| | | | | | | | | | | | This is the second attemp to recommit r292526. The original summary: Currently, a GEP is considered free only if its indices are all constant. TTI::getGEPCost() can give target-specific more accurate analysis. TTI is already used for the cost of many other instructions. llvm-svn: 292616
* Revert "Recommit "[InlineCost] Use TTI to check if GEP is free.""Haicheng Wu2017-01-201-18/+2
| | | | | | This reverts commit r292570. The test still has problem. llvm-svn: 292572
* Recommit "[InlineCost] Use TTI to check if GEP is free."Haicheng Wu2017-01-201-2/+18
| | | | | | | | | | | | This recommits r292526 which is reverted in r292529 after fixing the test case. The original summary: Currently, a GEP is considered free only if its indices are all constant. TTI::getGEPCost() can give target-specific more accurate analysis. TTI is already used for the cost of many other instructions. llvm-svn: 292570
* Revert "[InlineCost] Use TTI to check if GEP is free."Haicheng Wu2017-01-191-18/+2
| | | | | | This reverts commit r292526. The test case has problem. llvm-svn: 292529
* [InlineCost] Use TTI to check if GEP is free.Haicheng Wu2017-01-191-2/+18
| | | | | | | | | | Currently, a GEP is considered free only if its indices are all constant. TTI::getGEPCost() can give target-specific more accurate analysis. TTI is already used for the cost of many other instructions. Differential Revision: https://reviews.llvm.org/D28693 llvm-svn: 292526
* Refactor inline threshold update code.Easwaran Raman2017-01-091-22/+19
| | | | | | | | | | Functional change: Previously, if a callee is cold, we used ColdThreshold if it minimizes the existing threshold. This was irrespective of whether we were optimizing for minsize (-Oz) or not. But -Oz uses very low threshold to begin with and the inlining with -Oz is expected to be tuned for lowering code size, so there is no good reason to set an even lower threshold for cold callees. We now lower the threshold for cold callees only when -Oz is not used. For default values of -inlinethreshold and -inlinecold-threshold, this change has no effect and this simplifies the code. NFC changes: Group all threshold updates that are guarded by !Caller->optForMinSize() and within that group threshold updates that require profile summary info. Differential revision: https://reviews.llvm.org/D28369 llvm-svn: 291487
* [PM] Provide an initial, minimal port of the inliner to the new pass manager.Chandler Carruth2016-12-201-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This doesn't implement *every* feature of the existing inliner, but tries to implement the most important ones for building a functional optimization pipeline and beginning to sort out bugs, regressions, and other problems. Notable, but intentional omissions: - No alloca merging support. Why? Because it isn't clear we want to do this at all. Active discussion and investigation is going on to remove it, so for simplicity I omitted it. - No support for trying to iterate on "internally" devirtualized calls. Why? Because it adds what I suspect is inappropriate coupling for little or no benefit. We will have an outer iteration system that tracks devirtualization including that from function passes and iterates already. We should improve that rather than approximate it here. - Optimization remarks. Why? Purely to make the patch smaller, no other reason at all. The last one I'll probably work on almost immediately. But I wanted to skip it in the initial patch to try to focus the change as much as possible as there is already a lot of code moving around and both of these *could* be skipped without really disrupting the core logic. A summary of the different things happening here: 1) Adding the usual new PM class and rigging. 2) Fixing minor underlying assumptions in the inline cost analysis or inline logic that don't generally hold in the new PM world. 3) Adding the core pass logic which is in essence a loop over the calls in the nodes in the call graph. This is a bit duplicated from the old inliner, but only a handful of lines could realistically be shared. (I tried at first, and it really didn't help anything.) All told, this is only about 100 lines of code, and most of that is the mechanics of wiring up analyses from the new PM world. 4) Updating the LazyCallGraph (in the new PM) based on the *newly inlined* calls and references. This is very minimal because we cannot form cycles. 5) When inlining removes the last use of a function, eagerly nuking the body of the function so that any "one use remaining" inline cost heuristics are immediately refined, and queuing these functions to be completely deleted once inlining is complete and the call graph updated to reflect that they have become dead. 6) After all the inlining for a particular function, updating the LazyCallGraph and the CGSCC pass manager to reflect the function-local simplifications that are done immediately and internally by the inline utilties. These are the exact same fundamental set of CG updates done by arbitrary function passes. 7) Adding a bunch of test cases to specifically target CGSCC and other subtle aspects in the new PM world. Many thanks to the careful review from Easwaran and Sanjoy and others! Differential Revision: https://reviews.llvm.org/D24226 llvm-svn: 290161
* Revert @llvm.assume with operator bundles (r289755-r289757)Daniel Jasper2016-12-191-16/+25
| | | | | | | This creates non-linear behavior in the inliner (see more details in r289755's commit thread). llvm-svn: 290086
* Remove the AssumptionCacheHal Finkel2016-12-151-25/+16
| | | | | | | | | After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... llvm-svn: 289756
* [Analysis] Fix typo in comment. NFCCraig Topper2016-12-091-1/+1
| | | | llvm-svn: 289171
* IR: Change the gep_type_iterator API to avoid always exposing the "current" ↵Peter Collingbourne2016-12-021-1/+1
| | | | | | | | | | | | | type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458
* [InlineCost] Remove skew when calculating call costsJames Molloy2016-11-141-1/+3
| | | | | | | | | | | | | | | When calculating the cost of a call instruction we were applying a heuristic penalty as well as the cost of the instruction itself. However, when calculating the benefit from inlining we weren't discounting the equivalent penalty for the call instruction that would be removed! This caused skew in the calculation and meant we wouldn't inline in the following, trivial case: int g() { h(); } int f() { g(); } llvm-svn: 286814
* Rename isHotFunction/isColdFunction to ↵Dehao Chen2016-10-101-2/+2
| | | | | | | | isFunctionEntryHot/isFunctionEntryCold. (NFC) This is in preparation for https://reviews.llvm.org/D25048 llvm-svn: 283805
* NFC fix doxygen commentsPiotr Padlewski2016-09-301-18/+18
| | | | llvm-svn: 282950
* Fix a thinko in r278189.Easwaran Raman2016-08-291-1/+1
| | | | llvm-svn: 280008
* Make more fields of InlineParams Optional.Easwaran Raman2016-08-111-4/+8
| | | | | | Differential revision: https://reviews.llvm.org/D23386 llvm-svn: 278312
* Changed sign of LastCallToStaticBounsPiotr Padlewski2016-08-101-1/+1
| | | | | | | | | | | | | | | | | Summary: I think it is much better this way. When I firstly saw line: Cost += InlineConstants::LastCallToStaticBonus; I though that this is a bug, because everywhere where the cost is being reduced it is usuing -=. Reviewers: eraman, tejohnson, mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23222 llvm-svn: 278290
* Do not directly use inline threshold cl options in cost analysis.Easwaran Raman2016-08-101-57/+98
| | | | | | | | This adds an InlineParams struct which is populated from the command line options by getInlineParams and passed to getInlineCost for the call analyzer to use. Differential revision: https://reviews.llvm.org/D22120 llvm-svn: 278189
* Remove cold callsite heuristic that is not necessary because of cold callee ↵Dehao Chen2016-08-051-7/+5
| | | | | | heuristic. llvm-svn: 277863
* Replace hot-callsite based heuristic to use its own threshold parameter ↵Dehao Chen2016-08-051-6/+17
| | | | | | | | | | | | | | instead of share inline-hint parameter Summary: Hot callsites should have higher threshold than inline hints. This patch uses separate threshold parameter for hot callsites. Reviewers: davidxl, eraman Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D22368 llvm-svn: 277860
* Avoid using a raw AssumptionCacheTracker in various inliner functions.Sean Silva2016-07-231-29/+29
| | | | | | | | | | This unblocks the new PM part of River's patch in https://reviews.llvm.org/D22706 Conveniently, this same change was needed for D21921 and so these changes are just spun out from there. llvm-svn: 276515
* Implement callsite-hotness based inline cost for Sample-based PGODehao Chen2016-07-111-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch. E.g. if (A1 && A2 && A3 && ..... && A10) { for (i=0; i < 100000000; i++) { callsite(); } } Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value. In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness. Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22118 llvm-svn: 275073
* Fix size computation of array allocation in inline cost analysisEaswaran Raman2016-06-271-3/+4
| | | | | | Differential revision: http://reviews.llvm.org/D21690 llvm-svn: 273952
* Use ProfileSummaryInfo in inline cost analysis.Easwaran Raman2016-06-091-39/+28
| | | | | | | | Instead of directly using MaxFunctionCount and function entry count to determine callee hotness, use the isHotFunction/isColdFunction methods provided by ProfileSummaryInfo. Differential revision: http://reviews.llvm.org/D21045 llvm-svn: 272321
* Allow -inline-threshold to override default threshold.Easwaran Raman2016-05-191-4/+7
| | | | | | | | Before r257832, the threshold used by SimpleInliner was explicitly specified or generated from opt levels and passed to the base class Inliner's constructor. There, it was first overridden by explicitly specified -inline-threshold. The refactoring in r257832 did not preserve this behavior for all opt levels. This change brings back the original behavior. Differential Revision: http://reviews.llvm.org/D20452 llvm-svn: 270153
* Revert r269131Easwaran Raman2016-05-101-4/+2
| | | | llvm-svn: 269138
* Reapply r266477 and r266488Easwaran Raman2016-05-101-2/+4
| | | | llvm-svn: 269131
* [Inliner] don't assume that a Constant alloca size is a ConstantInt (PR27277)Sanjay Patel2016-05-091-4/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D20077 llvm-svn: 268980
* [Inliner] Formatting. NFC.Chad Rosier2016-04-281-36/+41
| | | | | | | Patch by Aditya Kumar! Differential Revision: http://reviews.llvm.org/D19047 llvm-svn: 267888
* Introduce llvm.load.relative intrinsic.Peter Collingbourne2016-04-221-0/+5
| | | | | | | | | | | | | | | | | | | This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that value and returns it. The constant folder specifically recognizes the form of this intrinsic and the constant initializers it may load from; if a loaded constant initializer is known to have the form ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. LLVM provides that the calculation of such a constant initializer will not overflow at link time under the medium code model if ``x`` is an ``unnamed_addr`` function. However, it does not provide this guarantee for a constant initializer folded into a function body. This intrinsic can be used to avoid the possibility of overflows when loading from such a constant. Differential Revision: http://reviews.llvm.org/D18367 llvm-svn: 267223
* Revert "Replace the use of MaxFunctionCount module flag"Eric Liu2016-04-181-4/+2
| | | | | | | | | | This reverts commit r266477. This commit introduces cyclic dependency. This commit has "Analysis" depend on "ProfileData", while "ProfileData" depends on "Object", which depends on "BitCode", which depends on "Analysis". llvm-svn: 266619
* Replace the use of MaxFunctionCount module flagEaswaran Raman2016-04-151-2/+4
| | | | | | | | Adds an interface to get ProfileSummary for a module and makes InlineCost use ProfileSummary to get max function count. Differential Revision: http://reviews.llvm.org/D18622 llvm-svn: 266477
* [TTI] Add getInliningThresholdMultiplier.Justin Lebar2016-04-151-0/+4
| | | | | | | | | | | | | | | | Summary: InlineCost's threshold is multiplied by this value. This lets us adjust the inlining threshold up or down on a per-target basis. For example, we might want to increase the threshold on targets where calls are unusually expensive. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18560 llvm-svn: 266405
* Return immediately from analyzeCall if analyzeBlock returns false.Easwaran Raman2016-04-131-14/+2
| | | | | | This is part of the patch reviewed at http://reviews.llvm.org/D17584 llvm-svn: 266249
* Refactor Threshold computation. NFC.Easwaran Raman2016-04-081-22/+35
| | | | | | This is part of changes reviewed in http://reviews.llvm.org/D17584. llvm-svn: 265852
* Don't IPO over functions that can be de-refinedSanjoy Das2016-04-081-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fixes PR26774. If you're aware of the issue, feel free to skip the "Motivation" section and jump directly to "This patch". Motivation: I define "refinement" as discarding behaviors from a program that the optimizer has license to discard. So transforming: ``` void f(unsigned x) { unsigned t = 5 / x; (void)t; } ``` to ``` void f(unsigned x) { } ``` is refinement, since the behavior went from "if x == 0 then undefined else nothing" to "nothing" (the optimizer has license to discard undefined behavior). Refinement is a fundamental aspect of many mid-level optimizations done by LLVM. For instance, transforming `x == (x + 1)` to `false` also involves refinement since the expression's value went from "if x is `undef` then { `true` or `false` } else { `false` }" to "`false`" (by definition, the optimizer has license to fold `undef` to any non-`undef` value). Unfortunately, refinement implies that the optimizer cannot assume that the implementation of a function it can see has all of the behavior an unoptimized or a differently optimized version of the same function can have. This is a problem for functions with comdat linkage, where a function can be replaced by an unoptimized or a differently optimized version of the same source level function. For instance, FunctionAttrs cannot assume a comdat function is actually `readnone` even if it does not have any loads or stores in it; since there may have been loads and stores in the "original function" that were refined out in the currently visible variant, and at the link step the linker may in fact choose an implementation with a load or a store. As an example, consider a function that does two atomic loads from the same memory location, and writes to memory only if the two values are not equal. The optimizer is allowed to refine this function by first CSE'ing the two loads, and the folding the comparision to always report that the two values are equal. Such a refined variant will look like it is `readonly`. However, the unoptimized version of the function can still write to memory (since the two loads //can// result in different values), and selecting the unoptimized version at link time will retroactively invalidate transforms we may have done under the assumption that the function does not write to memory. Note: this is not just a problem with atomics or with linking differently optimized object files. See PR26774 for more realistic examples that involved neither. This patch: This change introduces a new set of linkage types, predicated as `GlobalValue::mayBeDerefined` that returns true if the linkage type allows a function to be replaced by a differently optimized variant at link time. It then changes a set of IPO passes to bail out if they see such a function. Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18634 llvm-svn: 265762
* Revert revisions 262636, 262643, 262679, and 262682.Easwaran Raman2016-03-081-86/+16
| | | | llvm-svn: 262883
* Fix a memory leak.Easwaran Raman2016-03-041-1/+4
| | | | llvm-svn: 262682
* Fix breakage caused by r262636.Easwaran Raman2016-03-031-1/+1
| | | | | | Use LLVM_ATTRIBUTE_UNUSED instead of __attribute_((unused)) llvm-svn: 262643
OpenPOWER on IntegriCloud