summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/IPO/FunctionAttrs.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Use Argument::hasAttribute and AttributeList::ReturnIndex moreReid Kleckner2017-04-281-2/+3
| | | | | | | | | | | This eliminates many extra 'Idx' induction variables in loops over arguments in CodeGen/ and Target/. It also reduces the number of places where we assume that ReturnIndex is 0 and that we should add one to argument numbers to get the corresponding attribute list index. NFC llvm-svn: 301666
* Prefer addAttr(Attribute::AttrKind) over the AttributeList overloadReid Kleckner2017-04-191-28/+12
| | | | | | | | This should simplify the call sites, which typically want to tweak one attribute at a time. It should also avoid creating ephemeral AttributeLists that live forever. llvm-svn: 300718
* [IR] Make paramHasAttr to use arg indices instead of attr indicesReid Kleckner2017-04-141-1/+1
| | | | | | | | | This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern. Previously we were testing return value attributes with index 0, so I introduced hasReturnAttr() for that use case. llvm-svn: 300367
* Rename AttributeSet to AttributeListReid Kleckner2017-03-211-15/+16
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This class is a list of AttributeSetNodes corresponding the function prototype of a call or function declaration. This class used to be called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is typically accessed by parameter and return value index, so "AttributeList" seems like a more intuitive name. Rename AttributeSetImpl to AttributeListImpl to follow suit. It's useful to rename this class so that we can rename AttributeSetNode to AttributeSet later. AttributeSet is the set of attributes that apply to a single function, argument, or return value. Reviewers: sanjoy, javed.absar, chandlerc, pete Reviewed By: pete Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits Differential Revision: https://reviews.llvm.org/D31102 llvm-svn: 298393
* FunctionAttrs: Factor out a function for querying memory access of a ↵Peter Collingbourne2017-02-141-16/+21
| | | | | | | | | | | specific copy of a function. NFC. This will later be used by ThinLTOBitcodeWriter to add copies of readnone functions to the regular LTO module. Differential Revision: https://reviews.llvm.org/D29695 llvm-svn: 295008
* [FunctionAttrs] try to extend nonnull-ness of arguments from a callsite back ↵Sanjay Patel2017-02-131-0/+53
| | | | | | | | | | | | | | | | | | | to its parent function As discussed here: http://lists.llvm.org/pipermail/llvm-dev/2016-December/108182.html ...we should be able to propagate 'nonnull' info from a callsite back to its parent. The original motivation for this patch is our botched optimization of "dyn_cast" (PR28430), but this won't solve that problem. The transform is currently disabled by default while we wait for clang to work-around potential security problems: http://lists.llvm.org/pipermail/cfe-dev/2017-January/052066.html Differential Revision: https://reviews.llvm.org/D27855 llvm-svn: 294998
* De-duplicate some code for creating an AARGetter suitable for the legacy PM.Peter Collingbourne2017-02-091-14/+2
| | | | | | | | I'm about to use this in a couple more places. Differential Revision: https://reviews.llvm.org/D29793 llvm-svn: 294648
* [PH] Replace uses of AssertingVH from members of analysis results withChandler Carruth2017-01-241-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a lazy-asserting PoisoningVH. AssertVH is fundamentally incompatible with cache-invalidation of analysis results. The invaliadtion happens after the AssertingVH has already fired. Instead, use a PoisoningVH that will assert if the dangling handle is ever used rather than merely be assigned or destroyed. This patch also removes all of the (numerous) doomed attempts to work around this fundamental incompatibility. It is a pretty significant simplification IMO. The most interesting change is in the Inliner where we still do some clearing because we don't want to rely on the coarse grained invalidation strategy of the containing pass manager. However, I prefer the approach that contains this logic to the cleanup phase of the Inliner, and I think we could enhance the CGSCC analysis management layer to make this even better in the future if desired. The rest is straight cleanup. I've also added a test for one of the harder cases to work around: when a *module analysis* contains many AssertingVHes pointing at functions. Differential Revision: https://reviews.llvm.org/D29006 llvm-svn: 292928
* Revert @llvm.assume with operator bundles (r289755-r289757)Daniel Jasper2016-12-191-0/+3
| | | | | | | This creates non-linear behavior in the inliner (see more details in r289755's commit thread). llvm-svn: 290086
* Remove the AssumptionCacheHal Finkel2016-12-151-3/+0
| | | | | | | | | After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... llvm-svn: 289756
* Fix 80-column violations. NFC.Chad Rosier2016-11-071-3/+7
| | | | llvm-svn: 286117
* [FunctionAttrs] Don't try to infer returned if it is already on an argumentDavid Majnemer2016-09-121-0/+5
| | | | | | | | | | | Trying to infer the 'returned' attribute if an argument is already 'returned' can lead to verification failure: inference might determine that a different argument is passed through which would result in two different arguments marked as 'returned'. This fixes PR30350. llvm-svn: 281221
* s/static inline/static/ for headers I have changed in r279475. NFC.Tim Shen2016-08-311-5/+3
| | | | llvm-svn: 280257
* [PM] Introduce basic update capabilities to the new PM's CGSCC passChandler Carruth2016-08-241-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | manager, including both plumbing and logic to handle function pass updates. There are three fundamentally tied changes here: 1) Plumbing *some* mechanism for updating the CGSCC pass manager as the CG changes while passes are running. 2) Changing the CGSCC pass manager infrastructure to have support for the underlying graph to mutate mid-pass run. 3) Actually updating the CG after function passes run. I can separate them if necessary, but I think its really useful to have them together as the needs of #3 drove #2, and that in turn drove #1. The plumbing technique is to extend the "run" method signature with extra arguments. We provide the call graph that intrinsically is available as it is the basis of the pass manager's IR units, and an output parameter that records the results of updating the call graph during an SCC passes's run. Note that "...UpdateResult" isn't a *great* name here... suggestions very welcome. I tried a pretty frustrating number of different data structures and such for the innards of the update result. Every other one failed for one reason or another. Sometimes I just couldn't keep the layers of complexity right in my head. The thing that really worked was to just directly provide access to the underlying structures used to walk the call graph so that their updates could be informed by the *particular* nature of the change to the graph. The technique for how to make the pass management infrastructure cope with mutating graphs was also something that took a really, really large number of iterations to get to a place where I was happy. Here are some of the considerations that drove the design: - We operate at three levels within the infrastructure: RefSCC, SCC, and Node. In each case, we are working bottom up and so we want to continue to iterate on the "lowest" node as the graph changes. Look at how we iterate over nodes in an SCC running function passes as those function passes mutate the CG. We continue to iterate on the "lowest" SCC, which is the one that continues to contain the function just processed. - The call graph structure re-uses SCCs (and RefSCCs) during mutation events for the *highest* entry in the resulting new subgraph, not the lowest. This means that it is necessary to continually update the current SCC or RefSCC as it shifts. This is really surprising and subtle, and took a long time for me to work out. I actually tried changing the call graph to provide the opposite behavior, and it breaks *EVERYTHING*. The graph update algorithms are really deeply tied to this particualr pattern. - When SCCs or RefSCCs are split apart and refined and we continually re-pin our processing to the bottom one in the subgraph, we need to enqueue the newly formed SCCs and RefSCCs for subsequent processing. Queuing them presents a few challenges: 1) SCCs and RefSCCs use wildly different iteration strategies at a high level. We end up needing to converge them on worklist approaches that can be extended in order to be able to handle the mutations. 2) The order of the enqueuing need to remain bottom-up post-order so that we don't get surprising order of visitation for things like the inliner. 3) We need the worklists to have set semantics so we don't duplicate things endlessly. We don't need a *persistent* set though because we always keep processing the bottom node!!!! This is super, super surprising to me and took a long time to convince myself this is correct, but I'm pretty sure it is... Once we sink down to the bottom node, we can't re-split out the same node in any way, and the postorder of the current queue is fixed and unchanging. 4) We need to make sure that the "current" SCC or RefSCC actually gets enqueued here such that we re-visit it because we continue processing a *new*, *bottom* SCC/RefSCC. - We also need the ability to *skip* SCCs and RefSCCs that get merged into a larger component. We even need the ability to skip *nodes* from an SCC that are no longer part of that SCC. This led to the design you see in the patch which uses SetVector-based worklists. The RefSCC worklist is always empty until an update occurs and is just used to handle those RefSCCs created by updates as the others don't even exist yet and are formed on-demand during the bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and we push new SCCs onto it and blacklist existing SCCs on it to get the desired processing. We then *directly* update these when updating the call graph as I was never able to find a satisfactory abstraction around the update strategy. Finally, we need to compute the updates for function passes. This is mostly used as an initial customer of all the update mechanisms to drive their design to at least cover some real set of use cases. There are a bunch of interesting things that came out of doing this: - It is really nice to do this a function at a time because that function is likely hot in the cache. This means we want even the function pass adaptor to support online updates to the call graph! - To update the call graph after arbitrary function pass mutations is quite hard. We have to build a fairly comprehensive set of data structures and then process them. Fortunately, some of this code is related to the code for building the cal graph in the first place. Unfortunately, very little of it makes any sense to share because the nature of what we're doing is so very different. I've factored out the one part that made sense at least. - We need to transfer these updates into the various structures for the CGSCC pass manager. Once those were more sanely worked out, this became relatively easier. But some of those needs necessitated changes to the LazyCallGraph interface to make it significantly easier to extract the changed SCCs from an update operation. - We also need to update the CGSCC analysis manager as the shape of the graph changes. When an SCC is merged away we need to clear analyses associated with it from the analysis manager which we didn't have support for in the analysis manager infrsatructure. New SCCs are easy! But then we have the case that the original SCC has its shape changed but remains in the call graph. There we need to *invalidate* the analyses associated with it. - We also need to invalidate analyses after we *finish* processing an SCC. But the analyses we need to invalidate here are *only those for the newly updated SCC*!!! Because we only continue processing the bottom SCC, if we split SCCs apart the original one gets invalidated once when its shape changes and is not processed farther so its analyses will be correct. It is the bottom SCC which continues being processed and needs to have the "normal" invalidation done based on the preserved analyses set. All of this is mostly background and context for the changes here. Many thanks to all the reviewers who helped here. Especially Sanjoy who caught several interesting bugs in the graph algorithms, David, Sean, and others who all helped with feedback. Differential Revision: http://reviews.llvm.org/D21464 llvm-svn: 279618
* [GraphTraits] Replace all NodeType usage with NodeRefTim Shen2016-08-221-9/+4
| | | | | | | | This should finish the GraphTraits migration. Differential Revision: http://reviews.llvm.org/D23730 llvm-svn: 279475
* Replace a few more "fall through" comments with LLVM_FALLTHROUGHJustin Bogner2016-08-171-1/+2
| | | | | | Follow up to r278902. I had missed "fall through", with a space. llvm-svn: 278970
* IPO: Swap || operands to avoid dereferencing end()Duncan P. N. Exon Smith2016-08-171-2/+2
| | | | | | | | IsOperandBundleUse conveniently indicates whether std::next(F->arg_begin(),UseIndex) will get to (or past) end(). Check it first to avoid dereferencing end(). llvm-svn: 278884
* Consistently use ModuleAnalysisManagerSean Silva2016-08-091-1/+1
| | | | | | | | | | | Besides a general consistently benefit, the extra layer of indirection allows the mechanical part of https://reviews.llvm.org/D23256 that requires touching every transformation and analysis to be factored out cleanly. Thanks to David for the suggestion. llvm-svn: 278078
* Add some comments linking back to PR28400.Sean Silva2016-08-081-0/+2
| | | | | | Thanks to Mehdi for the suggestion! llvm-svn: 277984
* [PM] Invalidate CallGraphAnalysis because it holds AssertingVHSean Silva2016-08-081-0/+5
| | | | | | | This is essentially PR28400. The fix here is similar to that implemented in r274656. llvm-svn: 277980
* [ADT] NFC: Generalize GraphTraits requirement of "NodeType *" in interfaces ↵Tim Shen2016-08-011-0/+1
| | | | | | | | | | | | | | to "NodeRef", and migrate SCCIterator.h to use NodeRef Summary: By generalize the interface, users are able to inject more flexible Node token into the algorithm, for example, a pair of vector<Node>* and index integer. Currently I only migrated SCCIterator to use NodeRef, but more is coming. It's a NFC. Reviewers: dblaikie, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22937 llvm-svn: 277399
* [FunctionAttrs] Correct the safety analysis for inference of 'returned'David Majnemer2016-07-191-0/+51
| | | | | | | | | | | | We skipped over ReturnInsts which didn't return an argument which would lead us to incorrectly conclude that an argument returned by another ReturnInst was 'returned'. This reverts commit r275756. This fixes PR28610. llvm-svn: 276008
* Revert r275678, "Revert "Revert r275027 - Let FuncAttrs infer the 'returned' ↵NAKAMURA Takumi2016-07-181-50/+0
| | | | | | | | | | argument attribute"" This reverts also r275029, "Update Clang tests after adding inference for the returned argument attribute" It broke LTO build. Seems miscompilation. llvm-svn: 275756
* Revert "Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute"Hal Finkel2016-07-161-0/+50
| | | | | | | | | | | | | | | | This reverts commit r275042; the initial commit triggered self-hosting failures on ARM/AArch64. James Molloy identified the problematic backend code, which has been disabled in r275677. Trying again... Original commit message: Let FuncAttrs infer the 'returned' argument attribute A function can have one argument with the 'returned' attribute, indicating that the associated argument is always the return value of the function. Add FuncAttrs inference logic. llvm-svn: 275678
* Revert r275027 - Let FuncAttrs infer the 'returned' argument attributeHal Finkel2016-07-111-50/+0
| | | | | | Reverting r275027 and r275033. These seem to cause miscompiles on the AArch64 buildbot. llvm-svn: 275042
* Don't use a SmallSet for returned attribute inferenceHal Finkel2016-07-111-11/+19
| | | | | | | Suggested post-commit by David Majnemer on IRC (following-up on a pre-commit review comment). llvm-svn: 275033
* Let FuncAttrs infer the 'returned' argument attributeHal Finkel2016-07-101-0/+42
| | | | | | | | | | A function can have one argument with the 'returned' attribute, indicating that the associated argument is always the return value of the function. Add FuncAttrs inference logic. Differential Revision: http://reviews.llvm.org/D22202 llvm-svn: 275027
* [PM] Some preparatory refactoring to minimize the diff of D21921Sean Silva2016-07-031-14/+20
| | | | llvm-svn: 274456
* Remove dead TLI arg of isKnownNonNull and propagate deadness. NFC.Sean Silva2016-07-021-21/+6
| | | | | | | | | | | | | | This actually uncovered a surprisingly large chain of ultimately unused TLI args. From what I can gather, this argument is a remnant of when isKnownNonNull would look at the TLI directly. The current approach seems to be that InferFunctionAttrs runs early in the pipeline and uses TLI to annotate the TLI-dependent non-null information as return attributes. This also removes the dependence of functionattrs on TLI altogether. llvm-svn: 274455
* Apply clang-tidy's modernize-loop-convert to most of lib/Transforms.Benjamin Kramer2016-06-261-16/+11
| | | | | | Only minor manual fixes. No functionality change intended. llvm-svn: 273808
* [PM] Port ReversePostOrderFunctionAttrs to the new PMSean Silva2016-06-121-18/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Below are my super rough notes when porting. They can probably serve as a basic guide for porting other passes to the new PM. As I port more passes I'll expand and generalize this and make a proper docs/HowToPortToNewPassManager.rst document. There is also missing documentation for general concepts and API's in the new PM which will require some documentation. Once there is proper documentation in place we can put up a list of passes that have to be ported and game-ify/crowdsource the rest of the porting (at least of the middle end; the backend is still unclear). I will however be taking personal responsibility for ensuring that the LLD/ELF LTO pipeline is ported in a timely fashion. The remaining passes to be ported are (do something like `git grep "<the string in the bullet point below>"` to find the pass): General Scalar: [ ] Simplify the CFG [ ] Jump Threading [ ] MemCpy Optimization [ ] Promote Memory to Register [ ] MergedLoadStoreMotion [ ] Lazy Value Information Analysis General IPO: [ ] Dead Argument Elimination [ ] Deduce function attributes in RPO Loop stuff / vectorization stuff: [ ] Alignment from assumptions [ ] Canonicalize natural loops [ ] Delete dead loops [ ] Loop Access Analysis [ ] Loop Invariant Code Motion [ ] Loop Vectorization [ ] SLP Vectorizer [ ] Unroll loops Devirtualization / CFI: [ ] Cross-DSO CFI [ ] Whole program devirtualization [ ] Lower bitset metadata CGSCC passes: [ ] Function Integration/Inlining [ ] Remove unused exception handling info [ ] Promote 'by reference' arguments to scalars Please let me know if you are interested in working on any of the passes in the above list (e.g. reply to the post-commit thread for this patch). I'll probably be tackling "General Scalar" and "General IPO" first FWIW. Steps as I port "Deduce function attributes in RPO" --------------------------------------------------- (note: if you are doing any work based on these notes, please leave a note in the post-commit review thread for this commit with any improvements / suggestions / incompleteness you ran into!) Note: "Deduce function attributes in RPO" is a module pass. 1. Do preparatory refactoring. Do preparatory factoring. In this case all I had to do was to pull out a static helper (r272503). (TODO: give more advice here e.g. if pass holds state or something) 2. Rename the old pass class. llvm/lib/Transforms/IPO/FunctionAttrs.cpp Rename class ReversePostOrderFunctionAttrs -> ReversePostOrderFunctionAttrsLegacyPass in preparation for adding a class ReversePostOrderFunctionAttrs as the pass in the new PM. (edit: actually wait what? The new class name will be ReversePostOrderFunctionAttrsPass, so it doesn't conflict. So this step is sort of useless churn). llvm/include/llvm/InitializePasses.h llvm/lib/LTO/LTOCodeGenerator.cpp llvm/lib/Transforms/IPO/IPO.cpp llvm/lib/Transforms/IPO/FunctionAttrs.cpp Rename initializeReversePostOrderFunctionAttrsPass -> initializeReversePostOrderFunctionAttrsLegacyPassPass (note that the "PassPass" thing falls out of `s/ReversePostOrderFunctionAttrs/ReversePostOrderFunctionAttrsLegacyPass/`) Note that the INITIALIZE_PASS macro is what creates this identifier name, so renaming the class requires this renaming too. Note that createReversePostOrderFunctionAttrsPass does not need to be renamed since its name is not generated from the class name. 3. Add the new PM pass class. In the new PM all passes need to have their declaration in a header somewhere, so you will often need to add a header. In this case llvm/include/llvm/Transforms/IPO/FunctionAttrs.h is already there because PostOrderFunctionAttrsPass was already ported. The file-level comment from the .cpp file can be used as the file-level comment for the new header. You may want to tweak the wording slightly from "this file implements" to "this file provides" or similar. Add declaration for the new PM pass in this header: class ReversePostOrderFunctionAttrsPass : public PassInfoMixin<ReversePostOrderFunctionAttrsPass> { public: PreservedAnalyses run(Module &M, AnalysisManager<Module> &AM); }; Its name should end with `Pass` for consistency (note that this doesn't collide with the names of most old PM passes). E.g. call it `<name of the old PM pass>Pass`. Also, move the doxygen comment from the old PM pass to the declaration of this class in the header. Also, include the declaration for the new PM class `llvm/Transforms/IPO/FunctionAttrs.h` at the top of the file (in this case, it was already done when the other pass in this file was ported). Now define the `run` method for the new class. The main things here are: a) Use AM.getResult<...>(M) to get results instead of `getAnalysis<...>()` b) If the old PM pass would have returned "false" (i.e. `Changed == false`), then you should return PreservedAnalyses::all(); c) In the old PM getAnalysisUsage method, observe the calls `AU.addPreserved<...>();`. In the case `Changed == true`, for each preserved analysis you should do call `PA.preserve<...>()` on a PreservedAnalyses object and return it. E.g.: PreservedAnalyses PA; PA.preserve<CallGraphAnalysis>(); return PA; Note that calls to skipModule/skipFunction are not supported in the new PM currently, so optnone and optimization bisect support do not work. You can just drop those calls for now. 4. Add the pass to the new PM pass registry to make it available in opt. In llvm/lib/Passes/PassBuilder.cpp add a #include for your header. `#include "llvm/Transforms/IPO/FunctionAttrs.h"` In this case there is already an include (from when PostOrderFunctionAttrsPass was ported). Add your pass to llvm/lib/Passes/PassRegistry.def In this case, I added `MODULE_PASS("rpo-functionattrs", ReversePostOrderFunctionAttrsPass())` The string is from the `INITIALIZE_PASS*` macros used in the old pass manager. Then choose a test that uses the pass and use the new PM `-passes=...` to run it. E.g. in this case there is a test that does: ; RUN: opt < %s -basicaa -functionattrs -rpo-functionattrs -S | FileCheck %s I have added the line: ; RUN: opt < %s -aa-pipeline=basic-aa -passes='require<targetlibinfo>,cgscc(function-attrs),rpo-functionattrs' -S | FileCheck %s The `-aa-pipeline=basic-aa` and `require<targetlibinfo>,cgscc(function-attrs)` are what is needed to run functionattrs in the new PM (note that in the new PM "functionattrs" becomes "function-attrs" for some reason). This is just pulled from `readattrs.ll` which contains the change from when functionattrs was ported to the new PM. Adding rpo-functionattrs causes the pass that was just ported to run. llvm-svn: 272505
* Factor out a helper. NFCSean Silva2016-06-121-5/+10
| | | | | | Prep for porting to new PM. llvm-svn: 272503
* [FunctionAttrs] Volatile loads should disable readonlyDavid Majnemer2016-05-251-0/+5
| | | | | | | | A volatile load has side effects beyond what callers expect readonly to signify. For example, it is not safe to reorder two function calls which each perform a volatile load to the same memory location. llvm-svn: 270671
* ReversePostOrderFunctionAttrs is not modifying the call graph, let's ↵Mehdi Amini2016-05-021-0/+1
| | | | | | | | | | preserve it. When running cc1 with -flto=thin, it is followed by GlobalOpt, which requires the callgraph. This saves rebuilding one. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 268266
* Re-commit optimization bisect support (r267022) without new pass manager ↵Andrew Kaylor2016-04-221-0/+6
| | | | | | | | | | support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231
* Revert "Initial implementation of optimization bisect support."Vedant Kumar2016-04-221-9/+0
| | | | | | | | This reverts commit r267022, due to an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549 llvm-svn: 267115
* Initial implementation of optimization bisect support.Andrew Kaylor2016-04-211-0/+9
| | | | | | | | | | | | This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations. The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used. The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way. Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267022
* Don't IPO over functions that can be de-refinedSanjoy Das2016-04-081-14/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fixes PR26774. If you're aware of the issue, feel free to skip the "Motivation" section and jump directly to "This patch". Motivation: I define "refinement" as discarding behaviors from a program that the optimizer has license to discard. So transforming: ``` void f(unsigned x) { unsigned t = 5 / x; (void)t; } ``` to ``` void f(unsigned x) { } ``` is refinement, since the behavior went from "if x == 0 then undefined else nothing" to "nothing" (the optimizer has license to discard undefined behavior). Refinement is a fundamental aspect of many mid-level optimizations done by LLVM. For instance, transforming `x == (x + 1)` to `false` also involves refinement since the expression's value went from "if x is `undef` then { `true` or `false` } else { `false` }" to "`false`" (by definition, the optimizer has license to fold `undef` to any non-`undef` value). Unfortunately, refinement implies that the optimizer cannot assume that the implementation of a function it can see has all of the behavior an unoptimized or a differently optimized version of the same function can have. This is a problem for functions with comdat linkage, where a function can be replaced by an unoptimized or a differently optimized version of the same source level function. For instance, FunctionAttrs cannot assume a comdat function is actually `readnone` even if it does not have any loads or stores in it; since there may have been loads and stores in the "original function" that were refined out in the currently visible variant, and at the link step the linker may in fact choose an implementation with a load or a store. As an example, consider a function that does two atomic loads from the same memory location, and writes to memory only if the two values are not equal. The optimizer is allowed to refine this function by first CSE'ing the two loads, and the folding the comparision to always report that the two values are equal. Such a refined variant will look like it is `readonly`. However, the unoptimized version of the function can still write to memory (since the two loads //can// result in different values), and selecting the unoptimized version at link time will retroactively invalidate transforms we may have done under the assumption that the function does not write to memory. Note: this is not just a problem with atomics or with linking differently optimized object files. See PR26774 for more realistic examples that involved neither. This patch: This change introduces a new set of linkage types, predicated as `GlobalValue::mayBeDerefined` that returns true if the linkage type allows a function to be replaced by a differently optimized variant at link time. It then changes a set of IPO passes to bail out if they see such a function. Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18634 llvm-svn: 265762
* [attrs] Handle convergent CallSites.Justin Lebar2016-03-141-39/+34
| | | | | | | | | | | | | | | | | | | | | | | Summary: Previously we had a notion of convergent functions but not of convergent calls. This is insufficient to correctly analyze calls where the target is unknown, e.g. indirect calls. Now a call is convergent if it targets a known-convergent function, or if it's explicitly marked as convergent. As usual, we can remove convergent where we can prove that no convergent operations are performed in the call. Originally landed as r261544, then reverted in r261544 for (incidental) build breakage. Re-landed here with no changes. Reviewers: chandlerc, jingyue Subscribers: llvm-commits, tra, jhen, hfinkel Differential Revision: http://reviews.llvm.org/D17739 llvm-svn: 263481
* [PM] Make the AnalysisManager parameter to run methods a reference.Chandler Carruth2016-03-111-4/+4
| | | | | | | | | | | | This was originally a pointer to support pass managers which didn't use AnalysisManagers. However, that doesn't realistically come up much and the complexity of supporting it doesn't really make sense. In fact, *many* parts of the pass manager were just assuming the pointer was never null already. This at least makes it much more explicit and clear. llvm-svn: 263219
* [AA] Hoist the logic to reformulate various AA queries in terms of otherChandler Carruth2016-03-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | parts of the AA interface out of the base class of every single AA result object. Because this logic reformulates the query in terms of some other aspect of the API, it would easily cause O(n^2) query patterns in alias analysis. These could in turn be magnified further based on the number of call arguments, and then further based on the number of AA queries made for a particular call. This ended up causing problems for Rust that were actually noticable enough to get a bug (PR26564) and probably other places as well. When originally re-working the AA infrastructure, the desire was to regularize the pattern of refinement without losing any generality. While I think it was successful, that is clearly proving to be too costly. And the cost is needless: we gain no actual improvement for this generality of making a direct query to tbaa actually be able to re-use some other alias analysis's refinement logic for one of the other APIs, or some such. In short, this is entirely wasted work. To the extent possible, delegation to other API surfaces should be done at the aggregation layer so that we can avoid re-walking the aggregation. In fact, this significantly simplifies the logic as we no longer need to smuggle the aggregation layer into each alias analysis (or the TargetLibraryInfo into each alias analysis just so we can form argument memory locations!). However, we also have some delegation logic inside of BasicAA and some of it even makes sense. When the delegation logic is baking in specific knowledge of aliasing properties of the LLVM IR, as opposed to simply reformulating the query to utilize a different alias analysis interface entry point, it makes a lot of sense to restrict that logic to a different layer such as BasicAA. So one aspect of the delegation that was in every AA base class is that when we don't have operand bundles, we re-use function AA results as a fallback for callsite alias results. This relies on the IR properties of calls and functions w.r.t. aliasing, and so seems a better fit to BasicAA. I've lifted the logic up to that point where it seems to be a natural fit. This still does a bit of redundant work (we query function attributes twice, once via the callsite and once via the function AA query) but it is *exactly* twice here, no more. The end result is that all of the delegation logic is hoisted out of the base class and into either the aggregation layer when it is a pure retargeting to a different API surface, or into BasicAA when it relies on the IR's aliasing properties. This should fix the quadratic query pattern reported in PR26564, although I don't have a stand-alone test case to reproduce it. It also seems general goodness. Now the numerous AAs that don't need target library info don't carry it around and depend on it. I think I can even rip out the general access to the aggregation layer and only expose that in BasicAA as it is the only place where we re-query in that manner. However, this is a non-trivial change to the AA infrastructure so I want to get some additional eyes on this before it lands. Sadly, it can't wait long because we should really cherry pick this into 3.8 if we're going to go this route. Differential Revision: http://reviews.llvm.org/D17329 llvm-svn: 262490
* Revert "[attrs] Handle convergent CallSites."Justin Lebar2016-02-221-25/+37
| | | | | | | This reverts r261544, which was causing a test failure in Transforms/FunctionAttrs/readattrs.ll. llvm-svn: 261549
* [attrs] Handle convergent CallSites.Justin Lebar2016-02-221-37/+25
| | | | | | | | | | | | | | | | | | | | Summary: Previously we had a notion of convergent functions but not of convergent calls. This is insufficient to correctly analyze calls where the target is unknown, e.g. indirect calls. Now a call is convergent if it targets a known-convergent function, or if it's explicitly marked as convergent. As usual, we can remove convergent where we can prove that no convergent operations are performed in the call. Reviewers: chandlerc, jingyue Subscribers: hfinkel, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17317 llvm-svn: 261544
* ADT: Remove == and != comparisons between ilist iterators and pointersDuncan P. N. Exon Smith2016-02-211-1/+1
| | | | | | | | | | | | | | I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498
* [PM] Port the PostOrderFunctionAttrs pass to the new pass manager andChandler Carruth2016-02-181-34/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | convert one test to use this. This is a particularly significant milestone because it required a working per-function AA framework which can be queried over each function from within a CGSCC transform pass (and additionally a module analysis to be accessible). This is essentially *the* point of the entire pass manager rewrite. A CGSCC transform is able to query for multiple different function's analysis results. It works. The whole thing appears to actually work and accomplish the original goal. While we were able to hack function attrs and basic-aa to "work" in the old pass manager, this port doesn't use any of that, it directly leverages the new fundamental functionality. For this to work, the CGSCC framework also has to support SCC-based behavior analysis, etc. The only part of the CGSCC pass infrastructure not sorted out at this point are the updates in the face of inlining and running function passes that mutate the call graph. The changes are pretty boring and boiler-plate. Most of the work was factored into more focused preperatory patches. But this is what wires it all together. llvm-svn: 261203
* [attrs] Move the norecurse deduction to operate on the node set ratherChandler Carruth2016-02-131-15/+16
| | | | | | | | | | | | | | | | | | than the SCC object, and have it scan the instruction stream directly rather than relying on call records. This makes the behavior of this routine consistent between libc routines and LLVM intrinsics for libc routines. We can go and start teaching it about those being norecurse, but we should behave the same for the intrinsic and the libc routine rather than differently. I chatted with James Molloy and the inconsistency doesn't seem intentional and likely is due to intrinsic calls not being modelled in the call graph analyses. This also fixes a bug where we would deduce norecurse on optnone functions, when generally we try to handle optnone functions as-if they were replaceable and thus unanalyzable. llvm-svn: 260813
* [attrs] Simplify the convergent removal to directly use the pre-builtChandler Carruth2016-02-121-22/+10
| | | | | | | | | | | | | | node set rather than walking the SCC directly. This directly exposes the functions and has already had null entries filtered out. We also don't need need to handle optnone as it has already been handled in the caller -- we never try to remove convergent when there are optnone functions in the SCC. With this change, the code for removing convergent should work with the new pass manager and a different SCC analysis. llvm-svn: 260668
* [attrs] Consolidate the test for a non-SCC, non-convergent function callChandler Carruth2016-02-121-20/+14
| | | | | | | | | | | | | | | | with the test for a non-convergent intrinsic call. While it is possible to use the call records to search for function calls, we're going to do an instruction scan anyways to find the intrinsics, we can handle both cases while scanning instructions. This will also make the logic more amenable to the new pass manager which doesn't use the same call graph structure. My next patch will remove use of CallGraphNode entirely and allow this code to work with both the old and new pass manager. Fortunately, it should also get strictly simpler without changing functionality. llvm-svn: 260666
* [attrs] Run clang-format over a newly added routine in function-attrsChandler Carruth2016-02-121-12/+16
| | | | | | before I update it to be friendly with the new pass manager. llvm-svn: 260653
* Add convergent-removing bits to FunctionAttrs pass.Justin Lebar2016-02-091-0/+63
| | | | | | | | | | | | | | | | | Summary: Remove the convergent attribute on any functions which provably do not contain or invoke any convergent functions. After this change, we'll be able to modify clang to conservatively add 'convergent' to all functions when compiling CUDA. Reviewers: jingyue, joker.eph Subscribers: llvm-commits, tra, jhen, hfinkel, resistor, chandlerc, arsenm Differential Revision: http://reviews.llvm.org/D17013 llvm-svn: 260319
OpenPOWER on IntegriCloud