summaryrefslogtreecommitdiffstats
path: root/llvm/lib/LTO/LTOCodeGenerator.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Add a libLTO diagnostic handler that supports lto_get_error_message APIYunzhong Gao2015-11-111-8/+2
| | | | | | | | | | | | | | This is a follow-up from the previous discussion on the thread: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151019/307763.html The LibLTO lto_get_error_message() API reads error messages from a std::string sLastErrorString. Instead of passing this string around as an argument, this patch creates a diagnostic handler and then sends this handler to the constructor of LTOCodeGenerator. Differential Revision: http://reviews.llvm.org/D14313 llvm-svn: 252791
* Reapply "LTO: Disable extra verify runs in release builds"Duncan P. N. Exon Smith2015-09-151-8/+13
| | | | | | | This reverts commit r247730, effectively reapplying r247729. This time I have an lld commit ready to follow. llvm-svn: 247735
* Revert "LTO: Disable extra verify runs in release builds"Duncan P. N. Exon Smith2015-09-151-13/+8
| | | | | | | This temporarily reverts commit r247729, as it caused lld build failures. I'll recommit once I have an lld patch ready-to-go. llvm-svn: 247730
* LTO: Disable extra verify runs in release buildsDuncan P. N. Exon Smith2015-09-151-8/+13
| | | | | | | | | | | | | | | | | | | | | | The verifier currently runs three times in LTO: (1) after parsing, (2) at the beginning of the optimization pipeline, and (3) at the end of it. The first run is important, since we're not sure where the bitcode comes from and it's nice to validate it, but in release builds the extra runs aren't appropriate. This commit: - Allows these runs to be disabled in LTOCodeGenerator. - Adds command-line options to llvm-lto. - Adds command-line options to libLTO.dylib, and disables the verifier by default in release builds (based on NDEBUG). This shaves about 3.5% off the runtime of ld64 when linking verify-uselistorder with -flto -g. rdar://22509081 llvm-svn: 247729
* [PM] Port SROA to the new pass manager.Chandler Carruth2015-09-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In some ways this is a very boring port to the new pass manager as there are no interesting analyses or dependencies or other oddities. However, this does introduce the first good example of a transformation pass with non-trivial state porting to the new pass manager. I've tried to carve out patterns here to replicate elsewhere, and would appreciate comments on whether folks like these patterns: - A common need in the new pass manager is to effectively lift the pass class and some of its state into a public header file. Prior to this, LLVM used anonymous namespaces to provide "module private" types and utilities, but that doesn't scale to cases where a public header file is needed and the new pass manager will exacerbate that. The pattern I've adopted here is to use the namespace-cased-name of the core pass (what would be a module if we had them) as a module-private namespace. Then utility and other code can be declared and defined in this namespace. At some point in the future, we could even have (conditionally compiled) code that used modules features when available to do the same basic thing. - I've split the actual pass run method in two in order to expose a private method usable by the old pass manager to wrap the new class with a minimum of duplicated code. I actually looked at a bunch of ways to automate or generate these, but they are all quite terrible IMO. The fundamental need is to extract the set of analyses which need to cross this interface boundary, and that will end up being too unpredictable to effectively encapsulate IMO. This is also a relatively small amount of boiler plate that will live a relatively short time, so I'm not too worried about the fact that it is boiler plate. The rest of the patch is totally boring but results in a massive diff (sorry). It just moves code around and removes or adds qualifiers to reflect the new name and nesting structure. Differential Revision: http://reviews.llvm.org/D12773 llvm-svn: 247501
* Add a non-exiting diagnostic handler for LTO.Yunzhong Gao2015-09-111-2/+8
| | | | | | | This is in order to give LTO clients a chance to do some clean-up before terminating the process. llvm-svn: 247461
* [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatibleChandler Carruth2015-09-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with the new pass manager, and no longer relying on analysis groups. This builds essentially a ground-up new AA infrastructure stack for LLVM. The core ideas are the same that are used throughout the new pass manager: type erased polymorphism and direct composition. The design is as follows: - FunctionAAResults is a type-erasing alias analysis results aggregation interface to walk a single query across a range of results from different alias analyses. Currently this is function-specific as we always assume that aliasing queries are *within* a function. - AAResultBase is a CRTP utility providing stub implementations of various parts of the alias analysis result concept, notably in several cases in terms of other more general parts of the interface. This can be used to implement only a narrow part of the interface rather than the entire interface. This isn't really ideal, this logic should be hoisted into FunctionAAResults as currently it will cause a significant amount of redundant work, but it faithfully models the behavior of the prior infrastructure. - All the alias analysis passes are ported to be wrapper passes for the legacy PM and new-style analysis passes for the new PM with a shared result object. In some cases (most notably CFL), this is an extremely naive approach that we should revisit when we can specialize for the new pass manager. - BasicAA has been restructured to reflect that it is much more fundamentally a function analysis because it uses dominator trees and loop info that need to be constructed for each function. All of the references to getting alias analysis results have been updated to use the new aggregation interface. All the preservation and other pass management code has been updated accordingly. The way the FunctionAAResultsWrapperPass works is to detect the available alias analyses when run, and add them to the results object. This means that we should be able to continue to respect when various passes are added to the pipeline, for example adding CFL or adding TBAA passes should just cause their results to be available and to get folded into this. The exception to this rule is BasicAA which really needs to be a function pass due to using dominator trees and loop info. As a consequence, the FunctionAAResultsWrapperPass directly depends on BasicAA and always includes it in the aggregation. This has significant implications for preserving analyses. Generally, most passes shouldn't bother preserving FunctionAAResultsWrapperPass because rebuilding the results just updates the set of known AA passes. The exception to this rule are LoopPass instances which need to preserve all the function analyses that the loop pass manager will end up needing. This means preserving both BasicAAWrapperPass and the aggregating FunctionAAResultsWrapperPass. Now, when preserving an alias analysis, you do so by directly preserving that analysis. This is only necessary for non-immutable-pass-provided alias analyses though, and there are only three of interest: BasicAA, GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is preserved when needed because it (like DominatorTree and LoopInfo) is marked as a CFG-only pass. I've expanded GlobalsAA into the preserved set everywhere we previously were preserving all of AliasAnalysis, and I've added SCEVAA in the intersection of that with where we preserve SCEV itself. One significant challenge to all of this is that the CGSCC passes were actually using the alias analysis implementations by taking advantage of a pretty amazing set of loop holes in the old pass manager's analysis management code which allowed analysis groups to slide through in many cases. Moving away from analysis groups makes this problem much more obvious. To fix it, I've leveraged the flexibility the design of the new PM components provides to just directly construct the relevant alias analyses for the relevant functions in the IPO passes that need them. This is a bit hacky, but should go away with the new pass manager, and is already in many ways cleaner than the prior state. Another significant challenge is that various facilities of the old alias analysis infrastructure just don't fit any more. The most significant of these is the alias analysis 'counter' pass. That pass relied on the ability to snoop on AA queries at different points in the analysis group chain. Instead, I'm planning to build printing functionality directly into the aggregation layer. I've not included that in this patch merely to keep it smaller. Note that all of this needs a nearly complete rewrite of the AA documentation. I'm planning to do that, but I'd like to make sure the new design settles, and to flesh out a bit more of what it looks like in the new pass manager first. Differential Revision: http://reviews.llvm.org/D12080 llvm-svn: 247167
* Fix typo.Yaron Keren2015-09-011-1/+1
| | | | llvm-svn: 246538
* LTO: Cleanup parameter names and header docs, NFCDuncan P. N. Exon Smith2015-08-311-53/+48
| | | | | | | | Follow LLVM style for the parameter names (`CamelCase` not `camelCase`), and surface the header docs in doxygen. No functionality change intended. llvm-svn: 246509
* CodeGen: Introduce splitCodeGen and teach LTOCodeGenerator to use it.Peter Collingbourne2015-08-271-13/+15
| | | | | | | | | | | | | | | llvm::splitCodeGen is a function that implements the core of parallel LTO code generation. It uses llvm::SplitModule to split the module into linkable partitions and spawning one code generation thread per partition. The function produces multiple object files which can be linked in the usual way. This has been threaded through to LTOCodeGenerator (and llvm-lto for testing purposes). Separate patches will add parallel LTO support to the gold plugin and lld. Differential Revision: http://reviews.llvm.org/D12260 llvm-svn: 246236
* LTO: Simplify merged module ownership.Peter Collingbourne2015-08-241-28/+12
| | | | | | | | | | | This change moves LTOCodeGenerator's ownership of the merged module to a field of type std::unique_ptr<Module>. This helps simplify parts of the code and clears the way for the module to be consumed by LLVM CodeGen (see D12132 review comments). Differential Revision: http://reviews.llvm.org/D12205 llvm-svn: 245891
* LTO: Rename mergedModule variables to MergedModule to prepare for ownership ↵Peter Collingbourne2015-08-241-20/+17
| | | | | | | | change. Also convert a few loops to range-for loops and correct a comment. llvm-svn: 245874
* LTO: Maintain target triple, FeatureStr and CGOptLevel in the module or ↵Peter Collingbourne2015-08-221-18/+22
| | | | | | | | LTOCodeGenerator. This makes it easier to create new TargetMachines on demand. llvm-svn: 245781
* LTO: Change signature of LTOCodeGenerator::setCodePICModel() to take a ↵Peter Collingbourne2015-08-211-30/+0
| | | | | | | | | Reloc::Model. This allows us to remove a bunch of code in LTOCodeGenerator and llvm-lto and has the side effect of improving error handling in the libLTO C API. llvm-svn: 245756
* LTO: Simplify ownership of LTOCodeGenerator::TargetMach.Peter Collingbourne2015-08-211-6/+3
| | | | llvm-svn: 245671
* LTO: Simplify ownership of LTOCodeGenerator::CodegenOptions.Peter Collingbourne2015-08-211-15/+9
| | | | llvm-svn: 245670
* Remove access to the DataLayout in the TargetMachineMehdi Amini2015-07-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | Summary: Replace getDataLayout() with a createDataLayout() method to make explicit that it is intended to create a DataLayout only and not accessing it for other purpose. This change is the last of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11103 (cherry picked from commit 5609fc56bca971e5a7efeaa6ca4676638eaec5ea) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 243114
* Revert "Remove access to the DataLayout in the TargetMachine"Mehdi Amini2015-07-241-1/+1
| | | | | | | | | | This reverts commit 0f720d984f419c747709462f7476dff962c0bc41. It breaks clang too badly, I need to prepare a proper patch for clang first. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 243089
* Remove access to the DataLayout in the TargetMachineMehdi Amini2015-07-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | Summary: Replace getDataLayout() with a createDataLayout() method to make explicit that it is intended to create a DataLayout only and not accessing it for other purpose. This change is the last of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11103 (cherry picked from commit 5609fc56bca971e5a7efeaa6ca4676638eaec5ea) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 243083
* Simplify the Mangler interface now that DataLayout is mandatory.Rafael Espindola2015-06-231-1/+1
| | | | | | | We only need to pass in a DataLayout when mangling a raw string, not when constructing the mangler. llvm-svn: 240405
* Make the C++ LTO API easier to use from C++ clients.Peter Collingbourne2015-06-011-14/+7
| | | | | | | | | | | Start using C++ types such as StringRef and MemoryBuffer in the C++ LTO API. In doing so, clarify the ownership of the native object file: the caller now owns it, not the LTOCodeGenerator. The C libLTO library has been modified to use a derived class of LTOCodeGenerator that owns the object file. Differential Revision: http://reviews.llvm.org/D10114 llvm-svn: 238776
* LTO: Add API to choose whether to embed uselistsDuncan P. N. Exon Smith2015-04-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | Reverse libLTO's default behaviour for preserving use-list order in bitcode, and add API for controlling it. The default setting is now `false` (don't preserve them), which is consistent with `clang`'s default behaviour. Users of libLTO should call `lto_codegen_should_embed_uselists(CG,true)` prior to calling `lto_codegen_write_merged_modules()` whenever the output file isn't part of the production workflow in order to reproduce results with subsequent calls to `llc`. (I haven't added tests since `llvm-lto` (the test tool for LTO) doesn't support bitcode output, and even if it did: there isn't actually a good way to test whether a tool has passed the flag. If the order is already "natural" (if the order will already round-trip) then no use-list directives are emitted at all. At some point I'll circle back to add tests to `llvm-as` (etc.) that they actually respect the flag, at which point I can somehow add a test here as well.) llvm-svn: 235943
* LTO: Simplify code generator initializationDuncan P. N. Exon Smith2015-04-271-15/+2
| | | | | | | Simplify `LTOCodeGenerator` initialization by initializing simple fields at their definition. llvm-svn: 235939
* [LTO API] add lto_codegen_set_should_internalize.Manman Ren2015-04-171-1/+2
| | | | | | | | | | | When debugging LTO issues with ld64, we use -save-temps to save the merged optimized bitcode file, then invoke ld64 again on the single bitcode file. The saved bitcode file is already internalized, so we can call lto_codegen_set_should_internalize and skip running internalization again. rdar://20227235 llvm-svn: 235211
* uselistorder: Remove the global bitsDuncan P. N. Exon Smith2015-04-151-6/+1
| | | | | | | | | | | | | Remove all the global bits to do with preserving use-list order by moving the `cl::opt`s to the individual tools that want them. There's a minor functionality change to `libLTO`, in that you can't send in `-preserve-bc-uselistorder=false`, but making that bit settable (if it's worth doing) should be through explicit LTO API. As a drive-by fix, I removed some includes of `UseListOrder.h` that were made unnecessary by recent commits. llvm-svn: 234973
* uselistorder: Pull the bit through WriteToBitcodFile()Duncan P. N. Exon Smith2015-04-151-1/+2
| | | | | | | | | | | Change the callers of `WriteToBitcodeFile()` to pass `true` or `shouldPreserveBitcodeUseListOrder()` explicitly. I left the callers that want to send `false` alone. I'll keep pushing the bit higher until hopefully I can delete the global `cl::opt` entirely. llvm-svn: 234957
* Use raw_pwrite_stream in the object writer/streamer.Rafael Espindola2015-04-141-1/+2
| | | | | | The ELF object writer will take advantage of that in the next commit. llvm-svn: 234950
* IR: Set -preserve-bc-uselistorder=false by defaultDuncan P. N. Exon Smith2015-04-141-0/+5
| | | | | | | But keep it on by default in `llvm-as`, `opt`, `bugpoint`, `llvm-link`, `llvm-extract`, and `LTOCodeGenerator`. Part of PR5680. llvm-svn: 234921
* Simplify use of formatted_raw_ostream.Rafael Espindola2015-04-091-4/+1
| | | | | | | | | | | | | | | formatted_raw_ostream is a wrapper over another stream to add column and line number tracking. It is used only for asm printing. This patch moves the its creation down to where we know we are printing assembly. This has the following advantages: * Simpler lifetime management: std::unique_ptr * We don't compute column and line number of object files :-) llvm-svn: 234535
* This reverts commit r234460 and r234461.Rafael Espindola2015-04-091-1/+4
| | | | | | | | | Revert "Add classof implementations to the raw_ostream classes." Revert "Use the cast machinery to remove dummy uses of formatted_raw_ostream." The underlying issue can be fixed without classof. llvm-svn: 234495
* Use the cast machinery to remove dummy uses of formatted_raw_ostream.Rafael Espindola2015-04-091-4/+1
| | | | | | | If we know we are producing an object, we don't need to wrap the stream in a formatted_raw_ostream anymore. llvm-svn: 234461
* [LTO] do not run internalize pass from compileOptimized.Manman Ren2015-04-081-3/+0
| | | | | | | | | The input to compileOptimized is already optimized and internalized, so remove internalize pass from compileOptimized. rdar://20227235 llvm-svn: 234446
* Verifier: Remove the separate -verify-di passDuncan P. N. Exon Smith2015-03-191-1/+0
| | | | | | | | | | | | | | Remove `DebugInfoVerifierLegacyPass` and the `-verify-di` pass. Instead, call into the `DebugInfoVerifier` from inside `VerifierLegacyPass::finalizeModule()`. This better matches the logic in `verifyModule()` (used by the new PassManager), avoids requiring two separate passes to verify the IR, and makes the API for "add a pass to verify the IR" simple. Note: the `-verify-debug-info` flag still works (for now, at least; eventually it might make sense to just remove it). llvm-svn: 232772
* libLTO, llvm-lto, gold: Introduce flag for controlling optimization level.Peter Collingbourne2015-03-191-10/+22
| | | | | | | | | | This change also introduces a link-time optimization level of 1. This optimization level runs only the globaldce pass as well as cleanup passes for passes that run at -O0, specifically simplifycfg which cleans up lowerbitsets. http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150316/266951.html llvm-svn: 232769
* Make DataLayout Non-Optional in the ModuleMehdi Amini2015-03-041-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: DataLayout keeps the string used for its creation. As a side effect it is no longer needed in the Module. This is "almost" NFC, the string is no longer canonicalized, you can't rely on two "equals" DataLayout having the same string returned by getStringRepresentation(). Get rid of DataLayoutPass: the DataLayout is in the Module The DataLayout is "per-module", let's enforce this by not duplicating it more than necessary. One more step toward non-optionality of the DataLayout in the module. Make DataLayout Non-Optional in the Module Module->getDataLayout() will never returns nullptr anymore. Reviewers: echristo Subscribers: resistor, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D7992 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231270
* [LTO API] fix memory leakage introduced at r230290.Manman Ren2015-02-251-4/+15
| | | | | | | | r230290 released the LLVM module but not the LTOModule. rdar://19024554 llvm-svn: 230544
* [LTO API] add lto_codegen_set_module to set the destination module.Manman Ren2015-02-241-0/+16
| | | | | | | | | | | | | | | | | | When debugging LTO issues with ld64, we use -save-temps to save the merged optimized bitcode file, then invoke ld64 again on the single bitcode file to speed up debugging code generation passes and ld64 stuff after code generation. llvm linking a single bitcode file via lto_codegen_add_module will generate a different bitcode file from the single input. With the newly-added lto_codegen_set_module, we can make sure the destination module is the same as the input. lto_codegen_set_module will transfer the ownship of the module to code generator. rdar://19024554 llvm-svn: 230290
* [PM] Remove the old 'PassManager.h' header file at the top level ofChandler Carruth2015-02-131-4/+4
| | | | | | | | | | | | | | | | | | | | LLVM's include tree and the use of using declarations to hide the 'legacy' namespace for the old pass manager. This undoes the primary modules-hostile change I made to keep out-of-tree targets building. I sent an email inquiring about whether this would be reasonable to do at this phase and people seemed fine with it, so making it a reality. This should allow us to start bootstrapping with modules to a certain extent along with making it easier to mix and match headers in general. The updates to any code for users of LLVM are very mechanical. Switch from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h". Qualify the types which now produce compile errors with "legacy::". The most common ones are "PassManager", "PassManagerBase", and "FunctionPassManager". llvm-svn: 229094
* [LTO API] split lto_codegen_compile to lto_codegen_optimize andManman Ren2015-02-031-26/+53
| | | | | | | | | | | | | | | | | | | lto_codegen_compile_optimized. Also add lto_api_version. Before this commit, we can only dump the optimized bitcode after running lto_codegen_compile, but it includes some impacts of running codegen passes, one example is StackProtector pass. We will get assertion failure when running llc on the optimized bitcode, because StackProtector is effectively run twice. After splitting lto_codegen_compile, the linker can choose to dump the bitcode before running lto_codegen_compile_optimized. lto_api_version is added so ld64 can check for runtime-availability of the new API. rdar://19565500 llvm-svn: 228000
* [multiversion] Implement the old pass manager's TTI wrapper pass inChandler Carruth2015-02-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | terms of the new pass manager's TargetIRAnalysis. Yep, this is one of the nicer bits of the new pass manager's design. Passes can in many cases operate in a vacuum and so we can just nest things when convenient. This is particularly convenient here as I can now consolidate all of the TargetMachine logic on this analysis. The most important change here is that this pushes the function we need TTI for all the way into the TargetMachine, and re-creates the TTI object for each function rather than re-using it for each function. We're now prepared to teach the targets to produce function-specific TTI objects with specific subtargets cached, etc. One piece of feedback I'd love here is whether its worth renaming any of this stuff. None of the names really seem that awesome to me at this point, but TargetTransformInfoWrapperPass is particularly ... odd. TargetIRAnalysisWrapper might make more sense. I would want to do that rename separately anyways, but let me know what you think. llvm-svn: 227731
* [PM] Switch the TargetMachine interface from accepting a pass managerChandler Carruth2015-01-311-1/+2
| | | | | | | | | | | | | | | | | | | | | | | base which it adds a single analysis pass to, to instead return the type erased TargetTransformInfo object constructed for that TargetMachine. This removes all of the pass variants for TTI. There is now a single TTI *pass* in the Analysis layer. All of the Analysis <-> Target communication is through the TTI's type erased interface itself. While the diff is large here, it is nothing more that code motion to make types available in a header file for use in a different source file within each target. I've tried to keep all the doxygen comments and file boilerplate in line with this move, but let me know if I missed anything. With this in place, the next step to making TTI work with the new pass manager is to introduce a really simple new-style analysis that produces a TTI object via a callback into this routine on the target machine. Once we have that, we'll have the building blocks necessary to accept a function argument as well. llvm-svn: 227685
* [PM] Sink the population of the pass manager with target-specificChandler Carruth2015-01-301-1/+4
| | | | | | | | | | | | analyses back into the LTO code generator. The pass manager builder (and the transforms library in general) shouldn't be referencing the target machine at all. This makes the LTO population work like the others -- the data layout and target transform info need to be pre-populated. llvm-svn: 227576
* [LTO] Scan all per-function subtargets when collecting runtime library names.Akira Hatanaka2015-01-301-12/+22
| | | | | | | | | | | | accumulateAndSortLibcalls in LTOCodeGenerator.cpp collects names of runtime library functions which are used to identify user-defined functions that should be protected. Previously, this function would only scan the TargetLowering object belonging to the "main" subtarget for the library function names. This commit changes it to scan all per-function subtargets. Differential Revision: http://reviews.llvm.org/D7275 llvm-svn: 227533
* Move DataLayout back to the TargetMachine from TargetSubtargetInfoEric Christopher2015-01-261-2/+2
| | | | | | | | | | | | | | | | | | | derived classes. Since global data alignment, layout, and mangling is often based on the DataLayout, move it to the TargetMachine. This ensures that global data is going to be layed out and mangled consistently if the subtarget changes on a per function basis. Prior to this all targets(*) have had subtarget dependent code moved out and onto the TargetMachine. *One target hasn't been migrated as part of this change: R600. The R600 port has, as a subtarget feature, the size of pointers and this affects global data layout. I've currently hacked in a FIXME to enable progress, but the port needs to be updated to either pass the 64-bitness to the TargetMachine, or fix the DataLayout to avoid subtarget dependent features. llvm-svn: 227113
* [PM] Rework how the TargetLibraryInfo pass integrates with the new passChandler Carruth2015-01-241-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | manager to support the actual uses of it. =] When I ported instcombine to the new pass manager I discover that it didn't work because TLI wasn't available in the right places. This is a somewhat surprising and/or subtle aspect of the new pass manager design that came up before but I think is useful to be reminded of: While the new pass manager *allows* a function pass to query a module analysis, it requires that the module analysis is already run and cached prior to the function pass manager starting up, possibly with a 'require<foo>' style utility in the pass pipeline. This is an intentional hurdle because using a module analysis from a function pass *requires* that the module analysis is run prior to entering the function pass manager. Otherwise the other functions in the module could be in who-knows-what state, etc. A somewhat surprising consequence of this design decision (at least to me) is that you have to design a function pass that leverages a module analysis to do so as an optional feature. Even if that means your function pass does no work in the absence of the module analysis, you have to handle that possibility and remain conservatively correct. This is a natural consequence of things being able to invalidate the module analysis and us being unable to re-run it. And it's a generally good thing because it lets us reorder passes arbitrarily without breaking correctness, etc. This ends up causing problems in one case. What if we have a module analysis that is *definitionally* impossible to invalidate. In the places this might come up, the analysis is usually also definitionally trivial to run even while other transformation passes run on the module, regardless of the state of anything. And so, it follows that it is natural to have a hard requirement on such analyses from a function pass. It turns out, that TargetLibraryInfo is just such an analysis, and InstCombine has a hard requirement on it. The approach I've taken here is to produce an analysis that models this flexibility by making it both a module and a function analysis. This exposes the fact that it is in fact safe to compute at any point. We can even make it a valid CGSCC analysis at some point if that is useful. However, we don't want to have a copy of the actual target library info state for each function! This state is specific to the triple. The somewhat direct and blunt approach here is to turn TLI into a pimpl, with the state and mutators in the implementation class and the query routines primarily in the wrapper. Then the analysis can lazily construct and cache the implementations, keyed on the triple, and on-demand produce wrappers of them for each function. One minor annoyance is that we will end up with a wrapper for each function in the module. While this is a bit wasteful (one pointer per function) it seems tolerable. And it has the advantage of ensuring that we pay the absolute minimum synchronization cost to access this information should we end up with a nice parallel function pass manager in the future. We could look into trying to mark when analysis results are especially cheap to recompute and more eagerly GC-ing the cached results, or we could look at supporting a variant of analyses whose results are specifically *not* cached and expected to just be used and discarded by the consumer. Either way, these seem like incremental enhancements that should happen when we start profiling the memory and CPU usage of the new pass manager and not before. The other minor annoyance is that if we end up using the TLI in both a module pass and a function pass, those will be produced by two separate analyses, and thus will point to separate copies of the implementation state. While a minor issue, I dislike this and would like to find a way to cleanly allow a single analysis instance to be used across multiple IR unit managers. But I don't have a good solution to this today, and I don't want to hold up all of the work waiting to come up with one. This too seems like a reasonable thing to incrementally improve later. llvm-svn: 226981
* [PM] Separate the InstCombiner from its pass.Chandler Carruth2015-01-201-1/+1
| | | | | | | | | | | | | | | | | | | | This creates a small internal pass which runs the InstCombiner over a function. This is the hard part of porting InstCombine to the new pass manager, as at this point none of the code in InstCombine has access to a Pass object any longer. The resulting interface for the InstCombiner is pretty terrible. I'm not planning on leaving it that way. The key thing missing is that we need to separate the worklist from the combiner a touch more. Once that's done, it should be possible for *any* part of LLVM to just create a worklist with instructions, populate it, and then combine it until empty. The pass will just be the (obvious and important) special case of doing that for an entire function body. For now, this is the first increment of factoring to make all of this work. llvm-svn: 226618
* [PM] Move TargetLibraryInfo into the Analysis library.Chandler Carruth2015-01-151-1/+1
| | | | | | | | | | | | | | | | While the term "Target" is in the name, it doesn't really have to do with the LLVM Target library -- this isn't an abstraction which LLVM targets generally need to implement or extend. It has much more to do with modeling the various runtime libraries on different OSes and with different runtime environments. The "target" in this sense is the more general sense of a target of cross compilation. This is in preparation for porting this analysis to the new pass manager. No functionality changed, and updates inbound for Clang and Polly. llvm-svn: 226078
* libLTO: Assert if LTOCodeGenerator and LTOModule are from different contextsDuncan P. N. Exon Smith2014-11-111-0/+3
| | | | llvm-svn: 221730
* libLTO: Allow LTOCodeGenerator to own a contextDuncan P. N. Exon Smith2014-11-111-4/+18
| | | | llvm-svn: 221726
* Add an option to the LTO code generator to disable vectorization during LTOArnold Schwaighofer2014-10-261-3/+9
| | | | | | | | | | | | | | | | | | | | We used to always vectorize (slp and loop vectorize) in the LTO pass pipeline. r220345 changed it so that we used the PassManager's fields 'LoopVectorize' and 'SLPVectorize' out of the desire to be able to disable vectorization using the cl::opt flags 'vectorize-loops'/'slp-vectorize' which the before mentioned fields default to. Unfortunately, this turns off vectorization because those fields default to false. This commit adds flags to the LTO library to disable lto vectorization which reconciles the desire to optionally disable vectorization during LTO and the desired behavior of defaulting to enabled vectorization. We really want tools to set PassManager flags directly to enable/disable vectorization and not go the route via cl::opt flags *in* PassManagerBuilder.cpp. llvm-svn: 220652
OpenPOWER on IntegriCloud