summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/IPO
Commit message (Collapse)AuthorAgeFilesLines
...
* clang-format SampleProfile.cpp (NFC)Dehao Chen2017-01-191-6/+4
| | | | llvm-svn: 292533
* LowerTypeTests: Implement exporting of type identifiers.Peter Collingbourne2017-01-191-31/+103
| | | | | | | | | | | | Type identifiers are exported by: - Adding coarse-grained information about how to test the type identifier to the summary. - Creating symbols in the object file (aliases and absolute symbols) containing fine-grained information about the type identifier. Differential Revision: https://reviews.llvm.org/D28424 llvm-svn: 292462
* ThinLTOBitcodeWriter: Clear comdats on filtered globals.Peter Collingbourne2017-01-181-0/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D28839 llvm-svn: 292431
* Apply clang-tidy's performance-unnecessary-value-param to LLVM.Benjamin Kramer2017-01-133-3/+3
| | | | | | | With some minor manual fixes for using function_ref instead of std::function. No functional change intended. llvm-svn: 291904
* [ThinLTO] Import static functions from the same module as callerTeresa Johnson2017-01-121-4/+22
| | | | | | | | | | | | | | | | | | | | | | | | Summary: We can sometimes end up with multiple copies of a local function that have the same GUID in the index. This happens when there are local functions with the same name that are in different source files with the same name (but in different directories), and they were compiled in their own directory so had the same path at compile time. In this case make sure we import the copy in the caller's module. While it isn't a correctness problem (the renamed reference which is based on the module IR hash will be unique since the module must have had an externally visible function that was imported), importing the wrong copy will result in lost performance opportunity since it won't be referenced and inlined. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28440 llvm-svn: 291841
* LowerTypeTests: Represent the memory region size with the constant size-1.Peter Collingbourne2017-01-111-15/+15
| | | | | | | | | This means that we can use a shorter instruction sequence in the case where the size is a power of two and on the boundary between two representations. Differential Revision: https://reviews.llvm.org/D28421 llvm-svn: 291706
* Re-apply r291205, "LowerTypeTests: Split the pass in two: a resolution phase ↵Peter Collingbourne2017-01-111-110/+155
| | | | | | and a lowering phase.", with a fix for an off-by-one error. llvm-svn: 291699
* Revert rL291205 because it breaks Chrome tests under CFI.Ivan Krasin2017-01-111-155/+110
| | | | | | | | | | | | | | | | | | | | | Summary: Revert LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase. This change separates how type identifiers are resolved from how intrinsic calls are lowered. All information required to lower an intrinsic call is stored in a new TypeIdLowering data structure. The idea is that this data structure can either be initialized using the module itself during regular LTO, or using the module summary in ThinLTO backends. Original URL: https://reviews.llvm.org/D28341 Reviewers: pcc Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D28532 llvm-svn: 291684
* LowerTypeTests: Thread summary and action from the API and command line into ↵Peter Collingbourne2017-01-072-37/+75
| | | | | | | | | | | the pass. Also move command line handling out of the pass constructor and into a separate function. Differential Revision: https://reviews.llvm.org/D28422 llvm-svn: 291323
* LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase.Peter Collingbourne2017-01-061-110/+155
| | | | | | | | | | | | This change separates how type identifiers are resolved from how intrinsic calls are lowered. All information required to lower an intrinsic call is stored in a new TypeIdLowering data structure. The idea is that this data structure can either be initialized using the module itself during regular LTO, or using the module summary in ThinLTO backends. Differential Revision: https://reviews.llvm.org/D28341 llvm-svn: 291205
* ThinLTO: add early "dead-stripping" on the IndexTeresa Johnson2017-01-051-5/+98
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Using the linker-supplied list of "preserved" symbols, we can compute the list of "dead" symbols, i.e. the one that are not reachable from a "preserved" symbol transitively on the reference graph. Right now we are using this information to mark these functions as non-eligible for import. The impact is two folds: - Reduction of compile time: we don't import these functions anywhere or import the function these symbols are calling. - The limited number of import/export leads to better internalization. Patch originally by Mehdi Amini. Reviewers: mehdi_amini, pcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23488 llvm-svn: 291177
* [ThinLTO] Subsume all importing checks into a single flagTeresa Johnson2017-01-051-73/+1
| | | | | | | | | | | | | | | | | | | Summary: This adds a new summary flag NotEligibleToImport that subsumes several existing flags (NoRename, HasInlineAsmMaybeReferencingInternal and IsNotViableToInline). It also subsumes the checking of references on the summary that was being done during the thin link by eligibleForImport() for each candidate. It is much more efficient to do that checking once during the per-module summary build and record it in the summary. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28169 llvm-svn: 291108
* IR: Module summary representation for type identifiers; summary test ↵Peter Collingbourne2017-01-051-0/+51
| | | | | | | | | | | | | | | scaffolding for lowertypetests. Set up basic YAML I/O support for module summaries, plumb the summary into the pass and add a few command line flags to test YAML I/O support. Bitcode support to come separately, as will the code in LowerTypeTests that actually uses the summary. Also add a couple of tests that pass by virtue of the pass doing nothing with the summary (which happens to be the correct thing to do for those tests). Differential Revision: https://reviews.llvm.org/D28041 llvm-svn: 291069
* Use lazy-loading of Metadata in MetadataLoader when importing is enabled (NFC)Mehdi Amini2017-01-041-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is a relatively simple scheme: we use the index emitted in the bitcode to avoid loading all the global metadata. Instead we load the index with their position in the bitcode so that we can load each of them individually. Materializing the global metadata block in this condition only triggers loading the named metadata, and the ones referenced from there (transitively). When materializing a function, metadata from the global block are loaded lazily as they are referenced. Two main current limitations are: 1) Global values other than functions are not materialized on demand, so we need to eagerly load METADATA_GLOBAL_DECL_ATTACHMENT records (and their transitive dependencies). 2) When we load a single metadata, we don't recurse on the operands, instead we use a placeholder or a temporary metadata. Unfortunately tepmorary nodes are very expensive. This is why we don't have it always enabled and only for importing. These two limitations can be lifted in a subsequent improvement if needed. With this change, the total link time of opt with ThinLTO and Debug Info enabled is going down from 282s to 224s (~20%). Reviewers: pcc, tejohnson, dexonsmith Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28113 llvm-svn: 291027
* [PMBuilder] Remove RunFloat2Int cl::opt.Davide Italiano2017-01-021-6/+1
| | | | | | The pass has been on by default for a long time without problems. llvm-svn: 290814
* [PM] Teach the inliner's call graph update to handle inserting new edgesChandler Carruth2016-12-281-7/+9
| | | | | | | | | | | | | | | | | when they are call edges at the leaf but may (transitively) be reached via ref edges. It turns out there is a simple rule: insert everything as a ref edge which is a safe conservative default. Then we let the existing update logic handle promoting some of those to call edges. Note that it would be fairly cheap to make these call edges right away if that is desirable by testing whether there is some existing call path from the source to the target. It just seemed like slightly more complexity in this code path that isn't strictly necessary. If anyone feels strongly about handling this differently I'm happy to change it. llvm-svn: 290649
* [PM] Add one of the features left out of the initial inliner patch:Chandler Carruth2016-12-271-7/+23
| | | | | | | | | | | | | | | | | skipping indirectly recursive inline chains. To do this, we implicitly build an inline stack for each callsite and check prior to inlining that doing so would not form a cycle. This uses the exact same technique and even shares some code with the legacy PM inliner. This solution remains deeply unsatisfying to me because it means we cannot actually iterate the inliner externally. Doing so would not be able to easily detect and avoid such cycles. Some day I would very much like to have a solution that works without this internal state to detect cycles, but this is not that day. llvm-svn: 290590
* [PM] Teach the inliner in the new PM to merge attributes after inlining.Chandler Carruth2016-12-271-0/+3
| | | | | | | Also enable the new PM in the attributes test case which caught this issue. llvm-svn: 290572
* [PM] Teach the always inliner in the new pass manager to supportChandler Carruth2016-12-262-38/+30
| | | | | | | | | | | | | | | | removing fully-dead comdats without removing dead entries in comdats with live members. This factors the core logic out of the current inliner's internals to a reusable utility and leverages that in both places. The factored out code should also be (minorly) more efficient in cases where we have very few dead functions or dead comdats to consider. I've added a test case to cover this behavior of the always inliner. This is the last significant bug in the new PM's always inliner I've found (so far). llvm-svn: 290557
* [NewGVN] Add a flag to enable the pass via `-mllvm`.Davide Italiano2016-12-261-4/+14
| | | | | | | | NewGVN can be tested passing `-mllvm -enable-newgvn` to clang. Differential Revision: https://reviews.llvm.org/D28059 llvm-svn: 290548
* [PM] Teach the always inlining test case to be much more strict aboutChandler Carruth2016-12-231-0/+15
| | | | | | | | | | | | | | | | | | | | whether functions are removed, and fix the new PM's always inliner to actually pass this test. Without this, the new PM's always inliner leaves all the functions kicking around which won't work out very well given the semantics of always inline. Doing this really highlights how frustrating the current alwaysinline semantic contract is though -- why can we put it on *external* functions, etc? Also I've added a number of tricky and interesting test cases for removing functions with the always inliner. There is one remaining case not handled -- fully removing comdats -- and I've left a FIXME about this. llvm-svn: 290457
* Function-import: Disable IRVerifier on lazy-loaded modules: the ODR ↵Mehdi Amini2016-12-231-8/+0
| | | | | | TypeUniquing generates invalid debug info. llvm-svn: 290442
* Fix build after r290437 (missing include)Mehdi Amini2016-12-231-0/+1
| | | | llvm-svn: 290438
* FunctionImport: fix typo '#ifndef NDEBUG' instead of '#ifndef DEBUG'Mehdi Amini2016-12-231-1/+1
| | | | llvm-svn: 290437
* [ThinLTO] Verify lazy-loaded source module for function importing when ↵Mehdi Amini2016-12-231-0/+8
| | | | | | assertions are enabled (NFC) llvm-svn: 290416
* [cfi] Emit jump tables as a function-level inline asm.Evgeniy Stepanov2016-12-221-132/+90
| | | | | | | | | | | | | | | | | | Use a dummy private function with inline asm calls instead of module level asm blocks for CFI jumptables. The main advantage is that now jumptable codegen can be affected by the function attributes (like target_cpu on ARM). Module level asm gets the default subtarget based on the target triple, which is often not good enough. This change also uses asm constraints/arguments to reference jumptable targets and aliases directly. We no longer do asm name mangling in an IR pass. Differential Revision: https://reviews.llvm.org/D28012 llvm-svn: 290384
* Pass GetAssumptionCache to InlineFunctionInfo constructorEaswaran Raman2016-12-221-1/+1
| | | | | | Differential revision: https://reviews.llvm.org/D28038 llvm-svn: 290295
* [LDist] Match behavior between invoking via optimization pipeline or opt ↵Adam Nemet2016-12-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | -loop-distribute In r267672, where the loop distribution pragma was introduced, I tried it hard to keep the old behavior for opt: when opt is invoked with -loop-distribute, it should distribute the loop (it's off by default when ran via the optimization pipeline). As MichaelZ has discovered this has the unintended consequence of breaking a very common developer work-flow to reproduce compilations using opt: First you print the pass pipeline of clang with -debug-pass=Arguments and then invoking opt with the returned arguments. clang -debug-pass will include -loop-distribute but the pass is invoked with default=off so nothing happens unless the loop carries the pragma. While through opt (default=on) we will try to distribute all loops. This changes opt's default to off as well to match clang. The tests are modified to explicitly enable the transformation. llvm-svn: 290235
* IPO: Remove the ModuleSummary argument to the FunctionImport pass. NFCI.Peter Collingbourne2016-12-212-34/+15
| | | | | | | | | No existing client is passing a non-null value here. This will come back in a slightly different form as part of the type identifier summary work. Differential Revision: https://reviews.llvm.org/D28006 llvm-svn: 290222
* [PM] Provide an initial, minimal port of the inliner to the new pass manager.Chandler Carruth2016-12-203-19/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This doesn't implement *every* feature of the existing inliner, but tries to implement the most important ones for building a functional optimization pipeline and beginning to sort out bugs, regressions, and other problems. Notable, but intentional omissions: - No alloca merging support. Why? Because it isn't clear we want to do this at all. Active discussion and investigation is going on to remove it, so for simplicity I omitted it. - No support for trying to iterate on "internally" devirtualized calls. Why? Because it adds what I suspect is inappropriate coupling for little or no benefit. We will have an outer iteration system that tracks devirtualization including that from function passes and iterates already. We should improve that rather than approximate it here. - Optimization remarks. Why? Purely to make the patch smaller, no other reason at all. The last one I'll probably work on almost immediately. But I wanted to skip it in the initial patch to try to focus the change as much as possible as there is already a lot of code moving around and both of these *could* be skipped without really disrupting the core logic. A summary of the different things happening here: 1) Adding the usual new PM class and rigging. 2) Fixing minor underlying assumptions in the inline cost analysis or inline logic that don't generally hold in the new PM world. 3) Adding the core pass logic which is in essence a loop over the calls in the nodes in the call graph. This is a bit duplicated from the old inliner, but only a handful of lines could realistically be shared. (I tried at first, and it really didn't help anything.) All told, this is only about 100 lines of code, and most of that is the mechanics of wiring up analyses from the new PM world. 4) Updating the LazyCallGraph (in the new PM) based on the *newly inlined* calls and references. This is very minimal because we cannot form cycles. 5) When inlining removes the last use of a function, eagerly nuking the body of the function so that any "one use remaining" inline cost heuristics are immediately refined, and queuing these functions to be completely deleted once inlining is complete and the call graph updated to reflect that they have become dead. 6) After all the inlining for a particular function, updating the LazyCallGraph and the CGSCC pass manager to reflect the function-local simplifications that are done immediately and internally by the inline utilties. These are the exact same fundamental set of CG updates done by arbitrary function passes. 7) Adding a bunch of test cases to specifically target CGSCC and other subtle aspects in the new PM world. Many thanks to the careful review from Easwaran and Sanjoy and others! Differential Revision: https://reviews.llvm.org/D24226 llvm-svn: 290161
* [IR] Remove the DIExpression field from DIGlobalVariable.Adrian Prantl2016-12-201-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. This reapplies r289902 with additional testcase upgrades and a change to the Bitcode record for DIGlobalVariable, that makes upgrading the old format unambiguous also for variables without DIExpressions. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 290153
* Revert @llvm.assume with operator bundles (r289755-r289757)Daniel Jasper2016-12-197-12/+65
| | | | | | | This creates non-linear behavior in the inliner (see more details in r289755's commit thread). llvm-svn: 290086
* Revert "[GVNHoist] Move GVNHoist to function simplification part of pipeline."Evgeniy Stepanov2016-12-171-2/+2
| | | | | | | | This reverts r289696, which caused TSan perf regression. See PR31382. llvm-svn: 290030
* Revert "[IR] Remove the DIExpression field from DIGlobalVariable."Adrian Prantl2016-12-161-10/+7
| | | | | | | | | | | | | | | | | This reverts commit 289920 (again). I forgot to implement a Bitcode upgrade for the case where a DIGlobalVariable has not DIExpression. Unfortunately it is not possible to safely upgrade these variables without adding a flag to the bitcode record indicating which version they are. My plan of record is to roll the planned follow-up patch that adds a unit: field to DIGlobalVariable into this patch before recomitting. This way we only need one Bitcode upgrade for both changes (with a version flag in the bitcode record to safely distinguish the record formats). Sorry for the churn! llvm-svn: 289982
* [IR] Remove the DIExpression field from DIGlobalVariable.Adrian Prantl2016-12-161-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. This reapplies r289902 with additional testcase upgrades. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 289920
* [ThinLTO] Thin link efficiency: More efficient export list computationTeresa Johnson2016-12-161-32/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Instead of checking whether a global referenced by a function being imported is defined in the same module, speculatively always add the referenced globals to the module's export list. After all imports are computed, for each module prune any not in its defined set from its export list. For a huge C++ app with aggressive importing thresholds, even with D27687 we spent a lot of time invoking modulePath() from exportGlobalInModule (modulePath() was still the 2nd hottest routine in profile). The reason is that with comdat/linkonce the summary lists for each GUID can be long. For the app in question, for example, we were invoking exportGlobalInModule almost 2 million times, and we traversed an average of 63 entries in the summary list each time. This patch reduced the thin link time for the app by about 10% (on top of D27687) when using aggressive importing thresholds, and about 3.5% on average with default importing thresholds. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27755 llvm-svn: 289918
* Revert "[IR] Remove the DIExpression field from DIGlobalVariable."Adrian Prantl2016-12-161-10/+7
| | | | | | This reverts commit 289902 while investigating bot berakage. llvm-svn: 289906
* Add missing library dep.Peter Collingbourne2016-12-161-1/+1
| | | | llvm-svn: 289903
* [IR] Remove the DIExpression field from DIGlobalVariable.Adrian Prantl2016-12-161-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 289902
* IPO: Introduce ThinLTOBitcodeWriter pass.Peter Collingbourne2016-12-162-0/+345
| | | | | | | | | | | | | | This pass prepares a module containing type metadata for ThinLTO by splitting it into regular and thin LTO parts if possible, and writing both parts to a multi-module bitcode file. Modules that do not contain type metadata are written unmodified as a single module. All globals with type metadata are added to the regular LTO module, and the rest are added to the thin LTO module. Differential Revision: https://reviews.llvm.org/D27324 llvm-svn: 289899
* [ThinLTO] Thin link efficiency improvement: don't re-export globals (NFC)Teresa Johnson2016-12-151-9/+13
| | | | | | | | | | | | | | | | | Summary: We were reinvoking exportGlobalInModule numerous times redundantly. No need to re-export globals referenced by a global that was already imported from its module. This resulted in a large speedup in the thin link for a big application, particularly when importing aggressiveness was cranked up. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27687 llvm-svn: 289896
* [ThinLTO] Revert part of r289843 that belonged to another patch.Teresa Johnson2016-12-151-13/+9
| | | | | | | | The code change for D27687 accidentally got committed along with the main change in r289843. Revert it temporarily, so that I can recommit it along with its test as intended. llvm-svn: 289875
* [ThinLTO] Remove stale comment (NFC)Teresa Johnson2016-12-151-2/+1
| | | | | | This should have been removed with r288446. llvm-svn: 289871
* [ThinLTO] Thin link efficiency: skip candidate added later with higher ↵Teresa Johnson2016-12-151-4/+13
| | | | | | | | | | | | | | | | | | | | | | | threshold (NFC) Summary: Thin link efficiency improvement. After adding an importing candidate to the worklist we might have later added it again with a higher threshold. Skip it when popped from the worklist if we recorded a higher threshold than the current worklist entry, it will get processed again at the higher threshold when that entry is popped. This required adding the summary's GUID to the worklist, so that it can be used to query the recorded highest threshold for it when we pop from the worklist. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27696 llvm-svn: 289867
* [ThinLTO] Ensure callees get hot threshold when first seen on cold pathTeresa Johnson2016-12-151-24/+28
| | | | | | | | | | | | | | | | | This is split out from D27696, since it turned out to be a bug fix and not part of the NFC efficiency change. Keep the same adjusted (possibly decayed) threshold in both the worklist and the ImportList. Otherwise if we encountered it first along a cold path, the callee would be added to the worklist with a lower decayed threshold than when it is later encountered along a hot path. But the logic uses the threshold recorded in the ImportList entry to check if we should re-add it, and without this patch the threshold recorded there is the same along both paths so we don't re-add it. Using the same possibly decayed threshold in the ImportList ensures we re-add it later with the higher non-decayed hot path threshold. llvm-svn: 289843
* Remove the AssumptionCacheHal Finkel2016-12-157-65/+12
| | | | | | | | | After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... llvm-svn: 289756
* Only sets profile summary when it was not preset.Dehao Chen2016-12-141-1/+2
| | | | | | | | | | | | Summary: SampleProfileLoader pass may be invoked twice by LTO. The 2nd pass should not append more summary info as it is already preset by the 1st pass. Reviewers: eraman, davidxl Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D27733 llvm-svn: 289725
* Fix the bug in r289714 (NFC).Dehao Chen2016-12-141-1/+1
| | | | llvm-svn: 289724
* Create SampleProfileLoader pass in llvm instead of clangDehao Chen2016-12-141-0/+5
| | | | | | | | | | | | Summary: We used to create SampleProfileLoader pass in clang. This makes LTO/ThinLTO unable to add this pass in the linker plugin. This patch moves the SampleProfileLoader pass creation from clang to llvm pass manager builder. Reviewers: tejohnson, davidxl, dnovillo Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D27743 llvm-svn: 289714
* [GVNHoist] Move GVNHoist to function simplification part of pipeline.Geoff Berry2016-12-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Move GVNHoist to later in the optimization pipeline, specifically, to the function simplification part of the pipeline. The new pipeline location allows GVNHoist to run on a function after its callees have been inlined but before the function has been considered for inlining into its callers, exposing more opportunities for hoisting. Performance results on AArch64 kryo: Improvements: Benchmarks/CoyoteBench/fftbench -24.952% spec2006/bzip2 -4.071% internal bmark -3.177% Benchmarks/PAQ8p/paq8p -1.754% spec2000/perlbmk -1.328% spec2006/h264ref -1.140% Regressions: internal bmark +1.818% Benchmarks/mafft/pairlocalalign +1.084% Reviewers: sebpop, dberlin, hiraditya Subscribers: aemerson, mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27722 llvm-svn: 289696
OpenPOWER on IntegriCloud