summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/IPO
Commit message (Collapse)AuthorAgeFilesLines
...
* [FunctionAttrs][ArgumentPromotion][GlobalOpt] Disable some optimisations ↵Luke Cheeseman2018-02-223-2/+15
| | | | | | | | | | | | | | passes for naked functions - Fix for bug 36078. - Prevent the functionattrs, function-attrs, globalopt and argpromotion passes from changing naked functions. - These passes can perform some alterations to the functions that should not be applied. An example is removing parameters that are seemingly not used because they are only referenced in the inline assembly. Another example is marking the function as fastcc. llvm-svn: 325788
* [SampleProf] NFC. Expose reusable functionality in SampleProfile.Mircea Trofin2018-02-221-29/+9
| | | | | | | | | | | | | | | | | | Summary: Exposing getOffset and findFunctionSamples as members of SampleProfile. They are intimately tied to design choices of the sample profile format - using offsets instead of line numbers, and traversing inlined functions stack, respectively. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43605 llvm-svn: 325747
* [ThinLTO] Add GraphTraits for FunctionSummariesCharles Saternos2018-02-191-1/+1
| | | | | | | | Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. Third attempt - moved function from lambda to static function due to build failures. llvm-svn: 325506
* Revert: [llvm] r325448 - [ThinLTO] Add GraphTraits for FunctionSummaries Simon Pilgrim2018-02-181-1/+1
| | | | | | | | | | Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. Second attempt, since last patch caused stage2 build to fail (now using function_ref rather than std::function). Reverted due to buildbot failures llvm-svn: 325454
* [ThinLTO] Add GraphTraits for FunctionSummariesCharles Saternos2018-02-171-1/+1
| | | | | | | | Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. Second attempt, since last patch caused stage2 build to fail (now using function_ref rather than std::function). llvm-svn: 325448
* [ThinLTO] Import global variablesEugene Leviant2018-02-161-12/+84
| | | | | | Differential revision: https://reviews.llvm.org/D43077 llvm-svn: 325320
* Pass a module reference to CloneModule.Rafael Espindola2018-02-141-1/+1
| | | | | | | It can never be null and most callers were already using references or std::unique_ptr. llvm-svn: 325160
* Pass a reference to a module to the bitcode writer.Rafael Espindola2018-02-141-9/+8
| | | | | | | This simplifies most callers as they are already using references or std::unique_ptr. llvm-svn: 325155
* Revert "[ThinLTO] Add GraphTraits for FunctionSummaries"Volodymyr Sapsai2018-02-121-1/+1
| | | | | | | | | It caused assertion failure Assertion failed: (!DD.IsLambda && !MergeDD.IsLambda && "faked up lambda definition?"), function MergeDefinitionData, file /Users/buildslave/jenkins/workspace/clang-stage1-configure-RA/llvm/tools/clang/lib/Serialization/ASTReaderDecl.cpp, line 1675. on the second stage build bots. llvm-svn: 324932
* [ThinLTO] Add GraphTraits for FunctionSummariesCharles Saternos2018-02-111-1/+1
| | | | | | Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. llvm-svn: 324854
* [ThinLTO] Teach ThinLTO about auto hide symbolsSteven Wu2018-02-091-0/+7
| | | | | | | | | | | | | | | | | | Summary: For symbols that has linkonce_odr linkage and unnamed_addr, it can be auto hide by linker to avoid weak external symbols. Teach ThinLTO to perform auto hide so it can safely promote linkonce_odr to weak symbols without breaking this nice property. Reviewers: tejohnson, mehdi_amini Reviewed By: tejohnson Subscribers: inglorion, eraman, rnk, pcc, llvm-commits Differential Revision: https://reviews.llvm.org/D43130 llvm-svn: 324757
* [ThinLTO] Skip BlockAddresses while replacing uses in function import.Dmitry Mikulin2018-02-081-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D43027 llvm-svn: 324658
* Recommit r324455 "[ThinLTO] - Simplify code in ThinLTOBitcodeWriter."George Rimar2018-02-081-40/+12
| | | | | | | | | | | | | With fix: reimplemented. Original commit message: Recently introduced convertToDeclaration is very similar to code used in filterModule function. Patch reuses it to reduce duplication. Differential revision: https://reviews.llvm.org/D42971 llvm-svn: 324574
* Revert r324455 "[ThinLTO] - Simplify code in ThinLTOBitcodeWriter."George Rimar2018-02-071-11/+37
| | | | | | | It broke BB: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/23721 llvm-svn: 324458
* [ThinLTO] - Simplify code in ThinLTOBitcodeWriter.George Rimar2018-02-071-37/+11
| | | | | | | | | | Recently introduced convertToDeclaration is very similar to code used in filterModule function. Patch reuses it to reduce duplication. Differential revision: https://reviews.llvm.org/D42971 llvm-svn: 324455
* [DeadArgumentElim] Set pointer to DISubprogram before calling RAUW. NFCPetar Jovanovic2018-02-061-3/+3
| | | | | | | | | | | | It is better to update pointer of the DISuprogram before we call RAUW for still live arguments of the function, because with the change reviewed in D42541 in RAUW we compare DISubprograms rather than functions itself. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D42794 llvm-svn: 324335
* ThinLTOBitcodeWriter: Do not include module-level inline asm in the merged ↵Peter Collingbourne2018-02-061-0/+1
| | | | | | | | | | | module. If the inline asm provides the definition of a symbol, this can result in duplicate symbol errors. Differential Revision: https://reviews.llvm.org/D42944 llvm-svn: 324313
* [ThinLTO] Convert dead alias to declarationsTeresa Johnson2018-02-051-12/+26
| | | | | | | | | | | | | | | | Summary: This complements the fixes in r323633 and r324075 which drop the definitions of dead functions and variables, respectively. Fixes PR36208. Reviewers: grimar, rafael Subscribers: mehdi_amini, llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D42856 llvm-svn: 324242
* [ThinLTO] - Add comment. NFC.George Rimar2018-02-021-0/+2
| | | | | | Was requested during review of D42798. llvm-svn: 324095
* [GlobalOpt] Include padding in debug fragmentsMikael Holmen2018-02-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When creating the debug fragments for a SRA'd variable, use the types' allocation sizes. This fixes issues where the pass would emit too small fragments, placed at the wrong offset, for padded types. An example of this is long double on x86. The type is represented using x86_fp80, which is 10 bytes, but the value is aligned to 12/16 bytes. The padding is included in the type's DW_AT_byte_size attribute; therefore, the fragments should also include that. Newer GCC releases (I tested 7.2.0) emit 12/16-byte pieces for long double. Earlier releases, e.g. GCC 5.5.0, behaved as LLVM did, i.e. by emitting a 10-byte piece, followed by an empty 2/6-byte piece for the padding. Failing to cover all `DW_AT_byte_size' bytes of a value with non-empty pieces results in the value being printed as <optimized out> by GDB. Patch by: David Stenberg Reviewers: aprantl, JDevlieghere Reviewed By: aprantl, JDevlieghere Subscribers: llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D42807 llvm-svn: 324066
* [GlobalOpt] Improve common case efficiency of static global initializer ↵Amara Emerson2018-01-311-2/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | evaluation For very, very large global initializers which can be statically evaluated, the code would create vectors of temporary Constants, modifying them in place, before committing the resulting Constant aggregate to the global's initializer value. This had effectively O(n^2) complexity in the size of the global initializer and would cause memory and non-termination issues compiling some workloads. This change performs the static initializer evaluation and creation in batches, once for each global in the evaluated IR memory. The existing code is maintained as a last resort when the initializers are more complex than simple values in a large aggregate. This should theoretically by NFC, no test as the example case is massive. The existing test cases pass with this, as well as the llvm test suite. To give an example, consider the following C++ code adapted from the clang regression tests: struct S { int n = 10; int m = 2 * n; S(int a) : n(a) {} }; template<typename T> struct U { T *r = &q; T q = 42; U *p = this; }; U<S> e; The global static constructor for 'e' will need to initialize 'r' and 'p' of the outer struct, while also initializing the inner 'q' structs 'n' and 'm' members. This batch algorithm will simply use general CommitValueTo() method to handle the complex nested S struct initialization of 'q', before processing the outermost members in a single batch. Using CommitValueTo() to handle member in the outer struct is inefficient when the struct/array is very large as we end up creating and destroy constant arrays for each initialization. For the above case, we expect the following IR to be generated: %struct.U = type { %struct.S*, %struct.S, %struct.U* } %struct.S = type { i32, i32 } @e = global %struct.U { %struct.S* gep inbounds (%struct.U, %struct.U* @e, i64 0, i32 1), %struct.S { i32 42, i32 84 }, %struct.U* @e } The %struct.S { i32 42, i32 84 } inner initializer is treated as a complex constant expression, while the other two elements of @e are "simple". Differential Revision: https://reviews.llvm.org/D42612 llvm-svn: 323933
* LTO: Drop comdats when converting definitions to declarations.Peter Collingbourne2018-01-311-0/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D42715 llvm-svn: 323844
* Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass ↵Zaara Syeda2018-01-301-6/+158
| | | | | | | | | | | | | | | | | | | | | to mark candidates with coldcc attribute. This recommits r322721 reverted due to sanitizer memory leak build bot failures. Original commit message: This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 323778
* [ThinLTO] - Stop internalizing and drop non-prevailing symbols.George Rimar2018-01-291-20/+30
| | | | | | | | | | | Implementation marks non-prevailing symbols as not live in the summary. Then them are dropped in backends. Fixes https://bugs.llvm.org/show_bug.cgi?id=35938 Differential revision: https://reviews.llvm.org/D42107 llvm-svn: 323633
* [SyntheticCounts] Rewrite the code using only graph traits.Easwaran Raman2018-01-251-6/+17
| | | | | | | | | | | | | | | | | | | Summary: The intent of this is to allow the code to be used with ThinLTO. In Thinlink phase, a traditional Callgraph can not be computed even though all the necessary information (nodes and edges of a call graph) is available. This is due to the fact that CallGraph class is closely tied to the IR. This patch first extends GraphTraits to add a CallGraphTraits graph. This is then used to implement a version of counts propagation on a generic callgraph. Reviewers: davidxl Subscribers: mehdi_amini, tejohnson, llvm-commits Differential Revision: https://reviews.llvm.org/D42311 llvm-svn: 323475
* Re-land "[ThinLTO] Add call edges' relative block frequency to per-module ↵Easwaran Raman2018-01-251-2/+3
| | | | | | | | | | | | | | summary." It was reverted after buildbot regressions. Original commit message: This allows relative block frequency of call edges to be passed to the thinlink stage where it will be used to compute synthetic entry counts of functions. llvm-svn: 323460
* Another try to commit 323321 (aggressive instruction combine).Amjad Aboud2018-01-252-1/+5
| | | | llvm-svn: 323416
* [GlobalOpt] Emit fragments using field offsets from struct layoutMikael Holmen2018-01-251-4/+2
| | | | | | | | | | | | | | | | | | | | | | Summary: When creating the debug fragments for a SRA'd struct, use the fields' offsets, taken from the struct layout, as the offsets for the resulting fragments. This fixes an issue where GlobalOpt would emit fragments with incorrect offsets for padded fields. This should solve PR36016. Patch by David Stenberg. Reviewers: aprantl Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42489 llvm-svn: 323411
* Reverted 323321.Amjad Aboud2018-01-242-5/+1
| | | | llvm-svn: 323326
* [InstCombine] Introducing Aggressive Instruction Combine pass ↵Amjad Aboud2018-01-242-1/+5
| | | | | | | | | | | | | | | | | | (-aggressive-instcombine). Combine expression patterns to form expressions with fewer, simple instructions. This pass does not modify the CFG. For example, this pass reduce width of expressions post-dominated by TruncInst into smaller width when applicable. It differs from instcombine pass in that it contains pattern optimization that requires higher complexity than the O(1), thus, it should run fewer times than instcombine pass. Differential Revision: https://reviews.llvm.org/D38313 llvm-svn: 323321
* BlockExtractor: Remove unused variable. NFC.Volkan Keles2018-01-231-1/+1
| | | | llvm-svn: 323271
* [llvm-extract] Support extracting basic blocksVolkan Keles2018-01-234-153/+176
| | | | | | | | | | | | | | | | Summary: Currently, there is no way to extract a basic block from a function easily. This patch extends llvm-extract to extract the specified basic block(s). Reviewers: loladiro, rafael, bogner Reviewed By: bogner Subscribers: hintonda, mgorny, qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D41638 llvm-svn: 323266
* [ThinLTO] Re-commit of dot dumper after test fixEugene Leviant2018-01-223-4/+4
| | | | llvm-svn: 323116
* Temporarily revert r323062 to investigate buildbot failuresEugene Leviant2018-01-213-4/+4
| | | | llvm-svn: 323065
* [ThinLTO] Implement summary visualizerEugene Leviant2018-01-213-4/+4
| | | | | | Differential revision: https://reviews.llvm.org/D41297 llvm-svn: 323062
* [NFC] fix trivial typos in commentsHiroshi Inoue2018-01-191-1/+1
| | | | | | "the the" -> "the" llvm-svn: 322934
* Add a ProfileCount class to represent entry counts.Easwaran Raman2018-01-173-6/+12
| | | | | | | | | | | | | | Summary: The class wraps a uint64_t and an enum to represent the type of profile count (real and synthetic) with some helper methods. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41883 llvm-svn: 322771
* Revert [PowerPC] This reverts commit rL322721Zaara Syeda2018-01-171-158/+6
| | | | | | Failing build bots. Revert the commit now. llvm-svn: 322748
* [PowerPC] Add handling for ColdCC calling convention and a pass to markZaara Syeda2018-01-171-6/+158
| | | | | | | | | | | | | | | | candidates with coldcc attribute. This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 322721
* Make internal/private GVs implicitly dso_local.Rafael Espindola2018-01-111-0/+1
| | | | | | | | | | | | | | | | While updating clang tests for having clang set dso_local I noticed that: - There are *a lot* of tests to update. - Many of the updates are redundant. They are redundant because a GV is "obviously dso_local". This patch starts formalizing that a bit by requiring that internal and private GVs be dso_local too. Since they all are, we don't have to print dso_local to the textual representation, making it a bit more compact and easier to read. llvm-svn: 322317
* LowerTypeTests: Add limited support for aliasesVlad Tsyrklevich2018-01-102-0/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: LowerTypeTests moves some function definitions from individual object files to the merged module, leaving a stub to be called in the merged module's jump table. If an alias was pointing to such a function definition LowerTypeTests would fail because the alias would be left without a definition to point to. This change 1) emits information about aliases to the ThinLTO summary, 2) replaces aliases pointing to function definitions that are moved to the merged module with function declarations, and 3) re-emits those aliases in the merged module pointing to the correct function definitions. The patch does not correctly fix all possible mis-uses of aliases in LowerTypeTests. For example, it does not handle aliases with a different type from the pointed to function. The addition of alias data increases the size of Chrome build artifacts by less than 1%. Reviewers: pcc Reviewed By: pcc Subscribers: mehdi_amini, eraman, mgrang, llvm-commits, eugenis, kcc Differential Revision: https://reviews.llvm.org/D41741 llvm-svn: 322139
* Add a pass to generate synthetic function entry counts.Easwaran Raman2018-01-093-1/+129
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This pass synthesizes function entry counts by traversing the callgraph and using the relative block frequencies of the callsites. The intended use of these counts is in inlining to determine hot/cold callsites in the absence of profile information. The pass is split into two files with the code that propagates the counts in a callgraph in a Utils file. I plan to add support for propagation in the thinlto link phase and the propagation code will be shared and hence this split. I did not add support to the old PM since hot callsite determination in inlining is not possible in old PM (although we could use hot callee heuristic with synthetic counts in the old PM it is not worth the effort tuning it) Reviewers: davidxl, silvas Subscribers: mgorny, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D41604 llvm-svn: 322110
* AlwaysInliner: Alow setting InsertLifetime in the new-style passJustin Bogner2018-01-081-1/+2
| | | | llvm-svn: 322033
* ArgPromotion: Allow setting MaxElements in the new-style passJustin Bogner2018-01-081-1/+1
| | | | llvm-svn: 322025
* WholeProgramDevirt: Simplify ORE getter mechanism for old PM. NFCI.Peter Collingbourne2018-01-051-34/+17
| | | | llvm-svn: 321841
* Remove superfluous copies in sample profiling.Benjamin Kramer2017-12-281-5/+6
| | | | | | No functionliaty change intended. llvm-svn: 321530
* Avoid int to string conversion in Twine or raw_ostream contexts.Benjamin Kramer2017-12-281-2/+1
| | | | | | Some output changes from uppercase hex to lowercase hex, no other functionality change intended. llvm-svn: 321526
* Add hasProfileData() to check if a function has profile data. NFC.Easwaran Raman2017-12-221-4/+4
| | | | | | | | | | | | | | | | | | | Summary: This replaces calls to getEntryCount().hasValue() with hasProfileData that does the same thing. This refactoring is useful to do before adding synthetic function entry counts but also a useful cleanup IMO even otherwise. I have used hasProfileData instead of hasRealProfileData as David had earlier suggested since I think profile implies "real" and I use the phrase "synthetic entry count" and not "synthetic profile count" but I am fine calling it hasRealProfileData if you prefer. Reviewers: davidxl, silvas Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41461 llvm-svn: 321331
* Silence a bunch of implicit fallthrough warningsAdrian Prantl2017-12-192-0/+2
| | | | llvm-svn: 321114
* [PGO] Fix handling of cold entry count for instrumented PGOTeresa Johnson2017-12-181-1/+4
| | | | | | | | | | | | | | | | | | | | | Summary: In r277849, getEntryCount was changed to return None when the entry count was 0, specifically for SamplePGO where it means no samples were recorded. However, for instrumentation PGO a 0 entry count should be returned directly, since it does mean that the function was completely cold. Otherwise we end up treating these functions conservatively in isFunctionEntryCold() and isColdBB(). Instead, for SamplePGO use -1 when there are no samples, and change getEntryCount to return None when the value is -1. Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41307 llvm-svn: 321018
OpenPOWER on IntegriCloud