summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/IPO
Commit message (Collapse)AuthorAgeFilesLines
* [IR] Remove the DIExpression field from DIGlobalVariable.Adrian Prantl2016-12-161-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. This reapplies r289902 with additional testcase upgrades. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 289920
* [ThinLTO] Thin link efficiency: More efficient export list computationTeresa Johnson2016-12-161-32/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Instead of checking whether a global referenced by a function being imported is defined in the same module, speculatively always add the referenced globals to the module's export list. After all imports are computed, for each module prune any not in its defined set from its export list. For a huge C++ app with aggressive importing thresholds, even with D27687 we spent a lot of time invoking modulePath() from exportGlobalInModule (modulePath() was still the 2nd hottest routine in profile). The reason is that with comdat/linkonce the summary lists for each GUID can be long. For the app in question, for example, we were invoking exportGlobalInModule almost 2 million times, and we traversed an average of 63 entries in the summary list each time. This patch reduced the thin link time for the app by about 10% (on top of D27687) when using aggressive importing thresholds, and about 3.5% on average with default importing thresholds. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27755 llvm-svn: 289918
* Revert "[IR] Remove the DIExpression field from DIGlobalVariable."Adrian Prantl2016-12-161-10/+7
| | | | | | This reverts commit 289902 while investigating bot berakage. llvm-svn: 289906
* Add missing library dep.Peter Collingbourne2016-12-161-1/+1
| | | | llvm-svn: 289903
* [IR] Remove the DIExpression field from DIGlobalVariable.Adrian Prantl2016-12-161-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 289902
* IPO: Introduce ThinLTOBitcodeWriter pass.Peter Collingbourne2016-12-162-0/+345
| | | | | | | | | | | | | | This pass prepares a module containing type metadata for ThinLTO by splitting it into regular and thin LTO parts if possible, and writing both parts to a multi-module bitcode file. Modules that do not contain type metadata are written unmodified as a single module. All globals with type metadata are added to the regular LTO module, and the rest are added to the thin LTO module. Differential Revision: https://reviews.llvm.org/D27324 llvm-svn: 289899
* [ThinLTO] Thin link efficiency improvement: don't re-export globals (NFC)Teresa Johnson2016-12-151-9/+13
| | | | | | | | | | | | | | | | | Summary: We were reinvoking exportGlobalInModule numerous times redundantly. No need to re-export globals referenced by a global that was already imported from its module. This resulted in a large speedup in the thin link for a big application, particularly when importing aggressiveness was cranked up. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27687 llvm-svn: 289896
* [ThinLTO] Revert part of r289843 that belonged to another patch.Teresa Johnson2016-12-151-13/+9
| | | | | | | | The code change for D27687 accidentally got committed along with the main change in r289843. Revert it temporarily, so that I can recommit it along with its test as intended. llvm-svn: 289875
* [ThinLTO] Remove stale comment (NFC)Teresa Johnson2016-12-151-2/+1
| | | | | | This should have been removed with r288446. llvm-svn: 289871
* [ThinLTO] Thin link efficiency: skip candidate added later with higher ↵Teresa Johnson2016-12-151-4/+13
| | | | | | | | | | | | | | | | | | | | | | | threshold (NFC) Summary: Thin link efficiency improvement. After adding an importing candidate to the worklist we might have later added it again with a higher threshold. Skip it when popped from the worklist if we recorded a higher threshold than the current worklist entry, it will get processed again at the higher threshold when that entry is popped. This required adding the summary's GUID to the worklist, so that it can be used to query the recorded highest threshold for it when we pop from the worklist. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27696 llvm-svn: 289867
* [ThinLTO] Ensure callees get hot threshold when first seen on cold pathTeresa Johnson2016-12-151-24/+28
| | | | | | | | | | | | | | | | | This is split out from D27696, since it turned out to be a bug fix and not part of the NFC efficiency change. Keep the same adjusted (possibly decayed) threshold in both the worklist and the ImportList. Otherwise if we encountered it first along a cold path, the callee would be added to the worklist with a lower decayed threshold than when it is later encountered along a hot path. But the logic uses the threshold recorded in the ImportList entry to check if we should re-add it, and without this patch the threshold recorded there is the same along both paths so we don't re-add it. Using the same possibly decayed threshold in the ImportList ensures we re-add it later with the higher non-decayed hot path threshold. llvm-svn: 289843
* Remove the AssumptionCacheHal Finkel2016-12-157-65/+12
| | | | | | | | | After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... llvm-svn: 289756
* Only sets profile summary when it was not preset.Dehao Chen2016-12-141-1/+2
| | | | | | | | | | | | Summary: SampleProfileLoader pass may be invoked twice by LTO. The 2nd pass should not append more summary info as it is already preset by the 1st pass. Reviewers: eraman, davidxl Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D27733 llvm-svn: 289725
* Fix the bug in r289714 (NFC).Dehao Chen2016-12-141-1/+1
| | | | llvm-svn: 289724
* Create SampleProfileLoader pass in llvm instead of clangDehao Chen2016-12-141-0/+5
| | | | | | | | | | | | Summary: We used to create SampleProfileLoader pass in clang. This makes LTO/ThinLTO unable to add this pass in the linker plugin. This patch moves the SampleProfileLoader pass creation from clang to llvm pass manager builder. Reviewers: tejohnson, davidxl, dnovillo Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D27743 llvm-svn: 289714
* [GVNHoist] Move GVNHoist to function simplification part of pipeline.Geoff Berry2016-12-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Move GVNHoist to later in the optimization pipeline, specifically, to the function simplification part of the pipeline. The new pipeline location allows GVNHoist to run on a function after its callees have been inlined but before the function has been considered for inlining into its callers, exposing more opportunities for hoisting. Performance results on AArch64 kryo: Improvements: Benchmarks/CoyoteBench/fftbench -24.952% spec2006/bzip2 -4.071% internal bmark -3.177% Benchmarks/PAQ8p/paq8p -1.754% spec2000/perlbmk -1.328% spec2006/h264ref -1.140% Regressions: internal bmark +1.818% Benchmarks/mafft/pairlocalalign +1.084% Reviewers: sebpop, dberlin, hiraditya Subscribers: aemerson, mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27722 llvm-svn: 289696
* revert r289669 which breaks botsDehao Chen2016-12-141-5/+0
| | | | llvm-svn: 289676
* Create SampleProfileLoader pass in llvm instead of clangDehao Chen2016-12-141-0/+5
| | | | | | | | | | | | Summary: We used to create SampleProfileLoader pass in clang. This makes LTO/ThinLTO unable to add this pass in the linker plugin. This patch moves the SampleProfileLoader pass creation from clang to llvm pass manager builder. Reviewers: tejohnson, davidxl, dnovillo Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D27743 llvm-svn: 289669
* Change CoverageTracker from a global variable to member variable to avoid ↵Dehao Chen2016-12-131-52/+52
| | | | | | breaking thread-safety. (NFC) llvm-svn: 289603
* [ThinLTO] Remove useless code (NFC)Teresa Johnson2016-12-121-4/+0
| | | | | | Should have been removed in r288446. llvm-svn: 289466
* WholeProgramDevirt: Teach the pass to handle structs of arrays.Peter Collingbourne2016-12-091-23/+22
| | | | | | This will become necessary in some cases once D22296 lands. llvm-svn: 289165
* Make WholeProgramDevirt understand ConstStruct vtables.Peter Collingbourne2016-12-091-13/+37
| | | | | | | | Based on a patch by LemonBoy! Differential Revision: https://reviews.llvm.org/D26581 llvm-svn: 289162
* CFI-icall on ThumbEvgeniy Stepanov2016-12-081-4/+14
| | | | | | | | | | | | Replace @progbits in the section directive with %progbits, because "@" starts a comment on arm/thumb. Use b.w branch instruction. Use .thumb_function and .thumb_set for proper arm/thumb interwork. This way jumptable entry addresses on thumb have bit 0 set (correctly). This does not affect CFI check math, because the address of the jumptable start also has that bit set. This does not work on thumbv5, because it does not support b.w, and the linker would not insert a veneer (trampoline?) to extend the range of b.n. We may need to do full-range plt-style jumptables on thumbv54, which are 12 bytes per entry. Another option is "push lr; bl; pop pc" (4 bytes) but that needs unwinding instructions, etc. Differential Revision: https://reviews.llvm.org/D27499 llvm-svn: 289008
* Try unbreaking the MSVC build.Benjamin Kramer2016-12-071-1/+1
| | | | llvm-svn: 288907
* [LowerTypeTests] Use the TrailingObjects infrastructure for trailing objects.Benjamin Kramer2016-12-071-6/+10
| | | | | | Also avoid allocating ~3x as much memory as needed. llvm-svn: 288904
* LowerTypeTests: Improve performance by optimising type metadata queries.Peter Collingbourne2016-12-061-88/+129
| | | | | | | | | | | | | | | | | | | | | | Requesting metadata for a global is a relatively expensive operation as it involves a map lookup, but it's one that we need to do relatively frequently in this pass to collect the list of type metadata nodes associated with a global. This change improves the performance of type metadata queries by prebuilding data structures that keep the global together with its list of type metadata, and changing the pass to use that data structure wherever we were previously passing global references around. This change also eliminates some O(N^2) behavior by collecting the list of globals associated with each type identifier during the first pass over the list of globals rather than visiting each global to compute that list every time we add a new type identifier. Reduces pass runtime on a module containing Chrome's vtables from over 60s to 0.9s. Differential Revision: https://reviews.llvm.org/D27484 llvm-svn: 288859
* IR: Move NumElements field from {Array,Vector}Type to SequentialType.Peter Collingbourne2016-12-021-12/+2
| | | | | | | | | | Now that PointerType is no longer a SequentialType, all SequentialTypes have an associated number of elements, so we can move that information to the base class, allowing for a number of simplifications. Differential Revision: https://reviews.llvm.org/D27122 llvm-svn: 288464
* IR: Change PointerType to derive from Type rather than SequentialType.Peter Collingbourne2016-12-021-3/+5
| | | | | | | | | | | | | | | | | | | As proposed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-October/106640.html This is for a couple of reasons: - Values of type PointerType are unlike the other SequentialTypes (arrays and vectors) in that they do not hold values of the element type. By moving PointerType we can unify certain aspects of how the other SequentialTypes are handled. - PointerType will have no place in the SequentialType hierarchy once pointee types are removed, so this is a necessary step towards removing pointee types. Differential Revision: https://reviews.llvm.org/D26595 llvm-svn: 288462
* IR: Change the gep_type_iterator API to avoid always exposing the "current" ↵Peter Collingbourne2016-12-021-13/+7
| | | | | | | | | | | | | type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458
* [ThinLTO] Stop importing constant global vars as copies in the backendTeresa Johnson2016-12-021-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We were doing an optimization in the ThinLTO backends of importing constant unnamed_addr globals unconditionally as a local copy (regardless of whether the thin link decided to import them). This should be done in the thin link instead, so that resulting exported references are marked and promoted appropriately, but will need a summary enhancement to mark these variables as constant unnamed_addr. The function import logic during the thin link was trying to handle this proactively, by conservatively marking all values referenced in the initializer lists of exported global variables as also exported. However, this only handled values referenced directly from the initializer list of an exported global variable. If the value is itself a constant unnamed_addr variable, we could end up exporting its references as well. This caused multiple issues. The first is that the transitively exported references weren't promoted. Secondly, some could not be promoted/renamed (e.g. they had a section or other constraint). recursively, instead of just adding the first level of initializer list references to the ExportList directly. Remove this optimization and the associated handling in the function import backend. SPEC measurements indicate we weren't getting much from it in any case. Fixes PR31052. Reviewers: mehdi_amini Subscribers: krasin, llvm-commits Differential Revision: https://reviews.llvm.org/D26880 llvm-svn: 288446
* Object: Extract a ModuleSymbolTable class from IRObjectFile.Peter Collingbourne2016-12-011-1/+1
| | | | | | | | | | | | This class represents a symbol table built from in-memory IR. It provides access to GlobalValues and should only be used if such access is required (e.g. in the LTO implementation). We will eventually change IRObjectFile to read from a bitcode symbol table rather than using ModuleSymbolTable, so it would not be able to expose the module. Differential Revision: https://reviews.llvm.org/D27073 llvm-svn: 288319
* Use CallSite to simplify codeDavid Blaikie2016-11-291-5/+3
| | | | llvm-svn: 288192
* Replace some callers of setTailCall with setTailCallKindDavid Majnemer2016-11-252-6/+6
| | | | | | | We were a little sloppy with adding tailcall markers. Be more consistent by using setTailCallKind instead of setTailCall. llvm-svn: 287955
* Before sample pgo annotation, do not inline a function that has no debug ↵Dehao Chen2016-11-221-0/+2
| | | | | | | | info. (NFC) If there is no debug info in the callee, inlining it will not help annotator. This avoids infinite loop as reported in PR/31119. llvm-svn: 287710
* [GlobalSplit] Port to the new pass manager.Davide Italiano2016-11-211-0/+7
| | | | llvm-svn: 287511
* Fix spelling mistakes in Transforms comments. NFC.Simon Pilgrim2016-11-201-1/+1
| | | | | | Identified by Pedro Giffuni in PR27636. llvm-svn: 287488
* [CMake] NFC. Updating CMake dependency specificationsChris Bieneman2016-11-171-2/+3
| | | | | | This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system. llvm-svn: 287206
* Introduce GlobalSplit pass.Peter Collingbourne2016-11-164-0/+171
| | | | | | | | | This pass splits globals into elements using inrange annotations on getelementptr indices. Differential Revision: https://reviews.llvm.org/D22295 llvm-svn: 287178
* [ThinLTO] Only promote exported locals as marked in indexTeresa Johnson2016-11-141-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We have always speculatively promoted all renamable local values (except const non-address taken variables) for both the exporting and importing module. We would then internalize them back based on the ThinLink results if they weren't actually exported. This is inefficient, and results in unnecessary renames. It also meant we had to check the non-renamability of a value in the summary, which was already checked during function importing analysis in the ThinLink. Made renameModuleForThinLTO (which does the promotion/renaming) instead use the index when exporting, to avoid unnecessary renames/promotions. For importing modules, we can simply promoted all values as any local we import by definition is exported and needs promotion. This required changes to the method used by the FunctionImport pass (only invoked from 'opt' for testing) and when invoked from llvm-link, since neither does a ThinLink. We simply conservatively mark all locals in the index as promoted, which preserves the current aggressive promotion behavior. I also needed to change an llvm-lto based test where we had previously been aggressively promoting values that weren't importable (aliasees), but now will not promote. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26467 llvm-svn: 286871
* [ThinLTO] Make inline assembly handling more efficient in summaryTeresa Johnson2016-11-141-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The change in r285513 to prevent exporting of locals used in inline asm added all locals in the llvm.used set to the reference set of functions containing inline asm. Since these locals were marked NoRename, this automatically prevented importing of the function. Unfortunately, this caused an explosion in the summary reference lists in some cases. In my particular example, it happened for a large protocol buffer generated C++ file, where many of the generated functions contained an inline asm call. It was exacerbated when doing a ThinLTO PGO instrumentation build, where the PGO instrumentation included thousands of private __profd_* values that were added to llvm.used. We really only need to include a single llvm.used local (NoRename) value in the reference list of a function containing inline asm to block it being imported. However, it seems cleaner to add a flag to the summary that explicitly describes this situation, which is what this patch does. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26402 llvm-svn: 286840
* Bitcode: Change module reader functions to return an llvm::Expected.Peter Collingbourne2016-11-131-1/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D26562 llvm-svn: 286752
* [cfi] Fix weak functions handling.Evgeniy Stepanov2016-11-111-2/+74
| | | | | | | | | | | | | | | When a function pointer is replaced with a jumptable pointer, special case is needed to preserve the semantics of extern_weak functions. Since a jumptable entry can not be extern_weak, we emulate that behaviour by replacing all references to F (the extern_weak function) with the following expression: F != nullptr ? JumpTablePtr : nullptr. Extra special care is needed for global initializers, since most (or probably all) backends can not lower an initializer that includes this kind of constant expression. Initializers like that are replaced with a global constructor (i.e. a runtime initializer). llvm-svn: 286636
* Make the FunctionComparator of the MergeFunctions pass a stand-alone utility.Erik Eckstein2016-11-111-1217/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is pure refactoring. NFC. This change moves the FunctionComparator (together with the GlobalNumberState utility) in to a separate file so that it can be used by other passes. For example, the SwiftMergeFunctions pass in the Swift compiler: https://github.com/apple/swift/blob/master/lib/LLVMPasses/LLVMMergeFunctions.cpp Details of the change: *) The big part is just moving code out of MergeFunctions.cpp into FunctionComparator.h/cpp *) Make FunctionComparator member functions protected (instead of private) so that a derived comparator class can use them. Following refactoring helps to share code between the base FunctionComparator class and a derived class: *) Add a beginCompare() function *) Move some basic function property comparisons into a separate function compareSignature() *) Do the GEP comparison inside cmpOperations() which now has a new needToCmpOperands reference parameter https://reviews.llvm.org/D25385 llvm-svn: 286632
* Bitcode: Change getModuleSummaryIndex() to return an llvm::Expected.Peter Collingbourne2016-11-111-35/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D26539 llvm-svn: 286624
* [cfi] Implement cfi-icall using inline assembly.Evgeniy Stepanov2016-11-111-75/+159
| | | | | | | | | | | | | | | | | | | | | | | | | The current implementation is emitting a global constant that happens to evaluate to the same bytes + relocation as a jump instruction on X86. This does not work for PIE executables and shared libraries though, because we end up with a wrong relocation type. And it has no chance of working on ARM/AArch64 which use different relocation types for jump instructions (R_ARM_JUMP24) that is never generated for data. This change replaces the constant with module-level inline assembly followed by a hidden declaration of the jump table. Works fine for ARM/AArch64, but has some drawbacks. * Extra symbols are added to the static symbol table, which inflate the size of the unstripped binary a little. Stripped binaries are not affected. This happens because jump table declarations must be external (because their body is in the inline asm). * Original functions that were anonymous are now named <original name>.cfi, and it affects symbolization sometimes. This is necessary because the only user of these functions is the (inline asm) jump table, so they had to be added to @llvm.used, which does not allow unnamed functions. llvm-svn: 286611
* Add comments about why we put LoopSink pass at the very late stage.Dehao Chen2016-11-101-0/+4
| | | | llvm-svn: 286480
* Bitcode: Change the materializer interface to return llvm::Error.Peter Collingbourne2016-11-091-8/+22
| | | | | | Differential Revision: https://reviews.llvm.org/D26439 llvm-svn: 286382
* Enable Loop Sink pass for functions that has profile.Dehao Chen2016-11-091-4/+4
| | | | | | | | | | | | Summary: For functions with profile data, we are confident that loop sink will be optimal in sinking code. Reviewers: davidxl, hfinkel Subscribers: mehdi_amini, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D26155 llvm-svn: 286325
* Fix typo in comment. NFC.Chad Rosier2016-11-081-2/+2
| | | | llvm-svn: 286270
* Remove unused include. NFC.Chad Rosier2016-11-081-1/+0
| | | | llvm-svn: 286250
OpenPOWER on IntegriCloud