summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/IPO
Commit message (Collapse)AuthorAgeFilesLines
* Vector of pointers in function attributes calculationElena Demikhovsky2015-11-171-1/+1
| | | | | | | | | While setting function attributes we check all instructions that may access memory. For a call instruction we check all arguments. The special check is required for pointers. I added vector-of-pointers to the call arguments types that should be checked. Differential Revision: http://reviews.llvm.org/D14693 llvm-svn: 253363
* [GlobalOpt] Address post-commit review comments on r253168James Molloy2015-11-161-3/+17
| | | | | | | | Address Duncan Exon Smith's comments on D14148, which was added after the patch had been LGTM'd and committed: * clang-format one area where whitespace diffs occurred. * Add a threshold to limit the store/load dominance checks as they are quadratic. llvm-svn: 253192
* Move helper classes into anonymous namespaces. NFC.Benjamin Kramer2015-11-161-0/+2
| | | | llvm-svn: 253189
* [GlobalOpt] Demote globals to locals more aggressivelyJames Molloy2015-11-151-7/+76
| | | | | | | | | | | | | | | | Global to local demotion can speed up programs that use globals a lot. It is particularly useful with LTO, when the entire call graph is known and most functions have been internalized. For a global to be demoted, it must only be accessed by one function and that function: 1. Must never recurse directly or indirectly, else the GV would be clobbered. 2. Must never rely on the value in GV at the start of the function (apart from the initializer). GlobalOpt can already do this, but it is hamstrung and only ever tries to demote globals inside "main", because C++ gives extra guarantees about how main is called - once and only once. In LTO mode, we can often prove the first property (if the function is internal by this point, we know enough about the callgraph to determine if it could possibly recurse). FunctionAttrs now infers the "norecurse" attribute for this reason. The second property can be proven for a subset of functions by proving that all loads from GV are dominated by a store to GV. This is conservative in the name of compile time - this only requires a DominatorTree which is fairly cheap in the grand scheme of things. We could do more fancy stuff with MemoryDependenceAnalysis too to catch more cases but this appears to catch most of the useful ones in my testing. llvm-svn: 253168
* [GlobalOpt] Make sure all debug lines end with '\n'James Molloy2015-11-131-2/+2
| | | | | | GlobalVariable::print() used to emit a newline. It hasn't for a while now, but these debug lines weren't updated. llvm-svn: 253030
* [GlobalOpt] Coding style - remove function names from doxygen commentsJames Molloy2015-11-131-126/+115
| | | | | | Suggested by Mehdi in the review of D14148. llvm-svn: 253029
* Revert r252990.Akira Hatanaka2015-11-131-1/+34
| | | | | | Some of the buildbots are still failing. llvm-svn: 252999
* Provide a way to specify inliner's attribute compatibility and merging.Akira Hatanaka2015-11-131-34/+1
| | | | | | | | | | | | | | | | | | This reapplies r252949. I've changed the type of FuncName to be std::string instead of StringRef in emitFnAttrCompatCheck. Original commit message for r252949: Provide a way to specify inliner's attribute compatibility and merging rules using table-gen. NFC. This commit adds new classes CompatRule and MergeRule to Attributes.td, which are used to generate code to check attribute compatibility and merge attributes of the caller and callee. rdar://problem/19836465 llvm-svn: 252990
* Revert r252949.Akira Hatanaka2015-11-121-1/+34
| | | | | | It broke some of the bots including clang-x64-ninja-win7. llvm-svn: 252951
* Provide a way to specify inliner's attribute compatibility and mergingAkira Hatanaka2015-11-121-34/+1
| | | | | | | | | | | | rules using table-gen. NFC. This commit adds new classes CompatRule and MergeRule to Attributes.td, which are used to generate code to check attribute compatibility and merge attributes of the caller and callee. rdar://problem/19836465 llvm-svn: 252949
* Revert "Revert "[FunctionAttrs] Identify norecurse functions""James Molloy2015-11-121-1/+78
| | | | | | This reapplies this patch, with test fixes. llvm-svn: 252871
* Revert "[FunctionAttrs] Identify norecurse functions"James Molloy2015-11-121-78/+1
| | | | | | This reverts commit r252862. This introduced test failures and I'm reverting while I investigate how this happened. llvm-svn: 252863
* [FunctionAttrs] Identify norecurse functionsJames Molloy2015-11-121-1/+78
| | | | | | | | | | | | | A function can be marked as norecurse if: * The SCC to which it belongs has cardinality 1; and either a) It does not call any non-norecurse function. This includes self-recursion; or b) It only has one callsite and the function that callsite is within is marked norecurse. a) is best propagated bottom-up and b) is best propagated top-down. We build up the norecurse attributes bottom-up using the existing SCC pass, and mark functions with no obvious recursion (but not provably norecurse) to sweep later, top-down. llvm-svn: 252862
* [IR] Add support for empty tokensDavid Majnemer2015-11-111-1/+3
| | | | | | | | | | | | | | When working with tokens, it is often the case that one has instructions which consume a token and produce a new token. Currently, we have no mechanism to represent an initial token state. Instead, we can create a notional "empty token" by inventing a new constant which captures the semantics we would like. This new constant is called ConstantTokenNone and is written textually as "token none". Differential Revision: http://reviews.llvm.org/D14581 llvm-svn: 252811
* GlobalOpt should maintain externally_initialized when splitting aggregatesOliver Stannard2015-11-091-0/+2
| | | | | | | | | | | | | When GlobalOpt splits an internal, global variable with an aggregate type, it should propagate the externally_initialized flag to the newly created globals. This makes the pass safe for our downstream use of this flag, while still allowing some useful optimisations (such as removing dead parts of the split aggregate) to be performed. Differential Revision: http://reviews.llvm.org/D13382 llvm-svn: 252490
* Unbreak the buildSanjoy Das2015-11-071-1/+1
| | | | | | | My code clashed with some ilist iterator changes upstream. Fix by adding an explicit "&*" coercion. llvm-svn: 252392
* [FunctionAttrs] Add comment and clarify assertion message; NFCSanjoy Das2015-11-071-1/+6
| | | | llvm-svn: 252389
* [FunctionAttrs] Add handling for operand bundlesSanjoy Das2015-11-071-4/+31
| | | | | | | | | | | | | | Summary: Teach the FunctionAttrs to do the right thing for IR with operand bundles. Reviewers: reames, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14408 llvm-svn: 252387
* [FunctionAttrs] Fix an iterator wraparound bugSanjoy Das2015-11-071-18/+19
| | | | | | | | | | | | | | | | | | | Summary: This change fixes an iterator wraparound bug in `determinePointerReadAttrs`. Ideally, ++'ing off the `end()` of an iplist should result in a failed assert, but currently iplist seems to silently wrap to the head of the list on `end()++`. This is why the bad behavior is difficult to demonstrate. Reviewers: chandlerc, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14350 llvm-svn: 252386
* ADT: Remove last implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-11-071-1/+1
| | | | | | | | | | Some implicit ilist iterator conversions have crept back into Analysis, Transforms, Hexagon, and llvm-stress. This removes them. I'll commit a patch immediately after this to disallow them (in a separate patch so that it's easy to revert if necessary). llvm-svn: 252371
* DI: Reverse direction of subprogram -> function edge.Peter Collingbourne2015-11-053-35/+11
| | | | | | | | | | | | | | | | | | | | | | | Previously, subprograms contained a metadata reference to the function they described. Because most clients need to get or set a subprogram for a given function rather than the other way around, this created unneeded inefficiency. For example, many passes needed to call the function llvm::makeSubprogramMap() to build a mapping from functions to subprograms, and the IR linker needed to fix up function references in a way that caused quadratic complexity in the IR linking phase of LTO. This change reverses the direction of the edge by storing the subprogram as function-level metadata and removing DISubprogram's function field. Since this is an IR change, a bitcode upgrade has been provided. Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is attached to the PR. Differential Revision: http://reviews.llvm.org/D14265 llvm-svn: 252219
* [FunctionAttrs] Remove a loop, NFC refactorSanjoy Das2015-11-051-16/+14
| | | | | | | | | | | | | | | | | | | | | | Summary: Remove the loop over the uses of the CallSite in ArgumentUsesTracker. Since we have the `Use *` for actual argument operand, we can just use pointer subtraction. The time complexity remains the same though (except for a vararg argument) -- `std::advance` is O(UseIndex) for the ArgumentList iterator. The real motivation is to make a later change adding support for operand bundles simpler. Reviewers: reames, chandlerc, nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14363 llvm-svn: 252141
* LLE 6/6: Add LoopLoadElimination passAdam Nemet2015-11-031-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The goal of this pass is to perform store-to-load forwarding across the backedge of a loop. E.g.: for (i) A[i + 1] = A[i] + B[i] => T = A[0] for (i) T = T + B[i] A[i + 1] = T The pass relies on loop dependence analysis via LoopAccessAnalisys to find opportunities of loop-carried dependences with a distance of one between a store and a load. Since it's using LoopAccessAnalysis, it was easy to also add support for versioning away may-aliasing intervening stores that would otherwise prevent this transformation. This optimization is also performed by Load-PRE in GVN without the option of multi-versioning. As was discussed with Daniel Berlin in http://reviews.llvm.org/D9548, this is inferior to a more loop-aware solution applied here. Hopefully, we will be able to remove some complexity from GVN/MemorySSA as a consequence. In the long run, we may want to extend this pass (or create a new one if there is little overlap) to also eliminate loop-indepedent redundant loads and store that *require* versioning due to may-aliasing intervening stores/loads. I have some motivating cases for store elimination. My plan right now is to wait for MemorySSA to come online first rather than using memdep for this. The main motiviation for this pass is the 456.hmmer loop in SPECint2006 where after distributing the original loop and vectorizing the top part, we are left with the critical path exposed in the bottom loop. Being able to promote the memory dependence into a register depedence (even though the HW does perform store-to-load fowarding as well) results in a major gain (~20%). This gain also transfers over to x86: it's around 8-10%. Right now the pass is off by default and can be enabled with -enable-loop-load-elim. On the LNT testsuite, there are two performance changes (negative number -> improvement): 1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the critical paths is reduced 2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this outside of LNT The pass is scheduled after the loop vectorizer (which is after loop distribution). The rational is to try to reuse LAA state, rather than recomputing it. The order between LV and LLE is not critical because normally LV does not touch scalar st->ld forwarding cases where vectorizing would inhibit the CPU's st->ld forwarding to kick in. LoopLoadElimination requires LAA to provide the full set of dependences (including forward dependences). LAA is known to omit loop-independent dependences in certain situations. The big comment before removeDependencesFromMultipleStores explains why this should not occur for the cases that we're interested in. Reviewers: dberlin, hfinkel Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13259 llvm-svn: 252017
* Restore "Support for ThinLTO function importing and symbol linking."Teresa Johnson2015-11-031-0/+39
| | | | | | | This restores commit r251837, with the new library dependence added to llvm-link/Makefile to address bot failures. llvm-svn: 251866
* Revert "Support for ThinLTO function importing and symbol linking."Teresa Johnson2015-11-021-39/+0
| | | | | | | | | | | | | | | | | | | | This reverts commit r251837, due to a number of bot failures of the form: /home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined reference to 'llvm::object::FunctionIndexObjectFile::create(llvm::MemoryBufferRef, llvm::LLVMContext&, llvm::Module const*, bool)' /home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined reference to 'llvm::object::FunctionIndexObjectFile::takeIndex()' I'm not sure why these are happening - I added Object to the requred libraries in tools/llvm-link/LLVMBuild.txt and the LLVM_LINK_COMPONENTS in tools/llvm-link/CMakeLists.txt. Confirmed for my build that these symbols come out of libLLVMObject.a. What am I missing? llvm-svn: 251841
* Support for ThinLTO function importing and symbol linking.Teresa Johnson2015-11-021-0/+39
| | | | | | | | | | | | | | | | | | | | | Summary: Support for necessary linkage changes and symbol renaming during ThinLTO function importing. Also includes llvm-link support for manually importing functions and associated llvm-link based tests. Note that this does not include support for intelligently importing metadata, which is currently imported duplicate times. That support will be in the follow-on patch, and currently is ignored by the tests. Reviewers: dexonsmith, joker.eph, davidxl Subscribers: tobiasvk, tejohnson, llvm-commits Differential Revision: http://reviews.llvm.org/D13515 llvm-svn: 251837
* StringRef-ify DiagnosticInfoSampleProfile::FilenameDavid Blaikie2015-11-021-3/+2
| | | | llvm-svn: 251823
* Clang format a few prior patches (NFC)Teresa Johnson2015-11-021-13/+14
| | | | | | | I had clang formatted my earlier patches using the wrong style. Reformatted with the LLVM style. llvm-svn: 251812
* SamplePGO - Count sample records in embedded profiles when computing coverage.Diego Novillo2015-10-311-30/+54
| | | | | | | The initial coverage checking code for sample records failed to count records inside inlined profiles. This change fixes the oversight. llvm-svn: 251752
* [FunctionAttrs] Inline the prototype attribute inference to an existingChandler Carruth2015-10-311-21/+6
| | | | | | | | loop over the SCC. The separate function wasn't really adding much, NFC. llvm-svn: 251728
* [PM] Port StripDeadPrototypes to the new pass managerJustin Bogner2015-10-302-22/+32
| | | | | | | This is a really straightforward port. Also adds a test for the pass, since it only seemed to be tested tangentially before. llvm-svn: 251726
* Whitespace. NFCJustin Bogner2015-10-302-5/+5
| | | | llvm-svn: 251724
* [FunctionAttrs] Separate another chunk of the logic for functionattrsChandler Carruth2015-10-301-10/+16
| | | | | | | | | | | from its pass harness by providing a lambda to query for AA results. This allows the legacy pass to easily provide a lambda that uses the special helpers to construct function AA results from a legacy CGSCC pass. With the new pass manager (the next patch) the lambda just directly wraps the intuitive query API. llvm-svn: 251715
* [FunctionAttrs] Provide a single SCC node set to all of theChandler Carruth2015-10-291-91/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | transformations in FunctionAttrs rather than building a new one each time. This isn't trivial because there are different heuristics from different passes for exactly what set they want. The primary difference is whether an *overridable* function completely disables the synthesis of attributes. I've modeled this by directly testing for overridable, and using the common set that excludes external and opt-none functions. This does cause some changes by disabling more optimizations in the face of opt-none. Specifically, we were still optimizing *calls* to opt-none functions based on their attributes, just not the bodies. It seems better to be conservative on both fronts given the intended semanticas here (best effort to not assume or disturb anything). I've not tried to test this change as it seems complex, brittle, and not important to the implicit contract of opt-none. Instead, it seems more like a choice that should be dictated by the simplified implementation and the change to be acceptable differences within the space of opt-none. A big benefit here is that these transformations no longer rely on the legacy pass manager's SCC types, they just work on generic sets of function pointers. This will make it easy to re-use their logic in the new pass manager. I've also made the transforms static functions instead of members where trivial while I was touching the signatures. llvm-svn: 251640
* Fix use-after-free. Thanks ASAN for giving me a detailed report :-).Daniel Jasper2015-10-291-2/+2
| | | | llvm-svn: 251623
* SamplePGO - Add flag to check sampling coverage.Diego Novillo2015-10-281-3/+83
| | | | | | | | | | | | | | | | This adds the flag -mllvm -sample-profile-check-coverage=N to the SampleProfile pass. N is the percent of input sample records that the user expects to apply. If the pass does not use N% (or more) of the sample records in the input, it emits a warning. This is useful to detect some forms of stale profiles. If the code has drifted enough from the original profile, there will be records that do not match the IR anymore. This will not detect cases where a sample profile record for line L is referring to some other instructions that also used to be at line L. llvm-svn: 251568
* SamplePGO - Clear per-function data after applying a profile.Diego Novillo2015-10-281-4/+21
| | | | | | | | | | The pass was keeping around a lot of per-function data (visited blocks, edges, dominance, etc) that is just taking up memory for no reason. In fact, from function to function it could potentially confuse the propagator since some maps are indexed by line offsets which can be common between functions. llvm-svn: 251531
* [GlobalOpt] Add newlines to DEBUG messagesJames Molloy2015-10-281-4/+4
| | | | | | | | I think these were affected by a change way back when to stop printing newlines in Value::dump() by default. This change simply allows the debug output to be readable. NFC. llvm-svn: 251517
* Tidy a comment. NFC.Diego Novillo2015-10-271-1/+1
| | | | llvm-svn: 251434
* Fix SamplePGO segfault when debug info is missing.Diego Novillo2015-10-271-2/+4
| | | | | | | | | | When emitting a remark for a conditional branch annotation, the remark uses the line location information of the conditional branch in the message. In some cases, that information is unavailable and the optimization would segfaul. I'm still not sure whether this is a bug or WAI, but the optimizer should not die because of this. llvm-svn: 251420
* [function-attrs] Refactor code to handle shorter code with early exits.Chandler Carruth2015-10-271-31/+37
| | | | | | | | | | | No functionality changed here, but the indentation is substantially reduced and IMO the code is much easier to read. I've also added some helpful comments. This is just a clean-up I wrote while studying the code, and that has been in my backlog for a while. llvm-svn: 251381
* Remove unused local variable. NFC.Diego Novillo2015-10-261-2/+0
| | | | llvm-svn: 251344
* SamplePGO - Add optimization reports.Diego Novillo2015-10-261-6/+30
| | | | | | | | | | | | | | | | | | | | | | | This adds a couple of optimization remarks to the SamplePGO transformation. When it decides to inline a hot function (to mimic the inline stack and repeat useful inline decisions in the original build). It will also report branch destinations. For instance, given the code fragment: 6 if (i < 1000) 7 sum -= i; 8 else 9 sum += -i * rand(); If the 'else' branch is taken most of the time, building this code with -Rpass=sample-profile will produce: a.cc:9:14: remark: most popular destination for conditional branches at small.cc:6:9 [-Rpass=sample-profile] sum += -i * rand(); ^ llvm-svn: 251330
* Tolerate negative offset when matching sample profile.Dehao Chen2015-10-211-9/+20
| | | | | | | | In some cases (as illustrated in the unittest), lineno can be less than the heade_lineno because the function body are included from some other files. In this case, offset will be negative. This patch makes clang still able to match the profile to IR in this situation. http://reviews.llvm.org/D13914 llvm-svn: 250873
* Sample Profiles - Adjust integer types. Mostly NFC.Diego Novillo2015-10-151-22/+29
| | | | | | | | | | | | | | This adjusts all integers in the reader/writer to reflect the types stored on profile files. They should all be unsigned 32-bit or 64-bit values. Changed all associated internal types to be uint32_t or uint64_t. The only place that needed some adjustments is in the sample profile transformation. Altough the weight read from the profile are 64-bit values, the internal API for branch weights only accepts 32-bit values. The pass now saturates weights that overflow uint32_t. llvm-svn: 250427
* IPO: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-10-1313-99/+97
| | | | llvm-svn: 250187
* [GlobalsAA] Turn GlobalsAA on again by defaultJames Molloy2015-10-131-1/+1
| | | | | | | | Now that all the known faults with GlobalsAA have been fixed, flip the big switch on -enable-non-lto-gmr again. Feel free to pester me with any more bugs found, and don't hesitate to flip the switch back off. llvm-svn: 250157
* GlobalOpt does not treat externally_initialized globals correctlyOliver Stannard2015-10-121-1/+1
| | | | | | | | GlobalOpt currently merges stores into the initialisers of internal, externally_initialized globals, but should not do so as the value of the global may change between the initialiser and any code in the module being run. llvm-svn: 250035
* Make HeaderLineno a local variable.Dehao Chen2015-10-091-12/+8
| | | | | | | | http://reviews.llvm.org/D13576 As we are using hierarchical profile, there is no need to keep HeaderLineno a member variable. This is because each level of the inline stack will have its own header lineno. One should use the head lineno of its own inline stack level instead of the actual symbol. llvm-svn: 249848
* Fix Clang-tidy modernize-use-nullptr warnings in source directories and ↵Hans Wennborg2015-10-062-8/+7
| | | | | | | | | | generated files; other minor cleanups. Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D13321 llvm-svn: 249482
OpenPOWER on IntegriCloud