summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* Change ModuleLinker to take a set of GlobalValues to import instead of a ↵Mehdi Amini2015-12-021-1/+5
| | | | | | | | | | | | single one For efficiency reason, when importing multiple functions for the same Module, we can avoid reparsing it every time. Differential Revision: http://reviews.llvm.org/D15102 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 254486
* [sanitizer coverage] when adding a bb trace instrumentation, do it instead, ↵Kostya Serebryany2015-12-021-15/+10
| | | | | | not in addition to, regular coverage. Do the regular coverage in the run-time instead llvm-svn: 254482
* Modify FunctionImport to take a callback to load modulesMehdi Amini2015-12-021-4/+7
| | | | | | | | | | | | When linking static archive, there is no individual module files to load. Instead they can be mmap'ed and could be initialized from a buffer directly. The callback provide flexibility to override the scheme for loading module from the summary. Differential Revision: http://reviews.llvm.org/D15101 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 254479
* Use references now that it is natural to do so.Rafael Espindola2015-12-011-2/+2
| | | | | | | The linker never takes ownership of a module or changes which module it is refering to, making it natural to use references. llvm-svn: 254449
* [ThinLTO] Wrap dbgs() output in DEBUG macroTeresa Johnson2015-12-011-5/+5
| | | | | | Missed in a couple places. llvm-svn: 254422
* [ThinLTO] Remove stale comment (NFC)Teresa Johnson2015-12-011-4/+0
| | | | | | Stale as of r254036 which added basic profitability check. llvm-svn: 254421
* Bring r254336 back:Rafael Espindola2015-12-011-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The difference is that now we don't error on out-of-comdat access to internal global values. We copy them instead. This seems to match the expectation of COFF linkers (see pr25686). Original message: Start deciding earlier what to link. A traditional linker is roughly split in symbol resolution and "copying stuff". The two tasks are badly mixed in lib/Linker. This starts splitting them apart. With this patch there are no direct call to linkGlobalValueBody or linkGlobalValueProto. Everything is linked via WapValue. This also includes a few fixes: * A GV goes undefined if the comdat is dropped (comdat11.ll). * We error if an internal GV goes undefined (comdat13.ll). * We don't link an unused comdat. The first two match the behavior of an ELF linker. The second one is equivalent to running globaldce on the input. llvm-svn: 254418
* [LIR] Push check into helper function. NFC.Chad Rosier2015-12-011-4/+4
| | | | llvm-svn: 254416
* [safestack] Protect byval function arguments.Evgeniy Stepanov2015-12-012-48/+117
| | | | | | | Detect unsafe byval function arguments and move them to the unsafe stack. llvm-svn: 254353
* [safestack] Fix handling of array allocas.Evgeniy Stepanov2015-12-011-5/+17
| | | | | | | | The current code does not take alloca array size into account and, as a result, considers any access past the first array element to be unsafe. llvm-svn: 254350
* This reverts commit r254336 and r254344.Rafael Espindola2015-11-301-3/+3
| | | | | | They broke a bot and I am debugging why. llvm-svn: 254347
* Start deciding earlier what to link.Rafael Espindola2015-11-301-3/+3
| | | | | | | | | | | | | | | | | | | | | | A traditional linker is roughly split in symbol resolution and "copying stuff". The two tasks are badly mixed in lib/Linker. This starts splitting them apart. With this patch there are no direct call to linkGlobalValueBody or linkGlobalValueProto. Everything is linked via WapValue. This also includes a few fixes: * A GV goes undefined if the comdat is dropped (comdat11.ll). * We error if an internal GV goes undefined (comdat13.ll). * We don't link an unused comdat. The first two match the behavior of an ELF linker. The second one is equivalent to running globaldce on the input. llvm-svn: 254336
* [SimplifyLibCalls] Transform log(exp2(y)) to y*log(2) under fast-math.Davide Italiano2015-11-301-1/+9
| | | | llvm-svn: 254317
* fix typos in comments; NFCSanjay Patel2015-11-291-6/+8
| | | | llvm-svn: 254266
* [SimplifyLibCalls] Don't crash if the function doesn't have a name.Davide Italiano2015-11-291-3/+2
| | | | llvm-svn: 254265
* [SimplifyLibCalls] Cross out implemented transformations.Davide Italiano2015-11-291-2/+0
| | | | llvm-svn: 254264
* [SimplifyLibCalls] Tranform log(pow(x, y)) -> y*log(x).Davide Italiano2015-11-291-5/+50
| | | | | | | | | | | | | | | | | | This one is enabled only under -ffast-math. There are cases where the difference between the value computed and the correct value is huge even for ffast-math, e.g. as Steven pointed out: x = -1, y = -4 log(pow(-1), 4) = 0 4*log(-1) = NaN I checked what GCC does and apparently they do the same optimization (which result in the dramatic difference). Future work might try to make this (slightly) less worse. Differential Revision: http://reviews.llvm.org/D14400 llvm-svn: 254263
* SamplePGO - Do not use std::to_string in diagnostics.Diego Novillo2015-11-291-12/+17
| | | | | | | | This fixes buildbots in systems that std::to_string is not present. It also tidies the output of the diagnostic to render doubles a bit better (thanks Ben Kramer for help with string streams and format). llvm-svn: 254261
* Remove an intermediate lambda. NFCCraig Topper2015-11-291-3/+2
| | | | llvm-svn: 254246
* [SimplifyLibCalls] Use any_of(). Suggested by David Blaikie!Davide Italiano2015-11-281-4/+3
| | | | llvm-svn: 254239
* [SimplifyLibCalls] Fix inverted condition that lead to an uninitialized ↵Benjamin Kramer2015-11-281-2/+2
| | | | | | | | memory read below. Found by msan! llvm-svn: 254238
* Use range-based for loops. NFCCraig Topper2015-11-281-37/+20
| | | | llvm-svn: 254222
* SamplePGO - Add initial support for inliner annotations.Diego Novillo2015-11-271-1/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds two thresholds to the sample profiler to affect inlining decisions: the concept of global hotness and coldness. Functions that have accumulated more than a certain fraction of samples at runtime, are annotated with the InlineHint attribute. Conversely, functions that accumulate less than a certain fraction of samples, are annotated with the Cold attribute. This is very similar to the hints emitted by Clang when using instrumentation profiles. Notice that this is a very blunt instrument. A function may have globally collected a significant fraction of samples, but that does not necessarily mean that every callsite for that function is hot. Ideally, we would annotate each callsite with the samples collected at that callsite. This way, the inliner can incorporate all these weights into its cost model. Once the inliner offers this functionality, we can change the hints emitted here to a more precise per-callsite annotation. For now, this is providing some measure of speedups with our internal benchmarks. I've observed speedups of up to 23% (though the geo mean is about 3%). I expect these numbers to improve as the inliner gets better annotations. llvm-svn: 254212
* SamplePGO - Fix default threshold for hot callsites.Diego Novillo2015-11-271-3/+4
| | | | | | | | | | | | | | | | | | | | | | | Based on testing of internal benchmarks, I'm lowering this threshold to a value of 0.1%. This means that SamplePGO will respect 99.9% of the original inline decisions when following a profile. The performance difference is noticeable in some tests. With the previous threshold, the speedups over baseline -O2 was about 0.63%. With the new default, the speedups are around 3% on average. The point of this threshold is not to do more aggressive inlining. When an inlined callsite crosses this threshold, SamplePGO will redo the inline decision so that it can better apply the input profile. By respecting most original inline decisions, we can apply more of the input profile because the shape of the code follows the profile more closely. In the next series, I'll be looking at adding some inline hints for the cold callsites and for toplevel functions that are hot/cold as well. llvm-svn: 254211
* Simplify the linking of recursive data.Rafael Espindola2015-11-271-2/+10
| | | | | | | | Now the ValueMapper has two callbacks. The first one maps the declaration. The ValueMapper records the mapping and then materializes the body/initializer. llvm-svn: 254209
* [sanitizer] [dfsan] Unify aarch64 mappingAdhemerval Zanella2015-11-271-16/+21
| | | | | | | | | | | | This patch changes the DFSan instrumentation for aarch64 to instead of using fixes application mask defined by SANITIZER_AARCH64_VMA to read the application shadow mask value from compiler-rt. The value is initialized based on runtime VAM detection. Along with this patch a compiler-rt one will also be added to export the shadow mask variable. llvm-svn: 254196
* [SimplifyLibCalls] Use range-based loop. NFC.Davide Italiano2015-11-271-4/+2
| | | | llvm-svn: 254193
* [LoopVectorize] Use MapVector rather than DenseMap for MinBWs.Charlie Turner2015-11-261-3/+3
| | | | | | | | | | | | | | | | | The order in which instructions are truncated in truncateToMinimalBitwidths effects code generation. Switch to a map with a determinisic order, since the iteration order over a DenseMap is not defined. This code is not hot, so the difference in container performance isn't interesting. Many thanks to David Blaikie for making me aware of MapVector! Fixes PR25490. Differential Revision: http://reviews.llvm.org/D14981 llvm-svn: 254179
* Disallow aliases to available_externally.Rafael Espindola2015-11-261-37/+0
| | | | | | | | | | | | They are as much trouble as aliases to declarations. They are requiring the code generator to define a symbol with the same value as another symbol, but the second symbol is undefined. If representing this is important for some optimization, we could add support for available_externally aliases. They would be *required* to point to a declaration (or available_externally definition). llvm-svn: 254170
* [SimplifyLibCalls] Don't depend on a called function having a name, it might ↵Benjamin Kramer2015-11-261-11/+8
| | | | | | | | be an indirect call. Fixes the crasher in PR25651 and related crashers using the same pattern. llvm-svn: 254145
* [safestack] Fix alignment of dynamic allocas.Evgeniy Stepanov2015-11-251-1/+1
| | | | | | Fixes PR25588. llvm-svn: 254109
* [SCCP] More informative message if we don't know how to handle a terminator.Davide Italiano2015-11-251-1/+1
| | | | llvm-svn: 254093
* [OperandBundles] Extract duplicated code into a helper function, NFCSanjoy Das2015-11-252-10/+2
| | | | llvm-svn: 254047
* [InstCombine] Don't drop operand bundlesSanjoy Das2015-11-251-3/+10
| | | | | | | | | | Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14857 llvm-svn: 254046
* [PGO] Revert revision r254021,r254028,r254035Rong Xu2015-11-246-933/+2
| | | | | | Revert the above revision due to multiple issues. llvm-svn: 254040
* [ThinLTO] Add option to limit importing based on instruction countTeresa Johnson2015-11-241-0/+12
| | | | | | | | Add a simple initial heuristic to control importing based on the number of instructions recorded in the function's summary. Add option to control the limit, and test using option. llvm-svn: 254036
* SamplePGO - Add test for hot/cold inlined functions.Diego Novillo2015-11-241-17/+47
| | | | | | | | | | | | | | | | | | | | | | | When the original binary is executed and sampled, the resulting profile contains information on the original inline stack. We currently follow the original inline plan if we notice that the inlined callsite has more than 0 samples to it. A better way is to determine whether the callsite is actually worth inlining. If the callsite accumulates a small fraction of the samples spent in the parent function, then we don't want to bother inlining it (as it means that the callsite is actually cold). This patch introduces a threshold expressed in percentage of samples in relation to the parent function. If the callsite uses less than N% of the total samples used by its parent, the original inline decision is not re-applied. I've set the threshold to the very arbitrary value of 5%. I'm yet to do any actual experiments to see what's a good value. I wanted to separate the basic mechanism from the tuning. llvm-svn: 254034
* [PGO] Fix build errors in x86_64-darwinRong Xu2015-11-242-3/+3
| | | | | | Fix buildbot failure for x86_64-darwin due to r254021 llvm-svn: 254028
* [PGO] MST based PGO instrumentation infrastructureRong Xu2015-11-246-2/+933
| | | | | | | | | | | | This patch implements a minimum spanning tree (MST) based instrumentation for PGO. The use of MST guarantees minimum number of CFG edges getting instrumented. An addition optimization is to instrument the less executed edges to further reduce the instrumentation overhead. The patch contains both the instrumentation and the use of the profile to set the branch weights. Differential Revision: http://reviews.llvm.org/D12781 llvm-svn: 254021
* [ThinLTO] Refactor function body scan during importing into helper (NFC)Teresa Johnson2015-11-241-36/+27
| | | | llvm-svn: 254020
* [ThinLTO] Enable iterative importing in FunctionImport passTeresa Johnson2015-11-241-2/+36
| | | | | | | | | | | Analyze imported function bodies and add any new external calls to the worklist for importing. Currently no controls on the importing so this will end up importing everything possible in the call tree below the importing module. Basic profitability checks coming next. Update test to check for iteratively inlined functions. llvm-svn: 254011
* [Utils] Put includes in correct order. NFC.Weiming Zhao2015-11-248-10/+8
| | | | | | | | | | | | | | | | | | | Summary: Followed the guidelines in: http://llvm.org/docs/CodingStandards.html#include-style However, I noticed that uppercase named headers come before lowercase ones throughout the codebase. So kept them as is. Patch by Mandeep Singh Grang <mgrang@codeaurora.org> Reviewers: majnemer, davide, jmolloy, atrick Subscribers: sanjoy Differential Revision: http://reviews.llvm.org/D14939 llvm-svn: 254005
* [InstCombine] fix propagation of fast-math-flagsSanjay Patel2015-11-241-10/+5
| | | | | | | Noticed while working on D4583: http://reviews.llvm.org/D4583 llvm-svn: 253997
* use convenience function for copying IR flags; NFCISanjay Patel2015-11-241-12/+2
| | | | llvm-svn: 253996
* [ThinLTO] Fix FunctionImport alias checking and testTeresa Johnson2015-11-241-4/+5
| | | | | | | Skip imports for weak_any aliases as well. Fix the test to check non-import of weak aliases and functions, and import of normal alias. llvm-svn: 253991
* Fix build after r253954Ismail Donmez2015-11-241-1/+1
| | | | llvm-svn: 253969
* Add a FunctionImporter helper to perform summary-based cross-module function ↵Mehdi Amini2015-11-244-1/+242
| | | | | | | | | | | | | | | | | importing Summary: This is a helper to perform cross-module import for ThinLTO. Right now it is importing naively every possible called functions. Reviewers: tejohnson Subscribers: dexonsmith, llvm-commits Differential Revision: http://reviews.llvm.org/D14914 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 253954
* [LIR] Put includes in correct order. NFC.Chad Rosier2015-11-231-1/+1
| | | | llvm-svn: 253915
* SamplePGO - Add coverage tracking for samples.Diego Novillo2015-11-231-26/+85
| | | | | | | | | | | | The existing coverage tracker counts the number of records that were used from the input profile. An alternative view of coverage is to check how many available samples were applied. This way, if the profile contains several records with few samples, it doesn't really matter much that they were not applied. The more interesting records to apply are the ones that contribute many samples. llvm-svn: 253912
* [WinEH] Fix a case where GVN could incorrectly PRE a load into an EH pad.Andrew Kaylor2015-11-231-0/+10
| | | | | | Differential Revision: http://reviews.llvm.org/D14842 llvm-svn: 253908
OpenPOWER on IntegriCloud