summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/AsmPrinter/DebugLocStream.h
Commit message (Collapse)AuthorAgeFilesLines
* Don't generate comments in the DebugLocStream unless required. NFC.Pete Cooper2015-05-201-2/+6
| | | | | | | | | | | | The ByteStreamer here wasn't taking account of whether the asm streamer was text based and verbose. Only with that combination should we emit comments. This change makes sure that we only actually convert a Twine to a string using Twine::str() if we need the comment. This saves about 10000 small allocations on a test case involving the verify-use_list-order bitcode going through llc with debug info. Note, this is NFC as the comments would ultimately never be emitted unless required. Reviewed by Duncan Exon Smith and David Blaikie. llvm-svn: 237851
* AsmPrinter: Create a unified .debug_loc streamDuncan P. N. Exon Smith2015-04-171-0/+129
This commit removes `DebugLocList` and replaces it with `DebugLocStream`. - `DebugLocEntry` no longer contains its byte/comment streams. - The `DebugLocEntry` list for a variable/inlined-at pair is allocated on the stack, and released right after `DebugLocEntry::finalize()` (possible because of the refactoring in r231023). Now, only one list is in memory at a time now. - There's a single unified stream for the `.debug_loc` section that persists, stored in the new `DebugLocStream` data structure. The last point is important: this collapses the nested `SmallVector<>`s from `DebugLocList` into unified streams. We previously had something like the following: vec<tuple<Label, CU, vec<tuple<BeginSym, EndSym, vec<Value>, vec<char>, vec<string>>>>> A `SmallVector` can avoid allocations, but is statically fairly large for a vector: three pointers plus the size of the small storage, which is the number of elements in small mode times the element size). Nesting these is expensive, since an inner vector's size contributes to the element size of an outer one. (Nesting any vector is expensive...) In the old data structure, the outer vector's *element* size was 632B, excluding allocation costs for when the middle and inner vectors exceeded their small sizes. 312B of this was for the "three" pointers in the vector-tree beneath it. If you assume 1M functions with an average of 10 variable/inlined-at pairs each (in an LTO scenario), that's almost 6GB (besides inner allocations), with almost 3GB for the "three" pointers. This came up in a heap profile a little while ago of a `clang -flto -g` bootstrap, with `DwarfDebug::collectVariableInfo()` using something like 10-15% of the total memory. With this commit, we have: tuple<vec<tuple<Label, CU, Offset>>, vec<tuple<BeginSym, EndSym, Offset, Offset>>, vec<char>, vec<string>> The offsets are used to create `ArrayRef` slices of adjacent `SmallVector`s. This reduces the number of vectors to four (unrelated to the number of variable/inlined-at pairs), and caps the number of allocations at the same number. Besides saving memory and limiting allocations, this is NFC. I don't know my way around this code very well yet, but I wonder if we could go further: why stream to a side-table, instead of directly to the output stream? llvm-svn: 235229
OpenPOWER on IntegriCloud