| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 302998
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This adds a visitor that is capable of accessing type
records randomly and caching intermediate results that it
learns about during partial linear scans. This yields
amortized O(1) access to a type stream even though type
streams cannot normally be indexed.
Differential Revision: https://reviews.llvm.org/D33009
llvm-svn: 302936
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the time we know exactly how many type records we
have in a list, and we want to use the visitor to deserialize
them into actual records in a database. Previously we were
just using push_back() every time without reserving the space
up front in the vector. This is obviously terrible from a
performance standpoint, and it's not uncommon to have PDB
files with half a million type records, where the performance
degredation was quite noticeable.
llvm-svn: 302302
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Verifying the hash values as we are currently doing
results in iterating every type record before the user
even tries to access the first one, and the API user
has no control over, or ability to hook into this
process.
As a result, when the user wants to iterate over types
to print them or index them, this results in a second
iteration over the same list of types. When there's
upwards of 1,000,000 type records, this is obviously
quite undesirable.
This patch raises the verification outside of TpiStream
, and llvm-pdbdump hooks a hash verification visitor
into the normal dumping process. So we still verify
the hash records, but we can do it while not requiring
a second iteration over the type stream.
Differential Revision: https://reviews.llvm.org/D32873
llvm-svn: 302206
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I tried to run llvm-pdbdump on a very large (~1.5GB) PDB to
try and identify show-stopping performance problems. This
patch addresses the first such problem.
When loading the DBI stream, before anyone has even tried to
access a single record, we build an in memory map of every
source file for every module. In the particular PDB I was
using, this was over 85 million files. Specifically, the
complexity is O(m*n) where m is the number of modules and
n is the average number of source files (including headers)
per module.
The whole reason for doing this was so that we could have
constant time access to any module and any of its source
file lists. However, we can still get O(1) access to the
source file list for a given module with a simple O(m)
precomputation, and access to the list of modules is
already O(1) anyway.
So this patches reduces the O(m*n) up-front precomputation
to an O(m) one, where n is ~6,500 and n*m is about 85 million
in my pathological test case.
Differential Revision: https://reviews.llvm.org/D32870
llvm-svn: 302205
|
| |
|
|
|
|
|
|
|
| |
Building the type database is expensive, and can take multiple
minutes for large PDBs. But we only need it in certain cases
depending on what command line options are specified. So only
build it when we know we're about to need it.
llvm-svn: 302204
|
| |
|
|
| |
llvm-svn: 302069
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The raw CodeView format references strings by "offsets", but it's
confusing what table the offset refers to. In the case of line
number information, it's an offset into a buffer of records,
and an indirection is required to get another offset into a
different table to find the final string. And in the case of
checksum information, there is no indirection, and the offset
refers directly to the location of the string in another buffer.
This would be less confusing if we always just referred to the
strings by their value, and have the library be smart enough
to correctly resolve the offsets on its own from the right
location.
This patch makes that possible. When either reading or writing,
all the user deals with are strings, and the library does the
appropriate translations behind the scenes.
llvm-svn: 302053
|
| |
|
|
|
|
|
|
|
|
| |
llvm-readobj hand rolls some CodeView parsing code for string
tables, so this patch updates it to re-use some of the newly
introduced parsing code in LLVMDebugInfoCodeView.
Differential Revision: https://reviews.llvm.org/D32772
llvm-svn: 302052
|
| |
|
|
|
|
|
|
| |
With the forthcoming codeview::StringTable which a pdb::StringTable
would hold an instance of as one member, this ambiguity becomes
confusing. Rename to PDBStringTable to avoid this.
llvm-svn: 301948
|
| |
|
|
|
|
|
|
| |
Previously we wrote line information and file checksum
information, but we did not write information about inlinee
lines and functions. This patch adds support for that.
llvm-svn: 301936
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D32716
llvm-svn: 301882
|
| |
|
|
|
|
|
|
|
|
|
|
| |
In preparation for introducing writing capabilities for each of
these classes, I would like to adopt a Foo / FooRef naming
convention, where Foo indicates that the class can manipulate and
serialize Foos, and FooRef indicates that it is an immutable view of
an existing Foo. In other words, Foo is a writer and FooRef is a
reader. This patch names some existing readers to conform to the
FooRef convention, while offering no functional change.
llvm-svn: 301810
|
| |
|
|
| |
llvm-svn: 301738
|
| |
|
|
|
|
|
|
|
| |
There is a lot of duplicate code for printing line info between
YAML and the raw output printer. This introduces a base class
that can be shared between the two, and makes some minor
cleanups in the process.
llvm-svn: 301728
|
| |
|
|
|
|
|
|
|
|
|
|
| |
When dumping raw data from a stream, you might know the offset
of a certain record you're interested in, as well as how long
that record is. Previously, you had to dump the entire stream
and wade through the bytes to find the interesting record.
This patch allows you to specify an offset and length on the
command line, and it will only dump the requested range.
llvm-svn: 301607
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: zturner, hansw, hans
Reviewed By: hans
Subscribers: hans, llvm-commits
Differential Revision: https://reviews.llvm.org/D32611
llvm-svn: 301595
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously parsing of these were all grouped together into a
single master class that could parse any type of debug info
fragment.
With writing forthcoming, the complexity of each individual
fragment is enough to warrant them having their own classes so
that reading and writing of each fragment type can be grouped
together, but isolated from the code for reading and writing
other fragment types.
In doing so, I found a place where parsing code was duplicated
for the FileChecksums fragment, across llvm-readobj and the
CodeView library, and one of the implementations had a bug.
Now that the codepaths are merged, the bug is resolved.
Differential Revision: https://reviews.llvm.org/D32547
llvm-svn: 301557
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have a lot of very similarly named classes related to
dealing with module debug info. This patch has NFC, it just
renames some classes to be more descriptive (albeit slightly
more to type). The mapping from old to new class names is as
follows:
Old | New
ModInfo | DbiModuleDescriptor
ModuleSubstream | ModuleDebugFragment
ModStream | ModuleDebugStream
With the corresponding Builder classes renamed accordingly.
Differential Revision: https://reviews.llvm.org/D32506
llvm-svn: 301555
|
| |
|
|
| |
llvm-svn: 301358
|
| |
|
|
|
|
|
|
|
| |
We were already parsing and dumping this to the human readable
format, but not to the YAML format. This does so, in preparation
for reading it in and reconstructing the line information from
YAML.
llvm-svn: 301357
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The *real* difference between these two was that
a) The "graphical" dumper could recurse, while the text one could
not.
b) The "text" dumper could display nested types and functions,
while the graphical one could not.
Merge these two so that there is only one dumper that can recurse
arbitrarily deep and optionally display nested types or not.
llvm-svn: 301204
|
| |
|
|
|
|
|
|
| |
This reworks the way virtual bases are handled, and also the way
padding is detected across multiple levels of aggregates, producing
a much more accurate result.
llvm-svn: 301203
|
| |
|
|
| |
llvm-svn: 300258
|
| |
|
|
| |
llvm-svn: 300163
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a followup patch I intend to introduce an additional dumping
mode which dumps a graphical representation of a class's layout.
In preparation for this, the text-based layout printer needs to
be split out from the graphical layout printer, and both need
to be able to use the same code for printing the intro and outro
of a class's definition (e.g. base class list, etc).
This patch does so, and in the process introduces a skeleton
definition for the graphical printer, while currently making
the graphical printer just print nothing.
NFC
llvm-svn: 300134
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the dumping of class definitions was very primitive,
and it made it hard to do more than the most trivial of output
formats when dumping. As such, we would only dump one line for
each field, and then dump non-layout items like nested types
and enums.
With this patch, we do a complete analysis of the object
hierarchy including aggregate types, bases, virtual bases,
vftable analysis, etc. The only immediately visible effects
of this are that a) we can now dump a line for the vfptr where
before we would treat that as padding, and b) we now don't
treat virtual bases that come at the end of a class as padding
since we have a more detailed analysis of the class's storage
usage.
In subsequent patches, we should be able to use this analysis
to display a complete graphical view of a class's layout including
recursing arbitrarily deep into an object's base class / aggregate
member hierarchy.
llvm-svn: 300133
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When dumping classes, show where padding occurs, and at the end of the
class print statistics about how many bytes total of padding exist in a
class.
Since PDB doesn't specifically contain information about padding, we have
to mimic this by sort of reversing a small portion of the record layout
algorithm (e.g. looking at offsets and sizes and trying to determine
whether something is part of the same field or a new field).
Differential Revision: https://reviews.llvm.org/D31800
llvm-svn: 299869
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Adds support for pointers to arrays, which was missing
* Adds some tests
* Improves consistency of const and volatile qualifiers
* Eliminates non-composable special case code for arrays and function by using
a more general recursive approach
* Has a hack for getting the calling convention into the right spot for
pointer-to-functions
Given the rapid changes happenning in llvm-pdbdump, this may be difficult to
merge.
Differential Revision: https://reviews.llvm.org/D31832
llvm-svn: 299848
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Added some asserts to make sure concrete symbol types don't
get constructed with RawSymbols that have an incompatible
SymTag enum value.
2. Added new forwarding macros that auto-define an Id/Sym method
pair whenever there is a method that returns a SymIndexId.
Previously we would just provide one method that returned only
the SymIndexId and it was up to the caller to use the Session
object to get a pointer to the symbol. Now we automatically
get both the method that returns the Id, as well as a method
that returns the pointer directly with just one macro.
3. Added some methods for dumping straight to stdout that can
be used from inside the debugger for diagnostics during a
debug session.
4. Added a clone() method and a cast<T>() method to PDBSymbol
that can shorten some usage patterns.
llvm-svn: 299831
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously when dumping class definitions, there were only
two modes - on or off. But it's useful to sometimes get a
little more fine-grained. For example, you might only want
to see the record layout (for example to look for extraneous
padding). This patch adds a third mode, layout mode, which
does exactly that. Only this-relative data members are
displayed in this mode.
Differential Revision: https://reviews.llvm.org/D31794
llvm-svn: 299733
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we just had the -types option, which would dump all
classes, typedefs, and enums. But this produces a lot of output
if you only want to view classes, for example. This patch breaks
this down into 3 additional options, -classes, -enums, and
-typedefs, and keeps the -types option around which implies all
3 more specific options.
Differential Revision: https://reviews.llvm.org/D31791
llvm-svn: 299732
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The TypeTableBuilder provides stable storage for type records. We don't
need to copy all of the bytes into a flat vector before adding it to the
TpiStreamBuilder.
This makes addTypeRecord take an ArrayRef<uint8_t> and a hash code to go
with it, which seems like a simplification.
Reviewers: ruiu, zturner, inglorion
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31634
llvm-svn: 299406
|
| |
|
|
|
|
| |
MASM can produce these type records.
llvm-svn: 299388
|
| |
|
|
|
|
|
|
|
| |
The -output-color option was successful at suppressing color changes, but
was still allowing color resets.
Differential Revision: https://reviews.llvm.org/D31468
llvm-svn: 299006
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When dumping these records from an object file section, we should use
only one type database. However, when dumping from a PDB, we should use
two: one for the type stream and one for the IPI stream.
Certain type records that normally live in the .debug$T object file
section get moved over to the IPI stream of the PDB file and they get
new indices.
So far, I've noticed that the MSVC linker always moves these records
into IPI:
- LF_FUNC_ID
- LF_MFUNC_ID
- LF_STRING_ID
- LF_SUBSTR_LIST
- LF_BUILDINFO
- LF_UDT_MOD_SRC_LINE
These records have index fields that can point into TPI or IPI. In
particular, LF_SUBSTR_LIST and LF_BUILDINFO point to LF_STRING_ID
records to describe compilation command lines.
I've modified the dumper to have an optional pointer to the item DB, and
to do type name lookup of these fields in that DB. See printItemIndex.
The result is that our pdbdump-headers.test is more faithful to the PDB
contents and the output is less confusing.
Reviewers: ruiu
Subscribers: amccarth, zturner, llvm-commits
Differential Revision: https://reviews.llvm.org/D31309
llvm-svn: 298649
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds -color-output option to llvm-pdbdump pretty commands that lets the user
specify whether the output should have color. The default depends on whether
the output is going to a TTY (per prior discussion in
https://reviews.llvm.org/D31246).
This will enable tests that pipe llvm-pdbdump output to FileCheck to work
across platforms without regard to the differences in ANSI codes.
Differential Revision: https://reviews.llvm.org/D31263
llvm-svn: 298610
|
| |
|
|
|
|
|
| |
They are structurally the same, but now we need to distinguish them
because one record lives in the IPI stream and the other lives in TPI.
llvm-svn: 298474
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was discovered when running `llvm-pdbdump diff` against
two files, the second of which was generated by running the
first one through pdb2yaml and then yaml2pdb.
The second one was missing some bytes from the PDB Stream, and
tracking this down showed that at the end of the PDB Stream were
some additional bytes that we were ignoring. Looking back
to the reference code, these seem to specify some additional
flags that indicate whether the PDB supports various optional
features.
This patch adds support for reading, writing, and round-tripping
these flags through YAML and the raw dumper, and updates the
tests accordingly.
llvm-svn: 297984
|
| |
|
|
|
|
|
|
|
|
| |
In doing so I discovered that we completely ignore some bytes
of the PDB Stream after we "finish" loading it. These bytes
seem to specify some additional information about what kind
of data is present in the PDB. A subsequent patch will add
code to read in those fields and store their values.
llvm-svn: 297983
|
| |
|
|
|
|
|
| |
Looks like this file did not have clang-format run on
it when its initial revision was committed.
llvm-svn: 297977
|
| |
|
|
| |
llvm-svn: 297902
|
| |
|
|
| |
llvm-svn: 297901
|
| |
|
|
|
|
|
|
|
|
|
| |
Previously we did not have support for writing detailed
module information for each module, as well as the symbol
records. This patch adds support for this, and in doing
so enables the ability to construct minimal PDBs from
just a few lines of YAML. A test is added to illustrate
this functionality.
llvm-svn: 297900
|
| |
|
|
|
|
|
|
|
|
|
| |
Together, these allow lldb-pdbdump to list all the modules from a PDB using a
native reader (rather than DIA).
Note that I'll probably be specializing NativeRawSymbol in a subsequent patch.
Differential Revision: https://reviews.llvm.org/D30956
llvm-svn: 297883
|
| |
|
|
|
|
|
|
|
|
| |
For now this only diffs the stream directory and the MSF
Superblock. Future patches will drill down into individual
streams to find out where the differences lie.
Differential Revision: https://reviews.llvm.org/D30908
llvm-svn: 297689
|
| |
|
|
|
|
|
|
| |
Previously we could round-trip type records from PDB -> Yaml ->
PDB, but for symbols we could only go from PDB -> Yaml. This
completes the round-tripping for symbols as well.
llvm-svn: 297625
|
| |
|
|
|
|
|
|
|
|
| |
After several smaller patches to get most of the core improvements
finished up, this patch is a straight move and header fixup of
the source.
Differential Revision: https://reviews.llvm.org/D30266
llvm-svn: 296810
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Before the endianness was specified on each call to read
or write of the StreamReader / StreamWriter, but in practice
it's extremely rare for streams to have data encoded in
multiple different endiannesses, so we should optimize for the
99% use case.
This makes the code cleaner and more general, but otherwise
has NFC.
llvm-svn: 296415
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was reverted because it was breaking some builds, and
because of incorrect error code usage. Since the CL was
large and contained many different things, I'm resubmitting
it in pieces.
This portion is NFC, and consists of:
1) Renaming classes to follow a consistent naming convention.
2) Fixing the const-ness of the interface methods.
3) Adding detailed doxygen comments.
4) Fixing a few instances of passing `const BinaryStream& X`. These
are now passed as `BinaryStreamRef X`.
llvm-svn: 296394
|