summaryrefslogtreecommitdiffstats
path: root/llvm/tools/llvm-pdbdump
Commit message (Collapse)AuthorAgeFilesLines
* [llvm-pdbdump] Add the option to sort functions and data.Zachary Turner2017-05-144-11/+136
| | | | llvm-svn: 302998
* [CodeView] Add a random access type visitor.Zachary Turner2017-05-121-1/+1
| | | | | | | | | | | | This adds a visitor that is capable of accessing type records randomly and caching intermediate results that it learns about during partial linear scans. This yields amortized O(1) access to a type stream even though type streams cannot normally be indexed. Differential Revision: https://reviews.llvm.org/D33009 llvm-svn: 302936
* [CodeView] Reserve TypeDatabase records up front.Zachary Turner2017-05-052-4/+4
| | | | | | | | | | | | | Most of the time we know exactly how many type records we have in a list, and we want to use the visitor to deserialize them into actual records in a database. Previously we were just using push_back() every time without reserving the space up front in the vector. This is obviously terrible from a performance standpoint, and it's not uncommon to have PDB files with half a million type records, where the performance degredation was quite noticeable. llvm-svn: 302302
* [pdb] Don't verify TPI hash values up front.Zachary Turner2017-05-041-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | Verifying the hash values as we are currently doing results in iterating every type record before the user even tries to access the first one, and the API user has no control over, or ability to hook into this process. As a result, when the user wants to iterate over types to print them or index them, this results in a second iteration over the same list of types. When there's upwards of 1,000,000 type records, this is obviously quite undesirable. This patch raises the verification outside of TpiStream , and llvm-pdbdump hooks a hash verification visitor into the normal dumping process. So we still verify the hash records, but we can do it while not requiring a second iteration over the type stream. Differential Revision: https://reviews.llvm.org/D32873 llvm-svn: 302206
* [PDB] Don't build the entire source file list up front.Zachary Turner2017-05-043-36/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | I tried to run llvm-pdbdump on a very large (~1.5GB) PDB to try and identify show-stopping performance problems. This patch addresses the first such problem. When loading the DBI stream, before anyone has even tried to access a single record, we build an in memory map of every source file for every module. In the particular PDB I was using, this was over 85 million files. Specifically, the complexity is O(m*n) where m is the number of modules and n is the average number of source files (including headers) per module. The whole reason for doing this was so that we could have constant time access to any module and any of its source file lists. However, we can still get O(1) access to the source file list for a given module with a simple O(m) precomputation, and access to the list of modules is already O(1) anyway. So this patches reduces the O(m*n) up-front precomputation to an O(m) one, where n is ~6,500 and n*m is about 85 million in my pathological test case. Differential Revision: https://reviews.llvm.org/D32870 llvm-svn: 302205
* [llvm-pdbdump] Only build the TypeDatabase if necessary.Zachary Turner2017-05-042-41/+77
| | | | | | | | | Building the type database is expensive, and can take multiple minutes for large PDBs. But we only need it in certain cases depending on what command line options are specified. So only build it when we know we're about to need it. llvm-svn: 302204
* Remove unused private field.Zachary Turner2017-05-031-1/+1
| | | | llvm-svn: 302069
* [CodeView] Use actual strings for dealing with checksums and lines.Zachary Turner2017-05-031-40/+13
| | | | | | | | | | | | | | | | | | | | | The raw CodeView format references strings by "offsets", but it's confusing what table the offset refers to. In the case of line number information, it's an offset into a buffer of records, and an indirection is required to get another offset into a different table to find the final string. And in the case of checksum information, there is no indirection, and the offset refers directly to the location of the string in another buffer. This would be less confusing if we always just referred to the strings by their value, and have the library be smart enough to correctly resolve the offsets on its own from the right location. This patch makes that possible. When either reading or writing, all the user deals with are strings, and the library does the appropriate translations behind the scenes. llvm-svn: 302053
* [llvm-readobj] Update readobj to re-use parsing code.Zachary Turner2017-05-033-19/+42
| | | | | | | | | | llvm-readobj hand rolls some CodeView parsing code for string tables, so this patch updates it to re-use some of the newly introduced parsing code in LLVMDebugInfoCodeView. Differential Revision: https://reviews.llvm.org/D32772 llvm-svn: 302052
* Rename pdb::StringTable -> pdb::PDBStringTable.Zachary Turner2017-05-023-4/+4
| | | | | | | | With the forthcoming codeview::StringTable which a pdb::StringTable would hold an instance of as one member, this ambiguity becomes confusing. Rename to PDBStringTable to avoid this. llvm-svn: 301948
* [PDB/CodeView] Read/write codeview inlinee line information.Zachary Turner2017-05-029-26/+203
| | | | | | | | Previously we wrote line information and file checksum information, but we did not write information about inlinee lines and functions. This patch adds support for that. llvm-svn: 301936
* [CodeView] Write CodeView line information.Zachary Turner2017-05-013-4/+66
| | | | | | Differential Revision: https://reviews.llvm.org/D32716 llvm-svn: 301882
* [PDB/CodeView] Rename some classes.Zachary Turner2017-05-016-18/+16
| | | | | | | | | | | | In preparation for introducing writing capabilities for each of these classes, I would like to adopt a Foo / FooRef naming convention, where Foo indicates that the class can manipulate and serialize Foos, and FooRef indicates that it is an immutable view of an existing Foo. In other words, Foo is a writer and FooRef is a reader. This patch names some existing readers to conform to the FooRef convention, while offering no functional change. llvm-svn: 301810
* Remove unused private field.Zachary Turner2017-04-291-4/+3
| | | | llvm-svn: 301738
* [llvm-pdbdump] Abstract some of the YAML/Raw printing code.Zachary Turner2017-04-297-165/+288
| | | | | | | | | There is a lot of duplicate code for printing line info between YAML and the raw output printer. This introduces a base class that can be shared between the two, and makes some minor cleanups in the process. llvm-svn: 301728
* [llvm-pdbdump] Allow printing only a portion of a stream.Zachary Turner2017-04-283-5/+45
| | | | | | | | | | | | When dumping raw data from a stream, you might know the offset of a certain record you're interested in, as well as how long that record is. Previously, you had to dump the entire stream and wade through the bytes to find the interesting record. This patch allows you to specify an offset and length on the command line, and it will only dump the requested range. llvm-svn: 301607
* Fix a few pedantic warnings.Frederich Munch2017-04-271-5/+5
| | | | | | | | | | | | Reviewers: zturner, hansw, hans Reviewed By: hans Subscribers: hans, llvm-commits Differential Revision: https://reviews.llvm.org/D32611 llvm-svn: 301595
* [CodeView] Isolate Debug Info Fragments into standalone classes.Zachary Turner2017-04-272-25/+22
| | | | | | | | | | | | | | | | | | | | | Previously parsing of these were all grouped together into a single master class that could parse any type of debug info fragment. With writing forthcoming, the complexity of each individual fragment is enough to warrant them having their own classes so that reading and writing of each fragment type can be grouped together, but isolated from the code for reading and writing other fragment types. In doing so, I found a place where parsing code was duplicated for the FileChecksums fragment, across llvm-readobj and the CodeView library, and one of the implementations had a bug. Now that the codepaths are merged, the bug is resolved. Differential Revision: https://reviews.llvm.org/D32547 llvm-svn: 301557
* Rename some PDB classes.Zachary Turner2017-04-276-24/+24
| | | | | | | | | | | | | | | | | | | We have a lot of very similarly named classes related to dealing with module debug info. This patch has NFC, it just renames some classes to be more descriptive (albeit slightly more to type). The mapping from old to new class names is as follows: Old | New ModInfo | DbiModuleDescriptor ModuleSubstream | ModuleDebugFragment ModStream | ModuleDebugStream With the corresponding Builder classes renamed accordingly. Differential Revision: https://reviews.llvm.org/D32506 llvm-svn: 301555
* [llvm-pdbdump] Allow sorting / filtering by immediate paddingZachary Turner2017-04-254-2/+49
| | | | llvm-svn: 301358
* [llvm-pdbdump] Dump File / Line Info to YAML.Zachary Turner2017-04-256-8/+328
| | | | | | | | | We were already parsing and dumping this to the human readable format, but not to the YAML format. This does so, in preparation for reading it in and reconstructing the line information from YAML. llvm-svn: 301357
* [llvm-pdbdump] Merge functionality of graphical and text dumpers.Zachary Turner2017-04-249-208/+100
| | | | | | | | | | | | | | The *real* difference between these two was that a) The "graphical" dumper could recurse, while the text one could not. b) The "text" dumper could display nested types and functions, while the graphical one could not. Merge these two so that there is only one dumper that can recurse arbitrarily deep and optionally display nested types or not. llvm-svn: 301204
* [llvm-pdbdump] Re-write the record layout code to be more resilient.Zachary Turner2017-04-2411-75/+98
| | | | | | | | This reworks the way virtual bases are handled, and also the way padding is detected across multiple levels of aggregates, producing a much more accurate result. llvm-svn: 301203
* [llvm-pdbdump] Recursively dump class layout.Zachary Turner2017-04-1313-81/+335
| | | | llvm-svn: 300258
* Remove some unused private fields.Zachary Turner2017-04-132-5/+1
| | | | llvm-svn: 300163
* [llvm-pdbdump] Minor prepatory refactor of Class Def Dumper.Zachary Turner2017-04-129-125/+301
| | | | | | | | | | | | | | | | | In a followup patch I intend to introduce an additional dumping mode which dumps a graphical representation of a class's layout. In preparation for this, the text-based layout printer needs to be split out from the graphical layout printer, and both need to be able to use the same code for printing the intro and outro of a class's definition (e.g. base class list, etc). This patch does so, and in the process introduces a skeleton definition for the graphical printer, while currently making the graphical printer just print nothing. NFC llvm-svn: 300134
* [llvm-pdbdump] More advanced class definition dumping.Zachary Turner2017-04-129-103/+119
| | | | | | | | | | | | | | | | | | | | | | | | Previously the dumping of class definitions was very primitive, and it made it hard to do more than the most trivial of output formats when dumping. As such, we would only dump one line for each field, and then dump non-layout items like nested types and enums. With this patch, we do a complete analysis of the object hierarchy including aggregate types, bases, virtual bases, vftable analysis, etc. The only immediately visible effects of this are that a) we can now dump a line for the vfptr where before we would treat that as padding, and b) we now don't treat virtual bases that come at the end of a class as padding since we have a more detailed analysis of the class's storage usage. In subsequent patches, we should be able to use this analysis to display a complete graphical view of a class's layout including recursing arbitrarily deep into an object's base class / aggregate member hierarchy. llvm-svn: 300133
* [llvm-pdbdump] Display padding bytes on record layoutZachary Turner2017-04-108-36/+115
| | | | | | | | | | | | | | | When dumping classes, show where padding occurs, and at the end of the class print statistics about how many bytes total of padding exist in a class. Since PDB doesn't specifically contain information about padding, we have to mimic this by sort of reversing a small portion of the record layout algorithm (e.g. looking at offsets and sizes and trying to determine whether something is part of the same field or a new field). Differential Revision: https://reviews.llvm.org/D31800 llvm-svn: 299869
* Improves pretty printing of variable types in llvm-pdbdumpAdrian McCarthy2017-04-103-55/+90
| | | | | | | | | | | | | | | | | * Adds support for pointers to arrays, which was missing * Adds some tests * Improves consistency of const and volatile qualifiers * Eliminates non-composable special case code for arrays and function by using a more general recursive approach * Has a hack for getting the calling convention into the right spot for pointer-to-functions Given the rapid changes happenning in llvm-pdbdump, this may be difficult to merge. Differential Revision: https://reviews.llvm.org/D31832 llvm-svn: 299848
* General usability improvements to generic PDB library.Zachary Turner2017-04-105-26/+14
| | | | | | | | | | | | | | | | | | | | 1. Added some asserts to make sure concrete symbol types don't get constructed with RawSymbols that have an incompatible SymTag enum value. 2. Added new forwarding macros that auto-define an Id/Sym method pair whenever there is a method that returns a SymIndexId. Previously we would just provide one method that returned only the SymIndexId and it was up to the caller to use the Session object to get a pointer to the symbol. Now we automatically get both the method that returns the Id, as well as a method that returns the pointer directly with just one macro. 3. Added some methods for dumping straight to stdout that can be used from inside the debugger for diagnostics during a debug session. 4. Added a clone() method and a cast<T>() method to PDBSymbol that can shorten some usage patterns. llvm-svn: 299831
* Allow specification of what kinds of class members to dump.Zachary Turner2017-04-065-99/+46
| | | | | | | | | | | | | | Previously when dumping class definitions, there were only two modes - on or off. But it's useful to sometimes get a little more fine-grained. For example, you might only want to see the record layout (for example to look for extraneous padding). This patch adds a third mode, layout mode, which does exactly that. Only this-relative data members are displayed in this mode. Differential Revision: https://reviews.llvm.org/D31794 llvm-svn: 299733
* [llvm-pdbdump] Allow pretty to only dump specific types of types.Zachary Turner2017-04-063-30/+59
| | | | | | | | | | | | | Previously we just had the -types option, which would dump all classes, typedefs, and enums. But this produces a lot of output if you only want to view classes, for example. This patch breaks this down into 3 additional options, -classes, -enums, and -typedefs, and keeps the -types option around which implies all 3 more specific options. Differential Revision: https://reviews.llvm.org/D31791 llvm-svn: 299732
* [PDB] Save one type record copyReid Kleckner2017-04-041-2/+2
| | | | | | | | | | | | | | | | | | Summary: The TypeTableBuilder provides stable storage for type records. We don't need to copy all of the bytes into a flat vector before adding it to the TpiStreamBuilder. This makes addTypeRecord take an ArrayRef<uint8_t> and a hash code to go with it, which seems like a simplification. Reviewers: ruiu, zturner, inglorion Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31634 llvm-svn: 299406
* [codeview] Add support for label type recordsReid Kleckner2017-04-031-0/+11
| | | | | | MASM can produce these type records. llvm-svn: 299388
* llvm-pdbdump: If we don't change the color, don't reset the color.Adrian McCarthy2017-03-292-3/+8
| | | | | | | | | The -output-color option was successful at suppressing color changes, but was still allowing color resets. Differential Revision: https://reviews.llvm.org/D31468 llvm-svn: 299006
* [PDB] Use two DBs when dumping the IPI streamReid Kleckner2017-03-232-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When dumping these records from an object file section, we should use only one type database. However, when dumping from a PDB, we should use two: one for the type stream and one for the IPI stream. Certain type records that normally live in the .debug$T object file section get moved over to the IPI stream of the PDB file and they get new indices. So far, I've noticed that the MSVC linker always moves these records into IPI: - LF_FUNC_ID - LF_MFUNC_ID - LF_STRING_ID - LF_SUBSTR_LIST - LF_BUILDINFO - LF_UDT_MOD_SRC_LINE These records have index fields that can point into TPI or IPI. In particular, LF_SUBSTR_LIST and LF_BUILDINFO point to LF_STRING_ID records to describe compilation command lines. I've modified the dumper to have an optional pointer to the item DB, and to do type name lookup of these fields in that DB. See printItemIndex. The result is that our pdbdump-headers.test is more faithful to the PDB contents and the output is less confusing. Reviewers: ruiu Subscribers: amccarth, zturner, llvm-commits Differential Revision: https://reviews.llvm.org/D31309 llvm-svn: 298649
* Add option to control whether llvm-pdbdump outputs in colorAdrian McCarthy2017-03-233-7/+17
| | | | | | | | | | | | | | Adds -color-output option to llvm-pdbdump pretty commands that lets the user specify whether the output should have color. The default depends on whether the output is going to a TTY (per prior discussion in https://reviews.llvm.org/D31246). This will enable tests that pipe llvm-pdbdump output to FileCheck to work across platforms without regard to the differences in ANSI codes. Differential Revision: https://reviews.llvm.org/D31263 llvm-svn: 298610
* [codeview] Use separate records for LF_SUBSTR_LIST and LF_ARGLISTReid Kleckner2017-03-221-1/+5
| | | | | | | They are structurally the same, but now we need to distinguish them because one record lives in the IPI stream and the other lives in TPI. llvm-svn: 298474
* [PDB] Add support for parsing Flags from PDB Stream.Zachary Turner2017-03-166-1/+59
| | | | | | | | | | | | | | | | | | | This was discovered when running `llvm-pdbdump diff` against two files, the second of which was generated by running the first one through pdb2yaml and then yaml2pdb. The second one was missing some bytes from the PDB Stream, and tracking this down showed that at the end of the PDB Stream were some additional bytes that we were ignoring. Looking back to the reference code, these seem to specify some additional flags that indicate whether the PDB supports various optional features. This patch adds support for reading, writing, and round-tripping these flags through YAML and the raw dumper, and updates the tests accordingly. llvm-svn: 297984
* [llvm-pdbdump] Add support for diffing the PDB Stream.Zachary Turner2017-03-161-23/+110
| | | | | | | | | | In doing so I discovered that we completely ignore some bytes of the PDB Stream after we "finish" loading it. These bytes seem to specify some additional information about what kind of data is present in the PDB. A subsequent patch will add code to read in those fields and store their values. llvm-svn: 297983
* [llvm-pdbdump] clang-format Diff.cppZachary Turner2017-03-161-29/+42
| | | | | | | Looks like this file did not have clang-format run on it when its initial revision was committed. llvm-svn: 297977
* Try to fix build break due to template argument deduction.Zachary Turner2017-03-151-6/+7
| | | | llvm-svn: 297902
* [llvm-pdbdump] Add support for diffing the String Table.Zachary Turner2017-03-151-16/+136
| | | | llvm-svn: 297901
* [pdb] Write the module info and symbol record streams.Zachary Turner2017-03-155-78/+94
| | | | | | | | | | | Previously we did not have support for writing detailed module information for each module, as well as the symbol records. This patch adds support for this, and in doing so enables the ability to construct minimal PDBs from just a few lines of YAML. A test is added to illustrate this functionality. llvm-svn: 297900
* Introduce NativeEnumModules and NativeCompilandSymbolAdrian McCarthy2017-03-151-1/+6
| | | | | | | | | | | Together, these allow lldb-pdbdump to list all the modules from a PDB using a native reader (rather than DIA). Note that I'll probably be specializing NativeRawSymbol in a subsequent patch. Differential Revision: https://reviews.llvm.org/D30956 llvm-svn: 297883
* Add the beginning of PDB diffing support.Zachary Turner2017-03-139-131/+526
| | | | | | | | | | For now this only diffs the stream directory and the MSF Superblock. Future patches will drill down into individual streams to find out where the differences lie. Differential Revision: https://reviews.llvm.org/D30908 llvm-svn: 297689
* [llvm-pdbdump] Add support for dumping symbols from Yaml -> PDB.Zachary Turner2017-03-133-31/+38
| | | | | | | | Previously we could round-trip type records from PDB -> Yaml -> PDB, but for symbols we could only go from PDB -> Yaml. This completes the round-tripping for symbols as well. llvm-svn: 297625
* [Support] Move Stream library from MSF -> Support.Zachary Turner2017-03-022-2/+2
| | | | | | | | | | After several smaller patches to get most of the core improvements finished up, this patch is a straight move and header fixup of the source. Differential Revision: https://reviews.llvm.org/D30266 llvm-svn: 296810
* [PDB] Make streams carry their own endianness.Zachary Turner2017-02-281-1/+1
| | | | | | | | | | | | | Before the endianness was specified on each call to read or write of the StreamReader / StreamWriter, but in practice it's extremely rare for streams to have data encoded in multiple different endiannesses, so we should optimize for the 99% use case. This makes the code cleaner and more general, but otherwise has NFC. llvm-svn: 296415
* [PDB] Partial resubmit of r296215, which improved PDB Stream Library.Zachary Turner2017-02-273-12/+13
| | | | | | | | | | | | | | | | | This was reverted because it was breaking some builds, and because of incorrect error code usage. Since the CL was large and contained many different things, I'm resubmitting it in pieces. This portion is NFC, and consists of: 1) Renaming classes to follow a consistent naming convention. 2) Fixing the const-ness of the interface methods. 3) Adding detailed doxygen comments. 4) Fixing a few instances of passing `const BinaryStream& X`. These are now passed as `BinaryStreamRef X`. llvm-svn: 296394
OpenPOWER on IntegriCloud