summaryrefslogtreecommitdiffstats
path: root/llvm/lib/DebugInfo/PDB/Native
Commit message (Collapse)AuthorAgeFilesLines
...
* [PDB] Better printing of builtin types when using DIA dumper.Zachary Turner2018-09-201-0/+1
| | | | llvm-svn: 342658
* [PDB] Add the ability to map forward references to full decls.Zachary Turner2018-09-202-0/+133
| | | | | | | | | | | | | | | | | | | | | Some records point to an LF_CLASS, LF_UNION, LF_STRUCTURE, or LF_ENUM which is a forward reference and doesn't contain complete debug information. In these cases, we'd like to be able to quickly locate the full record. The TPI stream stores an array of pre-computed record hash values, one for each type record. If we pre-process this on startup, we can build a mapping from hash value -> {list of possible matching type indices}. Since hashes of full records are only based on the name and or unique name and not the full record contents, we can then use forward ref record to compute the hash of what *would* be the full record by just hashing the name, use this to get the list of possible matches, and iterate those looking for a match on name or unique name. llvm-pdbutil is updated to resolve forward references for the purposes of testing (plus it's just useful). Differential Revision: https://reviews.llvm.org/D52283 llvm-svn: 342656
* [PDB] Better support for enumerating pointer types.Zachary Turner2018-09-187-42/+122
| | | | | | | | | | | | | | | | | | | There were several issues with the previous implementation. 1) There were no tests. 2) We didn't support creating PDBSymbolTypePointer records for builtin types since those aren't described by LF_POINTER records. 3) We didn't support a wide enough variety of builtin types even ignoring pointers. This patch fixes all of these issues. In order to add tests, it's helpful to be able to ignore the symbol index id hierarchy because it makes the golden output from the DIA version not match our output, so I've extended the dumper to disable dumping of id fields. llvm-svn: 342493
* [PDB] Make the native reader support enumerators.Zachary Turner2018-09-173-9/+237
| | | | | | | | | | | Previously we would dump the names of enum types, but not their enumerator values. This adds support for enumerator values. In doing so, we have to introduce a general purpose mechanism for caching symbol indices of field list members. Unlike global types, FieldList members do not have a TypeIndex. So instead, we identify them by the pair {TypeIndexOfFieldList, IndexInFieldList}. llvm-svn: 342415
* [PDB] Make the native reader support modified types.Zachary Turner2018-09-175-53/+150
| | | | | | | | Previously for cv-qualified types, we would just ignore them and they would never get printed. Now we can enumerate them and cache them like any other symbol type. llvm-svn: 342414
* Give InfoStreamBuilder an opt-in method to write a hash of the PDB as GUID.Nico Weber2018-09-152-10/+34
| | | | | | | | | | | | | Naively computing the hash after the PDB data has been generated is in practice as fast as other approaches I tried. I also tried online-computing the hash as parts of the PDB were written out (https://reviews.llvm.org/D51887; that's also where all the measuring data is) and computing the hash in parallel (https://reviews.llvm.org/D51957). This approach here is simplest, without being slower. Differential Revision: https://reviews.llvm.org/D51956 llvm-svn: 342333
* [PDB] Refactor a little of the Symbol creation code.Zachary Turner2018-09-143-28/+16
| | | | | | | | | | | Eventually we need to be able to support nested types, which don't have an associated CVType record. To handle this, remove the CVType from all of the record classes, and instead store the deserialized record. Then move the deserialization up to the thing that creates the type. This actually makes error handling better anyway as we can return an invalid symbol instead of asserting false. llvm-svn: 342284
* DebugInfo/PDB: Remove unused memberDavid Blaikie2018-09-131-2/+2
| | | | llvm-svn: 342101
* [PDB] Remove all clone() methods.Zachary Turner2018-09-127-28/+0
| | | | | | | These are dead code and encourage poor usage patterns, so I'm removing them. They weren't called anywhere anyway. llvm-svn: 342093
* [PDB] Emit old fpo data to the PDB file.Zachary Turner2018-09-121-7/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | r342003 added support for emitting FPO data from the DEBUG_S_FRAMEDATA subsection of the .debug$S section to the PDB file. However, that is not the end of the story. FPO can end up in two different destinations in a PDB, each corresponding to a different FPO data source. The case handled by r342003 involves copying data from the DEBUG_S_FRAMEDATA subsection of the .debug$S section to the "New FPO" stream in the PDB, which is then referred to by the DBI stream. The case handled by this patch involves copying records from the .debug$F section of an object file to the "FPO" stream (or perhaps more aptly, the "Old FPO" stream) in the PDB file, which is also referred to by the DBI stream. The formats are largely similar, and the difference is mostly only visible in masm generated object files, such as some of the low-level CRT object files like memcpy. MASM doesn't appear to support writing the DEBUG_S_FRAMEDATA subsection, and instead just writes these records to the .debug$F section. Although clang-cl does not emit a .debug$F section ever, lld still needs to support it so we have good debugging for CRT functions. Differential Revision: https://reviews.llvm.org/D51958 llvm-svn: 342080
* [PDB] Write FPO Data to the PDB.Zachary Turner2018-09-111-3/+28
| | | | llvm-svn: 342003
* pdb output: Initialize padding in PublicsStreamHeader.Nico Weber2018-09-111-2/+3
| | | | | | | | | Makes the produced pdbs more deterministic; before they'd contain 2 arbitary bytes where this padding was. Also reorder initialization to match the order of the fields in the struct (nfc) llvm-svn: 341945
* [PDB] Change uint32_t to SymIndex wherever it makes sense.Zachary Turner2018-09-105-39/+17
| | | | | | Although it's just a typedef, it helps for readability. NFC. llvm-svn: 341863
* Fix some of the PDB tests.Zachary Turner2018-09-071-2/+0
| | | | | | | | They were unintentionally calling DIA directly, which requires Windows. We need to pass the -native flag, and this then required fixing up one or two tests. llvm-svn: 341731
* [PDB] Support pointer types in the native reader.Zachary Turner2018-09-078-11/+239
| | | | | | | | | | In order to start testing this, I've added a new mode to llvm-pdbutil which is only really useful for writing tests. It just dumps the value of raw fields in record format. This isn't really ideal and it won't allow us to test some important cases, but it's better than nothing for now. llvm-svn: 341729
* [PDB] Rename some files in the native reader.Zachary Turner2018-09-076-79/+78
| | | | | | | By calling these NativeType<foo>.cpp, they will all be sorted together, and it also distinguishes the types from the symbols. llvm-svn: 341609
* [PDB] Create a SymbolCache class.Zachary Turner2018-09-076-140/+193
| | | | | | | | | | | | | Part of the responsibility of the native PDB reader is to cache symbols the first time they are accessed, so they can then be looked up by an ID. Furthermore, we need to resolve type indices to records that we vend to the user, and other things. Previously this code was all thrown together a bit haphazardly in the native session class, but it makes sense to collect all of this into a single class whose sole responsibility is to manage the collection of known symbols. llvm-svn: 341608
* [PDB] Refactor the PDB symbol classes to fix a reuse bug.Zachary Turner2018-09-057-53/+70
| | | | | | | | | | | | | | | | | | | | The way DIA SDK works is that when you request a symbol, it gets assigned an internal identifier that is unique for the life of the session. You can then use this identifier to get back the same symbol, with all of the same internal state that it had before, even if you "destroyed" the original copy of the object you had. This didn't work properly in our native implementation, and if you destroyed an object for a particular symbol, then requested the same symbol again, it would get assigned a new ID and you'd get a fresh copy of the object. In order to fix this some refactoring had to happen to properly reuse cached objects. Some unittests are added to verify that symbol reuse is taking place, making use of the new unittest input feature. llvm-svn: 341503
* [DebugInfo] Common behavior for error typesAlexandre Ganea2018-08-314-30/+4
| | | | | | | | | | | | | | | Following D50807, and heading towards D50664, this intermediary change does the following: 1. Upgrade all custom Error types in llvm/trunk/lib/DebugInfo/ to use the new StringError behavior (D50807). 2. Implement std::is_error_code_enum and make_error_code() for DebugInfo error enumerations. 3. Rename GenericError -> PDBError (the file will be renamed in a subsequent commit) 4. Update custom error messages to follow the same formatting: (\w\s*)+\. 5. Keep generic "file not found" (ENOENT) errors as they are in PDB code. Previously, there used to be a custom enumeration for that purpose. 6. Remove a few extraneous LF in log() implementations. Printing LF is a responsability at a higher level, not at the error level. Differential Revision: https://reviews.llvm.org/D51499 llvm-svn: 341228
* [codeview] Use push_macro to avoid conflicts instead of a prefixReid Kleckner2018-08-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Summary: This prefix was added in r333421, and it changed our dumper output to say things like "CVRegEAX" instead of just "EAX". That's a functional change that I'd rather avoid. I tested GCC, Clang, and MSVC, and all of them support #pragma push_macro. They don't issue warnings whem the macro is not defined either. I don't have a Mac so I can't test the real termios.h header, but I looked at the termios.h sources online and looked for other conflicts. I saw only the CR* macros, so those are the ones we work around. Reviewers: zturner, JDevlieghere Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D50851 llvm-svn: 339907
* [llvm-pdbutil] Support PDBs without a DBI streamAlexandre Ganea2018-08-061-1/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D50258 llvm-svn: 339045
* [llvm] Change 2 instances of std::sort to llvm::sortMandeep Singh Grang2018-07-161-1/+1
| | | | llvm-svn: 337192
* [PDB] memicmp only exists on Windows, use StringRef::compare_lower insteadBenjamin Kramer2018-07-061-2/+2
| | | | llvm-svn: 336469
* [PDB] One more fix for hasing GSI records.Zachary Turner2018-07-061-8/+27
| | | | | | | | | | | | | | | | The reference implementation uses a case-insensitive string comparison for strings of equal length. This will cause the string "tEo" to compare less than "VUo". However we were using a case sensitive comparison, which would generate the opposite outcome. Switch to a case insensitive comparison. Also, when one of the strings contains non-ascii characters, fallback to a straight memcmp. The only way to really test this is with a DIA test. Before this patch, the test will fail (but succeed if link.exe is used instead of lld-link). After the patch, it succeeds even with lld-link. llvm-svn: 336464
* [PDB] Sort globals symbols by name in GSI hash buckets.Zachary Turner2018-07-061-5/+19
| | | | | | | | | | | It seems like the debugger first computes a symbol's bucket, and then does a binary search of entries in the bucket using the symbol's name in order to find it. If the bucket entries are not in sorted order, this obviously won't work. After this patch a couple of simple test cases show that we generate an exactly identical GSI hash stream, which is very nice. llvm-svn: 336405
* Move some code from PDBFileBuilder to MSFBuilder.Zachary Turner2018-06-271-71/+15
| | | | | | | | The code to emit the pieces of the MSF file were actually in PDBFileBuilder. Move this to MSFBuilder so that we can theoretically emit an MSF without having a PDB file. llvm-svn: 335789
* [CodeView] Add prefix to CodeView registers.Jonas Devlieghere2018-05-291-2/+2
| | | | | | | | | | | | | Adds CVReg to CodeView register names to prevent a duplicate symbol with CR3 defined in termios.h, as suggested by Zachary on the mailing list. http://lists.llvm.org/pipermail/llvm-dev/2018-May/123372.html Differential revision: https://reviews.llvm.org/D47478 rdar://39863705 llvm-svn: 333421
* [LLD/PDB] Emit first section contribution for DBI Module Descriptor.Zachary Turner2018-04-201-0/+5
| | | | | | | | | | | | | | | | Part of the DBI stream is a list of variable length structures describing each module that contributes to the final executable. One member of this structure is a section contribution entry that describes the first section contribution in the output file for the given module. We have been leaving this structure unpopulated until now, so with this patch it is now filled out correctly. Differential Revision: https://reviews.llvm.org/D45832 llvm-svn: 330457
* [llvm-pdbutil] Dump first section contribution for each module.Zachary Turner2018-04-172-1/+5
| | | | | | | | | | | | | | The DBI stream contains a list of module descriptors. At the beginning of each descriptor is a structure representing the first section contribution in the output file for that module. LLD currently doesn't fill out this structure at all, but link.exe does. So as a precursor to emitting this data in LLD, we first need a way to dump it so that it can be checked. This patch adds support for the dumping, and verifies via a test that LLD emits bogus information. llvm-svn: 330208
* [PDB] Correctly use the target machine when writing DBI stream.Zachary Turner2018-04-161-0/+5
| | | | | | | | | Using Config->is64() will treat ARM64 as Amd64, which is incorrect. Furthermore, there are more esoteric architectures that could theoretically be encountered. Just set it directly to the machine type, which we already know anyway. llvm-svn: 330157
* Resubmit "Fix some incorrect fields in our generated PDBs."Zachary Turner2018-04-161-1/+9
| | | | | | | This fixes the failing tests. They simply hadn't been updated to match the new output resulting from this patch. llvm-svn: 330145
* Revert "Fix some incorrect fields in our generated PDBs."Zachary Turner2018-04-161-9/+1
| | | | | | | There are a couple of failing tests which slipped under my radar so I'm reverting this while I attempt to fix. llvm-svn: 330133
* Fix some incorrect fields in our generated PDBs.Zachary Turner2018-04-161-1/+9
| | | | | | | | | | | | | | | | | | Most of these are pretty trivial and obvious. Setting the toolchain version to 14.11 is perhaps a little questionable, but we've been bitten in the past where one of our version fields sidn't match MSVC's, and I definitely don't want to go through that diagnosis again as it was pretty time consuming and hard to track down. I found all of these by using llvm-pdbutil export to dump the dbi and pdb streams to a file, then using fc followed by llvm-pdbutil explain to explain the mismatched bytes. There are still some more, these are just the low hanging fruit. Differential Revision: https://reviews.llvm.org/D45276 llvm-svn: 330130
* [DebugInfoPDB] Add DIA implementations of findSymbolByRVA and findSymbolByAddrAaron Smith2018-04-101-0/+11
| | | | llvm-svn: 329724
* [llvm-pdbutil] Add the ability to explain binary files.Zachary Turner2018-04-043-14/+20
| | | | | | | | | | Using this, you can use llvm-pdbutil to export the contents of a stream to a binary file, then run explain on the binary file so that it treats the offset as an offset into the stream instead of an offset into a file. This makes it easy to compare the contents of the same stream from two different files. llvm-svn: 329207
* [llvm-pdbutil] Add an export subcommand.Zachary Turner2018-04-022-4/+13
| | | | | | | | | | | | | | | | | This command can dump the binary contents of a stream to a file. This is useful when you want to do side-by-side comparisons of a specific stream from two PDBs to examine the differences between them. You can export both of them to a file, then open them up side by side in a hex editor (for example), so as to eliminate any differences that might arise from the contents being on different blocks in the PDB. In subsequent patches I plan to improve the "explain" subcommand so that you can explain the contents of a binary file that isn't necessarily a full PDB, but one of these dumped streams, by telling the subcommand how to interpret the contents. llvm-svn: 329002
* [llvm-pdbutil] Dig deeper into the PDB and DBI streams when explaining.Zachary Turner2018-03-301-13/+9
| | | | | | | | | This will show more detail when using `llvm-pdbutil explain` on an offset in the DBI or PDB streams. Specifically, it will dig into individual header fields and substreams to give a more precise description of what the byte represents. llvm-svn: 328878
* [DebugInfoPDB] Add DIA implementation of findLineNumbersByRVAAaron Smith2018-03-261-0/+5
| | | | | | | This method is used to find line numbers for PDBSymbolData that have an invalid virtual address. llvm-svn: 328586
* [DebugInfoPDB] Add DIA implementation of addressForVA and addressForRVAAaron Smith2018-03-261-0/+10
| | | | | | These are used in finding line numbers for PDBSymbolData llvm-svn: 328585
* [PDB] Resubmit "Support embedding natvis files in PDBs."Zachary Turner2018-03-232-1/+123
| | | | | | | | | | | | This was reverted several times due to what ultimately turned out to be incompatibilities in our serialized hash table format. Several changes went in prior to this to fix those issues since they were more fundamental and independent of supporting injected sources, so now that those are fixed this change should hopefully pass. llvm-svn: 328363
* [PDB] Make our PDBs look more like MS PDBs.Zachary Turner2018-03-236-47/+137
| | | | | | | | | | | | | | | | | | When investigating bugs in PDB generation, the first step is often to do the same link with link.exe and then compare PDBs. But comparing PDBs is hard because two completely different byte sequences can both be correct, so it hampers the investigation when you also have to spend time figuring out not just which bytes are different, but also if the difference is meaningful. This patch fixes a couple of cases related to string table emission, hash table emission, and the order in which we emit strings that makes more of our bytes the same as the bytes generated by MS PDBs. Differential Revision: https://reviews.llvm.org/D44810 llvm-svn: 328348
* [Codeview/PDB] Rename some methods for clarity.Zachary Turner2018-03-221-0/+8
| | | | | | | | | NFC, this just renames some methods to better express what they do, and also adds a few helper methods to add some symmetry to the API in a few places (for example there was a getStringFromId but not a getIdFromString method in the string table). llvm-svn: 328221
* [DIA] Add IPDBSectionContrib interfaces and DIA implementationAaron Smith2018-03-221-0/+5
| | | | | | | | | | | | | | | | | | | | | To resolve symbol context at a particular address, we need to determine the compiland for the address. We are able to determine the parent compiland of PDBSymbolFunc, PDBSymbolTypeUDT, PDBSymbolTypeEnum symbols indirectly through line information. However no such information is availabile for PDBSymbolData, i.e. variables. The Section Contribution table from PDBs has information about each compiland's contribution to sections by address. For example, a piece of a contribution looks like, VA RelativeVA Sect No. Offset Length Compiland 14000087B0 000087B0 0001 000077B0 000000BB exe_main.obj So given an address, it's possible to determine its compiland with this information. llvm-svn: 328178
* [PDB] Don't ignore bucket 0 when writing the PDB string table.Zachary Turner2018-03-212-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | The hash table is a list of buckets, and the *value* stored in the bucket cannot be 0 since that is reserved. However, the code here was incorrectly skipping over the 0'th bucket entirely. The 0'th bucket is perfectly fine, just none of these buckets can contain the value 0. As a result, whenever there was a string where hash(S) % Size was equal to 0, we would write the value in the next bucket instead. We never caught this in our tests due to *another* bug, which is that we would iterate the entire list of buckets looking for the value, only using the hash value as a starting point. However, the real algorithm stops when it finds 0 in a bucket since it takes that to mean "the item is not in the hash table". The unit test is updated to carefully construct a set of hash values that will cause one item to hash to 0 mod bucket count, and the reader is also updated to return an error indicating that the item is not found when it encounters a 0 bucket. llvm-svn: 328162
* Revert "Resubmit "Support embedding natvis files in PDBs.""Zachary Turner2018-03-203-161/+13
| | | | | | | | This is still failing on a different bot this time due to some issue related to hashing absolute paths. Reverting until I can figure it out. llvm-svn: 328014
* Resubmit "Support embedding natvis files in PDBs."Zachary Turner2018-03-203-13/+161
| | | | | | | | | | | The issue causing this to fail in certain configurations should be fixed. It was due to the fact that DIA apparently expects there to be a null string at ID 1 in the string table. I'm not sure why this is important but it seems to make a difference, so set it. llvm-svn: 328002
* Revert "Support embedding natvis files in PDBs."Zachary Turner2018-03-193-161/+13
| | | | | | | This is causing a test failure on a certain bot, so I'm removing this temporarily until we can figure out the source of the error. llvm-svn: 327903
* Support embedding natvis files in PDBs.Zachary Turner2018-03-193-13/+161
| | | | | | | | | | | | | | | | | | | | | | | | Natvis is a debug language supported by Visual Studio for specifying custom visualizers. The /NATVIS option is an undocumented link.exe flag which will take a .natvis file and "inject" it into the PDB. This way, you can ship the debug visualizers for a program along with the PDB, which is very useful for postmortem debugging. This is implemented by adding a new "named stream" to the PDB with a special name of /src/files/<natvis file name> and simply copying the contents of the xml into this file. Additionally, we need to emit a single stream named /src/headerblock which contains a hash table of embedded files to records describing them. This patch adds this functionality, including the /NATVIS option to lld-link. Differential Revision: https://reviews.llvm.org/D44328 llvm-svn: 327895
* [PDB] Fix a bug where we were serializing hash tables incorrectly.Zachary Turner2018-03-151-3/+5
| | | | | | | | | | | There was some code that tried to calculate the number of 4-byte words required to hold N bits, but it was instead computing the number of bytes required to hold N bits. This was leading to extraneous data being output into the hash table, which would cause certain operations in DIA (the Microsoft PDB reader) to fail. llvm-svn: 327675
* Refactor the PDB HashTable class.Zachary Turner2018-03-153-192/+27
| | | | | | | | | It previously only worked when the key and value types were both 4 byte integers. We now have a use case for a non trivial value type, so we need to extend it to support arbitrary value types, which means templatizing it. llvm-svn: 327647
OpenPOWER on IntegriCloud