summaryrefslogtreecommitdiffstats
path: root/llvm/lib/DebugInfo/PDB/Native
Commit message (Collapse)AuthorAgeFilesLines
* [llvm] Change 2 instances of std::sort to llvm::sortMandeep Singh Grang2018-07-161-1/+1
| | | | llvm-svn: 337192
* [PDB] memicmp only exists on Windows, use StringRef::compare_lower insteadBenjamin Kramer2018-07-061-2/+2
| | | | llvm-svn: 336469
* [PDB] One more fix for hasing GSI records.Zachary Turner2018-07-061-8/+27
| | | | | | | | | | | | | | | | The reference implementation uses a case-insensitive string comparison for strings of equal length. This will cause the string "tEo" to compare less than "VUo". However we were using a case sensitive comparison, which would generate the opposite outcome. Switch to a case insensitive comparison. Also, when one of the strings contains non-ascii characters, fallback to a straight memcmp. The only way to really test this is with a DIA test. Before this patch, the test will fail (but succeed if link.exe is used instead of lld-link). After the patch, it succeeds even with lld-link. llvm-svn: 336464
* [PDB] Sort globals symbols by name in GSI hash buckets.Zachary Turner2018-07-061-5/+19
| | | | | | | | | | | It seems like the debugger first computes a symbol's bucket, and then does a binary search of entries in the bucket using the symbol's name in order to find it. If the bucket entries are not in sorted order, this obviously won't work. After this patch a couple of simple test cases show that we generate an exactly identical GSI hash stream, which is very nice. llvm-svn: 336405
* Move some code from PDBFileBuilder to MSFBuilder.Zachary Turner2018-06-271-71/+15
| | | | | | | | The code to emit the pieces of the MSF file were actually in PDBFileBuilder. Move this to MSFBuilder so that we can theoretically emit an MSF without having a PDB file. llvm-svn: 335789
* [CodeView] Add prefix to CodeView registers.Jonas Devlieghere2018-05-291-2/+2
| | | | | | | | | | | | | Adds CVReg to CodeView register names to prevent a duplicate symbol with CR3 defined in termios.h, as suggested by Zachary on the mailing list. http://lists.llvm.org/pipermail/llvm-dev/2018-May/123372.html Differential revision: https://reviews.llvm.org/D47478 rdar://39863705 llvm-svn: 333421
* [LLD/PDB] Emit first section contribution for DBI Module Descriptor.Zachary Turner2018-04-201-0/+5
| | | | | | | | | | | | | | | | Part of the DBI stream is a list of variable length structures describing each module that contributes to the final executable. One member of this structure is a section contribution entry that describes the first section contribution in the output file for the given module. We have been leaving this structure unpopulated until now, so with this patch it is now filled out correctly. Differential Revision: https://reviews.llvm.org/D45832 llvm-svn: 330457
* [llvm-pdbutil] Dump first section contribution for each module.Zachary Turner2018-04-172-1/+5
| | | | | | | | | | | | | | The DBI stream contains a list of module descriptors. At the beginning of each descriptor is a structure representing the first section contribution in the output file for that module. LLD currently doesn't fill out this structure at all, but link.exe does. So as a precursor to emitting this data in LLD, we first need a way to dump it so that it can be checked. This patch adds support for the dumping, and verifies via a test that LLD emits bogus information. llvm-svn: 330208
* [PDB] Correctly use the target machine when writing DBI stream.Zachary Turner2018-04-161-0/+5
| | | | | | | | | Using Config->is64() will treat ARM64 as Amd64, which is incorrect. Furthermore, there are more esoteric architectures that could theoretically be encountered. Just set it directly to the machine type, which we already know anyway. llvm-svn: 330157
* Resubmit "Fix some incorrect fields in our generated PDBs."Zachary Turner2018-04-161-1/+9
| | | | | | | This fixes the failing tests. They simply hadn't been updated to match the new output resulting from this patch. llvm-svn: 330145
* Revert "Fix some incorrect fields in our generated PDBs."Zachary Turner2018-04-161-9/+1
| | | | | | | There are a couple of failing tests which slipped under my radar so I'm reverting this while I attempt to fix. llvm-svn: 330133
* Fix some incorrect fields in our generated PDBs.Zachary Turner2018-04-161-1/+9
| | | | | | | | | | | | | | | | | | Most of these are pretty trivial and obvious. Setting the toolchain version to 14.11 is perhaps a little questionable, but we've been bitten in the past where one of our version fields sidn't match MSVC's, and I definitely don't want to go through that diagnosis again as it was pretty time consuming and hard to track down. I found all of these by using llvm-pdbutil export to dump the dbi and pdb streams to a file, then using fc followed by llvm-pdbutil explain to explain the mismatched bytes. There are still some more, these are just the low hanging fruit. Differential Revision: https://reviews.llvm.org/D45276 llvm-svn: 330130
* [DebugInfoPDB] Add DIA implementations of findSymbolByRVA and findSymbolByAddrAaron Smith2018-04-101-0/+11
| | | | llvm-svn: 329724
* [llvm-pdbutil] Add the ability to explain binary files.Zachary Turner2018-04-043-14/+20
| | | | | | | | | | Using this, you can use llvm-pdbutil to export the contents of a stream to a binary file, then run explain on the binary file so that it treats the offset as an offset into the stream instead of an offset into a file. This makes it easy to compare the contents of the same stream from two different files. llvm-svn: 329207
* [llvm-pdbutil] Add an export subcommand.Zachary Turner2018-04-022-4/+13
| | | | | | | | | | | | | | | | | This command can dump the binary contents of a stream to a file. This is useful when you want to do side-by-side comparisons of a specific stream from two PDBs to examine the differences between them. You can export both of them to a file, then open them up side by side in a hex editor (for example), so as to eliminate any differences that might arise from the contents being on different blocks in the PDB. In subsequent patches I plan to improve the "explain" subcommand so that you can explain the contents of a binary file that isn't necessarily a full PDB, but one of these dumped streams, by telling the subcommand how to interpret the contents. llvm-svn: 329002
* [llvm-pdbutil] Dig deeper into the PDB and DBI streams when explaining.Zachary Turner2018-03-301-13/+9
| | | | | | | | | This will show more detail when using `llvm-pdbutil explain` on an offset in the DBI or PDB streams. Specifically, it will dig into individual header fields and substreams to give a more precise description of what the byte represents. llvm-svn: 328878
* [DebugInfoPDB] Add DIA implementation of findLineNumbersByRVAAaron Smith2018-03-261-0/+5
| | | | | | | This method is used to find line numbers for PDBSymbolData that have an invalid virtual address. llvm-svn: 328586
* [DebugInfoPDB] Add DIA implementation of addressForVA and addressForRVAAaron Smith2018-03-261-0/+10
| | | | | | These are used in finding line numbers for PDBSymbolData llvm-svn: 328585
* [PDB] Resubmit "Support embedding natvis files in PDBs."Zachary Turner2018-03-232-1/+123
| | | | | | | | | | | | This was reverted several times due to what ultimately turned out to be incompatibilities in our serialized hash table format. Several changes went in prior to this to fix those issues since they were more fundamental and independent of supporting injected sources, so now that those are fixed this change should hopefully pass. llvm-svn: 328363
* [PDB] Make our PDBs look more like MS PDBs.Zachary Turner2018-03-236-47/+137
| | | | | | | | | | | | | | | | | | When investigating bugs in PDB generation, the first step is often to do the same link with link.exe and then compare PDBs. But comparing PDBs is hard because two completely different byte sequences can both be correct, so it hampers the investigation when you also have to spend time figuring out not just which bytes are different, but also if the difference is meaningful. This patch fixes a couple of cases related to string table emission, hash table emission, and the order in which we emit strings that makes more of our bytes the same as the bytes generated by MS PDBs. Differential Revision: https://reviews.llvm.org/D44810 llvm-svn: 328348
* [Codeview/PDB] Rename some methods for clarity.Zachary Turner2018-03-221-0/+8
| | | | | | | | | NFC, this just renames some methods to better express what they do, and also adds a few helper methods to add some symmetry to the API in a few places (for example there was a getStringFromId but not a getIdFromString method in the string table). llvm-svn: 328221
* [DIA] Add IPDBSectionContrib interfaces and DIA implementationAaron Smith2018-03-221-0/+5
| | | | | | | | | | | | | | | | | | | | | To resolve symbol context at a particular address, we need to determine the compiland for the address. We are able to determine the parent compiland of PDBSymbolFunc, PDBSymbolTypeUDT, PDBSymbolTypeEnum symbols indirectly through line information. However no such information is availabile for PDBSymbolData, i.e. variables. The Section Contribution table from PDBs has information about each compiland's contribution to sections by address. For example, a piece of a contribution looks like, VA RelativeVA Sect No. Offset Length Compiland 14000087B0 000087B0 0001 000077B0 000000BB exe_main.obj So given an address, it's possible to determine its compiland with this information. llvm-svn: 328178
* [PDB] Don't ignore bucket 0 when writing the PDB string table.Zachary Turner2018-03-212-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | The hash table is a list of buckets, and the *value* stored in the bucket cannot be 0 since that is reserved. However, the code here was incorrectly skipping over the 0'th bucket entirely. The 0'th bucket is perfectly fine, just none of these buckets can contain the value 0. As a result, whenever there was a string where hash(S) % Size was equal to 0, we would write the value in the next bucket instead. We never caught this in our tests due to *another* bug, which is that we would iterate the entire list of buckets looking for the value, only using the hash value as a starting point. However, the real algorithm stops when it finds 0 in a bucket since it takes that to mean "the item is not in the hash table". The unit test is updated to carefully construct a set of hash values that will cause one item to hash to 0 mod bucket count, and the reader is also updated to return an error indicating that the item is not found when it encounters a 0 bucket. llvm-svn: 328162
* Revert "Resubmit "Support embedding natvis files in PDBs.""Zachary Turner2018-03-203-161/+13
| | | | | | | | This is still failing on a different bot this time due to some issue related to hashing absolute paths. Reverting until I can figure it out. llvm-svn: 328014
* Resubmit "Support embedding natvis files in PDBs."Zachary Turner2018-03-203-13/+161
| | | | | | | | | | | The issue causing this to fail in certain configurations should be fixed. It was due to the fact that DIA apparently expects there to be a null string at ID 1 in the string table. I'm not sure why this is important but it seems to make a difference, so set it. llvm-svn: 328002
* Revert "Support embedding natvis files in PDBs."Zachary Turner2018-03-193-161/+13
| | | | | | | This is causing a test failure on a certain bot, so I'm removing this temporarily until we can figure out the source of the error. llvm-svn: 327903
* Support embedding natvis files in PDBs.Zachary Turner2018-03-193-13/+161
| | | | | | | | | | | | | | | | | | | | | | | | Natvis is a debug language supported by Visual Studio for specifying custom visualizers. The /NATVIS option is an undocumented link.exe flag which will take a .natvis file and "inject" it into the PDB. This way, you can ship the debug visualizers for a program along with the PDB, which is very useful for postmortem debugging. This is implemented by adding a new "named stream" to the PDB with a special name of /src/files/<natvis file name> and simply copying the contents of the xml into this file. Additionally, we need to emit a single stream named /src/headerblock which contains a hash table of embedded files to records describing them. This patch adds this functionality, including the /NATVIS option to lld-link. Differential Revision: https://reviews.llvm.org/D44328 llvm-svn: 327895
* [PDB] Fix a bug where we were serializing hash tables incorrectly.Zachary Turner2018-03-151-3/+5
| | | | | | | | | | | There was some code that tried to calculate the number of 4-byte words required to hold N bits, but it was instead computing the number of bytes required to hold N bits. This was leading to extraneous data being output into the hash table, which would cause certain operations in DIA (the Microsoft PDB reader) to fail. llvm-svn: 327675
* Refactor the PDB HashTable class.Zachary Turner2018-03-153-192/+27
| | | | | | | | | It previously only worked when the key and value types were both 4 byte integers. We now have a use case for a non trivial value type, so we need to extend it to support arbitrary value types, which means templatizing it. llvm-svn: 327647
* [DebugInfo] Add a new method IPDBSession::findLineNumbersBySectOffsetAaron Smith2018-03-151-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Some PDB symbols do not have a valid VA or RVA but have Addr by Section and Offset. For example, a variable in thread-local storage has the following properties: get_addressOffset: 0 get_addressSection: 5 get_lexicalParentId: 2 get_name: g_tls get_symIndexId: 12 get_typeId: 4 get_dataKind: 6 get_symTag: 7 get_locationType: 2 This change provides a new method to locate line numbers by Section and Offset from those symbols. Reviewers: zturner, rnk, llvm-commits Subscribers: asmith, JDevlieghere Differential Revision: https://reviews.llvm.org/D44407 llvm-svn: 327601
* [PDB] Support dumping injected sources via the DIA reader.Zachary Turner2018-03-131-0/+5
| | | | | | | | | | | | | | | | | | Injected sources are basically a way to add actual source file content to your PDB. Presumably you could use this for shipping your source code with your debug information, but in practice I can only find this being used for embedding natvis files inside of PDBs. In order to effectively test LLVM's natvis file injection, we need a way to dump the injected sources of a PDB in a way that is authoritative (i.e. based on Microsoft's understanding of the PDB format, and not LLVM's). To this end, I've added support for dumping injected sources via DIA. I made a PDB file that used the /natvis option to generate a test case. Differential Revision: https://reviews.llvm.org/D44405 llvm-svn: 327428
* [DebugInfoPDB] Add DIA implementation for getSrcLineOnTypeDefnAaron Smith2018-03-071-0/+6
| | | | | | | | | | | | | | Summary: This helps to determine the line number for a PDB type with definition Reviewers: zturner, llvm-commits, rnk Reviewed By: zturner Subscribers: rengolin, JDevlieghere Differential Revision: https://reviews.llvm.org/D44119 llvm-svn: 326857
* [PDB] Defer writing the build id until the rest of the PDB is written.Zachary Turner2018-03-012-7/+26
| | | | | | | | | | For now this is NFC, but this small refactor opens the door to letting us embed a hash of the PDB in the build id field of the PDB. Differential Revision: https://reviews.llvm.org/D43913 llvm-svn: 326453
* [PDB] Check the result of setLoadAddress()Aaron Smith2018-02-231-1/+1
| | | | | | | | | | | | Summary: Change setLoadAddress() to return true or false on failure. Reviewers: zturner, llvm-commits Reviewed By: zturner Differential Revision: https://reviews.llvm.org/D43638 llvm-svn: 325843
* [PDB] Implement more find methods for PDB symbolsAaron Smith2018-02-221-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add additional find methods on PDB raw symbols. findChildrenByAddr() findChildrenByVA() findInlineFramesByAddr() findInlineFramesByVA() findInlineLines() findInlineLinesByAddr() findInlineLinesByRVA() findInlineLinesByVA() Reviewers: zturner, llvm-commits Reviewed By: zturner Differential Revision: https://reviews.llvm.org/D43637 llvm-svn: 325824
* Fix emission of PDB string table.Zachary Turner2018-02-164-171/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was originally reported as a bug with the symptom being "cvdump crashes when printing an LLD-linked PDB that has an S_FILESTATIC record in it". After some additional investigation, I determined that this was a symptom of a larger problem, and in fact the real problem was in the way we emitted the global PDB string table. As evidence of this, you can take any lld-generated PDB, run cvdump -stringtable on it, and it would return no results. My hypothesis was that cvdump could not *find* the string table to begin with. Normally it would do this by looking in the "named stream map", finding the string /names, and using its value as the stream index. If this lookup fails, then cvdump would fail to load the string table. To test this hypothesis, I looked at the name stream map generated by a link.exe PDB, and I emitted exactly those bytes into an LLD-generated PDB. Suddenly, cvdump could read our string table! This code has always been hacky and we knew there was something we didn't understand. After all, there were some comments to the effect of "we have to emit strings in a specific order, otherwise things don't work". The key to fixing this was finally understanding this. The way it works is that it makes use of a generic serializable hash map that maps integers to other integers. In this case, the "key" is the offset into a buffer, and the value is the stream number. If you index into the buffer at the offset specified by a given key, you find the name. The underlying cause of all these problems is that we were using the identity function for the hash. i.e. if a string's offset in the buffer was 12, the hash value was 12. Instead, we need to hash the string *at that offset*. There is an additional catch, in that we have to compute the hash as a uint32 and then truncate it to uint16. Making this work is a little bit annoying, because we use the same hash table in other places as well, and normally just using the identity function for the hash function is actually what's desired. I'm not totally happy with the template goo I came up with, but it works in any case. The reason we never found this bug through our own testing is because we were building a /parallel/ hash table (in the form of an llvm::StringMap<>) and doing all of our lookups and "real" hash table work against that. I deleted all of that code and now everything goes through the real hash table. Then, to test it, I added a unit test which adds 7 strings and queries the associated values. I test every possible insertion order permutation of these 7 strings, to verify that it really does work as expected. Differential Revision: https://reviews.llvm.org/D43326 llvm-svn: 325386
* Remove redundant includes from lib/DebugInfo.Michael Zolotukhin2017-12-139-15/+0
| | | | llvm-svn: 320620
* [DebugInfo/PDB] Adding getUndecoratedNameEx and IPDB interfaces for ↵Aaron Smith2017-11-162-0/+9
| | | | | | | | | | | IDiaEnumTables and IDiaTable. Initial changes to support debugging PE/COFF files with LLDB on Windows through DIA SDK. There is another set of changes required on the LLDB side before this does anything. Differential Revision: https://reviews.llvm.org/D39517 llvm-svn: 318403
* Test commit. Add a missing dash to the standard llvm file header; NFC.Aaron Smith2017-11-161-1/+1
| | | | llvm-svn: 318400
* Convert FileOutputBuffer to Expected. NFC.Rafael Espindola2017-11-081-3/+2
| | | | llvm-svn: 317649
* [PDB] Handle an empty globals hash table with no bucketsReid Kleckner2017-10-271-2/+3
| | | | llvm-svn: 316722
* COFF: Add type server pdb files to linkrepro tar file.Peter Collingbourne2017-10-201-8/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D38977 llvm-svn: 316233
* COFF: PDB: Allow multiple modules with the same name.Peter Collingbourne2017-09-071-18/+3
| | | | | | | | | | It is possible for two modules to have the same name if they are archive members with the same name, or if we are doing LTO (in which case all modules will have the name "lto.tmp"). Differential Revision: https://reviews.llvm.org/D37589 llvm-svn: 312744
* Remove dead code. NFCI.Peter Collingbourne2017-09-071-8/+0
| | | | llvm-svn: 312740
* [llvm-pdbutil] Support dumping CodeView from object files.Zachary Turner2017-09-011-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | We have llvm-readobj for dumping CodeView from object files, and llvm-pdbutil has always been more focused on PDB. However, llvm-pdbutil has a lot of useful options for summarizing debug information in aggregate and presenting high level statistical views. Furthermore, it's arguably better as a testing tool since we don't have to write tests to conform to a state-machine like structure where you match multiple lines in succession, each depending on a previous match. llvm-pdbutil dumps much more concisely, so it's possible to use single-line matches in many cases where as with readobj tests you have to use multi-line matches with an implicit state machine. Because of this, I'm adding object file support to llvm-pdbutil. In fact, this mirrors the cvdump tool from Microsoft, which also supports both object files and pdb files. In the future we could perhaps rename this tool llvm-cvutil. In the meantime, this allows us to deep dive into object files the same way we already can with PDB files. llvm-svn: 312358
* [llvm-pdbutil] Print detailed S_UDT stats.Zachary Turner2017-08-311-1/+5
| | | | | | | | | | | | | | This adds a new command line option, -udt-stats, which breaks down the stats of S_UDT records. These are one of the biggest contributors to the size of /DEBUG:FASTLINK PDBs, so they need some additional tools to be able to analyze their usage. This option will dig into each S_UDT record and determine what kind of record it points to, and then break down the statistics by the target type. The goal here is to identify how our object files differ from MSVC object files in S_UDT records, so that we can output fewer of them and reach size parity. llvm-svn: 312276
* [lld/pdb] Speed up construction of publics & globals addr map.Zachary Turner2017-08-211-19/+21
| | | | | | | | | | | | | computeAddrMap function calls std::stable_sort with a comparison function that computes deserialized symbols every time its called. In the result deserializeAs<PublicSym32> is called 20-30 times per symbol. It's much faster to calculate it beforehand and pass a pointer to it to the comparison function. Patch by Alex Telishev Differential Revision: https://reviews.llvm.org/D36941 llvm-svn: 311373
* [llvm-pdbutil] Add support for dumping detailed module stats.Zachary Turner2017-08-211-0/+12
| | | | | | | | | | | | | | | This adds support for dumping a summary of module symbols and CodeView debug chunks. This option prints a table for each module of all of the symbols that occurred in the module and the number of times it occurred and total byte size. Then at the end it prints the totals for the entire file. Additionally, this patch adds the -jmc (just my code) option, which suppresses modules which are from external libraries or linker imports, so that you can focus only on the object files and libraries that originate from your own source code. llvm-svn: 311338
* Move helper classes into anonymous namespaces.Benjamin Kramer2017-08-201-1/+1
| | | | | | No functionality change intended. llvm-svn: 311288
* [LLD/PDB] Write actual records to the globals stream.Zachary Turner2017-08-111-10/+29
| | | | | | | | | | | | | | | | Previously we were writing an empty globals stream. Windows tools interpret this as "private symbols are not present in this PDB", even when they are, so we need to fix this. Regardless, without it we don't have information about global variables, so we need to fix it anyway. This patch does that. With this patch, the "lm" command in WinDbg correctly reports that we have private symbols available, but the "dv" command still refuses to display local variables. Differential Revision: https://reviews.llvm.org/D36535 llvm-svn: 310743
OpenPOWER on IntegriCloud