| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
This has been made obsolete by the fact that almost all of the
things it previously checked for are no longer relevant since
we can just compare bytes in a lot of places.
llvm-svn: 328562
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was originally reported as a bug with the symptom being "cvdump
crashes when printing an LLD-linked PDB that has an S_FILESTATIC record
in it". After some additional investigation, I determined that this was
a symptom of a larger problem, and in fact the real problem was in the
way we emitted the global PDB string table. As evidence of this, you can
take any lld-generated PDB, run cvdump -stringtable on it, and it would
return no results.
My hypothesis was that cvdump could not *find* the string table to begin
with. Normally it would do this by looking in the "named stream map",
finding the string /names, and using its value as the stream index. If
this lookup fails, then cvdump would fail to load the string table.
To test this hypothesis, I looked at the name stream map generated by a
link.exe PDB, and I emitted exactly those bytes into an LLD-generated
PDB. Suddenly, cvdump could read our string table!
This code has always been hacky and we knew there was something we
didn't understand. After all, there were some comments to the effect of
"we have to emit strings in a specific order, otherwise things don't
work". The key to fixing this was finally understanding this.
The way it works is that it makes use of a generic serializable hash map
that maps integers to other integers. In this case, the "key" is the
offset into a buffer, and the value is the stream number. If you index
into the buffer at the offset specified by a given key, you find the
name. The underlying cause of all these problems is that we were using
the identity function for the hash. i.e. if a string's offset in the
buffer was 12, the hash value was 12. Instead, we need to hash the
string *at that offset*. There is an additional catch, in that we have
to compute the hash as a uint32 and then truncate it to uint16.
Making this work is a little bit annoying, because we use the same hash
table in other places as well, and normally just using the identity
function for the hash function is actually what's desired. I'm not
totally happy with the template goo I came up with, but it works in any
case.
The reason we never found this bug through our own testing is because we
were building a /parallel/ hash table (in the form of an
llvm::StringMap<>) and doing all of our lookups and "real" hash table
work against that. I deleted all of that code and now everything goes
through the real hash table. Then, to test it, I added a unit test which
adds 7 strings and queries the associated values. I test every possible
insertion order permutation of these 7 strings, to verify that it really
does work as expected.
Differential Revision: https://reviews.llvm.org/D43326
llvm-svn: 325386
|
|
|
|
| |
llvm-svn: 320631
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for dumping a summary of module symbols
and CodeView debug chunks. This option prints a table for
each module of all of the symbols that occurred in the module
and the number of times it occurred and total byte size. Then
at the end it prints the totals for the entire file.
Additionally, this patch adds the -jmc (just my code) option,
which suppresses modules which are from external libraries or
linker imports, so that you can focus only on the object files
and libraries that originate from your own source code.
llvm-svn: 311338
|
|
|
|
|
|
|
|
|
|
|
| |
Sometimes the normal module equivalence detection algorithm doesn't
quite work. For example, you might build the same program with
MSVC and clang-cl, outputting to different object files, exes, and
PDBs, then compare them. If the object files have different names
though, then they won't be treated as equivalent. This way we
can force specific module indices to be treated as equivalent.
llvm-svn: 309983
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was originally reverted because of two issues.
1) Printing ANSI color escape codes even when outputting to
a file
2) Module name comparisons were failing when comparing a PDB
generated on one machine to a PDB generated on another
machine.
I attempted to fix #2 by adding command line options which let
you specify prefixes to strip from the beginning of embedded
paths, which effectively lets us specify a path to "base" each
PDB from and only compare the parts under the base. But this is
tricky because PDB paths always use Windows path syntax, even
when they are created on non-Windows hosts. A problem still
existed when constructing the prefix to strip, where we were
accidentally using a host-specific path separator instead of
a Windows path separator.
This resubmission fixes the issue on Linux (and I have verified
that the test now passes on Linux).
llvm-svn: 307571
|
|
|
|
|
|
|
|
| |
This reverts commit 180af3fdbdb17ec35b45ec1f925fd743b28d37e1.
This is still breaking due to linux-specific path differences.
llvm-svn: 307559
|
|
|
|
| |
llvm-svn: 307556
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A test was checked in on Friday that worked by checking in an
object file and PDB generated locally by MSVC, and then having
the test run lld-link on the object file and diffing LLD's PDB
against the checked in PDB.
This failed because part of the diffing algorithm involves
determining if two modules are the same, and if so drilling into
the module and diffing individual fields of the module. The
only thing we can use to make this determination though is the
"name" of the module, which is a path to where the module (obj
file) was read from on the machine where it was linked. This
fails for obvious reasons when comparing a PDB generated on one
machine to a PDB on another machine.
The fix employed here is to add two command line options to the
diff subcommand, which allow the user to specify a "binary root
path". The bin root path, if specified, is stripped from the
beginning of any embedded PDB paths. The test is updated to
specify the user's local test output directory for the left
PDB, and is hardcoded to the location where the original PDB
was created for the right PDB. This way all the equivalence
comparisons should succeed.
llvm-svn: 307555
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without this we would just append whatever the user
wrote on the command line, so if we're in C:\foo
and we run lld-link bar/baz.obj, we would write
C:\foo\bar/baz.obj in various places in the PDB.
MSVC linker does not do this, so we shouldn't either.
This fixes some differences in the diff test, so we
update the test as well.
Differential Revision: https://reviews.llvm.org/D35092
llvm-svn: 307423
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A couple of things were different about our generated PDBs.
1) We were outputting the wrong Version on the PDB Stream.
The version we were setting was newer than what MSVC is setting.
It's not clear what the implications are, but we change LLD
to use PdbImplVC70, as MSVC does.
2) For the optional debug stream indices in the DBI Stream, we
were outputting 0 to mean "the stream is not present". MSVC
outputs uint16_t(-1), which is the "correct" way to specify
that a stream is not present. So we fix that as well.
3) We were setting the PDB Stream signature to 0. This is supposed
to be the result of calling time(nullptr). Although this leads
to non-deterministic builds, a better way to solve that is by
having a command line option explicitly for generating a
reproducible build, and have the default behavior of lld-link
match the default behavior of link.
To test this, I'm making use of the new and improved `pdb diff`
sub command. To make it suitable for writing tests against, I had
to modify the diff subcommand slightly to print less verbose output.
Previously it would always print | <column> | <value1> | <value2> |
which is quite verbose, and the values are fragile. All we really
want to know is "did we produce the same value as link?" So I added
command line options to print a single character representing the
result status (different, identical, equivalent), and another to
hide the value display. Note that just inspecting the diff output
used to write the test, you can see some things that are obviously
wrong. That is just reflective of the fact that this is the state
of affairs today, not that we're asserting that this is "correct".
We can use this as a starting point to discover differences, fix
them, and update the test.
Differential Revision: https://reviews.llvm.org/D35086
llvm-svn: 307422
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're getting to the point that some MS tools (e.g. DIA) can recognize
our PDBs but others (e.g. link.exe) cannot. I think the way forward is
to improve our tooling to help us find differences more easily. For
example, if we can compile the same program with clang-cl and cl and
have a tool tell us all the places where the PDBs differ, this could
tell us what we're doing wrong. It's tricky though, because there are a
lot of "benign" differences in a PDB. For example, if the string table
in one PDB consists of "foo" followed by "bar" and in the other PDB it
consists of "bar" followed by "foo", this is not necessarily a critical
difference, as long as the uses of these strings also refer to the
correct location. On the other hand, if the second PDB doesn't even
contain the string "foo" at all, this is a critical difference.
diff mode has been in llvm-pdbutil for quite a while, but because of the
above challenge along with some others, it's been hard to make it
useful. I think this patch addresses that. It looks for all the same
things, but it now prints the output in tabular format (carefully
formatted and aligned into tables and fields), and it highlights
critical differences in red, non-critical differences in yellow, and
identical fields in green. This makes it easy to spot the places we
differ, and the general concept of outputting arbitrary fields in
tabular format can be extended to provide analysis into many of the
different types of information that show up in a PDB.
Differential Revision: https://reviews.llvm.org/D35039
llvm-svn: 307421
|
|
|
|
| |
llvm-svn: 305818
|
|
This is to reflect the evolving nature of the tool as being
useful for more than just dumping PDBs, as it can do many other
things.
Differential Revision: https://reviews.llvm.org/D34062
llvm-svn: 305106
|