| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
| |
LazyRandomTypeCollection is designed for random access, and in
order to provide this it lazily indexes ranges of types. In the
case of types from an object file, there is no partial index
to build off of, so it has to index the full stream up front.
However, merging types only requires sequential access, and when
that is needed, this extra work is simply wasted. Changing the
algorithm to work on sequential arrays of types rather than
random access type collections eliminates this up front scan.
llvm-svn: 303707
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When writing field list records, we would construct a temporary
type serializer that shared a bump ptr allocator with the rest
of the application, so anything allocated from here would live
forever. Furthermore, this temporary serializer had all the
properties of a full blown serializer including record hashing
and de-duplication.
These features are required when you're merging multiple type
streams into each other, because different streams may contain
identical records, but records from the same type stream will
never collide with each other. So all of this hashing was
unnecessary.
To solve this, two fixes are made:
1) The temporary serializer keeps its own bump ptr allocator
instead of sharing a global one. When it's finished, all of
its memory is freed.
2) Instead of using the same temporary serializer for the life
of an entire type stream, we use it only for the life of a single
field list record and delete it when the field list record is
completed. This way the hash table will not grow as other
records from the same type stream get inserted. Further improvements
could eliminate hashing entirely from this codepath.
This reduces the link time by 85% in my test, from 1 minute to 9
seconds.
llvm-svn: 303676
|
|
|
|
| |
llvm-svn: 303667
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
First, StringMap uses llvm::HashString, which is only good for short
identifiers and really bad for large blobs of binary data like type
records. Moving to `DenseMap<StringRef, TypeIndex>` with some tricks for
memory allocation fixes that.
Unfortunately, that didn't buy very much performance. Profiling showed
that we spend a long time during DenseMap growth rehashing existing
entries. Also, in general, DenseMap is faster when the keys are small.
This change takes that to the logical conclusion by introducing a small
wrapper value type around a pointer to key data. The key data contains a
precomputed hash, the original record data (pointer and size), and the
type index, which is the "value" of our original map.
This reduces the time to produce llvm-as.exe and llvm-as.pdb from ~15s
on my machine to 3.5s, which is about a 4x improvement.
Reviewers: zturner, inglorion, ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33428
llvm-svn: 303665
|
|
|
|
|
|
|
|
| |
This reverts commit e34ccb7b57da25cc89ded913d8638a2906d1110a.
This is causing failures on the ASAN bots.
llvm-svn: 303640
|
|
|
|
| |
llvm-svn: 303609
|
|
|
|
|
|
| |
Sorry for the bot noise.
llvm-svn: 303592
|
|
|
|
|
|
|
|
|
| |
separate CUs residing there
NFC, just an optimization. Will be building on this for DWP support
shortly.
llvm-svn: 303591
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previous algotirhm assumed that types and ids are in a single
unified stream. For inputs that come from object files, this
is the case. But if the input is already a PDB, or is the result
of a previous merge, then the types and ids will already have
been split up, in which case we need an algorithm that can
accept operate on independent streams of types and ids that
refer across stream boundaries to each other.
Differential Revision: https://reviews.llvm.org/D33417
llvm-svn: 303577
|
|
|
|
| |
llvm-svn: 303576
|
|
|
|
|
|
|
|
|
|
|
| |
llvm-symbolizer would fail to symbolize addresses in unlinked object
files when handling .dwo file data because the addresses would not be
relocated in the same way as the ranges in the skeleton CU in the object
file.
Fix that so object files can be symbolized the same as executables.
llvm-svn: 303532
|
|
|
|
| |
llvm-svn: 303482
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was originally reverted because it was a breaking a bunch
of bots and the breakage was not surfacing on Windows. After much
head-scratching this was ultimately traced back to a bug in the
lit test runner related to its pipe handling. Now that the bug
in lit is fixed, Windows correctly reports these test failures,
and as such I have finally (hopefully) fixed all of them in this
patch.
llvm-svn: 303446
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a squash of ~5 reverts of, well, pretty much everything
I did today. Something is seriously broken with lit on Windows
right now, and as a result assertions that fire in tests are
triggering failures. I've been breaking non-Windows bots all
day which has seriously confused me because all my tests have
been passing, and after running lit with -a to view the output
even on successful runs, I find out that the tool is crashing
and yet lit is still reporting it as a success!
At this point I don't even know where to start, so rather than
leave the tree broken for who knows how long, I will get this
back to green, and then once lit is fixed on Windows, hopefully
hopefully fix the remaining set of problems for real.
llvm-svn: 303409
|
|
|
|
| |
llvm-svn: 303408
|
|
|
|
|
|
|
|
|
|
|
| |
We were using a BumpPtrAllocator to allocate stable storage for
a record, then trying to insert that into a hash table. If a
collision occurred, the bytes were never inserted and the
allocation was unnecessary. At the cost of an extra hash
computation, check first if it exists, and only if it does do
we allocate and insert.
llvm-svn: 303407
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently this was always broken, but previously we were more
graceful about it and we would print "unknown udt" if we couldn't
find the type index, whereas now we just segfault because we
assume it's valid. But this exposed a real bug, which is that
we weren't looking in the right place. So fix that, and also
fix this crash at the same time.
llvm-svn: 303397
|
|
|
|
| |
llvm-svn: 303391
|
|
|
|
|
|
|
| |
This map will be needed to rewrite symbol streams after re-writing
the corresponding type streams.
llvm-svn: 303390
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Merging PDBs is a feature that will be used heavily by
the linker. The functionality already exists but does not
have deep test coverage because it's not easily exposed through
any tools. This patch aims to address that by adding the
ability to merge PDBs via llvm-pdbdump. It takes arbitrarily
many PDBs and outputs a single PDB.
Using this new functionality, a test is added for merging
type records. Future patches will add the ability to merge
symbol records, module information, etc.
llvm-svn: 303389
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Right now we have multiple notions of things that represent collections of
types. Most commonly used are TypeDatabase, which is supposed to keep
mappings from TypeIndex to type name when reading a type stream, which
happens when reading PDBs. And also TypeTableBuilder, which is used to
build up a collection of types dynamically which we will later serialize
(i.e. when writing PDBs).
But often you just want to do some operation on a collection of types, and
you may want to do the same operation on any kind of collection. For
example, you might want to merge two TypeTableBuilders or you might want
to merge two type streams that you loaded from various files.
This dichotomy between reading and writing is responsible for a lot of the
existing code duplication and overlapping responsibilities in the existing
CodeView library classes. For example, after building up a
TypeTableBuilder with a bunch of type records, if we want to dump it we
have to re-invent a bunch of extra glue because our dumper takes a
TypeDatabase or a CVTypeArray, which are both incompatible with
TypeTableBuilder.
This patch introduces an abstract base class called TypeCollection which
is shared between the various type collection like things. Wherever we
previously stored a TypeDatabase& in some common class, we now store a
TypeCollection&.
The advantage of this is that all the details of how the collection are
implemented, such as lazy deserialization of partial type streams, is
completely transparent and you can just treat any collection of types the
same regardless of where it came from.
Differential Revision: https://reviews.llvm.org/D33293
llvm-svn: 303388
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) Until now I'd never seen a valid PDB where the DBI stream and
the PDB Stream disagreed on the "Age" field. Because of that,
we had code to assert that they matched. Recently though I was
given a PDB where they disagreed, so this assumption has proven
to be incorrect. Remove this check.
2) We were walking the entire list of hash values for types up front
and then throwing away the values. For large PDBs this was a
significant slow down. Remove this.
With this patch, I can dump the list of all compilands from a
1.5GB PDB file in just a few seconds.
llvm-svn: 303351
|
|
|
|
|
|
|
|
|
| |
We do not need to store relocation width field.
Patch removes relative code, that simplifies implementation.
Differential revision: https://reviews.llvm.org/D33274
llvm-svn: 303335
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I revisited Decompressor API (issue with it was triggered during D32865 review)
and found it is probably provides more then we really need.
Issue was about next method's signature:
Error decompress(SmallString<32> &Out);
It is too strict. At first I wanted to change it to decompress(SmallVectorImpl<char> &Out),
but then found it is still not flexible because sticks to SmallVector.
During reviews was suggested to use templating to simplify code. Patch do that.
Differential revision: https://reviews.llvm.org/D33200
llvm-svn: 303331
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
llvm-pdbdump yaml2pdb used to fail with a misleading error
message ("An I/O error occurred on the file system") if no output file
was specified. This change adds an assert to PDBFileBuilder to check
that an output file name is specified, and makes llvm-pdbdump generate
an output file name based on the input file name if no output file
name is explicitly specified.
Reviewers: amccarth, zturner
Reviewed By: zturner
Subscribers: fhahn, llvm-commits
Differential Revision: https://reviews.llvm.org/D33296
llvm-svn: 303299
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is often a lot of boilerplate code required to visit a type
record or type stream. The #1 use case is that you have a sequence
of bytes that represent one or more records, and you want to
deserialize each one, switch on it, and call a callback with the
deserialized record that the user can examine. Currently this
requires at least 6 lines of code:
codeview::TypeVisitorCallbackPipeline Pipeline;
Pipeline.addCallbackToPipeline(Deserializer);
Pipeline.addCallbackToPipeline(MyCallbacks);
codeview::CVTypeVisitor Visitor(Pipeline);
consumeError(Visitor.visitTypeRecord(Record));
With this patch, it becomes one line of code:
consumeError(codeview::visitTypeRecord(Record, MyCallbacks));
This is done by having the deserialization happen internally inside
of the visitTypeRecord function. Since this is occasionally not
desirable, the function provides a 3rd parameter that can be used
to change this behavior.
Hopefully this can significantly reduce the barrier to entry
to using the visitation infrastructure.
Differential Revision: https://reviews.llvm.org/D33245
llvm-svn: 303271
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RelocAddrMap was a pair of <width, address>, where width is relocation size (4/8/x, x < 8),
and width field was never used in code.
Relocations proccessing loop had checks for width field. Does not look like DWARF parser
should do that. There is probably no much sense to validate relocations during proccessing
them in parser.
Patch removes relocation's width relative code from DWARFContext.
Differential revision: https://reviews.llvm.org/D33194
llvm-svn: 303251
|
|
|
|
|
|
| |
Was mentioned as possible cleanup during review of D33184.
llvm-svn: 303171
|
|
|
|
|
|
|
|
|
|
|
| |
DWARFAddressRangesVector.
Recommit of r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector"
All places were shitched to use DWARFAddressRange now.
Suggested during review of D33184.
llvm-svn: 303163
|
|
|
|
|
|
|
|
|
| |
pair for DWARFAddressRangesVector."
Something went wrong, it broke BB.
http://green.lab.llvm.org/green//job/clang-stage1-cmake-RA-incremental_build/38477/consoleFull#-200034420049ba4694-19c4-4d7e-bec5-911270d8a58c
llvm-svn: 303162
|
|
|
|
|
|
|
|
| |
DWARFAddressRangesVector.
Suggested during review of D33184.
llvm-svn: 303159
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I am working on a speedup of building .gdb_index in LLD and
noticed that relocations that are proccessed in DWARFContextInMemory often uses
the same symbol in a row. This patch introduces caching to reduce the relocations
proccessing time.
For benchmark,
I took debug LLC binary objects configured with -ggnu-pubnames and linked it using LLD.
Link time without --gdb-index is about 4,45s.
Link time with --gdb-index: a) Without patch: 19,16s b) With patch: 15,52s
That means time spent on --gdb-index in this configuration is
19,16s - 4,45s = 14,71s (without patch) vs 15,52s - 4,45s = 11,07s (with patch).
Differential revision: https://reviews.llvm.org/D31136
llvm-svn: 303051
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a visitor that is capable of accessing type
records randomly and caching intermediate results that it
learns about during partial linear scans. This yields
amortized O(1) access to a type stream even though type
streams cannot normally be indexed.
Differential Revision: https://reviews.llvm.org/D33009
llvm-svn: 302936
|
|
|
|
|
|
|
|
| |
Reviewers: dblaikie
Differential Revision: https://reviews.llvm.org/D32987
llvm-svn: 302574
|
|
|
|
|
|
| |
MSVC 2015); NFC.
llvm-svn: 302531
|
|
|
|
|
|
| |
This reverts commit r302520 because it break the unit tests.
llvm-svn: 302524
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no other explanation about why this only started happening
now, even though it crashes on old code (supposedly reachable from
here).
The only common factor between the failing bots is that they use GCC
(4.9 and 5.3) to compile Clang, while the others use Clang 3.8, but the
failure is while building the tests, as an assertion, on Clang.
Commenting it out for now in hope the bots will go back green, but we
should keep looking for the real cause, and update bugzilla.
llvm-svn: 302520
|
|
|
|
|
|
| |
DWARFDie.
llvm-svn: 302471
|
|
|
|
| |
llvm-svn: 302470
|
|
|
|
| |
llvm-svn: 302466
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously type visitation was done strictly sequentially, and
TypeIndexes were computed by incrementing the TypeIndex of the
last visited record. This works fine for situations like dumping,
but not when you want to visit types in random order. For example,
in a debug session someone might lookup a symbol by name, find that
it has TypeIndex 10,000 and then want to go straight to TypeIndex
10,000.
In order to make this work, the visitation framework needs a mode
where it can plumb TypeIndices through the callback pipeline. This
patch adds such a mode. In doing so, it is necessary to provide
an alternative implementation of TypeDatabase that supports random
access, so that is done as well.
Nothing actually uses these random access capabilities yet, but
this will be done in subsequent patches.
Differential Revision: https://reviews.llvm.org/D32928
llvm-svn: 302454
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the time we know exactly how many type records we
have in a list, and we want to use the visitor to deserialize
them into actual records in a database. Previously we were
just using push_back() every time without reserving the space
up front in the vector. This is obviously terrible from a
performance standpoint, and it's not uncommon to have PDB
files with half a million type records, where the performance
degredation was quite noticeable.
llvm-svn: 302302
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
llvm-dwarfdump currently prints no message if decompression fails
for some reason. I noticed that during work on one of LLD patches
where LLD produced an broken output. It was a bit confusing to see
no output for section dumped and no any error message at all.
Patch adds error message for such cases.
Differential revision: https://reviews.llvm.org/D32865
llvm-svn: 302221
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Verifying the hash values as we are currently doing
results in iterating every type record before the user
even tries to access the first one, and the API user
has no control over, or ability to hook into this
process.
As a result, when the user wants to iterate over types
to print them or index them, this results in a second
iteration over the same list of types. When there's
upwards of 1,000,000 type records, this is obviously
quite undesirable.
This patch raises the verification outside of TpiStream
, and llvm-pdbdump hooks a hash verification visitor
into the normal dumping process. So we still verify
the hash records, but we can do it while not requiring
a second iteration over the type stream.
Differential Revision: https://reviews.llvm.org/D32873
llvm-svn: 302206
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I tried to run llvm-pdbdump on a very large (~1.5GB) PDB to
try and identify show-stopping performance problems. This
patch addresses the first such problem.
When loading the DBI stream, before anyone has even tried to
access a single record, we build an in memory map of every
source file for every module. In the particular PDB I was
using, this was over 85 million files. Specifically, the
complexity is O(m*n) where m is the number of modules and
n is the average number of source files (including headers)
per module.
The whole reason for doing this was so that we could have
constant time access to any module and any of its source
file lists. However, we can still get O(1) access to the
source file list for a given module with a simple O(m)
precomputation, and access to the list of modules is
already O(1) anyway.
So this patches reduces the O(m*n) up-front precomputation
to an O(m) one, where n is ~6,500 and n*m is about 85 million
in my pathological test case.
Differential Revision: https://reviews.llvm.org/D32870
llvm-svn: 302205
|
|
|
|
|
|
| |
the .debug_line section.
llvm-svn: 302180
|
|
|
|
| |
llvm-svn: 302086
|
|
|
|
| |
llvm-svn: 302069
|
|
|
|
|
|
|
|
| |
Adrian requested that we break things down to make things clean in the DWARFVerifier. This patch breaks everything down into nice individual functions and cleans up the code quite a bit and prepares us for the next round of verifiers.
Differential Revision: https://reviews.llvm.org/D32812
llvm-svn: 302062
|
|
|
|
|
|
| |
I should've staged this with my last commit.
llvm-svn: 302059
|