summaryrefslogtreecommitdiffstats
path: root/lld/ELF/InputSection.h
Commit message (Collapse)AuthorAgeFilesLines
* [ELF] Fix a null pointer dereference when --emit-relocs and --strip-debug ↵Fangrui Song2020-04-101-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | are used together Fixes https://bugs.llvm.org//show_bug.cgi?id=44878 When --strip-debug is specified, .debug* are removed from inputSections while .rel[a].debug* (incorrectly) remain. LinkerScript::addOrphanSections() requires the output section of a relocated InputSectionBase to be created first. .debug* are not in inputSections -> output sections .debug* are not created -> getOutputSectionName(.rel[a].debug*) dereferences a null pointer. Fix the null pointer dereference by deleting .rel[a].debug* from inputSections as well. Reviewed By: grimar, nickdesaulniers Differential Revision: https://reviews.llvm.org/D74510 (cherry picked from commit 6c73246179376442705b3a545f4e1f1478777a04)
* [ELF] Improve --gc-sections compatibility with GNU ld regarding section groupsFangrui Song2019-11-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on D70020 by serge-sans-paille. The ELF spec says: > Furthermore, there may be internal references among these sections that would not make sense if one of the sections were removed or replaced by a duplicate from another object. Therefore, such groups must be included or omitted from the linked object as a unit. A section cannot be a member of more than one group. GNU ld has 2 behaviors that we don't have: - Group members (nextInSectionGroup != nullptr) are subject to garbage collection. This includes non-SHF_ALLOC SHT_NOTE sections. In particular, discarding non-SHF_ALLOC SHT_NOTE sections is an expected behavior by the Annobin project. See https://developers.redhat.com/blog/2018/02/20/annobin-storing-information-binaries/ for more information. - Groups members are retained or discarded as a unit. Members may have internal references that are not expressed as SHF_LINK_ORDER, relocations, etc. It seems that we should be more conservative here: if a section is marked live, mark all the other member within the group. Both behaviors are reasonable. This patch implements them. A new field InputSectionBase::nextInSectionGroup tracks the next member within a group. on ELF64, this increases sizeof(InputSectionBase) froms 144 to 152. InputSectionBase::dependentSections tracks section dependencies, which is used by both --gc-sections and /DISCARD/. We can't overload it for the "next member" semantic, because we should allow /DISCARD/ to discard sections independent of --gc-sections (GNU ld behavior). This behavior may be reasonably used by `/DISCARD/ : { *(.ARM.exidx*) }` or `/DISCARD/ : { *(.note*) }` (new test `linkerscript/discard-group.s`). Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D70146
* [ELF] Delete SectionBase::assignedFangrui Song2019-09-241-17/+4
| | | | | | | | | | | | | D67504 removed uses of `assigned` from OutputSection::addSection, which makes `assigned` purely used in processSectionCommands() and its callees. By replacing its references with `parent`, we can remove `assigned`. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D67531 llvm-svn: 372735
* [ELF] Fix variable names in comments after VariableName -> variableName changeFangrui Song2019-07-161-2/+2
| | | | | | Also fix some typos. llvm-svn: 366181
* [Coding style change] Rename variables so that they start with a lowercase ↵Rui Ueyama2019-07-101-113/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | letter This patch is mechanically generated by clang-llvm-rename tool that I wrote using Clang Refactoring Engine just for creating this patch. You can see the source code of the tool at https://reviews.llvm.org/D64123. There's no manual post-processing; you can generate the same patch by re-running the tool against lld's code base. Here is the main discussion thread to change the LLVM coding style: https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html In the discussion thread, I proposed we use lld as a testbed for variable naming scheme change, and this patch does that. I chose to rename variables so that they are in camelCase, just because that is a minimal change to make variables to start with a lowercase letter. Note to downstream patch maintainers: if you are maintaining a downstream lld repo, just rebasing ahead of this commit would cause massive merge conflicts because this patch essentially changes every line in the lld subdirectory. But there's a remedy. clang-llvm-rename tool is a batch tool, so you can rename variables in your downstream repo with the tool. Given that, here is how to rebase your repo to a commit after the mass renaming: 1. rebase to the commit just before the mass variable renaming, 2. apply the tool to your downstream repo to mass-rename variables locally, and 3. rebase again to the head. Most changes made by the tool should be identical for a downstream repo and for the head, so at the step 3, almost all changes should be merged and disappear. I'd expect that there would be some lines that you need to merge by hand, but that shouldn't be too many. Differential Revision: https://reviews.llvm.org/D64121 llvm-svn: 365595
* ELF: Create synthetic sections for loadable partitions.Peter Collingbourne2019-06-071-0/+4
| | | | | | | | | | | | | | | We create several types of synthetic sections for loadable partitions, including: - The dynamic symbol table. This allows code outside of the loadable partitions to find entry points with dlsym. - Creating a dynamic symbol table also requires the creation of several other synthetic sections for the partition, such as the dynamic table and hash table sections. - The partition's ELF header is represented as a synthetic section in the combined output file, and will be used by llvm-objcopy to extract partitions. Differential Revision: https://reviews.llvm.org/D62350 llvm-svn: 362819
* ELF: Add basic partition data structures and behaviours.Peter Collingbourne2019-05-291-6/+11
| | | | | | | | | | | | | | This change causes us to read partition specifications from partition specification sections and split output sections into partitions according to their reachability from partition entry points. This is only the first step towards a full implementation of partitions. Later changes will add additional synthetic sections to each partition so that they can be loaded independently. Differential Revision: https://reviews.llvm.org/D60353 llvm-svn: 361925
* Revert r358069 "Discard debuginfo for object files empty after GC"Bob Haarman2019-05-161-7/+3
| | | | | | | | The change broke some scenarios where debug information is still needed, although MarkLive cannot see it, including the Chromium/Android build. Reverting to unbreak that build. llvm-svn: 360955
* [ELF] Place SectionPiece::{Live,Hash} bit fields togetherFangrui Song2019-04-181-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | Summary: We access Live and OutputOff (which may share the same memory location) concurrently in 2 parallelForEachN loops. Separating them avoids subtle data races like D41884/PR35788. This patch places Live and Hash together. 2 reasons this is appealing: 1) Hash is immutable. Live is almost read-only - only written once in MarkLive.cpp where Hash is not accessed 2) we already discard low bits of Hash to decide ShardID. It doesn't matter much if we make 32-bit Hash to 31-bit. For a huge internal clang -O3 executable (1.6GiB), `Strings` in StringTableBuilder::finalizeStringTable contains at most 310253 elements. The expected number of pair-wise collisions 2^(-31) * C(310253,2) ~= 22.41 is too small to have a negative impact on performance. Actually, my benchmark shows there is actually a minor performance improvement. Differential Revision: https://reviews.llvm.org/D60765 llvm-svn: 358645
* Discard debuginfo for object files empty after GCRui Ueyama2019-04-101-3/+7
| | | | | | | | | | | | | | | | | | | | | | Patch by Robert O'Callahan. Rust projects tend to link in all object files from all dependent libraries and rely on --gc-sections to strip unused code and data. Unfortunately --gc-sections doesn't currently strip any debuginfo associated with GC'ed sections, so lld links in the full debuginfo from all dependencies even if almost all that code has been discarded. See https://github.com/rust-lang/rust/issues/56068 for some details. Properly stripping debuginfo for discarded sections would be difficult, but a simple approach that helps significantly is to mark debuginfo sections as live only if their associated object file has at least one live code/data section. This patch does that. In a (contrived but not totally artificial) Rust testcase linked above, it reduces the final binary size from 46MB to 5.1MB. Differential Revision: https://reviews.llvm.org/D54747 llvm-svn: 358069
* ELF: Use bump pointer allocator for uncompressed section buffers. NFCI.Peter Collingbourne2019-03-121-5/+6
| | | | | | | | | | | | | | This shaves another word off SectionBase and makes it possible to clone a section using the implicit copy constructor. This basically reverts r311056, which removed the mutex in order to make the code easier to understand. On balance I think it's probably more straightforward to have a mutex here than to have an unusual copy constructor in SectionBase. Differential Revision: https://reviews.llvm.org/D59269 llvm-svn: 355966
* ELF: Reduce the size of InputSectionBase by two words. NFCI.Peter Collingbourne2019-03-071-21/+21
| | | | | | | | | | | | - The Assigned bit was previously taking a word on its own. Move it into the bit fields in SectionBase. - NumRelocations and AreRelocsRela were previously also taking up a word despite only using half of it. Move them into the alignment gap after SectionBase's fields. Differential Revision: https://reviews.llvm.org/D59044 llvm-svn: 355622
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* Do not use a hash table to uniquify mergeable strings.Rui Ueyama2018-12-051-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | Previously, we have a hash table containing strings and their offsets to manage mergeable strings. Technically we can live without that, because we can do binary search on a vector of mergeable strings to find a mergeable strings. We did have both the hash table and the binary search because we thought that that is faster. We recently observed that lld tend to consume more memory than gold when building an output with debug info. A few percent of memory is consumed by the hash table. So, we needed to reevaluate whether or not having the extra hash table is a good CPU/memory tradeoff. I run a few benchmarks with and without the hash table. I got a mixed result for the benchmark. We observed a regression for some programs by removing the hash table (that's what we expected), but we also observed that performance imrpovements for some programs. This is perhaps due to reduced memory usage. Differential Revision: https://reviews.llvm.org/D55234 llvm-svn: 348401
* [LLD][ELF] - Simplify. NFCI.George Rimar2018-11-231-1/+0
| | | | | | | This makes getRISCVPCRelHi20 to be static local helper, and rotates the 'if' condition. llvm-svn: 347497
* Avoid unnecessary buffer allocation and memcpy for compressed sections.Rui Ueyama2018-10-081-14/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we uncompress all compressed sections before doing anything. That works, and that is conceptually simple, but that could results in a waste of CPU time and memory if uncompressed sections are then discarded or just copied to the output buffer. In particular, if .debug_gnu_pub{names,types} are compressed and if no -gdb-index option is given, we wasted CPU and memory because we uncompress them into newly allocated bufers and then memcpy the buffers to the output buffer. That temporary buffer was redundant. This patch changes how to uncompress sections. Now, compressed sections are uncompressed lazily. To do that, `Data` member of `InputSectionBase` is now hidden from outside, and `data()` accessor automatically expands an compressed buffer if necessary. If no one calls `data()`, then `writeTo()` directly uncompresses compressed data into the output buffer. That eliminates the redundant memory allocation and redundant memcpy. This patch significantly reduces memory consumption (20 GiB max RSS to 15 Gib) for an executable whose .debug_gnu_pub{names,types} are in total 5 GiB in an uncompressed form. Differential Revision: https://reviews.llvm.org/D52917 llvm-svn: 343979
* Revert r342297: Discard uncompressed buffer after creating .gdb_index contents.Rui Ueyama2018-09-141-0/+1
| | | | | | Looks like it broke some local builds that use -gdb-index. llvm-svn: 342298
* Discard uncompressed buffer after creating .gdb_index contents.Rui Ueyama2018-09-141-1/+0
| | | | | | | | | | | | | | Once we create .gdb_index contents, .zdebug_gnu_pub{names,types} are useless, so there's no need to keep their uncompressed data in memory. I observed that for a test case in which lld creates a 3GB .gdb_index section, the maximum resident set size reduced from 43GB to 29GB after this patch. Differential Revision: https://reviews.llvm.org/D52126 llvm-svn: 342297
* Support RISC-VRui Ueyama2018-08-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch by PkmX. This patch makes lld recognize RISC-V target and implements basic relocation for RV32/RV64 (and RVC). This should be necessary for static linking ELF applications. The ABI documentation for RISC-V can be found at: https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md. Note that the documentation is far from complete so we had to figure out some details from bfd. The patch should be pretty straightforward. Some highlights: - A new relocation Expr R_RISCV_PC_INDIRECT is added. This is needed as the low part of a PC-relative relocation is linked to the corresponding high part (auipc), see: https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#pc-relative-symbol-addresses - LLVM's MC support for RISC-V is very incomplete (we are working on this), so tests are given in objectyaml format with the original assembly included in the comments. Once we have complete support for RISC-V in MC, we can switch to llvm-as/llvm-objdump. - We don't support linker relaxation for now as it requires greater changes to lld that is beyond the scope of this patch. Once this is accepted we can start to work on adding relaxation to lld. Differential Revision: https://reviews.llvm.org/D39322 llvm-svn: 339364
* Implement framework for linking split-stack object files, and x86_64 support.Sterling Augustine2018-07-171-1/+15
| | | | llvm-svn: 337332
* ELF: Do not ICF two sections with different output sections.Peter Collingbourne2018-05-231-1/+1
| | | | | | | | | | Note that this doesn't do the right thing in the case where there is a linker script. We probably need to move output section assignment before ICF to get the correct behaviour here. Differential Revision: https://reviews.llvm.org/D47241 llvm-svn: 333052
* [ELF] Implement --keep-unique optionPeter Smith2018-05-151-2/+5
| | | | | | | | | | | | | | The --keep-unique <symbol> option is taken from gold. The intention is that <symbol> will be prevented from being folded by ICF. Although not specifically mentioned in the documentation <symbol> only matches global symbols, with a warning if the symbol is not found. The implementation finds the Section defining <symbol> and removes it from the set of sections considered for ICF. Differential Revision: https://reviews.llvm.org/D46755 llvm-svn: 332332
* Split merge sections early.Rafael Espindola2018-04-271-8/+0
| | | | | | | | | | | | | | | Now that getSectionPiece is fast (uses a hash) it is probably OK to split merge sections early. The reason I want to do this is to split eh_frame sections in the same place. This does mean that we have to decompress early. Given that the only compressed sections are debug info, I don't think we are missing much. It is a small improvement: 0.5% on the geometric mean. llvm-svn: 331058
* Define InputSection::getOffset inline.Rafael Espindola2018-04-191-0/+2
| | | | | | | This is much simpler than the other section types and there are many places where the section type is statically know. llvm-svn: 330350
* Rename MergeInputSection::getOffset.Rafael Espindola2018-04-191-3/+3
| | | | | | | Unlike the getOffset in the base class, this one computes the offset in the parent synthetic section, not the final output section. llvm-svn: 330339
* Reduce code duplication.Rafael Espindola2018-04-131-1/+1
| | | | | | getVA was already implemented in the base class. llvm-svn: 330036
* Initialize OutputOff to zero.Rafael Espindola2018-04-051-1/+1
| | | | | | | We have a dedicated Live bit, so we don't need a special value and we were not accounting for in at least one place. llvm-svn: 329307
* Inline initOffsetMap.Rafael Espindola2018-04-031-3/+1
| | | | | | | | | | | | | | | In the lld perf builder r328686 had a negative impact in stalled-cycles-frontend. Somehow that stat is not showing on my machine, but the attached patch shows an improvement on cache-misses, which is probably a reasonable proxy. My working theory is that given a large input the pieces vector is out of cache by the time initOffsetMap runs. Both finalizeContents implementation have a convenient location for initializing the OffsetMap, so this seems the best solution. llvm-svn: 329117
* Initialize OffsetMap in a known location.Rafael Espindola2018-03-281-4/+2
| | | | | | This is a small optimization and avoids the need to use call_once. llvm-svn: 328686
* Define a trivial method inline.Rafael Espindola2018-03-281-1/+3
| | | | llvm-svn: 328685
* Store live offsets as uint32_t.Rafael Espindola2018-03-281-1/+1
| | | | | | | We don't support input merge sections larger than 4gb, so these can be uint32_t. llvm-svn: 328684
* Add a SectionBase::getVA helper. NFC.Rafael Espindola2018-03-241-0/+2
| | | | | | There were a few too many places duplicating this. llvm-svn: 328402
* s/uncompress/decompress/g.Rui Ueyama2018-02-121-3/+3
| | | | | | | In lld, we use both "uncompress" and "decompress" which is confusing. Since LLVM uses "decompress", we should use the same term. llvm-svn: 324944
* Move function to the file where it is used.Rafael Espindola2018-01-301-4/+0
| | | | llvm-svn: 323780
* Detemplate reportDuplicate.Rafael Espindola2017-12-231-1/+0
| | | | | | | | | | We normally avoid "switch (Config->EKind)", but in this case I think it is worth it. It is only executed when there is an error and it allows detemplating a lot of code. llvm-svn: 321404
* Pass an InputFile to the InputSection constructor.Rafael Espindola2017-12-211-1/+1
| | | | | | | This simplifies toRegularSection and reduces the noise in a followup patch. llvm-svn: 321240
* Convert a few more InputFiles to references.Rafael Espindola2017-12-211-4/+4
| | | | | | | We use null files in sections to represent linker created sections, so ObjFile<ELFT> is never null. llvm-svn: 321238
* Detemplate createCommentSection.Rafael Espindola2017-12-211-0/+3
| | | | | | | It was only templated so it could create a dummy section header that was immediately parsed back. llvm-svn: 321235
* Move Repl to SectionBase.Rafael Espindola2017-12-131-10/+10
| | | | | | | | | | | | | | | | It is currently in InputSectionBase. Only InputSections are used in ICF, so Repl should be move to InputSection to clear the class hierarchy or, like this patch does, to SectionBase for convenience. The convenience of having it on the base class is that we can just access the replacement without having to first check if it is an InputSection. It is a bit less code and a bit faster as some of this code is very hot. I got up to 1.77% improvement in clang-gdb-index and no regressions according to lnt. llvm-svn: 320654
* Fix the type of the Discared section.Rafael Espindola2017-12-131-8/+2
| | | | | | | It is constructed with a kind of Regular and will dyn_cast to InputSection, but is declared to be an InputSectionBase. llvm-svn: 320539
* Fix line endings. NFC.Rafael Espindola2017-12-121-5/+5
| | | | llvm-svn: 320502
* [ELF] Reset OutputSection size prior to processing linker script commandsJames Henderson2017-12-121-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The size of an OutputSection is calculated early, to aid handling of compressed debug sections. However, subsequent to this point, unused synthetic sections are removed. In the event that an OutputSection, from which such an InputSection is removed, is still required (e.g. because it has a symbol assignment), and no longer has any InputSections, dot assignments, or BYTE()-family directives, the size member is never updated when processing the commands. If the removed InputSection had a non-zero size (such as a .got.plt section), the section ends up with the wrong size in the output. The fix is to reset the OutputSection size prior to processing the linker script commands relating to that OutputSection. This ensures that the size is correct even in the above situation. Additionally, to reduce the risk of developers misusing OutputSection Size and InputSection OutSecOff, they are set to simply the number of InputSections in an OutputSection, and the corresponding index respectively. We cannot completely stop using them, due to SHF_LINK_ORDER sections requiring them. Compressed debug sections also require the full size. This is now calculated in maybeCompress for these kinds of sections. Reviewers: ruiu, rafael Differential Revision: https://reviews.llvm.org/D38361 llvm-svn: 320472
* Delete dead code.Rafael Espindola2017-11-301-2/+0
| | | | llvm-svn: 319403
* ELF: Merge DefinedRegular and Defined.Peter Collingbourne2017-11-061-2/+2
| | | | | | | | | Now that DefinedRegular is the only remaining derived class of Defined, we can merge the two classes. Differential Revision: https://reviews.llvm.org/D39667 llvm-svn: 317448
* ELF: Remove DefinedCommon.Peter Collingbourne2017-11-061-2/+3
| | | | | | | | | Common symbols are now represented with a DefinedRegular that points to a BssSection, even during symbol resolution. Differential Revision: https://reviews.llvm.org/D39666 llvm-svn: 317447
* Rename SymbolBody -> SymbolRui Ueyama2017-11-031-2/+3
| | | | | | | | | | | | | Now that we have only SymbolBody as the symbol class. So, "SymbolBody" is a bit strange name now. This is a mechanical change generated by perl -i -pe s/SymbolBody/Symbol/g $(git grep -l SymbolBody lld/ELF lld/COFF) nd clang-format-diff. Differential Revision: https://reviews.llvm.org/D39459 llvm-svn: 317370
* [ELF] - Teach LLD to report line numbers for data symbols.George Rimar2017-11-011-1/+1
| | | | | | | | | | | | | | | | | | | This is PR34826. Currently LLD is unable to report line number when reporting duplicate declaration of some variable. That happens because for extracting line information we always use .debug_line section content which describes mapping from machine instructions to source file locations, what does not help for variables as does not describe them. In this patch I am taking the approproate information about variables locations from the .debug_info section. Differential revision: https://reviews.llvm.org/D38721 llvm-svn: 317080
* Revert r316305: Remove a fast lookup table from MergeInputSection.Rui Ueyama2017-10-311-0/+3
| | | | | | This reverts commit r316305 because performance regression was observed. llvm-svn: 317026
* Move "Assigned" bit from SectionBase to InputSectionBase.Rui Ueyama2017-10-291-14/+14
| | | | | | | | | This bit is to manage whether an input section has already been assigned to some output section by linker scripts or not. So it logically belongs to InputSectionBase. SectionBase is a common base class for input and output sections, so that wasn't the right place to define the bit. llvm-svn: 316879
* Initialize members not by assignment but by the member initializer list.Rui Ueyama2017-10-291-9/+4
| | | | llvm-svn: 316876
OpenPOWER on IntegriCloud