| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 275153
|
|
|
|
|
|
|
|
|
|
|
| |
The TinyPtrVector of const Thunk<ELFT>* in InputSections.h can cause
build failures on certain compiler/library combinations when Thunk<ELFT>
is not a complete type or is an abstract class. Fixed by making Thunk<ELFT>
non Abstract.
type or is an abstract class
llvm-svn: 274863
|
|
|
|
|
|
|
| |
This seems to be causing a buildbot failure on lld-x86_64-freebsd. Will
reproduce locally and fix.
llvm-svn: 274841
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generalise the Mips LA25 Thunk code and implement ARM and Thumb
interworking Thunks.
- Introduce a new module Thunks.cpp to store the Target Specific Thunk
implementations.
- DefinedRegular and Shared have a ThunkData field to record Thunk.
- A Target can have more than one type of Thunk.
- Support PC-relative calls to Thunks.
- Support Thunks to PLT entries.
- Existing Mips LA25 Thunk code integrated.
- Support for ARMv7A interworking Thunks.
Limitations:
- Only one Thunk per SymbolBody, this is sufficient for all currently
implemented Thunks.
- ARM thunks assume presence of V6T2 MOVT and MOVW instructions.
Differential revision: http://reviews.llvm.org/D21891
llvm-svn: 274836
|
|
|
|
|
|
|
|
| |
Previously, ch_size was read in host byte order, so if a host and
a target are different in byte order, we would produce a corrupted
output.
llvm-svn: 274729
|
|
|
|
|
|
|
|
|
|
| |
Patch implements support of zlib style compressed sections.
SHF_COMPRESSED flag is used to recognize that decompression is required.
After that decompression is performed and flag is removed from output.
Differential revision: http://reviews.llvm.org/D20272
llvm-svn: 273661
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch adds one more partition to the MIPS GOT. This time it is for
TLS related GOT entries. Such entries are located after 'local' and 'global'
ones. We cannot get a final offset for these entries at the time of
creation because we do not know size of 'local' and 'global' partitions.
So we have to adjust the offset later using `getMipsTlsOffset()` method.
All MIPS TLS relocations which need GOT entries operates MIPS style GOT
offset - 'offset from the GOT's beginning' - MipsGPOffset constant. That
is why I add new types of relocation expressions.
One more difference from othe ABIs is that the MIPS ABI does not support
any TLS relocation relaxations. I decided to make a separate function
`handleMipsTlsRelocation` and put MIPS TLS relocation handling code
there. It is similar to `handleTlsRelocation` routine and duplicates its
code. But it allows to make the code cleaner and prevent pollution of
the `handleTlsRelocation` by MIPS 'if' statements.
Differential Revision: http://reviews.llvm.org/D21606
llvm-svn: 273569
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Peter Smith found while trying to support thunk creation for ARM that
LLD sometimes creates broken thunks for MIPS. The cause of the bug is
that we assign file offsets to input sections too early. We need to
create all sections and then assign section offsets because appending
thunks changes file offsets for all following sections.
This patch separates the pass to assign file offsets from thunk
creation pass. This effectively reverts r265673.
Differential Revision: http://reviews.llvm.org/D21598
llvm-svn: 273532
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
addends
There are two motivations for this patch. The first one is a preparation
for support MIPS TLS relocations. It might sound like a joke but for GOT
entries related to TLS relocations MIPS ABI uses almost regular approach
with creation of dynamic relocations for each GOT enty etc. But we need
to separate these 'regular' TLS related entries from MIPS specific local
and global parts of GOT. ABI declare simple solution - all TLS related
entries allocated at the end of GOT after local/global parts. The second
motivation it to support GOT relocations for non-preemptible symbols
with addends. If we have more than one GOT relocations against symbol S
with different addends we need to create GOT entries for each unique
Symbol/Addend pairs.
So we store all MIPS GOT entries in separate containers. For non-preemptible
symbols we have to maintain two data structures. The first one is MipsLocal
vector. Each entry corresponds to the GOT entry from the 'local' part
of the GOT contains the symbol's address plus addend. The second one
is MipsLocalMap. It is a map from Symbol/Addend pair to the GOT index.
Differential Revision: http://reviews.llvm.org/D21297
llvm-svn: 273127
|
|
|
|
|
|
|
|
| |
I think it is me who named these variables, but I always find that
they are slightly confusing because align is a verb.
Adding four letters is worth it.
llvm-svn: 272984
|
|
|
|
| |
llvm-svn: 271815
|
|
|
|
|
|
|
| |
This remove some EM_386 specific code from InputSection.cpp and opens
the way for more relaxations.
llvm-svn: 271814
|
|
|
|
|
|
| |
We were not handling page relative relocations.
llvm-svn: 271798
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is mostly extracted from http://reviews.llvm.org/D18960.
The general idea for tlsdesc is that the two GD got entries are used
for a function pointer and its argument. The dynamic linker sets
both. In the non-dlopen case the dynamic linker sets the function to
the identity and the argument to the offset in the tls block.
All that the static linker has to do in the non-dlopen case is
relocate the code to point to the got entries and create a dynamic
relocation.
The dlopen case is more complicated, but can be implemented in another patch.
llvm-svn: 271569
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch implements next relaxation from latest ABI:
"Convert memory operand of test and binop into immediate operand, where binop is one of adc, add, and, cmp, or,
sbb, sub, xor instructions, when position-independent code is disabled."
It is described in System V Application Binary Interface AMD64 Architecture Processor
Supplement Draft Version 0.99.8 (https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-r249.pdf,
B.2 "B.2 Optimize GOTPCRELX Relocations").
Differential revision: http://reviews.llvm.org/D20793
llvm-svn: 271405
|
|
|
|
|
|
|
| |
This reverts commit r271365.
Sorry, wrong branch.
llvm-svn: 271366
|
|
|
|
| |
llvm-svn: 271365
|
|
|
|
| |
llvm-svn: 271133
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MergedInputSection::getOffset is the busiest function in LLD if string
merging is enabled and input files have lots of mergeable sections.
It is usually the case when creating executable with debug info,
so it is pretty common.
The reason why it is slow is because it has to do faily complex
computations. For non-mergeable sections, section contents are
contiguous in output, so in order to compute an output offset,
we only have to add the output section's base address to an input
offset. But for mergeable strings, section contents are split for
merging, so they are not contigous. We've got to do some lookups.
We used to do binary search on the list of section pieces.
It is slow because I think it's hostile to branch prediction.
This patch replaces it with hash table lookup. Seems it's working
pretty well. Below is "perf stat -r10" output when linking clang
with debug info. In this case this patch speeds up about 4%.
Before:
6584.153205 task-clock (msec) # 1.001 CPUs utilized ( +- 0.09% )
238 context-switches # 0.036 K/sec ( +- 6.59% )
0 cpu-migrations # 0.000 K/sec ( +- 50.92% )
1,067,675 page-faults # 0.162 M/sec ( +- 0.15% )
18,369,931,470 cycles # 2.790 GHz ( +- 0.09% )
9,640,680,143 stalled-cycles-frontend # 52.48% frontend cycles idle ( +- 0.18% )
<not supported> stalled-cycles-backend
21,206,747,787 instructions # 1.15 insns per cycle
# 0.45 stalled cycles per insn ( +- 0.04% )
3,817,398,032 branches # 579.786 M/sec ( +- 0.04% )
132,787,249 branch-misses # 3.48% of all branches ( +- 0.02% )
6.579106511 seconds time elapsed ( +- 0.09% )
After:
6312.317533 task-clock (msec) # 1.001 CPUs utilized ( +- 0.19% )
221 context-switches # 0.035 K/sec ( +- 4.11% )
1 cpu-migrations # 0.000 K/sec ( +- 45.21% )
1,280,775 page-faults # 0.203 M/sec ( +- 0.37% )
17,611,539,150 cycles # 2.790 GHz ( +- 0.19% )
10,285,148,569 stalled-cycles-frontend # 58.40% frontend cycles idle ( +- 0.30% )
<not supported> stalled-cycles-backend
18,794,779,900 instructions # 1.07 insns per cycle
# 0.55 stalled cycles per insn ( +- 0.03% )
3,287,450,865 branches # 520.799 M/sec ( +- 0.03% )
72,259,605 branch-misses # 2.20% of all branches ( +- 0.01% )
6.307411828 seconds time elapsed ( +- 0.19% )
Differential Revision: http://reviews.llvm.org/D20645
llvm-svn: 270999
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MIPS .reginfo and .MIPS.options sections are consumed by the linker, and
the linker produces a single output section. But it is possible that
input files contain section symbol points to the corresponding input
section. In case of generation a relocatable output we need to write
such symbols to the output file.
Fixes bug 27878.
Differential Revision: http://reviews.llvm.org/D20688
llvm-svn: 270910
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
System V Application Binary Interface AMD64 Architecture Processor Supplement Draft Version 0.99.8
(https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-r249.pdf, B.2 "B.2 Optimize GOTPCRELX Relocations")
introduces possible relaxations for R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX.
That patch implements the next relaxation:
mov foo@GOTPCREL(%rip), %reg => lea foo(%rip), %reg
and also opens door for implementing all other ones.
Implementation was suggested by Rafael Ávila de Espíndola with few additions and testcases by myself.
Differential revision: http://reviews.llvm.org/D15779
llvm-svn: 270705
|
|
|
|
| |
llvm-svn: 270563
|
|
|
|
| |
llvm-svn: 270555
|
|
|
|
|
|
|
|
| |
This reverts commit r270551.
Sorry, I commited the wrong branch :-(
llvm-svn: 270554
|
|
|
|
| |
llvm-svn: 270551
|
|
|
|
| |
llvm-svn: 270532
|
|
|
|
| |
llvm-svn: 270526
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, mergeable section's constructors did more than just
setting member variables; it split section contents into small
pieces. It is not always computationally cheap task because if
the section is a mergeable string section, it needs to scan the
entire section to split them by NUL characters.
If a section would be thrown away by GC, that cost ended up
being a waste of time. It is going to be larger problem if the
section is compressed -- the whole time to uncompress it and
split it up is going to be a waste.
Luckily, we can defer section splitting after GC. We just have
to remember which offsets are in use during GC and apply that later.
This patch implements it.
Differential Revision: http://reviews.llvm.org/D20516
llvm-svn: 270455
|
|
|
|
| |
llvm-svn: 270387
|
|
|
|
| |
llvm-svn: 270386
|
|
|
|
| |
llvm-svn: 270385
|
|
|
|
|
|
| |
So that we don't need to cut a slice when we use a SectionPiece.
llvm-svn: 270348
|
|
|
|
|
|
|
|
| |
This patch adds Size member to SectionPiece so that getRangeAndSize
can just return a SectionPiece instead of a std::pair<SectionPiece *, uint_t>.
Also renamed the function.
llvm-svn: 270346
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were using std::pair to represents pieces of splittable section
contents. It hurt readability because "first" and "second" are not
meaningful. This patch give them names.
One more thing is that piecewise liveness information is stored to
the second element of the pair as a special value of output section
offset. It was confusing, so I defiend a new bit, "Live", in the
new struct.
llvm-svn: 270340
|
|
|
|
|
|
| |
non-alloc input section
llvm-svn: 270328
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it explicit that each R_RELAX_TLS_* is equivalent to some
other expression.
With this I think we are at a sweet spot for how much is done in
Target.cpp. I did experiment with moving *all* the value math out of it.
It has the advantage that we know the final value in target independent
code, but it gets quite verbose.
llvm-svn: 270277
|
|
|
|
| |
llvm-svn: 270275
|
|
|
|
|
|
| |
This avoid doing math in Target.cpp to compensate.
llvm-svn: 270266
|
|
|
|
|
|
|
|
| |
This adds direct support for computing offsets from the thread pointer
for both variants. Of the architectures we support, variant 1 is used
only by aarch64 (but that doesn't seem to be documented anywhere.)
llvm-svn: 270243
|
|
|
|
|
|
|
|
|
| |
New names reflect purpose of corresponding GOT entries better.
Both expression types related to entries allocated in the 'local'
part of MIPS GOT. R_MIPS_GOT_LOCAL_PAGE is for entries contain 'page'
addresses. R_MIPS_GOT_LOCAL is for entries contain 'full' address.
llvm-svn: 269597
|
|
|
|
|
|
|
| |
This speeds up a link of chromium with -O2 (but no icf,gc) from
1.940664632 to 1.925578119.
llvm-svn: 268639
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were previously using an output offset of -1 for both GC'd and tail
merged pieces. We need to distinguish these two cases in order to filter
GC'd symbols from the symbol table -- we were previously asserting when we
asked for the VA of a symbol pointing into a dead piece, which would end
up asking the tail merging string table for an offset even though we hadn't
initialized it properly.
This patch fixes the bug by using an offset of -1 to exclusively mean GC'd
pieces, using 0 for tail merges, and distinguishing the tail merge case from
an offset of 0 by asking the output section whether it is tail merge.
Differential Revision: http://reviews.llvm.org/D19953
llvm-svn: 268604
|
|
|
|
| |
llvm-svn: 268501
|
|
|
|
| |
llvm-svn: 268486
|
|
|
|
|
|
|
|
|
| |
MIPS N64 ABI introduces .MIPS.options section which specifies miscellaneous
options to be applied to an object/shared/executable file. LLVM as well as
modern versions of GNU tools read and write the only type of the options -
ODK_REGINFO. It is exact copy of .reginfo section used by O32 ABI.
llvm-svn: 268485
|
|
|
|
|
|
|
| |
r267917 produces corrupted debug info because it didn't apply
relocations to right offsets.
llvm-svn: 267979
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Relocations against sections with no SHF_ALLOC bit are R_ABS relocations.
Currently we are creating Relocations vector for them, but that is wasteful.
This patch is to skip vector construction and to directly apply relocations
in place.
This patch seems to be pretty effective for large executables with debug info.
r266158 (Rafael's patch to change the way how we apply relocations) caused a
temporary performance degradation for such executables, but this patch makes
it even faster than before.
Time to link clang with debug info (output size is 1070 MB):
before r266158: 15.312 seconds (0%)
r266158: 17.301 seconds (+13.0%)
Head: 16.484 seconds (+7.7%)
w/patch: 13.166 seconds (-14.0%)
Differential Revision: http://reviews.llvm.org/D19645
llvm-svn: 267917
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D19490
llvm-svn: 267637
|
|
|
|
|
|
| |
Every caller was doing it.
llvm-svn: 267603
|
|
|
|
| |
llvm-svn: 267245
|