summaryrefslogtreecommitdiffstats
path: root/lld/ELF/Arch/X86.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [ELF] Add -z force-ibt and -z shstk for Intel Control-flow Enforcement ↵Fangrui Song2020-01-131-0/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | Technology This patch is a joint work by Rui Ueyama and me based on D58102 by Xiang Zhang. It adds Intel CET (Control-flow Enforcement Technology) support to lld. The implementation follows the draft version of psABI which you can download from https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI. CET introduces a new restriction on indirect jump instructions so that you can limit the places to which you can jump to using indirect jumps. In order to use the feature, you need to compile source files with -fcf-protection=full. * IBT is enabled if all input files are compiled with the flag. To force enabling ibt, pass -z force-ibt. * SHSTK is enabled if all input files are compiled with the flag, or if -z shstk is specified. IBT-enabled executables/shared objects have two PLT sections, ".plt" and ".plt.sec". For the details as to why we have two sections, please read the comments. Reviewed By: xiangzhangllvm Differential Revision: https://reviews.llvm.org/D59780
* [lld] Fix trivial typos in commentsKazuaki Ishizaki2020-01-061-1/+1
| | | | | | Reviewed By: ruiu, MaskRay Differential Revision: https://reviews.llvm.org/D72196
* [ELF] writePlt, writeIplt: replace parameters gotPltEntryAddr and index with ↵Fangrui Song2019-12-181-19/+19
| | | | | | | | | | `const Symbol &`. NFC PPC::writeIplt (IPLT code sequence, D71621) needs to access `Symbol`. Reviewed By: grimar, ruiu Differential Revision: https://reviews.llvm.org/D71631
* [ELF] Add IpltSectionFangrui Song2019-12-171-4/+7
| | | | | | | | | | | | | | | | | | | PltSection is used by both PLT and IPLT. The PLT section may have a header while the IPLT section does not. Split off IpltSection from PltSection to be clearer. Unlike other targets, PPC64 cannot use the same code sequence for PLT and IPLT. This helps make a future PPC64 patch (D71509) more isolated. On EM_386 and EM_X86_64, when PLT is empty while IPLT is not, currently we are inconsistent whether the PLT header is conceptually attached to in.plt or in.iplt . Consistently attach the header to in.plt can make the -z retpolineplt logic simpler. It also makes `jmp` point to an aesthetically better place for non-retpolineplt cases. Reviewed By: grimar, ruiu Differential Revision: https://reviews.llvm.org/D71519
* [ELF] Delete relOff from TargetInfo::writePLTFangrui Song2019-12-161-9/+9
| | | | | | | | | | | | | | This change only affects EM_386. relOff can be computed from `index` easily, so it is unnecessarily passed as a parameter. Both in.plt and in.iplt entries are written by writePLT. For in.iplt, the instruction `push reloc_offset` will change because `index` is now different. Fortunately, this does not matter because `push; jmp` is only used by PLT. IPLT does not need the code sequence. Reviewed By: grimar, ruiu Differential Revision: https://reviews.llvm.org/D71518
* [ELF] Wrap things in `namespace lld { namespace elf {`, NFCFangrui Song2019-10-071-3/+7
| | | | | | | | | | | This makes it clear `ELF/**/*.cpp` files define things in the `lld::elf` namespace and simplifies `elf::foo` to `foo`. Reviewed By: atanasyan, grimar, ruiu Differential Revision: https://reviews.llvm.org/D68323 llvm-svn: 373885
* [Coding style change] Rename variables so that they start with a lowercase ↵Rui Ueyama2019-07-101-183/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | letter This patch is mechanically generated by clang-llvm-rename tool that I wrote using Clang Refactoring Engine just for creating this patch. You can see the source code of the tool at https://reviews.llvm.org/D64123. There's no manual post-processing; you can generate the same patch by re-running the tool against lld's code base. Here is the main discussion thread to change the LLVM coding style: https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html In the discussion thread, I proposed we use lld as a testbed for variable naming scheme change, and this patch does that. I chose to rename variables so that they are in camelCase, just because that is a minimal change to make variables to start with a lowercase letter. Note to downstream patch maintainers: if you are maintaining a downstream lld repo, just rebasing ahead of this commit would cause massive merge conflicts because this patch essentially changes every line in the lld subdirectory. But there's a remedy. clang-llvm-rename tool is a batch tool, so you can rename variables in your downstream repo with the tool. Given that, here is how to rebase your repo to a commit after the mass renaming: 1. rebase to the commit just before the mass variable renaming, 2. apply the tool to your downstream repo to mass-rename variables locally, and 3. rebase again to the head. Most changes made by the tool should be identical for a downstream repo and for the head, so at the step 3, almost all changes should be merged and disappear. I'd expect that there would be some lines that you need to merge by hand, but that shouldn't be too many. Differential Revision: https://reviews.llvm.org/D64121 llvm-svn: 365595
* [ELF] Make the rule to create relative relocations in a writable section ↵Fangrui Song2019-06-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | stricter The current rule is loose: `!Sym.IsPreemptible || Expr == R_GOT`. When the symbol is non-preemptable, this allows absolute relocation types with smaller numbers of bits, e.g. R_X86_64_{8,16,32}. They are disallowed by ld.bfd and gold, e.g. ld.bfd: a.o: relocation R_X86_64_8 against `.text' can not be used when making a shared object; recompile with -fPIC This patch: a) Add TargetInfo::SymbolicRel to represent relocation types that resolve to a symbol value (e.g. R_AARCH_ABS64, R_386_32, R_X86_64_64). As a side benefit, we currently (ab)use GotRel (R_*_GLOB_DAT) to resolve GOT slots that are link-time constants. Since we now use Target->SymbolRel to do the job, we can remove R_*_GLOB_DAT from relocateOne() for all targets. R_*_GLOB_DAT cannot be used as static relocation types. b) Change the condition to `!Sym.IsPreemptible && Type != Target->SymbolicRel || Expr == R_GOT`. Some tests are caught by the improved error checking (ld.bfd/gold also issue errors on them). Many misuse .long where .quad should be used instead. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D63121 llvm-svn: 363059
* ELF: Create synthetic sections for loadable partitions.Peter Collingbourne2019-06-071-1/+1
| | | | | | | | | | | | | | | We create several types of synthetic sections for loadable partitions, including: - The dynamic symbol table. This allows code outside of the loadable partitions to find entry points with dlsym. - Creating a dynamic symbol table also requires the creation of several other synthetic sections for the partition, such as the dynamic table and hash table sections. - The partition's ELF header is represented as a synthetic section in the combined output file, and will be used by llvm-objcopy to extract partitions. Differential Revision: https://reviews.llvm.org/D62350 llvm-svn: 362819
* [ELF] Delete GotEntrySize and GotPltEntrySizeFangrui Song2019-05-311-2/+0
| | | | | | | | | | | | | GotEntrySize and GotPltEntrySize were added in D22288. Later, with the introduction of wordsize() (then Config->Wordsize), they become redundant, because there is no target that sets GotEntrySize or GotPltEntrySize to a number different from Config->Wordsize. Reviewed By: grimar, ruiu Differential Revision: https://reviews.llvm.org/D62727 llvm-svn: 362220
* [ELF][X86] Allow R_386_TLS_LDO_32 and R_X86_64_DTPOFF{32,64} to preemptable ↵Fangrui Song2019-04-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | local-dynamic symbols Summary: Fixes PR35242. A simplified reproduce: thread_local int i; int f() { return i; } % {g++,clang++} -fPIC -shared -ftls-model=local-dynamic -fuse-ld=lld a.cc ld.lld: error: can't create dynamic relocation R_X86_64_DTPOFF32 against symbol: i in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output In isStaticLinkTimeConstant(), Syn.IsPreemptible is true, so it is not seen as a constant. The error is then issued in processRelocAux(). A symbol of the local-dynamic TLS model cannot be preempted but it can preempt symbols of the global-dynamic TLS model in other DSOs. So it makes some sense that the variable is not static. This patch fixes the linking error by changing getRelExpr() on R_386_TLS_LDO_32 and R_X86_64_DTPOFF{32,64} from R_ABS to R_DTPREL. R_PPC64_DTPREL_* and R_MIPS_TLS_DTPREL_* need similar fixes, but they are not handled in this patch. As a bonus, we use `if (Expr == R_ABS && !Config->Shared)` to find ld-to-le opportunities. R_ABS is overloaded here for such STT_TLS symbols. A dedicated R_DTPREL is clearer. Differential Revision: https://reviews.llvm.org/D60945 llvm-svn: 358870
* [ELF][X86] Rename R_RELAX_TLS_GD_TO_IE_END to R_RELAX_TLS_GD_TO_IE_GOTPLTFangrui Song2019-04-221-1/+1
| | | | | | | | | | Summary: This relocation type is used by R_386_TLS_GD. Its formula is the same as R_GOTPLT (e.g R_X86_64_GOT{32,64} R_386_TLS_GOTIE). Rename it to be clearer. Differential Revision: https://reviews.llvm.org/D60941 llvm-svn: 358868
* Simplify. NFC.Rui Ueyama2019-04-011-13/+13
| | | | llvm-svn: 357373
* Inline a trivial function. NFC.Rui Ueyama2019-03-281-3/+3
| | | | | | | I found that hiding this particular actual expression doesn't help readers understand the code. So I remove and inline that function. llvm-svn: 357140
* [ELF] Change GOT*_FROM_END (relative to end(.got)) to GOTPLT* (start(.got.plt))Fangrui Song2019-03-251-23/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This should address remaining issues discussed in PR36555. Currently R_GOT*_FROM_END are exclusively used by x86 and x86_64 to express relocations types relative to the GOT base. We have _GLOBAL_OFFSET_TABLE_ (GOT base) = start(.got.plt) but end(.got) != start(.got.plt) This can have problems when _GLOBAL_OFFSET_TABLE_ is used as a symbol, e.g. glibc dl_machine_dynamic assumes _GLOBAL_OFFSET_TABLE_ is start(.got.plt), which is not true. extern const ElfW(Addr) _GLOBAL_OFFSET_TABLE_[] attribute_hidden; return _GLOBAL_OFFSET_TABLE_[0]; // R_X86_64_GOTPC32 In this patch, we * Change all GOT*_FROM_END to GOTPLT* to fix the problem. * Add HasGotPltOffRel to denote whether .got.plt should be kept even if the section is empty. * Simplify GotSection::empty and GotPltSection::empty by setting HasGotOffRel and HasGotPltOffRel according to GlobalOffsetTable early. The change of R_386_GOTPC makes X86::writePltHeader simpler as we don't have to compute the offset start(.got.plt) - Ebx (it is constant 0). We still diverge from ld.bfd (at least in most cases) and gold in that .got.plt and .got are not adjacent, but the advantage doing that is unclear. Reviewers: ruiu, sivachandra, espindola Subscribers: emaste, mehdi_amini, arichardson, dexonsmith, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59594 llvm-svn: 356968
* Improve error message for unknown relocations.Rui Ueyama2019-02-141-2/+4
| | | | | | | | | | | | Previously, we showed the following message for an unknown relocation: foo.o: unrecognized reloc 256 This patch improves it so that the error message includes a symbol name: foo.o: unknown relocation (256) against symbol bar llvm-svn: 354040
* Recommit r353293 "[LLD][ELF] - Set DF_STATIC_TLS flag for i386 target."George Rimar2019-02-061-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the following changes: 1) Compilation fix: std::atomic<bool> HasStaticTlsModel = false; -> std::atomic<bool> HasStaticTlsModel{false}; 2) Adjusted the comment in code. Initial commit message: DF_STATIC_TLS flag indicates that the shared object or executable contains code using a static thread-local storage scheme. Patch checks if IE/LE relocations were used to check if the code uses a static model. If so it sets the DF_STATIC_TLS flag. Differential revision: https://reviews.llvm.org/D57749 ---- Modified : /lld/trunk/ELF/Arch/X86.cpp Modified : /lld/trunk/ELF/Config.h Modified : /lld/trunk/ELF/SyntheticSections.cpp Added : /lld/trunk/test/ELF/Inputs/i386-static-tls-model1.s Added : /lld/trunk/test/ELF/Inputs/i386-static-tls-model2.s Added : /lld/trunk/test/ELF/Inputs/i386-static-tls-model3.s Added : /lld/trunk/test/ELF/Inputs/i386-static-tls-model4.s Added : /lld/trunk/test/ELF/i386-static-tls-model.s Modified : /lld/trunk/test/ELF/i386-tls-ie-shared.s Modified : /lld/trunk/test/ELF/tls-dynamic-i686.s Modified : /lld/trunk/test/ELF/tls-opt-iele-i686-nopic.s llvm-svn: 353299
* Revert r353293 "[LLD][ELF] - Set DF_STATIC_TLS flag for i386 target."George Rimar2019-02-061-9/+0
| | | | | | | | | | | | It broke BB: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/43450 http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/27891 Error is: tools/lld/ELF/Config.h:84:41: error: copying member subobject of type 'std::atomic<bool>' invokes deleted constructor std::atomic<bool> HasStaticTlsModel = false; llvm-svn: 353297
* [LLD][ELF] - Set DF_STATIC_TLS flag for i386 target.George Rimar2019-02-061-0/+9
| | | | | | | | | | | | DF_STATIC_TLS flag indicates that the shared object or executable contains code using a static thread-local storage scheme. Patch checks if IE/LE relocations were used to check if the code uses a static model. If so it sets the DF_STATIC_TLS flag. Differential revision: https://reviews.llvm.org/D57749 llvm-svn: 353293
* [PPC64] Set the number of relocations processed for R_PPC64_TLS[GL]D to 2Fangrui Song2019-02-061-1/+5
| | | | | | | | | | | | | | | | | | | | | | Summary: R_PPC64_TLSGD and R_PPC64_TLSLD are used as markers on TLS code sequences. After GD-to-IE or GD-to-LE relaxation, the next relocation R_PPC64_REL24 should be skipped to not create a false dependency on __tls_get_addr. When linking statically, the false dependency may cause an "undefined symbol: __tls_get_addr" error. R_PPC64_GOT_TLSGD16_HA R_PPC64_GOT_TLSGD16_LO R_PPC64_TLSGD R_TLSDESC_CALL R_PPC64_REL24 __tls_get_addr Reviewers: ruiu, sfertile, syzaara, espindola Reviewed By: sfertile Subscribers: emaste, nemanjai, arichardson, kbarton, jsji, llvm-commits, tamur Tags: #llvm Differential Revision: https://reviews.llvm.org/D57673 llvm-svn: 353262
* Inline a trivial function and update comment. NFC.Rui Ueyama2019-02-051-9/+7
| | | | llvm-svn: 353200
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [ELF] Make TrapInstr and Filler byte arrays. NFC.Simon Atanasyan2018-11-141-1/+1
| | | | | | | | | | | | The uint32_t type does not clearly convey that these fields are interpreted in the target endianness. Converting them to byte arrays should make this more obvious and less error-prone. Patch by James Clarke Differential Revision: http://reviews.llvm.org/D54207 llvm-svn: 346893
* [ELF] - Do not fail on R_*_NONE relocations when parsing the debug info.George Rimar2018-09-261-0/+1
| | | | | | | | | | | | | | | | | This is https://bugs.llvm.org//show_bug.cgi?id=38919. Currently, LLD may report "unsupported relocation target while parsing debug info" when parsing the debug information. At the same time LLD does that for zeroed R_X86_64_NONE relocations, which obviously has "invalid" targets. The nature of R_*_NONE relocation assumes them should be ignored. This patch teaches LLD to stop reporting the debug information parsing errors for them. Differential revision: https://reviews.llvm.org/D52408 llvm-svn: 343078
* Reset input section pointers to null on each linker invocation.Rui Ueyama2018-09-251-9/+9
| | | | | | | | | | Previously, if you invoke lld's `main` more than once in the same process, the second invocation could fail or produce a wrong result due to a stale pointer values of the previous run. Differential Revision: https://reviews.llvm.org/D52506 llvm-svn: 343009
* Align AArch64 and i386 image base to superpageDimitry Andric2018-09-211-0/+4
| | | | | | | | | | | | | | | | | | | Summary: As for x86_64, the default image base for AArch64 and i386 should be aligned to a superpage appropriate for the architecture. On AArch64, this is 2 MiB, on i386 it is 4 MiB. Reviewers: emaste, grimar, javed.absar, espindola, ruiu, peter.smith, srhines, rprichard Reviewed By: ruiu, peter.smith Subscribers: jfb, markj, arichardson, krytarowski, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D50297 llvm-svn: 342746
* Rename R_TLSGD/R_TLSLD to add _GOT_FROM_END. NFC.Sean Fertile2018-05-311-2/+2
| | | | | | | | | getRelocTargetVA for R_TLSGD and R_TLSLD RelExprs calculate an offset from the end of the got, so adjust the names to reflect this. Differential Revision: https://reviews.llvm.org/D47379 llvm-svn: 333674
* [ELF] - Relax checks for R_386_8/R_386_16 relocations.George Rimar2018-04-031-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes PR36927. The issue is next. Imagine we have -Ttext 0x7c and code below. .code16 .global _start _start: movb $_start+0x83,%ah So we have R_386_8 relocation and _start at 0x7C. Addend is 0x83 == 131. We will sign extend it to 0xffffffffffffff83. Now, 0xffffffffffffff83 + 0x7c gives us 0xFFFFFFFFFFFFFFFF. Techically 0x83 + 0x7c == 0xFF, we do not exceed 1 byte value, but currently LLD errors out, because we use checkUInt<8>. Let's try to use checkInt<8> now and the following code to see if it can help (no): main.s: .byte foo input.s: .globl foo .hidden foo foo = 0xff Here, foo is 0xFF. And addend is 0x0. Final value is 0x00000000000000FF. Again, it fits one byte well, but with checkInt<8>, we would error out it, so we can't use it. What we want to do is to check that the result fits 1 byte well. Patch changes the check to checkIntUInt to fix the issue. Differential revision: https://reviews.llvm.org/D45051 llvm-svn: 329061
* Do not use template for check{Int,UInt,IntUInt,Alignment}.Rui Ueyama2018-03-291-5/+5
| | | | | | | | Template is just unnecessary. Differential Revision: https://reviews.llvm.org/D45063 llvm-svn: 328843
* [ELF] Fix X86 & X86_64 PLT retpoline paddingAndrew Ng2018-03-291-14/+19
| | | | | | | | | | | | | | The PLT retpoline support for X86 and X86_64 did not include the padding when writing the header and entries. This issue was revealed when linker scripts were used, as this disables the built-in behaviour of filling the last page of executable segments with trap instructions. This particular behaviour was hiding the missing padding. Added retpoline tests with linker scripts. Differential Revision: https://reviews.llvm.org/D44682 llvm-svn: 328777
* [ELF] Recommit 327248 with Arm using the .got for _GLOBAL_OFFSET_TABLE_Peter Smith2018-03-191-1/+0
| | | | | | | | | | | | | | | | | | | | | | This is the same as 327248 except Arm defining _GLOBAL_OFFSET_TABLE_ to be the base of the .got section as some existing code is relying upon it. For most Targets the _GLOBAL_OFFSET_TABLE_ symbol is expected to be at the start of the .got.plt section so that _GLOBAL_OFFSET_TABLE_[0] = reserved value that is by convention the address of the dynamic section. Previously we had defined _GLOBAL_OFFSET_TABLE_ as either the start or end of the .got section with the intention that the .got.plt section would follow the .got. However this does not always hold with the current default section ordering so _GLOBAL_OFFSET_TABLE_[0] may not be consistent with the reserved first entry of the .got.plt. X86, X86_64 and AArch64 will use the .got.plt. Arm, Mips and Power use .got Fixes PR36555 Differential Revision: https://reviews.llvm.org/D44259 llvm-svn: 327823
* Revert r327248, "For most Targets the _GLOBAL_OFFSET_TABLE_ symbol is ↵Peter Collingbourne2018-03-161-0/+1
| | | | | | | | | | | expected to be at" This change broke ARM code that expects to be able to add _GLOBAL_OFFSET_TABLE_ to the result of an R_ARM_REL32. I will provide a reproducer on llvm-commits. llvm-svn: 327688
* Reduce code duplication a bit.Rafael Espindola2018-03-141-7/+9
| | | | | | | The code for computing the offset of an entry in the plt is simple, but it was duplicated in quite a few places. llvm-svn: 327536
* For most Targets the _GLOBAL_OFFSET_TABLE_ symbol is expected to be atPeter Smith2018-03-111-1/+0
| | | | | | | | | | | | | | | | | | the start of the .got.plt section so that _GLOBAL_OFFSET_TABLE_[0] = reserved value that is by convention the address of the dynamic section. Previously we had defined _GLOBAL_OFFSET_TABLE_ as either the start or end of the .got section with the intention that the .got.plt section would follow the .got. However this does not always hold with the current default section ordering so _GLOBAL_OFFSET_TABLE_[0] may not be consistent with the reserved first entry of the .got.plt. X86, X86_64, Arm and AArch64 will use the .got.plt. Mips and Power use .got Fixes PR36555 Differential Revision: https://reviews.llvm.org/D44259 llvm-svn: 327248
* Fix retpoline PLT header size for i386.Rui Ueyama2018-01-241-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D42397 llvm-svn: 323288
* Introduce the "retpoline" x86 mitigation technique for variant #2 of the ↵Chandler Carruth2018-01-221-3/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre.. Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took *in the speculative execution*, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain. The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers. However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address. On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address. This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886 We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction. There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection. The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness. For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile *all* libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller. When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typical workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel. When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%. However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we *strongly* recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline. We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors. This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design. Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41723 llvm-svn: 323155
* Make it clear where is a placeholder for later binary patching.Rui Ueyama2017-12-271-9/+9
| | | | | | | | | This is an aesthetic change to represent a placeholder for later binary patching as "0, 0, 0, 0" instead of "0x00, 0x00, 0x00, 0x00". The former is how we represent it in COFF, and I found it easier to read than the latter. llvm-svn: 321471
* Rename SymbolBody -> SymbolRui Ueyama2017-11-031-6/+6
| | | | | | | | | | | | | Now that we have only SymbolBody as the symbol class. So, "SymbolBody" is a bit strange name now. This is a mechanical change generated by perl -i -pe s/SymbolBody/Symbol/g $(git grep -l SymbolBody lld/ELF lld/COFF) nd clang-format-diff. Differential Revision: https://reviews.llvm.org/D39459 llvm-svn: 317370
* [lld] unified COFF and ELF error handling on new Common/ErrorHandlerBob Haarman2017-10-251-1/+1
| | | | | | | | | | | | | | | | | | | Summary: The COFF linker and the ELF linker have long had similar but separate Error.h and Error.cpp files to implement error handling. This change introduces new error handling code in Common/ErrorHandler.h, changes the COFF and ELF linkers to use it, and removes the old, separate implementations. Reviewers: ruiu Reviewed By: ruiu Subscribers: smeenai, jyknight, emaste, sdardis, nemanjai, nhaehnle, mgorny, javed.absar, kbarton, fedor.sergeev, llvm-commits Differential Revision: https://reviews.llvm.org/D39259 llvm-svn: 316624
* Move comment to the place where it makes more sense.Rui Ueyama2017-10-241-3/+3
| | | | llvm-svn: 316491
* [ELF] Recognize additional relocation typesPetr Hosek2017-10-131-0/+4
| | | | | | | | | | These are generated by the linker itself and it shouldn't treat them as unrecognized. This was introduced in r315552 and is triggering an error when building UBSan shared library for i386. Differential Revision: https://reviews.llvm.org/D38899 llvm-svn: 315737
* Remove one parameter from Target::getRelExpr.Rui Ueyama2017-10-121-14/+27
| | | | | | | | A section was passed to getRelExpr just to create an error message. But if there's an invalid relocation, we would eventually report it in relocateOne. So we don't have to pass a section to getRelExpr. llvm-svn: 315552
* Rewrite comment.Rui Ueyama2017-10-121-10/+34
| | | | llvm-svn: 315548
* Define RelType to represent relocation types.Rui Ueyama2017-10-111-18/+18
| | | | | | | | | | | | | | | | | | We were using uint32_t as the type of relocation kind. It has a readability issue because what Type really means in `uint32_t Type` is not obvious. It could be a section type, a symbol type or a relocation type. Since we do not do any arithemetic operations on relocation types (e.g. adding one to R_X86_64_PC32 doesn't make sense), it would be more natural if they are represented as enums. Unfortunately, that is not doable because relocation type definitions are spread into multiple header files. So I decided to use typedef. This still should be better than the plain uint32_t because the intended type is now obvious. llvm-svn: 315525
* Fix which file is in an error message.Rafael Espindola2017-08-041-4/+4
| | | | | | | When reporting an invalid relocation we were blaming the destination file instead of the file with the relocation. llvm-svn: 310084
* Add trap instructions for ARM and MIPS.Rui Ueyama2017-06-261-3/+1
| | | | | | | This patch fills holes in executable sections with 0xd4 (ARM) or 0xef (MIPS). These trap instructions were suggested by Theo de Raadt. llvm-svn: 306322
* [ELF] Define _GLOBAL_OFFSET_TABLE_ symbol relative to .gotPeter Smith2017-06-261-0/+1
| | | | | | | | | | | | | | | | | | On many architectures gcc and clang will recognize _GLOBAL_OFFSET_TABLE_ - . and produce a relocation that can be processed without needing to know the value of _GLOBAL_OFFSET_TABLE_. This is not always the case; for example ARM gcc produces R_ARM_BASE_PREL but clang produces the more general R_ARM_REL32 to _GLOBAL_OFFSET_TABLE_. To evaluate this relocation correctly _GLOBAL_OFFSET_TABLE_ must be defined to be the either the base of the GOT or end of the GOT dependent on architecture.. If/when llvm-mc is changed to recognize _GLOBAL_OFFSET_TABLE_ - . this change will not be necessary for new objects. However there may still be old objects and versions of clang. Differential Revision: https://reviews.llvm.org/D34355 llvm-svn: 306282
* Do not use make<> to allocate TargetInfo. NFC.Rui Ueyama2017-06-161-2/+4
| | | | llvm-svn: 305577
* Split Target.cpp into small files.Rui Ueyama2017-06-161-0/+363
Target.cpp contains code for all the targets that LLD supports. It was simple and easy, but as the number of supported targets increased, it got messy. This patch splits the file into per-target files under ELF/arch directory. Differential Revision: https://reviews.llvm.org/D34222 llvm-svn: 305565
OpenPOWER on IntegriCloud