3 files changed, 374 insertions, 69 deletions
diff --git a/lld/docs/AtomLLD.rst b/lld/docs/AtomLLD.rst
new file mode 100644
index 00000000000..4d36ac60a7d
--- /dev/null
+++ b/lld/docs/AtomLLD.rst
@@ -0,0 +1,61 @@
+ATOM-based lld
+==============
+
+ATOM-based lld is a new set of modular code for creating linker tools.
+Currently it supports Mach-O.
+
+* End-User Features:
+
+  * Compatible with existing linker options
+  * Reads standard Object Files
+  * Writes standard Executable Files
+  * Remove clang's reliance on "the system linker"
+  * Uses the LLVM `"UIUC" BSD-Style license`__.
+
+* Applications:
+
+  * Modular design
+  * Support cross linking
+  * Easy to add new CPU support
+  * Can be built as static tool or library
+
+* Design and Implementation:
+
+  * Extensive unit tests
+  * Internal linker model can be dumped/read to textual format
+  * Additional linking features can be plugged in as "passes"
+  * OS specific and CPU specific code factored out
+
+Why a new linker?
+-----------------
+
+The fact that clang relies on whatever linker tool you happen to have installed
+means that clang has been very conservative adopting features which require a
+recent linker.
+
+In the same way that the MC layer of LLVM has removed clang's reliance on the
+system assembler tool, the lld project will remove clang's reliance on the
+system linker tool.
+
+
+Contents
+--------
+
+.. toctree::
+   :maxdepth: 2
+
+   design
+   getting_started
+   ReleaseNotes
+   development
+   windows_support
+   open_projects
+   sphinx_intro
+
+Indices and tables
+------------------
+
+* :ref:`genindex`
+* :ref:`search`
+
+__ http://llvm.org/docs/DeveloperPolicy.html#license
diff --git a/lld/docs/NewLLD.rst b/lld/docs/NewLLD.rst
new file mode 100644
index 00000000000..891160cded5
--- /dev/null
+++ b/lld/docs/NewLLD.rst
@@ -0,0 +1,309 @@
+The ELF and COFF Linkers
+========================
+
+We started rewriting the ELF (Unix) and COFF (Windows) linkers in May 2015.
+Since then, we have been making a steady progress towards providing
+drop-in replacements for the system linkers.
+
+Currently, the Windows support is mostly complete and is about 2x faster
+than the linker that comes as a part of Micrsoft Visual Studio toolchain.
+
+The ELF support is in progress and is able to link large programs
+such as Clang or LLD itself. Unless your program depends on linker scripts,
+you can expect it to be linkable with LLD.
+It is currently about 1.2x to 2x faster than GNU gold linker.
+We aim to make it a drop-in replacement for the GNU linker.
+
+We expect that FreeBSD is going to be the first large system
+to adopt LLD as the system linker.
+We are working on it in collaboration with the FreeBSD project.
+
+The linkers are notably small; as of March 2016,
+the COFF linker is under 7k LOC and the ELF linker is about 10k LOC.
+
+The linkers are designed to be as fast and simple as possible.
+Because it is simple, it is easy to extend it to support new features.
+There a few key design choices that we made to achieve these goals.
+We will describe them in this document.
+
+The ELF Linker as a Library
+---------------------------
+
+You can embed LLD to your program by linking against it and calling the linker's
+entry point function lld::elf::link.
+
+The current policy is that it is your reponsibility to give trustworthy object
+files. The function is guaranteed to return as long as you do not pass corrupted
+or malicious object files. A corrupted file could cause a fatal error or SEGV.
+That being said, you don't need to worry too much about it if you create object
+files in the usual way and give them to the linker. It is naturally expected to
+work, or otherwise it's a linker's bug.
+
+Design
+======
+
+We will describe the design of the linkers in the rest of the document.
+
+Key Concepts
+------------
+
+Linkers are fairly large pieces of software.
+There are many design choices you have to make to create a complete linker.
+
+This is a list of design choices we've made for ELF and COFF LLD.
+We believe that these high-level design choices achieved a right balance
+between speed, simplicity and extensibility.
+
+* Implement as native linkers
+
+  We implemented the linkers as native linkers for each file format.
+
+  The two linkers share the same design but do not share code.
+  Sharing code makes sense if the benefit is worth its cost.
+  In our case, ELF and COFF are different enough that we thought the layer to
+  abstract the differences wouldn't worth its complexity and run-time cost.
+  Elimination of the abstract layer has greatly simplified the implementation.
+
+* Speed by design
+
+  One of the most important thing in archiving high performance is to
+  do less rather than do it efficiently.
+  Therefore, the high-level design matters more than local optimizations.
+  Since we are trying to create a high-performance linker,
+  it is very important to keep the design as efficient as possible.
+
+  Broadly speaking, we do not do anything until we have to do it.
+  For example, we do not read section contents or relocations
+  until we need them to continue linking.
+  When we need to do some costly operation (such as looking up
+  a hash table for each symbol), we do it only once.
+  We obtain a handler (which is typically just a pointer to actual data)
+  on the first operation and use it throughout the process.
+
+* Efficient archive file handling
+
+  LLD's handling of archive files (the files with ".a" file extension) is different
+  from the traditional Unix linkers and pretty similar to Windows linkers.
+  We'll describe how the traditional Unix linker handles archive files,
+  what the problem is, and how LLD approached the problem.
+
+  The traditional Unix linker maintains a set of undefined symbols during linking.
+  The linker visits each file in the order as they appeared in the command line
+  until the set becomes empty. What the linker would do depends on file type.
+
+  - If the linker visits an object file, the linker links object files to the result,
+    and undefined symbols in the object file are added to the set.
+
+  - If the linker visits an archive file, it checks for the archive file's symbol table
+    and extracts all object files that have definitions for any symbols in the set.
+
+  This algorithm sometimes leads to a counter-intuitive behavior.
+  If you give archive files before object files, nothing will happen
+  because when the linker visits archives, there is no undefined symbols in the set.
+  As a result, no files are extracted from the first archive file,
+  and the link is done at that point because the set is empty after it visits one file.
+
+  You can fix the problem by reordering the files,
+  but that cannot fix the issue of mutually-dependent archive files.
+
+  Linking mutually-dependent archive files is tricky.
+  You may specify the same archive file multiple times to
+  let the linker visit it more than once.
+  Or, you may use the special command line options, `-(` and `-)`,
+  to let the linker loop over the files between the options until
+  no new symbols are added to the set.
+
+  Visiting the same archive files multiple makes the linker slower.
+
+  Here is how LLD approached the problem. Instead of memorizing only undefined symbols,
+  we program LLD so that it memorizes all symbols.
+  When it sees an undefined symbol that can be resolved by extracting an object file
+  from an archive file it previously visited, it immediately extracts the file and link it.
+  It is doable because LLD does not forget symbols it have seen in archive files.
+
+  We believe that the LLD's way is efficient and easy to justify.
+
+  The semantics of LLD's archive handling is different from the traditional Unix's.
+  You can observe it if you carefully craft archive files to exploit it.
+  However, in reality, we don't know any program that cannot link
+  with our algorithm so far, so we are not too worried about the incompatibility.
+
+Important Data Strcutures
+-------------------------
+
+We will describe the key data structures in LLD in this section.
+The linker can be understood as the interactions between them.
+Once you understand their functions, the code of the linker should look obvious to you.
+
+* SymbolBody
+
+  SymbolBody is a class to represent symbols.
+  They are created for symbols in object files or archive files.
+  The linker creates linker-defined symbols as well.
+
+  There are basically three types of SymbolBodies: Defined, Undefined, or Lazy.
+
+  - Defined symbols are for all symbols that are considered as "resolved",
+    including real defined symbols, COMDAT symbols, common symbols,
+    absolute symbols, linker-created symbols, etc.
+  - Undefined symbols represent undefined symbols, which need to be replaced by
+    Defined symbols by the resolver until the link is complete.
+  - Lazy symbols represent symbols we found in archive file headers
+    which can turn into Defined if we read archieve members.
+
+* Symbol
+
+  Symbol is a pointer to a SymbolBody. There's only one Symbol for
+  each unique symbol name (this uniqueness is guaranteed by the symbol table).
+  Because SymbolBodies are created for each file independently,
+  there can be many SymbolBodies for the same name.
+  Thus, the relationship between Symbols and SymbolBodies is 1:N.
+  You can think of Symbols as handles for SymbolBodies.
+
+  The resolver keeps the Symbol's pointer to always point to the "best" SymbolBody.
+  Pointer mutation is the resolve operation of this linker.
+
+  SymbolBodies have pointers to their Symbols.
+  That means you can always find the best SymbolBody from
+  any SymbolBody by following pointers twice.
+  This structure makes it very easy and cheap to find replacements for symbols.
+  For example, if you have an Undefined SymbolBody, you can find a Defined
+  SymbolBody for that symbol just by going to its Symbol and then to SymbolBody,
+  assuming the resolver have successfully resolved all undefined symbols.
+
+* SymbolTable
+
+  SymbolTable is basically a hash table from strings to Symbols
+  with a logic to resolve symbol conflicts. It resolves conflicts by symbol type.
+
+  - If we add Undefined and Defined symbols, the symbol table will keep the latter.
+  - If we add Defined and Lazy symbols, it will keep the former.
+  - If we add Lazy and Undefined, it will keep the former,
+    but it will also trigger the Lazy symbol to load the archive member
+    to actually resolve the symbol.
+
+* Chunk (COFF specific)
+
+  Chunk represents a chunk of data that will occupy space in an output.
+  Each regular section becomes a chunk.
+  Chunks created for common or BSS symbols are not backed by sections.
+  The linker may create chunks to append additional data to an output as well.
+
+  Chunks know about their size, how to copy their data to mmap'ed outputs,
+  and how to apply relocations to them.
+  Specifically, section-based chunks know how to read relocation tables
+  and how to apply them.
+
+* InputSection (ELF specific)
+
+  Since we have less synthesized data for ELF, we don't abstract slices of
+  input files as Chunks for ELF. Instead, we directly use the input section
+  as an internal data type.
+
+  InputSection knows about their size and how to copy themselves to
+  mmap'ed outputs, just like COFF Chunks.
+
+* OutputSection
+
+  OutputSection is a container of InputSections (ELF) or Chunks (COFF).
+  An InputSection or Chunk belongs to at most one OutputSection.
+
+There are mainly three actors in this linker.
+
+* InputFile
+
+  InputFile is a superclass of file readers.
+  We have a different subclass for each input file type,
+  such as regular object file, archive file, etc.
+  They are responsible for creating and owning SymbolBodies and
+  InputSections/Chunks.
+
+* Writer
+
+  The writer is responsible for writing file headers and InputSections/Chunks to a file.
+  It creates OutputSections, put all InputSections/Chunks into them,
+  assign unique, non-overlapping addresses and file offsets to them,
+  and then write them down to a file.
+
+* Driver
+
+  The linking process is drived by the driver. The driver
+
+  - processes command line options,
+  - creates a symbol table,
+  - creates an InputFile for each input file and put all symbols in it into the symbol table,
+  - checks if there's no remaining undefined symbols,
+  - creates a writer,
+  - and passes the symbol table to the writer to write the result to a file.
+
+Link-Time Optimization
+----------------------
+
+LTO is implemented by handling LLVM bitcode files as object files.
+The linker resolves symbols in bitcode files normally. If all symbols
+are successfully resolved, it then calls an LLVM libLTO function
+with all bitcode files to convert them to one big regular ELF/COFF file.
+Finally, the linker replaces bitcode symbols with ELF/COFF symbols,
+so that we link the input files as if they were in the native
+format from the beginning.
+
+The details are described in this document.
+http://llvm.org/docs/LinkTimeOptimization.html
+
+Glossary
+--------
+
+* RVA (COFF)
+
+  Short for Relative Virtual Address.
+
+  Windows executables or DLLs are not position-independent; they are
+  linked against a fixed address called an image base. RVAs are
+  offsets from an image base.
+
+  Default image bases are 0x140000000 for executables and 0x18000000
+  for DLLs. For example, when we are creating an executable, we assume
+  that the executable will be loaded at address 0x140000000 by the
+  loader, so we apply relocations accordingly. Result texts and data
+  will contain raw absolute addresses.
+
+* VA
+
+  Short for Virtual Address. For COFF, it is equivalent to RVA + image base.
+
+* Base relocations (COFF)
+
+  Relocation information for the loader. If the loader decides to map
+  an executable or a DLL to a different address than their image
+  bases, it fixes up binaries using information contained in the base
+  relocation table. A base relocation table consists of a list of
+  locations containing addresses. The loader adds a difference between
+  RVA and actual load address to all locations listed there.
+
+  Note that this run-time relocation mechanism is much simpler than ELF.
+  There's no PLT or GOT. Images are relocated as a whole just
+  by shifting entire images in memory by some offsets. Although doing
+  this breaks text sharing, I think this mechanism is not actually bad
+  on today's computers.
+
+* ICF
+
+  Short for Identical COMDAT Folding (COFF) or Identical Code Folding (ELF).
+
+  ICF is an optimization to reduce output size by merging read-only sections
+  by not only their names but by their contents. If two read-only sections
+  happen to have the same metadata, actual contents and relocations,
+  they are merged by ICF. It is known as an effective technique,
+  and it usually reduces C++ program's size by a few percent or more.
+
+  Note that this is not entirely sound optimization. C/C++ require
+  different functions have different addresses. If a program depends on
+  that property, it would fail at runtime.
+
+  On Windows, that's not really an issue because MSVC link.exe enabled
+  the optimization by default. As long as your program works
+  with the linker's default settings, your program should be safe with ICF.
+
+  On Unix, your program is generally not guaranteed to be safe with ICF,
+  although large programs happen to work correctly.
+  LLD works fine with ICF for example.
diff --git a/lld/docs/index.rst b/lld/docs/index.rst
index f109610e4d3..d019c4f9fd8 100644
--- a/lld/docs/index.rst
+++ b/lld/docs/index.rst
@@ -4,55 +4,12 @@ lld - The LLVM Linker
 =====================
 
 lld contains two linkers whose architectures are different from each other.
-One is a linker that implements native features directly.
-They are in `COFF` or `ELF` directories. Other directories contains the other
-implementation that is designed to be a set of modular code for creating
-linker tools. This document covers mainly the latter.
-For the former, please read README.md in `COFF` directory.
 
-* End-User Features:
-
-  * Compatible with existing linker options
-  * Reads standard Object Files (e.g. ELF, Mach-O, PE/COFF)
-  * Writes standard Executable Files (e.g. ELF, Mach-O, PE)
-  * Remove clang's reliance on "the system linker"
-  * Uses the LLVM `"UIUC" BSD-Style license`__.
-
-* Applications:
-
-  * Modular design
-  * Support cross linking
-  * Easy to add new CPU support
-  * Can be built as static tool or library
-
-* Design and Implementation:
-
-  * Extensive unit tests
-  * Internal linker model can be dumped/read to textual format
-  * Additional linking features can be plugged in as "passes"
-  * OS specific and CPU specific code factored out
-
-Why a new linker?
------------------
-
-The fact that clang relies on whatever linker tool you happen to have installed
-means that clang has been very conservative adopting features which require a
-recent linker.
-
-In the same way that the MC layer of LLVM has removed clang's reliance on the
-system assembler tool, the lld project will remove clang's reliance on the
-system linker tool.
-
-
-Current Status
---------------
-
-lld can self host on x86-64 FreeBSD and Linux and x86 Windows.
-
-All SingleSource tests in test-suite pass on x86-64 Linux.
+.. toctree::
+   :maxdepth: 1
 
-All SingleSource and MultiSource tests in the LLVM test-suite
-pass on MIPS 32-bit little-endian Linux.
+   NewLLD
+   AtomLLD
 
 Source
 ------
@@ -66,25 +23,3 @@ lld is also available via the read-only git mirror::
   git clone http://llvm.org/git/lld.git
 
 Put it in llvm's tools/ directory, rerun cmake, then build target lld.
-
-Contents
---------
-
-.. toctree::
-   :maxdepth: 2
-
-   design
-   getting_started
-   ReleaseNotes
-   development
-   windows_support
-   open_projects
-   sphinx_intro
-
-Indices and tables
-------------------
-
-* :ref:`genindex`
-* :ref:`search`
-
-__ http://llvm.org/docs/DeveloperPolicy.html#license