diff options
| -rw-r--r-- | lld/www/content.css | 27 | ||||
| -rw-r--r-- | lld/www/hello.png | bin | 27616 -> 0 bytes | |||
| -rw-r--r-- | lld/www/index.html | 102 | ||||
| -rw-r--r-- | lld/www/linker_design.html | 421 | ||||
| -rw-r--r-- | lld/www/menu.css | 39 | 
5 files changed, 0 insertions, 589 deletions
diff --git a/lld/www/content.css b/lld/www/content.css deleted file mode 100644 index dca6a329143..00000000000 --- a/lld/www/content.css +++ /dev/null @@ -1,27 +0,0 @@ -html { margin: 0px; } body { margin: 8px; } - -html, body { -  padding:0px; -  font-size:small; font-family:"Lucida Grande", "Lucida Sans Unicode", Arial, Verdana, Helvetica, sans-serif; background-color: #fff; color: #222; -  line-height:1.5; -} - -h1, h2, h3, tt { color: #000 } - -h1 { padding-top:0px; margin-top:0px;} -h2 { color:#333333; padding-top:0.5em; } -h3 { padding-top: 0.5em; margin-bottom: -0.25em; color:#2d58b7} -li { padding-bottom: 0.5em; } -ul { padding-left:1.5em; } - -/* Slides */ -IMG.img_slide { -    display: block; -    margin-left: auto; -    margin-right: auto -} - -.itemTitle { color:#2d58b7 } - -/* Tables */ -tr { vertical-align:top } diff --git a/lld/www/hello.png b/lld/www/hello.png Binary files differdeleted file mode 100644 index 70df111f1ab..00000000000 --- a/lld/www/hello.png +++ /dev/null diff --git a/lld/www/index.html b/lld/www/index.html deleted file mode 100644 index def21dba71e..00000000000 --- a/lld/www/index.html +++ /dev/null @@ -1,102 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" -          "http://www.w3.org/TR/html4/strict.dtd"> -<!-- Material used from: HTML 4.01 specs: http://www.w3.org/TR/html401/ --> -<html> -<head> -  <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> -  <title>lld: a linker for LLVM</title> -  <link type="text/css" rel="stylesheet" href="menu.css"> -  <link type="text/css" rel="stylesheet" href="content.css"> -</head> - -<body> -<div id="menu"> -  <div> -    <a href="http://llvm.org/">LLVM Home</a> -  </div> - -  <div class="submenu"> -    <label>lld Info</label> -    <a href="./index.html">About</a> -    <a href="linker_design.html">Design</a> -  </div> - -  <div class="submenu"> -    <label>Quick Links</label> -    <a href="http://lists.cs.uiuc.edu/pipermail/llvm-commits/">commits</a> -    <a href="http://llvm.org/bugs/">Bug Reports</a> -    <a href="http://llvm.org/svn/llvm-project/lld/trunk/">Browse SVN</a> -    <a href="http://llvm.org/viewvc/llvm-project/lld/trunk/">Browse ViewVC</a> -  </div> -</div> - -<div id="content"> -  <!--*********************************************************************--> -  <h1>lld: a linker for LLVM</h1> -  <!--*********************************************************************--> - -  <p>lld is a new set of modular code for creating linker tools.</p> - -  <!--=====================================================================--> -  <h2 id="goals">Features and Goals</h2> -  <!--=====================================================================--> - -    <p><b>End-User Features:</b></p> -    <ul> -        <li>Compatible with existing linker options</li> -        <li>Reads standard Object Files (e.g. ELF, mach-o, PE/COFF)</li> -        <li>Writes standard Executable Files (e.g. ELF, mach-o, PE)</li> -        <li>Fast link times</li> -        <li>Minimal memory use</li> -        <li>Remove clang's reliance on "the system linker"</li> -        <li>Uses the LLVM 'BSD' License</li> -    </ul> - -    <p><b>Applications:</b></p> -    <ul> -        <li>Modular design</li> -        <li>Support cross linking</li> -        <li>Easy to add new CPU support</li> -        <li>Can be built as static tool or library</li> -    </ul> - -    <p><b>Design and Implementation:</b></p> -    <ul> -        <li>Extensive unit tests</li> -        <li>Internal linker model can be dumped/read to textual format</li> -        <li>Internal linker model can be dumped/read to new native format</li> -		<li>Native format designed to be fast to read and write</li> -        <li>Additional linking features can be plugged in as "passes"</li> -        <li>OS specific and CPU specific code factored out</li> -    </ul> - -  <!--=====================================================================--> -  <h2 id="why">Why a new linker?</h2> -  <!--=====================================================================--> - -  <p>The fact that clang relies on whatever linker tool you happen to have -    installed means that clang has been very conservative adopting features -    which require a recent linker.</p> - - <p>In the same way that the MC layer of LLVM has removed clang's reliance -    on the system assembler tool, the lld project will remove clang's reliance -    on the system linker tool.</p> - - -  <!--=====================================================================--> -  <h2 id="status">Current Status</h2> -  <!--=====================================================================--> - -   <p>lld is in its very early stages of development.</p> - - <!--=====================================================================--> -  <h2>Design Documents</h2> -  <!--=====================================================================--> - -<ul> -   <li><a href="linker_design.html">Design of lld</a></li> -</ul> - -</div> -</body> -</html> diff --git a/lld/www/linker_design.html b/lld/www/linker_design.html deleted file mode 100644 index 308882c2582..00000000000 --- a/lld/www/linker_design.html +++ /dev/null @@ -1,421 +0,0 @@ -<html> -<head> -  <title>Design of lld</title> -</head> -<body> - - -<h1> -  Design of lld  -</h1> - -<h2> -  <a name="introduction">Introduction</a> -</h2> - -<p>lld is a new generation of linker.  It is not "section" based -like traditional linkers which mostly just interlace sections from multiple -object files into the output file.  Instead, lld is based on "Atoms". -Traditional section based linking work well for simple linking, but their model -makes advanced linking features difficult to implement.  Features like dead code  -stripping, reordering functions for locality, and C++ coalescing require the -linker to work at a finer grain. -</p> - -<p>An atom is an indivisible chunk of code or data.  An atom has a set of -attributes, such as: name, scope, content-type, alignment, etc.  An atom also -has a list of References.  A Reference contains: a kind, an optional offset,  -an optional addend, and an optional target atom.</p> - -<p>The Atom model allows the linker to use standard graph theory models for  -linking data structures.  Each atom is a node, and each Reference is an edge.  -The feature of dead code stripping is implemented by following edges to mark -all live atoms, and then delete the non-live atoms.</p> -<br> -<h2> -  <a name="Atom model">Atom model</a> -</h2> - -<p>An atom is an indivisible chuck of code or data.  Typically each user -written function or global variable is an atom.  In addition, the compiler may -emit other atoms, such as for literal c-strings or floating point constants, or -for runtime data structures like dwarf unwind info or pointers to initializers. -</p> - -<p>A simple "hello world" object file would be modeled like this:</p> -<img src="hello.png" alt="hello world graphic"/> -<p>There are three atoms: main, a proxy for printf, and an anonymous atom  -containing the c-string literal "hello world".  The Atom "main" has two   -references. One is the call site for the call to printf, and the other is  -a refernce for the instruction that loads the address of the c-string literal.  -</p> - -<br> -<h2> -  <a name="File model">File model</a> -</h2> - -<p>The linker views the input files as basically containers of Atoms and  -References, and just a few attributes of their own.  The linker works with three  -kinds of files: object files, static libraries, and dynamic shared libraries.    -Each kind of file has reader object which presents the file in the model  -expected by the linker.</p> -<h4> <a>Object File</a>  -</h4> -An object file is just a container of atoms.  When linking  -an object file, a reader is instantiated which parses the object file and  -instantiates a set of atoms representing all content in the .o file.  The -linker adds all those atoms to a master graph. - -<h4> <a>Static Library (Archive)</a>  -</h4> -This is the traditional unix static archive which is just a collection of -object files with a "table of contents". When linking with a static library, -by default nothing is added to the master graph of atoms. Instead, if after -merging all atoms from object files into a master graph, if any "undefined" atoms -are left remaining in the master graph, the linker reads the  -table of contents for each static library to see if any have the needed  -definitions. If so, the set of atoms from the specified object file in the  -static library is added to the master graph of atoms.  - -<h4> <a>Dynamic Library (Shared Object)</a>  -</h4> -Dynamic libraries are different than object files and static libraries in that  -they don't directly add any content. -Their purpose is to check at build time that the remaining undefined references  -can be resolved at runtime, and provide a list of dynamic libraries (SO_NEEDED)  -that will be needed at runtime.  -The way this is modeled in the linker is that a dynamic library contributes -no atoms to the initial graph of atoms.  Instead, (like static libraries) if -there are "undefined" atoms in the master graph of all atoms, then each  -dynamic library is checked to see if exports the required symbol. If so, -a "shared library" atom is instantiated by the by the reader which the linker -uses to replace the "undefined" atom.</p> - -<br> -<h2> -  <a name="Linking Steps">Linking Steps</a> -</h2> -<p>Through the use of abstract Atoms, the core of linking is architecture  -independent and file format independent.  All command line parsing is factored -out into a separate "options" abstraction which enables the linker to be driven -with different command line sets.</p> -<p>The overall steps in linking are:<p> -<ol> -  <li>Command line processing</li> -  <li>Parsing input files</li> -  <li>Resolving</li> -  <li>Passes/Optimizations</li> -  <li>Generate output file</li> -</ol> - -<p>The Resolving and Passes steps are done purely on the master graph of atoms,  -so they have no notion of file formats such as mach-o or ELF.</p> - -<h4> <a>Resolving</a>  -</h4> -<p>The resolving step takes all the atoms graphs from each object file and  -combines them into one master object graph.  Unfortunately, it is not as simple -as appending the atom list from each file into one big list.  There are many -cases where atoms need to be coalesced.  That is, two or more atoms need to  -be coalesced into one atom.  This is necessary to support: C language - "tentative definitions", C++ weak symbols for templates and inlines defined -in headers, replacing undefined atoms with actual definition atoms, and  -for merging copies of constants like c-strings and floating point constants.</p> - -<p>The linker support coalescing by-name and by-content. By-name is used for -tentative definitions and weak symbols.  By-content is used for constant data -that can be merged. </p> - -<p>The resolving process maintains some global linking "state", including -a "symbol table" which is a map from llvm::StringRef to lld::Atom*. -With these data structures, the linker iterates all atoms in all input files. F -or each atom, it checks if the atom is named and has a global or hidden scope. -If so, the atom is added to the symbol table map.  If there already is -a matching atom in that table, that means the current atom needs to be  -coalesced with the found atom, or it is a multiple definition error. -</p> - -<p>When all initial input file atoms have been processed by the resolver,   -a scan is made to see if there are any undefined atoms in the graph.  If there -are, the linker scans all libraries (both static and dynamic) looking for   -definitions to replace the undefined atoms.  It is an error if any undefined   -atoms are left remaining. -</p> - -<p>Dead code stripping (if requested) is done at the end of resolving.  The -linker does a simple mark-and-sweep. It starts with "root" atoms (like "main" -in a main executable) and follows each references and marks each Atom that -it visits as "live".  When done, all atoms not marked "live" are removed. -</p> - -<p>The result of the Resolving phase is the creation of an lld::File object.   -The goal is that the lld::File model is <b>the</b> internal representation throughout -the linker. The file readers parse (mach-o, ELF, COFF) into an lld::File. -The file writers (mach-o, ELF, COFF) taken an lld::File and produce their -file kind, and every Pass only operates on an lld::File.  This is not only -a simpler, consistent model, but it enables the state of the linker to be  -dumped at any point in the link for testing purposes. -</p> - - -<h4> <a>Passes</a>  -</h4> -<p>The Passes step -is an open ended set of routines that each get a change to modify or enhance -the current lld::File object. Some example Passes are:</p> -<ul> -  <li>stub (PLT) generation</li> -  <li>GOT instantiation</li> -  <li>order_file optimization</li> -  <li>branch island generation</li> -  <li>branch shim generation</li> -  <li>Objective-C optimizations (Darwin specific)</li> -  <li>TLV instantiation (Darwin specific)</li> -  <li>dtrace probe processing (Darwin specific)</li> -  <li>compact unwind encoding (Darwin specific)</li> -</ul> -<p>Some of these passes are specific to Darwin's runtime environments.  But many -of the passes are applicable to any OS (such as generating branch island for  -out of range branch instructions).</p> - -<p>The general structure of a pass is to iterate through the atoms in the -current lld::File object, inspecting each atom and doing something.   -For instance, the stub pass, looks for call sites to shared library atoms -(e.g. call to printf).  It then -instantiates a "stub" atom (PLT entry) and a "lazy pointer" atom for each  -proxy atom needed, and these new atoms are added to the current lld::File  -object.  Next, all the noted call sites to shared library atoms have their -References altered to point to the stub atom instead of the shared library -atom.</p>   - -<h4><a>Generate Output File</a>  -</h4> -<p>Once the passes are done, the output file writer is given current lld::File  -object.  The writer's job is to create the executable content file wrapper  -and place the content of the atoms into it.  -</p> - - -<h2> -  <a name="lld::File representations">lld::File representations</a> -</h2> -<p>Just as LLVM has three representations of its IR model, lld has three -representations of its File/Atom/Reference model:  -<ul> -	<li>In memory, abstract C++ classes  -	     (lld::Atom, lld::Reference, and lld::File)</li> -	<li>textual (in YAML)</li> -	<li>binary format ("native")</li> -</ul> -</p> - -<h4><a>Binary File Format</a>  -</h4> -<p>In theory, lld::File objects could be written to disk in an existing  -Object File format standard (e.g. ELF).  Instead we choose to define a new -binary file format. There are two main reasons for this: fidelity and  -performance.</p> -<p>In order for lld to work as a linker on all platforms, its internal model -must be rich enough to model all CPU and OS linking features.  But if we choose  -an existing Object File format as the lld binary format, that means an on -going need to retrofit each platform specific feature needed from alternate -platforms into the existing Object File format.  Having our own "native" -binary format side steps that issue.  We still need to be able to binary -encode all the features, but once the in-memory model can represent the feature, -it is straight forward to binary encode it.</p> - -<p>The reason to use a binary file format at all, instead of a textual file  -format, is speed.  You want the binary format to be as fast as possible to read  -into the in-memory model. Given that we control the in-memory model and the  -binary format, the obvious way to make reading super fast it to make the file -format be basically just an array of atoms.  The reader just mmaps in the file  -and looks at the header to see how many atoms there are and instantiate that   -many atom objects with the atom attribute information coming from that array.    -The trick is designing this in a way that can be extended as the Atom mode  -evolves and new attributes are added. -</p> - -<p>The native object file format starts with a header that lists how many  -"chunks" are in the file.  A chunk is an array of "ivar data".  The native -file reader instantiates an array of Atom objects (with one large malloc call).  -Each atom contains just a pointer to its vtable and a pointer to its ivar data.  -All methods on lld::Atom are virtual, so all the method implementations return -values based on the ivar data to which it has a pointer. -If a new linking features is added which requires a change to the lld::Atom -model, a new native reader class (e.g. version 2) is defined which knows -how to read the new feature information from the new ivar data.  The old -reader class (e.g. version 1) is updated to do its best to model (the lack -of the new feature) given the old ivar data in existing native object files. -</p> - -<p>With this model for the native file format, files can be read and turned -into the in-memory graph of lld::Atoms with just a few memory allocations.   -And the format can easily adapt over time to new features</p> - - -<h4><a>Textual representations in YAML</a>  -</h4> -<p> -In designing a textual format we want something easy for humans to read and -easy for the linker to parse.  Since an atom has lots of attributes most of -which are usually just the default, we should define default values for  -every attribute so that those can be omitted from the text representation. -Here is the atoms for a simple hello world program expressed in YAML. -</p> -<pre> ---- -target-triple:   x86_64-apple-darwin11 - -atoms: -    - name:    _main -      scope:   global -      type:    code -      content: [ 55, 48, 89, e5, 48, 8d, 3d, 00, 00, 00, 00, 30, c0, e8, 00, 00, -                 00, 00, 31, c0, 5d, c3 ] -      fixups: -      - offset: 07 -        kind:   pcrel32 -        target: 2 -      - offset: 0E -        kind:   call32 -        target: _fprintf - -    - type:    c-string -      content: [ 73, 5A, 00 ] - -... -</pre> - -<p>The biggest use for the textual format will be writing test cases.  -Writing test cases in C is problematic because the compiler  -may vary its output over time for its own optimization reasons which my  -inadvertently disable or break the linker feature trying to be tested. By  -writing test cases in the linkers own textual format, we can exactly specify  -every attribute of every atom and thus target specific linker logic. -</p> - - - - -<h2> -  <a name="Testing">Testing</a> -</h2> -<p>The lld project contains a test suite which is being built up as new code -is added to lld.  All new lld functionality should have a tests added to the  -test suite.</p> -<p>The test suite is <a href="http://llvm.org/cmds/lit.html">lit</a> driven. -Each test is a text file with comments telling lit how to run the test and -check the result</p> -<p>To facilitate testing, the lld project builds a tool called lld-core. -This tool reads a YAML file (default from stdin), parses it into one or -more lld::File objects in memory and then feeds those lld::File objects  -to the resolver phase.  The output of the resolver is written as a native -object file.  It is then read back in using the native object file reader -and then pass to the YAML writer.  This round-about path means that all three -representations (in-memory, binary, and text) are exercised, and any new -feature has to work in all the representations to pass the test. -</p> - -<h4><a>Resolver testing</a>  -</h4> -<p>Basic testing is the "core linking" or resolving phase.  That -is where the linker merges object files.  All test cases are written in YAML. -One feature of YAML is that it allows multiple "documents" to be encoding in -one YAML stream.  That means one text file can appear to the linker as  -multiple .o files - the normal case for the linker.   -</p> -<p>Here is a simple example of a core linking test case. It checks that -an undefined atom from one file will be replaced by a definition from -another file.</p> -<pre> -# RUN: lld-core %s | FileCheck %s - -# -# Test that undefined atoms are replaced with defined atoms. -# - ---- -atoms: -    - name:              foo -      definition:        undefined ---- -atoms: -    - name:              foo -      scope:             global -      type:              code -... - -# CHECK:       name:       foo -# CHECK:       scope:      global -# CHECK:       type:       code -# CHECK-NOT:   name:       foo -# CHECK:       ... -</pre> - -<h4><a>Passes testing</a>  -</h4> -<p>Since Passes just operate on an lld::File object, the lld-core tool has -the option to run a particular pass (after resolving).  Thus, you can write -a YAML test case with carefully crafted input to exercise areas of a Pass -and the check the resulting lld::File object as represented in YAML. -</p> - - -<h2> -  <a name="Design Issues">Design Issues</a> -</h2> - -<p>There are a number of open issues in the design of lld.  The plan is to  -wait and make these design decisions when we need to.</p> - - -<h4><a>Debug Info</a>  -</h4> -<p>Currently, the lld model says nothing about debug info.  But the most  -popular debug format is DWARF and there is some impedance mismatch with the  -lld model and DWARF.  In lld there are just Atoms and only Atoms that need -to be in a special section at runtime have an associated section.  Also,  -Atoms do not have addresses.  The way DWARF is spec'ed different parts of  -DWARF are supposed to go into specially named sections and the DWARF references -function code by address.</p> - -<h4><a>CPU and OS specific functionality</a>  -</h4> -<p>Currently, lld has an abstract "Platform" that deals with any CPU or OS -specific differences in linking.  We just keep adding virtual methods to -the base Platform class as we find linking areas that might need customization. -At some point we'll need to structure this better. -</p> - -<h4><a>File Attributes</a>  -</h4> -<p>Currently, lld::File just has a path and a way to iterate its atoms. We  -will need to add mores attributes on a File.  For example, some equivalent -to the target triple.  There is also a number of cached or computed attributes -that could make various Passes more efficient.  For instance, on Darwin -there are a number of Objective-C optimizations that can be done by a Pass. -But it would improve the plain C case if the Objective-C optimization Pass did -not have to scan all atoms looking for any Objective-C data structures.  This -could be done if the lld::File object had an attribute that said if the file -had any Objective-C data in it. The Resolving phase would then be required -to "merge" that attribute as object files are added. -</p> - -<h4><a>Command Line Processing</a>  -</h4> -<p>Eventually, we may want this linker to be able to be a drop in replacement  -linker for existing linker tools.  That means being able to handle command -line arguments for different platforms (e.g. darwin or linux).  Currently, -there is no command line processing code in lld. If clang winds up -incorporating the lld libraries into the clang binary, lld may be able to punt -this work because clang will be responsible for setting up the state for lld. -</p> - - - - - -</body> -</html> - diff --git a/lld/www/menu.css b/lld/www/menu.css deleted file mode 100644 index 4a887b1907a..00000000000 --- a/lld/www/menu.css +++ /dev/null @@ -1,39 +0,0 @@ -/***************/ -/* page layout */ -/***************/ - -[id=menu] { -	position:fixed; -	width:25ex; -} -[id=content] { -	/* *****  EDIT THIS VALUE IF CONTENT OVERLAPS MENU ***** */ -	position:absolute; -  left:29ex; -	padding-right:4ex; -} - -/**************/ -/* menu style */ -/**************/ - -#menu .submenu { -	padding-top:1em; -	display:block; -} - -#menu label { -	display:block; -	font-weight: bold; -	text-align: center; -	background-color: rgb(192,192,192); -} -#menu a { -	padding:0 .2em; -	display:block; -	text-align: center; -	background-color: rgb(235,235,235); -} -#menu a:visited { -	color:rgb(100,50,100); -}  | 

