diff options
| author | Peter Collingbourne <peter@pcc.me.uk> | 2014-12-03 02:08:38 +0000 |
|---|---|---|
| committer | Peter Collingbourne <peter@pcc.me.uk> | 2014-12-03 02:08:38 +0000 |
| commit | 51d2de7b9edfb8e5aad0c533c249b16989a9af41 (patch) | |
| tree | beb252c6d4c847126fcb191cc5e2a79a5e97e227 /llvm/docs/LangRef.rst | |
| parent | 4c71cc1dcb57d1beb5acbd5fdb0cba7e8c246a48 (diff) | |
| download | bcm5719-llvm-51d2de7b9edfb8e5aad0c533c249b16989a9af41.tar.gz bcm5719-llvm-51d2de7b9edfb8e5aad0c533c249b16989a9af41.zip | |
Prologue support
Patch by Ben Gamari!
This redefines the `prefix` attribute introduced previously and
introduces a `prologue` attribute. There are a two primary usecases
that these attributes aim to serve,
1. Function prologue sigils
2. Function hot-patching: Enable the user to insert `nop` operations
at the beginning of the function which can later be safely replaced
with a call to some instrumentation facility
3. Runtime metadata: Allow a compiler to insert data for use by the
runtime during execution. GHC is one example of a compiler that
needs this functionality for its tables-next-to-code functionality.
Previously `prefix` served cases (1) and (2) quite well by allowing the user
to introduce arbitrary data at the entrypoint but before the function
body. Case (3), however, was poorly handled by this approach as it
required that prefix data was valid executable code.
Here we redefine the notion of prefix data to instead be data which
occurs immediately before the function entrypoint (i.e. the symbol
address). Since prefix data now occurs before the function entrypoint,
there is no need for the data to be valid code.
The previous notion of prefix data now goes under the name "prologue
data" to emphasize its duality with the function epilogue.
The intention here is to handle cases (1) and (2) with prologue data and
case (3) with prefix data.
References
----------
This idea arose out of discussions[1] with Reid Kleckner in response to a
proposal to introduce the notion of symbol offsets to enable handling of
case (3).
[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073235.html
Test Plan: testsuite
Differential Revision: http://reviews.llvm.org/D6454
llvm-svn: 223189
Diffstat (limited to 'llvm/docs/LangRef.rst')
| -rw-r--r-- | llvm/docs/LangRef.rst | 86 |
1 files changed, 60 insertions, 26 deletions
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index b48769e1ee4..3412ed31015 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -633,7 +633,8 @@ name, a (possibly empty) argument list (each with optional :ref:`parameter attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`, an optional section, an optional alignment, an optional :ref:`comdat <langref_comdats>`, -an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, an opening +an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, +an optional :ref:`prologue <prologuedata>`, an opening curly brace, a list of basic blocks, and a closing curly brace. LLVM function declarations consist of the "``declare``" keyword, an @@ -643,7 +644,8 @@ an optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr`` attribute, a return type, an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly empty list of arguments, an optional alignment, an optional -:ref:`garbage collector name <gc>` and an optional :ref:`prefix <prefixdata>`. +:ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, +and an optional :ref:`prologue <prologuedata>`. A function definition contains a list of basic blocks, forming the CFG (Control Flow Graph) for the function. Each basic block may optionally start with a label @@ -680,7 +682,7 @@ Syntax:: [cconv] [ret attrs] <ResultType> @<FunctionName> ([argument list]) [unnamed_addr] [fn Attrs] [section "name"] [comdat $<ComdatName>] - [align N] [gc] [prefix Constant] { ... } + [align N] [gc] [prefix Constant] [prologue Constant] { ... } The argument list is a comma seperated sequence of arguments where each argument is of the following form @@ -1021,47 +1023,79 @@ support the named garbage collection algorithm. Prefix Data ----------- -Prefix data is data associated with a function which the code generator -will emit immediately before the function body. The purpose of this feature -is to allow frontends to associate language-specific runtime metadata with -specific functions and make it available through the function pointer while -still allowing the function pointer to be called. To access the data for a -given function, a program may bitcast the function pointer to a pointer to -the constant's type. This implies that the IR symbol points to the start -of the prefix data. +Prefix data is data associated with a function which the code +generator will emit immediately before the function's entrypoint. +The purpose of this feature is to allow frontends to associate +language-specific runtime metadata with specific functions and make it +available through the function pointer while still allowing the +function pointer to be called. -To maintain the semantics of ordinary function calls, the prefix data must +To access the data for a given function, a program may bitcast the +function pointer to a pointer to the constant's type and dereference +index -1. This implies that the IR symbol points just past the end of +the prefix data. For instance, take the example of a function annotated +with a single ``i32``, + +.. code-block:: llvm + + define void @f() prefix i32 123 { ... } + +The prefix data can be referenced as, + +.. code-block:: llvm + + %0 = bitcast *void () @f to *i32 + %a = getelementptr inbounds *i32 %0, i32 -1 + %b = load i32* %a + +Prefix data is laid out as if it were an initializer for a global variable +of the prefix data's type. The function will be placed such that the +beginning of the prefix data is aligned. This means that if the size +of the prefix data is not a multiple of the alignment size, the +function's entrypoint will not be aligned. If alignment of the +function's entrypoint is desired, padding must be added to the prefix +data. + +A function may have prefix data but no body. This has similar semantics +to the ``available_externally`` linkage in that the data may be used by the +optimizers but will not be emitted in the object file. + +.. _prologuedata: + +Prologue Data +------------- + +The ``prologue`` attribute allows arbitrary code (encoded as bytes) to +be inserted prior to the function body. This can be used for enabling +function hot-patching and instrumentation. + +To maintain the semantics of ordinary function calls, the prologue data must have a particular format. Specifically, it must begin with a sequence of bytes which decode to a sequence of machine instructions, valid for the module's target, which transfer control to the point immediately succeeding -the prefix data, without performing any other visible action. This allows +the prologue data, without performing any other visible action. This allows the inliner and other passes to reason about the semantics of the function -definition without needing to reason about the prefix data. Obviously this -makes the format of the prefix data highly target dependent. +definition without needing to reason about the prologue data. Obviously this +makes the format of the prologue data highly target dependent. -Prefix data is laid out as if it were an initializer for a global variable -of the prefix data's type. No padding is automatically placed between the -prefix data and the function body. If padding is required, it must be part -of the prefix data. - -A trivial example of valid prefix data for the x86 architecture is ``i8 144``, +A trivial example of valid prologue data for the x86 architecture is ``i8 144``, which encodes the ``nop`` instruction: .. code-block:: llvm - define void @f() prefix i8 144 { ... } + define void @f() prologue i8 144 { ... } -Generally prefix data can be formed by encoding a relative branch instruction -which skips the metadata, as in this example of valid prefix data for the +Generally prologue data can be formed by encoding a relative branch instruction +which skips the metadata, as in this example of valid prologue data for the x86_64 architecture, where the first two bytes encode ``jmp .+10``: .. code-block:: llvm %0 = type <{ i8, i8, i8* }> - define void @f() prefix %0 <{ i8 235, i8 8, i8* @md}> { ... } + define void @f() prologue %0 <{ i8 235, i8 8, i8* @md}> { ... } -A function may have prefix data but no body. This has similar semantics +A function may have prologue data but no body. This has similar semantics to the ``available_externally`` linkage in that the data may be used by the optimizers but will not be emitted in the object file. |

