summaryrefslogtreecommitdiffstats
path: root/llvm/docs/tutorial/LangImpl09.rst
diff options
context:
space:
mode:
Diffstat (limited to 'llvm/docs/tutorial/LangImpl09.rst')
-rw-r--r--llvm/docs/tutorial/LangImpl09.rst468
1 files changed, 5 insertions, 463 deletions
diff --git a/llvm/docs/tutorial/LangImpl09.rst b/llvm/docs/tutorial/LangImpl09.rst
index d81f9fa0001..1ff4dc8af44 100644
--- a/llvm/docs/tutorial/LangImpl09.rst
+++ b/llvm/docs/tutorial/LangImpl09.rst
@@ -1,465 +1,7 @@
-======================================
-Kaleidoscope: Adding Debug Information
-======================================
+:orphan:
-.. contents::
- :local:
-
-Chapter 9 Introduction
-======================
-
-Welcome to Chapter 9 of the "`Implementing a language with
-LLVM <index.html>`_" tutorial. In chapters 1 through 8, we've built a
-decent little programming language with functions and variables.
-What happens if something goes wrong though, how do you debug your
-program?
-
-Source level debugging uses formatted data that helps a debugger
-translate from binary and the state of the machine back to the
-source that the programmer wrote. In LLVM we generally use a format
-called `DWARF <http://dwarfstd.org>`_. DWARF is a compact encoding
-that represents types, source locations, and variable locations.
-
-The short summary of this chapter is that we'll go through the
-various things you have to add to a programming language to
-support debug info, and how you translate that into DWARF.
-
-Caveat: For now we can't debug via the JIT, so we'll need to compile
-our program down to something small and standalone. As part of this
-we'll make a few modifications to the running of the language and
-how programs are compiled. This means that we'll have a source file
-with a simple program written in Kaleidoscope rather than the
-interactive JIT. It does involve a limitation that we can only
-have one "top level" command at a time to reduce the number of
-changes necessary.
-
-Here's the sample program we'll be compiling:
-
-.. code-block:: python
-
- def fib(x)
- if x < 3 then
- 1
- else
- fib(x-1)+fib(x-2);
-
- fib(10)
-
-
-Why is this a hard problem?
-===========================
-
-Debug information is a hard problem for a few different reasons - mostly
-centered around optimized code. First, optimization makes keeping source
-locations more difficult. In LLVM IR we keep the original source location
-for each IR level instruction on the instruction. Optimization passes
-should keep the source locations for newly created instructions, but merged
-instructions only get to keep a single location - this can cause jumping
-around when stepping through optimized programs. Secondly, optimization
-can move variables in ways that are either optimized out, shared in memory
-with other variables, or difficult to track. For the purposes of this
-tutorial we're going to avoid optimization (as you'll see with one of the
-next sets of patches).
-
-Ahead-of-Time Compilation Mode
-==============================
-
-To highlight only the aspects of adding debug information to a source
-language without needing to worry about the complexities of JIT debugging
-we're going to make a few changes to Kaleidoscope to support compiling
-the IR emitted by the front end into a simple standalone program that
-you can execute, debug, and see results.
-
-First we make our anonymous function that contains our top level
-statement be our "main":
-
-.. code-block:: udiff
-
- - auto Proto = llvm::make_unique<PrototypeAST>("", std::vector<std::string>());
- + auto Proto = llvm::make_unique<PrototypeAST>("main", std::vector<std::string>());
-
-just with the simple change of giving it a name.
-
-Then we're going to remove the command line code wherever it exists:
-
-.. code-block:: udiff
-
- @@ -1129,7 +1129,6 @@ static void HandleTopLevelExpression() {
- /// top ::= definition | external | expression | ';'
- static void MainLoop() {
- while (1) {
- - fprintf(stderr, "ready> ");
- switch (CurTok) {
- case tok_eof:
- return;
- @@ -1184,7 +1183,6 @@ int main() {
- BinopPrecedence['*'] = 40; // highest.
-
- // Prime the first token.
- - fprintf(stderr, "ready> ");
- getNextToken();
-
-Lastly we're going to disable all of the optimization passes and the JIT so
-that the only thing that happens after we're done parsing and generating
-code is that the LLVM IR goes to standard error:
-
-.. code-block:: udiff
-
- @@ -1108,17 +1108,8 @@ static void HandleExtern() {
- static void HandleTopLevelExpression() {
- // Evaluate a top-level expression into an anonymous function.
- if (auto FnAST = ParseTopLevelExpr()) {
- - if (auto *FnIR = FnAST->codegen()) {
- - // We're just doing this to make sure it executes.
- - TheExecutionEngine->finalizeObject();
- - // JIT the function, returning a function pointer.
- - void *FPtr = TheExecutionEngine->getPointerToFunction(FnIR);
- -
- - // Cast it to the right type (takes no arguments, returns a double) so we
- - // can call it as a native function.
- - double (*FP)() = (double (*)())(intptr_t)FPtr;
- - // Ignore the return value for this.
- - (void)FP;
- + if (!F->codegen()) {
- + fprintf(stderr, "Error generating code for top level expr");
- }
- } else {
- // Skip token for error recovery.
- @@ -1439,11 +1459,11 @@ int main() {
- // target lays out data structures.
- TheModule->setDataLayout(TheExecutionEngine->getDataLayout());
- OurFPM.add(new DataLayoutPass());
- +#if 0
- OurFPM.add(createBasicAliasAnalysisPass());
- // Promote allocas to registers.
- OurFPM.add(createPromoteMemoryToRegisterPass());
- @@ -1218,7 +1210,7 @@ int main() {
- OurFPM.add(createGVNPass());
- // Simplify the control flow graph (deleting unreachable blocks, etc).
- OurFPM.add(createCFGSimplificationPass());
- -
- + #endif
- OurFPM.doInitialization();
-
- // Set the global so the code gen can use this.
-
-This relatively small set of changes get us to the point that we can compile
-our piece of Kaleidoscope language down to an executable program via this
-command line:
-
-.. code-block:: bash
-
- Kaleidoscope-Ch9 < fib.ks | & clang -x ir -
-
-which gives an a.out/a.exe in the current working directory.
-
-Compile Unit
-============
-
-The top level container for a section of code in DWARF is a compile unit.
-This contains the type and function data for an individual translation unit
-(read: one file of source code). So the first thing we need to do is
-construct one for our fib.ks file.
-
-DWARF Emission Setup
-====================
-
-Similar to the ``IRBuilder`` class we have a
-`DIBuilder <http://llvm.org/doxygen/classllvm_1_1DIBuilder.html>`_ class
-that helps in constructing debug metadata for an LLVM IR file. It
-corresponds 1:1 similarly to ``IRBuilder`` and LLVM IR, but with nicer names.
-Using it does require that you be more familiar with DWARF terminology than
-you needed to be with ``IRBuilder`` and ``Instruction`` names, but if you
-read through the general documentation on the
-`Metadata Format <http://llvm.org/docs/SourceLevelDebugging.html>`_ it
-should be a little more clear. We'll be using this class to construct all
-of our IR level descriptions. Construction for it takes a module so we
-need to construct it shortly after we construct our module. We've left it
-as a global static variable to make it a bit easier to use.
-
-Next we're going to create a small container to cache some of our frequent
-data. The first will be our compile unit, but we'll also write a bit of
-code for our one type since we won't have to worry about multiple typed
-expressions:
-
-.. code-block:: c++
-
- static DIBuilder *DBuilder;
-
- struct DebugInfo {
- DICompileUnit *TheCU;
- DIType *DblTy;
-
- DIType *getDoubleTy();
- } KSDbgInfo;
-
- DIType *DebugInfo::getDoubleTy() {
- if (DblTy)
- return DblTy;
-
- DblTy = DBuilder->createBasicType("double", 64, dwarf::DW_ATE_float);
- return DblTy;
- }
-
-And then later on in ``main`` when we're constructing our module:
-
-.. code-block:: c++
-
- DBuilder = new DIBuilder(*TheModule);
-
- KSDbgInfo.TheCU = DBuilder->createCompileUnit(
- dwarf::DW_LANG_C, DBuilder->createFile("fib.ks", "."),
- "Kaleidoscope Compiler", 0, "", 0);
-
-There are a couple of things to note here. First, while we're producing a
-compile unit for a language called Kaleidoscope we used the language
-constant for C. This is because a debugger wouldn't necessarily understand
-the calling conventions or default ABI for a language it doesn't recognize
-and we follow the C ABI in our LLVM code generation so it's the closest
-thing to accurate. This ensures we can actually call functions from the
-debugger and have them execute. Secondly, you'll see the "fib.ks" in the
-call to ``createCompileUnit``. This is a default hard coded value since
-we're using shell redirection to put our source into the Kaleidoscope
-compiler. In a usual front end you'd have an input file name and it would
-go there.
-
-One last thing as part of emitting debug information via DIBuilder is that
-we need to "finalize" the debug information. The reasons are part of the
-underlying API for DIBuilder, but make sure you do this near the end of
-main:
-
-.. code-block:: c++
-
- DBuilder->finalize();
-
-before you dump out the module.
-
-Functions
-=========
-
-Now that we have our ``Compile Unit`` and our source locations, we can add
-function definitions to the debug info. So in ``PrototypeAST::codegen()`` we
-add a few lines of code to describe a context for our subprogram, in this
-case the "File", and the actual definition of the function itself.
-
-So the context:
-
-.. code-block:: c++
-
- DIFile *Unit = DBuilder->createFile(KSDbgInfo.TheCU.getFilename(),
- KSDbgInfo.TheCU.getDirectory());
-
-giving us an DIFile and asking the ``Compile Unit`` we created above for the
-directory and filename where we are currently. Then, for now, we use some
-source locations of 0 (since our AST doesn't currently have source location
-information) and construct our function definition:
-
-.. code-block:: c++
-
- DIScope *FContext = Unit;
- unsigned LineNo = 0;
- unsigned ScopeLine = 0;
- DISubprogram *SP = DBuilder->createFunction(
- FContext, P.getName(), StringRef(), Unit, LineNo,
- CreateFunctionType(TheFunction->arg_size(), Unit),
- false /* internal linkage */, true /* definition */, ScopeLine,
- DINode::FlagPrototyped, false);
- TheFunction->setSubprogram(SP);
-
-and we now have an DISubprogram that contains a reference to all of our
-metadata for the function.
-
-Source Locations
-================
-
-The most important thing for debug information is accurate source location -
-this makes it possible to map your source code back. We have a problem though,
-Kaleidoscope really doesn't have any source location information in the lexer
-or parser so we'll need to add it.
-
-.. code-block:: c++
-
- struct SourceLocation {
- int Line;
- int Col;
- };
- static SourceLocation CurLoc;
- static SourceLocation LexLoc = {1, 0};
-
- static int advance() {
- int LastChar = getchar();
-
- if (LastChar == '\n' || LastChar == '\r') {
- LexLoc.Line++;
- LexLoc.Col = 0;
- } else
- LexLoc.Col++;
- return LastChar;
- }
-
-In this set of code we've added some functionality on how to keep track of the
-line and column of the "source file". As we lex every token we set our current
-current "lexical location" to the assorted line and column for the beginning
-of the token. We do this by overriding all of the previous calls to
-``getchar()`` with our new ``advance()`` that keeps track of the information
-and then we have added to all of our AST classes a source location:
-
-.. code-block:: c++
-
- class ExprAST {
- SourceLocation Loc;
-
- public:
- ExprAST(SourceLocation Loc = CurLoc) : Loc(Loc) {}
- virtual ~ExprAST() {}
- virtual Value* codegen() = 0;
- int getLine() const { return Loc.Line; }
- int getCol() const { return Loc.Col; }
- virtual raw_ostream &dump(raw_ostream &out, int ind) {
- return out << ':' << getLine() << ':' << getCol() << '\n';
- }
-
-that we pass down through when we create a new expression:
-
-.. code-block:: c++
-
- LHS = llvm::make_unique<BinaryExprAST>(BinLoc, BinOp, std::move(LHS),
- std::move(RHS));
-
-giving us locations for each of our expressions and variables.
-
-To make sure that every instruction gets proper source location information,
-we have to tell ``Builder`` whenever we're at a new source location.
-We use a small helper function for this:
-
-.. code-block:: c++
-
- void DebugInfo::emitLocation(ExprAST *AST) {
- DIScope *Scope;
- if (LexicalBlocks.empty())
- Scope = TheCU;
- else
- Scope = LexicalBlocks.back();
- Builder.SetCurrentDebugLocation(
- DebugLoc::get(AST->getLine(), AST->getCol(), Scope));
- }
-
-This both tells the main ``IRBuilder`` where we are, but also what scope
-we're in. The scope can either be on compile-unit level or be the nearest
-enclosing lexical block like the current function.
-To represent this we create a stack of scopes:
-
-.. code-block:: c++
-
- std::vector<DIScope *> LexicalBlocks;
-
-and push the scope (function) to the top of the stack when we start
-generating the code for each function:
-
-.. code-block:: c++
-
- KSDbgInfo.LexicalBlocks.push_back(SP);
-
-Also, we may not forget to pop the scope back off of the scope stack at the
-end of the code generation for the function:
-
-.. code-block:: c++
-
- // Pop off the lexical block for the function since we added it
- // unconditionally.
- KSDbgInfo.LexicalBlocks.pop_back();
-
-Then we make sure to emit the location every time we start to generate code
-for a new AST object:
-
-.. code-block:: c++
-
- KSDbgInfo.emitLocation(this);
-
-Variables
-=========
-
-Now that we have functions, we need to be able to print out the variables
-we have in scope. Let's get our function arguments set up so we can get
-decent backtraces and see how our functions are being called. It isn't
-a lot of code, and we generally handle it when we're creating the
-argument allocas in ``FunctionAST::codegen``.
-
-.. code-block:: c++
-
- // Record the function arguments in the NamedValues map.
- NamedValues.clear();
- unsigned ArgIdx = 0;
- for (auto &Arg : TheFunction->args()) {
- // Create an alloca for this variable.
- AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, Arg.getName());
-
- // Create a debug descriptor for the variable.
- DILocalVariable *D = DBuilder->createParameterVariable(
- SP, Arg.getName(), ++ArgIdx, Unit, LineNo, KSDbgInfo.getDoubleTy(),
- true);
-
- DBuilder->insertDeclare(Alloca, D, DBuilder->createExpression(),
- DebugLoc::get(LineNo, 0, SP),
- Builder.GetInsertBlock());
-
- // Store the initial value into the alloca.
- Builder.CreateStore(&Arg, Alloca);
-
- // Add arguments to variable symbol table.
- NamedValues[Arg.getName()] = Alloca;
- }
-
-
-Here we're first creating the variable, giving it the scope (``SP``),
-the name, source location, type, and since it's an argument, the argument
-index. Next, we create an ``lvm.dbg.declare`` call to indicate at the IR
-level that we've got a variable in an alloca (and it gives a starting
-location for the variable), and setting a source location for the
-beginning of the scope on the declare.
-
-One interesting thing to note at this point is that various debuggers have
-assumptions based on how code and debug information was generated for them
-in the past. In this case we need to do a little bit of a hack to avoid
-generating line information for the function prologue so that the debugger
-knows to skip over those instructions when setting a breakpoint. So in
-``FunctionAST::CodeGen`` we add some more lines:
-
-.. code-block:: c++
-
- // Unset the location for the prologue emission (leading instructions with no
- // location in a function are considered part of the prologue and the debugger
- // will run past them when breaking on a function)
- KSDbgInfo.emitLocation(nullptr);
-
-and then emit a new location when we actually start generating code for the
-body of the function:
-
-.. code-block:: c++
-
- KSDbgInfo.emitLocation(Body.get());
-
-With this we have enough debug information to set breakpoints in functions,
-print out argument variables, and call functions. Not too bad for just a
-few simple lines of code!
-
-Full Code Listing
-=================
-
-Here is the complete code listing for our running example, enhanced with
-debug information. To build this example, use:
-
-.. code-block:: bash
-
- # Compile
- clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
- # Run
- ./toy
-
-Here is the code:
-
-.. literalinclude:: ../../examples/Kaleidoscope/Chapter9/toy.cpp
- :language: c++
-
-`Next: Conclusion and other useful LLVM tidbits <LangImpl10.html>`_
+=====================
+Kaleidoscope Tutorial
+=====================
+The Kaleidoscope Tutorial has `moved to another location <MyFirstLanguageFrontend/index>`_ .
OpenPOWER on IntegriCloud