2 files changed, 380 insertions, 0 deletions
diff --git a/lldb/www/projects.html b/lldb/www/projects.html
new file mode 100755
index 00000000000..8551dac80ea
--- /dev/null
+++ b/lldb/www/projects.html
@@ -0,0 +1,379 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml">
+<head>
+<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
+<link href="style.css" rel="stylesheet" type="text/css" />
+<title>LLDB Projects</title>
+</head>
+
+<body>
+    <div class="www_title">
+      Project ideas for the <strong>LLDB</strong> Debugger
+    </div>
+    
+<div id="container">
+	<div id="content">
+         <!--#include virtual="sidebar.incl"-->
+		<div id="middle">
+			<div class="post">
+				<h1 class ="postheader">Goals</h1>
+				<div class="postcontent">
+
+					  <p>The following is a mostly unordered set of the ideas for improvements
+                                            to the LLDB debugger.  Some are fairly deep, some would require less 
+                                            effort.
+					     </p>
+
+					  <ol>
+                                            <li>
+                                              Speed up type realization in lldb.  
+                                              <p>
+                                                The type of problem I'm addressing here is the situation where you are 
+                                                debugging a large program (lldb built with debug clang/swift will do) 
+                                                and you go to print a simple expression, and lldb goes away for 30 seconds.  
+                                                When you sample it, it is always busily churning through all the CU's in the
+                                                world looking for something.  The problem isn't that looking for something in 
+                                                particular is slow, but rather that we somehow turned an bounded search 
+                                                (maybe a subtype of "std::string" into an unbounded search (all things with the 
+                                                name of that subtype.)  Or didn't stop when we got a reasonable answer 
+                                                proximate to the context of the search, but let the search leak out globally. 
+                                                And quite likely there are other issues that I haven't guessed yet. 
+                                                But if you end up churning though 3 or 4 Gig of debug info, that's going to be slow 
+                                                no matter how well written your debug reader is...
+                                              </p>
+                                              <p>
+                                                My guess is the work will be more in the general symbol lookup than in the DWARF 
+                                                parser in particular, but it may be a combination of both.
+                                              </p>
+                                              <p>
+                                                As a user debugging a largish program, this is the most obvious lameness of lldb.
+                                              </p>
+                                            </li>
+
+                                            <li>
+                                              Symbol name completion in the expression parser.
+                                              <p>
+                                              This is the other obvious lameness of lldb.  You can do:
+                                              </p>
+<code><pre><tt>
+(lldb) frame var foo.b<TAB>
+</tt></pre></code>
+                                              <p>
+                                                and we will tell you it is "foo.bar".  But you can't do that in the expression parser. 
+                                                This will require collaboration with the clang/swift folks to get the right extension 
+                                                points in the compiler. And whatever they are, lldb will need use them to tell the 
+                                                compiler about what names are available.  It will be important to avoid the pitfalls 
+                                                of #1 where we wander into the entire DWARF world.
+                                              </p>
+
+                                            </li>
+                                            <li>
+                                              Make a high speed asynchronous communication channel to replace the gdb-remote protocol.
+                                              <p>
+                                                All lldb debugging nowadays is done by talking to a debug agent.  We used the 
+                                                gdb-remote protocol because that is universal, and good enough, and you have 
+                                                to support it anyway since so many little devices & JTAG's and VM's etc 
+                                                support it.  But it is really old, not terribly high performance, and 
+                                                can't really handle sending or receiving messages while the process is 
+                                                supposedly running.  It should have compression built in, remove the hand-built 
+                                                checksums and rely on the robust communication protocols we always have nowadays,
+                                                allow for out-of-order requests/replies, allow for reconnecting to a temporarily 
+                                                disconnected debug session, regularize all of the packet formatting into JSON or 
+                                                BSON or whatever while including a way to do large binary transfers.  It must be 
+                                                possible to come up with something faster, and better tunable for the many 
+                                                communications pathways we end up supporting.
+                                              </p>
+                                            </li>
+
+                                            <li>
+                                              Fix the event handling/process control machinery to support calling SB & Commands 
+                                              everywhere, and to support non-stop debugging
+                                              <p>
+                                                There is a fairly ad-hoc system to handle when it is safe to run SB API's and 
+                                                command line commands.  This is actually a bit of a tricky problem, since we 
+                                                allow access to the command line and SB API from some funky places in lldb. 
+                                                The Operating System plugins are the most obvious instance, since they get 
+                                                run right after lldb is told by debugserver that the process has stopped, but 
+                                                before it has finished collating the information from the stop for presentation 
+                                                to the higher levels.  But breakpoint callbacks have some of the same problems, 
+                                                and other things like the scripted stepping operations and any fancier extension 
+                                                points we want to add to the debugger are going to be hard to implement robustly 
+                                                till we work on a finer-grained and more explicit control over who gets to control 
+                                                the process state.
+                                              <p>
+                                                We also won't have any chance of supporting non-stop debugging - which is a useful
+                                                mode for programs that have a lot of high-priority or real-time worker threads - 
+                                                until we get this sorted out.
+                                              </p>
+                                            </li>
+
+                                            <li>
+                                              Finish the language abstraction and remove all the unnecessary C++/clang-based API's
+                                              <p>
+                                                An important part of making lldb a more useful "debugger toolkit" as opposed to a 
+                                                C/C++/ObjC/Swift debugger is to have a clean abstraction for language support. 
+                                                We did most, but not all, of the physical separation.  We need to finish that. 
+                                                And then by force of necessity the API's really look like the interface to a C++ 
+                                                type system with a few swift bits added on.  How you would go about adding a new 
+                                                language is unclear and much more trouble than it is worth at present.  But if 
+                                                we made this nice, we could add a lot of value to other language projects.
+                                              </p>
+                                            </li>
+
+                                            <li>
+                                              Add some syntax to generate data formatters from type definitions
+                                              <p>
+                                                Uses of the data formatters fall into two types.  There are data formatters for 
+                                                types where the structure elements pretty much tell you how to present the data, 
+                                                you just need a little expression language to express how to turn them into what 
+                                                the user expects to see.  Then there are the ones (like pretty much all our 
+                                                Foundation/AppKit/UIKit formatters) that use deep magic to figure out how the type 
+                                                is actually laid out.  The latter are pretty much always going to have to be done by hand.
+                                              </p>
+                                              <p>
+                                                But for the ones where the information is expressed in the fields, it would be great 
+                                                to have a way to express the instructions to produce summaries and children in some 
+                                                form you could embed next to the types and have the compiler produce a byte code 
+                                                form of the instructions and then make that available to lldb along with the library.
+                                                This isn't as simple as having clang run over the headers and produce something from the 
+                                                types directly.  After all, clang has no way of knowing that the interesting thing about 
+                                                a std::vector is the elements that you get by calling size (for the summary) and [] for 
+                                                the elements.  But it shouldn't be hard to come up with a generic markup to express this.
+                                              </p>
+                                            </li>
+
+                                            <li>
+                                              Allow the expression parser to access dynamic type/data formatter information
+                                              <p>
+                                                This seems like a smaller one.  The symptom is your object is Foo child of Bar, and in 
+                                                the Locals view you see all the fields of Foo, but because the static type of the object 
+                                                is Bar, you can't see any of the fields of Foo.  But if you could get this working, 
+                                                you could hijack the mechanism to make the results of the value object summaries/synthetic 
+                                                children available to expressions.  And if you can do that, you could add other properties 
+                                                to an object externally (through Python or some other extension point) and then have 
+                                                these also available in the expression parser.  You could use this to express invariants 
+                                                for data structures, or other more advanced uses of types in the debugger.
+                                              </p>
+                                              <p>
+                                                Another version of this is to allow access to synthetic children in the expression 
+                                                parser.  Otherwise you end up in situations like:
+                                              </p>
+<code><pre><tt>
+(lldb) print return_a_foo()
+(SomeVectorLikeType) $1 = {
+  [0] = 0
+  [1] = 1
+  [2] = 2
+  [3] = 3
+  [4] = 4
+}
+</tt></pre></code>
+
+                                              <p>
+                                                That's good but:
+                                              </p>
+<code><pre><tt>
+(lldb) print return_a_foo()[2]
+</tt></pre></code>
+                                              <p>
+                                              fails because the expression parser doesn't know anything about the array-like nature of 
+                                              SomeVectorLikeType that it gets from the synthetic children.
+                                              </p>
+                                            </li>
+                            
+                                            <li>
+                                              Add an extension point in the breakpoint search machinery.
+                                              <p>
+                                                This would allow highly customizable, algorithmic breakpoint types, like "break on every 
+                                                use of some particular instruction, or instruction pattern, etc."
+                                              </p>
+                                            </li>
+
+                                            <li>
+                                              Documentation and better examples
+
+                                              <p>
+                                                We need to put the lldb syntax docs in the tutorial somewhere that is more easily accessible.  
+                                                On suggestion is to add non-command based help to the help system, and then have a "help lldb" 
+                                                or "help syntax" type command with this info.  Be nice if the non-command based help could 
+                                                be hierarchical so you could make topics.
+                                              </p>
+
+                                              <p>
+                                                There's a fair bit of docs about the SB API's, but it is spotty.  Some classes are 
+                                                well documented in the Python "help (lldb.SBWhatever)" and some are not.
+                                              </p>
+
+
+                                              <p>
+                                                We need more conceptual docs.  And we need more examples.  And we could provide a 
+                                                clean pluggable example for using LLDB standalone from Python.  The process_events.py 
+                                                is a start of this, but it just handles process events, and it is really a quick sketch 
+                                                not a polished expandable proto-tool.  
+                                              </p>
+
+                                            </li>
+
+                                            <li>
+                                              Make a more accessible plugin architecture for lldb.
+
+                                              <p>
+                                                Right now, you can only use the Python or SB API's to extend an extant lldb.  You can't 
+                                                implement any of the actual lldb Plugins as plugins.  That means anybody that wants to 
+                                                add new Object file/Process/Language etc support has to build and distribute their own 
+                                                lldb.  This is tricky because the API's the plugins use are currently not stable (and 
+                                                recently have been changing quite a lot.)  We would have to define a subset of lldb_private 
+                                                that you could use, and some way of telling whether the plugins were compatible with 
+                                                the lldb.  But long-term, making this sort of extension possible will make lldb more 
+                                                appealing for research and 3rd party uses.
+                                              </p>
+                                            </li>
+
+                                            <li>
+                                              Use instruction emulation to avoid the overhead of swapping trap and instruction for breakpoints
+
+                                              <p>
+                                                At present, breakpoints are implemented by inserting a trap instruction, then when the trap 
+                                                is hit, replace the trap with the actual instruction and single step.  Then swap back and 
+                                                continue.  This causes problems for read only text, and also means that no-stop debugging 
+                                                ust either stop all threads briefly to handle this two-step or risk missing some 
+                                                breakpoint hits.  If you emulated the instruction and wrote back the results, you wouldn't 
+                                                have these problems, and it would also save a stop per breakpoint hit.  Since we use breakpoints 
+                                                to implement stepping, this savings could be significant on slow connections.
+                                              </p>
+                                            </li>
+                                            <li>
+                                              Use the JIT to speed up conditional breakpoint evaluation
+
+                                              <p>
+                                                We already JIT and cache the conditional expressions for breakpoints for the C family of 
+                                                languages, so we aren't re-compiling every time you hit the breakpoint.  And if we couldn't 
+                                                IR interpret the expression, we leave the JIT'ed code in place for reuse.  But it would 
+                                                be even better if we could also insert the "stop or not" decision into the code at the 
+                                                breakpoint, so you would only actually stop the process when the condition was true.  
+                                                Greg's idea was that if you had a conditional breakpoint set when you started the 
+                                                debug session, Xcode could rebuild and insert enough no-ops that we could instrument 
+                                                the breakpoint site and call the conditional expression, and only trap if the conditional was true.
+                                              </p>
+                                            </li>
+
+                                            <li>
+                                              Broaden the idea in "target stop-hook" to cover more events in the debugger
+
+                                              <p>
+                                                Shared library loads, command execution, User directed memory/register reads and writes 
+                                                are all places where you would reasonably want to hook into the debugger.
+                                              </p>
+                                            </li>
+
+                                            <li>
+                                              Mock classes for testing
+                                              <p>
+                                                We need  "ProcessMock" and "ObjectFileMock" and the like.  These would be real
+                                                plugin implementations for their underlying lldb classes, with the addition 
+                                                that you can prime them from some sort of text based input files.  For classes 
+                                                that manage changes over time (like process) you would need to program the state
+                                                at StopPoint 0, StopPoint 1, etc.  These could then be used for testing reactions 
+                                                to complex threading 
+                                                problems & the like, and also for simulating hard-to-test environments (like bare 
+                                                board debugging).  
+                                              </p>
+                                            </li>
+
+                                            <li>
+                                              A Bug-Trapper infrastructure
+                                              <p>
+                                                We very often have bugs that can't be reproduced locally.  So having a bug-report-trapper 
+                                                that can gather enough information from the 
+                                                surroundings of a bug so that we can replay the session locally would be a big help
+                                                tracking down issues in this situation.  This is tricky because you 
+                                                can't necessarily require folks to leak information about their code in order to 
+                                                file bug reports.  So not only will you have to figure out what state to gather, 
+                                                you're also going to have to anonymize it somehow.  But we very often have bugs 
+                                                from people that can't reduce the problem to a simple test case and can't give us 
+                                                our code, and we often just can't help them as things stand now.  Note that adding
+                                                the ProcessMock would be a good first stage towards this, since you could make a ProcessMock 
+                                                creator/serializer from the current lldb state.
+                                              </p>
+                                            </li>
+
+                                            <li>
+                                              Expression parser needs syntax for "{symbol,type} A in CU B.cpp" etc.
+
+                                              <p>
+                                                Sometimes you need to specify non-visible or ambiguous types to the expression parser.  
+                                                We were planning to do $b_dot_cpp$A or something like.  You might want to specify a
+                                                static in a function, in a source file, or in a shared library.  So the syntax should
+                                                support all these.
+                                              </p>
+                                            </li>
+
+                                            <li>
+                                              Add a "testButDontAbort" style test to the UnitTest framework.
+
+                                              <p>
+                                                The way we use unittest now (maybe this is the only way it can work, I don't know) 
+                                                you can't report a real failure and continue with the test.  That is appropriate 
+                                                in some cases: if I'm supposed to hit breakpoint A before I evaluate an expression, 
+                                                and don't hit breakpoint A, the test should fail.  But it means that if I want to 
+                                                test five different expressions, I can either do it in one test, which is good 
+                                                because it means I only have to fire up one process, attach to it, and get it to 
+                                                a certain point.  But it also means if the first test fails, the other four don't 
+                                                even get run.  So though at first we wrote a bunch of test like this, as time went 
+                                                on we switched more to writing "one at a time" tests because they were more robust 
+                                                against a single failure.  That makes the test suite run much more slowly.  It 
+                                                would be great to add a "test_but_dont_abort" variant of the tests, then we could 
+                                                gang tests that all drive to the same place and do similar things.  As an added 
+                                                benefit, it would allow us to be more thorough in writing tests, since each 
+                                                test would have lower costs.
+                                              </p>
+                                            </li>
+
+                                            <li>
+                                              Unify Watchpoint's & Breakpoints.  
+                                              <p>
+                                                Option handling isn't shared, and more importantly the PerformAction's have a lot 
+                                                of duplicated common code, most of which works less well on the Watchpoint side.
+                                              </p>
+                                            </li>
+
+                                            <li>
+                                              Reverse debugging.  
+                                              <p>
+                                                This is kind of a holy grail, it's hard to support for complex apps (many 
+                                                threads, shared memory, etc.)  But it would be SO nice to have...
+                                              </p>
+                                            </li>
+                                            
+                                            <li>
+                                              Non-stop debugging.  
+                                              
+                                              <p>
+                                                By this I mean allowing some threads in the target program to run while stopping 
+                                                other threads.  This is supported in name in lldb at present, but lldb makes the 
+                                                assumption "If I get a stop, I won't get another stop unless I actually run the 
+                                                program." in a bunch of places so getting it to work reliably will be some a 
+                                                good bit of work.  And figuring out how to present this in the UI will also be tricky.
+                                              </p>
+                                            </li>
+
+                                            <li>
+                                              Fix and continue.  
+                                              <p>
+                                                We did this in gdb without a real JIT.  The implementation shouldn't be that hard,
+                                                especially if you can build the executable for fix and continue.  The tricky part is
+                                                how to verify that the user can only do the kinds of fixes that are safe to do.  
+                                                No changing object sizes is easy to detect, but there were many more subtle changes 
+                                                (function you are fixing is on the stack...) that take more work to prevent.  
+                                                And then you have to explain these conditions the user in some helpful way.
+                                              </p>
+                                            </li>
+                                          </ol>
+                </div>
+          	<div class="postfooter"></div>
+                </div>
+      	</div>
+	</div>
+</div>
+</body>
+</html>
diff --git a/lldb/www/sidebar.incl b/lldb/www/sidebar.incl
index 17beaaddfec..184a022790e 100644
--- a/lldb/www/sidebar.incl
+++ b/lldb/www/sidebar.incl
@@ -8,6 +8,7 @@
       <li><a href="/goals.html">Goals</a></li>
       <li><a href="/features.html">Features</a></li>
       <li><a href="/status.html">Status</a></li>
+      <li><a href="/projects.html">Projects</a></li>
     </ul>
   </div>
   <div class="menu">