summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--llvm/docs/Proposals/GitHubMove.rst868
-rw-r--r--llvm/docs/Proposals/GitHubSubMod.rst273
-rw-r--r--llvm/docs/index.rst4
3 files changed, 870 insertions, 275 deletions
diff --git a/llvm/docs/Proposals/GitHubMove.rst b/llvm/docs/Proposals/GitHubMove.rst
new file mode 100644
index 00000000000..f3cf749313a
--- /dev/null
+++ b/llvm/docs/Proposals/GitHubMove.rst
@@ -0,0 +1,868 @@
+==============================
+Moving LLVM Projects to GitHub
+==============================
+
+.. contents:: Table of Contents
+ :depth: 4
+ :local:
+
+Introduction
+============
+
+This is a proposal to move our current revision control system from our own
+hosted Subversion to GitHub. Below are the financial and technical arguments as
+to why we are proposing such a move and how people (and validation
+infrastructure) will continue to work with a Git-based LLVM.
+
+There will be a survey pointing at this document which we'll use to gauge the
+community's reaction and, if we collectively decide to move, the time-frame. Be
+sure to make your view count.
+
+Additionally, we will discuss this during a BoF at the next US LLVM Developer
+meeting (http://llvm.org/devmtg/2016-11/).
+
+What This Proposal is *Not* About
+=================================
+
+Changing the development policy.
+
+This proposal relates only to moving the hosting of our source-code repository
+from SVN hosted on our own servers to Git hosted on GitHub. We are not proposing
+using GitHub's issue tracker, pull-requests, or code-review.
+
+Contributers will continue to earn commit access on demand under the Developer
+Policy, except that that a GitHub account will be required instead of SVN
+username/password-hash.
+
+Why Git, and Why GitHub?
+========================
+
+Why Move At All?
+----------------
+
+This discussion began because we currently host our own Subversion server
+and Git mirror on a voluntary basis. The LLVM Foundation sponsors the server and
+provides limited support, but there is only so much it can do.
+
+Volunteers are not sysadmins themselves, but compiler engineers that happen
+to know a thing or two about hosting servers. We also don't have 24/7 support,
+and we sometimes wake up to see that continuous integration is broken because
+the SVN server is either down or unresponsive.
+
+We should take advantage of one of the services out there (GitHub, GitLab,
+and BitBucket, among others) that offer better service (24/7 stability, disk
+space, Git server, code browsing, forking facilities, etc) for free.
+
+Why Git?
+--------
+
+Many new coders nowadays start with Git, and a lot of people have never used
+SVN, CVS, or anything else. Websites like GitHub have changed the landscape
+of open source contributions, reducing the cost of first contribution and
+fostering collaboration.
+
+Git is also the version control many LLVM developers use. Despite the
+sources being stored in a SVN server, these developers are already using Git
+through the Git-SVN integration.
+
+Git allows you to:
+
+* Commit, squash, merge, and fork locally without touching the remote server.
+* Maintain local branches, enabling multiple threads of development.
+* Collaborate on these branches (e.g. through your own fork of llvm on GitHub).
+* Inspect the repository history (blame, log, bisect) without Internet access.
+* Maintain remote forks and branches on Git hosting services and
+ integrate back to the main repository.
+
+In addition, because Git seems to be replacing many OSS projects' version
+control systems, there are many tools that are built over Git.
+Future tooling may support Git first (if not only).
+
+Why GitHub?
+-----------
+
+GitHub, like GitLab and BitBucket, provides free code hosting for open source
+projects. Any of these could replace the code-hosting infrastructure that we
+have today.
+
+These services also have a dedicated team to monitor, migrate, improve and
+distribute the contents of the repositories depending on region and load.
+
+GitHub has one important advantage over GitLab and
+BitBucket: it offers read-write **SVN** access to the repository
+(https://github.com/blog/626-announcing-svn-support).
+This would enable people to continue working post-migration as though our code
+were still canonically in an SVN repository.
+
+In addition, there are already multiple LLVM mirrors on GitHub, indicating that
+part of our community has already settled there.
+
+On Managing Revision Numbers with Git
+-------------------------------------
+
+The current SVN repository hosts all the LLVM sub-projects alongside each other.
+A single revision number (e.g. r123456) thus identifies a consistent version of
+all LLVM sub-projects.
+
+Git does not use sequential integer revision number but instead uses a hash to
+identify each commit. (Linus mentioned that the lack of such revision number
+is "the only real design mistake" in Git [TorvaldRevNum]_.)
+
+The loss of a sequential integer revision number has been a sticking point in
+past discussions about Git:
+
+- "The 'branch' I most care about is mainline, and losing the ability to say
+ 'fixed in r1234' (with some sort of monotonically increasing number) would
+ be a tragic loss." [LattnerRevNum]_
+- "I like those results sorted by time and the chronology should be obvious, but
+ timestamps are incredibly cumbersome and make it difficult to verify that a
+ given checkout matches a given set of results." [TrickRevNum]_
+- "There is still the major regression with unreadable version numbers.
+ Given the amount of Bugzilla traffic with 'Fixed in...', that's a
+ non-trivial issue." [JSonnRevNum]_
+- "Sequential IDs are important for LNT and llvmlab bisection tool." [MatthewsRevNum]_.
+
+However, Git can emulate this increasing revision number:
+`git rev-list --count <commit-hash>`. This identifier is unique only within a
+single branch, but this means the tuple `(num, branch-name)` uniquely identifies
+a commit.
+
+We can thus use this revision number to ensure that e.g. `clang -v` reports a
+user-friendly revision number (e.g. `master-12345` or `4.0-5321`), addressing
+the objections raised above with respect to this aspect of Git.
+
+What About Branches and Merges?
+-------------------------------
+
+In contrast to SVN, Git makes branching easy. Git's commit history is
+represented as a DAG, a departure from SVN's linear history. However, we propose
+to mandate making merge commits illegal in our canonical Git repository.
+
+Unfortunately, GitHub does not support server side hooks to enforce such a
+policy. We must rely on the community to avoid pushing merge commits.
+
+GitHub offers a feature called `Status Checks`: a branch protected by
+`status checks` requires commits to be whitelisted before the push can happen.
+We could supply a pre-push hook on the client side that would run and check the
+history, before whitelisting the commit being pushed [statuschecks]_.
+However this solution would be somewhat fragile (how do you update a script
+installed on every developer machine?) and prevents SVN access to the
+repository.
+
+What About Commit Emails?
+-------------------------
+
+We will need a new bot to send emails for each commit. This proposal leaves the
+email format unchanged besides the commit URL.
+
+Straw Man Migration Plan
+========================
+
+Step #1 : Before The Move
+-------------------------
+
+1. Update docs to mention the move, so people are aware of what is going on.
+2. Set up a read-only version of the GitHub project, mirroring our current SVN
+ repository.
+3. Add the required bots to implement the commit emails, as well as the
+ umbrella repository update (if the multirepo is selected) or the read-only
+ Git views for the sub-projects (if the monorepo is selected).
+
+Step #2 : Git Move
+------------------
+
+4. Update the buildbots to pick up updates and commits from the GitHub
+ repository. Not all bots have to migrate at this point, but it'll help
+ provide infrastructure testing.
+5. Update Phabricator to pick up commits from the GitHub repository.
+6. LNT and llvmlab have to be updated: they rely on unique monotonically
+ increasing integer across branch [MatthewsRevNum]_.
+7. Instruct downstream integrators to pick up commits from the GitHub
+ repository.
+8. Review and prepare an update for the LLVM documentation.
+
+Until this point nothing has changed for developers, it will just
+boil down to a lot of work for buildbot and other infrastructure
+owners.
+
+The migration will pause here until all dependencies have cleared, and all
+problems have been solved.
+
+Step #3: Write Access Move
+--------------------------
+
+9. Collect developers' GitHub account information, and add them to the project.
+10. Switch the SVN repository to read-only and allow pushes to the GitHub repository.
+11. Update the documentation.
+12. Mirror Git to SVN.
+
+Step #4 : Post Move
+-------------------
+
+13. Archive the SVN repository.
+14. Update links on the LLVM website pointing to viewvc/klaus/phab etc. to
+ point to GitHub instead.
+
+One or Multiple Repositories?
+=============================
+
+There are two major variants for how to structure our Git repository: The
+"multirepo" and the "monorepo".
+
+Multirepo Variant
+-----------------
+
+This variant recommends moving each LLVM sub-project to a separate Git
+repository. This mimics the existing official read-only Git repositories
+(e.g., http://llvm.org/git/compiler-rt.git), and creates new canonical
+repositories for each sub-project.
+
+This will allow the individual sub-projects to remain distinct: a
+developer interested only in compiler-rt can checkout only this repository,
+build it, and work in isolation of the other sub-projects.
+
+A key need is to be able to check out multiple projects (i.e. lldb+clang+llvm or
+clang+llvm+libcxx for example) at a specific revision.
+
+A tuple of revisions (one entry per repository) accurately describes the state
+across the sub-projects.
+For example, a given version of clang would be
+*<LLVM-12345, clang-5432, libcxx-123, etc.>*.
+
+Umbrella Repository
+^^^^^^^^^^^^^^^^^^^
+
+To make this more convenient, a separate *umbrella* repository will be
+provided. This repository will be used for the sole purpose of understanding
+the sequence in which commits were pushed to the different repositories and to
+provide a single revision number.
+
+This umbrella repository will be read-only and continuously updated
+to record the above tuple. The proposed form to record this is to use Git
+[submodules]_, possibly along with a set of scripts to help check out a
+specific revision of the LLVM distribution.
+
+A regular LLVM developer does not need to interact with the umbrella repository
+-- the individual repositories can be checked out independently -- but you would
+need to use the umbrella repository to bisect multiple sub-projects at the same
+time, or to check-out old revisions of LLVM with another sub-project at a
+consistent state.
+
+This umbrella repository will be updated automatically by a bot (running on
+notice from a webhook on every push, and periodically) on a per commit basis: a
+single commit in the umbrella repository would match a single commit in a
+sub-project.
+
+Living Downstream
+^^^^^^^^^^^^^^^^^
+
+Downstream SVN users can use the read/write SVN bridges with the following
+caveats:
+
+ * Be prepared for a one-time change to the upstream revision numbers.
+ * The upstream sub-project revision numbers will no longer be in sync.
+
+Downstream Git users can continue without any major changes, with the minor
+change of upstreaming using `git push` instead of `git svn dcommit`.
+
+Git users also have the option of adopting an umbrella repository downstream.
+The tooling for the upstream umbrella can easily be reused for downstream needs,
+incorporating extra sub-projects and branching in parallel with sub-project
+branches.
+
+Multirepo Preview
+^^^^^^^^^^^^^^^^^
+
+As a preview (disclaimer: this rough prototype, not polished and not
+representative of the final solution), you can look at the following:
+
+ * Repository: https://github.com/llvm-beanz/llvm-submodules
+ * Update bot: http://beanz-bot.com:8180/jenkins/job/submodule-update/
+
+Concerns
+^^^^^^^^
+
+ * Because GitHub does not allow server-side hooks, and because there is no
+ "push timestamp" in Git, the umbrella repository sequence isn't totally
+ exact: commits from different repositories pushed around the same time can
+ appear in different orders. However, we don't expect it to be the common case
+ or to cause serious issues in practice.
+ * You can't have a single cross-projects commit that would update both LLVM and
+ other sub-projects (something that can be achieved now). It would be possible
+ to establish a protocol whereby users add a special token to their commit
+ messages that causes the umbrella repo's updater bot to group all of them
+ into a single revision.
+ * Another option is to group commits that were pushed closely enough together
+ in the umbrella repository. This has the advantage of allowing cross-project
+ commits, and is less sensitive to mis-ordering commits. However, this has the
+ potential to group unrelated commits together, especially if the bot goes
+ down and needs to catch up.
+ * This variant relies on heavier tooling. But the current prototype shows that
+ it is not out-of-reach.
+ * Submodules don't have a good reputation / are complicating the command line.
+ However, in the proposed setup, a regular developer will seldom interact with
+ submodules directly, and certainly never update them.
+ * Refactoring across projects is not friendly: taking some functions from clang
+ to make it part of a utility in libSupport wouldn't carry the history of the
+ code in the llvm repo, preventing recursively applying `git blame` for
+ instance. However, this is not very different than how most people are
+ Interacting with the repository today, by splitting such change in multiple
+ commits.
+
+Workflows
+^^^^^^^^^
+
+ * :ref:`Checkout/Clone a Single Project, without Commit Access <workflow-checkout-commit>`.
+ * :ref:`Checkout/Clone a Single Project, with Commit Access <workflow-multicheckout-nocommit>`.
+ * :ref:`Checkout/Clone Multiple Projects, with Commit Access <workflow-multicheckout-multicommit>`.
+ * :ref:`Commit an API Change in LLVM and Update the Sub-projects <workflow-cross-repo-commit>`.
+ * :ref:`Branching/Stashing/Updating for Local Development or Experiments <workflow-multi-branching>`.
+ * :ref:`Bisecting <workflow-multi-bisecting>`.
+
+Monorepo Variant
+----------------
+
+This variant recommends moving all LLVM sub-projects to a single Git repository,
+similar to https://github.com/llvm-project/llvm-project.
+This would mimic an export of the current SVN repository, with each sub-project
+having its own top-level directory.
+Not all sub-projects are used for building toolchains. In practice, www/
+and test-suite/ will probably stay out of the monorepo.
+
+Putting all sub-projects in a single checkout makes cross-project refactoring
+naturally simple:
+
+ * New sub-projects can be trivially split out for better reuse and/or layering
+ (e.g., to allow libSupport and/or LIT to be used by runtimes without adding a
+ dependency on LLVM).
+ * Changing an API in LLVM and upgrading the sub-projects will always be done in
+ a single commit, designing away a common source of temporary build breakage.
+ * Moving code across sub-project (during refactoring for instance) in a single
+ commit enables accurate `git blame` when tracking code change history.
+ * Tooling based on `git grep` works natively across sub-projects, allowing to
+ easier find refactoring opportunities across projects (for example reusing a
+ datastructure initially in LLDB by moving it into libSupport).
+ * Having all the sources present encourages maintaining the other sub-projects
+ when changing API.
+
+Finally, the monorepo maintains the property of the existing SVN repository that
+the sub-projects move synchronously, and a single revision number (or commit
+hash) identifies the state of the development across all projects.
+
+.. _build_single_project:
+
+Building a single sub-project
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Nobody will be forced to build unnecessary projects. The exact structure
+is TBD, but making it trivial to configure builds for a single sub-project
+(or a subset of sub-projects) is a hard requirement.
+
+As an example, it could look like the following::
+
+ mkdir build && cd build
+ # Configure only LLVM (default)
+ cmake path/to/monorepo
+ # Configure LLVM and lld
+ cmake path/to/monorepo -DLLVM_ENABLE_PROJECTS=lld
+ # Configure LLVM and clang
+ cmake path/to/monorepo -DLLVM_ENABLE_PROJECTS=clang
+
+.. _git-svn-mirror:
+
+Read/write sub-project mirrors
+------------------------------
+
+With the Monorepo, the existing single-subproject mirrors (e.g.
+http://llvm.org/git/compiler-rt.git) with git-svn read-write access would
+continue to be maintained: developers would continue to be able to use the
+existing single-subproject git repositories as they do today, with *no changes
+to workflow*. Everything (git fetch, git svn dcommit, etc.) could continue to
+work identically to how it works today. The monorepo can be set-up such that the
+SVN revision number matches the SVN revision in the GitHub SVN-bridge.
+
+Living Downstream
+^^^^^^^^^^^^^^^^^
+
+Downstream SVN users can use the read/write SVN bridge. The SVN revision
+number can be preserved in the monorepo, minimizing the impact.
+
+Downstream Git users can continue without any major changes, by using the
+git-svn mirrors on top of the SVN bridge.
+
+Git users can also work upstream with monorepo even if their downstream
+fork has split repositories. They can apply patches in the appropriate
+subdirectories of the monorepo using, e.g., `git am --directory=...`, or
+plain `diff` and `patch`.
+
+Alternatively, Git users can migrate their own fork to the monorepo. As a
+demonstration, we've migrated the "CHERI" fork to the monorepo in two ways:
+
+ * Using a script that rewrites history (including merges) so that it looks
+ like the fork always lived in the monorepo [LebarCHERI]_. The upside of
+ this is when you check out an old revision, you get a copy of all llvm
+ sub-projects at a consistent revision. (For instance, if it's a clang
+ fork, when you check out an old revision you'll get a consistent version
+ of llvm proper.) The downside is that this changes the fork's commit
+ hashes.
+
+ * Merging the fork into the monorepo [AminiCHERI]_. This preserves the
+ fork's commit hashes, but when you check out an old commit you only get
+ the one sub-project.
+
+Monorepo Preview
+^^^^^^^^^^^^^^^^^
+
+As a preview (disclaimer: this rough prototype, not polished and not
+representative of the final solution), you can look at the following:
+
+ * Full Repository: https://github.com/joker-eph/llvm-project
+ * Single sub-project view with *SVN write access* to the full repo:
+ https://github.com/joker-eph/compiler-rt
+
+Concerns
+^^^^^^^^
+
+ * Using the monolithic repository may add overhead for those contributing to a
+ standalone sub-project, particularly on runtimes like libcxx and compiler-rt
+ that don't rely on LLVM; currently, a fresh clone of libcxx is only 15MB (vs.
+ 1GB for the monorepo), and the commit rate of LLVM may cause more frequent
+ `git push` collisions when upstreaming. Affected contributors can continue to
+ use the SVN bridge or the single-subproject Git mirrors with git-svn for
+ read-write.
+ * Using the monolithic repository may add overhead for those *integrating* a
+ standalone sub-project, even if they aren't contributing to it, due to the
+ same disk space concern as the point above. The availability of the
+ sub-project Git mirror addesses this, even without SVN access.
+ * Preservation of the existing read/write SVN-based workflows relies on the
+ GitHub SVN bridge, which is an extra dependency. Maintaining this locks us
+ into GitHub and could restrict future workflow changes.
+
+Workflows
+^^^^^^^^^
+
+ * :ref:`Checkout/Clone a Single Project, without Commit Access <workflow-checkout-commit>`.
+ * :ref:`Checkout/Clone a Single Project, with Commit Access <workflow-monocheckout-nocommit>`.
+ * :ref:`Checkout/Clone Multiple Projects, with Commit Access <workflow-monocheckout-multicommit>`.
+ * :ref:`Commit an API Change in LLVM and Update the Sub-projects <workflow-cross-repo-commit>`.
+ * :ref:`Branching/Stashing/Updating for Local Development or Experiments <workflow-mono-branching>`.
+ * :ref:`Bisecting <workflow-mono-bisecting>`.
+
+Multi/Mono Hybrid Variant
+-------------------------
+
+This variant recommends moving only the LLVM sub-projects that are *rev-locked*
+to LLVM into a monorepo (clang, lld, lldb, ...), following the multirepo
+proposal for the rest. While neither variant recommends combining sub-projects
+like www/ and test-suite/ (which are completely standalone), this goes further
+and keeps sub-projects like libcxx and compiler-rt in their own distinct
+repositories.
+
+Concerns
+^^^^^^^^
+
+ * This has most disadvantages of multirepo and monorepo, without bringing many
+ of the advantages.
+ * Downstream have to upgrade to the monorepo structure, but only partially. So
+ they will keep the infrastructure to integrate the other separate
+ sub-projects.
+ * All projects that use LIT for testing are effectively rev-locked to LLVM.
+ Furthermore, some runtimes (like compiler-rt) are rev-locked with Clang.
+ It's not clear where to draw the lines.
+
+
+Workflow Before/After
+=====================
+
+This section goes through a few examples of workflows, intended to illustrate
+how end-users or developers would interact with the repository for
+various use-cases.
+
+.. _workflow-checkout-commit:
+
+Checkout/Clone a Single Project, without Commit Access
+------------------------------------------------------
+
+Except the URL, nothing changes. The possibilities today are::
+
+ svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
+ # or with Git
+ git clone http://llvm.org/git/llvm.git
+
+After the move to GitHub, you would do either::
+
+ git clone https://github.com/llvm-project/llvm.git
+ # or using the GitHub svn native bridge
+ svn co https://github.com/llvm-project/llvm/trunk
+
+The above works for both the monorepo and the multirepo, as we'll maintain the
+existing read-only views of the individual sub-projects.
+
+Checkout/Clone a Single Project, with Commit Access
+---------------------------------------------------
+
+Currently
+^^^^^^^^^
+
+::
+
+ # direct SVN checkout
+ svn co https://user@llvm.org/svn/llvm-project/llvm/trunk llvm
+ # or using the read-only Git view, with git-svn
+ git clone http://llvm.org/git/llvm.git
+ cd llvm
+ git svn init https://llvm.org/svn/llvm-project/llvm/trunk --username=<username>
+ git config svn-remote.svn.fetch :refs/remotes/origin/master
+ git svn rebase -l # -l avoids fetching ahead of the git mirror.
+
+Commits are performed using `svn commit` or with the sequence `git commit` and
+`git svn dcommit`.
+
+.. _workflow-multicheckout-nocommit:
+
+Multirepo Variant
+^^^^^^^^^^^^^^^^^
+
+With the multirepo variant, nothing changes but the URL, and commits can be
+performed using `svn commit` or `git commit` and `git push`::
+
+ git clone https://github.com/llvm/llvm.git llvm
+ # or using the GitHub svn native bridge
+ svn co https://github.com/llvm/llvm/trunk/ llvm
+
+.. _workflow-monocheckout-nocommit:
+
+Monorepo Variant
+^^^^^^^^^^^^^^^^
+
+With the monorepo variant, there are a few options, depending on your
+constraints. First, you could just clone the full repository::
+
+ git clone https://github.com/llvm/llvm-projects.git llvm
+ # or using the GitHub svn native bridge
+ svn co https://github.com/llvm/llvm-projects/trunk/ llvm
+
+At this point you have every sub-project (llvm, clang, lld, lldb, ...), which
+:ref:`doesn't imply you have to build all of them <build_single_project>`. You
+can still build only compiler-rt for instance. In this way it's not different
+from someone who would check out all the projects with SVN today.
+
+You can commit as normal using `git commit` and `git push` or `svn commit`, and
+read the history for a single project (`git log libcxx` for example).
+
+Secondly, there are a few options to avoid checking out all the sources.
+
+**Using the GitHub SVN bridge**
+
+The GitHub SVN native bridge allows to checkout a subdirectory directly:
+
+ svn co https://github.com/llvm/llvm-projects/trunk/compiler-rt compiler-rt —username=...
+
+This checks out only compiler-rt and provides commit access using "svn commit",
+in the same way as it would do today.
+
+**Using a Subproject Git Nirror**
+
+You can use *git-svn* and one of the sub-project mirrors::
+
+ # Clone from the single read-only Git repo
+ git clone http://llvm.org/git/llvm.git
+ cd llvm
+ # Configure the SVN remote and initialize the svn metadata
+ $ git svn init https://github.com/joker-eph/llvm-project/trunk/llvm —username=...
+ git config svn-remote.svn.fetch :refs/remotes/origin/master
+ git svn rebase -l
+
+In this case the repository contains only a single sub-project, and commits can
+be made using `git svn dcommit`, again exactly as we do today.
+
+**Using a Sparse Checkouts**
+
+You can hide the other directories using a Git sparse checkout::
+
+ git config core.sparseCheckout true
+ echo /compiler-rt > .git/info/sparse-checkout
+ git read-tree -mu HEAD
+
+The data for all sub-projects is still in your `.git` directory, but in your
+checkout, you only see `compiler-rt`.
+Before you push, you'll need to fetch and rebase (`git pull --rebase`) as
+usual.
+
+Note that when you fetch you'll likely pull in changes to sub-projects you don't
+care about. If you are using spasre checkout, the files from other projects
+won't appear on your disk. The only effect is that your commit hash changes.
+
+You can check whether the changes in the last fetch are relevant to your commit
+by running::
+
+ git log origin/master@{1}..origin/master -- libcxx
+
+This command can be hidden in a script so that `git llvmpush` would perform all
+these steps, fail only if such a dependent change exists, and show immediately
+the change that prevented the push. An immediate repeat of the command would
+(almost) certainly result in a successful push.
+Note that today with SVN or git-svn, this step is not possible since the
+"rebase" implicitly happens while committing (unless a conflict occurs).
+
+Checkout/Clone Multiple Projects, with Commit Access
+----------------------------------------------------
+
+Let's look how to assemble llvm+clang+libcxx at a given revision.
+
+Currently
+^^^^^^^^^
+
+::
+
+ svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm -r $REVISION
+ cd llvm/tools
+ svn co http://llvm.org/svn/llvm-project/clang/trunk clang -r $REVISION
+ cd ../projects
+ svn co http://llvm.org/svn/llvm-project/libcxx/trunk libcxx -r $REVISION
+
+Or using git-svn::
+
+ git clone http://llvm.org/git/llvm.git
+ cd llvm/
+ git svn init https://llvm.org/svn/llvm-project/llvm/trunk --username=<username>
+ git config svn-remote.svn.fetch :refs/remotes/origin/master
+ git svn rebase -l
+ git checkout `git svn find-rev -B r258109`
+ cd tools
+ git clone http://llvm.org/git/clang.git
+ cd clang/
+ git svn init https://llvm.org/svn/llvm-project/clang/trunk --username=<username>
+ git config svn-remote.svn.fetch :refs/remotes/origin/master
+ git svn rebase -l
+ git checkout `git svn find-rev -B r258109`
+ cd ../../projects/
+ git clone http://llvm.org/git/libcxx.git
+ cd libcxx
+ git svn init https://llvm.org/svn/llvm-project/libcxx/trunk --username=<username>
+ git config svn-remote.svn.fetch :refs/remotes/origin/master
+ git svn rebase -l
+ git checkout `git svn find-rev -B r258109`
+
+Note that the list would be longer with more sub-projects.
+
+.. _workflow-multicheckout-multicommit:
+
+Multirepo Variant
+^^^^^^^^^^^^^^^^^
+
+With the multirepo variant, the umbrella repository will be used. This is
+where the mapping from a single revision number to the individual repositories
+revisions is stored.::
+
+ git clone https://github.com/llvm-beanz/llvm-submodules
+ cd llvm-submodules
+ git checkout $REVISION
+ git submodule init
+ git submodule update clang llvm libcxx
+ # the list of sub-project is optional, `git submodule update` would get them all.
+
+At this point the clang, llvm, and libcxx individual repositories are cloned
+and stored alongside each other. There are CMake flags to describe the directory
+structure; alternatively, you can just symlink `clang` to `llvm/tools/clang`,
+etc.
+
+Another option is to checkout repositories based on the commit timestamp::
+
+ git checkout `git rev-list -n 1 --before="2009-07-27 13:37" master`
+
+.. _workflow-monocheckout-multicommit:
+
+Monorepo Variant
+^^^^^^^^^^^^^^^^
+
+The repository contains natively the source for every sub-projects at the right
+revision, which makes this straightforward::
+
+ git clone https://github.com/llvm/llvm-projects.git llvm-projects
+ cd llvm-projects
+ git checkout $REVISION
+
+As before, at this point clang, llvm, and libcxx are stored in directories
+alongside each other.
+
+.. _workflow-cross-repo-commit:
+
+Commit an API Change in LLVM and Update the Sub-projects
+--------------------------------------------------------
+
+Today this is possible, even though not common (at least not documented) for
+subversion users and for git-svn users. For example, few Git users try to update
+LLD or Clang in the same commit as they change an LLVM API.
+
+The multirepo variant does not address this: one would have to commit and push
+separately in every individual repository. It would be possible to establish a
+protocol whereby users add a special token to their commit messages that causes
+the umbrella repo's updater bot to group all of them into a single revision.
+
+The monorepo variant handles this natively.
+
+Branching/Stashing/Updating for Local Development or Experiments
+----------------------------------------------------------------
+
+Currently
+^^^^^^^^^
+
+SVN does not allow this use case, but developers that are currently using
+git-svn can do it. Let's look in practice what it means when dealing with
+multiple sub-projects.
+
+To update the repository to tip of trunk::
+
+ git pull
+ cd tools/clang
+ git pull
+ cd ../../projects/libcxx
+ git pull
+
+To create a new branch::
+
+ git checkout -b MyBranch
+ cd tools/clang
+ git checkout -b MyBranch
+ cd ../../projects/libcxx
+ git checkout -b MyBranch
+
+To switch branches::
+
+ git checkout AnotherBranch
+ cd tools/clang
+ git checkout AnotherBranch
+ cd ../../projects/libcxx
+ git checkout AnotherBranch
+
+.. _workflow-multi-branching:
+
+Multirepo Variant
+^^^^^^^^^^^^^^^^^
+
+The multirepo works the same as the current Git workflow: every command needs
+to be applied to each of the individual repositories.
+However, the umbrella repository makes this easy using `git submodule foreach`
+to replicate a command on all the individual repositories (or submodules
+in this case):
+
+To create a new branch::
+
+ git submodule foreach git checkout -b MyBranch
+
+To switch branches::
+
+ git submodule foreach git checkout AnotherBranch
+
+.. _workflow-mono-branching:
+
+Monorepo Variant
+^^^^^^^^^^^^^^^^
+
+Regular Git commands are sufficient, because everything is in a single
+repository:
+
+To update the repository to tip of trunk::
+
+ git pull
+
+To create a new branch::
+
+ git checkout -b MyBranch
+
+To switch branches::
+
+ git checkout AnotherBranch
+
+Bisecting
+---------
+
+Assuming a developer is looking for a bug in clang (or lld, or lldb, ...).
+
+Currently
+^^^^^^^^^
+
+SVN does not have builtin bisection support, but the single revision across
+sub-projects makes it possible to script around.
+
+Using the existing Git read-only view of the repositories, it is possible to use
+the native Git bisection script over the llvm repository, and use some scripting
+to synchronize the clang repository to match the llvm revision.
+
+.. _workflow-multi-bisecting:
+
+Multirepo Variant
+^^^^^^^^^^^^^^^^^
+
+With the multi-repositories variant, the cross-repository synchronization is
+achieved using the umbrella repository. This repository contains only
+submodules for the other sub-projects. The native Git bisection can be used on
+the umbrella repository directly. A subtlety is that the bisect script itself
+needs to make sure the submodules are updated accordingly.
+
+For example, to find which commit introduces a regression where clang-3.9
+crashes but not clang-3.8 passes, one should be able to simply do::
+
+ git bisect start release_39 release_38
+ git bisect run ./bisect_script.sh
+
+With the `bisect_script.sh` script being::
+
+ #!/bin/sh
+ cd $UMBRELLA_DIRECTORY
+ git submodule update llvm clang libcxx #....
+ cd $BUILD_DIR
+
+ ninja clang || exit 125 # an exit code of 125 asks "git bisect"
+ # to "skip" the current commit
+
+ ./bin/clang some_crash_test.cpp
+
+When the `git bisect run` command returns, the umbrella repository is set to
+the state where the regression is introduced. The commit diff in the umbrella
+indicate which submodule was updated, and the last commit in this sub-projects
+is the one that the bisect found.
+
+.. _workflow-mono-bisecting:
+
+Monorepo Variant
+^^^^^^^^^^^^^^^^
+
+Bisecting on the monorepo is straightforward, and very similar to the above,
+except that the bisection script does not need to include the
+`git submodule update` step.
+
+The same example, finding which commit introduces a regression where clang-3.9
+crashes but not clang-3.8 passes, will look like::
+
+ git bisect start release_39 release_38
+ git bisect run ./bisect_script.sh
+
+With the `bisect_script.sh` script being::
+
+ #!/bin/sh
+ cd $BUILD_DIR
+
+ ninja clang || exit 125 # an exit code of 125 asks "git bisect"
+ # to "skip" the current commit
+
+ ./bin/clang some_crash_test.cpp
+
+Also, since the monorepo handles commits update across multiple projects, you're
+less like to encounter a build failure where a commit change an API in LLVM and
+another later one "fixes" the build in clang.
+
+
+References
+==========
+
+.. [LattnerRevNum] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041739.html
+.. [TrickRevNum] Andrew Trick, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041721.html
+.. [JSonnRevNum] Joerg Sonnenberg, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041688.html
+.. [TorvaldRevNum] Linus Torvald, http://git.661346.n2.nabble.com/Git-commit-generation-numbers-td6584414.html
+.. [MatthewsRevNum] Chris Matthews, http://lists.llvm.org/pipermail/cfe-dev/2016-July/049886.html
+.. [submodules] Git submodules, https://git-scm.com/book/en/v2/Git-Tools-Submodules)
+.. [statuschecks] GitHub status-checks, https://help.github.com/articles/about-required-status-checks/
+.. [LebarCHERI] Port *CHERI* to a single repository rewriting history, http://lists.llvm.org/pipermail/llvm-dev/2016-July/102787.html
+.. [AminiCHERI] Port *CHERI* to a single repository preserving history, http://lists.llvm.org/pipermail/llvm-dev/2016-July/102804.html
diff --git a/llvm/docs/Proposals/GitHubSubMod.rst b/llvm/docs/Proposals/GitHubSubMod.rst
deleted file mode 100644
index 6b8cd2c24dc..00000000000
--- a/llvm/docs/Proposals/GitHubSubMod.rst
+++ /dev/null
@@ -1,273 +0,0 @@
-===============================================
-Moving LLVM Projects to GitHub with Sub-Modules
-===============================================
-
-Introduction
-============
-
-This is a proposal to move our current revision control system from our own
-hosted Subversion to GitHub. Below are the financial and technical arguments as
-to why we need such a move and how will people (and validation infrastructure)
-continue to work with a Git-based LLVM.
-
-There will be a survey pointing at this document when we'll know the community's
-reaction and, if we collectively decide to move, the time-frames. Be sure to make
-your views count.
-
-Essentially, the proposal is divided in the following parts:
-
-* Outline of the reasons to move to Git and GitHub
-* Description on what the work flow will look like (compared to SVN)
-* Remaining issues and potential problems
-* The proposed migration plan
-
-Why Git, and Why GitHub?
-========================
-
-Why move at all?
-----------------
-
-The strongest reason for the move, and why this discussion started in the first
-place, is that we currently host our own Subversion server and Git mirror in a
-voluntary basis. The LLVM Foundation sponsors the server and provides limited
-support, but there is only so much it can do.
-
-The volunteers are not Sysadmins themselves, but compiler engineers that happen
-to know a thing or two about hosting servers. We also don't have 24/7 support,
-and we sometimes wake up to see that continuous integration is broken because
-the SVN server is either down or unresponsive.
-
-With time and money, the foundation and volunteers could improve our services,
-implement more functionality and provide around the clock support, so that we
-can have a first class infrastructure with which to work. But the cost is not
-small, both in money and time invested.
-
-On the other hand, there are multiple services out there (GitHub, GitLab,
-BitBucket among others) that offer that same service (24/7 stability, disk space,
-Git server, code browsing, forking facilities, etc) for the very affordable price
-of *free*.
-
-Why Git?
---------
-
-Most new coders nowadays start with Git. A lot of them have never used SVN, CVS
-or anything else. Websites like GitHub have changed the landscape of open source
-contributions, reducing the cost of first contribution and fostering
-collaboration.
-
-Git is also the version control most LLVM developers use. Despite the sources
-being stored in an SVN server, most people develop using the Git-SVN integration,
-and that shows that Git is not only more powerful than SVN, but people have
-resorted to using a bridge because its features are now indispensable to their
-internal and external workflows.
-
-In essence, Git allows you to:
-
-* Commit, squash, merge, fork locally without any penalty to the server
-* Add as many branches as necessary to allow for multiple threads of development
-* Collaborate with peers directly, even without access to the Internet
-* Have multiple trees without multiplying disk space.
-
-In addition, because Git seems to be replacing every project's version control
-system, there are many more tools that can use Git's enhanced feature set, so
-new tooling is much more likely to support Git first (if not only), than any
-other version control system.
-
-Why GitHub?
------------
-
-GitHub, like GitLab and BitBucket, provide free code hosting for open source
-projects. Essentially, they will completely replace *all* the infrastructure that
-we have today that serves code repository, mirroring, user control, etc.
-
-They also have a dedicated team to monitor, migrate, improve and distribute the
-contents of the repositories depending on region and load. A level of quality
-that we'd never have without spending money that would be better spent elsewhere,
-for example development meetings, sponsoring disadvantaged people to work on
-compilers and foster diversity and equality in our community.
-
-GitHub has the added benefit that we already have a presence there. Many
-developers use it already, and the mirror from our current repository is already
-set up.
-
-Furthermore, GitHub has an *SVN view* (https://github.com/blog/626-announcing-svn-support)
-where people that still have/want to use SVN infrastructure and tooling can
-slowly migrate or even stay working as if it was an SVN repository (including
-read-write access).
-
-So, any of the three solutions solve the cost and maintenance problem, but GitHub
-has two additional features that would be beneficial to the migration plan as
-well as the community already settled there.
-
-
-What will the new workflow look like
-====================================
-
-In order to move version control, we need to make sure that we get all the
-benefits with the least amount of problems. That's why the migration plan will
-be slow, one step at a time, and we'll try to make it look as close as possible
-to the current style without impacting the new features we want.
-
-Each LLVM project will continue to be hosted as separate GitHub repository
-under a single GitHub organisation. Users can continue to choose to use either
-SVN or Git to access the repositories to suit their current workflow.
-
-In addition, we'll create a repository that will mimic our current *linear
-history* repository. The most accepted proposal, then, was to have an umbrella
-project that will contain *sub-modules* (https://git-scm.com/book/en/v2/Git-Tools-Submodules)
-of all the LLVM projects and nothing else.
-
-This repository can be checked out on its own, in order to have *all* LLVM
-projects in a single check-out, as many people have suggested, but it can also
-only hold the references to the other projects, and be used for the sole purpose
-of understanding the *sequence* in which commits were added by using the
-``git rev-list --count hash`` or ``git describe hash`` commands.
-
-One example of such a repository is Takumi's llvm-project-submodule
-(https://github.com/chapuni/llvm-project-submodule), which when checked out,
-will have the references to all sub-modules but not check them out, so one will
-need to *init* the module manually. This will allow the *exact* same behaviour
-as checking out individual SVN repositories, as it will keep the correct linear
-history.
-
-There is no need to additional tags, flags and properties, or external
-services controlling the history, since both SVN and *git rev-list* can already
-do that on their own.
-
-We will need additional server hooks to avoid non-fast-forwards commits (ex.
-merges, forced pushes, etc) in order to keep the linearity of the history.
-
-The three types hooks to be implemented are:
-
-* Status Checks: By placing status checks on a protected branch, we can guarantee
- that the history is kept linear and sane at all times, on all repositories.
- See: https://help.github.com/articles/about-required-status-checks/
-* Umbrella updates: By using GitHub web hooks, we can update a small web-service
- inside LLVM's own infrastructure to update the umbrella project remotely. The
- maintenance of this service will be lower than the current SVN maintenance and
- the scope of its failures will be less severe.
- See: https://developer.github.com/webhooks/
-* Commits email update: By adding an email web hook, we can make every push show
- in the lists, allowing us to retain history and do post-commit reviews.
- See: https://help.github.com/articles/managing-notifications-for-pushes-to-a-repository/
-
-Access will be transferred one-to-one to GitHub accounts for everyone that already
-has commit access to our current repository. Those who don't have accounts will
-have to create one in order to continue contributing to the project. In the
-future, people only need to provide their GitHub accounts to be granted access.
-
-In a nutshell:
-
-* The projects' repositories will remain identical, with a new address (GitHub).
-* They'll continue to have SVN access (Read-Write), but will also gain Git RW access.
-* The linear history can still be accessed in the (RO) submodule meta project.
-* Individual projects' history will be local (ie. not interlaced with the other
- projects, as the current SVN repos are), and we need the umbrella project
- (using submodules) to have the same view as we had in SVN.
-
-Additionally, each repository will have the following server hooks:
-
-* Pre-commit hooks to stop people from applying non-fast-forward merges
-* Webhook to update the umbrella project (via buildbot or web services)
-* Email hook to each commits list (llvm-commit, cfe-commit, etc)
-
-Essentially, we're adding Git RW access in addition to the already existing
-structure, with all the additional benefits of it being in GitHub.
-
-Example of a working version:
-
-* Repository: https://github.com/llvm-beanz/llvm-submodules
-* Update bot: http://beanz-bot.com:8180/jenkins/job/submodule-update/
-
-What will *not* be changed
---------------------------
-
-This is a change of version control system, not the whole infrastructure. There
-are plans to replace our current tools (review, bugs, documents), but they're
-all orthogonal to this proposal.
-
-We'll also be keeping the buildbots (and migrating them to use Git) as well as
-LNT, and any other system that currently provides value upstream.
-
-Any discussion regarding those tools are out of scope in this proposal.
-
-Remaining questions and problems
-================================
-
-1. How much the SVN view emulates and how much it'll break tools/CI?
-
-For this one, we'll need people that will have problems in that area to tell
-us what's wrong and how to help them fix it.
-
-We also recommend people and companies to migrate to Git, for its many other
-additional benefits.
-
-2. Which tools will need changing?
-
-LNT may break, since it relies on SVN's history. We can continue to
-use LNT with the SVN-View, but it would be best to move it to Git once and for
-all.
-
-The LLVMLab bisect tool will also be affected and will need adjusting. As with
-LNT, it should be fine to use GitHub's SVN view, but changing it to work on Git
-will be required in the long term.
-
-Phabricator will also need to change its configuration to point at the GitHub
-repositories, but since it already works with Git, this will be a trivial change.
-
-Migration Plan
-==============
-
-If we decide to move, we'll have to set a date for the process to begin.
-
-As usual, we should be announcing big changes in one release to happen in the
-next one. But since this won't impact external users (if they rely on our source
-release tarballs), we don't necessarily have to.
-
-We will have to make sure all the *problems* reported are solved before the
-final push. But we can start all non-binding processes (like mirroring to GitHub
-and testing the SVN interface in it) before any hard decision.
-
-Here's a proposed plan:
-
-STEP #1 : Pre Move
-
-0. Update docs to mention the move, so people are aware the it's going on.
-1. Register an official GitHub project with the LLVM foundation.
-2. Setup another (read-only) mirror of llvm.org/git at this GitHub project,
- adding all necessary hooks to avoid broken history (merge, dates, pushes), as
- well as a webhook to update the umbrella project (see below).
-3. Make sure we have an llvm-project (with submodules) setup in the official
- account, with all necessary hooks (history, update, merges).
-4. Make sure bisecting with llvm-project works.
-5. Make sure no one has any other blocker.
-
-STEP #2 : Git Move
-
-6. Update the buildbots to pick up updates and commits from the official git
- repository.
-7. Update Phabricator to pick up commits from the official git repository.
-8. Tell people living downstream to pick up commits from the official git
- repository.
-9. Give things time to settle. We could play some games like disabling the SVN
- repository for a few hours on purpose so that people can test that their
- infrastructure has really become independent of the SVN repository.
-
-Until this point nothing has changed for developers, it will just
-boil down to a lot of work for buildbot and other infrastructure
-owners.
-
-Once all dependencies are cleared, and all problems have been solved:
-
-STEP #3: Write Access Move
-
-10. Collect peoples GitHub account information, adding them to the project.
-11. Switch SVN repository to read-only and allow pushes to the GitHub repository.
-12. Mirror Git to SVN.
-
-STEP #4 : Post Move
-
-13. Archive the SVN repository, if GitHub's SVN is good enough.
-14. Review and update *all* LLVM documentation.
-15. Review website links pointing to viewvc/klaus/phab etc. to point to GitHub
- instead.
diff --git a/llvm/docs/index.rst b/llvm/docs/index.rst
index 36f5a979a41..e24d795946e 100644
--- a/llvm/docs/index.rst
+++ b/llvm/docs/index.rst
@@ -510,13 +510,13 @@ can be better.
:hidden:
CodeOfConduct
- Proposals/GitHubSubMod
+ Proposals/GitHubMove
:doc:`CodeOfConduct`
Proposal to adopt a code of conduct on the LLVM social spaces (lists, events,
IRC, etc).
-:doc:`Proposals/GitHubSubMod`
+:doc:`Proposals/GitHubMove`
Proposal to move from SVN/Git to GitHub.
OpenPOWER on IntegriCloud