9 files changed, 945 insertions, 0 deletions
diff --git a/llvm/docs/GlobalISel/GMIR.rst b/llvm/docs/GlobalISel/GMIR.rst
new file mode 100644
index 00000000000..4eaf039b14b
--- /dev/null
+++ b/llvm/docs/GlobalISel/GMIR.rst
@@ -0,0 +1,158 @@
+.. _gmir:
+
+Generic Machine IR
+==================
+
+Machine IR operates on physical registers, register classes, and (mostly)
+target-specific instructions.
+
+To bridge the gap with LLVM IR, GlobalISel introduces "generic" extensions to
+Machine IR:
+
+.. contents::
+   :local:
+
+``NOTE``:
+The generic MIR (GMIR) representation still contains references to IR
+constructs (such as ``GlobalValue``).  Removing those should let us write more
+accurate tests, or delete IR after building the initial MIR.  However, it is
+not part of the GlobalISel effort.
+
+.. _gmir-instructions:
+
+Generic Instructions
+--------------------
+
+The main addition is support for pre-isel generic machine instructions (e.g.,
+``G_ADD``).  Like other target-independent instructions (e.g., ``COPY`` or
+``PHI``), these are available on all targets.
+
+``TODO``:
+While we're progressively adding instructions, one kind in particular exposes
+interesting problems: compares and how to represent condition codes.
+Some targets (x86, ARM) have generic comparisons setting multiple flags,
+which are then used by predicated variants.
+Others (IR) specify the predicate in the comparison and users just get a single
+bit.  SelectionDAG uses SETCC/CONDBR vs BR_CC (and similar for select) to
+represent this.
+
+The ``MachineIRBuilder`` class wraps the ``MachineInstrBuilder`` and provides
+a convenient way to create these generic instructions.
+
+.. _gmir-gvregs:
+
+Generic Virtual Registers
+-------------------------
+
+Generic instructions operate on a new kind of register: "generic" virtual
+registers.  As opposed to non-generic vregs, they are not assigned a Register
+Class.  Instead, generic vregs have a :ref:`gmir-llt`, and can be assigned
+a :ref:`gmir-regbank`.
+
+``MachineRegisterInfo`` tracks the same information that it does for
+non-generic vregs (e.g., use-def chains).  Additionally, it also tracks the
+:ref:`gmir-llt` of the register, and, instead of the ``TargetRegisterClass``,
+its :ref:`gmir-regbank`, if any.
+
+For simplicity, most generic instructions only accept generic vregs:
+
+* instead of immediates, they use a gvreg defined by an instruction
+  materializing the immediate value (see :ref:`irtranslator-constants`).
+* instead of physical register, they use a gvreg defined by a ``COPY``.
+
+``NOTE``:
+We started with an alternative representation, where MRI tracks a size for
+each gvreg, and instructions have lists of types.
+That had two flaws: the type and size are redundant, and there was no generic
+way of getting a given operand's type (as there was no 1:1 mapping between
+instruction types and operands).
+We considered putting the type in some variant of MCInstrDesc instead:
+See `PR26576 <http://llvm.org/PR26576>`_: [GlobalISel] Generic MachineInstrs
+need a type but this increases the memory footprint of the related objects
+
+.. _gmir-regbank:
+
+Register Bank
+-------------
+
+A Register Bank is a set of register classes defined by the target.
+A bank has a size, which is the maximum store size of all covered classes.
+
+In general, cross-class copies inside a bank are expected to be cheaper than
+copies across banks.  They are also coalesceable by the register coalescer,
+whereas cross-bank copies are not.
+
+Also, equivalent operations can be performed on different banks using different
+instructions.
+
+For example, X86 can be seen as having 3 main banks: general-purpose, x87, and
+vector (which could be further split into a bank per domain for single vs
+double precision instructions).
+
+Register banks are described by a target-provided API,
+:ref:`RegisterBankInfo <api-registerbankinfo>`.
+
+.. _gmir-llt:
+
+Low Level Type
+--------------
+
+Additionally, every generic virtual register has a type, represented by an
+instance of the ``LLT`` class.
+
+Like ``EVT``/``MVT``/``Type``, it has no distinction between unsigned and signed
+integer types.  Furthermore, it also has no distinction between integer and
+floating-point types: it mainly conveys absolutely necessary information, such
+as size and number of vector lanes:
+
+* ``sN`` for scalars
+* ``pN`` for pointers
+* ``<N x sM>`` for vectors
+* ``unsized`` for labels, etc..
+
+``LLT`` is intended to replace the usage of ``EVT`` in SelectionDAG.
+
+Here are some LLT examples and their ``EVT`` and ``Type`` equivalents:
+
+   =============  =========  ======================================
+   LLT            EVT        IR Type
+   =============  =========  ======================================
+   ``s1``         ``i1``     ``i1``
+   ``s8``         ``i8``     ``i8``
+   ``s32``        ``i32``    ``i32``
+   ``s32``        ``f32``    ``float``
+   ``s17``        ``i17``    ``i17``
+   ``s16``        N/A        ``{i8, i8}``
+   ``s32``        N/A        ``[4 x i8]``
+   ``p0``         ``iPTR``   ``i8*``, ``i32*``, ``%opaque*``
+   ``p2``         ``iPTR``   ``i8 addrspace(2)*``
+   ``<4 x s32>``  ``v4f32``  ``<4 x float>``
+   ``s64``        ``v1f64``  ``<1 x double>``
+   ``<3 x s32>``  ``v3i32``  ``<3 x i32>``
+   ``unsized``    ``Other``  ``label``
+   =============  =========  ======================================
+
+
+Rationale: instructions already encode a specific interpretation of types
+(e.g., ``add`` vs. ``fadd``, or ``sdiv`` vs. ``udiv``).  Also encoding that
+information in the type system requires introducing bitcast with no real
+advantage for the selector.
+
+Pointer types are distinguished by address space.  This matches IR, as opposed
+to SelectionDAG where address space is an attribute on operations.
+This representation better supports pointers having different sizes depending
+on their addressspace.
+
+``NOTE``:
+Currently, LLT requires at least 2 elements in vectors, but some targets have
+the concept of a '1-element vector'.  Representing them as their underlying
+scalar type is a nice simplification.
+
+``TODO``:
+Currently, non-generic virtual registers, defined by non-pre-isel-generic
+instructions, cannot have a type, and thus cannot be used by a pre-isel generic
+instruction.  Instead, they are given a type using a COPY.  We could relax that
+and allow types on all vregs: this would reduce the number of MI required when
+emitting target-specific MIR early in the pipeline.  This should purely be
+a compile-time optimization.
+
diff --git a/llvm/docs/GlobalISel/IRTranslator.rst b/llvm/docs/GlobalISel/IRTranslator.rst
new file mode 100644
index 00000000000..515df56d794
--- /dev/null
+++ b/llvm/docs/GlobalISel/IRTranslator.rst
@@ -0,0 +1,57 @@
+.. _irtranslator:
+
+IRTranslator
+------------
+
+This pass translates the input LLVM IR ``Function`` to a GMIR
+``MachineFunction``.
+
+``TODO``:
+This currently doesn't support the more complex instructions, in particular
+those involving control flow (``switch``, ``invoke``, ...).
+For ``switch`` in particular, we can initially use the ``LowerSwitch`` pass.
+
+.. _api-calllowering:
+
+API: CallLowering
+^^^^^^^^^^^^^^^^^
+
+The ``IRTranslator`` (using the ``CallLowering`` target-provided utility) also
+implements the ABI's calling convention by lowering calls, returns, and
+arguments to the appropriate physical register usage and instruction sequences.
+
+.. _irtranslator-aggregates:
+
+Aggregates
+^^^^^^^^^^
+
+Aggregates are lowered to a single scalar vreg.
+This differs from SelectionDAG's multiple vregs via ``GetValueVTs``.
+
+``TODO``:
+As some of the bits are undef (padding), we should consider augmenting the
+representation with additional metadata (in effect, caching computeKnownBits
+information on vregs).
+See `PR26161 <http://llvm.org/PR26161>`_: [GlobalISel] Value to vreg during
+IR to MachineInstr translation for aggregate type
+
+.. _irtranslator-constants:
+
+Constant Lowering
+^^^^^^^^^^^^^^^^^
+
+The ``IRTranslator`` lowers ``Constant`` operands into uses of gvregs defined
+by ``G_CONSTANT`` or ``G_FCONSTANT`` instructions.
+Currently, these instructions are always emitted in the entry basic block.
+In a ``MachineFunction``, each ``Constant`` is materialized by a single gvreg.
+
+This is beneficial as it allows us to fold constants into immediate operands
+during :ref:`instructionselect`, while still avoiding redundant materializations
+for expensive non-foldable constants.
+However, this can lead to unnecessary spills and reloads in an -O0 pipeline, as
+these vregs can have long live ranges.
+
+``TODO``:
+We're investigating better placement of these instructions, in fast and
+optimized modes.
+
diff --git a/llvm/docs/GlobalISel/InstructionSelect.rst b/llvm/docs/GlobalISel/InstructionSelect.rst
new file mode 100644
index 00000000000..9798ae7a596
--- /dev/null
+++ b/llvm/docs/GlobalISel/InstructionSelect.rst
@@ -0,0 +1,98 @@
+
+.. _instructionselect:
+
+InstructionSelect
+-----------------
+
+This pass transforms generic machine instructions into equivalent
+target-specific instructions.  It traverses the ``MachineFunction`` bottom-up,
+selecting uses before definitions, enabling trivial dead code elimination.
+
+.. _api-instructionselector:
+
+API: InstructionSelector
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The target implements the ``InstructionSelector`` class, containing the
+target-specific selection logic proper.
+
+The instance is provided by the subtarget, so that it can specialize the
+selector by subtarget feature (with, e.g., a vector selector overriding parts
+of a general-purpose common selector).
+We might also want to parameterize it by MachineFunction, to enable selector
+variants based on function attributes like optsize.
+
+The simple API consists of:
+
+  .. code-block:: c++
+
+    virtual bool select(MachineInstr &MI)
+
+This target-provided method is responsible for mutating (or replacing) a
+possibly-generic MI into a fully target-specific equivalent.
+It is also responsible for doing the necessary constraining of gvregs into the
+appropriate register classes as well as passing through COPY instructions to
+the register allocator.
+
+The ``InstructionSelector`` can fold other instructions into the selected MI,
+by walking the use-def chain of the vreg operands.
+As GlobalISel is Global, this folding can occur across basic blocks.
+
+SelectionDAG Rule Imports
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+TableGen will import SelectionDAG rules and provide the following function to
+execute them:
+
+  .. code-block:: c++
+
+    bool selectImpl(MachineInstr &MI)
+
+The ``--stats`` option can be used to determine what proportion of rules were
+successfully imported. The easiest way to use this is to copy the
+``-gen-globalisel`` tablegen command from ``ninja -v`` and modify it.
+
+Similarly, the ``--warn-on-skipped-patterns`` option can be used to obtain the
+reasons that rules weren't imported. This can be used to focus on the most
+important rejection reasons.
+
+PatLeaf Predicates
+^^^^^^^^^^^^^^^^^^
+
+PatLeafs cannot be imported because their C++ is implemented in terms of
+``SDNode`` objects. PatLeafs that handle immediate predicates should be
+replaced by ``ImmLeaf``, ``IntImmLeaf``, or ``FPImmLeaf`` as appropriate.
+
+There's no standard answer for other PatLeafs. Some standard predicates have
+been baked into TableGen but this should not generally be done.
+
+Custom SDNodes
+^^^^^^^^^^^^^^
+
+Custom SDNodes should be mapped to Target Pseudos using ``GINodeEquiv``. This
+will cause the instruction selector to import them but you will also need to
+ensure the target pseudo is introduced to the MIR before the instruction
+selector. Any preceding pass is suitable but the legalizer will be a
+particularly common choice.
+
+ComplexPatterns
+^^^^^^^^^^^^^^^
+
+ComplexPatterns cannot be imported because their C++ is implemented in terms of
+``SDNode`` objects. GlobalISel versions should be defined with
+``GIComplexOperandMatcher`` and mapped to ComplexPattern with
+``GIComplexPatternEquiv``.
+
+The following predicates are useful for porting ComplexPattern:
+
+* isBaseWithConstantOffset() - Check for base+offset structures
+* isOperandImmEqual() - Check for a particular constant
+* isObviouslySafeToFold() - Check for reasons an instruction can't be sunk and folded into another.
+
+There are some important points for the C++ implementation:
+
+* Don't modify MIR in the predicate
+* Renderer lambdas should capture by value to avoid use-after-free. They will be used after the predicate returns.
+* Only create instructions in a renderer lambda. GlobalISel won't clean up things you create but don't use.
+
+
diff --git a/llvm/docs/GlobalISel/Legalizer.rst b/llvm/docs/GlobalISel/Legalizer.rst
new file mode 100644
index 00000000000..d0cb7324c5e
--- /dev/null
+++ b/llvm/docs/GlobalISel/Legalizer.rst
@@ -0,0 +1,351 @@
+.. _milegalizer:
+
+Legalizer
+---------
+
+This pass transforms the generic machine instructions such that they are legal.
+
+A legal instruction is defined as:
+
+* **selectable** --- the target will later be able to select it to a
+  target-specific (non-generic) instruction.
+
+* operating on **vregs that can be loaded and stored** -- if necessary, the
+  target can select a ``G_LOAD``/``G_STORE`` of each gvreg operand.
+
+As opposed to SelectionDAG, there are no legalization phases.  In particular,
+'type' and 'operation' legalization are not separate.
+
+Legalization is iterative, and all state is contained in GMIR.  To maintain the
+validity of the intermediate code, instructions are introduced:
+
+* ``G_MERGE_VALUES`` --- concatenate multiple registers of the same
+  size into a single wider register.
+
+* ``G_UNMERGE_VALUES`` --- extract multiple registers of the same size
+  from a single wider register.
+
+* ``G_EXTRACT`` --- extract a simple register (as contiguous sequences of bits)
+  from a single wider register.
+
+As they are expected to be temporary byproducts of the legalization process,
+they are combined at the end of the :ref:`milegalizer` pass.
+If any remain, they are expected to always be selectable, using loads and stores
+if necessary.
+
+The legality of an instruction may only depend on the instruction itself and
+must not depend on any context in which the instruction is used. However, after
+deciding that an instruction is not legal, using the context of the instruction
+to decide how to legalize the instruction is permitted. As an example, if we
+have a ``G_FOO`` instruction of the form::
+
+  %1:_(s32) = G_CONSTANT i32 1
+  %2:_(s32) = G_FOO %0:_(s32), %1:_(s32)
+
+it's impossible to say that G_FOO is legal iff %1 is a ``G_CONSTANT`` with
+value ``1``. However, the following::
+
+  %2:_(s32) = G_FOO %0:_(s32), i32 1
+
+can say that it's legal iff operand 2 is an immediate with value ``1`` because
+that information is entirely contained within the single instruction.
+
+.. _api-legalizerinfo:
+
+API: LegalizerInfo
+^^^^^^^^^^^^^^^^^^
+
+The recommended [#legalizer-legacy-footnote]_ API looks like this::
+
+  getActionDefinitionsBuilder({G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SHL})
+      .legalFor({s32, s64, v2s32, v4s32, v2s64})
+      .clampScalar(0, s32, s64)
+      .widenScalarToNextPow2(0)
+      .clampNumElements(0, v2s32, v4s32)
+      .clampNumElements(0, v2s64, v2s64)
+      .moreElementsToNextPow2(0);
+
+and describes a set of rules by which we can either declare an instruction legal
+or decide which action to take to make it more legal.
+
+At the core of this ruleset is the ``LegalityQuery`` which describes the
+instruction. We use a description rather than the instruction to both allow other
+passes to determine legality without having to create an instruction and also to
+limit the information available to the predicates to that which is safe to rely
+on. Currently, the information available to the predicates that determine
+legality contains:
+
+* The opcode for the instruction
+
+* The type of each type index (see ``type0``, ``type1``, etc.)
+
+* The size in bytes and atomic ordering for each MachineMemOperand
+
+Rule Processing and Declaring Rules
+"""""""""""""""""""""""""""""""""""
+
+The ``getActionDefinitionsBuilder`` function generates a ruleset for the given
+opcode(s) that rules can be added to. If multiple opcodes are given, they are
+all permanently bound to the same ruleset. The rules in a ruleset are executed
+from top to bottom and will start again from the top if an instruction is
+legalized as a result of the rules. If the ruleset is exhausted without
+satisfying any rule, then it is considered unsupported.
+
+When it doesn't declare the instruction legal, each pass over the rules may
+request that one type changes to another type. Sometimes this can cause multiple
+types to change but we avoid this as much as possible as making multiple changes
+can make it difficult to avoid infinite loops where, for example, narrowing one
+type causes another to be too small and widening that type causes the first one
+to be too big.
+
+In general, it's advisable to declare instructions legal as close to the top of
+the rule as possible and to place any expensive rules as low as possible. This
+helps with performance as testing for legality happens more often than
+legalization and legalization can require multiple passes over the rules.
+
+As a concrete example, consider the rule::
+
+  getActionDefinitionsBuilder({G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SHL})
+      .legalFor({s32, s64, v2s32, v4s32, v2s64})
+      .clampScalar(0, s32, s64)
+      .widenScalarToNextPow2(0);
+
+and the instruction::
+
+  %2:_(s7) = G_ADD %0:_(s7), %1:_(s7)
+
+this doesn't meet the predicate for the :ref:`.legalFor() <legalfor>` as ``s7``
+is not one of the listed types so it falls through to the
+:ref:`.clampScalar() <clampscalar>`. It does meet the predicate for this rule
+as the type is smaller than the ``s32`` and this rule instructs the legalizer
+to change type 0 to ``s32``. It then restarts from the top. This time it does
+satisfy ``.legalFor()`` and the resulting output is::
+
+  %3:_(s32) = G_ANYEXT %0:_(s7)
+  %4:_(s32) = G_ANYEXT %1:_(s7)
+  %5:_(s32) = G_ADD %3:_(s32), %4:_(s32)
+  %2:_(s7) = G_TRUNC %5:_(s32)
+
+where the ``G_ADD`` is legal and the other instructions are scheduled for
+processing by the legalizer.
+
+Rule Actions
+""""""""""""
+
+There are various rule factories that append rules to a ruleset but they have a
+few actions in common:
+
+.. _legalfor:
+
+* ``legalIf()``, ``legalFor()``, etc. declare an instruction to be legal if the
+  predicate is satisfied.
+
+* ``narrowScalarIf()``, ``narrowScalarFor()``, etc. declare an instruction to be illegal
+  if the predicate is satisfied and indicates that narrowing the scalars in one
+  of the types to a specific type would make it more legal. This action supports
+  both scalars and vectors.
+
+* ``widenScalarIf()``, ``widenScalarFor()``, etc. declare an instruction to be illegal
+  if the predicate is satisfied and indicates that widening the scalars in one
+  of the types to a specific type would make it more legal. This action supports
+  both scalars and vectors.
+
+* ``fewerElementsIf()``, ``fewerElementsFor()``, etc. declare an instruction to be
+  illegal if the predicate is satisfied and indicates reducing the number of
+  vector elements in one of the types to a specific type would make it more
+  legal. This action supports vectors.
+
+* ``moreElementsIf()``, ``moreElementsFor()``, etc. declare an instruction to be illegal
+  if the predicate is satisfied and indicates increasing the number of vector
+  elements in one of the types to a specific type would make it more legal.
+  This action supports vectors.
+
+* ``lowerIf()``, ``lowerFor()``, etc. declare an instruction to be illegal if the
+  predicate is satisfied and indicates that replacing it with equivalent
+  instruction(s) would make it more legal. Support for this action differs for
+  each opcode.
+
+* ``libcallIf()``, ``libcallFor()``, etc. declare an instruction to be illegal if the
+  predicate is satisfied and indicates that replacing it with a libcall would
+  make it more legal. Support for this action differs for
+  each opcode.
+
+* ``customIf()``, ``customFor()``, etc. declare an instruction to be illegal if the
+  predicate is satisfied and indicates that the backend developer will supply
+  a means of making it more legal.
+
+* ``unsupportedIf()``, ``unsupportedFor()``, etc. declare an instruction to be illegal
+  if the predicate is satisfied and indicates that there is no way to make it
+  legal and the compiler should fail.
+
+* ``fallback()`` falls back on an older API and should only be used while porting
+  existing code from that API.
+
+Rule Predicates
+"""""""""""""""
+
+The rule factories also have predicates in common:
+
+* ``legal()``, ``lower()``, etc. are always satisfied.
+
+* ``legalIf()``, ``narrowScalarIf()``, etc. are satisfied if the user-supplied
+  ``LegalityPredicate`` function returns true. This predicate has access to the
+  information in the ``LegalityQuery`` to make its decision.
+  User-supplied predicates can also be combined using ``all(P0, P1, ...)``.
+
+* ``legalFor()``, ``narrowScalarFor()``, etc. are satisfied if the type matches one in
+  a given set of types. For example ``.legalFor({s16, s32})`` declares the
+  instruction legal if type 0 is either s16 or s32. Additional versions for two
+  and three type indices are generally available. For these, all the type
+  indices considered together must match all the types in one of the tuples. So
+  ``.legalFor({{s16, s32}, {s32, s64}})`` will only accept ``{s16, s32}``, or
+  ``{s32, s64}`` but will not accept ``{s16, s64}``.
+
+* ``legalForTypesWithMemSize()``, ``narrowScalarForTypesWithMemSize()``, etc. are
+  similar to ``legalFor()``, ``narrowScalarFor()``, etc. but additionally require a
+  MachineMemOperand to have a given size in each tuple.
+
+* ``legalForCartesianProduct()``, ``narrowScalarForCartesianProduct()``, etc. are
+  satisfied if each type index matches one element in each of the independent
+  sets. So ``.legalForCartesianProduct({s16, s32}, {s32, s64})`` will accept
+  ``{s16, s32}``, ``{s16, s64}``, ``{s32, s32}``, and ``{s32, s64}``.
+
+Composite Rules
+"""""""""""""""
+
+There are some composite rules for common situations built out of the above facilities:
+
+* ``widenScalarToNextPow2()`` is like ``widenScalarIf()`` but is satisfied iff the type
+  size in bits is not a power of 2 and selects a target type that is the next
+  largest power of 2. 
+
+.. _clampscalar:
+
+* ``minScalar()`` is like ``widenScalarIf()`` but is satisfied iff the type
+  size in bits is smaller than the given minimum and selects the minimum as the
+  target type. Similarly, there is also a ``maxScalar()`` for the maximum and a
+  ``clampScalar()`` to do both at once. 
+
+* ``minScalarSameAs()`` is like ``minScalar()`` but the minimum is taken from another
+  type index.
+
+* ``moreElementsToNextMultiple()`` is like ``moreElementsToNextPow2()`` but is based on
+  multiples of X rather than powers of 2.
+
+Other Information
+"""""""""""""""""
+
+``TODO``:
+An alternative worth investigating is to generalize the API to represent
+actions using ``std::function`` that implements the action, instead of explicit
+enum tokens (``Legal``, ``WidenScalar``, ...).
+
+``TODO``:
+Moreover, we could use TableGen to initially infer legality of operation from
+existing patterns (as any pattern we can select is by definition legal).
+Expanding that to describe legalization actions is a much larger but
+potentially useful project.
+
+.. rubric:: Footnotes
+
+.. [#legalizer-legacy-footnote] An API is broadly similar to
+   SelectionDAG/TargetLowering is available but is not recommended as a more
+   powerful API is available.
+
+.. _min-legalizerinfo:
+
+Minimum Rule Set
+^^^^^^^^^^^^^^^^
+
+GlobalISel's legalizer has a great deal of flexibility in how a given target
+shapes the GMIR that the rest of the backend must handle. However, there are
+a small number of requirements that all targets must meet.
+
+Before discussing the minimum requirements, we'll need some terminology:
+
+Producer Type Set
+  The set of types which is the union of all possible types produced by at
+  least one legal instruction.
+
+Consumer Type Set
+  The set of types which is the union of all possible types consumed by at
+  least one legal instruction.
+
+Both sets are often identical but there's no guarantee of that. For example,
+it's not uncommon to be unable to consume s64 but still be able to produce it
+for a few specific instructions.
+
+Minimum Rules For Scalars
+"""""""""""""""""""""""""
+
+* G_ANYEXT must be legal for all inputs from the producer type set and all larger
+  outputs from the consumer type set.
+* G_TRUNC must be legal for all inputs from the producer type set and all
+  smaller outputs from the consumer type set.
+
+G_ANYEXT, and G_TRUNC have mandatory legality since the GMIR requires a means to
+connect operations with different type sizes. They are usually trivial to support
+since G_ANYEXT doesn't define the value of the additional bits and G_TRUNC is
+discarding bits. The other conversions can be lowered into G_ANYEXT/G_TRUNC
+with some additional operations that are subject to further legalization. For
+example, G_SEXT can lower to::
+
+  %1 = G_ANYEXT %0
+  %2 = G_CONSTANT ...
+  %3 = G_SHL %1, %2
+  %4 = G_ASHR %3, %2
+
+and the G_CONSTANT/G_SHL/G_ASHR can further lower to other operations or target
+instructions. Similarly, G_FPEXT has no legality requirement since it can lower
+to a G_ANYEXT followed by a target instruction.
+
+G_MERGE_VALUES and G_UNMERGE_VALUES do not have legality requirements since the
+former can lower to G_ANYEXT and some other legalizable instructions, while the
+latter can lower to some legalizable instructions followed by G_TRUNC.
+
+Minimum Legality For Vectors
+""""""""""""""""""""""""""""
+
+Within the vector types, there aren't any defined conversions in LLVM IR as
+vectors are often converted by reinterpreting the bits or by decomposing the
+vector and reconstituting it as a different type. As such, G_BITCAST is the
+only operation to account for. We generally don't require that it's legal
+because it can usually be lowered to COPY (or to nothing using
+replaceAllUses()). However, there are situations where G_BITCAST is non-trivial
+(e.g. little-endian vectors of big-endian data such as on big-endian MIPS MSA and
+big-endian ARM NEON, see `_i_bitcast`). To account for this G_BITCAST must be
+legal for all type combinations that change the bit pattern in the value.
+
+There are no legality requirements for G_BUILD_VECTOR, or G_BUILD_VECTOR_TRUNC
+since these can be handled by:
+* Declaring them legal.
+* Scalarizing them.
+* Lowering them to G_TRUNC+G_ANYEXT and some legalizable instructions.
+* Lowering them to target instructions which are legal by definition.
+
+The same reasoning also allows G_UNMERGE_VALUES to lack legality requirements
+for vector inputs.
+
+Minimum Legality for Pointers
+"""""""""""""""""""""""""""""
+
+There are no minimum rules for pointers since G_INTTOPTR and G_PTRTOINT can
+be selected to a COPY from register class to another by the legalizer.
+
+Minimum Legality For Operations
+"""""""""""""""""""""""""""""""
+
+The rules for G_ANYEXT, G_MERGE_VALUES, G_BITCAST, G_BUILD_VECTOR,
+G_BUILD_VECTOR_TRUNC, G_CONCAT_VECTORS, G_UNMERGE_VALUES, G_PTRTOINT, and
+G_INTTOPTR have already been noted above. In addition to those, the following
+operations have requirements:
+
+* At least one G_IMPLICIT_DEF must be legal. This is usually trivial as it
+  requires no code to be selected.
+* G_PHI must be legal for all types in the producer and consumer typesets. This
+  is usually trivial as it requires no code to be selected.
+* At least one G_FRAME_INDEX must be legal
+* At least one G_BLOCK_ADDR must be legal
+
+There are many other operations you'd expect to have legality requirements but
+they can be lowered to target instructions which are legal by definition.
diff --git a/llvm/docs/GlobalISel/Pipeline.rst b/llvm/docs/GlobalISel/Pipeline.rst
new file mode 100644
index 00000000000..2f9bc93c9b5
--- /dev/null
+++ b/llvm/docs/GlobalISel/Pipeline.rst
@@ -0,0 +1,82 @@
+.. _pipeline:
+
+Core Pipeline
+=============
+
+There are four required passes, regardless of the optimization mode:
+
+.. toctree::
+  :maxdepth: 1
+
+  IRTranslator
+  Legalizer
+  RegBankSelect
+  InstructionSelect
+
+Additional passes can then be inserted at higher optimization levels or for
+specific targets. For example, to match the current SelectionDAG set of
+transformations: MachineCSE and a better MachineCombiner between every pass.
+
+``NOTE``:
+In theory, not all passes are always necessary.
+As an additional compile-time optimization, we could skip some of the passes by
+setting the relevant MachineFunction properties.  For instance, if the
+IRTranslator did not encounter any illegal instruction, it would set the
+``legalized`` property to avoid running the :ref:`milegalizer`.
+Similarly, we considered specializing the IRTranslator per-target to directly
+emit target-specific MI.
+However, we instead decided to keep the core pipeline simple, and focus on
+minimizing the overhead of the passes in the no-op cases.
+
+.. _maintainability-verifier:
+
+MachineVerifier
+---------------
+
+The pass approach lets us use the ``MachineVerifier`` to enforce invariants.
+For instance, a ``regBankSelected`` function may not have gvregs without
+a bank.
+
+``TODO``:
+The ``MachineVerifier`` being monolithic, some of the checks we want to do
+can't be integrated to it:  GlobalISel is a separate library, so we can't
+directly reference it from CodeGen.  For instance, legality checks are
+currently done in RegBankSelect/InstructionSelect proper.  We could #ifdef out
+the checks, or we could add some sort of verifier API.
+
+
+.. _maintainability:
+
+Maintainability
+===============
+
+.. _maintainability-iterative:
+
+Iterative Transformations
+-------------------------
+
+Passes are split into small, iterative transformations, with all state
+represented in the MIR.
+
+This differs from SelectionDAG (in particular, the legalizer) using various
+in-memory side-tables.
+
+
+.. _maintainability-mir:
+
+MIR Serialization
+-----------------
+
+.. FIXME: Update the MIRLangRef to include GMI additions.
+
+:ref:`gmir` is serializable (see :doc:`../MIRLangRef`).
+Combined with :ref:`maintainability-iterative`, this enables much finer-grained
+testing, rather than requiring large and fragile IR-to-assembly tests.
+
+The current "stage" in the :ref:`pipeline` is represented by a set of
+``MachineFunctionProperties``:
+
+* ``legalized``
+* ``regBankSelected``
+* ``selected``
+
diff --git a/llvm/docs/GlobalISel/Porting.rst b/llvm/docs/GlobalISel/Porting.rst
new file mode 100644
index 00000000000..f2928639158
--- /dev/null
+++ b/llvm/docs/GlobalISel/Porting.rst
@@ -0,0 +1,21 @@
+.. _porting:
+
+Porting GlobalISel to A New Target
+==================================
+
+There are four major classes to implement by the target:
+
+* :ref:`CallLowering <api-calllowering>` --- lower calls, returns, and arguments
+  according to the ABI.
+* :ref:`RegisterBankInfo <api-registerbankinfo>` --- describe
+  :ref:`gmir-regbank` coverage, cross-bank copy cost, and the mapping of
+  operands onto banks for each instruction.
+* :ref:`LegalizerInfo <api-legalizerinfo>` --- describe what is legal, and how
+  to legalize what isn't.
+* :ref:`InstructionSelector <api-instructionselector>` --- select generic MIR
+  to target-specific MIR.
+
+Additionally:
+
+* ``TargetPassConfig`` --- create the passes constituting the pipeline,
+  including additional passes not included in the :ref:`pipeline`.
diff --git a/llvm/docs/GlobalISel/RegBankSelect.rst b/llvm/docs/GlobalISel/RegBankSelect.rst
new file mode 100644
index 00000000000..2702d689b84
--- /dev/null
+++ b/llvm/docs/GlobalISel/RegBankSelect.rst
@@ -0,0 +1,73 @@
+.. _regbankselect:
+
+RegBankSelect
+-------------
+
+This pass constrains the :ref:`gmir-gvregs` operands of generic
+instructions to some :ref:`gmir-regbank`.
+
+It iteratively maps instructions to a set of per-operand bank assignment.
+The possible mappings are determined by the target-provided
+:ref:`RegisterBankInfo <api-registerbankinfo>`.
+The mapping is then applied, possibly introducing ``COPY`` instructions if
+necessary.
+
+It traverses the ``MachineFunction`` top down so that all operands are already
+mapped when analyzing an instruction.
+
+This pass could also remap target-specific instructions when beneficial.
+In the future, this could replace the ExeDepsFix pass, as we can directly
+select the best variant for an instruction that's available on multiple banks.
+
+.. _api-registerbankinfo:
+
+API: RegisterBankInfo
+^^^^^^^^^^^^^^^^^^^^^
+
+The ``RegisterBankInfo`` class describes multiple aspects of register banks.
+
+* **Banks**: ``addRegBankCoverage`` --- which register bank covers each
+  register class.
+
+* **Cross-Bank Copies**: ``copyCost`` --- the cost of a ``COPY`` from one bank
+  to another.
+
+* **Default Mapping**: ``getInstrMapping`` --- the default bank assignments for
+  a given instruction.
+
+* **Alternative Mapping**: ``getInstrAlternativeMapping`` --- the other
+  possible bank assignments for a given instruction.
+
+``TODO``:
+All this information should eventually be static and generated by TableGen,
+mostly using existing information augmented by bank descriptions.
+
+``TODO``:
+``getInstrMapping`` is currently separate from ``getInstrAlternativeMapping``
+because the latter is more expensive: as we move to static mapping info,
+both methods should be free, and we should merge them.
+
+.. _regbankselect-modes:
+
+RegBankSelect Modes
+^^^^^^^^^^^^^^^^^^^
+
+``RegBankSelect`` currently has two modes:
+
+* **Fast** --- For each instruction, pick a target-provided "default" bank
+  assignment.  This is the default at -O0.
+
+* **Greedy** --- For each instruction, pick the cheapest of several
+  target-provided bank assignment alternatives.
+
+We intend to eventually introduce an additional optimizing mode:
+
+* **Global** --- Across multiple instructions, pick the cheapest combination of
+  bank assignments.
+
+``NOTE``:
+On AArch64, we are considering using the Greedy mode even at -O0 (or perhaps at
+backend -O1):  because :ref:`gmir-llt` doesn't distinguish floating point from
+integer scalars, the default assignment for loads and stores is the integer
+bank, introducing cross-bank copies on most floating point operations.
+
diff --git a/llvm/docs/GlobalISel/Resources.rst b/llvm/docs/GlobalISel/Resources.rst
new file mode 100644
index 00000000000..6ac35d241aa
--- /dev/null
+++ b/llvm/docs/GlobalISel/Resources.rst
@@ -0,0 +1,11 @@
+.. _other_resources:
+
+Resources
+=========
+
+* `Global Instruction Selection - A Proposal by Quentin Colombet @LLVMDevMeeting 2015 <https://www.youtube.com/watch?v=F6GGbYtae3g>`_
+* `Global Instruction Selection - Status by Quentin Colombet, Ahmed Bougacha, and Tim Northover @LLVMDevMeeting 2016 <https://www.youtube.com/watch?v=6tfb344A7w8>`_
+* `GlobalISel - LLVM's Latest Instruction Selection Framework by Diana Picus @FOSDEM17 <https://www.youtube.com/watch?v=d6dF6E4BPeU>`_
+* `GlobalISel: Past, Present, and Future by Quentin Colombet and Ahmed Bougacha @LLVMDevMeeting 2017 <https://www.llvm.org/devmtg/2017-10/#talk11>`_
+* `Head First into GlobalISel by Daniel Sanders, Aditya Nandakumar, and Justin Bogner @LLVMDevMeeting 2017 <https://www.llvm.org/devmtg/2017-10/#tutorial2>`_
+* `Generating Optimized Code with GlobalISel by Volkan Keles, Daniel Sanders @LLVMDevMeeting 2019 <https://www.llvm.org/devmtg/2019-10/talk-abstracts.html#keynote1>`_
diff --git a/llvm/docs/GlobalISel/index.rst b/llvm/docs/GlobalISel/index.rst
new file mode 100644
index 00000000000..f620bc643bb
--- /dev/null
+++ b/llvm/docs/GlobalISel/index.rst
@@ -0,0 +1,94 @@
+============================
+Global Instruction Selection
+============================
+
+.. warning::
+   This document is a work in progress.  It reflects the current state of the
+   implementation, as well as open design and implementation issues.
+
+.. contents::
+   :local:
+   :depth: 1
+
+Introduction
+============
+
+GlobalISel is a framework that provides a set of reusable passes and utilities
+for instruction selection --- translation from LLVM IR to target-specific
+Machine IR (MIR).
+
+GlobalISel is intended to be a replacement for SelectionDAG and FastISel, to
+solve three major problems:
+
+* **Performance** --- SelectionDAG introduces a dedicated intermediate
+  representation, which has a compile-time cost.
+
+  GlobalISel directly operates on the post-isel representation used by the
+  rest of the code generator, MIR.
+  It does require extensions to that representation to support arbitrary
+  incoming IR: :ref:`gmir`.
+
+* **Granularity** --- SelectionDAG and FastISel operate on individual basic
+  blocks, losing some global optimization opportunities.
+
+  GlobalISel operates on the whole function.
+
+* **Modularity** --- SelectionDAG and FastISel are radically different and share
+  very little code.
+
+  GlobalISel is built in a way that enables code reuse. For instance, both the
+  optimized and fast selectors share the :ref:`pipeline`, and targets can
+  configure that pipeline to better suit their needs.
+
+Design and Implementation Reference
+===================================
+
+More information on the design and implementation of GlobalISel can be found in
+the following sections.
+
+.. toctree::
+  :maxdepth: 1
+
+  GMIR
+  Pipeline
+  Porting
+  Resources
+
+
+.. _progress:
+
+Progress and Future Work
+========================
+
+The initial goal is to replace FastISel on AArch64.  The next step will be to
+replace SelectionDAG as the optimized ISel.
+
+``NOTE``:
+While we iterate on GlobalISel, we strive to avoid affecting the performance of
+SelectionDAG, FastISel, or the other MIR passes.  For instance, the types of
+:ref:`gmir-gvregs` are stored in a separate table in ``MachineRegisterInfo``,
+that is destroyed after :ref:`instructionselect`.
+
+.. _progress-fastisel:
+
+FastISel Replacement
+--------------------
+
+For the initial FastISel replacement, we intend to fallback to SelectionDAG on
+selection failures.
+
+Currently, compile-time of the fast pipeline is within 1.5x of FastISel.
+We're optimistic we can get to within 1.1/1.2x, but beating FastISel will be
+challenging given the multi-pass approach.
+Still, supporting all IR (via a complete legalizer) and avoiding the fallback
+to SelectionDAG in the worst case should enable better amortized performance
+than SelectionDAG+FastISel.
+
+``NOTE``:
+We considered never having a fallback to SelectionDAG, instead deciding early
+whether a given function is supported by GlobalISel or not.  The decision would
+be based on :ref:`milegalizer` queries.
+We abandoned that for two reasons:
+a) on IR inputs, we'd need to basically simulate the :ref:`irtranslator`;
+b) to be robust against unforeseen failures and to enable iterative
+improvements.