diff options
| author | Daniel Sanders <daniel_l_sanders@apple.com> | 2019-10-25 15:50:36 -0700 |
|---|---|---|
| committer | Daniel Sanders <daniel_l_sanders@apple.com> | 2019-10-25 15:51:09 -0700 |
| commit | feab0334f57d9e103965f7743f587cffcb4269f4 (patch) | |
| tree | 9b7292b11d5cb446ac6cb7f3870cbe0c351b5d2d | |
| parent | 10b5cd8ed5272d135ac75a94d3cf5854a0912f84 (diff) | |
| download | bcm5719-llvm-feab0334f57d9e103965f7743f587cffcb4269f4.tar.gz bcm5719-llvm-feab0334f57d9e103965f7743f587cffcb4269f4.zip | |
[globalisel] Restructure the GlobalISel documentation
There's a couple minor deletions amongst this but 99% of it is just moving
the documentation around to prepare the way for more meaningful changes.
| -rw-r--r-- | llvm/docs/FuzzingLLVM.rst | 2 | ||||
| -rw-r--r-- | llvm/docs/GlobalISel.rst | 941 | ||||
| -rw-r--r-- | llvm/docs/GlobalISel/GMIR.rst | 158 | ||||
| -rw-r--r-- | llvm/docs/GlobalISel/IRTranslator.rst | 57 | ||||
| -rw-r--r-- | llvm/docs/GlobalISel/InstructionSelect.rst | 98 | ||||
| -rw-r--r-- | llvm/docs/GlobalISel/Legalizer.rst | 351 | ||||
| -rw-r--r-- | llvm/docs/GlobalISel/Pipeline.rst | 82 | ||||
| -rw-r--r-- | llvm/docs/GlobalISel/Porting.rst | 21 | ||||
| -rw-r--r-- | llvm/docs/GlobalISel/RegBankSelect.rst | 73 | ||||
| -rw-r--r-- | llvm/docs/GlobalISel/Resources.rst | 11 | ||||
| -rw-r--r-- | llvm/docs/GlobalISel/index.rst | 94 | ||||
| -rw-r--r-- | llvm/docs/Reference.rst | 6 |
12 files changed, 949 insertions, 945 deletions
diff --git a/llvm/docs/FuzzingLLVM.rst b/llvm/docs/FuzzingLLVM.rst index 825e6d7ae02..ab4b9b557cd 100644 --- a/llvm/docs/FuzzingLLVM.rst +++ b/llvm/docs/FuzzingLLVM.rst @@ -83,7 +83,7 @@ A |LLVM IR fuzzer| aimed at finding bugs in instruction selection. This fuzzer accepts flags after `ignore_remaining_args=1`. The flags match those of :doc:`llc <CommandGuide/llc>` and the triple is required. For example, -the following command would fuzz AArch64 with :doc:`GlobalISel`: +the following command would fuzz AArch64 with :doc:`GlobalISel/index`: .. code-block:: shell diff --git a/llvm/docs/GlobalISel.rst b/llvm/docs/GlobalISel.rst deleted file mode 100644 index 2afc2ab42f9..00000000000 --- a/llvm/docs/GlobalISel.rst +++ /dev/null @@ -1,941 +0,0 @@ -============================ -Global Instruction Selection -============================ - -.. contents:: - :local: - :depth: 1 - -.. warning:: - This document is a work in progress. It reflects the current state of the - implementation, as well as open design and implementation issues. - -Introduction -============ - -GlobalISel is a framework that provides a set of reusable passes and utilities -for instruction selection --- translation from LLVM IR to target-specific -Machine IR (MIR). - -GlobalISel is intended to be a replacement for SelectionDAG and FastISel, to -solve three major problems: - -* **Performance** --- SelectionDAG introduces a dedicated intermediate - representation, which has a compile-time cost. - - GlobalISel directly operates on the post-isel representation used by the - rest of the code generator, MIR. - It does require extensions to that representation to support arbitrary - incoming IR: :ref:`gmir`. - -* **Granularity** --- SelectionDAG and FastISel operate on individual basic - blocks, losing some global optimization opportunities. - - GlobalISel operates on the whole function. - -* **Modularity** --- SelectionDAG and FastISel are radically different and share - very little code. - - GlobalISel is built in a way that enables code reuse. For instance, both the - optimized and fast selectors share the :ref:`pipeline`, and targets can - configure that pipeline to better suit their needs. - - -.. _gmir: - -Generic Machine IR -================== - -Machine IR operates on physical registers, register classes, and (mostly) -target-specific instructions. - -To bridge the gap with LLVM IR, GlobalISel introduces "generic" extensions to -Machine IR: - -.. contents:: - :local: - -``NOTE``: -The generic MIR (GMIR) representation still contains references to IR -constructs (such as ``GlobalValue``). Removing those should let us write more -accurate tests, or delete IR after building the initial MIR. However, it is -not part of the GlobalISel effort. - -.. _gmir-instructions: - -Generic Instructions --------------------- - -The main addition is support for pre-isel generic machine instructions (e.g., -``G_ADD``). Like other target-independent instructions (e.g., ``COPY`` or -``PHI``), these are available on all targets. - -``TODO``: -While we're progressively adding instructions, one kind in particular exposes -interesting problems: compares and how to represent condition codes. -Some targets (x86, ARM) have generic comparisons setting multiple flags, -which are then used by predicated variants. -Others (IR) specify the predicate in the comparison and users just get a single -bit. SelectionDAG uses SETCC/CONDBR vs BR_CC (and similar for select) to -represent this. - -The ``MachineIRBuilder`` class wraps the ``MachineInstrBuilder`` and provides -a convenient way to create these generic instructions. - -.. _gmir-gvregs: - -Generic Virtual Registers -------------------------- - -Generic instructions operate on a new kind of register: "generic" virtual -registers. As opposed to non-generic vregs, they are not assigned a Register -Class. Instead, generic vregs have a :ref:`gmir-llt`, and can be assigned -a :ref:`gmir-regbank`. - -``MachineRegisterInfo`` tracks the same information that it does for -non-generic vregs (e.g., use-def chains). Additionally, it also tracks the -:ref:`gmir-llt` of the register, and, instead of the ``TargetRegisterClass``, -its :ref:`gmir-regbank`, if any. - -For simplicity, most generic instructions only accept generic vregs: - -* instead of immediates, they use a gvreg defined by an instruction - materializing the immediate value (see :ref:`irtranslator-constants`). -* instead of physical register, they use a gvreg defined by a ``COPY``. - -``NOTE``: -We started with an alternative representation, where MRI tracks a size for -each gvreg, and instructions have lists of types. -That had two flaws: the type and size are redundant, and there was no generic -way of getting a given operand's type (as there was no 1:1 mapping between -instruction types and operands). -We considered putting the type in some variant of MCInstrDesc instead: -See `PR26576 <http://llvm.org/PR26576>`_: [GlobalISel] Generic MachineInstrs -need a type but this increases the memory footprint of the related objects - -.. _gmir-regbank: - -Register Bank -------------- - -A Register Bank is a set of register classes defined by the target. -A bank has a size, which is the maximum store size of all covered classes. - -In general, cross-class copies inside a bank are expected to be cheaper than -copies across banks. They are also coalesceable by the register coalescer, -whereas cross-bank copies are not. - -Also, equivalent operations can be performed on different banks using different -instructions. - -For example, X86 can be seen as having 3 main banks: general-purpose, x87, and -vector (which could be further split into a bank per domain for single vs -double precision instructions). - -Register banks are described by a target-provided API, -:ref:`RegisterBankInfo <api-registerbankinfo>`. - -.. _gmir-llt: - -Low Level Type --------------- - -Additionally, every generic virtual register has a type, represented by an -instance of the ``LLT`` class. - -Like ``EVT``/``MVT``/``Type``, it has no distinction between unsigned and signed -integer types. Furthermore, it also has no distinction between integer and -floating-point types: it mainly conveys absolutely necessary information, such -as size and number of vector lanes: - -* ``sN`` for scalars -* ``pN`` for pointers -* ``<N x sM>`` for vectors -* ``unsized`` for labels, etc.. - -``LLT`` is intended to replace the usage of ``EVT`` in SelectionDAG. - -Here are some LLT examples and their ``EVT`` and ``Type`` equivalents: - - ============= ========= ====================================== - LLT EVT IR Type - ============= ========= ====================================== - ``s1`` ``i1`` ``i1`` - ``s8`` ``i8`` ``i8`` - ``s32`` ``i32`` ``i32`` - ``s32`` ``f32`` ``float`` - ``s17`` ``i17`` ``i17`` - ``s16`` N/A ``{i8, i8}`` - ``s32`` N/A ``[4 x i8]`` - ``p0`` ``iPTR`` ``i8*``, ``i32*``, ``%opaque*`` - ``p2`` ``iPTR`` ``i8 addrspace(2)*`` - ``<4 x s32>`` ``v4f32`` ``<4 x float>`` - ``s64`` ``v1f64`` ``<1 x double>`` - ``<3 x s32>`` ``v3i32`` ``<3 x i32>`` - ``unsized`` ``Other`` ``label`` - ============= ========= ====================================== - - -Rationale: instructions already encode a specific interpretation of types -(e.g., ``add`` vs. ``fadd``, or ``sdiv`` vs. ``udiv``). Also encoding that -information in the type system requires introducing bitcast with no real -advantage for the selector. - -Pointer types are distinguished by address space. This matches IR, as opposed -to SelectionDAG where address space is an attribute on operations. -This representation better supports pointers having different sizes depending -on their addressspace. - -``NOTE``: -Currently, LLT requires at least 2 elements in vectors, but some targets have -the concept of a '1-element vector'. Representing them as their underlying -scalar type is a nice simplification. - -``TODO``: -Currently, non-generic virtual registers, defined by non-pre-isel-generic -instructions, cannot have a type, and thus cannot be used by a pre-isel generic -instruction. Instead, they are given a type using a COPY. We could relax that -and allow types on all vregs: this would reduce the number of MI required when -emitting target-specific MIR early in the pipeline. This should purely be -a compile-time optimization. - -.. _pipeline: - -Core Pipeline -============= - -There are four required passes, regardless of the optimization mode: - -.. contents:: - :local: - -Additional passes can then be inserted at higher optimization levels or for -specific targets. For example, to match the current SelectionDAG set of -transformations: MachineCSE and a better MachineCombiner between every pass. - -``NOTE``: -In theory, not all passes are always necessary. -As an additional compile-time optimization, we could skip some of the passes by -setting the relevant MachineFunction properties. For instance, if the -IRTranslator did not encounter any illegal instruction, it would set the -``legalized`` property to avoid running the :ref:`milegalizer`. -Similarly, we considered specializing the IRTranslator per-target to directly -emit target-specific MI. -However, we instead decided to keep the core pipeline simple, and focus on -minimizing the overhead of the passes in the no-op cases. - - -.. _irtranslator: - -IRTranslator ------------- - -This pass translates the input LLVM IR ``Function`` to a GMIR -``MachineFunction``. - -``TODO``: -This currently doesn't support the more complex instructions, in particular -those involving control flow (``switch``, ``invoke``, ...). -For ``switch`` in particular, we can initially use the ``LowerSwitch`` pass. - -.. _api-calllowering: - -API: CallLowering -^^^^^^^^^^^^^^^^^ - -The ``IRTranslator`` (using the ``CallLowering`` target-provided utility) also -implements the ABI's calling convention by lowering calls, returns, and -arguments to the appropriate physical register usage and instruction sequences. - -.. _irtranslator-aggregates: - -Aggregates -^^^^^^^^^^ - -Aggregates are lowered to a single scalar vreg. -This differs from SelectionDAG's multiple vregs via ``GetValueVTs``. - -``TODO``: -As some of the bits are undef (padding), we should consider augmenting the -representation with additional metadata (in effect, caching computeKnownBits -information on vregs). -See `PR26161 <http://llvm.org/PR26161>`_: [GlobalISel] Value to vreg during -IR to MachineInstr translation for aggregate type - -.. _irtranslator-constants: - -Constant Lowering -^^^^^^^^^^^^^^^^^ - -The ``IRTranslator`` lowers ``Constant`` operands into uses of gvregs defined -by ``G_CONSTANT`` or ``G_FCONSTANT`` instructions. -Currently, these instructions are always emitted in the entry basic block. -In a ``MachineFunction``, each ``Constant`` is materialized by a single gvreg. - -This is beneficial as it allows us to fold constants into immediate operands -during :ref:`instructionselect`, while still avoiding redundant materializations -for expensive non-foldable constants. -However, this can lead to unnecessary spills and reloads in an -O0 pipeline, as -these vregs can have long live ranges. - -``TODO``: -We're investigating better placement of these instructions, in fast and -optimized modes. - - -.. _milegalizer: - -Legalizer ---------- - -This pass transforms the generic machine instructions such that they are legal. - -A legal instruction is defined as: - -* **selectable** --- the target will later be able to select it to a - target-specific (non-generic) instruction. - -* operating on **vregs that can be loaded and stored** -- if necessary, the - target can select a ``G_LOAD``/``G_STORE`` of each gvreg operand. - -As opposed to SelectionDAG, there are no legalization phases. In particular, -'type' and 'operation' legalization are not separate. - -Legalization is iterative, and all state is contained in GMIR. To maintain the -validity of the intermediate code, instructions are introduced: - -* ``G_MERGE_VALUES`` --- concatenate multiple registers of the same - size into a single wider register. - -* ``G_UNMERGE_VALUES`` --- extract multiple registers of the same size - from a single wider register. - -* ``G_EXTRACT`` --- extract a simple register (as contiguous sequences of bits) - from a single wider register. - -As they are expected to be temporary byproducts of the legalization process, -they are combined at the end of the :ref:`milegalizer` pass. -If any remain, they are expected to always be selectable, using loads and stores -if necessary. - -The legality of an instruction may only depend on the instruction itself and -must not depend on any context in which the instruction is used. However, after -deciding that an instruction is not legal, using the context of the instruction -to decide how to legalize the instruction is permitted. As an example, if we -have a ``G_FOO`` instruction of the form:: - - %1:_(s32) = G_CONSTANT i32 1 - %2:_(s32) = G_FOO %0:_(s32), %1:_(s32) - -it's impossible to say that G_FOO is legal iff %1 is a ``G_CONSTANT`` with -value ``1``. However, the following:: - - %2:_(s32) = G_FOO %0:_(s32), i32 1 - -can say that it's legal iff operand 2 is an immediate with value ``1`` because -that information is entirely contained within the single instruction. - -.. _api-legalizerinfo: - -API: LegalizerInfo -^^^^^^^^^^^^^^^^^^ - -The recommended [#legalizer-legacy-footnote]_ API looks like this:: - - getActionDefinitionsBuilder({G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SHL}) - .legalFor({s32, s64, v2s32, v4s32, v2s64}) - .clampScalar(0, s32, s64) - .widenScalarToNextPow2(0) - .clampNumElements(0, v2s32, v4s32) - .clampNumElements(0, v2s64, v2s64) - .moreElementsToNextPow2(0); - -and describes a set of rules by which we can either declare an instruction legal -or decide which action to take to make it more legal. - -At the core of this ruleset is the ``LegalityQuery`` which describes the -instruction. We use a description rather than the instruction to both allow other -passes to determine legality without having to create an instruction and also to -limit the information available to the predicates to that which is safe to rely -on. Currently, the information available to the predicates that determine -legality contains: - -* The opcode for the instruction - -* The type of each type index (see ``type0``, ``type1``, etc.) - -* The size in bytes and atomic ordering for each MachineMemOperand - -Rule Processing and Declaring Rules -""""""""""""""""""""""""""""""""""" - -The ``getActionDefinitionsBuilder`` function generates a ruleset for the given -opcode(s) that rules can be added to. If multiple opcodes are given, they are -all permanently bound to the same ruleset. The rules in a ruleset are executed -from top to bottom and will start again from the top if an instruction is -legalized as a result of the rules. If the ruleset is exhausted without -satisfying any rule, then it is considered unsupported. - -When it doesn't declare the instruction legal, each pass over the rules may -request that one type changes to another type. Sometimes this can cause multiple -types to change but we avoid this as much as possible as making multiple changes -can make it difficult to avoid infinite loops where, for example, narrowing one -type causes another to be too small and widening that type causes the first one -to be too big. - -In general, it's advisable to declare instructions legal as close to the top of -the rule as possible and to place any expensive rules as low as possible. This -helps with performance as testing for legality happens more often than -legalization and legalization can require multiple passes over the rules. - -As a concrete example, consider the rule:: - - getActionDefinitionsBuilder({G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SHL}) - .legalFor({s32, s64, v2s32, v4s32, v2s64}) - .clampScalar(0, s32, s64) - .widenScalarToNextPow2(0); - -and the instruction:: - - %2:_(s7) = G_ADD %0:_(s7), %1:_(s7) - -this doesn't meet the predicate for the :ref:`.legalFor() <legalfor>` as ``s7`` -is not one of the listed types so it falls through to the -:ref:`.clampScalar() <clampscalar>`. It does meet the predicate for this rule -as the type is smaller than the ``s32`` and this rule instructs the legalizer -to change type 0 to ``s32``. It then restarts from the top. This time it does -satisfy ``.legalFor()`` and the resulting output is:: - - %3:_(s32) = G_ANYEXT %0:_(s7) - %4:_(s32) = G_ANYEXT %1:_(s7) - %5:_(s32) = G_ADD %3:_(s32), %4:_(s32) - %2:_(s7) = G_TRUNC %5:_(s32) - -where the ``G_ADD`` is legal and the other instructions are scheduled for -processing by the legalizer. - -Rule Actions -"""""""""""" - -There are various rule factories that append rules to a ruleset but they have a -few actions in common: - -.. _legalfor: - -* ``legalIf()``, ``legalFor()``, etc. declare an instruction to be legal if the - predicate is satisfied. - -* ``narrowScalarIf()``, ``narrowScalarFor()``, etc. declare an instruction to be illegal - if the predicate is satisfied and indicates that narrowing the scalars in one - of the types to a specific type would make it more legal. This action supports - both scalars and vectors. - -* ``widenScalarIf()``, ``widenScalarFor()``, etc. declare an instruction to be illegal - if the predicate is satisfied and indicates that widening the scalars in one - of the types to a specific type would make it more legal. This action supports - both scalars and vectors. - -* ``fewerElementsIf()``, ``fewerElementsFor()``, etc. declare an instruction to be - illegal if the predicate is satisfied and indicates reducing the number of - vector elements in one of the types to a specific type would make it more - legal. This action supports vectors. - -* ``moreElementsIf()``, ``moreElementsFor()``, etc. declare an instruction to be illegal - if the predicate is satisfied and indicates increasing the number of vector - elements in one of the types to a specific type would make it more legal. - This action supports vectors. - -* ``lowerIf()``, ``lowerFor()``, etc. declare an instruction to be illegal if the - predicate is satisfied and indicates that replacing it with equivalent - instruction(s) would make it more legal. Support for this action differs for - each opcode. - -* ``libcallIf()``, ``libcallFor()``, etc. declare an instruction to be illegal if the - predicate is satisfied and indicates that replacing it with a libcall would - make it more legal. Support for this action differs for - each opcode. - -* ``customIf()``, ``customFor()``, etc. declare an instruction to be illegal if the - predicate is satisfied and indicates that the backend developer will supply - a means of making it more legal. - -* ``unsupportedIf()``, ``unsupportedFor()``, etc. declare an instruction to be illegal - if the predicate is satisfied and indicates that there is no way to make it - legal and the compiler should fail. - -* ``fallback()`` falls back on an older API and should only be used while porting - existing code from that API. - -Rule Predicates -""""""""""""""" - -The rule factories also have predicates in common: - -* ``legal()``, ``lower()``, etc. are always satisfied. - -* ``legalIf()``, ``narrowScalarIf()``, etc. are satisfied if the user-supplied - ``LegalityPredicate`` function returns true. This predicate has access to the - information in the ``LegalityQuery`` to make its decision. - User-supplied predicates can also be combined using ``all(P0, P1, ...)``. - -* ``legalFor()``, ``narrowScalarFor()``, etc. are satisfied if the type matches one in - a given set of types. For example ``.legalFor({s16, s32})`` declares the - instruction legal if type 0 is either s16 or s32. Additional versions for two - and three type indices are generally available. For these, all the type - indices considered together must match all the types in one of the tuples. So - ``.legalFor({{s16, s32}, {s32, s64}})`` will only accept ``{s16, s32}``, or - ``{s32, s64}`` but will not accept ``{s16, s64}``. - -* ``legalForTypesWithMemSize()``, ``narrowScalarForTypesWithMemSize()``, etc. are - similar to ``legalFor()``, ``narrowScalarFor()``, etc. but additionally require a - MachineMemOperand to have a given size in each tuple. - -* ``legalForCartesianProduct()``, ``narrowScalarForCartesianProduct()``, etc. are - satisfied if each type index matches one element in each of the independent - sets. So ``.legalForCartesianProduct({s16, s32}, {s32, s64})`` will accept - ``{s16, s32}``, ``{s16, s64}``, ``{s32, s32}``, and ``{s32, s64}``. - -Composite Rules -""""""""""""""" - -There are some composite rules for common situations built out of the above facilities: - -* ``widenScalarToNextPow2()`` is like ``widenScalarIf()`` but is satisfied iff the type - size in bits is not a power of 2 and selects a target type that is the next - largest power of 2. - -.. _clampscalar: - -* ``minScalar()`` is like ``widenScalarIf()`` but is satisfied iff the type - size in bits is smaller than the given minimum and selects the minimum as the - target type. Similarly, there is also a ``maxScalar()`` for the maximum and a - ``clampScalar()`` to do both at once. - -* ``minScalarSameAs()`` is like ``minScalar()`` but the minimum is taken from another - type index. - -* ``moreElementsToNextMultiple()`` is like ``moreElementsToNextPow2()`` but is based on - multiples of X rather than powers of 2. - -Other Information -""""""""""""""""" - -``TODO``: -An alternative worth investigating is to generalize the API to represent -actions using ``std::function`` that implements the action, instead of explicit -enum tokens (``Legal``, ``WidenScalar``, ...). - -``TODO``: -Moreover, we could use TableGen to initially infer legality of operation from -existing patterns (as any pattern we can select is by definition legal). -Expanding that to describe legalization actions is a much larger but -potentially useful project. - -.. rubric:: Footnotes - -.. [#legalizer-legacy-footnote] An API is broadly similar to - SelectionDAG/TargetLowering is available but is not recommended as a more - powerful API is available. - -.. _min-legalizerinfo: - -Minimum Rule Set -^^^^^^^^^^^^^^^^ - -GlobalISel's legalizer has a great deal of flexibility in how a given target -shapes the GMIR that the rest of the backend must handle. However, there are -a small number of requirements that all targets must meet. - -Before discussing the minimum requirements, we'll need some terminology: - -Producer Type Set - The set of types which is the union of all possible types produced by at - least one legal instruction. - -Consumer Type Set - The set of types which is the union of all possible types consumed by at - least one legal instruction. - -Both sets are often identical but there's no guarantee of that. For example, -it's not uncommon to be unable to consume s64 but still be able to produce it -for a few specific instructions. - -Minimum Rules For Scalars -""""""""""""""""""""""""" - -* G_ANYEXT must be legal for all inputs from the producer type set and all larger - outputs from the consumer type set. -* G_TRUNC must be legal for all inputs from the producer type set and all - smaller outputs from the consumer type set. - -G_ANYEXT, and G_TRUNC have mandatory legality since the GMIR requires a means to -connect operations with different type sizes. They are usually trivial to support -since G_ANYEXT doesn't define the value of the additional bits and G_TRUNC is -discarding bits. The other conversions can be lowered into G_ANYEXT/G_TRUNC -with some additional operations that are subject to further legalization. For -example, G_SEXT can lower to:: - - %1 = G_ANYEXT %0 - %2 = G_CONSTANT ... - %3 = G_SHL %1, %2 - %4 = G_ASHR %3, %2 - -and the G_CONSTANT/G_SHL/G_ASHR can further lower to other operations or target -instructions. Similarly, G_FPEXT has no legality requirement since it can lower -to a G_ANYEXT followed by a target instruction. - -G_MERGE_VALUES and G_UNMERGE_VALUES do not have legality requirements since the -former can lower to G_ANYEXT and some other legalizable instructions, while the -latter can lower to some legalizable instructions followed by G_TRUNC. - -Minimum Legality For Vectors -"""""""""""""""""""""""""""" - -Within the vector types, there aren't any defined conversions in LLVM IR as -vectors are often converted by reinterpreting the bits or by decomposing the -vector and reconstituting it as a different type. As such, G_BITCAST is the -only operation to account for. We generally don't require that it's legal -because it can usually be lowered to COPY (or to nothing using -replaceAllUses()). However, there are situations where G_BITCAST is non-trivial -(e.g. little-endian vectors of big-endian data such as on big-endian MIPS MSA and -big-endian ARM NEON, see `_i_bitcast`). To account for this G_BITCAST must be -legal for all type combinations that change the bit pattern in the value. - -There are no legality requirements for G_BUILD_VECTOR, or G_BUILD_VECTOR_TRUNC -since these can be handled by: -* Declaring them legal. -* Scalarizing them. -* Lowering them to G_TRUNC+G_ANYEXT and some legalizable instructions. -* Lowering them to target instructions which are legal by definition. - -The same reasoning also allows G_UNMERGE_VALUES to lack legality requirements -for vector inputs. - -Minimum Legality for Pointers -""""""""""""""""""""""""""""" - -There are no minimum rules for pointers since G_INTTOPTR and G_PTRTOINT can -be selected to a COPY from register class to another by the legalizer. - -Minimum Legality For Operations -""""""""""""""""""""""""""""""" - -The rules for G_ANYEXT, G_MERGE_VALUES, G_BITCAST, G_BUILD_VECTOR, -G_BUILD_VECTOR_TRUNC, G_CONCAT_VECTORS, G_UNMERGE_VALUES, G_PTRTOINT, and -G_INTTOPTR have already been noted above. In addition to those, the following -operations have requirements: - -* At least one G_IMPLICIT_DEF must be legal. This is usually trivial as it - requires no code to be selected. -* G_PHI must be legal for all types in the producer and consumer typesets. This - is usually trivial as it requires no code to be selected. -* At least one G_FRAME_INDEX must be legal -* At least one G_BLOCK_ADDR must be legal - -There are many other operations you'd expect to have legality requirements but -they can be lowered to target instructions which are legal by definition. - -.. _regbankselect: - -RegBankSelect -------------- - -This pass constrains the :ref:`gmir-gvregs` operands of generic -instructions to some :ref:`gmir-regbank`. - -It iteratively maps instructions to a set of per-operand bank assignment. -The possible mappings are determined by the target-provided -:ref:`RegisterBankInfo <api-registerbankinfo>`. -The mapping is then applied, possibly introducing ``COPY`` instructions if -necessary. - -It traverses the ``MachineFunction`` top down so that all operands are already -mapped when analyzing an instruction. - -This pass could also remap target-specific instructions when beneficial. -In the future, this could replace the ExeDepsFix pass, as we can directly -select the best variant for an instruction that's available on multiple banks. - -.. _api-registerbankinfo: - -API: RegisterBankInfo -^^^^^^^^^^^^^^^^^^^^^ - -The ``RegisterBankInfo`` class describes multiple aspects of register banks. - -* **Banks**: ``addRegBankCoverage`` --- which register bank covers each - register class. - -* **Cross-Bank Copies**: ``copyCost`` --- the cost of a ``COPY`` from one bank - to another. - -* **Default Mapping**: ``getInstrMapping`` --- the default bank assignments for - a given instruction. - -* **Alternative Mapping**: ``getInstrAlternativeMapping`` --- the other - possible bank assignments for a given instruction. - -``TODO``: -All this information should eventually be static and generated by TableGen, -mostly using existing information augmented by bank descriptions. - -``TODO``: -``getInstrMapping`` is currently separate from ``getInstrAlternativeMapping`` -because the latter is more expensive: as we move to static mapping info, -both methods should be free, and we should merge them. - -.. _regbankselect-modes: - -RegBankSelect Modes -^^^^^^^^^^^^^^^^^^^ - -``RegBankSelect`` currently has two modes: - -* **Fast** --- For each instruction, pick a target-provided "default" bank - assignment. This is the default at -O0. - -* **Greedy** --- For each instruction, pick the cheapest of several - target-provided bank assignment alternatives. - -We intend to eventually introduce an additional optimizing mode: - -* **Global** --- Across multiple instructions, pick the cheapest combination of - bank assignments. - -``NOTE``: -On AArch64, we are considering using the Greedy mode even at -O0 (or perhaps at -backend -O1): because :ref:`gmir-llt` doesn't distinguish floating point from -integer scalars, the default assignment for loads and stores is the integer -bank, introducing cross-bank copies on most floating point operations. - - -.. _instructionselect: - -InstructionSelect ------------------ - -This pass transforms generic machine instructions into equivalent -target-specific instructions. It traverses the ``MachineFunction`` bottom-up, -selecting uses before definitions, enabling trivial dead code elimination. - -.. _api-instructionselector: - -API: InstructionSelector -^^^^^^^^^^^^^^^^^^^^^^^^ - -The target implements the ``InstructionSelector`` class, containing the -target-specific selection logic proper. - -The instance is provided by the subtarget, so that it can specialize the -selector by subtarget feature (with, e.g., a vector selector overriding parts -of a general-purpose common selector). -We might also want to parameterize it by MachineFunction, to enable selector -variants based on function attributes like optsize. - -The simple API consists of: - - .. code-block:: c++ - - virtual bool select(MachineInstr &MI) - -This target-provided method is responsible for mutating (or replacing) a -possibly-generic MI into a fully target-specific equivalent. -It is also responsible for doing the necessary constraining of gvregs into the -appropriate register classes as well as passing through COPY instructions to -the register allocator. - -The ``InstructionSelector`` can fold other instructions into the selected MI, -by walking the use-def chain of the vreg operands. -As GlobalISel is Global, this folding can occur across basic blocks. - -SelectionDAG Rule Imports -^^^^^^^^^^^^^^^^^^^^^^^^^ - -TableGen will import SelectionDAG rules and provide the following function to -execute them: - - .. code-block:: c++ - - bool selectImpl(MachineInstr &MI) - -The ``--stats`` option can be used to determine what proportion of rules were -successfully imported. The easiest way to use this is to copy the -``-gen-globalisel`` tablegen command from ``ninja -v`` and modify it. - -Similarly, the ``--warn-on-skipped-patterns`` option can be used to obtain the -reasons that rules weren't imported. This can be used to focus on the most -important rejection reasons. - -PatLeaf Predicates -^^^^^^^^^^^^^^^^^^ - -PatLeafs cannot be imported because their C++ is implemented in terms of -``SDNode`` objects. PatLeafs that handle immediate predicates should be -replaced by ``ImmLeaf``, ``IntImmLeaf``, or ``FPImmLeaf`` as appropriate. - -There's no standard answer for other PatLeafs. Some standard predicates have -been baked into TableGen but this should not generally be done. - -Custom SDNodes -^^^^^^^^^^^^^^ - -Custom SDNodes should be mapped to Target Pseudos using ``GINodeEquiv``. This -will cause the instruction selector to import them but you will also need to -ensure the target pseudo is introduced to the MIR before the instruction -selector. Any preceding pass is suitable but the legalizer will be a -particularly common choice. - -ComplexPatterns -^^^^^^^^^^^^^^^ - -ComplexPatterns cannot be imported because their C++ is implemented in terms of -``SDNode`` objects. GlobalISel versions should be defined with -``GIComplexOperandMatcher`` and mapped to ComplexPattern with -``GIComplexPatternEquiv``. - -The following predicates are useful for porting ComplexPattern: - -* isBaseWithConstantOffset() - Check for base+offset structures -* isOperandImmEqual() - Check for a particular constant -* isObviouslySafeToFold() - Check for reasons an instruction can't be sunk and folded into another. - -There are some important points for the C++ implementation: - -* Don't modify MIR in the predicate -* Renderer lambdas should capture by value to avoid use-after-free. They will be used after the predicate returns. -* Only create instructions in a renderer lambda. GlobalISel won't clean up things you create but don't use. - - -.. _maintainability: - -Maintainability -=============== - -.. _maintainability-iterative: - -Iterative Transformations -------------------------- - -Passes are split into small, iterative transformations, with all state -represented in the MIR. - -This differs from SelectionDAG (in particular, the legalizer) using various -in-memory side-tables. - - -.. _maintainability-mir: - -MIR Serialization ------------------ - -.. FIXME: Update the MIRLangRef to include GMI additions. - -:ref:`gmir` is serializable (see :doc:`MIRLangRef`). -Combined with :ref:`maintainability-iterative`, this enables much finer-grained -testing, rather than requiring large and fragile IR-to-assembly tests. - -The current "stage" in the :ref:`pipeline` is represented by a set of -``MachineFunctionProperties``: - -* ``legalized`` -* ``regBankSelected`` -* ``selected`` - - -.. _maintainability-verifier: - -MachineVerifier ---------------- - -The pass approach lets us use the ``MachineVerifier`` to enforce invariants. -For instance, a ``regBankSelected`` function may not have gvregs without -a bank. - -``TODO``: -The ``MachineVerifier`` being monolithic, some of the checks we want to do -can't be integrated to it: GlobalISel is a separate library, so we can't -directly reference it from CodeGen. For instance, legality checks are -currently done in RegBankSelect/InstructionSelect proper. We could #ifdef out -the checks, or we could add some sort of verifier API. - - -.. _progress: - -Progress and Future Work -======================== - -The initial goal is to replace FastISel on AArch64. The next step will be to -replace SelectionDAG as the optimized ISel. - -``NOTE``: -While we iterate on GlobalISel, we strive to avoid affecting the performance of -SelectionDAG, FastISel, or the other MIR passes. For instance, the types of -:ref:`gmir-gvregs` are stored in a separate table in ``MachineRegisterInfo``, -that is destroyed after :ref:`instructionselect`. - -.. _progress-fastisel: - -FastISel Replacement --------------------- - -For the initial FastISel replacement, we intend to fallback to SelectionDAG on -selection failures. - -Currently, compile-time of the fast pipeline is within 1.5x of FastISel. -We're optimistic we can get to within 1.1/1.2x, but beating FastISel will be -challenging given the multi-pass approach. -Still, supporting all IR (via a complete legalizer) and avoiding the fallback -to SelectionDAG in the worst case should enable better amortized performance -than SelectionDAG+FastISel. - -``NOTE``: -We considered never having a fallback to SelectionDAG, instead deciding early -whether a given function is supported by GlobalISel or not. The decision would -be based on :ref:`milegalizer` queries. -We abandoned that for two reasons: -a) on IR inputs, we'd need to basically simulate the :ref:`irtranslator`; -b) to be robust against unforeseen failures and to enable iterative -improvements. - -.. _progress-targets: - -Support For Other Targets -------------------------- - -In parallel, we're investigating adding support for other - ideally quite -different - targets. For instance, there is some initial AMDGPU support. - - -.. _porting: - -Porting GlobalISel to A New Target -================================== - -There are four major classes to implement by the target: - -* :ref:`CallLowering <api-calllowering>` --- lower calls, returns, and arguments - according to the ABI. -* :ref:`RegisterBankInfo <api-registerbankinfo>` --- describe - :ref:`gmir-regbank` coverage, cross-bank copy cost, and the mapping of - operands onto banks for each instruction. -* :ref:`LegalizerInfo <api-legalizerinfo>` --- describe what is legal, and how - to legalize what isn't. -* :ref:`InstructionSelector <api-instructionselector>` --- select generic MIR - to target-specific MIR. - -Additionally: - -* ``TargetPassConfig`` --- create the passes constituting the pipeline, - including additional passes not included in the :ref:`pipeline`. - -.. _other_resources: - -Resources -========= - -* `Global Instruction Selection - A Proposal by Quentin Colombet @LLVMDevMeeting 2015 <https://www.youtube.com/watch?v=F6GGbYtae3g>`_ -* `Global Instruction Selection - Status by Quentin Colombet, Ahmed Bougacha, and Tim Northover @LLVMDevMeeting 2016 <https://www.youtube.com/watch?v=6tfb344A7w8>`_ -* `GlobalISel - LLVM's Latest Instruction Selection Framework by Diana Picus @FOSDEM17 <https://www.youtube.com/watch?v=d6dF6E4BPeU>`_ -* `GlobalISel: Past, Present, and Future by Quentin Colombet and Ahmed Bougacha @LLVMDevMeeting 2017 <https://www.llvm.org/devmtg/2017-10/#talk11>`_ -* `Head First into GlobalISel by Daniel Sanders, Aditya Nandakumar, and Justin Bogner @LLVMDevMeeting 2017 <https://www.llvm.org/devmtg/2017-10/#tutorial2>`_ -* `Generating Optimized Code with GlobalISel by Volkan Keles, Daniel Sanders @LLVMDevMeeting 2019 <https://www.llvm.org/devmtg/2019-10/talk-abstracts.html#keynote1>`_ diff --git a/llvm/docs/GlobalISel/GMIR.rst b/llvm/docs/GlobalISel/GMIR.rst new file mode 100644 index 00000000000..4eaf039b14b --- /dev/null +++ b/llvm/docs/GlobalISel/GMIR.rst @@ -0,0 +1,158 @@ +.. _gmir: + +Generic Machine IR +================== + +Machine IR operates on physical registers, register classes, and (mostly) +target-specific instructions. + +To bridge the gap with LLVM IR, GlobalISel introduces "generic" extensions to +Machine IR: + +.. contents:: + :local: + +``NOTE``: +The generic MIR (GMIR) representation still contains references to IR +constructs (such as ``GlobalValue``). Removing those should let us write more +accurate tests, or delete IR after building the initial MIR. However, it is +not part of the GlobalISel effort. + +.. _gmir-instructions: + +Generic Instructions +-------------------- + +The main addition is support for pre-isel generic machine instructions (e.g., +``G_ADD``). Like other target-independent instructions (e.g., ``COPY`` or +``PHI``), these are available on all targets. + +``TODO``: +While we're progressively adding instructions, one kind in particular exposes +interesting problems: compares and how to represent condition codes. +Some targets (x86, ARM) have generic comparisons setting multiple flags, +which are then used by predicated variants. +Others (IR) specify the predicate in the comparison and users just get a single +bit. SelectionDAG uses SETCC/CONDBR vs BR_CC (and similar for select) to +represent this. + +The ``MachineIRBuilder`` class wraps the ``MachineInstrBuilder`` and provides +a convenient way to create these generic instructions. + +.. _gmir-gvregs: + +Generic Virtual Registers +------------------------- + +Generic instructions operate on a new kind of register: "generic" virtual +registers. As opposed to non-generic vregs, they are not assigned a Register +Class. Instead, generic vregs have a :ref:`gmir-llt`, and can be assigned +a :ref:`gmir-regbank`. + +``MachineRegisterInfo`` tracks the same information that it does for +non-generic vregs (e.g., use-def chains). Additionally, it also tracks the +:ref:`gmir-llt` of the register, and, instead of the ``TargetRegisterClass``, +its :ref:`gmir-regbank`, if any. + +For simplicity, most generic instructions only accept generic vregs: + +* instead of immediates, they use a gvreg defined by an instruction + materializing the immediate value (see :ref:`irtranslator-constants`). +* instead of physical register, they use a gvreg defined by a ``COPY``. + +``NOTE``: +We started with an alternative representation, where MRI tracks a size for +each gvreg, and instructions have lists of types. +That had two flaws: the type and size are redundant, and there was no generic +way of getting a given operand's type (as there was no 1:1 mapping between +instruction types and operands). +We considered putting the type in some variant of MCInstrDesc instead: +See `PR26576 <http://llvm.org/PR26576>`_: [GlobalISel] Generic MachineInstrs +need a type but this increases the memory footprint of the related objects + +.. _gmir-regbank: + +Register Bank +------------- + +A Register Bank is a set of register classes defined by the target. +A bank has a size, which is the maximum store size of all covered classes. + +In general, cross-class copies inside a bank are expected to be cheaper than +copies across banks. They are also coalesceable by the register coalescer, +whereas cross-bank copies are not. + +Also, equivalent operations can be performed on different banks using different +instructions. + +For example, X86 can be seen as having 3 main banks: general-purpose, x87, and +vector (which could be further split into a bank per domain for single vs +double precision instructions). + +Register banks are described by a target-provided API, +:ref:`RegisterBankInfo <api-registerbankinfo>`. + +.. _gmir-llt: + +Low Level Type +-------------- + +Additionally, every generic virtual register has a type, represented by an +instance of the ``LLT`` class. + +Like ``EVT``/``MVT``/``Type``, it has no distinction between unsigned and signed +integer types. Furthermore, it also has no distinction between integer and +floating-point types: it mainly conveys absolutely necessary information, such +as size and number of vector lanes: + +* ``sN`` for scalars +* ``pN`` for pointers +* ``<N x sM>`` for vectors +* ``unsized`` for labels, etc.. + +``LLT`` is intended to replace the usage of ``EVT`` in SelectionDAG. + +Here are some LLT examples and their ``EVT`` and ``Type`` equivalents: + + ============= ========= ====================================== + LLT EVT IR Type + ============= ========= ====================================== + ``s1`` ``i1`` ``i1`` + ``s8`` ``i8`` ``i8`` + ``s32`` ``i32`` ``i32`` + ``s32`` ``f32`` ``float`` + ``s17`` ``i17`` ``i17`` + ``s16`` N/A ``{i8, i8}`` + ``s32`` N/A ``[4 x i8]`` + ``p0`` ``iPTR`` ``i8*``, ``i32*``, ``%opaque*`` + ``p2`` ``iPTR`` ``i8 addrspace(2)*`` + ``<4 x s32>`` ``v4f32`` ``<4 x float>`` + ``s64`` ``v1f64`` ``<1 x double>`` + ``<3 x s32>`` ``v3i32`` ``<3 x i32>`` + ``unsized`` ``Other`` ``label`` + ============= ========= ====================================== + + +Rationale: instructions already encode a specific interpretation of types +(e.g., ``add`` vs. ``fadd``, or ``sdiv`` vs. ``udiv``). Also encoding that +information in the type system requires introducing bitcast with no real +advantage for the selector. + +Pointer types are distinguished by address space. This matches IR, as opposed +to SelectionDAG where address space is an attribute on operations. +This representation better supports pointers having different sizes depending +on their addressspace. + +``NOTE``: +Currently, LLT requires at least 2 elements in vectors, but some targets have +the concept of a '1-element vector'. Representing them as their underlying +scalar type is a nice simplification. + +``TODO``: +Currently, non-generic virtual registers, defined by non-pre-isel-generic +instructions, cannot have a type, and thus cannot be used by a pre-isel generic +instruction. Instead, they are given a type using a COPY. We could relax that +and allow types on all vregs: this would reduce the number of MI required when +emitting target-specific MIR early in the pipeline. This should purely be +a compile-time optimization. + diff --git a/llvm/docs/GlobalISel/IRTranslator.rst b/llvm/docs/GlobalISel/IRTranslator.rst new file mode 100644 index 00000000000..515df56d794 --- /dev/null +++ b/llvm/docs/GlobalISel/IRTranslator.rst @@ -0,0 +1,57 @@ +.. _irtranslator: + +IRTranslator +------------ + +This pass translates the input LLVM IR ``Function`` to a GMIR +``MachineFunction``. + +``TODO``: +This currently doesn't support the more complex instructions, in particular +those involving control flow (``switch``, ``invoke``, ...). +For ``switch`` in particular, we can initially use the ``LowerSwitch`` pass. + +.. _api-calllowering: + +API: CallLowering +^^^^^^^^^^^^^^^^^ + +The ``IRTranslator`` (using the ``CallLowering`` target-provided utility) also +implements the ABI's calling convention by lowering calls, returns, and +arguments to the appropriate physical register usage and instruction sequences. + +.. _irtranslator-aggregates: + +Aggregates +^^^^^^^^^^ + +Aggregates are lowered to a single scalar vreg. +This differs from SelectionDAG's multiple vregs via ``GetValueVTs``. + +``TODO``: +As some of the bits are undef (padding), we should consider augmenting the +representation with additional metadata (in effect, caching computeKnownBits +information on vregs). +See `PR26161 <http://llvm.org/PR26161>`_: [GlobalISel] Value to vreg during +IR to MachineInstr translation for aggregate type + +.. _irtranslator-constants: + +Constant Lowering +^^^^^^^^^^^^^^^^^ + +The ``IRTranslator`` lowers ``Constant`` operands into uses of gvregs defined +by ``G_CONSTANT`` or ``G_FCONSTANT`` instructions. +Currently, these instructions are always emitted in the entry basic block. +In a ``MachineFunction``, each ``Constant`` is materialized by a single gvreg. + +This is beneficial as it allows us to fold constants into immediate operands +during :ref:`instructionselect`, while still avoiding redundant materializations +for expensive non-foldable constants. +However, this can lead to unnecessary spills and reloads in an -O0 pipeline, as +these vregs can have long live ranges. + +``TODO``: +We're investigating better placement of these instructions, in fast and +optimized modes. + diff --git a/llvm/docs/GlobalISel/InstructionSelect.rst b/llvm/docs/GlobalISel/InstructionSelect.rst new file mode 100644 index 00000000000..9798ae7a596 --- /dev/null +++ b/llvm/docs/GlobalISel/InstructionSelect.rst @@ -0,0 +1,98 @@ + +.. _instructionselect: + +InstructionSelect +----------------- + +This pass transforms generic machine instructions into equivalent +target-specific instructions. It traverses the ``MachineFunction`` bottom-up, +selecting uses before definitions, enabling trivial dead code elimination. + +.. _api-instructionselector: + +API: InstructionSelector +^^^^^^^^^^^^^^^^^^^^^^^^ + +The target implements the ``InstructionSelector`` class, containing the +target-specific selection logic proper. + +The instance is provided by the subtarget, so that it can specialize the +selector by subtarget feature (with, e.g., a vector selector overriding parts +of a general-purpose common selector). +We might also want to parameterize it by MachineFunction, to enable selector +variants based on function attributes like optsize. + +The simple API consists of: + + .. code-block:: c++ + + virtual bool select(MachineInstr &MI) + +This target-provided method is responsible for mutating (or replacing) a +possibly-generic MI into a fully target-specific equivalent. +It is also responsible for doing the necessary constraining of gvregs into the +appropriate register classes as well as passing through COPY instructions to +the register allocator. + +The ``InstructionSelector`` can fold other instructions into the selected MI, +by walking the use-def chain of the vreg operands. +As GlobalISel is Global, this folding can occur across basic blocks. + +SelectionDAG Rule Imports +^^^^^^^^^^^^^^^^^^^^^^^^^ + +TableGen will import SelectionDAG rules and provide the following function to +execute them: + + .. code-block:: c++ + + bool selectImpl(MachineInstr &MI) + +The ``--stats`` option can be used to determine what proportion of rules were +successfully imported. The easiest way to use this is to copy the +``-gen-globalisel`` tablegen command from ``ninja -v`` and modify it. + +Similarly, the ``--warn-on-skipped-patterns`` option can be used to obtain the +reasons that rules weren't imported. This can be used to focus on the most +important rejection reasons. + +PatLeaf Predicates +^^^^^^^^^^^^^^^^^^ + +PatLeafs cannot be imported because their C++ is implemented in terms of +``SDNode`` objects. PatLeafs that handle immediate predicates should be +replaced by ``ImmLeaf``, ``IntImmLeaf``, or ``FPImmLeaf`` as appropriate. + +There's no standard answer for other PatLeafs. Some standard predicates have +been baked into TableGen but this should not generally be done. + +Custom SDNodes +^^^^^^^^^^^^^^ + +Custom SDNodes should be mapped to Target Pseudos using ``GINodeEquiv``. This +will cause the instruction selector to import them but you will also need to +ensure the target pseudo is introduced to the MIR before the instruction +selector. Any preceding pass is suitable but the legalizer will be a +particularly common choice. + +ComplexPatterns +^^^^^^^^^^^^^^^ + +ComplexPatterns cannot be imported because their C++ is implemented in terms of +``SDNode`` objects. GlobalISel versions should be defined with +``GIComplexOperandMatcher`` and mapped to ComplexPattern with +``GIComplexPatternEquiv``. + +The following predicates are useful for porting ComplexPattern: + +* isBaseWithConstantOffset() - Check for base+offset structures +* isOperandImmEqual() - Check for a particular constant +* isObviouslySafeToFold() - Check for reasons an instruction can't be sunk and folded into another. + +There are some important points for the C++ implementation: + +* Don't modify MIR in the predicate +* Renderer lambdas should capture by value to avoid use-after-free. They will be used after the predicate returns. +* Only create instructions in a renderer lambda. GlobalISel won't clean up things you create but don't use. + + diff --git a/llvm/docs/GlobalISel/Legalizer.rst b/llvm/docs/GlobalISel/Legalizer.rst new file mode 100644 index 00000000000..d0cb7324c5e --- /dev/null +++ b/llvm/docs/GlobalISel/Legalizer.rst @@ -0,0 +1,351 @@ +.. _milegalizer: + +Legalizer +--------- + +This pass transforms the generic machine instructions such that they are legal. + +A legal instruction is defined as: + +* **selectable** --- the target will later be able to select it to a + target-specific (non-generic) instruction. + +* operating on **vregs that can be loaded and stored** -- if necessary, the + target can select a ``G_LOAD``/``G_STORE`` of each gvreg operand. + +As opposed to SelectionDAG, there are no legalization phases. In particular, +'type' and 'operation' legalization are not separate. + +Legalization is iterative, and all state is contained in GMIR. To maintain the +validity of the intermediate code, instructions are introduced: + +* ``G_MERGE_VALUES`` --- concatenate multiple registers of the same + size into a single wider register. + +* ``G_UNMERGE_VALUES`` --- extract multiple registers of the same size + from a single wider register. + +* ``G_EXTRACT`` --- extract a simple register (as contiguous sequences of bits) + from a single wider register. + +As they are expected to be temporary byproducts of the legalization process, +they are combined at the end of the :ref:`milegalizer` pass. +If any remain, they are expected to always be selectable, using loads and stores +if necessary. + +The legality of an instruction may only depend on the instruction itself and +must not depend on any context in which the instruction is used. However, after +deciding that an instruction is not legal, using the context of the instruction +to decide how to legalize the instruction is permitted. As an example, if we +have a ``G_FOO`` instruction of the form:: + + %1:_(s32) = G_CONSTANT i32 1 + %2:_(s32) = G_FOO %0:_(s32), %1:_(s32) + +it's impossible to say that G_FOO is legal iff %1 is a ``G_CONSTANT`` with +value ``1``. However, the following:: + + %2:_(s32) = G_FOO %0:_(s32), i32 1 + +can say that it's legal iff operand 2 is an immediate with value ``1`` because +that information is entirely contained within the single instruction. + +.. _api-legalizerinfo: + +API: LegalizerInfo +^^^^^^^^^^^^^^^^^^ + +The recommended [#legalizer-legacy-footnote]_ API looks like this:: + + getActionDefinitionsBuilder({G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SHL}) + .legalFor({s32, s64, v2s32, v4s32, v2s64}) + .clampScalar(0, s32, s64) + .widenScalarToNextPow2(0) + .clampNumElements(0, v2s32, v4s32) + .clampNumElements(0, v2s64, v2s64) + .moreElementsToNextPow2(0); + +and describes a set of rules by which we can either declare an instruction legal +or decide which action to take to make it more legal. + +At the core of this ruleset is the ``LegalityQuery`` which describes the +instruction. We use a description rather than the instruction to both allow other +passes to determine legality without having to create an instruction and also to +limit the information available to the predicates to that which is safe to rely +on. Currently, the information available to the predicates that determine +legality contains: + +* The opcode for the instruction + +* The type of each type index (see ``type0``, ``type1``, etc.) + +* The size in bytes and atomic ordering for each MachineMemOperand + +Rule Processing and Declaring Rules +""""""""""""""""""""""""""""""""""" + +The ``getActionDefinitionsBuilder`` function generates a ruleset for the given +opcode(s) that rules can be added to. If multiple opcodes are given, they are +all permanently bound to the same ruleset. The rules in a ruleset are executed +from top to bottom and will start again from the top if an instruction is +legalized as a result of the rules. If the ruleset is exhausted without +satisfying any rule, then it is considered unsupported. + +When it doesn't declare the instruction legal, each pass over the rules may +request that one type changes to another type. Sometimes this can cause multiple +types to change but we avoid this as much as possible as making multiple changes +can make it difficult to avoid infinite loops where, for example, narrowing one +type causes another to be too small and widening that type causes the first one +to be too big. + +In general, it's advisable to declare instructions legal as close to the top of +the rule as possible and to place any expensive rules as low as possible. This +helps with performance as testing for legality happens more often than +legalization and legalization can require multiple passes over the rules. + +As a concrete example, consider the rule:: + + getActionDefinitionsBuilder({G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SHL}) + .legalFor({s32, s64, v2s32, v4s32, v2s64}) + .clampScalar(0, s32, s64) + .widenScalarToNextPow2(0); + +and the instruction:: + + %2:_(s7) = G_ADD %0:_(s7), %1:_(s7) + +this doesn't meet the predicate for the :ref:`.legalFor() <legalfor>` as ``s7`` +is not one of the listed types so it falls through to the +:ref:`.clampScalar() <clampscalar>`. It does meet the predicate for this rule +as the type is smaller than the ``s32`` and this rule instructs the legalizer +to change type 0 to ``s32``. It then restarts from the top. This time it does +satisfy ``.legalFor()`` and the resulting output is:: + + %3:_(s32) = G_ANYEXT %0:_(s7) + %4:_(s32) = G_ANYEXT %1:_(s7) + %5:_(s32) = G_ADD %3:_(s32), %4:_(s32) + %2:_(s7) = G_TRUNC %5:_(s32) + +where the ``G_ADD`` is legal and the other instructions are scheduled for +processing by the legalizer. + +Rule Actions +"""""""""""" + +There are various rule factories that append rules to a ruleset but they have a +few actions in common: + +.. _legalfor: + +* ``legalIf()``, ``legalFor()``, etc. declare an instruction to be legal if the + predicate is satisfied. + +* ``narrowScalarIf()``, ``narrowScalarFor()``, etc. declare an instruction to be illegal + if the predicate is satisfied and indicates that narrowing the scalars in one + of the types to a specific type would make it more legal. This action supports + both scalars and vectors. + +* ``widenScalarIf()``, ``widenScalarFor()``, etc. declare an instruction to be illegal + if the predicate is satisfied and indicates that widening the scalars in one + of the types to a specific type would make it more legal. This action supports + both scalars and vectors. + +* ``fewerElementsIf()``, ``fewerElementsFor()``, etc. declare an instruction to be + illegal if the predicate is satisfied and indicates reducing the number of + vector elements in one of the types to a specific type would make it more + legal. This action supports vectors. + +* ``moreElementsIf()``, ``moreElementsFor()``, etc. declare an instruction to be illegal + if the predicate is satisfied and indicates increasing the number of vector + elements in one of the types to a specific type would make it more legal. + This action supports vectors. + +* ``lowerIf()``, ``lowerFor()``, etc. declare an instruction to be illegal if the + predicate is satisfied and indicates that replacing it with equivalent + instruction(s) would make it more legal. Support for this action differs for + each opcode. + +* ``libcallIf()``, ``libcallFor()``, etc. declare an instruction to be illegal if the + predicate is satisfied and indicates that replacing it with a libcall would + make it more legal. Support for this action differs for + each opcode. + +* ``customIf()``, ``customFor()``, etc. declare an instruction to be illegal if the + predicate is satisfied and indicates that the backend developer will supply + a means of making it more legal. + +* ``unsupportedIf()``, ``unsupportedFor()``, etc. declare an instruction to be illegal + if the predicate is satisfied and indicates that there is no way to make it + legal and the compiler should fail. + +* ``fallback()`` falls back on an older API and should only be used while porting + existing code from that API. + +Rule Predicates +""""""""""""""" + +The rule factories also have predicates in common: + +* ``legal()``, ``lower()``, etc. are always satisfied. + +* ``legalIf()``, ``narrowScalarIf()``, etc. are satisfied if the user-supplied + ``LegalityPredicate`` function returns true. This predicate has access to the + information in the ``LegalityQuery`` to make its decision. + User-supplied predicates can also be combined using ``all(P0, P1, ...)``. + +* ``legalFor()``, ``narrowScalarFor()``, etc. are satisfied if the type matches one in + a given set of types. For example ``.legalFor({s16, s32})`` declares the + instruction legal if type 0 is either s16 or s32. Additional versions for two + and three type indices are generally available. For these, all the type + indices considered together must match all the types in one of the tuples. So + ``.legalFor({{s16, s32}, {s32, s64}})`` will only accept ``{s16, s32}``, or + ``{s32, s64}`` but will not accept ``{s16, s64}``. + +* ``legalForTypesWithMemSize()``, ``narrowScalarForTypesWithMemSize()``, etc. are + similar to ``legalFor()``, ``narrowScalarFor()``, etc. but additionally require a + MachineMemOperand to have a given size in each tuple. + +* ``legalForCartesianProduct()``, ``narrowScalarForCartesianProduct()``, etc. are + satisfied if each type index matches one element in each of the independent + sets. So ``.legalForCartesianProduct({s16, s32}, {s32, s64})`` will accept + ``{s16, s32}``, ``{s16, s64}``, ``{s32, s32}``, and ``{s32, s64}``. + +Composite Rules +""""""""""""""" + +There are some composite rules for common situations built out of the above facilities: + +* ``widenScalarToNextPow2()`` is like ``widenScalarIf()`` but is satisfied iff the type + size in bits is not a power of 2 and selects a target type that is the next + largest power of 2. + +.. _clampscalar: + +* ``minScalar()`` is like ``widenScalarIf()`` but is satisfied iff the type + size in bits is smaller than the given minimum and selects the minimum as the + target type. Similarly, there is also a ``maxScalar()`` for the maximum and a + ``clampScalar()`` to do both at once. + +* ``minScalarSameAs()`` is like ``minScalar()`` but the minimum is taken from another + type index. + +* ``moreElementsToNextMultiple()`` is like ``moreElementsToNextPow2()`` but is based on + multiples of X rather than powers of 2. + +Other Information +""""""""""""""""" + +``TODO``: +An alternative worth investigating is to generalize the API to represent +actions using ``std::function`` that implements the action, instead of explicit +enum tokens (``Legal``, ``WidenScalar``, ...). + +``TODO``: +Moreover, we could use TableGen to initially infer legality of operation from +existing patterns (as any pattern we can select is by definition legal). +Expanding that to describe legalization actions is a much larger but +potentially useful project. + +.. rubric:: Footnotes + +.. [#legalizer-legacy-footnote] An API is broadly similar to + SelectionDAG/TargetLowering is available but is not recommended as a more + powerful API is available. + +.. _min-legalizerinfo: + +Minimum Rule Set +^^^^^^^^^^^^^^^^ + +GlobalISel's legalizer has a great deal of flexibility in how a given target +shapes the GMIR that the rest of the backend must handle. However, there are +a small number of requirements that all targets must meet. + +Before discussing the minimum requirements, we'll need some terminology: + +Producer Type Set + The set of types which is the union of all possible types produced by at + least one legal instruction. + +Consumer Type Set + The set of types which is the union of all possible types consumed by at + least one legal instruction. + +Both sets are often identical but there's no guarantee of that. For example, +it's not uncommon to be unable to consume s64 but still be able to produce it +for a few specific instructions. + +Minimum Rules For Scalars +""""""""""""""""""""""""" + +* G_ANYEXT must be legal for all inputs from the producer type set and all larger + outputs from the consumer type set. +* G_TRUNC must be legal for all inputs from the producer type set and all + smaller outputs from the consumer type set. + +G_ANYEXT, and G_TRUNC have mandatory legality since the GMIR requires a means to +connect operations with different type sizes. They are usually trivial to support +since G_ANYEXT doesn't define the value of the additional bits and G_TRUNC is +discarding bits. The other conversions can be lowered into G_ANYEXT/G_TRUNC +with some additional operations that are subject to further legalization. For +example, G_SEXT can lower to:: + + %1 = G_ANYEXT %0 + %2 = G_CONSTANT ... + %3 = G_SHL %1, %2 + %4 = G_ASHR %3, %2 + +and the G_CONSTANT/G_SHL/G_ASHR can further lower to other operations or target +instructions. Similarly, G_FPEXT has no legality requirement since it can lower +to a G_ANYEXT followed by a target instruction. + +G_MERGE_VALUES and G_UNMERGE_VALUES do not have legality requirements since the +former can lower to G_ANYEXT and some other legalizable instructions, while the +latter can lower to some legalizable instructions followed by G_TRUNC. + +Minimum Legality For Vectors +"""""""""""""""""""""""""""" + +Within the vector types, there aren't any defined conversions in LLVM IR as +vectors are often converted by reinterpreting the bits or by decomposing the +vector and reconstituting it as a different type. As such, G_BITCAST is the +only operation to account for. We generally don't require that it's legal +because it can usually be lowered to COPY (or to nothing using +replaceAllUses()). However, there are situations where G_BITCAST is non-trivial +(e.g. little-endian vectors of big-endian data such as on big-endian MIPS MSA and +big-endian ARM NEON, see `_i_bitcast`). To account for this G_BITCAST must be +legal for all type combinations that change the bit pattern in the value. + +There are no legality requirements for G_BUILD_VECTOR, or G_BUILD_VECTOR_TRUNC +since these can be handled by: +* Declaring them legal. +* Scalarizing them. +* Lowering them to G_TRUNC+G_ANYEXT and some legalizable instructions. +* Lowering them to target instructions which are legal by definition. + +The same reasoning also allows G_UNMERGE_VALUES to lack legality requirements +for vector inputs. + +Minimum Legality for Pointers +""""""""""""""""""""""""""""" + +There are no minimum rules for pointers since G_INTTOPTR and G_PTRTOINT can +be selected to a COPY from register class to another by the legalizer. + +Minimum Legality For Operations +""""""""""""""""""""""""""""""" + +The rules for G_ANYEXT, G_MERGE_VALUES, G_BITCAST, G_BUILD_VECTOR, +G_BUILD_VECTOR_TRUNC, G_CONCAT_VECTORS, G_UNMERGE_VALUES, G_PTRTOINT, and +G_INTTOPTR have already been noted above. In addition to those, the following +operations have requirements: + +* At least one G_IMPLICIT_DEF must be legal. This is usually trivial as it + requires no code to be selected. +* G_PHI must be legal for all types in the producer and consumer typesets. This + is usually trivial as it requires no code to be selected. +* At least one G_FRAME_INDEX must be legal +* At least one G_BLOCK_ADDR must be legal + +There are many other operations you'd expect to have legality requirements but +they can be lowered to target instructions which are legal by definition. diff --git a/llvm/docs/GlobalISel/Pipeline.rst b/llvm/docs/GlobalISel/Pipeline.rst new file mode 100644 index 00000000000..2f9bc93c9b5 --- /dev/null +++ b/llvm/docs/GlobalISel/Pipeline.rst @@ -0,0 +1,82 @@ +.. _pipeline: + +Core Pipeline +============= + +There are four required passes, regardless of the optimization mode: + +.. toctree:: + :maxdepth: 1 + + IRTranslator + Legalizer + RegBankSelect + InstructionSelect + +Additional passes can then be inserted at higher optimization levels or for +specific targets. For example, to match the current SelectionDAG set of +transformations: MachineCSE and a better MachineCombiner between every pass. + +``NOTE``: +In theory, not all passes are always necessary. +As an additional compile-time optimization, we could skip some of the passes by +setting the relevant MachineFunction properties. For instance, if the +IRTranslator did not encounter any illegal instruction, it would set the +``legalized`` property to avoid running the :ref:`milegalizer`. +Similarly, we considered specializing the IRTranslator per-target to directly +emit target-specific MI. +However, we instead decided to keep the core pipeline simple, and focus on +minimizing the overhead of the passes in the no-op cases. + +.. _maintainability-verifier: + +MachineVerifier +--------------- + +The pass approach lets us use the ``MachineVerifier`` to enforce invariants. +For instance, a ``regBankSelected`` function may not have gvregs without +a bank. + +``TODO``: +The ``MachineVerifier`` being monolithic, some of the checks we want to do +can't be integrated to it: GlobalISel is a separate library, so we can't +directly reference it from CodeGen. For instance, legality checks are +currently done in RegBankSelect/InstructionSelect proper. We could #ifdef out +the checks, or we could add some sort of verifier API. + + +.. _maintainability: + +Maintainability +=============== + +.. _maintainability-iterative: + +Iterative Transformations +------------------------- + +Passes are split into small, iterative transformations, with all state +represented in the MIR. + +This differs from SelectionDAG (in particular, the legalizer) using various +in-memory side-tables. + + +.. _maintainability-mir: + +MIR Serialization +----------------- + +.. FIXME: Update the MIRLangRef to include GMI additions. + +:ref:`gmir` is serializable (see :doc:`../MIRLangRef`). +Combined with :ref:`maintainability-iterative`, this enables much finer-grained +testing, rather than requiring large and fragile IR-to-assembly tests. + +The current "stage" in the :ref:`pipeline` is represented by a set of +``MachineFunctionProperties``: + +* ``legalized`` +* ``regBankSelected`` +* ``selected`` + diff --git a/llvm/docs/GlobalISel/Porting.rst b/llvm/docs/GlobalISel/Porting.rst new file mode 100644 index 00000000000..f2928639158 --- /dev/null +++ b/llvm/docs/GlobalISel/Porting.rst @@ -0,0 +1,21 @@ +.. _porting: + +Porting GlobalISel to A New Target +================================== + +There are four major classes to implement by the target: + +* :ref:`CallLowering <api-calllowering>` --- lower calls, returns, and arguments + according to the ABI. +* :ref:`RegisterBankInfo <api-registerbankinfo>` --- describe + :ref:`gmir-regbank` coverage, cross-bank copy cost, and the mapping of + operands onto banks for each instruction. +* :ref:`LegalizerInfo <api-legalizerinfo>` --- describe what is legal, and how + to legalize what isn't. +* :ref:`InstructionSelector <api-instructionselector>` --- select generic MIR + to target-specific MIR. + +Additionally: + +* ``TargetPassConfig`` --- create the passes constituting the pipeline, + including additional passes not included in the :ref:`pipeline`. diff --git a/llvm/docs/GlobalISel/RegBankSelect.rst b/llvm/docs/GlobalISel/RegBankSelect.rst new file mode 100644 index 00000000000..2702d689b84 --- /dev/null +++ b/llvm/docs/GlobalISel/RegBankSelect.rst @@ -0,0 +1,73 @@ +.. _regbankselect: + +RegBankSelect +------------- + +This pass constrains the :ref:`gmir-gvregs` operands of generic +instructions to some :ref:`gmir-regbank`. + +It iteratively maps instructions to a set of per-operand bank assignment. +The possible mappings are determined by the target-provided +:ref:`RegisterBankInfo <api-registerbankinfo>`. +The mapping is then applied, possibly introducing ``COPY`` instructions if +necessary. + +It traverses the ``MachineFunction`` top down so that all operands are already +mapped when analyzing an instruction. + +This pass could also remap target-specific instructions when beneficial. +In the future, this could replace the ExeDepsFix pass, as we can directly +select the best variant for an instruction that's available on multiple banks. + +.. _api-registerbankinfo: + +API: RegisterBankInfo +^^^^^^^^^^^^^^^^^^^^^ + +The ``RegisterBankInfo`` class describes multiple aspects of register banks. + +* **Banks**: ``addRegBankCoverage`` --- which register bank covers each + register class. + +* **Cross-Bank Copies**: ``copyCost`` --- the cost of a ``COPY`` from one bank + to another. + +* **Default Mapping**: ``getInstrMapping`` --- the default bank assignments for + a given instruction. + +* **Alternative Mapping**: ``getInstrAlternativeMapping`` --- the other + possible bank assignments for a given instruction. + +``TODO``: +All this information should eventually be static and generated by TableGen, +mostly using existing information augmented by bank descriptions. + +``TODO``: +``getInstrMapping`` is currently separate from ``getInstrAlternativeMapping`` +because the latter is more expensive: as we move to static mapping info, +both methods should be free, and we should merge them. + +.. _regbankselect-modes: + +RegBankSelect Modes +^^^^^^^^^^^^^^^^^^^ + +``RegBankSelect`` currently has two modes: + +* **Fast** --- For each instruction, pick a target-provided "default" bank + assignment. This is the default at -O0. + +* **Greedy** --- For each instruction, pick the cheapest of several + target-provided bank assignment alternatives. + +We intend to eventually introduce an additional optimizing mode: + +* **Global** --- Across multiple instructions, pick the cheapest combination of + bank assignments. + +``NOTE``: +On AArch64, we are considering using the Greedy mode even at -O0 (or perhaps at +backend -O1): because :ref:`gmir-llt` doesn't distinguish floating point from +integer scalars, the default assignment for loads and stores is the integer +bank, introducing cross-bank copies on most floating point operations. + diff --git a/llvm/docs/GlobalISel/Resources.rst b/llvm/docs/GlobalISel/Resources.rst new file mode 100644 index 00000000000..6ac35d241aa --- /dev/null +++ b/llvm/docs/GlobalISel/Resources.rst @@ -0,0 +1,11 @@ +.. _other_resources: + +Resources +========= + +* `Global Instruction Selection - A Proposal by Quentin Colombet @LLVMDevMeeting 2015 <https://www.youtube.com/watch?v=F6GGbYtae3g>`_ +* `Global Instruction Selection - Status by Quentin Colombet, Ahmed Bougacha, and Tim Northover @LLVMDevMeeting 2016 <https://www.youtube.com/watch?v=6tfb344A7w8>`_ +* `GlobalISel - LLVM's Latest Instruction Selection Framework by Diana Picus @FOSDEM17 <https://www.youtube.com/watch?v=d6dF6E4BPeU>`_ +* `GlobalISel: Past, Present, and Future by Quentin Colombet and Ahmed Bougacha @LLVMDevMeeting 2017 <https://www.llvm.org/devmtg/2017-10/#talk11>`_ +* `Head First into GlobalISel by Daniel Sanders, Aditya Nandakumar, and Justin Bogner @LLVMDevMeeting 2017 <https://www.llvm.org/devmtg/2017-10/#tutorial2>`_ +* `Generating Optimized Code with GlobalISel by Volkan Keles, Daniel Sanders @LLVMDevMeeting 2019 <https://www.llvm.org/devmtg/2019-10/talk-abstracts.html#keynote1>`_ diff --git a/llvm/docs/GlobalISel/index.rst b/llvm/docs/GlobalISel/index.rst new file mode 100644 index 00000000000..f620bc643bb --- /dev/null +++ b/llvm/docs/GlobalISel/index.rst @@ -0,0 +1,94 @@ +============================ +Global Instruction Selection +============================ + +.. warning:: + This document is a work in progress. It reflects the current state of the + implementation, as well as open design and implementation issues. + +.. contents:: + :local: + :depth: 1 + +Introduction +============ + +GlobalISel is a framework that provides a set of reusable passes and utilities +for instruction selection --- translation from LLVM IR to target-specific +Machine IR (MIR). + +GlobalISel is intended to be a replacement for SelectionDAG and FastISel, to +solve three major problems: + +* **Performance** --- SelectionDAG introduces a dedicated intermediate + representation, which has a compile-time cost. + + GlobalISel directly operates on the post-isel representation used by the + rest of the code generator, MIR. + It does require extensions to that representation to support arbitrary + incoming IR: :ref:`gmir`. + +* **Granularity** --- SelectionDAG and FastISel operate on individual basic + blocks, losing some global optimization opportunities. + + GlobalISel operates on the whole function. + +* **Modularity** --- SelectionDAG and FastISel are radically different and share + very little code. + + GlobalISel is built in a way that enables code reuse. For instance, both the + optimized and fast selectors share the :ref:`pipeline`, and targets can + configure that pipeline to better suit their needs. + +Design and Implementation Reference +=================================== + +More information on the design and implementation of GlobalISel can be found in +the following sections. + +.. toctree:: + :maxdepth: 1 + + GMIR + Pipeline + Porting + Resources + + +.. _progress: + +Progress and Future Work +======================== + +The initial goal is to replace FastISel on AArch64. The next step will be to +replace SelectionDAG as the optimized ISel. + +``NOTE``: +While we iterate on GlobalISel, we strive to avoid affecting the performance of +SelectionDAG, FastISel, or the other MIR passes. For instance, the types of +:ref:`gmir-gvregs` are stored in a separate table in ``MachineRegisterInfo``, +that is destroyed after :ref:`instructionselect`. + +.. _progress-fastisel: + +FastISel Replacement +-------------------- + +For the initial FastISel replacement, we intend to fallback to SelectionDAG on +selection failures. + +Currently, compile-time of the fast pipeline is within 1.5x of FastISel. +We're optimistic we can get to within 1.1/1.2x, but beating FastISel will be +challenging given the multi-pass approach. +Still, supporting all IR (via a complete legalizer) and avoiding the fallback +to SelectionDAG in the worst case should enable better amortized performance +than SelectionDAG+FastISel. + +``NOTE``: +We considered never having a fallback to SelectionDAG, instead deciding early +whether a given function is supported by GlobalISel or not. The decision would +be based on :ref:`milegalizer` queries. +We abandoned that for two reasons: +a) on IR inputs, we'd need to basically simulate the :ref:`irtranslator`; +b) to be robust against unforeseen failures and to enable iterative +improvements. diff --git a/llvm/docs/Reference.rst b/llvm/docs/Reference.rst index 3fbceba1181..4c421a20927 100644 --- a/llvm/docs/Reference.rst +++ b/llvm/docs/Reference.rst @@ -23,7 +23,7 @@ LLVM and API reference documentation. FuzzingLLVM
GarbageCollection
GetElementPtr
- GlobalISel
+ GlobalISel/index
GwpAsan
HowToSetUpLLVMStyleRTTI
HowToUseAttributes
@@ -125,7 +125,7 @@ LLVM IR A reference manual for the MIR serialization format, which is used to test
LLVM's code generation passes.
-:doc:`GlobalISel`
+:doc:`GlobalISel/index`
This describes the prototype instruction selection replacement, GlobalISel.
=====================
@@ -208,4 +208,4 @@ Additional Topics LLVM support for coroutines.
:doc:`YamlIO`
- A reference guide for using LLVM's YAML I/O library.
\ No newline at end of file + A reference guide for using LLVM's YAML I/O library.
|

