summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorDaniel Sanders <daniel_l_sanders@apple.com>2019-11-05 15:10:00 -0800
committerDaniel Sanders <daniel_l_sanders@apple.com>2019-11-05 15:16:43 -0800
commitad0dfb0a25344027948a675ec15a6793e44b6463 (patch)
treecd772954a85ccac464eff1c3db2186d526a839f4
parent041f35c468088d315bae6c2a71ec901a12cca1b5 (diff)
downloadbcm5719-llvm-ad0dfb0a25344027948a675ec15a6793e44b6463.tar.gz
bcm5719-llvm-ad0dfb0a25344027948a675ec15a6793e44b6463.zip
[globalisel][docs] Rework GMIR documentation and add an early GenericOpcode reference
Summary: Rework the GMIR documentation to focus more on the end user than the implementation and tie it in to the MIR document. There was also some out-of-date information which has been removed. The quality of the GenericOpcode reference is highly variable and drops sharply as I worked through them all but we've got to start somewhere :-). It would be great if others could expand on this too as there is an awful lot to get through. Also fix a typo in the definition of G_FLOG. Previously, the comments said we had two base-2's (G_FLOG and G_FLOG2). Reviewers: aemerson, volkan, rovka, arsenm Reviewed By: rovka Subscribers: wdng, arphaman, jfb, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69545
-rw-r--r--llvm/docs/GlobalISel/GMIR.rst207
-rw-r--r--llvm/docs/GlobalISel/GenericOpcode.rst658
-rw-r--r--llvm/docs/GlobalISel/index.rst1
-rw-r--r--llvm/docs/LangRef.rst2
-rw-r--r--llvm/docs/MIRLangRef.rst4
-rw-r--r--llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h5
-rw-r--r--llvm/include/llvm/Target/GenericOpcodes.td2
7 files changed, 803 insertions, 76 deletions
diff --git a/llvm/docs/GlobalISel/GMIR.rst b/llvm/docs/GlobalISel/GMIR.rst
index 4eaf039b14b..fead6de771f 100644
--- a/llvm/docs/GlobalISel/GMIR.rst
+++ b/llvm/docs/GlobalISel/GMIR.rst
@@ -3,38 +3,35 @@
Generic Machine IR
==================
-Machine IR operates on physical registers, register classes, and (mostly)
-target-specific instructions.
-
-To bridge the gap with LLVM IR, GlobalISel introduces "generic" extensions to
-Machine IR:
-
.. contents::
:local:
-``NOTE``:
-The generic MIR (GMIR) representation still contains references to IR
-constructs (such as ``GlobalValue``). Removing those should let us write more
-accurate tests, or delete IR after building the initial MIR. However, it is
-not part of the GlobalISel effort.
+Generic MIR (gMIR) is an intermediate representation that shares the same data
+structures as :doc:`MachineIR (MIR) <../MIRLangRef>` but has more relaxed
+constraints. As the compilation pipeline proceeds, these constraints are
+gradually tightened until gMIR has become MIR.
+
+The rest of this document will assume that you are familiar with the concepts
+in :doc:`MachineIR (MIR) <../MIRLangRef>` and will highlight the differences
+between MIR and gMIR.
.. _gmir-instructions:
-Generic Instructions
---------------------
+Generic Machine Instructions
+----------------------------
+
+.. note::
-The main addition is support for pre-isel generic machine instructions (e.g.,
-``G_ADD``). Like other target-independent instructions (e.g., ``COPY`` or
-``PHI``), these are available on all targets.
+ This section expands on :ref:`mir-instructions` from the MIR Language
+ Reference.
-``TODO``:
-While we're progressively adding instructions, one kind in particular exposes
-interesting problems: compares and how to represent condition codes.
-Some targets (x86, ARM) have generic comparisons setting multiple flags,
-which are then used by predicated variants.
-Others (IR) specify the predicate in the comparison and users just get a single
-bit. SelectionDAG uses SETCC/CONDBR vs BR_CC (and similar for select) to
-represent this.
+Whereas MIR deals largely in Target Instructions and only has a small set of
+target independent opcodes such as ``COPY``, ``PHI``, and ``REG_SEQUENCE``,
+gMIR defines a rich collection of ``Generic Opcodes`` which are target
+independent and describe operations which are typically supported by targets.
+One example is ``G_ADD`` which is the generic opcode for an integer addition.
+More information on each of the generic opcodes can be found at
+:doc:`GenericOpcode`.
The ``MachineIRBuilder`` class wraps the ``MachineInstrBuilder`` and provides
a convenient way to create these generic instructions.
@@ -44,50 +41,109 @@ a convenient way to create these generic instructions.
Generic Virtual Registers
-------------------------
-Generic instructions operate on a new kind of register: "generic" virtual
-registers. As opposed to non-generic vregs, they are not assigned a Register
-Class. Instead, generic vregs have a :ref:`gmir-llt`, and can be assigned
-a :ref:`gmir-regbank`.
+.. note::
-``MachineRegisterInfo`` tracks the same information that it does for
-non-generic vregs (e.g., use-def chains). Additionally, it also tracks the
-:ref:`gmir-llt` of the register, and, instead of the ``TargetRegisterClass``,
-its :ref:`gmir-regbank`, if any.
+ This section expands on :ref:`mir-registers` from the MIR Language
+ Reference.
-For simplicity, most generic instructions only accept generic vregs:
+Generic virtual registers are like virtual registers but they are not assigned a
+Register Class constraint. Instead, generic virtual registers have less strict
+constraints starting with a :ref:`gmir-llt` and then further constrained to a
+:ref:`gmir-regbank`. Eventually they will be constrained to a register class
+at which point they become normal virtual registers.
-* instead of immediates, they use a gvreg defined by an instruction
- materializing the immediate value (see :ref:`irtranslator-constants`).
-* instead of physical register, they use a gvreg defined by a ``COPY``.
+Generic virtual registers can be used with all the virtual register API's
+provided by ``MachineRegisterInfo``. In particular, the def-use chain API's can
+be used without needing to distinguish them from non-generic virtual registers.
-``NOTE``:
-We started with an alternative representation, where MRI tracks a size for
-each gvreg, and instructions have lists of types.
-That had two flaws: the type and size are redundant, and there was no generic
-way of getting a given operand's type (as there was no 1:1 mapping between
-instruction types and operands).
-We considered putting the type in some variant of MCInstrDesc instead:
-See `PR26576 <http://llvm.org/PR26576>`_: [GlobalISel] Generic MachineInstrs
-need a type but this increases the memory footprint of the related objects
+For simplicity, most generic instructions only accept virtual registers (both
+generic and non-generic). There are some exceptions to this but in general:
-.. _gmir-regbank:
+* instead of immediates, they use a generic virtual register defined by an
+ instruction that materializes the immediate value (see
+ :ref:`irtranslator-constants`). Typically this is a G_CONSTANT or a
+ G_FCONSTANT. One example of an exception to this rule is G_SEXT_INREG where
+ having an immediate is mandatory.
+* instead of physical register, they use a generic virtual register that is
+ either defined by a ``COPY`` from the physical register or used by a ``COPY``
+ that defines the physical register.
-Register Bank
--------------
+.. admonition:: Historical Note
-A Register Bank is a set of register classes defined by the target.
-A bank has a size, which is the maximum store size of all covered classes.
+ We started with an alternative representation, where MRI tracks a size for
+ each generic virtual register, and instructions have lists of types.
+ That had two flaws: the type and size are redundant, and there was no generic
+ way of getting a given operand's type (as there was no 1:1 mapping between
+ instruction types and operands).
+ We considered putting the type in some variant of MCInstrDesc instead:
+ See `PR26576 <http://llvm.org/PR26576>`_: [GlobalISel] Generic MachineInstrs
+ need a type but this increases the memory footprint of the related objects
-In general, cross-class copies inside a bank are expected to be cheaper than
-copies across banks. They are also coalesceable by the register coalescer,
-whereas cross-bank copies are not.
+.. _gmir-regbank:
-Also, equivalent operations can be performed on different banks using different
-instructions.
+Register Bank
+-------------
-For example, X86 can be seen as having 3 main banks: general-purpose, x87, and
-vector (which could be further split into a bank per domain for single vs
-double precision instructions).
+A Register Bank is a set of register classes defined by the target. This
+definition is rather loose so let's talk about what they can achieve.
+
+Suppose we have a processor that has two register files, A and B. These are
+equal in every way and support the same instructions for the same cost. They're
+just physically stored apart and each instruction can only access registers from
+A or register B but never a mix of the two. If we want to perform an operation
+on data that's in split between the two register files, we must first copy all
+the data into a single register file.
+
+Given a processor like this, we would benefit from clustering related data
+together into one register file so that we minimize the cost of copying data
+back and forth to satisfy the (possibly conflicting) requirements of all the
+instructions. Register Banks are a means to constrain the register allocator to
+use a particular register file for a virtual register.
+
+In practice, register files A and B are rarely equal. They can typically store
+the same data but there's usually some restrictions on what operations you can
+do on each register file. A fairly common pattern is for one of them to be
+accessible to integer operations and the other accessible to floating point
+operations. To accomodate this, let's rename A and B to GPR (general purpose
+registers) and FPR (floating point registers).
+
+We now have some additional constraints that limit us. An operation like G_FMUL
+has to happen in FPR and G_ADD has to happen in GPR. However, even though this
+prescribes a lot of the assignments we still have some freedom. A G_LOAD can
+happen in both GPR and FPR, and which we want depends on who is going to consume
+the loaded data. Similarly, G_FNEG can happen in both GPR and FPR. If we assign
+it to FPR, then we'll use floating point negation. However, if we assign it to
+GPR then we can equivalently G_XOR the sign bit with 1 to invert it.
+
+In summary, Register Banks are a means of disambiguating between seemingly
+equivalent choices based on some analysis of the differences when each choice
+is applied in a given context.
+
+To give some concrete examples:
+
+AArch64
+
+ AArch64 has three main banks. GPR for integer operations, FPR for floating
+ point and also for the NEON vector instruction set. The third is CCR and
+ describes the condition code register used for predication.
+
+MIPS
+
+ MIPS has five main banks of which many programs only really use one or two.
+ GPR is the general purpose bank for integer operations. FGR or CP1 is for
+ the floating point operations as well as the MSA vector instructions and a
+ few other application specific extensions. CP0 is for system registers and
+ few programs will use it. CP2 and CP3 are for any application specific
+ coprocessors that may be present in the chip. Arguably, there is also a sixth
+ for the LO and HI registers but these are only used for the result of a few
+ operations and it's of questionable value to model distinctly from GPR.
+
+X86
+
+ X86 can be seen as having 3 main banks: general-purpose, x87, and
+ vector (which could be further split into a bank per domain for single vs
+ double precision instructions). It also looks like there's arguably a few
+ more potential banks such as one for the AVX512 Mask Registers.
Register banks are described by a target-provided API,
:ref:`RegisterBankInfo <api-registerbankinfo>`.
@@ -108,7 +164,6 @@ as size and number of vector lanes:
* ``sN`` for scalars
* ``pN`` for pointers
* ``<N x sM>`` for vectors
-* ``unsized`` for labels, etc..
``LLT`` is intended to replace the usage of ``EVT`` in SelectionDAG.
@@ -122,14 +177,13 @@ Here are some LLT examples and their ``EVT`` and ``Type`` equivalents:
``s32`` ``i32`` ``i32``
``s32`` ``f32`` ``float``
``s17`` ``i17`` ``i17``
- ``s16`` N/A ``{i8, i8}``
- ``s32`` N/A ``[4 x i8]``
+ ``s16`` N/A ``{i8, i8}`` [#abi-dependent]_
+ ``s32`` N/A ``[4 x i8]`` [#abi-dependent]_
``p0`` ``iPTR`` ``i8*``, ``i32*``, ``%opaque*``
``p2`` ``iPTR`` ``i8 addrspace(2)*``
``<4 x s32>`` ``v4f32`` ``<4 x float>``
``s64`` ``v1f64`` ``<1 x double>``
``<3 x s32>`` ``v3i32`` ``<3 x i32>``
- ``unsized`` ``Other`` ``label``
============= ========= ======================================
@@ -143,16 +197,23 @@ to SelectionDAG where address space is an attribute on operations.
This representation better supports pointers having different sizes depending
on their addressspace.
-``NOTE``:
-Currently, LLT requires at least 2 elements in vectors, but some targets have
-the concept of a '1-element vector'. Representing them as their underlying
-scalar type is a nice simplification.
+.. note::
+
+ .. caution::
+
+ Is this still true? I thought we'd removed the 1-element vector concept.
+ Hypothetically, it could be distinct from a scalar but I think we failed to
+ find a real occurrence.
+
+ Currently, LLT requires at least 2 elements in vectors, but some targets have
+ the concept of a '1-element vector'. Representing them as their underlying
+ scalar type is a nice simplification.
+
+.. rubric:: Footnotes
+
+.. [#abi-dependent] This mapping is ABI dependent. Here we've assumed no additional padding is required.
-``TODO``:
-Currently, non-generic virtual registers, defined by non-pre-isel-generic
-instructions, cannot have a type, and thus cannot be used by a pre-isel generic
-instruction. Instead, they are given a type using a COPY. We could relax that
-and allow types on all vregs: this would reduce the number of MI required when
-emitting target-specific MIR early in the pipeline. This should purely be
-a compile-time optimization.
+Generic Opcode Reference
+------------------------
+The Generic Opcodes that are available are described at :doc:`GenericOpcode`.
diff --git a/llvm/docs/GlobalISel/GenericOpcode.rst b/llvm/docs/GlobalISel/GenericOpcode.rst
new file mode 100644
index 00000000000..3faaa851132
--- /dev/null
+++ b/llvm/docs/GlobalISel/GenericOpcode.rst
@@ -0,0 +1,658 @@
+
+.. _gmir-opcodes:
+
+Generic Opcodes
+===============
+
+.. contents::
+ :local:
+
+.. note::
+
+ This documentation does not yet fully account for vectors. Many of the
+ scalar/integer/floating-point operations can also take vectors.
+
+Constants
+---------
+
+G_IMPLICIT_DEF
+^^^^^^^^^^^^^^
+
+An undefined value.
+
+.. code-block:: none
+
+ %0:_(s32) = G_IMPLICIT_DEF
+
+G_CONSTANT
+^^^^^^^^^^
+
+An integer constant.
+
+.. code-block:: none
+
+ %0:_(s32) = G_CONSTANT i32 1
+
+G_FCONSTANT
+^^^^^^^^^^^
+
+A floating point constant.
+
+.. code-block:: none
+
+ %0:_(s32) = G_FCONSTANT float 1.0
+
+G_FRAME_INDEX
+^^^^^^^^^^^^^
+
+The address of an object in the stack frame.
+
+.. code-block:: none
+
+ %1:_(p0) = G_FRAME_INDEX %stack.0.ptr0
+
+G_GLOBAL_VALUE
+^^^^^^^^^^^^^^
+
+The address of a global value.
+
+.. code-block:: none
+
+ %0(p0) = G_GLOBAL_VALUE @var_local
+
+G_BLOCK_ADDR
+^^^^^^^^^^^^
+
+The address of a basic block.
+
+.. code-block:: none
+
+ %0:_(p0) = G_BLOCK_ADDR blockaddress(@test_blockaddress, %ir-block.block)
+
+Integer Extension and Truncation
+--------------------------------
+
+G_ANYEXT
+^^^^^^^^
+
+Extend the underlying scalar type of an operation, leaving the high bits
+unspecified.
+
+.. code-block:: none
+
+ %1:_(s32) = G_ANYEXT %0:_(s16)
+
+G_SEXT
+^^^^^^
+
+Sign extend the underlying scalar type of an operation, copying the sign bit
+into the newly-created space.
+
+.. code-block:: none
+
+ %1:_(s32) = G_SEXT %0:_(s16)
+
+G_SEXT_INREG
+^^^^^^^^^^^^
+
+Sign extend the a value from an arbitrary bit position, copying the sign bit
+into all bits above it. This is equivalent to a shl + ashr pair with an
+appropriate shift amount. $sz is an immediate (MachineOperand::isImm()
+returns true) to allow targets to have some bitwidths legal and others
+lowered. This opcode is particularly useful if the target has sign-extension
+instructions that are cheaper than the constituent shifts as the optimizer is
+able to make decisions on whether it's better to hang on to the G_SEXT_INREG
+or to lower it and optimize the individual shifts.
+
+.. code-block:: none
+
+ %1:_(s32) = G_SEXT_INREG %0:_(s32), 16
+
+G_ZEXT
+^^^^^^
+
+Zero extend the underlying scalar type of an operation, putting zero bits
+into the newly-created space.
+
+.. code-block:: none
+
+ %1:_(s32) = G_ZEXT %0:_(s16)
+
+G_TRUNC
+^^^^^^^
+
+Truncate the underlying scalar type of an operation. This is equivalent to
+G_EXTRACT for scalar types, but acts elementwise on vectors.
+
+.. code-block:: none
+
+ %1:_(s16) = G_TRUNC %0:_(s32)
+
+Type Conversions
+----------------
+
+G_INTTOPTR
+^^^^^^^^^^
+
+Convert an integer to a pointer.
+
+.. code-block:: none
+
+ %1:_(p0) = G_INTTOPTR %0:_(s32)
+
+G_PTRTOINT
+^^^^^^^^^^
+
+Convert an pointer to an integer.
+
+.. code-block:: none
+
+ %1:_(s32) = G_PTRTOINT %0:_(p0)
+
+G_BITCAST
+^^^^^^^^^
+
+Reinterpret a value as a new type. This is usually done without changing any
+bits but this is not always the case due a sublety in the definition of the
+:ref:`LLVM-IR Bitcast Instruction <i_bitcast>`.
+
+.. code-block:: none
+
+ %1:_(s64) = G_BITCAST %0:_(<2 x s32>)
+
+G_ADDRSPACE_CAST
+^^^^^^^^^^^^^^^^
+
+Convert a pointer to an address space to a pointer to another address space.
+
+.. code-block:: none
+
+ %1:_(p1) = G_ADDRSPACE_CAST %0:_(p0)
+
+.. caution::
+
+ :ref:`i_addrspacecast` doesn't mention what happens if the cast is simply
+ invalid (i.e. if the address spaces are disjoint).
+
+Scalar Operations
+-----------------
+
+G_EXTRACT
+^^^^^^^^^
+
+Extract a register of the specified size, starting from the block given by
+index. This will almost certainly be mapped to sub-register COPYs after
+register banks have been selected.
+
+G_INSERT
+^^^^^^^^
+
+Insert a smaller register into a larger one at the specified bit-index.
+
+G_MERGE_VALUES
+^^^^^^^^^^^^^^
+
+Concatenate multiple registers of the same size into a wider register.
+The input operands are always ordered from lowest bits to highest:
+
+.. code-block:: none
+
+ %0:(s32) = G_MERGE_VALUES %bits_0_7:(s8), %bits_8_15:(s8),
+ %bits_16_23:(s8), %bits_24_31:(s8)
+
+G_UNMERGE_VALUES
+^^^^^^^^^^^^^^^^
+
+Extract multiple registers specified size, starting from blocks given by
+indexes. This will almost certainly be mapped to sub-register COPYs after
+register banks have been selected.
+The output operands are always ordered from lowest bits to highest:
+
+.. code-block:: none
+
+ %bits_0_7:(s8), %bits_8_15:(s8),
+ %bits_16_23:(s8), %bits_24_31:(s8) = G_UNMERGE_VALUES %0:(s32)
+
+G_BSWAP
+^^^^^^^
+
+Reverse the order of the bytes in a scalar
+
+.. code-block:: none
+
+ %1:_(s32) = G_BSWAP %0:_(s32)
+
+G_BITREVERSE
+^^^^^^^^^^^^
+
+Reverse the order of the bits in a scalar
+
+.. code-block:: none
+
+ %1:_(s32) = G_BITREVERSE %0:_(s32)
+
+Integer Operations
+-------------------
+
+G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SDIV, G_UDIV, G_SREM, G_UREM
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+These each perform their respective integer arithmetic on a scalar.
+
+.. code-block:: none
+
+ %2:_(s32) = G_ADD %0:_(s32), %1:_(s32)
+
+G_SHL, G_LSHR, G_ASHR
+^^^^^^^^^^^^^^^^^^^^^
+
+Shift the bits of a scalar left or right inserting zeros (sign-bit for G_ASHR).
+
+G_ICMP
+^^^^^^
+
+Perform integer comparison producing non-zero (true) or zero (false). It's
+target specific whether a true value is 1, ~0U, or some other non-zero value.
+
+G_SELECT
+^^^^^^^^
+
+Select between two values depending on a zero/non-zero value.
+
+.. code-block:: none
+
+ %5:_(s32) = G_SELECT %4(s1), %6, %2
+
+G_PTR_ADD
+^^^^^^^^^
+
+Add an offset to a pointer measured in addressible units. Addressible units are
+typically bytes but this can vary between targets.
+
+.. code-block:: none
+
+ %1:_(p0) = G_PTR_MASK %0, 3
+
+G_PTR_MASK
+^^^^^^^^^^
+
+Zero the least significant N bits of a pointer.
+
+.. code-block:: none
+
+ %1:_(p0) = G_PTR_MASK %0, 3
+
+G_SMIN, G_SMAX, G_UMIN, G_UMAX
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Take the minimum/maximum of two values.
+
+.. code-block:: none
+
+ %5:_(s32) = G_SMIN %6, %2
+
+G_UADDO, G_SADDO, G_USUBO, G_SSUBO, G_SMULO, G_UMULO
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Perform the requested arithmetic and produce a carry output in addition to the
+normal result.
+
+.. code-block:: none
+
+ %3:_(s32), %4:_(s1) = G_UADDO %0, %1
+
+G_UADDE, G_SADDE, G_USUBE, G_SSUBE
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Perform the requested arithmetic and consume a carry input in addition to the
+normal input. Also produce a carry output in addition to the normal result.
+
+.. code-block:: none
+
+ %3:_(s32), %4:_(s1) = G_UADDO %0, %1
+
+G_UMULH, G_SMULH
+^^^^^^^^^^^^^^^^
+
+Multiply two numbers at twice the incoming bit width (signed) and return
+the high half of the result
+
+.. code-block:: none
+
+ %3:_(s32), %4:_(s1) = G_UADDO %0, %1
+
+G_CTLZ, G_CTTZ, G_CTPOP
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Count leading zeros, trailing zeros, or number of set bits
+
+G_CTLZ_ZERO_UNDEF, G_CTTZ_ZERO_UNDEF
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Count leading zeros or trailing zeros. If the value is zero then the result is
+undefined.
+
+Floating Point Operations
+-------------------------
+
+G_FCMP
+^^^^^^
+
+Perform floating point comparison producing non-zero (true) or zero
+(false). It's target specific whether a true value is 1, ~0U, or some other
+non-zero value.
+
+G_FNEG
+^^^^^^
+
+Floating point negation
+
+G_FPEXT
+^^^^^^^
+
+Convert a floating point value to a larger type
+
+G_FPTRUNC
+^^^^^^^^^
+
+Convert a floating point value to a narrower type
+
+G_FPTOSI, G_FPTOUI, G_SITOFP, G_UITOFP
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Convert between integer and floating point
+
+G_FABS
+^^^^^^
+
+Take the absolute value of a floating point value
+
+G_FCOPYSIGN
+^^^^^^^^^^^
+
+Copy the value of the first operand, replacing the sign bit with that of the
+second operand.
+
+G_FCANONICALIZE
+^^^^^^^^^^^^^^^
+
+See :ref:`i_intr_llvm_canonicalize`
+
+G_FMINNUM
+^^^^^^^^^
+
+Perform floating-point minimum on two values.
+
+In the case where a single input is a NaN (either signaling or quiet),
+the non-NaN input is returned.
+
+The return value of (FMINNUM 0.0, -0.0) could be either 0.0 or -0.0.
+
+G_FMAXNUM
+^^^^^^^^^
+
+Perform floating-point maximum on two values.
+
+In the case where a single input is a NaN (either signaling or quiet),
+the non-NaN input is returned.
+
+The return value of (FMAXNUM 0.0, -0.0) could be either 0.0 or -0.0.
+
+G_FMINNUM_IEEE
+^^^^^^^^^^^^^^
+
+Perform floating-point minimum on two values, following the IEEE-754 2008
+definition. This differs from FMINNUM in the handling of signaling NaNs. If one
+input is a signaling NaN, returns a quiet NaN.
+
+G_FMAXNUM_IEEE
+^^^^^^^^^^^^^^
+
+Perform floating-point maximum on two values, following the IEEE-754 2008
+definition. This differs from FMAXNUM in the handling of signaling NaNs. If one
+input is a signaling NaN, returns a quiet NaN.
+
+G_FMINIMUM
+^^^^^^^^^^
+
+NaN-propagating minimum that also treat -0.0 as less than 0.0. While
+FMINNUM_IEEE follow IEEE 754-2008 semantics, FMINIMUM follows IEEE 754-2018
+draft semantics.
+
+G_FMAXIMUM
+^^^^^^^^^^
+
+NaN-propagating maximum that also treat -0.0 as less than 0.0. While
+FMAXNUM_IEEE follow IEEE 754-2008 semantics, FMAXIMUM follows IEEE 754-2018
+draft semantics.
+
+G_FADD, G_FSUB, G_FMUL, G_FDIV, G_FREM
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Perform the specified floating point arithmetic.
+
+G_FMA
+^^^^^
+
+Perform a fused multiple add (i.e. without the intermediate rounding step).
+
+G_FMAD
+^^^^^^
+
+Perform a non-fused multiple add (i.e. with the intermediate rounding step).
+
+G_FPOW
+^^^^^^
+
+Raise the first operand to the power of the second.
+
+G_FEXP, G_FEXP2
+^^^^^^^^^^^^^^^
+
+Calculate the base-e or base-2 exponential of a value
+
+G_FLOG, G_FLOG2, G_FLOG10
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Calculate the base-e, base-2, or base-10 respectively.
+
+G_FCEIL, G_FCOS, G_FSIN, G_FSQRT, G_FFLOOR, G_FRINT, G_FNEARBYINT
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+These correspond to the standard C functions of the same name.
+
+G_INTRINSIC_TRUNC
+^^^^^^^^^^^^^^^^^
+
+Returns the operand rounded to the nearest integer not larger in magnitude than the operand.
+
+G_INTRINSIC_ROUND
+^^^^^^^^^^^^^^^^^
+
+Returns the operand rounded to the nearest integer.
+
+Vector Specific Operations
+--------------------------
+
+G_CONCAT_VECTORS
+^^^^^^^^^^^^^^^^
+
+Concatenate two vectors to form a longer vector.
+
+G_BUILD_VECTOR, G_BUILD_VECTOR_TRUNC
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Create a vector from multiple scalar registers. No implicit
+conversion is performed (i.e. the result element type must be the
+same as all source operands)
+
+The _TRUNC version truncates the larger operand types to fit the
+destination vector elt type.
+
+G_INSERT_VECTOR_ELT
+^^^^^^^^^^^^^^^^^^^
+
+Insert an element into a vector
+
+G_EXTRACT_VECTOR_ELT
+^^^^^^^^^^^^^^^^^^^^
+
+Extract an element from a vector
+
+G_SHUFFLE_VECTOR
+^^^^^^^^^^^^^^^^
+
+Concatenate two vectors and shuffle the elements according to the mask operand.
+The mask operand should be an IR Constant which exactly matches the
+corresponding mask for the IR shufflevector instruction.
+
+Memory Operations
+-----------------
+
+G_LOAD, G_SEXTLOAD, G_ZEXTLOAD
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Generic load. Expects a MachineMemOperand in addition to explicit
+operands. If the result size is larger than the memory size, the
+high bits are undefined, sign-extended, or zero-extended respectively.
+
+Only G_LOAD is valid if the result is a vector type. If the result is larger
+than the memory size, the high elements are undefined (i.e. this is not a
+per-element, vector anyextload)
+
+G_INDEXED_LOAD
+^^^^^^^^^^^^^^
+
+Generic indexed load. Combines a GEP with a load. $newaddr is set to $base + $offset.
+If $am is 0 (post-indexed), then the value is loaded from $base; if $am is 1 (pre-indexed)
+then the value is loaded from $newaddr.
+
+G_INDEXED_SEXTLOAD
+^^^^^^^^^^^^^^^^^^
+
+Same as G_INDEXED_LOAD except that the load performed is sign-extending, as with G_SEXTLOAD.
+
+G_INDEXED_ZEXTLOAD
+^^^^^^^^^^^^^^^^^^
+
+Same as G_INDEXED_LOAD except that the load performed is zero-extending, as with G_ZEXTLOAD.
+
+G_STORE
+^^^^^^^
+
+Generic store. Expects a MachineMemOperand in addition to explicit operands.
+
+G_INDEXED_STORE
+^^^^^^^^^^^^^^^
+
+Combines a store with a GEP. See description of G_INDEXED_LOAD for indexing behaviour.
+
+G_ATOMIC_CMPXCHG_WITH_SUCCESS
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Generic atomic cmpxchg with internal success check. Expects a
+MachineMemOperand in addition to explicit operands.
+
+G_ATOMIC_CMPXCHG
+^^^^^^^^^^^^^^^^
+
+Generic atomic cmpxchg. Expects a MachineMemOperand in addition to explicit
+operands.
+
+G_ATOMICRMW_XCHG, G_ATOMICRMW_ADD, G_ATOMICRMW_SUB, G_ATOMICRMW_AND, G_ATOMICRMW_NAND, G_ATOMICRMW_OR, G_ATOMICRMW_XOR, G_ATOMICRMW_MAX, G_ATOMICRMW_MIN, G_ATOMICRMW_UMAX, G_ATOMICRMW_UMIN, G_ATOMICRMW_FADD, G_ATOMICRMW_FSUB
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Generic atomicrmw. Expects a MachineMemOperand in addition to explicit
+operands.
+
+G_FENCE
+^^^^^^^
+
+.. caution::
+
+ I couldn't find any documentation on this at the time of writing.
+
+Control Flow
+------------
+
+G_PHI
+^^^^^
+
+Implement the φ node in the SSA graph representing the function.
+
+.. code-block:: none
+
+ %1(s8) = G_PHI %7(s8), %bb.0, %3(s8), %bb.1
+
+G_BR
+^^^^
+
+Unconditional branch
+
+G_BRCOND
+^^^^^^^^
+
+Conditional branch
+
+G_BRINDIRECT
+^^^^^^^^^^^^
+
+Indirect branch
+
+G_BRJT
+^^^^^^
+
+Indirect branch to jump table entry
+
+G_JUMP_TABLE
+^^^^^^^^^^^^
+
+.. caution::
+
+ I found no documentation for this instruction at the time of writing.
+
+G_INTRINSIC, G_INTRINSIC_W_SIDE_EFFECTS
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Call an intrinsic
+
+The _W_SIDE_EFFECTS version is considered to have unknown side-effects and
+as such cannot be reordered acrosss other side-effecting instructions.
+
+.. note::
+
+ Unlike SelectionDAG, there is no _VOID variant. Both of these are permitted
+ to have zero, one, or multiple results.
+
+Variadic Arguments
+------------------
+
+G_VASTART
+^^^^^^^^^
+
+.. caution::
+
+ I found no documentation for this instruction at the time of writing.
+
+G_VAARG
+^^^^^^^
+
+.. caution::
+
+ I found no documentation for this instruction at the time of writing.
+
+Other Operations
+----------------
+
+G_DYN_STACKALLOC
+^^^^^^^^^^^^^^^^
+
+Dynamically realign the stack pointer to the specified alignment
+
+.. code-block:: none
+
+ %8:_(p0) = G_DYN_STACKALLOC %7(s64), 32
+
+.. caution::
+
+ What does it mean for the immediate to be 0? It happens in the tests
diff --git a/llvm/docs/GlobalISel/index.rst b/llvm/docs/GlobalISel/index.rst
index 2a3d0ca39b6..78afc1f5a2b 100644
--- a/llvm/docs/GlobalISel/index.rst
+++ b/llvm/docs/GlobalISel/index.rst
@@ -50,6 +50,7 @@ the following sections.
:maxdepth: 1
GMIR
+ GenericOpcode
Pipeline
Porting
Resources
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 96dec84ea70..098e4485629 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -13954,6 +13954,8 @@ Examples
Specialised Arithmetic Intrinsics
---------------------------------
+.. _i_intr_llvm_canonicalize:
+
'``llvm.canonicalize.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/llvm/docs/MIRLangRef.rst b/llvm/docs/MIRLangRef.rst
index c2c14d9db7c..0380bcced5f 100644
--- a/llvm/docs/MIRLangRef.rst
+++ b/llvm/docs/MIRLangRef.rst
@@ -345,6 +345,8 @@ specified in brackets after the block's definition:
``Alignment`` is specified in bytes, and must be a power of two.
+.. _mir-instructions:
+
Machine Instructions
--------------------
@@ -407,6 +409,8 @@ The syntax for bundled instructions is the following:
The first instruction is often a bundle header. The instructions between ``{``
and ``}`` are bundled with the first instruction.
+.. _mir-registers:
+
Registers
---------
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h b/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
index 37c301bb4c2..d3c17aee4a2 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
@@ -406,8 +406,9 @@ public:
/// Build and insert \p Res = G_PTR_ADD \p Op0, \p Op1
///
- /// G_PTR_ADD adds \p Op1 bytes to the pointer specified by \p Op0,
- /// storing the resulting pointer in \p Res.
+ /// G_PTR_ADD adds \p Op1 addressible units to the pointer specified by \p Op0,
+ /// storing the resulting pointer in \p Res. Addressible units are typically
+ /// bytes but this can vary between targets.
///
/// \pre setBasicBlock or setMI must have been called.
/// \pre \p Res and \p Op0 must be generic virtual registers with pointer
diff --git a/llvm/include/llvm/Target/GenericOpcodes.td b/llvm/include/llvm/Target/GenericOpcodes.td
index 1afba4499e2..e3037327b64 100644
--- a/llvm/include/llvm/Target/GenericOpcodes.td
+++ b/llvm/include/llvm/Target/GenericOpcodes.td
@@ -670,7 +670,7 @@ def G_FEXP2 : GenericInstruction {
let hasSideEffects = 0;
}
-// Floating point base-2 logarithm of a value.
+// Floating point base-e logarithm of a value.
def G_FLOG : GenericInstruction {
let OutOperandList = (outs type0:$dst);
let InOperandList = (ins type0:$src1);
OpenPOWER on IntegriCloud