summaryrefslogtreecommitdiffstats
path: root/Documentation/arm64
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/arm64')
-rw-r--r--Documentation/arm64/booting.rst3
-rw-r--r--Documentation/arm64/cpu-feature-registers.rst35
-rw-r--r--Documentation/arm64/elf_hwcaps.rst90
-rw-r--r--Documentation/arm64/index.rst1
-rw-r--r--Documentation/arm64/kasan-offsets.sh27
-rw-r--r--Documentation/arm64/memory.rst130
-rw-r--r--Documentation/arm64/silicon-errata.rst19
-rw-r--r--Documentation/arm64/tagged-address-abi.rst156
-rw-r--r--Documentation/arm64/tagged-pointers.rst21
9 files changed, 414 insertions, 68 deletions
diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst
index d3f3a60fbf25..5d78a6f5b0ae 100644
--- a/Documentation/arm64/booting.rst
+++ b/Documentation/arm64/booting.rst
@@ -213,6 +213,9 @@ Before jumping into the kernel, the following conditions must be met:
- ICC_SRE_EL3.Enable (bit 3) must be initialiased to 0b1.
- ICC_SRE_EL3.SRE (bit 0) must be initialised to 0b1.
+ - ICC_CTLR_EL3.PMHE (bit 6) must be set to the same value across
+ all CPUs the kernel is executing on, and must stay constant
+ for the lifetime of the kernel.
- If the kernel is entered at EL1:
diff --git a/Documentation/arm64/cpu-feature-registers.rst b/Documentation/arm64/cpu-feature-registers.rst
index 2955287e9acc..41937a8091aa 100644
--- a/Documentation/arm64/cpu-feature-registers.rst
+++ b/Documentation/arm64/cpu-feature-registers.rst
@@ -117,6 +117,8 @@ infrastructure:
+------------------------------+---------+---------+
| Name | bits | visible |
+------------------------------+---------+---------+
+ | RNDR | [63-60] | y |
+ +------------------------------+---------+---------+
| TS | [55-52] | y |
+------------------------------+---------+---------+
| FHM | [51-48] | y |
@@ -168,8 +170,15 @@ infrastructure:
+------------------------------+---------+---------+
- 3) MIDR_EL1 - Main ID Register
+ 3) ID_AA64PFR1_EL1 - Processor Feature Register 1
+ +------------------------------+---------+---------+
+ | Name | bits | visible |
+ +------------------------------+---------+---------+
+ | SSBS | [7-4] | y |
+ +------------------------------+---------+---------+
+
+ 4) MIDR_EL1 - Main ID Register
+------------------------------+---------+---------+
| Name | bits | visible |
+------------------------------+---------+---------+
@@ -188,11 +197,21 @@ infrastructure:
as available on the CPU where it is fetched and is not a system
wide safe value.
- 4) ID_AA64ISAR1_EL1 - Instruction set attribute register 1
+ 5) ID_AA64ISAR1_EL1 - Instruction set attribute register 1
+------------------------------+---------+---------+
| Name | bits | visible |
+------------------------------+---------+---------+
+ | I8MM | [55-52] | y |
+ +------------------------------+---------+---------+
+ | DGH | [51-48] | y |
+ +------------------------------+---------+---------+
+ | BF16 | [47-44] | y |
+ +------------------------------+---------+---------+
+ | SB | [39-36] | y |
+ +------------------------------+---------+---------+
+ | FRINTTS | [35-32] | y |
+ +------------------------------+---------+---------+
| GPI | [31-28] | y |
+------------------------------+---------+---------+
| GPA | [27-24] | y |
@@ -210,7 +229,7 @@ infrastructure:
| DPB | [3-0] | y |
+------------------------------+---------+---------+
- 5) ID_AA64MMFR2_EL1 - Memory model feature register 2
+ 6) ID_AA64MMFR2_EL1 - Memory model feature register 2
+------------------------------+---------+---------+
| Name | bits | visible |
@@ -218,15 +237,23 @@ infrastructure:
| AT | [35-32] | y |
+------------------------------+---------+---------+
- 6) ID_AA64ZFR0_EL1 - SVE feature ID register 0
+ 7) ID_AA64ZFR0_EL1 - SVE feature ID register 0
+------------------------------+---------+---------+
| Name | bits | visible |
+------------------------------+---------+---------+
+ | F64MM | [59-56] | y |
+ +------------------------------+---------+---------+
+ | F32MM | [55-52] | y |
+ +------------------------------+---------+---------+
+ | I8MM | [47-44] | y |
+ +------------------------------+---------+---------+
| SM4 | [43-40] | y |
+------------------------------+---------+---------+
| SHA3 | [35-32] | y |
+------------------------------+---------+---------+
+ | BF16 | [23-20] | y |
+ +------------------------------+---------+---------+
| BitPerm | [19-16] | y |
+------------------------------+---------+---------+
| AES | [7-4] | y |
diff --git a/Documentation/arm64/elf_hwcaps.rst b/Documentation/arm64/elf_hwcaps.rst
index 91f79529c58c..7dfb97dfe416 100644
--- a/Documentation/arm64/elf_hwcaps.rst
+++ b/Documentation/arm64/elf_hwcaps.rst
@@ -119,10 +119,6 @@ HWCAP_LRCPC
HWCAP_DCPOP
Functionality implied by ID_AA64ISAR1_EL1.DPB == 0b0001.
-HWCAP2_DCPODP
-
- Functionality implied by ID_AA64ISAR1_EL1.DPB == 0b0010.
-
HWCAP_SHA3
Functionality implied by ID_AA64ISAR0_EL1.SHA3 == 0b0001.
@@ -141,6 +137,41 @@ HWCAP_SHA512
HWCAP_SVE
Functionality implied by ID_AA64PFR0_EL1.SVE == 0b0001.
+HWCAP_ASIMDFHM
+ Functionality implied by ID_AA64ISAR0_EL1.FHM == 0b0001.
+
+HWCAP_DIT
+ Functionality implied by ID_AA64PFR0_EL1.DIT == 0b0001.
+
+HWCAP_USCAT
+ Functionality implied by ID_AA64MMFR2_EL1.AT == 0b0001.
+
+HWCAP_ILRCPC
+ Functionality implied by ID_AA64ISAR1_EL1.LRCPC == 0b0010.
+
+HWCAP_FLAGM
+ Functionality implied by ID_AA64ISAR0_EL1.TS == 0b0001.
+
+HWCAP_SSBS
+ Functionality implied by ID_AA64PFR1_EL1.SSBS == 0b0010.
+
+HWCAP_SB
+ Functionality implied by ID_AA64ISAR1_EL1.SB == 0b0001.
+
+HWCAP_PACA
+ Functionality implied by ID_AA64ISAR1_EL1.APA == 0b0001 or
+ ID_AA64ISAR1_EL1.API == 0b0001, as described by
+ Documentation/arm64/pointer-authentication.rst.
+
+HWCAP_PACG
+ Functionality implied by ID_AA64ISAR1_EL1.GPA == 0b0001 or
+ ID_AA64ISAR1_EL1.GPI == 0b0001, as described by
+ Documentation/arm64/pointer-authentication.rst.
+
+HWCAP2_DCPODP
+
+ Functionality implied by ID_AA64ISAR1_EL1.DPB == 0b0010.
+
HWCAP2_SVE2
Functionality implied by ID_AA64ZFR0_EL1.SVEVer == 0b0001.
@@ -165,42 +196,45 @@ HWCAP2_SVESM4
Functionality implied by ID_AA64ZFR0_EL1.SM4 == 0b0001.
-HWCAP_ASIMDFHM
- Functionality implied by ID_AA64ISAR0_EL1.FHM == 0b0001.
+HWCAP2_FLAGM2
-HWCAP_DIT
- Functionality implied by ID_AA64PFR0_EL1.DIT == 0b0001.
+ Functionality implied by ID_AA64ISAR0_EL1.TS == 0b0010.
-HWCAP_USCAT
- Functionality implied by ID_AA64MMFR2_EL1.AT == 0b0001.
+HWCAP2_FRINT
-HWCAP_ILRCPC
- Functionality implied by ID_AA64ISAR1_EL1.LRCPC == 0b0010.
+ Functionality implied by ID_AA64ISAR1_EL1.FRINTTS == 0b0001.
-HWCAP_FLAGM
- Functionality implied by ID_AA64ISAR0_EL1.TS == 0b0001.
+HWCAP2_SVEI8MM
-HWCAP2_FLAGM2
+ Functionality implied by ID_AA64ZFR0_EL1.I8MM == 0b0001.
- Functionality implied by ID_AA64ISAR0_EL1.TS == 0b0010.
+HWCAP2_SVEF32MM
-HWCAP_SSBS
- Functionality implied by ID_AA64PFR1_EL1.SSBS == 0b0010.
+ Functionality implied by ID_AA64ZFR0_EL1.F32MM == 0b0001.
-HWCAP_PACA
- Functionality implied by ID_AA64ISAR1_EL1.APA == 0b0001 or
- ID_AA64ISAR1_EL1.API == 0b0001, as described by
- Documentation/arm64/pointer-authentication.rst.
+HWCAP2_SVEF64MM
-HWCAP_PACG
- Functionality implied by ID_AA64ISAR1_EL1.GPA == 0b0001 or
- ID_AA64ISAR1_EL1.GPI == 0b0001, as described by
- Documentation/arm64/pointer-authentication.rst.
+ Functionality implied by ID_AA64ZFR0_EL1.F64MM == 0b0001.
-HWCAP2_FRINT
+HWCAP2_SVEBF16
- Functionality implied by ID_AA64ISAR1_EL1.FRINTTS == 0b0001.
+ Functionality implied by ID_AA64ZFR0_EL1.BF16 == 0b0001.
+
+HWCAP2_I8MM
+
+ Functionality implied by ID_AA64ISAR1_EL1.I8MM == 0b0001.
+
+HWCAP2_BF16
+
+ Functionality implied by ID_AA64ISAR1_EL1.BF16 == 0b0001.
+
+HWCAP2_DGH
+
+ Functionality implied by ID_AA64ISAR1_EL1.DGH == 0b0001.
+
+HWCAP2_RNG
+ Functionality implied by ID_AA64ISAR0_EL1.RNDR == 0b0001.
4. Unused AT_HWCAP bits
-----------------------
diff --git a/Documentation/arm64/index.rst b/Documentation/arm64/index.rst
index 96b696ba4e6c..5c0c69dc58aa 100644
--- a/Documentation/arm64/index.rst
+++ b/Documentation/arm64/index.rst
@@ -16,6 +16,7 @@ ARM64 Architecture
pointer-authentication
silicon-errata
sve
+ tagged-address-abi
tagged-pointers
.. only:: subproject and html
diff --git a/Documentation/arm64/kasan-offsets.sh b/Documentation/arm64/kasan-offsets.sh
new file mode 100644
index 000000000000..2b7a021db363
--- /dev/null
+++ b/Documentation/arm64/kasan-offsets.sh
@@ -0,0 +1,27 @@
+#!/bin/sh
+
+# Print out the KASAN_SHADOW_OFFSETS required to place the KASAN SHADOW
+# start address at the mid-point of the kernel VA space
+
+print_kasan_offset () {
+ printf "%02d\t" $1
+ printf "0x%08x00000000\n" $(( (0xffffffff & (-1 << ($1 - 1 - 32))) \
+ + (1 << ($1 - 32 - $2)) \
+ - (1 << (64 - 32 - $2)) ))
+}
+
+echo KASAN_SHADOW_SCALE_SHIFT = 3
+printf "VABITS\tKASAN_SHADOW_OFFSET\n"
+print_kasan_offset 48 3
+print_kasan_offset 47 3
+print_kasan_offset 42 3
+print_kasan_offset 39 3
+print_kasan_offset 36 3
+echo
+echo KASAN_SHADOW_SCALE_SHIFT = 4
+printf "VABITS\tKASAN_SHADOW_OFFSET\n"
+print_kasan_offset 48 4
+print_kasan_offset 47 4
+print_kasan_offset 42 4
+print_kasan_offset 39 4
+print_kasan_offset 36 4
diff --git a/Documentation/arm64/memory.rst b/Documentation/arm64/memory.rst
index 464b880fc4b7..02e02175e6f5 100644
--- a/Documentation/arm64/memory.rst
+++ b/Documentation/arm64/memory.rst
@@ -14,6 +14,10 @@ with the 4KB page configuration, allowing 39-bit (512GB) or 48-bit
64KB pages, only 2 levels of translation tables, allowing 42-bit (4TB)
virtual address, are used but the memory layout is the same.
+ARMv8.2 adds optional support for Large Virtual Address space. This is
+only available when running with a 64KB page size and expands the
+number of descriptors in the first level of translation.
+
User addresses have bits 63:48 set to 0 while the kernel addresses have
the same bits set to 1. TTBRx selection is given by bit 63 of the
virtual address. The swapper_pg_dir contains only kernel (global)
@@ -22,40 +26,43 @@ The swapper_pg_dir address is written to TTBR1 and never written to
TTBR0.
-AArch64 Linux memory layout with 4KB pages + 3 levels::
-
- Start End Size Use
- -----------------------------------------------------------------------
- 0000000000000000 0000007fffffffff 512GB user
- ffffff8000000000 ffffffffffffffff 512GB kernel
-
-
-AArch64 Linux memory layout with 4KB pages + 4 levels::
+AArch64 Linux memory layout with 4KB pages + 4 levels (48-bit)::
Start End Size Use
-----------------------------------------------------------------------
0000000000000000 0000ffffffffffff 256TB user
- ffff000000000000 ffffffffffffffff 256TB kernel
-
-
-AArch64 Linux memory layout with 64KB pages + 2 levels::
-
- Start End Size Use
- -----------------------------------------------------------------------
- 0000000000000000 000003ffffffffff 4TB user
- fffffc0000000000 ffffffffffffffff 4TB kernel
-
-
-AArch64 Linux memory layout with 64KB pages + 3 levels::
+ ffff000000000000 ffff7fffffffffff 128TB kernel logical memory map
+ ffff800000000000 ffff9fffffffffff 32TB kasan shadow region
+ ffffa00000000000 ffffa00007ffffff 128MB bpf jit region
+ ffffa00008000000 ffffa0000fffffff 128MB modules
+ ffffa00010000000 fffffdffbffeffff ~93TB vmalloc
+ fffffdffbfff0000 fffffdfffe5f8fff ~998MB [guard region]
+ fffffdfffe5f9000 fffffdfffe9fffff 4124KB fixed mappings
+ fffffdfffea00000 fffffdfffebfffff 2MB [guard region]
+ fffffdfffec00000 fffffdffffbfffff 16MB PCI I/O space
+ fffffdffffc00000 fffffdffffdfffff 2MB [guard region]
+ fffffdffffe00000 ffffffffffdfffff 2TB vmemmap
+ ffffffffffe00000 ffffffffffffffff 2MB [guard region]
+
+
+AArch64 Linux memory layout with 64KB pages + 3 levels (52-bit with HW support)::
Start End Size Use
-----------------------------------------------------------------------
- 0000000000000000 0000ffffffffffff 256TB user
- ffff000000000000 ffffffffffffffff 256TB kernel
-
-
-For details of the virtual kernel memory layout please see the kernel
-booting log.
+ 0000000000000000 000fffffffffffff 4PB user
+ fff0000000000000 fff7ffffffffffff 2PB kernel logical memory map
+ fff8000000000000 fffd9fffffffffff 1440TB [gap]
+ fffda00000000000 ffff9fffffffffff 512TB kasan shadow region
+ ffffa00000000000 ffffa00007ffffff 128MB bpf jit region
+ ffffa00008000000 ffffa0000fffffff 128MB modules
+ ffffa00010000000 fffff81ffffeffff ~88TB vmalloc
+ fffff81fffff0000 fffffc1ffe58ffff ~3TB [guard region]
+ fffffc1ffe590000 fffffc1ffe9fffff 4544KB fixed mappings
+ fffffc1ffea00000 fffffc1ffebfffff 2MB [guard region]
+ fffffc1ffec00000 fffffc1fffbfffff 16MB PCI I/O space
+ fffffc1fffc00000 fffffc1fffdfffff 2MB [guard region]
+ fffffc1fffe00000 ffffffffffdfffff 3968GB vmemmap
+ ffffffffffe00000 ffffffffffffffff 2MB [guard region]
Translation table lookup with 4KB pages::
@@ -83,7 +90,8 @@ Translation table lookup with 64KB pages::
| | | | [15:0] in-page offset
| | | +----------> [28:16] L3 index
| | +--------------------------> [41:29] L2 index
- | +-------------------------------> [47:42] L1 index
+ | +-------------------------------> [47:42] L1 index (48-bit)
+ | [51:42] L1 index (52-bit)
+-------------------------------------------------> [63] TTBR0/1
@@ -96,3 +104,69 @@ ARM64_HARDEN_EL2_VECTORS is selected for particular CPUs.
When using KVM with the Virtualization Host Extensions, no additional
mappings are created, since the host kernel runs directly in EL2.
+
+52-bit VA support in the kernel
+-------------------------------
+If the ARMv8.2-LVA optional feature is present, and we are running
+with a 64KB page size; then it is possible to use 52-bits of address
+space for both userspace and kernel addresses. However, any kernel
+binary that supports 52-bit must also be able to fall back to 48-bit
+at early boot time if the hardware feature is not present.
+
+This fallback mechanism necessitates the kernel .text to be in the
+higher addresses such that they are invariant to 48/52-bit VAs. Due
+to the kasan shadow being a fraction of the entire kernel VA space,
+the end of the kasan shadow must also be in the higher half of the
+kernel VA space for both 48/52-bit. (Switching from 48-bit to 52-bit,
+the end of the kasan shadow is invariant and dependent on ~0UL,
+whilst the start address will "grow" towards the lower addresses).
+
+In order to optimise phys_to_virt and virt_to_phys, the PAGE_OFFSET
+is kept constant at 0xFFF0000000000000 (corresponding to 52-bit),
+this obviates the need for an extra variable read. The physvirt
+offset and vmemmap offsets are computed at early boot to enable
+this logic.
+
+As a single binary will need to support both 48-bit and 52-bit VA
+spaces, the VMEMMAP must be sized large enough for 52-bit VAs and
+also must be sized large enought to accommodate a fixed PAGE_OFFSET.
+
+Most code in the kernel should not need to consider the VA_BITS, for
+code that does need to know the VA size the variables are
+defined as follows:
+
+VA_BITS constant the *maximum* VA space size
+
+VA_BITS_MIN constant the *minimum* VA space size
+
+vabits_actual variable the *actual* VA space size
+
+
+Maximum and minimum sizes can be useful to ensure that buffers are
+sized large enough or that addresses are positioned close enough for
+the "worst" case.
+
+52-bit userspace VAs
+--------------------
+To maintain compatibility with software that relies on the ARMv8.0
+VA space maximum size of 48-bits, the kernel will, by default,
+return virtual addresses to userspace from a 48-bit range.
+
+Software can "opt-in" to receiving VAs from a 52-bit space by
+specifying an mmap hint parameter that is larger than 48-bit.
+
+For example:
+
+.. code-block:: c
+
+ maybe_high_address = mmap(~0UL, size, prot, flags,...);
+
+It is also possible to build a debug kernel that returns addresses
+from a 52-bit space by enabling the following kernel config options:
+
+.. code-block:: sh
+
+ CONFIG_EXPERT=y && CONFIG_ARM64_FORCE_52BIT=y
+
+Note that this option is only intended for debugging applications
+and should not be used in production.
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
index 3e57d09246e6..9120e59578dc 100644
--- a/Documentation/arm64/silicon-errata.rst
+++ b/Documentation/arm64/silicon-errata.rst
@@ -70,8 +70,12 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A57 | #834220 | ARM64_ERRATUM_834220 |
+----------------+-----------------+-----------------+-----------------------------+
+| ARM | Cortex-A57 | #1319537 | ARM64_ERRATUM_1319367 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A72 | #853709 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
+| ARM | Cortex-A72 | #1319367 | ARM64_ERRATUM_1319367 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A73 | #858921 | ARM64_ERRATUM_858921 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A55 | #1024718 | ARM64_ERRATUM_1024718 |
@@ -84,13 +88,22 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A76 | #1463225 | ARM64_ERRATUM_1463225 |
+----------------+-----------------+-----------------+-----------------------------+
+| ARM | Cortex-A55 | #1530923 | ARM64_ERRATUM_1530923 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Neoverse-N1 | #1188873,1418040| ARM64_ERRATUM_1418040 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Neoverse-N1 | #1349291 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
+| ARM | Neoverse-N1 | #1542419 | ARM64_ERRATUM_1542419 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | MMU-500 | #841119,826419 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
+----------------+-----------------+-----------------+-----------------------------+
+| Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_845719 |
++----------------+-----------------+-----------------+-----------------------------+
+| Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_843419 |
++----------------+-----------------+-----------------+-----------------------------+
++----------------+-----------------+-----------------+-----------------------------+
| Cavium | ThunderX ITS | #22375,24313 | CAVIUM_ERRATUM_22375 |
+----------------+-----------------+-----------------+-----------------------------+
| Cavium | ThunderX ITS | #23144 | CAVIUM_ERRATUM_23144 |
@@ -107,6 +120,8 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| Cavium | ThunderX2 SMMUv3| #126 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
+| Cavium | ThunderX2 Core | #219 | CAVIUM_TX2_ERRATUM_219 |
++----------------+-----------------+-----------------+-----------------------------+
+----------------+-----------------+-----------------+-----------------------------+
| Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 |
+----------------+-----------------+-----------------+-----------------------------+
@@ -115,6 +130,8 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| Hisilicon | Hip0{6,7} | #161010701 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
+| Hisilicon | Hip0{6,7} | #161010803 | N/A |
++----------------+-----------------+-----------------+-----------------------------+
| Hisilicon | Hip07 | #161600802 | HISILICON_ERRATUM_161600802 |
+----------------+-----------------+-----------------+-----------------------------+
| Hisilicon | Hip08 SMMU PMCG | #162001800 | N/A |
@@ -122,7 +139,7 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| Qualcomm Tech. | Kryo/Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003 |
+----------------+-----------------+-----------------+-----------------------------+
-| Qualcomm Tech. | Falkor v1 | E1009 | QCOM_FALKOR_ERRATUM_1009 |
+| Qualcomm Tech. | Kryo/Falkor v1 | E1009 | QCOM_FALKOR_ERRATUM_1009 |
+----------------+-----------------+-----------------+-----------------------------+
| Qualcomm Tech. | QDF2400 ITS | E0065 | QCOM_QDF2400_ERRATUM_0065 |
+----------------+-----------------+-----------------+-----------------------------+
diff --git a/Documentation/arm64/tagged-address-abi.rst b/Documentation/arm64/tagged-address-abi.rst
new file mode 100644
index 000000000000..d4a85d535bf9
--- /dev/null
+++ b/Documentation/arm64/tagged-address-abi.rst
@@ -0,0 +1,156 @@
+==========================
+AArch64 TAGGED ADDRESS ABI
+==========================
+
+Authors: Vincenzo Frascino <vincenzo.frascino@arm.com>
+ Catalin Marinas <catalin.marinas@arm.com>
+
+Date: 21 August 2019
+
+This document describes the usage and semantics of the Tagged Address
+ABI on AArch64 Linux.
+
+1. Introduction
+---------------
+
+On AArch64 the ``TCR_EL1.TBI0`` bit is set by default, allowing
+userspace (EL0) to perform memory accesses through 64-bit pointers with
+a non-zero top byte. This document describes the relaxation of the
+syscall ABI that allows userspace to pass certain tagged pointers to
+kernel syscalls.
+
+2. AArch64 Tagged Address ABI
+-----------------------------
+
+From the kernel syscall interface perspective and for the purposes of
+this document, a "valid tagged pointer" is a pointer with a potentially
+non-zero top-byte that references an address in the user process address
+space obtained in one of the following ways:
+
+- ``mmap()`` syscall where either:
+
+ - flags have the ``MAP_ANONYMOUS`` bit set or
+ - the file descriptor refers to a regular file (including those
+ returned by ``memfd_create()``) or ``/dev/zero``
+
+- ``brk()`` syscall (i.e. the heap area between the initial location of
+ the program break at process creation and its current location).
+
+- any memory mapped by the kernel in the address space of the process
+ during creation and with the same restrictions as for ``mmap()`` above
+ (e.g. data, bss, stack).
+
+The AArch64 Tagged Address ABI has two stages of relaxation depending
+how the user addresses are used by the kernel:
+
+1. User addresses not accessed by the kernel but used for address space
+ management (e.g. ``mmap()``, ``mprotect()``, ``madvise()``). The use
+ of valid tagged pointers in this context is always allowed.
+
+2. User addresses accessed by the kernel (e.g. ``write()``). This ABI
+ relaxation is disabled by default and the application thread needs to
+ explicitly enable it via ``prctl()`` as follows:
+
+ - ``PR_SET_TAGGED_ADDR_CTRL``: enable or disable the AArch64 Tagged
+ Address ABI for the calling thread.
+
+ The ``(unsigned int) arg2`` argument is a bit mask describing the
+ control mode used:
+
+ - ``PR_TAGGED_ADDR_ENABLE``: enable AArch64 Tagged Address ABI.
+ Default status is disabled.
+
+ Arguments ``arg3``, ``arg4``, and ``arg5`` must be 0.
+
+ - ``PR_GET_TAGGED_ADDR_CTRL``: get the status of the AArch64 Tagged
+ Address ABI for the calling thread.
+
+ Arguments ``arg2``, ``arg3``, ``arg4``, and ``arg5`` must be 0.
+
+ The ABI properties described above are thread-scoped, inherited on
+ clone() and fork() and cleared on exec().
+
+ Calling ``prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE, 0, 0, 0)``
+ returns ``-EINVAL`` if the AArch64 Tagged Address ABI is globally
+ disabled by ``sysctl abi.tagged_addr_disabled=1``. The default
+ ``sysctl abi.tagged_addr_disabled`` configuration is 0.
+
+When the AArch64 Tagged Address ABI is enabled for a thread, the
+following behaviours are guaranteed:
+
+- All syscalls except the cases mentioned in section 3 can accept any
+ valid tagged pointer.
+
+- The syscall behaviour is undefined for invalid tagged pointers: it may
+ result in an error code being returned, a (fatal) signal being raised,
+ or other modes of failure.
+
+- The syscall behaviour for a valid tagged pointer is the same as for
+ the corresponding untagged pointer.
+
+
+A definition of the meaning of tagged pointers on AArch64 can be found
+in Documentation/arm64/tagged-pointers.rst.
+
+3. AArch64 Tagged Address ABI Exceptions
+-----------------------------------------
+
+The following system call parameters must be untagged regardless of the
+ABI relaxation:
+
+- ``prctl()`` other than pointers to user data either passed directly or
+ indirectly as arguments to be accessed by the kernel.
+
+- ``ioctl()`` other than pointers to user data either passed directly or
+ indirectly as arguments to be accessed by the kernel.
+
+- ``shmat()`` and ``shmdt()``.
+
+Any attempt to use non-zero tagged pointers may result in an error code
+being returned, a (fatal) signal being raised, or other modes of
+failure.
+
+4. Example of correct usage
+---------------------------
+.. code-block:: c
+
+ #include <stdlib.h>
+ #include <string.h>
+ #include <unistd.h>
+ #include <sys/mman.h>
+ #include <sys/prctl.h>
+
+ #define PR_SET_TAGGED_ADDR_CTRL 55
+ #define PR_TAGGED_ADDR_ENABLE (1UL << 0)
+
+ #define TAG_SHIFT 56
+
+ int main(void)
+ {
+ int tbi_enabled = 0;
+ unsigned long tag = 0;
+ char *ptr;
+
+ /* check/enable the tagged address ABI */
+ if (!prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE, 0, 0, 0))
+ tbi_enabled = 1;
+
+ /* memory allocation */
+ ptr = mmap(NULL, sysconf(_SC_PAGE_SIZE), PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+ if (ptr == MAP_FAILED)
+ return 1;
+
+ /* set a non-zero tag if the ABI is available */
+ if (tbi_enabled)
+ tag = rand() & 0xff;
+ ptr = (char *)((unsigned long)ptr | (tag << TAG_SHIFT));
+
+ /* memory access to a tagged address */
+ strcpy(ptr, "tagged pointer\n");
+
+ /* syscall with a tagged pointer */
+ write(1, ptr, strlen(ptr));
+
+ return 0;
+ }
diff --git a/Documentation/arm64/tagged-pointers.rst b/Documentation/arm64/tagged-pointers.rst
index 2acdec3ebbeb..eab4323609b9 100644
--- a/Documentation/arm64/tagged-pointers.rst
+++ b/Documentation/arm64/tagged-pointers.rst
@@ -20,7 +20,9 @@ Passing tagged addresses to the kernel
--------------------------------------
All interpretation of userspace memory addresses by the kernel assumes
-an address tag of 0x00.
+an address tag of 0x00, unless the application enables the AArch64
+Tagged Address ABI explicitly
+(Documentation/arm64/tagged-address-abi.rst).
This includes, but is not limited to, addresses found in:
@@ -33,13 +35,15 @@ This includes, but is not limited to, addresses found in:
- the frame pointer (x29) and frame records, e.g. when interpreting
them to generate a backtrace or call graph.
-Using non-zero address tags in any of these locations may result in an
-error code being returned, a (fatal) signal being raised, or other modes
-of failure.
+Using non-zero address tags in any of these locations when the
+userspace application did not enable the AArch64 Tagged Address ABI may
+result in an error code being returned, a (fatal) signal being raised,
+or other modes of failure.
-For these reasons, passing non-zero address tags to the kernel via
-system calls is forbidden, and using a non-zero address tag for sp is
-strongly discouraged.
+For these reasons, when the AArch64 Tagged Address ABI is disabled,
+passing non-zero address tags to the kernel via system calls is
+forbidden, and using a non-zero address tag for sp is strongly
+discouraged.
Programs maintaining a frame pointer and frame records that use non-zero
address tags may suffer impaired or inaccurate debug and profiling
@@ -59,6 +63,9 @@ be preserved.
The architecture prevents the use of a tagged PC, so the upper byte will
be set to a sign-extension of bit 55 on exception return.
+This behaviour is maintained when the AArch64 Tagged Address ABI is
+enabled.
+
Other considerations
--------------------
OpenPOWER on IntegriCloud