summaryrefslogtreecommitdiffstats
path: root/Documentation/memory-barriers.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/memory-barriers.txt')
-rw-r--r--Documentation/memory-barriers.txt36
1 files changed, 20 insertions, 16 deletions
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 4710845dbac4..28d1bc3edb1c 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -262,9 +262,14 @@ What is required is some way of intervening to instruct the compiler and the
CPU to restrict the order.
Memory barriers are such interventions. They impose a perceived partial
-ordering between the memory operations specified on either side of the barrier.
-They request that the sequence of memory events generated appears to other
-parts of the system as if the barrier is effective on that CPU.
+ordering over the memory operations on either side of the barrier.
+
+Such enforcement is important because the CPUs and other devices in a system
+can use a variety of tricks to improve performance - including reordering,
+deferral and combination of memory operations; speculative loads; speculative
+branch prediction and various types of caching. Memory barriers are used to
+override or suppress these tricks, allowing the code to sanely control the
+interaction of multiple CPUs and/or devices.
VARIETIES OF MEMORY BARRIER
@@ -282,7 +287,7 @@ Memory barriers come in four basic varieties:
A write barrier is a partial ordering on stores only; it is not required
to have any effect on loads.
- A CPU can be viewed as as commiting a sequence of store operations to the
+ A CPU can be viewed as committing a sequence of store operations to the
memory system as time progresses. All stores before a write barrier will
occur in the sequence _before_ all the stores after the write barrier.
@@ -413,7 +418,7 @@ There are certain things that the Linux kernel memory barriers do not guarantee:
indirect effect will be the order in which the second CPU sees the effects
of the first CPU's accesses occur, but see the next point:
- (*) There is no guarantee that the a CPU will see the correct order of effects
+ (*) There is no guarantee that a CPU will see the correct order of effects
from a second CPU's accesses, even _if_ the second CPU uses a memory
barrier, unless the first CPU _also_ uses a matching memory barrier (see
the subsection on "SMP Barrier Pairing").
@@ -461,8 +466,8 @@ Whilst this may seem like a failure of coherency or causality maintenance, it
isn't, and this behaviour can be observed on certain real CPUs (such as the DEC
Alpha).
-To deal with this, a data dependency barrier must be inserted between the
-address load and the data load:
+To deal with this, a data dependency barrier or better must be inserted
+between the address load and the data load:
CPU 1 CPU 2
=============== ===============
@@ -484,7 +489,7 @@ lines. The pointer P might be stored in an odd-numbered cache line, and the
variable B might be stored in an even-numbered cache line. Then, if the
even-numbered bank of the reading CPU's cache is extremely busy while the
odd-numbered bank is idle, one can see the new value of the pointer P (&B),
-but the old value of the variable B (1).
+but the old value of the variable B (2).
Another example of where data dependency barriers might by required is where a
@@ -597,7 +602,7 @@ Consider the following sequence of events:
This sequence of events is committed to the memory coherence system in an order
that the rest of the system might perceive as the unordered set of { STORE A,
-STORE B, STORE C } all occuring before the unordered set of { STORE D, STORE E
+STORE B, STORE C } all occurring before the unordered set of { STORE D, STORE E
}:
+-------+ : :
@@ -744,7 +749,7 @@ some effectively random order, despite the write barrier issued by CPU 1:
: :
-If, however, a read barrier were to be placed between the load of E and the
+If, however, a read barrier were to be placed between the load of B and the
load of A on CPU 2:
CPU 1 CPU 2
@@ -1461,9 +1466,8 @@ instruction itself is complete.
On a UP system - where this wouldn't be a problem - the smp_mb() is just a
compiler barrier, thus making sure the compiler emits the instructions in the
-right order without actually intervening in the CPU. Since there there's only
-one CPU, that CPU's dependency ordering logic will take care of everything
-else.
+right order without actually intervening in the CPU. Since there's only one
+CPU, that CPU's dependency ordering logic will take care of everything else.
ATOMIC OPERATIONS
@@ -1640,9 +1644,9 @@ functions:
The PCI bus, amongst others, defines an I/O space concept - which on such
CPUs as i386 and x86_64 cpus readily maps to the CPU's concept of I/O
- space. However, it may also mapped as a virtual I/O space in the CPU's
- memory map, particularly on those CPUs that don't support alternate
- I/O spaces.
+ space. However, it may also be mapped as a virtual I/O space in the CPU's
+ memory map, particularly on those CPUs that don't support alternate I/O
+ spaces.
Accesses to this space may be fully synchronous (as on i386), but
intermediary bridges (such as the PCI host bridge) may not fully honour
OpenPOWER on IntegriCloud