diff options
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/RCU/arrayRCU.txt | 20 | ||||
-rw-r--r-- | Documentation/RCU/lockdep.txt | 10 | ||||
-rw-r--r-- | Documentation/RCU/rcu_dereference.txt | 38 | ||||
-rw-r--r-- | Documentation/RCU/whatisRCU.txt | 6 | ||||
-rw-r--r-- | Documentation/kernel-parameters.txt | 33 | ||||
-rw-r--r-- | Documentation/memory-barriers.txt | 7 |
6 files changed, 69 insertions, 45 deletions
diff --git a/Documentation/RCU/arrayRCU.txt b/Documentation/RCU/arrayRCU.txt index 453ebe6953ee..f05a9afb2c39 100644 --- a/Documentation/RCU/arrayRCU.txt +++ b/Documentation/RCU/arrayRCU.txt @@ -10,7 +10,19 @@ also be used to protect arrays. Three situations are as follows: 3. Resizeable Arrays -Each of these situations are discussed below. +Each of these three situations involves an RCU-protected pointer to an +array that is separately indexed. It might be tempting to consider use +of RCU to instead protect the index into an array, however, this use +case is -not- supported. The problem with RCU-protected indexes into +arrays is that compilers can play way too many optimization games with +integers, which means that the rules governing handling of these indexes +are far more trouble than they are worth. If RCU-protected indexes into +arrays prove to be particularly valuable (which they have not thus far), +explicit cooperation from the compiler will be required to permit them +to be safely used. + +That aside, each of the three RCU-protected pointer situations are +described in the following sections. Situation 1: Hash Tables @@ -36,9 +48,9 @@ Quick Quiz: Why is it so important that updates be rare when Situation 3: Resizeable Arrays Use of RCU for resizeable arrays is demonstrated by the grow_ary() -function used by the System V IPC code. The array is used to map from -semaphore, message-queue, and shared-memory IDs to the data structure -that represents the corresponding IPC construct. The grow_ary() +function formerly used by the System V IPC code. The array is used +to map from semaphore, message-queue, and shared-memory IDs to the data +structure that represents the corresponding IPC construct. The grow_ary() function does not acquire any locks; instead its caller must hold the ids->sem semaphore. diff --git a/Documentation/RCU/lockdep.txt b/Documentation/RCU/lockdep.txt index cd83d2348fef..da51d3068850 100644 --- a/Documentation/RCU/lockdep.txt +++ b/Documentation/RCU/lockdep.txt @@ -47,11 +47,6 @@ checking of rcu_dereference() primitives: Use explicit check expression "c" along with srcu_read_lock_held()(). This is useful in code that is invoked by both SRCU readers and updaters. - rcu_dereference_index_check(p, c): - Use explicit check expression "c", but the caller - must supply one of the rcu_read_lock_held() functions. - This is useful in code that uses RCU-protected arrays - that is invoked by both RCU readers and updaters. rcu_dereference_raw(p): Don't check. (Use sparingly, if at all.) rcu_dereference_protected(p, c): @@ -64,11 +59,6 @@ checking of rcu_dereference() primitives: but retain the compiler constraints that prevent duplicating or coalescsing. This is useful when when testing the value of the pointer itself, for example, against NULL. - rcu_access_index(idx): - Return the value of the index and omit all barriers, but - retain the compiler constraints that prevent duplicating - or coalescsing. This is useful when when testing the - value of the index itself, for example, against -1. The rcu_dereference_check() check expression can be any boolean expression, but would normally include a lockdep expression. However, diff --git a/Documentation/RCU/rcu_dereference.txt b/Documentation/RCU/rcu_dereference.txt index ceb05da5a5ac..1e6c0da994f5 100644 --- a/Documentation/RCU/rcu_dereference.txt +++ b/Documentation/RCU/rcu_dereference.txt @@ -25,17 +25,6 @@ o You must use one of the rcu_dereference() family of primitives for an example where the compiler can in fact deduce the exact value of the pointer, and thus cause misordering. -o Do not use single-element RCU-protected arrays. The compiler - is within its right to assume that the value of an index into - such an array must necessarily evaluate to zero. The compiler - could then substitute the constant zero for the computation, so - that the array index no longer depended on the value returned - by rcu_dereference(). If the array index no longer depends - on rcu_dereference(), then both the compiler and the CPU - are within their rights to order the array access before the - rcu_dereference(), which can cause the array access to return - garbage. - o Avoid cancellation when using the "+" and "-" infix arithmetic operators. For example, for a given variable "x", avoid "(x-x)". There are similar arithmetic pitfalls from other @@ -76,14 +65,15 @@ o Do not use the results from the boolean "&&" and "||" when dereferencing. For example, the following (rather improbable) code is buggy: - int a[2]; - int index; - int force_zero_index = 1; + int *p; + int *q; ... - r1 = rcu_dereference(i1) - r2 = a[r1 && force_zero_index]; /* BUGGY!!! */ + p = rcu_dereference(gp) + q = &global_q; + q += p != &oom_p1 && p != &oom_p2; + r1 = *q; /* BUGGY!!! */ The reason this is buggy is that "&&" and "||" are often compiled using branches. While weak-memory machines such as ARM or PowerPC @@ -94,14 +84,15 @@ o Do not use the results from relational operators ("==", "!=", ">", ">=", "<", or "<=") when dereferencing. For example, the following (quite strange) code is buggy: - int a[2]; - int index; - int flip_index = 0; + int *p; + int *q; ... - r1 = rcu_dereference(i1) - r2 = a[r1 != flip_index]; /* BUGGY!!! */ + p = rcu_dereference(gp) + q = &global_q; + q += p > &oom_p; + r1 = *q; /* BUGGY!!! */ As before, the reason this is buggy is that relational operators are often compiled using branches. And as before, although @@ -193,6 +184,11 @@ o Be very careful about comparing pointers obtained from pointer. Note that the volatile cast in rcu_dereference() will normally prevent the compiler from knowing too much. + However, please note that if the compiler knows that the + pointer takes on only one of two values, a not-equal + comparison will provide exactly the information that the + compiler needs to deduce the value of the pointer. + o Disable any value-speculation optimizations that your compiler might provide, especially if you are making use of feedback-based optimizations that take data collected from prior runs. Such diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index 88dfce182f66..5746b0c77f3e 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt @@ -256,7 +256,9 @@ rcu_dereference() If you are going to be fetching multiple fields from the RCU-protected structure, using the local variable is of course preferred. Repeated rcu_dereference() calls look - ugly and incur unnecessary overhead on Alpha CPUs. + ugly, do not guarantee that the same pointer will be returned + if an update happened while in the critical section, and incur + unnecessary overhead on Alpha CPUs. Note that the value returned by rcu_dereference() is valid only within the enclosing RCU read-side critical section. @@ -879,9 +881,7 @@ SRCU: Initialization/cleanup All: lockdep-checked RCU-protected pointer access - rcu_access_index rcu_access_pointer - rcu_dereference_index_check rcu_dereference_raw rcu_lockdep_assert rcu_sleep_check diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 61ab1628a057..0b7f3e7a029c 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2992,11 +2992,34 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Set maximum number of finished RCU callbacks to process in one batch. + rcutree.dump_tree= [KNL] + Dump the structure of the rcu_node combining tree + out at early boot. This is used for diagnostic + purposes, to verify correct tree setup. + + rcutree.gp_cleanup_delay= [KNL] + Set the number of jiffies to delay each step of + RCU grace-period cleanup. This only has effect + when CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP is set. + rcutree.gp_init_delay= [KNL] Set the number of jiffies to delay each step of RCU grace-period initialization. This only has - effect when CONFIG_RCU_TORTURE_TEST_SLOW_INIT is - set. + effect when CONFIG_RCU_TORTURE_TEST_SLOW_INIT + is set. + + rcutree.gp_preinit_delay= [KNL] + Set the number of jiffies to delay each step of + RCU grace-period pre-initialization, that is, + the propagation of recent CPU-hotplug changes up + the rcu_node combining tree. This only has effect + when CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT is set. + + rcutree.rcu_fanout_exact= [KNL] + Disable autobalancing of the rcu_node combining + tree. This is used by rcutorture, and might + possibly be useful for architectures having high + cache-to-cache transfer latencies. rcutree.rcu_fanout_leaf= [KNL] Increase the number of CPUs assigned to each @@ -3101,7 +3124,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted. test, hence the "fake". rcutorture.nreaders= [KNL] - Set number of RCU readers. + Set number of RCU readers. The value -1 selects + N-1, where N is the number of CPUs. A value + "n" less than -1 selects N-n-2, where N is again + the number of CPUs. For example, -2 selects N + (the number of CPUs), -3 selects N+1, and so on. rcutorture.object_debug= [KNL] Enable debug-object double-call_rcu() testing. diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index a3014bcc5b08..360841da3744 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -1795,10 +1795,9 @@ for each construct. These operations all imply certain barriers: Memory operations issued before the ACQUIRE may be completed after the ACQUIRE operation has completed. An smp_mb__before_spinlock(), - combined with a following ACQUIRE, orders prior loads against - subsequent loads and stores and also orders prior stores against - subsequent stores. Note that this is weaker than smp_mb()! The - smp_mb__before_spinlock() primitive is free on many architectures. + combined with a following ACQUIRE, orders prior stores against + subsequent loads and stores. Note that this is weaker than smp_mb()! + The smp_mb__before_spinlock() primitive is free on many architectures. (2) RELEASE operation implication: |