diff options
author | Ingo Molnar <mingo@elte.hu> | 2006-01-09 15:59:20 -0800 |
---|---|---|
committer | Ingo Molnar <mingo@hera.kernel.org> | 2006-01-09 15:59:20 -0800 |
commit | f3f54ffa703c6298240ffd69616451d645bae4d5 (patch) | |
tree | 0f66c760d21ab3c94b4f0be4229f458c0a3fd9c2 /Documentation | |
parent | 6053ee3b32e3437e8c1e72687850f436e779bd49 (diff) | |
download | talos-obmc-linux-f3f54ffa703c6298240ffd69616451d645bae4d5.tar.gz talos-obmc-linux-f3f54ffa703c6298240ffd69616451d645bae4d5.zip |
[PATCH] mutex subsystem, documentation
Add mutex design related documentation.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/DocBook/kernel-locking.tmpl | 22 | ||||
-rw-r--r-- | Documentation/mutex-design.txt | 135 |
2 files changed, 149 insertions, 8 deletions
diff --git a/Documentation/DocBook/kernel-locking.tmpl b/Documentation/DocBook/kernel-locking.tmpl index 90dc2de8e0af..158ffe9bfade 100644 --- a/Documentation/DocBook/kernel-locking.tmpl +++ b/Documentation/DocBook/kernel-locking.tmpl @@ -222,7 +222,7 @@ <title>Two Main Types of Kernel Locks: Spinlocks and Semaphores</title> <para> - There are two main types of kernel locks. The fundamental type + There are three main types of kernel locks. The fundamental type is the spinlock (<filename class="headerfile">include/asm/spinlock.h</filename>), which is a very simple single-holder lock: if you can't get the @@ -230,16 +230,22 @@ very small and fast, and can be used anywhere. </para> <para> - The second type is a semaphore + The second type is a mutex + (<filename class="headerfile">include/linux/mutex.h</filename>): it + is like a spinlock, but you may block holding a mutex. + If you can't lock a mutex, your task will suspend itself, and be woken + up when the mutex is released. This means the CPU can do something + else while you are waiting. There are many cases when you simply + can't sleep (see <xref linkend="sleeping-things"/>), and so have to + use a spinlock instead. + </para> + <para> + The third type is a semaphore (<filename class="headerfile">include/asm/semaphore.h</filename>): it can have more than one holder at any time (the number decided at initialization time), although it is most commonly used as a - single-holder lock (a mutex). If you can't get a semaphore, - your task will put itself on the queue, and be woken up when the - semaphore is released. This means the CPU will do something - else while you are waiting, but there are many cases when you - simply can't sleep (see <xref linkend="sleeping-things"/>), and so - have to use a spinlock instead. + single-holder lock (a mutex). If you can't get a semaphore, your + task will be suspended and later on woken up - just like for mutexes. </para> <para> Neither type of lock is recursive: see diff --git a/Documentation/mutex-design.txt b/Documentation/mutex-design.txt new file mode 100644 index 000000000000..cbf79881a41c --- /dev/null +++ b/Documentation/mutex-design.txt @@ -0,0 +1,135 @@ +Generic Mutex Subsystem + +started by Ingo Molnar <mingo@redhat.com> + + "Why on earth do we need a new mutex subsystem, and what's wrong + with semaphores?" + +firstly, there's nothing wrong with semaphores. But if the simpler +mutex semantics are sufficient for your code, then there are a couple +of advantages of mutexes: + + - 'struct mutex' is smaller on most architectures: .e.g on x86, + 'struct semaphore' is 20 bytes, 'struct mutex' is 16 bytes. + A smaller structure size means less RAM footprint, and better + CPU-cache utilization. + + - tighter code. On x86 i get the following .text sizes when + switching all mutex-alike semaphores in the kernel to the mutex + subsystem: + + text data bss dec hex filename + 3280380 868188 396860 4545428 455b94 vmlinux-semaphore + 3255329 865296 396732 4517357 44eded vmlinux-mutex + + that's 25051 bytes of code saved, or a 0.76% win - off the hottest + codepaths of the kernel. (The .data savings are 2892 bytes, or 0.33%) + Smaller code means better icache footprint, which is one of the + major optimization goals in the Linux kernel currently. + + - the mutex subsystem is slightly faster and has better scalability for + contended workloads. On an 8-way x86 system, running a mutex-based + kernel and testing creat+unlink+close (of separate, per-task files) + in /tmp with 16 parallel tasks, the average number of ops/sec is: + + Semaphores: Mutexes: + + $ ./test-mutex V 16 10 $ ./test-mutex V 16 10 + 8 CPUs, running 16 tasks. 8 CPUs, running 16 tasks. + checking VFS performance. checking VFS performance. + avg loops/sec: 34713 avg loops/sec: 84153 + CPU utilization: 63% CPU utilization: 22% + + i.e. in this workload, the mutex based kernel was 2.4 times faster + than the semaphore based kernel, _and_ it also had 2.8 times less CPU + utilization. (In terms of 'ops per CPU cycle', the semaphore kernel + performed 551 ops/sec per 1% of CPU time used, while the mutex kernel + performed 3825 ops/sec per 1% of CPU time used - it was 6.9 times + more efficient.) + + the scalability difference is visible even on a 2-way P4 HT box: + + Semaphores: Mutexes: + + $ ./test-mutex V 16 10 $ ./test-mutex V 16 10 + 4 CPUs, running 16 tasks. 8 CPUs, running 16 tasks. + checking VFS performance. checking VFS performance. + avg loops/sec: 127659 avg loops/sec: 181082 + CPU utilization: 100% CPU utilization: 34% + + (the straight performance advantage of mutexes is 41%, the per-cycle + efficiency of mutexes is 4.1 times better.) + + - there are no fastpath tradeoffs, the mutex fastpath is just as tight + as the semaphore fastpath. On x86, the locking fastpath is 2 + instructions: + + c0377ccb <mutex_lock>: + c0377ccb: f0 ff 08 lock decl (%eax) + c0377cce: 78 0e js c0377cde <.text.lock.mutex> + c0377cd0: c3 ret + + the unlocking fastpath is equally tight: + + c0377cd1 <mutex_unlock>: + c0377cd1: f0 ff 00 lock incl (%eax) + c0377cd4: 7e 0f jle c0377ce5 <.text.lock.mutex+0x7> + c0377cd6: c3 ret + + - 'struct mutex' semantics are well-defined and are enforced if + CONFIG_DEBUG_MUTEXES is turned on. Semaphores on the other hand have + virtually no debugging code or instrumentation. The mutex subsystem + checks and enforces the following rules: + + * - only one task can hold the mutex at a time + * - only the owner can unlock the mutex + * - multiple unlocks are not permitted + * - recursive locking is not permitted + * - a mutex object must be initialized via the API + * - a mutex object must not be initialized via memset or copying + * - task may not exit with mutex held + * - memory areas where held locks reside must not be freed + * - held mutexes must not be reinitialized + * - mutexes may not be used in irq contexts + + furthermore, there are also convenience features in the debugging + code: + + * - uses symbolic names of mutexes, whenever they are printed in debug output + * - point-of-acquire tracking, symbolic lookup of function names + * - list of all locks held in the system, printout of them + * - owner tracking + * - detects self-recursing locks and prints out all relevant info + * - detects multi-task circular deadlocks and prints out all affected + * locks and tasks (and only those tasks) + +Disadvantages +------------- + +The stricter mutex API means you cannot use mutexes the same way you +can use semaphores: e.g. they cannot be used from an interrupt context, +nor can they be unlocked from a different context that which acquired +it. [ I'm not aware of any other (e.g. performance) disadvantages from +using mutexes at the moment, please let me know if you find any. ] + +Implementation of mutexes +------------------------- + +'struct mutex' is the new mutex type, defined in include/linux/mutex.h +and implemented in kernel/mutex.c. It is a counter-based mutex with a +spinlock and a wait-list. The counter has 3 states: 1 for "unlocked", +0 for "locked" and negative numbers (usually -1) for "locked, potential +waiters queued". + +the APIs of 'struct mutex' have been streamlined: + + DEFINE_MUTEX(name); + + mutex_init(mutex); + + void mutex_lock(struct mutex *lock); + int mutex_lock_interruptible(struct mutex *lock); + int mutex_trylock(struct mutex *lock); + void mutex_unlock(struct mutex *lock); + int mutex_is_locked(struct mutex *lock); + |