summaryrefslogtreecommitdiffstats
path: root/kernel/locking/osq_lock.c
Commit message (Collapse)AuthorAgeFilesLines
* locking/osq: Relax atomic semanticsDavidlohr Bueso2015-09-181-3/+8
| | | | | | | | | | | | | | | | | ... by using acquire/release for ops around the lock->tail. As such, weakly ordered archs can benefit from more relaxed use of barriers when issuing atomics. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <Waiman.Long@hpe.com> Link: http://lkml.kernel.org/r/1442216244-4409-3-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
* locking: Remove ACCESS_ONCE() usageDavidlohr Bueso2015-02-241-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | With the new standardized functions, we can replace all ACCESS_ONCE() calls across relevant locking - this includes lockref and seqlock while at it. ACCESS_ONCE() does not work reliably on non-scalar types. For example gcc 4.6 and 4.7 might remove the volatile tag for such accesses during the SRA (scalar replacement of aggregates) step: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 Update the new calls regardless of if it is a scalar type, this is cleaner than having three alternatives. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1424662301.6539.18.camel@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
* locking/osq: No need for load/acquire when acquire-pollingDavidlohr Bueso2015-01-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both mutexes and rwsems took a performance hit when we switched over from the original mcs code to the cancelable variant (osq). The reason being the use of smp_load_acquire() when polling for node->locked. This is not needed as reordering is not an issue, as such, relax the barrier semantics. Paul describes the scenario nicely: https://lkml.org/lkml/2013/11/19/405 - If we start polling before the insertion is complete, all that happens is that the first few polls have no chance of seeing a lock grant. - Ordering the polling against the initialization -- the above xchg() is already doing that for us. The smp_load_acquire() when unqueuing make sense. In addition, we don't need to worry about leaking the critical region as osq is only used internally. This impacts both regular and large levels of concurrency, ie on a 40 core system with a disk intensive workload: disk-1 804.83 ( 0.00%) 828.16 ( 2.90%) disk-61 8063.45 ( 0.00%) 18181.82 (125.48%) disk-121 7187.41 ( 0.00%) 20119.17 (179.92%) disk-181 6933.32 ( 0.00%) 20509.91 (195.82%) disk-241 6850.81 ( 0.00%) 20397.80 (197.74%) disk-301 6815.22 ( 0.00%) 20287.58 (197.68%) disk-361 7080.40 ( 0.00%) 20205.22 (185.37%) disk-421 7076.13 ( 0.00%) 19957.33 (182.04%) disk-481 7083.25 ( 0.00%) 19784.06 (179.31%) disk-541 7038.39 ( 0.00%) 19610.92 (178.63%) disk-601 7072.04 ( 0.00%) 19464.53 (175.23%) disk-661 7010.97 ( 0.00%) 19348.23 (175.97%) disk-721 7069.44 ( 0.00%) 19255.33 (172.37%) disk-781 7007.58 ( 0.00%) 19103.14 (172.61%) disk-841 6981.18 ( 0.00%) 18964.22 (171.65%) disk-901 6968.47 ( 0.00%) 18826.72 (170.17%) disk-961 6964.61 ( 0.00%) 18708.02 (168.62%) Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1420573509-24774-7-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
* locking/mcs: Better differentiate between MCS variantsDavidlohr Bueso2015-01-141-0/+203
We have two flavors of the MCS spinlock: standard and cancelable (OSQ). While each one is independent of the other, we currently mix and match them. This patch: - Moves the OSQ code out of mcs_spinlock.h (which only deals with the traditional version) into include/linux/osq_lock.h. No unnecessary code is added to the more global header file, anything locks that make use of OSQ must include it anyway. - Renames mcs_spinlock.c to osq_lock.c. This file only contains osq code. - Introduces a CONFIG_LOCK_SPIN_ON_OWNER in order to only build osq_lock if there is support for it. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Jason Low <jason.low2@hp.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Waiman Long <Waiman.Long@hp.com> Link: http://lkml.kernel.org/r/1420573509-24774-5-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
OpenPOWER on IntegriCloud