summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-linus' of ↵Linus Torvalds2011-06-2011-189/+174
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: avoid delayed metadata items during commits btrfs: fix uninitialized return value btrfs: fix wrong reservation when doing delayed inode operations btrfs: Remove unused sysfs code btrfs: fix dereference of ERR_PTR value Btrfs: fix relocation races Btrfs: set no_trans_join after trying to expand the transaction Btrfs: protect the pending_snapshots list with trans_lock Btrfs: fix path leakage on subvol deletion Btrfs: drop the delalloc_bytes check in shrink_delalloc Btrfs: check the return value from set_anon_super
| * Btrfs: avoid delayed metadata items during commitsChris Mason2011-06-173-10/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Snapshot creation has two phases. One is the initial snapshot setup, and the second is done during commit, while nobody is allowed to modify the root we are snapshotting. The delayed metadata insertion code can break that rule, it does a delayed inode update on the inode of the parent of the snapshot, and delayed directory item insertion. This makes sure to run the pending delayed operations before we record the snapshot root, which avoids corruptions. Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * btrfs: fix uninitialized return valueDavid Sterba2011-06-171-3/+3
| | | | | | | | | | | | | | | | | | When allocation fails in btrfs_read_fs_root_no_name, ret is not set although it is returned, holding a garbage value. Signed-off-by: David Sterba <dsterba@suse.cz> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * btrfs: fix wrong reservation when doing delayed inode operationsMiao Xie2011-06-172-6/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have migrated the space for the delayed inode items from trans_block_rsv to global_block_rsv, but we forgot to set trans->block_rsv to global_block_rsv when we doing delayed inode operations, and the following Oops happened: [ 9792.654889] ------------[ cut here ]------------ [ 9792.654898] WARNING: at fs/btrfs/extent-tree.c:5681 btrfs_alloc_free_block+0xca/0x27c [btrfs]() [ 9792.654899] Hardware name: To Be Filled By O.E.M. [ 9792.654900] Modules linked in: btrfs zlib_deflate libcrc32c ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables arc4 rt61pci rt2x00pci rt2x00lib snd_hda_codec_hdmi mac80211 snd_hda_codec_realtek cfg80211 snd_hda_intel edac_core snd_seq rfkill pcspkr serio_raw snd_hda_codec eeprom_93cx6 edac_mce_amd sp5100_tco i2c_piix4 k10temp snd_hwdep snd_seq_device snd_pcm floppy r8169 xhci_hcd mii snd_timer snd soundcore snd_page_alloc ipv6 firewire_ohci pata_acpi ata_generic firewire_core pata_via crc_itu_t radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan] [ 9792.654919] Pid: 2762, comm: rm Tainted: G W 2.6.39+ #1 [ 9792.654920] Call Trace: [ 9792.654922] [<ffffffff81053c4a>] warn_slowpath_common+0x83/0x9b [ 9792.654925] [<ffffffff81053c7c>] warn_slowpath_null+0x1a/0x1c [ 9792.654933] [<ffffffffa038e747>] btrfs_alloc_free_block+0xca/0x27c [btrfs] [ 9792.654945] [<ffffffffa03b8562>] ? map_extent_buffer+0x6e/0xa8 [btrfs] [ 9792.654953] [<ffffffffa038189b>] __btrfs_cow_block+0xfc/0x30c [btrfs] [ 9792.654963] [<ffffffffa0396aa6>] ? btrfs_buffer_uptodate+0x47/0x58 [btrfs] [ 9792.654970] [<ffffffffa0382e48>] ? read_block_for_search+0x94/0x368 [btrfs] [ 9792.654978] [<ffffffffa0381ba9>] btrfs_cow_block+0xfe/0x146 [btrfs] [ 9792.654986] [<ffffffffa03848b0>] btrfs_search_slot+0x14d/0x4b6 [btrfs] [ 9792.654997] [<ffffffffa03b8562>] ? map_extent_buffer+0x6e/0xa8 [btrfs] [ 9792.655022] [<ffffffffa03938e8>] btrfs_lookup_inode+0x2f/0x8f [btrfs] [ 9792.655025] [<ffffffff8147afac>] ? _cond_resched+0xe/0x22 [ 9792.655027] [<ffffffff8147b892>] ? mutex_lock+0x29/0x50 [ 9792.655039] [<ffffffffa03d41b1>] btrfs_update_delayed_inode+0x72/0x137 [btrfs] [ 9792.655051] [<ffffffffa03d4ea2>] btrfs_run_delayed_items+0x90/0xdb [btrfs] [ 9792.655062] [<ffffffffa039a69b>] btrfs_commit_transaction+0x228/0x654 [btrfs] [ 9792.655064] [<ffffffff8106e8da>] ? remove_wait_queue+0x3a/0x3a [ 9792.655075] [<ffffffffa03a2fa5>] btrfs_evict_inode+0x14d/0x202 [btrfs] [ 9792.655077] [<ffffffff81132bd6>] evict+0x71/0x111 [ 9792.655079] [<ffffffff81132de0>] iput+0x12a/0x132 [ 9792.655081] [<ffffffff8112aa3a>] do_unlinkat+0x106/0x155 [ 9792.655083] [<ffffffff81127b83>] ? path_put+0x1f/0x23 [ 9792.655085] [<ffffffff8109c53c>] ? audit_syscall_entry+0x145/0x171 [ 9792.655087] [<ffffffff81128410>] ? putname+0x34/0x36 [ 9792.655090] [<ffffffff8112b441>] sys_unlinkat+0x29/0x2b [ 9792.655092] [<ffffffff81482c42>] system_call_fastpath+0x16/0x1b [ 9792.655093] ---[ end trace 02b696eb02b3f768 ]--- This patch fix it by setting the reservation of the transaction handle to the correct one. Reported-by: Josef Bacik <josef@redhat.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * btrfs: Remove unused sysfs codeMaarten Lankhorst2011-06-173-148/+0
| | | | | | | | | | | | | | | | Removes code no longer used. The sysfs file itself is kept, because the btrfs developers expressed interest in putting new entries to sysfs. Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * btrfs: fix dereference of ERR_PTR valueDavid Sterba2011-06-171-1/+1
| | | | | | | | | | | | | | | | | | | | smatch reports: btrfs_recover_log_trees error: 'wc.replay_dest' dereferencing possible ERR_PTR() Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Merge branch 'for-chris' of ↵Chris Mason2011-06-173-3/+14
| |\ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work into for-linus Conflicts: fs/btrfs/transaction.c Signed-off-by: Chris Mason <chris.mason@oracle.com>
| | * Btrfs: set no_trans_join after trying to expand the transactionJosef Bacik2011-06-151-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can lockup if we try to allow new writers join the transaction and we have flushoncommit set or have a pending snapshot. This is because we set no_trans_join and then loop around and try to wait for ordered extents again. The problem is the ordered endio stuff needs to join the transaction, which it can't do because no_trans_join is set. So instead wait until after this loop to set no_trans_join and then make sure to wait for num_writers == 1 in case anybody got started in between us exiting the loop and setting no_trans_join. This could easily be reproduced by mounting -o flushoncommit and running xfstest 13. It cannot be reproduced with this patch. Thanks, Reported-by: Jim Schutt <jaschut@sandia.gov> Signed-off-by: Josef Bacik <josef@redhat.com>
| | * Btrfs: protect the pending_snapshots list with trans_lockJosef Bacik2011-06-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently there is nothing protecting the pending_snapshots list on the transaction. We only hold the directory mutex that we are snapshotting and a read lock on the subvol_sem, so we could race with somebody else creating a snapshot in a different directory and end up with list corruption. So protect this list with the trans_lock. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
| | * Btrfs: fix path leakage on subvol deletionJosef Bacik2011-06-151-0/+1
| | | | | | | | | | | | | | | | | | | | | The delayed ref patch accidently removed the btrfs_free_path in btrfs_unlink_subvol, this puts it back and means we don't leak a path. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
| * | Btrfs: fix relocation racesChris Mason2011-06-174-13/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The recent commit to get rid of our trans_mutex introduced some races with block group relocation. The problem is that relocation needs to do some record keeping about each root, and it was relying on the transaction mutex to coordinate things in subtle ways. This fix adds a mutex just for the relocation code and makes sure it doesn't have a big impact on normal operations. The race is really fixed in btrfs_record_root_in_trans, which is where we step back and wait for the relocation code to finish accounting setup. Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | Btrfs: drop the delalloc_bytes check in shrink_delallocChris Mason2011-06-131-4/+0
| | | | | | | | | | | | | | | | | | | | | Even when delalloc_bytes is zero, we may need to sleep while waiting for delalloc space. Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | Btrfs: check the return value from set_anon_superChris Mason2011-06-131-1/+3
| |/ | | | | | | | | | | | | Al Viro noticed we weren't checking for set_anon_super failures. This adds the required checks. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* | Merge branch 'kvm-updates/3.0' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2011-06-204-5/+11
|\ \ | | | | | | | | | | | | | | | | | | | | | * 'kvm-updates/3.0' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Fix register corruption in pvclock_scale_delta KVM: MMU: fix opposite condition in mapping_level_dirty_bitmap KVM: VMX: do not overwrite uptodate vcpu->arch.cr3 on KVM_SET_SREGS KVM: MMU: Fix build warnings in walk_addr_generic()
| * | KVM: Fix register corruption in pvclock_scale_deltaZachary Amsden2011-06-191-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | The 128-bit multiply in pvclock.h was missing an output constraint for EDX which caused a register corruption to appear. Thanks to Ulrich for diagnosing the EDX corruption and Avi for providing this fix. Signed-off-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: MMU: fix opposite condition in mapping_level_dirty_bitmapSteve2011-06-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The condition is opposite, it always maps huge page for the dirty tracked page Reported-by: Steve <stefan.bosak@gmail.com> Signed-off-by: Steve <stefan.bosak@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: VMX: do not overwrite uptodate vcpu->arch.cr3 on KVM_SET_SREGSMarcelo Tosatti2011-06-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Only decache guest CR3 value if vcpu->arch.cr3 is stale. Fixes loadvm with live guest. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Tested-by: Markus Schade <markus.schade@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: MMU: Fix build warnings in walk_addr_generic()Borislav Petkov2011-06-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 3.0-rc1 I get In file included from arch/x86/kvm/mmu.c:2856: arch/x86/kvm/paging_tmpl.h: In function ‘paging32_walk_addr_generic’: arch/x86/kvm/paging_tmpl.h:124: warning: ‘ptep_user’ may be used uninitialized in this function In file included from arch/x86/kvm/mmu.c:2852: arch/x86/kvm/paging_tmpl.h: In function ‘paging64_walk_addr_generic’: arch/x86/kvm/paging_tmpl.h:124: warning: ‘ptep_user’ may be used uninitialized in this function caused by 6e2ca7d1802bf8ed9908435e34daa116662e7790. According to Takuya Yoshikawa, ptep_user won't be used uninitialized so shut up gcc. Cc: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Link: http://lkml.kernel.org/r/20110530094604.GC21833@liondog.tnic Signed-off-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Avi Kivity <avi@redhat.com>
| | |
| \ \
| \ \
| \ \
| \ \
| \ \
*-----. \ \ Merge branches 'perf-urgent-for-linus', 'sched-urgent-for-linus', ↵Linus Torvalds2011-06-1912-29/+61
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'timers-urgent-for-linus' and 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tools/perf: Fix static build of perf tool tracing: Fix regression in printk_formats file * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: generic-ipi: Fix kexec boot crash by initializing call_single_queue before enabling interrupts * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: clocksource: Make watchdog robust vs. interruption timerfd: Fix wakeup of processes when timer is cancelled on clock change * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, MAINTAINERS: Add x86 MCE people x86, efi: Do not reserve boot services regions within reserved areas
| | | | * | | x86, MAINTAINERS: Add x86 MCE peopleIngo Molnar2011-06-181-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Announce the new x86 MCE infrastructure maintainers. Acked-by: Borislav Petkov <bp@amd64.org> Acked-by: Tony Luck <tony.luck@intel.com> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/n/tip-8hs7yob6wib4vblmrmbpbav4@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | | * | | x86, efi: Do not reserve boot services regions within reserved areasMaarten Lankhorst2011-06-183-7/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 916f676f8dc started reserving boot service code since some systems require you to keep that code around until SetVirtualAddressMap is called. However, in some cases those areas will overlap with reserved regions. The proper medium-term fix is to fix the bootloader to prevent the conflicts from occurring by moving the kernel to a better position, but the kernel should check for this possibility, and only reserve regions which can be reserved. Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com> Link: http://lkml.kernel.org/r/4DF7A005.1050407@gmail.com Acked-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | * | | | clocksource: Make watchdog robust vs. interruptionThomas Gleixner2011-06-162-11/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The clocksource watchdog code is interruptible and it has been observed that this can trigger false positives which disable the TSC. The reason is that an interrupt storm or a long running interrupt handler between the read of the watchdog source and the read of the TSC brings the two far enough apart that the delta is larger than the unstable treshold. Move both reads into a short interrupt disabled region to avoid that. Reported-and-tested-by: Vernon Mauery <vernux@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org
| | | * | | | timerfd: Fix wakeup of processes when timer is cancelled on clock changeMax Asbock2011-06-141-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently processes waiting with poll on cancelable timerfd timers are not woken up when the timers are canceled. When the system time is set the clock_was_set() function calls timerfd_clock_was_set() to cancel and wake up processes waiting on potential cancelable timerfd timers. However the wake up currently has no effect because in the case of timerfd_read it is dependent on ctx->ticks not being 0. timerfd_poll also requires ctx->ticks being non zero. As a consequence processes waiting on cancelable timers only get woken up when the timers expire. This patch fixes this by incrementing ctx->ticks before calling wake_up. Signed-off-by: Max Asbock <masbock@linux.vnet.ibm.com> Cc: kay.sievers@vrfy.org Cc: virtuoso@slind.org Cc: johnstul <johnstul@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1307985512.4710.41.camel@w-amax.beaverton.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | * | | | | generic-ipi: Fix kexec boot crash by initializing call_single_queue before ↵Takao Indoh2011-06-173-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabling interrupts There is a problem that kdump(2nd kernel) sometimes hangs up due to a pending IPI from 1st kernel. Kernel panic occurs because IPI comes before call_single_queue is initialized. To fix the crash, rename init_call_single_data() to call_function_init() and call it in start_kernel() so that call_single_queue can be initialized before enabling interrupts. The details of the crash are: (1) 2nd kernel boots up (2) A pending IPI from 1st kernel comes when irqs are first enabled in start_kernel(). (3) Kernel tries to handle the interrupt, but call_single_queue is not initialized yet at this point. As a result, in the generic_smp_call_function_single_interrupt(), NULL pointer dereference occurs when list_replace_init() tries to access &q->list.next. Therefore this patch changes the name of init_call_single_data() to call_function_init() and calls it before local_irq_enable() in start_kernel(). Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Milton Miller <miltonm@bga.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: kexec@lists.infradead.org Link: http://lkml.kernel.org/r/D6CBEE2F420741indou.takao@jp.fujitsu.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | tools/perf: Fix static build of perf toolMathias Krause2011-06-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To build a statically linked version of the perf tool all needed libraries must be added in the correct order to get the symbols resolved. Currently this is broken when, e.g. python or newt support is enabled -- libpython needs libpthread which is an unconditional link dependency of the perf tool; libslang needs libm, another unconditional dependency. To solve the problem in the long run without the need to keep track of transitive library dependencies, simply make the linker look at the EXTLIBS multiple times until it has all symbols resolved. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Link: http://lkml.kernel.org/r/1308171818-20370-1-git-send-email-minipli@googlemail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | Merge branch 'tip/perf/urgent' of ↵Ingo Molnar2011-06-131-4/+1
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent
| | * | | | | | tracing: Fix regression in printk_formats fileSteven Rostedt2011-06-091-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fix to fix the printk_formats of modules broke the printk_formats of trace_printks in the kernel. The update of what to show via the seq_file was only updated if the passed in fmt was NULL, which happens only on the first iteration. The result was showing the first format every time instead of iterating through the available formats. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | | | | | | | Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds2011-06-199-403/+466
|\ \ \ \ \ \ \ \ | |_|_|_|_|_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: Move RCU_BOOST #ifdefs to header file rcu: use softirq instead of kthreads except when RCU_BOOST=y rcu: Use softirq to address performance regression rcu: Simplify curing of load woes
| * | | | | | | rcu: Move RCU_BOOST #ifdefs to header filePaul E. McKenney2011-06-163-382/+390
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit "use softirq instead of kthreads except when RCU_BOOST=y" just applied #ifdef in place. This commit is a cleanup that moves the newly #ifdef'ed code to the header file kernel/rcutree_plugin.h. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
| * | | | | | | rcu: use softirq instead of kthreads except when RCU_BOOST=yPaul E. McKenney2011-06-154-41/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch #ifdefs RCU kthreads out of the kernel unless RCU_BOOST=y, thus eliminating context-switch overhead if RCU priority boosting has not been configured. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
| * | | | | | | rcu: Use softirq to address performance regressionShaohua Li2011-06-148-6/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a26ac2455ffcf3(rcu: move TREE_RCU from softirq to kthread) introduced performance regression. In an AIM7 test, this commit degraded performance by about 40%. The commit runs rcu callbacks in a kthread instead of softirq. We observed high rate of context switch which is caused by this. Out test system has 64 CPUs and HZ is 1000, so we saw more than 64k context switch per second which is caused by RCU's per-CPU kthread. A trace showed that most of the time the RCU per-CPU kthread doesn't actually handle any callbacks, but instead just does a very small amount of work handling grace periods. This means that RCU's per-CPU kthreads are making the scheduler do quite a bit of work in order to allow a very small amount of RCU-related processing to be done. Alex Shi's analysis determined that this slowdown is due to lock contention within the scheduler. Unfortunately, as Peter Zijlstra points out, the scheduler's real-time semantics require global action, which means that this contention is inherent in real-time scheduling. (Yes, perhaps someone will come up with a workaround -- otherwise, -rt is not going to do well on large SMP systems -- but this patch will work around this issue in the meantime. And "the meantime" might well be forever.) This patch therefore re-introduces softirq processing to RCU, but only for core RCU work. RCU callbacks are still executed in kthread context, so that only a small amount of RCU work runs in softirq context in the common case. This should minimize ksoftirqd execution, allowing us to skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Tested-by: "Alex,Shi" <alex.shi@intel.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
| * | | | | | | rcu: Simplify curing of load woesPaul E. McKenney2011-06-142-54/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the functions creating the kthreads wake them up. Leverage the fact that the per-node and boost kthreads can run anywhere, thus dispensing with the need to wake them up once the incoming CPU has gone fully online. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Daniel J Blueman <daniel.blueman@gmail.com>
* | | | | | | | Merge branch 'hwmon-for-linus' of ↵Linus Torvalds2011-06-185-8/+6
|\ \ \ \ \ \ \ \ | |_|_|_|_|_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging: hwmon: (s3c) Initialize sysfs attributes hwmon: (ibmpex) Initialize sysfs attributes hwmon: (ibmaem) Initialize sysfs attributes hwmon: (asus_atk0110) Consolidate sysfs attribute initialization hwmon: (coretemp) Drop unused struct members
| * | | | | | | hwmon: (s3c) Initialize sysfs attributesGuenter Roeck2011-06-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initialize dynamically allocated sysfs attributes before device_create_file() call to suppress lockdep_init_map() warning if lockdep debugging is enabled. Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com> Acked-by: Jean Delvare <khali@linux-fr.org> Cc: stable@kernel.org # 2.6.34+
| * | | | | | | hwmon: (ibmpex) Initialize sysfs attributesGuenter Roeck2011-06-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initialize dynamically allocated sysfs attributes before device_create_file() call to suppress lockdep_init_map() warning if lockdep debugging is enabled. Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com> Acked-by: Jean Delvare <khali@linux-fr.org> Cc: stable@kernel.org # 2.6.34+
| * | | | | | | hwmon: (ibmaem) Initialize sysfs attributesGuenter Roeck2011-06-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initialize dynamically allocated sysfs attributes before device_create_file() call to suppress lockdep_init_map() warning if lockdep debugging is enabled. Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com> Acked-by: Jean Delvare <khali@linux-fr.org> Cc: stable@kernel.org # 2.6.34+
| * | | | | | | hwmon: (asus_atk0110) Consolidate sysfs attribute initializationGuenter Roeck2011-06-171-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Call sysfs_attr_init() from atk_init_attribute() to handle sysfs attribute initialization in a single function. Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com> Cc: Luca Tettamanti <kronos.it@gmail.com> Acked-by: Jean Delvare <khali@linux-fr.org>
| * | | | | | | hwmon: (coretemp) Drop unused struct membersJean Delvare2011-06-171-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pdev_entry.cpu and pdev_entry.cpu_core_id aren't used anywhere in the driver code so we can drop these struct members. Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Guenter Roeck <guenter.roeck@ericsson.com> Cc: Durgadoss R <durgadoss.r@intel.com> Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
* | | | | | | | isofs: fix bh leak in isofs_fill_super() error caseLinus Torvalds2011-06-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In isofs_fill_super(), when an iso_primary_descriptor is found, it is kept in pri_bh. The error cases don't properly release it. Fix it. Reported-and-tested-by: 김원석 <stanley.will.kim@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | Merge branch 'drm-fixes' of ↵Linus Torvalds2011-06-171-0/+10
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/i915: Fix gen6 (SNB) missed BLT ring interrupts.
| * | | | | | | | drm/i915: Fix gen6 (SNB) missed BLT ring interrupts.Daniel J Blueman2011-06-181-0/+10
| |/ / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The failure appeared in dmesg as: [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... blt ring idle [waiting on 35064155, at 35064155], missed IRQ? This works around that problem on by making the blitter command streamer write interrupt state to the Hardware Status Page when a MI_USER_INTERRUPT command is decoded, which appears to force the seqno out to memory before the interrupt happens. v1->v2: Moved to prior interrupt handler installation and RMW flags as per feedback. v2->v3: Removed RMW of flags (by anholt) Cc: stable@kernel.org Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> [v1] Tested-by: Eric Anholt <eric@anholt.net> [v1,v3] (incidence of the bug with a testcase went from avg 2/1000 to 0/12651 in the latest test run (plus more for v1)) Tested-by: Kenneth Graunke <kenneth@whitecape.org> [v1] Tested-by: Robert Hooker <robert.hooker@canonical.com> [v1] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=33394 Signed-off-by: Dave Airlie <airlied@redhat.com>
* | | | | | | | Merge branch 'fixes' of ↵Linus Torvalds2011-06-172-4/+10
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] powernow-k8: Don't try to transition if the pstate is incorrect [CPUFREQ] powernow-k8: Don't notify of successful transition if we failed (vid case). [CPUFREQ] Don't set stat->last_index to -1 if the pol->cur has incorrect value.
| * | | | | | | | [CPUFREQ] powernow-k8: Don't try to transition if the pstate is incorrectKonrad Rzeszutek Wilk2011-06-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch augments the pstate transition code to error out (instead of returning 0) when an incorrect pstate is provided. Suggested-by: Borislav Petkov <bp@alien8.de> CC: andre.przywara@amd.com CC: Mark.Langsdorf@amd.com Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Dave Jones <davej@redhat.com>
| * | | | | | | | [CPUFREQ] powernow-k8: Don't notify of successful transition if we failed ↵Konrad Rzeszutek Wilk2011-06-161-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (vid case). Before this patch if we failed the vid transition would still try to submit the "new" frequencies to cpufreq. That is incorrect - also we could submit a non-existing frequency value which would cause cpufreq to crash. The ultimate fix is in cpufreq to deal with incorrect values, but this patch improves the error recovery in the AMD powernowk8 driver. The failure that was reported was as follows: powernow-k8: Found 1 AMD Athlon(tm) 64 Processor 3700+ (1 cpu cores) (version 2.20.00) powernow-k8: fid 0x2 (1000 MHz), vid 0x12 powernow-k8: fid 0xa (1800 MHz), vid 0xa powernow-k8: fid 0xc (2000 MHz), vid 0x8 powernow-k8: fid 0xe (2200 MHz), vid 0x8 Marking TSC unstable due to cpufreq changes powernow-k8: fid trans failed, fid 0x2, curr 0x0 BUG: unable to handle kernel paging request at ffff880807e07b78 IP: [<ffffffff81479163>] cpufreq_stats_update+0x46/0x5b ... And transition fails and data->currfid ends up with 0. Since the machine does not support 800Mhz value when the calculation is done ('find_khz_freq_from_fid(data->currfid);') it reports the new frequency as 800000 which is bogus. This patch fixes the issue during target setting. The patch however does not fix the issue in 'powernowk8_cpu_init' where the pol->cur can also be set with the 800000 value: pol->cur = find_khz_freq_from_fid(data->currfid); dprintk("policy current frequency %d kHz\n", pol->cur); /* min/max the cpu is capable of */ if (cpufreq_frequency_table_cpuinfo(pol, data->powernow_table)) { The fix for that looks to update cpufreq_frequency_table_cpuinfo to check pol->cur.... but that would cause an regression in how the acpi-cpufreq driver works (it sets cpu->cur after calling cpufreq_frequency_table_cpuinfo). Instead the fix will be to let cpufreq gracefully handle bogus data (another patch). Acked-by: Borislav Petkov <bp@alien8.de> CC: andre.przywara@amd.com CC: Mark.Langsdorf@amd.com Reported-by: Tobias Diedrich <ranma+xen@tdiedrich.de> Tested-by: Tobias Diedrich <ranma+xen@tdiedrich.de> [v1: Rebased on v3.0-rc2, reduced patch to deal with vid case] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Dave Jones <davej@redhat.com>
| * | | | | | | | [CPUFREQ] Don't set stat->last_index to -1 if the pol->cur has incorrect value.Konrad Rzeszutek Wilk2011-06-161-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the driver submitted an non-existing pol>cur value (say it used the default initialized value of zero), when the cpufreq stats tries to setup its initial values it incorrectly sets stat->last_index to -1 (or 0xfffff...). And cpufreq_stats_update tries to update at that index location and fails. This can be caused by: stat->last_index = freq_table_get_index(stat, policy->cur); not finding the appropiate frequency in the table (b/c the policy->cur is wrong) and we end up crashing. The fix however is concentrated in the 'cpufreq_stats_update' as the last_index (and old_index) are updated there. Which means it can reset the last_index to -1 again and on the next iteration cause a crash. Without this patch, the following crash is observed: powernow-k8: Found 1 AMD Athlon(tm) 64 Processor 3700+ (1 cpu cores) (version 2.20.00) powernow-k8: fid 0x2 (1000 MHz), vid 0x12 powernow-k8: fid 0xa (1800 MHz), vid 0xa powernow-k8: fid 0xc (2000 MHz), vid 0x8 powernow-k8: fid 0xe (2200 MHz), vid 0x8 Marking TSC unstable due to cpufreq changes powernow-k8: fid trans failed, fid 0x2, curr 0x0 BUG: unable to handle kernel paging request at ffff880807e07b78 IP: [<ffffffff81479163>] cpufreq_stats_update+0x46/0x5b .. snip.. Pid: 1, comm: swapper Not tainted 3.0.0-rc2 #45 MICRO-STAR INTERNATIONAL CO., LTD MS-7094/MS-7094 ..snip.. Call Trace: [<ffffffff81479248>] cpufreq_stat_notifier_trans+0x48/0x7c [<ffffffff81095d68>] notifier_call_chain+0x32/0x5e [<ffffffff81095e6b>] __srcu_notifier_call_chain+0x47/0x63 [<ffffffff81095e96>] srcu_notifier_call_chain+0xf/0x11 [<ffffffff81477e7a>] cpufreq_notify_transition+0x111/0x134 [<ffffffff8147b0d4>] powernowk8_target+0x53b/0x617 [<ffffffff8147723a>] __cpufreq_driver_target+0x2e/0x30 [<ffffffff8147a127>] cpufreq_governor_dbs+0x339/0x356 [<ffffffff81477394>] __cpufreq_governor+0xa8/0xe9 [<ffffffff81477525>] __cpufreq_set_policy+0x132/0x13e [<ffffffff8147848d>] cpufreq_add_dev_interface+0x272/0x28c Reported-by: Tobias Diedrich <ranma+xen@tdiedrich.de> Tested-by: Tobias Diedrich <ranma+xen@tdiedrich.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Dave Jones <davej@redhat.com>
* | | | | | | | | Merge branch 'anon_vma-locking'Linus Torvalds2011-06-171-31/+75
|\ \ \ \ \ \ \ \ \ | |_|/ / / / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * anon_vma-locking: mm: avoid anon_vma_chain allocation under anon_vma lock mm: avoid repeated anon_vma lock/unlock sequences in unlink_anon_vmas() mm: avoid repeated anon_vma lock/unlock sequences in anon_vma_clone()
| * | | | | | | | mm: avoid anon_vma_chain allocation under anon_vma lockLinus Torvalds2011-06-171-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hugh Dickins points out that lockdep (correctly) spots a potential deadlock on the anon_vma lock, because we now do a GFP_KERNEL allocation of anon_vma_chain while doing anon_vma_clone(). The problem is that page reclaim will want to take the anon_vma lock of any anonymous pages that it will try to reclaim. So re-organize the code in anon_vma_clone() slightly: first do just a GFP_NOWAIT allocation, which will usually work fine. But if that fails, let's just drop the lock and re-do the allocation, now with GFP_KERNEL. End result: not only do we avoid the locking problem, this also ends up getting better concurrency in case the allocation does need to block. Tim Chen reports that with all these anon_vma locking tweaks, we're now almost back up to the spinlock performance. Reported-and-tested-by: Hugh Dickins <hughd@google.com> Tested-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | mm: avoid repeated anon_vma lock/unlock sequences in unlink_anon_vmas()Peter Zijlstra2011-06-171-21/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This matches the anon_vma_clone() case, and uses the same lock helper functions. Because of the need to potentially release the anon_vma's, it's a bit more complex, though. We traverse the 'vma->anon_vma_chain' in two phases: the first loop gets the anon_vma lock (with the helper function that only takes the lock once for the whole loop), and removes any entries that don't need any more processing. The second phase just traverses the remaining list entries (without holding the anon_vma lock), and does any actual freeing of the anon_vma's that is required. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Tested-by: Hugh Dickins <hughd@google.com> Tested-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | mm: avoid repeated anon_vma lock/unlock sequences in anon_vma_clone()Linus Torvalds2011-06-171-3/+36
| | |_|_|_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In anon_vma_clone() we traverse the vma->anon_vma_chain of the source vma, locking the anon_vma for each entry. But they are all going to have the same root entry, which means that we're locking and unlocking the same lock over and over again. Which is expensive in locked operations, but can get _really_ expensive when that root entry sees any kind of lock contention. In fact, Tim Chen reports a big performance regression due to this: when we switched to use a mutex instead of a spinlock, the contention case gets much worse. So to alleviate this all, this commit creates a small helper function (lock_anon_vma_root()) that can be used to take the lock just once rather than taking and releasing it over and over again. We still have the same "take the lock and release" it behavior in the exit path (in unlink_anon_vmas()), but that one is a bit harder to fix since we're actually freeing the anon_vma entries as we go, and that will touch the lock too. Reported-and-tested-by: Tim Chen <tim.c.chen@linux.intel.com> Tested-by: Hugh Dickins <hughd@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2011-06-179-14/+33
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ALSA: hda - Fix no NID error with VIA codecs ALSA: isight: adjust for new queueing API ALSA: 6fire - Fix signedness bug ALSA: lola - Fix section mismatch ALSA: hda - Fix missing static inline to beep dummy function ALSA: 6fire: Fix double-free bug in usb6fire_fw_ezusb_upload() ALSA: hda - Fix beep_device compilation warnings ALSA: hda - Fix wrong auto-mute type for Acer Aspire-one ALSA: emu10k1: Add details for E-mu 0404 PCIe version ALSA: hdspm - Add firmware revision ID for RME MADI PCI version ALSA: hdspm - Fix jumping external wordclock frequency in AutoSync mode ALSA: hdspm - Fix locking in snd_hdspm_midi_input_read
OpenPOWER on IntegriCloud