summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2010-11-1234-519/+367
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.kernel.dk/linux-2.6-block: (27 commits) block: remove unused copy_io_context() Documentation: remove anticipatory scheduler info block: remove REQ_HARDBARRIER ioprio: rcu_read_lock/unlock protect find_task_by_vpid call (V2) ioprio: fix RCU locking around task dereference block: ioctl: fix information leak to userland block: read i_size with i_size_read() cciss: fix proc warning on attempt to remove non-existant directory bio: take care not overflow page count when mapping/copying user data block: limit vec count in bio_kmalloc() and bio_alloc_map_data() block: take care not to overflow when calculating total iov length block: check for proper length of iov entries in blk_rq_map_user_iov() cciss: remove controllers supported by hpsa cciss: use usleep_range not msleep for small sleeps cciss: limit commands allocated on reset_devices cciss: Use kernel provided PCI state save and restore functions cciss: fix board status waiting code drbd: Removed checks for REQ_HARDBARRIER on incomming BIOs drbd: REQ_HARDBARRIER -> REQ_FUA transition for meta data accesses drbd: Removed the BIO_RW_BARRIER support form the receiver/epoch code ...
| * block: remove unused copy_io_context()Jens Axboe2010-11-112-15/+0
| | | | | | | | | | Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * Documentation: remove anticipatory scheduler infoRandy Dunlap2010-11-113-7/+7
| | | | | | | | | | | | | | | | | | | | Remove anticipatory block I/O scheduler info from Documentation/ since the code has been deleted. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Reported-by: "Robert P. J. Day" <rpjday@crashcourse.ca> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * block: remove REQ_HARDBARRIERChristoph Hellwig2010-11-1011-51/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | REQ_HARDBARRIER is dead now, so remove the leftovers. What's left at this point is: - various checks inside the block layer. - sanity checks in bio based drivers. - now unused bio_empty_barrier helper. - Xen blockfront use of BLKIF_OP_WRITE_BARRIER - it's dead for a while, but Xen really needs to sort out it's barrier situaton. - setting of ordered tags in uas - dead code copied from old scsi drivers. - scsi different retry for barriers - it's dead and should have been removed when flushes were converted to FS requests. - blktrace handling of barriers - removed. Someone who knows blktrace better should add support for REQ_FLUSH and REQ_FUA, though. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * Merge branch 'for-2.6.37/drivers' into for-linusJens Axboe2010-11-1011-413/+265
| |\ | | | | | | | | | | | | | | | | | | Conflicts: drivers/block/cciss.c Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| | * cciss: remove controllers supported by hpsaStephen M. Cameron2010-10-231-41/+4
| | | | | | | | | | | | | | | | | | | | | | | | We would prefer not to have any overlap between the two drivers. Remove the cciss_allow_hpsa option, as it it is no longer needed. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| | * cciss: use usleep_range not msleep for small sleepsStephen M. Cameron2010-10-231-1/+1
| | | | | | | | | | | | | | | Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| | * cciss: limit commands allocated on reset_devicesStephen M. Cameron2010-10-231-0/+5
| | | | | | | | | | | | | | | | | | | | | This is to conserve memory in a memory-limited kdump scenario Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| | * cciss: Use kernel provided PCI state save and restore functionsStephen M. Cameron2010-10-231-58/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | and use the doorbell reset method if available (which doesn't lock up the controller if you properly save and restore all the PCI registers that you're supposed to.) Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| | * cciss: fix board status waiting codeStephen M. Cameron2010-10-232-8/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After a reset, we should first wait for the board to become "not ready", and then wait for it to become "ready", instead of immediately waiting for it to become "ready", and do this waiting *after* restoring PCI config space registers. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| | * Merge branch 'for-jens' of git://git.drbd.org/linux-2.6-drbd into ↵Jens Axboe2010-10-239-344/+202
| | |\ | | | | | | | | | | | | for-2.6.37/drivers
| | | * drbd: Removed checks for REQ_HARDBARRIER on incomming BIOsPhilipp Reisner2010-10-233-20/+0
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| | | * drbd: REQ_HARDBARRIER -> REQ_FUA transition for meta data accessesPhilipp Reisner2010-10-233-20/+7
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| | | * drbd: Removed the BIO_RW_BARRIER support form the receiver/epoch codePhilipp Reisner2010-10-236-219/+33
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| | | * drbd: Silenced an assertPhilipp Reisner2010-10-222-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | That assertion's condition needed adjustment for today's semantics Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| | | * drbd: rate limit an error messageLars Ellenberg2010-10-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we don't rate limit it, and you happen to log err level messages via serial console, an IO error on a disconnected Primary may cause serious unresponsiveness. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| | | * drbd: fix a misleading printkLars Ellenberg2010-10-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This codepath used to be called only for failed kmalloc GFP_ATOMIC, but is now also triggered by other things. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| | | * drbd: fix potential data divergence after multiple failuresLars Ellenberg2010-10-223-11/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we get an IO-error during an activity log transaction, if we failed to write the bitmap of the evicted extent, we must not write the transaction itself. If we failed to write the transaction, we must not even submit the corresponding bio, as its extent is not yet marked in the activity log. Otherwise, if this was a disconneted Primary (degraded cluster), which now lost its disk as well, and we later re-attach the same backend storage, we possibly "forget" to resync some parts of the disk that potentially have been changed. On the receiving side, when receiving from a peer with unhealthy disk, checking for pdsk == D_DISKLESS is not enough, we need to set out of sync and do AL transactions for everything pdsk < D_INCONSISTENT on the receiving side. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| | | * drbd: fix potential deadlock on detachLars Ellenberg2010-10-224-63/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we have contention in drbd_al_begin_iod (heavy randon IO), an administrative request to detach the disk may deadlock for similar reasons as the recently fixed deadlock if detaching because of IO-error. The approach taken here is to either go through the intermediate cleanup state D_FAILED, or first lock out application io, don't just go directly to D_DISKLESS. We need an additional state bit (WAS_IO_ERROR) to distinguish the -> D_FAILED because of IO-error from other failures. Sanitize D_ATTACHING -> D_FAILED to D_ATTACHING -> D_DISKLESS. If only attaching, ldev may be missing still, but would be referenced from within the after_state_ch for -> D_FAILED, potentially dereferencing a NULL pointer. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| | | * drbd: tag a few error messages with "assert failed"Lars Ellenberg2010-10-221-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If those messages ever get logged, clearly state that they are actually failed ASSERTS, so our regression tests can pick them up from the logs more easily. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| | | * drbd: consolidate explicit drbd_md_sync into drbd_create_new_uuidLars Ellenberg2010-10-222-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Every code path changing the current UUID needs to get it on stable storage anyways. Flush it to disk right there, remove the now obsolte explicit drbd_md_sync statements in the other code paths. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| * | | ioprio: rcu_read_lock/unlock protect find_task_by_vpid call (V2)Sergey Senozhatsky2010-11-101-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns= Assertion failed in sys_ioprio_get. The patch is fixing assertion failure in ioprio_set as well. kernel/pid.c:419 invoked rcu_dereference_check() without protection! stack backtrace: Pid: 4254, comm: iotop Not tainted Call Trace: [<ffffffff810656f2>] lockdep_rcu_dereference+0xaa/0xb2 [<ffffffff81053c67>] find_task_by_pid_ns+0x4f/0x68 [<ffffffff81053c9d>] find_task_by_vpid+0x1d/0x1f [<ffffffff811104e2>] sys_ioprio_get+0x50/0x2da [<ffffffff81002182>] system_call_fastpath+0x16/0x1b V2: rcu critical section expanded according to comment by Paul E. McKenney Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * | | ioprio: fix RCU locking around task dereferenceDaniel J Blueman2010-11-101-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With 2.6.37-rc1, I observe sys_ioprio_set not taking the RCU lock [1] across access to the task credentials. Inspecting the code in fs/ioprio.c, the tasklist_lock is held for read across the __task_cred call, which is presumably sufficient to prevent the task credentials becoming stale. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- kernel/pid.c:419 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 1 lock held by start-stop-daem/2246: #0: (tasklist_lock){.?.?..}, at: [<ffffffff811a2dfa>] sys_ioprio_set+0x8a/0x400 stack backtrace: Pid: 2246, comm: start-stop-daem Not tainted 2.6.37-rc1-330cd+ #2 Call Trace: [<ffffffff8109f5f4>] lockdep_rcu_dereference+0xa4/0xc0 [<ffffffff81085651>] find_task_by_pid_ns+0x81/0x90 [<ffffffff8108567d>] find_task_by_vpid+0x1d/0x20 [<ffffffff811a3160>] sys_ioprio_set+0x3f0/0x400 [<ffffffff816efa79>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff81003482>] system_call_fastpath+0x16/0x1b Take the RCU lock for read across acquiring the pointer to the task credentials and dereferencing it. Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Fixed up by Jens to fix missing rcu_read_unlock() on mismatches. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * | | block: ioctl: fix information leak to userlandVasiliy Kulikov2010-11-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Structure hd_geometry is copied to userland with 4 padding bytes between cylinders and start fields uninitialized on 64-bit platforms. It leads to leaking of contents of kernel stack memory. Currently there is no memset() in real implementations of getgeo() in drivers/block/, so it makes sense to have memset() in blkdev_ioctl(). Signed-off-by: Vasiliy Kulikov <segooon@gmail.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * | | block: read i_size with i_size_read()Mike Snitzer2010-11-105-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert direct reads of an inode's i_size to using i_size_read(). i_size_{read,write} use a seqcount to protect reads from accessing incomple writes. Concurrent i_size_write()s require mutual exclussion to protect the seqcount that is used by i_size_{read,write}. But i_size_read() callers do not need to use additional locking. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: NeilBrown <neilb@suse.de> Acked-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * | | cciss: fix proc warning on attempt to remove non-existant directoryJens Axboe2010-11-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Randy reports that he gets the following stack trace when removing the cciss module: [ 109.164277] Pid: 3463, comm: rmmod Not tainted 2.6.37-rc1 #7 [ 109.164280] Call Trace: [ 109.164292] [<ffffffff8107eb8d>] warn_slowpath_common+0xc6/0xf3 [ 109.164299] [<ffffffff8107ecaa>] warn_slowpath_fmt+0x5b/0x6b [ 109.164307] [<ffffffff8155175b>] ? _raw_spin_unlock+0x40/0x4b [ 109.164313] [<ffffffff8123dd1e>] remove_proc_entry+0x156/0x35e [ 109.164320] [<ffffffff812cd91b>] ? do_raw_spin_unlock+0xff/0x10f [ 109.164327] [<ffffffff8113823d>] ? trace_hardirqs_on+0x10/0x4a [ 109.164333] [<ffffffff8155162d>] ? _raw_spin_unlock_irq+0x4c/0x7b [ 109.164339] [<ffffffff8154d4d1>] ? wait_for_common+0x145/0x15e [ 109.164345] [<ffffffff81075337>] ? default_wake_function+0x0/0x22 [ 109.164357] [<ffffffffa0615a8f>] cciss_cleanup+0xa9/0xc7 [cciss] [ 109.164365] [<ffffffff810d3cb0>] sys_delete_module+0x2d6/0x368 [ 109.164371] [<ffffffff8155036b>] ? lockdep_sys_exit_thunk+0x35/0x67 [ 109.164377] [<ffffffff810fdfaf>] ? audit_syscall_entry+0x172/0x1a5 [ 109.164383] [<ffffffff815502f5>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 109.164389] [<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b [ 109.164394] ---[ end trace 88e8568246ed0b1d ]--- which will happen if you don't actually have an HP CISS adapter, since it'll do an uncondional removal of a proc directory it never attempted to create in that case. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Tested-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * | | bio: take care not overflow page count when mapping/copying user dataJens Axboe2010-11-101-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the iovec is being set up in a way that causes uaddr + PAGE_SIZE to overflow, we could end up attempting to map a huge number of pages. Check for this invalid input type. Reported-by: Dan Rosenberg <drosenberg@vsecurity.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * | | block: limit vec count in bio_kmalloc() and bio_alloc_map_data()Jens Axboe2010-11-101-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | Reported-by: Dan Rosenberg <drosenberg@vsecurity.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * | | block: take care not to overflow when calculating total iov lengthJens Axboe2010-11-101-10/+24
| | | | | | | | | | | | | | | | | | | | | | | | Reported-by: Dan Rosenberg <drosenberg@vsecurity.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * | | block: check for proper length of iov entries in blk_rq_map_user_iov()Jens Axboe2010-11-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure that we pass down properly validated iov segments before calling into the mapping or copy functions. Reported-by: Dan Rosenberg <drosenberg@vsecurity.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* | | | Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2010-11-129-155/+119
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, pvclock: Remove leftover scale_delta() function x86, apic: Remove double #include x86: Adjust section annotations in AMD Fam10 MMCONF enabling code x86, UV: Update node controller MMRs x86: Remove unnecessary casts of void ptr returning alloc function return values x86: Address gcc4.6 "set but not used" warnings in apic.h x86, mm: Fix section mismatch in tlb.c
| * | | | x86, pvclock: Remove leftover scale_delta() functionKusanagi Kouichi2010-11-101-38/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 92580d64e16402762e2acc3022f065397c780425 ("x86: pvclock: Move scale_delta into common header") forgot to remove scale_delta. Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Cc: Zachary Amsden <zamsden@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Glauber Costa <glommer@redhat.com> LKML-Reference: <20101105110444.BAF6D6FC03B@msa105.auone-net.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86, apic: Remove double #includeJesper Juhl2010-11-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the second <asm/atomic.h> inclusion. Signed-off-by: Jesper Juhl <jj@chaosbits.net> LKML-Reference: <alpine.LNX.2.00.1011072253360.26247@swampdragon.chaosbits.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86: Adjust section annotations in AMD Fam10 MMCONF enabling codeJan Beulich2010-11-101-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | check_enable_amd_mmconf_dmi() gets called only for the BSP, hence everything hanging off of it can be __init*. Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4CD2DE1E0200007800020990@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86, UV: Update node controller MMRsJack Steiner2010-11-102-99/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A new version of the SGI UV hub node controller is being developed. A few of the MMRs (control registers) that exist on the current hub no longer exist on the new hub. Fortunately, there are alternate MMRs that are are functionally equivalent and that exist on both hubs. This patch changes the UV code to use MMRs that exist in BOTH versions of the hub node controller. Signed-off-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <20101106204056.GA27584@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86: Remove unnecessary casts of void ptr returning alloc function return valuesJesper Juhl2010-11-102-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The [vk][cmz]alloc(_node) family of functions return void pointers which it's completely unnecessary/pointless to cast to other pointer types since that happens implicitly. This patch removes such casts from arch/x86. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Cc: trivial@kernel.org Cc: amd64-microcode@amd64.org Cc: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <alpine.LNX.2.00.1011082310220.23697@swampdragon.chaosbits.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86: Address gcc4.6 "set but not used" warnings in apic.hAndi Kleen2010-11-091-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | native_apic_msr_read() and x2apic_enabled() use rdmsr(msr, low, high), but only use the low part. gcc4.6 complains about this: .../apic.h:144:11: warning: variable 'high' set but not used [-Wunused-but-set-variable] rdmsr() is just a wrapper around rdmsrl() which splits the 64bit value into low and high, so using rdmsrl() directly solves this. [tglx: Changed the variables to u64 as suggested by Cyrill. It's less confusing and has no code impact as this is 64bit only anyway. Massaged changelog as well. ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: x86@kernel.org Cc: Cyrill Gorcunov <gorcunov@gmail.com> LKML-Reference: <1289251229-19589-1-git-send-email-andi@firstfloor.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | x86, mm: Fix section mismatch in tlb.cRakib Mullick2010-11-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark tlb_cpuhp_notify as __cpuinit. It's basically a callback function, which is called from __cpuinit init_smp_flash(). So - it's safe. We were warned by the following warning: WARNING: arch/x86/mm/built-in.o(.text+0x356d): Section mismatch in reference from the function tlb_cpuhp_notify() to the function .cpuinit.text:calculate_tlb_offset() The function tlb_cpuhp_notify() references the function __cpuinit calculate_tlb_offset(). This is often because tlb_cpuhp_notify lacks a __cpuinit annotation or the annotation of calculate_tlb_offset is wrong. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Shaohua Li <shaohua.li@intel.com> LKML-Reference: <AANLkTinWQRG=HA9uB3ad0KAqRRTinL6L_4iKgF84coph@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds2010-11-1221-104/+271
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf, amd: Use kmalloc_node(,__GFP_ZERO) for northbridge structure allocation perf_events: Fix time tracking in samples perf trace: update usage perf trace: update Documentation with new perf trace variants perf trace: live-mode command-line cleanup perf trace record: handle commands correctly perf record: make the record options available outside perf record perf trace scripting: remove system-wide param from shell scripts perf trace scripting: fix some small memory leaks and missing error checks perf: Fix usages of profile_cpu in builtin-top.c to use cpu_list perf, ui: Eliminate stack-smashing protection compiler complaint
| * | | | | perf, amd: Use kmalloc_node(,__GFP_ZERO) for northbridge structure allocationPeter Zijlstra2010-11-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jasper suggested we use the zeroing capability of the allocators instead of calling memset ourselves. Add node affinity while we're at it. Reported-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | perf_events: Fix time tracking in samplesStephane Eranian2010-11-102-8/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch corrects time tracking in samples. Without this patch both time_enabled and time_running are bogus when user asks for PERF_SAMPLE_READ. One uses PERF_SAMPLE_READ to sample the values of other counters in each sample. Because of multiplexing, it is necessary to know both time_enabled, time_running to be able to scale counts correctly. In this second version of the patch, we maintain a shadow copy of ctx->time which allows us to compute ctx->time without calling update_context_time() from NMI context. We avoid the issue that update_context_time() must always be called with ctx->lock held. We do not keep shadow copies of the other event timings because if the lead event is overflowing then it is active and thus it's been scheduled in via event_sched_in() in which case neither tstamp_stopped, tstamp_running can be modified. This timing logic only applies to samples when PERF_SAMPLE_READ is used. Note that this patch does not address timing issues related to sampling inheritance between tasks. This will be addressed in a future patch. With this patch, the libpfm4 example task_smpl now reports correct counts (shown on 2.4GHz Core 2): $ task_smpl -p 2400000000 -e unhalted_core_cycles:u,instructions_retired:u,baclears noploop 5 noploop for 5 seconds IIP:0x000000004006d6 PID:5596 TID:5596 TIME:466,210,211,430 STREAM_ID:33 PERIOD:2,400,000,000 ENA=1,010,157,814 RUN=1,010,157,814 NR=3 2,400,000,254 unhalted_core_cycles:u (33) 2,399,273,744 instructions_retired:u (34) 53,340 baclears (35) Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4cc6e14b.1e07e30a.256e.5190@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | perf trace: update usageTom Zanussi2010-11-101-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update usage to reflect the different perf trace variants. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
| * | | | | perf trace: update Documentation with new perf trace variantsTom Zanussi2010-11-101-8/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add documentation describing new 'perf trace' command changes e.g. <command> handling and live-mode/top variants. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
| * | | | | perf trace: live-mode command-line cleanupTom Zanussi2010-11-101-57/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch attempts to make the perf trace command-line for live-mode commands more user-friendly and consistent with other perf commands. The main change it makes is to allow <commands> to be run as part of perf trace live-mode commands, as other perf commands do, instead of the system-wide traces they're currently hard-coded to by the shell scripts. With this patch, the following live-mode trace now works as expected: $ perf trace rw-by-pid ls -al The previous system-wide behavior for this command would still be available by explicitly specifying -a: $ perf trace rw-by-pid -a ls -al and if no <command> is specified, the output is also system-wide: $ perf trace rw-by-pid Because live-mode requires both record and report steps to be invoked, it isn't always possible to know which args to send to the report and which to send to the record steps - mainly this is the case for report scripts with optional args - in those cases it would be necessary to use separate 'perf trace record' and 'perf trace report' steps. For example: $ perf trace syscall-counts ls Here we can't decide whether ls should be passed as a param to the syscall-counts script or whether we should invoke ls as a <command>. In these cases, we just say that we'll ignore optional script params and always interpret the extra arguments as a <command>. If the user instead wants the other interpretation, that can be accomplished by using separate record and report commands explicitly: $ perf trace record syscall-counts $ perf trace report syscall-counts ls So the rules that this patch implements, which seem to make the most intuitive sense for live-mode commands: - for commands with optional args and commands with no args, no args are sent to the report script, all are sent to the record step - for 'top' commands i.e. that end with 'top', <commands> can't be used - all extra args are send to the report script as params - for commands with required args, the n required args are taken to be the first n args after the script name and sent to the report script, and the rest are sent to the record step Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
| * | | | | perf trace record: handle commands correctlyTom Zanussi2010-11-101-4/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because the perf-trace shell scripts hard-coded the use of the perf-record system-wide param, a perf trace record session was always system wide, even if it was given a command. If given a command, perf trace record now only records the events for the command, as users expect. If no command is given, or if the '-a' option is used, the recorded events are system-wide, as before. root@tropicana:~# perf trace record syscall-counts ls -al root@tropicana:~# perf trace ls-23152 [000] 39984.890387: sys_enter: NR 12 (0, 0, 0, 0, 0, 0) ls-23152 [000] 39984.890404: sys_enter: NR 9 (0, 0, 0, 0, 0, 0) root@tropicana:~# perf trace record syscall-counts -a ls -al root@tropicana:~# perf trace npviewer.bin-22297 [000] 39831.102709: sys_enter: NR 168 (0, 0, 0, 0, 0, 0) ls-23111 [000] 39831.107679: sys_enter: NR 59 (0, 0, 0, 0, 0, 0) Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
| * | | | | perf record: make the record options available outside perf recordTom Zanussi2010-11-101-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Other perf commands that invoke perf record, such as perf trace, may want to reuse the options used by perf record. This makes them non-static and renames them to avoid clashes with other 'options' variables. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
| * | | | | perf trace scripting: remove system-wide param from shell scriptsTom Zanussi2010-11-1013-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Including -a unconditionally when recording doesn't allow for the option of running scripts without it. Future patches will add add it back if needed at run-time. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
| * | | | | perf trace scripting: fix some small memory leaks and missing error checksTom Zanussi2010-11-101-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Free the other two fields of script_desc which somehow got overlooked, free malloc'ed args in case exec fails, and add missing checks for failed mallocs. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
| * | | | | perf: Fix usages of profile_cpu in builtin-top.c to use cpu_listCorey Ashford2010-11-101-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | profile_cpu was left over from an earlier implementation that supported running perf top on a single CPU. profile_cpu was no longer set by any switch and usages of it resulted in dead code. Instead, convert the code to use cpu_list, which is set by the -C <cpu_list> option. Also improved the printing of nr_cpus and cpu_list by correcting the plurals. Signed-off-by: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: a.p.zijlstra@chello.nl Cc: acme@redhat.com LKML-Reference: <1289269245-9388-1-git-send-email-cjashfor@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | perf, ui: Eliminate stack-smashing protection compiler complaintCyrill Gorcunov2010-11-101-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gcc complains about small auto-var strings being allocated from stack space. Make them const to avoid this: | CC util/ui/util.o | cc1: warnings being treated as errors | util/ui/util.c: In function ‘ui__dialog_yesno’: | util/ui/util.c:108: error: not protecting function: no buffer at least 8 bytes long | make: *** [util/ui/util.o] Error 1 The real bug is in the newtWinChoice() ABI - but that's an externality we cannot fix here, so we use this workaround. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <20101106084724.GA5956@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
OpenPOWER on IntegriCloud