summaryrefslogtreecommitdiffstats
path: root/drivers/block/nbd.c
Commit message (Collapse)AuthorAgeFilesLines
* nbd: fix race in ioctlVegard Nossum2016-08-041-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Quentin ran into this bug: WARNING: CPU: 64 PID: 10085 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x65/0x80 sysfs: cannot create duplicate filename '/devices/virtual/block/nbd3/pid' Modules linked in: nbd CPU: 64 PID: 10085 Comm: qemu-nbd Tainted: G D 4.6.0+ #7 0000000000000000 ffff8820330bba68 ffffffff814b8791 ffff8820330bbac8 0000000000000000 ffff8820330bbab8 ffffffff810d04ab ffff8820330bbaa8 0000001f00000296 0000000000017681 ffff8810380bf000 ffffffffa0001790 Call Trace: [<ffffffff814b8791>] dump_stack+0x4d/0x6c [<ffffffff810d04ab>] __warn+0xdb/0x100 [<ffffffff810d0574>] warn_slowpath_fmt+0x44/0x50 [<ffffffff81218c65>] sysfs_warn_dup+0x65/0x80 [<ffffffff81218a02>] sysfs_add_file_mode_ns+0x172/0x180 [<ffffffff81218a35>] sysfs_create_file_ns+0x25/0x30 [<ffffffff81594a76>] device_create_file+0x36/0x90 [<ffffffffa0000e8d>] __nbd_ioctl+0x32d/0x9b0 [nbd] [<ffffffff814cc8e8>] ? find_next_bit+0x18/0x20 [<ffffffff810f7c29>] ? select_idle_sibling+0xe9/0x120 [<ffffffff810f6cd7>] ? __enqueue_entity+0x67/0x70 [<ffffffff810f9bf0>] ? enqueue_task_fair+0x630/0xe20 [<ffffffff810efa76>] ? resched_curr+0x36/0x70 [<ffffffff810f0078>] ? check_preempt_curr+0x78/0x90 [<ffffffff810f00a2>] ? ttwu_do_wakeup+0x12/0x80 [<ffffffff810f01b1>] ? ttwu_do_activate.constprop.86+0x61/0x70 [<ffffffff810f0c15>] ? try_to_wake_up+0x185/0x2d0 [<ffffffff810f0d6d>] ? default_wake_function+0xd/0x10 [<ffffffff81105471>] ? autoremove_wake_function+0x11/0x40 [<ffffffffa0001577>] nbd_ioctl+0x67/0x94 [nbd] [<ffffffff814ac0fd>] blkdev_ioctl+0x14d/0x940 [<ffffffff811b0da2>] ? put_pipe_info+0x22/0x60 [<ffffffff811d96cc>] block_ioctl+0x3c/0x40 [<ffffffff811ba08d>] do_vfs_ioctl+0x8d/0x5e0 [<ffffffff811aa329>] ? ____fput+0x9/0x10 [<ffffffff810e9092>] ? task_work_run+0x72/0x90 [<ffffffff811ba627>] SyS_ioctl+0x47/0x80 [<ffffffff8185f5df>] entry_SYSCALL_64_fastpath+0x17/0x93 ---[ end trace 7899b295e4f850c8 ]--- It seems fairly obvious that device_create_file() is not being protected from being run concurrently on the same nbd. Quentin found the following relevant commits: 1a2ad21 nbd: add locking to nbd_ioctl 90b8f28 [PATCH] end of methods switch: remove the old ones d4430d6 [PATCH] beginning of methods conversion 08f8585 [PATCH] move block_device_operations to blkdev.h It would seem that the race was introduced in the process of moving nbd from BKL to unlocked ioctls. By setting nbd->task_recv while the mutex is held, we can prevent other processes from running concurrently (since nbd->task_recv is also checked while the mutex is held). Reported-and-tested-by: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Markus Pargmann <mpa@pengutronix.de> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Pavel Machek <pavel@suse.cz> Cc: Jens Axboe <axboe@fb.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* Merge branch 'for-4.8/core' of git://git.kernel.dk/linux-blockLinus Torvalds2016-07-261-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull core block updates from Jens Axboe: - the big change is the cleanup from Mike Christie, cleaning up our uses of command types and modified flags. This is what will throw some merge conflicts - regression fix for the above for btrfs, from Vincent - following up to the above, better packing of struct request from Christoph - a 2038 fix for blktrace from Arnd - a few trivial/spelling fixes from Bart Van Assche - a front merge check fix from Damien, which could cause issues on SMR drives - Atari partition fix from Gabriel - convert cfq to highres timers, since jiffies isn't granular enough for some devices these days. From Jan and Jeff - CFQ priority boost fix idle classes, from me - cleanup series from Ming, improving our bio/bvec iteration - a direct issue fix for blk-mq from Omar - fix for plug merging not involving the IO scheduler, like we do for other types of merges. From Tahsin - expose DAX type internally and through sysfs. From Toshi and Yigal * 'for-4.8/core' of git://git.kernel.dk/linux-block: (76 commits) block: Fix front merge check block: do not merge requests without consulting with io scheduler block: Fix spelling in a source code comment block: expose QUEUE_FLAG_DAX in sysfs block: add QUEUE_FLAG_DAX for devices to advertise their DAX support Btrfs: fix comparison in __btrfs_map_block() block: atari: Return early for unsupported sector size Doc: block: Fix a typo in queue-sysfs.txt cfq-iosched: Charge at least 1 jiffie instead of 1 ns cfq-iosched: Fix regression in bonnie++ rewrite performance cfq-iosched: Convert slice_resid from u64 to s64 block: Convert fifo_time from ulong to u64 blktrace: avoid using timespec block/blk-cgroup.c: Declare local symbols static block/bio-integrity.c: Add #include "blk.h" block/partition-generic.c: Remove a set-but-not-used variable block: bio: kill BIO_MAX_SIZE cfq-iosched: temporarily boost queue priority for idle classes block: drbd: avoid to use BIO_MAX_SIZE block: bio: remove BIO_MAX_SECTORS ...
| * block, drivers: add REQ_OP_FLUSH operationMike Christie2016-06-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | This adds a REQ_OP_FLUSH operation that is sent to request_fn based drivers by the block layer's flush code, instead of sending requests with the request->cmd_flags REQ_FLUSH bit set. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * drivers: use req op accessorMike Christie2016-06-071-1/+1
| | | | | | | | | | | | | | | | | | | | The req operation REQ_OP is separated from the rq_flag_bits definition. This converts the block layer drivers to use req_op to get the op from the request struct. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* | nbd: pass the nbd pointer for flags debugfsJosef Bacik2016-06-081-1/+1
|/ | | | | | | | | We were passing in &nbd for the private data in debugfs_create_file() for the flags entry. We expect it to just be nbd, fix this so we get proper output from this debugfs entry. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: switch to using blk_queue_write_cache()Jens Axboe2016-04-121-2/+2
| | | | | Signed-off-by: Jens Axboe <axboe@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* nbd: use correct div_s64 helperArnd Bergmann2016-03-041-2/+1
| | | | | | | | | | | | | | | | The do_div() macro now checks its arguments for the correct type, and refuses anything other than u64, so we get a warning about nbd_ioctl passing in an loff_t: drivers/block/nbd.c: In function '__nbd_ioctl': drivers/block/nbd.c:757:77: error: comparison of distinct pointer types lacks a cast [-Werror] This changes the nbd code to use div_s64() instead, which takes a signed argument. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 37091fdd831f ("nbd: Create size change events for userspace") Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: Create size change events for userspaceMarkus Pargmann2016-02-151-21/+58
| | | | | | | | | | | | | | | | | The userspace needs to know when nbd devices are ready for use. Currently no events are created for the userspace which doesn't work for systemd. See the discussion here: https://github.com/systemd/systemd/pull/358 This patch uses a central point to setup the nbd-internal sizes. A ioctl to set a size does not lead to a visible size change. The size of the block device will be kept at 0 until nbd is connected. As soon as it connects, the size will be changed to the real value and a uevent is created. When disconnecting, the blockdevice is set to 0 size and another uevent is generated. Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
* nbd: ratelimit error msgs after socket closeDan Streetman2016-02-051-2/+2
| | | | | | | | | | | | | | | | | | Make the "Attempted send on closed socket" error messages generated in nbd_request_handler() ratelimited. When the nbd socket is shutdown, the nbd_request_handler() function emits an error message for every request remaining in its queue. If the queue is large, this will spam a large amount of messages to the log. There's no need for a separate error message for each request, so this patch ratelimits it. In the specific case this was found, the system was virtual and the error messages were logged to the serial port, which overwhelmed it. Fixes: 4d48a542b427 ("nbd: fix I/O hang on disconnected nbds") Signed-off-by: Dan Streetman <dan.streetman@canonical.com> Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
* nbd: Move flag parsing to a functionMarkus Pargmann2016-02-051-9/+13
| | | | | | | | nbd changes properties of the blockdevice depending on flags that were received. This patch moves this flag parsing into a separate function nbd_parse_flags(). Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
* nbd: Cleanup reset of nbd and bdev after a disconnectMarkus Pargmann2016-02-051-11/+29
| | | | | | | | Group all variables that are reset after a disconnect into reset functions. This patch adds two of these functions, nbd_reset() and nbd_bdev_reset(). Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
* nbd: Timeouts are not user requested disconnectsMarkus Pargmann2016-02-051-3/+6
| | | | | | | | | It may be useful to know in the client that a connection timed out. The current code returns success for a timeout. This patch reports the error code -ETIMEDOUT for a timeout. Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
* nbd: Remove signal usageMarkus Pargmann2016-02-051-78/+48
| | | | | | | | | | | | | | | | | | | | As discussed on the mailing list, the usage of signals for timeout handling has a lot of potential issues. The nbd driver used for some time signals for timeouts. These signals where able to get the threads out of the blocking socket operations. This patch removes all signal usage and uses a socket shutdown instead. The socket descriptor itself is cleared later when the whole nbd device is closed. The tasks_lock is removed as we do not depend on this anymore. Instead a new lock for the socket is introduced so we can safely work with the socket in the timeout handler outside of the two main threads. Cc: Oleg Nesterov <oleg@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Reviewed-by: Christoph Hellwig <hch@lst.de>
* nbd: Fix debugfs error handlingMarkus Pargmann2016-02-031-41/+14
| | | | | | | | | | | | Static checker complains about the implemented error handling. It is indeed wrong. We don't care about the return values of created debugfs files. We only have to check the return values of created dirs for NULL pointer. If we use a null pointer as parent directory for files, this may lead to debugfs files in wrong places. Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
* nbd: use ->compat_ioctl()Al Viro2016-01-081-0/+1
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* signal: turn dequeue_signal_lock() into kernel_dequeue_signal()Oleg Nesterov2015-11-061-11/+4
| | | | | | | | | | | | | | | | | | | | | | 1. Rename dequeue_signal_lock() to kernel_dequeue_signal(). This matches another "for kthreads only" kernel_sigaction() helper. 2. Remove the "tsk" and "mask" arguments, they are always current and current->blocked. And it is simply wrong if tsk != current. 3. We could also remove the 3rd "siginfo_t *info" arg but it looks potentially useful. However we can simplify the callers if we change kernel_dequeue_signal() to accept info => NULL. 4. Remove _irqsave, it is never called from atomic context. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Tejun Heo <tj@kernel.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Felipe Balbi <balbi@ti.com> Cc: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* nbd: Add locking for tasksMarkus Pargmann2015-10-081-6/+30
| | | | | | | | | | | | | | | | | | The timeout handling introduced in 7e2893a16d3e (nbd: Fix timeout detection) introduces a race condition which may lead to killing of tasks that are not in nbd context anymore. This was not observed or reproducable yet. This patch adds locking to critical use of task_recv and task_send to avoid killing tasks that already left the NBD thread functions. This lock is only acquired if a timeout occures or the nbd device starts/stops. Reported-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Reviewed-by: Ben Hutchings <ben@decadent.org.uk> Fixes: 7e2893a16d3e ("nbd: Fix timeout detection") Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: flags is a u32 variableMarkus Pargmann2015-08-171-1/+1
| | | | | | | | The flags variable is used as u32 variable. This patch changes the type to be u32. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: Rename functions for clearness of recv/send pathMarkus Pargmann2015-08-171-6/+6
| | | | | | | | This patch renames functions so that it is clear what the function does. Otherwise it is not directly understandable what for example 'do_it' means. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: Change 'disconnect' to be booleanMarkus Pargmann2015-08-171-4/+4
| | | | | | Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: Add debugfs entriesMarkus Pargmann2015-08-171-1/+177
| | | | | | | | Add some debugfs files that help to understand the internal state of NBD. This exports the different sizes, flags, tasks and so on. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: Remove variable 'pid'Markus Pargmann2015-08-171-10/+9
| | | | | | | | This patch uses nbd->task_recv to determine the value of the previously used variable 'pid' for sysfs. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: Move clear queue debug messageMarkus Pargmann2015-08-171-1/+1
| | | | | | | | This message was a warning without a reason. This patch moves it into nbd_clear_que and transforms it to a debug message. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: Remove 'harderror' and propagate error properlyMarkus Pargmann2015-08-171-14/+14
| | | | | | | | | | | | Instead of a variable 'harderror' we can simply try to correctly propagate errors to the userspace. This patch removes the harderror variable and passes errors through error pointers and nbd_do_it back to the userspace. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: restructure sock_shutdownMarkus Pargmann2015-08-171-6/+7
| | | | | | | | | This patch restructures sock_shutdown to avoid having the main code path in an if block. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: sock_shutdown, remove conditional lockMarkus Pargmann2015-08-171-8/+8
| | | | | | | | Move the conditional lock from sock_shutdown into the surrounding code. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: Fix timeout detectionMarkus Pargmann2015-08-171-28/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the moment the nbd timeout just detects hanging tcp operations. This is not enough to detect a hanging or bad connection as expected of a timeout. This patch redesigns the timeout detection to include some more cases. The timeout is now in relation to replies from the server. If the server does not send replies within the timeout the connection will be shut down. The patch adds a continous timer 'timeout_timer' that is setup in one of two cases: - The request list is empty and we are sending the first request out to the server. We want to have a reply within the given timeout, otherwise we consider the connection to be dead. - A server response was received. This means the server is still communicating with us. The timer is reset to the timeout value. The timer is not stopped if the list becomes empty. It will just trigger a timeout which will directly leave the handling routine again as the request list is empty. The whole patch does not use any additional explicit locking. The list_empty() calls are safe to be used concurrently. The timer is locked internally as we just use mod_timer and del_timer_sync(). The patch is based on the idea of Michal Belczyk with a previous different implementation. Cc: Michal Belczyk <belczyk@bsd.krakow.pl> Cc: Hermann Lauer <Hermann.Lauer@iwr.uni-heidelberg.de> Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Tested-by: Hermann Lauer <Hermann.Lauer@iwr.uni-heidelberg.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* block: have drivers use blk_queue_max_discard_sectors()Jens Axboe2015-07-171-1/+1
| | | | | | | | | | Some drivers use it now, others just set the limits field manually. But in preparation for splitting this into a hard and soft limit, ensure that they all call the proper function for setting the hw limit for discards. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* block: nbd: convert to blkdev_reread_part()Ming Lei2015-05-201-1/+1
| | | | | | | | Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Jarod Wilson <jarod@redhat.com> Acked-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: stop using req->cmdChristoph Hellwig2015-05-051-25/+23
| | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* block: rename REQ_TYPE_SPECIAL to REQ_TYPE_DRV_PRIVChristoph Hellwig2015-05-051-1/+1
| | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: Return error pointer directlyMarkus Pargmann2015-04-021-5/+2
| | | | | | Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: Return error code directlyMarkus Pargmann2015-04-021-5/+2
| | | | | | | | | By returning the error code directly, we can avoid the jump label error_out. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: Remove fixme that was already fixedMarkus Pargmann2015-04-021-6/+3
| | | | | | | The mentioned problem is not present anymore. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: Restructure debugging printsMarkus Pargmann2015-04-021-66/+22
| | | | | | | | | | | | | | dprintk has some name collisions with other frameworks and drivers. It is also not necessary to have these custom debug print filters. Dynamic debug offers the same amount of filtered debugging. This patch replaces all dprintks with dev_dbg(). It also removes the ioctl dprintk which prints the ingoing ioctls which should be replaceable by strace or similar stuff. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: Fix device bytesize typeMarkus Pargmann2015-04-021-1/+2
| | | | | | | | | The block subsystem uses loff_t to store the device size. Change the type for nbd_device bytesize to loff_t. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: Replace kthread_create with kthread_runMarkus Pargmann2015-04-021-3/+3
| | | | | | | | | kthread_run includes the wake_up_process() call, so instead of kthread_create() followed by wake_up_process() we can use this macro. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: Remove kernel internal headerMarkus Pargmann2015-04-021-0/+22
| | | | | | | | The header is not included anywhere. Remove it and include the private nbd_device struct in nbd.c. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: fix possible memory leakSudip Mukherjee2015-03-051-4/+4
| | | | | | | | | | we have already allocated memory for nbd_dev, but we were not releasing that memory and just returning the error value. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Acked-by: Paul Clements <Paul.Clements@SteelEye.com> Cc: <stable@vger.kernel.org> Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
* block: disable entropy contributions for nonrot devicesMike Snitzer2014-10-041-0/+1
| | | | | | | | | | | | | | | | | | | | | Clear QUEUE_FLAG_ADD_RANDOM in all block drivers that set QUEUE_FLAG_NONROT. Historically, all block devices have automatically made entropy contributions. But as previously stated in commit e2e1a148 ("block: add sysfs knob for turning off disk entropy contributions"): - On SSD disks, the completion times aren't as random as they are for rotational drives. So it's questionable whether they should contribute to the random pool in the first place. - Calling add_disk_randomness() has a lot of overhead. There are more reliable sources for randomness than non-rotational block devices. From a security perspective it is better to err on the side of caution than to allow entropy contributions from unreliable "random" sources. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: zero from and len fields in NBD_CMD_DISCONNECT.Hani Benhabiles2014-06-061-5/+2
| | | | | | | | | | | | Len field is already set to zero, but not the from field which is sent as 0xfffffffffffffe00. This makes no sense, and may cause confuse server implementations doing sanity checks (qemu-nbd is an example.) Signed-off-by: Hani Benhabiles <hani@linux.com> Cc: Paul Clements <paul.clements@us.sios.com> Cc: Paul Clements <Paul.Clements@steeleye.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'sched/urgent' into sched/core, to avoid conflictsIngo Molnar2014-05-071-29/+19
|\ | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * switch nbd to sockfd_lookup/sockfd_putAl Viro2014-04-011-29/+19
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | sched, treewide: Replace hardcoded nice values with MIN_NICE/MAX_NICEDongsheng Yang2014-04-181-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | Replace various -20/+19 hardcoded nice values with MIN_NICE/MAX_NICE. Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ff13819fd09b7a5dba5ab5ae797f2e7019bdfa17.1394532288.git.yangds.fnst@cn.fujitsu.com Cc: devel@driverdev.osuosl.org Cc: devicetree@vger.kernel.org Cc: fcoe-devel@open-fcoe.org Cc: linux390@de.ibm.com Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-s390@vger.kernel.org Cc: linux-scsi@vger.kernel.org Cc: nbd-general@lists.sourceforge.net Cc: ocfs2-devel@oss.oracle.com Cc: openipmi-developer@lists.sourceforge.net Cc: qla2xxx-upstream@qlogic.com Cc: linux-arch@vger.kernel.org [ Consolidated the patches, twiddled the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
* block: Immutable bio vecsKent Overstreet2013-11-231-1/+1
| | | | | | | | | | | | | | | | | This adds a mechanism by which we can advance a bio by an arbitrary number of bytes without modifying the biovec: bio->bi_iter.bi_bvec_done indicates the number of bytes completed in the current bvec. Various driver code still needs to be updated to not refer to the bvec directly before we can use this for interesting things, like efficient bio splitting. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Paul Clements <Paul.Clements@steeleye.com> Cc: drbd-user@lists.linbit.com Cc: nbd-general@lists.sourceforge.net
* block: Convert bio_for_each_segment() to bvec_iterKent Overstreet2013-11-231-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | More prep work for immutable biovecs - with immutable bvecs drivers won't be able to use the biovec directly, they'll need to use helpers that take into account bio->bi_iter.bi_bvec_done. This updates callers for the new usage without changing the implementation yet. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Ed L. Cashin" <ecashin@coraid.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Paul Clements <Paul.Clements@steeleye.com> Cc: Jim Paris <jim@jtan.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Yehuda Sadeh <yehuda@inktank.com> Cc: Sage Weil <sage@inktank.com> Cc: Alex Elder <elder@inktank.com> Cc: ceph-devel@vger.kernel.org Cc: Joshua Morris <josh.h.morris@us.ibm.com> Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Neil Brown <neilb@suse.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com> Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com> Cc: support@lsi.com Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Guo Chao <yan@linux.vnet.ibm.com> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Quoc-Son Anh <quoc-sonx.anh@intel.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jan Kara <jack@suse.cz> Cc: linux-m68k@lists.linux-m68k.org Cc: linuxppc-dev@lists.ozlabs.org Cc: drbd-user@lists.linbit.com Cc: nbd-general@lists.sourceforge.net Cc: cbe-oss-dev@lists.ozlabs.org Cc: xen-devel@lists.xensource.com Cc: virtualization@lists.linux-foundation.org Cc: linux-raid@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: DL-MPTFusionLinux@lsi.com Cc: linux-scsi@vger.kernel.org Cc: devel@driverdev.osuosl.org Cc: linux-fsdevel@vger.kernel.org Cc: cluster-devel@redhat.com Cc: linux-mm@kvack.org Acked-by: Geoff Levand <geoff@infradead.org>
* nbd: correct disconnect behaviorPaul Clements2013-07-031-1/+6
| | | | | | | | | | | | | | | | | | | | Currently, when a disconnect is requested by the user (via NBD_DISCONNECT ioctl) the return from NBD_DO_IT is undefined (it is usually one of several error codes). This means that nbd-client does not know if a manual disconnect was performed or whether a network error occurred. Because of this, nbd-client's persist mode (which tries to reconnect after error, but not after manual disconnect) does not always work correctly. This change fixes this by causing NBD_DO_IT to always return 0 if a user requests a disconnect. This means that nbd-client can correctly either persist the connection (if an error occurred) or disconnect (if the user requested it). Signed-off-by: Paul Clements <paul.clements@steeleye.com> Acked-by: Rob Landley <rob@landley.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* nbd: remove bogus BUG_ON in NBD_CLEAR_QUEMichal Belczyk2013-07-031-1/+0
| | | | | | | | | | | | The NBD_CLEAR_QUE ioctl has been deprecated for quite some time (its job is now done by two other ioctls). We should stop trying to make bogus assertions in it. Also, user-level code should remove calls to NBD_CLEAR_QUE, ASAP. Signed-off-by: Michal Belczyk <belczyk@bsd.krakow.pl> Signed-off-by: Paul Clements <paul.clements@steeleye.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* block: do not pass disk names as format stringsKees Cook2013-07-031-1/+2
| | | | | | | | | | | | | | | Disk names may contain arbitrary strings, so they must not be interpreted as format strings. It seems that only md allows arbitrary strings to be used for disk names, but this could allow for a local memory corruption from uid 0 into ring 0. CVE-2013-2851 Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* nbd: increase default and max request sizesMichal Belczyk2013-04-301-0/+2
| | | | | | | | | | | | | | | | | | Raise the default max request size for nbd to 128KB (from 127KB) to get it 4KB aligned. This patch also allows the max request size to be increased (via /sys/block/nbd<x>/queue/max_sectors_kb) to 32MB. The patch makes nbd network traffic more efficient by: - reducing request fragmentation (4KB alignment) - reducing the number of requests (fewer round trips, less network overhead) Especially in high latency networks, larger request size can make a dramatic Signed-off-by: Paul Clements <paul.clements@steeleye.com> Signed-off-by: Michal Belczyk <belczyk@bsd.krakow.pl> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OpenPOWER on IntegriCloud