diff options
author | Tejun Heo <tj@kernel.org> | 2010-07-20 15:18:07 -0700 |
---|---|---|
committer | Al Viro <viro@zeniv.linux.org.uk> | 2010-08-09 16:48:59 -0400 |
commit | 4f331f01b9c43bf001d3ffee578a97a1e0633eac (patch) | |
tree | 77cd690ab7af2624e3fd7932563f6dc0f5d6441a /fs/super.c | |
parent | 719f2c879f4dda7d7f303bd387d37cd96db29d31 (diff) | |
download | talos-obmc-linux-4f331f01b9c43bf001d3ffee578a97a1e0633eac.tar.gz talos-obmc-linux-4f331f01b9c43bf001d3ffee578a97a1e0633eac.zip |
vfs: don't hold s_umount over close_bdev_exclusive() call
Fix an obscure AB-BA deadlock in get_sb_bdev().
When a superblock is mounted more than once get_sb_bdev() calls
close_bdev_exclusive() to drop the extra bdev reference while holding
s_umount. However, sb->s_umount nests inside bd_mutex during
__invalidate_device() and close_bdev_exclusive() acquires bd_mutex during
blkdev_put(); thus creating an AB-BA deadlock.
This condition doesn't trigger frequently. For this condition to be
visible to lockdep, the filesystem must occupy the whole device (as
__invalidate_device() only grabs bd_mutex for the whole device), the FS
must be mounted more than once and partition rescan should be issued while
the FS is still mounted.
Fix it by dropping s_umount over close_bdev_exclusive().
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Ciprian Docan <docan@eden.rutgers.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Diffstat (limited to 'fs/super.c')
-rw-r--r-- | fs/super.c | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/fs/super.c b/fs/super.c index 938119ab8dcb..3479ca6f005f 100644 --- a/fs/super.c +++ b/fs/super.c @@ -773,7 +773,16 @@ int get_sb_bdev(struct file_system_type *fs_type, goto error_bdev; } + /* + * s_umount nests inside bd_mutex during + * __invalidate_device(). close_bdev_exclusive() + * acquires bd_mutex and can't be called under + * s_umount. Drop s_umount temporarily. This is safe + * as we're holding an active reference. + */ + up_write(&s->s_umount); close_bdev_exclusive(bdev, mode); + down_write(&s->s_umount); } else { char b[BDEVNAME_SIZE]; |