diff options
author | Josef Bacik <jbacik@fusionio.com> | 2013-07-02 10:38:02 -0400 |
---|---|---|
committer | Josef Bacik <jbacik@fusionio.com> | 2013-07-02 11:51:49 -0400 |
commit | 0e267c44c3a402d35111d1935be1167240b5b79f (patch) | |
tree | d09bc2390133da4ca34a4d11ca34ef524d03a87d /fs/btrfs/inode.c | |
parent | 7fb7d76f96bfcbea25007d190ba828b18e13d29d (diff) | |
download | talos-obmc-linux-0e267c44c3a402d35111d1935be1167240b5b79f.tar.gz talos-obmc-linux-0e267c44c3a402d35111d1935be1167240b5b79f.zip |
Btrfs: wait ordered range before doing direct io
My recent truncate patch uncovered this bug, but I can reproduce it without the
truncate patch. If you mount with -o compress-force, do a direct write to some
area, do a buffered write to some other area, and then do a direct read you will
get the wrong data for where you did the buffered write. This is because the
generic direct io helpers only call filemap_write_and_wait once, and for
compression we need it twice. So to be safe add the btrfs_wait_ordered_range to
the start of the direct io function to make sure any compressed writes have
truly been written. This patch makes xfstests 130 pass when you mount with -o
compress-force=lzo. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Diffstat (limited to 'fs/btrfs/inode.c')
-rw-r--r-- | fs/btrfs/inode.c | 10 |
1 files changed, 9 insertions, 1 deletions
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 0a43d42268f7..55dda871437f 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7270,8 +7270,16 @@ static ssize_t btrfs_direct_IO(int rw, struct kiocb *iocb, atomic_inc(&inode->i_dio_count); smp_mb__after_atomic_inc(); + /* + * The generic stuff only does filemap_write_and_wait_range, which isn't + * enough if we've written compressed pages to this area, so we need to + * call btrfs_wait_ordered_range to make absolutely sure that any + * outstanding dirty pages are on disk. + */ + count = iov_length(iov, nr_segs); + btrfs_wait_ordered_range(inode, offset, count); + if (rw & WRITE) { - count = iov_length(iov, nr_segs); /* * If the write DIO is beyond the EOF, we need update * the isize, but it is protected by i_mutex. So we can |