diff options
author | Brian Foster <bfoster@redhat.com> | 2014-07-24 19:49:28 +1000 |
---|---|---|
committer | Dave Chinner <david@fromorbit.com> | 2014-07-24 19:49:28 +1000 |
commit | dc06f398f00059707236d456d954a3a9d2a829db (patch) | |
tree | b7d0f9a2d9a109a32e6b72af3909b9465b406c23 /fs/xfs/xfs_file.c | |
parent | f4526397928fff052f795713748f376a2bba1b5e (diff) | |
download | blackbird-obmc-linux-dc06f398f00059707236d456d954a3a9d2a829db.tar.gz blackbird-obmc-linux-dc06f398f00059707236d456d954a3a9d2a829db.zip |
xfs: run an eofblocks scan on ENOSPC/EDQUOT
From: Brian Foster <bfoster@redhat.com>
Speculative preallocation and and the associated throttling metrics
assume we're working with large files on large filesystems. Users have
reported inefficiencies in these mechanisms when we happen to be dealing
with large files on smaller filesystems. This can occur because while
prealloc throttling is aggressive under low free space conditions, it is
not active until we reach 5% free space or less.
For example, a 40GB filesystem has enough space for several files large
enough to have multi-GB preallocations at any given time. If those files
are slow growing, they might reserve preallocation for long periods of
time as well as avoid the background scanner due to frequent
modification. If a new file is written under these conditions, said file
has no access to this already reserved space and premature ENOSPC is
imminent.
To handle this scenario, modify the buffered write ENOSPC handling and
retry sequence to invoke an eofblocks scan. In the smaller filesystem
scenario, the eofblocks scan resets the usage of preallocation such that
when the 5% free space threshold is met, throttling effectively takes
over to provide fair and efficient preallocation until legitimate
ENOSPC.
The eofblocks scan is selective based on the nature of the failure. For
example, an EDQUOT failure in a particular quota will use a filtered
scan for that quota. Because we don't know which quota might have caused
an allocation failure at any given time, we include each applicable
quota determined to be under low free space conditions in the scan.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Diffstat (limited to 'fs/xfs/xfs_file.c')
-rw-r--r-- | fs/xfs/xfs_file.c | 23 |
1 files changed, 19 insertions, 4 deletions
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 181605da08e4..fcf91a22f5d8 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -38,6 +38,7 @@ #include "xfs_trace.h" #include "xfs_log.h" #include "xfs_dinode.h" +#include "xfs_icache.h" #include <linux/aio.h> #include <linux/dcache.h> @@ -689,14 +690,28 @@ write_retry: ret = generic_perform_write(file, from, pos); if (likely(ret >= 0)) iocb->ki_pos = pos + ret; + /* - * If we just got an ENOSPC, try to write back all dirty inodes to - * convert delalloc space to free up some of the excess reserved - * metadata space. + * If we hit a space limit, try to free up some lingering preallocated + * space before returning an error. In the case of ENOSPC, first try to + * write back all dirty inodes to free up some of the excess reserved + * metadata space. This reduces the chances that the eofblocks scan + * waits on dirty mappings. Since xfs_flush_inodes() is serialized, this + * also behaves as a filter to prevent too many eofblocks scans from + * running at the same time. */ - if (ret == -ENOSPC && !enospc) { + if (ret == -EDQUOT && !enospc) { + enospc = xfs_inode_free_quota_eofblocks(ip); + if (enospc) + goto write_retry; + } else if (ret == -ENOSPC && !enospc) { + struct xfs_eofblocks eofb = {0}; + enospc = 1; xfs_flush_inodes(ip->i_mount); + eofb.eof_scan_owner = ip->i_ino; /* for locking */ + eofb.eof_flags = XFS_EOF_FLAGS_SYNC; + xfs_icache_free_eofblocks(ip->i_mount, &eofb); goto write_retry; } |