summaryrefslogtreecommitdiffstats
path: root/fs/xfs
Commit message (Collapse)AuthorAgeFilesLines
...
| | * | | xfs: add XBF_XBF_NO_IOACCT to buf trace outputEric Sandeen2016-11-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When XBF_NO_IOACCT got added, it missed the translation in XFS_BUF_FLAGS, so we see "0x8" in trace output rather than the flag name. Fix it. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | | | Merge branch 'xfs-4.10-extent-lookup' into for-nextDave Chinner2016-11-249-267/+178
| |\| | |
| | * | | xfs: remove NULLEXTNUMChristoph Hellwig2016-11-242-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We only ever set a field to this constant for an impossible to reach error case in xfs_bmap_search_extents. That functions has been removed, so we can remove the constant as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: remove xfs_bmap_search_extentsChristoph Hellwig2016-11-242-95/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that all users are gone. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: use new extent lookup helpers in xfs_reflink_end_cowChristoph Hellwig2016-11-241-10/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: use new extent lookup helpers in xfs_reflink_cancel_cow_blocksChristoph Hellwig2016-11-241-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: use new extent lookup helpers in xfs_reflink_trim_irec_to_next_cowChristoph Hellwig2016-11-242-22/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And remove the unused return value. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: cleanup xfs_reflink_find_cow_mappingChristoph Hellwig2016-11-243-32/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use xfs_iext_lookup_extent to look up the extent, drop a useless check, drop a unneeded return value and clean up the general style a little bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: use new extent lookup helpers in __xfs_reflink_reserve_cowChristoph Hellwig2016-11-241-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: use new extent lookup helpers xfs_file_iomap_begin_delayChristoph Hellwig2016-11-241-11/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And only lookup the previous extent inside xfs_iomap_prealloc_size if we actually need it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: remove prev argument to xfs_bmapi_reserve_delallocChristoph Hellwig2016-11-244-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can easily lookup the previous extent for the cases where we need it, which saves the callers from looking it up for us later in the series. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: use new extent lookup helpers in __xfs_bunmapiChristoph Hellwig2016-11-241-27/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: use new extent lookup helpers in xfs_bmapi_writeChristoph Hellwig2016-11-241-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: use new extent lookup helpers in xfs_bmapi_readChristoph Hellwig2016-11-241-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: cleanup xfs_bmap_last_beforeChristoph Hellwig2016-11-241-32/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rewrite the function using xfs_iext_lookup_extent and xfs_iext_get_extent, and massage the flow into something easily understandable. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: new inode extent list lookup helpersChristoph Hellwig2016-11-242-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xfs_iext_lookup_extent looks up a single extent at the passed in offset, and returns the extent covering the area, or the one behind it in case of a hole, as well as the index of the returned extent in arguments, as well as a simple bool as return value that is set to false if no extent could be found because the offset is behind EOF. It is a simpler replacement for xfs_bmap_search_extent that leaves looking up the rarely needed previous extent to the caller and has a nicer calling convention. xfs_iext_get_extent is a helper for iterating over the extent list, it takes an extent index as input, and returns the extent at that index in it's expanded form in an argument if it exists. The actual return value is a bool whether the index is valid or not. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | | | Merge branch 'xfs-4.10-misc-fixes-1' into for-nextDave Chinner2016-11-1012-122/+111
| |\| | |
| | * | | xfs: fix unbalanced inode reclaim flush lockingBrian Foster2016-11-102-18/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Filesystem shutdown testing on an older distro kernel has uncovered an imbalanced locking pattern for the inode flush lock in xfs_reclaim_inode(). Specifically, there is a double unlock sequence between the call to xfs_iflush_abort() and xfs_reclaim_inode() at the "reclaim:" label. This actually does not cause obvious problems on current kernels due to the current flush lock implementation. Older kernels use a counting based flush lock mechanism, however, which effectively breaks the lock indefinitely when an already unlocked flush lock is repeatedly unlocked. Though this only currently occurs on filesystem shutdown, it has reproduced the effect of elevating an fs shutdown to a system-wide crash or hang. As it turns out, the flush lock is not actually required for the reclaim logic in xfs_reclaim_inode() because by that time we have already cycled the flush lock once while holding ILOCK_EXCL. Therefore, remove the additional flush lock/unlock cycle around the 'reclaim:' label and update branches into this label to release the flush lock where appropriate. Add an assert to xfs_ifunlock() to help prevent future occurences of the same problem. Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: provide helper for counting extents from if_bytesEric Sandeen2016-11-088-67/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The open-coded pattern: ifp->if_bytes / (uint)sizeof(xfs_bmbt_rec_t) is all over the xfs code; provide a new helper xfs_iext_count(ifp) to count the number of inline extents in an inode fork. [dchinner: pick up several missed conversions] Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: fix up xfs_swap_extent_forks inline extent handlingEric Sandeen2016-11-081-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There have been several reports over the years of NULL pointer dereferences in xfs_trans_log_inode during xfs_fsr processes, when the process is doing an fput and tearing down extents on the temporary inode, something like: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 PID: 29439 TASK: ffff880550584fa0 CPU: 6 COMMAND: "xfs_fsr" [exception RIP: xfs_trans_log_inode+0x10] #9 [ffff8800a57bbbe0] xfs_bunmapi at ffffffffa037398e [xfs] #10 [ffff8800a57bbce8] xfs_itruncate_extents at ffffffffa0391b29 [xfs] #11 [ffff8800a57bbd88] xfs_inactive_truncate at ffffffffa0391d0c [xfs] #12 [ffff8800a57bbdb8] xfs_inactive at ffffffffa0392508 [xfs] #13 [ffff8800a57bbdd8] xfs_fs_evict_inode at ffffffffa035907e [xfs] #14 [ffff8800a57bbe00] evict at ffffffff811e1b67 #15 [ffff8800a57bbe28] iput at ffffffff811e23a5 #16 [ffff8800a57bbe58] dentry_kill at ffffffff811dcfc8 #17 [ffff8800a57bbe88] dput at ffffffff811dd06c #18 [ffff8800a57bbea8] __fput at ffffffff811c823b #19 [ffff8800a57bbef0] ____fput at ffffffff811c846e #20 [ffff8800a57bbf00] task_work_run at ffffffff81093b27 #21 [ffff8800a57bbf30] do_notify_resume at ffffffff81013b0c #22 [ffff8800a57bbf50] int_signal at ffffffff8161405d As it turns out, this is because the i_itemp pointer, along with the d_ops pointer, has been overwritten with zeros when we tear down the extents during truncate. When the in-core inode fork on the temporary inode used by xfs_fsr was originally set up during the extent swap, we mistakenly looked at di_nextents to determine whether all extents fit inline, but this misses extents generated by speculative preallocation; we should be using if_bytes instead. This mistake corrupts the in-memory inode, and code in xfs_iext_remove_inline eventually gets bad inputs, causing it to memmove and memset incorrect ranges; this became apparent because the two values in ifp->if_u2.if_inline_ext[1] contained what should have been in d_ops and i_itemp; they were memmoved due to incorrect array indexing and then the original locations were zeroed with memset, again due to an array overrun. Fix this by properly using i_df.if_bytes to determine the number of extents, not di_nextents. Thanks to dchinner for looking at this with me and spotting the root cause. Cc: stable@vger.kernel.org Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: don't BUG() on mixed direct and mapped I/OBrian Foster2016-11-081-2/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've had reports of generic/095 causing XFS to BUG() in __xfs_get_blocks() due to the existence of delalloc blocks on a direct I/O read. generic/095 issues a mix of various types of I/O, including direct and memory mapped I/O to a single file. This is clearly not supported behavior and is known to lead to such problems. E.g., the lack of exclusion between the direct I/O and write fault paths means that a write fault can allocate delalloc blocks in a region of a file that was previously a hole after the direct read has attempted to flush/inval the file range, but before it actually reads the block mapping. In turn, the direct read discovers a delalloc extent and cannot proceed. While the appropriate solution here is to not mix direct and memory mapped I/O to the same regions of the same file, the current BUG_ON() behavior is probably overkill as it can crash the entire system. Instead, localize the failure to the I/O in question by returning an error for a direct I/O that cannot be handled safely due to delalloc blocks. Be careful to allow the case of a direct write to post-eof delalloc blocks. This can occur due to speculative preallocation and is safe as post-eof blocks are not accompanied by dirty pages in pagecache (conversely, preallocation within eof must have been zeroed, and thus dirtied, before the inode size could have been increased beyond said blocks). Finally, provide an additional warning if a direct I/O write occurs while the file is memory mapped. This may not catch all problematic scenarios, but provides a hint that some known-to-be-problematic I/O methods are in use. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: don't skip cow forks w/ delalloc blocks in cowblocks scanBrian Foster2016-11-083-37/+6
| | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cowblocks background scanner currently clears the cowblocks tag for inodes without any real allocations in the cow fork. This excludes inodes with only delalloc blocks in the cow fork. While we might never expect to clear delalloc blocks from the cow fork in the background scanner, it is not necessarily correct to clear the cowblocks tag from such inodes. For example, if the background scanner happens to process an inode between a buffered write and writeback, the scanner catches the inode in a state after delalloc blocks have been allocated to the cow fork but before the delalloc blocks have been converted to real blocks by writeback. The background scanner then incorrectly clears the cowblocks tag, even if part of the aforementioned delalloc reservation will not be remapped to the data fork (i.e., extra blocks due to the cowextsize hint). This means that any such additional blocks in the cow fork might never be reclaimed by the background scanner and could persist until the inode itself is reclaimed. To address this problem, only skip and clear inodes without any cow fork allocations whatsoever from the background scanner. While we generally do not want to cancel delalloc reservations from the background scanner, the pagecache dirty check following the cowblocks check should prevent that situation. If we do end up with delalloc cow fork blocks without a dirty address space mapping, this is probably an indication that something has gone wrong and the blocks should be reclaimed, as they may never be converted to a real allocation. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | | Merge branch 'xfs-4.10-libxfs-cleanups' into for-nextDave Chinner2016-11-1014-25/+54
| |\ \ \
| | * | | xfs: check minimum block size for CRC filesystemsDarrick J. Wong2016-11-092-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check the minimum block size on v5 filesystems. [dchinner: cleaned up XFS_MIN_CRC_BLOCKSIZE check] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: check return value of _trans_reserve_quota_nblksDarrick J. Wong2016-11-081-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check the return value of xfs_trans_reserve_quota_nblks for errors. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: move dir_ino_validate declaration per xfsprogsDarrick J. Wong2016-11-082-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the declaration of _dir_ino_validate out of the private dir2 header file into the public one, since xfsprogs did that for the benefit of xfs_repair. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | xfs: don't call xfs_sb_quota_from_disk twiceEric Sandeen2016-11-081-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Source xfsprogs commit: ee3754254e8c186c99b6cdd4d59f741759d04acb Kernel commit 5ef828c4 ("xfs: avoid false quotacheck after unclean shutdown") made xfs_sb_from_disk() also call xfs_sb_quota_from_disk by default. However, when this was merged to libxfs, existing separate calls to libxfs_sb_quota_from_disk remained, and calling it twice in a row on a V4 superblock leads to issues, because: if (sbp->sb_qflags & XFS_PQUOTA_ACCT) { ... sbp->sb_pquotino = sbp->sb_gquotino; sbp->sb_gquotino = NULLFSINO; and after the second call, we have set both pquotino and gquotino to NULLFSINO. Fix this by making it safe to call twice, and also remove the extra calls to libxfs_sb_quota_from_disk. This is only spotted when running xfstests with "-m crc=0" because the sb_from_disk change came about after V5 became default, and the above behavior only exists on a V4 superblock. Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | libxfs: clean up _dir2_data_freescanDarrick J. Wong2016-11-082-7/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor the implementations of xfs_dir2_data_freescan into a routine that takes the raw directory block parameters and a second function that figures out the raw parameters from the directory inode. This enables us to use the exact same code for both userspace and the kernel, since repair knows exactly which directory block geometry parameters it needs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | libxfs: fix xfs_attr_shortform_bytesfit declarationDarrick J. Wong2016-11-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the xfs_attr_shortform_bytesfit declaration to have struct xfs_inode to avoid tripping up the libxfs-diff scanner. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | libxfs: fix whitespace problemsDarrick J. Wong2016-11-081-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix some whitespace problems that trip up my libxfs-diff script. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | libxfs: synchronize dinode_verify with userspaceDarrick J. Wong2016-11-081-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The userspace version of _dinode_verify takes a raw inode number instead of an inode itself. Since neither version actually needs the inode, port the changes to the kernel. This will also reduce the libxfs diff noise. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | | libxfs: convert ushort to unsigned shortDarrick J. Wong2016-11-085-9/+10
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since xfsprogs dropped ushort in favor of unsigned short, do that here too. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | | Merge branch 'dax-4.10-iomap-pmd' into for-nextDave Chinner2016-11-103-29/+10
| |\ \ \ | | |/ / | |/| / | | |/
* | | Merge tag 'ext4_for_linus' of ↵Linus Torvalds2016-12-143-29/+10
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "This merge request includes the dax-4.0-iomap-pmd branch which is needed for both ext4 and xfs dax changes to use iomap for DAX. It also includes the fscrypt branch which is needed for ubifs encryption work as well as ext4 encryption and fscrypt cleanups. Lots of cleanups and bug fixes, especially making sure ext4 is robust against maliciously corrupted file systems --- especially maliciously corrupted xattr blocks and a maliciously corrupted superblock. Also fix ext4 support for 64k block sizes so it works well on ppcle. Fixed mbcache so we don't miss some common xattr blocks that can be merged" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (86 commits) dax: Fix sleep in atomic contex in grab_mapping_entry() fscrypt: Rename FS_WRITE_PATH_FL to FS_CTX_HAS_BOUNCE_BUFFER_FL fscrypt: Delay bounce page pool allocation until needed fscrypt: Cleanup page locking requirements for fscrypt_{decrypt,encrypt}_page() fscrypt: Cleanup fscrypt_{decrypt,encrypt}_page() fscrypt: Never allocate fscrypt_ctx on in-place encryption fscrypt: Use correct index in decrypt path. fscrypt: move the policy flags and encryption mode definitions to uapi header fscrypt: move non-public structures and constants to fscrypt_private.h fscrypt: unexport fscrypt_initialize() fscrypt: rename get_crypt_info() to fscrypt_get_crypt_info() fscrypto: move ioctl processing more fully into common code fscrypto: remove unneeded Kconfig dependencies MAINTAINERS: fscrypto: recommend linux-fsdevel for fscrypto patches ext4: do not perform data journaling when data is encrypted ext4: return -ENOMEM instead of success ext4: reject inodes with negative size ext4: remove another test in ext4_alloc_file_blocks() Documentation: fix description of ext4's block_validity mount option ext4: fix checks for data=ordered and journal_async_commit options ...
| * \ \ Merge branch 'dax-4.10-iomap-pmd' into originTheodore Ts'o2016-11-133-29/+10
| |\ \ \ | | | |/ | | |/|
| | * | xfs: use struct iomap based DAX PMD fault pathRoss Zwisler2016-11-083-25/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch xfs_filemap_pmd_fault() from using dax_pmd_fault() to the new and improved dax_iomap_pmd_fault(). Also, now that it has no more users, remove xfs_get_blocks_dax_fault(). Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
| | * | dax: correct dax iomap code namespaceRoss Zwisler2016-11-081-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The recently added DAX functions that use the new struct iomap data structure were named iomap_dax_rw(), iomap_dax_fault() and iomap_dax_actor(). These are actually defined in fs/dax.c, though, so should be part of the "dax" namespace and not the "iomap" namespace. Rename them to dax_iomap_rw(), dax_iomap_fault() and dax_iomap_actor() respectively. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | | | Merge branch 'for-4.10/block' of git://git.kernel.dk/linux-blockLinus Torvalds2016-12-132-5/+4
|\ \ \ \ | |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block layer updates from Jens Axboe: "This is the main block pull request this series. Contrary to previous release, I've kept the core and driver changes in the same branch. We always ended up having dependencies between the two for obvious reasons, so makes more sense to keep them together. That said, I'll probably try and keep more topical branches going forward, especially for cycles that end up being as busy as this one. The major parts of this pull request is: - Improved support for O_DIRECT on block devices, with a small private implementation instead of using the pig that is fs/direct-io.c. From Christoph. - Request completion tracking in a scalable fashion. This is utilized by two components in this pull, the new hybrid polling and the writeback queue throttling code. - Improved support for polling with O_DIRECT, adding a hybrid mode that combines pure polling with an initial sleep. From me. - Support for automatic throttling of writeback queues on the block side. This uses feedback from the device completion latencies to scale the queue on the block side up or down. From me. - Support from SMR drives in the block layer and for SD. From Hannes and Shaun. - Multi-connection support for nbd. From Josef. - Cleanup of request and bio flags, so we have a clear split between which are bio (or rq) private, and which ones are shared. From Christoph. - A set of patches from Bart, that improve how we handle queue stopping and starting in blk-mq. - Support for WRITE_ZEROES from Chaitanya. - Lightnvm updates from Javier/Matias. - Supoort for FC for the nvme-over-fabrics code. From James Smart. - A bunch of fixes from a whole slew of people, too many to name here" * 'for-4.10/block' of git://git.kernel.dk/linux-block: (182 commits) blk-stat: fix a few cases of missing batch flushing blk-flush: run the queue when inserting blk-mq flush elevator: make the rqhash helpers exported blk-mq: abstract out blk_mq_dispatch_rq_list() helper blk-mq: add blk_mq_start_stopped_hw_queue() block: improve handling of the magic discard payload blk-wbt: don't throttle discard or write zeroes nbd: use dev_err_ratelimited in io path nbd: reset the setup task for NBD_CLEAR_SOCK nvme-fabrics: Add FC LLDD loopback driver to test FC-NVME nvme-fabrics: Add target support for FC transport nvme-fabrics: Add host support for FC transport nvme-fabrics: Add FC transport LLDD api definitions nvme-fabrics: Add FC transport FC-NVME definitions nvme-fabrics: Add FC transport error codes to nvme.h Add type 0x28 NVME type code to scsi fc headers nvme-fabrics: patch target code in prep for FC transport support nvme-fabrics: set sqe.command_id in core not transports parser: add u64 number parser nvme-rdma: align to generic ib_event logging helper ...
| * | | writeback: add wbc_to_write_flags()Jens Axboe2016-11-021-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add wbc_to_write_flags(), which returns the write modifier flags to use, based on a struct writeback_control. No functional changes in this patch, but it prepares us for factoring other wbc fields for write type. Signed-off-by: Jens Axboe <axboe@fb.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | | block,fs: use REQ_* flags directlyChristoph Hellwig2016-11-012-5/+8
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | Remove the WRITE_* and READ_SYNC wrappers, and just use the flags directly. Where applicable this also drops usage of the bio_set_op_attrs wrapper. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* | | xfs: defer should abort intent items if the trans roll failsDarrick J. Wong2016-10-241-12/+5
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | If the deferred ops transaction roll fails, we need to abort the intent items if we haven't already logged a done item for it, regardless of whether or not the deferred ops has had a transaction committed. Dave found this while running generic/388. Move the tracepoint to make it easier to track object lifetimes. Reported-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: clear cowblocks tag when cow fork is emptiedBrian Foster2016-10-241-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The background cowblocks scan job takes care of scanning for inodes with potentially lingering blocks in the cow fork and clearing them out. If the background scanner reclaims the cow fork blocks, however, it doesn't immediately clear the cowblocks tag from the inode. Instead, the inode remains tagged until the background scanner comes around again, discovers the inode cow fork has no blocks, clears the tag and fires the trace_xfs_inode_free_cowblocks_invalid() tracepoint to indicate that the inode may have been incorrectly tagged. This is not a major functional problem as the tag is ultimately cleared. Nonetheless, clear the tag when an inode cow fork is explicitly emptied to avoid the extra round trip through the background scanner and spurious "invalid" tracepoint. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: fix up inode cowblocks tracking tracepointsBrian Foster2016-10-241-4/+4
| | | | | | | | | | | | | | | | | | | | | | These calls are still using the eofblocks tracepoints. The cowblocks equivalents are already defined, we just aren't actually calling them. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: remove xfs_bunmapi_cowChristoph Hellwig2016-10-202-23/+0
| | | | | | | | | | | | | | | | | | | | | | Since no one uses it anymore. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: optimize xfs_reflink_end_cowChristoph Hellwig2016-10-202-64/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of doing a full extent list search for each extent that is to be deleted using xfs_bmapi_read and then doing another one inside of xfs_bunmapi_cow use the same scheme that xfs_bumapi uses: look up the last extent to be deleted and then use the extent index to walk downward until we are outside the range to be deleted. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: optimize xfs_reflink_cancel_cow_blocksChristoph Hellwig2016-10-201-28/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rewrite xfs_reflink_cancel_cow_blocks so that we only do a search for the first extent in the extent list and then iterate over the remaining extents using the extent index, passing the extent we operate on directly to xfs_bmap_del_extent_delay or xfs_bmap_del_extent_cow instead of going through xfs_bunmapi and doing yet another extent list lookup. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: refactor xfs_bunmapi_cowChristoph Hellwig2016-10-203-159/+225
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Split out two helpers for deleting delayed or real extents from the COW fork. This allows to call them directly from xfs_reflink_cow_end_io once that function is refactored to iterate the extent tree. It will also allow to reuse the delalloc deletion from xfs_bunmapi in the future. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: optimize writes to reflink filesChristoph Hellwig2016-10-204-105/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of reserving space as the first thing in write_begin move it past reading the extent in the data fork. That way we only have to read from the data fork once and can reuse that information for trimming the extent to the shared/unshared boundary. Additionally this allows to easily limit the actual write size to said boundary, and avoid a roundtrip on the ilock. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: don't bother looking at the refcount tree for readsChristoph Hellwig2016-10-201-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | There is no need to trim an extent into a shared or non-shared one, or report any flags for plain old reads. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: handle "raw" delayed extents xfs_reflink_trim_around_sharedChristoph Hellwig2016-10-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Delalloc extents in the extent list contain the number of reserved indirect blocks in their startblock value and don't use the magic DELAYSTARTBLOCK constant. Ensure that xfs_reflink_trim_around_shared handles them properly by checking for isnullstartblock(). Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
OpenPOWER on IntegriCloud