summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* ceph: include time stamp in every MDS requestSage Weil2014-06-062-1/+9
| | | | | | | | | | We recently modified the client/MDS protocol to include a timestamp in the client request. This allows ctime updates to follow the client's clock in most cases, which avoids subtle problems when clocks are out of sync and timestamps are updated sometimes by the MDS clock (for most requests) and sometimes by the client clock (for cap writeback). Signed-off-by: Sage Weil <sage@inktank.com>
* ceph: refactor readpage_nounlock() to make the logic clearerZhang Zhen2014-06-061-10/+7
| | | | | | | | | | If the return value of ceph_osdc_readpages() is not negative, it is certainly greater than or equal to zero. Remove the useless condition judgment and redundant braces. Signed-off-by: Zhang Zhen <zhenzhang.zhang@huawei.com> Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
* mds: check cap ID when handling cap export messageYan, Zheng2014-06-061-1/+1
| | | | | | | | | | | | | | handle following sequence of events: - mds0 exports an inode to mds1. client receives the cap import message from mds1. caps from mds0 are removed while handling the cap import message. - mds1 exports an inode to mds0. client receives the cap export message from mds1. handle_cap_export() adds placeholder caps for mds0 - client receives the first cap export message (for exporting inode from mds0 to mds1) Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* ceph: remember subtree root dirfrag's auth MDSYan, Zheng2014-06-061-1/+7
| | | | | | | remember dirfrag's auth MDS when it's different from its parent inode's auth MDS. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* ceph: introduce ceph_fill_fragtree()Yan, Zheng2014-06-061-45/+84
| | | | | | | | Move the code that update the i_fragtree into a separate function. Also add simple probabilistic test to decide whether the i_fragtree should be updated Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* ceph: handle cap import atomicallyYan, Zheng2014-06-061-45/+54
| | | | | | | | | | | | cap import messages are processed by both handle_cap_import() and handle_cap_grant(). These two functions are not executed in the same atomic context, so they can races with cap release. The fix is make handle_cap_import() not release the i_ceph_lock when it returns. Let handle_cap_grant() release the lock after it finishes its job. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* ceph: pre-allocate ceph_cap struct for ceph_add_cap()Yan, Zheng2014-06-063-79/+85
| | | | | | | So that ceph_add_cap() can be used while i_ceph_lock is locked. This simplifies the code that handle cap import/export. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* ceph: update inode fields according to issued capsYan, Zheng2014-06-062-57/+71
| | | | | | | | | Cap message and request reply from non-auth MDS may carry stale information (corresponding locks are in LOCK states) even they have the newest inode version. So client should update inode fields according to issued caps. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* ceph: queue vmtruncate if necessary when handing cap grant/revokeYan, Zheng2014-06-061-10/+16
| | | | | | | | cap grant/revoke message from non-auth MDS can update inode's size and truncate_seq/truncate_size. (the message arrives before auth MDS's cap trunc message) Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* ceph: remove useless ACL checkZhang Zhen2014-06-061-6/+0
| | | | | | | | | posix_acl_xattr_set() already does the check, and it's the only way to feed in an ACL from userspace. So the check here is useless, remove it. Signed-off-by: zhang zhen <zhenzhang.zhang@huawei.com> Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
* ceph: ceph_get_parent() can be staticFengguang Wu2014-06-061-1/+1
| | | | | Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2014-05-222-2/+6
|\ | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull two btrfs fixes from Chris Mason: "This has two fixes that we've been testing for 3.16, but since both are safe and fix real bugs, it makes sense to send for 3.15 instead" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: send, fix incorrect ref access when using extrefs Btrfs: fix EIO on reading file after ioctl clone works on it
| * Btrfs: send, fix incorrect ref access when using extrefsFilipe Manana2014-05-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running send, if an inode only has extended reference items associated to it and no regular references, send.c:get_first_ref() was incorrectly assuming the reference it found was of type BTRFS_INODE_REF_KEY due to use of the wrong key variable. This caused weird behaviour when using the found item has a regular reference, such as weird path string, and occasionally (when lucky) a crash: [ 190.600652] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC [ 190.600994] Modules linked in: btrfs xor raid6_pq binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc psmouse serio_raw evbug pcspkr i2c_piix4 e1000 floppy [ 190.602565] CPU: 2 PID: 14520 Comm: btrfs Not tainted 3.13.0-fdm-btrfs-next-26+ #1 [ 190.602728] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 190.602868] task: ffff8800d447c920 ti: ffff8801fa79e000 task.ti: ffff8801fa79e000 [ 190.603030] RIP: 0010:[<ffffffff813266b4>] [<ffffffff813266b4>] memcpy+0x54/0x110 [ 190.603262] RSP: 0018:ffff8801fa79f880 EFLAGS: 00010202 [ 190.603395] RAX: ffff8800d4326e3f RBX: 000000000000036a RCX: ffff880000000000 [ 190.603553] RDX: 000000000000032a RSI: ffe708844042936a RDI: ffff8800d43271a9 [ 190.603710] RBP: ffff8801fa79f8c8 R08: 00000000003a4ef0 R09: 0000000000000000 [ 190.603867] R10: 793a4ef09f000000 R11: 9f0000000053726f R12: ffff8800d43271a9 [ 190.604020] R13: 0000160000000000 R14: ffff8802110134f0 R15: 000000000000036a [ 190.604020] FS: 00007fb423d09b80(0000) GS:ffff880216200000(0000) knlGS:0000000000000000 [ 190.604020] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 190.604020] CR2: 00007fb4229d4b78 CR3: 00000001f5d76000 CR4: 00000000000006e0 [ 190.604020] Stack: [ 190.604020] ffffffffa01f4d49 ffff8801fa79f8f0 00000000000009f9 ffff8801fa79f8c8 [ 190.604020] 00000000000009f9 ffff880211013260 000000000000f971 ffff88021147dba8 [ 190.604020] 00000000000009f9 ffff8801fa79f918 ffffffffa02367f5 ffff8801fa79f928 [ 190.604020] Call Trace: [ 190.604020] [<ffffffffa01f4d49>] ? read_extent_buffer+0xb9/0x120 [btrfs] [ 190.604020] [<ffffffffa02367f5>] fs_path_add_from_extent_buffer+0x45/0x60 [btrfs] [ 190.604020] [<ffffffffa0238806>] get_first_ref+0x1f6/0x210 [btrfs] [ 190.604020] [<ffffffffa0238994>] __get_cur_name_and_parent+0x174/0x3a0 [btrfs] [ 190.604020] [<ffffffff8118df3d>] ? kmem_cache_alloc_trace+0x11d/0x1e0 [ 190.604020] [<ffffffffa0236674>] ? fs_path_alloc+0x24/0x60 [btrfs] [ 190.604020] [<ffffffffa0238c91>] get_cur_path+0xd1/0x240 [btrfs] (...) Steps to reproduce (either crash or some weirdness like an odd path string): mkfs.btrfs -f -O extref /dev/sdd mount /dev/sdd /mnt mkdir /mnt/testdir touch /mnt/testdir/foobar for i in `seq 1 2550`; do ln /mnt/testdir/foobar /mnt/testdir/foobar_link_`printf "%04d" $i` done ln /mnt/testdir/foobar /mnt/testdir/final_foobar_name rm -f /mnt/testdir/foobar for i in `seq 1 2550`; do rm -f /mnt/testdir/foobar_link_`printf "%04d" $i` done btrfs subvolume snapshot -r /mnt /mnt/mysnap btrfs send /mnt/mysnap -f /tmp/mysnap.send Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
| * Btrfs: fix EIO on reading file after ioctl clone works on itLiu Bo2014-05-201-1/+5
| | | | | | | | | | | | | | | | | | | | | | For inline data extent, we need to make its length aligned, otherwise, we can get a phantom extent map which confuses readpages() to return -EIO. This can be detected by xfstests/btrfs/035. Reported-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com>
* | Merge tag 'xfs-for-linus-3.15-rc6' of git://oss.sgi.com/xfs/xfsLinus Torvalds2014-05-225-25/+27
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull xfs fixes from Dave Chinner: "Code inspection of the XFS error number sign translations found a bunch of issues, including returning incorrectly signed errors for some data integrity operations. These leak to userspace and result in applications not getting the errors correctly reported. Hence they need fixing sooner rather than later. A couple of the bugs are in data integrity operations, a couple more are in the new COLLAPSE_RANGE code. One of these came in through a recent ext4 merge and so I had to update the base tree to 3.15-rc5 before fixing the issues" * tag 'xfs-for-linus-3.15-rc6' of git://oss.sgi.com/xfs/xfs: xfs: list_lru_init returns a negative error xfs: negate xfs_icsb_init_counters error value xfs: negate mount workqueue init error value xfs: fix wrong err sign on xfs_set_acl() xfs: fix wrong errno from xfs_initxattrs xfs: correct error sign on COLLAPSE_RANGE errors xfs: xfs_commit_metadata returns wrong errno xfs: fix incorrect error sign in xfs_file_aio_read xfs: xfs_dir_fsync() returns positive errno
| * | xfs: list_lru_init returns a negative errorDave Chinner2014-05-151-12/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And we don't invert it properly when initialising the dquot lru list. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | xfs: negate xfs_icsb_init_counters error value Dave Chinner2014-05-151-1/+1
| | | | | | | | | | | | | | | | | | | | | Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | xfs: negate mount workqueue init error valueDave Chinner2014-05-151-1/+1
| | | | | | | | | | | | | | | | | | | | | Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | xfs: fix wrong err sign on xfs_set_acl()Dave Chinner2014-05-151-2/+2
| | | | | | | | | | | | | | | | | | | | | Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | xfs: fix wrong errno from xfs_initxattrsDave Chinner2014-05-151-4/+4
| | | | | | | | | | | | | | | | | | | | | Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | xfs: correct error sign on COLLAPSE_RANGE errorsDave Chinner2014-05-151-2/+2
| | | | | | | | | | | | | | | | | | | | | Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | xfs: xfs_commit_metadata returns wrong errnoDave Chinner2014-05-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Invert it. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | xfs: fix incorrect error sign in xfs_file_aio_readDave Chinner2014-05-151-1/+1
| | | | | | | | | | | | | | | | | | | | | Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | xfs: xfs_dir_fsync() returns positive errnoDave Chinner2014-05-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And it should be negative. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | | Merge tag 'driver-core-3.15-rc6' of ↵Linus Torvalds2014-05-213-9/+14
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are two driver core (well, sysfs) fixes for 3.15-rc6 that resolve some reported issues and a regression from 3.13" * tag 'driver-core-3.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: sysfs: make sure read buffer is zeroed kernfs, sysfs, cgroup: restrict extra perm check on open to sysfs
| * | | sysfs: make sure read buffer is zeroedTejun Heo2014-05-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 13c589d5b0ac ("sysfs: use seq_file when reading regular files") switched sysfs from custom read implementation to seq_file to enable later transition to kernfs. After the change, the buffer passed to ->show() is acquired through seq_get_buf(); unfortunately, this introduces a subtle behavior change. Before the commit, the buffer passed to ->show() was always zero as it was allocated using get_zeroed_page(). Because seq_file doesn't clear buffers on allocation and neither does seq_get_buf(), after the commit, depending on the behavior of ->show(), we may end up exposing uninitialized data to userland thus possibly altering userland visible behavior and leaking information. Fix it by explicitly clearing the buffer. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Ron <ron@debian.org> Fixes: 13c589d5b0ac ("sysfs: use seq_file when reading regular files") Cc: stable <stable@vger.kernel.org> # 3.13+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | kernfs, sysfs, cgroup: restrict extra perm check on open to sysfsTejun Heo2014-05-132-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernfs open method - kernfs_fop_open() - inherited extra permission checks from sysfs. While the vfs layer allows ignoring the read/write permissions checks if the issuer has CAP_DAC_OVERRIDE, sysfs explicitly denied open regardless of the cap if the file doesn't have any of the UGO perms of the requested access or doesn't implement the requested operation. It can be debated whether this was a good idea or not but the behavior is too subtle and dangerous to change at this point. After cgroup got converted to kernfs, this extra perm check also got applied to cgroup breaking libcgroup which opens write-only files with O_RDWR as root. This patch gates the extra open permission check with a new flag KERNFS_ROOT_EXTRA_OPEN_PERM_CHECK and enables it for sysfs. For sysfs, nothing changes. For cgroup, root now can perform any operation regardless of the permissions as it was before kernfs conversion. Note that kernfs still fails unimplemented operations with -EINVAL. While at it, add comments explaining KERNFS_ROOT flags. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Andrey Wagin <avagin@gmail.com> Tested-by: Andrey Wagin <avagin@gmail.com> Cc: Li Zefan <lizefan@huawei.com> References: http://lkml.kernel.org/g/CANaxB-xUm3rJ-Cbp72q-rQJO5mZe1qK6qXsQM=vh0U8upJ44+A@mail.gmail.com Fixes: 2bd59d48ebfb ("cgroup: convert to kernfs") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | Merge tag 'metag-for-v3.15-2' of ↵Linus Torvalds2014-05-201-3/+3
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag Pull Metag architecture and related fixes from James Hogan: "Mostly fixes for metag and parisc relating to upgrowing stacks. - Fix missing compiler barriers in metag memory barriers. - Fix BUG_ON on metag when RLIMIT_STACK hard limit is increased beyond safe value. - Make maximum stack size configurable. This reduces the default user stack size back to 80MB (especially on parisc after their removal of _STK_LIM_MAX override). This only affects metag and parisc. - Remove metag _STK_LIM_MAX override to match other arches and follow parisc, now that it is safe to do so (due to the BUG_ON fix mentioned above). - Finally now that both metag and parisc _STK_LIM_MAX overrides have been removed, it makes sense to remove _STK_LIM_MAX altogether" * tag 'metag-for-v3.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: asm-generic: remove _STK_LIM_MAX metag: Remove _STK_LIM_MAX override parisc,metag: Do not hardcode maximum userspace stack size metag: Reduce maximum stack size to 256MB metag: fix memory barriers
| * | | | metag: Reduce maximum stack size to 256MBJames Hogan2014-05-151-3/+3
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Specify the maximum stack size for arches where the stack grows upward (parisc and metag) in asm/processor.h rather than hard coding in fs/exec.c so that metag can specify a smaller value of 256MB rather than 1GB. This fixes a BUG on metag if the RLIMIT_STACK hard limit is increased beyond a safe value by root. E.g. when starting a process after running "ulimit -H -s unlimited" it will then attempt to use a stack size of the maximum 1GB which is far too big for metag's limited user virtual address space (stack_top is usually 0x3ffff000): BUG: failure at fs/exec.c:589/shift_arg_pages()! Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: linux-parisc@vger.kernel.org Cc: linux-metag@vger.kernel.org Cc: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # only needed for >= v3.9 (arch/metag)
* | | | Merge tag 'locks-v3.15-4' of git://git.samba.org/jlayton/linuxLinus Torvalds2014-05-131-12/+24
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull file locking fix from Jeff Layton: "Fix for regression in handling of F_GETLK commands" * tag 'locks-v3.15-4' of git://git.samba.org/jlayton/linux: locks: only validate the lock vs. f_mode in F_SETLK codepaths
| * | | | locks: only validate the lock vs. f_mode in F_SETLK codepathsJeff Layton2014-05-091-12/+24
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: replace missing break in switch statement (as pointed out by Dave Jones) commit bce7560d4946 (locks: consolidate checks for compatible filp->f_mode values in setlk handlers) introduced a regression in the F_GETLK handler. flock64_to_posix_lock is a shared codepath between F_GETLK and F_SETLK, but the f_mode checks should only be applicable to the F_SETLK codepaths according to POSIX. Instead of just reverting the patch, add a new function to do this checking and have the F_SETLK handlers call it. Cc: Dave Jones <davej@redhat.com> Reported-and-Tested-by: Reuben Farrelly <reuben@reub.net> Signed-off-by: Jeff Layton <jlayton@poochiereds.net>
* | | | Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2014-05-131-0/+3
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull cifs fix from Steve French: "Small cifs fix for metadata caching" * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix actimeo=0 corner case when cifs_i->time == jiffies
| * | | | cifs: fix actimeo=0 corner case when cifs_i->time == jiffiesJeff Layton2014-04-241-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | actimeo=0 is supposed to be a special case that ensures that inode attributes are always refetched from the server instead of trusting the cache. The cifs code however uses time_in_range() to determine whether the attributes have timed out. In the case where cifs_i->time equals jiffies, this leads to the cifs code not refetching the inode attributes when it should. Fix this by explicitly testing for actimeo=0, and handling it as a special case. Reported-and-tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
* | | | | Merge branch 'for-3.15' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2014-05-112-20/+22
|\ \ \ \ \ | |_|/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd fixes from Bruce Fields. * 'for-3.15' of git://linux-nfs.org/~bfields/linux: NFSD: Call ->set_acl with a NULL ACL structure if no entries NFSd: call rpc_destroy_wait_queue() from free_client() NFSd: Move default initialisers from create_client() to alloc_client()
| * | | | NFSD: Call ->set_acl with a NULL ACL structure if no entriesKinglong Mee2014-05-081-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After setting ACL for directory, I got two problems that caused by the cached zero-length default posix acl. This patch make sure nfsd4_set_nfs4_acl calls ->set_acl with a NULL ACL structure if there are no entries. Thanks for Christoph Hellwig's advice. First problem: ............ hang ........... Second problem: [ 1610.167668] ------------[ cut here ]------------ [ 1610.168320] kernel BUG at /root/nfs/linux/fs/nfsd/nfs4acl.c:239! [ 1610.168320] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 1610.168320] Modules linked in: nfsv4(OE) nfs(OE) nfsd(OE) rpcsec_gss_krb5 fscache ip6t_rpfilter ip6t_REJECT cfg80211 xt_conntrack rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw auth_rpcgss nfs_acl snd_intel8x0 ppdev lockd snd_ac97_codec ac97_bus snd_pcm snd_timer e1000 pcspkr parport_pc snd parport serio_raw joydev i2c_piix4 sunrpc(OE) microcode soundcore i2c_core ata_generic pata_acpi [last unloaded: nfsd] [ 1610.168320] CPU: 0 PID: 27397 Comm: nfsd Tainted: G OE 3.15.0-rc1+ #15 [ 1610.168320] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 1610.168320] task: ffff88005ab653d0 ti: ffff88005a944000 task.ti: ffff88005a944000 [ 1610.168320] RIP: 0010:[<ffffffffa034d5ed>] [<ffffffffa034d5ed>] _posix_to_nfsv4_one+0x3cd/0x3d0 [nfsd] [ 1610.168320] RSP: 0018:ffff88005a945b00 EFLAGS: 00010293 [ 1610.168320] RAX: 0000000000000001 RBX: ffff88006700bac0 RCX: 0000000000000000 [ 1610.168320] RDX: 0000000000000000 RSI: ffff880067c83f00 RDI: ffff880068233300 [ 1610.168320] RBP: ffff88005a945b48 R08: ffffffff81c64830 R09: 0000000000000000 [ 1610.168320] R10: ffff88004ea85be0 R11: 000000000000f475 R12: ffff880068233300 [ 1610.168320] R13: 0000000000000003 R14: 0000000000000002 R15: ffff880068233300 [ 1610.168320] FS: 0000000000000000(0000) GS:ffff880077800000(0000) knlGS:0000000000000000 [ 1610.168320] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1610.168320] CR2: 00007f5bcbd3b0b9 CR3: 0000000001c0f000 CR4: 00000000000006f0 [ 1610.168320] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1610.168320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1610.168320] Stack: [ 1610.168320] ffffffff00000000 0000000b67c83500 000000076700bac0 0000000000000000 [ 1610.168320] ffff88006700bac0 ffff880068233300 ffff88005a945c08 0000000000000002 [ 1610.168320] 0000000000000000 ffff88005a945b88 ffffffffa034e2d5 000000065a945b68 [ 1610.168320] Call Trace: [ 1610.168320] [<ffffffffa034e2d5>] nfsd4_get_nfs4_acl+0x95/0x150 [nfsd] [ 1610.168320] [<ffffffffa03400d6>] nfsd4_encode_fattr+0x646/0x1e70 [nfsd] [ 1610.168320] [<ffffffff816a6e6e>] ? kmemleak_alloc+0x4e/0xb0 [ 1610.168320] [<ffffffffa0327962>] ? nfsd_setuser_and_check_port+0x52/0x80 [nfsd] [ 1610.168320] [<ffffffff812cd4bb>] ? selinux_cred_prepare+0x1b/0x30 [ 1610.168320] [<ffffffffa0341caa>] nfsd4_encode_getattr+0x5a/0x60 [nfsd] [ 1610.168320] [<ffffffffa0341e07>] nfsd4_encode_operation+0x67/0x110 [nfsd] [ 1610.168320] [<ffffffffa033844d>] nfsd4_proc_compound+0x21d/0x810 [nfsd] [ 1610.168320] [<ffffffffa0324d9b>] nfsd_dispatch+0xbb/0x200 [nfsd] [ 1610.168320] [<ffffffffa00850cd>] svc_process_common+0x46d/0x6d0 [sunrpc] [ 1610.168320] [<ffffffffa0085433>] svc_process+0x103/0x170 [sunrpc] [ 1610.168320] [<ffffffffa032472f>] nfsd+0xbf/0x130 [nfsd] [ 1610.168320] [<ffffffffa0324670>] ? nfsd_destroy+0x80/0x80 [nfsd] [ 1610.168320] [<ffffffff810a5202>] kthread+0xd2/0xf0 [ 1610.168320] [<ffffffff810a5130>] ? insert_kthread_work+0x40/0x40 [ 1610.168320] [<ffffffff816c1ebc>] ret_from_fork+0x7c/0xb0 [ 1610.168320] [<ffffffff810a5130>] ? insert_kthread_work+0x40/0x40 [ 1610.168320] Code: 78 02 e9 e7 fc ff ff 31 c0 31 d2 31 c9 66 89 45 ce 41 8b 04 24 66 89 55 d0 66 89 4d d2 48 8d 04 80 49 8d 5c 84 04 e9 37 fd ff ff <0f> 0b 90 0f 1f 44 00 00 55 8b 56 08 c7 07 00 00 00 00 8b 46 0c [ 1610.168320] RIP [<ffffffffa034d5ed>] _posix_to_nfsv4_one+0x3cd/0x3d0 [nfsd] [ 1610.168320] RSP <ffff88005a945b00> [ 1610.257313] ---[ end trace 838254e3e352285b ]--- Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | NFSd: call rpc_destroy_wait_queue() from free_client()Trond Myklebust2014-05-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mainly to ensure that we don't leave any hanging timers. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | NFSd: Move default initialisers from create_client() to alloc_client()Trond Myklebust2014-05-061-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aside from making it clearer what is non-trivial in create_client(), it also fixes a bug whereby we can call free_client() before idr_init() has been called. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | | | | Merge tag 'xfs-for-linus-3.15-rc5' of git://oss.sgi.com/xfs/xfsLinus Torvalds2014-05-089-50/+77
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull xfs fixes from Dave Chinner: "The main fix is adding support for default ACLs on O_TMPFILE opened inodes to bring XFS into line with other filesystems. Metadata CRCs are now also considered well enough tested to be fully supported, so we're removing the shouty warnings issued at mount time for filesystems with that format. And there's transaction block reservation overrun fix. Summary: - fix a remote attribute size calculation bug that leads to a transaction overrun - add default ACLs to O_TMPFILE files - Remove the EXPERIMENTAL tag from filesystems with metadata CRC support" * tag 'xfs-for-linus-3.15-rc5' of git://oss.sgi.com/xfs/xfs: xfs: remote attribute overwrite causes transaction overrun xfs: initialize default acls for ->tmpfile() xfs: fully support v5 format filesystems
| * | | | | xfs: remote attribute overwrite causes transaction overrunDave Chinner2014-05-065-14/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e461fcb ("xfs: remote attribute lookups require the value length") passes the remote attribute length in the xfs_da_args structure on lookup so that CRC calculations and validity checking can be performed correctly by related code. This, unfortunately has the side effect of changing the args->valuelen parameter in cases where it shouldn't. That is, when we replace a remote attribute, the incoming replacement stores the value and length in args->value and args->valuelen, but then the lookup which finds the existing remote attribute overwrites args->valuelen with the length of the remote attribute being replaced. Hence when we go to create the new attribute, we create it of the size of the existing remote attribute, not the size it is supposed to be. When the new attribute is much smaller than the old attribute, this results in a transaction overrun and an ASSERT() failure on a debug kernel: XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, file: fs/xfs/xfs_trans.c, line: 331 Fix this by keeping the remote attribute value length separate to the attribute value length in the xfs_da_args structure. The enables us to pass the length of the remote attribute to be removed without overwriting the new attribute's length. Also, ensure that when we save remote block contexts for a later rename we zero the original state variables so that we don't confuse the state of the attribute to be removes with the state of the new attribute that we just added. [Spotted by Brain Foster.] Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | | | | xfs: initialize default acls for ->tmpfile()Brian Foster2014-05-061-26/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current tmpfile handler does not initialize default ACLs. Doing so within xfs_vn_tmpfile() makes it roughly equivalent to xfs_vn_mknod(), which is already used as a common create handler. xfs_vn_mknod() does not currently have a mechanism to determine whether to link the file into the namespace. Therefore, further abstract xfs_vn_mknod() into a new xfs_generic_create() handler with a tmpfile parameter. This new handler calls xfs_create_tmpfile() and d_tmpfile() on the dentry when called via ->tmpfile(). Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | | | | xfs: fully support v5 format filesystemsDave Chinner2014-05-053-10/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have had this code in the kernel for over a year now and have shaken all the known issues out of the code over the past few releases. It's now time to remove the experimental warnings during mount and fully support the new filesystem format in production systems. Remove the experimental warning, and add a version number to the initial "mounting filesystem" message to tell use what type of filesystem is being mounted. Also, remove the temporary inode cluster size output at mount time now we know that this code works fine. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | | | | | Merge branch 'akpm' (incoming from Andrew)Linus Torvalds2014-05-064-4/+9
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge misc fixes from Andrew Morton: "13 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: agp: info leak in agpioc_info_wrap() fs/affs/super.c: bugfix / double free fanotify: fix -EOVERFLOW with large files on 64-bit slub: use sysfs'es release mechanism for kmem_cache revert "mm: vmscan: do not swap anon pages just because free+file is low" autofs: fix lockref lookup mm: filemap: update find_get_pages_tag() to deal with shadow entries mm/compaction: make isolate_freepages start at pageblock boundary MAINTAINERS: zswap/zbud: change maintainer email address mm/page-writeback.c: fix divide by zero in pos_ratio_polynom hugetlb: ensure hugepage access is denied if hugepages are not supported slub: fix memcg_propagate_slab_attrs drivers/rtc/rtc-pcf8523.c: fix month definition
| * | | | | | fs/affs/super.c: bugfix / double freeFabian Frederick2014-05-061-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 842a859db26b ("affs: use ->kill_sb() to simplify ->put_super() and failure exits of ->mount()") adds .kill_sb which frees sbi but doesn't remove sbi free in case of parse_options error causing double free+random crash. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> [3.14.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | fanotify: fix -EOVERFLOW with large files on 64-bitWill Woods2014-05-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 64-bit systems, O_LARGEFILE is automatically added to flags inside the open() syscall (also openat(), blkdev_open(), etc). Userspace therefore defines O_LARGEFILE to be 0 - you can use it, but it's a no-op. Everything should be O_LARGEFILE by default. But: when fanotify does create_fd() it uses dentry_open(), which skips all that. And userspace can't set O_LARGEFILE in fanotify_init() because it's defined to 0. So if fanotify gets an event regarding a large file, the read() will just fail with -EOVERFLOW. This patch adds O_LARGEFILE to fanotify_init()'s event_f_flags on 64-bit systems, using the same test as open()/openat()/etc. Addresses https://bugzilla.redhat.com/show_bug.cgi?id=696821 Signed-off-by: Will Woods <wwoods@redhat.com> Acked-by: Eric Paris <eparis@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | autofs: fix lockref lookupIan Kent2014-05-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | autofs needs to be able to see private data dentry flags for its dentrys that are being created but not yet hashed and for its dentrys that have been rmdir()ed but not yet freed. It needs to do this so it can block processes in these states until a status has been returned to indicate the given operation is complete. It does this by keeping two lists, active and expring, of dentrys in this state and uses ->d_release() to keep them stable while it checks the reference count to determine if they should be used. But with the recent lockref changes dentrys being freed sometimes don't transition to a reference count of 0 before being freed so autofs can occassionally use a dentry that is invalid which can lead to a panic. Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | hugetlb: ensure hugepage access is denied if hugepages are not supportedNishanth Aravamudan2014-05-061-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, I am seeing the following when I `mount -t hugetlbfs /none /dev/hugetlbfs`, and then simply do a `ls /dev/hugetlbfs`. I think it's related to the fact that hugetlbfs is properly not correctly setting itself up in this state?: Unable to handle kernel paging request for data at address 0x00000031 Faulting instruction address: 0xc000000000245710 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=2048 NUMA pSeries .... In KVM guests on Power, in a guest not backed by hugepages, we see the following: AnonHugePages: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 64 kB HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages are not supported at boot-time, but this is only checked in hugetlb_init(). Extract the check to a helper function, and use it in a few relevant places. This does make hugetlbfs not supported (not registered at all) in this environment. I believe this is fine, as there are no valid hugepages and that won't change at runtime. [akpm@linux-foundation.org: use pr_info(), per Mel] [akpm@linux-foundation.org: fix build when HPAGE_SHIFT is undefined] Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2014-05-063-219/+111
|\ \ \ \ \ \ \ | |/ / / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "dcache fixes + kvfree() (uninlined, exported by mm/util.c) + posix_acl bugfix from hch" The dcache fixes are for a subtle LRU list corruption bug reported by Miklos Szeredi, where people inside IBM saw list corruptions with the LTP/host01 test. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: nick kvfree() from apparmor posix_acl: handle NULL ACL in posix_acl_equiv_mode dcache: don't need rcu in shrink_dentry_list() more graceful recovery in umount_collect() don't remove from shrink list in select_collect() dentry_kill(): don't try to remove from shrink list expand the call of dentry_lru_del() in dentry_kill() new helper: dentry_free() fold try_prune_one_dentry() fold d_kill() and d_free() fix races between __d_instantiate() and checks of dentry flags
| * | | | | | posix_acl: handle NULL ACL in posix_acl_equiv_modeChristoph Hellwig2014-05-061-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various filesystems don't bother checking for a NULL ACL in posix_acl_equiv_mode, and thus can dereference a NULL pointer when it gets passed one. This usually happens from the NFS server, as the ACL tools never pass a NULL ACL, but instead of one representing the mode bits. Instead of adding boilerplat to all filesystems put this check into one place, which will allow us to remove the check from other filesystems as well later on. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Ben Greear <greearb@candelatech.com> Reported-by: Marco Munderloh <munderl@tnt.uni-hannover.de>, Cc: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | | | dcache: don't need rcu in shrink_dentry_list()Miklos Szeredi2014-05-031-23/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since now the shrink list is private and nobody can free the dentry while it is on the shrink list, we can remove RCU protection from this. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | | | more graceful recovery in umount_collect()Al Viro2014-05-031-76/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Start with shrink_dcache_parent(), then scan what remains. First of all, BUG() is very much an overkill here; we are holding ->s_umount, and hitting BUG() means that a lot of interesting stuff will be hanging after that point (sync(2), for example). Moreover, in cases when there had been more than one leak, we'll be better off reporting all of them. And more than just the last component of pathname - %pd is there for just such uses... That was the last user of dentry_lru_del(), so kill it off... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
OpenPOWER on IntegriCloud