summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* kill the 4th argument of __generic_file_aio_write()Al Viro2014-04-013-3/+3
| | | | | | | It's always equal to &iocb->ki_pos, where iocb is the value of the 1st argument. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ocfs2: don't open-code kernel_recvmsg()Al Viro2014-04-011-18/+3
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* constify blk_rq_map_user_iov() and friendsAl Viro2014-04-011-5/+5
| | | | | | sg_iovec array passed to it can be const Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ocfs2: don't open-code kernel_sendmsg()Al Viro2014-04-011-20/+8
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* read_code(): go through vfs_read() instead of calling the method directlyAl Viro2014-04-011-1/+1
| | | | | | | | ... and don't skip on sanity checks. It's *not* a hot path, TYVM (a couple of calls per a.out execve(), for pity sake) and headers of random a.out binary are not to be trusted. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fold cifs_iovec_read() into its (only) callerAl Viro2014-04-011-18/+9
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* cifs_iovec_read: keep iov_iter between the calls of cifs_readdata_to_iov()Al Viro2014-04-011-45/+17
| | | | | | | | | | ... we are doing them on adjacent parts of file, so what happens is that each subsequent call works to rebuild the iov_iter to exact state it had been abandoned in by previous one. Just keep it through the entire cifs_iovec_read(). And use copy_page_to_iter() instead of doing kmap/copy_to_user/kunmap manually... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* switch vmsplice_to_user() to copy_page_to_iter()Al Viro2014-04-011-89/+21
| | | | | | | | I've switched the sanity checks on iovec to rw_copy_check_uvector(); we might need to do a local analog, if any behaviour differences are not actually bugfixes here... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* switch pipe_read() to copy_page_to_iter()Al Viro2014-04-011-71/+8
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* cifs_iovec_read(): resubmit shouldn't restart the loopAl Viro2014-04-011-8/+8
| | | | | | | | ... by that point the request we'd just resent is in the head of the list anyway. Just return to the beginning of the loop body... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* callers of iov_copy_from_user_atomic() don't need pagecache_disable()Al Viro2014-04-012-7/+0
| | | | | | ... it does that itself (via kmap_atomic()) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* switch ->is_partially_uptodate() to saner argumentsAl Viro2014-04-011-3/+3
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* pipe: kill ->map() and ->unmap()Al Viro2014-04-013-73/+27
| | | | | | all pipe_buffer_operations have the same instances of those... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fuse/dev: use atomic mapsAl Viro2014-04-011-5/+5
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* VFS: Make delayed_free() call free_vfsmnt()David Howells2014-04-011-12/+8
| | | | | | | | | Make delayed_free() call free_vfsmnt() so that we don't have two functions doing the same job. This requires the calls to mnt_free_id() in free_vfsmnt() to be moved into the callers of that function. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* cifs: ->rename() without ->lookup() makes no senseAl Viro2014-04-011-1/+0
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* get rid of pointless checks for NULL ->i_opAl Viro2014-04-012-3/+1
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ntfs: don't put NULL into ->i_op/->i_fopAl Viro2014-04-011-2/+0
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* new helper: readlink_copy()Al Viro2014-04-014-46/+11
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* get rid of files_defer_init()Al Viro2014-04-012-8/+4
| | | | | | | | the only thing it's doing these days is calculation of upper limit for fs.nr_open sysctl and that can be done statically Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* namei.c: move EXPORT_SYMBOL to corresponding definitionsAl Viro2014-04-011-28/+27
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* get_write_access() is inlined, exporting it is pointlessAl Viro2014-04-011-1/+0
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* tidy do_dentry_open() up a bitAl Viro2014-04-011-12/+10
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* mark struct file that had write access grabbed by open()Al Viro2014-04-013-41/+9
| | | | | | | | | new flag in ->f_mode - FMODE_WRITER. Set by do_dentry_open() in case when it has grabbed write access, checked by __fput() to decide whether it wants to drop the sucker. Allows to stop bothering with mnt_clone_write() in alloc_file(), along with fewer special_file() checks. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fold __get_file_write_access() into its only callerAl Viro2014-04-011-19/+6
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* get rid of DEBUG_WRITECOUNTAl Viro2014-04-012-13/+0
| | | | | | it only makes control flow in __fput() and friends more convoluted. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* don't bother with {get,put}_write_access() on non-regular filesAl Viro2014-04-012-21/+9
| | | | | | | | | it's pointless and actually leads to wrong behaviour in at least one moderately convoluted case (pipe(), close one end, try to get to another via /proc/*/fd and run into ETXTBUSY). Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ncpfs: switch to sockfd_lookup()/sockfd_put()Al Viro2014-04-012-40/+12
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* reduce m_start() cost...Al Viro2014-04-013-4/+23
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* smarter propagate_mnt()Al Viro2014-04-013-82/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current mainline has copies propagated to *all* nodes, then tears down the copies we made for nodes that do not contain counterparts of the desired mountpoint. That sets the right propagation graph for the copies (at teardown time we move the slaves of removed node to a surviving peer or directly to master), but we end up paying a fairly steep price in useless allocations. It's fairly easy to create a situation where N calls of mount(2) create exactly N bindings, with O(N^2) vfsmounts allocated and freed in process. Fortunately, it is possible to avoid those allocations/freeings. The trick is to create copies in the right order and find which one would've eventually become a master with the current algorithm. It turns out to be possible in O(nodes getting propagation) time and with no extra allocations at all. One part is that we need to make sure that eventual master will be created before its slaves, so we need to walk the propagation tree in a different order - by peer groups. And iterate through the peers before dealing with the next group. Another thing is finding the (earlier) copy that will be a master of one we are about to create; to do that we are (temporary) marking the masters of mountpoints we are attaching the copies to. Either we are in a peer of the last mountpoint we'd dealt with, or we have the following situation: we are attaching to mountpoint M, the last copy S_0 had been attached to M_0 and there are sequences S_0...S_n, M_0...M_n such that S_{i+1} is a master of S_{i}, S_{i} mounted on M{i} and we need to create a slave of the first S_{k} such that M is getting propagation from M_{k}. It means that the master of M_{k} will be among the sequence of masters of M. On the other hand, the nearest marked node in that sequence will either be the master of M_{k} or the master of M_{k-1} (the latter - in the case if M_{k-1} is a slave of something M gets propagation from, but in a wrong peer group). So we go through the sequence of masters of M until we find a marked one (P). Let N be the one before it. Then we go through the sequence of masters of S_0 until we find one (say, S) mounted on a node D that has P as master and check if D is a peer of N. If it is, S will be the master of new copy, if not - the master of S will be. That's it for the hard part; the rest is fairly simple. Iterator is in next_group(), handling of one prospective mountpoint is propagate_one(). It seems to survive all tests and gives a noticably better performance than the current mainline for setups that are seriously using shared subtrees. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* switch mnt_hash to hlistAl Viro2014-03-304-50/+61
| | | | | | | | | | | | | fixes RCU bug - walking through hlist is safe in face of element moves, since it's self-terminating. Cyclic lists are not - if we end up jumping to another hash chain, we'll loop infinitely without ever hitting the original list head. [fix for dumb braino folded] Spotted by: Max Kellermann <mk@cm4all.com> Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* don't bother with propagate_mnt() unless the target is sharedAl Viro2014-03-301-10/+7
| | | | | | | | | If the dest_mnt is not shared, propagate_mnt() does nothing - there's no mounts to propagate to and thus no copies to create. Might as well don't bother calling it in that case. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* keep shadowed vfsmounts togetherAl Viro2014-03-301-9/+23
| | | | | | | preparation to switching mnt_hash to hlist Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* resizable namespace.c hashesAl Viro2014-03-302-24/+59
| | | | | | | | | * switch allocation to alloc_large_system_hash() * make sizes overridable by boot parameters (mhash_entries=, mphash_entries=) * switch mountpoint_hashtable from list_head to hlist_head Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ocfs2: check if cluster name exists before derefSasha Levin2014-03-281-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit c74a3bdd9b52 ("ocfs2: add clustername to cluster connection") is trying to strlcpy a string which was explicitly passed as NULL in the very same patch, triggering a NULL ptr deref. BUG: unable to handle kernel NULL pointer dereference at (null) IP: strlcpy (lib/string.c:388 lib/string.c:151) CPU: 19 PID: 19426 Comm: trinity-c19 Tainted: G W 3.14.0-rc7-next-20140325-sasha-00014-g9476368-dirty #274 RIP: strlcpy (lib/string.c:388 lib/string.c:151) Call Trace: ocfs2_cluster_connect (fs/ocfs2/stackglue.c:350) ocfs2_cluster_connect_agnostic (fs/ocfs2/stackglue.c:396) user_dlm_register (fs/ocfs2/dlmfs/userdlm.c:679) dlmfs_mkdir (fs/ocfs2/dlmfs/dlmfs.c:503) vfs_mkdir (fs/namei.c:3467) SyS_mkdirat (fs/namei.c:3488 fs/namei.c:3472) tracesys (arch/x86/kernel/entry_64.S:749) akpm: this patch probably disables the feature. A temporary thing to avoid triviel oopses. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Goldwyn Rodrigues <rgoldwyn@suse.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* vfs: Allocate anon_inode_inode in anon_inode_init()Jan Kara2014-03-271-22/+8
| | | | | | | | | | | Currently we allocated anon_inode_inode in anon_inodefs_mount. This is somewhat fragile as if that function ever gets called again, it will overwrite anon_inode_inode pointer. So move the initialization of anon_inode_inode to anon_inode_init(). Signed-off-by: Jan Kara <jack@suse.cz> [ Further simplified on suggestion from Dave Jones ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs: remove now stale label in anon_inode_init()Linus Torvalds2014-03-251-1/+0
| | | | | | | | The previous commit removed the register_filesystem() call and the associated error handling, but left the label for the error path that no longer exists. Remove that too. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs: Avoid userspace mounting anon_inodefs filesystemJan Kara2014-03-251-3/+0
| | | | | | | | | | | | | | | | | anon_inodefs filesystem is a kernel internal filesystem userspace shouldn't mess with. Remove registration of it so userspace cannot even try to mount it (which would fail anyway because the filesystem is MS_NOUSER). This fixes an oops triggered by trinity when it tried mounting anon_inodefs which overwrote anon_inode_inode pointer while other CPU has been in anon_inode_getfile() between ihold() and d_instantiate(). Thus effectively creating dentry pointing to an inode without holding a reference to it. Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'nfsd-next' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2014-03-251-0/+1
|\ | | | | | | | | | | | | | | | | Pull nfsd fix frm Bruce Fields: "J R Okajima sent this early and I was just slow to pass it along, apologies. Fortunately it's a simple fix" * 'nfsd-next' of git://linux-nfs.org/~bfields/linux: nfsd: fix lost nfserrno() call in nfsd_setattr()
| * nfsd: fix lost nfserrno() call in nfsd_setattr()J. R. Okajima2014-02-181-0/+1
| | | | | | | | | | | | | | | | | | There is a regression in 208d0ac 2014-01-07 nfsd4: break only delegations when appropriate which deletes an nfserrno() call in nfsd_setattr() (by accident, probably), and NFSD becomes ignoring an error from VFS. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | rcuwalk: recheck mount_lock after mountpoint crossing attemptsAl Viro2014-03-231-16/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can get false negative from __lookup_mnt() if an unrelated vfsmount gets moved. In that case legitimize_mnt() is guaranteed to fail, and we will fall back to non-RCU walk... unless we end up running into a hard error on a filesystem object we wouldn't have reached if not for that false negative. IOW, delaying that check until the end of pathname resolution is wrong - we should recheck right after we attempt to cross the mountpoint. We don't need to recheck unless we see d_mountpoint() being true - in that case even if we have just raced with mount/umount, we can simply go on as if we'd come at the moment when the sucker wasn't a mountpoint; if we run into a hard error as the result, it was a legitimate outcome. __lookup_mnt() returning NULL is different in that respect, since it might've happened due to operation on completely unrelated mountpoint. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | make prepend_name() work correctly when called with negative *buflenAl Viro2014-03-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | In all callchains leading to prepend_name(), the value left in *buflen is eventually discarded unused if prepend_name() has returned a negative. So we are free to do what prepend() does, and subtract from *buflen *before* checking for underflow (which turns into checking the sign of subtraction result, of course). Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | vfs: Don't let __fdget_pos() get FMODE_PATH filesEric Biggers2014-03-231-15/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit bd2a31d522344 ("get rid of fget_light()") introduced the __fdget_pos() function, which returns the resulting file pointer and fdput flags combined in an 'unsigned long'. However, it also changed the behavior to return files with FMODE_PATH set, which shouldn't happen because read(), write(), lseek(), etc. aren't allowed on such files. This commit restores the old behavior. This regression actually had no effect on read() and write() since FMODE_READ and FMODE_WRITE are not set on file descriptors opened with O_PATH, but it did cause lseek() on a file descriptor opened with O_PATH to fail with ESPIPE rather than EBADF. Signed-off-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | vfs: atomic f_pos access in llseek()Eric Biggers2014-03-231-2/+2
| | | | | | | | | | | | | | | | | | Commit 9c225f2655e36a4 ("vfs: atomic f_pos accesses as per POSIX") changed several system calls to use fdget_pos() instead of fdget(), but missed sys_llseek(). Fix it. Signed-off-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2014-03-113-19/+36
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull CIFS fixes from Steve French: "A fix for the problem which Al spotted in cifs_writev and a followup (noticed when fixing CVE-2014-0069) patch to ensure that cifs never sends more than the smb frame length over the socket (as we saw with that cifs_iovec_write problem that Jeff fixed last month)" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: cifs: mask off top byte in get_rfc1002_length() cifs: sanity check length of data to send before sending CIFS: Fix wrong pos argument of cifs_find_lock_conflict
| * | cifs: mask off top byte in get_rfc1002_length()Jeff Layton2014-02-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rfc1002 length actually includes a type byte, which we aren't masking off. In most cases, it's not a problem since the RFC1002_SESSION_MESSAGE type is 0, but when doing a RFC1002 session establishment, the type is non-zero and that throws off the returned length. Signed-off-by: Jeff Layton <jlayton@redhat.com> Tested-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
| * | cifs: sanity check length of data to send before sendingJeff Layton2014-02-231-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We had a bug discovered recently where an upper layer function (cifs_iovec_write) could pass down a smb_rqst with an invalid amount of data in it. The length of the SMB frame would be correct, but the rqst struct would cause smb_send_rqst to send nearly 4GB of data. This should never be the case. Add some sanity checking to the beginning of smb_send_rqst that ensures that the amount of data we're going to send agrees with the length in the RFC1002 header. If it doesn't, WARN() and return -EIO to the upper layers. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <smfrench@gmail.com>
| * | CIFS: Fix wrong pos argument of cifs_find_lock_conflictPavel Shilovsky2014-02-231-18/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | and use generic_file_aio_write rather than __generic_file_aio_write in cifs_writev. Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Steve French <smfrench@gmail.com>
* | | Merge branch 'akpm' (patches from Andrew Morton)Linus Torvalds2014-03-105-2/+56
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge misc fixes from Andrew Morton: "Nine fixes" * emailed patches from Andrew Morton akpm@linux-foundation.org>: cris: convert ffs from an object-like macro to a function-like macro hfsplus: add HFSX subfolder count support tools/testing/selftests/ipc/msgque.c: handle msgget failure return correctly MAINTAINERS: blackfin: add git repository revert "kallsyms: fix absolute addresses for kASLR" mm/Kconfig: fix URL for zsmalloc benchmark fs/proc/base.c: fix GPF in /proc/$PID/map_files mm/compaction: break out of loop on !PageBuddy in isolate_freepages_block mm: fix GFP_THISNODE callers and clarify
| * | | hfsplus: add HFSX subfolder count supportSergei Antonov2014-03-104-2/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds support for HFSX 'HasFolderCount' flag and a corresponding 'folderCount' field in folder records. (For reference see HFS_FOLDERCOUNT and kHFSHasFolderCountBit/kHFSHasFolderCountMask in Apple's source code.) Ignoring subfolder count leads to fs errors found by Mac: ... Checking catalog hierarchy. HasFolderCount flag needs to be set (id = 105) (It should be 0x10 instead of 0) Incorrect folder count in a directory (id = 2) (It should be 7 instead of 6) ... Steps to reproduce: Format with "newfs_hfs -s /dev/diskXXX". Mount in Linux. Create a new directory in root. Unmount. Run "fsck_hfs /dev/diskXXX". The patch handles directory creation, deletion, and rename. Signed-off-by: Sergei Antonov <saproj@gmail.com> Reviewed-by: Vyacheslav Dubeyko <slava@dubeyko.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OpenPOWER on IntegriCloud