talos-op-linux - Talos™ II Linux sources for OpenPOWER

	Commit message (Collapse)	Author	Age	Files	Lines
...
\| *	NFS: Remove an extra if in _nfs4_recover_proc_open()	Anna Schumaker	2017-01-30	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \|	It's simpler just to return the status unconditionally Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: Return errors directly in _nfs4_opendata_reclaim_to_nfs4_state()	Anna Schumaker	2017-01-30	1	-8/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is no need for a goto just to return an error code without any cleanup. Returning the error directly helps to clean up the code. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: Remove nfs4_wait_for_completion_rpc_task()	Anna Schumaker	2017-01-30	1	-15/+7
\| \| \| \| \| \| \| \|	Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: Clean up _nfs4_is_integrity_protected()	Anna Schumaker	2017-01-30	1	-6/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can cut out the if statement and return the results of the comparison directly. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: Fix inconsistent indentation in nfs4proc.c	Anna Schumaker	2017-01-30	1	-28/+28
\| \| \| \| \| \| \| \|	Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: Make trace_nfs4_setup_sequence() available to NFS v4.0	Anna Schumaker	2017-01-30	3	-35/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This tracepoint displays information about the slot that was chosen for the RPC, in addition to session information. This could be useful information for debugging, and we can set the session id hash to 0 to indicate that there is no session. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: Merge the remaining setup_sequence functions	Anna Schumaker	2017-01-30	1	-67/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This creates a single place for all the work to happen, using the presence of a session to determine if extra values need to be set. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: Check if the slot table is draining from nfs4_setup_sequence()	Anna Schumaker	2017-01-30	1	-14/+9
\| \| \| \| \| \| \| \|	Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: Handle setup sequence task rescheduling in a single place	Anna Schumaker	2017-01-30	1	-22/+15
\| \| \| \| \| \| \| \|	Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: Lock the slot table from a single place during setup sequence	Anna Schumaker	2017-01-30	1	-13/+12
\| \| \| \| \| \| \| \| \| \| \| \|	Rather than implementing this twice for NFS v4.0 and v4.1 Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: Move slot-already-allocated check into nfs_setup_sequence()	Anna Schumaker	2017-01-30	1	-10/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This puts the check in a single place, rather than needing to implement it twice for v4.0 and v4.1. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: Create a single nfs4_setup_sequence() function	Anna Schumaker	2017-01-30	1	-47/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The inline ifdef lets us put everything in a single place, rather than having two (very similar) versions of this function. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: Use nfs4_setup_sequence() everywhere	Anna Schumaker	2017-01-30	5	-52/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This does the right thing depending on if we have a session, rather than needing to handle this manually in multiple places. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: Change nfs4_setup_sequence() to take an nfs_client structure	Anna Schumaker	2017-01-30	1	-16/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I want to have all callers use this function, rather than calling the NFS v4.0 and v4.1 versions directly. This includes pNFS, which only has access to the nfs_client structure in some places. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: Change nfs4_get_session() to take an nfs_client structure	Anna Schumaker	2017-01-30	3	-10/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pNFS only has access to the nfs_client structure, and not the nfs_server, so we need to make this change so the function can be used by pNFS as well. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: Move nfs4_get_session() into nfs4_session.h	Anna Schumaker	2017-01-30	3	-10/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This puts session related functions together in the same space. I only keep one version of this function, since this variable will always be NULL when using NFS v4.0. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	NFS: tidy up nfs_show_mountd_netid	NeilBrown	2017-01-30	1	-14/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This function is a bit clumsy, incorrectly producing ",mountproto=" if mountd_protocol is 0 and !showdefaults, and duplicating the code for reporting "auto". Tidy it up so that it only makes a single seq_printf() call, and more obviously does the right thing. Fixes: ee671b016fbf ("NFS: convert proto= option to use netids rather than a protoname") Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
\| *	sunrpc & nfs: Add and use dprintk_cont macros	Joe Perches	2017-01-30	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow line continuations to work properly with KERN_CONT. Signed-off-by: Joe Perches <joe@perches.com> [Anna: Add fallback dprintk_cont() for when CONFIG_SUNRPC_DEBUG=n] Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* \|	Merge tag 'nfsd-4.11' of git://linux-nfs.org/~bfields/linux	Linus Torvalds	2017-02-28	1	-2/+4
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull nfsd updates from Bruce Fields: "The nfsd update this round is mainly a lot of miscellaneous cleanups and bugfixes. A couple changes could theoretically break working setups on upgrade. I don't expect complaints in practice, but they seem worth calling out just in case: - NFS security labels are now off by default; a new security_label export flag reenables it per export. But, having them on by default is a disaster, as it generally only makes sense if all your clients and servers have similar enough selinux policies. Thanks to Jason Tibbitts for pointing this out. - NFSv4/UDP support is off. It was never really supported, and the spec explicitly forbids it. We only ever left it on out of laziness; thanks to Jeff Layton for finally fixing that" * tag 'nfsd-4.11' of git://linux-nfs.org/~bfields/linux: (34 commits) nfsd: Fix display of the version string nfsd: fix configuration of supported minor versions sunrpc: don't register UDP port with rpcbind when version needs congestion control nfs/nfsd/sunrpc: enforce transport requirements for NFSv4 sunrpc: flag transports as having congestion control sunrpc: turn bitfield flags in svc_version into bools nfsd: remove superfluous KERN_INFO nfsd: special case truncates some more nfsd: minor nfsd_setattr cleanup NFSD: Reserve adequate space for LOCKT operation NFSD: Get response size before operation for all RPCs nfsd/callback: Drop a useless data copy when comparing sessionid nfsd/callback: skip the callback tag nfsd/callback: Cleanup callback cred on shutdown nfsd/idmap: return nfserr_inval for 0-length names SUNRPC/Cache: Always treat the invalid cache as unexpired SUNRPC: Drop all entries from cache_detail when cache_purge() svcrdma: Poll CQs in "workqueue" mode svcrdma: Combine list fields in struct svc_rdma_op_ctxt svcrdma: Remove unused sc_dto_q field ...
\| * \|	nfs/nfsd/sunrpc: enforce transport requirements for NFSv4	Jeff Layton	2017-02-24	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NFSv4 requires a transport "that is specified to avoid network congestion" (RFC 7530, section 3.1, paragraph 2). In practical terms, that means that you should not run NFSv4 over UDP. The server has never enforced that requirement, however. This patchset fixes this by adding a new flag to the svc_version that states that it has these transport requirements. With that, we can check that the transport has XPT_CONG_CTRL set before processing an RPC. If it doesn't we reject it with RPC_PROG_MISMATCH. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
\| * \|	sunrpc: turn bitfield flags in svc_version into bools	Jeff Layton	2017-02-24	1	-2/+2
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's just simpler to read this way, IMO. Also, no need to explicitly set vs_hidden to false in the nfsacl ones. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* \|	lib/vsprintf.c: remove %Z support	Alexey Dobriyan	2017-02-27	4	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that %z is standartised in C99 there is no reason to support %Z. Unlike %L it doesn't even make format strings smaller. Use BUILD_BUG_ON in a couple ATM drivers. In case anyone didn't notice lib/vsprintf.o is about half of SLUB which is in my opinion is quite an achievement. Hopefully this patch inspires someone else to trim vsprintf.c more. Link: http://lkml.kernel.org/r/20170103230126.GA30170@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmf	Dave Jiang	2017-02-24	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to take a vma and vmf parameter when the vma already resides in vmf. Remove the vma parameter to simplify things. [arnd@arndb.de: fix ARM build] Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	Merge branch 'for-linus' of ↵	Linus Torvalds	2017-02-23	2	-2/+2
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace updates from Eric Biederman: "There is a lot here. A lot of these changes result in subtle user visible differences in kernel behavior. I don't expect anything will care but I will revert/fix things immediately if any regressions show up. From Seth Forshee there is a continuation of the work to make the vfs ready for unpriviled mounts. We had thought the previous changes prevented the creation of files outside of s_user_ns of a filesystem, but it turns we missed the O_CREAT path. Ooops. Pavel Tikhomirov and Oleg Nesterov worked together to fix a long standing bug in the implemenation of PR_SET_CHILD_SUBREAPER where only children that are forked after the prctl are considered and not children forked before the prctl. The only known user of this prctl systemd forks all children after the prctl. So no userspace regressions will occur. Holding earlier forked children to the same rules as later forked children creates a semantic that is sane enough to allow checkpoing of processes that use this feature. There is a long delayed change by Nikolay Borisov to limit inotify instances inside a user namespace. Michael Kerrisk extends the API for files used to maniuplate namespaces with two new trivial ioctls to allow discovery of the hierachy and properties of namespaces. Konstantin Khlebnikov with the help of Al Viro adds code that when a network namespace exits purges it's sysctl entries from the dcache. As in some circumstances this could use a lot of memory. Vivek Goyal fixed a bug with stacked filesystems where the permissions on the wrong inode were being checked. I continue previous work on ptracing across exec. Allowing a file to be setuid across exec while being ptraced if the tracer has enough credentials in the user namespace, and if the process has CAP_SETUID in it's own namespace. Proc files for setuid or otherwise undumpable executables are now owned by the root in the user namespace of their mm. Allowing debugging of setuid applications in containers to work better. A bug I introduced with permission checking and automount is now fixed. The big change is to mark the mounts that the kernel initiates as a result of an automount. This allows the permission checks in sget to be safely suppressed for this kind of mount. As the permission check happened when the original filesystem was mounted. Finally a special case in the mount namespace is removed preventing unbounded chains in the mount hash table, and making the semantics simpler which benefits CRIU. The vfs fix along with related work in ima and evm I believe makes us ready to finish developing and merge fully unprivileged mounts of the fuse filesystem. The cleanups of the mount namespace makes discussing how to fix the worst case complexity of umount. The stacked filesystem fixes pave the way for adding multiple mappings for the filesystem uids so that efficient and safer containers can be implemented" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: proc/sysctl: Don't grab i_lock under sysctl_lock. vfs: Use upper filesystem inode in bprm_fill_uid() proc/sysctl: prune stale dentries during unregistering mnt: Tuck mounts under others instead of creating shadow/side mounts. prctl: propagate has_child_subreaper flag to every descendant introduce the walk_process_tree() helper nsfs: Add an ioctl() to return owner UID of a userns fs: Better permission checking for submounts exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction vfs: open() with O_CREAT should not create inodes with unknown ids nsfs: Add an ioctl() to return the namespace type proc: Better ownership of files for non-dumpable tasks in user namespaces exec: Remove LSM_UNSAFE_PTRACE_CAP exec: Test the ptracer's saved cred to see if the tracee can gain caps exec: Don't reset euid and egid when the tracee has CAP_SETUID inotify: Convert to using per-namespace limits
\| * \|	fs: Better permission checking for submounts	Eric W. Biederman	2017-02-02	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To support unprivileged users mounting filesystems two permission checks have to be performed: a test to see if the user allowed to create a mount in the mount namespace, and a test to see if the user is allowed to access the specified filesystem. The automount case is special in that mounting the original filesystem grants permission to mount the sub-filesystems, to any user who happens to stumble across the their mountpoint and satisfies the ordinary filesystem permission checks. Attempting to handle the automount case by using override_creds almost works. It preserves the idea that permission to mount the original filesystem is permission to mount the sub-filesystem. Unfortunately using override_creds messes up the filesystems ordinary permission checks. Solve this by being explicit that a mount is a submount by introducing vfs_submount, and using it where appropriate. vfs_submount uses a new mount internal mount flags MS_SUBMOUNT, to let sget and friends know that a mount is a submount so they can take appropriate action. sget and sget_userns are modified to not perform any permission checks on submounts. follow_automount is modified to stop using override_creds as that has proven problemantic. do_mount is modified to always remove the new MS_SUBMOUNT flag so that we know userspace will never by able to specify it. autofs4 is modified to stop using current_real_cred that was put in there to handle the previous version of submount permission checking. cifs is modified to pass the mountpoint all of the way down to vfs_submount. debugfs is modified to pass the mountpoint all of the way down to trace_automount by adding a new parameter. To make this change easier a new typedef debugfs_automount_t is introduced to capture the type of the debugfs automount function. Cc: stable@vger.kernel.org Fixes: 069d5ac9ae0d ("autofs: Fix automounts by using current_real_cred()->uid") Fixes: aeaa4a79ff6a ("fs: Call d_automount with the filesystems creds") Reviewed-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* \| \|	nfs: no PG_private waiters remain, remove waker	Nicholas Piggin	2017-02-22	1	-2/+0
\| \|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since commit 4f52b6bb8c57 ("NFS: Don't call COMMIT in ->releasepage()"), no tasks wait on PagePrivate. Thus the wake introduced in commit 9590544694be ("NFS: avoid deadlocks with loop-back mounted NFS filesystems.") can be removed. Link: http://lkml.kernel.org/r/20170103182234.30141-2-npiggin@gmail.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Anna Schumaker <anna.schumaker@netapp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	pNFS: Fix a reference leak in _pnfs_return_layout	Trond Myklebust	2017-01-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	IF NFS_LAYOUT_RETURN_REQUESTED is not set, then we currently exit without freeing the list of invalidated layout segments, leading to a reference leak. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: 24408f5282 ("pNFS: Fix bugs in _pnfs_return_layout") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* \|	nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED"	Chuck Lever	2017-01-26	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Lock sequence IDs are bumped in decode_lock by calling nfs_increment_seqid(). nfs_increment_sequid() does not use the seqid_mutating_err() function fixed in commit 059aa7348241 ("Don't increment lock sequence ID after NFS4ERR_MOVED"). Fixes: 059aa7348241 ("Don't increment lock sequence ID after ...") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Xuan Qi <xuan.qi@oracle.com> Cc: stable@vger.kernel.org # v3.7+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* \|	NFSv4.0: always send mode in SETATTR after EXCLUSIVE4	Benjamin Coddington	2017-01-24	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some nfsv4.0 servers may return a mode for the verifier following an open with EXCLUSIVE4 createmode, but this does not mean the client should skip setting the mode in the following SETATTR. It should only do that for EXCLUSIVE4_1 or UNGAURDED createmode. Fixes: 5334c5bdac92 ("NFS: Send attributes in OPEN request for NFS4_CREATE_EXCLUSIVE4_1") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* \|	NFSv4.1: Fix a deadlock in layoutget	Trond Myklebust	2017-01-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We cannot call nfs4_handle_exception() without first ensuring that the slot has been freed. If not, we end up deadlocking with the process waiting for recovery to complete, and recovery waiting for the slot table to drain. Fixes: 2e80dbe7ac51 ("NFSv4.1: Close callback races for OPEN, LAYOUTGET...") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* \|	NFSv4: Fix client recovery when server reboots multiple times	Trond Myklebust	2017-01-13	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the server reboots multiple times, the client should rely on the server to tell it that it cannot reclaim state as per section 9.6.3.4 in RFC7530 and section 8.4.2.1 in RFC5661. Currently, the client is being to conservative, and is assuming that if the server reboots while state recovery is in progress, then it must ignore state that was not recovered before the reboot. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* \|	NFSv4: update_changeattr should update the attribute timestamp	Trond Myklebust	2017-01-12	1	-8/+13
\| \| \| \| \| \| \| \| \| \| \| \|	Otherwise, the attribute cache remains marked as being expired. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* \|	NFSv4: Don't call update_changeattr() unless the unlink is successful	Trond Myklebust	2017-01-12	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the unlink wasn't successful, then the directory has presumably not changed. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* \|	NFSv4: Don't apply change_info4 twice on rename within a directory	Trond Myklebust	2017-01-12	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a file is renamed, but stays in the same directory, we will still receive 2 change_info4 structures describing the change to that directory, but we only want to apply it once. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* \|	NFSv4: Call update_changeattr() from _nfs4_proc_open only if a file was created	Trond Myklebust	2017-01-12	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't want to invalidate the directory attribute and data cache unless we know that a file was created, or the change attribute differs from the one in our cache. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* \|	nfs: Don't take a reference on fl->fl_file for LOCK operation	Benjamin Coddington	2017-01-12	1	-3/+0
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I have reports of a crash that look like __fput() was called twice for a NFSv4.0 file. It seems possible that the state manager could try to reclaim a lock and take a reference on the fl->fl_file at the same time the file is being released if, during the close(), a signal interrupts the wait for outstanding IO while removing locks which then skips the removal of that lock. Since 83bfff23e9ed ("nfs4: have do_vfs_lock take an inode pointer") has removed the need to traverse fl->fl_file->f_inode in nfs4_lock_done(), taking that reference is no longer necessary. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	ktime: Get rid of ktime_equal()	Thomas Gleixner	2016-12-25	1	-1/+1
\| \| \| \| \| \| \| \|	No point in going through loops and hoops instead of just comparing the values. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
*	ktime: Get rid of the union	Thomas Gleixner	2016-12-25	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ktime is a union because the initial implementation stored the time in scalar nanoseconds on 64 bit machine and in a endianess optimized timespec variant for 32bit machines. The Y2038 cleanup removed the timespec variant and switched everything to scalar nanoseconds. The union remained, but become completely pointless. Get rid of the union and just keep ktime_t as simple typedef of type s64. The conversion was done with coccinelle and some manual mopping up. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
*	Replace <asm/uaccess.h> with <linux/uaccess.h> globally	Linus Torvalds	2016-12-24	6	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	This was entirely automated, using the script by Al: PATT='^[[:blank:]]#[[:blank:]]include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"\|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Merge tag 'nfs-for-4.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs	Linus Torvalds	2016-12-21	11	-109/+166
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull more NFS client updates from Trond Myklebust: "Highlights include: - further attribute cache improvements to make revalidation more fine grained - NFSv4 locking improvements Bugfixes: - nfs4_fl_prepare_ds must be careful about reporting success in files layout - pNFS/flexfiles: Instead of marking a device inactive, remove it from the cache" * tag 'nfs-for-4.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Retry the DELEGRETURN if the embedded GETATTR is rejected with EACCES NFS: Retry the CLOSE if the embedded GETATTR is rejected with EACCES NFSv4: Place the GETATTR operation before the CLOSE NFSv4: Also ask for attributes when downgrading to a READ-only state NFS: Don't abuse NFS_INO_REVAL_FORCED in nfs_post_op_update_inode_locked() pNFS: Return RW layouts on OPEN_DOWNGRADE NFSv4: Add encode/decode of the layoutreturn op in OPEN_DOWNGRADE NFS: Don't disconnect open-owner on NFS4ERR_BAD_SEQID NFSv4: ensure __nfs4_find_lock_state returns consistent result. NFSv4.1: nfs4_fl_prepare_ds must be careful about reporting success. pNFS/flexfiles: delete deviceid, don't mark inactive NFS: Clean up nfs_attribute_timeout() NFS: Remove unused function nfs_revalidate_inode_rcu() NFS: Fix and clean up the access cache validity checking NFS: Only look at the change attribute cache state in nfs_weak_revalidate() NFS: Clean up cache validity checking NFS: Don't revalidate the file on close if we hold a delegation NFSv4: Don't discard the attributes returned by asynchronous DELEGRETURN NFSv4: Update the attribute cache info in update_changeattr
\| *	NFSv4: Retry the DELEGRETURN if the embedded GETATTR is rejected with EACCES	Trond Myklebust	2016-12-19	2	-4/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If our DELEGRETURN RPC call is rejected with an EACCES call, then we should remove the GETATTR call from the compound RPC and retry. This could potentially happen when there is a conflict between an ACL denying attribute reads and our use of SP4_MACH_CRED. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
\| *	NFS: Retry the CLOSE if the embedded GETATTR is rejected with EACCES	Trond Myklebust	2016-12-19	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If our CLOSE RPC call is rejected with an EACCES call, then we should remove the GETATTR call from the compound RPC and retry. This could potentially happen when there is a conflict between an ACL denying attribute reads and our use of SP4_MACH_CRED. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
\| *	NFSv4: Place the GETATTR operation before the CLOSE	Trond Myklebust	2016-12-19	2	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to benefit from the DENY share lock protection, we should put the GETATTR operation before the CLOSE. Otherwise, we might race with a Windows machine that thinks it is now safe to modify the file. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
\| *	NFSv4: Also ask for attributes when downgrading to a READ-only state	Trond Myklebust	2016-12-19	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we're downgrading from a READ+WRITE mode to a READ-only mode, then ask for cache consistency attributes so that we avoid the revalidation in nfs_close_context() Fixes: 3947b74d0f9d ("NFSv4: Don't request a GETATTR on open_downgrade.") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
\| *	NFS: Don't abuse NFS_INO_REVAL_FORCED in nfs_post_op_update_inode_locked()	Trond Myklebust	2016-12-19	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The NFS_INO_REVAL_FORCED flag now really only has meaning for the case when we've just been handed a delegation for a file that was already cached, and we're unsure about that cache. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
\| *	pNFS: Return RW layouts on OPEN_DOWNGRADE	Trond Myklebust	2016-12-19	1	-3/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the client holds no more writeable open state, and does not hold a write delegation, then send a layoutreturn as part of the OPEN_DOWNGRADE. We do this only for writes, since some layout drivers may require you to also hold a read layout if you are doing a R/W workload. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
\| *	NFSv4: Add encode/decode of the layoutreturn op in OPEN_DOWNGRADE	Trond Myklebust	2016-12-19	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While we do not need to return the RW layout when downgrading from a read/write open state to read-only, we might want to do so in order to reduce the burden on the metadataserver so that it does not need to check for changed data when responding to GETATTR requests. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
\| *	NFS: Don't disconnect open-owner on NFS4ERR_BAD_SEQID	NeilBrown	2016-12-19	1	-16/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an NFS4ERR_BAD_SEQID is received the open-owner is removed from the ->state_owners rbtree so that it will no longer be used. If any stateids attached to this open-owner are still in use, and if a request using one gets an NFS4ERR_BAD_STATEID reply, this can for bad. The state is marked as needing recovery and the nfs4_state_manager() is scheduled to clean up. nfs4_state_manager() finds states to be recovered by walking the state_owners rbtree. As the open-owner is not in the rbtree, the bad state is not found so nfs4_state_manager() completes having done nothing. The request is then retried, with a predicatable result (indefinite retries). If the stateid is for a delegation, this open_owner will be used to open files when the delegation is returned. For that to work, a new open-owner needs to be presented to the server. This patch changes NFS4ERR_BAD_SEQID handling to leave the open-owner in the rbtree but updates the 'create_time' so it looks like a new open-owner. With this the indefinite retries no longer happen. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
\| *	NFSv4: ensure __nfs4_find_lock_state returns consistent result.	NeilBrown	2016-12-19	1	-8/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a file has both flock locks and OFD locks, then it is possible that two different nfs4 lock states could apply to file accesses from a single process. It is not possible to know, efficiently, which one is "correct". Presumably the state which represents a lock that covers the region undergoing IO would be the "correct" one to use, but finding that has a non-trivial cost and would provide miniscule value. Currently we just return whichever is first in the list, which could result in inconsistent behaviour if an application ever put it self in this position. As consistent behaviour is preferable (when perfectly correct behaviour is not available), change the search to return a consistent result in this circumstance. Specifically: if there is both a flock and OFD lock state, always return the flock one. Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
\| *	NFSv4.1: nfs4_fl_prepare_ds must be careful about reporting success.	NeilBrown	2016-12-19	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Various places assume that if nfs4_fl_prepare_ds() turns a non-NULL 'ds', then ds->ds_clp will also be non-NULL. This is not necessasrily true in the case when the process received a fatal signal while nfs4_pnfs_ds_connect is waiting in nfs4_wait_ds_connect(). In that case ->ds_clp may not be set, and the devid may not recently have been marked unavailable. So add a test for ds_clp == NULL and return NULL in that case. Fixes: c23266d532b4 ("NFS4.1 Fix data server connection race") Signed-off-by: NeilBrown <neilb@suse.com> Acked-by: Olga Kornievskaia <aglo@umich.edu> Acked-by: Adamson, Andy <William.Adamson@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>