blackbird-op-linux - Blackbird™ Linux sources for OpenPOWER

	Commit message (Collapse)	Author	Age	Files	Lines
*	NFS: Don't zap caches on fallocate()	Anna Schumaker	2015-04-23	4	-10/+35
\| \| \| \| \| \| \| \| \| \| \|	This patch adds a GETATTR to the end of ALLOCATE and DEALLOCATE operations so we can set the updated inode size and change attribute directly. DEALLOCATE will still need to release pagecache pages, so nfs42_proc_deallocate() now calls truncate_pagecache_range() before contacting the server. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFS: Block new writes while syncing data in nfs_getattr()	Trond Myklebust	2015-03-27	1	-0/+2
\| \| \| \|	Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4.1/pnfs: Separate out metadata and data consistency for pNFS	Trond Myklebust	2015-03-27	9	-8/+47
\| \| \| \| \| \| \| \| \| \| \|	The LAYOUTCOMMIT operation means different things to different layout types. For blocks and objects, it is both a data and metadata consistency operation. For files and flexfiles, it is only a metadata consistency operation. This patch separates out the 2 cases, allowing the files/flexfiles layout drivers to optimise away the data consistency calls to layoutcommit. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4.1/pnfs: Ensure we send layoutcommit before return-on-close	Trond Myklebust	2015-03-27	1	-1/+4
\| \| \| \| \| \| \| \|	We must not send a close or delegreturn that would result in a return-on-close of the layout without ensuring that we've also sent the necessary layoutcommit. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4.1/pnfs: Ensure that writes respect the O_SYNC flag when doing O_DIRECT	Trond Myklebust	2015-03-27	3	-0/+3
\| \| \| \| \| \| \| \| \|	If the caller does not specify the O_SYNC flag, then it is legitimate to return from O_DIRECT without doing a pNFS layoutcommit operation. However if the file is opened O_DIRECT\|O_SYNC then we'd better get it right. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4: Truncating file opens should also sync O_DIRECT writes	Trond Myklebust	2015-03-27	2	-2/+3
\| \| \| \| \| \|	We don't just want to sync out buffered writes, but also O_DIRECT ones. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFS: File unlock needs to be a metadata synchronisation point	Trond Myklebust	2015-03-27	1	-1/+1
\| \| \| \| \| \| \|	File unlock needs to update both data and metadata on the NFS server in order to act as a synchronisation point for other clients. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFS: Add a helper to sync both O_DIRECT and buffered writes	Trond Myklebust	2015-03-27	1	-6/+9
\| \| \| \| \| \|	Then apply it to nfs_setattr() and nfs_getattr(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4.1/pnfs: Refactor pnfs_set_layoutcommit()	Trond Myklebust	2015-03-27	4	-42/+14
\| \| \| \| \| \| \|	pnfs_set_layoutcommit() and pnfs_commit_set_layoutcommit() are 100% identical except for the function arguments. Refactor to eliminate the difference. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4.1/pnfs: Fix setting of layoutcommit last write byte	Trond Myklebust	2015-03-27	1	-9/+8
\| \| \| \| \| \| \| \| \|	If the NFS_INO_LAYOUTCOMMIT flag was unset, then we _must_ ensure that we also reset the last write byte (lwb) for that layout. The current code depends on us clearing the lwb when we clear NFS_INO_LAYOUTCOMMIT, which is not the case when we call pnfs_clear_layoutcommit(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4: Return the delegation before returning the layout in evict_inode()	Trond Myklebust	2015-03-27	1	-2/+3
\| \| \| \| \| \| \|	Minor optimisation for the case where the layout has return-on-close enabled. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4: Allow tracing of NFSv4 fsync calls	Trond Myklebust	2015-03-27	2	-0/+8
\| \| \| \| \| \|	I appear to have missed this when adding the ftrace probes. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFS: Fix free_deveiceid -> free_deviceid	Trond Myklebust	2015-03-27	2	-4/+4
\| \| \| \| \| \|	Make it easier to grep for these functions by name. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4.1: Don't cache deviceids that have no notifications	Trond Myklebust	2015-03-27	3	-0/+13
\| \| \| \| \| \| \| \|	The spec says that once all layouts that reference a given deviceid have been returned, then we are only allowed to continue to cache the deviceid if the metadata server supports notifications. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4.1: Allow getdeviceinfo to return notification info back to caller	Trond Myklebust	2015-03-27	2	-9/+10
\| \| \| \| \| \| \| \|	We are only allowed to cache deviceinfo if the server supports notifications and actually promises to call us back when changes occur. Right now, we request those notifications, but then we don't check the server's reply. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4.1: Cleanup - don't opencode nfs4_put_deviceid_node()	Trond Myklebust	2015-03-27	1	-4/+2
\| \| \| \| \| \|	There really is no reason to do so. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4.1: Convert pNFS deviceid to use kfree_rcu()	Trond Myklebust	2015-03-27	6	-8/+7
\| \| \| \| \| \| \| \|	Use of synchronize_rcu() when unmounting and potentially freeing a lot of deviceids is problematic. There really is no reason why we can't just use kfree_rcu() here. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	nfs: clean up nfs_direct_IO	Peng Tao	2015-03-13	1	-7/+0
\| \| \| \| \| \| \| \|	This follows up "nfs: fix dio deadlock when O_DIRECT flag is flipped" and removes the unnecessary CONFIG_NFS_SWAP switch. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4: Append delegations to the per-client list instead of prepending	Trond Myklebust	2015-03-12	1	-1/+1
\| \| \| \| \| \| \| \| \|	Do so on the assumption that for most use cases, that list will turn into a more or less LRU-ordered list, and so the list traversals in nfs_client_return_marked_delegations() are likely to be shorter before hitting a candidate to return. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4.1: Clear the old state by our client id before establishing a new lease	Trond Myklebust	2015-03-03	3	-5/+17
\| \| \| \| \| \| \| \| \| \|	If the call to exchange-id returns with the EXCHGID4_FLAG_CONFIRMED_R flag set, then that means our lease was established by a previous mount instance. Ensure that we detect this situation, and that we clear the state held by that mount. Reported-by: Jorge Mora <Jorge.Mora@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4: Fix a race in NFSv4.1 server trunking discovery	Trond Myklebust	2015-03-03	3	-8/+17
\| \| \| \| \| \| \| \|	We do not want to allow a race with another NFS mount to cause nfs41_walk_client_list() to establish a lease on our nfs_client before we're done checking for trunking. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFS: Don't write enable new pages while an invalidation is proceeding	Trond Myklebust	2015-03-03	2	-0/+4
\| \| \| \| \| \| \| \| \|	nfs_vm_page_mkwrite() should wait until the page cache invalidation is finished. This is the second patch in a 2 patch series to deprecate the NFS client's reliance on nfs_release_page() in the context of nfs_invalidate_mapping(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFS: Fix a regression in the read() syscall	Trond Myklebust	2015-03-03	2	-5/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When invalidating the page cache for a regular file, we want to first sync all dirty data to disk and then call invalidate_inode_pages2(). The latter relies on nfs_launder_page() and nfs_release_page() to deal respectively with dirty pages, and unstable written pages. When commit 9590544694bec ("NFS: avoid deadlocks with loop-back mounted NFS filesystems.") changed the behaviour of nfs_release_page(), then it made it possible for invalidate_inode_pages2() to fail with an EBUSY. Unfortunately, that error is then propagated back to read(). Let's therefore work around the problem for now by protecting the call to sync the data and invalidate_inode_pages2() so that they are atomic w.r.t. the addition of new writes. Later on, we can revisit whether or not we still need nfs_launder_page() and nfs_release_page(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4: Ensure we skip delegations that are already being returned	Trond Myklebust	2015-03-02	1	-0/+6
\| \| \| \| \| \| \| \|	In nfs_client_return_marked_delegations() and nfs_delegation_reap_unclaimed() we want to optimise the loop traversal by skipping delegations that are already in the process of being returned. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4: Pin the superblock while we're returning the delegation	Trond Myklebust	2015-03-02	1	-4/+16
\| \| \| \| \| \| \|	This patch ensures that the superblock doesn't go ahead and disappear underneath us while the state manager thread is returning delegations. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4: Ensure we honour NFS_DELEGATION_RETURNING in nfs_inode_set_delegation()	Trond Myklebust	2015-03-02	1	-1/+4
\| \| \| \| \| \| \|	Ensure that nfs_inode_set_delegation() doesn't inadvertently detach a delegation that is already in the process of being returned. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4: Ensure that we don't reap a delegation that is being returned	Trond Myklebust	2015-03-02	1	-5/+7
\| \| \| \|	Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFS: Fix stateid used for NFS v4 closes	Anna Schumaker	2015-03-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	After 566fcec60 the client uses the "current stateid" from the nfs4_state structure to close a file. This could potentially contain a delegation stateid, which is disallowed by the protocol and causes servers to return NFS4ERR_BAD_STATEID. This patch restores the (correct) behavior of sending the open stateid to close a file. Reported-by: Olga Kornievskaia <kolga@netapp.com> Fixes: 566fcec60 (NFSv4: Fix an atomicity problem in CLOSE) Signed-off-by: Anna Schumaker <Anna.Schumaker@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4: Don't call put_rpccred() under the rcu_read_lock()	Trond Myklebust	2015-03-01	1	-1/+1
\| \| \| \| \| \| \| \|	put_rpccred() can sleep. Fixes: 8f649c3762547 ("NFSv4: Fix the locking in nfs_inode_reclaim_delegation()") Cc: stable@vger.kernel.org # 2.6.35+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFS: Don't require a filehandle to refresh the inode in nfs_prime_dcache()	Trond Myklebust	2015-03-01	1	-3/+13
\| \| \| \| \| \| \| \| \| \| \| \|	If the server does not return a valid set of attributes that we can use to either create a file or refresh the inode, then there is no value in calling nfs_prime_dcache(). However if we're just refreshing the inode using the attributes that the server returned, then it shouldn't matter whether or not we have a filehandle, as long as we check the fsid+fileid combination. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv3: Use the readdir fileid as the mounted-on-fileid	Trond Myklebust	2015-03-01	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we call readdirplus, set the fileid normally returned by readdir as the mounted-on-fileid, since that is commonly the case if there is a mountpoint. To ensure that we get it right, we only set the flag if the readdir fileid differs from the one returned in the readdirplus attributes. This again means that we can avoid the issues described in commit 2ef47eb1aee17 ("NFS: Fix use of nfs_attr_use_mounted_on_fileid()"), which only fixed NFSv4. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFS: Don't invalidate a submounted dentry in nfs_prime_dcache()	Trond Myklebust	2015-03-01	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we're traversing a directory which contains a submounted filesystem, or one that has a referral, the NFS server that is processing the READDIR request will often return information for the underlying (mounted-on) directory. It may, or may not, also return filehandle information. If this happens, and the lookup in nfs_prime_dcache() returns the dentry for the submounted directory, the filehandle comparison will fail, and we call d_invalidate(). Post-commit 8ed936b5671bf ("vfs: Lazily remove mounts on unlinked files and directories."), this means the entire subtree is unmounted. The following minimal patch addresses this problem by punting on the invalidation if there is a submount. Kudos to Neil Brown <neilb@suse.de> for having tracked down this issue (see link). Reported-by: Nix <nix@esperi.org.uk> Link: http://lkml.kernel.org/r/87iofju9ht.fsf@spindle.srvr.nix Cc: stable@vger.kernel.org # 3.18+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4: Set a barrier in the update_changeattr() helper	Trond Myklebust	2015-03-01	2	-0/+2
\| \| \| \| \| \| \| \|	Ensure that we don't regress the changes that were made to the directory. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
*	NFS: Fix nfs_post_op_update_inode() to set an attribute barrier	Trond Myklebust	2015-03-01	1	-0/+1
\| \| \| \| \| \| \| \|	nfs_post_op_update_inode() is called after a self-induced attribute update. Ensure that it also sets the barrier. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
*	NFS: Remove size hack in nfs_inode_attrs_need_update()	Trond Myklebust	2015-03-01	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Prior to this patch, we used to always OK attribute updates that extended the file size on the assumption that we might be performing writeback. Now that we have attribute barriers to protect the writeback related updates, we should remove this hack, as it can cause truncate() operations to apparently be reverted if/when a readahead or getattr RPC call races with our on-the-wire SETATTR. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
*	NFSv4: Add attribute update barriers to delegreturn and pNFS layoutcommit	Trond Myklebust	2015-03-01	1	-0/+1
\| \| \| \| \| \| \| \|	Ensure that other operations that race with delegreturn and layoutcommit cannot revert the attribute updates that were made on the server. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
*	NFS: Add attribute update barriers to NFS writebacks	Trond Myklebust	2015-03-01	6	-8/+56
\| \| \| \| \| \| \| \|	Ensure that other operations that race with our write RPC calls cannot revert the file size updates that were made on the server. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
*	NFS: Set an attribute barrier on all updates	Trond Myklebust	2015-03-01	1	-0/+4
\| \| \| \| \| \| \| \|	Ensure that we update the attribute barrier even if there were no invalidations, provided that this value is newer than the old one. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
*	NFS: Add attribute update barriers to nfs_setattr_update_inode()	Trond Myklebust	2015-03-01	4	-10/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ensure that other operations which raced with our setattr RPC call cannot revert the file attribute changes that were made on the server. To do so, we artificially bump the attribute generation counter on the inode so that all calls to nfs_fattr_init() that precede ours will be dropped. The motivation for the patch came from Chuck Lever's reports of readaheads racing with truncate operations and causing the file size to be reverted. Reported-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
*	NFS: Add a helper to set attribute barriers	Trond Myklebust	2015-03-01	1	-0/+16
\| \| \| \| \|	Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
*	NFS: Ensure that buffered writes wait for O_DIRECT writes to complete	Trond Myklebust	2015-03-01	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	The O_DIRECT code will grab the inode->i_mutex and flush out buffered writes, before scheduling a read or a write. However there is no equivalent in the buffered write code to wait for O_DIRECT to complete. Fixes a reported issue in xfstests generic/133, when first performing an O_DIRECT write followed by a buffered write. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
*	NFSv4: nfs4_open_recover_helper() must set share access	Trond Myklebust	2015-02-27	1	-0/+3
\| \| \| \| \| \| \| \| \|	The share access mode is now specified as an argument in the nfs4_opendata, and so nfs4_open_recover_helper() needs to call nfs4_map_atomic_open_share() in order to set it. Fixes: 6ae373394c42 ("NFSv4.1: Ask for no delegation on OPEN if using O_DIRECT") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4.1: Clean up bind_conn_to_session	Trond Myklebust	2015-02-18	2	-22/+22
\| \| \| \| \| \|	We don't need to fake up an entire session in order retrieve the arguments. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4.1: Always set up a forward channel when binding the session	Trond Myklebust	2015-02-18	1	-1/+1
\| \| \| \| \| \| \| \|	Currently, the client requests a back channel or a bidirectional connection when binding a new TCP channel to an existing session. Fix that to ask for a forward channel or bidirectional. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4.1: Don't set up a backchannel if the server didn't agree to do so	Trond Myklebust	2015-02-18	3	-2/+9
\| \| \| \| \| \| \|	If the server doesn't agree to out backchannel setup request, then don't set one up. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	NFSv4.1: Clean up create_session	Trond Myklebust	2015-02-18	3	-22/+42
\| \| \| \| \| \|	Don't decode directly into the shared struct session Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
*	Merge branch 'cleanups'	Trond Myklebust	2015-02-18	13	-168/+161
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge cleanups requested by Linus. * cleanups: (3 commits) pnfs: Refactor the *_layout_mark_request_commit to use pnfs_layout_mark_request_commit nfs: Can call nfs_clear_page_commit() instead nfs: Provide and use helper functions for marking a page as unstable
\| *	pnfs: Refactor the *_layout_mark_request_commit to use ↵	Tom Haynes	2015-02-18	4	-75/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pnfs_layout_mark_request_commit The File Layout's filelayout_mark_request_commit() is almost the Flex File Layout's ff_layout_mark_request_commit(). And that can be reduced by calling into nfs_request_add_commit_list(). Signed-off-by: Tom Haynes <loghyr@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
\| *	nfs: Can call nfs_clear_page_commit() instead	Tom Haynes	2015-02-13	1	-5/+2
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Tom Haynes <loghyr@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
\| *	nfs: Provide and use helper functions for marking a page as unstable	Tom Haynes	2015-02-13	4	-21/+19
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Tom Haynes <loghyr@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>