summaryrefslogtreecommitdiffstats
path: root/fs/nfs
Commit message (Collapse)AuthorAgeFilesLines
...
| * | Merge branch 'nfs-rdma'Trond Myklebust2016-07-241-1/+5
| |\ \
| | * | NFS: Don't drop CB requests with invalid principalsChuck Lever2016-07-111-1/+5
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before commit 778be232a207 ("NFS do not find client in NFSv4 pg_authenticate"), the Linux callback server replied with RPC_AUTH_ERROR / RPC_AUTH_BADCRED, instead of dropping the CB request. Let's restore that behavior so the server has a chance to do something useful about it, and provide a warning that helps admins correct the problem. Fixes: 778be232a207 ("NFS do not find client in NFSv4 ...") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | Merge branch 'pnfs'Trond Myklebust2016-07-246-136/+218
| |\ \
| | * | pNFS: Remove redundant smp_mb() from pnfs_init_lseg()Trond Myklebust2016-07-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | It's not visible yet, and won't be until after we grab the inode->i_lock. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Cleanup - do layout segment initialisation in one placeTrond Myklebust2016-07-241-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | ...instead of splitting the initialisation over init_lseg() and pnfs_layout_process(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Remove redundant stateid invalidationTrond Myklebust2016-07-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The layout stateid will be invalidated once it holds no more layout segments anyway. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Remove redundant pnfs_mark_layout_returned_if_empty()Trond Myklebust2016-07-244-16/+0
| | | | | | | | | | | | | | | | | | | | | | | | That's already being taken care of in pnfs_layout_remove_lseg(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Clear the layout metadata if the server changed the layout stateidTrond Myklebust2016-07-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the server changed the layout stateid's "other" field, then we should treat the old layout as being completely gone. In that case, we want to clear the metadata such as scheduled layoutreturns. Do this by calling pnfs_mark_layout_stateid_invalid(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Cleanup - don't open code pnfs_mark_layout_stateid_invalid()Trond Myklebust2016-07-244-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure nfs42_layoutstat_done() layoutget don't open code layout stateid invalidation. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | NFS: pnfs_mark_matching_lsegs_return() should match the layout sequence idTrond Myklebust2016-07-241-14/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When determining which layout segments to return, we do want pnfs_mark_matching_lsegs_return to check that they match the layout sequence id. This ensures that we don't waste time if the server is replaying a layout recall that has already been satisfied. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Do not set plh_return_seq for non-callback related layoutreturnsTrond Myklebust2016-07-241-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In cases where we need to send a layoutreturn in order to propagate an error, we should not tie that to a specific layout stateid. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Ensure layoutreturn acts as a completion for layout callbacksTrond Myklebust2016-07-241-15/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we return NFS_OK to the CB_LAYOUTRECALL, we are required to send a layoutreturn that "completes" that layout recall request, using the correct stateid. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Fix CB_LAYOUTRECALL stateid verificationTrond Myklebust2016-07-241-19/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to evaluate in this order: If the client holds no layout for this inode, then return NFS4ERR_NOMATCHING_LAYOUT; it probably forgot the layout. If the client finds the inode among the list of layouts, but the corresponding stateid has not yet been initialised, then return NFS4ERR_DELAY to ask the server to retry once the outstanding LAYOUTGET is complete. If the current layout stateid's "other" field does not match the recalled stateid, return NFS4ERR_BAD_STATEID. If already processing a layout recall with a newer stateid, return NFS4ERR_OLD_STATEID. This can only happens for servers that are non-compliant with the NFSv4.1 protocol. If already processing a layout recall with an older stateid, return NFS4ERR_DELAY to ask the server to retry once the outstanding LAYOUTRETURN is complete. Again, this is technically incompliant with the NFSv4.1 protocol. If the current layout sequence id is newer than the recalled stateid's sequence id, return NFS4ERR_OLD_STATEID. This too implies protocol non-compliance. If the current layout sequence id is older than the recalled stateid's sequence id+1, return NFS4ERR_DELAY. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Always update the layout barrier seqid on LAYOUTGETTrond Myklebust2016-07-241-13/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, pnfs_set_layout_stateid() will update the layout sequence id barrier only if the stateid itself is newer than the current layout stateid. However in a situation where multiple LAYOUTGET calls and a LAYOUTRETURN raced, it is entirely possible for one of the LAYOUTGET to set the current stateid to something newer than the LAYOUTRETURN that needs to set the barrier. The fix is to allow the "update_barrier" flag to force a check as to whether or not the barrier needs to be updated. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Always update the layout stateid if NFS_LAYOUT_INVALID_STID is setTrond Myklebust2016-07-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If the layout stateid is invalid, then pnfs_set_layout_stateid() must always initialise it. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Clear the layout return tracking on layout reinitialisationTrond Myklebust2016-07-241-5/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure that we don't carry over layoutreturn info from a previous incarnation of this layout. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: LAYOUTRETURN should only update the stateid if the layout is validTrond Myklebust2016-07-242-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If the layout was completely returned, then ignore the returned layout stateid. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | Merge commit 'e7bdea7750eb'Trond Myklebust2016-07-248-23/+45
| | |\| | | | | | | | | | | | | Needed in order to work on top of pNFS changes in Linus' upstream kernel.
| | * | Fix NULL pointer dereference in bl_free_device().Artem Savkov2016-07-221-9/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When bl_parse_deviceid() fails in bl_alloc_deviceid_node() on blkdev_get_by_*() step we get an pnfs_block_dev struct that is uninitialized except for bdev field which is set to whatever error blkdev_get_by_*() returns. bl_free_device() then tries to call blkdev_put() if bdev is not 0 resulting in a wrong pointer dereference. Fixing this by setting bdev in struct pnfs_block_dev only if we didn't get an error from blkdev_get_by_*(). Signed-off-by: Artem Savkov <asavkov@redhat.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | nfs/blocklayout: Check max uuids and devices before decodingKinglong Mee2016-07-151-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid nfs return uuids/devices larger than maximum. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | nfs/blocklayout: Make sure calculate signature length alignedKinglong Mee2016-07-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid a bad nfs server return an unaligned length of signature. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | nfs/blocklayout: support RH/Fedora dm-mpath device nodesChristoph Hellwig2016-07-151-1/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of reusing the wwn-* names for multipath devices nodes RHEL and Fedora introduce new dm-mpath-uuid-* nodes with a slightly different naming scheme. Try these names first to ensure we always get a multipath-capable device if it exists. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | nfs/blocklayout: refactor open-by-wwnChristoph Hellwig2016-07-151-26/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current code works with the standard udev/systemd names, but we'll have to add another method in the next patch. Refactor it into a separate helper to make room for the new variant. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | nfs/blocklayout: use proper fmode for opening block devicesChristoph Hellwig2016-07-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was fixed for the original block layout code a while ago, but also needs to be fixed for the SCSI layout path. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | Merge branch 'writeback'Trond Myklebust2016-07-2417-280/+437
| |\ \ \
| | * | | pNFS/files: filelayout_write_done_cb must call nfs_writeback_update_inode()Trond Myklebust2016-07-211-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All write callbacks are required to call nfs_writeback_update_inode() upon success to ensure that file size changes are recorded, and the attribute cache is invalidated. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | pNFS: Don't mark the inode as revalidated if a LAYOUTCOMMIT is outstandingTrond Myklebust2016-07-182-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We know that the attributes will need updating if there is still a LAYOUTCOMMIT outstanding. Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFSv4: Revert "Truncating file opens should also sync O_DIRECT writes"Trond Myklebust2016-07-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We're not holding any locks, so both nfs_wb_all() and inode_dio_wait() are unenforcible and have livelock potential. Just limit ourselves to flushing out the data. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFS nfs_vm_page_mkwrite: Don't freeze me, Bro...Trond Myklebust2016-07-051-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prevent filesystem freezes while handling the write page fault. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFSv4.2: llseek(SEEK_HOLE) and llseek(SEEK_DATA) don't require data syncTrond Myklebust2016-07-051-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to ensure that we write the cached data to the server, but don't require it be synced to disk. If the server reboots, we will get a stateid error, which will cause us to retry anyway. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFSv4.2: Fix writeback races in nfs4_copy_file_rangeTrond Myklebust2016-07-054-13/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to ensure that any writes to the destination file are serialised with the copy, meaning that the writeback has to occur under the inode lock. Also relax the writeback requirement on the source, and rely on the stateid checking to tell us if the source rebooted. Add the helper nfs_filemap_write_and_wait_range() to call pnfs_sync_inode() as is appropriate for pNFS servers that may need a layoutcommit. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFSv4.2: Fix a race in nfs42_proc_deallocate()Trond Myklebust2016-07-051-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When punching holes in a file, we want to ensure the operation is serialised w.r.t. other writes, meaning that we want to call nfs_sync_inode() while holding the inode lock. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFS: Getattr doesn't require data sync semanticsTrond Myklebust2016-07-051-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When retrieving stat() information, NFS unfortunately does require us to sync writes to disk in order to ensure that mtime and ctime are up to date. However we shouldn't have to ensure that those writes are persisted. Relaxing that requirement does mean that we may see an mtime/ctime change if the server reboots and forces us to replay all writes. The exception to this rule are pNFS clients that are required to send layoutcommit, however that is dealt with by the call to pnfs_sync_inode() in _nfs_revalidate_inode(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFS: Do not aggressively cache file attributes in the case of O_DIRECTTrond Myklebust2016-07-052-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A file that is open for O_DIRECT is by definition not obeying close-to-open cache consistency semantics, so let's not cache the attributes too aggressively either. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFS: Remove unused function nfs_revalidate_mapping_protected()Trond Myklebust2016-07-051-34/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up... Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFS: Remove redundant waits for O_DIRECT in fsync() and write_begin()Trond Myklebust2016-07-051-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We're now waiting immediately after taking the locks, so waiting in fsync() and write_begin() is either redundant or potentially subject to livelock (if not holding the lock). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFS: Cleanup nfs_direct_complete()Trond Myklebust2016-07-051-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is only one caller that sets the "write" argument to true, so just move the call to nfs_zap_mapping() and get rid of the now redundant argument. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFS: Do not serialise O_DIRECT reads and writesTrond Myklebust2016-07-055-37/+173
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow dio requests to be scheduled in parallel, but ensuring that they do not conflict with buffered I/O. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFS: Move buffered I/O locking into nfs_file_write()Trond Myklebust2016-07-051-12/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Preparation for the patch that de-serialises O_DIRECT reads and writes. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFS Cleanup: move call to generic_write_checks() into fs/nfs/direct.cTrond Myklebust2016-07-052-9/+9
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFS: Remove racy size manipulations in O_DIRECTTrond Myklebust2016-07-051-16/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On success, the RPC callbacks will ensure that we make the appropriate calls to nfs_writeback_update_inode() Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFS: Ensure we reset the write verifier 'committed' value on resend.Trond Myklebust2016-07-052-0/+19
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFS: Fix O_DIRECT verifier problemsTrond Myklebust2016-07-053-3/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should not be interested in looking at the value of the stable field, since that could take any value. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | pNFS: pnfs_layoutcommit_outstanding() is no longer used when !CONFIG_NFS_V4_1Trond Myklebust2016-07-051-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cleanup... Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | pNFS: Ensure we layoutcommit before revalidating attributesTrond Myklebust2016-07-051-16/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we need to update the cached attributes, then we'd better make sure that we also layoutcommit first. Otherwise, the server may have stale attributes. Prior to this patch, the revalidation code tried to "fix" this problem by simply disabling attributes that would be affected by the layoutcommit. That approach breaks nfs_writeback_check_extend(), leading to a file size corruption. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | pNFS: Files and flexfiles always need to commit before layoutcommitTrond Myklebust2016-07-055-9/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So ensure that we mark the layout for commit once the write is done, and then ensure that the commit to ds is finished before sending layoutcommit. Note that by doing this, we're able to optimise away the commit for the case of servers that don't need layoutcommit in order to return updated attributes. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | pNFS/flexfiles: Clean up calls to pnfs_set_layoutcommit()Trond Myklebust2016-07-051-9/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's just have one place where we check ff_layout_need_layoutcommit(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | pNFS/flexfiles: Fix layoutcommit after a commit to DSTrond Myklebust2016-07-051-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should always do a layoutcommit after commit to DS, except if the layout segment we're using has set FF_FLAGS_NO_LAYOUTCOMMIT. Fixes: d67ae825a59d ("pnfs/flexfiles: Add the FlexFile Layout Driver") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | pNFS/files: Fix layoutcommit after a commit to DSTrond Myklebust2016-07-051-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to the errata https://www.rfc-editor.org/errata_search.php?rfc=5661&eid=2751 we should always send layout commit after a commit to DS. Fixes: bc7d4b8fd091 ("nfs/filelayout: set layoutcommit...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | NFS: Don't call COMMIT in ->releasepage()Trond Myklebust2016-06-221-23/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While COMMIT has the potential to free up a lot of memory that is being taken by unstable writes, it isn't guaranteed to free up this particular page. Also, calling fsync() on the server is expensive and so we want to do it in a more controlled fashion, rather than have it triggered at random by the VM. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
OpenPOWER on IntegriCloud