summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* ocfs2: Add ->check_downconvert callback in dlmglueMark Fasheh2006-09-241-1/+18
| | | | | | This will allow lock types to force a requeue of a lock downconvert. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Check for refreshing locks in generic unblock functionMark Fasheh2006-09-241-12/+19
| | | | | | Tidy up the exit path a bit too. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: don't unconditionally pass LVB flagsMark Fasheh2006-09-241-3/+15
| | | | | | | | | Allow a lock type to specifiy whether it makes use of the LVB. The only type which does this right now is the meta data lock. This should save us some space on network messages since they won't have to needlessly transmit value blocks. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: combine inode and generic blocking AST functionsMark Fasheh2006-09-241-112/+11
| | | | | | | There is extremely little difference between the two now. We can remove the callback from ocfs2_lock_res_ops as well. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Add ->get_osb() dlmglue locking operationMark Fasheh2006-09-241-0/+33
| | | | | | Will be used to find the ocfs2_super structure from a given lockres. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: remove ->unlock_ast() callback from ocfs2_lock_res_opsMark Fasheh2006-09-241-13/+3
| | | | | | | This was always defined to the same function in all locks, so clean things up by removing and passing ocfs2_unlock_ast() directly to the DLM. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: combine inode and generic AST functionsMark Fasheh2006-09-241-110/+10
| | | | | | | There is extremely little difference between the two now. We can remove the callback from ocfs2_lock_res_ops as well. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Clean up lock resource refresh flagsMark Fasheh2006-09-241-14/+35
| | | | | | | Use of the refresh mechanism is lock-type wide, so move knowledge of that to the ocfs2_lock_res_ops structure. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remove i_generation from inode lock namesMark Fasheh2006-09-2410-53/+170
| | | | | | | | | | | | | | | | | | | | | | | OCFS2 puts inode meta data in the "lock value block" provided by the DLM. Typically, i_generation is encoded in the lock name so that a deleted inode on and a new one in the same block don't share the same lvb. Unfortunately, that scheme means that the read in ocfs2_read_locked_inode() is potentially thrown away as soon as the meta data lock is taken - we cannot encode the lock name without first knowing i_generation, which requires a disk read. This patch encodes i_generation in the inode meta data lvb, and removes the value from the inode meta data lock name. This way, the read can be covered by a lock, and at the same time we can distinguish between an up to date and a stale LVB. This will help cold-cache stat(2) performance in particular. Since this patch changes the protocol version, we take the opportunity to do a minor re-organization of two of the LVB fields. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Encode i_generation in the meta data lvbMark Fasheh2006-09-242-7/+12
| | | | | | | | When i_generation is removed from the lockname, this will help us determine whether a meta data lvb has information that is in sync with the local struct inode. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Free up some space in the lvbMark Fasheh2006-09-242-4/+6
| | | | | | | | lvb_version doesn't need to be a whole 32 bits. Make it an 8 bit field to free up some space. This should be backwards compatible until we use one of the fields, in which case we'd bump the lvb version anyway. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remove special casing for inode creation in ocfs2_dentry_attach_lock()Mark Fasheh2006-09-243-37/+14
| | | | | | | | We can't use LKM_LOCAL for new dentry locks because an unlink and subsequent re-create of a name/inode pair may result in the lock still being mastered somewhere in the cluster. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: manually d_move() during ocfs2_rename()Mark Fasheh2006-09-242-2/+5
| | | | | | | | | | Make use of FS_RENAME_DOES_D_MOVE to avoid a race condition that can occur during ->rename() if we d_move() outside of the parent directory cluster locks, and another node discovers the new name (created during the rename) and unlinks it. d_move() will unconditionally rehash a dentry - which will leave stale data in the system. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] Allow file systems to manually d_move() inside of ->rename()Mark Fasheh2006-09-243-10/+9
| | | | | | | | | | | | | | | | Some file systems want to manually d_move() the dentries involved in a rename. We can do this by making use of the FS_ODD_RENAME flag if we just have nfs_rename() unconditionally do the d_move(). While there, we rename the flag to be more descriptive. OCFS2 uses this to protect that part of the rename operation with a cluster lock. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org>
* ocfs2: Remove the dentry voteMark Fasheh2006-09-242-183/+2
| | | | | | This is unused now. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Hook rest of the file system into dentry locking APIMark Fasheh2006-09-244-41/+94
| | | | | | | Actually replace the vote calls with the new dentry operations. Make any necessary adjustments to get the scheme to work. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Add dentry tracking APIMark Fasheh2006-09-243-32/+369
| | | | | | | | | | | | | | | | | Replace the dentry vote mechanism with a cluster lock which covers a set of dentries. This allows us to force d_delete() only on nodes which actually care about an unlink. Every node that does a ->lookup() gets a read only lock on the dentry, until an unlink during which the unlinking node, will request an exclusive lock, forcing the other nodes who care about that dentry to d_delete() it. The effect is that we retain a very lightweight ->d_revalidate(), and at the same time get to make large improvements to the average case performance of the ocfs2 unlink and rename operations. This patch adds the higher level API and the dentry manipulation code. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Add new cluster lock typeMark Fasheh2006-09-245-104/+436
| | | | | | | | | | | | | | | | | | | Replace the dentry vote mechanism with a cluster lock which covers a set of dentries. This allows us to force d_delete() only on nodes which actually care about an unlink. Every node that does a ->lookup() gets a read only lock on the dentry, until an unlink during which the unlinking node, will request an exclusive lock, forcing the other nodes who care about that dentry to d_delete() it. The effect is that we retain a very lightweight ->d_revalidate(), and at the same time get to make large improvements to the average case performance of the ocfs2 unlink and rename operations. This patch adds the cluster lock type which OCFS2 can attach to dentries. A small number of fs/ocfs2/dcache.c functions are stubbed out so that this change can compile. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Update dlmglue for new dlmlock() APIMark Fasheh2006-09-241-0/+3
| | | | | | | File system lock names are very regular right now, so we really only need to pass an extra parameter to dlmlock(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Update dlmfs for new dlmlock() APIMark Fasheh2006-09-242-51/+31
| | | | | | | | We just need to add a namelen field to the user_lock_res structure, and update a few debug prints. Instead of updating all debug prints, I took the opportunity to remove a few that are likely unnecessary these days. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Allow binary names in the DLMMark Fasheh2006-09-245-8/+11
| | | | | | | | The OCFS2 DLM uses strlen() to determine lock name length, which excludes the possibility of putting binary values in the name string. Fix this by requiring that string length be passed in as a parameter. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Silence dlm error printMark Fasheh2006-09-241-3/+3
| | | | | | | An AST can be delivered via the network after a lock has been removed, so no need to print an error when we see that. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* Move several *_SUPER_MAGIC symbols to include/linux/magic.h.Jeff Garzik2006-09-249-7/+5
| | | | Signed-off-by: Jeff Garzik <jeff@garzik.org>
* NFS: unmark NFS direct I/O as experimentalChuck Lever2006-09-221-2/+2
| | | | | | | | | | | Remove the EXPERIMENTAL flag from the NFS_DIRECTIO option. Test plan: Unset the EXPERIMENTAL kernel build option and check to see that the NFS direct I/O option is still available. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: add comments clarifying the use of nfs_post_op_update()Chuck Lever2006-09-222-1/+13
| | | | | | | | | | | Comments-only change to clarify a detail of the NFS protocol and how it is implemented in Linux. Test plan: None. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Use SEEK_END instead of hardcoded valueJosef 'Jeff' Sipek2006-09-221-1/+1
| | | | | Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: When mounting with a port=0 argument, substitute port=2049Trond Myklebust2006-09-221-0/+3
| | | | | | | | | RFC3530 states that the registered port 2049 for the NFS protocol should be the default configuration in order to allow clients not to use the RPC binding protocols. If the mount program sends us a port=0, we therefore substitute port=2049. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Poll more aggressively when handling NFS4ERR_DELAYTrond Myklebust2006-09-221-1/+1
| | | | | | | Change the initial retry delay from 1s to 0.1s (and then back off exponentially). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Handle the condition NFS4ERR_FILE_OPENTrond Myklebust2006-09-221-0/+1
| | | | | | | Retry a few times before we give up: the error is usually due to ordering issues with asynchronous RPC calls. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Retry lease recovery if it failed during a synchronous operation.Trond Myklebust2006-09-221-2/+9
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Don't invalidate the symlink we just stuffed into the cacheTrond Myklebust2006-09-221-7/+5
| | | | | | | And slight optimisation of nfs_end_data_update(): directories never have delegations anyway. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Make read() return an ESTALE if the file has been deletedTrond Myklebust2006-09-221-3/+16
| | | | | | | | | | | Currently, a read() request will return EIO even if the file has been deleted on the server, simply because that is what the VM will return if the call to readpage() fails to update the page. Ensure that readpage() marks the inode as stale if it receives an ESTALE. Then return that error to userland. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: It's perfectly legal for clp to be NULL here....J. Bruce Fields2006-09-221-1/+1
| | | | | Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: nfs_lookup - don't hash dentry when optimising away the lookupTrond Myklebust2006-09-221-3/+11
| | | | | | | | | | | | | | | | If the open intents tell us that a given lookup is going to result in a, exclusive create, we currently optimize away the lookup call itself. The reason is that the lookup would not be atomic with the create RPC call, so why do it in the first place? A problem occurs, however, if the VFS aborts the exclusive create operation after the lookup, but before the call to create the file/directory: in this case we will end up with a hashed negative dentry in the dcache that has never been looked up. Fix this by only actually hashing the dentry once the create operation has been successfully completed. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Fix a referral error Oopsandros@citi.umich.edu2006-09-221-0/+2
| | | | | | | | Fix an oops when the referral server is not responding. Check the error return from nfs4_set_client() in nfs4_create_referral_server. Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: NFS_ROOT should use the new rpc_create APIChuck Lever2006-09-221-16/+13
| | | | | | | | | | | Teach NFS_ROOT to use the new rpc_create API instead of the old two-call API for creating an RPC transport. Test plan: Compile the kernel with the NFS client build-in, and set CONFIG_NFS_ROOT. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix up compiler warnings on 64-bit platforms in client.cDavid Howells2006-09-221-6/+14
| | | | | | | Fix up warnings from compiling on ppc64. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: Make rpc_mkpipe() take the parent dentry as an argumentTrond Myklebust2006-09-221-5/+1
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Fix a use-after-free issue with the nfs server.Trond Myklebust2006-09-223-18/+27
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Add a real API for dealing with blk_congestion_wait()Trond Myklebust2006-09-221-0/+1
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Use cached page as buffer for NFS symlink requestsChuck Lever2006-09-227-35/+49
| | | | | | | | | | | Now that we have a copy of the symlink path in the page cache, we can pass a struct page down to the XDR routines instead of a string buffer. Test plan: Connectathon, all NFS versions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: copy symlinks into page cache before sending NFS SYMLINK requestChuck Lever2006-09-221-18/+68
| | | | | | | | | | | | | Currently the NFS client does not cache symlinks it creates. They get cached only when the NFS client reads them back from the server. Copy the symlink into the page cache before sending it. Test plan: Connectathon, all NFS versions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix double d_drop in nfs_instantiate() error pathChuck Lever2006-09-224-45/+57
| | | | | | | | | | | | | | | | | | | If the LOOKUP or GETATTR in nfs_instantiate fail, nfs_instantiate will do a d_drop before returning. But some callers already do a d_drop in the case of an error return. Make certain we do only one d_drop in all error paths. This issue was introduced because over time, the symlink proc API diverged slightly from the create/mkdir/mknod proc API. To prevent other coding mistakes of this type, change the symlink proc API to be more like create/mkdir/mknod and move the nfs_instantiate call into the symlink proc routines so it is used in exactly the same way for create, mkdir, mknod, and symlink. Test plan: Connectathon, all versions of NFS. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: remove a no-longer-needed error check in nfs_symlink()Chuck Lever2006-09-221-6/+2
| | | | | | | | | | | | | | | | | In the early days of NFS, there was no duplicate reply cache on the server. Thus retransmitted non-idempotent requests often found that the request had already completed on the server. To avoid passing an unanticipated return code to unsuspecting applications, NFS clients would often shunt error codes that implied the request had been retried but already completed. Thanks to NFS over TCP, duplicate reply caches on the server, and network performance and reliability improvements, it is safe to remove such checks. Test plan: None. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSD: Convert NFS server callback logic to use new rpc_create APIChuck Lever2006-09-221-39/+27
| | | | | | | | | | | Replace xprt_create_proto/rpc_create_client call in NFS server callback functions to use new rpc_create() API. Test plan: NFSv4 delegation functionality tests. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Convert NFS client to use new rpc_create() APIChuck Lever2006-09-221-16/+11
| | | | | | | | | | | Convert NFS client mount logic to use rpc_create() instead of the old xprt_create_proto/rpc_create_client API. Test plan: Mount stress tests. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* LOCKD: Convert to use new rpc_create() APIChuck Lever2006-09-222-47/+44
| | | | | | | | | | | | | | | Replace xprt_create_proto/rpc_create_client with new rpc_create() interface in the Network Lock Manager. Note that the semantics of NLM transports is now "hard" instead of "soft" to provide a better guarantee that lock requests will get to the server. Test plan: Repeated runs of Connectathon locking suite. Check network trace to ensure NLM requests are working correctly. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: Clean-up after previous patches.Chuck Lever2006-09-221-1/+0
| | | | | | | | | | Remove some unused macros related to accessing an RPC peer address Test plan: Compile kernel with CONFIG_NFS option enabled. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: remove extraneous header inclusionsChuck Lever2006-09-221-1/+0
| | | | | | | | | | | | include/linux/sunrpc/clnt.h already includes include/linux/sunrpc/xprt.h. We can remove xprt.h from source files that already include clnt.h. Likewise include/linux/sunrpc/timer.h. Test plan: Compile kernel with CONFIG_NFS enabled. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* LOCKD: Teach lockd to use the new rpc_peeraddr() APIChuck Lever2006-09-221-5/+5
| | | | | | | | | | | | Hide the details of how the RPC client stores remote peer addresses from the Network Lock Manager. Test plan: Destructive testing (unplugging the network temporarily). Connectathon with UDP and TCP. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
OpenPOWER on IntegriCloud