summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* NFS: support large reads and writes on the wireChuck Lever2006-01-065-29/+40
| | | | | | | | | | | | | | | | | Most NFS server implementations allow up to 64KB reads and writes on the wire. The Solaris NFS server allows up to a megabyte, for instance. Now the Linux NFS client supports transfer sizes up to 1MB, too. This will help reduce protocol and context switch overhead on read/write intensive NFS workloads, and support larger atomic read and write operations on servers that support them. Test-plan: Connectathon and iozone on mount point with wsize=rsize>32768 over TCP. Tests with NFS over UDP to verify the maximum RPC payload size cap. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: make "inode number mismatch" message more usefulChuck Lever2006-01-061-8/+9
| | | | | | | | | | | To help NFS users and server developers, make the "inode number mismatch" message display more useful information. Test-plan: None. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: get rid of useless kernel log messageChuck Lever2006-01-061-2/+1
| | | | | | | | | | | nfs_statfs() generates a log message when GETATTR returns an error. This is usually a useless message. Make it a dprintk. Test plan: None Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix error recovery code in fs/nfs/inode.c:__init_nfs()Chuck Lever2006-01-061-2/+2
| | | | | | | Red Hat found a problem in the error recovery logic in __init_nfs. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: use generic_write_checks() to sanity check direct writesChuck Lever2006-01-061-24/+19
| | | | | | | | | | | | | Replace ad hoc write parameter sanity checking in nfs_file_direct_write() with a call to generic_write_checks(). This should make the proper checks modulo the O_LARGEFILE flag, and should catch NFSv2-specific limitations by virtue of i_sb->s_maxbytes. Test plan: Posix compliance testing with both NFSv2 and NFSv3. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Remove requirement for machine creds for the "setclientid" operationTrond Myklebust2006-01-064-49/+52
| | | | | | Use a cred from the nfs4_client->cl_state_owners list. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Remove requirement for machine creds for the "renew" operationTrond Myklebust2006-01-064-25/+41
| | | | | | | | | | | | | | | | | In RFC3530, the RENEW operation is allowed to use either the same principal, RPC security flavour and (if RPCSEC_GSS), the same mechanism and service that was used for SETCLIENTID_CONFIRM OR Any principal, RPC security flavour and service combination that currently has an OPEN file on the server. Choose the latter since that doesn't require us to keep credentials for the same principal for the entire duration of the mount. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Send RENEW requests to the server only when we're holding stateTrond Myklebust2006-01-065-2/+75
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Convert instances of kernel_thread() to kthread()Trond Myklebust2006-01-061-30/+16
| | | | | | | Convert private implementations in NFSv4 state recovery and delegation code to use kthreads. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: State recovery cleanupTrond Myklebust2006-01-063-25/+28
| | | | | | Use wait_on_bit() when waiting for state recovery to complete. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: OPEN/LOCK/LOCKU/CLOSE will automatically renew the NFSv4 leaseTrond Myklebust2006-01-061-3/+31
| | | | | | Cut down on the number of unnecessary RENEW requests on the wire. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: Ensure that SIGKILL will always terminate a synchronous RPC call.Trond Myklebust2006-01-061-2/+2
| | | | | | | ...and make sure that the "intr" flag also enables SIGHUP and SIGTERM to interrupt RPC calls too (as per the Solaris implementation). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Make DELEGRETURN an interruptible operation.Trond Myklebust2006-01-061-8/+60
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Convert LOCK rpc call into an asynchronous RPC callTrond Myklebust2006-01-062-75/+175
| | | | | | In order to allow users to interrupt/cancel it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: locking XDR cleanupTrond Myklebust2006-01-062-168/+155
| | | | | | Get rid of some unnecessary intermediate structures Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Make open recovery track O_RDWR, O_RDONLY and O_WRONLY correctlyTrond Myklebust2006-01-062-131/+156
| | | | | | | When recovering from a delegation recall or a network partition, we need to replay open(O_RDWR), open(O_RDONLY) and open(O_WRONLY) separately. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Make nfs4_state track O_RDWR, O_RDONLY and O_WRONLY separatelyTrond Myklebust2006-01-063-28/+42
| | | | | | | | | | | | | A closer reading of RFC3530 reveals that OPEN_DOWNGRADE must always specify a access modes that have been the argument of a previous OPEN operation. IOW: doing OPEN(O_RDWR) and then OPEN_DOWNGRADE(O_WRONLY) is forbidden unless the user called OPEN(O_WRONLY) In order to fix that, we really need to track the three possible open states separately. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Make open_confirm() asynchronous tooTrond Myklebust2006-01-062-28/+80
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Convert open() into an asynchronous RPC callTrond Myklebust2006-01-062-66/+130
| | | | | | | | | OPEN is a stateful operation, so we must ensure that it always completes. In order to allow users to interrupt the operation, we need to make the RPC call asynchronous, and then wait on completion (or cancel). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Allocate OPEN call RPC arguments using kmalloc()Trond Myklebust2006-01-061-96/+117
| | | | | | Cleanup in preparation for making OPEN calls interruptible by the user. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Make locku use the new RPC "wait on completion" interface.Trond Myklebust2006-01-061-29/+36
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: stateful NFSv4 RPC call interfaceTrond Myklebust2006-01-061-1/+0
| | | | | | | | | | | | | | The NFSv4 model requires us to complete all RPC calls that might establish state on the server whether or not the user wants to interrupt it. We may also need to schedule new work (including new RPC calls) in order to cancel the new state. The asynchronous RPC model will allow us to ensure that RPC calls always complete, but in order to allow for "synchronous" RPC, we want to add the ability to wait for completion. The waits are, of course, interruptible. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: Further cleanupsTrond Myklebust2006-01-062-18/+16
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* RPC: Clean up RPC task structureTrond Myklebust2006-01-0612-140/+181
| | | | | | | | | | Shrink the RPC task structure. Instead of storing separate pointers for task->tk_exit and task->tk_release, put them in a structure. Also pass the user data pointer as a parameter instead of passing it via task->tk_calldata. This enables us to nest callbacks. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Work correctly with single-page ->writepage() callsTrond Myklebust2006-01-061-11/+5
| | | | | | | Ensure that we always initiate flushing of data before we exit a single-page ->writepage() call. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Merge branch 'post-2.6.15' of git://brick.kernel.dk/data/git/linux-2.6-blockLinus Torvalds2006-01-061-2/+24
|\ | | | | | | | | | | | | Manual fixup for merge with Jens' "Suspend support for libata", commit ID 9b847548663ef1039dd49f0eb4463d001e596bc3. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * [BLOCK] bio: check for same page merge possibilities in __bio_add_page()Jens Axboe2006-01-061-2/+24
| | | | | | | | | | | | | | | | | | For filesystems with a blocksize < page size, we can merge same page calls into the bio_vec at the end of the bio. This saves segments on systems with a page size > the "normal" 4kb fs block size. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@suse.de>
* | [PATCH] knfsd: reduce stack consumptionNeil Brown2006-01-061-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A typical nfsd call trace is nfsd -> svc_process -> nfsd_dispatch -> nfsd3_proc_write -> nfsd_write ->nfsd_vfs_write -> vfs_writev These add up to over 300 bytes on the stack. Looking at each of these, I see that nfsd_write (which includes nfsd_vfs_write) contributes 0x8c to stack usage itself!! It turns out this is because it puts a 'struct iattr' on the stack so it can kill suid if needed. The following patch saves about 50 bytes off the stack in this call path. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] knfsd: check error status from vfs_getattr and i_op->fsyncDavid Shaw2006-01-064-55/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both vfs_getattr and i_op->fsync return error statuses which nfsd was largely ignoring. This as noticed when exporting directories using fuse. This patch cleans up most of the offences, which involves moving the call to vfs_getattr out of the xdr encoding routines (where it is too late to report an error) into the main NFS procedure handling routines. There is still a called to vfs_gettattr (related to the ACL code) where the status is ignored, and called to nfsd_sync_dir don't check return status either. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] jbd: split checkpoint listsJan Kara2006-01-061-177/+241
| | | | | | | | | | | | | | | | | | | | | | | | Split the checkpoint list of the transaction into two lists. In the first list we keep the buffers that need to be submitted for IO. In the second list are kept buffers that were already submitted and we just have to wait for the IO to complete. This should simplify a handling of checkpoint lists a bit and can eventually be also a performance gain. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] fuse: check file type in lookupMiklos Szeredi2006-01-062-13/+22
| | | | | | | | | | | | | | | | | | | | | | | | Previously invalid types were quietly changed to regular files, but at revalidation the inode was changed to bad. This was rather inconsistent behavior. Now check if the type is valid on initial lookup, and return -EIO if not. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] fuse: ensure progress in read and writeMiklos Szeredi2006-01-061-4/+3
| | | | | | | | | | | | | | | | | | | | | | In direct_io mode, send at least one page per reqest. Previously it was possible that reqests with zero data were sent, and hence the read/write didn't make any progress, resulting in an infinite (though interruptible) loop. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] fuse: make maximum write data configurableMiklos Szeredi2006-01-063-23/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the maximum size of write data configurable by the filesystem. The previous fixed 4096 limit only worked on architectures where the page size is less or equal to this. This change make writing work on other architectures too, and also lets the filesystem receive bigger write requests in direct_io mode. Normal writes which go through the page cache are still limited to a page sized chunk per request. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] fuse: clean up request size limit checkingMiklos Szeredi2006-01-064-27/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the way a too large request is handled. Until now in this case the device read returned -EINVAL and the operation returned -EIO. Make it more flexibible by not returning -EINVAL from the read, but restarting it instead. Also remove the fixed limit on setxattr data and let the filesystem provide as large a read buffer as it needs to handle the extended attribute data. The symbolic link length is already checked by VFS to be less than PATH_MAX, so the extra check against FUSE_SYMLINK_MAX is not needed. The check in fuse_create_open() against FUSE_NAME_MAX is not needed, since the dentry has already been looked up, and hence the name already checked. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] fuse: fail file operations on bad inodeMiklos Szeredi2006-01-062-5/+37
| | | | | | | | | | | | | | | | | | Make file operations on a bad inode fail. This just makes things a bit more consistent. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] fuse: add code documentationMiklos Szeredi2006-01-061-9/+90
| | | | | | | | | | | | | | | | Document some not-so-trivial functions. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] fuse: support caching negative dentriesMiklos Szeredi2006-01-061-21/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for caching negative dentries. Up till now, ->d_revalidate() always forced a new lookup on these. Now let the lookup method return a zero node ID (not used for anything else) meaning a negative entry, but with a positive cache timeout. The old way of signaling negative entry (replying ENOENT) still works. Userspace should check the ABI minor version to see whether sending a zero ID is allowed by the kernel or not. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] fuse: add frsize to statfs replyMiklos Szeredi2006-01-061-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | Add 'frsize' member to the statfs reply. I'm not sure if sending f_fsid will ever be needed, but just in case leave some space at the end of the structure, so less compatibility mess would be required. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] fuse: bump interface versionMiklos Szeredi2006-01-062-0/+5
| | | | | | | | | | | | | | | | | | | | | | Change interface version to 7.4. Following changes will need backward compatibility support, so store the minor version returned by userspace. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] fuse: clean up page offset calculationMiklos Szeredi2006-01-061-4/+3
| | | | | | | | | | | | | | | | Use page_offset() instead of doing page offset calculation by hand. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] fuse: clean up fuse_lookup()Miklos Szeredi2006-01-061-52/+23
| | | | | | | | | | | | | | | | Simplify fuse_lookup() and related functions. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] s390: cleanup KconfigMartin Schwidefsky2006-01-062-2/+2
| | | | | | | | | | | | | | | | | | | | Sanitize some s390 Kconfig options. We have ARCH_S390, ARCH_S390X, ARCH_S390_31, 64BIT, S390_SUPPORT and COMPAT. Replace these 6 options by S390, 64BIT and COMPAT. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] s390: cms volume label definitionsPeter Oberparleiter2006-01-061-15/+15
| | | | | | | | | | | | | | | | | | | | Moved definition of CMS volume label to vtoc.h and modify partitions/ibm.c to use this volume label definition instead of anonymous array. Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] NOMMU: Provide shared-writable mmap support on ramfsDavid Howells2006-01-065-22/+368
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The attached patch makes ramfs support shared-writable mmaps by: (1) Attempting to perform a contiguous block allocation to the requested size when truncate attempts to increase the file from zero size, such as happens when: fd = shm_open("/file/on/ramfs", ...): ftruncate(fd, size_requested); addr = mmap(NULL, subsize, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED, fd, offset); (2) Permitting any shared-writable mapping over any contiguous set of extant pages. get_unmapped_area() will return the address into the actual ramfs pages. The mapping may start anywhere and be of any size, but may not go over the end of file. Multiple mappings may overlap in any way. (3) Not permitting a file to be shrunk if it would truncate any shared mappings (private mappings are copied). Thus this patch provides support for POSIX shared memory on NOMMU kernels, with certain limitations such as there being a large enough block of pages available to support the allocation and it only working on directly mappable filesystems. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] mm: rmap optimisationNick Piggin2006-01-061-1/+1
| | | | | | | | | | | | | | | | | | | | Optimise rmap functions by minimising atomic operations when we know there will be no concurrent modifications. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] Hugetlb: Copy on Write supportDavid Gibson2006-01-061-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement copy-on-write support for hugetlb mappings so MAP_PRIVATE can be supported. This helps us to safely use hugetlb pages in many more applications. The patch makes the following changes. If needed, I also have it broken out according to the following paragraphs. 1. Add a pair of functions to set/clear write access on huge ptes. The writable check in make_huge_pte is moved out to the caller for use by COW later. 2. Hugetlb copy-on-write requires special case handling in the following situations: - copy_hugetlb_page_range() - Copied pages must be write protected so a COW fault will be triggered (if necessary) if those pages are written to. - find_or_alloc_huge_page() - Only MAP_SHARED pages are added to the page cache. MAP_PRIVATE pages still need to be locked however. 3. Provide hugetlb_cow() and calls from hugetlb_fault() and hugetlb_no_page() which handles the COW fault by making the actual copy. 4. Remove the check in hugetlbfs_file_map() so that MAP_PRIVATE mmaps will be allowed. Make MAP_HUGETLB exempt from the depricated VM_RESERVED mapping check. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "Seth, Rohit" <rohit.seth@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] hfsplus oops fixJoshua Kwan2006-01-061-1/+1
|/ | | | | | | | | | | nls_utf8 is available, and the check in hfsplus_fill_super checks the wrong pointer for NULLness (it checks the saved nls, not the new one that it needs to use.) Signed-off-by: Joshua Kwan <joshk@triplehelix.org> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge http://oss.oracle.com/git/ocfs2Linus Torvalds2006-01-05104-12/+45227
|\
| * [PATCH] o Update Kconfig documentation to reflect support for readonly mounts.Mark Fasheh2006-01-031-1/+0
| | | | | | | | Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
| * [PATCH] This patch contains the following cleanups:Adrian Bunk2006-01-033-8/+3
| | | | | | | | | | | | | | | | | | - cluster/sys.c: make needlessly global code static - dlm/: "extern" declarations for variables belong into header files (and in this case, they are already in dlmdomain.h) Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
OpenPOWER on IntegriCloud