summaryrefslogtreecommitdiffstats
path: root/include/linux/fs.h
Commit message (Collapse)AuthorAgeFilesLines
* dnotify: reimplement dnotify using fsnotifyEric Paris2009-06-111-5/+0
| | | | | | | | Reimplement dnotify using fsnotify. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de>
* fsnotify: add marks to inodes so groups can interpret how to handle those inodesEric Paris2009-06-111-0/+5
| | | | | | | | | | | | | | | | | This patch creates a way for fsnotify groups to attach marks to inodes. These marks have little meaning to the generic fsnotify infrastructure and thus their meaning should be interpreted by the group that attached them to the inode's list. dnotify and inotify will make use of these markings to indicate which inodes are of interest to their respective groups. But this implementation has the useful property that in the future other listeners could actually use the marks for the exact opposite reason, aka to indicate which inodes it had NO interest in. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de>
* Merge branch 'master' into for-2.6.31Jens Axboe2009-05-221-1/+3
|\ | | | | | | | | | | | | | | Conflicts: drivers/block/hd.c drivers/block/mg_disk.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * Fix races around the access to ->s_optionsAl Viro2009-05-091-0/+1
| | | | | | | | | | | | | | | | | | Put generic_show_options read access to s_options under rcu_read_lock, split save_mount_options() into "we are setting it the first time" (uses in foo_fill_super()) and "we are relacing and freeing the old one", synchronize_rcu() before kfree() in the latter. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * Switch open_exec() and sys_uselib() to do_open_filp()Al Viro2009-05-091-1/+1
| | | | | | | | | | | | ... and make path_lookup_open() static Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * New helper: deactivate_locked_super()Al Viro2009-05-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Does equivalent of up_write(&s->s_umount); deactivate_super(s); However, it does not does not unlock it until it's all over. As the result, it's safe to use to dispose of new superblock on ->get_sb() failure exits - nobody will see the sucker until it's all over. Equivalent using up_write/deactivate_super is safe for that purpose if superblock is either safe to use or has NULL ->s_root when we unlock. Normally filesystems take the required precautions, but a) we do have bugs in that area in some of them. b) up_write/deactivate_super sequence is extremely common, so the helper makes sense anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | splice: implement default splice_read methodMiklos Szeredi2009-05-111-0/+2
|/ | | | | | | | | | | | If f_op->splice_read() is not implemented, fall back to a plain read. Use vfs_readv() to read into previously allocated pages. This will allow splice and functions using splice, such as the loop device, to work on all filesystems. This includes "direct_io" files in fuse which bypass the page cache. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* fs: Mark get_filesystem_list() as __init function.Tetsuo Handa2009-04-201-1/+1
| | | | | | | | | "int get_filesystem_list(char * buf)" is called by only "static void __init get_fs_names(char *page)". We can mark get_filesystem_list() as "__init". Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* kill vfs_stat_fd / vfs_lstat_fdChristoph Hellwig2009-04-201-2/+0
| | | | | | | | | | There's really no reason to keep vfs_stat_fd and vfs_lstat_fd with Oleg's vfs_fstatat. Use vfs_fstatat for the few cases having the directory fd, and switch all others to vfs_stat / vfs_lstat. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Separate out common fstatat code into vfs_fstatatOleg Drokin2009-04-201-0/+1
| | | | | | | | | | This is a version incorporating Christoph's suggestion. Separate out common *fstatat functionality into a single function instead of duplicating it all over the code. Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* splice: add helpers for locking pipe inodeMiklos Szeredi2009-04-151-3/+0
| | | | | | | | | | | | | | | | | | | | There are lots of sequences like this, especially in splice code: if (pipe->inode) mutex_lock(&pipe->inode->i_mutex); /* do something */ if (pipe->inode) mutex_unlock(&pipe->inode->i_mutex); so introduce helpers which do the conditional locking and unlocking. Also replace the inode_double_lock() call with a pipe_double_lock() helper to avoid spreading the use of this functionality beyond the pipe code. This patch is just a cleanup, and should cause no behavioral changes. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* splice: remove generic_file_splice_write_nolock()Miklos Szeredi2009-04-151-2/+0
| | | | | | | | | Remove the now unused generic_file_splice_write_nolock() function. It's conceptually broken anyway, because splice may need to wait for pipe events so holding locks across the whole operation is wrong. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* Document and move the various READ/WRITE typesJens Axboe2009-04-151-0/+59
| | | | | | | It's a somewhat twisty maze of hints and behavioural modifiers, try and clear it up a bit with some documentation. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* namespaces: move proc_net_get_sb to a generic fs/super.c helperSerge E. Hallyn2009-04-071-0/+3
| | | | | | | | | | | | The mqueuefs filesystem will use this helper as well. Proc's main get_sb could also be made to use it, but that will require a bit more rework. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'kmemtrace-for-linus' of ↵Linus Torvalds2009-04-061-34/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'kmemtrace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: kmemtrace: trace kfree() calls with NULL or zero-length objects kmemtrace: small cleanups kmemtrace: restore original tracing data binary format, improve ABI kmemtrace: kmemtrace_alloc() must fill type_id kmemtrace: use tracepoints kmemtrace, rcu: don't include unnecessary headers, allow kmemtrace w/ tracepoints kmemtrace, rcu: fix rcupreempt.c data structure dependencies kmemtrace, rcu: fix rcu_tree_trace.c data structure dependencies kmemtrace, rcu: fix linux/rcutree.h and linux/rcuclassic.h dependencies kmemtrace, mm: fix slab.h dependency problem in mm/failslab.c kmemtrace, kbuild: fix slab.h dependency problem in lib/decompress_unlzma.c kmemtrace, kbuild: fix slab.h dependency problem in lib/decompress_bunzip2.c kmemtrace, kbuild: fix slab.h dependency problem in lib/decompress_inflate.c kmemtrace, squashfs: fix slab.h dependency problem in squasfs kmemtrace, befs: fix slab.h dependency problem kmemtrace, security: fix linux/key.h header file dependencies kmemtrace, fs: fix linux/fdtable.h header file dependencies kmemtrace, fs: uninline simple_transaction_set() kmemtrace, fs, security: move alloc_secdata() and free_secdata() to linux/security.h
| * kmemtrace, fs: uninline simple_transaction_set()Ingo Molnar2009-04-031-13/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: cleanup We want to remove percpu.h from rcupdate.h (for upcoming kmemtrace changes), but this is not possible currently without breaking the build because fs.h has an implicit include file depedency: it uses PAGE_SIZE but does not include asm/page.h which defines it. This problem gets masked in practice because most fs.h using sites use rcupreempt.h (and other headers) which includes percpu.h which brings in asm/page.h indirectly. We cannot add asm/page.h to asm/fs.h because page.h is not an exported header. Move simple_transaction_set() to the other simple-transaction file helpers in fs/libfs.c. This removes the include file hell and also reduces kernel size a bit. Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: paulmck@linux.vnet.ibm.com LKML-Reference: <1237898630.25315.83.camel@penberg-laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * kmemtrace, fs, security: move alloc_secdata() and free_secdata() to ↵Pekka Enberg2009-04-031-21/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | linux/security.h Impact: cleanup We want to remove percpu.h from rcupdate.h (for upcoming kmemtrace changes), but this is not possible currently without breaking the build because fs.h has implicit include file depedencies: it uses GFP_* types in inlines but does not include gfp.h. In practice most fs.h using .c files get gfp.h included implicitly, via an indirect route: via rcupdate.h inclusion - so this underlying problem gets masked in practice. So we want to solve fs.h's dependency on gfp.h. gfp.h can not be included here directly because it is not exported and it would break the build the following way: /home/mingo/tip/usr/include/linux/bsg.h:11: found __[us]{8,16,32,64} type without #include <linux/types.h> /home/mingo/tip/usr/include/linux/fs.h:11: included file 'linux/gfp.h' is not exported make[3]: *** [/home/mingo/tip/usr/include/linux/.check] Error 1 make[2]: *** [linux] Error 2 As suggested by Alexey Dobriyan, move alloc_secdata() and free_secdata() to linux/security.h - they belong there. This also cleans fs.h of GFP_* usage. Suggested-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> LKML-Reference: <1237906803.25315.96.camel@penberg-laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | block: Add flag for telling the IO schedulers NOT to anticipate more IOJens Axboe2009-04-061-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By default, CFQ will anticipate more IO from a given io context if the previously completed IO was sync. This used to be fine, since the only sync IO was reads and O_DIRECT writes. But with more "normal" sync writes being used now, we don't want to anticipate for those. Add a bio/request flag that informs the IO scheduler that this is a sync request that we should not idle for. Introduce WRITE_ODIRECT specifically for O_DIRECT writes, and make sure that the other sync writes set this flag. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Add WRITE_SYNC_PLUG and SWRITE_SYNC_PLUGJens Axboe2009-04-061-0/+3
| | | | | | | | | | | | | | | | | | | | (S)WRITE_SYNC always unplugs the device right after IO submission. Sometimes we want to build up a queue before doing so, so add variants that explicitly DON'T unplug the queue. The caller must then do that after submitting all the IO. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'for-linus' of ↵Linus Torvalds2009-04-021-0/+14
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: Remove two unneeded exports and make two symbols static in fs/mpage.c Cleanup after commit 585d3bc06f4ca57f975a5a1f698f65a45ea66225 Trim includes of fdtable.h Don't crap into descriptor table in binfmt_som Trim includes in binfmt_elf Don't mess with descriptor table in load_elf_binary() Get rid of indirect include of fs_struct.h New helper - current_umask() check_unsafe_exec() doesn't care about signal handlers sharing New locking/refcounting for fs_struct Take fs_struct handling to new file (fs/fs_struct.c) Get rid of bumping fs_struct refcount in pivot_root(2) Kill unsharing fs_struct in __set_personality()
| * Cleanup after commit 585d3bc06f4ca57f975a5a1f698f65a45ea66225Al Viro2009-04-011-0/+12
| | | | | | | | | | | | | | fsync_bdev() export and a bunch of stubs for !CONFIG_BLOCK case had been left behind Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * New helper - current_umask()Al Viro2009-03-311-0/+2
| | | | | | | | | | | | | | current->fs->umask is what most of fs_struct users are doing. Put that into a helper function. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | filesystem freeze: allow SysRq emergency thaw to thaw frozen filesystemsEric Sandeen2009-04-011-0/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | Now that the filesystem freeze operation has been elevated to the VFS, and is just an ioctl away, some sort of safety net for unintentionally frozen root filesystems may be in order. The timeout thaw originally proposed did not get merged, but perhaps something like this would be useful in emergencies. For example, freeze /path/to/mountpoint may freeze your root filesystem if you forgot that you had that unmounted. I chose 'j' as the last remaining character other than 'h' which is sort of reserved for help (because help is generated on any unknown character). I've tested this on a non-root fs with multiple (nested) freezers, as well as on a system rendered unresponsive due to a frozen root fs. [randy.dunlap@oracle.com: emergency thaw only if CONFIG_BLOCK enabled] Signed-off-by: Eric Sandeen <sandeen@redhat.com> Cc: Takashi Sato <t-sato@yk.jp.nec.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'bkl-removal' of git://git.lwn.net/linux-2.6Linus Torvalds2009-03-301-1/+1
|\ | | | | | | | | | | * 'bkl-removal' of git://git.lwn.net/linux-2.6: Fix a lockdep warning in fasync_helper() Add a missing unlock_kernel() in raw_open()
| * Fix a lockdep warning in fasync_helper()Jonathan Corbet2009-03-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | Lockdep gripes if file->f_lock is taken in a no-IRQ situation, since that is not always the case. We don't really want to disable IRQs for every acquisition of f_lock; instead, just move it outside of fasync_lock. Reported-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Reported-by: Larry Finger <Larry.Finger@lwfinger.net> Reported-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
* | Merge branch 'for-linus' of ↵Linus Torvalds2009-03-271-35/+185
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (37 commits) fs: avoid I_NEW inodes Merge code for single and multiple-instance mounts Remove get_init_pts_sb() Move common mknod_ptmx() calls into caller Parse mount options just once and copy them to super block Unroll essentials of do_remount_sb() into devpts vfs: simple_set_mnt() should return void fs: move bdev code out of buffer.c constify dentry_operations: rest constify dentry_operations: configfs constify dentry_operations: sysfs constify dentry_operations: JFS constify dentry_operations: OCFS2 constify dentry_operations: GFS2 constify dentry_operations: FAT constify dentry_operations: FUSE constify dentry_operations: procfs constify dentry_operations: ecryptfs constify dentry_operations: CIFS constify dentry_operations: AFS ...
| * | vfs: simple_set_mnt() should return voidSukadev Bhattiprolu2009-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | simple_set_mnt() is defined as returning 'int' but always returns 0. Callers assume simple_set_mnt() never fails and don't properly cleanup if it were to _ever_ fail. For instance, get_sb_single() and get_sb_nodev() should: up_write(sb->s_unmount); deactivate_super(sb); if simple_set_mnt() fails. Since simple_set_mnt() never fails, would be cleaner if it did not return anything. [akpm@linux-foundation.org: fix build] Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | fs: move bdev code out of buffer.cNick Piggin2009-03-271-0/+7
| | | | | | | | | | | | | | | | | | | | | Move some block device related code out from buffer.c and put it in block_dev.c. I'm trying to move non-buffer_head code out of buffer.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | vfs: Further changes from macro to inline function in fs.hSteven Whitehouse2009-03-271-7/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | There is a second set of macros for when CONFIG_FILE_LOCKING is not set. This patch updates those to become inline functions as well. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | vfs: Update fs.h to use inline functions when no file locking setSteven Whitehouse2009-03-271-26/+139
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This avoids various issues which might give rise to compiler warnings about missing functions and/or unused variable with the previous macros. This also fixes a bug where one of the macros was returning 0, but it should have been void. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Tested-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | do_pipe cleanup: drop its last user in arch/alpha/Cheng Renquan2009-03-271-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | The last user of do_pipe is in arch/alpha/, after replacing it with do_pipe_flags, the do_pipe can be totally dropped. Signed-off-by: Cheng Renquan <crquan@gmail.com> Acked-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | Merge branch 'bkl-removal' of git://git.lwn.net/linux-2.6Linus Torvalds2009-03-261-1/+1
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | * 'bkl-removal' of git://git.lwn.net/linux-2.6: Rationalize fasync return values Move FASYNC bit handling to f_op->fasync() Use f_lock to protect f_flags Rename struct file->f_ep_lock
| * | Use f_lock to protect f_flagsJonathan Corbet2009-03-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Traditionally, changes to struct file->f_flags have been done under BKL protection, or with no protection at all. This patch causes all f_flags changes after file open/creation time to be done under protection of f_lock. This allows the removal of some BKL usage and fixes a number of longstanding (if microscopic) races. Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
| * | Rename struct file->f_ep_lockJonathan Corbet2009-03-161-1/+1
| |/ | | | | | | | | | | | | | | | | | | This lock moves out of the CONFIG_EPOLL ifdef and becomes f_lock. For now, epoll remains the only user, but a future patch will use it to protect f_flags as well. Cc: Davide Libenzi <davidel@xmailserver.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
* | Add a strictatime mount optionMatthew Garrett2009-03-261-0/+1
|/ | | | | | | | | Add support for explicitly requesting full atime updates. This makes it possible for kernels to default to relatime but still allow userspace to override it. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2009-02-181-3/+3
|\ | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: fix deadlock in blk_abort_queue() for drivers that readd to timeout list block: fix booting from partitioned md array block: revert part of 18ce3751ccd488c78d3827e9f6bf54e6322676fb cciss: PCI power management reset for kexec paride/pg.c: xs(): &&/|| confusion fs/bio: bio_alloc_bioset: pass right object ptr to mempool_free block: fix bad definition of BIO_RW_SYNC bsg: Fix sense buffer bug in SG_IO
| * block: fix bad definition of BIO_RW_SYNCJens Axboe2009-02-181-3/+3
| | | | | | | | | | | | | | | | We can't OR shift values, so get rid of BIO_RW_SYNC and use BIO_RW_SYNCIO and BIO_RW_UNPLUG explicitly. This brings back the behaviour from before 213d9417fec62ef4c3675621b9364a667954d4dd. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | vfs: separate FMODE_PREAD/FMODE_PWRITE into separate flagsPaul Turner2009-02-181-6/+12
|/ | | | | | | | | | | | | | | | | | | | | Separate FMODE_PREAD and FMODE_PWRITE into separate flags to reflect the reality that the read and write paths may have independent restrictions. A git grep verifies that these flags are always cleared together so this new behavior will only apply to interfaces that change to clear flags individually. This is required for "seq_file: properly cope with pread", a post-2.6.25 regression fix. [akpm@linux-foundation.org: add comment] Signed-off-by: Paul Turner <pjt@google.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* filesystem freeze: implement generic freeze featureTakashi Sato2009-01-091-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ioctls for the generic freeze feature are below. o Freeze the filesystem int ioctl(int fd, int FIFREEZE, arg) fd: The file descriptor of the mountpoint FIFREEZE: request code for the freeze arg: Ignored Return value: 0 if the operation succeeds. Otherwise, -1 o Unfreeze the filesystem int ioctl(int fd, int FITHAW, arg) fd: The file descriptor of the mountpoint FITHAW: request code for unfreeze arg: Ignored Return value: 0 if the operation succeeds. Otherwise, -1 Error number: If the filesystem has already been unfrozen, errno is set to EINVAL. [akpm@linux-foundation.org: fix CONFIG_BLOCK=n] Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com> Signed-off-by: Masayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com> Cc: <xfs-masters@oss.sgi.com> Cc: <linux-ext4@vger.kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* filesystem freeze: add error handling of write_super_lockfs/unlockfsTakashi Sato2009-01-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, ext3 in mainline Linux doesn't have the freeze feature which suspends write requests. So, we cannot take a backup which keeps the filesystem's consistency with the storage device's features (snapshot and replication) while it is mounted. In many case, a commercial filesystem (e.g. VxFS) has the freeze feature and it would be used to get the consistent backup. If Linux's standard filesystem ext3 has the freeze feature, we can do it without a commercial filesystem. So I have implemented the ioctls of the freeze feature. I think we can take the consistent backup with the following steps. 1. Freeze the filesystem with the freeze ioctl. 2. Separate the replication volume or create the snapshot with the storage device's feature. 3. Unfreeze the filesystem with the unfreeze ioctl. 4. Take the backup from the separated replication volume or the snapshot. This patch: VFS: Changed the type of write_super_lockfs and unlockfs from "void" to "int" so that they can return an error. Rename write_super_lockfs and unlockfs of the super block operation freeze_fs and unfreeze_fs to avoid a confusion. ext3, ext4, xfs, gfs2, jfs: Changed the type of write_super_lockfs and unlockfs from "void" to "int" so that write_super_lockfs returns an error if needed, and unlockfs always returns 0. reiserfs: Changed the type of write_super_lockfs and unlockfs from "void" to "int" so that they always return 0 (success) to keep a current behavior. Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com> Signed-off-by: Masayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com> Cc: <xfs-masters@oss.sgi.com> Cc: <linux-ext4@vger.kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for_linus' of ↵Linus Torvalds2009-01-081-0/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (57 commits) jbd2: Fix oops in jbd2_journal_init_inode() on corrupted fs ext4: Remove "extents" mount option block: Add Kconfig help which notes that ext4 needs CONFIG_LBD ext4: Make printk's consistently prefixed with "EXT4-fs: " ext4: Add sanity checks for the superblock before mounting the filesystem ext4: Add mount option to set kjournald's I/O priority jbd2: Submit writes to the journal using WRITE_SYNC jbd2: Add pid and journal device name to the "kjournald2 starting" message ext4: Add markers for better debuggability ext4: Remove code to create the journal inode ext4: provide function to release metadata pages under memory pressure ext3: provide function to release metadata pages under memory pressure add releasepage hooks to block devices which can be used by file systems ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc ext4: Init the complete page while building buddy cache ext4: Don't allow new groups to be added during block allocation ext4: mark the blocks/inode bitmap beyond end of group as used ext4: Use new buffer_head flag to check uninit group bitmaps initialization ext4: Fix the race between read_inode_bitmap() and ext4_new_inode() ext4: code cleanup ...
| * add releasepage hooks to block devices which can be used by file systemsTheodore Ts'o2009-01-031-0/+2
| | | | | | | | | | | | | | | | | | Implement blkdev_releasepage() to release the buffer_heads and pages after we release private data belonging to a mounted filesystem. Cc: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* | async: make the final inode deletion an asynchronous eventArjan van de Ven2009-01-071-0/+5
| | | | | | | | | | | | | | this makes "rm -rf" on a (names cached) kernel tree go from 11.6 to 8.6 seconds on an ext3 filesystem Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
* | fs: sys_sync fixNick Piggin2009-01-061-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | s_syncing livelock avoidance was breaking data integrity guarantee of sys_sync, by allowing sys_sync to skip writing or waiting for superblocks if there is a concurrent sys_sync happening. This livelock avoidance is much less important now that we don't have the get_super_to_sync() call after every sb that we sync. This was replaced by __put_super_and_need_restart. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmwLinus Torvalds2009-01-051-0/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (27 commits) GFS2: Use DEFINE_SPINLOCK GFS2: Fix use-after-free bug on umount (try #2) Revert "GFS2: Fix use-after-free bug on umount" GFS2: Streamline alloc calculations for writes GFS2: Send useful information with uevent messages GFS2: Fix use-after-free bug on umount GFS2: Remove ancient, unused code GFS2: Move four functions from super.c GFS2: Fix bug in gfs2_lock_fs_check_clean() GFS2: Send some sensible sysfs stuff GFS2: Kill two daemons with one patch GFS2: Move gfs2_recoverd into recovery.c GFS2: Fix "truncate in progress" hang GFS2: Clean up & move gfs2_quotad GFS2: Add more detail to debugfs glock dumps GFS2: Banish struct gfs2_rgrpd_host GFS2: Move rg_free from gfs2_rgrpd_host to gfs2_rgrpd GFS2: Move rg_igeneration into struct gfs2_rgrpd GFS2: Banish struct gfs2_dinode_host GFS2: Move i_size from gfs2_dinode_host and rename it to i_disksize ...
| * | GFS2: Support for FIEMAP ioctlSteven Whitehouse2009-01-051-0/+3
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the FIEMAP ioctl for GFS2. We can use the generic code (aside from a lock order issue, solved as per Ted Tso's suggestion) for which I've introduced a new variant of the generic function. We also have one exception to deal with, namely stuffed files, so we do that "by hand", setting all the required flags. This has been tested with a modified (I could only find an old version) of Eric's test program, and appears to work correctly. This patch does not currently support FIEMAP of xattrs, but the plan is to add that feature at some future point. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Eric Sandeen <sandeen@redhat.com>
* | Merge branch 'for-linus' of ↵Linus Torvalds2009-01-051-1/+1
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: inotify: fix type errors in interfaces fix breakage in reiserfs_new_inode() fix the treatment of jfs special inodes vfs: remove duplicate code in get_fs_type() add a vfs_fsync helper sys_execve and sys_uselib do not call into fsnotify zero i_uid/i_gid on inode allocation inode->i_op is never NULL ntfs: don't NULL i_op isofs check for NULL ->i_op in root directory is dead code affs: do not zero ->i_op kill suid bit only for regular files vfs: lseek(fd, 0, SEEK_CUR) race condition
| * add a vfs_fsync helperChristoph Hellwig2009-01-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fsync currently has a fdatawrite/fdatawait pair around the method call, and a mutex_lock/unlock of the inode mutex. All callers of fsync have to duplicate this, but we have a few and most of them don't quite get it right. This patch adds a new vfs_fsync that takes care of this. It's a little more complicated as usual as ->fsync might get a NULL file pointer and just a dentry from nfsd, but otherwise gets afile and we want to take the mapping and file operations from it when it is there. Notes on the fsync callers: - ecryptfs wasn't calling filemap_fdatawrite / filemap_fdatawait on the lower file - coda wasn't calling filemap_fdatawrite / filemap_fdatawait on the host file, and returning 0 when ->fsync was missing - shm wasn't calling either filemap_fdatawrite / filemap_fdatawait nor taking i_mutex. Now given that shared memory doesn't have disk backing not doing anything in fsync seems fine and I left it out of the vfs_fsync conversion for now, but in that case we might just not pass it through to the lower file at all but just call the no-op simple_sync_file directly. [and now actually export vfs_fsync] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | fs: symlink write_begin allocation context fixNick Piggin2009-01-041-1/+4
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the write_begin/write_end aops, page_symlink was broken because it could no longer pass a GFP_NOFS type mask into the point where the allocations happened. They are done in write_begin, which would always assume that the filesystem can be entered from reclaim. This bug could cause filesystem deadlocks. The funny thing with having a gfp_t mask there is that it doesn't really allow the caller to arbitrarily tinker with the context in which it can be called. It couldn't ever be GFP_ATOMIC, for example, because it needs to take the page lock. The only thing any callers care about is __GFP_FS anyway, so turn that into a single flag. Add a new flag for write_begin, AOP_FLAG_NOFS. Filesystems can now act on this flag in their write_begin function. Change __grab_cache_page to accept a nofs argument as well, to honour that flag (while we're there, change the name to grab_cache_page_write_begin which is more instructive and does away with random leading underscores). This is really a more flexible way to go in the end anyway -- if a filesystem happens to want any extra allocations aside from the pagecache ones in ints write_begin function, it may now use GFP_KERNEL (rather than GFP_NOFS) for common case allocations (eg. ocfs2_alloc_write_ctxt, for a random example). [kosaki.motohiro@jp.fujitsu.com: fix ubifs] [kosaki.motohiro@jp.fujitsu.com: fix fuse] Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: <stable@kernel.org> [2.6.28.x] Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Cleaned up the calling convention: just pass in the AOP flags untouched to the grab_cache_page_write_begin() function. That just simplifies everybody, and may even allow future expansion of the logic. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* nfsd/create race fixes, infrastructureAl Viro2008-12-311-0/+2
| | | | | | | | | | | | | new helpers - insert_inode_locked() and insert_inode_locked4(). Hash new inode, making sure that there's no such inode in icache already. If there is and it does not end up unhashed (as would happen if we have nfsd trying to resolve a bogus fhandle), fail. Otherwise insert our inode into hash and succeed. In either case have i_state set to new+locked; cleanup ends up being simpler with such calling conventions. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
OpenPOWER on IntegriCloud