summaryrefslogtreecommitdiffstats
path: root/fs/btrfs/disk-io.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-linus' of ↵Linus Torvalds2012-01-281-0/+7
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: fix reservations in btrfs_page_mkwrite Btrfs: advance window_start if we're using a bitmap btrfs: mask out gfp flags in releasepage Btrfs: fix enospc error caused by wrong checks of the chunk Btrfs: do not defrag a file partially Btrfs: fix warning for 32-bit build of fs/btrfs/check-integrity.c Btrfs: use cluster->window_start when allocating from a cluster bitmap Btrfs: Check for NULL page in extent_range_uptodate btrfs: Fix busyloops in transaction waiting code Btrfs: make sure a bitmap has enough bytes Btrfs: fix uninit warning in backref.c
| * btrfs: mask out gfp flags in releasepageDavid Sterba2012-01-261-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btree_releasepage is a callback and can be passed unknown gfp flags and then they may end up in kmem_cache_alloc called from alloc_extent_state, slab allocator will BUG_ON when there is HIGHMEM or DMA32 flag set. This may happen when btrfs is mounted from a loop device, which masks out __GFP_IO flag. The check in try_release_extent_state 3399 if ((mask & GFP_NOFS) == GFP_NOFS) 3400 mask = GFP_NOFS; will not work and passes unfiltered flags further resulting in crash at mm/slab.c:2963 [<000000000024ae4c>] cache_alloc_refill+0x3b4/0x5c8 [<000000000024c810>] kmem_cache_alloc+0x204/0x294 [<00000000001fd3c2>] mempool_alloc+0x52/0x170 [<000003c000ced0b0>] alloc_extent_state+0x40/0xd4 [btrfs] [<000003c000cee5ae>] __clear_extent_bit+0x38a/0x4cc [btrfs] [<000003c000cee78c>] try_release_extent_state+0x9c/0xd4 [btrfs] [<000003c000cc4c66>] btree_releasepage+0x7e/0xd0 [btrfs] [<0000000000210d84>] shrink_page_list+0x6a0/0x724 [<0000000000211394>] shrink_inactive_list+0x230/0x578 [<0000000000211bb8>] shrink_list+0x6c/0x120 [<0000000000211e4e>] shrink_zone+0x1e2/0x228 [<0000000000211f24>] shrink_zones+0x90/0x254 [<0000000000213410>] do_try_to_free_pages+0xac/0x420 [<0000000000213ae0>] try_to_free_pages+0x13c/0x1b0 [<0000000000204e6c>] __alloc_pages_nodemask+0x5b4/0x9a8 [<00000000001fb04a>] grab_cache_page_write_begin+0x7e/0xe8 Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* | Merge branch 'btrfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds2012-01-171-34/+36
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'btrfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: btrfs: take allocation of ->tree_root into open_ctree() btrfs: let ->s_fs_info point to fs_info, not root... btrfs: consolidate failure exits in btrfs_mount() a bit btrfs: make free_fs_info() call ->kill_sb() unconditional btrfs: merge free_fs_info() calls on fill_super failures btrfs: kill pointless reassignment of ->s_fs_info in btrfs_fill_super() btrfs: make open_ctree() return int btrfs: sanitizing ->fs_info, part 5 btrfs: sanitizing ->fs_info, part 4 btrfs: sanitizing ->fs_info, part 3 btrfs: sanitizing ->fs_info, part 2 btrfs: sanitizing ->fs_info, part 1 btrfs: fix a deadlock in btrfs_scan_one_device() btrfs: fix mount/umount race btrfs: get ->kill_sb() of its own btrfs: preparation to fixing mount/umount race
| * | btrfs: take allocation of ->tree_root into open_ctree()Al Viro2012-01-081-3/+5
| | | | | | | | | | | | | | | | | | now that we don't need it for sget() anymore... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | btrfs: let ->s_fs_info point to fs_info, not root...Al Viro2012-01-081-2/+2
| | | | | | | | | | | | | | | | | | | | | the latter can be obtained from the former (by looking as ->tree_root) just as cheaply as we currently are doing the other way round. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | btrfs: merge free_fs_info() calls on fill_super failuresAl Viro2012-01-081-2/+0
| | | | | | | | | | | | | | | | | | ... all the way up into btrfs_mount(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | btrfs: make open_ctree() return intAl Viro2012-01-081-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | It returns either ERR_PTR(-ve) or sb->s_fs_info. The latter can be found by caller just as well, TYVM, no need to return it. Just return -ve or 0... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | btrfs: sanitizing ->fs_info, part 5Al Viro2012-01-081-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | close_ctree() uses a weird mix of accesses to root->fs_info and its value at the beginning of function stored in local variable. Since ->fs_info *never* changes, let's just use the local variable to avoid confusion. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | btrfs: sanitizing ->fs_info, part 4Al Viro2012-01-081-18/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A new helper: btrfs_alloc_root(fs_info); allocates btrfs_root and sets ->fs_info. All places allocating the suckers converted to it. At that point we *never* reassign ->fs_info of btrfs_root; it's set before anyone sees the address of newly allocated struct btrfs_root and never assigned anywhere else. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | btrfs: sanitizing ->fs_info, part 3Al Viro2012-01-081-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | move assignments to ->fs_info in open_ctree() up, to the place just after the original allocations. Assignment for tree_root becomes a no-op - we'd obtained fs_info from tree_root->fs_info in the first place. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | btrfs: sanitizing ->fs_info, part 2Al Viro2012-01-081-1/+4
| | | | | | | | | | | | | | | | | | lift assignment to callers of find_and_setup_root() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | btrfs: sanitizing ->fs_info, part 1Al Viro2012-01-081-1/+6
| | | | | | | | | | | | | | | | | | take assignment of ->fs_info to callers of __setup_root() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | btrfs: preparation to fixing mount/umount raceAl Viro2012-01-081-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need fs_info and root to live until the moment when the victim superblock leaves the list, so we need to postpone free_fs_info() until after ->put_super(). The call is buried in close_ctree(), though, so we need to lift it into the callers (including btrfs_put_super()) first. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2012-01-171-8/+41
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (62 commits) Btrfs: use larger system chunks Btrfs: add a delalloc mutex to inodes for delalloc reservations Btrfs: space leak tracepoints Btrfs: protect orphan block rsv with spin_lock Btrfs: add allocator tracepoints Btrfs: don't call btrfs_throttle in file write Btrfs: release space on error in page_mkwrite Btrfs: fix btrfsck error 400 when truncating a compressed Btrfs: do not use btrfs_end_transaction_throttle everywhere Btrfs: add balance progress reporting Btrfs: allow for resuming restriper after it was paused Btrfs: allow for canceling restriper Btrfs: allow for pausing restriper Btrfs: add skip_balance mount option Btrfs: recover balance on mount Btrfs: save balance parameters to disk Btrfs: soft profile changing mode (aka soft convert) Btrfs: implement online profile changing Btrfs: do not reduce profile in do_chunk_alloc() Btrfs: virtual address space subset filter ... Fix up trivial conflict in fs/btrfs/ioctl.c due to the use of the new mnt_drop_write_file() helper.
| * | Merge branch 'integrity-check-patch-v2' of ↵Chris Mason2012-01-161-2/+24
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://btrfs.giantdisaster.de/git/btrfs into integration Conflicts: fs/btrfs/ctree.h fs/btrfs/super.c Signed-off-by: Chris Mason <chris.mason@oracle.com>
| | * | Btrfs: integrate integrity check module into btrfsStefan Behrens2011-12-211-2/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the last part of the patch series. It modifies the btrfs code to use the integrity check module if configured to do so with the define BTRFS_FS_CHECK_INTEGRITY. If this define is not set, the only effective change is that code is added that handles the mount option to activate the integrity check. If the mount option is set and the define BTRFS_FS_CHECK_INTEGRITY is not set, that code complains in the log and the mount fails with EINVAL. Add the mount option to activate the usage of the integrity check code. Add invocation of btrfs integrity check code init and cleanup function on mount and umount, respectively. Add hook to call btrfs integrity check code version of submit_bh/submit_bio. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
| * | | Merge branch 'for-chris' of git://git.jan-o-sch.net/btrfs-unstable into ↵Chris Mason2012-01-161-1/+2
| |\ \ \ | | | | | | | | | | | | | | | integration
| | * | | Btrfs: mark delayed refs as for cowArne Jansen2011-12-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a for_cow parameter to add_delayed_*_ref and pass the appropriate value from every call site. The for_cow parameter will later on be used to determine if a ref will change anything with respect to qgroups. Delayed refs coming from relocation are always counted as for_cow, as they don't change subvol quota. Also pass in the fs_info for later use. btrfs_find_all_roots() will use this as an optimization, as changes that are for_cow will not change anything with respect to which root points to a certain leaf. Thus, we don't need to add the current sequence number to those delayed refs. Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
| * | | | Merge branch 'for-chris' of git://repo.or.cz/linux-btrfs-devel into integrationChris Mason2012-01-161-2/+0
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: fs/btrfs/volumes.c Signed-off-by: Chris Mason <chris.mason@oracle.com>
| | * | | | Btrfs: fix possible deadlock when opening a seed deviceLi Zefan2012-01-111-2/+0
| | | |/ / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The correct lock order is uuid_mutex -> volume_mutex -> chunk_mutex, but when we mount a filesystem which has backing seed devices, we have this lock chain: open_ctree() lock(chunk_mutex); read_chunk_tree(); read_one_dev(); open_seed_devices(); lock(uuid_mutex); and then we hit a lockdep splat. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
| * | | | Merge branch 'restriper' of git://github.com/idryomov/btrfs-unstable into ↵Chris Mason2012-01-161-3/+15
| |\ \ \ \ | | | | | | | | | | | | | | | | | | integration
| | * | | | Btrfs: allow for canceling restriperIlya Dryomov2012-01-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement an ioctl for canceling restriper. Currently we wait until relocation of the current block group is finished, in future this can be done by triggering a commit. Balance item is deleted and no memory about the interrupted balance is kept. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| | * | | | Btrfs: allow for pausing restriperIlya Dryomov2012-01-161-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement an ioctl for pausing restriper. This pauses the relocation, but balance is still considered to be "in progress": balance item is not deleted, other volume operations cannot be started, etc. If paused in the middle of profile changing operation we will continue making allocations with the target profile. Add a hook to close_ctree() to pause restriper and free its data structures on unmount. (It's safe to unmount when restriper is in "paused" state, we will resume with the same parameters on the next mount) Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| | * | | | Btrfs: recover balance on mountIlya Dryomov2012-01-161-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On mount, if balance item is found, resume balance in a separate kernel thread. Try to be smart to continue roughly where previous balance (or convert) was interrupted. For chunk types that were being converted to some profile we turn on soft convert, in case of a simple balance we turn on usage filter and relocate only less-than-90%-full chunks of that type. These are just heuristics but they help quite a bit, and can be improved in future. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| | * | | | Btrfs: add basic restriper infrastructureIlya Dryomov2012-01-161-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add basic restriper infrastructure: extended balancing ioctl and all related ioctl data structures, add data structure for tracking restriper's state to fs_info, etc. The semantics of the old balancing ioctl are fully preserved. Explicitly disallow any volume operations when balance is in progress. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| | * | | | Btrfs: get rid of *_alloc_profile fieldsIlya Dryomov2012-01-161-3/+0
| | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | {data,metadata,system}_alloc_profile fields have been unused for a long time now. Get rid of them. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
* | | | | mm: compaction: introduce sync-light migration for use by compactionMel Gorman2012-01-121-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a lightweight sync migrate operation MIGRATE_SYNC_LIGHT mode that avoids writing back pages to backing storage. Async compaction maps to MIGRATE_ASYNC while sync compaction maps to MIGRATE_SYNC_LIGHT. For other migrate_pages users such as memory hotplug, MIGRATE_SYNC is used. This avoids sync compaction stalling for an excessive length of time, particularly when copying files to a USB stick where there might be a large number of dirty pages backed by a filesystem that does not support ->writepages. [aarcange@redhat.com: This patch is heavily based on Andrea's work] [akpm@linux-foundation.org: fix fs/nfs/write.c build] [akpm@linux-foundation.org: fix fs/btrfs/disk-io.c build] Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Dave Jones <davej@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Andy Isaacson <adi@hexapodia.org> Cc: Nai Xia <nai.xia@gmail.com> Cc: Johannes Weiner <jweiner@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | mm: compaction: determine if dirty pages can be migrated without blocking ↵Mel Gorman2012-01-121-2/+2
| |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | within ->migratepage Asynchronous compaction is used when allocating transparent hugepages to avoid blocking for long periods of time. Due to reports of stalling, there was a debate on disabling synchronous compaction but this severely impacted allocation success rates. Part of the reason was that many dirty pages are skipped in asynchronous compaction by the following check; if (PageDirty(page) && !sync && mapping->a_ops->migratepage != migrate_page) rc = -EBUSY; This skips over all mapping aops using buffer_migrate_page() even though it is possible to migrate some of these pages without blocking. This patch updates the ->migratepage callback with a "sync" parameter. It is the responsibility of the callback to fail gracefully if migration would block. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Dave Jones <davej@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Andy Isaacson <adi@hexapodia.org> Cc: Nai Xia <nai.xia@gmail.com> Cc: Johannes Weiner <jweiner@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'master' into pm-sleepRafael J. Wysocki2011-12-211-13/+21
|\ \ \ \ | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * master: (848 commits) SELinux: Fix RCU deref check warning in sel_netport_insert() binary_sysctl(): fix memory leak mm/vmalloc.c: remove static declaration of va from __get_vm_area_node ipmi_watchdog: restore settings when BMC reset oom: fix integer overflow of points in oom_badness memcg: keep root group unchanged if creation fails nilfs2: potential integer overflow in nilfs_ioctl_clean_segments() nilfs2: unbreak compat ioctl cpusets: stall when updating mems_allowed for mempolicy or disjoint nodemask evm: prevent racing during tfm allocation evm: key must be set once during initialization mmc: vub300: fix type of firmware_rom_wait_states module parameter Revert "mmc: enable runtime PM by default" mmc: sdhci: remove "state" argument from sdhci_suspend_host x86, dumpstack: Fix code bytes breakage due to missing KERN_CONT IB/qib: Correct sense on freectxts increment and decrement RDMA/cma: Verify private data length cgroups: fix a css_set not found bug in cgroup_attach_proc oprofile: Fix uninitialized memory access when writing to writing to oprofilefs Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel" ... Conflicts: kernel/cgroup_freezer.c
| * | | Merge branches 'for-linus' and 'for-linus-3.2' of ↵Linus Torvalds2011-12-161-13/+21
| |\ \ \ | | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: unplug every once and a while Btrfs: deal with NULL srv_rsv in the delalloc inode reservation code Btrfs: only set cache_generation if we setup the block group Btrfs: don't panic if orphan item already exists Btrfs: fix leaked space in truncate Btrfs: fix how we do delalloc reservations and how we free reservations on error Btrfs: deal with enospc from dirtying inodes properly Btrfs: fix num_workers_starting bug and other bugs in async thread BTRFS: Establish i_ops before calling d_instantiate Btrfs: add a cond_resched() into the worker loop Btrfs: fix ctime update of on-disk inode btrfs: keep orphans for subvolume deletion Btrfs: fix inaccurate available space on raid0 profile Btrfs: fix wrong disk space information of the files Btrfs: fix wrong i_size when truncating a file to a larger size Btrfs: fix btrfs_end_bio to deal with write errors to a single mirror * 'for-linus-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: lower the dirty balance poll interval
| | * | Btrfs: fix num_workers_starting bug and other bugs in async threadJosef Bacik2011-12-151-13/+21
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Al pointed out we have some random problems with the way we account for num_workers_starting in the async thread stuff. First of all we need to make sure to decrement num_workers_starting if we fail to start the worker, so make __btrfs_start_workers do this. Also fix __btrfs_start_workers so that it doesn't call btrfs_stop_workers(), there is no point in stopping everybody if we failed to create a worker. Also check_pending_worker_creates needs to call __btrfs_start_work in it's work function since it already increments num_workers_starting. People only start one worker at a time, so get rid of the num_workers argument everywhere, and make btrfs_queue_worker a void since it will always succeed. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
* | | Merge branch 'pm-freezer' of ↵Rafael J. Wysocki2011-11-231-6/+2
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into pm-freezer * 'pm-freezer' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc: (24 commits) freezer: fix wait_event_freezable/__thaw_task races freezer: kill unused set_freezable_with_signal() dmatest: don't use set_freezable_with_signal() usb_storage: don't use set_freezable_with_signal() freezer: remove unused @sig_only from freeze_task() freezer: use lock_task_sighand() in fake_signal_wake_up() freezer: restructure __refrigerator() freezer: fix set_freezable[_with_signal]() race freezer: remove should_send_signal() and update frozen() freezer: remove now unused TIF_FREEZE freezer: make freezing() test freeze conditions in effect instead of TIF_FREEZE cgroup_freezer: prepare for removal of TIF_FREEZE freezer: clean up freeze_processes() failure path freezer: kill PF_FREEZING freezer: test freezable conditions while holding freezer_lock freezer: make freezing indicate freeze condition in effect freezer: use dedicated lock instead of task_lock() + memory barrier freezer: don't distinguish nosig tasks on thaw freezer: remove racy clear_freeze_flag() and set PF_NOFREEZE on dead tasks freezer: rename thaw_process() to __thaw_task() and simplify the implementation ...
| * | freezer: unexport refrigerator() and update try_to_freeze() slightlyTejun Heo2011-11-211-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no reason to export two functions for entering the refrigerator. Calling refrigerator() instead of try_to_freeze() doesn't save anything noticeable or removes any race condition. * Rename refrigerator() to __refrigerator() and make it return bool indicating whether it scheduled out for freezing. * Update try_to_freeze() to return bool and relay the return value of __refrigerator() if freezing(). * Convert all refrigerator() users to try_to_freeze(). * Update documentation accordingly. * While at it, add might_sleep() to try_to_freeze(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Chris Mason <chris.mason@oracle.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: Christoph Hellwig <hch@infradead.org>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2011-11-221-18/+129
|\ \ \ | |/ / |/| / | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: remove free-space-cache.c WARN during log replay Btrfs: sectorsize align offsets in fiemap Btrfs: clear pages dirty for io and set them extent mapped Btrfs: wait on caching if we're loading the free space cache Btrfs: prefix resize related printks with btrfs: btrfs: fix stat blocks accounting Btrfs: avoid unnecessary bitmap search for cluster setup Btrfs: fix to search one more bitmap for cluster setup btrfs: mirror_num should be int, not u64 btrfs: Fix up 32/64-bit compatibility for new ioctls Btrfs: fix barrier flushes Btrfs: fix tree corruption after multi-thread snapshots and inode_cache flush
| * btrfs: mirror_num should be int, not u64Jan Schmidt2011-11-201-1/+1
| | | | | | | | | | | | | | | | My previous patch introduced some u64 for failed_mirror variables, this one makes it consistent again. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Btrfs: fix barrier flushesChris Mason2011-11-201-17/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When btrfs is writing the super blocks, it send barrier flushes to make sure writeback caching drives get all the metadata on disk in the right order. But, we have two bugs in the way these are sent down. When doing full commits (not via the tree log), we are sending the barrier down before the last super when it should be going down before the first. In multi-device setups, we should be waiting for the barriers to complete on all devices before writing any of the supers. Both of these bugs can cause corruptions on power failures. We fix it with some new code to send down empty barriers to all devices before writing the first super. Alexandre Oliva found the multi-device bug. Arne Jansen did the async barrier loop. Signed-off-by: Chris Mason <chris.mason@oracle.com> Reported-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
* | Merge branch 'for-linus' of ↵Linus Torvalds2011-11-111-24/+18
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: rename the option to nospace_cache Btrfs: handle bio_add_page failure gracefully in scrub Btrfs: fix deadlock caused by the race between relocation Btrfs: only map pages if we know we need them when reading the space cache Btrfs: fix orphan backref nodes Btrfs: Abstract similar code for btrfs_block_rsv_add{, _noflush} Btrfs: fix unreleased path in btrfs_orphan_cleanup() Btrfs: fix no reserved space for writing out inode cache Btrfs: fix nocow when deleting the item Btrfs: tweak the delayed inode reservations again Btrfs: rework error handling in btrfs_mount() Btrfs: close devices on all error paths in open_ctree() Btrfs: avoid null dereference and leaks when bailing from open_ctree() Btrfs: fix subvol_name leak on error in btrfs_mount() Btrfs: fix memory leak in btrfs_parse_early_options() Btrfs: fix our reservations for updating an inode when completing io Btrfs: fix oops on NULL trans handle in btrfs_truncate btrfs: fix double-free 'tree_root' in 'btrfs_mount()'
| * Btrfs: close devices on all error paths in open_ctree()Ilya Dryomov2011-11-091-4/+3
| | | | | | | | | | | | | | | | Fix a bug introduced by 7e662854 where we would leave devices busy on certain error paths in open_ctree(). fs_info is guaranteed to be non-NULL now so it's safe to dereference it on all error paths. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * Btrfs: avoid null dereference and leaks when bailing from open_ctree()Ilya Dryomov2011-11-091-20/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix bugs introduced by 6c41761f. Firstly, after failing to allocate any of the tree roots (first 'goto fail' in open_ctree()) we would dereference a NULL fs_info pointer in free_fs_info(). Secondly, after failures from init_srcu_struct(), setup_bdi() and new_inode() we would leak all earlier allocated roots: fs_info fields haven't been initialized yet so free_fs_info() is rendered useless. Fix this by initializing fs_info pointer and fs_info fields before any allocations happen. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
* | Merge branch 'for-linus' of ↵Linus Torvalds2011-11-061-56/+385
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (114 commits) Btrfs: check for a null fs root when writing to the backup root log Btrfs: fix race during transaction joins Btrfs: fix a potential btrfs_bio leak on scrub fixups Btrfs: rename btrfs_bio multi -> bbio for consistency Btrfs: stop leaking btrfs_bios on readahead Btrfs: stop the readahead threads on failed mount Btrfs: fix extent_buffer leak in the metadata IO error handling Btrfs: fix the new inspection ioctls for 32 bit compat Btrfs: fix delayed insertion reservation Btrfs: ClearPageError during writepage and clean_tree_block Btrfs: be smarter about committing the transaction in reserve_metadata_bytes Btrfs: make a delayed_block_rsv for the delayed item insertion Btrfs: add a log of past tree roots btrfs: separate superblock items out of fs_info Btrfs: use the global reserve when truncating the free space cache inode Btrfs: release metadata from global reserve if we have to fallback for unlink Btrfs: make sure to flush queued bios if write_cache_pages waits Btrfs: fix extent pinning bugs in the tree log Btrfs: make sure btrfs_remove_free_space doesn't leak EAGAIN Btrfs: don't wait as long for more batches during SSD log commit ...
| * Btrfs: check for a null fs root when writing to the backup root logChris Mason2011-11-061-3/+10
| | | | | | | | | | | | | | | | During log replay, can commit the transaction before the fs_root pointers are setup, so we have to make sure they are not null before trying to use them. Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Btrfs: stop the readahead threads on failed mountChris Mason2011-11-061-0/+1
| | | | | | | | | | | | | | If we don't stop them, they linger around corrupting memory by using pointers to freed things. Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Btrfs: fix extent_buffer leak in the metadata IO error handlingChris Mason2011-11-061-0/+1
| | | | | | | | | | | | | | The scrub readahead branch brought in a new error handling hook, but it was leaking extent_buffer references. Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Merge git://git.jan-o-sch.net/btrfs-unstable into integrationChris Mason2011-11-061-2/+2
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: fs/btrfs/Makefile fs/btrfs/extent_io.c fs/btrfs/extent_io.h fs/btrfs/scrub.c Signed-off-by: Chris Mason <chris.mason@oracle.com>
| | * btrfs: add mirror_num to extent_read_full_pageJan Schmidt2011-09-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Currently, extent_read_full_page always assumes we are trying to read mirror 0, which generally is the best we can do. To add flexibility, pass it as a parameter. This will be needed by scrub fixup code. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
| * | Merge branch 'for-chris' of git://github.com/sensille/linux into integrationChris Mason2011-11-061-2/+82
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: fs/btrfs/ctree.h Signed-off-by: Chris Mason <chris.mason@oracle.com>
| | * | btrfs: hooks for readaheadArne Jansen2011-10-021-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds the hooks needed for readahead. In the readpage_end_io_hook, the extent state is checked for the EXTENT_READAHEAD flag. Only in this case the readahead hook is called, to keep the impact on non-ra as low as possible. Additionally, a hook for a failed IO is added, otherwise readahead would wait indefinitely for the extent to finish. Changes for v2: - eliminate race condition Signed-off-by: Arne Jansen <sensille@gmx.net>
| | * | btrfs: state information for readaheadArne Jansen2011-10-021-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add state information for readahead to btrfs_fs_info and btrfs_device Changes v2: - don't wait in radix_trees - add own set of workers for readahead Reviewed-by: Josef Bacik <josef@redhat.com> Signed-off-by: Arne Jansen <sensille@gmx.net>
| | * | btrfs: add READAHEAD extent buffer flagArne Jansen2011-10-021-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a READAHEAD extent buffer flag. Add a function to trigger a read with this flag set. Changes v2: - use extent buffer flags instead of extent state flags Changes v5: - adapt to changed read_extent_buffer_pages interface - don't return eb from reada_tree_block_flagged if it has CORRUPT flag set Signed-off-by: Arne Jansen <sensille@gmx.net>
| | * | btrfs: add an extra wait mode to read_extent_buffer_pagesArne Jansen2011-10-021-2/+3
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | read_extent_buffer_pages currently has two modes, either trigger a read without waiting for anything, or wait for the I/O to finish. The former also bails when it's unable to lock the page. This patch now adds an additional parameter to allow it to block on page lock, but don't wait for completion. Changes v5: - merge the 2 wait parameters into one and define WAIT_NONE, WAIT_COMPLETE and WAIT_PAGE_LOCK Change v6: - fix bug introduced in v5 Signed-off-by: Arne Jansen <sensille@gmx.net>
OpenPOWER on IntegriCloud