summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Btrfs: fix the mismatch of page->mappingLiu Bo2012-03-291-16/+19
| | | | | | | | | | | | | | | | commit 600a45e1d5e376f679ff9ecc4ce9452710a6d27c (Btrfs: fix deadlock on page lock when doing auto-defragment) fixes the deadlock on page, but it also introduces another bug. A page may have been truncated after unlock & lock. So we need to find it again to get the right one. And since we've held i_mutex lock, inode size remains unchanged and we can drop isize overflow checks. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: fix race between direct io and autodefragLiu Bo2012-03-291-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bug is from running xfstests 209 with autodefrag. The race is as follows: t1 t2(autodefrag) direct IO invalidate pagecache dio(old data) add_inode_defrag invalidate pagecache endio direct IO invalidate pagecache run_defrag readpage(old data) set page dirty (old data) dio(new data, rewrite) invalidate pagecache (*) endio t2(autodefrag) will get old data into pagecache via readpage and set pagecache dirty. Meanwhile, invalidate pagecache(*) will fail due to dirty flags in pages. So the old data may be flushed into disk by flush thread, which will lead to data loss. And so does the case of user defragment progs. The patch fixes this race by holding i_mutex when we readpage and set page dirty. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: fix deadlock during allocating chunksLiu Bo2012-03-291-0/+50
| | | | | | | | | | | | | This deadlock comes from xfstests 251. We'll hold the chunk_mutex throughout the whole of a chunk allocation. But if we find that we've used up system chunk space, we need to allocate a new system chunk, but this will lead to a recursion of chunk allocation and end up with a deadlock on chunk_mutex. So instead we need to allocate the system chunk first if we find we're in ENOSPC. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: show useful info in space reservation tracepointLiu Bo2012-03-293-25/+13
| | | | | | | | o For space info, the type of space info is useful for debug. o For transaction handle, its transid is useful. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: don't use crc items bigger than 4KBChris Mason2012-03-281-1/+3
| | | | | | | | | | | | | | | With the big metadata blocks, we can have crc items that are much bigger than a page. There are a few places that we try to kmalloc memory to hold the items during a split. Items bigger than 4KB don't really have a huge benefit in efficiency, but they do trigger larger order allocations. This commits changes the csums to make sure they stay under 4KB. This is not a format change, just a #define to limit huge items. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: flush out and clean up any block device pages during mountChris Mason2012-03-282-0/+4
| | | | | | | | | | | | | | Btrfs puts the filesystem metadata into its own address space, and somehow the block device address space isn't getting onto disk properly before a mount. The end result is that a loop of mkfs and mounting the filesystem will sometimes find stale or incorrect data. This commit should fix it by sprinkling fdatawrites and invalidate_bdev calls around. This is a short term measure to make sure it is fixed. The block devices really should be flushed and cleaned up higher in the stack. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Merge git://git.jan-o-sch.net/btrfs-unstable into for-linusChris Mason2012-03-285-55/+75
|\ | | | | | | | | | | | | Conflicts: fs/btrfs/transaction.c Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Btrfs: fix regression in scrub path resolvingJan Schmidt2012-03-274-55/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | In commit 4692cf58 we introduced new backref walking code for btrfs. This assumes we're searching live roots, which requires a transaction context. While scrubbing, however, we must not join a transaction because this could deadlock with the commit path. Additionally, what scrub really wants to do is resolving a logical address in the commit root it's currently checking. This patch adds support for logical to path resolving on commit roots and makes scrub use that. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
| * Btrfs: check return value of btrfs_cow_block()Jan Schmidt2012-03-271-2/+4
| | | | | | | | | | | | | | | | | | The two helper functions commit_cowonly_roots() and create_pending_snapshot() failed to check the return value from btrfs_cow_block(), which could at least in theory fail with -ENOSPC from btrfs_alloc_free_block(). This commit adds the missing checks. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
| * Btrfs: actually call btrfs_init_lockdepJan Schmidt2012-03-271-0/+2
| | | | | | | | | | | | | | | | | | btrfs_init_lockdep only makes our lockdep class names look prettier, thus it did never hurt we forgot to actually call it. This turns our lockdep identifier strings from lockdep auto-set #[id] into really pretty "btrfs-fs-01" or "btrfs-csum-03". Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
* | Merge branch 'for-chris' of git://github.com/idryomov/btrfs-unstable into ↵Chris Mason2012-03-284-139/+157
|\ \ | | | | | | | | | for-linus
| * | Btrfs: fix infinite loop in btrfs_shrink_device()Ilya Dryomov2012-03-271-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | If relocate of block group 0 fails with ENOSPC we end up infinitely looping because key.offset -= 1 statement in that case brings us back to where we started. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * | Btrfs: fix memory leak in resolver codeIlya Dryomov2012-03-271-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | init_ipath() allocates btrfs_data_container which is never freed. Free it in free_ipath() and nuke the comment for init_data_container() - we can safely free it with kfree(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * | Btrfs: allow dup for data chunks in mixed modeIlya Dryomov2012-03-271-4/+9
| | | | | | | | | | | | | | | | | | | | | Generally we don't allow dup for data, but mixed chunks are special and people seem to think this has its use cases. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * | Btrfs: validate target profiles only if we are going to use themIlya Dryomov2012-03-271-16/+11
| | | | | | | | | | | | | | | | | | | | | | | | Do not run sanity checks on all target profiles unless they all will be used. This came up because alloc_profile_is_valid() is now more strict than it used to be. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * | Btrfs: improve the logic in btrfs_can_relocate()Ilya Dryomov2012-03-271-6/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently if we don't have enough space allocated we go ahead and loop though devices in the hopes of finding enough space for a chunk of the *same* type as the one we are trying to relocate. The problem with that is that if we are trying to restripe the chunk its target type can be more relaxed than the current one (eg require less devices or less space). So, when restriping, run checks against the target profile instead of the current one. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * | Btrfs: add __get_block_group_index() helperIlya Dryomov2012-03-271-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | Add __get_block_group_index() helper to be able to derive block group index from an arbitary set of flags. Implement get_block_group_index() in terms of it. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * | Btrfs: add get_restripe_target() helperIlya Dryomov2012-03-271-44/+50
| | | | | | | | | | | | | | | | | | Add get_restripe_target() helper and switch everybody to use it. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * | Btrfs: move alloc_profile_is_valid() to volumes.cIlya Dryomov2012-03-273-30/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | Header file is not a good place to define functions. This also moves a call to alloc_profile_is_valid() down the stack and removes a redundant check from __btrfs_alloc_chunk() - alloc_profile_is_valid() takes it into account. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * | Btrfs: make profile_is_valid() check more strictIlya Dryomov2012-03-273-12/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "0" is a valid value for an on-disk chunk profile, but it is not a valid extended profile. (We have a separate bit for single chunks in extended case) Also rename it to alloc_profile_is_valid() for clarity. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * | Btrfs: add wrappers for working with alloc profilesIlya Dryomov2012-03-273-30/+30
| | | | | | | | | | | | | | | | | | | | | Add functions to abstract the conversion between chunk and extended allocation profile formats and switch everybody to use them. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * | Btrfs: stop silently switching single chunks to raid0 on balanceIlya Dryomov2012-03-271-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This has been causing a lot of confusion for quite a while now and a lot of users were surprised by this (some of them were even stuck in a ENOSPC situation which they couldn't easily get out of). The addition of restriper gives users a clear choice between raid0 and drive concat setup so there's absolutely no excuse for us to keep doing this. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
* | | Merge branch 'error-handling' into for-linusChris Mason2012-03-2838-1018/+2017
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: fs/btrfs/ctree.c fs/btrfs/disk-io.c fs/btrfs/extent-tree.c fs/btrfs/extent_io.c fs/btrfs/extent_io.h fs/btrfs/inode.c fs/btrfs/scrub.c Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | | btrfs: disallow unequal data/metadata blocksize for mixed block groupsDavid Sterba2012-03-281-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With support for bigger metadata blocks, we must avoid mounting a filesystem with different block size for mixed block groups, this causes corruption (found by xfstests/083). Signed-off-by: David Sterba <dsterba@suse.cz>
| * | | Btrfs: enhance superblock sanity checksDavid Sterba2012-03-281-5/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Validate checksum algorithm during mount and prevent BUG_ON later in btrfs_super_csum_size. Signed-off-by: David Sterba <dsterba@suse.cz>
| * | | btrfs: Fix busyloop in transaction_kthread()Jan Kara2012-03-221-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a filesystem got aborted due do error, transaction_kthread() will busyloop. Fix it by going to sleep in that case as well. Maybe we should just stop transaction_kthread() when filesystem is aborted but that would be more complex. Signed-off-by: Jan Kara <jack@suse.cz>
| * | | btrfs: replace many BUG_ONs with proper error handlingJeff Mahoney2012-03-2223-385/+980
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs currently handles most errors with BUG_ON. This patch is a work-in- progress but aims to handle most errors other than internal logic errors and ENOMEM more gracefully. This iteration prevents most crashes but can run into lockups with the page lock on occasion when the timing "works out." Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| * | | btrfs: enhance transaction abort infrastructureJeff Mahoney2012-03-228-56/+300
| | | | | | | | | | | | | | | | Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| * | | btrfs: add varargs to btrfs_errorJeff Mahoney2012-03-222-9/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs currently handles most errors with BUG_ON. This patch is a work-in- progress but aims to handle most errors other than internal logic errors and ENOMEM more gracefully. This iteration prevents most crashes but can run into lockups with the page lock on occasion when the timing "works out." Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| * | | btrfs: Remove BUG_ON from __finish_chunk_alloc()Mark Fasheh2012-03-221-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs_alloc_chunk() unconditionally BUGs on any error returned from __finish_chunk_alloc() so there's no need for two BUG_ON lines. Remove the one from __finish_chunk_alloc(). Signed-off-by: Mark Fasheh <mfasheh@suse.de>
| * | | btrfs: Remove BUG_ON from __btrfs_alloc_chunk()Mark Fasheh2012-03-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We BUG_ON() error from add_extent_mapping(), but that error looks pretty easy to bubble back up - as far as I can tell there have not been any permanent modifications to fs state at that point. Signed-off-by: Mark Fasheh <mfasheh@suse.de>
| * | | btrfs: Don't BUG_ON insert errors in btrfs_alloc_dev_extent()Mark Fasheh2012-03-221-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only caller of btrfs_alloc_dev_extent() is __btrfs_alloc_chunk() which already bugs on any error returned. We can remove the BUG_ON's in btrfs_alloc_dev_extent() then since __btrfs_alloc_chunk() will "catch" them anyway. Signed-off-by: Mark Fasheh <mfasheh@suse.de>
| * | | btrfs: Go readonly on tree errors in balance_levelMark Fasheh2012-03-221-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | balace_level() seems to deal with missing tree nodes by BUG_ON(). Instead, we can easily just set the file system readonly and bubble -EROFS back up the stack. Signed-off-by: Mark Fasheh <mfasheh@suse.com>
| * | | btrfs: Don't BUG_ON errors from update_ref_for_cow()Mark Fasheh2012-03-221-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __btrfs_cow_block(), the only caller of update_ref_for_cow() will BUG_ON() any error return. Instead, we can go read-only fs as update_ref_for_cow() manipulates disk data in a way which doesn't look like it's easily rolled back. Signed-off-by: Mark Fasheh <mfasheh@suse.de>
| * | | btrfs: Go readonly on bad extent refs in update_ref_for_cow()Mark Fasheh2012-03-221-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | update_ref_for_cow() will BUG_ON() after it's call to btrfs_lookup_extent_info() if no existing references are found. Since refs are computed directly from disk, this should be treated as a corruption instead of a logic error. Signed-off-by: Mark Fasheh <mfasheh@suse.de>
| * | | btrfs: Don't BUG_ON errors in __finish_chunk_alloc()Mark Fasheh2012-03-221-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All callers of __finish_chunk_alloc() BUG_ON() return value, so it's trivial for us to always bubble up any errors caught in __finish_chunk_alloc() to be caught there. Signed-off-by: Mark Fasheh <mfasheh@suse.de>
| * | | btrfs: Don't BUG_ON kzalloc error in btrfs_lookup_csums_range()Mark Fasheh2012-03-221-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unfortunately it isn't enough to just exit here - the kzalloc() happens in a loop and the allocated items are added to a linked list whose head is passed in from the caller. To fix the BUG_ON() and also provide the semantic that the list passed in is only modified on success, I create function-local temporary list that we add items too. If no error is met, that list is spliced to the callers at the end of the function. Otherwise the list will be walked and all items freed before the error value is returned. I did a simple test on this patch by forcing an error at the kzalloc() point and verifying that when this hits (git clone seemed to exercise this), the function throws the proper error. Unfortunately but predictably, we later hit a BUG_ON(ret) type line that still hasn't been fixed up ;) Signed-off-by: Mark Fasheh <mfasheh@suse.com>
| * | | btrfs: Don't BUG_ON() errors in update_ref_for_cow()Mark Fasheh2012-03-221-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only caller of update_ref_for_cow() is __btrfs_cow_block() which was originally ignoring any return values. update_ref_for_cow() however doesn't look like a candidate to become a void function - there are a few places where errors can occur. So instead I changed update_ref_for_cow() to bubble all errors up (instead of BUG_ON). __btrfs_cow_block() was then updated to catch and BUG_ON() any errors from update_ref_for_cow(). The end effect is that we have no change in behavior, but about 8 different places where a BUG_ON(ret) was removed. Obviously a future patch will have to address the BUG_ON() in __btrfs_cow_block(). Signed-off-by: Mark Fasheh <mfasheh@suse.de>
| * | | btrfs: Don't BUG_ON errors from btrfs_create_subvol_root()Mark Fasheh2012-03-222-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is called from only one place - create_subvol() which passes errors safely back out to it's caller, btrfs_mksubvol where they are handled. Additionally, btrfs_create_subvol_root() itself bug's needlessly from error return of btrfs_update_inode(). Since create_subvol() was fixed to catch errors we can bubble this one up too. Signed-off-by: Mark Fasheh <mfasheh@suse.com>
| * | | btrfs: btrfs_drop_snapshot should return intJeff Mahoney2012-03-224-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit cb1b69f4 (Btrfs: forced readonly when btrfs_drop_snapshot() fails) made btrfs_drop_snapshot return void because there were no callers checking the return value. That is the wrong order to handle error propogation since the caller will have no idea that an error has occured and continue on as if nothing went wrong. Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| * | | btrfs: split extent_state opsJeff Mahoney2012-03-223-15/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | set_extent_bit can do exclusive locking but only when called by lock_extent*, Drop the exclusive bits argument except when called by lock_extent. Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| * | | btrfs: drop gfp_t from lock_extentJeff Mahoney2012-03-229-76/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | lock_extent and unlock_extent are always called with GFP_NOFS, drop the argument and use GFP_NOFS consistently. Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| * | | btrfs: return void in functions without error conditionsJeff Mahoney2012-03-2229-410/+293
| | | | | | | | | | | | | | | | Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| * | | btrfs: __add_reloc_root error push-upJeff Mahoney2012-03-221-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch pushes kmalloc errors up to the caller and BUGs in the caller. The BUG_ON for duplicate reloc tree root insertion is replaced with a panic explaining the issue. Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| * | | btrfs: ->submit_bio_hook error push-upJeff Mahoney2012-03-223-15/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This pushes failures from the submit_bio_hook callbacks, btrfs_submit_bio_hook and btree_submit_bio_hook into the callers, including callers of submit_one_bio where it catches the failures with BUG_ON. It also pushes up through the ->readpage_io_failed_hook to end_bio_extent_writepage where the error is already caught with BUG_ON. Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| * | | btrfs: Factor out tree->ops->merge_bio_hook callJeff Mahoney2012-03-222-5/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In submit_extent_page, there's a visually noisy if statement that, in the midst of other conditions, does the tree dependency for tree->ops and tree->ops->merge_bio_hook before calling it, and then another condition afterwards. If an error is returned from merge_bio_hook, there's no way to catch it. It's considered a routine "1" return value instead of a failure. This patch factors out the dependency check into a new local merge_bio routine and BUG's on an error. The if statement is less noisy as a side- effect. Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| * | | btrfs: Simplify btrfs_submit_bio_hookJeff Mahoney2012-03-221-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs_submit_bio_hook currently calls btrfs_bio_wq_end_io in either case of an if statement that determines one of the arguments. This patch moves the function call outside of the if statement and uses it to only determine the different argument. This allows us to catch an error in one place in a more visually obvious way. Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| * | | btrfs: btrfs_update_root error push-upJeff Mahoney2012-03-222-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs_update_root BUG's when it can't alloc a path, yet it can recover from a search error. This patch returns -ENOMEM instead. Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| * | | btrfs: find_and_setup_root error push-upJeff Mahoney2012-03-221-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | find_and_setup_root BUGs when it encounters an error from btrfs_find_last_root, which can occur if a path can't be allocated. This patch pushes it up to its callers where it is already handled. Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| * | | btrfs: Remove set bits return from clear_extent_bitJeff Mahoney2012-03-221-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is only one caller of clear_extent_bit that checks the return value and it only checks if it's negative. Since there are no users of the returned bits functionality of clear_extent_bit, stop returning it and avoid complicating error handling. Signed-off-by: Jeff Mahoney <jeffm@suse.com>
OpenPOWER on IntegriCloud