summaryrefslogtreecommitdiffstats
path: root/fs/btrfs
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | | | btrfs: scrub: embed scrub_wr_ctx into scrub contextDavid Sterba2017-06-191-54/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The structure scrub_wr_ctx is not used anywhere just the scrub context, we can move the members there. The tgtdev is renamed so it's more clear that it belongs to the "wr" part. Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: scrub: use fs_info::sectorsize and drop it from scrub contextDavid Sterba2017-06-191-14/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we now have the node/block sizes in fs_info, we can use them and can drop the local copies. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: add statx supportYonghong Song2017-06-191-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return enhanced file attributes from the btrfs, including: (1). inode creation time as stx_btime, and (2). Certain BTRFS_INODE_xxx flags are mapped to stx_attributes flags. Example output: [root@localhost ~]# cat t.sh touch t chattr +aic t ~/linux/samples/statx/test-statx t chattr -aic t touch t echo "========================================" ~/linux/samples/statx/test-statx t /bin/rm t [root@localhost ~]# ./t.sh statx(t) = 0 results=fff Size: 0 Blocks: 0 IO Block: 4096 regular file Device: 00:1c Inode: 63962 Links: 1 Access: (0644/-rw-r--r--) Uid: 0 Gid: 0 Access: 2017-05-11 16:03:13.999856591-0700 Modify: 2017-05-11 16:03:13.999856591-0700 Change: 2017-05-11 16:03:14.000856663-0700 Birth: 2017-05-11 16:03:13.999856591-0700 Attributes: 0000000000000034 (........ ........ ........ ........ ........ ........ ........ .-ai.c..) ======================================== statx(t) = 0 results=fff Size: 0 Blocks: 0 IO Block: 4096 regular file Device: 00:1c Inode: 63962 Links: 1 Access: (0644/-rw-r--r--) Uid: 0 Gid: 0 Access: 2017-05-11 16:03:14.006857097-0700 Modify: 2017-05-11 16:03:14.006857097-0700 Change: 2017-05-11 16:03:14.006857097-0700 Birth: 2017-05-11 16:03:13.999856591-0700 Attributes: 0000000000000000 (........ ........ ........ ........ ........ ........ ........ .---.-..) [root@localhost ~]# Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: lzo: fix typo in error message after failed deflateTimofey Titovets2017-06-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix copy paste typo in debug message for lzo.c, lzo is not deflate. Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: btrfs_wait_tree_block_writeback can be void returnJeff Layton2017-06-192-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nothing checks its return value. Is it safe to skip checking return value of btrfs_wait_tree_block_writeback? Liu Bo: I think yes, it's used in walk_log_tree which is called in two places, free_log_tree and log replay. For free_log_tree, it waits for any running writeback of the extent buffer under freeing to finish in case we need to access the eb pointer from page->private, and it's OK to not check the return value, while for log replay, it's doesn't wait because wc->wait is not set. So neither cares about the writeback error. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> [ added more explanation to changelog, from Liu Bo ] Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: remove __BTRFS_LEAF_DATA_SIZENikolay Borisov2017-06-191-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __BTRFS_LAF_DATA_SIZE is used only by BTRFS_LEAF_DATA_SIZE. Make the latter subsume the former. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: rename btrfs_leaf_data to BTRFS_LEAF_DATA_OFFSETNikolay Borisov2017-06-193-27/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 5f39d397dfbe ("Btrfs: Create extent_buffer interface for large blocksizes") refactored btrfs_leaf_data function to take extent_buffer rather than struct btrfs_leaf. However, as it turns out the parameter being passed is never used. Furthermore this function no longer returns the leaf data but rather the offset to it. So rename the function to BTRFS_LEAF_DATA_OFFSET to make it consistent with other BTRFS_LEAF_* helpers and turn it into a macro. Signed-off-by: Nikolay Borisov <nborisov@suse.com> [ removed () from the macro ] Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: reduce arguments for decompress_bio opsAnand Jain2017-06-194-57/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct compressed_bio pointer can be used instead. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: btrfs_decompress_bio() could accept compressed_bio insteadAnand Jain2017-06-191-14/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of sending each argument of struct compressed_bio, send the compressed_bio itself. Also by having struct compressed_bio in btrfs_decompress_bio() it would help tracing. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: Refactor update_space_infoNikolay Borisov2017-06-191-40/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Following the factoring out of the creation code udpate_space_info can only be called for already-existing space_info structs. As such it cannot fail. Remove superfluous error handling and make the function return void. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: Separate space_info create/updateNikolay Borisov2017-06-191-66/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the struct space_info creation code is intermixed in the udpate_space_info function. There are well-defined points at which the we actually want to create brand-new space_info structs (e.g. during mount of the filesystem as well as sometimes when adding/initialising new chunks). In such cases update_space_info is called with 0 as the bytes parameter. All of this makes for spaghetti code. Fix it by factoring out the creation code in a separate create_space_info structure. This also allows to simplify the internals. Also remove BUG_ON from do_alloc_chunk since the callers handle errors. Furthermore it will make the update_space_info function not fail, allowing us to remove error handling in callers. This will come in a follow up patch. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: let btrfs_print_leaf print more about block groupLiu Bo2017-06-191-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds chunk_objectid and flags, with flags we can recognize whether the block group is about data or metadata. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: skip commit transaction if we don't have enough pinned bytesLiu Bo2017-06-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We commit transaction in order to reclaim space from pinned bytes because it could process delayed refs, and in may_commit_transaction(), we check first if pinned bytes are enough for the required space, we then check if that plus bytes reserved for delayed insert are enough for the required space. This changes the code to the above logic. Fixes: b150a4f10d87 ("Btrfs: use a percpu to keep track of possibly pinned bytes") Tested-by: Nikolay Borisov <nborisov@suse.com> Reported-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: scrub: simplify cleanup of wr_ctx in scrub_free_ctxDavid Sterba2017-06-191-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't need to take the mutex and zero out wr_cur_bio, as this is called after the scrub finished. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: scrub: inline helper scrub_free_wr_ctxDavid Sterba2017-06-191-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The helper scrub_free_wr_ctx is used only once and fits into scrub_free_ctx as it continues sctx shutdown, no need to keep it separate. Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: scrub: inline helper scrub_setup_wr_ctxDavid Sterba2017-06-191-27/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The helper scrub_setup_wr_ctx is used only once and fits into scrub_setup_ctx as it continues intialization, no need to keep it separate. Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: remove root usage from can_overcommitJeff Mahoney2017-06-191-36/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | can_overcommit using the root to determine the allocation profile is the only use of a root in the call graph below reserve_metadata_bytes. It turns out that we only need to know whether the allocation is for the chunk root or not -- and we can pass that around as a bool instead. This allows us to pull root usage out of the reservation path all the way up to reserve_metadata_bytes itself, which uses it only to compare against fs_info->chunk_root to set the bool. In turn, this eliminates a bunch of races where we use a particular root too early in the mount process. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: cleanup root usage by btrfs_get_alloc_profileJeff Mahoney2017-06-195-15/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two places where we don't already know what kind of alloc profile we need before calling btrfs_get_alloc_profile, but we need access to a root everywhere we call it. This patch adds helpers for btrfs_{data,metadata,system}_alloc_profile() and relegates btrfs_system_alloc_profile to a static for use in those two cases. The next patch will eliminate one of those. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: fix bool type in btrfs_page_exists_in_rangeDavid Sterba2017-06-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use only a simple bool indicator, int is not a problem here. Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: remove unused member list from btrfs_end_io_wqDavid Sterba2017-06-191-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The end io work queue items have been tracked by the work queues since "Btrfs: Add async worker threads for pre and post IO checksumming" (8b7128429235d9bd72cfd5e) (2008). Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: remove unused members dir_path from recorded_refDavid Sterba2017-06-191-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The two members do not seem to be used since the initial commit. Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: remove unused member list from async_submit_bioDavid Sterba2017-06-191-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The list used to track checksums in the early version (2.6.29), but I was able not pinpoint the commit that stopped using it. Everything apparently works without it for a long time. Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: remove unused member err from reada_extentDavid Sterba2017-06-191-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Seems to be unused since the initial commit, we ignore readahead errors anyway, the full read will handle that if necessary. Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: Remove unnecessary branching in free-space-tree.cSahil Kang2017-06-191-10/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both btrfs_create_free_space_tree and btrfs_clear_free_space_tree contain: if (ret) return ret; return 0; The if statement is only false when ret equals zero, and since we return zero in such cases, we can safely remove the branching. Signed-off-by: Sahil Kang <sahil.kang@asilaycomputing.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: hardcode GFP_NOFS for btrfs_bio_clone_partialLiu Bo2017-06-193-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We only pass GFP_NOFS to btrfs_bio_clone_partial, so lets hardcode it. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: work around maybe-uninitialized warningArnd Bergmann2017-06-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A rewrite of btrfs_submit_direct_hook appears to have introduced a warning: fs/btrfs/inode.c: In function 'btrfs_submit_direct_hook': fs/btrfs/inode.c:8467:14: error: 'bio' may be used uninitialized in this function [-Werror=maybe-uninitialized] Where the 'bio' variable was previously initialized unconditionally, it is now set in the "while (submit_len > 0)" loop that would never execute if submit_len is zero. Assuming this cannot happen in practice, we can avoid the warning by simply replacing the while{} loop with a do{}while() loop so the compiler knows that it will always be entered at least once. Fixes changes introduced in "Btrfs: use bio_clone_bioset_partial to simplify DIO submit". Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: unify naming of btrfs_io_bioLiu Bo2017-06-191-19/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All dio endio functions are using io_bio for struct btrfs_io_bio, this makes btrfs_submit_direct to follow this convention. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: check-integrity use bvec_iterLiu Bo2017-06-191-12/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some check-integrity code depends on bio->bi_vcnt, this changes it to use bio segments because some bios passing here may not have a reliable bi_vcnt. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: record error if one block has failed to retryLiu Bo2017-06-191-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the nocsum case of dio read endio, it returns immediately if an error gets returned when repairing, which leaves the rest blocks unrepaired. The behavior is different from how buffered read endio works in the same case. This changes it to record error only and go on repairing the rest blocks. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: change how we iterate bios in endioLiu Bo2017-06-194-32/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since dio submit has used bio_clone_fast, the submitted bio may not have a reliable bi_vcnt, for the bio vector iterations in checksum related functions, bio->bi_iter is not modified yet and it's safe to use bio_for_each_segment, while for those bio vector iterations in dio read's endio, we now save a copy of bvec_iter in struct btrfs_io_bio when cloning bios and use the helper __bio_for_each_segment with the saved bvec_iter to access each bvec. Also for dio reads which don't get split, we also need to save a copy of bio iterator in btrfs_bio_clone to let __bio_for_each_segments to access each bvec in dio read's endio. Note that it doesn't affect other calls of btrfs_bio_clone() because they don't need to use this iterator. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: use bio_clone_bioset_partial to simplify DIO submitLiu Bo2017-06-191-74/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently when mapping bio to limit bio to a single stripe length, we split bio by adding page to bio one by one, but later we don't modify the vector of bio at all, thus we can use bio_clone_fast to use the original bio vector directly. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: new helper btrfs_bio_clone_partialLiu Bo2017-06-192-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a new helper btrfs_bio_clone_partial, it'll allocate a cloned bio that only owns a part of the original bio's data. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: use bio_clone_fast to clone our bioLiu Bo2017-06-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For raid1 and raid10, we clone the original bio to the bios which are then sent to different disks. Right now we use bio_clone_bioset to create a clone bio with iterating bi_io_vec to initialize it. This changes it to use bio_clone_fast() which creates a clone bio but only copies the bi_io_vec pointer instead of iterating bi_io_vec. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: don't pass the inode through clean_io_failureJosef Bacik2017-06-193-37/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead pass around the failure tree and the io tree. Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: remove inode argument from repair_io_failureJosef Bacik2017-06-193-13/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once we remove the btree_inode we won't have an inode to pass anymore, just pass the fs_info directly and the inum since we use that to print out the repair message. Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: replace tree->mapping with tree->private_dataJosef Bacik2017-06-199-91/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For extent_io tree's we have carried the address_mapping of the inode around in the io tree in order to pull the inode back out for calling into various tree ops hooks. This works fine when everything that has an extent_io_tree has an inode. But we are going to remove the btree_inode, so we need to change this. Instead just have a generic void * for private data that we can initialize with, and have all the tree ops use that instead. This had a lot of cascading changes but should be relatively straightforward. Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: David Sterba <dsterba@suse.com> [ minor reordering of the callback prototypes ] Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: Add quota_override knob into sysfsSargun Dhillon2017-06-191-0/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the read-write attribute quota_override into sysfs. Any process which has CAP_SYS_RESOURCE can set this flag to on, and once it is set to true, processes with CAP_SYS_RESOURCE can exceed the quota. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Reviewed-by: David Sterba <dsterba@suse.com> [ minor changelog edits ] Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: add quota override flag to enable quota override for CAP_SYS_RESOURCESargun Dhillon2017-06-192-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces the quota override flag to btrfs_fs_info, and a change to quota limit checking code to temporarily allow for quota to be overridden for processes with CAP_SYS_RESOURCE. It's useful for administrative programs, such as log rotation, that may need to temporarily use more disk space in order to free up a greater amount of overall disk space without yielding more disk space to the rest of userland. Eventually, we may want to add the idea of an operator-specific quota, operator reserved space, or something else to allow for administrative override, but this is perhaps the simplest solution. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Reviewed-by: David Sterba <dsterba@suse.com> [ minor changelog edits ] Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: Convert fs_info->free_chunk_space to atomic64_tNikolay Borisov2017-06-194-26/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ->free_chunk_space variable is used to track the unallocated space and access to it is protected by a spinlock, which is not used for anything else. Make the code a bit self-explanatory by switching the variable to an atomic64_t type and kill the spinlock. Signed-off-by: Nikolay Borisov <nborisov@suse.com> [ not a performance critical code, use of atomic type is ok ] Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: add framework to handle device flush error as a volumeAnand Jain2017-06-192-4/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds comments to the flush error handling part of the code, and hopes to maintain the same logic with a framework which can be used to handle the errors at the volume level. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: remove obsolete FIXMEs in qgroup ioctlsDaichou2017-06-191-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These FIXMEs were already addressed in 2013. All functions check for qgroup existence: * btrfs_add_qgroup_relation * btrfs_ioctl_qgroup_create * btrfs_limit_qgroup * btrfs_del_qgroup_relation Signed-off-by: Daichou <tommy0705c@gmail.com> [ enhance and reformat changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | Btrfs: remove an unused variableDan Carpenter2017-06-191-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "item" is never used. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | btrfs: kmap() can't failFabian Frederick2017-06-191-7/+1
| | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove NULL test on kmap() as it will always return a valid pointer. Signed-off-by: Fabian Frederick <fabf@skynet.be> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* | | | | Merge branch 'for-4.13/block' of git://git.kernel.dk/linux-blockLinus Torvalds2017-07-0315-168/+197
|\ \ \ \ \ | |/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull core block/IO updates from Jens Axboe: "This is the main pull request for the block layer for 4.13. Not a huge round in terms of features, but there's a lot of churn related to some core cleanups. Note this depends on the UUID tree pull request, that Christoph already sent out. This pull request contains: - A series from Christoph, unifying the error/stats codes in the block layer. We now use blk_status_t everywhere, instead of using different schemes for different places. - Also from Christoph, some cleanups around request allocation and IO scheduler interactions in blk-mq. - And yet another series from Christoph, cleaning up how we handle and do bounce buffering in the block layer. - A blk-mq debugfs series from Bart, further improving on the support we have for exporting internal information to aid debugging IO hangs or stalls. - Also from Bart, a series that cleans up the request initialization differences across types of devices. - A series from Goldwyn Rodrigues, allowing the block layer to return failure if we will block and the user asked for non-blocking. - Patch from Hannes for supporting setting loop devices block size to that of the underlying device. - Two series of patches from Javier, fixing various issues with lightnvm, particular around pblk. - A series from me, adding support for write hints. This comes with NVMe support as well, so applications can help guide data placement on flash to improve performance, latencies, and write amplification. - A series from Ming, improving and hardening blk-mq support for stopping/starting and quiescing hardware queues. - Two pull requests for NVMe updates. Nothing major on the feature side, but lots of cleanups and bug fixes. From the usual crew. - A series from Neil Brown, greatly improving the bio rescue set support. Most notably, this kills the bio rescue work queues, if we don't really need them. - Lots of other little bug fixes that are all over the place" * 'for-4.13/block' of git://git.kernel.dk/linux-block: (217 commits) lightnvm: pblk: set line bitmap check under debug lightnvm: pblk: verify that cache read is still valid lightnvm: pblk: add initialization check lightnvm: pblk: remove target using async. I/Os lightnvm: pblk: use vmalloc for GC data buffer lightnvm: pblk: use right metadata buffer for recovery lightnvm: pblk: schedule if data is not ready lightnvm: pblk: remove unused return variable lightnvm: pblk: fix double-free on pblk init lightnvm: pblk: fix bad le64 assignations nvme: Makefile: remove dead build rule blk-mq: map all HWQ also in hyperthreaded system nvmet-rdma: register ib_client to not deadlock in device removal nvme_fc: fix error recovery on link down. nvmet_fc: fix crashes on bad opcodes nvme_fc: Fix crash when nvme controller connection fails. nvme_fc: replace ioabort msleep loop with completion nvme_fc: fix double calls to nvme_cleanup_cmd() nvme-fabrics: verify that a controller returns the correct NQN nvme: simplify nvme_dev_attrs_are_visible ...
| * | | | btrfs: add support for passing in write hints for buffered writesJens Axboe2017-06-271-0/+1
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Chris Mason <clm@fb.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | btrfs: use new block error codeDan Carpenter2017-06-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function is supposed to return blk_status_t error codes now but there was a stray -ENOMEM left behind. Fixes: 4e4cbee93d56 ("block: switch bios to blk_status_t") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | btrfs: nowait aio supportGoldwyn Rodrigues2017-06-202-6/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return EAGAIN if any of the following checks fail + i_rwsem is not lockable + NODATACOW or PREALLOC is not set + Cannot nocow at the desired location + Writing beyond end of file which is not allocated Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | blk: replace bioset_create_nobvec() with a flags arg to bioset_create()NeilBrown2017-06-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "flags" arguments are often seen as good API design as they allow easy extensibility. bioset_create_nobvec() is implemented internally as a variation in flags passed to __bioset_create(). To support future extension, make the internal structure part of the API. i.e. add a 'flags' argument to bioset_create() and discard bioset_create_nobvec(). Note that the bio_split allocations in drivers/md/raid* do not need the bvec mempool - they should have used bioset_create_nobvec(). Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | Merge tag 'v4.12-rc5' into for-4.13/blockJens Axboe2017-06-126-16/+139
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've already got a few conflicts and upcoming work depends on some of the changes that have gone into mainline as regression fixes for this series. Pull in 4.12-rc5 to resolve these conflicts and make it easier on down stream trees to continue working on 4.13 changes. Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | | block: switch bios to blk_status_tChristoph Hellwig2017-06-0914-157/+160
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace bi_error with a new bi_status to allow for a clear conversion. Note that device mapper overloaded bi_error with a private value, which we'll have to keep arround at least for now and thus propagate to a proper blk_status_t value. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
OpenPOWER on IntegriCloud