summaryrefslogtreecommitdiffstats
path: root/fs/f2fs
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-blockLinus Torvalds2015-09-021-6/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull core block updates from Jens Axboe: "This first core part of the block IO changes contains: - Cleanup of the bio IO error signaling from Christoph. We used to rely on the uptodate bit and passing around of an error, now we store the error in the bio itself. - Improvement of the above from myself, by shrinking the bio size down again to fit in two cachelines on x86-64. - Revert of the max_hw_sectors cap removal from a revision again, from Jeff Moyer. This caused performance regressions in various tests. Reinstate the limit, bump it to a more reasonable size instead. - Make /sys/block/<dev>/queue/discard_max_bytes writeable, by me. Most devices have huge trim limits, which can cause nasty latencies when deleting files. Enable the admin to configure the size down. We will look into having a more sane default instead of UINT_MAX sectors. - Improvement of the SGP gaps logic from Keith Busch. - Enable the block core to handle arbitrarily sized bios, which enables a nice simplification of bio_add_page() (which is an IO hot path). From Kent. - Improvements to the partition io stats accounting, making it faster. From Ming Lei. - Also from Ming Lei, a basic fixup for overflow of the sysfs pending file in blk-mq, as well as a fix for a blk-mq timeout race condition. - Ming Lin has been carrying Kents above mentioned patches forward for a while, and testing them. Ming also did a few fixes around that. - Sasha Levin found and fixed a use-after-free problem introduced by the bio->bi_error changes from Christoph. - Small blk cgroup cleanup from Viresh Kumar" * 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits) blk: Fix bio_io_vec index when checking bvec gaps block: Replace SG_GAPS with new queue limits mask block: bump BLK_DEF_MAX_SECTORS to 2560 Revert "block: remove artifical max_hw_sectors cap" blk-mq: fix race between timeout and freeing request blk-mq: fix buffer overflow when reading sysfs file of 'pending' Documentation: update notes in biovecs about arbitrarily sized bios block: remove bio_get_nr_vecs() fs: use helper bio_add_page() instead of open coding on bi_io_vec block: kill merge_bvec_fn() completely md/raid5: get rid of bio_fits_rdev() md/raid5: split bio for chunk_aligned_read block: remove split code in blkdev_issue_{discard,write_same} btrfs: remove bio splitting and merge_bvec_fn() calls bcache: remove driver private bio splitting code block: simplify bio_add_page() block: make generic_make_request handle arbitrarily sized bios blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL) block: don't access bio->bi_error after bio_put() block: shrink struct bio down to 2 cache lines again ...
| * block: remove bio_get_nr_vecs()Kent Overstreet2015-08-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | We can always fill up the bio now, no need to estimate the possible size based on queue parameters. Acked-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> [hch: rebased and wrote a changelog] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * block: add a bi_error field to struct bioChristoph Hellwig2015-07-291-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we have two different ways to signal an I/O error on a BIO: (1) by clearing the BIO_UPTODATE flag (2) by returning a Linux errno value to the bi_end_io callback The first one has the drawback of only communicating a single possible error (-EIO), and the second one has the drawback of not beeing persistent when bios are queued up, and are not passed along from child to parent bio in the ever more popular chaining scenario. Having both mechanisms available has the additional drawback of utterly confusing driver authors and introducing bugs where various I/O submitters only deal with one of them, and the others have to add boilerplate code to deal with both kinds of error returns. So add a new bi_error field to store an errno value directly in struct bio and remove the existing mechanisms to clean all this up. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* | f2fs: call set_page_dirty to attach i_wb for cgroupJaegeuk Kim2015-07-255-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cgroup attaches inode->i_wb via mark_inode_dirty and when set_page_writeback is called, __inc_wb_stat() updates i_wb's stat. So, we need to explicitly call set_page_dirty->__mark_inode_dirty in prior to any writebacking pages. This patch should resolve the following kernel panic reported by Andreas Reis. https://bugzilla.kernel.org/show_bug.cgi?id=101801 --- Comment #2 from Andreas Reis <andreas.reis@gmail.com> --- BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 IP: [<ffffffff8149deea>] __percpu_counter_add+0x1a/0x90 PGD 2951ff067 PUD 2df43f067 PMD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: CPU: 7 PID: 10356 Comm: gcc Tainted: G W 4.2.0-1-cu #1 Hardware name: Gigabyte Technology Co., Ltd. G1.Sniper M5/G1.Sniper M5, BIOS T01 02/03/2015 task: ffff880295044f80 ti: ffff880295140000 task.ti: ffff880295140000 RIP: 0010:[<ffffffff8149deea>] [<ffffffff8149deea>] __percpu_counter_add+0x1a/0x90 RSP: 0018:ffff880295143ac8 EFLAGS: 00010082 RAX: 0000000000000003 RBX: ffffea000a526d40 RCX: 0000000000000001 RDX: 0000000000000020 RSI: 0000000000000001 RDI: 0000000000000088 RBP: ffff880295143ae8 R08: 0000000000000000 R09: ffff88008f69bb30 R10: 00000000fffffffa R11: 0000000000000000 R12: 0000000000000088 R13: 0000000000000001 R14: ffff88041d099000 R15: ffff880084a205d0 FS: 00007f8549374700(0000) GS:ffff88042f3c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000a8 CR3: 000000033e1d5000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Stack: 0000000000000000 ffffea000a526d40 ffff880084a20738 ffff880084a20750 ffff880295143b48 ffffffff811cc91e ffff880000000000 0000000000000296 0000000000000000 ffff880417090198 0000000000000000 ffffea000a526d40 Call Trace: [<ffffffff811cc91e>] __test_set_page_writeback+0xde/0x1d0 [<ffffffff813fee87>] do_write_data_page+0xe7/0x3a0 [<ffffffff813faeea>] gc_data_segment+0x5aa/0x640 [<ffffffff813fb0b8>] do_garbage_collect+0x138/0x150 [<ffffffff813fb3fe>] f2fs_gc+0x1be/0x3e0 [<ffffffff81405541>] f2fs_balance_fs+0x81/0x90 [<ffffffff813ee357>] f2fs_unlink+0x47/0x1d0 [<ffffffff81239329>] vfs_unlink+0x109/0x1b0 [<ffffffff8123e3d7>] do_unlinkat+0x287/0x2c0 [<ffffffff8123ebc6>] SyS_unlink+0x16/0x20 [<ffffffff81942e2e>] entry_SYSCALL_64_fastpath+0x12/0x71 Code: 41 5e 5d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55 49 89 f5 41 54 49 89 fc 53 48 83 ec 08 65 ff 05 e6 d9 b6 7e <48> 8b 47 20 48 63 ca 65 8b 18 48 63 db 48 01 f3 48 39 cb 7d 0a RIP [<ffffffff8149deea>] __percpu_counter_add+0x1a/0x90 RSP <ffff880295143ac8> CR2: 00000000000000a8 ---[ end trace 5132449a58ed93a3 ]--- note: gcc[10356] exited with preempt_count 2 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* | f2fs: handle error cases in move_encrypted_blockJaegeuk Kim2015-07-251-8/+15
|/ | | | | | | This patch fixes some missing error handlers. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* Merge branch 'for-4.2/writeback' of git://git.kernel.dk/linux-blockLinus Torvalds2015-06-252-3/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull cgroup writeback support from Jens Axboe: "This is the big pull request for adding cgroup writeback support. This code has been in development for a long time, and it has been simmering in for-next for a good chunk of this cycle too. This is one of those problems that has been talked about for at least half a decade, finally there's a solution and code to go with it. Also see last weeks writeup on LWN: http://lwn.net/Articles/648292/" * 'for-4.2/writeback' of git://git.kernel.dk/linux-block: (85 commits) writeback, blkio: add documentation for cgroup writeback support vfs, writeback: replace FS_CGROUP_WRITEBACK with SB_I_CGROUPWB writeback: do foreign inode detection iff cgroup writeback is enabled v9fs: fix error handling in v9fs_session_init() bdi: fix wrong error return value in cgwb_create() buffer: remove unusued 'ret' variable writeback: disassociate inodes from dying bdi_writebacks writeback: implement foreign cgroup inode bdi_writeback switching writeback: add lockdep annotation to inode_to_wb() writeback: use unlocked_inode_to_wb transaction in inode_congested() writeback: implement unlocked_inode_to_wb transaction and use it for stat updates writeback: implement [locked_]inode_to_wb_and_lock_list() writeback: implement foreign cgroup inode detection writeback: make writeback_control track the inode being written back writeback: relocate wb[_try]_get(), wb_put(), inode_{attach|detach}_wb() mm: vmscan: disable memcg direct reclaim stalling if cgroup writeback support is in use writeback: implement memcg writeback domain based throttling writeback: reset wb_domain->dirty_limit[_tstmp] when memcg domain size changes writeback: implement memcg wb_domain writeback: update wb_over_bg_thresh() to use wb_domain aware operations ...
| * writeback: separate out include/linux/backing-dev-defs.hTejun Heo2015-06-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the planned cgroup writeback support, backing-dev related declarations will be more widely used across block and cgroup; unfortunately, including backing-dev.h from include/linux/blkdev.h makes cyclic include dependency quite likely. This patch separates out backing-dev-defs.h which only has the essential definitions and updates blkdev.h to include it. c files which need access to more backing-dev details now include backing-dev.h directly. This takes backing-dev.h off the common include dependency chain making it a lot easier to use it across block and cgroup. v2: fs/fat build failure fixed. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@fb.com>
| * writeback: move bandwidth related fields from backing_dev_info into ↵Tejun Heo2015-06-022-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bdi_writeback Currently, a bdi (backing_dev_info) embeds single wb (bdi_writeback) and the role of the separation is unclear. For cgroup support for writeback IOs, a bdi will be updated to host multiple wb's where each wb serves writeback IOs of a different cgroup on the bdi. To achieve that, a wb should carry all states necessary for servicing writeback IOs for a cgroup independently. This patch moves bandwidth related fields from backing_dev_info into bdi_writeback. * The moved fields are: bw_time_stamp, dirtied_stamp, written_stamp, write_bandwidth, avg_write_bandwidth, dirty_ratelimit, balanced_dirty_ratelimit, completions and dirty_exceeded. * writeback_chunk_size() and over_bground_thresh() now take @wb instead of @bdi. * bdi_writeout_fraction(bdi, ...) -> wb_writeout_fraction(wb, ...) bdi_dirty_limit(bdi, ...) -> wb_dirty_limit(wb, ...) bdi_position_ration(bdi, ...) -> wb_position_ratio(wb, ...) bdi_update_writebandwidth(bdi, ...) -> wb_update_write_bandwidth(wb, ...) [__]bdi_update_bandwidth(bdi, ...) -> [__]wb_update_bandwidth(wb, ...) bdi_{max|min}_pause(bdi, ...) -> wb_{max|min}_pause(wb, ...) bdi_dirty_limits(bdi, ...) -> wb_dirty_limits(wb, ...) * Init/exits of the relocated fields are moved to bdi_wb_init/exit() respectively. Note that explicit zeroing is dropped in the process as wb's are cleared in entirety anyway. * As there's still only one bdi_writeback per backing_dev_info, all uses of bdi->stat[] are mechanically replaced with bdi->wb.stat[] introducing no behavior changes. v2: Typo in description fixed as suggested by Jan. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* | Merge tag 'for-f2fs-4.2' of ↵Linus Torvalds2015-06-2429-616/+3775
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "New features: - per-file encryption (e.g., ext4) - FALLOC_FL_ZERO_RANGE - FALLOC_FL_COLLAPSE_RANGE - RENAME_WHITEOUT Major enhancement/fixes: - recovery broken superblocks - enhance f2fs_trim_fs with a discard_map - fix a race condition on dentry block allocation - fix a deadlock during summary operation - fix a missing fiemap result .. and many minor bug fixes and clean-ups were done" * tag 'for-f2fs-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (83 commits) f2fs: do not trim preallocated blocks when truncating after i_size f2fs crypto: add alloc_bounce_page f2fs crypto: fix to handle errors likewise ext4 f2fs: drop the volatile_write flag only f2fs: skip committing valid superblock f2fs: setting discard option in parse_options() f2fs: fix to return exact trimmed size f2fs: support FALLOC_FL_INSERT_RANGE f2fs: hide common code in f2fs_replace_block f2fs: disable the discard option when device doesn't support f2fs crypto: remove alloc_page for bounce_page f2fs: fix a deadlock for summary page lock vs. sentry_lock f2fs crypto: clean up error handling in f2fs_fname_setup_filename f2fs crypto: avoid f2fs_inherit_context for symlink f2fs crypto: do not set encryption policy for non-directory by ioctl f2fs crypto: allow setting encryption policy once f2fs crypto: check context consistent for rename2 f2fs: avoid duplicated code by reusing f2fs_read_end_io f2fs crypto: use per-inode tfm structure f2fs: recovering broken superblock during mount ...
| * | f2fs: do not trim preallocated blocks when truncating after i_sizeChao Yu2015-06-111-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we perform generic/092 in xfstests, output is like below: XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 0: [0..10239]: data 0: [0..10239]: data -1: [10240..20479]: unwritten +1: [10240..14335]: unwritten This is because with this testcase, we redefine the regulation for truncate in perallocated space past i_size as below: "There was some confused about what the fs was supposed to do when you truncate at i_size with preallocated space past i_size. We decided on the following things. 1) truncate(i_size) will trim all blocks past i_size. 2) truncate(x) where x > i_size will not trim all blocks past i_size. " This method is used in xfs, and then ext4/btrfs will follow the rule. This patch fixes to follow the new rule for f2fs. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: add alloc_bounce_pageJaegeuk Kim2015-06-111-8/+15
| | | | | | | | | | | | | | | | | | This patch adds alloc_bounce_page likewise ext4. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: fix to handle errors likewise ext4Jaegeuk Kim2015-06-111-3/+3
| | | | | | | | | | | | | | | | | | This patch makes some error handling policies same with ext4. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: drop the volatile_write flag onlyJaegeuk Kim2015-06-091-4/+2
| | | | | | | | | | | | | | | | | | | | | When aborting volatile_writes, let's drop its flag and give up any further volatile_writes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: skip committing valid superblockChao Yu2015-06-083-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In recovery procedure for superblock, we try to write data of valid superblock into invalid one for recovery, work should be finished here, but then still we will write the valid one with its original data. This operation is not needed. Let's skip doing this unnecessary work. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: setting discard option in parse_options()Chao Yu2015-06-081-11/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the first mount of f2fs image with realtime discard option, we will disable discard option if device is not supported, but for remount operation, our discard option can still be set, this should be avoided. This patch moves configuring of discard option to parse_options() to fix this issue. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: fix to return exact trimmed sizeJaegeuk Kim2015-06-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Now, we add all the candidates for trim commands and then finally issue discard commands. So, we should count the trimmed size in back-end. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: support FALLOC_FL_INSERT_RANGEChao Yu2015-06-021-2/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | FALLOC_FL_INSERT_RANGE flag for ->fallocate was introduced in commit dd46c787788d ("fs: Add support FALLOC_FL_INSERT_RANGE for fallocate"). The effect of FALLOC_FL_INSERT_RANGE command is the opposite of FALLOC_FL_COLLAPSE_RANGE, if this command was performed, all data from offset to EOF in our file will be shifted to right as given length, and then range [offset, offset + length] becomes a hole. This command is useful for our user who wants to add some data in the middle of the file, for example: video/music editor will insert a keyframe in specified position of media file, with this command we can easily create a hole for inserting without removing original data. This patch introduces f2fs_insert_range() to support FALLOC_FL_INSERT_RANGE. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Yuan Zhong <yuan.mark.zhong@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: hide common code in f2fs_replace_blockChao Yu2015-06-024-20/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch clean up codes through: 1.rename f2fs_replace_block to __f2fs_replace_block(). 2.introduce new f2fs_replace_block() to include __f2fs_replace_block() and some common related codes around __f2fs_replace_block(). Then, newly introduced function f2fs_replace_block can be used by following patch. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: disable the discard option when device doesn't supportChenxi Mao2015-06-011-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current f2fs check the whether the blk device can support discard. However, the code will cause the discard option cannot be enabled. Because the clear_opt(sbi, DISCARD) will be invoked forever. This patch can fix this issue. Jaegeuk Kim: The original patch was intended to disable the discard option when device does not support trim command. Rather than remaining the buggy patch, let's replace with this patch as an integrated one. Signed-off-by: Chenxi Mao <chenxi.mao2013@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: remove alloc_page for bounce_pageJaegeuk Kim2015-06-012-23/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | We don't need to call alloc_page() prior to mempool_alloc(), since the mempool_alloc() calls alloc_page() internally. And, if __GFP_WAIT is set, it never fails on page allocation, so let's give GFP_NOWAIT and handle ENOMEM by writepage(). Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: fix a deadlock for summary page lock vs. sentry_lockJaegeuk Kim2015-06-011-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In f2fs_gc: In f2fs_replace_block: - lock_page(sum_page) - check_valid_map() - mutex_lock(sentry_lock) - mutex_lock(sentry_lock) - change_curseg() - lock_page(sum_page) This patch fixes the deadlock condition. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: clean up error handling in f2fs_fname_setup_filenameJaegeuk Kim2015-06-011-14/+10
| | | | | | | | | | | | | | | | | | | | | Sync with: ext4 crypto: clean up error handling in ext4_fname_setup_filename Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: avoid f2fs_inherit_context for symlinkJaegeuk Kim2015-06-011-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes to call f2fs_inherit_context twice for newly created symlink. The original one is called by f2fs_add_link(), which invokes f2fs_setxattr. If the second one is called again, f2fs_setxattr is triggered again with same encryption index. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: do not set encryption policy for non-directory by ioctlChao Yu2015-06-012-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Encryption policy should only be set to an empty directory through ioctl, This patch add a judgement condition to verify type of the target inode to avoid incorrectly configuring for non-directory. Additionally, remove unneeded inline data conversion since regular or symlink file should not be processed here. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: allow setting encryption policy onceChao Yu2015-06-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch add XATTR_CREATE flag in setxattr when setting encryption context for inode. Without this flag the context could be set more than once, this should never happen. So, fix it. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: check context consistent for rename2Chao Yu2015-06-011-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For exchange rename, we should check context consistent of encryption between new_dir and old_inode or old_dir and new_inode. Otherwise inheritance of parent's encryption context will be broken. Signed-off-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: sync with ext4 approach] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: avoid duplicated code by reusing f2fs_read_end_ioChao Yu2015-06-011-28/+4
| | | | | | | | | | | | | | | | | | | | | | | | This patch tries to clean up code because part code of f2fs_read_end_io and mpage_end_io are the same, so it's better to merge and reuse them. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: use per-inode tfm structureJaegeuk Kim2015-06-019-167/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch applies the following ext4 patch: ext4 crypto: use per-inode tfm structure As suggested by Herbert Xu, we shouldn't allocate a new tfm each time we read or write a page. Instead we can use a single tfm hanging off the inode's crypt_info structure for all of our encryption needs for that inode, since the tfm can be used by multiple crypto requests in parallel. Also use cmpxchg() to avoid races that could result in crypt_info structure getting doubly allocated or doubly freed. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: recovering broken superblock during mounthujianyang2015-06-011-13/+52
| | | | | | | | | | | | | | | | | | | | | | | | This patch recovers a broken superblock with the other valid one. Signed-off-by: hujianyang <hujianyang@huawei.com> [Jaegeuk Kim: reinitialize local variables in f2fs_fill_super for retrial] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: check encryption for tmpfileJaegeuk Kim2015-06-011-0/+6
| | | | | | | | | | | | | | | | | | This patch adds to check encryption for tmpfile in early stage. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: support RENAME_WHITEOUTChao Yu2015-06-011-53/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the description of rename in manual, RENAME_WHITEOUT is a special operation that only makes sense for overlay/union type filesystem. When performing rename with RENAME_WHITEOUT, dst will be replace with src, and meanwhile, a 'whiteout' will be create with name of src. A "whiteout" is designed to be a char device with 0,0 device number, it has specially meaning for stackable filesystem. In these filesystems, there are multiple layers exist, and only top of these can be modified. So a whiteout in top layer is used to hide a corresponding file in lower layer, as well removal of whiteout will make the file appear. Now in overlayfs, when we rename a file which is exist in lower layer, it will be copied up to upper if it is not on upper layer yet, and then rename it on upper layer, source file will be whiteouted to hide corresponding file in lower layer at the same time. So in upper layer filesystem, implementation of RENAME_WHITEOUT provide a atomic operation for stackable filesystem to support rename operation. There are multiple ways to implement RENAME_WHITEOUT in log of this commit: 7dcf5c3e4527 ("xfs: add RENAME_WHITEOUT support") which pointed out by Dave Chinner. For now, we just try to follow the way that xfs/ext4 use. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: introduce update_meta_pageChao Yu2015-06-013-30/+22
| | | | | | | | | | | | | | | | | | | | | | | | Add a help function update_meta_page() to update meta page with specified buffer. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: zero next free dnode blockChao Yu2015-06-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Now page cache of meta inode is used by garbage collection for encrypted page, it may contain random data, so we should zero it before issuing discard. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: split f2fs_crypto_init/exit with two partsJaegeuk Kim2015-06-014-47/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch splits f2fs_crypto_init/exit with two parts: base initialization and memory allocation. Firstly, f2fs module declares the base encryption memory pointers. Then, allocating internal memories is done at the first encrypted inode access. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: fix incorrect release for crypto ctxChao Yu2015-06-011-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When encryption feature is enable, if we rmmod f2fs module, we will encounter a stack backtrace reported in syslog: "BUG: Bad page state in process rmmod pfn:aaf8a page:f0f4f148 count:0 mapcount:129 mapping:ee2f4104 index:0x80 flags: 0xee2830a4(referenced|lru|slab|private_2|writeback|swapbacked|mlocked) page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set bad because of flags: flags: 0x2030a0(lru|slab|private_2|writeback|mlocked) Modules linked in: f2fs(O-) fuse bnep rfcomm bluetooth dm_crypt binfmt_misc snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device joydev ppdev mac_hid lp hid_generic i2c_piix4 parport_pc psmouse snd serio_raw parport soundcore ext4 jbd2 mbcache usbhid hid e1000 [last unloaded: f2fs] CPU: 1 PID: 3049 Comm: rmmod Tainted: G B O 4.1.0-rc3+ #10 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 00000000 00000000 c0021eb4 c15b7518 f0f4f148 c0021ed8 c112e0b7 c1779174 c9b75674 000aaf8a 01b13ce1 c17791a4 f0f4f148 ee2830a4 c0021ef8 c112e3c3 00000000 f0f4f148 c0021f34 f0f4f148 ee2830a4 ef9f0000 c0021f20 c112fdf8 Call Trace: [<c15b7518>] dump_stack+0x41/0x52 [<c112e0b7>] bad_page.part.72+0xa7/0x100 [<c112e3c3>] free_pages_prepare+0x213/0x220 [<c112fdf8>] free_hot_cold_page+0x28/0x120 [<c1073380>] ? try_to_wake_up+0x2b0/0x2b0 [<c112ff15>] __free_pages+0x25/0x30 [<c112c4fd>] mempool_free_pages+0xd/0x10 [<c112c5f1>] mempool_free+0x31/0x90 [<f0f441cf>] f2fs_exit_crypto+0x6f/0xf0 [f2fs] [<f0f456c4>] exit_f2fs_fs+0x23/0x95f [f2fs] [<c10c30e0>] SyS_delete_module+0x130/0x180 [<c11556d6>] ? vm_munmap+0x46/0x60 [<c15bd888>] sysenter_do_call+0x12/0x12" The reason is that: since commit 0827e645fd35 ("f2fs crypto: shrink size of the f2fs_crypto_ctx structure") is merged, some fields in f2fs_crypto_ctx structure are merged into a union as they will never be used simultaneously in write path, read path or on free list. In f2fs_exit_crypto, we traverse each crypto ctx from free list, in this moment, our free_list field in union is valid, but still we will try to release memory space which is pointed by other invalid field in union structure for each ctx. Then the error occurs, let's fix it with this patch. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: fix to release buffer for fname cryptoChao Yu2015-06-011-5/+5
| | | | | | | | | | | | | | | | | | | | | This patch fixes memory leak issue in error path of f2fs_fname_setup_filename(). Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: shrink size of the f2fs_crypto_ctx structureJaegeuk Kim2015-06-013-30/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch integrates the below patch into f2fs. "ext4 crypto: shrink size of the ext4_crypto_ctx structure Some fields are only used when the crypto_ctx is being used on the read path, some are only used on the write path, and some are only used when the structure is on free list. Optimize memory use by using a union." Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: get rid of ci_mode from struct f2fs_crypt_infoJaegeuk Kim2015-06-014-15/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch integrates the below patch into f2fs. "ext4 crypto: get rid of ci_mode from struct ext4_crypt_info The ci_mode field was superfluous, and getting rid of it gets rid of an unused hole in the structure." Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: use slab cachesJaegeuk Kim2015-06-013-32/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch integrates the below patch into f2fs. "ext4 crypto: use slab caches Use slab caches the ext4_crypto_ctx and ext4_crypt_info structures for slighly better memory efficiency and debuggability." Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: truncate data blocks for orphan inodeJaegeuk Kim2015-06-011-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As Hu reported, F2FS has a space leak problem, when conducting: 1) format a 4GB f2fs partition 2) dd a 3G file, 3) unlink it. So, when doing f2fs_drop_inode(), we need to truncate data blocks before skipping it. We can also drop unused caches assigned to each inode. Reported-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: cleanup a confusing indentDan Carpenter2015-06-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | The return was not indented far enough so it looked like it was supposed to go with the other if statement. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: fix building on 32-bit architecturesArnd Bergmann2015-06-011-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A bug fix to the debug output extended the type of some local variables to 64-bit, which now causes the kernel to fail building because of missing 64-bit division functions: ERROR: "__aeabi_uldivmod" [fs/f2fs/f2fs.ko] undefined! In the kernel, we have to use div_u64 or do_div to do this, in order to annotate that this is an expensive operation. As the function is only called for debug out, we know this is not performance critical, so it is safe to use div_u64. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: d1f85bd38db19 ("f2fs: avoid value overflow in showing current status") Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: avoid buggy functionsJaegeuk Kim2015-06-011-0/+18
| | | | | | | | | | | | | | | | | | | | | This patch avoids to use a buggy function for now. It needs to fix them later. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: add compat_ioctl to provide backward compatabilityhujianyang2015-06-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | introduce compat_ioctl to regular files, but doesn't add this functionality to f2fs_dir_operations. While running a 32-bit busybox, I met an error like this: (A is a directory) chattr: reading flags on A: Inappropriate ioctl for device This patch copies compat_ioctl from f2fs_file_operations and fix this problem. Signed-off-by: hujianyang <hujianyang@huawei.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: do not issue next dnode discard redundantlyJaegeuk Kim2015-06-011-1/+14
| | | | | | | | | | | | | | | | | | We have a discard map, so that we can avoid redundant discard issues. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: add default mount options to remountYunlei He2015-05-281-13/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I use f2fs filesystem with /data partition on my Android phone by the default mount options. When I remount /data in order to adding discard option to run some benchmarks, I find the default options such as background_gc, user_xattr and acl turned off. So I introduce a function named default_options in super.c. It do some default setting, and both mount and remount operations will call this function to complete default setting. Signed-off-by: Yunlei He <heyunlei@huawei.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: remove checking key context during lookupJaegeuk Kim2015-05-281-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | No matter what the key is valid or not, readdir shows the dir entries correctly. So, lookup should not failed. But, we expect further accesses should be denied from open, rename, link, and so on. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: fix missing key when reading a pageJaegeuk Kim2015-05-281-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. mount $mnt 2. cp data $mnt/ 3. umount $mnt 4. log out 5. log in 6. cat $mnt/data -> panic, due to no i_crypt_info. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: add symlink encryptionJaegeuk Kim2015-05-283-6/+142
| | | | | | | | | | | | | | | | | | | | | | | | This patch implements encryption support for symlink. Signed-off-by: Uday Savagaonkar <savagaon@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs crypto: add filename encryption for roll-forward recoveryJaegeuk Kim2015-05-285-9/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a bit flag to indicate whether or not i_name in the inode is encrypted. If this name is encrypted, we can't do recover_dentry during roll-forward. So, f2fs_sync_file() needs to do checkpoint, if this will be needed in future. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
OpenPOWER on IntegriCloud