summaryrefslogtreecommitdiffstats
path: root/fs/f2fs/gc.c
Commit message (Collapse)AuthorAgeFilesLines
* f2fs: remove useless #include <linux/proc_fs.h> as we're now using sysfs as ↵Haicheng Li2013-04-301-1/+0
| | | | | | | debug entry. Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: give a chance to merge IOs by IO schedulerJaegeuk Kim2013-04-261-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, background GC submits many 4KB read requests to load victim blocks and/or its (i)node blocks. ... f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb61, blkaddr = 0x3b964ed f2fs_gc : block_rq_complete: 8,16 R () 499854968 + 8 [0] f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb6f, blkaddr = 0x3b964ee f2fs_gc : block_rq_complete: 8,16 R () 499854976 + 8 [0] f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb79, blkaddr = 0x3b964ef f2fs_gc : block_rq_complete: 8,16 R () 499854984 + 8 [0] ... However, by the fact that many IOs are sequential, we can give a chance to merge the IOs by IO scheduler. In order to do that, let's use blk_plug. ... f2fs_gc : f2fs_iget: ino = 143 f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c6, blkaddr = 0x2e6ee f2fs_gc : f2fs_iget: ino = 143 f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c7, blkaddr = 0x2e6ef <idle> : block_rq_complete: 8,16 R () 1519616 + 8 [0] <idle> : block_rq_complete: 8,16 R () 1519848 + 8 [0] <idle> : block_rq_complete: 8,16 R () 1520432 + 96 [0] <idle> : block_rq_complete: 8,16 R () 1520536 + 104 [0] <idle> : block_rq_complete: 8,16 R () 1521008 + 112 [0] <idle> : block_rq_complete: 8,16 R () 1521440 + 152 [0] <idle> : block_rq_complete: 8,16 R () 1521688 + 144 [0] <idle> : block_rq_complete: 8,16 R () 1522128 + 192 [0] <idle> : block_rq_complete: 8,16 R () 1523256 + 328 [0] ... Note that this issue should be addressed in checkpoint, and some readahead flows too. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: avoid frequent background GCJaegeuk Kim2013-04-261-3/+0
| | | | | | | | | If there is no victim segments selected by background GC, let's wait a little bit longer time to collect dirty segments. By default, let's give 5 minutes. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: add tracepoints for GC threadsNamjae Jeon2013-04-231-0/+5
| | | | | | | | | | | Add tracepoints for tracing the garbage collector threads in f2fs with status of collection & type. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> [Jaegeuk: modify slightly to show information] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: write checkpoint before starting FG_GCJaegeuk Kim2013-04-091-1/+3
| | | | | | | In order to be aware of prefree and free sections during FG_GC, let's start with write_checkpoint(). Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: introduce a new global lock schemeJaegeuk Kim2013-04-091-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the previous version, f2fs uses global locks according to the usage types, such as directory operations, block allocation, block write, and so on. Reference the following lock types in f2fs.h. enum lock_type { RENAME, /* for renaming operations */ DENTRY_OPS, /* for directory operations */ DATA_WRITE, /* for data write */ DATA_NEW, /* for data allocation */ DATA_TRUNC, /* for data truncate */ NODE_NEW, /* for node allocation */ NODE_TRUNC, /* for node truncate */ NODE_WRITE, /* for node write */ NR_LOCK_TYPE, }; In that case, we lose the performance under the multi-threading environment, since every types of operations must be conducted one at a time. In order to address the problem, let's share the locks globally with a mutex array regardless of any types. So, let users grab a mutex and perform their jobs in parallel as much as possbile. For this, I propose a new global lock scheme as follows. 0. Data structure - f2fs_sb_info -> mutex_lock[NR_GLOBAL_LOCKS] - f2fs_sb_info -> node_write 1. mutex_lock_op(sbi) - try to get an avaiable lock from the array. - returns the index of the gottern lock variable. 2. mutex_unlock_op(sbi, index of the lock) - unlock the given index of the lock. 3. mutex_lock_all(sbi) - grab all the locks in the array before the checkpoint. 4. mutex_unlock_all(sbi) - release all the locks in the array after checkpoint. 5. block_operations() - call mutex_lock_all() - sync_dirty_dir_inodes() - grab node_write - sync_node_pages() Note that, the pairs of mutex_lock_op()/mutex_unlock_op() and mutex_lock_all()/mutex_unlock_all() should be used together. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: avoid race for summary informationJaegeuk Kim2013-04-031-7/+1
| | | | | | | | In order to do GC more reliably, I'd like to lock the vicitm summary page until its GC is completed, and also prevent any checkpoint process. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: check completion of foreground GCJaegeuk Kim2013-04-031-12/+34
| | | | | | | | The foreground GCs are triggered under not enough free sections. So, we should not skip moving valid blocks in the victim segments. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: change GC bitmaps to apply the section granularityJaegeuk Kim2013-04-031-19/+24
| | | | | | | | | | | | | | | | | | | | | | This patch removes a bitmap for victim segments selected by foreground GC, and modifies the other bitmap for victim segments selected by background GC. 1) foreground GC bitmap : We don't need to manage this, since we just only one previous victim section number instead of the whole victim history. The f2fs uses the victim section number in order not to allocate currently GC'ed section to current active logs. 2) background GC bitmap : This bitmap is used to avoid selecting victims repeatedly by background GCs. In addition, the victims are able to be selected by foreground GCs, since there is no need to read victim blocks during foreground GCs. By the fact that the foreground GC reclaims segments in a section unit, it'd be better to manage this bitmap based on the section granularity. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix typo in commentsMasanari Iida2013-03-201-1/+1
| | | | | | | | Correct spelling typo in comments Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix calculation of max. gc cost in the SSR caseJaegeuk Kim2013-02-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | In the SSR case, the max gc cost should be the number of pages in a segment. Otherwise, f2fs is able to fail getting dirty segments frequently for SSR. In get_victim_by_default() previously, while(1) { ... cost = get_gc_cost(); <- cost is between 0 ~ 512. ... if (cost == get_max_cost(sbi, &p)) <- max cost is UINT_MAX due to GC_CB type continue; if (nsearched++ >= MAX_VICTIM_SEARCH) break; } So, if there are a number of fully valid segments in series, f2fs cannot skip those segments by comparing the cost and max cost of each segment. Note that, the cost is the number of valid blocks at the time of the last checkpoint. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: clarify and enhance the f2fs_gc flowJaegeuk Kim2013-02-121-66/+41
| | | | | | | | | | | | | | | | | | | | | | This patch makes clearer the ambiguous f2fs_gc flow as follows. 1. Remove intermediate checkpoint condition during f2fs_gc (i.e., should_do_checkpoint() and GC_BLOCKED) 2. Remove unnecessary return values of f2fs_gc because of #1. (i.e., GC_NODE, GC_OK, etc) 3. Simplify write_checkpoint() because of #2. 4. Clarify the main f2fs_gc flow. o monitor how many freed sections during one iteration of do_garbage_collect(). o do GC more without checkpoints if we can't get enough free sections. o do checkpoint once we've got enough free sections through forground GCs. 5. Adopt thread-logging (Slack-Space-Recycle) scheme more aggressively on data log types. See. get_ssr_segement() Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: mark gc_thread as NULL when thread creation is failedNamjae Jeon2013-02-121-0/+1
| | | | | | | | | | | | When gc thread creation is failed, mark gc_thread as NULL to avoid crash while trying to stop invalid thread in stop_gc_thread->kthread_stop. Instead make it return from: if (!gc_th) return; Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: name gc task as per the block deviceNamjae Jeon2013-02-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | Currently GC task is started for each f2fs formatted/mounted device. But, when we check the task list, using 'ps', there is no distinguishing factor between the tasks. So, name the task as per the block device just like the flusher threads. Also, remove the macro GC_THREAD_NAME and instead use the name: f2fs_gc to avoid name length truncation, as the command length is 16 -> TASK_COMM_LEN 16 and example name like: f2fs_gc_task:8:16 -> this exceeds name length Before Patch for 2 F2FS formatted partitions: root 28061 0.0 0.0 0 0 ? S 10:31 0:00 [f2fs_gc_task] root 28087 0.0 0.0 0 0 ? S 10:32 0:00 [f2fs_gc_task] After Patch: root 16756 0.0 0.0 0 0 ? S 14:57 0:00 [f2fs_gc-8:18] root 16765 0.0 0.0 0 0 ? S 14:57 0:00 [f2fs_gc-8:19] Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: remove unnecessary gc option check and balance_fsChangman Lee2013-02-121-5/+2
| | | | | | | | | | | 1. If f2fs is mounted with background_gc_off option, checking BG_GC is not redundant. 2. f2fs_balance_fs is checked in f2fs_gc, so this is also redundant. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: avoid balanc_fs during evict_inodeJaegeuk Kim2013-02-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Background Previously, if f2fs tries to move data blocks of an *evicting* inode during the cleaning process, it stops the process incompletely and then restarts the whole process, since it needs a locked inode to grab victim data pages in its address space. In order to get a locked inode, iget_locked() by f2fs_iget() is normally used, but, it waits if the inode is on freeing. So, here is a deadlock scenario. 1. f2fs_evict_inode() <- inode "A" 2. f2fs_balance_fs() 3. f2fs_gc() 4. gc_data_segment() 5. f2fs_iget() <- inode "A" too! If step #1 and #5 treat a same inode "A", step #5 would fall into deadlock since the inode "A" is on freeing. In order to resolve this, f2fs_iget_nowait() which skips __wait_on_freeing_inode() was introduced in step #5, and stops f2fs_gc() to complete f2fs_evict_inode(). 1. f2fs_evict_inode() <- inode "A" 2. f2fs_balance_fs() 3. f2fs_gc() 4. gc_data_segment() 5. f2fs_iget_nowait() <- inode "A", then stop f2fs_gc() w/ -ENOENT 2. Problem and Solution In the above scenario, however, f2fs cannot finish f2fs_evict_inode() only if: o there are not enough free sections, and o f2fs_gc() tries to move data blocks of the *evicting* inode repeatedly. So, the final solution is to use f2fs_iget() and remove f2fs_balance_fs() in f2fs_evict_inode(). The f2fs_evict_inode() actually truncates all the data and node blocks, which means that it doesn't produce any dirty node pages accordingly. So, we don't need to do f2fs_balance_fs() in practical. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: avoid redundant call to has_not_enough_free_secs in f2fs_gcNamjae Jeon2013-02-121-1/+1
| | | | | | | | | | | | | | After doing a write_checkpoint from garbage collection path if there is still need to do more garbage collection, gc_more label is used to jump and start the process again. And in that process, first step before getting victim is to check if there are not enough free sections, which is already done before doing a jump to gc_more. We can avoid the redundant call to check free sections, by checking the gc_type flag which will remain FG_GC(value 1) under this condition. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: add un/freeze_fs into super_operationsChangman Lee2013-02-121-0/+5
| | | | | | | | | | | This patch supports ioctl FIFREEZE and FITHAW to snapshot filesystem. Before calling f2fs_freeze, all writers would be suspended and sync_fs would be completed. So no f2fs has to do something. Just background gc operation should be skipped due to generate dirty nodes and data until unfreeze. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: add comments of start_bidx_of_nodeJaegeuk Kim2013-01-221-1/+5
| | | | | | | | | The caller of start_bidx_of_node() should give proper node offsets which point only direct node blocks. Otherwise, it is a caller's bug. This patch adds comments to make it clear. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: add __init to functions in init_f2fs_fsNamjae Jeon2013-01-221-1/+1
| | | | | | | | Add __init to functions in init_f2fs_fs for code consistency. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: revisit the f2fs_gc flowJaegeuk Kim2013-01-101-39/+21
| | | | | | | | | | | | I'd like to revisit the f2fs_gc flow and rewrite as follows. 1. In practical, the nGC parameter of f2fs_gc is meaningless. So, let's remove it. 2. Background GC marks victim blocks as dirty one at a time. 3. Foreground GC should do cleaning job until acquiring enough free sections. Afterwards, it needs to do checkpoint. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: clean up unused variables and return valuesJaegeuk Kim2012-12-281-9/+3
| | | | | | | This patch cleans up a couple of unnecessary codes related to unused variables and return values. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: clean up the start_bidx_of_node functionJaegeuk Kim2012-12-281-14/+8
| | | | | | | | | This patch also resolves the following warning reported by kbuild test robot. fs/f2fs/gc.c: In function 'start_bidx_of_node': fs/f2fs/gc.c:453:21: warning: 'bidx' may be used uninitialized in this function Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: remove unneeded initializationNamjae Jeon2012-12-111-1/+1
| | | | | | | | | No need to initialize "struct f2fs_gc_kthread *gc_th = NULL", as gc_th = NULL, will be taken care by the return values of kmalloc(). And fix codes in other places. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
* f2fs: adjust kernel coding styleJaegeuk Kim2012-12-111-5/+5
| | | | | | | As pointed out by Randy Dunlap, this patch removes all usage of "/**" for comment blocks. Instead, just use "/*". Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: add garbage collection functionsJaegeuk Kim2012-12-111-0/+742
This adds on-demand and background cleaning functions. - The basic background cleaning policy is trying to do cleaning jobs as much as possible whenever the system is idle. Once the background cleaning is done, the cleaner sleeps an amount of time not to interfere with VFS calls. The time is dynamically adjusted according to the status of whole segments, which is decreased when the following conditions are satisfied. . GC is not conducted currently, and . IO subsystem is idle by checking the number of requets in bdev's request list, and . There are enough dirty segments. Otherwise, the time is increased incrementally until to the maximum time. Note that, min and max times are 10 secs and 30 secs by default. - F2FS adopts a default victim selection policy where background cleaning uses a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm. - The method of moving data during the cleaning is slightly different between background and on-demand cleaning schemes. In the case of background cleaning, F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should move the data right away. - In order to identify valid blocks in a victim segment, F2FS scans the bitmap of the segment managed as an SIT entry. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
OpenPOWER on IntegriCloud