summaryrefslogtreecommitdiffstats
path: root/fs/ext3/balloc.c
Commit message (Collapse)AuthorAgeFilesLines
* ext3: Avoid loading bitmaps for full groups during block allocationFrans van de Wiel2010-05-211-0/+6
| | | | | | | | | | | | There is no point in loading bitmap for groups which are completely full. This causes noticeable performance problems (and memory pressure) on small systems with large full filesystem (http://marc.info/?l=linux-ext4&m=126843108314310&w=2). Jan Kara: Added a comment and changed check to use cpu-endian value. Signed-off-by: "Frans van de Wiel" <fvdw@fvdw.eu> Signed-off-by: Jan Kara <jack@suse.cz>
* include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* dquot: cleanup space allocation / freeing routinesChristoph Hellwig2010-03-051-5/+6
| | | | | | | | | | | | | | | | Get rid of the alloc_space, free_space, reserve_space, claim_space and release_rsv dquot operations - they are always called from the filesystem and if a filesystem really needs their own (which none currently does) it can just call into it's own routine directly. Move shared logic into the common __dquot_alloc_space, dquot_claim_space_nodirty and __dquot_free_space low-level methods, and rationalize the wrappers around it to move as much as possible code into the common block for CONFIG_QUOTA vs not. Also rename all these helpers to be named dquot_* instead of vfs_dq_*. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
* ext3: remove ->write_super and stop maintaining ->s_dirtChristoph Hellwig2009-06-111-2/+1
| | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ext3: Use lowercase names of quota functionsJan Kara2009-03-261-4/+4
| | | | | | | Use lowercase names of quota functions instead of old uppercase ones. Signed-off-by: Jan Kara <jack@suse.cz> CC: linux-ext4@vger.kernel.org
* CRED: Wrap task credential accesses in the Ext3 filesystemDavid Howells2008-11-141-1/+1
| | | | | | | | | | | | | | | | | | | | Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: adilger@sun.com Cc: linux-ext4@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>
* ext3: fix ext3 block reservation early ENOSPC issueMingming Cao2008-10-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We could run into ENOSPC error on ext3, even when there is free blocks on the filesystem. The problem is triggered in the case the goal block group has 0 free blocks , and the rest block groups are skipped due to the check of "free_blocks < windowsz/2". Current code could fall back to non reservation allocation to prevent early ENOSPC after examing all the block groups with reservation on , but this code was bypassed if the reservation window is turned off already, which is true in this case. This patch fixed two issues: 1) We don't need to turn off block reservation if the goal block group has 0 free blocks left and continue search for the rest of block groups. Current code the intention is to turn off the block reservation if the goal allocation group has a few (some) free blocks left (not enough for make the desired reservation window),to try to allocation in the goal block group, to get better locality. But if the goal blocks have 0 free blocks, it should leave the block reservation on, and continues search for the next block groups,rather than turn off block reservation completely. 2) we don't need to check the window size if the block reservation is off. The problem was originally found and fixed in ext4. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ext3: replace remaining __FUNCTION__ occurrencesHarvey Harrison2008-04-281-6/+6
| | | | | | | | | __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ext3: retry block allocation if new blocks are allocated from system zoneAneesh Kumar K.V2008-04-281-5/+10
| | | | | | | | | | | | | | | | If the block allocator gets blocks out of system zone ext3 calls ext3_error. But if the file system is mounted with errors=continue retry block allocation. We need to mark the system zone blocks as in use to make sure retry don't pick them again System zone is the block range mapping block bitmap, inode bitmap and inode table. [akpm@linux-foundation.org: fix typo in comment] Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs/ext3: use BUG_ONJulia Lawall2008-04-281-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | if (...) BUG(); should be replaced with BUG_ON(...) when the test has no side-effects to allow a definition of BUG_ON that drops the code completely. The semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @ disable unlikely @ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } | - if (unlikely(E)) { BUG(); } + BUG_ON(E); ) @@ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } | - if (E) { BUG(); } + BUG_ON(E); ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ext3: replace all adds to little endians variables with le*_add_cpuMarcin Slusarz2008-02-081-5/+2
| | | | | | | | | | | | | | | | | replace all: little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) + expression_in_cpu_byteorder); with: leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder); sparse didn't generate any new warning with this patch Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ext[234]: cleanup ext[234]_bg_num_gdb()Akinobu Mita2008-02-061-5/+1
| | | | | | | | | Use ext[234]_bg_has_super() to remove duplicate code. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ext[234]: fix comment for nonexistent variableAkinobu Mita2008-02-061-1/+1
| | | | | | | | | | | The comment in ext[234]_new_blocks() describes about "i". But there is no local variable called "i" in that scope. I guess it has been renamed to group_no. Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ext3: return after ext3_error in case of failuresAneesh Kumar K.V2008-02-061-2/+6
| | | | | | | | | | | This fixes some instances where we were continuing after calling ext3_error. ext3_error calls panic only if errors=panic mount option is set. So we need to make sure we return correctly after ext3_error call Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ext3: add block bitmap validationAneesh Kumar K.V2008-02-061-8/+70
| | | | | | | | | | | | When a new block bitmap is read from disk in read_block_bitmap() there are a few bits that should ALWAYS be set. In particular, the blocks given corresponding to block bitmap, inode bitmap and inode tables. Validate the block bitmap against these blocks. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Revert "ext2/ext3/ext4: add block bitmap validation"Linus Torvalds2007-11-131-44/+4
| | | | | | | | | | | | | | | | | | | | | | This reverts commit 7c9e69faa28027913ee059c285a5ea8382e24b5d, fixing up conflicts in fs/ext4/balloc.c manually. The cost of doing the bitmap validation on each lookup - even when the bitmap is cached - is absolutely prohibitive. We could, and probably should, do it only when adding the bitmap to the buffer cache. However, right now we are better off just reverting it. Peter Zijlstra measured the cost of this extra validation as a 85% decrease in cached iozone, and while I had a patch that took it down to just 17% by not being _quite_ so stupid in the validation, it was still a big slowdown that could have been avoided by just doing it right. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Cc: Andreas Dilger <adilger@clusterfs.com> Cc: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ext2/ext3/ext4: add block bitmap validationAneesh Kumar K.V2007-10-171-11/+43
| | | | | | | | | | | | | | When a new block bitmap is read from disk in read_block_bitmap() there are a few bits that should ALWAYS be set. In particular, the blocks given by ext4_blk_bitmap, ext4_inode_bitmap and ext4_inode_table. Validate the block bitmap against these blocks. [akpm@linux-foundation.org: cleanups] Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andreas Dilger <adilger@clusterfs.com> Acked-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lib: percpu_counter_subPeter Zijlstra2007-10-171-1/+1
| | | | | | | | | | | | | | | | | | | Hugh spotted that some code does: percpu_counter_add(&counter, -unsignedlong) which, when the amount argument is of type s32, sort-of works thanks to two's-complement. However when we'd change the type to s64 this breaks on 32bit machines, because the promotion rules zero extend the unsigned number. Provide percpu_counter_sub() to hide the s64 cast. That is: percpu_counter_sub(&counter, foo) is equal to: percpu_counter_add(&counter, -(s64)foo); Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lib: percpu_counter_addPeter Zijlstra2007-10-171-2/+2
| | | | | | | | | | s/percpu_counter_mod/percpu_counter_add/ Because its a better name, _mod implies modulo. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] ext[234]: update documentationAneesh Kumar K.V2007-02-201-1/+1
| | | | | | Signed-off-by: "Aneesh Kumar K.V" <aneesh.kumar@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] ext3 balloc: fix _with_rsv freezeHugh Dickins2006-12-071-1/+1
| | | | | | | | | | | | Port fix to the off-by-one in find_next_usable_block's memscan from ext2 to ext3; but it didn't cause a serious problem for ext3 because the additional ext3_test_allocatable check rescued it from the error. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3 balloc: use io_error labelHugh Dickins2006-12-071-4/+2
| | | | | | | | | | | ext3_new_blocks has a nice io_error label for setting -EIO, so goto that in the one place that doesn't already use it. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3 balloc: say rb_entry not list_entryHugh Dickins2006-12-071-3/+3
| | | | | | | | | | | The reservations tree is an rb_tree not a list, so it's less confusing to use rb_entry() than list_entry() - though they're both just container_of(). Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3 balloc: fix off-by-one against rsv_endHugh Dickins2006-12-071-1/+1
| | | | | | | | | | | rsv_end is the last block within the reservation, so alloc_new_reservation should accept start_block == rsv_end as success. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3 balloc: fix off-by-one against grp_goalHugh Dickins2006-12-071-2/+2
| | | | | | | | | | | grp_goal 0 is a genuine goal (unlike -1), so ext3_try_to_allocate_with_rsv should treat it as such. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3 balloc: reset windowsz when fullHugh Dickins2006-12-071-0/+1
| | | | | | | | | | | | ext3_new_blocks should reset the reservation window size to 0 when squeezing the last blocks out of an almost full filesystem, so the retry doesn't skip any groups with less than half that free, reporting ENOSPC too soon. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3: fix reservation extensionMingming Cao2006-12-071-4/+8
| | | | | | | | | | | | | | | | | | | | | | Hugh Dickins wrote: > Not found anything relevant, but I keep noticing these lines > in ext2_try_to_allocate_with_rsv(), ext3 and ext4 similar: > > } else if (grp_goal > 0 && > (my_rsv->rsv_end - grp_goal + 1) < *count) > try_to_extend_reservation(my_rsv, sb, > *count-my_rsv->rsv_end + grp_goal - 1); > > They're wrong, a no-op in most groups, aren't they? rsv_end is an > absolute block number, whereas grp_goal is group-relative, so the > calculation ought to bring in group_first_block? Or I'm confused. > Signed-off-by: Mingming Cao <cmm@us.ibm.com> Cc: "linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] more ext3 16T overflow fixesEric Sandeen2006-09-271-6/+6
| | | | | | | | | | | | | | Some of the changes in balloc.c are just cosmetic, as Andreas pointed out - if they overflow they'll then underflow and things are fine. 5th hunk actually fixes an overflow problem. Also check for potential overflows in inode & block counts when resizing. Signed-off-by: Eric Sandeen <esandeen@redhat.com> Cc: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3: More whitespace cleanupsDave Kleikamp2006-09-271-16/+16
| | | | | | | | | More white space cleanups in preparation of cloning ext4 from ext3. Removing spaces that precede a tab. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3: more comments about block allocation/reservation codeMingming Cao2006-09-271-45/+247
| | | | | | | Signed-off-by: Mingming Cao <cmm@us.ibm.com> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3: turn on reservation dump on block allocation errorsMingming Cao2006-09-271-4/+8
| | | | | | | | | | In the past there were a few kernel panics related to block reservation tree operations failure (insert/remove etc). It would be very useful to get the block allocation reservation map info when such error happens. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3 and jbd cleanup: remove whitespaceMingming Cao2006-09-271-8/+8
| | | | | | | | Remove whitespace from ext3 and jbd, before we clone ext4. Signed-off-by: Mingming Cao<cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3 filesystem bogus ENOSPC with reservation fixMingming Cao2006-08-271-3/+3
| | | | | | | | | | | | | | To handle the earlier bogus ENOSPC error caused by filesystem full of block reservation, current code falls back to non block reservation, starts to allocate block(s) from the goal allocation block group as if there is no block reservation. Current code needs to re-load the corresponding block group descriptor for the initial goal block group in this case. The patch fixes this. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Remove obsolete #include <linux/config.h>Jörn Engel2006-06-301-1/+0
| | | | | Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
* [PATCH] ext3_fsblk_t: the rest of in-kernel filesystem blocks conversionMingming Cao2006-06-251-16/+13
| | | | | | | | | | Convert the ext3 in-kernel filesystem blocks to ext3_fsblk_t. Convert the rest of all unsigned long type in-kernel filesystem blocks to ext3_fsblk_t, and replace the printk format string respondingly. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3_fsblk_t: filesystem, group blocks and bug fixesMingming Cao2006-06-251-103/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some of the in-kernel ext3 block variable type are treated as signed 4 bytes int type, thus limited ext3 filesystem to 8TB (4kblock size based). While trying to fix them, it seems quite confusing in the ext3 code where some blocks are filesystem-wide blocks, some are group relative offsets that need to be signed value (as -1 has special meaning). So it seem saner to define two types of physical blocks: one is filesystem wide blocks, another is group-relative blocks. The following patches clarify these two types of blocks in the ext3 code, and fix the type bugs which limit current 32 bit ext3 filesystem limit to 8TB. With this series of patches and the percpu counter data type changes in the mm tree, we are able to extend exts filesystem limit to 16TB. This work is also a pre-request for the recent >32 bit ext3 work, and makes the kernel to able to address 48 bit ext3 block a lot easier: Simply redefine ext3_fsblk_t from unsigned long to sector_t and redefine the format string for ext3 filesystem block corresponding. Two RFC with a series patches have been posted to ext2-devel list and have been reviewed and discussed: http://marc.theaimsgroup.com/?l=ext2-devel&m=114722190816690&w=2 http://marc.theaimsgroup.com/?l=ext2-devel&m=114784919525942&w=2 Patches are tested on both 32 bit machine and 64 bit machine, <8TB ext3 and >8TB ext3 filesystem(with the latest to be released e2fsprogs-1.39). Tests includes overnight fsx, tiobench, dbench and fsstress. This patch: Defines ext3_fsblk_t and ext3_grpblk_t, and the printk format string for filesystem wide blocks. This patch classifies all block group relative blocks, and ext3_fsblk_t blocks occurs in the same function where used to be confusing before. Also include kernel bug fixes for filesystem wide in-kernel block variables. There are some fileystem wide blocks are treated as int/unsigned int type in the kernel currently, especially in ext3 block allocation and reservation code. This patch fixed those bugs by converting those variables to ext3_fsblk_t(unsigned long) type. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3_get_blocks: Adjust reservation window size for mblocksMingming Cao2006-03-261-1/+31
| | | | | | | | | | | | | | | | | Optimize the block reservation and the multiple block allocation: with the knowledge of the total number of blocks ahead, set or adjust the reservation window size properly (based on the number of blocks needed) before block allocation happens: if there isn't any reservation yet, make sure the reservation window equals to or greater than the number of blocks needed, before create an reservation window; if a reservation window is already exists, try to extends the window size to match the number of blocks to allocate. This could increase the possibility of completing multiple blocks allocation in a single request, as blocks are only allocated in the range of the inode's reservation window. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3_get_blocks: Adjust accounting info in ext3_new_blocks()Mingming Cao2006-03-261-12/+19
| | | | | | | | | Update accounting information (quota, boundary checks, free blocks number etc) in ext3_new_blocks(). Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3_get_blocks: support multiple blocks allocation in ext3_new_block()Mingming Cao2006-03-261-10/+36
| | | | | | | | | | | | Change ext3_try_to_allocate() (called via ext3_new_blocks()) to try to allocate the requested number of blocks on a best effort basis: After allocated the first block, it will always attempt to allocate the next few(up to the requested size and not beyond the reservation window) adjacent blocks at the same time. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3: Properly report backup block present in a groupGlauber de Oliveira Costa2006-03-241-7/+33
| | | | | | | | | | | | | | | In filesystems with the meta block group flag on, ext3_bg_num_gdb() fails to report the correct number of blocks used to store the group descriptor backups in a given group. It happens because meta_bg follows a different logic from the original ext3 backup placement in groups multiples of 3, 5 and 7. Signed-off-by: Glauber de Oliveira Costa <glommer@br.ibm.com> Cc: Andreas Dilger <adilger@clusterfs.com> Cc: "Stephen C. Tweedie" <sct@redhat.com> Cc: Alex Tomas <alex@clusterfs.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] capable/capability.h (fs/)Randy Dunlap2006-01-111-0/+1
| | | | | | | | | fs: Use <linux/capability.h> where capable() is used. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: Tim Schmielau <tim@physik3.uni-rostock.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] fs/ext3/: small cleanupsAdrian Bunk2006-01-101-2/+0
| | | | | | | | | | | This patch contains the following cleanups: - there's no need for ext3_count_free() #ifndef EXT3FS_DEBUG - having prototypes for ext3_count_free() in two different headers is nonsense Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] remove CONFIG_EXT{2,3}_CHECKAdrian Bunk2005-11-091-73/+0
| | | | | | | | | The CONFIG_EXT{2,3}_CHECK options where were never available, and all they did was to implement a subset of e2fsck in the kernel. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3: sparse fixesBen Dooks2005-10-301-0/+2
| | | | | | | | | | | | | | | | | | Fix warnings from sparse due to un-declared functions that should either have a header file or have been declared static fs/ext2/bitmap.c:14:15: warning: symbol 'ext2_count_free' was not declared. Should it be static? fs/ext2/namei.c:92:15: warning: symbol 'ext2_get_parent' was not declared. Should it be static? fs/ext3/bitmap.c:15:15: warning: symbol 'ext3_count_free' was not declared. Should it be static? fs/ext3/namei.c:1013:15: warning: symbol 'ext3_get_parent' was not declared. Should it be static? fs/ext3/xattr.c:214:1: warning: symbol 'ext3_xattr_block_get' was not declared. Should it be static? fs/ext3/xattr.c:358:1: warning: symbol 'ext3_xattr_block_list' was not declared. Should it be static? fs/ext3/xattr.c:630:1: warning: symbol 'ext3_xattr_block_find' was not declared. Should it be static? fs/ext3/xattr.c:863:1: warning: symbol 'ext3_xattr_ibody_find' was not declared. Should it be static? Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Locking problems while EXT3FS_DEBUG onGlauber de Oliveira Costa2005-10-301-3/+2
| | | | | | | | | | | | | | | | I noticed some problems while running ext3 with the debug flag set on. More precisely, I was unable to umount the filesystem. Some investigation took me to the patch that follows. At a first glance , the lock/unlock I've taken out seems really not necessary, as the main code (outside debug) does not lock the super. The only additional danger operations that debug code introduces seems to be related to bitmap, but bitmap operations tends to be all atomic anyway. I also took the opportunity to fix 2 spelling errors. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3: EXT3_DEBUG build fixesGlauber de Oliveira Costa2005-09-221-3/+3
| | | | | | | Fix some warnings and a build error when EXT3_DEBUG is enabled. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3: reduce allocate-with-reservation lock latenciesMingming Cao2005-06-281-72/+63
| | | | | | | | | | | | | | | | | | | | | | | | Currently in ext3 block reservation code, the global filesystem reservation tree lock (rsv_block) is hold during the process of searching for a space to make a new reservation window, including while scaning the block bitmap to verify if the avalible window has a free block. Holding the lock during bitmap scan is unnecessary and could possibly cause scalability issue and latency issues. This patch tries to address this by dropping the lock before scan the bitmap. Before that we need to reserve the open window in case someone else is targetting at the same window. Question was should we reserve the whole free reservable space or just the window size we need. Reserve the whole free reservable space will possibly force other threads which intended to do block allocation nearby move to another block group(cause bad layout). In this patch, we just reserve the desired size before drop the lock and scan the block bitmap. This patch fixed a ext3 reservation latency issue seen on a cvs check out test. Patch is tested with many fsx, tiobench, dbench and untar a kernel test. Signed-Off-By: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Linux-2.6.12-rc2v2.6.12-rc2Linus Torvalds2005-04-161-0/+1600
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
OpenPOWER on IntegriCloud