summaryrefslogtreecommitdiffstats
path: root/fs/gfs2/bmap.c
Commit message (Collapse)AuthorAgeFilesLines
* GFS2: Change maxlen variables to size_tBob Peterson2014-08-211-4/+5
| | | | | | | | | | This patch changes some variables (especially maxlen in function gfs2_block_map) from unsigned int to size_t. We need 64-bit arithmetic for very large files (e.g. 1PB) where the variables otherwise get shifted to all 0's. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: fs/gfs2/bmap.c: kernel-doc warning fixesFabian Frederick2014-05-161-4/+4
| | | | | | | | | Fix 2 typos and move one definition which was between function comments and function definition (yet another kernel-doc warning) Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Clean up journal extent mappingSteven Whitehouse2014-03-031-0/+115
| | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a long standing issue in mapping the journal extents. Most journals will consist of only a single extent, and although the cache took account of that by merging extents, it did not actually map large extents, but instead was doing a block by block mapping. Since the journal was only being mapped on mount, this was not normally noticeable. With the updated code, it is now possible to use the same extent mapping system during journal recovery (which will be added in a later patch). This will allow checking of the integrity of the journal before any reply of the journal content is attempted. For this reason the code is moving to bmap.c, since it will be used more widely in due course. An exercise left for the reader is to compare the new function gfs2_map_journal_extents() with gfs2_write_alloc_required() Additionally, should there be a failure, the error reporting is also updated to show more detail about what went wrong. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Add allocation parameters structureSteven Whitehouse2013-10-021-1/+2
| | | | | | | | | | | | This patch adds a structure to contain allocation parameters with the intention of future expansion of this structure. The idea is that we should be able to add more information about the allocation in the future in order to allow the allocator to make a better job of placing the requests on-disk. There is no functional difference from applying this patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Clean up reservation removalSteven Whitehouse2013-09-271-1/+3
| | | | | | | | | | | | | | | The reservation for an inode should be cleared when it is truncated so that we can start again at a different offset for future allocations. We could try and do better than that, by resetting the search based on where the truncation started from, but this is only a first step. In addition, there are three callers of gfs2_rs_delete() but only one of those should really be testing the value of i_writecount. While we get away with that in the other cases currently, I think it would be better if we made that test specific to the one case which requires it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* truncate: drop 'oldsize' truncate_pagecache() parameterKirill A. Shutemov2013-09-121-2/+2
| | | | | | | | | | truncate_pagecache() doesn't care about old size since commit cedabed49b39 ("vfs: Fix vmtruncate() regression"). Let's drop it. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* GFS2: Reserve journal space for quota change in do_growBob Peterson2013-06-271-1/+3
| | | | | | | | | | | | | If a GFS2 file system is mounted with quotas and a file is grown in such a way that its free blocks for the allocation are represented in a secondary bitmap, GFS2 ran out of blocks in the transaction. That resulted in "fatal: assertion "tr->tr_num_buf <= tr->tr_blocks". This patch reserves extra blocks for the quota change so the transaction has enough space. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Increase i_writecount during gfs2_setattr_sizeBob Peterson2013-06-031-4/+13
| | | | | | | | | | This patch calls get_write_access in a few functions. This merely increases inode->i_writecount for the duration of the function. That will ensure that any file closes won't delete the inode's multi-block reservation while the function is running. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Remove vestigial parameter ip from function rs_deltreeBob Peterson2013-04-081-1/+1
| | | | | | | | | The functions that delete block reservations from the rgrp block reservations rbtree no longer use the ip parameter. This patch eliminates the parameter. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2013-02-251-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace and namespace infrastructure changes from Eric W Biederman: "This set of changes starts with a few small enhnacements to the user namespace. reboot support, allowing more arbitrary mappings, and support for mounting devpts, ramfs, tmpfs, and mqueuefs as just the user namespace root. I do my best to document that if you care about limiting your unprivileged users that when you have the user namespace support enabled you will need to enable memory control groups. There is a minor bug fix to prevent overflowing the stack if someone creates way too many user namespaces. The bulk of the changes are a continuation of the kuid/kgid push down work through the filesystems. These changes make using uids and gids typesafe which ensures that these filesystems are safe to use when multiple user namespaces are in use. The filesystems converted for 3.9 are ceph, 9p, afs, ocfs2, gfs2, ncpfs, nfs, nfsd, and cifs. The changes for these filesystems were a little more involved so I split the changes into smaller hopefully obviously correct changes. XFS is the only filesystem that remains. I was hoping I could get that in this release so that user namespace support would be enabled with an allyesconfig or an allmodconfig but it looks like the xfs changes need another couple of days before it they are ready." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (93 commits) cifs: Enable building with user namespaces enabled. cifs: Convert struct cifs_ses to use a kuid_t and a kgid_t cifs: Convert struct cifs_sb_info to use kuids and kgids cifs: Modify struct smb_vol to use kuids and kgids cifs: Convert struct cifsFileInfo to use a kuid cifs: Convert struct cifs_fattr to use kuid and kgids cifs: Convert struct tcon_link to use a kuid. cifs: Modify struct cifs_unix_set_info_args to hold a kuid_t and a kgid_t cifs: Convert from a kuid before printing current_fsuid cifs: Use kuids and kgids SID to uid/gid mapping cifs: Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID to keyring_alloc cifs: Use BUILD_BUG_ON to validate uids and gids are the same size cifs: Override unmappable incoming uids and gids nfsd: Enable building with user namespaces enabled. nfsd: Properly compare and initialize kuids and kgids nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids nfsd: Modify nfsd4_cb_sec to use kuids and kgids nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion nfsd: Convert nfsxdr to use kuids and kgids nfsd: Convert nfs3xdr to use kuids and kgids ...
| * gfs2: Split NO_QUOTA_CHANGE inot NO_UID_QUTOA_CHANGE and NO_GID_QUTOA_CHANGEEric W. Biederman2013-02-131-1/+1
| | | | | | | | | | | | | | | | Split NO_QUOTA_CHANGE into NO_UID_QUTOA_CHANGE and NO_GID_QUTOA_CHANGE so the constants may be well typed. Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | GFS2: Get a block reservation before resizing a fileBob Peterson2013-02-011-0/+4
| | | | | | | | | | | | | | | | | | | | This patch allocates a block reservation structure before growing or shrinking a file. Without this structure, the grow or shink code can reference the bad pointer. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* | GFS2: Use ->writepages for ordered writesSteven Whitehouse2013-01-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using a list of buffers to write ahead of the journal flush, this now uses a list of inodes and calls ->writepages via filemap_fdatawrite() in order to achieve the same thing. For most use cases this results in a shorter ordered write list, as well as much larger i/os being issued. The ordered write list is sorted by inode number before writing in order to retain the disk block ordering between inodes as per the previous code. The previous ordered write code used to conflict in its assumptions about how to write out the disk blocks with mpage_writepages() so that with this updated version we can also use mpage_writepages() for GFS2's ordered write, writepages implementation. So we will also send larger i/os from writeback too. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* | GFS2: Split gfs2_trans_add_bh() into twoSteven Whitehouse2013-01-291-12/+12
|/ | | | | | | | | | | | | | There is little common content in gfs2_trans_add_bh() between the data and meta classes by the time that the functions which it calls are taken into account. The intent here is to split this into two separate functions. Stage one is to introduce gfs2_trans_add_data() and gfs2_trans_add_meta() and update the callers accordingly. Later patches will then pull in the content of gfs2_trans_add_bh() and its dependent functions in order to clean up the code in this area. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Fix truncation of journaled data filesSteven Whitehouse2012-11-131-3/+49
| | | | | | | | | | | This patch fixes an issue relating to not having enough revokes available when truncating journaled data files. In order to ensure that we do no run out, the truncation is broken into separate pieces if it is large enough. Tested using fsx on a journaled data file. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Add Orlov allocatorSteven Whitehouse2012-11-071-1/+1
| | | | | | | | | | | | | | | Just like ext3, this works on the root directory and any directory with the +T flag set. Also, just like ext3, any subdirectory created in one of the just mentioned cases will be allocated to a random resource group (GFS2 equivalent of a block group). If you are creating a set of directories, each of which will contain a job running on a different node, then by setting +T on the parent directory before creating the subdirectories, each will land up in a different resource group, and thus resource group contention between nodes will be kept to a minimum. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Add structure to contain rgrp, bitmap, offset tupleSteven Whitehouse2012-09-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | This patch introduces a new structure, gfs2_rbm, which is a tuple of a resource group, a bitmap within the resource group and an offset within that bitmap. This is designed to make manipulating these sets of variables easier. There is also a new helper function which converts this representation back to a disk block address. In addition, the rbtree nodes which are used for the reservations were not being correctly initialised, which is now fixed. Also, the tracing was not passing through the inode where it should have been. That is mostly fixed aside from one corner case. This needs to be revisited since there can also be a NULL rgrp in some cases which results in the device being incorrect in the trace. This is intended to be the first step towards cleaning up some of the allocation code, and some further bug fixes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Reduce file fragmentationBob Peterson2012-07-191-0/+3
| | | | | | | | | | | | | | This patch reduces GFS2 file fragmentation by pre-reserving blocks. The resulting improved on disk layout greatly speeds up operations in cases which would have resulted in interlaced allocation of blocks previously. A typical example of this is 10 parallel dd processes, each writing to a file in a common dirctory. The implementation uses an rbtree of reservations attached to each resource group (and each inode). Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Fold quota data into the reservations structBob Peterson2012-06-061-13/+5
| | | | | | | | | | This patch moves the ancillary quota data structures into the block reservations structure. This saves GFS2 some time and effort in allocating and deallocating the qadata structure. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Eliminate unused "new" parameter to gfs2_meta_indirect_bufferBob Peterson2012-05-111-2/+2
| | | | | | | | | | It turns out that the "new" parameter to function gfs2_meta_indirect_buffer was always being passed in as zero. Therefore, this patch eliminates it and simplifies the function. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Use variable rather than qa to determine if unstuff necessaryBob Peterson2012-04-241-2/+4
| | | | | | | | | | | In the future, the qadata structure will be eliminated and merged back in with the block reservation structure, after we extend the lifespan of that. This patch is a step forward in eliminating the qadata structure. It adds a variable to the do_grow function to determine when unstuffing is necessary, and has been done. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Make sure rindex is uptodate before starting transactionsBob Peterson2012-04-051-1/+5
| | | | | | | | | This patch removes the call from gfs2_blk2rgrd to function gfs2_rindex_update and replaces it with individual calls. The former way turned out to be too problematic. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Change truncate page allocation to be GFP_NOFSBob Peterson2012-03-201-2/+2
| | | | | | | | | This patch changes the page allocation in gfs2_block_truncate_page and two others to GFP_NOFS to avoid deadlock in low-memory conditions. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: decouple quota allocations from block allocationsBob Peterson2011-11-221-10/+9
| | | | | | | | | | | | This patch separates the code pertaining to allocations into two parts: quota-related information and block reservations. This patch also moves all the block reservation structure allocations to function gfs2_inplace_reserve to simplify the code, and moves the frees to function gfs2_inplace_release. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: move toward a generic multi-block allocatorBob Peterson2011-11-211-2/+2
| | | | | | | | | | | | | | | | | | | | | This patch is a revision of the one I previously posted. I tried to integrate all the suggestions Steve gave. The purpose of the patch is to change function gfs2_alloc_block (allocate either a dinode block or an extent of data blocks) to a more generic gfs2_alloc_blocks function that can allocate both a dinode _and_ an extent of data blocks in the same call. This will ultimately help us create a multi-block reservation scheme to reduce file fragmentation. This patch moves more toward a generic multi-block allocator that takes a pointer to the number of data blocks to allocate, plus whether or not to allocate a dinode. In theory, it could be called to allocate (1) a single dinode block, (2) a group of one or more data blocks, or (3) a dinode plus several data blocks. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: combine gfs2_alloc_block and gfs2_alloc_diBob Peterson2011-11-151-2/+2
| | | | | | | | | | | | | GFS2 functions gfs2_alloc_block and gfs2_alloc_di do basically the same things, with a few exceptions. This patch combines the two functions into a slightly more generic gfs2_alloc_block. Having one centralized block allocation function will reduce code redundancy and make it easier to implement multi-block reservations to reduce file fragmentation in the future. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: More automated code analysis fixesSteven Whitehouse2011-11-081-3/+0
| | | | | | | | A potentially uninitialised variable, some unreachable code, and the main part of this, fixing the error path in the unlink function. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Move readahead of metadata during deallocation into its own functionSteven Whitehouse2011-10-211-19/+26
| | | | | | | | | | Move the recently added readahead of the indirect pointer tree during deallocation into its own function in order that we can use it elsewhere in the future. Also this fixes the resetting of the "first" variable in the original patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: rewrite fallocate code to write blocks directlyBenjamin Marzinski2011-10-211-0/+12
| | | | | | | | | | | | GFS2's fallocate code currently goes through the page cache. Since it's only writing to the end of the file or to holes in it, it doesn't need to, and it was causing issues on low memory environments. This patch pulls in some of Steve's block allocation work, and uses it to simply allocate the blocks for the file, and zero them out at allocation time. It provides a slight performance increase, and it dramatically simplifies the code. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: speed up delete/unlink performance for large filesBob Peterson2011-10-211-3/+23
| | | | | | | | | | | This patch improves the performance of delete/unlink operations in a GFS2 file system where the files are large by adding a layer of metadata read-ahead for indirect blocks. Mileage will vary, but on my system, deleting an 8.6G file dropped from 22 seconds to about 4.5 seconds. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Use cached rgrp in gfs2_rlist_add()Steven Whitehouse2011-10-211-2/+2
| | | | | | | | | | | Each block which is deallocated, requires a call to gfs2_rlist_add() and each of those calls was calling gfs2_blk2rgrpd() in order to figure out which rgrp the block belonged in. This can be speeded up by making use of the rgrp cached in the inode. We also reset this cached rgrp in case the block has changed rgrp. This should provide a big reduction in gfs2_blk2rgrpd() calls during deallocation. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Call do_strip() directly from recursive_scan()Steven Whitehouse2011-10-211-78/+71
| | | | | | | | | | | | | | The recursive_scan() function only ever takes a single "bc" argument, so we might as well just call do_strip() directly from resource_scan() rather than pass it in as an argument. Also the "data" argument is always a struct strip_mine, so we can pass that in, rather than using a void pointer. This also moves do_strip() ahead of recursive_scan() so that we don't need to add a prototype. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Make resource groups "append only" during life of fsSteven Whitehouse2011-10-211-7/+0
| | | | | | | | | | | | | | | | | | | | | Since we have ruled out supporting online filesystem shrink, it is possible to make the resource group list append only during the life of a super block. This gives several benefits: Firstly, we only need to read new rindex elements as they are added rather than needing to reread the whole rindex file each time one element is added. Secondly, the rindex glock can be held for much shorter periods of time, and is completely removed from the fast path for allocations. The lock is taken in shared mode only when updating the resource groups when the first allocation occurs, and after a grow has taken place. Thirdly, this results in a reduction in code size, and everything gets a lot simpler to understand in this area. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2011-07-221-0/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (107 commits) vfs: use ERR_CAST for err-ptr tossing in lookup_instantiate_filp isofs: Remove global fs lock jffs2: fix IN_DELETE_SELF on overwriting rename() killing a directory fix IN_DELETE_SELF on overwriting rename() on ramfs et.al. mm/truncate.c: fix build for CONFIG_BLOCK not enabled fs:update the NOTE of the file_operations structure Remove dead code in dget_parent() AFS: Fix silly characters in a comment switch d_add_ci() to d_splice_alias() in "found negative" case as well simplify gfs2_lookup() jfs_lookup(): don't bother with . or .. get rid of useless dget_parent() in btrfs rename() and link() get rid of useless dget_parent() in fs/btrfs/ioctl.c fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers drivers: fix up various ->llseek() implementations fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek Ext4: handle SEEK_HOLE/SEEK_DATA generically Btrfs: implement our own ->llseek fs: add SEEK_HOLE and SEEK_DATA flags reiserfs: make reiserfs default to barrier=flush ... Fix up trivial conflicts in fs/xfs/linux-2.6/xfs_super.c due to the new shrinker callout for the inode cache, that clashed with the xfs code to start the periodic workers later.
| * fs: move inode_dio_wait calls into ->setattrChristoph Hellwig2011-07-201-0/+2
| | | | | | | | | | | | | | | | | | | | Let filesystems handle waiting for direct I/O requests themselves instead of doing it beforehand. This means filesystem-specific locks to prevent new dio referenes from appearing can be held. This is important to allow generalizing i_dio_count to non-DIO_LOCKING filesystems. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | GFS2: combine duplicated block freeing routinesEric Sandeen2011-07-151-10/+2
|/ | | | | | | | | | | | __gfs2_free_data and __gfs2_free_meta are almost identical, and can be trivially combined. [This is as per Eric's original patch minus gfs2_free_data() which had no callers left and plus the conversion of the bmap.c calls to these functions. All in all, a nice clean up] Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Wipe directory hash table metadata when deallocating a directorySteven Whitehouse2011-05-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | The deallocation code for directories in GFS2 is largely divided into two parts. The first part deallocates any directory leaf blocks and marks the directory as being a regular file when that is complete. The second stage was identical to deallocating regular files. Regular files have their data blocks in a different address space to directories, and thus what would have been normal data blocks in a regular file (the hash table in a GFS2 directory) were deallocated correctly. However, a reference to these blocks was left in the journal (assuming of course that some previous activity had resulted in those blocks being in the journal or ail list). This patch uses the i_depth as a test of whether the inode is an exhash directory (we cannot test the inode type as that has already been changed to a regular file at this stage in deallocation) The original issue was reported by Chris Hertel as an issue he encountered running bonnie++ Reported-by: Christopher R. Hertel <crh@samba.org> Cc: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* Fix common misspellingsLucas De Marchi2011-03-311-1/+1
| | | | | | Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
* GFS2: deallocation performance patchBob Peterson2011-02-241-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is a performance improvement to GFS2's dealloc code. Rather than update the quota file and statfs file for every single block that's stripped off in unlink function do_strip, this patch keeps track and updates them once for every layer that's stripped. This is done entirely inside the existing transaction, so there should be no risk of corruption. The other functions that deallocate blocks will be unaffected because they are using wrapper functions that do the same thing that they do today. I tested this code on my roth cluster by creating 200 files in a directory, each of which is 100MB, then on four nodes, I simultaneously deleted the files, thus competing for GFS2 resources (but different files). The commands I used were: [root@roth-01]# time for i in `seq 1 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done [root@roth-02]# time for i in `seq 2 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done [root@roth-03]# time for i in `seq 3 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done [root@roth-05]# time for i in `seq 4 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done The performance increase was significant: roth-01 roth-02 roth-03 roth-05 --------- --------- --------- --------- old: real 0m34.027 0m25.021s 0m23.906s 0m35.646s new: real 0m22.379s 0m24.362s 0m24.133s 0m18.562s Total time spent deleting: old: 118.6s new: 89.4 For this particular case, this showed a 25% performance increase for GFS2 unlinks. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Fix uninitialised error value in previous patchSteven Whitehouse2010-11-301-1/+1
| | | | Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: fix recursive locking during rindex truncatesBenjamin Marzinski2010-11-301-2/+7
| | | | | | | | | | When you truncate the rindex file, you need to avoid calling gfs2_rindex_hold, since you already hold it. However, if you haven't already read in the resource groups, you need to do that. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: reserve more blocks for transactionsBenjamin Marzinski2010-09-281-1/+1
| | | | | | | | | | | | | Some of the functions in GFS2 were not reserving space in the transaction for the resource group header and the resource groups bitblocks that get added when you do allocation. GFS2 now makes sure to reserve space for the resource group header and either all the bitblocks in the resource group, or one for each block that it may allocate, whichever is smaller using the new gfs2_rg_blocks() inline function. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Remove i_disksizeSteven Whitehouse2010-09-201-7/+5
| | | | | | | | | With the update of the truncate code, ip->i_disksize and inode->i_size are merely copies of each other. This means we can remove ip->i_disksize and use inode->i_size exclusively reducing the size of a GFS2 inode by 8 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: New truncate sequenceSteven Whitehouse2010-09-201-128/+119
| | | | | | | | | | | This updates GFS2's truncate code to use the new truncate sequence correctly. This is a stepping stone to being able to remove ip->i_disksize in favour of using i_size everywhere now that the two sizes are always identical. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Christoph Hellwig <hch@lst.de>
* GFS2: Fix typo in stuffed file data copy handlingAbhijith Das2010-07-301-1/+1
| | | | | | | | | trunc_start() in bmap.c incorrectly uses sizeof(struct gfs2_inode) instead of sizeof(struct gfs2_dinode). Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Simplify gfs2_write_alloc_requiredBob Peterson2010-07-291-10/+5
| | | | | | | | | | | Function gfs2_write_alloc_required always returned zero as its return code. Therefore, it doesn't need to return a return code at all. Given that, we can use the return value to return whether or not the dinode needs block allocations rather than passing that value in, which in turn simplifies a bunch of error checking. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: O_TRUNC not working on stuffed files across clusterBob Peterson2010-07-151-0/+1
| | | | | | | | | This patch replaces a statement that got dropped out by accident. Without the patch, truncates on stuffed (very small) files cause those files to have an unpredictable size. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmwLinus Torvalds2010-05-211-7/+10
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: GFS2: Fix typo GFS2: stuck in inode wait, no glocks stuck GFS2: Eliminate useless err variable GFS2: Fix writing to non-page aligned gfs2_quota structures GFS2: Add some useful messages GFS2: fix quota state reporting GFS2: Various gfs2_logd improvements GFS2: glock livelock GFS2: Clean up stuffed file copying GFS2: docs update GFS2: Remove space from slab cache name
| * GFS2: Clean up stuffed file copyingSteven Whitehouse2010-03-291-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | If the inode size was corrupt for stuffed files, it was possible for the copying of data to overrun the block and/or page. This patch checks for that condition so that this is no longer possible. This is also preparation for the new truncate sequence patch which requires the ability to have stuffed files with larger sizes than (disk block size - sizeof(on disk inode)) with the restriction that only the initial part of the file may be non-zero. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* | include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-301-1/+0
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
OpenPOWER on IntegriCloud