summaryrefslogtreecommitdiffstats
path: root/fs/gfs2
Commit message (Collapse)AuthorAgeFilesLines
...
| * | GFS2: Update master statfs buffer with sd_statfs_spin lockedBob Peterson2015-12-141-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch, function update_statfs called gfs2_statfs_change_out to update the master statfs buffer without the sd_statfs_spin held. In theory, another process could call gfs2_statfs_sync, which takes the sd_statfs_spin lock and re-reads m_sc from the buffer. So there's a theoretical timing window in which one process could write the master statfs buffer, then another comes along and re-reads it, wiping out the changes. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * | GFS2: Reduce size of incore inodeBob Peterson2015-12-145-26/+26
| | | | | | | | | | | | | | | | | | | | | | | | This patch makes no functional changes. Its goal is to reduce the size of the gfs2 inode in memory by rearranging structures and changing the size of some variables within the structure. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * | GFS2: Make rgrp reservations part of the gfs2_inode structureBob Peterson2015-12-1412-82/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch, multi-block reservation structures were allocated from a special slab. This patch folds the structure into the gfs2_inode structure. The disadvantage is that the gfs2_inode needs more memory, even when a file is opened read-only. The advantages are: (a) we don't need the special slab and the extra time it takes to allocate and deallocate from it. (b) we no longer need to worry that the structure exists for things like quota management. (c) This also allows us to remove the calls to get_write_access and put_write_access since we know the structure will exist. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * | GFS2: Extract quota data from reservations structure (revert 5407e24)Bob Peterson2015-11-2413-63/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch basically reverts the majority of patch 5407e24. That patch eliminated the gfs2_qadata structure in favor of just using the reservations structure. The problem with doing that is that it increases the size of the reservations structure. That is not an issue until it comes time to fold the reservations structure into the inode in memory so we know it's always there. By separating out the quota structure again, we aren't punishing the non-quota users by making all the inodes bigger, requiring more slab space. This patch creates a new slab area to allocate the quota stuff so it's managed a little more sanely. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * | gfs2: Extended attribute readahead optimizationAndreas Gruenbacher2015-11-181-18/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of submitting a READ_SYNC bio for the inode and a READA bio for the inode's extended attributes through submit_bh, submit a single READ_SYNC bio for both through submit_bio when possible. This can be more efficient on some kinds of block devices. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * | gfs2: Extended attribute readaheadAndreas Gruenbacher2015-11-168-14/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When gfs2 allocates an inode and its extended attribute block next to each other at inode create time, the inode's directory entry indicates that in de_rahead. In that case, we can readahead the extended attribute block when we read in the inode. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * | GFS2: Use rht_for_each_entry_rcu in glock_hash_walkAndrew Price2015-11-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This lockdep splat was being triggered on umount: [55715.973122] =============================== [55715.980169] [ INFO: suspicious RCU usage. ] [55715.981021] 4.3.0-11553-g8d3de01-dirty #15 Tainted: G W [55715.982353] ------------------------------- [55715.983301] fs/gfs2/glock.c:1427 suspicious rcu_dereference_protected() usage! The code it refers to is the rht_for_each_entry_safe usage in glock_hash_walk. The condition that triggers the warning is lockdep_rht_bucket_is_held(tbl, hash) which is checked in the __rcu_dereference_protected macro. The rhashtable buckets are not changed in glock_hash_walk so it's safe to rely on the rcu protection. Replace the rht_for_each_entry_safe() usage with rht_for_each_entry_rcu(), which doesn't care whether the bucket lock is held if the rcu read lock is held. Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Delete an unnecessary check before the function call "iput"Markus Elfring2015-11-161-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The iput() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * | gfs2: Automatically set GFS2_DIF_SYSTEM flag on system filesAbhi Das2015-11-102-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When new files and directories are created inside a parent directory we automatically inherit the GFS2_DIF_SYSTEM flag (if set) and assign it to the new file/dirs. All new system files/dirs created in the metafs by, say gfs2_jadd, will have this flag set because they will have parent directories in the metafs whose GFS2_DIF_SYSTEM flag has already been set (most likely by a previous mkfs.gfs2) Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
* | | Merge branch 'work.misc' of ↵Linus Torvalds2016-01-121-3/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "All kinds of stuff. That probably should've been 5 or 6 separate branches, but by the time I'd realized how large and mixed that bag had become it had been too close to -final to play with rebasing. Some fs/namei.c cleanups there, memdup_user_nul() introduction and switching open-coded instances, burying long-dead code, whack-a-mole of various kinds, several new helpers for ->llseek(), assorted cleanups and fixes from various people, etc. One piece probably deserves special mention - Neil's lookup_one_len_unlocked(). Similar to lookup_one_len(), but gets called without ->i_mutex and tries to avoid ever taking it. That, of course, means that it's not useful for any directory modifications, but things like getting inode attributes in nfds readdirplus are fine with that. I really should've asked for moratorium on lookup-related changes this cycle, but since I hadn't done that early enough... I *am* asking for that for the coming cycle, though - I'm going to try and get conversion of i_mutex to rwsem with ->lookup() done under lock taken shared. There will be a patch closer to the end of the window, along the lines of the one Linus had posted last May - mechanical conversion of ->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/ inode_is_locked()/inode_lock_nested(). To quote Linus back then: ----- | This is an automated patch using | | sed 's/mutex_lock(&\(.*\)->i_mutex)/inode_lock(\1)/' | sed 's/mutex_unlock(&\(.*\)->i_mutex)/inode_unlock(\1)/' | sed 's/mutex_lock_nested(&\(.*\)->i_mutex,[ ]*I_MUTEX_\([A-Z0-9_]*\))/inode_lock_nested(\1, I_MUTEX_\2)/' | sed 's/mutex_is_locked(&\(.*\)->i_mutex)/inode_is_locked(\1)/' | sed 's/mutex_trylock(&\(.*\)->i_mutex)/inode_trylock(\1)/' | | with a very few manual fixups ----- I'm going to send that once the ->i_mutex-affecting stuff in -next gets mostly merged (or when Linus says he's about to stop taking merges)" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) nfsd: don't hold i_mutex over userspace upcalls fs:affs:Replace time_t with time64_t fs/9p: use fscache mutex rather than spinlock proc: add a reschedule point in proc_readfd_common() logfs: constify logfs_block_ops structures fcntl: allow to set O_DIRECT flag on pipe fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE fs: xattr: Use kvfree() [s390] page_to_phys() always returns a multiple of PAGE_SIZE nbd: use ->compat_ioctl() fs: use block_device name vsprintf helper lib/vsprintf: add %*pg format specifier fs: use gendisk->disk_name where possible poll: plug an unused argument to do_poll amdkfd: don't open-code memdup_user() cdrom: don't open-code memdup_user() rsxx: don't open-code memdup_user() mtip32xx: don't open-code memdup_user() [um] mconsole: don't open-code memdup_user_nul() [um] hostaudio: don't open-code memdup_user() ...
| * | | fs: use block_device name vsprintf helperDmitry Monakhov2016-01-061-3/+1
| | |/ | |/| | | | | | | | | | Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | Merge branch 'work.xattr' of ↵Linus Torvalds2016-01-114-55/+2
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs xattr updates from Al Viro: "Andreas' xattr cleanup series. It's a followup to his xattr work that went in last cycle; -0.5KLoC" * 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: xattr handlers: Simplify list operation ocfs2: Replace list xattr handler operations nfs: Move call to security_inode_listsecurity into nfs_listxattr xfs: Change how listxattr generates synthetic attributes tmpfs: listxattr should include POSIX ACL xattrs tmpfs: Use xattr handler infrastructure btrfs: Use xattr handler infrastructure vfs: Distinguish between full xattr names and proper prefixes posix acls: Remove duplicate xattr name definitions gfs2: Remove gfs2_xattr_acl_chmod vfs: Remove vfs_xattr_cmp
| * | | posix acls: Remove duplicate xattr name definitionsAndreas Gruenbacher2015-12-062-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove POSIX_ACL_XATTR_{ACCESS,DEFAULT} and GFS2_POSIX_ACL_{ACCESS,DEFAULT} and replace them with the definitions in <include/uapi/linux/xattr.h>. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | gfs2: Remove gfs2_xattr_acl_chmodAndreas Gruenbacher2015-12-062-51/+0
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | Function gfs2_xattr_acl_chmod is unused since commit e01580bf. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Acked-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | switch ->get_link() to delayed_call, kill ->put_link()Al Viro2015-12-301-4/+4
| | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | replace ->follow_link() with new method that could stay in RCU modeAl Viro2015-12-081-5/+10
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | new method: ->get_link(); replacement of ->follow_link(). The differences are: * inode and dentry are passed separately * might be called both in RCU and non-RCU mode; the former is indicated by passing it a NULL dentry. * when called that way it isn't allowed to block and should return ERR_PTR(-ECHILD) if it needs to be called in non-RCU mode. It's a flagday change - the old method is gone, all in-tree instances converted. Conversion isn't hard; said that, so far very few instances do not immediately bail out when called in RCU mode. That'll change in the next commits. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge branch 'for-linus-3' of ↵Linus Torvalds2015-11-131-5/+8
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs xattr cleanups from Al Viro. * 'for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: f2fs: xattr simplifications squashfs: xattr simplifications 9p: xattr simplifications xattr handlers: Pass handler to operations instead of flags jffs2: Add missing capability check for listing trusted xattrs hfsplus: Remove unused xattr handler list operations ubifs: Remove unused security xattr handler vfs: Fix the posix_acl_xattr_list return value vfs: Check attribute names in posix acl xattr handers
| * xattr handlers: Pass handler to operations instead of flagsAndreas Gruenbacher2015-11-131-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | The xattr_handler operations are currently all passed a file system specific flags value which the operations can use to disambiguate between different handlers; some file systems use that to distinguish the xattr namespace, for example. In some oprations, it would be useful to also have access to the handler prefix. To allow that, pass a pointer to the handler to operations instead of the flags value alone. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge branch 'akpm' (patches from Andrew)Linus Torvalds2015-11-091-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge third patch-bomb from Andrew Morton: "We're pretty much done over here - I'm still waiting for a nouveau merge so I can cleanly finish up Christoph's dma-mapping rework. - bunch of small misc stuff - fold abs64() into abs(), remove abs64() - new_valid_dev() cleanups - binfmt_elf_fdpic feature work" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (24 commits) fs/binfmt_elf_fdpic.c: provide NOMMU loader for regular ELF binaries fs/stat.c: remove unnecessary new_valid_dev() check fs/reiserfs/namei.c: remove unnecessary new_valid_dev() check fs/nilfs2/namei.c: remove unnecessary new_valid_dev() check fs/ncpfs/dir.c: remove unnecessary new_valid_dev() check fs/jfs: remove unnecessary new_valid_dev() checks fs/hpfs/namei.c: remove unnecessary new_valid_dev() check fs/f2fs/namei.c: remove unnecessary new_valid_dev() check fs/ext2/namei.c: remove unnecessary new_valid_dev() check fs/exofs/namei.c: remove unnecessary new_valid_dev() check fs/btrfs/inode.c: remove unnecessary new_valid_dev() check fs/9p: remove unnecessary new_valid_dev() checks include/linux/kdev_t.h: old/new_valid_dev() can return bool include/linux/kdev_t.h: remove unused huge_valid_dev() kmap_atomic_to_page() has no users, remove it drivers/scsi/cxgbi: fix build with EXTRA_CFLAGS dma: remove external references to dma_supported Documentation/sysctl/vm.txt: fix misleading code reference of overcommit_memory remove abs64() kernel.h: make abs() work with 64-bit types ...
| * | remove abs64()Andrew Morton2015-11-091-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch everything to the new and more capable implementation of abs(). Mainly to give the new abs() a bit of a workout. Cc: Michal Nazarewicz <mina86@mina86.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'gfs2-merge-window' of ↵Linus Torvalds2015-11-0910-59/+70
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Bob Peterson: "Here is a list of patches we've accumulated for GFS2 for the current upstream merge window. There are only six patches this time: 1. A cleanup patch from Andreas to remove the gl_spin #define in favor of its value for the sake of clarity. 2. A fix from Andy Price to mark the inode dirty during fallocate. 3. A fix from Andy Price to set s_mode on mount failures to prevent a stack trace. 4 A patch from me to prevent a kernel BUG() in trans_add_meta/trans_add_data due to uninitialized storage. 5. A patch from me to protecting our freeing of the in-core directory hash table to prevent double-free. 6. A fix for a page/block rounding problem that resulted in a metadata coherency problem when the block size != page size" I've got a lot more patches in various stages of review and testing, but I'm afraid they'll have to wait until the next merge window. So next time we're likely to have a lot more" * tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: GFS2: Fix rgrp end rounding problem for bsize < page size GFS2: Protect freeing directory hash table with i_lock spin_lock gfs2: Remove gl_spin define gfs2: Add missing else in trans_add_meta/data GFS2: Set s_mode before parsing mount options GFS2: fallocate: do not rely on file_update_time to mark the inode dirty
| * GFS2: Fix rgrp end rounding problem for bsize < page sizeBob Peterson2015-11-091-2/+3
| | | | | | | | | | | | | | | | This patch fixes a bug introduced by commit 7005c3e. That patch tries to map a vm range for resource groups, but the calculation breaks down when the block size is less than the page size. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * GFS2: Protect freeing directory hash table with i_lock spin_lockBob Peterson2015-11-041-1/+6
| | | | | | | | | | | | | | | | | | This patch changes function gfs2_dir_hash_inval so it uses the i_lock spin_lock to protect the in-core hash table, i_hash_cache. This will prevent double-frees due to a race between gfs2_evict_inode and inode invalidation. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: Remove gl_spin defineAndreas Gruenbacher2015-10-296-54/+53
| | | | | | | | | | | | | | | | | | | | Commit e66cf161 replaced the gl_spin spinlock in struct gfs2_glock with a gl_lockref lockref and defined gl_spin as gl_lockref.lock (the spinlock in gl_lockref). Remove that define to make the references to gl_lockref.lock more obvious. Signed-off-by: Andreas Gruenbacher <andreas.gruenbacher@gmail.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: Add missing else in trans_add_meta/dataBob Peterson2015-10-011-0/+4
| | | | | | | | | | | | | | | | | | | | This patch fixes a timing window that causes a segfault. The problem is that bd can remain NULL throughout the function and then reference that NULL pointer if the bh->b_private starts out NULL, then someone sets it to non-NULL inside the locking. In that case, bd still needs to be set. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * GFS2: Set s_mode before parsing mount optionsAndrew Price2015-09-231-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the generic mount_bdev() function, deactivate_locked_super() is called after the fill_super() call fails, at which point s_mode has been set. kill_block_super() expects this and dumps a warning when FMODE_EXCL is not set in s_mode. In gfs2_mount() we call deactivate_locked_super() on failure of gfs2_mount_args(), at which point s_mode has not yet been set. This causes kill_block_super() to dump a stack trace when gfs2 fails to mount with invalid options. Set s_mode earlier in gfs2_mount() to avoid that. Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * GFS2: fallocate: do not rely on file_update_time to mark the inode dirtyAndrew Price2015-09-221-1/+1
| | | | | | | | | | | | | | | | | | Previously __gfs2_fallocate() relied on file_update_time() marking the inode dirty, but that's not a safe assumption as that function doesn't dirty the inode in some cases. Mark the inode dirty explicitly. Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
* | Move locks API users to locks_lock_inode_wait()Benjamin Coddington2015-10-221-4/+4
|/ | | | | | | | | Instead of having users check for FL_POSIX or FL_FLOCK to call the correct locks API function, use the check within locks_lock_inode_wait(). This allows for some later cleanup. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
* Merge tag 'gfs2-merge-window' of ↵Linus Torvalds2015-09-1111-285/+212
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull GFS2 updates from Bob Peterson: "Here is a list of patches we've accumulated for GFS2 for the current upstream merge window. This time we've only got six patches, many of which are very small: - three cleanups from Andreas Gruenbacher, including a nice cleanup of the sequence file code for the sbstats debugfs file. - a patch from Ben Hutchings that changes statistics variables from signed to unsigned. - two patches from me that increase GFS2's glock scalability by switching from a conventional hash table to rhashtable" * tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: A minor "sbstats" cleanup gfs2: Fix a typo in a comment gfs2: Make statistics unsigned, suitable for use with do_div() GFS2: Use resizable hash table for glocks GFS2: Move glock superblock pointer to field gl_name gfs2: Simplify the seq file code for "sbstats"
| * gfs2: A minor "sbstats" cleanupAndreas Gruenbacher2015-09-031-7/+6
| | | | | | | | | | | | | | It seems cleaner to avoid the temporary value here. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: Fix a typo in a commentAndreas Gruenbacher2015-09-031-1/+1
| | | | | | | | | | Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: Make statistics unsigned, suitable for use with do_div()Ben Hutchings2015-09-034-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | None of these statistics can meaningfully be negative, and the numerator for do_div() must have the type u64. The generic implementation of do_div() used on some 32-bit architectures asserts that, resulting in a compiler error in gfs2_rgrp_congested(). Fixes: 0166b197c2ed ("GFS2: Average in only non-zero round-trip times ...") Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Andreas Gruenbacher <agruenba@redhat.com>
| * GFS2: Use resizable hash table for glocksBob Peterson2015-09-032-166/+102
| | | | | | | | | | | | | | | | | | | | This patch changes the glock hash table from a normal hash table to a resizable hash table, which scales better. This also simplifies a lot of code. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
| * GFS2: Move glock superblock pointer to field gl_nameBob Peterson2015-09-0311-74/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | What uniquely identifies a glock in the glock hash table is not gl_name, but gl_name and its superblock pointer. This patch makes the gl_name field correspond to a unique glock identifier. That will allow us to simplify hashing with a future patch, since the hash algorithm can then take the gl_name and hash its components in one operation. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
| * gfs2: Simplify the seq file code for "sbstats"Andreas Gruenbacher2015-09-031-20/+11
| | | | | | | | | | | | | | | | Don't use struct gfs2_glock_iter as the helper data structure for iterating through "sbstats"; we are not iterating through glocks here. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
* | fs: create and use seq_show_option for escapingKees Cook2015-09-041-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many file systems that implement the show_options hook fail to correctly escape their output which could lead to unescaped characters (e.g. new lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files. This could lead to confusion, spoofed entries (resulting in things like systemd issuing false d-bus "mount" notifications), and who knows what else. This looks like it would only be the root user stepping on themselves, but it's possible weird things could happen in containers or in other situations with delegated mount privileges. Here's an example using overlay with setuid fusermount trusting the contents of /proc/mounts (via the /etc/mtab symlink). Imagine the use of "sudo" is something more sneaky: $ BASE="ovl" $ MNT="$BASE/mnt" $ LOW="$BASE/lower" $ UP="$BASE/upper" $ WORK="$BASE/work/ 0 0 none /proc fuse.pwn user_id=1000" $ mkdir -p "$LOW" "$UP" "$WORK" $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt $ cat /proc/mounts none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0 none /proc fuse.pwn user_id=1000 0 0 $ fusermount -u /proc $ cat /proc/mounts cat: /proc/mounts: No such file or directory This fixes the problem by adding new seq_show_option and seq_show_option_n helpers, and updating the vulnerable show_option handlers to use them as needed. Some, like SELinux, need to be open coded due to unusual existing escape mechanisms. [akpm@linux-foundation.org: add lost chunk, per Kees] [keescook@chromium.org: seq_show_option should be using const parameters] Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Jan Kara <jack@suse.com> Acked-by: Paul Moore <paul@paul-moore.com> Cc: J. R. Okajima <hooanon05g@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | block: remove bio_get_nr_vecs()Kent Overstreet2015-08-131-8/+1
| | | | | | | | | | | | | | | | | | | | | | | | We can always fill up the bio now, no need to estimate the possible size based on queue parameters. Acked-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> [hch: rebased and wrote a changelog] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* | block: add a bi_error field to struct bioChristoph Hellwig2015-07-292-8/+8
|/ | | | | | | | | | | | | | | | | | | | | | | | Currently we have two different ways to signal an I/O error on a BIO: (1) by clearing the BIO_UPTODATE flag (2) by returning a Linux errno value to the bi_end_io callback The first one has the drawback of only communicating a single possible error (-EIO), and the second one has the drawback of not beeing persistent when bios are queued up, and are not passed along from child to parent bio in the ever more popular chaining scenario. Having both mechanisms available has the additional drawback of utterly confusing driver authors and introducing bugs where various I/O submitters only deal with one of them, and the others have to add boilerplate code to deal with both kinds of error returns. So add a new bi_error field to store an errno value directly in struct bio and remove the existing mechanisms to clean all this up. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* Merge tag 'gfs2-merge-window' of ↵Linus Torvalds2015-06-2711-145/+435
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org:/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull GFS2 updates from Bob Peterson: "Here are the patches we've accumulated for GFS2 for the current upstream merge window. We have a good mixture this time. Here are some of the features: - Fix a problem with RO mounts writing to the journal. - Further improvements to quotas on GFS2. - Added support for rename2 and RENAME_EXCHANGE on GFS2. - Increase performance by making glock lru_list less of a bottleneck. - Increase performance by avoiding unnecessary buffer_head releases. - Increase performance by using average glock round trip time from all CPUs. - Fixes for some compiler warnings and minor white space issues. - Other misc bug fixes" * tag 'gfs2-merge-window' of git://git.kernel.org:/pub/scm/linux/kernel/git/gfs2/linux-gfs2: GFS2: Don't brelse rgrp buffer_heads every allocation GFS2: Don't add all glocks to the lru gfs2: Don't support fallocate on jdata files gfs2: s64 cast for negative quota value gfs2: limit quota log messages gfs2: fix quota updates on block boundaries gfs2: fix shadow warning in gfs2_rbm_find() gfs2: kerneldoc warning fixes gfs2: convert simple_str to kstr GFS2: make sure S_NOSEC flag isn't overwritten GFS2: add support for rename2 and RENAME_EXCHANGE gfs2: handle NULL rgd in set_rgrp_preferences GFS2: inode.c: indent with TABs, not spaces GFS2: mark the journal idle to fix ro mounts GFS2: Average in only non-zero round-trip times for congestion stats GFS2: Use average srttb value in congestion calculations
| * GFS2: Don't brelse rgrp buffer_heads every allocationBob Peterson2015-06-193-7/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows the block allocation code to retain the buffers for the resource groups so they don't need to be re-read from buffer cache with every request. This is a performance improvement that's especially noticeable when resource groups are very large. For example, with 2GB resource groups and 4K blocks, there can be 33 blocks for every resource group. This patch allows those 33 buffers to be kept around and not read in and thrown away with every operation. The buffers are released when the resource group is either synced or invalidated. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Steven Whitehouse <swhiteho@redhat.com> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
| * GFS2: Don't add all glocks to the lruBob Peterson2015-06-183-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The glocks used for resource groups often come and go hundreds of thousands of times per second. Adding them to the lru list just adds unnecessary contention for the lru_lock spin_lock, especially considering we're almost certainly going to re-use the glock and take it back off the lru microseconds later. We never want the glock shrinker to cull them anyway. This patch adds a new bit in the glops that determines which glock types get put onto the lru list and which ones don't. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
| * gfs2: Don't support fallocate on jdata filesAbhi Das2015-06-091-1/+1
| | | | | | | | | | | | | | | | We cannot provide an efficient implementation due to the headers on the data blocks, so there doesn't seem much point in having it. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: s64 cast for negative quota valueAbhi Das2015-06-081-1/+1
| | | | | | | | | | | | | | | | | | One-line fix to cast quota value to s64 before comparison. By default the quantity is treated as u64. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
| * gfs2: limit quota log messagesAbhi Das2015-06-022-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes the quota subsystem only report once that a particular user/group has exceeded their allotted quota. Previously, it was possible for a program to continuously try exceeding quota (despite receiving EDQUOT) and in turn trigger gfs2 to issue a kernel log message about quota exceed. In theory, this could get out of hand and flood the log and the filesystem hosting the log files. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: fix quota updates on block boundariesAbhi Das2015-06-022-80/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For smaller block sizes (512B, 1K, 2K), some quotas straddle block boundaries such that the usage value is on one block and the rest of the quota is on the previous block. In such cases, the value does not get updated correctly. This patch fixes that by addressing the boundary conditions correctly. This patch also adds a (s64) cast that was missing in a call to gfs2_quota_change() in inode.c Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: fix shadow warning in gfs2_rbm_find()Fabian Frederick2015-05-181-3/+1
| | | | | | | | | | | | | | bi was already declared and initialized globally in gfs2_rbm_find() Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: kerneldoc warning fixesFabian Frederick2015-05-051-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the following kernel-doc warnings: Warning(fs/gfs2/aops.c:180): No description found for parameter 'wbc' Warning(fs/gfs2/aops.c:236): No description found for parameter 'end' Warning(fs/gfs2/aops.c:236): No description found for parameter 'done_index' Warning(fs/gfs2/aops.c:236): Excess function parameter 'writepage' description in 'gfs2_write_jdata_pagevec' Warning(fs/gfs2/aops.c:346): Excess function parameter 'writepage' description in 'gfs2_write_cache_jdata' Warning(fs/gfs2/aops.c:346): Excess function parameter 'data' description in 'gfs2_write_cache_jdata' Warning(fs/gfs2/aops.c:605): No description found for parameter 'file' Warning(fs/gfs2/aops.c:605): No description found for parameter 'mapping' Warning(fs/gfs2/aops.c:605): No description found for parameter 'pages' Warning(fs/gfs2/aops.c:605): No description found for parameter 'nr_pages' Warning(fs/gfs2/aops.c:870): No description found for parameter 'copied' Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: convert simple_str to kstrFabian Frederick2015-05-051-18/+48
| | | | | | | | | | | | | | | | | | | | | | | | -Remove obsolete simple_str functions. -Return error code when kstr failed. -This patch also calls functions corresponding to destination type. Thanks to Alexey Dobriyan for suggesting improvements in block_store() and wdack_store() Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * GFS2: make sure S_NOSEC flag isn't overwrittenBenjamin Marzinski2015-05-051-1/+1
| | | | | | | | | | | | | | | | | | At the end of gfs2_set_inode_flags inode->i_flags is set to flags, so we should be modifying flags instead of inode->i_flags, so it isn't overwritten. Signed-off-by: Benjamin Marzinski <bmarzins redhat com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * GFS2: add support for rename2 and RENAME_EXCHANGEBenjamin Marzinski2015-05-051-16/+189
| | | | | | | | | | | | | | | | | | gfs2 now uses the rename2 directory iop, and supports the RENAME_EXCHANGE flag (as well as RENAME_NOREPLACE, which the vfs takes care of). Signed-off-by: Benjamin Marzinski <bmarzins redhat com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
OpenPOWER on IntegriCloud