summaryrefslogtreecommitdiffstats
path: root/fs/ocfs2/aops.c
Commit message (Collapse)AuthorAgeFilesLines
* ocfs2: Fix comparison in ocfs2_size_fits_inline_data()Mark Fasheh2007-11-271-1/+1
| | | | | | This was causing us to prematurely push out inline data by one byte. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: fix write() performance regressionMark Fasheh2007-11-061-0/+22
| | | | | | | | | On file systems which don't support sparse files, Ocfs2_map_page_blocks() was reading blocks on appending writes. This caused write performance to suffer dramatically. Fix this by detecting an appending write on a nonsparse fs and skipping the read. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: convert to new aopsNick Piggin2007-10-161-6/+8
| | | | | | | | | | | Plug ocfs2 into the ->write_begin and ->write_end aops. A bunch of custom code is now gone - the iovec iteration stuff during write and the ocfs2 splice write actor. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ocfs2: Write support for inline dataMark Fasheh2007-10-121-2/+171
| | | | | | | | | | | | | | | | | | This fixes up write, truncate, mmap, and RESVSP/UNRESVP to understand inline inode data. For the most part, the changes to the core write code can be relied on to do the heavy lifting. Any code calling ocfs2_write_begin (including shared writeable mmap) can count on it doing the right thing with respect to growing inline data to an extent tree. Size reducing truncates, including UNRESVP can simply zero that portion of the inode block being removed. Size increasing truncatesm, including RESVP have to be a little bit smarter and grow the inode to an extent tree if necessary. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
* ocfs2: Read support for inline dataMark Fasheh2007-10-121-4/+76
| | | | | | | | | | This hooks up ocfs2_readpage() to populate a page with data from an inode block. Direct IO reads from inline data are modified to fall back to buffered I/O. Appropriate checks are also placed in the extent map code to avoid reading an extent list when inline data might be stored. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
* ocfs2: Small refactor of truncate zeroing codeMark Fasheh2007-10-121-8/+12
| | | | | | | | | | | | | | | | | | | We'll want to reuse most of this when pushing inline data back out to an extent. Keeping this part as a seperate patch helps to keep the upcoming changes for write support uncluttered. The core portion of ocfs2_zero_cluster_pages() responsible for making sure a page is mapped and properly dirtied is abstracted out into it's own function, ocfs2_map_and_dirty_page(). Actual functionality doesn't change, though zeroing becomes optional. We also turn part of ocfs2_free_write_ctxt() into a common function for unlocking and freeing a page array. This operation is very common (and uniform) for Ocfs2 cluster sizes greater than page size, so it makes sense to keep the code in one place. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
* ocfs2: move nonsparse hole-filling into ocfs2_write_begin()Mark Fasheh2007-10-121-4/+36
| | | | | | | | | By doing this, we can remove any higher level logic which has to have knowledge of btree functionality - any callers of ocfs2_write_begin() can now expect it to do anything necessary to prepare the inode for new data. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
* ocfs2: Don't double set write parametersMark Fasheh2007-09-201-13/+3
| | | | | | | | | | The target page offsets were being incorrectly set a second time in ocfs2_prepare_page_for_write(), which was causing problems on a 16k page size kernel. Additionally, ocfs2_write_failure() was incorrectly using those parameters instead of the parameters for the individual page being cleaned up. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Fix pos/len passed to ocfs2_write_clusterMark Fasheh2007-09-201-1/+16
| | | | | | | | This was broken for file systems whose cluster size is greater than page size. Pos needs to be incremented as we loop through the descriptors, and len needs to be capped to the size of a single cluster. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] ocfs2: Fix a wrong cluster calculation.tao.ma@oracle.com2007-09-111-1/+3
| | | | | | | | | | | | In ocfs2_alloc_write_write_ctxt, the written clusters length is calculated by the byte length only. This may cause some problems if we start to write at some position in the end of one cluster and last to a second cluster while the "len" is smaller than a cluster size. In that case, we have to write 2 clusters actually. So we have to take the start position into consideration also. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* mm: merge populate and nopage into fault (fixes nonlinear)Nick Piggin2007-07-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes the virtual address -> file offset differently from linear mappings. ->populate is a layering violation because the filesystem/pagecache code should need to know anything about the virtual memory mapping. The hitch here is that the ->nopage handler didn't pass down enough information (ie. pgoff). But it is more logical to pass pgoff rather than have the ->nopage function calculate it itself anyway (because that's a similar layering violation). Having the populate handler install the pte itself is likewise a nasty thing to be doing. This patch introduces a new fault handler that replaces ->nopage and ->populate and (later) ->nopfn. Most of the old mechanism is still in place so there is a lot of duplication and nice cleanups that can be removed if everyone switches over. The rationale for doing this in the first place is that nonlinear mappings are subject to the pagefault vs invalidate/truncate race too, and it seemed stupid to duplicate the synchronisation logic rather than just consolidate the two. After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in pagecache. Seems like a fringe functionality anyway. NOPAGE_REFAULT is removed. This should be implemented with ->fault, and no users have hit mainline yet. [akpm@linux-foundation.org: cleanup] [randy.dunlap@oracle.com: doc. fixes for readahead] [akpm@linux-foundation.org: build fix] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] ocfs2: zero_user_page conversionEric Sandeen2007-07-101-11/+2
| | | | | Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Support creation of unwritten extentsMark Fasheh2007-07-101-1/+1
| | | | | | | | | | This can now be trivially supported with re-use of our existing extend code. ocfs2_allocate_unwritten_extents() takes a start offset and a byte length and iterates over the inode, adding extents (marked as unwritten) until len is reached. Existing extents are skipped over. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: support writing of unwritten extentsMark Fasheh2007-07-101-20/+74
| | | | | | | | | | Update the write code to detect when the user is asking to write to an unwritten extent. Like writing to a hole, we must zero the region between the write and the cluster boundaries. Most of the existing cluster zeroing logic can be re-used with some additional checks for the unwritten flag on extent records. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: small cleanup of ocfs2_write_begin_nolock()Mark Fasheh2007-07-101-32/+76
| | | | | | | We can easily seperate out the write descriptor setup and manipulation into helper functions. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: plug truncate into cached dealloc routinesMark Fasheh2007-07-101-0/+1
| | | | Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: harden buffer check during mapping of page blocksMark Fasheh2007-07-101-1/+2
| | | | | | | | We don't want to submit buffer_new blocks for read i/o. This actually won't happen right now because those requests during an allocating write are all nicely aligned. It's probably a good idea to provide an explicit check though. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: shared writeable mmapMark Fasheh2007-07-101-15/+41
| | | | | | | Implement cluster consistent shared writeable mappings using the ->page_mkwrite() callback. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: factor out write aops into nolock variantsMark Fasheh2007-07-101-40/+80
| | | | | | | ocfs2_mkwrite() will want this so that it can add some mmap specific checks before asking for a write. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: rework ocfs2_buffered_write_cluster()Mark Fasheh2007-07-101-334/+478
| | | | | | | | | | | Use some ideas from the new-aops patch series and turn ocfs2_buffered_write_cluster() into a 2 stage operation with the caller copying data in between. The code now understands multiple cluster writes as a result of having to deal with a full page write for greater than 4k pages. This sets us up to easily call into the write path during ->page_mkwrite(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Fix invalid assertion during write on 64k pagesMark Fasheh2007-06-061-10/+10
| | | | | | | | | | | | The write path code intends to bug if a math error (or unhandled case) results in a write outside of the current cluster boundaries. The actual BUG_ON() statements however are incorrect, leading to a crash on kernels with 64k page size. Fix those by checking against the right variables. Also, move the assertions higher up within the functions so that they trip *before* the code starts to mark buffers. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] ocfs2: use zero_user_pageNate Diller2007-05-251-4/+1
| | | | | | | | Use zero_user_page() instead of open-coding it. Signed-off-by: Nate Diller <nate.diller@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: trylock in ocfs2_readpage()Mark Fasheh2007-05-251-1/+5
| | | | | | | | Similarly to the page lock / cluster lock inversion in ocfs2_readpage, we can deadlock on ip_alloc_sem. We can down_read_trylock() instead and just return AOP_TRUNCATED_PAGE if the operation fails. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Force use of GFP_NOFS in ocfs2_write()Mark Fasheh2007-05-021-1/+1
| | | | | | We can otherwise recurse into the file system. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: fix sparse warnings in fs/ocfs2Mark Fasheh2007-05-021-1/+2
| | | | | | | None of these are actually harmful, but the noise makes looking for real problems difficult. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] fs/ocfs2/: make 3 functions staticAdrian Bunk2007-05-021-3/+3
| | | | | | | | | | | This patch makes the following needlessly global functions static: - aops.c: ocfs2_write_data_page() - dlmglue.c: ocfs2_dump_meta_lvb_info() - file.c: ocfs2_set_inode_size() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remember rw lock level during direct ioMark Fasheh2007-04-261-2/+7
| | | | | | | Cluster locking might have been redone because a direct write won't complete, so this needs to be reflected in the iocb. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Fix up i_blocks calculation to know about holesMark Fasheh2007-04-261-1/+1
| | | | | | | | Older file systems which didn't support holes did a dumb calculation of i_blocks based on i_size. This is no longer accurate, so fix things up to take actual allocation into account. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Fix extent lookup to return true size of holesMark Fasheh2007-04-261-2/+1
| | | | | | | | Initially, we had wired things to return a size '1' of holes. Cook up a small amount of code to find the next extent and calculate the number of clusters between the virtual offset and the next allocated extent. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Read from an unwritten extent returns zerosMark Fasheh2007-04-261-7/+14
| | | | | | | | Return an optional extent flags field from our lookup functions and wire up callers to treat unwritten regions as holes for the purpose of returning zeros to the user. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Use own splice write actorMark Fasheh2007-04-261-0/+69
| | | | | | | | We need to fill holes during a splice write. Provide our own splice write actor which can call ocfs2_file_buffered_write() with a splice-specific callback. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: zero tail of sparse files on truncateMark Fasheh2007-04-261-19/+15
| | | | | | | | | | Since we don't zero on extend anymore, truncate needs to be fixed up to zero the part of a file between i_size and and end of it's cluster. Otherwise a subsequent extend could expose bad data. This introduced a new helper, which can be used in ocfs2_write(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Teach ocfs2_get_block() about holesMark Fasheh2007-04-261-38/+61
| | | | | | | | ocfs2_get_block() didn't understand sparse files, fix that. Also remove some code that isn't really useful anymore. We can fix up ocfs2_direct_IO_get_blocks() at the same time. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: remove ocfs2_prepare_write() and ocfs2_commit_write()Mark Fasheh2007-04-261-120/+5
| | | | | | | These are no longer used, and can't handle file systems with sparse file allocation. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: teach ocfs2_file_aio_write() about sparse filesMark Fasheh2007-04-261-16/+663
| | | | | | | | | | | | Unfortunately, ocfs2 can no longer make use of generic_file_aio_write_nlock() because allocating writes will require zeroing of pages adjacent to the I/O for cluster sizes greater than page size. Implement a custom file write here, which can order page locks for zeroing. This also has the advantage that cluster locks can easily be ordered outside of the page locks. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: temporarily remove extent map cachingMark Fasheh2007-04-261-5/+3
| | | | | | | | | | | The code in extent_map.c is not prepared to deal with a subtree being rotated between lookups. This can happen when filling holes in sparse files. Instead of a lengthy patch to update the code (which would likely lose the benefit of caching subtree roots), we remove most of the algorithms and implement a simple path based lookup. A less ambitious extent caching scheme will be added in a later patch. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: add some missing address space callbacksJoel Becker2007-03-141-1/+25
| | | | | | | | | | | | | Under load, OCFS2 would crash in invalidate_inode_pages2_range() because invalidate_complete_page2() was unable to invalidate a page. It would appear that JBD is holding on to the page. ext3 has a specific ->releasepage() handler to cover this case. Steal ext3's ->releasepage(), ->invalidatepage(), and ->migratepage(), as they appear completely appropriate for OCFS2. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Allow direct I/O read past end of fileMark Fasheh2006-12-281-7/+17
| | | | | | | | | | | ocfs2_direct_IO_get_blocks() was incorrectly returning -EIO for a direct I/O read whose start block was past the end of the file allocation tree. Fix things so that we return a hole instead. do_direct_IO() will then notice that the range start is past eof and return a short read. While there, remove the unused vbo_max variable. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] struct path: convert ocfs2Josef Sipek2006-12-081-2/+2
| | | | | | Signed-off-by: Josef Sipek <jsipek@fsl.cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* ocfs2: Remove struct ocfs2_journal_handle in favor of handle_tMark Fasheh2006-12-011-4/+4
| | | | | | | | | | This is mostly a search and replace as ocfs2_journal_handle is now no more than a container for a handle_t pointer. ocfs2_commit_trans() becomes very straight forward, and we remove some out of date comments / code. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: remove handle argument to ocfs2_start_trans()Mark Fasheh2006-12-011-1/+1
| | | | | | | | | All callers either pass in NULL directly, or a local variable that is already set to NULL. The internals of ocfs2_start_trans() get a nice cleanup as a result. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: pass ocfs2_super * into ocfs2_commit_trans()Mark Fasheh2006-12-011-2/+2
| | | | | | This sets us up to remove handle->journal. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: remove unused handle argument from ocfs2_meta_lock_full()Mark Fasheh2006-12-011-4/+4
| | | | | | Now that this is unused and all callers pass NULL, we can safely remove it. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: properly update i_mtime on buffered writeMark Fasheh2006-09-201-49/+34
| | | | | | | | We weren't always updating i_mtime on writes, so fix ocfs2_commit_write() to handle this. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Acked-by: Zach Brown <zach.brown@oracle.com>
* ocfs2: remove redundant NULL checks in ocfs2_direct_IO_get_blocks()Florin Malita2006-06-291-8/+1
| | | | | Signed-off-by: Florin Malita <fmalita@gmail.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] mark address_space_operations constChristoph Hellwig2006-06-281-1/+1
| | | | | | | | | | Same as with already do with the file operations: keep them in .rodata and prevents people from doing runtime patching. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* ocfs2: take data locks around extendMark Fasheh2006-05-171-7/+39
| | | | | | | | | We need to take a data lock around extends to protect the pages that ocfs2_zero_extend is going to be pulling into the page cache. Otherwise an extend on one node might populate the page cache with data pages that have no lock coverage. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] remove ->get_blocks() supportBadari Pulavarty2006-03-261-1/+1
| | | | | | | | | | Now that get_block() can handle mapping multiple disk blocks, no need to have ->get_blocks(). This patch removes fs specific ->get_blocks() added for DIO and makes it users use get_block() instead. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* ocfs2: don't use MLF* in the file systemMark Fasheh2006-03-241-8/+10
| | | | Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] OCFS2: The Second Oracle Cluster FilesystemMark Fasheh2006-01-031-0/+643
The OCFS2 file system module. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
OpenPOWER on IntegriCloud