talos-obmc-linux - Talos™ II Linux sources for OpenBMC

	Commit message (Collapse)	Author	Age	Files	Lines
*	dm cache: fix truncation bug when mapping I/O to >2TB fast device	Heinz Mauelshagen	2014-02-28	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When remapping a block to the cache's fast device that is larger than 2TB we must not truncate the destination sector to 32bits. The 32bit temporary result of from_cblock() was being overflowed in remap_to_cache() due to the logical left shift. Use an intermediate 64bit type to store the 32bit from_cblock() result to fix the overflow. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org
*	dm thin: allow metadata space larger than supported to go unused	Mike Snitzer	2014-02-27	5	-19/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It was always intended that a user could provide a thin metadata device that is larger than the max supported by the on-disk format. The extra space would just go unused. Unfortunately that never worked. If the user attempted to use a larger metadata device on creation they would get an error like the following: device-mapper: space map common: space map too large device-mapper: transaction manager: couldn't create metadata space map device-mapper: thin metadata: tm_create_with_sm failed device-mapper: table: 252:17: thin-pool: Error creating metadata object device-mapper: ioctl: error adding target to table Fix this by allowing the initial metadata space map creation to cap its size at the max number of blocks supported (DM_SM_METADATA_MAX_BLOCKS). get_metadata_dev_size() must also impose DM_SM_METADATA_MAX_BLOCKS (via THIN_METADATA_MAX_SECTORS), otherwise extending metadata would cap at THIN_METADATA_MAX_SECTORS_WARNING (which is larger than supported). Also, the calculation for THIN_METADATA_MAX_SECTORS didn't account for the sizeof the disk_bitmap_header. So the supported maximum metadata size is a bit smaller (reduced from 33423360 to 33292800 sectors). Lastly, remove the "excess space will not be used" warning message from get_metadata_dev_size(); it resulted in printing the warning multiple times. Factor out warn_if_metadata_device_too_big(), call it from pool_ctr() and maybe_resize_metadata_dev(). Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Joe Thornber <ejt@redhat.com>
*	dm mpath: fix stalls when handling invalid ioctls	Hannes Reinecke	2014-02-26	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An invalid ioctl will never be valid, irrespective of whether multipath has active paths or not. So for invalid ioctls we do not have to wait for multipath to activate any paths, but can rather return an error code immediately. This fix resolves numerous instances of: udevd[]: worker [] unexpectedly returned with status 0x0100 that have been seen during testing. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org
*	dm thin: fix the error path for the thin device constructor	Mike Snitzer	2014-02-24	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dm_pool_close_thin_device() must be called if dm_set_target_max_io_len() fails in thin_ctr(). Otherwise __pool_destroy() will fail because the pool will still have an open thin device: device-mapper: thin metadata: attempt to close pmd when 1 device(s) are still open device-mapper: thin: __pool_destroy: dm_pool_metadata_close() failed. Also, must establish error code if failing thin_ctr() because the pool is in fail_io mode. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Joe Thornber <ejt@redhat.com> Cc: stable@vger.kernel.org
*	dm raid1: fix immutable biovec related BUG when retrying read bio	Mikulas Patocka	2014-02-18	1	-0/+3
\| \| \| \| \| \| \| \|	When restoring bi_end_io, increase bi_remaining before retrying the bio to avoid BUG_ON(atomic_read(&bio->bi_remaining) <= 0) in bio_endio(). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
*	dm io: fix I/O to multiple destinations	Mikulas Patocka	2014-02-17	1	-12/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 003b5c5719f159f4f4bf97511c4702a0638313dd ("block: Convert drivers to immutable biovecs") broke dm-mirror due to dm-io breakage. dm-io had three possible iterators (DM_IO_PAGE_LIST, DM_IO_BVEC, DM_IO_VMA) that iterate over pages where the I/O should be performed. The switch to immutable biovecs changed the DM_IO_BVEC iterator to DM_IO_BIO. Before this change the iterator stored the pointer to a bio vector in the dpages structure. The iterator incremented the pointer in the dpages structure as it advanced over the pages. After the immutable biovecs change, the DM_IO_BIO iterator stores a pointer to the bio in the dpages structure and uses bio_advance to change the bio as it advances. The problem is that the function dispatch_io stores the content of the dpages structure into the variable old_pages and restores it before issuing I/O to each of the devices. Before the change, the statement "dp = old_pages;" restored the iterator to its starting position. After the change, struct dpages holds a pointer to the bio, thus the statement "dp = old_pages;" doesn't restore the iterator. Consequently, in the context of dm-mirror: only the first mirror leg is written correctly, the kernel locks up when trying to write the other mirror legs because the number of sectors to write in the where->count variable doesn't match the number of sectors returned by the iterator. This patch fixes the bug by partially reverting the original patch - it changes the code so that struct dpages holds a pointer to the bio vector, so that the statement "*dp = old_pages;" restores the iterator correctly. The field "context_u" holds the offset from the beginning of the current bio vector entry, just like the "bio->bi_iter.bi_bvec_done" field. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
*	dm thin: avoid metadata commit if a pool's thin devices haven't changed	Mike Snitzer	2014-02-17	3	-1/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 905e51b ("dm thin: commit outstanding data every second") introduced a periodic commit. This commit occurs regardless of whether any thin devices have made changes. Fix the periodic commit to check if any of a pool's thin devices have changed using dm_pool_changed_this_transaction(). Reported-by: Alexander Larsson <alexl@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Joe Thornber <ejt@redhat.com> Cc: stable@vger.kernel.org
*	dm cache: do not add migration to completed list before unhooking bio	Mike Snitzer	2014-02-17	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When completing an overwrite bio, in overwrite_endio(), the associated migration should not be added to the 'completed_migrations' until the bio's fields are restored with dm_unhook_bio(). Otherwise, do_worker() can race to process 'completed_migrations' before dm_unhook_bio() -- so the bio's bi_end_io is incorrect. This is unlikely to cause any problems given the current code but should be fixed on the basis of correctness. Also, the cache's spinlock only needs to be held when manipulating the 'completed_migrations' list -- other changes don't need protection. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Joe Thornber <ejt@redhat.com>
*	dm cache: move hook_info into common portion of per_bio_data structure	Mike Snitzer	2014-02-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit c9d28d5d ("dm cache: promotion optimisation for writes") incorrectly placed the 'hook_info' member in the writethrough-only portion of the per_bio_data structure. Given that the overwrite optimization may be used for writeback the 'hook_info' member must be placed above the 'cache' member of the per_bio_data structure. Any members above 'cache' are available from both writeback and writethrough modes' per_bio_data structure. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Joe Thornber <ejt@redhat.com> Cc: stable@vger.kernel.org # 3.13+
*	Merge tag 'md/3.14-fixes' of git://neil.brown.name/md	Linus Torvalds	2014-02-14	2	-49/+54
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull md fixes from Neil Brown: "Two bugfixes for md both tagged for -stable" * tag 'md/3.14-fixes' of git://neil.brown.name/md: md/raid5: Fix CPU hotplug callback registration md/raid1: restore ability for check and repair to fix read errors.
\| *	md/raid5: Fix CPU hotplug callback registration	Oleg Nesterov	2014-02-13	1	-46/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Interestingly, the raid5 code can actually prevent double initialization and hence can use the following simplified form of callback registration: register_cpu_notifier(&foobar_cpu_notifier); get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); put_online_cpus(); A hotplug operation that occurs between registering the notifier and calling get_online_cpus(), won't disrupt anything, because the code takes care to perform the memory allocations only once. So reorganize the code in raid5 this way to fix the deadlock with callback registration. Cc: linux-raid@vger.kernel.org Cc: stable@vger.kernel.org (v2.6.32+) Fixes: 36d1c6476be51101778882897b315bd928c8c7b5 Signed-off-by: Oleg Nesterov <oleg@redhat.com> [Srivatsa: Fixed the unregister_cpu_notifier() deadlock, added the free_scratch_buffer() helper to condense code further and wrote the changelog.] Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: NeilBrown <neilb@suse.de>
\| *	md/raid1: restore ability for check and repair to fix read errors.	NeilBrown	2014-02-05	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 30bc9b53878a9921b02e3b5bc4283ac1c6de102a md/raid1: fix bio handling problems in process_checks() Move the bio_reset() to a point before where BIO_UPTODATE is checked, so that check now always report that the bio is uptodate, even if it is not. This causes process_check() to sometimes treat read-errors as successful matches so the good data isn't written out. This patch preserves the flag until it is needed. Bug was introduced in 3.11, but backported to 3.10-stable (as it fixed an even worse bug). So suitable for any -stable since 3.10. Reported-and-tested-by: Michael Tokarev <mjt@tls.msk.ru> Cc: stable@vger.kernel.org (3.10+) Fixed: 30bc9b53878a9921b02e3b5bc4283ac1c6de102a Signed-off-by: NeilBrown <neilb@suse.de>
* \|	Merge branch 'bcache-for-3.14' of git://evilpiepirate.org/~kent/linux-bcache ↵	Jens Axboe	2014-01-30	6	-10/+15
\|\ \ \| \|/ \|/\| \| \|	into for-linus
\| *	bcache: bugfix - gc thread now gets woken when cache is full	Nicholas Swenson	2014-01-29	1	-3/+3
\| \| \| \| \| \| \| \|	Signed-off-by: Nicholas Swenson <nks@daterainc.com>
\| *	bcache: Minor fixes from kbuild robot	Kent Overstreet	2014-01-29	4	-5/+8
\| \| \| \| \| \| \| \|	Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: fix BUG_ON due to integer overflow with GC_SECTORS_USED	Darrick J. Wong	2014-01-29	2	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The BUG_ON at the end of __bch_btree_mark_key can be triggered due to an integer overflow error: BITMASK(GC_SECTORS_USED, struct bucket, gc_mark, 2, 13); ... SET_GC_SECTORS_USED(g, min_t(unsigned, GC_SECTORS_USED(g) + KEY_SIZE(k), (1 << 14) - 1)); BUG_ON(!GC_SECTORS_USED(g)); In bcache.h, the SECTORS_USED bitfield is defined to be 13 bits wide. While the SET_ code tries to ensure that the field doesn't overflow by clamping it to (1<<14)-1 == 16383, this is incorrect because 16383 requires 14 bits. Therefore, if GC_SECTORS_USED() + KEY_SIZE() = 8192, the SET_ statement tries to store 8192 into a 13-bit field. In a 13-bit field, 8192 becomes zero, thus triggering the BUG_ON. Therefore, create a field width constant and a max value constant, and use those to create the bitfield and check the inputs to SET_GC_SECTORS_USED. Arguably the BITMASK() template ought to have BUG_ON checks for too-large values, but that's a separate patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* \|	Merge branch 'for-3.14/drivers' of git://git.kernel.dk/linux-block	Linus Torvalds	2014-01-30	22	-1755/+2213
\|\ \ \| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull block IO driver changes from Jens Axboe: - bcache update from Kent Overstreet. - two bcache fixes from Nicholas Swenson. - cciss pci init error fix from Andrew. - underflow fix in the parallel IDE pg_write code from Dan Carpenter. I'm sure the 1 (or 0) users of that are now happy. - two PCI related fixes for sx8 from Jingoo Han. - floppy init fix for first block read from Jiri Kosina. - pktcdvd error return miss fix from Julia Lawall. - removal of IRQF_SHARED from the SEGA Dreamcast CD-ROM code from Michael Opdenacker. - comment typo fix for the loop driver from Olaf Hering. - potential oops fix for null_blk from Raghavendra K T. - two fixes from Sam Bradshaw (Micron) for the mtip32xx driver, fixing an OOM problem and a problem with handling security locked conditions * 'for-3.14/drivers' of git://git.kernel.dk/linux-block: (47 commits) mg_disk: Spelling s/finised/finished/ null_blk: Null pointer deference problem in alloc_page_buffers mtip32xx: Correctly handle security locked condition mtip32xx: Make SGL container per-command to eliminate high order dma allocation drivers/block/loop.c: fix comment typo in loop_config_discard drivers/block/cciss.c:cciss_init_one(): use proper errnos drivers/block/paride/pg.c: underflow bug in pg_write() drivers/block/sx8.c: remove unnecessary pci_set_drvdata() drivers/block/sx8.c: use module_pci_driver() floppy: bail out in open() if drive is not responding to block0 read bcache: Fix auxiliary search trees for key size > cacheline size bcache: Don't return -EINTR when insert finished bcache: Improve bucket_prio() calculation bcache: Add bch_bkey_equal_header() bcache: update bch_bkey_try_merge bcache: Move insert_fixup() to btree_keys_ops bcache: Convert sorting to btree_keys bcache: Convert debug code to btree_keys bcache: Convert btree_iter to struct btree_keys bcache: Refactor bset_tree sysfs stats ...
\| *	bcache: Fix auxiliary search trees for key size > cacheline size	Kent Overstreet	2014-01-08	1	-14/+14
\| \| \| \| \| \| \| \|	Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Don't return -EINTR when insert finished	Kent Overstreet	2014-01-08	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to return -EINTR after a split because we invalidated iterators (and freed the btree node) - but if we were finished inserting, we don't want to redo the traversal. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Improve bucket_prio() calculation	Kent Overstreet	2014-01-08	2	-3/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When deciding what order to reuse buckets we take into account both the bucket's priority (which indicates lru order) and also the amount of live data in that bucket. The way they were scaled together wasn't as correct as it could be... this patch improves and documents it. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Add bch_bkey_equal_header()	Nicholas Swenson	2014-01-08	3	-8/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Checks if two keys have equivalent header fields. (good enough for replacement or merging) Used in bch_bkey_try_merge, and replacing a key in the btree. Signed-off-by: Nicholas Swenson <nks@daterainc.com> Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: update bch_bkey_try_merge	Nicholas Swenson	2014-01-08	3	-16/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added generic header checks to bch_bkey_try_merge, which then calls the bkey specific function Removed extraneous checks from bch_extent_merge Signed-off-by: Nicholas Swenson <nks@daterainc.com>
\| *	bcache: Move insert_fixup() to btree_keys_ops	Kent Overstreet	2014-01-08	4	-229/+257
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now handling overlapping extents/keys is a method that's specific to what the btree node contains. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Convert sorting to btree_keys	Kent Overstreet	2014-01-08	3	-36/+33
\| \| \| \| \| \| \| \| \| \| \| \|	More work to disentangle various code from struct btree Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Convert debug code to btree_keys	Kent Overstreet	2014-01-08	9	-217/+264
\| \| \| \| \| \| \| \| \| \| \| \|	More work to disentangle various code from struct btree Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Convert btree_iter to struct btree_keys	Kent Overstreet	2014-01-08	6	-38/+41
\| \| \| \| \| \| \| \| \| \| \| \|	More work to disentangle bset.c from struct btree Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Refactor bset_tree sysfs stats	Kent Overstreet	2014-01-08	3	-47/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We're in the process of turning bset.c into library code, so none of the code in that file should know about struct cache_set or struct btree - so, move the btree traversal part of the stats code to sysfs.c. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Add bch_btree_keys_u64s_remaining()	Kent Overstreet	2014-01-08	2	-13/+30
\| \| \| \| \| \| \| \| \| \| \| \|	Helper function to explicitly check how much space is free in a btree node Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Add struct btree_keys	Kent Overstreet	2014-01-08	8	-263/+322
\| \| \| \| \| \| \| \| \| \| \| \|	Soon, bset.c won't need to depend on struct btree. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Abstract out stuff needed for sorting	Kent Overstreet	2014-01-08	9	-289/+423
\| \| \| \| \| \| \| \|	Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Rename/shuffle various code around	Kent Overstreet	2014-01-08	8	-276/+341
\| \| \| \| \| \| \| \| \| \| \| \|	More work to disentangle bset.c from the rest of the code: Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Add struct bset_sort_state	Kent Overstreet	2014-01-08	6	-49/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	More disentangling bset.c from the rest of the bcache code - soon, the sorting routines won't have any dependencies on any outside structs. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Split out sort_extent_cmp()	Kent Overstreet	2014-01-08	4	-32/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Only use extent comparison for comparing extents, so we're not using START_KEY() on other key types (i.e. btree pointers) Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Bkey indexing renaming	Kent Overstreet	2014-01-08	6	-52/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	More refactoring: node() -> bset_bkey_idx() end() -> bset_bkey_last() Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Make bch_keylist_realloc() take u64s, not nptrs	Kent Overstreet	2014-01-08	4	-16/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Getting away from KEY_PTRS and moving toward KEY_U64s - and getting rid of magic 2s Also - split out the part that checks against journal entry size so as to avoid a dependancy on struct cache_set in bset.c Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Remove/fix some header dependencies	Kent Overstreet	2014-01-08	3	-24/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the process of disentagling/libraryizing bset.c from the rest of the bcache code. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Use a mempool for mergesort temporary space	Kent Overstreet	2014-01-08	3	-16/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	It was a single element mempool before, it's slightly cleaner to just use a real mempool. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Btree verify code improvements	Kent Overstreet	2014-01-08	6	-40/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Used this fixed code to find and fix the bug fixed by a4d885097b0ac0cd1337f171f2d4b83e946094d4. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: kill index()	Kent Overstreet	2014-01-08	4	-8/+24
\| \| \| \| \| \| \| \| \| \| \| \|	That was a terrible name for a macro, add some better helpers to replace it. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Trivial error handling fix	Kent Overstreet	2014-01-08	1	-1/+2
\| \| \| \| \| \| \| \|	Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache/md: Use raid stripe size	Kent Overstreet	2014-01-08	2	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we've got code for raid5/6 stripe awareness, bcache just needs to know about the stripes and when writing partial stripes is expensive - we probably don't want to enable this optimization for raid1 or 10, even though they have stripes. So add a flag to queue_limits. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Do bkey_put() in btree_split() error path	Kent Overstreet	2014-01-08	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This error path shouldn't have been hit in practice.. and we've got reworked reserve code coming soon so that it shouldn't _ever_ be bit... but if we've got code for this error path it should be correct. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Rework allocator reserves	Kent Overstreet	2014-01-08	7	-79/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need a reserve for allocating buckets for new btree nodes - and now that we've got multiple btrees, it really needs to be per btree. This reworks the reserves so we've got separate freelists for each reserve instead of watermarks, which seems to make things a bit cleaner, and it adds some code so that btree_split() can make sure the reserve is available before it starts. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: kill closure locking code	Kent Overstreet	2014-01-08	2	-313/+123
\| \| \| \| \| \| \| \| \| \| \| \|	Also flesh out the documentation a bit Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: kill closure locking usage	Kent Overstreet	2014-01-08	7	-55/+98
\| \| \| \| \| \| \| \|	Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Zero less memory	Kent Overstreet	2014-01-08	3	-40/+41
\| \| \| \| \| \| \| \| \| \| \| \|	Another minor performance optimization Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Don't touch bucket gen for dirty ptrs	Kent Overstreet	2014-01-08	2	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unnecessary since a bucket that has dirty pointers pointing to it can never be invalidated - and skipping it is a measurable performance boost, since the bucket gen will usually be a cache miss. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Minor btree cache fix	Kent Overstreet	2014-01-08	1	-7/+3
\| \| \| \| \| \| \| \|	Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Performance fix for when journal entry is full	Kent Overstreet	2014-01-08	1	-5/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were unnecessarily waiting on a journal write to complete when we just needed to start a journal write and start setting up the next one. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
\| *	bcache: Minor journal fix	Kent Overstreet	2014-01-08	1	-5/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The real fix is where we check the bytes we need against how much is remaining - we also need to check for a journal entry bigger than our buffer, we'll never write those and it would be bad if we tried to read one. Also improve the diagnostic messages. Signed-off-by: Kent Overstreet <kmo@daterainc.com>