blackbird-op-linux - Blackbird™ Linux sources for OpenPOWER

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge branch 'for-linus' of ↵	Linus Torvalds	2012-07-31	25	-898/+1351
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph changes from Sage Weil: "Lots of stuff this time around: - lots of cleanup and refactoring in the libceph messenger code, and many hard to hit races and bugs closed as a result. - lots of cleanup and refactoring in the rbd code from Alex Elder, mostly in preparation for the layering functionality that will be coming in 3.7. - some misc rbd cleanups from Josh Durgin that are finally going upstream - support for CRUSH tunables (used by newer clusters to improve the data placement) - some cleanup in our use of d_parent that Al brought up a while back - a random collection of fixes across the tree There is another patch coming that fixes up our ->atomic_open() behavior, but I'm going to hammer on it a bit more before sending it." Fix up conflicts due to commits that were already committed earlier in drivers/block/rbd.c, net/ceph/{messenger.c, osd_client.c} * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (132 commits) rbd: create rbd_refresh_helper() rbd: return obj version in __rbd_refresh_header() rbd: fixes in rbd_header_from_disk() rbd: always pass ops array to rbd_req_sync_op() rbd: pass null version pointer in add_snap() rbd: make rbd_create_rw_ops() return a pointer rbd: have __rbd_add_snap_dev() return a pointer libceph: recheck con state after allocating incoming message libceph: change ceph_con_in_msg_alloc convention to be less weird libceph: avoid dropping con mutex before fault libceph: verify state after retaking con lock after dispatch libceph: revoke mon_client messages on session restart libceph: fix handling of immediate socket connect failure ceph: update MAINTAINERS file libceph: be less chatty about stray replies libceph: clear all flags on con_close libceph: clean up con flags libceph: replace connection state bits with states libceph: drop unnecessary CLOSED check in socket state change callback libceph: close socket directly from ceph_con_close() ...
\| *	rbd: create rbd_refresh_helper()	Alex Elder	2012-07-30	1	-10/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Create a simple helper that handles the common case of calling __rbd_refresh_header() while holding the ctl_mutex. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: return obj version in __rbd_refresh_header()	Alex Elder	2012-07-30	1	-14/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a new parameter to __rbd_refresh_header() through which the version of the header object is passed back to the caller. In most cases this isn't needed. The main motivation is to normalize (almost) all calls to __rbd_refresh_header() so they are all wrapped immediately by mutex_lock()/mutex_unlock(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: fixes in rbd_header_from_disk()	Alex Elder	2012-07-30	1	-5/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a few issues in rbd_header_from_disk(): - There is a check intended to catch overflow, but it's wrong in two ways. - First, the type we don't want to overflow is size_t, not unsigned int, and there is now a SIZE_MAX we can use for use with that type. - Second, we're allocating the snapshot ids and snapshot image sizes separately (each has type u64; on disk they grouped together as a rbd_image_header_ondisk structure). So we can use the size of u64 in this overflow check. - If there are no snapshots, then there should be no snapshot names. Enforce this, and issue a warning if we encounter a header with no snapshots but a non-zero snap_names_len. - When saving the snapshot names into the header, be more direct in defining the offset in the on-disk structure from which they're being copied by using "snap_count" rather than "i" in the array index. - If an error occurs, the "snapc" and "snap_names" fields are freed at the end of the function. Make those fields be null pointers after they're freed, to be explicit that they are no longer valid. - Finally, move the definition of the local variable "i" to the innermost scope in which it's needed. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: always pass ops array to rbd_req_sync_op()	Alex Elder	2012-07-30	1	-30/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All of the callers of rbd_req_sync_op() except one pass a non-null "ops" pointer. The only one that does not is rbd_req_sync_read(), which passes CEPH_OSD_OP_READ as its "opcode" and, CEPH_OSD_FLAG_READ for "flags". By allocating the ops array in rbd_req_sync_read() and moving the special case code for the null ops pointer into it, it becomes clear that much of that code is not even necessary. In addition, the "opcode" argument to rbd_req_sync_op() is never actually used, so get rid of that. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: pass null version pointer in add_snap()	Alex Elder	2012-07-30	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rbd_header_add_snap() passes the address of a version variable to rbd_req_sync_exec(), but it ignores the result. Just pass a null pointer instead. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: make rbd_create_rw_ops() return a pointer	Alex Elder	2012-07-30	1	-31/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Either rbd_create_rw_ops() will succeed, or it will fail because a memory allocation failed. Have it just return a valid pointer or null rather than stuffing a pointer into a provided address and returning an errno. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: have __rbd_add_snap_dev() return a pointer	Alex Elder	2012-07-30	1	-15/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's not obvious whether the snapshot pointer whose address is provided to __rbd_add_snap_dev() will be assigned by that function. Change it to return the snapshot, or a pointer-coded errno in the event of a failure. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	libceph: recheck con state after allocating incoming message	Sage Weil	2012-07-30	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We drop the lock when calling the ->alloc_msg() con op, which means we need to (a) not clobber con->in_msg without the mutex held, and (b) we need to verify that we are still in the OPEN state when we retake it to avoid causing any mayhem. If the state does change, -EAGAIN will get us back to con_work() and loop. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
\| *	libceph: change ceph_con_in_msg_alloc convention to be less weird	Sage Weil	2012-07-30	1	-25/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This function's calling convention is very limiting. In particular, we can't return any error other than ENOMEM (and only implicitly), which is a problem (see next patch). Instead, return an normal 0 or error code, and make the skip a pointer output parameter. Drop the useless in_hdr argument (we have the con pointer). Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
\| *	libceph: avoid dropping con mutex before fault	Sage Weil	2012-07-30	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ceph_fault() function takes the con mutex, so we should avoid dropping it before calling it. This fixes a potential race with another thread calling ceph_con_close(), or _open(), or similar (we don't reverify con->state after retaking the lock). Add annotation so that lockdep realizes we will drop the mutex before returning. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
\| *	libceph: verify state after retaking con lock after dispatch	Sage Weil	2012-07-30	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We drop the con mutex when delivering a message. When we retake the lock, we need to verify we are still in the OPEN state before preparing to read the next tag, or else we risk stepping on a connection that has been closed. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
\| *	libceph: revoke mon_client messages on session restart	Sage Weil	2012-07-30	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Revoke all mon_client messages when we shut down the old connection. This is mostly moot since we are re-using the same ceph_connection, but it is cleaner. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
\| *	libceph: fix handling of immediate socket connect failure	Sage Weil	2012-07-30	1	-7/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the connect() call immediately fails such that sock == NULL, we still need con_close_socket() to reset our socket state to CLOSED. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
\| *	ceph: update MAINTAINERS file	Sage Weil	2012-07-30	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* shiny new inktank.com email addresses * add include/linux/crush directory (previous oversight) Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
\| *	libceph: be less chatty about stray replies	Sage Weil	2012-07-30	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are many (normal) conditions that can lead to us getting unexpected replies, include cluster topology changes, osd failures, and timeouts. There's no need to spam the console about it. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
\| *	libceph: clear all flags on con_close	Sage Weil	2012-07-30	1	-0/+2
\| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@inktank.com>
\| *	libceph: clean up con flags	Sage Weil	2012-07-30	2	-36/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rename flags with CON_FLAG prefix, move the definitions into the c file, and (better) document their meaning. Signed-off-by: Sage Weil <sage@inktank.com>
\| *	libceph: replace connection state bits with states	Sage Weil	2012-07-30	2	-74/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use a simple set of 6 enumerated values for the socket states (CON_STATE_*) and use those instead of the state bits. All of the con->state checks are now under the protection of the con mutex, so this is safe. It also simplifies many of the state checks because we can check for anything other than the expected state instead of various bits for races we can think of. This appears to hold up well to stress testing both with and without socket failure injection on the server side. Signed-off-by: Sage Weil <sage@inktank.com>
\| *	libceph: drop unnecessary CLOSED check in socket state change callback	Sage Weil	2012-07-30	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \|	If we are CLOSED, the socket is closed and we won't get these. Signed-off-by: Sage Weil <sage@inktank.com>
\| *	libceph: close socket directly from ceph_con_close()	Sage Weil	2012-07-30	1	-7/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is simpler to do this immediately, since we already hold the con mutex. It also avoids the need to deal with a not-quite-CLOSED socket in con_work. Signed-off-by: Sage Weil <sage@inktank.com>
\| *	libceph: drop gratuitous socket close calls in con_work	Sage Weil	2012-07-30	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	If the state is CLOSED or OPENING, we shouldn't have a socket. Signed-off-by: Sage Weil <sage@inktank.com>
\| *	libceph: move ceph_con_send() closed check under the con mutex	Sage Weil	2012-07-30	1	-9/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Take the con mutex before checking whether the connection is closed to avoid racing with someone else closing it. Signed-off-by: Sage Weil <sage@inktank.com>
\| *	libceph: move msgr clear_standby under con mutex protection	Sage Weil	2012-07-30	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid dropping and retaking con->mutex in the ceph_con_send() case by leaving locking up to the caller. Signed-off-by: Sage Weil <sage@inktank.com>
\| *	libceph: fix fault locking; close socket on lossy fault	Sage Weil	2012-07-30	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we fault on a lossy connection, we should still close the socket immediately, and do so under the con mutex. We should also take the con mutex before printing out the state bits in the debug output. Signed-off-by: Sage Weil <sage@inktank.com>
\| *	rbd: drop "object_name" from rbd_req_sync_unwatch()	Alex Elder	2012-07-30	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rbd_req_sync_unwatch() only ever uses rbd_dev->header_name as the value of its "object_name" parameter, and that value is available within the function already. So get rid of the parameter. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: drop "object_name" from rbd_req_sync_notify_ack()	Alex Elder	2012-07-30	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rbd_req_sync_notify_ack() only ever uses rbd_dev->header_name as the value of its "object_name" parameter, and that value is available within the function already. So get rid of the parameter. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: drop "object_name" from rbd_req_sync_notify()	Alex Elder	2012-07-30	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rbd_req_sync_notify() only ever uses rbd_dev->header_name as the value of its "object_name" parameter, and that value is available within the function already. So get rid of the parameter. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: drop "object_name" from rbd_req_sync_watch()	Alex Elder	2012-07-30	1	-7/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rbd_req_sync_watch() is only called in one place, and in that place it passes rbd_dev->header_name as the value of the "object_name" parameter. This value is available within the function already. Having the extra parameter leaves the impression the object name could take on different values, but it does not. So get rid of the parameter. We can always add it back again if we find we want to watch some other object in the future. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: drop rbd_dev parameter in snap functions	Alex Elder	2012-07-30	1	-12/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Both rbd_register_snap_dev() and __rbd_remove_snap_dev() have rbd_dev parameters that are unused. Remove them. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: drop rbd_header_from_disk() gfp_flags parameter	Alex Elder	2012-07-30	1	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The function rbd_header_from_disk() is only called in one spot, and it passes GFP_KERNEL as its value for the gfp_flags parameter. Just drop that parameter and substitute GFP_KERNEL everywhere within that function it had been used. (If we find we need the parameter again in the future it's easy enough to add back again.) Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: snapc is unused in rbd_req_sync_read()	Alex Elder	2012-07-30	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "snapc" parameter to in rbd_req_sync_read() is not used, so get rid of it. Reported-by: Josh Durgin <josh.durgin@inktank.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: rename rbd_device->id	Alex Elder	2012-07-30	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "id" field of an rbd device structure represents the unique client-local device id mapped to the underlying rbd image. Each rbd image will have another id--the image id--and each snapshot has its own id as well. The simple name "id" no longer conveys the information one might like to have. Rename the device "id" field in struct rbd_dev to be "dev_id" to make it a little more obvious what we're dealing with without having to think more about context. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: encapsulate header validity test	Alex Elder	2012-07-30	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an rbd image header is read and it doesn't begin with the expected magic information, a warning is displayed. This is a fairly simple test, but it could be extended at some point. Fix the comparison so it actually looks at the "text" field rather than the front of the structure. In any case, encapsulate the validity test in its own function. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	ceph: define snap counts as u32 everywhere	Alex Elder	2012-07-30	3	-11/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two structures in which a count of snapshots are maintained: struct ceph_snap_context { ... u32 num_snaps; ... } and struct ceph_snap_realm { ... u32 num_prior_parent_snaps; /* had prior to parent_since */ ... u32 num_snaps; ... } These fields never take on negative values (e.g., to hold special meaning), and so are really inherently unsigned. Furthermore they take their value from over-the-wire or on-disk formatted 32-bit values. So change their definition to have type u32, and change some spots elsewhere in the code to account for this change. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: clean up a few dout() calls	Alex Elder	2012-07-30	1	-19/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There was a dout() call in rbd_do_request() that was reporting the reporting the offset as the length and vice versa. While fixing that I did a quick scan of other dout() calls and fixed a couple of other minor things. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: simplify __rbd_remove_all_snaps()	Alex Elder	2012-07-30	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This just replaces a while loop with list_for_each_entry_safe() in __rbd_remove_all_snaps(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: drop extra header_rwsem init	Alex Elder	2012-07-30	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit c666601a there was inadvertently added an extra initialization of rbd_dev->header_rwsem. This gets rid of the duplicate. Reported-by: Guangliang Zhao <gzhao@suse.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: kill rbd_image_header->snap_seq	Alex Elder	2012-07-30	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The snap_seq field in an rbd_image_header structure held the value from the rbd image header when it was last refreshed. We now maintain this value in the snapc->seq field. So get rid of the other one. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: set snapc->seq only when refreshing header	Alex Elder	2012-07-30	1	-8/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In rbd_header_add_snap() there is code to set snapc->seq to the just-added snapshot id. This is the only remnant left of the use of that field for recording which snapshot an rbd_dev was associated with. That functionality is no longer supported, so get rid of that final bit of code. Doing so means we never actually set snapc->seq any more. On the server, the snapshot context's sequence value represents the highest snapshot id ever issued for a particular rbd image. So we'll make it have that meaning here as well. To do so, set this value whenever the rbd header is (re-)read. That way it will always be consistent with the rest of the snapshot context we maintain. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: preserve snapc->seq in rbd_header_set_snap()	Alex Elder	2012-07-30	1	-11/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In rbd_header_set_snap(), there is logic to make the snap context's seq field get set to a particular snapshot id, or 0 if there is no snapshot for the rbd image. This seems to be an artifact of how the current snapshot id for an rbd_dev was recorded before the rbd_dev->snap_id field began to be used for that purpose. There's no need to update the value of snapc->seq here any more, so stop doing it. Tidy up a few local variables in that function while we're at it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: don't use snapc->seq that way	Alex Elder	2012-07-30	1	-14/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In what appears to be an artifact of a different way of encoding whether an rbd image maps a snapshot, __rbd_refresh_header() has code that arranges to update the seq value in an rbd image's snapshot context to point to the first entry in its snapshot array if that's where it was pointing initially. We now use rbd_dev->snap_id to record the snapshot id--using the special value CEPH_NOSNAP to indicate the rbd_dev is not mapping a snapshot at all. There is therefore no need to check for this case, nor to update the seq value, in __rbd_refresh_header(). Just preserve the seq value that rbd_read_header() provides (which, at the moment, is nothing). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| *	rbd: send header version when notifying	Josh Durgin	2012-07-30	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously the original header version was sent. Now, we update it when the header changes. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Reviewed-by: Alex Elder <elder@inktank.com>
\| *	rbd: use reference counting for the snap context	Josh Durgin	2012-07-30	1	-18/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This prevents a race between requests with a given snap context and header updates that free it. The osd client was already expecting the snap context to be reference counted, since it get()s it in ceph_osdc_build_request and put()s it when the request completes. Also remove the second down_read()/up_read() on header_rwsem in rbd_do_request, which wasn't actually preventing this race or protecting any other data. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Reviewed-by: Alex Elder <elder@inktank.com>
\| *	rbd: set image size when header is updated	Josh Durgin	2012-07-30	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The image may have been resized. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Reviewed-by: Alex Elder <elder@inktank.com>
\| *	rbd: expose the correct size of the device in sysfs	Josh Durgin	2012-07-30	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an image was mapped to a snapshot, the size of the head version would be shown. Protect capacity with header_rwsem, since it may change. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Reviewed-by: Alex Elder <elder@inktank.com>
\| *	rbd: only reset capacity when pointing to head	Josh Durgin	2012-07-30	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Snapshots cannot be resized, and the new capacity of head should not be reflected by the snapshot. Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
\| *	rbd: return errors for mapped but deleted snapshot	Josh Durgin	2012-07-30	1	-2/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a snapshot is deleted, the OSD will return ENOENT when reading from it. This is normally interpreted as a hole by rbd, which will return zeroes. To minimize the time in which this can happen, stop requests early when we are notified that our snapshot no longer exists. [elder@inktank.com: updated __rbd_init_snaps_header() logic] Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
\| *	libceph: trivial fix for the incorrect debug output	Jiaju Zhang	2012-07-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a trivial fix for the debug output, as it is inconsistent with the function name so may confuse people when debugging. [elder@inktank.com: switched to use __func__] Signed-off-by: Jiaju Zhang <jjzhang@suse.de> Reviewed-by: Alex Elder <elder@inktank.com>
\| *	ceph: fix potential double free	Alan Cox	2012-07-30	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We re-run the loop but we don't re-set the attrs pointer back to NULL. Signed-off-by: Alan Cox <alan@linux.intel.com> Reviewed-by: Alex Elder <elder@inktank.com>