blackbird-op-linux - Blackbird™ Linux sources for OpenPOWER

	Commit message (Collapse)	Author	Age	Files	Lines
*	libceph: for chooseleaf rules, retry CRUSH map descent from root if leaf is ↵	Jim Schutt	2013-01-17	4	-5/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	failed Add libceph support for a new CRUSH tunable recently added to Ceph servers. Consider the CRUSH rule step chooseleaf firstn 0 type <node_type> This rule means that <n> replicas will be chosen in a manner such that each chosen leaf's branch will contain a unique instance of <node_type>. When an object is re-replicated after a leaf failure, if the CRUSH map uses a chooseleaf rule the remapped replica ends up under the <node_type> bucket that held the failed leaf. This causes uneven data distribution across the storage cluster, to the point that when all the leaves but one fail under a particular <node_type> bucket, that remaining leaf holds all the data from its failed peers. This behavior also limits the number of peers that can participate in the re-replication of the data held by the failed leaf, which increases the time required to re-replicate after a failure. For a chooseleaf CRUSH rule, the tree descent has two steps: call them the inner and outer descents. If the tree descent down to <node_type> is the outer descent, and the descent from <node_type> down to a leaf is the inner descent, the issue is that a down leaf is detected on the inner descent, so only the inner descent is retried. In order to disperse re-replicated data as widely as possible across a storage cluster after a failure, we want to retry the outer descent. So, fix up crush_choose() to allow the inner descent to return immediately on choosing a failed leaf. Wire this up as a new CRUSH tunable. Note that after this change, for a chooseleaf rule, if the primary OSD in a placement group has failed, choosing a replacement may result in one of the other OSDs in the PG colliding with the new primary. This requires that OSD's data for that PG to need moving as well. This seems unavoidable but should be relatively rare. This corresponds to ceph.git commit 88f218181a9e6d2292e2697fc93797d0f6d6e5dc. Signed-off-by: Jim Schutt <jaschut@sandia.gov> Reviewed-by: Sage Weil <sage@inktank.com>
*	ceph: check mds_wanted for imported cap	Yan, Zheng	2013-01-17	1	-1/+3
\| \| \| \| \| \| \| \|	The MDS may have incorrect wanted caps after importing caps. So the client should check the value mds has and send cap update if necessary. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	ceph: allocate cap_release message when receiving cap import	Yan, Zheng	2013-01-17	1	-0/+3
\| \| \| \| \| \| \| \| \|	When client wants to release an imported cap, it's possible there is no reserved cap_release message in corresponding mds session. so __queue_cap_release causes kernel panic. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	ceph: allow revoking duplicated caps issued by non-auth MDS	Yan, Zheng	2013-01-17	1	-5/+10
\| \| \| \| \| \| \| \|	Allow revoking duplicated caps issued by non-auth MDS if these caps are also issued by auth MDS. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	ceph: move dirty inode to migrating list when clearing auth caps	Yan, Zheng	2013-01-17	1	-1/+9
\| \| \| \| \|	Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	ceph: re-calculate truncate_size for strip object	Yan, Zheng	2013-01-17	1	-0/+8
\| \| \| \| \| \| \|	Otherwise osd may truncate the object to larger size. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	ceph: Check for created flag in response from mds	Sam Lang	2013-01-17	4	-3/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The mds now sends back a created inode if the create request performed the create. If the file already existed, no inode is returned in the reply. This allows ceph to set the created flag in atomic_open so that permissions are properly checked in the case that the file wasn't created by the create call to the mds. To ensure compability with previous kernels, a feature for sending back the inode in the create reply was added, so that the mds will only send back the inode if the client indicates it supports the feature. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	ceph: Check for err on mds request in atomic_open	Sam Lang	2013-01-17	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	The error returned by ceph_mdsc_do_request includes errors sending the request, errors on timeout, or any errors coming from the mds. If ceph_mdsc_do_request returns an error, the reply struct will most likely be bogus. We need to bail out and propogate the error instead of overwriting it. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	libceph: fix protocol feature mismatch failure path	Sage Weil	2012-12-27	1	-10/+4
\| \| \| \| \| \| \| \| \| \| \| \|	We should not set con->state to CLOSED here; that happens in ceph_fault() in the caller, where it first asserts that the state is not yet CLOSED. Avoids a BUG when the features don't match. Since the fail_protocol() has become a trivial wrapper, replace calls to it with direct calls to reset_connection(). Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
*	libceph: WARN, don't BUG on unexpected connection states	Alex Elder	2012-12-27	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A number of assertions in the ceph messenger are implemented with BUG_ON(), killing the system if connection's state doesn't match what's expected. At this point our state model is (evidently) not well understood enough for these assertions to trigger a BUG(). Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...) so we learn about these issues without killing the machine. We now recognize that a connection fault can occur due to a socket closure at any time, regardless of the state of the connection. So there is really nothing we can assert about the state of the connection at that point so eliminate that assertion. Reported-by: Ugis <ugis22@gmail.com> Tested-by: Ugis <ugis22@gmail.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	libceph: always reset osds when kicking	Alex Elder	2012-12-27	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When ceph_osdc_handle_map() is called to process a new osd map, kick_requests() is called to ensure all affected requests are updated if necessary to reflect changes in the osd map. This happens in two cases: whenever an incremental map update is processed; and when a full map update (or the last one if there is more than one) gets processed. In the former case, the kick_requests() call is followed immediately by a call to reset_changed_osds() to ensure any connections to osds affected by the map change are reset. But for full map updates this isn't done. Both cases should be doing this osd reset. Rather than duplicating the reset_changed_osds() call, move it into the end of kick_requests(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	libceph: move linger requests sooner in kick_requests()	Alex Elder	2012-12-27	1	-11/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The kick_requests() function is called by ceph_osdc_handle_map() when an osd map change has been indicated. Its purpose is to re-queue any request whose target osd is different from what it was when it was originally sent. It is structured as two loops, one for incomplete but registered requests, and a second for handling completed linger requests. As a special case, in the first loop if a request marked to linger has not yet completed, it is moved from the request list to the linger list. This is as a quick and dirty way to have the second loop handle sending the request along with all the other linger requests. Because of the way it's done now, however, this quick and dirty solution can result in these incomplete linger requests never getting re-sent as desired. The problem lies in the fact that the second loop only arranges for a linger request to be sent if it appears its target osd has changed. This is the proper handling for completed linger requests (it avoids issuing the same linger request twice to the same osd). But although the linger requests added to the list in the first loop may have been sent, they have not yet completed, so they need to be re-sent regardless of whether their target osd has changed. The first required fix is we need to avoid calling __map_request() on any incomplete linger request. Otherwise the subsequent __map_request() call in the second loop will find the target osd has not changed and will therefore not re-send the request. Second, we need to be sure that a sent but incomplete linger request gets re-sent. If the target osd is the same with the new osd map as it was when the request was originally sent, this won't happen. This can be fixed through careful handling when we move these requests from the request list to the linger list, by unregistering the request before it is registered as a linger request. This works because a side-effect of unregistering the request is to make the request's r_osd pointer be NULL, and that will ensure the second loop actually re-sends the linger request. Processing of such a request is done at that point, so continue with the next one once it's been moved. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	rbd: get rid of rbd_{get,put}_dev()	Alex Elder	2012-12-20	1	-12/+2
\| \| \| \| \| \| \| \| \| \| \|	The functions rbd_get_dev() and rbd_put_dev() are trivial wrappers that add no value, and their existence suggests they may do more than what they do. Get rid of them. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>
*	libceph: register request before unregister linger	Alex Elder	2012-12-20	1	-1/+1
\| \| \| \| \| \| \| \| \|	In kick_requests(), we need to register the request before we unregister the linger request. Otherwise the unregister will reset the request's osd pointer to NULL. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	libceph: don't use rb_init_node() in ceph_osdc_alloc_request()	Alex Elder	2012-12-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The red-black node in the ceph osd request structure is initialized in ceph_osdc_alloc_request() using rbd_init_node(). We do need to initialize this, because in __unregister_request() we call RB_EMPTY_NODE(), which expects the node it's checking to have been initialized. But rb_init_node() is apparently overkill, and may in fact be on its way out. So use RB_CLEAR_NODE() instead. For a little more background, see this commit: 4c199a93 rbtree: empty nodes have no color" Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	libceph: init event->node in ceph_osdc_create_event()	Alex Elder	2012-12-20	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	The red-black node node in the ceph osd event structure is not initialized in create_osdc_create_event(). Because this node can be the subject of a RB_EMPTY_NODE() call later on, we should ensure the node is initialized properly for that. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	libceph: init osd->o_node in create_osd()	Alex Elder	2012-12-20	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	The red-black node node in the ceph osd structure is not initialized in create_osd(). Because this node can be the subject of a RB_EMPTY_NODE() call later on, we should ensure the node is initialized properly for that. Add a call to RB_CLEAR_NODE() initialize it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	libceph: report connection fault with warning	Alex Elder	2012-12-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a connection's socket disconnects, or if there's a protocol error of some kind on the connection, a fault is signaled and the connection is reset (closed and reopened, basically). We currently get an error message on the log whenever this occurs. A ceph connection will attempt to reestablish a socket connection repeatedly if a fault occurs. This means that these error messages will get repeatedly added to the log, which is undesirable. Change the error message to be a warning, so they don't get logged by default. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	libceph: socket can close in any connection state	Alex Elder	2012-12-17	1	-17/+30
\| \| \| \| \| \| \| \| \| \| \| \| \|	A connection's socket can close for any reason, independent of the state of the connection (and without irrespective of the connection mutex). As a result, the connectino can be in pretty much any state at the time its socket is closed. Handle those other cases at the top of con_work(). Pull this whole block of code into a separate function to reduce the clutter. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	rbd: don't use ENOTSUPP	Alex Elder	2012-12-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	ENOTSUPP is not a standard errno (it shows up as "Unknown error 524" in an error message). This is what was getting produced when the the local rbd code does not implement features required by a discovered rbd image. Change the error code returned in this case to ENXIO. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	rbd: remove linger unconditionally	Alex Elder	2012-12-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	In __unregister_linger_request(), the request is being removed from the osd client's req_linger list only when the request has a non-null osd pointer. It should be done whether or not the request currently has an osd. This is most likely a non-issue because I believe the request will always have an osd when this function is called. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	rbd: get rid of RBD_MAX_SEG_NAME_LEN	Alex Elder	2012-12-17	2	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RBD_MAX_SEG_NAME_LEN represents the maximum length of an rbd object name (i.e., one of the objects providing storage backing an rbd image). Another symbol, MAX_OBJ_NAME_SIZE, is used in the osd client code to define the maximum length of any object name in an osd request. Right now they disagree, with RBD_MAX_SEG_NAME_LEN being too big. There's no real benefit at this point to defining the rbd object name length limit separate from any other object name, so just get rid of RBD_MAX_SEG_NAME_LEN and use MAX_OBJ_NAME_SIZE in its place. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	libceph: avoid using freed osd in __kick_osd_requests()	Alex Elder	2012-12-17	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an osd has no requests and no linger requests, __reset_osd() will just remove it with a call to __remove_osd(). That drops a reference to the osd, and therefore the osd may have been free by the time __reset_osd() returns. That function offers no indication this may have occurred, and as a result the osd will continue to be used even when it's no longer valid. Change__reset_osd() so it returns an error (ENODEV) when it deletes the osd being reset. And change __kick_osd_requests() so it returns immediately (before referencing osd again) if __reset_osd() returns any error. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	ceph: don't reference req after put	Alex Elder	2012-12-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	In __unregister_request(), there is a call to list_del_init() referencing a request that was the subject of a call to ceph_osdc_put_request() on the previous line. This is not safe, because the request structure could have been freed by the time we reach the list_del_init(). Fix this by reversing the order of these lines. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-off-by: Sage Weil <sage@inktank.com>
*	rbd: do not allow remove of mounted-on image	Alex Elder	2012-12-17	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is no check in rbd_remove() to see if anybody holds open the image being removed. That's not cool. Add a simple open count that goes up and down with opens and closes (releases) of the device, and don't allow an rbd image to be removed if the count is non-zero. Protect the updates of the open count value with ctl_mutex to ensure the underlying rbd device doesn't get removed while concurrently being opened. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	libceph: Unlock unprocessed pages in start_read() error path	David Zafman	2012-12-13	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	Function start_read() can get an error before processing all pages. It must not only release the remaining pages, but unlock them too. This fixes http://tracker.newdream.net/issues/3370 Signed-off-by: David Zafman <david.zafman@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
*	ceph: call handle_cap_grant() for cap import message	Yan, Zheng	2012-12-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	If client sends cap message that requests new max size during exporting caps, the exporting MDS will drop the message quietly. So the client may wait for the reply that updates the max size forever. call handle_cap_grant() for cap import message can avoid this issue. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>
*	ceph: Fix __ceph_do_pending_vmtruncate	Yan, Zheng	2012-12-13	1	-6/+9
\| \| \| \| \| \| \| \|	we should set i_truncate_pending to 0 after page cache is truncated to i_truncate_size Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>
*	ceph: Don't add dirty inode to dirty list if caps is in migration	Yan, Zheng	2012-12-13	1	-3/+7
\| \| \| \| \| \| \| \|	Add dirty inode to cap_dirty_migrating list instead, this can avoid ceph_flush_dirty_caps() entering infinite loop. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>
*	ceph: Fix infinite loop in __wake_requests	Yan, Zheng	2012-12-13	1	-2/+7
\| \| \| \| \| \| \| \| \| \|	__wake_requests() will enter infinite loop if we use it to wake requests in the session->s_waiting list. __wake_requests() deletes requests from the list and __do_request() adds requests back to the list. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>
*	ceph: Don't update i_max_size when handling non-auth cap	Yan, Zheng	2012-12-13	1	-1/+1
\| \| \| \| \| \| \|	The cap from non-auth mds doesn't have a meaningful max_size value. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>
*	bdi_register: add __printf verification, fix arg mismatch	Joe Perches	2012-12-13	2	-1/+2
\| \| \| \| \| \| \|	__printf is useful to verify format and arguments. Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Alex Elder <elder@inktank.com>
*	libceph: remove 'osdtimeout' option	Sage Weil	2012-12-13	4	-49/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This would reset a connection with any OSD that had an outstanding request that was taking more than N seconds. The idea was that if the OSD was buggy, the client could compensate by resending the request. In reality, this only served to hide server bugs, and we haven't actually seen such a bug in quite a while. Moreover, the userspace client code never did this. More importantly, often the request is taking a long time because the OSD is trying to recover, or overloaded, and killing the connection and retrying would only make the situation worse by giving the OSD more work to do. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
*	ceph: fix dentry reference leak in ceph_encode_fh().	Cyril Roelandt	2012-12-13	1	-1/+3
\| \| \| \| \| \| \|	dput() was not called in the error path. Signed-off-by: Cyril Roelandt <tipecaml@gmail.com> Reviewed-by: Alex Elder <elder@inktank.com>
*	ceph: Fix i_size update race	Sage Weil	2012-11-05	2	-47/+77
\| \| \| \| \| \| \| \| \| \| \| \|	ceph_aio_write() has an optimization that marks cap EPH_CAP_FILE_WR dirty before data is copied to page cache and inode size is updated. If ceph_check_caps() flushes the dirty cap before the inode size is updated, MDS can miss the new inode size. The fix is move ceph_{get,put}_cap_refs() into ceph_write_{begin,end}() and call __ceph_mark_dirty_caps() after inode size is updated. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>
*	ceph: Hold caps_list_lock when adjusting caps_{use, total}_count	Yan, Zheng	2012-11-04	1	-0/+2
\| \| \| \| \|	Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>
*	rbd: get additional info in parent spec	Alex Elder	2012-11-01	1	-0/+133
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a layered rbd image has a parent, that parent is identified only by its pool id, image id, and snapshot id. Images that have been mapped also record names for those three id's. Add code to look up these names for parent images so they match mapped images more closely. Skip doing this for an image if it already has its pool name defined (this will be the case for images mapped by the user). It is possible that an the name of a parent image can't be determined, even if the image id is valid. If this occurs it does not preclude correct operation, so don't treat this as an error. On the other hand, defined pools will always have both an id and a name. And any snapshot of an image identified as a parent for a clone image will exist, and will have a name (if not it indicates some other internal error). So treat failure to get these bits of information as errors. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
*	libceph: define ceph_pg_pool_name_by_id()	Alex Elder	2012-11-01	2	-0/+17
\| \| \| \| \| \| \| \| \|	Define and export function ceph_pg_pool_name_by_id() to supply the name of a pg pool whose id is given. This will be used by the next patch. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
*	rbd: get parent spec for version 2 images	Alex Elder	2012-11-01	3	-0/+137
\| \| \| \| \| \| \| \| \| \| \|	Add support for getting the the information identifying the parent image for rbd images that have them. The child image holds a reference to its parent image specification structure. Create a new entry "parent" in /sys/bus/rbd/image/N/ to report the identifying information for the parent image, if any. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
*	rbd: allow null image name	Alex Elder	2012-11-01	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \|	Format 2 parent images are partially identified by their image id, but it may not be possible to determine their image name. The name is not strictly needed for correct operation, so we won't be treating it as an error if we don't know it. Handle this case gracefully in rbd_name_show(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
*	rbd: allow null image name	Alex Elder	2012-11-01	1	-0/+8
\| \| \| \| \| \| \| \| \|	We will know the image id for format 2 parent images, but won't initially know its image name. Avoid making the query for an image id in rbd_dev_image_id() if it's already known. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
*	rbd: encapsulate last part of probe	Alex Elder	2012-11-01	1	-75/+86
\| \| \| \| \| \| \| \| \|	Group the activities that now take place after an rbd_dev_probe() call into a single function, and move the call to that function into rbd_dev_probe() itself. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
*	rbd: define rbd_dev_{create,destroy}() helpers	Alex Elder	2012-11-01	1	-21/+41
\| \| \| \| \| \| \| \| \| \|	Encapsulate the creation/initialization and destruction of rbd device structures. The rbd_client and the rbd_spec structures provided on creation hold references whose ownership is transferred to the new rbd_device structure. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
*	rbd: consolidate rbd_dev init in rbd_add()	Alex Elder	2012-11-01	1	-19/+18
\| \| \| \| \| \| \| \| \| \|	Group the allocation and initialization of fields of the rbd device structure created in rbd_add(). Move the grouped code down later in the function, just prior to the call to rbd_dev_probe(). This is for the most part simple code movement. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
*	rbd: don't pass rbd_dev to rbd_get_client()	Alex Elder	2012-11-01	1	-18/+15
\| \| \| \| \| \| \| \| \| \| \| \| \|	The only reason rbd_dev is passed to rbd_get_client() is so its rbd_client field can get assigned. Instead, just return the rbd_client pointer as a result and have the caller do the assignment. Change rbd_put_client() so it takes an rbd_client structure, so follows the more typical symmetry with rbd_get_client(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
*	rbd: fill rbd_spec in rbd_add_parse_args()	Alex Elder	2012-11-01	1	-38/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pass the address of an rbd_spec structure to rbd_add_parse_args(). Use it to hold the information defining the rbd image to be mapped in an rbd_add() call. Use the result in the caller to initialize the rbd_dev->id field. This means rbd_dev is no longer needed in rbd_add_parse_args(), so get rid of it. Now that this transformation of rbd_add_parse_args() is complete, correct and expand on the its header documentation to reflect the new reality. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
*	rbd: add reference counting to rbd_spec	Alex Elder	2012-11-01	1	-10/+42
\| \| \| \| \| \| \| \| \| \| \|	With layered images we'll share rbd_spec structures, so add a reference count to it. It neatens up some code also. A silly get/put pair is added to the alloc routine just to avoid "defined but not used" warnings. It will go away soon. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
*	rbd: define image specification structure	Alex Elder	2012-10-30	1	-68/+90
\| \| \| \| \| \| \| \| \| \| \| \|	Group the fields that uniquely specify an rbd image into a new reference-counted rbd_spec structure. This structure will be used to describe the desired image when mapping an image, and when probing parent images in layered rbd devices. Replace the set of fields in the rbd device structure with a pointer to a dynamically allocated rbd_spec. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
*	rbd: have rbd_add_parse_args() return error	Alex Elder	2012-10-30	1	-17/+20
\| \| \| \| \| \| \| \| \|	Change the interface to rbd_add_parse_args() so it returns an error code rather than a pointer. Return the ceph_options result via a pointer whose address is passed as an argument. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
*	rbd: pass and populate rbd_options structure	Alex Elder	2012-10-30	1	-11/+19
\| \| \| \| \| \| \| \| \| \| \|	Have the caller pass the address of an rbd_options structure to rbd_add_parse_args(), to be initialized with the information gleaned as a result of the parse. I know, this is another near-reversal of a recent change... Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>