| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
|
|
|
|
|
|
|
|
| |
It seems that the real cause of all the issues where that
we did not noticed in drbd_try_connect() when the other
guy closes one socket if the round trip time gets higher
than 100ms. There were that 100ms hard coded!
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
|
|
|
|
|
|
|
| |
It is no longer sufficient to trigger on local WRITE,
we need to check on (rq_state & RQ_IN_ACT_LOG)
before calling drbd_al_complete_io also in the error path.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
|
|
|
| |
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If there is no replication traffic within the idle timeout
(ping-int seconds), DRBD will send a P_PING,
and adjust the timeout to ping-timeout.
If there is no P_PING_ACK received within this ping-timeout,
DRBD finally drops the connection, and tries to re-establish it.
To decide which timeout was active, we compared the current timeout
with the ping-timeout, and dropped the connection, if that was the case.
By default, ping-int is 10 seconds, ping-timeout is 500 ms.
Unfortunately, if you configure ping-timeout to be the same as ping-int,
expiry of the idle-timeout had been mistaken for a missing ping ack,
and caused an immediate reconnection attempt.
Fix:
Allow both timeouts to be equal, use a local variable
to store which timeout is active.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We limit ourselves to a configurable maximum number of pages used as
temporary bio pages.
If the configured "max_buffers" is not big enough to match the bandwidth
of the respective deployment, a distributed deadlock could be triggered
by e.g. fast online verify and heavy application IO.
TCP connections would block on congestion, because both receivers
would wait on pages to become available.
Fortunately the respective senders in this case would be able to give
back some pages already. So do that.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
|
|
|
|
|
|
|
|
| |
For some time we contemplated calling the "struct lru_cache"
a "struct tracked_set", and some comments kept the ts_ prefix.
Fix those to match the member field names.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In case a write failes on the local disk, go into D_INCONSISTENT
disk state. That causes future reads of that block to be shipped
to the peer.
Read retry remote was already in place.
Actually the documentation needs to get fixed now. Since the
application is still shielded from the error. (as long as we have
only a single disk failing) The difference to detach is that
we keep the disk. And therefore might keep all the other, still
working sectors up to date.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|\
| |
| |
| | |
of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-2.6.40/drivers
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The sector number on empty barrier requests may (will?) be -1, which,
given that it's being treated as unsigned 64-bit quantity, will almost
always exceed the actual (virtual) disk's size.
Inspired by Konrad's "When writting barriers set the sector number to
zero...".
While at it also add overflow checking to the math in vbd_translate().
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
xenbus_transaction_end()
vbd_resize() up_read()'s xs_state.suspend_mutex twice in a row via double
xenbus_transaction_end() calls. The next down_read() in
xenbus_transaction_start() (at eg. the next resize attempt) hangs.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=618317
Acked-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| | |
The recent changes caused this field of the structure to be offset a bit.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| | |
And not depend on the driver being built with -DDEBUG flag.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| | |
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| | |
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| | |
No need for that '_st' and xen_blkif is more apt.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| | |
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| | |
Not point of the blkif.h file. It is not used by the frontend.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| | |
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| | |
CHECK: multiple assignments should be avoided
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| | |
Break up the macro usage.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| | |
with more details.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| | |
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
dir.
From the blkif.h header, which was exposed to the frontend.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| | |
It is not really used for anything.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| | |
To make it easier to read.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| | |
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| | |
And also make them uniform and prefix the message with 'xen-blkback'.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| | |
Suggested-by: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| | |
They had the wrong data or were in the wrong spot.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
We do a check for the operations right before calling dispatch_rw_block_io.
And then we do the same check in dispatch_rw_block_io. This patch
squashes those checks into the 'dispatch_rw_block_io' function.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BLKIF_OP_WRITE_BARRIER.
We drop the support for 'feature-barrier' and add in the support
for the 'feature-flush-cache' if the real backend storage supports
flushing.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BLKIF_OP_WRITE_FLUSH_CACHE operation.
The operation BLKIF_OP_WRITE_FLUSH_CACHE has existed in the Xen
tree header file for years but it was never present in the Linux tree
because the frontend (nor the backend) supported this interface.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| | |
This reverts commit 97961ef46b9b5a6a7c918a38b898a7b3e49869f4 b/c
we lose about 15% performance if we do the unplugging and the
end of the reading the ring buffer.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If one runs a simple fio request with random read/write with a
20%/80% ratio, the numbers are incredibly bad when using the CFQ scheduler.
IOmeter | | | |
64K, randrw | NOOP | CFQ | deadline |
randrwmix=80 | | | |
--------------+-------+------+----------+
blkback |103/27 |32/10 | 102/27 |
--------------+-------+------+----------+
QEMU qdisk |103/27 |102/27| 102/27 |
The problem as explained by Vivek Goyal was:
".. that difference is that sync vs async requests. In the case of
a kernel thread submitting IO, [..] all the WRITES might be being
considered as async and will go in a different queue. If you mix those
with some READS, they are always sync and will go in differnet queue.
In presence of sync queue, CFQ will idle and choke up WRITES in
an attempt to improve latencies of READs.
In case of AIO [note: this is what QEMU qdisk is doing] , [..]
it is direct IO and both READS and WRITES will be considered SYNC
and will go in a single queue and no choking of WRITES will take place."
The solution is quite simple, tack on REQ_SYNC (which is
what the WRITE_ODIRECT macro points to) and the numbers go
back up.
Suggested-by: Vivek Goyal <vgoyal@redhat.com
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We used to the plug/unplug on the submit_bio. But that means
if within a stream of WRITE, WRITE, WRITE,...,WRITE we have
one READ, it could stall the pipeline (as the 'submio_bio'
could trigger the unplug_fnc to be called and stall/sync
when doing the READ). Instead we want to move the unplugging
when the whole (or as a much as possible) ring buffer has been
processed. This also eliminates us doing plug/unplug for
each request.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
And also shorten the name if it has blkback to blkbk.
This results in the symbol table (if compiled in the kernel)
to be much shorter, prettier, and also easier to search for.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| | |
Shuffling code around.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| | |
There is no need for it, as the address is updated constatly
in the root of the Linux kernel.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Daniel Stodden suggested to eliminate vbd.c and interface.c, inlining the
critical bits where they belong, respectively.
Leaving only blkback.c for the data- and xenbus.c for the control path.
Suggested-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| | |
.. and modify the Makefile and Kconfig files appropriately.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
They were used to check if the queue does not have QUEUE_FLAG_DEAD
set. That is not necessary anymore as the 'submit_io' call
ends up doing that for us.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
After the commit 0faa8cca883bbc6a0919e3c89128672659b75820
(" xen/blkback: remove per-queue plugging") we forgot
to retrieve the 'struct request_queue' from the block device.
This puts the functionality back in and fixes a NULL pointer
bug.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The commit 976222e05ea5a9959ccf880d7a24efbf79b3c6cf
xen/blkback: Move the check for misaligned I/O higher.
moved it a bit to high. The preq->vbdev was not set, so the
check for misaligned I/O would cause a NULL pointer derefence.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
xen_blk_map_seg.
The previous name ('fast_flush_area') had nothing to do with what it
does right now. Changing the names so that the code dealing with
mapping pages in and out of the guest is called xen_blkbk_[map|unmap].
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We move it up higher to be in same loop that actually computes
the sector number.
This way, all of the code that deals with verifying that the
request is correct is all done before we do any of the
page mapping, I/O submission, etc.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
We take out the chunk of code dealing with mapping to the guest
of pages into the xen_blk_map_buf code. And we also move the
vbd_translate to be done much earlier.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Moving it so that the code that 'fast_flush_area' code is
close to the code that deals with it so that the reader
won't lose focus.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We seperate the bio allocation (bio_alloc) from the bio submission so
that the error paths are much easier, and also so that the bio
submission can be done in one tight loop. It also makes the
plug/unplug calls much much easier.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
commit 7eaceaccab5f40bbfda044629a6298616aeaed50 ("block: remove per-queue plugging")
added two new interfaces to plug and unplug: blk_start_plug
and blk_finish_plug. Lets use those.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|