<feed xmlns='http://www.w3.org/2005/Atom'>
<title>talos-op-linux/block/blk-core.c, branch v5.2</title>
<subtitle>Talos™ II Linux sources for OpenPOWER</subtitle>
<id>https://git.raptorcs.com/git/talos-op-linux/atom?h=v5.2</id>
<link rel='self' href='https://git.raptorcs.com/git/talos-op-linux/atom?h=v5.2'/>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-op-linux/'/>
<updated>2019-06-07T04:39:39+00:00</updated>
<entry>
<title>block: free sched's request pool in blk_cleanup_queue</title>
<updated>2019-06-07T04:39:39+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-06-04T13:08:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-op-linux/commit/?id=c3e2219216c92919a6bd1711f340f5faa98695e6'/>
<id>urn:sha1:c3e2219216c92919a6bd1711f340f5faa98695e6</id>
<content type='text'>
In theory, IO scheduler belongs to request queue, and the request pool
of sched tags belongs to the request queue too.

However, the current tags allocation interfaces are re-used for both
driver tags and sched tags, and driver tags is definitely host wide,
and doesn't belong to any request queue, same with its request pool.
So we need tagset instance for freeing request of sched tags.

Meantime, blk_mq_free_tag_set() often follows blk_cleanup_queue() in case
of non-BLK_MQ_F_TAG_SHARED, this way requires that request pool of sched
tags to be freed before calling blk_mq_free_tag_set().

Commit 47cdee29ef9d94e ("block: move blk_exit_queue into __blk_release_queue")
moves blk_exit_queue into __blk_release_queue for simplying the fast
path in generic_make_request(), then causes oops during freeing requests
of sched tags in __blk_release_queue().

Fix the above issue by move freeing request pool of sched tags into
blk_cleanup_queue(), this way is safe becasue queue has been frozen and no any
in-queue requests at that time. Freeing sched tags has to be kept in queue's
release handler becasue there might be un-completed dispatch activity
which might refer to sched tags.

Cc: Bart Van Assche &lt;bvanassche@acm.org&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Fixes: 47cdee29ef9d94e485eb08f962c74943023a5271 ("block: move blk_exit_queue into __blk_release_queue")
Tested-by: Yi Zhang &lt;yi.zhang@redhat.com&gt;
Reported-by: kernel test robot &lt;rong.a.chen@intel.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: print offending values when cloned rq limits are exceeded</title>
<updated>2019-05-31T21:12:34+00:00</updated>
<author>
<name>John Pittman</name>
<email>jpittman@redhat.com</email>
</author>
<published>2019-05-23T21:49:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-op-linux/commit/?id=61939b12dc24d0ac958020f261046c35a16e0c48'/>
<id>urn:sha1:61939b12dc24d0ac958020f261046c35a16e0c48</id>
<content type='text'>
While troubleshooting issues where cloned request limits have been
exceeded, it is often beneficial to know the actual values that
have been breached.  Print these values, assisting in ease of
identification of root cause of the breach.

Reviewed-by: Chaitanya Kulkarni &lt;chaitanya.kulkarni@wdc.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: John Pittman &lt;jpittman@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: don't protect generic_make_request_checks with blk_queue_enter</title>
<updated>2019-05-29T12:09:11+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-05-15T03:03:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-op-linux/commit/?id=fe2008640ae36e3920cf41507a84fb5d3227435a'/>
<id>urn:sha1:fe2008640ae36e3920cf41507a84fb5d3227435a</id>
<content type='text'>
Now a063057d7c73 ("block: Fix a race between request queue removal and
the block cgroup controller") has been reverted, and blkcg_exit_queue()
won't be called in blk_cleanup_queue() any more.

So don't need to protect generic_make_request_checks() with
blk_queue_enter(), then the total mess can be cleaned.

37f9579f4c31 ("blk-mq: Avoid that submitting a bio concurrently with device
removal triggers a crash") is reverted.

Cc: Bart Van Assche &lt;bvanassche@acm.org&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: move blk_exit_queue into __blk_release_queue</title>
<updated>2019-05-29T12:09:09+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-05-15T03:03:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-op-linux/commit/?id=47cdee29ef9d94e485eb08f962c74943023a5271'/>
<id>urn:sha1:47cdee29ef9d94e485eb08f962c74943023a5271</id>
<content type='text'>
Commit 498f6650aec8 ("block: Fix a race between the cgroup code and
request queue initialization") moves what blk_exit_queue does into
blk_cleanup_queue() for fixing issue caused by changing back
queue lock.

However, after legacy request IO path is killed, driver queue lock
won't be used at all, and there isn't story for changing back
queue lock. Then the issue addressed by Commit 498f6650aec8 doesn't
exist any more.

So move move blk_exit_queue into __blk_release_queue.

This patch basically reverts the following two commits:

	498f6650aec8 block: Fix a race between the cgroup code and request queue initialization
	24ecc3585348 block: Ensure that a request queue is dissociated from the cgroup controller

Cc: Bart Van Assche &lt;bvanassche@acm.org&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: fix hang caused by freeze/unfreeze sequence</title>
<updated>2019-05-23T16:25:26+00:00</updated>
<author>
<name>Bob Liu</name>
<email>bob.liu@oracle.com</email>
</author>
<published>2019-05-21T03:25:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-op-linux/commit/?id=7996a8b5511a72465b0b286763c2d8f412b8874a'/>
<id>urn:sha1:7996a8b5511a72465b0b286763c2d8f412b8874a</id>
<content type='text'>
The following is a description of a hang in blk_mq_freeze_queue_wait().
The hang happens on attempt to freeze a queue while another task does
queue unfreeze.

The root cause is an incorrect sequence of percpu_ref_resurrect() and
percpu_ref_kill() and as a result those two can be swapped:

 CPU#0                         CPU#1
 ----------------              -----------------
 q1 = blk_mq_init_queue(shared_tags)

                                q2 = blk_mq_init_queue(shared_tags):
                                  blk_mq_add_queue_tag_set(shared_tags):
                                    blk_mq_update_tag_set_depth(shared_tags):
				     list_for_each_entry()
                                      blk_mq_freeze_queue(q1)
                                       &gt; percpu_ref_kill()
                                       &gt; blk_mq_freeze_queue_wait()

 blk_cleanup_queue(q1)
  blk_mq_freeze_queue(q1)
   &gt; percpu_ref_kill()
                 ^^^^^^ freeze_depth can't guarantee the order

                                      blk_mq_unfreeze_queue()
                                        &gt; percpu_ref_resurrect()

   &gt; blk_mq_freeze_queue_wait()
                 ^^^^^^ Hang here!!!!

This wrong sequence raises kernel warning:
percpu_ref_kill_and_confirm called more than once on blk_queue_usage_counter_release!
WARNING: CPU: 0 PID: 11854 at lib/percpu-refcount.c:336 percpu_ref_kill_and_confirm+0x99/0xb0

But the most unpleasant effect is a hang of a blk_mq_freeze_queue_wait(),
which waits for a zero of a q_usage_counter, which never happens
because percpu-ref was reinited (instead of being killed) and stays in
PERCPU state forever.

How to reproduce:
 - "insmod null_blk.ko shared_tags=1 nr_devices=0 queue_mode=2"
 - cpu0: python Script.py 0; taskset the corresponding process running on cpu0
 - cpu1: python Script.py 1; taskset the corresponding process running on cpu1

 Script.py:
 ------
 #!/usr/bin/python3

import os
import sys

while True:
    on = "echo 1 &gt; /sys/kernel/config/nullb/%s/power" % sys.argv[1]
    off = "echo 0 &gt; /sys/kernel/config/nullb/%s/power" % sys.argv[1]
    os.system(on)
    os.system(off)
------

This bug was first reported and fixed by Roman, previous discussion:
[1] Message id: 1443287365-4244-7-git-send-email-akinobu.mita@gmail.com
[2] Message id: 1443563240-29306-6-git-send-email-tj@kernel.org
[3] https://patchwork.kernel.org/patch/9268199/

Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Roman Pen &lt;roman.penyaev@profitbricks.com&gt;
Signed-off-by: Bob Liu &lt;bob.liu@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: don't drain in-progress dispatch in blk_cleanup_queue()</title>
<updated>2019-05-04T13:24:11+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-04-30T01:52:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-op-linux/commit/?id=662156641bc409a28fa313fca1a755105425d278'/>
<id>urn:sha1:662156641bc409a28fa313fca1a755105425d278</id>
<content type='text'>
Now freeing hw queue resource is moved to hctx's release handler,
we don't need to worry about the race between blk_cleanup_queue and
run queue any more.

So don't drain in-progress dispatch in blk_cleanup_queue().

This is basically revert of c2856ae2f315 ("blk-mq: quiesce queue before
freeing queue").

Cc: Dongli Zhang &lt;dongli.zhang@oracle.com&gt;
Cc: James Smart &lt;james.smart@broadcom.com&gt;
Cc: Bart Van Assche &lt;bart.vanassche@wdc.com&gt;
Cc: linux-scsi@vger.kernel.org,
Cc: Martin K . Petersen &lt;martin.petersen@oracle.com&gt;,
Cc: Christoph Hellwig &lt;hch@lst.de&gt;,
Cc: James E . J . Bottomley &lt;jejb@linux.vnet.ibm.com&gt;,
Reviewed-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Tested-by: James Smart &lt;james.smart@broadcom.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: move cancel of hctx-&gt;run_work into blk_mq_hw_sysfs_release</title>
<updated>2019-05-04T13:24:09+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-04-30T01:52:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-op-linux/commit/?id=1b97871b501f1bac0fd39a073c4c8473ee457a55'/>
<id>urn:sha1:1b97871b501f1bac0fd39a073c4c8473ee457a55</id>
<content type='text'>
hctx is always released after requeue is freed.

With holding queue's kobject refcount, it is safe for driver to run queue,
so one run queue might be scheduled after blk_sync_queue() is done.

So moving the cancel of hctx-&gt;run_work into blk_mq_hw_sysfs_release()
for avoiding run released queue.

Cc: Dongli Zhang &lt;dongli.zhang@oracle.com&gt;
Cc: James Smart &lt;james.smart@broadcom.com&gt;
Cc: Bart Van Assche &lt;bart.vanassche@wdc.com&gt;
Cc: linux-scsi@vger.kernel.org,
Cc: Martin K . Petersen &lt;martin.petersen@oracle.com&gt;,
Cc: Christoph Hellwig &lt;hch@lst.de&gt;,
Cc: James E . J . Bottomley &lt;jejb@linux.vnet.ibm.com&gt;,
Reviewed-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Tested-by: James Smart &lt;james.smart@broadcom.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: free hw queue's resource in hctx's release handler</title>
<updated>2019-05-04T13:24:05+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-04-30T01:52:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-op-linux/commit/?id=c7e2d94b3d1634988a95ac4d77a72dc7487ece06'/>
<id>urn:sha1:c7e2d94b3d1634988a95ac4d77a72dc7487ece06</id>
<content type='text'>
Once blk_cleanup_queue() returns, tags shouldn't be used any more,
because blk_mq_free_tag_set() may be called. Commit 45a9c9d909b2
("blk-mq: Fix a use-after-free") fixes this issue exactly.

However, that commit introduces another issue. Before 45a9c9d909b2,
we are allowed to run queue during cleaning up queue if the queue's
kobj refcount is held. After that commit, queue can't be run during
queue cleaning up, otherwise oops can be triggered easily because
some fields of hctx are freed by blk_mq_free_queue() in blk_cleanup_queue().

We have invented ways for addressing this kind of issue before, such as:

	8dc765d438f1 ("SCSI: fix queue cleanup race before queue initialization is done")
	c2856ae2f315 ("blk-mq: quiesce queue before freeing queue")

But still can't cover all cases, recently James reports another such
kind of issue:

	https://marc.info/?l=linux-scsi&amp;m=155389088124782&amp;w=2

This issue can be quite hard to address by previous way, given
scsi_run_queue() may run requeues for other LUNs.

Fixes the above issue by freeing hctx's resources in its release handler, and this
way is safe becasue tags isn't needed for freeing such hctx resource.

This approach follows typical design pattern wrt. kobject's release handler.

Cc: Dongli Zhang &lt;dongli.zhang@oracle.com&gt;
Cc: James Smart &lt;james.smart@broadcom.com&gt;
Cc: Bart Van Assche &lt;bart.vanassche@wdc.com&gt;
Cc: linux-scsi@vger.kernel.org,
Cc: Martin K . Petersen &lt;martin.petersen@oracle.com&gt;,
Cc: Christoph Hellwig &lt;hch@lst.de&gt;,
Cc: James E . J . Bottomley &lt;jejb@linux.vnet.ibm.com&gt;,
Reported-by: James Smart &lt;james.smart@broadcom.com&gt;
Fixes: 45a9c9d909b2 ("blk-mq: Fix a use-after-free")
Cc: stable@vger.kernel.org
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Tested-by: James Smart &lt;james.smart@broadcom.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-mq: move cancel of requeue_work into blk_mq_release</title>
<updated>2019-05-04T13:24:04+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-04-30T01:52:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-op-linux/commit/?id=fbc2a15e3433058582e5635aabe48a3011a644a8'/>
<id>urn:sha1:fbc2a15e3433058582e5635aabe48a3011a644a8</id>
<content type='text'>
With holding queue's kobject refcount, it is safe for driver
to schedule requeue. However, blk_mq_kick_requeue_list() may
be called after blk_sync_queue() is done because of concurrent
requeue activities, then requeue work may not be completed when
freeing queue, and kernel oops is triggered.

So moving the cancel of requeue_work into blk_mq_release() for
avoiding race between requeue and freeing queue.

Cc: Dongli Zhang &lt;dongli.zhang@oracle.com&gt;
Cc: James Smart &lt;james.smart@broadcom.com&gt;
Cc: Bart Van Assche &lt;bart.vanassche@wdc.com&gt;
Cc: linux-scsi@vger.kernel.org,
Cc: Martin K . Petersen &lt;martin.petersen@oracle.com&gt;,
Cc: Christoph Hellwig &lt;hch@lst.de&gt;,
Cc: James E . J . Bottomley &lt;jejb@linux.vnet.ibm.com&gt;,
Reviewed-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Reviewed-by: Johannes Thumshirn &lt;jthumshirn@suse.de&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Tested-by: James Smart &lt;james.smart@broadcom.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: add SPDX tags to block layer files missing licensing information</title>
<updated>2019-04-30T22:12:03+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2019-04-30T18:42:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-op-linux/commit/?id=3dcf60bcb603f56361abb364a4cd2f69677453f0'/>
<id>urn:sha1:3dcf60bcb603f56361abb364a4cd2f69677453f0</id>
<content type='text'>
Various block layer files do not have any licensing information at all.
Add SPDX tags for the default kernel GPLv2 license to those.

Reviewed-by: Chaitanya Kulkarni &lt;chaitanya.kulkarni@wdc.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
