<feed xmlns='http://www.w3.org/2005/Atom'>
<title>talos-obmc-linux/fs/fscache, branch dev-5.0</title>
<subtitle>Talos™ II Linux sources for OpenBMC</subtitle>
<id>https://git.raptorcs.com/git/talos-obmc-linux/atom?h=dev-5.0</id>
<link rel='self' href='https://git.raptorcs.com/git/talos-obmc-linux/atom?h=dev-5.0'/>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/'/>
<updated>2018-11-30T15:57:31+00:00</updated>
<entry>
<title>fscache: fix race between enablement and dropping of object</title>
<updated>2018-11-30T15:57:31+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.com</email>
</author>
<published>2018-10-26T06:16:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=c5a94f434c82529afda290df3235e4d85873c5b4'/>
<id>urn:sha1:c5a94f434c82529afda290df3235e4d85873c5b4</id>
<content type='text'>
It was observed that a process blocked indefintely in
__fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP
to be cleared via fscache_wait_for_deferred_lookup().

At this time, -&gt;backing_objects was empty, which would normaly prevent
__fscache_read_or_alloc_page() from getting to the point of waiting.
This implies that -&gt;backing_objects was cleared *after*
__fscache_read_or_alloc_page was was entered.

When an object is "killed" and then "dropped",
FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then
KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is
-&gt;backing_objects cleared.  This leaves a window where
something else can set FSCACHE_COOKIE_LOOKING_UP and
__fscache_read_or_alloc_page() can start waiting, before
-&gt;backing_objects is cleared

There is some uncertainty in this analysis, but it seems to be fit the
observations.  Adding the wake in this patch will be handled correctly
by __fscache_read_or_alloc_page(), as it checks if -&gt;backing_objects
is empty again, after waiting.

Customer which reported the hang, also report that the hang cannot be
reproduced with this fix.

The backtrace for the blocked process looked like:

PID: 29360  TASK: ffff881ff2ac0f80  CPU: 3   COMMAND: "zsh"
 #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1
 #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed
 #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8
 #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e
 #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache]
 #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache]
 #6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs]
 #7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs]
 #8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73
 #9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs]
#10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756
#11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa
#12 [ffff881ff43eff18] sys_read at ffffffff811fda62
#13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e

Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>fscache: Fix out of bound read in long cookie keys</title>
<updated>2018-10-18T09:32:21+00:00</updated>
<author>
<name>Eric Sandeen</name>
<email>sandeen@redhat.com</email>
</author>
<published>2018-10-17T14:23:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=fa520c47eaa15b9baa8ad66ac18da4a31679693b'/>
<id>urn:sha1:fa520c47eaa15b9baa8ad66ac18da4a31679693b</id>
<content type='text'>
fscache_set_key() can incur an out-of-bounds read, reported by KASAN:

 BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x5b3/0x680 [fscache]
 Read of size 4 at addr ffff88084ff056d4 by task mount.nfs/32615

and also reported by syzbot at https://lkml.org/lkml/2018/7/8/236

  BUG: KASAN: slab-out-of-bounds in fscache_set_key fs/fscache/cookie.c:120 [inline]
  BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x7a9/0x880 fs/fscache/cookie.c:171
  Read of size 4 at addr ffff8801d3cc8bb4 by task syz-executor907/4466

This happens for any index_key_len which is not divisible by 4 and is
larger than the size of the inline key, because the code allocates exactly
index_key_len for the key buffer, but the hashing loop is stepping through
it 4 bytes (u32) at a time in the buf[] array.

Fix this by calculating how many u32 buffers we'll need by using
DIV_ROUND_UP, and then using kcalloc() to allocate a precleared allocation
buffer to hold the index_key, then using that same count as the hashing
index limit.

Fixes: ec0328e46d6e ("fscache: Maintain a catalogue of allocated cookies")
Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com
Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Cc: stable &lt;stable@vger.kernel.org&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>fscache: Fix incomplete initialisation of inline key space</title>
<updated>2018-10-18T09:32:21+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2018-10-17T14:23:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=1ff22883b0b2f7a73eb2609ffe879c9fd96f6328'/>
<id>urn:sha1:1ff22883b0b2f7a73eb2609ffe879c9fd96f6328</id>
<content type='text'>
The inline key in struct rxrpc_cookie is insufficiently initialized,
zeroing only 3 of the 4 slots, therefore an index_key_len between 13 and 15
bytes will end up hashing uninitialized memory because the memcpy only
partially fills the last buf[] element.

Fix this by clearing fscache_cookie objects on allocation rather than using
the slab constructor to initialise them.  We're going to pretty much fill
in the entire struct anyway, so bringing it into our dcache writably
shouldn't incur much overhead.

This removes the need to do clearance in fscache_set_key() (where we aren't
doing it correctly anyway).

Also, we don't need to set cookie-&gt;key_len in fscache_set_key() as we
already did it in the only caller, so remove that.

Fixes: ec0328e46d6e ("fscache: Maintain a catalogue of allocated cookies")
Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com
Reported-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Cc: stable &lt;stable@vger.kernel.org&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>fscache: Fix reference overput in fscache_attach_object() error handling</title>
<updated>2018-07-25T13:49:00+00:00</updated>
<author>
<name>Kiran Kumar Modukuri</name>
<email>kiran.modukuri@gmail.com</email>
</author>
<published>2018-06-21T20:31:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=f29507ce66701084c39aeb1b0ae71690cbff3554'/>
<id>urn:sha1:f29507ce66701084c39aeb1b0ae71690cbff3554</id>
<content type='text'>
When a cookie is allocated that causes fscache_object structs to be
allocated, those objects are initialised with the cookie pointer, but
aren't blessed with a ref on that cookie unless the attachment is
successfully completed in fscache_attach_object().

If attachment fails because the parent object was dying or there was a
collision, fscache_attach_object() returns without incrementing the cookie
counter - but upon failure of this function, the object is released which
then puts the cookie, whether or not a ref was taken on the cookie.

Fix this by taking a ref on the cookie when it is assigned in
fscache_object_init(), even when we're creating a root object.


Analysis from Kiran Kumar:

This bug has been seen in 4.4.0-124-generic #148-Ubuntu kernel

BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1776277

fscache cookie ref count updated incorrectly during fscache object
allocation resulting in following Oops.

kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321!
kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!

[Cause]
Two threads are trying to do operate on a cookie and two objects.

(1) One thread tries to unmount the filesystem and in process goes over a
    huge list of objects marking them dead and deleting the objects.
    cookie-&gt;usage is also decremented in following path:

      nfs_fscache_release_super_cookie
       -&gt; __fscache_relinquish_cookie
        -&gt;__fscache_cookie_put
        -&gt;BUG_ON(atomic_read(&amp;cookie-&gt;usage) &lt;= 0);

(2) A second thread tries to lookup an object for reading data in following
    path:

    fscache_alloc_object
    1) cachefiles_alloc_object
        -&gt; fscache_object_init
           -&gt; assign cookie, but usage not bumped.
    2) fscache_attach_object -&gt; fails in cant_attach_object because the
         cookie's backing object or cookie's-&gt;parent object are going away
    3) fscache_put_object
        -&gt; cachefiles_put_object
          -&gt;fscache_object_destroy
            -&gt;fscache_cookie_put
               -&gt;BUG_ON(atomic_read(&amp;cookie-&gt;usage) &lt;= 0);

[NOTE from dhowells] It's unclear as to the circumstances in which (2) can
take place, given that thread (1) is in nfs_kill_super(), however a
conflicting NFS mount with slightly different parameters that creates a
different superblock would do it.  A backtrace from Kiran seems to show
that this is a possibility:

    kernel BUG at/build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!
    ...
    RIP: __fscache_cookie_put+0x3a/0x40 [fscache]
    Call Trace:
     __fscache_relinquish_cookie+0x87/0x120 [fscache]
     nfs_fscache_release_super_cookie+0x2d/0xb0 [nfs]
     nfs_kill_super+0x29/0x40 [nfs]
     deactivate_locked_super+0x48/0x80
     deactivate_super+0x5c/0x60
     cleanup_mnt+0x3f/0x90
     __cleanup_mnt+0x12/0x20
     task_work_run+0x86/0xb0
     exit_to_usermode_loop+0xc2/0xd0
     syscall_return_slowpath+0x4e/0x60
     int_ret_from_sys_call+0x25/0x9f

[Fix] Bump up the cookie usage in fscache_object_init, when it is first
being assigned a cookie atomically such that the cookie is added and bumped
up if its refcount is not zero.  Remove the assignment in
fscache_attach_object().

[Testcase]
I have run ~100 hours of NFS stress tests and not seen this bug recur.

[Regression Potential]
 - Limited to fscache/cachefiles.

Fixes: ccc4fc3d11e9 ("FS-Cache: Implement the cookie management part of the netfs API")
Signed-off-by: Kiran Kumar Modukuri &lt;kiran.modukuri@gmail.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>fscache: Allow cancelled operations to be enqueued</title>
<updated>2018-07-25T13:31:20+00:00</updated>
<author>
<name>Kiran Kumar Modukuri</name>
<email>kiran.modukuri@gmail.com</email>
</author>
<published>2018-07-25T13:31:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=d0eb06afe712b7b103b6361f40a9a0c638524669'/>
<id>urn:sha1:d0eb06afe712b7b103b6361f40a9a0c638524669</id>
<content type='text'>
Alter the state-check assertion in fscache_enqueue_operation() to allow
cancelled operations to be given processing time so they can be cleaned up.

Also fix a debugging statement that was requiring such operations to have
an object assigned.

Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
Reported-by: Kiran Kumar Modukuri &lt;kiran.modukuri@gmail.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>proc: introduce proc_create_single{,_data}</title>
<updated>2018-05-16T05:23:35+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2018-05-15T13:57:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=3f3942aca6da351a12543aa776467791b63b3a78'/>
<id>urn:sha1:3f3942aca6da351a12543aa776467791b63b3a78</id>
<content type='text'>
Variants of proc_create{,_data} that directly take a seq_file show
callback and drastically reduces the boilerplate code in the callers.

All trivial callers converted over.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
</entry>
<entry>
<title>proc: introduce proc_create_seq{,_data}</title>
<updated>2018-05-16T05:23:35+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2018-04-13T17:44:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=fddda2b7b521185f3aa018f9559eb33b0aee53a9'/>
<id>urn:sha1:fddda2b7b521185f3aa018f9559eb33b0aee53a9</id>
<content type='text'>
Variants of proc_create{,_data} that directly take a struct seq_operations
argument and drastically reduces the boilerplate code in the callers.

All trivial callers converted over.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
</entry>
<entry>
<title>fscache: use appropriate radix tree accessors</title>
<updated>2018-04-11T17:28:39+00:00</updated>
<author>
<name>Matthew Wilcox</name>
<email>mawilcox@microsoft.com</email>
</author>
<published>2018-04-10T23:36:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=e5a955419642e0842fd26e1ada6ab3328018ca16'/>
<id>urn:sha1:e5a955419642e0842fd26e1ada6ab3328018ca16</id>
<content type='text'>
Don't open-code accesses to data structure internals.

Link: http://lkml.kernel.org/r/20180313132639.17387-7-willy@infradead.org
Signed-off-by: Matthew Wilcox &lt;mawilcox@microsoft.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Cc: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Ryusuke Konishi &lt;konishi.ryusuke@lab.ntt.co.jp&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fscache: Maintain a catalogue of allocated cookies</title>
<updated>2018-04-06T13:05:14+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2018-04-04T12:41:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=ec0328e46d6e5d0f17372eb90ab8e333c2ac7ca9'/>
<id>urn:sha1:ec0328e46d6e5d0f17372eb90ab8e333c2ac7ca9</id>
<content type='text'>
Maintain a catalogue of allocated cookies so that cookie collisions can be
handled properly.  For the moment, this just involves printing a warning
and returning a NULL cookie to the caller of fscache_acquire_cookie(), but
in future it might make sense to wait for the old cookie to finish being
cleaned up.

This requires the cookie key to be stored attached to the cookie so that we
still have the key available if the netfs relinquishes the cookie.  This is
done by an earlier patch.

The catalogue also renders redundant fscache_netfs_list (used for checking
for duplicates), so that can be removed.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Acked-by: Anna Schumaker &lt;anna.schumaker@netapp.com&gt;
Tested-by: Steve Dickson &lt;steved@redhat.com&gt;
</content>
</entry>
<entry>
<title>fscache: Pass object size in rather than calling back for it</title>
<updated>2018-04-06T13:05:14+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2018-04-04T12:41:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=ee1235a9a06813429c201bf186397a6feeea07bf'/>
<id>urn:sha1:ee1235a9a06813429c201bf186397a6feeea07bf</id>
<content type='text'>
Pass the object size in to fscache_acquire_cookie() and
fscache_write_page() rather than the netfs providing a callback by which it
can be received.  This makes it easier to update the size of the object
when a new page is written that extends the object.

The current object size is also passed by fscache to the check_aux
function, obviating the need to store it in the aux data.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Acked-by: Anna Schumaker &lt;anna.schumaker@netapp.com&gt;
Tested-by: Steve Dickson &lt;steved@redhat.com&gt;
</content>
</entry>
</feed>
