<feed xmlns='http://www.w3.org/2005/Atom'>
<title>talos-obmc-linux/fs/inode.c, branch v5.0</title>
<subtitle>Talos™ II Linux sources for OpenBMC</subtitle>
<id>https://git.raptorcs.com/git/talos-obmc-linux/atom?h=v5.0</id>
<link rel='self' href='https://git.raptorcs.com/git/talos-obmc-linux/atom?h=v5.0'/>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/'/>
<updated>2019-02-13T00:33:18+00:00</updated>
<entry>
<title>Revert "mm: don't reclaim inodes with many attached pages"</title>
<updated>2019-02-13T00:33:18+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2019-02-12T23:35:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=69056ee6a8a3d576ed31e38b3b14c70d6c74edcc'/>
<id>urn:sha1:69056ee6a8a3d576ed31e38b3b14c70d6c74edcc</id>
<content type='text'>
This reverts commit a76cf1a474d7d ("mm: don't reclaim inodes with many
attached pages").

This change causes serious changes to page cache and inode cache
behaviour and balance, resulting in major performance regressions when
combining worklaods such as large file copies and kernel compiles.

  https://bugzilla.kernel.org/show_bug.cgi?id=202441

This change is a hack to work around the problems introduced by changing
how agressive shrinkers are on small caches in commit 172b06c32b94 ("mm:
slowly shrink slabs with a relatively small number of objects").  It
creates more problems than it solves, wasn't adequately reviewed or
tested, so it needs to be reverted.

Link: http://lkml.kernel.org/r/20190130041707.27750-2-david@fromorbit.com
Fixes: a76cf1a474d7d ("mm: don't reclaim inodes with many attached pages")
Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Cc: Wolfgang Walter &lt;linux@stwm.de&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Spock &lt;dairinin@gmail.com&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'y2038-for-4.21' of ssh://gitolite.kernel.org:/pub/scm/linux/kernel/git/arnd/playground</title>
<updated>2018-12-28T20:45:04+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-12-28T20:45:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=b12a9124eeb71d766a3e3eb594ebbb3fefc66902'/>
<id>urn:sha1:b12a9124eeb71d766a3e3eb594ebbb3fefc66902</id>
<content type='text'>
Pull y2038 updates from Arnd Bergmann:
 "More syscalls and cleanups

  This concludes the main part of the system call rework for 64-bit
  time_t, which has spread over most of year 2018, the last six system
  calls being

    - ppoll
    - pselect6
    - io_pgetevents
    - recvmmsg
    - futex
    - rt_sigtimedwait

  As before, nothing changes for 64-bit architectures, while 32-bit
  architectures gain another entry point that differs only in the layout
  of the timespec structure. Hopefully in the next release we can wire
  up all 22 of those system calls on all 32-bit architectures, which
  gives us a baseline version for glibc to start using them.

  This does not include the clock_adjtime, getrusage/waitid, and
  getitimer/setitimer system calls. I still plan to have new versions of
  those as well, but they are not required for correct operation of the
  C library since they can be emulated using the old 32-bit time_t based
  system calls.

  Aside from the system calls, there are also a few cleanups here,
  removing old kernel internal interfaces that have become unused after
  all references got removed. The arch/sh cleanups are part of this,
  there were posted several times over the past year without a reaction
  from the maintainers, while the corresponding changes made it into all
  other architectures"

* tag 'y2038-for-4.21' of ssh://gitolite.kernel.org:/pub/scm/linux/kernel/git/arnd/playground:
  timekeeping: remove obsolete time accessors
  vfs: replace current_kernel_time64 with ktime equivalent
  timekeeping: remove timespec_add/timespec_del
  timekeeping: remove unused {read,update}_persistent_clock
  sh: remove board_time_init() callback
  sh: remove unused rtc_sh_get/set_time infrastructure
  sh: sh03: rtc: push down rtc class ops into driver
  sh: dreamcast: rtc: push down rtc class ops into driver
  y2038: signal: Add compat_sys_rt_sigtimedwait_time64
  y2038: signal: Add sys_rt_sigtimedwait_time32
  y2038: socket: Add compat_sys_recvmmsg_time64
  y2038: futex: Add support for __kernel_timespec
  y2038: futex: Move compat implementation into futex.c
  io_pgetevents: use __kernel_timespec
  pselect6: use __kernel_timespec
  ppoll: use __kernel_timespec
  signal: Add restore_user_sigmask()
  signal: Add set_user_sigmask()
</content>
</entry>
<entry>
<title>vfs: replace current_kernel_time64 with ktime equivalent</title>
<updated>2018-12-18T15:13:05+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2018-12-05T00:14:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=d651d1607f22fd0cd249cb045627569f8028092b'/>
<id>urn:sha1:d651d1607f22fd0cd249cb045627569f8028092b</id>
<content type='text'>
current_time is the last remaining caller of current_kernel_time64(),
which is a wrapper around ktime_get_coarse_real_ts64().  This calls the
latter directly for consistency with the rest of the kernel that is moving
to the ktime_get_ family of time accessors, as now documented in
Documentation/core-api/timekeeping.rst.

An open questions is whether we may want to actually call the more
accurate ktime_get_real_ts64() for file systems that save high-resolution
timestamps in their on-disk format.  This would add a small overhead to
each update of the inode stamps but lead to inode timestamps to actually
have a usable resolution better than one jiffy (1 to 10 milliseconds
normally).  Experiments on a variety of hardware platforms show a typical
time of around 100 CPU cycles to read the cycle counter and calculate the
accurate time from that.  On old platforms without a cycle counter, this
can be signiciantly higher, up to several microseconds to access a
hardware clock, but those have become very rare by now.

I traced the original addition of the current_kernel_time() call to set
the nanosecond fields back to linux-2.5.48, where Andi Kleen added a patch
with subject "nanosecond stat timefields".  Andi explains that the
motivation was to introduce as little overhead as possible back then.  At
this time, reading the clock hardware was also more expensive when most
architectures did not have a cycle counter.

One side effect of having more accurate inode timestamp would be having to
write out the inode every time that mtime/ctime/atime get touched on most
systems, whereas many file systems today only write it when the timestamps
have changed, i.e.  at most once per jiffy unless something else changes
as well.  That change would certainly be noticed in some workloads, which
is enough reason to not do it without a good reason, regardless of the
cost of reading the time.

One thing we could still consider however would be to round the timestamps
from current_time() to multiples of NSEC_PER_JIFFY, e.g.  full
milliseconds rather than having six or seven meaningless but confusing
digits at the end of the timestamp.

Link: http://lkml.kernel.org/r/20180726130820.4174359-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
</content>
</entry>
<entry>
<title>mm: don't reclaim inodes with many attached pages</title>
<updated>2018-11-18T18:15:09+00:00</updated>
<author>
<name>Roman Gushchin</name>
<email>guro@fb.com</email>
</author>
<published>2018-11-16T23:08:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=a76cf1a474d7dbcd9336b5f5afb0162baa142cf0'/>
<id>urn:sha1:a76cf1a474d7dbcd9336b5f5afb0162baa142cf0</id>
<content type='text'>
Spock reported that commit 172b06c32b94 ("mm: slowly shrink slabs with a
relatively small number of objects") leads to a regression on his setup:
periodically the majority of the pagecache is evicted without an obvious
reason, while before the change the amount of free memory was balancing
around the watermark.

The reason behind is that the mentioned above change created some
minimal background pressure on the inode cache.  The problem is that if
an inode is considered to be reclaimed, all belonging pagecache page are
stripped, no matter how many of them are there.  So, if a huge
multi-gigabyte file is cached in the memory, and the goal is to reclaim
only few slab objects (unused inodes), we still can eventually evict all
gigabytes of the pagecache at once.

The workload described by Spock has few large non-mapped files in the
pagecache, so it's especially noticeable.

To solve the problem let's postpone the reclaim of inodes, which have
more than 1 attached page.  Let's wait until the pagecache pages will be
evicted naturally by scanning the corresponding LRU lists, and only then
reclaim the inode structure.

Link: http://lkml.kernel.org/r/20181023164302.20436-1-guro@fb.com
Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Reported-by: Spock &lt;dairinin@gmail.com&gt;
Tested-by: Spock &lt;dairinin@gmail.com&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[4.19.x]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: remove include/linux/bootmem.h</title>
<updated>2018-10-31T15:54:16+00:00</updated>
<author>
<name>Mike Rapoport</name>
<email>rppt@linux.vnet.ibm.com</email>
</author>
<published>2018-10-30T22:09:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=57c8a661d95dff48dd9c2f2496139082bbaf241a'/>
<id>urn:sha1:57c8a661d95dff48dd9c2f2496139082bbaf241a</id>
<content type='text'>
Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.

The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include &lt;linux/memblock.h&gt;

@@
@@
- #include &lt;linux/bootmem.h&gt;
+ #include &lt;linux/memblock.h&gt;

[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
  Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport &lt;rppt@linux.vnet.ibm.com&gt;
Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Greentime Hu &lt;green.hu@gmail.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Guan Xuetao &lt;gxt@pku.edu.cn&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: "James E.J. Bottomley" &lt;jejb@parisc-linux.org&gt;
Cc: Jonas Bonn &lt;jonas@southpole.se&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Ley Foon Tan &lt;lftan@altera.com&gt;
Cc: Mark Salter &lt;msalter@redhat.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Michal Simek &lt;monstr@monstr.eu&gt;
Cc: Palmer Dabbelt &lt;palmer@sifive.com&gt;
Cc: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: Richard Kuo &lt;rkuo@codeaurora.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rich Felker &lt;dalias@libc.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Serge Semin &lt;fancer.lancer@gmail.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Cc: Yoshinori Sato &lt;ysato@users.sourceforge.jp&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>page cache: Finish XArray conversion</title>
<updated>2018-10-21T14:46:44+00:00</updated>
<author>
<name>Matthew Wilcox</name>
<email>willy@infradead.org</email>
</author>
<published>2017-12-06T00:04:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=a28334862993b5c6a8766f6963ee69048403817c'/>
<id>urn:sha1:a28334862993b5c6a8766f6963ee69048403817c</id>
<content type='text'>
With no more radix tree API users left, we can drop the GFP flags
and use xa_init() instead of INIT_RADIX_TREE().

Signed-off-by: Matthew Wilcox &lt;willy@infradead.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'ovl-update-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs</title>
<updated>2018-08-22T01:19:09+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-08-22T01:19:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=d9a185f8b49678775ef56ecbdbc7b76970302897'/>
<id>urn:sha1:d9a185f8b49678775ef56ecbdbc7b76970302897</id>
<content type='text'>
Pull overlayfs updates from Miklos Szeredi:
 "This contains two new features:

   - Stack file operations: this allows removal of several hacks from
     the VFS, proper interaction of read-only open files with copy-up,
     possibility to implement fs modifying ioctls properly, and others.

   - Metadata only copy-up: when file is on lower layer and only
     metadata is modified (except size) then only copy up the metadata
     and continue to use the data from the lower file"

* tag 'ovl-update-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (66 commits)
  ovl: Enable metadata only feature
  ovl: Do not do metacopy only for ioctl modifying file attr
  ovl: Do not do metadata only copy-up for truncate operation
  ovl: add helper to force data copy-up
  ovl: Check redirect on index as well
  ovl: Set redirect on upper inode when it is linked
  ovl: Set redirect on metacopy files upon rename
  ovl: Do not set dentry type ORIGIN for broken hardlinks
  ovl: Add an inode flag OVL_CONST_INO
  ovl: Treat metacopy dentries as type OVL_PATH_MERGE
  ovl: Check redirects for metacopy files
  ovl: Move some dir related ovl_lookup_single() code in else block
  ovl: Do not expose metacopy only dentry from d_real()
  ovl: Open file with data except for the case of fsync
  ovl: Add helper ovl_inode_realdata()
  ovl: Store lower data inode in ovl_inode
  ovl: Fix ovl_getattr() to get number of blocks from lower
  ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry
  ovl: Copy up meta inode data from lowest data inode
  ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  ...
</content>
</entry>
<entry>
<title>Merge branch 'work.mkdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2018-08-14T03:25:58+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-08-14T03:25:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=0ea97a2d61df729ccce75b00a2fa37d39a508ab6'/>
<id>urn:sha1:0ea97a2d61df729ccce75b00a2fa37d39a508ab6</id>
<content type='text'>
Pull vfs icache updates from Al Viro:

 - NFS mkdir/open_by_handle race fix

 - analogous solution for FUSE, replacing the one currently in mainline

 - new primitive to be used when discarding halfway set up inodes on
   failed object creation; gives sane warranties re icache lookups not
   returning such doomed by still not freed inodes. A bunch of
   filesystems switched to that animal.

 - Miklos' fix for last cycle regression in iget5_locked(); -stable will
   need a slightly different variant, unfortunately.

 - misc bits and pieces around things icache-related (in adfs and jfs).

* 'work.mkdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  jfs: don't bother with make_bad_inode() in ialloc()
  adfs: don't put inodes into icache
  new helper: inode_fake_hash()
  vfs: don't evict uninitialized inode
  jfs: switch to discard_new_inode()
  ext2: make sure that partially set up inodes won't be returned by ext2_iget()
  udf: switch to discard_new_inode()
  ufs: switch to discard_new_inode()
  btrfs: switch to discard_new_inode()
  new primitive: discard_new_inode()
  kill d_instantiate_no_diralias()
  nfs_instantiate(): prevent multiple aliases for directory inode
</content>
</entry>
<entry>
<title>vfs: don't evict uninitialized inode</title>
<updated>2018-08-03T20:03:32+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@redhat.com</email>
</author>
<published>2018-07-24T13:01:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=e950564b97fd0f541b02eb207685d0746f5ecf29'/>
<id>urn:sha1:e950564b97fd0f541b02eb207685d0746f5ecf29</id>
<content type='text'>
iput() ends up calling -&gt;evict() on new inode, which is not yet initialized
by owning fs.  So use destroy_inode() instead.

Add to sb-&gt;s_inodes list only if inode is not in I_CREATING state (meaning
that it wasn't allocated with new_inode(), which already does the
insertion).

Reported-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
Fixes: 80ea09a002bf ("vfs: factor out inode_insert5()")
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>new primitive: discard_new_inode()</title>
<updated>2018-08-03T19:55:30+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2018-06-28T19:53:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=c2b6d621c4ffe9936adf7a55c8b1c769672c306f'/>
<id>urn:sha1:c2b6d621c4ffe9936adf7a55c8b1c769672c306f</id>
<content type='text'>
	We don't want open-by-handle picking half-set-up in-core
struct inode from e.g. mkdir() having failed halfway through.
In other words, we don't want such inodes returned by iget_locked()
on their way to extinction.  However, we can't just have them
unhashed - otherwise open-by-handle immediately *after* that would've
ended up creating a new in-core inode over the on-disk one that
is in process of being freed right under us.

	Solution: new flag (I_CREATING) set by insert_inode_locked() and
removed by unlock_new_inode() and a new primitive (discard_new_inode())
to be used by such halfway-through-setup failure exits instead of
unlock_new_inode() / iput() combinations.  That primitive unlocks new
inode, but leaves I_CREATING in place.

	iget_locked() treats finding an I_CREATING inode as failure
(-ESTALE, once we sort out the error propagation).
	insert_inode_locked() treats the same as instant -EBUSY.
	ilookup() treats those as icache miss.

[Fix by Dan Carpenter &lt;dan.carpenter@oracle.com&gt; folded in]

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
</feed>
