summaryrefslogtreecommitdiffstats
path: root/Documentation/filesystems
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r--Documentation/filesystems/Locking3
-rw-r--r--Documentation/filesystems/f2fs.txt18
-rw-r--r--Documentation/filesystems/overlayfs.txt81
-rw-r--r--Documentation/filesystems/proc.txt3
-rw-r--r--Documentation/filesystems/ramfs-rootfs-initramfs.txt2
-rw-r--r--Documentation/filesystems/vfs.txt16
6 files changed, 86 insertions, 37 deletions
diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 9e6f19eaef89..efea228ccd8a 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -21,8 +21,7 @@ prototypes:
char *(*d_dname)((struct dentry *dentry, char *buffer, int buflen);
struct vfsmount *(*d_automount)(struct path *path);
int (*d_manage)(const struct path *, bool);
- struct dentry *(*d_real)(struct dentry *, const struct inode *,
- unsigned int, unsigned int);
+ struct dentry *(*d_real)(struct dentry *, const struct inode *);
locking rules:
rename_lock ->d_lock may block rcu-walk
diff --git a/Documentation/filesystems/f2fs.txt b/Documentation/filesystems/f2fs.txt
index 69f8de995739..e5edd29687b5 100644
--- a/Documentation/filesystems/f2fs.txt
+++ b/Documentation/filesystems/f2fs.txt
@@ -157,6 +157,24 @@ data_flush Enable data flushing before checkpoint in order to
persist data of regular and symlink.
fault_injection=%d Enable fault injection in all supported types with
specified injection rate.
+fault_type=%d Support configuring fault injection type, should be
+ enabled with fault_injection option, fault type value
+ is shown below, it supports single or combined type.
+ Type_Name Type_Value
+ FAULT_KMALLOC 0x000000001
+ FAULT_KVMALLOC 0x000000002
+ FAULT_PAGE_ALLOC 0x000000004
+ FAULT_PAGE_GET 0x000000008
+ FAULT_ALLOC_BIO 0x000000010
+ FAULT_ALLOC_NID 0x000000020
+ FAULT_ORPHAN 0x000000040
+ FAULT_BLOCK 0x000000080
+ FAULT_DIR_DEPTH 0x000000100
+ FAULT_EVICT_INODE 0x000000200
+ FAULT_TRUNCATE 0x000000400
+ FAULT_IO 0x000000800
+ FAULT_CHECKPOINT 0x000001000
+ FAULT_DISCARD 0x000002000
mode=%s Control block allocation mode which supports "adaptive"
and "lfs". In "lfs" mode, there should be no random
writes towards main area.
diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
index 72615a2c0752..51c136c821bf 100644
--- a/Documentation/filesystems/overlayfs.txt
+++ b/Documentation/filesystems/overlayfs.txt
@@ -10,10 +10,6 @@ union-filesystems). An overlay-filesystem tries to present a
filesystem which is the result over overlaying one filesystem on top
of the other.
-The result will inevitably fail to look exactly like a normal
-filesystem for various technical reasons. The expectation is that
-many use cases will be able to ignore these differences.
-
Overlay objects
---------------
@@ -266,6 +262,30 @@ rightmost one and going left. In the above example lower1 will be the
top, lower2 the middle and lower3 the bottom layer.
+Metadata only copy up
+--------------------
+
+When metadata only copy up feature is enabled, overlayfs will only copy
+up metadata (as opposed to whole file), when a metadata specific operation
+like chown/chmod is performed. Full file will be copied up later when
+file is opened for WRITE operation.
+
+In other words, this is delayed data copy up operation and data is copied
+up when there is a need to actually modify data.
+
+There are multiple ways to enable/disable this feature. A config option
+CONFIG_OVERLAY_FS_METACOPY can be set/unset to enable/disable this feature
+by default. Or one can enable/disable it at module load time with module
+parameter metacopy=on/off. Lastly, there is also a per mount option
+metacopy=on/off to enable/disable this feature per mount.
+
+Do not use metacopy=on with untrusted upper/lower directories. Otherwise
+it is possible that an attacker can create a handcrafted file with
+appropriate REDIRECT and METACOPY xattrs, and gain access to file on lower
+pointed by REDIRECT. This should not be possible on local system as setting
+"trusted." xattrs will require CAP_SYS_ADMIN. But it should be possible
+for untrusted layers like from a pen drive.
+
Sharing and copying layers
--------------------------
@@ -284,7 +304,7 @@ though it will not result in a crash or deadlock.
Mounting an overlay using an upper layer path, where the upper layer path
was previously used by another mounted overlay in combination with a
different lower layer path, is allowed, unless the "inodes index" feature
-is enabled.
+or "metadata only copy up" feature is enabled.
With the "inodes index" feature, on the first time mount, an NFS file
handle of the lower layer root directory, along with the UUID of the lower
@@ -297,6 +317,10 @@ lower root origin, mount will fail with ESTALE. An overlayfs mount with
does not support NFS export, lower filesystem does not have a valid UUID or
if the upper filesystem does not support extended attributes.
+For "metadata only copy up" feature there is no verification mechanism at
+mount time. So if same upper is mounted with different set of lower, mount
+probably will succeed but expect the unexpected later on. So don't do it.
+
It is quite a common practice to copy overlay layers to a different
directory tree on the same or different underlying filesystem, and even
to a different machine. With the "inodes index" feature, trying to mount
@@ -306,27 +330,40 @@ the copied layers will fail the verification of the lower root file handle.
Non-standard behavior
---------------------
-The copy_up operation essentially creates a new, identical file and
-moves it over to the old name. Any open files referring to this inode
-will access the old data.
+Overlayfs can now act as a POSIX compliant filesystem with the following
+features turned on:
+
+1) "redirect_dir"
+
+Enabled with the mount option or module option: "redirect_dir=on" or with
+the kernel config option CONFIG_OVERLAY_FS_REDIRECT_DIR=y.
+
+If this feature is disabled, then rename(2) on a lower or merged directory
+will fail with EXDEV ("Invalid cross-device link").
+
+2) "inode index"
+
+Enabled with the mount option or module option "index=on" or with the
+kernel config option CONFIG_OVERLAY_FS_INDEX=y.
-The new file may be on a different filesystem, so both st_dev and st_ino
-of the real file may change. The values of st_dev and st_ino returned by
-stat(2) on an overlay object are often not the same as the real file
-stat(2) values to prevent the values from changing on copy_up.
+If this feature is disabled and a file with multiple hard links is copied
+up, then this will "break" the link. Changes will not be propagated to
+other names referring to the same inode.
-Unless "xino" feature is enabled, when overlay layers are not all on the
-same underlying filesystem, the value of st_dev may be different for two
-non-directory objects in the same overlay filesystem and the value of
-st_ino for directory objects may be non persistent and could change even
-while the overlay filesystem is still mounted.
+3) "xino"
-Unless "inode index" feature is enabled, if a file with multiple hard
-links is copied up, then this will "break" the link. Changes will not be
-propagated to other names referring to the same inode.
+Enabled with the mount option "xino=auto" or "xino=on", with the module
+option "xino_auto=on" or with the kernel config option
+CONFIG_OVERLAY_FS_XINO_AUTO=y. Also implicitly enabled by using the same
+underlying filesystem for all layers making up the overlay.
-Unless "redirect_dir" feature is enabled, rename(2) on a lower or merged
-directory will fail with EXDEV.
+If this feature is disabled or the underlying filesystem doesn't have
+enough free bits in the inode number, then overlayfs will not be able to
+guarantee that the values of st_ino and st_dev returned by stat(2) and the
+value of d_ino returned by readdir(3) will act like on a normal filesystem.
+E.g. the value of st_dev may be different for two objects in the same
+overlay filesystem and the value of st_ino for directory objects may not be
+persistent and could change even while the overlay filesystem is mounted.
Changes to underlying filesystems
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 1605acbb23b6..22b4b00dee31 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -870,6 +870,7 @@ Committed_AS: 100056 kB
VmallocTotal: 112216 kB
VmallocUsed: 428 kB
VmallocChunk: 111088 kB
+Percpu: 62080 kB
HardwareCorrupted: 0 kB
AnonHugePages: 49152 kB
ShmemHugePages: 0 kB
@@ -962,6 +963,8 @@ Committed_AS: The amount of memory presently allocated on the system.
VmallocTotal: total size of vmalloc memory area
VmallocUsed: amount of vmalloc area which is used
VmallocChunk: largest contiguous block of vmalloc area which is free
+ Percpu: Memory allocated to the percpu allocator used to back percpu
+ allocations. This stat excludes the cost of metadata.
..............................................................................
diff --git a/Documentation/filesystems/ramfs-rootfs-initramfs.txt b/Documentation/filesystems/ramfs-rootfs-initramfs.txt
index b176928e6963..79637d227e85 100644
--- a/Documentation/filesystems/ramfs-rootfs-initramfs.txt
+++ b/Documentation/filesystems/ramfs-rootfs-initramfs.txt
@@ -164,7 +164,7 @@ Documentation/early-userspace/README for more details.)
The kernel does not depend on external cpio tools. If you specify a
directory instead of a configuration file, the kernel's build infrastructure
creates a configuration file from that directory (usr/Makefile calls
-scripts/gen_initramfs_list.sh), and proceeds to package up that directory
+usr/gen_initramfs_list.sh), and proceeds to package up that directory
using the config file (by feeding it to usr/gen_init_cpio, which is created
from usr/gen_init_cpio.c). The kernel's build-time cpio creation code is
entirely self-contained, and the kernel's boot-time extractor is also
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 85907d5b9c2c..4b2084d0f1fb 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -989,8 +989,7 @@ struct dentry_operations {
char *(*d_dname)(struct dentry *, char *, int);
struct vfsmount *(*d_automount)(struct path *);
int (*d_manage)(const struct path *, bool);
- struct dentry *(*d_real)(struct dentry *, const struct inode *,
- unsigned int, unsigned int);
+ struct dentry *(*d_real)(struct dentry *, const struct inode *);
};
d_revalidate: called when the VFS needs to revalidate a dentry. This
@@ -1124,22 +1123,15 @@ struct dentry_operations {
dentry being transited from.
d_real: overlay/union type filesystems implement this method to return one of
- the underlying dentries hidden by the overlay. It is used in three
+ the underlying dentries hidden by the overlay. It is used in two
different modes:
- Called from open it may need to copy-up the file depending on the
- supplied open flags. This mode is selected with a non-zero flags
- argument. In this mode the d_real method can return an error.
-
Called from file_dentry() it returns the real dentry matching the inode
argument. The real dentry may be from a lower layer already copied up,
but still referenced from the file. This mode is selected with a
- non-NULL inode argument. This will always succeed.
-
- With NULL inode and zero flags the topmost real underlying dentry is
- returned. This will always succeed.
+ non-NULL inode argument.
- This method is never called with both non-NULL inode and non-zero flags.
+ With NULL inode the topmost real underlying dentry is returned.
Each dentry has a pointer to its parent dentry, as well as a hash list
of child dentries. Child dentries are basically like files in a
OpenPOWER on IntegriCloud