5 files changed, 150 insertions, 253 deletions
diff --git a/Documentation/filesystems/btrfs.txt b/Documentation/filesystems/btrfs.txt
index c772b47e7ef0..f9dad22d95ce 100644
--- a/Documentation/filesystems/btrfs.txt
+++ b/Documentation/filesystems/btrfs.txt
@@ -1,20 +1,10 @@
-
 BTRFS
 =====
 
-Btrfs is a copy on write filesystem for Linux aimed at
-implementing advanced features while focusing on fault tolerance,
-repair and easy administration. Initially developed by Oracle, Btrfs
-is licensed under the GPL and open for contribution from anyone.
-
-Linux has a wealth of filesystems to choose from, but we are facing a
-number of challenges with scaling to the large storage subsystems that
-are becoming common in today's data centers. Filesystems need to scale
-in their ability to address and manage large storage, and also in
-their ability to detect, repair and tolerate errors in the data stored
-on disk.  Btrfs is under heavy development, and is not suitable for
-any uses other than benchmarking and review. The Btrfs disk format is
-not yet finalized.
+Btrfs is a copy on write filesystem for Linux aimed at implementing advanced
+features while focusing on fault tolerance, repair and easy administration.
+Jointly developed by several companies, licensed under the GPL and open for
+contribution from anyone.
 
 The main Btrfs features include:
 
@@ -28,243 +18,14 @@ The main Btrfs features include:
     * Checksums on data and metadata (multiple algorithms available)
     * Compression
     * Integrated multiple device support, with several raid algorithms
-    * Online filesystem check (not yet implemented)
-    * Very fast offline filesystem check
-    * Efficient incremental backup and FS mirroring (not yet implemented)
+    * Offline filesystem check
+    * Efficient incremental backup and FS mirroring
     * Online filesystem defragmentation
 
+For more information please refer to the wiki
 
-Mount Options
-=============
-
-When mounting a btrfs filesystem, the following option are accepted.
-Options with (*) are default options and will not show in the mount options.
-
-  alloc_start=<bytes>
-	Debugging option to force all block allocations above a certain
-	byte threshold on each block device.  The value is specified in
-	bytes, optionally with a K, M, or G suffix, case insensitive.
-	Default is 1MB.
-
-  noautodefrag(*)
-  autodefrag
-	Disable/enable auto defragmentation.
-	Auto defragmentation detects small random writes into files and queue
-	them up for the defrag process.  Works best for small files;
-	Not well suited for large database workloads.
-
-  check_int
-  check_int_data
-  check_int_print_mask=<value>
-	These debugging options control the behavior of the integrity checking
-	module (the BTRFS_FS_CHECK_INTEGRITY config option required).
-
-	check_int enables the integrity checker module, which examines all
-	block write requests to ensure on-disk consistency, at a large
-	memory and CPU cost.
-
-	check_int_data includes extent data in the integrity checks, and
-	implies the check_int option.
-
-	check_int_print_mask takes a bitmask of BTRFSIC_PRINT_MASK_* values
-	as defined in fs/btrfs/check-integrity.c, to control the integrity
-	checker module behavior.
-
-	See comments at the top of fs/btrfs/check-integrity.c for more info.
-
-  commit=<seconds>
-	Set the interval of periodic commit, 30 seconds by default. Higher
-	values defer data being synced to permanent storage with obvious
-	consequences when the system crashes. The upper bound is not forced,
-	but a warning is printed if it's more than 300 seconds (5 minutes).
-
-  compress
-  compress=<type>
-  compress-force
-  compress-force=<type>
-	Control BTRFS file data compression.  Type may be specified as "zlib"
-	"lzo" or "no" (for no compression, used for remounting).  If no type
-	is specified, zlib is used.  If compress-force is specified,
-	all files will be compressed, whether or not they compress well.
-	If compression is enabled, nodatacow and nodatasum are disabled.
-
-  degraded
-	Allow mounts to continue with missing devices.  A read-write mount may
-	fail with too many devices missing, for example if a stripe member
-	is completely missing.
-
-  device=<devicepath>
-	Specify a device during mount so that ioctls on the control device
-	can be avoided.  Especially useful when trying to mount a multi-device
-	setup as root.  May be specified multiple times for multiple devices.
-
-  nodiscard(*)
-  discard
-	Disable/enable discard mount option.
-	Discard issues frequent commands to let the block device reclaim space
-	freed by the filesystem.
-	This is useful for SSD devices, thinly provisioned
-	LUNs and virtual machine images, but may have a significant
-	performance impact.  (The fstrim command is also available to
-	initiate batch trims from userspace).
-
-  noenospc_debug(*)
-  enospc_debug
-	Disable/enable debugging option to be more verbose in some ENOSPC conditions.
-
-  fatal_errors=<action>
-	Action to take when encountering a fatal error:
-	  "bug" - BUG() on a fatal error.  This is the default.
-	  "panic" - panic() on a fatal error.
-
-  noflushoncommit(*)
-  flushoncommit
-	The 'flushoncommit' mount option forces any data dirtied by a write in a
-	prior transaction to commit as part of the current commit.  This makes
-	the committed state a fully consistent view of the file system from the
-	application's perspective (i.e., it includes all completed file system
-	operations).  This was previously the behavior only when a snapshot is
-	created.
-
-  inode_cache
-	Enable free inode number caching.   Defaults to off due to an overflow
-	problem when the free space crcs don't fit inside a single page.
-
-  max_inline=<bytes>
-	Specify the maximum amount of space, in bytes, that can be inlined in
-	a metadata B-tree leaf.  The value is specified in bytes, optionally
-	with a K, M, or G suffix, case insensitive.  In practice, this value
-	is limited by the root sector size, with some space unavailable due
-	to leaf headers.  For a 4k sector size, max inline data is ~3900 bytes.
-
-  metadata_ratio=<value>
-	Specify that 1 metadata chunk should be allocated after every <value>
-	data chunks.  Off by default.
-
-  acl(*)
-  noacl
-	Enable/disable support for Posix Access Control Lists (ACLs).  See the
-	acl(5) manual page for more information about ACLs.
-
-  barrier(*)
-  nobarrier
-        Enable/disable the use of block layer write barriers.  Write barriers
-	ensure that certain IOs make it through the device cache and are on
-	persistent storage. If disabled on a device with a volatile
-	(non-battery-backed) write-back cache, nobarrier option will lead to
-	filesystem corruption on a system crash or power loss.
-
-  datacow(*)
-  nodatacow
-	Enable/disable data copy-on-write for newly created files.
-	Nodatacow implies nodatasum, and disables all compression.
-
-  datasum(*)
-  nodatasum
-	Enable/disable data checksumming for newly created files.
-	Datasum implies datacow.
-
-  treelog(*)
-  notreelog
-	Enable/disable the tree logging used for fsync and O_SYNC writes.
-
-  recovery
-	Enable autorecovery attempts if a bad tree root is found at mount time.
-	Currently this scans a list of several previous tree roots and tries to
-	use the first readable.
-
-  rescan_uuid_tree
-	Force check and rebuild procedure of the UUID tree. This should not
-	normally be needed.
-
-  skip_balance
-	Skip automatic resume of interrupted balance operation after mount.
-	May be resumed with "btrfs balance resume."
-
-  space_cache (*)
-	Enable the on-disk freespace cache.
-  nospace_cache
-	Disable freespace cache loading without clearing the cache.
-  clear_cache
-	Force clearing and rebuilding of the disk space cache if something
-	has gone wrong.
-
-  ssd
-  nossd
-  ssd_spread
-	Options to control ssd allocation schemes.  By default, BTRFS will
-	enable or disable ssd allocation heuristics depending on whether a
-	rotational or non-rotational disk is in use.  The ssd and nossd options
-	can override this autodetection.
-
-	The ssd_spread mount option attempts to allocate into big chunks
-	of unused space, and may perform better on low-end ssds.  ssd_spread
-	implies ssd, enabling all other ssd heuristics as well.
-
-  subvol=<path>
-	Mount subvolume at <path> rather than the root subvolume.  <path> is
-	relative to the top level subvolume.
-
-  subvolid=<ID>
-	Mount subvolume specified by an ID number rather than the root subvolume.
-	This allows mounting of subvolumes which are not in the root of the mounted
-	filesystem.
-	You can use "btrfs subvolume list" to see subvolume ID numbers.
-
-  subvolrootid=<objectid> (deprecated)
-	Mount subvolume specified by <objectid> rather than the root subvolume.
-	This allows mounting of subvolumes which are not in the root of the mounted
-	filesystem.
-	You can use "btrfs subvolume show " to see the object ID for a subvolume.
-
-  thread_pool=<number>
-	The number of worker threads to allocate.  The default number is equal
-	to the number of CPUs + 2, or 8, whichever is smaller.
-
-  user_subvol_rm_allowed
-	Allow subvolumes to be deleted by a non-root user. Use with caution.
-
-MAILING LIST
-============
-
-There is a Btrfs mailing list hosted on vger.kernel.org. You can
-find details on how to subscribe here:
-
-http://vger.kernel.org/vger-lists.html#linux-btrfs
-
-Mailing list archives are available from gmane:
-
-http://dir.gmane.org/gmane.comp.file-systems.btrfs
-
-
-
-IRC
-===
-
-Discussion of Btrfs also occurs on the #btrfs channel of the Freenode
-IRC network.
-
-
-
-	UTILITIES
-	=========
-
-Userspace tools for creating and manipulating Btrfs file systems are
-available from the git repository at the following location:
-
- http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git
- git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git
-
-These include the following tools:
-
-* mkfs.btrfs: create a filesystem
-
-* btrfs: a single tool to manage the filesystems, refer to the manpage for more details
-
-* 'btrfsck' or 'btrfs check': do a consistency check of the filesystem
-
-Other tools for specific tasks:
-
-* btrfs-convert: in-place conversion from ext2/3/4 filesystems
+  https://btrfs.wiki.kernel.org
 
-* btrfs-image: dump filesystem metadata for debugging
+that maintains information about administration tasks, frequently asked
+questions, use cases, mount options, comprehensible changelogs, features,
+manual pages, source code repositories, contacts etc.
diff --git a/Documentation/filesystems/nfs/pnfs-scsi-server.txt b/Documentation/filesystems/nfs/pnfs-scsi-server.txt
new file mode 100644
index 000000000000..5bef7268bd9f
--- /dev/null
+++ b/Documentation/filesystems/nfs/pnfs-scsi-server.txt
@@ -0,0 +1,23 @@
+
+pNFS SCSI layout server user guide
+==================================
+
+This document describes support for pNFS SCSI layouts in the Linux NFS server.
+With pNFS SCSI layouts, the NFS server acts as Metadata Server (MDS) for pNFS,
+which in addition to handling all the metadata access to the NFS export,
+also hands out layouts to the clients so that they can directly access the
+underlying SCSI LUNs that are shared with the client.
+
+To use pNFS SCSI layouts with with the Linux NFS server, the exported file
+system needs to support the pNFS SCSI layouts (currently just XFS), and the
+file system must sit on a SCSI LUN that is accessible to the clients in
+addition to the MDS.  As of now the file system needs to sit directly on the
+exported LUN, striping or concatenation of LUNs on the MDS and clients
+is not supported yet.
+
+On a server built with CONFIG_NFSD_SCSI, the pNFS SCSI volume support is
+automatically enabled if the file system is exported using the "pnfs"
+option and the underlying SCSI device support persistent reservations.
+On the client make sure the kernel has the CONFIG_PNFS_BLOCK option
+enabled, and the file system is mounted using the NFSv4.1 protocol
+version (mount -o vers=4.1).
diff --git a/Documentation/filesystems/ocfs2-online-filecheck.txt b/Documentation/filesystems/ocfs2-online-filecheck.txt
new file mode 100644
index 000000000000..1ab07860430d
--- /dev/null
+++ b/Documentation/filesystems/ocfs2-online-filecheck.txt
@@ -0,0 +1,94 @@
+		    OCFS2 online file check
+		    -----------------------
+
+This document will describe OCFS2 online file check feature.
+
+Introduction
+============
+OCFS2 is often used in high-availaibility systems. However, OCFS2 usually
+converts the filesystem to read-only when encounters an error. This may not be
+necessary, since turning the filesystem read-only would affect other running
+processes as well, decreasing availability.
+Then, a mount option (errors=continue) is introduced, which would return the
+-EIO errno to the calling process and terminate furhter processing so that the
+filesystem is not corrupted further. The filesystem is not converted to
+read-only, and the problematic file's inode number is reported in the kernel
+log. The user can try to check/fix this file via online filecheck feature.
+
+Scope
+=====
+This effort is to check/fix small issues which may hinder day-to-day operations
+of a cluster filesystem by turning the filesystem read-only. The scope of
+checking/fixing is at the file level, initially for regular files and eventually
+to all files (including system files) of the filesystem.
+
+In case of directory to file links is incorrect, the directory inode is
+reported as erroneous.
+
+This feature is not suited for extravagant checks which involve dependency of
+other components of the filesystem, such as but not limited to, checking if the
+bits for file blocks in the allocation has been set. In case of such an error,
+the offline fsck should/would be recommended.
+
+Finally, such an operation/feature should not be automated lest the filesystem
+may end up with more damage than before the repair attempt. So, this has to
+be performed using user interaction and consent.
+
+User interface
+==============
+When there are errors in the OCFS2 filesystem, they are usually accompanied
+by the inode number which caused the error. This inode number would be the
+input to check/fix the file.
+
+There is a sysfs directory for each OCFS2 file system mounting:
+
+  /sys/fs/ocfs2/<devname>/filecheck
+
+Here, <devname> indicates the name of OCFS2 volumn device which has been already
+mounted. The file above would accept inode numbers. This could be used to
+communicate with kernel space, tell which file(inode number) will be checked or
+fixed. Currently, three operations are supported, which includes checking
+inode, fixing inode and setting the size of result record history.
+
+1. If you want to know what error exactly happened to <inode> before fixing, do
+
+  # echo "<inode>" > /sys/fs/ocfs2/<devname>/filecheck/check
+  # cat /sys/fs/ocfs2/<devname>/filecheck/check
+
+The output is like this:
+  INO		DONE	ERROR
+39502		1	GENERATION
+
+<INO> lists the inode numbers.
+<DONE> indicates whether the operation has been finished.
+<ERROR> says what kind of errors was found. For the detailed error numbers,
+please refer to the file linux/fs/ocfs2/filecheck.h.
+
+2. If you determine to fix this inode, do
+
+  # echo "<inode>" > /sys/fs/ocfs2/<devname>/filecheck/fix
+  # cat /sys/fs/ocfs2/<devname>/filecheck/fix
+
+The output is like this:
+  INO		DONE	ERROR
+39502		1	SUCCESS
+
+This time, the <ERROR> column indicates whether this fix is successful or not.
+
+3. The record cache is used to store the history of check/fix results. It's
+defalut size is 10, and can be adjust between the range of 10 ~ 100. You can
+adjust the size like this:
+
+  # echo "<size>" > /sys/fs/ocfs2/<devname>/filecheck/set
+
+Fixing stuff
+============
+On receivng the inode, the filesystem would read the inode and the
+file metadata. In case of errors, the filesystem would fix the errors
+and report the problems it fixed in the kernel log. As a precautionary measure,
+the inode must first be checked for errors before performing a final fix.
+
+The inode and the result history will be maintained temporarily in a
+small linked list buffer which would contain the last (N) inodes
+fixed/checked, the detailed errors which were fixed/checked are printed in the
+kernel log.
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 843b045b4069..7f5607a089b4 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -43,6 +43,7 @@ Table of Contents
   3.7   /proc/<pid>/task/<tid>/children - Information about task children
   3.8   /proc/<pid>/fdinfo/<fd> - Information about opened file
   3.9   /proc/<pid>/map_files - Information about memory mapped files
+  3.10  /proc/<pid>/timerslack_ns - Task timerslack value
 
   4	Configuring procfs
   4.1	Mount options
@@ -1862,6 +1863,23 @@ time one can open(2) mappings from the listings of two processes and
 comparing their inode numbers to figure out which anonymous memory areas
 are actually shared.
 
+3.10	/proc/<pid>/timerslack_ns - Task timerslack value
+---------------------------------------------------------
+This file provides the value of the task's timerslack value in nanoseconds.
+This value specifies a amount of time that normal timers may be deferred
+in order to coalesce timers and avoid unnecessary wakeups.
+
+This allows a task's interactivity vs power consumption trade off to be
+adjusted.
+
+Writing 0 to the file will set the tasks timerslack to the default value.
+
+Valid values are from 0 - ULLONG_MAX
+
+An application setting the value must have PTRACE_MODE_ATTACH_FSCREDS level
+permissions on the task specified to change its timerslack_ns value.
+
+
 ------------------------------------------------------------------------------
 Configuring procfs
 ------------------------------------------------------------------------------
diff --git a/Documentation/filesystems/vfat.txt b/Documentation/filesystems/vfat.txt
index 223c32171dcc..cf51360e3a9f 100644
--- a/Documentation/filesystems/vfat.txt
+++ b/Documentation/filesystems/vfat.txt
@@ -56,9 +56,10 @@ iocharset=<name> -- Character set to use for converting between the
 		 you should consider the following option instead.
 
 utf8=<bool>   -- UTF-8 is the filesystem safe version of Unicode that
-		 is used by the console.  It can be enabled for the
-		 filesystem with this option. If 'uni_xlate' gets set,
-		 UTF-8 gets disabled.
+		 is used by the console. It can be enabled or disabled
+		 for the filesystem with this option.
+		 If 'uni_xlate' gets set, UTF-8 gets disabled.
+		 By default, FAT_DEFAULT_UTF8 setting is used.
 
 uni_xlate=<bool> -- Translate unhandled Unicode characters to special
 		 escaped sequences.  This would let you backup and