summaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/Changes1
-rw-r--r--Documentation/SubmittingPatches5
-rw-r--r--Documentation/acpi-hotkey.txt38
-rw-r--r--Documentation/arm/Samsung-S3C24XX/USB-Host.txt93
-rw-r--r--Documentation/crypto/api-intro.txt1
-rw-r--r--Documentation/dontdiff3
-rw-r--r--Documentation/fb/vesafb.txt16
-rw-r--r--Documentation/feature-removal-schedule.txt22
-rw-r--r--Documentation/filesystems/inotify.txt151
-rw-r--r--Documentation/filesystems/ntfs.txt29
-rw-r--r--Documentation/filesystems/proc.txt1
-rw-r--r--Documentation/filesystems/sysfs.txt28
-rw-r--r--Documentation/hwmon/adm1021 (renamed from Documentation/i2c/chips/adm1021)0
-rw-r--r--Documentation/hwmon/adm1025 (renamed from Documentation/i2c/chips/adm1025)0
-rw-r--r--Documentation/hwmon/adm1026 (renamed from Documentation/i2c/chips/adm1026)0
-rw-r--r--Documentation/hwmon/adm1031 (renamed from Documentation/i2c/chips/adm1031)0
-rw-r--r--Documentation/hwmon/adm9240 (renamed from Documentation/i2c/chips/adm9240)0
-rw-r--r--Documentation/hwmon/asb100 (renamed from Documentation/i2c/chips/asb100)0
-rw-r--r--Documentation/hwmon/ds1621 (renamed from Documentation/i2c/chips/ds1621)0
-rw-r--r--Documentation/hwmon/fscher (renamed from Documentation/i2c/chips/fscher)0
-rw-r--r--Documentation/hwmon/gl518sm (renamed from Documentation/i2c/chips/gl518sm)0
-rw-r--r--Documentation/hwmon/it87 (renamed from Documentation/i2c/chips/it87)0
-rw-r--r--Documentation/hwmon/lm63 (renamed from Documentation/i2c/chips/lm63)0
-rw-r--r--Documentation/hwmon/lm75 (renamed from Documentation/i2c/chips/lm75)0
-rw-r--r--Documentation/hwmon/lm77 (renamed from Documentation/i2c/chips/lm77)0
-rw-r--r--Documentation/hwmon/lm78 (renamed from Documentation/i2c/chips/lm78)7
-rw-r--r--Documentation/hwmon/lm80 (renamed from Documentation/i2c/chips/lm80)0
-rw-r--r--Documentation/hwmon/lm83 (renamed from Documentation/i2c/chips/lm83)0
-rw-r--r--Documentation/hwmon/lm85 (renamed from Documentation/i2c/chips/lm85)0
-rw-r--r--Documentation/hwmon/lm87 (renamed from Documentation/i2c/chips/lm87)0
-rw-r--r--Documentation/hwmon/lm90 (renamed from Documentation/i2c/chips/lm90)0
-rw-r--r--Documentation/hwmon/lm92 (renamed from Documentation/i2c/chips/lm92)0
-rw-r--r--Documentation/hwmon/max1619 (renamed from Documentation/i2c/chips/max1619)0
-rw-r--r--Documentation/hwmon/pc87360 (renamed from Documentation/i2c/chips/pc87360)0
-rw-r--r--Documentation/hwmon/sis5595 (renamed from Documentation/i2c/chips/sis5595)0
-rw-r--r--Documentation/hwmon/smsc47b397 (renamed from Documentation/i2c/chips/smsc47b397)0
-rw-r--r--Documentation/hwmon/smsc47m1 (renamed from Documentation/i2c/chips/smsc47m1)0
-rw-r--r--Documentation/hwmon/sysfs-interface (renamed from Documentation/i2c/sysfs-interface)0
-rw-r--r--Documentation/hwmon/userspace-tools (renamed from Documentation/i2c/userspace-tools)0
-rw-r--r--Documentation/hwmon/via686a (renamed from Documentation/i2c/chips/via686a)0
-rw-r--r--Documentation/hwmon/w83627hf (renamed from Documentation/i2c/chips/w83627hf)0
-rw-r--r--Documentation/hwmon/w83781d (renamed from Documentation/i2c/chips/w83781d)0
-rw-r--r--Documentation/hwmon/w83792d174
-rw-r--r--Documentation/hwmon/w83l785ts (renamed from Documentation/i2c/chips/w83l785ts)0
-rw-r--r--Documentation/i2c/chips/max6875104
-rw-r--r--Documentation/i2c/dev-interface15
-rw-r--r--Documentation/i2c/functionality2
-rw-r--r--Documentation/i2c/porting-clients25
-rw-r--r--Documentation/i2c/writing-clients121
-rw-r--r--Documentation/infiniband/core_locking.txt114
-rw-r--r--Documentation/infiniband/user_mad.txt53
-rw-r--r--Documentation/kernel-parameters.txt7
-rw-r--r--Documentation/kprobes.txt588
-rw-r--r--Documentation/networking/README.ipw2100246
-rw-r--r--Documentation/networking/README.ipw2200300
-rw-r--r--Documentation/networking/bonding.txt978
-rw-r--r--Documentation/networking/cxgb.txt352
-rw-r--r--Documentation/networking/phy.txt288
-rw-r--r--Documentation/pci.txt14
-rw-r--r--Documentation/pcmcia/driver-changes.txt18
-rw-r--r--Documentation/power/swsusp-dmcrypt.txt138
-rw-r--r--Documentation/power/swsusp.txt7
-rw-r--r--Documentation/power/video.txt9
-rw-r--r--Documentation/scsi/scsi_mid_low_api.txt15
-rw-r--r--Documentation/serial/driver15
-rw-r--r--Documentation/sound/alsa/ALSA-Configuration.txt45
-rw-r--r--Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl15
-rw-r--r--Documentation/stable_api_nonsense.txt2
-rw-r--r--Documentation/stable_kernel_rules.txt58
-rw-r--r--Documentation/usb/sn9c102.txt4
-rw-r--r--Documentation/usb/usbmon.txt29
-rw-r--r--Documentation/video4linux/CARDLIST.bttv2
-rw-r--r--Documentation/video4linux/CARDLIST.cx883
-rw-r--r--Documentation/video4linux/CARDLIST.saa713414
-rw-r--r--Documentation/video4linux/CARDLIST.tuner6
-rw-r--r--Documentation/video4linux/bttv/Cards74
-rw-r--r--Documentation/video4linux/bttv/Insmod-options3
-rw-r--r--Documentation/video4linux/not-in-cx2388x-datasheet.txt4
-rw-r--r--Documentation/vm/locking15
-rw-r--r--Documentation/watchdog/watchdog-api.txt20
-rw-r--r--Documentation/x86_64/boot-options.txt15
81 files changed, 3670 insertions, 617 deletions
diff --git a/Documentation/Changes b/Documentation/Changes
index dfec7569d450..5eaab0441d76 100644
--- a/Documentation/Changes
+++ b/Documentation/Changes
@@ -65,6 +65,7 @@ o isdn4k-utils 3.1pre1 # isdnctrl 2>&1|grep version
o nfs-utils 1.0.5 # showmount --version
o procps 3.2.0 # ps --version
o oprofile 0.9 # oprofiled --version
+o udev 058 # udevinfo -V
Kernel compilation
==================
diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 6761a7b241a5..7f43b040311e 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -149,6 +149,11 @@ USB, framebuffer devices, the VFS, the SCSI subsystem, etc. See the
MAINTAINERS file for a mailing list that relates specifically to
your change.
+If changes affect userland-kernel interfaces, please send
+the MAN-PAGES maintainer (as listed in the MAINTAINERS file)
+a man-pages patch, or at least a notification of the change,
+so that some information makes its way into the manual pages.
+
Even if the maintainer did not respond in step #4, make sure to ALWAYS
copy the maintainer when you change their code.
diff --git a/Documentation/acpi-hotkey.txt b/Documentation/acpi-hotkey.txt
new file mode 100644
index 000000000000..0acdc80c30c2
--- /dev/null
+++ b/Documentation/acpi-hotkey.txt
@@ -0,0 +1,38 @@
+driver/acpi/hotkey.c implement:
+1. /proc/acpi/hotkey/event_config
+(event based hotkey or event config interface):
+a. add a event based hotkey(event) :
+echo "0:bus::action:method:num:num" > event_config
+
+b. delete a event based hotkey(event):
+echo "1:::::num:num" > event_config
+
+c. modify a event based hotkey(event):
+echo "2:bus::action:method:num:num" > event_config
+
+2. /proc/acpi/hotkey/poll_config
+(polling based hotkey or event config interface):
+a.add a polling based hotkey(event) :
+echo "0:bus:method:action:method:num" > poll_config
+this adding command will create a proc file
+/proc/acpi/hotkey/method, which is used to get
+result of polling.
+
+b.delete a polling based hotkey(event):
+echo "1:::::num" > event_config
+
+c.modify a polling based hotkey(event):
+echo "2:bus:method:action:method:num" > poll_config
+
+3./proc/acpi/hotkey/action
+(interface to call aml method associated with a
+specific hotkey(event))
+echo "event_num:event_type:event_argument" >
+ /proc/acpi/hotkey/action.
+The result of the execution of this aml method is
+attached to /proc/acpi/hotkey/poll_method, which is dnyamically
+created. Please use command "cat /proc/acpi/hotkey/polling_method"
+to retrieve it.
+
+Note: Use cmdline "acpi_generic_hotkey" to over-ride
+loading any platform specific drivers.
diff --git a/Documentation/arm/Samsung-S3C24XX/USB-Host.txt b/Documentation/arm/Samsung-S3C24XX/USB-Host.txt
new file mode 100644
index 000000000000..b93b68e2b143
--- /dev/null
+++ b/Documentation/arm/Samsung-S3C24XX/USB-Host.txt
@@ -0,0 +1,93 @@
+ S3C24XX USB Host support
+ ========================
+
+
+
+Introduction
+------------
+
+ This document details the S3C2410/S3C2440 in-built OHCI USB host support.
+
+Configuration
+-------------
+
+ Enable at least the following kernel options:
+
+ menuconfig:
+
+ Device Drivers --->
+ USB support --->
+ <*> Support for Host-side USB
+ <*> OHCI HCD support
+
+
+ .config:
+ CONFIG_USB
+ CONFIG_USB_OHCI_HCD
+
+
+ Once these options are configured, the standard set of USB device
+ drivers can be configured and used.
+
+
+Board Support
+-------------
+
+ The driver attaches to a platform device, which will need to be
+ added by the board specific support file in linux/arch/arm/mach-s3c2410,
+ such as mach-bast.c or mach-smdk2410.c
+
+ The platform device's platform_data field is only needed if the
+ board implements extra power control or over-current monitoring.
+
+ The OHCI driver does not ensure the state of the S3C2410's MISCCTRL
+ register, so if both ports are to be used for the host, then it is
+ the board support file's responsibility to ensure that the second
+ port is configured to be connected to the OHCI core.
+
+
+Platform Data
+-------------
+
+ See linux/include/asm-arm/arch-s3c2410/usb-control.h for the
+ descriptions of the platform device data. An implementation
+ can be found in linux/arch/arm/mach-s3c2410/usb-simtec.c .
+
+ The `struct s3c2410_hcd_info` contains a pair of functions
+ that get called to enable over-current detection, and to
+ control the port power status.
+
+ The ports are numbered 0 and 1.
+
+ power_control:
+
+ Called to enable or disable the power on the port.
+
+ enable_oc:
+
+ Called to enable or disable the over-current monitoring.
+ This should claim or release the resources being used to
+ check the power condition on the port, such as an IRQ.
+
+ report_oc:
+
+ The OHCI driver fills this field in for the over-current code
+ to call when there is a change to the over-current state on
+ an port. The ports argument is a bitmask of 1 bit per port,
+ with bit X being 1 for an over-current on port X.
+
+ The function s3c2410_usb_report_oc() has been provided to
+ ensure this is called correctly.
+
+ port[x]:
+
+ This is struct describes each port, 0 or 1. The platform driver
+ should set the flags field of each port to S3C_HCDFLG_USED if
+ the port is enabled.
+
+
+
+Document Author
+---------------
+
+Ben Dooks, (c) 2005 Simtec Electronics
diff --git a/Documentation/crypto/api-intro.txt b/Documentation/crypto/api-intro.txt
index a2d5b4900772..74dffc68ff9f 100644
--- a/Documentation/crypto/api-intro.txt
+++ b/Documentation/crypto/api-intro.txt
@@ -223,6 +223,7 @@ CAST5 algorithm contributors:
TEA/XTEA algorithm contributors:
Aaron Grothe
+ Michael Ringe
Khazad algorithm contributors:
Aaron Grothe
diff --git a/Documentation/dontdiff b/Documentation/dontdiff
index d4fda25db868..96bea278bbf6 100644
--- a/Documentation/dontdiff
+++ b/Documentation/dontdiff
@@ -41,6 +41,7 @@ COPYING
CREDITS
CVS
ChangeSet
+Image
Kerntypes
MODS.txt
Module.symvers
@@ -103,6 +104,8 @@ logo_*.c
logo_*_clut224.c
logo_*_mono.c
lxdialog
+mach-types
+mach-types.h
make_times_h
map
maui_boot.h
diff --git a/Documentation/fb/vesafb.txt b/Documentation/fb/vesafb.txt
index 814e2f56a6ad..62db6758d1c1 100644
--- a/Documentation/fb/vesafb.txt
+++ b/Documentation/fb/vesafb.txt
@@ -144,7 +144,21 @@ vgapal Use the standard vga registers for palette changes.
This is the default.
pmipal Use the protected mode interface for palette changes.
-mtrr setup memory type range registers for the vesafb framebuffer.
+mtrr:n setup memory type range registers for the vesafb framebuffer
+ where n:
+ 0 - disabled (equivalent to nomtrr)
+ 1 - uncachable
+ 2 - write-back
+ 3 - write-combining (default)
+ 4 - write-through
+
+ If you see the following in dmesg, choose the type that matches the
+ old one. In this example, use "mtrr:2".
+...
+mtrr: type mismatch for e0000000,8000000 old: write-back new: write-combining
+...
+
+nomtrr disable mtrr
vremap:n
remap 'n' MiB of video RAM. If 0 or not specified, remap memory
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 12dde43fe657..363909056e46 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -102,16 +102,6 @@ Who: Jody McIntyre <scjody@steamballoon.com>
---------------------------
-What: register_serial/unregister_serial
-When: December 2005
-Why: This interface does not allow serial ports to be registered against
- a struct device, and as such does not allow correct power management
- of such ports. 8250-based ports should use serial8250_register_port
- and serial8250_unregister_port instead.
-Who: Russell King <rmk@arm.linux.org.uk>
-
----------------------------
-
What: i2c sysfs name change: in1_ref, vid deprecated in favour of cpu0_vid
When: November 2005
Files: drivers/i2c/chips/adm1025.c, drivers/i2c/chips/adm1026.c
@@ -135,3 +125,15 @@ Why: With the 16-bit PCMCIA subsystem now behaving (almost) like a
pcmciautils package available at
http://kernel.org/pub/linux/utils/kernel/pcmcia/
Who: Dominik Brodowski <linux@brodo.de>
+
+---------------------------
+
+What: ip_queue and ip6_queue (old ipv4-only and ipv6-only netfilter queue)
+When: December 2005
+Why: This interface has been obsoleted by the new layer3-independent
+ "nfnetlink_queue". The Kernel interface is compatible, so the old
+ ip[6]tables "QUEUE" targets still work and will transparently handle
+ all packets into nfnetlink queue number 0. Userspace users will have
+ to link against API-compatible library on top of libnfnetlink_queue
+ instead of the current 'libipq'.
+Who: Harald Welte <laforge@netfilter.org>
diff --git a/Documentation/filesystems/inotify.txt b/Documentation/filesystems/inotify.txt
new file mode 100644
index 000000000000..6d501903f68e
--- /dev/null
+++ b/Documentation/filesystems/inotify.txt
@@ -0,0 +1,151 @@
+ inotify
+ a powerful yet simple file change notification system
+
+
+
+Document started 15 Mar 2005 by Robert Love <rml@novell.com>
+
+
+(i) User Interface
+
+Inotify is controlled by a set of three system calls and normal file I/O on a
+returned file descriptor.
+
+First step in using inotify is to initialise an inotify instance:
+
+ int fd = inotify_init ();
+
+Each instance is associated with a unique, ordered queue.
+
+Change events are managed by "watches". A watch is an (object,mask) pair where
+the object is a file or directory and the mask is a bit mask of one or more
+inotify events that the application wishes to receive. See <linux/inotify.h>
+for valid events. A watch is referenced by a watch descriptor, or wd.
+
+Watches are added via a path to the file.
+
+Watches on a directory will return events on any files inside of the directory.
+
+Adding a watch is simple:
+
+ int wd = inotify_add_watch (fd, path, mask);
+
+Where "fd" is the return value from inotify_init(), path is the path to the
+object to watch, and mask is the watch mask (see <linux/inotify.h>).
+
+You can update an existing watch in the same manner, by passing in a new mask.
+
+An existing watch is removed via
+
+ int ret = inotify_rm_watch (fd, wd);
+
+Events are provided in the form of an inotify_event structure that is read(2)
+from a given inotify instance. The filename is of dynamic length and follows
+the struct. It is of size len. The filename is padded with null bytes to
+ensure proper alignment. This padding is reflected in len.
+
+You can slurp multiple events by passing a large buffer, for example
+
+ size_t len = read (fd, buf, BUF_LEN);
+
+Where "buf" is a pointer to an array of "inotify_event" structures at least
+BUF_LEN bytes in size. The above example will return as many events as are
+available and fit in BUF_LEN.
+
+Each inotify instance fd is also select()- and poll()-able.
+
+You can find the size of the current event queue via the standard FIONREAD
+ioctl on the fd returned by inotify_init().
+
+All watches are destroyed and cleaned up on close.
+
+
+(ii)
+
+Prototypes:
+
+ int inotify_init (void);
+ int inotify_add_watch (int fd, const char *path, __u32 mask);
+ int inotify_rm_watch (int fd, __u32 mask);
+
+
+(iii) Internal Kernel Implementation
+
+Each inotify instance is associated with an inotify_device structure.
+
+Each watch is associated with an inotify_watch structure. Watches are chained
+off of each associated device and each associated inode.
+
+See fs/inotify.c for the locking and lifetime rules.
+
+
+(iv) Rationale
+
+Q: What is the design decision behind not tying the watch to the open fd of
+ the watched object?
+
+A: Watches are associated with an open inotify device, not an open file.
+ This solves the primary problem with dnotify: keeping the file open pins
+ the file and thus, worse, pins the mount. Dnotify is therefore infeasible
+ for use on a desktop system with removable media as the media cannot be
+ unmounted. Watching a file should not require that it be open.
+
+Q: What is the design decision behind using an-fd-per-instance as opposed to
+ an fd-per-watch?
+
+A: An fd-per-watch quickly consumes more file descriptors than are allowed,
+ more fd's than are feasible to manage, and more fd's than are optimally
+ select()-able. Yes, root can bump the per-process fd limit and yes, users
+ can use epoll, but requiring both is a silly and extraneous requirement.
+ A watch consumes less memory than an open file, separating the number
+ spaces is thus sensible. The current design is what user-space developers
+ want: Users initialize inotify, once, and add n watches, requiring but one
+ fd and no twiddling with fd limits. Initializing an inotify instance two
+ thousand times is silly. If we can implement user-space's preferences
+ cleanly--and we can, the idr layer makes stuff like this trivial--then we
+ should.
+
+ There are other good arguments. With a single fd, there is a single
+ item to block on, which is mapped to a single queue of events. The single
+ fd returns all watch events and also any potential out-of-band data. If
+ every fd was a separate watch,
+
+ - There would be no way to get event ordering. Events on file foo and
+ file bar would pop poll() on both fd's, but there would be no way to tell
+ which happened first. A single queue trivially gives you ordering. Such
+ ordering is crucial to existing applications such as Beagle. Imagine
+ "mv a b ; mv b a" events without ordering.
+
+ - We'd have to maintain n fd's and n internal queues with state,
+ versus just one. It is a lot messier in the kernel. A single, linear
+ queue is the data structure that makes sense.
+
+ - User-space developers prefer the current API. The Beagle guys, for
+ example, love it. Trust me, I asked. It is not a surprise: Who'd want
+ to manage and block on 1000 fd's via select?
+
+ - No way to get out of band data.
+
+ - 1024 is still too low. ;-)
+
+ When you talk about designing a file change notification system that
+ scales to 1000s of directories, juggling 1000s of fd's just does not seem
+ the right interface. It is too heavy.
+
+ Additionally, it _is_ possible to more than one instance and
+ juggle more than one queue and thus more than one associated fd. There
+ need not be a one-fd-per-process mapping; it is one-fd-per-queue and a
+ process can easily want more than one queue.
+
+Q: Why the system call approach?
+
+A: The poor user-space interface is the second biggest problem with dnotify.
+ Signals are a terrible, terrible interface for file notification. Or for
+ anything, for that matter. The ideal solution, from all perspectives, is a
+ file descriptor-based one that allows basic file I/O and poll/select.
+ Obtaining the fd and managing the watches could have been done either via a
+ device file or a family of new system calls. We decided to implement a
+ family of system calls because that is the preffered approach for new kernel
+ interfaces. The only real difference was whether we wanted to use open(2)
+ and ioctl(2) or a couple of new system calls. System calls beat ioctls.
+
diff --git a/Documentation/filesystems/ntfs.txt b/Documentation/filesystems/ntfs.txt
index f89b440fad1d..eef4aca0c753 100644
--- a/Documentation/filesystems/ntfs.txt
+++ b/Documentation/filesystems/ntfs.txt
@@ -21,7 +21,7 @@ Overview
========
Linux-NTFS comes with a number of user-space programs known as ntfsprogs.
-These include mkntfs, a full-featured ntfs file system format utility,
+These include mkntfs, a full-featured ntfs filesystem format utility,
ntfsundelete used for recovering files that were unintentionally deleted
from an NTFS volume and ntfsresize which is used to resize an NTFS partition.
See the web site for more information.
@@ -149,7 +149,14 @@ case_sensitive=<BOOL> If case_sensitive is specified, treat all file names as
name, if it exists. If case_sensitive, you will need
to provide the correct case of the short file name.
-errors=opt What to do when critical file system errors are found.
+disable_sparse=<BOOL> If disable_sparse is specified, creation of sparse
+ regions, i.e. holes, inside files is disabled for the
+ volume (for the duration of this mount only). By
+ default, creation of sparse regions is enabled, which
+ is consistent with the behaviour of traditional Unix
+ filesystems.
+
+errors=opt What to do when critical filesystem errors are found.
Following values can be used for "opt":
continue: DEFAULT, try to clean-up as much as
possible, e.g. marking a corrupt inode as
@@ -432,6 +439,24 @@ ChangeLog
Note, a technical ChangeLog aimed at kernel hackers is in fs/ntfs/ChangeLog.
+2.1.23:
+ - Stamp the user space journal, aka transaction log, aka $UsnJrnl, if
+ it is present and active thus telling Windows and applications using
+ the transaction log that changes can have happened on the volume
+ which are not recorded in $UsnJrnl.
+ - Detect the case when Windows has been hibernated (suspended to disk)
+ and if this is the case do not allow (re)mounting read-write to
+ prevent data corruption when you boot back into the suspended
+ Windows session.
+ - Implement extension of resident files using the normal file write
+ code paths, i.e. most very small files can be extended to be a little
+ bit bigger but not by much.
+ - Add new mount option "disable_sparse". (See list of mount options
+ above for details.)
+ - Improve handling of ntfs volumes with errors and strange boot sectors
+ in particular.
+ - Fix various bugs including a nasty deadlock that appeared in recent
+ kernels (around 2.6.11-2.6.12 timeframe).
2.1.22:
- Improve handling of ntfs volumes with errors.
- Fix various bugs and race conditions.
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 6c98f2bd421e..5024ba7a592c 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -133,6 +133,7 @@ Table 1-1: Process specific entries in /proc
statm Process memory status information
status Process status in human readable form
wchan If CONFIG_KALLSYMS is set, a pre-decoded wchan
+ smaps Extension based on maps, presenting the rss size for each mapped file
..............................................................................
For example, to get the status information of a process, all you have to do is
diff --git a/Documentation/filesystems/sysfs.txt b/Documentation/filesystems/sysfs.txt
index dc276598a65a..c8bce82ddcac 100644
--- a/Documentation/filesystems/sysfs.txt
+++ b/Documentation/filesystems/sysfs.txt
@@ -90,7 +90,7 @@ void device_remove_file(struct device *, struct device_attribute *);
It also defines this helper for defining device attributes:
-#define DEVICE_ATTR(_name,_mode,_show,_store) \
+#define DEVICE_ATTR(_name, _mode, _show, _store) \
struct device_attribute dev_attr_##_name = { \
.attr = {.name = __stringify(_name) , .mode = _mode }, \
.show = _show, \
@@ -99,14 +99,14 @@ struct device_attribute dev_attr_##_name = { \
For example, declaring
-static DEVICE_ATTR(foo,0644,show_foo,store_foo);
+static DEVICE_ATTR(foo, S_IWUSR | S_IRUGO, show_foo, store_foo);
is equivalent to doing:
static struct device_attribute dev_attr_foo = {
.attr = {
.name = "foo",
- .mode = 0644,
+ .mode = S_IWUSR | S_IRUGO,
},
.show = show_foo,
.store = store_foo,
@@ -121,8 +121,8 @@ set of sysfs operations for forwarding read and write calls to the
show and store methods of the attribute owners.
struct sysfs_ops {
- ssize_t (*show)(struct kobject *, struct attribute *,char *);
- ssize_t (*store)(struct kobject *,struct attribute *,const char *);
+ ssize_t (*show)(struct kobject *, struct attribute *, char *);
+ ssize_t (*store)(struct kobject *, struct attribute *, const char *);
};
[ Subsystems should have already defined a struct kobj_type as a
@@ -137,7 +137,7 @@ calls the associated methods.
To illustrate:
-#define to_dev_attr(_attr) container_of(_attr,struct device_attribute,attr)
+#define to_dev_attr(_attr) container_of(_attr, struct device_attribute, attr)
#define to_dev(d) container_of(d, struct device, kobj)
static ssize_t
@@ -148,7 +148,7 @@ dev_attr_show(struct kobject * kobj, struct attribute * attr, char * buf)
ssize_t ret = 0;
if (dev_attr->show)
- ret = dev_attr->show(dev,buf);
+ ret = dev_attr->show(dev, buf);
return ret;
}
@@ -216,16 +216,16 @@ A very simple (and naive) implementation of a device attribute is:
static ssize_t show_name(struct device *dev, struct device_attribute *attr, char *buf)
{
- return sprintf(buf,"%s\n",dev->name);
+ return snprintf(buf, PAGE_SIZE, "%s\n", dev->name);
}
static ssize_t store_name(struct device * dev, const char * buf)
{
- sscanf(buf,"%20s",dev->name);
- return strlen(buf);
+ sscanf(buf, "%20s", dev->name);
+ return strnlen(buf, PAGE_SIZE);
}
-static DEVICE_ATTR(name,S_IRUGO,show_name,store_name);
+static DEVICE_ATTR(name, S_IRUGO, show_name, store_name);
(Note that the real implementation doesn't allow userspace to set the
@@ -290,7 +290,7 @@ struct device_attribute {
Declaring:
-DEVICE_ATTR(_name,_str,_mode,_show,_store);
+DEVICE_ATTR(_name, _str, _mode, _show, _store);
Creation/Removal:
@@ -310,7 +310,7 @@ struct bus_attribute {
Declaring:
-BUS_ATTR(_name,_mode,_show,_store)
+BUS_ATTR(_name, _mode, _show, _store)
Creation/Removal:
@@ -331,7 +331,7 @@ struct driver_attribute {
Declaring:
-DRIVER_ATTR(_name,_mode,_show,_store)
+DRIVER_ATTR(_name, _mode, _show, _store)
Creation/Removal:
diff --git a/Documentation/i2c/chips/adm1021 b/Documentation/hwmon/adm1021
index 03d02bfb3df1..03d02bfb3df1 100644
--- a/Documentation/i2c/chips/adm1021
+++ b/Documentation/hwmon/adm1021
diff --git a/Documentation/i2c/chips/adm1025 b/Documentation/hwmon/adm1025
index 39d2b781b5d6..39d2b781b5d6 100644
--- a/Documentation/i2c/chips/adm1025
+++ b/Documentation/hwmon/adm1025
diff --git a/Documentation/i2c/chips/adm1026 b/Documentation/hwmon/adm1026
index 473c689d7924..473c689d7924 100644
--- a/Documentation/i2c/chips/adm1026
+++ b/Documentation/hwmon/adm1026
diff --git a/Documentation/i2c/chips/adm1031 b/Documentation/hwmon/adm1031
index 130a38382b98..130a38382b98 100644
--- a/Documentation/i2c/chips/adm1031
+++ b/Documentation/hwmon/adm1031
diff --git a/Documentation/i2c/chips/adm9240 b/Documentation/hwmon/adm9240
index 35f618f32896..35f618f32896 100644
--- a/Documentation/i2c/chips/adm9240
+++ b/Documentation/hwmon/adm9240
diff --git a/Documentation/i2c/chips/asb100 b/Documentation/hwmon/asb100
index ab7365e139be..ab7365e139be 100644
--- a/Documentation/i2c/chips/asb100
+++ b/Documentation/hwmon/asb100
diff --git a/Documentation/i2c/chips/ds1621 b/Documentation/hwmon/ds1621
index 1fee6f1e6bc5..1fee6f1e6bc5 100644
--- a/Documentation/i2c/chips/ds1621
+++ b/Documentation/hwmon/ds1621
diff --git a/Documentation/i2c/chips/fscher b/Documentation/hwmon/fscher
index 64031659aff3..64031659aff3 100644
--- a/Documentation/i2c/chips/fscher
+++ b/Documentation/hwmon/fscher
diff --git a/Documentation/i2c/chips/gl518sm b/Documentation/hwmon/gl518sm
index ce0881883bca..ce0881883bca 100644
--- a/Documentation/i2c/chips/gl518sm
+++ b/Documentation/hwmon/gl518sm
diff --git a/Documentation/i2c/chips/it87 b/Documentation/hwmon/it87
index 0d0195040d88..0d0195040d88 100644
--- a/Documentation/i2c/chips/it87
+++ b/Documentation/hwmon/it87
diff --git a/Documentation/i2c/chips/lm63 b/Documentation/hwmon/lm63
index 31660bf97979..31660bf97979 100644
--- a/Documentation/i2c/chips/lm63
+++ b/Documentation/hwmon/lm63
diff --git a/Documentation/i2c/chips/lm75 b/Documentation/hwmon/lm75
index 8e6356fe05d7..8e6356fe05d7 100644
--- a/Documentation/i2c/chips/lm75
+++ b/Documentation/hwmon/lm75
diff --git a/Documentation/i2c/chips/lm77 b/Documentation/hwmon/lm77
index 57c3a46d6370..57c3a46d6370 100644
--- a/Documentation/i2c/chips/lm77
+++ b/Documentation/hwmon/lm77
diff --git a/Documentation/i2c/chips/lm78 b/Documentation/hwmon/lm78
index 357086ed7f64..fd5dc7a19f0e 100644
--- a/Documentation/i2c/chips/lm78
+++ b/Documentation/hwmon/lm78
@@ -2,16 +2,11 @@ Kernel driver lm78
==================
Supported chips:
- * National Semiconductor LM78
+ * National Semiconductor LM78 / LM78-J
Prefix: 'lm78'
Addresses scanned: I2C 0x20 - 0x2f, ISA 0x290 (8 I/O ports)
Datasheet: Publicly available at the National Semiconductor website
http://www.national.com/
- * National Semiconductor LM78-J
- Prefix: 'lm78-j'
- Addresses scanned: I2C 0x20 - 0x2f, ISA 0x290 (8 I/O ports)
- Datasheet: Publicly available at the National Semiconductor website
- http://www.national.com/
* National Semiconductor LM79
Prefix: 'lm79'
Addresses scanned: I2C 0x20 - 0x2f, ISA 0x290 (8 I/O ports)
diff --git a/Documentation/i2c/chips/lm80 b/Documentation/hwmon/lm80
index cb5b407ba3e6..cb5b407ba3e6 100644
--- a/Documentation/i2c/chips/lm80
+++ b/Documentation/hwmon/lm80
diff --git a/Documentation/i2c/chips/lm83 b/Documentation/hwmon/lm83
index 061d9ed8ff43..061d9ed8ff43 100644
--- a/Documentation/i2c/chips/lm83
+++ b/Documentation/hwmon/lm83
diff --git a/Documentation/i2c/chips/lm85 b/Documentation/hwmon/lm85
index 9549237530cf..9549237530cf 100644
--- a/Documentation/i2c/chips/lm85
+++ b/Documentation/hwmon/lm85
diff --git a/Documentation/i2c/chips/lm87 b/Documentation/hwmon/lm87
index c952c57f0e11..c952c57f0e11 100644
--- a/Documentation/i2c/chips/lm87
+++ b/Documentation/hwmon/lm87
diff --git a/Documentation/i2c/chips/lm90 b/Documentation/hwmon/lm90
index 2c4cf39471f4..2c4cf39471f4 100644
--- a/Documentation/i2c/chips/lm90
+++ b/Documentation/hwmon/lm90
diff --git a/Documentation/i2c/chips/lm92 b/Documentation/hwmon/lm92
index 7705bfaa0708..7705bfaa0708 100644
--- a/Documentation/i2c/chips/lm92
+++ b/Documentation/hwmon/lm92
diff --git a/Documentation/i2c/chips/max1619 b/Documentation/hwmon/max1619
index d6f8d9cd7d7f..d6f8d9cd7d7f 100644
--- a/Documentation/i2c/chips/max1619
+++ b/Documentation/hwmon/max1619
diff --git a/Documentation/i2c/chips/pc87360 b/Documentation/hwmon/pc87360
index 89a8fcfa78df..89a8fcfa78df 100644
--- a/Documentation/i2c/chips/pc87360
+++ b/Documentation/hwmon/pc87360
diff --git a/Documentation/i2c/chips/sis5595 b/Documentation/hwmon/sis5595
index b7ae36b8cdf5..b7ae36b8cdf5 100644
--- a/Documentation/i2c/chips/sis5595
+++ b/Documentation/hwmon/sis5595
diff --git a/Documentation/i2c/chips/smsc47b397 b/Documentation/hwmon/smsc47b397
index da9d80c96432..da9d80c96432 100644
--- a/Documentation/i2c/chips/smsc47b397
+++ b/Documentation/hwmon/smsc47b397
diff --git a/Documentation/i2c/chips/smsc47m1 b/Documentation/hwmon/smsc47m1
index 34e6478c1425..34e6478c1425 100644
--- a/Documentation/i2c/chips/smsc47m1
+++ b/Documentation/hwmon/smsc47m1
diff --git a/Documentation/i2c/sysfs-interface b/Documentation/hwmon/sysfs-interface
index 346400519d0d..346400519d0d 100644
--- a/Documentation/i2c/sysfs-interface
+++ b/Documentation/hwmon/sysfs-interface
diff --git a/Documentation/i2c/userspace-tools b/Documentation/hwmon/userspace-tools
index 2622aac65422..2622aac65422 100644
--- a/Documentation/i2c/userspace-tools
+++ b/Documentation/hwmon/userspace-tools
diff --git a/Documentation/i2c/chips/via686a b/Documentation/hwmon/via686a
index b82014cb7c53..b82014cb7c53 100644
--- a/Documentation/i2c/chips/via686a
+++ b/Documentation/hwmon/via686a
diff --git a/Documentation/i2c/chips/w83627hf b/Documentation/hwmon/w83627hf
index 78f37c2d602e..78f37c2d602e 100644
--- a/Documentation/i2c/chips/w83627hf
+++ b/Documentation/hwmon/w83627hf
diff --git a/Documentation/i2c/chips/w83781d b/Documentation/hwmon/w83781d
index e5459333ba68..e5459333ba68 100644
--- a/Documentation/i2c/chips/w83781d
+++ b/Documentation/hwmon/w83781d
diff --git a/Documentation/hwmon/w83792d b/Documentation/hwmon/w83792d
new file mode 100644
index 000000000000..8171c285bb55
--- /dev/null
+++ b/Documentation/hwmon/w83792d
@@ -0,0 +1,174 @@
+Kernel driver w83792d
+=====================
+
+Supported chips:
+ * Winbond W83792D
+ Prefix: 'w83792d'
+ Addresses scanned: I2C 0x2c - 0x2f
+ Datasheet: http://www.winbond.com.tw/E-WINBONDHTM/partner/PDFresult.asp?Pname=1035
+
+Author: Chunhao Huang
+Contact: DZShen <DZShen@Winbond.com.tw>
+
+
+Module Parameters
+-----------------
+
+* init int
+ (default 1)
+ Use 'init=0' to bypass initializing the chip.
+ Try this if your computer crashes when you load the module.
+
+* force_subclients=bus,caddr,saddr,saddr
+ This is used to force the i2c addresses for subclients of
+ a certain chip. Example usage is `force_subclients=0,0x2f,0x4a,0x4b'
+ to force the subclients of chip 0x2f on bus 0 to i2c addresses
+ 0x4a and 0x4b.
+
+
+Description
+-----------
+
+This driver implements support for the Winbond W83792AD/D.
+
+Detection of the chip can sometimes be foiled because it can be in an
+internal state that allows no clean access (Bank with ID register is not
+currently selected). If you know the address of the chip, use a 'force'
+parameter; this will put it into a more well-behaved state first.
+
+The driver implements three temperature sensors, seven fan rotation speed
+sensors, nine voltage sensors, and two automatic fan regulation
+strategies called: Smart Fan I (Thermal Cruise mode) and Smart Fan II.
+Automatic fan control mode is possible only for fan1-fan3. Fan4-fan7 can run
+synchronized with selected fan (fan1-fan3). This functionality and manual PWM
+control for fan4-fan7 is not yet implemented.
+
+Temperatures are measured in degrees Celsius and measurement resolution is 1
+degC for temp1 and 0.5 degC for temp2 and temp3. An alarm is triggered when
+the temperature gets higher than the Overtemperature Shutdown value; it stays
+on until the temperature falls below the Hysteresis value.
+
+Fan rotation speeds are reported in RPM (rotations per minute). An alarm is
+triggered if the rotation speed has dropped below a programmable limit. Fan
+readings can be divided by a programmable divider (1, 2, 4, 8, 16, 32, 64 or
+128) to give the readings more range or accuracy.
+
+Voltage sensors (also known as IN sensors) report their values in millivolts.
+An alarm is triggered if the voltage has crossed a programmable minimum
+or maximum limit.
+
+Alarms are provided as output from "realtime status register". Following bits
+are defined:
+
+bit - alarm on:
+0 - in0
+1 - in1
+2 - temp1
+3 - temp2
+4 - temp3
+5 - fan1
+6 - fan2
+7 - fan3
+8 - in2
+9 - in3
+10 - in4
+11 - in5
+12 - in6
+13 - VID change
+14 - chassis
+15 - fan7
+16 - tart1
+17 - tart2
+18 - tart3
+19 - in7
+20 - in8
+21 - fan4
+22 - fan5
+23 - fan6
+
+Tart will be asserted while target temperature cannot be achieved after 3 minutes
+of full speed rotation of corresponding fan.
+
+In addition to the alarms described above, there is a CHAS alarm on the chips
+which triggers if your computer case is open (This one is latched, contrary
+to realtime alarms).
+
+The chips only update values each 3 seconds; reading them more often will
+do no harm, but will return 'old' values.
+
+
+W83792D PROBLEMS
+----------------
+Known problems:
+ - This driver is only for Winbond W83792D C version device, there
+ are also some motherboards with B version W83792D device. The
+ calculation method to in6-in7(measured value, limits) is a little
+ different between C and B version. C or B version can be identified
+ by CR[0x49h].
+ - The function of vid and vrm has not been finished, because I'm NOT
+ very familiar with them. Adding support is welcome.
+  - The function of chassis open detection needs more tests.
+ - If you have ASUS server board and chip was not found: Then you will
+ need to upgrade to latest (or beta) BIOS. If it does not help please
+ contact us.
+
+Fan control
+-----------
+
+Manual mode
+-----------
+
+Works as expected. You just need to specify desired PWM/DC value (fan speed)
+in appropriate pwm# file.
+
+Thermal cruise
+--------------
+
+In this mode, W83792D provides the Smart Fan system to automatically control
+fan speed to keep the temperatures of CPU and the system within specific
+range. At first a wanted temperature and interval must be set. This is done
+via thermal_cruise# file. The tolerance# file serves to create T +- tolerance
+interval. The fan speed will be lowered as long as the current temperature
+remains below the thermal_cruise# +- tolerance# value. Once the temperature
+exceeds the high limit (T+tolerance), the fan will be turned on with a
+specific speed set by pwm# and automatically controlled its PWM duty cycle
+with the temperature varying. Three conditions may occur:
+
+(1) If the temperature still exceeds the high limit, PWM duty
+cycle will increase slowly.
+
+(2) If the temperature goes below the high limit, but still above the low
+limit (T-tolerance), the fan speed will be fixed at the current speed because
+the temperature is in the target range.
+
+(3) If the temperature goes below the low limit, PWM duty cycle will decrease
+slowly to 0 or a preset stop value until the temperature exceeds the low
+limit. (The preset stop value handling is not yet implemented in driver)
+
+Smart Fan II
+------------
+
+W83792D also provides a special mode for fan. Four temperature points are
+available. When related temperature sensors detects the temperature in preset
+temperature region (sf2_point@_fan# +- tolerance#) it will cause fans to run
+on programmed value from sf2_level@_fan#. You need to set four temperatures
+for each fan.
+
+
+/sys files
+----------
+
+pwm[1-3] - this file stores PWM duty cycle or DC value (fan speed) in range:
+ 0 (stop) to 255 (full)
+pwm[1-3]_enable - this file controls mode of fan/temperature control:
+ * 0 Disabled
+ * 1 Manual mode
+ * 2 Smart Fan II
+ * 3 Thermal Cruise
+pwm[1-3]_mode - Select PWM of DC mode
+ * 0 DC
+ * 1 PWM
+thermal_cruise[1-3] - Selects the desired temperature for cruise (degC)
+tolerance[1-3] - Value in degrees of Celsius (degC) for +- T
+sf2_point[1-4]_fan[1-3] - four temperature points for each fan for Smart Fan II
+sf2_level[1-3]_fan[1-3] - three PWM/DC levels for each fan for Smart Fan II
diff --git a/Documentation/i2c/chips/w83l785ts b/Documentation/hwmon/w83l785ts
index 1841cedc25b2..1841cedc25b2 100644
--- a/Documentation/i2c/chips/w83l785ts
+++ b/Documentation/hwmon/w83l785ts
diff --git a/Documentation/i2c/chips/max6875 b/Documentation/i2c/chips/max6875
index b4fb49b41813..96fec562a8e9 100644
--- a/Documentation/i2c/chips/max6875
+++ b/Documentation/i2c/chips/max6875
@@ -2,53 +2,107 @@ Kernel driver max6875
=====================
Supported chips:
- * Maxim max6874, max6875
- Prefixes: 'max6875'
- Addresses scanned: 0x50, 0x52
- Datasheets:
+ * Maxim MAX6874, MAX6875
+ Prefix: 'max6875'
+ Addresses scanned: None (see below)
+ Datasheet:
http://pdfserv.maxim-ic.com/en/ds/MAX6874-MAX6875.pdf
Author: Ben Gardner <bgardner@wabtec.com>
-Module Parameters
------------------
-
-* allow_write int
- Set to non-zero to enable write permission:
- *0: Read only
- 1: Read and write
-
-
Description
-----------
-The MAXIM max6875 is a EEPROM-programmable power-supply sequencer/supervisor.
+The Maxim MAX6875 is an EEPROM-programmable power-supply sequencer/supervisor.
It provides timed outputs that can be used as a watchdog, if properly wired.
It also provides 512 bytes of user EEPROM.
-At reset, the max6875 reads the configuration eeprom into its configuration
+At reset, the MAX6875 reads the configuration EEPROM into its configuration
registers. The chip then begins to operate according to the values in the
registers.
-See the datasheet for details on how to program the EEPROM.
+The Maxim MAX6874 is a similar, mostly compatible device, with more intputs
+and outputs:
+ vin gpi vout
+MAX6874 6 4 8
+MAX6875 4 3 5
+
+See the datasheet for more information.
Sysfs entries
-------------
-eeprom_user - 512 bytes of user-defined EEPROM space. Only writable if
- allow_write was set and register 0x43 is 0.
-
-eeprom_config - 70 bytes of config EEPROM. Note that changes will not get
- loaded into register space until a power cycle or device reset.
-
-reg_config - 70 bytes of register space. Any changes take affect immediately.
+eeprom - 512 bytes of user-defined EEPROM space.
General Remarks
---------------
-A typical application will require that the EEPROMs be programmed once and
-never altered afterwards.
+Valid addresses for the MAX6875 are 0x50 and 0x52.
+Valid addresses for the MAX6874 are 0x50, 0x52, 0x54 and 0x56.
+The driver does not probe any address, so you must force the address.
+
+Example:
+$ modprobe max6875 force=0,0x50
+
+The MAX6874/MAX6875 ignores address bit 0, so this driver attaches to multiple
+addresses. For example, for address 0x50, it also reserves 0x51.
+The even-address instance is called 'max6875', the odd one is 'max6875 subclient'.
+
+
+Programming the chip using i2c-dev
+----------------------------------
+
+Use the i2c-dev interface to access and program the chips.
+Reads and writes are performed differently depending on the address range.
+
+The configuration registers are at addresses 0x00 - 0x45.
+Use i2c_smbus_write_byte_data() to write a register and
+i2c_smbus_read_byte_data() to read a register.
+The command is the register number.
+
+Examples:
+To write a 1 to register 0x45:
+ i2c_smbus_write_byte_data(fd, 0x45, 1);
+
+To read register 0x45:
+ value = i2c_smbus_read_byte_data(fd, 0x45);
+
+
+The configuration EEPROM is at addresses 0x8000 - 0x8045.
+The user EEPROM is at addresses 0x8100 - 0x82ff.
+
+Use i2c_smbus_write_word_data() to write a byte to EEPROM.
+
+The command is the upper byte of the address: 0x80, 0x81, or 0x82.
+The data word is the lower part of the address or'd with data << 8.
+ cmd = address >> 8;
+ val = (address & 0xff) | (data << 8);
+
+Example:
+To write 0x5a to address 0x8003:
+ i2c_smbus_write_word_data(fd, 0x80, 0x5a03);
+
+
+Reading data from the EEPROM is a little more complicated.
+Use i2c_smbus_write_byte_data() to set the read address and then
+i2c_smbus_read_byte() or i2c_smbus_read_i2c_block_data() to read the data.
+
+Example:
+To read data starting at offset 0x8100, first set the address:
+ i2c_smbus_write_byte_data(fd, 0x81, 0x00);
+
+And then read the data
+ value = i2c_smbus_read_byte(fd);
+
+ or
+
+ count = i2c_smbus_read_i2c_block_data(fd, 0x84, buffer);
+
+The block read should read 16 bytes.
+0x84 is the block read command.
+
+See the datasheet for more details.
diff --git a/Documentation/i2c/dev-interface b/Documentation/i2c/dev-interface
index 09d6cda2a1fb..b849ad636583 100644
--- a/Documentation/i2c/dev-interface
+++ b/Documentation/i2c/dev-interface
@@ -14,9 +14,12 @@ C example
=========
So let's say you want to access an i2c adapter from a C program. The
-first thing to do is `#include <linux/i2c.h>" and "#include <linux/i2c-dev.h>.
-Yes, I know, you should never include kernel header files, but until glibc
-knows about i2c, there is not much choice.
+first thing to do is "#include <linux/i2c-dev.h>". Please note that
+there are two files named "i2c-dev.h" out there, one is distributed
+with the Linux kernel and is meant to be included from kernel
+driver code, the other one is distributed with lm_sensors and is
+meant to be included from user-space programs. You obviously want
+the second one here.
Now, you have to decide which adapter you want to access. You should
inspect /sys/class/i2c-dev/ to decide this. Adapter numbers are assigned
@@ -78,7 +81,7 @@ Full interface description
==========================
The following IOCTLs are defined and fully supported
-(see also i2c-dev.h and i2c.h):
+(see also i2c-dev.h):
ioctl(file,I2C_SLAVE,long addr)
Change slave address. The address is passed in the 7 lower bits of the
@@ -97,10 +100,10 @@ ioctl(file,I2C_PEC,long select)
ioctl(file,I2C_FUNCS,unsigned long *funcs)
Gets the adapter functionality and puts it in *funcs.
-ioctl(file,I2C_RDWR,struct i2c_ioctl_rdwr_data *msgset)
+ioctl(file,I2C_RDWR,struct i2c_rdwr_ioctl_data *msgset)
Do combined read/write transaction without stop in between.
- The argument is a pointer to a struct i2c_ioctl_rdwr_data {
+ The argument is a pointer to a struct i2c_rdwr_ioctl_data {
struct i2c_msg *msgs; /* ptr to array of simple messages */
int nmsgs; /* number of messages to exchange */
diff --git a/Documentation/i2c/functionality b/Documentation/i2c/functionality
index 8a78a95ae04e..41ffefbdc60c 100644
--- a/Documentation/i2c/functionality
+++ b/Documentation/i2c/functionality
@@ -115,7 +115,7 @@ CHECKING THROUGH /DEV
If you try to access an adapter from a userspace program, you will have
to use the /dev interface. You will still have to check whether the
functionality you need is supported, of course. This is done using
-the I2C_FUNCS ioctl. An example, adapted from the lm_sensors i2c_detect
+the I2C_FUNCS ioctl. An example, adapted from the lm_sensors i2cdetect
program, is below:
int file;
diff --git a/Documentation/i2c/porting-clients b/Documentation/i2c/porting-clients
index a7adbdd9ea8a..4849dfd6961c 100644
--- a/Documentation/i2c/porting-clients
+++ b/Documentation/i2c/porting-clients
@@ -1,4 +1,4 @@
-Revision 4, 2004-03-30
+Revision 5, 2005-07-29
Jean Delvare <khali@linux-fr.org>
Greg KH <greg@kroah.com>
@@ -17,20 +17,22 @@ yours for best results.
Technical changes:
-* [Includes] Get rid of "version.h". Replace <linux/i2c-proc.h> with
- <linux/i2c-sensor.h>. Includes typically look like that:
+* [Includes] Get rid of "version.h" and <linux/i2c-proc.h>.
+ Includes typically look like that:
#include <linux/module.h>
#include <linux/init.h>
#include <linux/slab.h>
#include <linux/i2c.h>
- #include <linux/i2c-sensor.h>
- #include <linux/i2c-vid.h> /* if you need VRM support */
+ #include <linux/hwmon.h> /* for hardware monitoring drivers */
+ #include <linux/hwmon-sysfs.h>
+ #include <linux/hwmon-vid.h> /* if you need VRM support */
#include <asm/io.h> /* if you have I/O operations */
Please respect this inclusion order. Some extra headers may be
required for a given driver (e.g. "lm75.h").
-* [Addresses] SENSORS_I2C_END becomes I2C_CLIENT_END, SENSORS_ISA_END
- becomes I2C_CLIENT_ISA_END.
+* [Addresses] SENSORS_I2C_END becomes I2C_CLIENT_END, ISA addresses
+ are no more handled by the i2c core.
+ SENSORS_INSMOD_<n> becomes I2C_CLIENT_INSMOD_<n>.
* [Client data] Get rid of sysctl_id. Try using standard names for
register values (for example, temp_os becomes temp_max). You're
@@ -66,13 +68,15 @@ Technical changes:
if (!(adapter->class & I2C_CLASS_HWMON))
return 0;
ISA-only drivers of course don't need this.
+ Call i2c_probe() instead of i2c_detect().
* [Detect] As mentioned earlier, the flags parameter is gone.
The type_name and client_name strings are replaced by a single
name string, which will be filled with a lowercase, short string
(typically the driver name, e.g. "lm75").
In i2c-only drivers, drop the i2c_is_isa_adapter check, it's
- useless.
+ useless. Same for isa-only drivers, as the test would always be
+ true. Only hybrid drivers (which are quite rare) still need it.
The errorN labels are reduced to the number needed. If that number
is 2 (i2c-only drivers), it is advised that the labels are named
exit and exit_free. For i2c+isa drivers, labels should be named
@@ -86,6 +90,8 @@ Technical changes:
device_create_file. Move the driver initialization before any
sysfs file creation.
Drop client->id.
+ Drop any 24RF08 corruption prevention you find, as this is now done
+ at the i2c-core level, and doing it twice voids it.
* [Init] Limits must not be set by the driver (can be done later in
user-space). Chip should not be reset default (although a module
@@ -93,7 +99,8 @@ Technical changes:
limited to the strictly necessary steps.
* [Detach] Get rid of data, remove the call to
- i2c_deregister_entry.
+ i2c_deregister_entry. Do not log an error message if
+ i2c_detach_client fails, as i2c-core will now do it for you.
* [Update] Don't access client->data directly, use
i2c_get_clientdata(client) instead.
diff --git a/Documentation/i2c/writing-clients b/Documentation/i2c/writing-clients
index f482dae81de3..077275722a7c 100644
--- a/Documentation/i2c/writing-clients
+++ b/Documentation/i2c/writing-clients
@@ -27,7 +27,6 @@ address.
static struct i2c_driver foo_driver = {
.owner = THIS_MODULE,
.name = "Foo version 2.3 driver",
- .id = I2C_DRIVERID_FOO, /* from i2c-id.h, optional */
.flags = I2C_DF_NOTIFY,
.attach_adapter = &foo_attach_adapter,
.detach_client = &foo_detach_client,
@@ -37,12 +36,6 @@ static struct i2c_driver foo_driver = {
The name can be chosen freely, and may be upto 40 characters long. Please
use something descriptive here.
-If used, the id should be a unique ID. The range 0xf000 to 0xffff is
-reserved for local use, and you can use one of those until you start
-distributing the driver, at which time you should contact the i2c authors
-to get your own ID(s). Note that most of the time you don't need an ID
-at all so you can just omit it.
-
Don't worry about the flags field; just put I2C_DF_NOTIFY into it. This
means that your driver will be notified when new adapters are found.
This is almost always what you want.
@@ -155,15 +148,15 @@ are defined in i2c.h to help you support them, as well as a generic
detection algorithm.
You do not have to use this parameter interface; but don't try to use
-function i2c_probe() (or i2c_detect()) if you don't.
+function i2c_probe() if you don't.
NOTE: If you want to write a `sensors' driver, the interface is slightly
different! See below.
-Probing classes (i2c)
----------------------
+Probing classes
+---------------
All parameters are given as lists of unsigned 16-bit integers. Lists are
terminated by I2C_CLIENT_END.
@@ -178,12 +171,18 @@ The following lists are used internally:
ignore: insmod parameter.
A list of pairs. The first value is a bus number (-1 for any I2C bus),
the second is the I2C address. These addresses are never probed.
- This parameter overrules 'normal' and 'probe', but not the 'force' lists.
+ This parameter overrules the 'normal_i2c' list only.
force: insmod parameter.
A list of pairs. The first value is a bus number (-1 for any I2C bus),
the second is the I2C address. A device is blindly assumed to be on
the given address, no probing is done.
+Additionally, kind-specific force lists may optionally be defined if
+the driver supports several chip kinds. They are grouped in a
+NULL-terminated list of pointers named forces, those first element if the
+generic force list mentioned above. Each additional list correspond to an
+insmod parameter of the form force_<kind>.
+
Fortunately, as a module writer, you just have to define the `normal_i2c'
parameter. The complete declaration could look like this:
@@ -193,66 +192,17 @@ parameter. The complete declaration could look like this:
/* Magic definition of all other variables and things */
I2C_CLIENT_INSMOD;
+ /* Or, if your driver supports, say, 2 kind of devices: */
+ I2C_CLIENT_INSMOD_2(foo, bar);
+
+If you use the multi-kind form, an enum will be defined for you:
+ enum chips { any_chip, foo, bar, ... }
+You can then (and certainly should) use it in the driver code.
Note that you *have* to call the defined variable `normal_i2c',
without any prefix!
-Probing classes (sensors)
--------------------------
-
-If you write a `sensors' driver, you use a slightly different interface.
-As well as I2C addresses, we have to cope with ISA addresses. Also, we
-use a enum of chip types. Don't forget to include `sensors.h'.
-
-The following lists are used internally. They are all lists of integers.
-
- normal_i2c: filled in by the module writer. Terminated by SENSORS_I2C_END.
- A list of I2C addresses which should normally be examined.
- normal_isa: filled in by the module writer. Terminated by SENSORS_ISA_END.
- A list of ISA addresses which should normally be examined.
- probe: insmod parameter. Initialize this list with SENSORS_I2C_END values.
- A list of pairs. The first value is a bus number (SENSORS_ISA_BUS for
- the ISA bus, -1 for any I2C bus), the second is the address. These
- addresses are also probed, as if they were in the 'normal' list.
- ignore: insmod parameter. Initialize this list with SENSORS_I2C_END values.
- A list of pairs. The first value is a bus number (SENSORS_ISA_BUS for
- the ISA bus, -1 for any I2C bus), the second is the I2C address. These
- addresses are never probed. This parameter overrules 'normal' and
- 'probe', but not the 'force' lists.
-
-Also used is a list of pointers to sensors_force_data structures:
- force_data: insmod parameters. A list, ending with an element of which
- the force field is NULL.
- Each element contains the type of chip and a list of pairs.
- The first value is a bus number (SENSORS_ISA_BUS for the ISA bus,
- -1 for any I2C bus), the second is the address.
- These are automatically translated to insmod variables of the form
- force_foo.
-
-So we have a generic insmod variabled `force', and chip-specific variables
-`force_CHIPNAME'.
-
-Fortunately, as a module writer, you just have to define the `normal_i2c'
-and `normal_isa' parameters, and define what chip names are used.
-The complete declaration could look like this:
- /* Scan i2c addresses 0x37, and 0x48 to 0x4f */
- static unsigned short normal_i2c[] = { 0x37, 0x48, 0x49, 0x4a, 0x4b, 0x4c,
- 0x4d, 0x4e, 0x4f, I2C_CLIENT_END };
- /* Scan ISA address 0x290 */
- static unsigned int normal_isa[] = {0x0290,SENSORS_ISA_END};
-
- /* Define chips foo and bar, as well as all module parameters and things */
- SENSORS_INSMOD_2(foo,bar);
-
-If you have one chip, you use macro SENSORS_INSMOD_1(chip), if you have 2
-you use macro SENSORS_INSMOD_2(chip1,chip2), etc. If you do not want to
-bother with chip types, you can use SENSORS_INSMOD_0.
-
-A enum is automatically defined as follows:
- enum chips { any_chip, chip1, chip2, ... }
-
-
Attaching to an adapter
-----------------------
@@ -271,17 +221,10 @@ detected at a specific address, another callback is called.
return i2c_probe(adapter,&addr_data,&foo_detect_client);
}
-For `sensors' drivers, use the i2c_detect function instead:
-
- int foo_attach_adapter(struct i2c_adapter *adapter)
- {
- return i2c_detect(adapter,&addr_data,&foo_detect_client);
- }
-
Remember, structure `addr_data' is defined by the macros explained above,
so you do not have to define it yourself.
-The i2c_probe or i2c_detect function will call the foo_detect_client
+The i2c_probe function will call the foo_detect_client
function only for those i2c addresses that actually have a device on
them (unless a `force' parameter was used). In addition, addresses that
are already in use (by some other registered client) are skipped.
@@ -290,19 +233,18 @@ are already in use (by some other registered client) are skipped.
The detect client function
--------------------------
-The detect client function is called by i2c_probe or i2c_detect.
-The `kind' parameter contains 0 if this call is due to a `force'
-parameter, and -1 otherwise (for i2c_detect, it contains 0 if
-this call is due to the generic `force' parameter, and the chip type
-number if it is due to a specific `force' parameter).
+The detect client function is called by i2c_probe. The `kind' parameter
+contains -1 for a probed detection, 0 for a forced detection, or a positive
+number for a forced detection with a chip type forced.
Below, some things are only needed if this is a `sensors' driver. Those
parts are between /* SENSORS ONLY START */ and /* SENSORS ONLY END */
markers.
-This function should only return an error (any value != 0) if there is
-some reason why no more detection should be done anymore. If the
-detection just fails for this address, return 0.
+Returning an error different from -ENODEV in a detect function will cause
+the detection to stop: other addresses and adapters won't be scanned.
+This should only be done on fatal or internal errors, such as a memory
+shortage or i2c_attach_client failing.
For now, you can ignore the `flags' parameter. It is there for future use.
@@ -327,11 +269,10 @@ For now, you can ignore the `flags' parameter. It is there for future use.
const char *type_name = "";
int is_isa = i2c_is_isa_adapter(adapter);
- if (is_isa) {
+ /* Do this only if the chip can additionally be found on the ISA bus
+ (hybrid chip). */
- /* If this client can't be on the ISA bus at all, we can stop now
- (call `goto ERROR0'). But for kicks, we will assume it is all
- right. */
+ if (is_isa) {
/* Discard immediately if this ISA range is already used */
if (check_region(address,FOO_EXTENT))
@@ -502,15 +443,13 @@ much simpler than the attachment code, fortunately!
/* SENSORS ONLY END */
/* Try to detach the client from i2c space */
- if ((err = i2c_detach_client(client))) {
- printk("foo.o: Client deregistration failed, client not detached.\n");
+ if ((err = i2c_detach_client(client)))
return err;
- }
- /* SENSORS ONLY START */
+ /* HYBRID SENSORS CHIP ONLY START */
if i2c_is_isa_client(client)
release_region(client->addr,LM78_EXTENT);
- /* SENSORS ONLY END */
+ /* HYBRID SENSORS CHIP ONLY END */
kfree(client); /* Frees client data too, if allocated at the same time */
return 0;
diff --git a/Documentation/infiniband/core_locking.txt b/Documentation/infiniband/core_locking.txt
new file mode 100644
index 000000000000..e1678542279a
--- /dev/null
+++ b/Documentation/infiniband/core_locking.txt
@@ -0,0 +1,114 @@
+INFINIBAND MIDLAYER LOCKING
+
+ This guide is an attempt to make explicit the locking assumptions
+ made by the InfiniBand midlayer. It describes the requirements on
+ both low-level drivers that sit below the midlayer and upper level
+ protocols that use the midlayer.
+
+Sleeping and interrupt context
+
+ With the following exceptions, a low-level driver implementation of
+ all of the methods in struct ib_device may sleep. The exceptions
+ are any methods from the list:
+
+ create_ah
+ modify_ah
+ query_ah
+ destroy_ah
+ bind_mw
+ post_send
+ post_recv
+ poll_cq
+ req_notify_cq
+ map_phys_fmr
+
+ which may not sleep and must be callable from any context.
+
+ The corresponding functions exported to upper level protocol
+ consumers:
+
+ ib_create_ah
+ ib_modify_ah
+ ib_query_ah
+ ib_destroy_ah
+ ib_bind_mw
+ ib_post_send
+ ib_post_recv
+ ib_req_notify_cq
+ ib_map_phys_fmr
+
+ are therefore safe to call from any context.
+
+ In addition, the function
+
+ ib_dispatch_event
+
+ used by low-level drivers to dispatch asynchronous events through
+ the midlayer is also safe to call from any context.
+
+Reentrancy
+
+ All of the methods in struct ib_device exported by a low-level
+ driver must be fully reentrant. The low-level driver is required to
+ perform all synchronization necessary to maintain consistency, even
+ if multiple function calls using the same object are run
+ simultaneously.
+
+ The IB midlayer does not perform any serialization of function calls.
+
+ Because low-level drivers are reentrant, upper level protocol
+ consumers are not required to perform any serialization. However,
+ some serialization may be required to get sensible results. For
+ example, a consumer may safely call ib_poll_cq() on multiple CPUs
+ simultaneously. However, the ordering of the work completion
+ information between different calls of ib_poll_cq() is not defined.
+
+Callbacks
+
+ A low-level driver must not perform a callback directly from the
+ same callchain as an ib_device method call. For example, it is not
+ allowed for a low-level driver to call a consumer's completion event
+ handler directly from its post_send method. Instead, the low-level
+ driver should defer this callback by, for example, scheduling a
+ tasklet to perform the callback.
+
+ The low-level driver is responsible for ensuring that multiple
+ completion event handlers for the same CQ are not called
+ simultaneously. The driver must guarantee that only one CQ event
+ handler for a given CQ is running at a time. In other words, the
+ following situation is not allowed:
+
+ CPU1 CPU2
+
+ low-level driver ->
+ consumer CQ event callback:
+ /* ... */
+ ib_req_notify_cq(cq, ...);
+ low-level driver ->
+ /* ... */ consumer CQ event callback:
+ /* ... */
+ return from CQ event handler
+
+ The context in which completion event and asynchronous event
+ callbacks run is not defined. Depending on the low-level driver, it
+ may be process context, softirq context, or interrupt context.
+ Upper level protocol consumers may not sleep in a callback.
+
+Hot-plug
+
+ A low-level driver announces that a device is ready for use by
+ consumers when it calls ib_register_device(), all initialization
+ must be complete before this call. The device must remain usable
+ until the driver's call to ib_unregister_device() has returned.
+
+ A low-level driver must call ib_register_device() and
+ ib_unregister_device() from process context. It must not hold any
+ semaphores that could cause deadlock if a consumer calls back into
+ the driver across these calls.
+
+ An upper level protocol consumer may begin using an IB device as
+ soon as the add method of its struct ib_client is called for that
+ device. A consumer must finish all cleanup and free all resources
+ relating to a device before returning from the remove method.
+
+ A consumer is permitted to sleep in its add and remove methods.
diff --git a/Documentation/infiniband/user_mad.txt b/Documentation/infiniband/user_mad.txt
index cae0c83f1ee9..750fe5e80ebc 100644
--- a/Documentation/infiniband/user_mad.txt
+++ b/Documentation/infiniband/user_mad.txt
@@ -28,13 +28,37 @@ Creating MAD agents
Receiving MADs
- MADs are received using read(). The buffer passed to read() must be
- large enough to hold at least one struct ib_user_mad. For example:
-
- struct ib_user_mad mad;
- ret = read(fd, &mad, sizeof mad);
- if (ret != sizeof mad)
+ MADs are received using read(). The receive side now supports
+ RMPP. The buffer passed to read() must be at least one
+ struct ib_user_mad + 256 bytes. For example:
+
+ If the buffer passed is not large enough to hold the received
+ MAD (RMPP), the errno is set to ENOSPC and the length of the
+ buffer needed is set in mad.length.
+
+ Example for normal MAD (non RMPP) reads:
+ struct ib_user_mad *mad;
+ mad = malloc(sizeof *mad + 256);
+ ret = read(fd, mad, sizeof *mad + 256);
+ if (ret != sizeof mad + 256) {
+ perror("read");
+ free(mad);
+ }
+
+ Example for RMPP reads:
+ struct ib_user_mad *mad;
+ mad = malloc(sizeof *mad + 256);
+ ret = read(fd, mad, sizeof *mad + 256);
+ if (ret == -ENOSPC)) {
+ length = mad.length;
+ free(mad);
+ mad = malloc(sizeof *mad + length);
+ ret = read(fd, mad, sizeof *mad + length);
+ }
+ if (ret < 0) {
perror("read");
+ free(mad);
+ }
In addition to the actual MAD contents, the other struct ib_user_mad
fields will be filled in with information on the received MAD. For
@@ -50,18 +74,21 @@ Sending MADs
MADs are sent using write(). The agent ID for sending should be
filled into the id field of the MAD, the destination LID should be
- filled into the lid field, and so on. For example:
+ filled into the lid field, and so on. The send side does support
+ RMPP so arbitrary length MAD can be sent. For example:
+
+ struct ib_user_mad *mad;
- struct ib_user_mad mad;
+ mad = malloc(sizeof *mad + mad_length);
- /* fill in mad.data */
+ /* fill in mad->data */
- mad.id = my_agent; /* req.id from agent registration */
- mad.lid = my_dest; /* in network byte order... */
+ mad->hdr.id = my_agent; /* req.id from agent registration */
+ mad->hdr.lid = my_dest; /* in network byte order... */
/* etc. */
- ret = write(fd, &mad, sizeof mad);
- if (ret != sizeof mad)
+ ret = write(fd, &mad, sizeof *mad + mad_length);
+ if (ret != sizeof *mad + mad_length)
perror("write");
Setting IsSM Capability Bit
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 753db6d8b745..3d5cd7a09b2f 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -37,7 +37,7 @@ restrictions referred to are that the relevant option is valid if:
IA-32 IA-32 aka i386 architecture is enabled.
IA-64 IA-64 architecture is enabled.
IOSCHED More than one I/O scheduler is enabled.
- IP_PNP IP DCHP, BOOTP, or RARP is enabled.
+ IP_PNP IP DHCP, BOOTP, or RARP is enabled.
ISAPNP ISA PnP code is enabled.
ISDN Appropriate ISDN support is enabled.
JOY Appropriate joystick support is enabled.
@@ -159,6 +159,11 @@ running once the system is up.
acpi_fake_ecdt [HW,ACPI] Workaround failure due to BIOS lacking ECDT
+ acpi_generic_hotkey [HW,ACPI]
+ Allow consolidated generic hotkey driver to
+ over-ride platform specific driver.
+ See also Documentation/acpi-hotkey.txt.
+
ad1816= [HW,OSS]
Format: <io>,<irq>,<dma>,<dma2>
See also Documentation/sound/oss/AD1816.
diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
new file mode 100644
index 000000000000..0541fe1de704
--- /dev/null
+++ b/Documentation/kprobes.txt
@@ -0,0 +1,588 @@
+Title : Kernel Probes (Kprobes)
+Authors : Jim Keniston <jkenisto@us.ibm.com>
+ : Prasanna S Panchamukhi <prasanna@in.ibm.com>
+
+CONTENTS
+
+1. Concepts: Kprobes, Jprobes, Return Probes
+2. Architectures Supported
+3. Configuring Kprobes
+4. API Reference
+5. Kprobes Features and Limitations
+6. Probe Overhead
+7. TODO
+8. Kprobes Example
+9. Jprobes Example
+10. Kretprobes Example
+
+1. Concepts: Kprobes, Jprobes, Return Probes
+
+Kprobes enables you to dynamically break into any kernel routine and
+collect debugging and performance information non-disruptively. You
+can trap at almost any kernel code address, specifying a handler
+routine to be invoked when the breakpoint is hit.
+
+There are currently three types of probes: kprobes, jprobes, and
+kretprobes (also called return probes). A kprobe can be inserted
+on virtually any instruction in the kernel. A jprobe is inserted at
+the entry to a kernel function, and provides convenient access to the
+function's arguments. A return probe fires when a specified function
+returns.
+
+In the typical case, Kprobes-based instrumentation is packaged as
+a kernel module. The module's init function installs ("registers")
+one or more probes, and the exit function unregisters them. A
+registration function such as register_kprobe() specifies where
+the probe is to be inserted and what handler is to be called when
+the probe is hit.
+
+The next three subsections explain how the different types of
+probes work. They explain certain things that you'll need to
+know in order to make the best use of Kprobes -- e.g., the
+difference between a pre_handler and a post_handler, and how
+to use the maxactive and nmissed fields of a kretprobe. But
+if you're in a hurry to start using Kprobes, you can skip ahead
+to section 2.
+
+1.1 How Does a Kprobe Work?
+
+When a kprobe is registered, Kprobes makes a copy of the probed
+instruction and replaces the first byte(s) of the probed instruction
+with a breakpoint instruction (e.g., int3 on i386 and x86_64).
+
+When a CPU hits the breakpoint instruction, a trap occurs, the CPU's
+registers are saved, and control passes to Kprobes via the
+notifier_call_chain mechanism. Kprobes executes the "pre_handler"
+associated with the kprobe, passing the handler the addresses of the
+kprobe struct and the saved registers.
+
+Next, Kprobes single-steps its copy of the probed instruction.
+(It would be simpler to single-step the actual instruction in place,
+but then Kprobes would have to temporarily remove the breakpoint
+instruction. This would open a small time window when another CPU
+could sail right past the probepoint.)
+
+After the instruction is single-stepped, Kprobes executes the
+"post_handler," if any, that is associated with the kprobe.
+Execution then continues with the instruction following the probepoint.
+
+1.2 How Does a Jprobe Work?
+
+A jprobe is implemented using a kprobe that is placed on a function's
+entry point. It employs a simple mirroring principle to allow
+seamless access to the probed function's arguments. The jprobe
+handler routine should have the same signature (arg list and return
+type) as the function being probed, and must always end by calling
+the Kprobes function jprobe_return().
+
+Here's how it works. When the probe is hit, Kprobes makes a copy of
+the saved registers and a generous portion of the stack (see below).
+Kprobes then points the saved instruction pointer at the jprobe's
+handler routine, and returns from the trap. As a result, control
+passes to the handler, which is presented with the same register and
+stack contents as the probed function. When it is done, the handler
+calls jprobe_return(), which traps again to restore the original stack
+contents and processor state and switch to the probed function.
+
+By convention, the callee owns its arguments, so gcc may produce code
+that unexpectedly modifies that portion of the stack. This is why
+Kprobes saves a copy of the stack and restores it after the jprobe
+handler has run. Up to MAX_STACK_SIZE bytes are copied -- e.g.,
+64 bytes on i386.
+
+Note that the probed function's args may be passed on the stack
+or in registers (e.g., for x86_64 or for an i386 fastcall function).
+The jprobe will work in either case, so long as the handler's
+prototype matches that of the probed function.
+
+1.3 How Does a Return Probe Work?
+
+When you call register_kretprobe(), Kprobes establishes a kprobe at
+the entry to the function. When the probed function is called and this
+probe is hit, Kprobes saves a copy of the return address, and replaces
+the return address with the address of a "trampoline." The trampoline
+is an arbitrary piece of code -- typically just a nop instruction.
+At boot time, Kprobes registers a kprobe at the trampoline.
+
+When the probed function executes its return instruction, control
+passes to the trampoline and that probe is hit. Kprobes' trampoline
+handler calls the user-specified handler associated with the kretprobe,
+then sets the saved instruction pointer to the saved return address,
+and that's where execution resumes upon return from the trap.
+
+While the probed function is executing, its return address is
+stored in an object of type kretprobe_instance. Before calling
+register_kretprobe(), the user sets the maxactive field of the
+kretprobe struct to specify how many instances of the specified
+function can be probed simultaneously. register_kretprobe()
+pre-allocates the indicated number of kretprobe_instance objects.
+
+For example, if the function is non-recursive and is called with a
+spinlock held, maxactive = 1 should be enough. If the function is
+non-recursive and can never relinquish the CPU (e.g., via a semaphore
+or preemption), NR_CPUS should be enough. If maxactive <= 0, it is
+set to a default value. If CONFIG_PREEMPT is enabled, the default
+is max(10, 2*NR_CPUS). Otherwise, the default is NR_CPUS.
+
+It's not a disaster if you set maxactive too low; you'll just miss
+some probes. In the kretprobe struct, the nmissed field is set to
+zero when the return probe is registered, and is incremented every
+time the probed function is entered but there is no kretprobe_instance
+object available for establishing the return probe.
+
+2. Architectures Supported
+
+Kprobes, jprobes, and return probes are implemented on the following
+architectures:
+
+- i386
+- x86_64 (AMD-64, E64MT)
+- ppc64
+- ia64 (Support for probes on certain instruction types is still in progress.)
+- sparc64 (Return probes not yet implemented.)
+
+3. Configuring Kprobes
+
+When configuring the kernel using make menuconfig/xconfig/oldconfig,
+ensure that CONFIG_KPROBES is set to "y". Under "Kernel hacking",
+look for "Kprobes". You may have to enable "Kernel debugging"
+(CONFIG_DEBUG_KERNEL) before you can enable Kprobes.
+
+You may also want to ensure that CONFIG_KALLSYMS and perhaps even
+CONFIG_KALLSYMS_ALL are set to "y", since kallsyms_lookup_name()
+is a handy, version-independent way to find a function's address.
+
+If you need to insert a probe in the middle of a function, you may find
+it useful to "Compile the kernel with debug info" (CONFIG_DEBUG_INFO),
+so you can use "objdump -d -l vmlinux" to see the source-to-object
+code mapping.
+
+4. API Reference
+
+The Kprobes API includes a "register" function and an "unregister"
+function for each type of probe. Here are terse, mini-man-page
+specifications for these functions and the associated probe handlers
+that you'll write. See the latter half of this document for examples.
+
+4.1 register_kprobe
+
+#include <linux/kprobes.h>
+int register_kprobe(struct kprobe *kp);
+
+Sets a breakpoint at the address kp->addr. When the breakpoint is
+hit, Kprobes calls kp->pre_handler. After the probed instruction
+is single-stepped, Kprobe calls kp->post_handler. If a fault
+occurs during execution of kp->pre_handler or kp->post_handler,
+or during single-stepping of the probed instruction, Kprobes calls
+kp->fault_handler. Any or all handlers can be NULL.
+
+register_kprobe() returns 0 on success, or a negative errno otherwise.
+
+User's pre-handler (kp->pre_handler):
+#include <linux/kprobes.h>
+#include <linux/ptrace.h>
+int pre_handler(struct kprobe *p, struct pt_regs *regs);
+
+Called with p pointing to the kprobe associated with the breakpoint,
+and regs pointing to the struct containing the registers saved when
+the breakpoint was hit. Return 0 here unless you're a Kprobes geek.
+
+User's post-handler (kp->post_handler):
+#include <linux/kprobes.h>
+#include <linux/ptrace.h>
+void post_handler(struct kprobe *p, struct pt_regs *regs,
+ unsigned long flags);
+
+p and regs are as described for the pre_handler. flags always seems
+to be zero.
+
+User's fault-handler (kp->fault_handler):
+#include <linux/kprobes.h>
+#include <linux/ptrace.h>
+int fault_handler(struct kprobe *p, struct pt_regs *regs, int trapnr);
+
+p and regs are as described for the pre_handler. trapnr is the
+architecture-specific trap number associated with the fault (e.g.,
+on i386, 13 for a general protection fault or 14 for a page fault).
+Returns 1 if it successfully handled the exception.
+
+4.2 register_jprobe
+
+#include <linux/kprobes.h>
+int register_jprobe(struct jprobe *jp)
+
+Sets a breakpoint at the address jp->kp.addr, which must be the address
+of the first instruction of a function. When the breakpoint is hit,
+Kprobes runs the handler whose address is jp->entry.
+
+The handler should have the same arg list and return type as the probed
+function; and just before it returns, it must call jprobe_return().
+(The handler never actually returns, since jprobe_return() returns
+control to Kprobes.) If the probed function is declared asmlinkage,
+fastcall, or anything else that affects how args are passed, the
+handler's declaration must match.
+
+register_jprobe() returns 0 on success, or a negative errno otherwise.
+
+4.3 register_kretprobe
+
+#include <linux/kprobes.h>
+int register_kretprobe(struct kretprobe *rp);
+
+Establishes a return probe for the function whose address is
+rp->kp.addr. When that function returns, Kprobes calls rp->handler.
+You must set rp->maxactive appropriately before you call
+register_kretprobe(); see "How Does a Return Probe Work?" for details.
+
+register_kretprobe() returns 0 on success, or a negative errno
+otherwise.
+
+User's return-probe handler (rp->handler):
+#include <linux/kprobes.h>
+#include <linux/ptrace.h>
+int kretprobe_handler(struct kretprobe_instance *ri, struct pt_regs *regs);
+
+regs is as described for kprobe.pre_handler. ri points to the
+kretprobe_instance object, of which the following fields may be
+of interest:
+- ret_addr: the return address
+- rp: points to the corresponding kretprobe object
+- task: points to the corresponding task struct
+The handler's return value is currently ignored.
+
+4.4 unregister_*probe
+
+#include <linux/kprobes.h>
+void unregister_kprobe(struct kprobe *kp);
+void unregister_jprobe(struct jprobe *jp);
+void unregister_kretprobe(struct kretprobe *rp);
+
+Removes the specified probe. The unregister function can be called
+at any time after the probe has been registered.
+
+5. Kprobes Features and Limitations
+
+As of Linux v2.6.12, Kprobes allows multiple probes at the same
+address. Currently, however, there cannot be multiple jprobes on
+the same function at the same time.
+
+In general, you can install a probe anywhere in the kernel.
+In particular, you can probe interrupt handlers. Known exceptions
+are discussed in this section.
+
+For obvious reasons, it's a bad idea to install a probe in
+the code that implements Kprobes (mostly kernel/kprobes.c and
+arch/*/kernel/kprobes.c). A patch in the v2.6.13 timeframe instructs
+Kprobes to reject such requests.
+
+If you install a probe in an inline-able function, Kprobes makes
+no attempt to chase down all inline instances of the function and
+install probes there. gcc may inline a function without being asked,
+so keep this in mind if you're not seeing the probe hits you expect.
+
+A probe handler can modify the environment of the probed function
+-- e.g., by modifying kernel data structures, or by modifying the
+contents of the pt_regs struct (which are restored to the registers
+upon return from the breakpoint). So Kprobes can be used, for example,
+to install a bug fix or to inject faults for testing. Kprobes, of
+course, has no way to distinguish the deliberately injected faults
+from the accidental ones. Don't drink and probe.
+
+Kprobes makes no attempt to prevent probe handlers from stepping on
+each other -- e.g., probing printk() and then calling printk() from a
+probe handler. As of Linux v2.6.12, if a probe handler hits a probe,
+that second probe's handlers won't be run in that instance.
+
+In Linux v2.6.12 and previous versions, Kprobes' data structures are
+protected by a single lock that is held during probe registration and
+unregistration and while handlers are run. Thus, no two handlers
+can run simultaneously. To improve scalability on SMP systems,
+this restriction will probably be removed soon, in which case
+multiple handlers (or multiple instances of the same handler) may
+run concurrently on different CPUs. Code your handlers accordingly.
+
+Kprobes does not use semaphores or allocate memory except during
+registration and unregistration.
+
+Probe handlers are run with preemption disabled. Depending on the
+architecture, handlers may also run with interrupts disabled. In any
+case, your handler should not yield the CPU (e.g., by attempting to
+acquire a semaphore).
+
+Since a return probe is implemented by replacing the return
+address with the trampoline's address, stack backtraces and calls
+to __builtin_return_address() will typically yield the trampoline's
+address instead of the real return address for kretprobed functions.
+(As far as we can tell, __builtin_return_address() is used only
+for instrumentation and error reporting.)
+
+If the number of times a function is called does not match the
+number of times it returns, registering a return probe on that
+function may produce undesirable results. We have the do_exit()
+and do_execve() cases covered. do_fork() is not an issue. We're
+unaware of other specific cases where this could be a problem.
+
+6. Probe Overhead
+
+On a typical CPU in use in 2005, a kprobe hit takes 0.5 to 1.0
+microseconds to process. Specifically, a benchmark that hits the same
+probepoint repeatedly, firing a simple handler each time, reports 1-2
+million hits per second, depending on the architecture. A jprobe or
+return-probe hit typically takes 50-75% longer than a kprobe hit.
+When you have a return probe set on a function, adding a kprobe at
+the entry to that function adds essentially no overhead.
+
+Here are sample overhead figures (in usec) for different architectures.
+k = kprobe; j = jprobe; r = return probe; kr = kprobe + return probe
+on same function; jr = jprobe + return probe on same function
+
+i386: Intel Pentium M, 1495 MHz, 2957.31 bogomips
+k = 0.57 usec; j = 1.00; r = 0.92; kr = 0.99; jr = 1.40
+
+x86_64: AMD Opteron 246, 1994 MHz, 3971.48 bogomips
+k = 0.49 usec; j = 0.76; r = 0.80; kr = 0.82; jr = 1.07
+
+ppc64: POWER5 (gr), 1656 MHz (SMT disabled, 1 virtual CPU per physical CPU)
+k = 0.77 usec; j = 1.31; r = 1.26; kr = 1.45; jr = 1.99
+
+7. TODO
+
+a. SystemTap (http://sourceware.org/systemtap): Work in progress
+to provide a simplified programming interface for probe-based
+instrumentation.
+b. Improved SMP scalability: Currently, work is in progress to handle
+multiple kprobes in parallel.
+c. Kernel return probes for sparc64.
+d. Support for other architectures.
+e. User-space probes.
+
+8. Kprobes Example
+
+Here's a sample kernel module showing the use of kprobes to dump a
+stack trace and selected i386 registers when do_fork() is called.
+----- cut here -----
+/*kprobe_example.c*/
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/kprobes.h>
+#include <linux/kallsyms.h>
+#include <linux/sched.h>
+
+/*For each probe you need to allocate a kprobe structure*/
+static struct kprobe kp;
+
+/*kprobe pre_handler: called just before the probed instruction is executed*/
+int handler_pre(struct kprobe *p, struct pt_regs *regs)
+{
+ printk("pre_handler: p->addr=0x%p, eip=%lx, eflags=0x%lx\n",
+ p->addr, regs->eip, regs->eflags);
+ dump_stack();
+ return 0;
+}
+
+/*kprobe post_handler: called after the probed instruction is executed*/
+void handler_post(struct kprobe *p, struct pt_regs *regs, unsigned long flags)
+{
+ printk("post_handler: p->addr=0x%p, eflags=0x%lx\n",
+ p->addr, regs->eflags);
+}
+
+/* fault_handler: this is called if an exception is generated for any
+ * instruction within the pre- or post-handler, or when Kprobes
+ * single-steps the probed instruction.
+ */
+int handler_fault(struct kprobe *p, struct pt_regs *regs, int trapnr)
+{
+ printk("fault_handler: p->addr=0x%p, trap #%dn",
+ p->addr, trapnr);
+ /* Return 0 because we don't handle the fault. */
+ return 0;
+}
+
+int init_module(void)
+{
+ int ret;
+ kp.pre_handler = handler_pre;
+ kp.post_handler = handler_post;
+ kp.fault_handler = handler_fault;
+ kp.addr = (kprobe_opcode_t*) kallsyms_lookup_name("do_fork");
+ /* register the kprobe now */
+ if (!kp.addr) {
+ printk("Couldn't find %s to plant kprobe\n", "do_fork");
+ return -1;
+ }
+ if ((ret = register_kprobe(&kp) < 0)) {
+ printk("register_kprobe failed, returned %d\n", ret);
+ return -1;
+ }
+ printk("kprobe registered\n");
+ return 0;
+}
+
+void cleanup_module(void)
+{
+ unregister_kprobe(&kp);
+ printk("kprobe unregistered\n");
+}
+
+MODULE_LICENSE("GPL");
+----- cut here -----
+
+You can build the kernel module, kprobe-example.ko, using the following
+Makefile:
+----- cut here -----
+obj-m := kprobe-example.o
+KDIR := /lib/modules/$(shell uname -r)/build
+PWD := $(shell pwd)
+default:
+ $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules
+clean:
+ rm -f *.mod.c *.ko *.o
+----- cut here -----
+
+$ make
+$ su -
+...
+# insmod kprobe-example.ko
+
+You will see the trace data in /var/log/messages and on the console
+whenever do_fork() is invoked to create a new process.
+
+9. Jprobes Example
+
+Here's a sample kernel module showing the use of jprobes to dump
+the arguments of do_fork().
+----- cut here -----
+/*jprobe-example.c */
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/uio.h>
+#include <linux/kprobes.h>
+#include <linux/kallsyms.h>
+
+/*
+ * Jumper probe for do_fork.
+ * Mirror principle enables access to arguments of the probed routine
+ * from the probe handler.
+ */
+
+/* Proxy routine having the same arguments as actual do_fork() routine */
+long jdo_fork(unsigned long clone_flags, unsigned long stack_start,
+ struct pt_regs *regs, unsigned long stack_size,
+ int __user * parent_tidptr, int __user * child_tidptr)
+{
+ printk("jprobe: clone_flags=0x%lx, stack_size=0x%lx, regs=0x%p\n",
+ clone_flags, stack_size, regs);
+ /* Always end with a call to jprobe_return(). */
+ jprobe_return();
+ /*NOTREACHED*/
+ return 0;
+}
+
+static struct jprobe my_jprobe = {
+ .entry = (kprobe_opcode_t *) jdo_fork
+};
+
+int init_module(void)
+{
+ int ret;
+ my_jprobe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("do_fork");
+ if (!my_jprobe.kp.addr) {
+ printk("Couldn't find %s to plant jprobe\n", "do_fork");
+ return -1;
+ }
+
+ if ((ret = register_jprobe(&my_jprobe)) <0) {
+ printk("register_jprobe failed, returned %d\n", ret);
+ return -1;
+ }
+ printk("Planted jprobe at %p, handler addr %p\n",
+ my_jprobe.kp.addr, my_jprobe.entry);
+ return 0;
+}
+
+void cleanup_module(void)
+{
+ unregister_jprobe(&my_jprobe);
+ printk("jprobe unregistered\n");
+}
+
+MODULE_LICENSE("GPL");
+----- cut here -----
+
+Build and insert the kernel module as shown in the above kprobe
+example. You will see the trace data in /var/log/messages and on
+the console whenever do_fork() is invoked to create a new process.
+(Some messages may be suppressed if syslogd is configured to
+eliminate duplicate messages.)
+
+10. Kretprobes Example
+
+Here's a sample kernel module showing the use of return probes to
+report failed calls to sys_open().
+----- cut here -----
+/*kretprobe-example.c*/
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/kprobes.h>
+#include <linux/kallsyms.h>
+
+static const char *probed_func = "sys_open";
+
+/* Return-probe handler: If the probed function fails, log the return value. */
+static int ret_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
+{
+ // Substitute the appropriate register name for your architecture --
+ // e.g., regs->rax for x86_64, regs->gpr[3] for ppc64.
+ int retval = (int) regs->eax;
+ if (retval < 0) {
+ printk("%s returns %d\n", probed_func, retval);
+ }
+ return 0;
+}
+
+static struct kretprobe my_kretprobe = {
+ .handler = ret_handler,
+ /* Probe up to 20 instances concurrently. */
+ .maxactive = 20
+};
+
+int init_module(void)
+{
+ int ret;
+ my_kretprobe.kp.addr =
+ (kprobe_opcode_t *) kallsyms_lookup_name(probed_func);
+ if (!my_kretprobe.kp.addr) {
+ printk("Couldn't find %s to plant return probe\n", probed_func);
+ return -1;
+ }
+ if ((ret = register_kretprobe(&my_kretprobe)) < 0) {
+ printk("register_kretprobe failed, returned %d\n", ret);
+ return -1;
+ }
+ printk("Planted return probe at %p\n", my_kretprobe.kp.addr);
+ return 0;
+}
+
+void cleanup_module(void)
+{
+ unregister_kretprobe(&my_kretprobe);
+ printk("kretprobe unregistered\n");
+ /* nmissed > 0 suggests that maxactive was set too low. */
+ printk("Missed probing %d instances of %s\n",
+ my_kretprobe.nmissed, probed_func);
+}
+
+MODULE_LICENSE("GPL");
+----- cut here -----
+
+Build and insert the kernel module as shown in the above kprobe
+example. You will see the trace data in /var/log/messages and on the
+console whenever sys_open() returns a negative value. (Some messages
+may be suppressed if syslogd is configured to eliminate duplicate
+messages.)
+
+For additional information on Kprobes, refer to the following URLs:
+http://www-106.ibm.com/developerworks/library/l-kprobes.html?ca=dgr-lnxw42Kprobe
+http://www.redhat.com/magazine/005mar05/features/kprobes/
diff --git a/Documentation/networking/README.ipw2100 b/Documentation/networking/README.ipw2100
new file mode 100644
index 000000000000..2046948b020d
--- /dev/null
+++ b/Documentation/networking/README.ipw2100
@@ -0,0 +1,246 @@
+
+===========================
+Intel(R) PRO/Wireless 2100 Network Connection Driver for Linux
+README.ipw2100
+
+March 14, 2005
+
+===========================
+Index
+---------------------------
+0. Introduction
+1. Release 1.1.0 Current Features
+2. Command Line Parameters
+3. Sysfs Helper Files
+4. Radio Kill Switch
+5. Dynamic Firmware
+6. Power Management
+7. Support
+8. License
+
+
+===========================
+0. Introduction
+------------ ----- ----- ---- --- -- -
+
+This document provides a brief overview of the features supported by the
+IPW2100 driver project. The main project website, where the latest
+development version of the driver can be found, is:
+
+ http://ipw2100.sourceforge.net
+
+There you can find the not only the latest releases, but also information about
+potential fixes and patches, as well as links to the development mailing list
+for the driver project.
+
+
+===========================
+1. Release 1.1.0 Current Supported Features
+---------------------------
+- Managed (BSS) and Ad-Hoc (IBSS)
+- WEP (shared key and open)
+- Wireless Tools support
+- 802.1x (tested with XSupplicant 1.0.1)
+
+Enabled (but not supported) features:
+- Monitor/RFMon mode
+- WPA/WPA2
+
+The distinction between officially supported and enabled is a reflection
+on the amount of validation and interoperability testing that has been
+performed on a given feature.
+
+
+===========================
+2. Command Line Parameters
+---------------------------
+
+If the driver is built as a module, the following optional parameters are used
+by entering them on the command line with the modprobe command using this
+syntax:
+
+ modprobe ipw2100 [<option>=<VAL1><,VAL2>...]
+
+For example, to disable the radio on driver loading, enter:
+
+ modprobe ipw2100 disable=1
+
+The ipw2100 driver supports the following module parameters:
+
+Name Value Example:
+debug 0x0-0xffffffff debug=1024
+mode 0,1,2 mode=1 /* AdHoc */
+channel int channel=3 /* Only valid in AdHoc or Monitor */
+associate boolean associate=0 /* Do NOT auto associate */
+disable boolean disable=1 /* Do not power the HW */
+
+
+===========================
+3. Sysfs Helper Files
+---------------------------
+
+There are several ways to control the behavior of the driver. Many of the
+general capabilities are exposed through the Wireless Tools (iwconfig). There
+are a few capabilities that are exposed through entries in the Linux Sysfs.
+
+
+----- Driver Level ------
+For the driver level files, look in /sys/bus/pci/drivers/ipw2100/
+
+ debug_level
+
+ This controls the same global as the 'debug' module parameter. For
+ information on the various debugging levels available, run the 'dvals'
+ script found in the driver source directory.
+
+ NOTE: 'debug_level' is only enabled if CONFIG_IPW2100_DEBUG is turn
+ on.
+
+----- Device Level ------
+For the device level files look in
+
+ /sys/bus/pci/drivers/ipw2100/{PCI-ID}/
+
+For example:
+ /sys/bus/pci/drivers/ipw2100/0000:02:01.0
+
+For the device level files, see /sys/bus/pci/drivers/ipw2100:
+
+ rf_kill
+ read -
+ 0 = RF kill not enabled (radio on)
+ 1 = SW based RF kill active (radio off)
+ 2 = HW based RF kill active (radio off)
+ 3 = Both HW and SW RF kill active (radio off)
+ write -
+ 0 = If SW based RF kill active, turn the radio back on
+ 1 = If radio is on, activate SW based RF kill
+
+ NOTE: If you enable the SW based RF kill and then toggle the HW
+ based RF kill from ON -> OFF -> ON, the radio will NOT come back on
+
+
+===========================
+4. Radio Kill Switch
+---------------------------
+Most laptops provide the ability for the user to physically disable the radio.
+Some vendors have implemented this as a physical switch that requires no
+software to turn the radio off and on. On other laptops, however, the switch
+is controlled through a button being pressed and a software driver then making
+calls to turn the radio off and on. This is referred to as a "software based
+RF kill switch"
+
+See the Sysfs helper file 'rf_kill' for determining the state of the RF switch
+on your system.
+
+
+===========================
+5. Dynamic Firmware
+---------------------------
+As the firmware is licensed under a restricted use license, it can not be
+included within the kernel sources. To enable the IPW2100 you will need a
+firmware image to load into the wireless NIC's processors.
+
+You can obtain these images from <http://ipw2100.sf.net/firmware.php>.
+
+See INSTALL for instructions on installing the firmware.
+
+
+===========================
+6. Power Management
+---------------------------
+The IPW2100 supports the configuration of the Power Save Protocol
+through a private wireless extension interface. The IPW2100 supports
+the following different modes:
+
+ off No power management. Radio is always on.
+ on Automatic power management
+ 1-5 Different levels of power management. The higher the
+ number the greater the power savings, but with an impact to
+ packet latencies.
+
+Power management works by powering down the radio after a certain
+interval of time has passed where no packets are passed through the
+radio. Once powered down, the radio remains in that state for a given
+period of time. For higher power savings, the interval between last
+packet processed to sleep is shorter and the sleep period is longer.
+
+When the radio is asleep, the access point sending data to the station
+must buffer packets at the AP until the station wakes up and requests
+any buffered packets. If you have an AP that does not correctly support
+the PSP protocol you may experience packet loss or very poor performance
+while power management is enabled. If this is the case, you will need
+to try and find a firmware update for your AP, or disable power
+management (via `iwconfig eth1 power off`)
+
+To configure the power level on the IPW2100 you use a combination of
+iwconfig and iwpriv. iwconfig is used to turn power management on, off,
+and set it to auto.
+
+ iwconfig eth1 power off Disables radio power down
+ iwconfig eth1 power on Enables radio power management to
+ last set level (defaults to AUTO)
+ iwpriv eth1 set_power 0 Sets power level to AUTO and enables
+ power management if not previously
+ enabled.
+ iwpriv eth1 set_power 1-5 Set the power level as specified,
+ enabling power management if not
+ previously enabled.
+
+You can view the current power level setting via:
+
+ iwpriv eth1 get_power
+
+It will return the current period or timeout that is configured as a string
+in the form of xxxx/yyyy (z) where xxxx is the timeout interval (amount of
+time after packet processing), yyyy is the period to sleep (amount of time to
+wait before powering the radio and querying the access point for buffered
+packets), and z is the 'power level'. If power management is turned off the
+xxxx/yyyy will be replaced with 'off' -- the level reported will be the active
+level if `iwconfig eth1 power on` is invoked.
+
+
+===========================
+7. Support
+---------------------------
+
+For general development information and support,
+go to:
+
+ http://ipw2100.sf.net/
+
+The ipw2100 1.1.0 driver and firmware can be downloaded from:
+
+ http://support.intel.com
+
+For installation support on the ipw2100 1.1.0 driver on Linux kernels
+2.6.8 or greater, email support is available from:
+
+ http://supportmail.intel.com
+
+===========================
+8. License
+---------------------------
+
+ Copyright(c) 2003 - 2005 Intel Corporation. All rights reserved.
+
+ This program is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License (version 2) as
+ published by the Free Software Foundation.
+
+ This program is distributed in the hope that it will be useful, but WITHOUT
+ ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ more details.
+
+ You should have received a copy of the GNU General Public License along with
+ this program; if not, write to the Free Software Foundation, Inc., 59
+ Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+
+ The full GNU General Public License is included in this distribution in the
+ file called LICENSE.
+
+ License Contact Information:
+ James P. Ketrenos <ipw2100-admin@linux.intel.com>
+ Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
diff --git a/Documentation/networking/README.ipw2200 b/Documentation/networking/README.ipw2200
new file mode 100644
index 000000000000..6916080c5f03
--- /dev/null
+++ b/Documentation/networking/README.ipw2200
@@ -0,0 +1,300 @@
+
+Intel(R) PRO/Wireless 2915ABG Driver for Linux in support of:
+
+Intel(R) PRO/Wireless 2200BG Network Connection
+Intel(R) PRO/Wireless 2915ABG Network Connection
+
+Note: The Intel(R) PRO/Wireless 2915ABG Driver for Linux and Intel(R)
+PRO/Wireless 2200BG Driver for Linux is a unified driver that works on
+both hardware adapters listed above. In this document the Intel(R)
+PRO/Wireless 2915ABG Driver for Linux will be used to reference the
+unified driver.
+
+Copyright (C) 2004-2005, Intel Corporation
+
+README.ipw2200
+
+Version: 1.0.0
+Date : January 31, 2005
+
+
+Index
+-----------------------------------------------
+1. Introduction
+1.1. Overview of features
+1.2. Module parameters
+1.3. Wireless Extension Private Methods
+1.4. Sysfs Helper Files
+2. About the Version Numbers
+3. Support
+4. License
+
+
+1. Introduction
+-----------------------------------------------
+The following sections attempt to provide a brief introduction to using
+the Intel(R) PRO/Wireless 2915ABG Driver for Linux.
+
+This document is not meant to be a comprehensive manual on
+understanding or using wireless technologies, but should be sufficient
+to get you moving without wires on Linux.
+
+For information on building and installing the driver, see the INSTALL
+file.
+
+
+1.1. Overview of Features
+-----------------------------------------------
+The current release (1.0.0) supports the following features:
+
++ BSS mode (Infrastructure, Managed)
++ IBSS mode (Ad-Hoc)
++ WEP (OPEN and SHARED KEY mode)
++ 802.1x EAP via wpa_supplicant and xsupplicant
++ Wireless Extension support
++ Full B and G rate support (2200 and 2915)
++ Full A rate support (2915 only)
++ Transmit power control
++ S state support (ACPI suspend/resume)
++ long/short preamble support
+
+
+
+1.2. Command Line Parameters
+-----------------------------------------------
+
+Like many modules used in the Linux kernel, the Intel(R) PRO/Wireless
+2915ABG Driver for Linux allows certain configuration options to be
+provided as module parameters. The most common way to specify a module
+parameter is via the command line.
+
+The general form is:
+
+% modprobe ipw2200 parameter=value
+
+Where the supported parameter are:
+
+ associate
+ Set to 0 to disable the auto scan-and-associate functionality of the
+ driver. If disabled, the driver will not attempt to scan
+ for and associate to a network until it has been configured with
+ one or more properties for the target network, for example configuring
+ the network SSID. Default is 1 (auto-associate)
+
+ Example: % modprobe ipw2200 associate=0
+
+ auto_create
+ Set to 0 to disable the auto creation of an Ad-Hoc network
+ matching the channel and network name parameters provided.
+ Default is 1.
+
+ channel
+ channel number for association. The normal method for setting
+ the channel would be to use the standard wireless tools
+ (i.e. `iwconfig eth1 channel 10`), but it is useful sometimes
+ to set this while debugging. Channel 0 means 'ANY'
+
+ debug
+ If using a debug build, this is used to control the amount of debug
+ info is logged. See the 'dval' and 'load' script for more info on
+ how to use this (the dval and load scripts are provided as part
+ of the ipw2200 development snapshot releases available from the
+ SourceForge project at http://ipw2200.sf.net)
+
+ mode
+ Can be used to set the default mode of the adapter.
+ 0 = Managed, 1 = Ad-Hoc
+
+
+1.3. Wireless Extension Private Methods
+-----------------------------------------------
+
+As an interface designed to handle generic hardware, there are certain
+capabilities not exposed through the normal Wireless Tool interface. As
+such, a provision is provided for a driver to declare custom, or
+private, methods. The Intel(R) PRO/Wireless 2915ABG Driver for Linux
+defines several of these to configure various settings.
+
+The general form of using the private wireless methods is:
+
+ % iwpriv $IFNAME method parameters
+
+Where $IFNAME is the interface name the device is registered with
+(typically eth1, customized via one of the various network interface
+name managers, such as ifrename)
+
+The supported private methods are:
+
+ get_mode
+ Can be used to report out which IEEE mode the driver is
+ configured to support. Example:
+
+ % iwpriv eth1 get_mode
+ eth1 get_mode:802.11bg (6)
+
+ set_mode
+ Can be used to configure which IEEE mode the driver will
+ support.
+
+ Usage:
+ % iwpriv eth1 set_mode {mode}
+ Where {mode} is a number in the range 1-7:
+ 1 802.11a (2915 only)
+ 2 802.11b
+ 3 802.11ab (2915 only)
+ 4 802.11g
+ 5 802.11ag (2915 only)
+ 6 802.11bg
+ 7 802.11abg (2915 only)
+
+ get_preamble
+ Can be used to report configuration of preamble length.
+
+ set_preamble
+ Can be used to set the configuration of preamble length:
+
+ Usage:
+ % iwpriv eth1 set_preamble {mode}
+ Where {mode} is one of:
+ 1 Long preamble only
+ 0 Auto (long or short based on connection)
+
+
+1.4. Sysfs Helper Files:
+-----------------------------------------------
+
+The Linux kernel provides a pseudo file system that can be used to
+access various components of the operating system. The Intel(R)
+PRO/Wireless 2915ABG Driver for Linux exposes several configuration
+parameters through this mechanism.
+
+An entry in the sysfs can support reading and/or writing. You can
+typically query the contents of a sysfs entry through the use of cat,
+and can set the contents via echo. For example:
+
+% cat /sys/bus/pci/drivers/ipw2200/debug_level
+
+Will report the current debug level of the driver's logging subsystem
+(only available if CONFIG_IPW_DEBUG was configured when the driver was
+built).
+
+You can set the debug level via:
+
+% echo $VALUE > /sys/bus/pci/drivers/ipw2200/debug_level
+
+Where $VALUE would be a number in the case of this sysfs entry. The
+input to sysfs files does not have to be a number. For example, the
+firmware loader used by hotplug utilizes sysfs entries for transferring
+the firmware image from user space into the driver.
+
+The Intel(R) PRO/Wireless 2915ABG Driver for Linux exposes sysfs entries
+at two levels -- driver level, which apply to all instances of the
+driver (in the event that there are more than one device installed) and
+device level, which applies only to the single specific instance.
+
+
+1.4.1 Driver Level Sysfs Helper Files
+-----------------------------------------------
+
+For the driver level files, look in /sys/bus/pci/drivers/ipw2200/
+
+ debug_level
+
+ This controls the same global as the 'debug' module parameter
+
+
+1.4.2 Device Level Sysfs Helper Files
+-----------------------------------------------
+
+For the device level files, look in
+
+ /sys/bus/pci/drivers/ipw2200/{PCI-ID}/
+
+For example:
+ /sys/bus/pci/drivers/ipw2200/0000:02:01.0
+
+For the device level files, see /sys/bus/pci/[drivers/ipw2200:
+
+ rf_kill
+ read -
+ 0 = RF kill not enabled (radio on)
+ 1 = SW based RF kill active (radio off)
+ 2 = HW based RF kill active (radio off)
+ 3 = Both HW and SW RF kill active (radio off)
+ write -
+ 0 = If SW based RF kill active, turn the radio back on
+ 1 = If radio is on, activate SW based RF kill
+
+ NOTE: If you enable the SW based RF kill and then toggle the HW
+ based RF kill from ON -> OFF -> ON, the radio will NOT come back on
+
+ ucode
+ read-only access to the ucode version number
+
+
+2. About the Version Numbers
+-----------------------------------------------
+
+Due to the nature of open source development projects, there are
+frequently changes being incorporated that have not gone through
+a complete validation process. These changes are incorporated into
+development snapshot releases.
+
+Releases are numbered with a three level scheme:
+
+ major.minor.development
+
+Any version where the 'development' portion is 0 (for example
+1.0.0, 1.1.0, etc.) indicates a stable version that will be made
+available for kernel inclusion.
+
+Any version where the 'development' portion is not a 0 (for
+example 1.0.1, 1.1.5, etc.) indicates a development version that is
+being made available for testing and cutting edge users. The stability
+and functionality of the development releases are not know. We make
+efforts to try and keep all snapshots reasonably stable, but due to the
+frequency of their release, and the desire to get those releases
+available as quickly as possible, unknown anomalies should be expected.
+
+The major version number will be incremented when significant changes
+are made to the driver. Currently, there are no major changes planned.
+
+
+3. Support
+-----------------------------------------------
+
+For installation support of the 1.0.0 version, you can contact
+http://supportmail.intel.com, or you can use the open source project
+support.
+
+For general information and support, go to:
+
+ http://ipw2200.sf.net/
+
+
+4. License
+-----------------------------------------------
+
+ Copyright(c) 2003 - 2005 Intel Corporation. All rights reserved.
+
+ This program is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License version 2 as
+ published by the Free Software Foundation.
+
+ This program is distributed in the hope that it will be useful, but WITHOUT
+ ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ more details.
+
+ You should have received a copy of the GNU General Public License along with
+ this program; if not, write to the Free Software Foundation, Inc., 59
+ Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+
+ The full GNU General Public License is included in this distribution in the
+ file called LICENSE.
+
+ Contact Information:
+ James P. Ketrenos <ipw2100-admin@linux.intel.com>
+ Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
+
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 0bc2ed136a38..24d029455baa 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -1,5 +1,7 @@
- Linux Ethernet Bonding Driver HOWTO
+ Linux Ethernet Bonding Driver HOWTO
+
+ Latest update: 21 June 2005
Initial release : Thomas Davis <tadavis at lbl.gov>
Corrections, HA extensions : 2000/10/03-15 :
@@ -11,15 +13,22 @@ Corrections, HA extensions : 2000/10/03-15 :
Reorganized and updated Feb 2005 by Jay Vosburgh
-Note :
-------
+Introduction
+============
+
+ The Linux bonding driver provides a method for aggregating
+multiple network interfaces into a single logical "bonded" interface.
+The behavior of the bonded interfaces depends upon the mode; generally
+speaking, modes provide either hot standby or load balancing services.
+Additionally, link integrity monitoring may be performed.
-The bonding driver originally came from Donald Becker's beowulf patches for
-kernel 2.0. It has changed quite a bit since, and the original tools from
-extreme-linux and beowulf sites will not work with this version of the driver.
+ The bonding driver originally came from Donald Becker's
+beowulf patches for kernel 2.0. It has changed quite a bit since, and
+the original tools from extreme-linux and beowulf sites will not work
+with this version of the driver.
-For new versions of the driver, patches for older kernels and the updated
-userspace tools, please follow the links at the end of this file.
+ For new versions of the driver, updated userspace tools, and
+who to ask for help, please follow the links at the end of this file.
Table of Contents
=================
@@ -30,9 +39,13 @@ Table of Contents
3. Configuring Bonding Devices
3.1 Configuration with sysconfig support
+3.1.1 Using DHCP with sysconfig
+3.1.2 Configuring Multiple Bonds with sysconfig
3.2 Configuration with initscripts support
+3.2.1 Using DHCP with initscripts
+3.2.2 Configuring Multiple Bonds with initscripts
3.3 Configuring Bonding Manually
-3.4 Configuring Multiple Bonds
+3.3.1 Configuring Multiple Bonds Manually
5. Querying Bonding Configuration
5.1 Bonding Configuration
@@ -56,21 +69,30 @@ Table of Contents
11. Promiscuous mode
-12. High Availability Information
+12. Configuring Bonding for High Availability
12.1 High Availability in a Single Switch Topology
-12.1.1 Bonding Mode Selection for Single Switch Topology
-12.1.2 Link Monitoring for Single Switch Topology
12.2 High Availability in a Multiple Switch Topology
-12.2.1 Bonding Mode Selection for Multiple Switch Topology
-12.2.2 Link Monitoring for Multiple Switch Topology
-12.3 Switch Behavior Issues for High Availability
+12.2.1 HA Bonding Mode Selection for Multiple Switch Topology
+12.2.2 HA Link Monitoring for Multiple Switch Topology
+
+13. Configuring Bonding for Maximum Throughput
+13.1 Maximum Throughput in a Single Switch Topology
+13.1.1 MT Bonding Mode Selection for Single Switch Topology
+13.1.2 MT Link Monitoring for Single Switch Topology
+13.2 Maximum Throughput in a Multiple Switch Topology
+13.2.1 MT Bonding Mode Selection for Multiple Switch Topology
+13.2.2 MT Link Monitoring for Multiple Switch Topology
-13. Hardware Specific Considerations
-13.1 IBM BladeCenter
+14. Switch Behavior Issues
+14.1 Link Establishment and Failover Delays
+14.2 Duplicated Incoming Packets
-14. Frequently Asked Questions
+15. Hardware Specific Considerations
+15.1 IBM BladeCenter
-15. Resources and Links
+16. Frequently Asked Questions
+
+17. Resources and Links
1. Bonding Driver Installation
@@ -86,16 +108,10 @@ the following steps:
1.1 Configure and build the kernel with bonding
-----------------------------------------------
- The latest version of the bonding driver is available in the
+ The current version of the bonding driver is available in the
drivers/net/bonding subdirectory of the most recent kernel source
-(which is available on http://kernel.org).
-
- Prior to the 2.4.11 kernel, the bonding driver was maintained
-largely outside the kernel tree; patches for some earlier kernels are
-available on the bonding sourceforge site, although those patches are
-still several years out of date. Most users will want to use either
-the most recent kernel from kernel.org or whatever kernel came with
-their distro.
+(which is available on http://kernel.org). Most users "rolling their
+own" will want to use the most recent kernel from kernel.org.
Configure kernel with "make menuconfig" (or "make xconfig" or
"make config"), then select "Bonding driver support" in the "Network
@@ -103,8 +119,8 @@ device support" section. It is recommended that you configure the
driver as module since it is currently the only way to pass parameters
to the driver or configure more than one bonding device.
- Build and install the new kernel and modules, then proceed to
-step 2.
+ Build and install the new kernel and modules, then continue
+below to install ifenslave.
1.2 Install ifenslave Control Utility
-------------------------------------
@@ -147,9 +163,9 @@ default kernel source include directory.
Options for the bonding driver are supplied as parameters to
the bonding module at load time. They may be given as command line
arguments to the insmod or modprobe command, but are usually specified
-in either the /etc/modprobe.conf configuration file, or in a
-distro-specific configuration file (some of which are detailed in the
-next section).
+in either the /etc/modules.conf or /etc/modprobe.conf configuration
+file, or in a distro-specific configuration file (some of which are
+detailed in the next section).
The available bonding driver parameters are listed below. If a
parameter is not specified the default value is used. When initially
@@ -162,34 +178,34 @@ degradation will occur during link failures. Very few devices do not
support at least miimon, so there is really no reason not to use it.
Options with textual values will accept either the text name
- or, for backwards compatibility, the option value. E.g.,
- "mode=802.3ad" and "mode=4" set the same mode.
+or, for backwards compatibility, the option value. E.g.,
+"mode=802.3ad" and "mode=4" set the same mode.
The parameters are as follows:
arp_interval
- Specifies the ARP monitoring frequency in milli-seconds. If
- ARP monitoring is used in a load-balancing mode (mode 0 or 2),
- the switch should be configured in a mode that evenly
- distributes packets across all links - such as round-robin. If
- the switch is configured to distribute the packets in an XOR
+ Specifies the ARP link monitoring frequency in milliseconds.
+ If ARP monitoring is used in an etherchannel compatible mode
+ (modes 0 and 2), the switch should be configured in a mode
+ that evenly distributes packets across all links. If the
+ switch is configured to distribute the packets in an XOR
fashion, all replies from the ARP targets will be received on
the same link which could cause the other team members to
- fail. ARP monitoring should not be used in conjunction with
- miimon. A value of 0 disables ARP monitoring. The default
+ fail. ARP monitoring should not be used in conjunction with
+ miimon. A value of 0 disables ARP monitoring. The default
value is 0.
arp_ip_target
- Specifies the ip addresses to use when arp_interval is > 0.
- These are the targets of the ARP request sent to determine the
- health of the link to the targets. Specify these values in
- ddd.ddd.ddd.ddd format. Multiple ip adresses must be
- seperated by a comma. At least one IP address must be given
- for ARP monitoring to function. The maximum number of targets
- that can be specified is 16. The default value is no IP
- addresses.
+ Specifies the IP addresses to use as ARP monitoring peers when
+ arp_interval is > 0. These are the targets of the ARP request
+ sent to determine the health of the link to the targets.
+ Specify these values in ddd.ddd.ddd.ddd format. Multiple IP
+ addresses must be separated by a comma. At least one IP
+ address must be given for ARP monitoring to function. The
+ maximum number of targets that can be specified is 16. The
+ default value is no IP addresses.
downdelay
@@ -207,11 +223,13 @@ lacp_rate
are:
slow or 0
- Request partner to transmit LACPDUs every 30 seconds (default)
+ Request partner to transmit LACPDUs every 30 seconds
fast or 1
Request partner to transmit LACPDUs every 1 second
+ The default is slow.
+
max_bonds
Specifies the number of bonding devices to create for this
@@ -221,10 +239,11 @@ max_bonds
miimon
- Specifies the frequency in milli-seconds that MII link
- monitoring will occur. A value of zero disables MII link
- monitoring. A value of 100 is a good starting point. The
- use_carrier option, below, affects how the link state is
+ Specifies the MII link monitoring frequency in milliseconds.
+ This determines how often the link state of each slave is
+ inspected for link failures. A value of zero disables MII
+ link monitoring. A value of 100 is a good starting point.
+ The use_carrier option, below, affects how the link state is
determined. See the High Availability section for additional
information. The default value is 0.
@@ -246,17 +265,31 @@ mode
active. A different slave becomes active if, and only
if, the active slave fails. The bond's MAC address is
externally visible on only one port (network adapter)
- to avoid confusing the switch. This mode provides
- fault tolerance. The primary option affects the
- behavior of this mode.
+ to avoid confusing the switch.
+
+ In bonding version 2.6.2 or later, when a failover
+ occurs in active-backup mode, bonding will issue one
+ or more gratuitous ARPs on the newly active slave.
+ One gratutious ARP is issued for the bonding master
+ interface and each VLAN interfaces configured above
+ it, provided that the interface has at least one IP
+ address configured. Gratuitous ARPs issued for VLAN
+ interfaces are tagged with the appropriate VLAN id.
+
+ This mode provides fault tolerance. The primary
+ option, documented below, affects the behavior of this
+ mode.
balance-xor or 2
- XOR policy: Transmit based on [(source MAC address
- XOR'd with destination MAC address) modulo slave
- count]. This selects the same slave for each
- destination MAC address. This mode provides load
- balancing and fault tolerance.
+ XOR policy: Transmit based on the selected transmit
+ hash policy. The default policy is a simple [(source
+ MAC address XOR'd with destination MAC address) modulo
+ slave count]. Alternate transmit policies may be
+ selected via the xmit_hash_policy option, described
+ below.
+
+ This mode provides load balancing and fault tolerance.
broadcast or 3
@@ -270,7 +303,17 @@ mode
duplex settings. Utilizes all slaves in the active
aggregator according to the 802.3ad specification.
- Pre-requisites:
+ Slave selection for outgoing traffic is done according
+ to the transmit hash policy, which may be changed from
+ the default simple XOR policy via the xmit_hash_policy
+ option, documented below. Note that not all transmit
+ policies may be 802.3ad compliant, particularly in
+ regards to the packet mis-ordering requirements of
+ section 43.2.4 of the 802.3ad standard. Differing
+ peer implementations will have varying tolerances for
+ noncompliance.
+
+ Prerequisites:
1. Ethtool support in the base drivers for retrieving
the speed and duplex of each slave.
@@ -333,7 +376,7 @@ mode
When a link is reconnected or a new slave joins the
bond the receive traffic is redistributed among all
- active slaves in the bond by intiating ARP Replies
+ active slaves in the bond by initiating ARP Replies
with the selected mac address to each of the
clients. The updelay parameter (detailed below) must
be set to a value equal or greater than the switch's
@@ -396,6 +439,60 @@ use_carrier
0 will use the deprecated MII / ETHTOOL ioctls. The default
value is 1.
+xmit_hash_policy
+
+ Selects the transmit hash policy to use for slave selection in
+ balance-xor and 802.3ad modes. Possible values are:
+
+ layer2
+
+ Uses XOR of hardware MAC addresses to generate the
+ hash. The formula is
+
+ (source MAC XOR destination MAC) modulo slave count
+
+ This algorithm will place all traffic to a particular
+ network peer on the same slave.
+
+ This algorithm is 802.3ad compliant.
+
+ layer3+4
+
+ This policy uses upper layer protocol information,
+ when available, to generate the hash. This allows for
+ traffic to a particular network peer to span multiple
+ slaves, although a single connection will not span
+ multiple slaves.
+
+ The formula for unfragmented TCP and UDP packets is
+
+ ((source port XOR dest port) XOR
+ ((source IP XOR dest IP) AND 0xffff)
+ modulo slave count
+
+ For fragmented TCP or UDP packets and all other IP
+ protocol traffic, the source and destination port
+ information is omitted. For non-IP traffic, the
+ formula is the same as for the layer2 transmit hash
+ policy.
+
+ This policy is intended to mimic the behavior of
+ certain switches, notably Cisco switches with PFC2 as
+ well as some Foundry and IBM products.
+
+ This algorithm is not fully 802.3ad compliant. A
+ single TCP or UDP conversation containing both
+ fragmented and unfragmented packets will see packets
+ striped across two interfaces. This may result in out
+ of order delivery. Most traffic types will not meet
+ this criteria, as TCP rarely fragments traffic, and
+ most UDP traffic is not involved in extended
+ conversations. Other implementations of 802.3ad may
+ or may not tolerate this noncompliance.
+
+ The default value is layer2. This option was added in bonding
+version 2.6.3. In earlier versions of bonding, this parameter does
+not exist, and the layer2 policy is the only policy.
3. Configuring Bonding Devices
@@ -448,8 +545,9 @@ Bonding devices can be managed by hand, however, as follows.
slave devices. On SLES 9, this is most easily done by running the
yast2 sysconfig configuration utility. The goal is for to create an
ifcfg-id file for each slave device. The simplest way to accomplish
-this is to configure the devices for DHCP. The name of the
-configuration file for each device will be of the form:
+this is to configure the devices for DHCP (this is only to get the
+file ifcfg-id file created; see below for some issues with DHCP). The
+name of the configuration file for each device will be of the form:
ifcfg-id-xx:xx:xx:xx:xx:xx
@@ -459,7 +557,7 @@ the device's permanent MAC address.
Once the set of ifcfg-id-xx:xx:xx:xx:xx:xx files has been
created, it is necessary to edit the configuration files for the slave
devices (the MAC addresses correspond to those of the slave devices).
-Before editing, the file will contain muliple lines, and will look
+Before editing, the file will contain multiple lines, and will look
something like this:
BOOTPROTO='dhcp'
@@ -496,16 +594,11 @@ STARTMODE="onboot"
BONDING_MASTER="yes"
BONDING_MODULE_OPTS="mode=active-backup miimon=100"
BONDING_SLAVE0="eth0"
-BONDING_SLAVE1="eth1"
+BONDING_SLAVE1="bus-pci-0000:06:08.1"
Replace the sample BROADCAST, IPADDR, NETMASK and NETWORK
values with the appropriate values for your network.
- Note that configuring the bonding device with BOOTPROTO='dhcp'
-does not work; the scripts attempt to obtain the device address from
-DHCP prior to adding any of the slave devices. Without active slaves,
-the DHCP requests are not sent to the network.
-
The STARTMODE specifies when the device is brought online.
The possible values are:
@@ -531,9 +624,17 @@ for the bonding mode, link monitoring, and so on here. Do not include
the max_bonds bonding parameter; this will confuse the configuration
system if you have multiple bonding devices.
- Finally, supply one BONDING_SLAVEn="ethX" for each slave,
-where "n" is an increasing value, one for each slave, and "ethX" is
-the name of the slave device (eth0, eth1, etc).
+ Finally, supply one BONDING_SLAVEn="slave device" for each
+slave. where "n" is an increasing value, one for each slave. The
+"slave device" is either an interface name, e.g., "eth0", or a device
+specifier for the network device. The interface name is easier to
+find, but the ethN names are subject to change at boot time if, e.g.,
+a device early in the sequence has failed. The device specifiers
+(bus-pci-0000:06:08.1 in the example above) specify the physical
+network device, and will not change unless the device's bus location
+changes (for example, it is moved from one PCI slot to another). The
+example above uses one of each type for demonstration purposes; most
+configurations will choose one or the other for all slave devices.
When all configuration files have been modified or created,
networking must be restarted for the configuration changes to take
@@ -544,7 +645,7 @@ effect. This can be accomplished via the following:
Note that the network control script (/sbin/ifdown) will
remove the bonding module as part of the network shutdown processing,
so it is not necessary to remove the module by hand if, e.g., the
-module paramters have changed.
+module parameters have changed.
Also, at this writing, YaST/YaST2 will not manage bonding
devices (they do not show bonding interfaces on its list of network
@@ -559,12 +660,37 @@ format can be found in an example ifcfg template file:
Note that the template does not document the various BONDING_
settings described above, but does describe many of the other options.
+3.1.1 Using DHCP with sysconfig
+-------------------------------
+
+ Under sysconfig, configuring a device with BOOTPROTO='dhcp'
+will cause it to query DHCP for its IP address information. At this
+writing, this does not function for bonding devices; the scripts
+attempt to obtain the device address from DHCP prior to adding any of
+the slave devices. Without active slaves, the DHCP requests are not
+sent to the network.
+
+3.1.2 Configuring Multiple Bonds with sysconfig
+-----------------------------------------------
+
+ The sysconfig network initialization system is capable of
+handling multiple bonding devices. All that is necessary is for each
+bonding instance to have an appropriately configured ifcfg-bondX file
+(as described above). Do not specify the "max_bonds" parameter to any
+instance of bonding, as this will confuse sysconfig. If you require
+multiple bonding devices with identical parameters, create multiple
+ifcfg-bondX files.
+
+ Because the sysconfig scripts supply the bonding module
+options in the ifcfg-bondX file, it is not necessary to add them to
+the system /etc/modules.conf or /etc/modprobe.conf configuration file.
+
3.2 Configuration with initscripts support
------------------------------------------
This section applies to distros using a version of initscripts
with bonding support, for example, Red Hat Linux 9 or Red Hat
-Enterprise Linux version 3. On these systems, the network
+Enterprise Linux version 3 or 4. On these systems, the network
initialization scripts have some knowledge of bonding, and can be
configured to control bonding devices.
@@ -614,10 +740,11 @@ USERCTL=no
Be sure to change the networking specific lines (IPADDR,
NETMASK, NETWORK and BROADCAST) to match your network configuration.
- Finally, it is necessary to edit /etc/modules.conf to load the
-bonding module when the bond0 interface is brought up. The following
-sample lines in /etc/modules.conf will load the bonding module, and
-select its options:
+ Finally, it is necessary to edit /etc/modules.conf (or
+/etc/modprobe.conf, depending upon your distro) to load the bonding
+module with your desired options when the bond0 interface is brought
+up. The following lines in /etc/modules.conf (or modprobe.conf) will
+load the bonding module, and select its options:
alias bond0 bonding
options bond0 mode=balance-alb miimon=100
@@ -629,6 +756,33 @@ options for your configuration.
will restart the networking subsystem and your bond link should be now
up and running.
+3.2.1 Using DHCP with initscripts
+---------------------------------
+
+ Recent versions of initscripts (the version supplied with
+Fedora Core 3 and Red Hat Enterprise Linux 4 is reported to work) do
+have support for assigning IP information to bonding devices via DHCP.
+
+ To configure bonding for DHCP, configure it as described
+above, except replace the line "BOOTPROTO=none" with "BOOTPROTO=dhcp"
+and add a line consisting of "TYPE=Bonding". Note that the TYPE value
+is case sensitive.
+
+3.2.2 Configuring Multiple Bonds with initscripts
+-------------------------------------------------
+
+ At this writing, the initscripts package does not directly
+support loading the bonding driver multiple times, so the process for
+doing so is the same as described in the "Configuring Multiple Bonds
+Manually" section, below.
+
+ NOTE: It has been observed that some Red Hat supplied kernels
+are apparently unable to rename modules at load time (the "-obonding1"
+part). Attempts to pass that option to modprobe will produce an
+"Operation not permitted" error. This has been reported on some
+Fedora Core kernels, and has been seen on RHEL 4 as well. On kernels
+exhibiting this problem, it will be impossible to configure multiple
+bonds with differing parameters.
3.3 Configuring Bonding Manually
--------------------------------
@@ -638,10 +792,11 @@ scripts (the sysconfig or initscripts package) do not have specific
knowledge of bonding. One such distro is SuSE Linux Enterprise Server
version 8.
- The general methodology for these systems is to place the
-bonding module parameters into /etc/modprobe.conf, then add modprobe
-and/or ifenslave commands to the system's global init script. The
-name of the global init script differs; for sysconfig, it is
+ The general method for these systems is to place the bonding
+module parameters into /etc/modules.conf or /etc/modprobe.conf (as
+appropriate for the installed distro), then add modprobe and/or
+ifenslave commands to the system's global init script. The name of
+the global init script differs; for sysconfig, it is
/etc/init.d/boot.local and for initscripts it is /etc/rc.d/rc.local.
For example, if you wanted to make a simple bond of two e100
@@ -649,7 +804,7 @@ devices (presumed to be eth0 and eth1), and have it persist across
reboots, edit the appropriate file (/etc/init.d/boot.local or
/etc/rc.d/rc.local), and add the following:
-modprobe bonding -obond0 mode=balance-alb miimon=100
+modprobe bonding mode=balance-alb miimon=100
modprobe e100
ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up
ifenslave bond0 eth0
@@ -657,11 +812,7 @@ ifenslave bond0 eth1
Replace the example bonding module parameters and bond0
network configuration (IP address, netmask, etc) with the appropriate
-values for your configuration. The above example loads the bonding
-module with the name "bond0," this simplifies the naming if multiple
-bonding modules are loaded (each successive instance of the module is
-given a different name, and the module instance names match the
-bonding interface names).
+values for your configuration.
Unfortunately, this method will not provide support for the
ifup and ifdown scripts on the bond devices. To reload the bonding
@@ -684,20 +835,23 @@ appropriate device driver modules. For our example above, you can do
the following:
# ifconfig bond0 down
-# rmmod bond0
+# rmmod bonding
# rmmod e100
Again, for convenience, it may be desirable to create a script
with these commands.
-3.4 Configuring Multiple Bonds
-------------------------------
+3.3.1 Configuring Multiple Bonds Manually
+-----------------------------------------
This section contains information on configuring multiple
-bonding devices with differing options. If you require multiple
-bonding devices, but all with the same options, see the "max_bonds"
-module paramter, documented above.
+bonding devices with differing options for those systems whose network
+initialization scripts lack support for configuring multiple bonds.
+
+ If you require multiple bonding devices, but all with the same
+options, you may wish to use the "max_bonds" module parameter,
+documented above.
To create multiple bonding devices with differing options, it
is necessary to load the bonding driver multiple times. Note that
@@ -724,11 +878,16 @@ named "bond0" and creates the bond0 device in balance-rr mode with an
miimon of 100. The second instance is named "bond1" and creates the
bond1 device in balance-alb mode with an miimon of 50.
+ In some circumstances (typically with older distributions),
+the above does not work, and the second bonding instance never sees
+its options. In that case, the second options line can be substituted
+as follows:
+
+install bonding1 /sbin/modprobe bonding -obond1 mode=balance-alb miimon=50
+
This may be repeated any number of times, specifying a new and
-unique name in place of bond0 or bond1 for each instance.
+unique name in place of bond1 for each subsequent instance.
- When the appropriate module paramters are in place, then
-configure bonding according to the instructions for your distro.
5. Querying Bonding Configuration
=================================
@@ -846,8 +1005,8 @@ tagged internally by bonding itself. As a result, bonding must
self generated packets.
For reasons of simplicity, and to support the use of adapters
-that can do VLAN hardware acceleration offloding, the bonding
-interface declares itself as fully hardware offloaing capable, it gets
+that can do VLAN hardware acceleration offloading, the bonding
+interface declares itself as fully hardware offloading capable, it gets
the add_vid/kill_vid notifications to gather the necessary
information, and it propagates those actions to the slaves. In case
of mixed adapter types, hardware accelerated tagged packets that
@@ -880,7 +1039,7 @@ bond interface:
matches the hardware address of the VLAN interfaces.
Note that changing a VLAN interface's HW address would set the
-underlying device -- i.e. the bonding interface -- to promiscouos
+underlying device -- i.e. the bonding interface -- to promiscuous
mode, which might not be what you want.
@@ -923,7 +1082,7 @@ down or have a problem making it unresponsive to ARP requests. Having
an additional target (or several) increases the reliability of the ARP
monitoring.
- Multiple ARP targets must be seperated by commas as follows:
+ Multiple ARP targets must be separated by commas as follows:
# example options for ARP monitoring with three targets
alias bond0 bonding
@@ -1045,7 +1204,7 @@ install bonding /sbin/modprobe tg3; /sbin/modprobe e1000;
This will, when loading the bonding module, rather than
performing the normal action, instead execute the provided command.
This command loads the device drivers in the order needed, then calls
-modprobe with --ingore-install to cause the normal action to then take
+modprobe with --ignore-install to cause the normal action to then take
place. Full documentation on this can be found in the modprobe.conf
and modprobe manual pages.
@@ -1130,14 +1289,14 @@ association.
common to enable promiscuous mode on the device, so that all traffic
is seen (instead of seeing only traffic destined for the local host).
The bonding driver handles promiscuous mode changes to the bonding
-master device (e.g., bond0), and propogates the setting to the slave
+master device (e.g., bond0), and propagates the setting to the slave
devices.
For the balance-rr, balance-xor, broadcast, and 802.3ad modes,
-the promiscuous mode setting is propogated to all slaves.
+the promiscuous mode setting is propagated to all slaves.
For the active-backup, balance-tlb and balance-alb modes, the
-promiscuous mode setting is propogated only to the active slave.
+promiscuous mode setting is propagated only to the active slave.
For balance-tlb mode, the active slave is the slave currently
receiving inbound traffic.
@@ -1148,46 +1307,182 @@ sending to peers that are unassigned or if the load is unbalanced.
For the active-backup, balance-tlb and balance-alb modes, when
the active slave changes (e.g., due to a link failure), the
-promiscuous setting will be propogated to the new active slave.
+promiscuous setting will be propagated to the new active slave.
-12. High Availability Information
-=================================
+12. Configuring Bonding for High Availability
+=============================================
High Availability refers to configurations that provide
maximum network availability by having redundant or backup devices,
-links and switches between the host and the rest of the world.
-
- There are currently two basic methods for configuring to
-maximize availability. They are dependent on the network topology and
-the primary goal of the configuration, but in general, a configuration
-can be optimized for maximum available bandwidth, or for maximum
-network availability.
+links or switches between the host and the rest of the world. The
+goal is to provide the maximum availability of network connectivity
+(i.e., the network always works), even though other configurations
+could provide higher throughput.
12.1 High Availability in a Single Switch Topology
--------------------------------------------------
- If two hosts (or a host and a switch) are directly connected
-via multiple physical links, then there is no network availability
-penalty for optimizing for maximum bandwidth: there is only one switch
-(or peer), so if it fails, you have no alternative access to fail over
-to.
+ If two hosts (or a host and a single switch) are directly
+connected via multiple physical links, then there is no availability
+penalty to optimizing for maximum bandwidth. In this case, there is
+only one switch (or peer), so if it fails, there is no alternative
+access to fail over to. Additionally, the bonding load balance modes
+support link monitoring of their members, so if individual links fail,
+the load will be rebalanced across the remaining devices.
+
+ See Section 13, "Configuring Bonding for Maximum Throughput"
+for information on configuring bonding with one peer device.
+
+12.2 High Availability in a Multiple Switch Topology
+----------------------------------------------------
+
+ With multiple switches, the configuration of bonding and the
+network changes dramatically. In multiple switch topologies, there is
+a trade off between network availability and usable bandwidth.
+
+ Below is a sample network, configured to maximize the
+availability of the network:
-Example 1 : host to switch (or other host)
+ | |
+ |port3 port3|
+ +-----+----+ +-----+----+
+ | |port2 ISL port2| |
+ | switch A +--------------------------+ switch B |
+ | | | |
+ +-----+----+ +-----++---+
+ |port1 port1|
+ | +-------+ |
+ +-------------+ host1 +---------------+
+ eth0 +-------+ eth1
- +----------+ +----------+
- | |eth0 eth0| switch |
- | Host A +--------------------------+ or |
- | +--------------------------+ other |
- | |eth1 eth1| host |
- +----------+ +----------+
+ In this configuration, there is a link between the two
+switches (ISL, or inter switch link), and multiple ports connecting to
+the outside world ("port3" on each switch). There is no technical
+reason that this could not be extended to a third switch.
+12.2.1 HA Bonding Mode Selection for Multiple Switch Topology
+-------------------------------------------------------------
-12.1.1 Bonding Mode Selection for single switch topology
---------------------------------------------------------
+ In a topology such as the example above, the active-backup and
+broadcast modes are the only useful bonding modes when optimizing for
+availability; the other modes require all links to terminate on the
+same peer for them to behave rationally.
+
+active-backup: This is generally the preferred mode, particularly if
+ the switches have an ISL and play together well. If the
+ network configuration is such that one switch is specifically
+ a backup switch (e.g., has lower capacity, higher cost, etc),
+ then the primary option can be used to insure that the
+ preferred link is always used when it is available.
+
+broadcast: This mode is really a special purpose mode, and is suitable
+ only for very specific needs. For example, if the two
+ switches are not connected (no ISL), and the networks beyond
+ them are totally independent. In this case, if it is
+ necessary for some specific one-way traffic to reach both
+ independent networks, then the broadcast mode may be suitable.
+
+12.2.2 HA Link Monitoring Selection for Multiple Switch Topology
+----------------------------------------------------------------
+
+ The choice of link monitoring ultimately depends upon your
+switch. If the switch can reliably fail ports in response to other
+failures, then either the MII or ARP monitors should work. For
+example, in the above example, if the "port3" link fails at the remote
+end, the MII monitor has no direct means to detect this. The ARP
+monitor could be configured with a target at the remote end of port3,
+thus detecting that failure without switch support.
+
+ In general, however, in a multiple switch topology, the ARP
+monitor can provide a higher level of reliability in detecting end to
+end connectivity failures (which may be caused by the failure of any
+individual component to pass traffic for any reason). Additionally,
+the ARP monitor should be configured with multiple targets (at least
+one for each switch in the network). This will insure that,
+regardless of which switch is active, the ARP monitor has a suitable
+target to query.
+
+
+13. Configuring Bonding for Maximum Throughput
+==============================================
+
+13.1 Maximizing Throughput in a Single Switch Topology
+------------------------------------------------------
+
+ In a single switch configuration, the best method to maximize
+throughput depends upon the application and network environment. The
+various load balancing modes each have strengths and weaknesses in
+different environments, as detailed below.
+
+ For this discussion, we will break down the topologies into
+two categories. Depending upon the destination of most traffic, we
+categorize them into either "gatewayed" or "local" configurations.
+
+ In a gatewayed configuration, the "switch" is acting primarily
+as a router, and the majority of traffic passes through this router to
+other networks. An example would be the following:
+
+
+ +----------+ +----------+
+ | |eth0 port1| | to other networks
+ | Host A +---------------------+ router +------------------->
+ | +---------------------+ | Hosts B and C are out
+ | |eth1 port2| | here somewhere
+ +----------+ +----------+
+
+ The router may be a dedicated router device, or another host
+acting as a gateway. For our discussion, the important point is that
+the majority of traffic from Host A will pass through the router to
+some other network before reaching its final destination.
+
+ In a gatewayed network configuration, although Host A may
+communicate with many other systems, all of its traffic will be sent
+and received via one other peer on the local network, the router.
+
+ Note that the case of two systems connected directly via
+multiple physical links is, for purposes of configuring bonding, the
+same as a gatewayed configuration. In that case, it happens that all
+traffic is destined for the "gateway" itself, not some other network
+beyond the gateway.
+
+ In a local configuration, the "switch" is acting primarily as
+a switch, and the majority of traffic passes through this switch to
+reach other stations on the same network. An example would be the
+following:
+
+ +----------+ +----------+ +--------+
+ | |eth0 port1| +-------+ Host B |
+ | Host A +------------+ switch |port3 +--------+
+ | +------------+ | +--------+
+ | |eth1 port2| +------------------+ Host C |
+ +----------+ +----------+port4 +--------+
+
+
+ Again, the switch may be a dedicated switch device, or another
+host acting as a gateway. For our discussion, the important point is
+that the majority of traffic from Host A is destined for other hosts
+on the same local network (Hosts B and C in the above example).
+
+ In summary, in a gatewayed configuration, traffic to and from
+the bonded device will be to the same MAC level peer on the network
+(the gateway itself, i.e., the router), regardless of its final
+destination. In a local configuration, traffic flows directly to and
+from the final destinations, thus, each destination (Host B, Host C)
+will be addressed directly by their individual MAC addresses.
+
+ This distinction between a gatewayed and a local network
+configuration is important because many of the load balancing modes
+available use the MAC addresses of the local network source and
+destination to make load balancing decisions. The behavior of each
+mode is described below.
+
+
+13.1.1 MT Bonding Mode Selection for Single Switch Topology
+-----------------------------------------------------------
This configuration is the easiest to set up and to understand,
although you will have to decide which bonding mode best suits your
-needs. The tradeoffs for each mode are detailed below:
+needs. The trade offs for each mode are detailed below:
balance-rr: This mode is the only mode that will permit a single
TCP/IP connection to stripe traffic across multiple
@@ -1206,6 +1501,23 @@ balance-rr: This mode is the only mode that will permit a single
interface's worth of throughput, even after adjusting
tcp_reordering.
+ Note that this out of order delivery occurs when both the
+ sending and receiving systems are utilizing a multiple
+ interface bond. Consider a configuration in which a
+ balance-rr bond feeds into a single higher capacity network
+ channel (e.g., multiple 100Mb/sec ethernets feeding a single
+ gigabit ethernet via an etherchannel capable switch). In this
+ configuration, traffic sent from the multiple 100Mb devices to
+ a destination connected to the gigabit device will not see
+ packets out of order. However, traffic sent from the gigabit
+ device to the multiple 100Mb devices may or may not see
+ traffic out of order, depending upon the balance policy of the
+ switch. Many switches do not support any modes that stripe
+ traffic (instead choosing a port based upon IP or MAC level
+ addresses); for those devices, traffic flowing from the
+ gigabit device to the many 100Mb devices will only utilize one
+ interface.
+
If you are utilizing protocols other than TCP/IP, UDP for
example, and your application can tolerate out of order
delivery, then this mode can allow for single stream datagram
@@ -1220,16 +1532,21 @@ active-backup: There is not much advantage in this network topology to
connected to the same peer as the primary. In this case, a
load balancing mode (with link monitoring) will provide the
same level of network availability, but with increased
- available bandwidth. On the plus side, it does not require
- any configuration of the switch.
+ available bandwidth. On the plus side, active-backup mode
+ does not require any configuration of the switch, so it may
+ have value if the hardware available does not support any of
+ the load balance modes.
balance-xor: This mode will limit traffic such that packets destined
for specific peers will always be sent over the same
interface. Since the destination is determined by the MAC
- addresses involved, this may be desirable if you have a large
- network with many hosts. It is likely to be suboptimal if all
- your traffic is passed through a single router, however. As
- with balance-rr, the switch ports need to be configured for
+ addresses involved, this mode works best in a "local" network
+ configuration (as described above), with destinations all on
+ the same local network. This mode is likely to be suboptimal
+ if all your traffic is passed through a single router (i.e., a
+ "gatewayed" network configuration, as described above).
+
+ As with balance-rr, the switch ports need to be configured for
"etherchannel" or "trunking."
broadcast: Like active-backup, there is not much advantage to this
@@ -1241,122 +1558,131 @@ broadcast: Like active-backup, there is not much advantage to this
protocol includes automatic configuration of the aggregates,
so minimal manual configuration of the switch is needed
(typically only to designate that some set of devices is
- usable for 802.3ad). The 802.3ad standard also mandates that
- frames be delivered in order (within certain limits), so in
- general single connections will not see misordering of
+ available for 802.3ad). The 802.3ad standard also mandates
+ that frames be delivered in order (within certain limits), so
+ in general single connections will not see misordering of
packets. The 802.3ad mode does have some drawbacks: the
standard mandates that all devices in the aggregate operate at
the same speed and duplex. Also, as with all bonding load
balance modes other than balance-rr, no single connection will
be able to utilize more than a single interface's worth of
- bandwidth. Additionally, the linux bonding 802.3ad
- implementation distributes traffic by peer (using an XOR of
- MAC addresses), so in general all traffic to a particular
- destination will use the same interface. Finally, the 802.3ad
- mode mandates the use of the MII monitor, therefore, the ARP
- monitor is not available in this mode.
-
-balance-tlb: This mode is also a good choice for this type of
- topology. It has no special switch configuration
- requirements, and balances outgoing traffic by peer, in a
- vaguely intelligent manner (not a simple XOR as in balance-xor
- or 802.3ad mode), so that unlucky MAC addresses will not all
- "bunch up" on a single interface. Interfaces may be of
- differing speeds. On the down side, in this mode all incoming
- traffic arrives over a single interface, this mode requires
- certain ethtool support in the network device driver of the
- slave interfaces, and the ARP monitor is not available.
-
-balance-alb: This mode is everything that balance-tlb is, and more. It
- has all of the features (and restrictions) of balance-tlb, and
- will also balance incoming traffic from peers (as described in
- the Bonding Module Options section, above). The only extra
- down side to this mode is that the network device driver must
- support changing the hardware address while the device is
- open.
-
-12.1.2 Link Monitoring for Single Switch Topology
--------------------------------------------------
+ bandwidth.
+
+ Additionally, the linux bonding 802.3ad implementation
+ distributes traffic by peer (using an XOR of MAC addresses),
+ so in a "gatewayed" configuration, all outgoing traffic will
+ generally use the same device. Incoming traffic may also end
+ up on a single device, but that is dependent upon the
+ balancing policy of the peer's 8023.ad implementation. In a
+ "local" configuration, traffic will be distributed across the
+ devices in the bond.
+
+ Finally, the 802.3ad mode mandates the use of the MII monitor,
+ therefore, the ARP monitor is not available in this mode.
+
+balance-tlb: The balance-tlb mode balances outgoing traffic by peer.
+ Since the balancing is done according to MAC address, in a
+ "gatewayed" configuration (as described above), this mode will
+ send all traffic across a single device. However, in a
+ "local" network configuration, this mode balances multiple
+ local network peers across devices in a vaguely intelligent
+ manner (not a simple XOR as in balance-xor or 802.3ad mode),
+ so that mathematically unlucky MAC addresses (i.e., ones that
+ XOR to the same value) will not all "bunch up" on a single
+ interface.
+
+ Unlike 802.3ad, interfaces may be of differing speeds, and no
+ special switch configuration is required. On the down side,
+ in this mode all incoming traffic arrives over a single
+ interface, this mode requires certain ethtool support in the
+ network device driver of the slave interfaces, and the ARP
+ monitor is not available.
+
+balance-alb: This mode is everything that balance-tlb is, and more.
+ It has all of the features (and restrictions) of balance-tlb,
+ and will also balance incoming traffic from local network
+ peers (as described in the Bonding Module Options section,
+ above).
+
+ The only additional down side to this mode is that the network
+ device driver must support changing the hardware address while
+ the device is open.
+
+13.1.2 MT Link Monitoring for Single Switch Topology
+----------------------------------------------------
The choice of link monitoring may largely depend upon which
mode you choose to use. The more advanced load balancing modes do not
support the use of the ARP monitor, and are thus restricted to using
-the MII monitor (which does not provide as high a level of assurance
-as the ARP monitor).
-
-
-12.2 High Availability in a Multiple Switch Topology
-----------------------------------------------------
-
- With multiple switches, the configuration of bonding and the
-network changes dramatically. In multiple switch topologies, there is
-a tradeoff between network availability and usable bandwidth.
-
- Below is a sample network, configured to maximize the
-availability of the network:
-
- | |
- |port3 port3|
- +-----+----+ +-----+----+
- | |port2 ISL port2| |
- | switch A +--------------------------+ switch B |
- | | | |
- +-----+----+ +-----++---+
- |port1 port1|
- | +-------+ |
- +-------------+ host1 +---------------+
- eth0 +-------+ eth1
-
- In this configuration, there is a link between the two
-switches (ISL, or inter switch link), and multiple ports connecting to
-the outside world ("port3" on each switch). There is no technical
-reason that this could not be extended to a third switch.
-
-12.2.1 Bonding Mode Selection for Multiple Switch Topology
-----------------------------------------------------------
-
- In a topology such as this, the active-backup and broadcast
-modes are the only useful bonding modes; the other modes require all
-links to terminate on the same peer for them to behave rationally.
-
-active-backup: This is generally the preferred mode, particularly if
- the switches have an ISL and play together well. If the
- network configuration is such that one switch is specifically
- a backup switch (e.g., has lower capacity, higher cost, etc),
- then the primary option can be used to insure that the
- preferred link is always used when it is available.
-
-broadcast: This mode is really a special purpose mode, and is suitable
- only for very specific needs. For example, if the two
- switches are not connected (no ISL), and the networks beyond
- them are totally independant. In this case, if it is
- necessary for some specific one-way traffic to reach both
- independent networks, then the broadcast mode may be suitable.
-
-12.2.2 Link Monitoring Selection for Multiple Switch Topology
+the MII monitor (which does not provide as high a level of end to end
+assurance as the ARP monitor).
+
+13.2 Maximum Throughput in a Multiple Switch Topology
+-----------------------------------------------------
+
+ Multiple switches may be utilized to optimize for throughput
+when they are configured in parallel as part of an isolated network
+between two or more systems, for example:
+
+ +-----------+
+ | Host A |
+ +-+---+---+-+
+ | | |
+ +--------+ | +---------+
+ | | |
+ +------+---+ +-----+----+ +-----+----+
+ | Switch A | | Switch B | | Switch C |
+ +------+---+ +-----+----+ +-----+----+
+ | | |
+ +--------+ | +---------+
+ | | |
+ +-+---+---+-+
+ | Host B |
+ +-----------+
+
+ In this configuration, the switches are isolated from one
+another. One reason to employ a topology such as this is for an
+isolated network with many hosts (a cluster configured for high
+performance, for example), using multiple smaller switches can be more
+cost effective than a single larger switch, e.g., on a network with 24
+hosts, three 24 port switches can be significantly less expensive than
+a single 72 port switch.
+
+ If access beyond the network is required, an individual host
+can be equipped with an additional network device connected to an
+external network; this host then additionally acts as a gateway.
+
+13.2.1 MT Bonding Mode Selection for Multiple Switch Topology
-------------------------------------------------------------
- The choice of link monitoring ultimately depends upon your
-switch. If the switch can reliably fail ports in response to other
-failures, then either the MII or ARP monitors should work. For
-example, in the above example, if the "port3" link fails at the remote
-end, the MII monitor has no direct means to detect this. The ARP
-monitor could be configured with a target at the remote end of port3,
-thus detecting that failure without switch support.
+ In actual practice, the bonding mode typically employed in
+configurations of this type is balance-rr. Historically, in this
+network configuration, the usual caveats about out of order packet
+delivery are mitigated by the use of network adapters that do not do
+any kind of packet coalescing (via the use of NAPI, or because the
+device itself does not generate interrupts until some number of
+packets has arrived). When employed in this fashion, the balance-rr
+mode allows individual connections between two hosts to effectively
+utilize greater than one interface's bandwidth.
- In general, however, in a multiple switch topology, the ARP
-monitor can provide a higher level of reliability in detecting link
-failures. Additionally, it should be configured with multiple targets
-(at least one for each switch in the network). This will insure that,
-regardless of which switch is active, the ARP monitor has a suitable
-target to query.
+13.2.2 MT Link Monitoring for Multiple Switch Topology
+------------------------------------------------------
+ Again, in actual practice, the MII monitor is most often used
+in this configuration, as performance is given preference over
+availability. The ARP monitor will function in this topology, but its
+advantages over the MII monitor are mitigated by the volume of probes
+needed as the number of systems involved grows (remember that each
+host in the network is configured with bonding).
-12.3 Switch Behavior Issues for High Availability
--------------------------------------------------
+14. Switch Behavior Issues
+==========================
- You may encounter issues with the timing of link up and down
-reporting by the switch.
+14.1 Link Establishment and Failover Delays
+-------------------------------------------
+
+ Some switches exhibit undesirable behavior with regard to the
+timing of link up and down reporting by the switch.
First, when a link comes up, some switches may indicate that
the link is up (carrier available), but not pass traffic over the
@@ -1370,30 +1696,70 @@ relevant interface(s).
Second, some switches may "bounce" the link state one or more
times while a link is changing state. This occurs most commonly while
the switch is initializing. Again, an appropriate updelay value may
-help, but note that if all links are down, then updelay is ignored
-when any link becomes active (the slave closest to completing its
-updelay is chosen).
+help.
Note that when a bonding interface has no active links, the
-driver will immediately reuse the first link that goes up, even if
-updelay parameter was specified. If there are slave interfaces
-waiting for the updelay timeout to expire, the interface that first
-went into that state will be immediately reused. This reduces down
-time of the network if the value of updelay has been overestimated.
+driver will immediately reuse the first link that goes up, even if the
+updelay parameter has been specified (the updelay is ignored in this
+case). If there are slave interfaces waiting for the updelay timeout
+to expire, the interface that first went into that state will be
+immediately reused. This reduces down time of the network if the
+value of updelay has been overestimated, and since this occurs only in
+cases with no connectivity, there is no additional penalty for
+ignoring the updelay.
In addition to the concerns about switch timings, if your
switches take a long time to go into backup mode, it may be desirable
to not activate a backup interface immediately after a link goes down.
Failover may be delayed via the downdelay bonding module option.
-13. Hardware Specific Considerations
+14.2 Duplicated Incoming Packets
+--------------------------------
+
+ It is not uncommon to observe a short burst of duplicated
+traffic when the bonding device is first used, or after it has been
+idle for some period of time. This is most easily observed by issuing
+a "ping" to some other host on the network, and noticing that the
+output from ping flags duplicates (typically one per slave).
+
+ For example, on a bond in active-backup mode with five slaves
+all connected to one switch, the output may appear as follows:
+
+# ping -n 10.0.4.2
+PING 10.0.4.2 (10.0.4.2) from 10.0.3.10 : 56(84) bytes of data.
+64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.7 ms
+64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!)
+64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!)
+64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!)
+64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!)
+64 bytes from 10.0.4.2: icmp_seq=2 ttl=64 time=0.216 ms
+64 bytes from 10.0.4.2: icmp_seq=3 ttl=64 time=0.267 ms
+64 bytes from 10.0.4.2: icmp_seq=4 ttl=64 time=0.222 ms
+
+ This is not due to an error in the bonding driver, rather, it
+is a side effect of how many switches update their MAC forwarding
+tables. Initially, the switch does not associate the MAC address in
+the packet with a particular switch port, and so it may send the
+traffic to all ports until its MAC forwarding table is updated. Since
+the interfaces attached to the bond may occupy multiple ports on a
+single switch, when the switch (temporarily) floods the traffic to all
+ports, the bond device receives multiple copies of the same packet
+(one per slave device).
+
+ The duplicated packet behavior is switch dependent, some
+switches exhibit this, and some do not. On switches that display this
+behavior, it can be induced by clearing the MAC forwarding table (on
+most Cisco switches, the privileged command "clear mac address-table
+dynamic" will accomplish this).
+
+15. Hardware Specific Considerations
====================================
This section contains additional information for configuring
bonding on specific hardware platforms, or for interfacing bonding
with particular switches or other devices.
-13.1 IBM BladeCenter
+15.1 IBM BladeCenter
--------------------
This applies to the JS20 and similar systems.
@@ -1407,12 +1773,12 @@ JS20 network adapter information
--------------------------------
All JS20s come with two Broadcom Gigabit Ethernet ports
-integrated on the planar. In the BladeCenter chassis, the eth0 port
-of all JS20 blades is hard wired to I/O Module #1; similarly, all eth1
-ports are wired to I/O Module #2. An add-on Broadcom daughter card
-can be installed on a JS20 to provide two more Gigabit Ethernet ports.
-These ports, eth2 and eth3, are wired to I/O Modules 3 and 4,
-respectively.
+integrated on the planar (that's "motherboard" in IBM-speak). In the
+BladeCenter chassis, the eth0 port of all JS20 blades is hard wired to
+I/O Module #1; similarly, all eth1 ports are wired to I/O Module #2.
+An add-on Broadcom daughter card can be installed on a JS20 to provide
+two more Gigabit Ethernet ports. These ports, eth2 and eth3, are
+wired to I/O Modules 3 and 4, respectively.
Each I/O Module may contain either a switch or a passthrough
module (which allows ports to be directly connected to an external
@@ -1432,29 +1798,30 @@ BladeCenter networking configuration
of ways, this discussion will be confined to describing basic
configurations.
- Normally, Ethernet Switch Modules (ESM) are used in I/O
+ Normally, Ethernet Switch Modules (ESMs) are used in I/O
modules 1 and 2. In this configuration, the eth0 and eth1 ports of a
JS20 will be connected to different internal switches (in the
respective I/O modules).
- An optical passthru module (OPM) connects the I/O module
-directly to an external switch. By using OPMs in I/O module #1 and
-#2, the eth0 and eth1 interfaces of a JS20 can be redirected to the
-outside world and connected to a common external switch.
-
- Depending upon the mix of ESM and OPM modules, the network
-will appear to bonding as either a single switch topology (all OPM
-modules) or as a multiple switch topology (one or more ESM modules,
-zero or more OPM modules). It is also possible to connect ESM modules
-together, resulting in a configuration much like the example in "High
-Availability in a multiple switch topology."
-
-Requirements for specifc modes
-------------------------------
-
- The balance-rr mode requires the use of OPM modules for
-devices in the bond, all connected to an common external switch. That
-switch must be configured for "etherchannel" or "trunking" on the
+ A passthrough module (OPM or CPM, optical or copper,
+passthrough module) connects the I/O module directly to an external
+switch. By using PMs in I/O module #1 and #2, the eth0 and eth1
+interfaces of a JS20 can be redirected to the outside world and
+connected to a common external switch.
+
+ Depending upon the mix of ESMs and PMs, the network will
+appear to bonding as either a single switch topology (all PMs) or as a
+multiple switch topology (one or more ESMs, zero or more PMs). It is
+also possible to connect ESMs together, resulting in a configuration
+much like the example in "High Availability in a Multiple Switch
+Topology," above.
+
+Requirements for specific modes
+-------------------------------
+
+ The balance-rr mode requires the use of passthrough modules
+for devices in the bond, all connected to an common external switch.
+That switch must be configured for "etherchannel" or "trunking" on the
appropriate ports, as is usual for balance-rr.
The balance-alb and balance-tlb modes will function with
@@ -1484,17 +1851,18 @@ connected to the JS20 system.
Other concerns
--------------
- The Serial Over LAN link is established over the primary
+ The Serial Over LAN (SoL) link is established over the primary
ethernet (eth0) only, therefore, any loss of link to eth0 will result
in losing your SoL connection. It will not fail over with other
-network traffic.
+network traffic, as the SoL system is beyond the control of the
+bonding driver.
It may be desirable to disable spanning tree on the switch
(either the internal Ethernet Switch Module, or an external switch) to
-avoid fail-over delays issues when using bonding.
+avoid fail-over delay issues when using bonding.
-14. Frequently Asked Questions
+16. Frequently Asked Questions
==============================
1. Is it SMP safe?
@@ -1505,8 +1873,8 @@ The new driver was designed to be SMP safe from the start.
2. What type of cards will work with it?
Any Ethernet type cards (you can even mix cards - a Intel
-EtherExpress PRO/100 and a 3com 3c905b, for example). They need not
-be of the same speed.
+EtherExpress PRO/100 and a 3com 3c905b, for example). For most modes,
+devices need not be of the same speed.
3. How many bonding devices can I have?
@@ -1524,11 +1892,12 @@ system.
disabled. The active-backup mode will fail over to a backup link, and
other modes will ignore the failed link. The link will continue to be
monitored, and should it recover, it will rejoin the bond (in whatever
-manner is appropriate for the mode). See the section on High
-Availability for additional information.
+manner is appropriate for the mode). See the sections on High
+Availability and the documentation for each mode for additional
+information.
Link monitoring can be enabled via either the miimon or
-arp_interval paramters (described in the module paramters section,
+arp_interval parameters (described in the module parameters section,
above). In general, miimon monitors the carrier state as sensed by
the underlying network device, and the arp monitor (arp_interval)
monitors connectivity to another host on the local network.
@@ -1536,7 +1905,7 @@ monitors connectivity to another host on the local network.
If no link monitoring is configured, the bonding driver will
be unable to detect link failures, and will assume that all links are
always available. This will likely result in lost packets, and a
-resulting degredation of performance. The precise performance loss
+resulting degradation of performance. The precise performance loss
depends upon the bonding mode and network configuration.
6. Can bonding be used for High Availability?
@@ -1550,12 +1919,12 @@ depends upon the bonding mode and network configuration.
In the basic balance modes (balance-rr and balance-xor), it
works with any system that supports etherchannel (also called
trunking). Most managed switches currently available have such
-support, and many unmananged switches as well.
+support, and many unmanaged switches as well.
The advanced balance modes (balance-tlb and balance-alb) do
not have special switch requirements, but do need device drivers that
support specific features (described in the appropriate section under
-module paramters, above).
+module parameters, above).
In 802.3ad mode, it works with with systems that support IEEE
802.3ad Dynamic Link Aggregation. Most managed and many unmanaged
@@ -1565,17 +1934,19 @@ switches currently available support 802.3ad.
8. Where does a bonding device get its MAC address from?
- If not explicitly configured with ifconfig, the MAC address of
-the bonding device is taken from its first slave device. This MAC
-address is then passed to all following slaves and remains persistent
-(even if the the first slave is removed) until the bonding device is
-brought down or reconfigured.
+ If not explicitly configured (with ifconfig or ip link), the
+MAC address of the bonding device is taken from its first slave
+device. This MAC address is then passed to all following slaves and
+remains persistent (even if the the first slave is removed) until the
+bonding device is brought down or reconfigured.
If you wish to change the MAC address, you can set it with
-ifconfig:
+ifconfig or ip link:
# ifconfig bond0 hw ether 00:11:22:33:44:55
+# ip link set bond0 address 66:77:88:99:aa:bb
+
The MAC address can be also changed by bringing down/up the
device and then changing its slaves (or their order):
@@ -1591,23 +1962,28 @@ from the bond (`ifenslave -d bond0 eth0'). The bonding driver will
then restore the MAC addresses that the slaves had before they were
enslaved.
-15. Resources and Links
+16. Resources and Links
=======================
The latest version of the bonding driver can be found in the latest
version of the linux kernel, found on http://kernel.org
+The latest version of this document can be found in either the latest
+kernel source (named Documentation/networking/bonding.txt), or on the
+bonding sourceforge site:
+
+http://www.sourceforge.net/projects/bonding
+
Discussions regarding the bonding driver take place primarily on the
bonding-devel mailing list, hosted at sourceforge.net. If you have
-questions or problems, post them to the list.
+questions or problems, post them to the list. The list address is:
bonding-devel@lists.sourceforge.net
-https://lists.sourceforge.net/lists/listinfo/bonding-devel
-
-There is also a project site on sourceforge.
+ The administrative interface (to subscribe or unsubscribe) can
+be found at:
-http://www.sourceforge.net/projects/bonding
+https://lists.sourceforge.net/lists/listinfo/bonding-devel
Donald Becker's Ethernet Drivers and diag programs may be found at :
- http://www.scyld.com/network/
diff --git a/Documentation/networking/cxgb.txt b/Documentation/networking/cxgb.txt
new file mode 100644
index 000000000000..76324638626b
--- /dev/null
+++ b/Documentation/networking/cxgb.txt
@@ -0,0 +1,352 @@
+ Chelsio N210 10Gb Ethernet Network Controller
+
+ Driver Release Notes for Linux
+
+ Version 2.1.1
+
+ June 20, 2005
+
+CONTENTS
+========
+ INTRODUCTION
+ FEATURES
+ PERFORMANCE
+ DRIVER MESSAGES
+ KNOWN ISSUES
+ SUPPORT
+
+
+INTRODUCTION
+============
+
+ This document describes the Linux driver for Chelsio 10Gb Ethernet Network
+ Controller. This driver supports the Chelsio N210 NIC and is backward
+ compatible with the Chelsio N110 model 10Gb NICs.
+
+
+FEATURES
+========
+
+ Adaptive Interrupts (adaptive-rx)
+ ---------------------------------
+
+ This feature provides an adaptive algorithm that adjusts the interrupt
+ coalescing parameters, allowing the driver to dynamically adapt the latency
+ settings to achieve the highest performance during various types of network
+ load.
+
+ The interface used to control this feature is ethtool. Please see the
+ ethtool manpage for additional usage information.
+
+ By default, adaptive-rx is disabled.
+ To enable adaptive-rx:
+
+ ethtool -C <interface> adaptive-rx on
+
+ To disable adaptive-rx, use ethtool:
+
+ ethtool -C <interface> adaptive-rx off
+
+ After disabling adaptive-rx, the timer latency value will be set to 50us.
+ You may set the timer latency after disabling adaptive-rx:
+
+ ethtool -C <interface> rx-usecs <microseconds>
+
+ An example to set the timer latency value to 100us on eth0:
+
+ ethtool -C eth0 rx-usecs 100
+
+ You may also provide a timer latency value while disabling adpative-rx:
+
+ ethtool -C <interface> adaptive-rx off rx-usecs <microseconds>
+
+ If adaptive-rx is disabled and a timer latency value is specified, the timer
+ will be set to the specified value until changed by the user or until
+ adaptive-rx is enabled.
+
+ To view the status of the adaptive-rx and timer latency values:
+
+ ethtool -c <interface>
+
+
+ TCP Segmentation Offloading (TSO) Support
+ -----------------------------------------
+
+ This feature, also known as "large send", enables a system's protocol stack
+ to offload portions of outbound TCP processing to a network interface card
+ thereby reducing system CPU utilization and enhancing performance.
+
+ The interface used to control this feature is ethtool version 1.8 or higher.
+ Please see the ethtool manpage for additional usage information.
+
+ By default, TSO is enabled.
+ To disable TSO:
+
+ ethtool -K <interface> tso off
+
+ To enable TSO:
+
+ ethtool -K <interface> tso on
+
+ To view the status of TSO:
+
+ ethtool -k <interface>
+
+
+PERFORMANCE
+===========
+
+ The following information is provided as an example of how to change system
+ parameters for "performance tuning" an what value to use. You may or may not
+ want to change these system parameters, depending on your server/workstation
+ application. Doing so is not warranted in any way by Chelsio Communications,
+ and is done at "YOUR OWN RISK". Chelsio will not be held responsible for loss
+ of data or damage to equipment.
+
+ Your distribution may have a different way of doing things, or you may prefer
+ a different method. These commands are shown only to provide an example of
+ what to do and are by no means definitive.
+
+ Making any of the following system changes will only last until you reboot
+ your system. You may want to write a script that runs at boot-up which
+ includes the optimal settings for your system.
+
+ Setting PCI Latency Timer:
+ setpci -d 1425:* 0x0c.l=0x0000F800
+
+ Disabling TCP timestamp:
+ sysctl -w net.ipv4.tcp_timestamps=0
+
+ Disabling SACK:
+ sysctl -w net.ipv4.tcp_sack=0
+
+ Setting large number of incoming connection requests:
+ sysctl -w net.ipv4.tcp_max_syn_backlog=3000
+
+ Setting maximum receive socket buffer size:
+ sysctl -w net.core.rmem_max=1024000
+
+ Setting maximum send socket buffer size:
+ sysctl -w net.core.wmem_max=1024000
+
+ Set smp_affinity (on a multiprocessor system) to a single CPU:
+ echo 1 > /proc/irq/<interrupt_number>/smp_affinity
+
+ Setting default receive socket buffer size:
+ sysctl -w net.core.rmem_default=524287
+
+ Setting default send socket buffer size:
+ sysctl -w net.core.wmem_default=524287
+
+ Setting maximum option memory buffers:
+ sysctl -w net.core.optmem_max=524287
+
+ Setting maximum backlog (# of unprocessed packets before kernel drops):
+ sysctl -w net.core.netdev_max_backlog=300000
+
+ Setting TCP read buffers (min/default/max):
+ sysctl -w net.ipv4.tcp_rmem="10000000 10000000 10000000"
+
+ Setting TCP write buffers (min/pressure/max):
+ sysctl -w net.ipv4.tcp_wmem="10000000 10000000 10000000"
+
+ Setting TCP buffer space (min/pressure/max):
+ sysctl -w net.ipv4.tcp_mem="10000000 10000000 10000000"
+
+ TCP window size for single connections:
+ The receive buffer (RX_WINDOW) size must be at least as large as the
+ Bandwidth-Delay Product of the communication link between the sender and
+ receiver. Due to the variations of RTT, you may want to increase the buffer
+ size up to 2 times the Bandwidth-Delay Product. Reference page 289 of
+ "TCP/IP Illustrated, Volume 1, The Protocols" by W. Richard Stevens.
+ At 10Gb speeds, use the following formula:
+ RX_WINDOW >= 1.25MBytes * RTT(in milliseconds)
+ Example for RTT with 100us: RX_WINDOW = (1,250,000 * 0.1) = 125,000
+ RX_WINDOW sizes of 256KB - 512KB should be sufficient.
+ Setting the min, max, and default receive buffer (RX_WINDOW) size:
+ sysctl -w net.ipv4.tcp_rmem="<min> <default> <max>"
+
+ TCP window size for multiple connections:
+ The receive buffer (RX_WINDOW) size may be calculated the same as single
+ connections, but should be divided by the number of connections. The
+ smaller window prevents congestion and facilitates better pacing,
+ especially if/when MAC level flow control does not work well or when it is
+ not supported on the machine. Experimentation may be necessary to attain
+ the correct value. This method is provided as a starting point fot the
+ correct receive buffer size.
+ Setting the min, max, and default receive buffer (RX_WINDOW) size is
+ performed in the same manner as single connection.
+
+
+DRIVER MESSAGES
+===============
+
+ The following messages are the most common messages logged by syslog. These
+ may be found in /var/log/messages.
+
+ Driver up:
+ Chelsio Network Driver - version 2.1.1
+
+ NIC detected:
+ eth#: Chelsio N210 1x10GBaseX NIC (rev #), PCIX 133MHz/64-bit
+
+ Link up:
+ eth#: link is up at 10 Gbps, full duplex
+
+ Link down:
+ eth#: link is down
+
+
+KNOWN ISSUES
+============
+
+ These issues have been identified during testing. The following information
+ is provided as a workaround to the problem. In some cases, this problem is
+ inherent to Linux or to a particular Linux Distribution and/or hardware
+ platform.
+
+ 1. Large number of TCP retransmits on a multiprocessor (SMP) system.
+
+ On a system with multiple CPUs, the interrupt (IRQ) for the network
+ controller may be bound to more than one CPU. This will cause TCP
+ retransmits if the packet data were to be split across different CPUs
+ and re-assembled in a different order than expected.
+
+ To eliminate the TCP retransmits, set smp_affinity on the particular
+ interrupt to a single CPU. You can locate the interrupt (IRQ) used on
+ the N110/N210 by using ifconfig:
+ ifconfig <dev_name> | grep Interrupt
+ Set the smp_affinity to a single CPU:
+ echo 1 > /proc/irq/<interrupt_number>/smp_affinity
+
+ It is highly suggested that you do not run the irqbalance daemon on your
+ system, as this will change any smp_affinity setting you have applied.
+ The irqbalance daemon runs on a 10 second interval and binds interrupts
+ to the least loaded CPU determined by the daemon. To disable this daemon:
+ chkconfig --level 2345 irqbalance off
+
+ By default, some Linux distributions enable the kernel feature,
+ irqbalance, which performs the same function as the daemon. To disable
+ this feature, add the following line to your bootloader:
+ noirqbalance
+
+ Example using the Grub bootloader:
+ title Red Hat Enterprise Linux AS (2.4.21-27.ELsmp)
+ root (hd0,0)
+ kernel /vmlinuz-2.4.21-27.ELsmp ro root=/dev/hda3 noirqbalance
+ initrd /initrd-2.4.21-27.ELsmp.img
+
+ 2. After running insmod, the driver is loaded and the incorrect network
+ interface is brought up without running ifup.
+
+ When using 2.4.x kernels, including RHEL kernels, the Linux kernel
+ invokes a script named "hotplug". This script is primarily used to
+ automatically bring up USB devices when they are plugged in, however,
+ the script also attempts to automatically bring up a network interface
+ after loading the kernel module. The hotplug script does this by scanning
+ the ifcfg-eth# config files in /etc/sysconfig/network-scripts, looking
+ for HWADDR=<mac_address>.
+
+ If the hotplug script does not find the HWADDRR within any of the
+ ifcfg-eth# files, it will bring up the device with the next available
+ interface name. If this interface is already configured for a different
+ network card, your new interface will have incorrect IP address and
+ network settings.
+
+ To solve this issue, you can add the HWADDR=<mac_address> key to the
+ interface config file of your network controller.
+
+ To disable this "hotplug" feature, you may add the driver (module name)
+ to the "blacklist" file located in /etc/hotplug. It has been noted that
+ this does not work for network devices because the net.agent script
+ does not use the blacklist file. Simply remove, or rename, the net.agent
+ script located in /etc/hotplug to disable this feature.
+
+ 3. Transport Protocol (TP) hangs when running heavy multi-connection traffic
+ on an AMD Opteron system with HyperTransport PCI-X Tunnel chipset.
+
+ If your AMD Opteron system uses the AMD-8131 HyperTransport PCI-X Tunnel
+ chipset, you may experience the "133-Mhz Mode Split Completion Data
+ Corruption" bug identified by AMD while using a 133Mhz PCI-X card on the
+ bus PCI-X bus.
+
+ AMD states, "Under highly specific conditions, the AMD-8131 PCI-X Tunnel
+ can provide stale data via split completion cycles to a PCI-X card that
+ is operating at 133 Mhz", causing data corruption.
+
+ AMD's provides three workarounds for this problem, however, Chelsio
+ recommends the first option for best performance with this bug:
+
+ For 133Mhz secondary bus operation, limit the transaction length and
+ the number of outstanding transactions, via BIOS configuration
+ programming of the PCI-X card, to the following:
+
+ Data Length (bytes): 1k
+ Total allowed outstanding transactions: 2
+
+ Please refer to AMD 8131-HT/PCI-X Errata 26310 Rev 3.08 August 2004,
+ section 56, "133-MHz Mode Split Completion Data Corruption" for more
+ details with this bug and workarounds suggested by AMD.
+
+ It may be possible to work outside AMD's recommended PCI-X settings, try
+ increasing the Data Length to 2k bytes for increased performance. If you
+ have issues with these settings, please revert to the "safe" settings
+ and duplicate the problem before submitting a bug or asking for support.
+
+ NOTE: The default setting on most systems is 8 outstanding transactions
+ and 2k bytes data length.
+
+ 4. On multiprocessor systems, it has been noted that an application which
+ is handling 10Gb networking can switch between CPUs causing degraded
+ and/or unstable performance.
+
+ If running on an SMP system and taking performance measurements, it
+ is suggested you either run the latest netperf-2.4.0+ or use a binding
+ tool such as Tim Hockin's procstate utilities (runon)
+ <http://www.hockin.org/~thockin/procstate/>.
+
+ Binding netserver and netperf (or other applications) to particular
+ CPUs will have a significant difference in performance measurements.
+ You may need to experiment which CPU to bind the application to in
+ order to achieve the best performance for your system.
+
+ If you are developing an application designed for 10Gb networking,
+ please keep in mind you may want to look at kernel functions
+ sched_setaffinity & sched_getaffinity to bind your application.
+
+ If you are just running user-space applications such as ftp, telnet,
+ etc., you may want to try the runon tool provided by Tim Hockin's
+ procstate utility. You could also try binding the interface to a
+ particular CPU: runon 0 ifup eth0
+
+
+SUPPORT
+=======
+
+ If you have problems with the software or hardware, please contact our
+ customer support team via email at support@chelsio.com or check our website
+ at http://www.chelsio.com
+
+===============================================================================
+
+ Chelsio Communications
+ 370 San Aleso Ave.
+ Suite 100
+ Sunnyvale, CA 94085
+ http://www.chelsio.com
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License, version 2, as
+published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License along
+with this program; if not, write to the Free Software Foundation, Inc.,
+59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+
+THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
+WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
+
+ Copyright (c) 2003-2005 Chelsio Communications. All rights reserved.
+
+===============================================================================
diff --git a/Documentation/networking/phy.txt b/Documentation/networking/phy.txt
new file mode 100644
index 000000000000..29ccae409031
--- /dev/null
+++ b/Documentation/networking/phy.txt
@@ -0,0 +1,288 @@
+
+-------
+PHY Abstraction Layer
+(Updated 2005-07-21)
+
+Purpose
+
+ Most network devices consist of set of registers which provide an interface
+ to a MAC layer, which communicates with the physical connection through a
+ PHY. The PHY concerns itself with negotiating link parameters with the link
+ partner on the other side of the network connection (typically, an ethernet
+ cable), and provides a register interface to allow drivers to determine what
+ settings were chosen, and to configure what settings are allowed.
+
+ While these devices are distinct from the network devices, and conform to a
+ standard layout for the registers, it has been common practice to integrate
+ the PHY management code with the network driver. This has resulted in large
+ amounts of redundant code. Also, on embedded systems with multiple (and
+ sometimes quite different) ethernet controllers connected to the same
+ management bus, it is difficult to ensure safe use of the bus.
+
+ Since the PHYs are devices, and the management busses through which they are
+ accessed are, in fact, busses, the PHY Abstraction Layer treats them as such.
+ In doing so, it has these goals:
+
+ 1) Increase code-reuse
+ 2) Increase overall code-maintainability
+ 3) Speed development time for new network drivers, and for new systems
+
+ Basically, this layer is meant to provide an interface to PHY devices which
+ allows network driver writers to write as little code as possible, while
+ still providing a full feature set.
+
+The MDIO bus
+
+ Most network devices are connected to a PHY by means of a management bus.
+ Different devices use different busses (though some share common interfaces).
+ In order to take advantage of the PAL, each bus interface needs to be
+ registered as a distinct device.
+
+ 1) read and write functions must be implemented. Their prototypes are:
+
+ int write(struct mii_bus *bus, int mii_id, int regnum, u16 value);
+ int read(struct mii_bus *bus, int mii_id, int regnum);
+
+ mii_id is the address on the bus for the PHY, and regnum is the register
+ number. These functions are guaranteed not to be called from interrupt
+ time, so it is safe for them to block, waiting for an interrupt to signal
+ the operation is complete
+
+ 2) A reset function is necessary. This is used to return the bus to an
+ initialized state.
+
+ 3) A probe function is needed. This function should set up anything the bus
+ driver needs, setup the mii_bus structure, and register with the PAL using
+ mdiobus_register. Similarly, there's a remove function to undo all of
+ that (use mdiobus_unregister).
+
+ 4) Like any driver, the device_driver structure must be configured, and init
+ exit functions are used to register the driver.
+
+ 5) The bus must also be declared somewhere as a device, and registered.
+
+ As an example for how one driver implemented an mdio bus driver, see
+ drivers/net/gianfar_mii.c and arch/ppc/syslib/mpc85xx_devices.c
+
+Connecting to a PHY
+
+ Sometime during startup, the network driver needs to establish a connection
+ between the PHY device, and the network device. At this time, the PHY's bus
+ and drivers need to all have been loaded, so it is ready for the connection.
+ At this point, there are several ways to connect to the PHY:
+
+ 1) The PAL handles everything, and only calls the network driver when
+ the link state changes, so it can react.
+
+ 2) The PAL handles everything except interrupts (usually because the
+ controller has the interrupt registers).
+
+ 3) The PAL handles everything, but checks in with the driver every second,
+ allowing the network driver to react first to any changes before the PAL
+ does.
+
+ 4) The PAL serves only as a library of functions, with the network device
+ manually calling functions to update status, and configure the PHY
+
+
+Letting the PHY Abstraction Layer do Everything
+
+ If you choose option 1 (The hope is that every driver can, but to still be
+ useful to drivers that can't), connecting to the PHY is simple:
+
+ First, you need a function to react to changes in the link state. This
+ function follows this protocol:
+
+ static void adjust_link(struct net_device *dev);
+
+ Next, you need to know the device name of the PHY connected to this device.
+ The name will look something like, "phy0:0", where the first number is the
+ bus id, and the second is the PHY's address on that bus.
+
+ Now, to connect, just call this function:
+
+ phydev = phy_connect(dev, phy_name, &adjust_link, flags);
+
+ phydev is a pointer to the phy_device structure which represents the PHY. If
+ phy_connect is successful, it will return the pointer. dev, here, is the
+ pointer to your net_device. Once done, this function will have started the
+ PHY's software state machine, and registered for the PHY's interrupt, if it
+ has one. The phydev structure will be populated with information about the
+ current state, though the PHY will not yet be truly operational at this
+ point.
+
+ flags is a u32 which can optionally contain phy-specific flags.
+ This is useful if the system has put hardware restrictions on
+ the PHY/controller, of which the PHY needs to be aware.
+
+ Now just make sure that phydev->supported and phydev->advertising have any
+ values pruned from them which don't make sense for your controller (a 10/100
+ controller may be connected to a gigabit capable PHY, so you would need to
+ mask off SUPPORTED_1000baseT*). See include/linux/ethtool.h for definitions
+ for these bitfields. Note that you should not SET any bits, or the PHY may
+ get put into an unsupported state.
+
+ Lastly, once the controller is ready to handle network traffic, you call
+ phy_start(phydev). This tells the PAL that you are ready, and configures the
+ PHY to connect to the network. If you want to handle your own interrupts,
+ just set phydev->irq to PHY_IGNORE_INTERRUPT before you call phy_start.
+ Similarly, if you don't want to use interrupts, set phydev->irq to PHY_POLL.
+
+ When you want to disconnect from the network (even if just briefly), you call
+ phy_stop(phydev).
+
+Keeping Close Tabs on the PAL
+
+ It is possible that the PAL's built-in state machine needs a little help to
+ keep your network device and the PHY properly in sync. If so, you can
+ register a helper function when connecting to the PHY, which will be called
+ every second before the state machine reacts to any changes. To do this, you
+ need to manually call phy_attach() and phy_prepare_link(), and then call
+ phy_start_machine() with the second argument set to point to your special
+ handler.
+
+ Currently there are no examples of how to use this functionality, and testing
+ on it has been limited because the author does not have any drivers which use
+ it (they all use option 1). So Caveat Emptor.
+
+Doing it all yourself
+
+ There's a remote chance that the PAL's built-in state machine cannot track
+ the complex interactions between the PHY and your network device. If this is
+ so, you can simply call phy_attach(), and not call phy_start_machine or
+ phy_prepare_link(). This will mean that phydev->state is entirely yours to
+ handle (phy_start and phy_stop toggle between some of the states, so you
+ might need to avoid them).
+
+ An effort has been made to make sure that useful functionality can be
+ accessed without the state-machine running, and most of these functions are
+ descended from functions which did not interact with a complex state-machine.
+ However, again, no effort has been made so far to test running without the
+ state machine, so tryer beware.
+
+ Here is a brief rundown of the functions:
+
+ int phy_read(struct phy_device *phydev, u16 regnum);
+ int phy_write(struct phy_device *phydev, u16 regnum, u16 val);
+
+ Simple read/write primitives. They invoke the bus's read/write function
+ pointers.
+
+ void phy_print_status(struct phy_device *phydev);
+
+ A convenience function to print out the PHY status neatly.
+
+ int phy_clear_interrupt(struct phy_device *phydev);
+ int phy_config_interrupt(struct phy_device *phydev, u32 interrupts);
+
+ Clear the PHY's interrupt, and configure which ones are allowed,
+ respectively. Currently only supports all on, or all off.
+
+ int phy_enable_interrupts(struct phy_device *phydev);
+ int phy_disable_interrupts(struct phy_device *phydev);
+
+ Functions which enable/disable PHY interrupts, clearing them
+ before and after, respectively.
+
+ int phy_start_interrupts(struct phy_device *phydev);
+ int phy_stop_interrupts(struct phy_device *phydev);
+
+ Requests the IRQ for the PHY interrupts, then enables them for
+ start, or disables then frees them for stop.
+
+ struct phy_device * phy_attach(struct net_device *dev, const char *phy_id,
+ u32 flags);
+
+ Attaches a network device to a particular PHY, binding the PHY to a generic
+ driver if none was found during bus initialization. Passes in
+ any phy-specific flags as needed.
+
+ int phy_start_aneg(struct phy_device *phydev);
+
+ Using variables inside the phydev structure, either configures advertising
+ and resets autonegotiation, or disables autonegotiation, and configures
+ forced settings.
+
+ static inline int phy_read_status(struct phy_device *phydev);
+
+ Fills the phydev structure with up-to-date information about the current
+ settings in the PHY.
+
+ void phy_sanitize_settings(struct phy_device *phydev)
+
+ Resolves differences between currently desired settings, and
+ supported settings for the given PHY device. Does not make
+ the changes in the hardware, though.
+
+ int phy_ethtool_sset(struct phy_device *phydev, struct ethtool_cmd *cmd);
+ int phy_ethtool_gset(struct phy_device *phydev, struct ethtool_cmd *cmd);
+
+ Ethtool convenience functions.
+
+ int phy_mii_ioctl(struct phy_device *phydev,
+ struct mii_ioctl_data *mii_data, int cmd);
+
+ The MII ioctl. Note that this function will completely screw up the state
+ machine if you write registers like BMCR, BMSR, ADVERTISE, etc. Best to
+ use this only to write registers which are not standard, and don't set off
+ a renegotiation.
+
+
+PHY Device Drivers
+
+ With the PHY Abstraction Layer, adding support for new PHYs is
+ quite easy. In some cases, no work is required at all! However,
+ many PHYs require a little hand-holding to get up-and-running.
+
+Generic PHY driver
+
+ If the desired PHY doesn't have any errata, quirks, or special
+ features you want to support, then it may be best to not add
+ support, and let the PHY Abstraction Layer's Generic PHY Driver
+ do all of the work.
+
+Writing a PHY driver
+
+ If you do need to write a PHY driver, the first thing to do is
+ make sure it can be matched with an appropriate PHY device.
+ This is done during bus initialization by reading the device's
+ UID (stored in registers 2 and 3), then comparing it to each
+ driver's phy_id field by ANDing it with each driver's
+ phy_id_mask field. Also, it needs a name. Here's an example:
+
+ static struct phy_driver dm9161_driver = {
+ .phy_id = 0x0181b880,
+ .name = "Davicom DM9161E",
+ .phy_id_mask = 0x0ffffff0,
+ ...
+ }
+
+ Next, you need to specify what features (speed, duplex, autoneg,
+ etc) your PHY device and driver support. Most PHYs support
+ PHY_BASIC_FEATURES, but you can look in include/mii.h for other
+ features.
+
+ Each driver consists of a number of function pointers:
+
+ config_init: configures PHY into a sane state after a reset.
+ For instance, a Davicom PHY requires descrambling disabled.
+ probe: Does any setup needed by the driver
+ suspend/resume: power management
+ config_aneg: Changes the speed/duplex/negotiation settings
+ read_status: Reads the current speed/duplex/negotiation settings
+ ack_interrupt: Clear a pending interrupt
+ config_intr: Enable or disable interrupts
+ remove: Does any driver take-down
+
+ Of these, only config_aneg and read_status are required to be
+ assigned by the driver code. The rest are optional. Also, it is
+ preferred to use the generic phy driver's versions of these two
+ functions if at all possible: genphy_read_status and
+ genphy_config_aneg. If this is not possible, it is likely that
+ you only need to perform some actions before and after invoking
+ these functions, and so your functions will wrap the generic
+ ones.
+
+ Feel free to look at the Marvell, Cicada, and Davicom drivers in
+ drivers/net/phy/ for examples (the lxt and qsemi drivers have
+ not been tested as of this writing)
diff --git a/Documentation/pci.txt b/Documentation/pci.txt
index 62b1dc5d97e2..76d28d033657 100644
--- a/Documentation/pci.txt
+++ b/Documentation/pci.txt
@@ -266,20 +266,6 @@ port an old driver to the new PCI interface. They are no longer present
in the kernel as they aren't compatible with hotplug or PCI domains or
having sane locking.
-pcibios_present() and Since ages, you don't need to test presence
-pci_present() of PCI subsystem when trying to talk to it.
- If it's not there, the list of PCI devices
- is empty and all functions for searching for
- devices just return NULL.
-pcibios_(read|write)_* Superseded by their pci_(read|write)_*
- counterparts.
-pcibios_find_* Superseded by their pci_get_* counterparts.
-pci_for_each_dev() Superseded by pci_get_device()
-pci_for_each_dev_reverse() Superseded by pci_find_device_reverse()
-pci_for_each_bus() Superseded by pci_find_next_bus()
pci_find_device() Superseded by pci_get_device()
pci_find_subsys() Superseded by pci_get_subsys()
pci_find_slot() Superseded by pci_get_slot()
-pcibios_find_class() Superseded by pci_get_class()
-pci_find_class() Superseded by pci_get_class()
-pci_(read|write)_*_nodev() Superseded by pci_bus_(read|write)_*()
diff --git a/Documentation/pcmcia/driver-changes.txt b/Documentation/pcmcia/driver-changes.txt
index 9c315ab48a02..403e7b4dcdd4 100644
--- a/Documentation/pcmcia/driver-changes.txt
+++ b/Documentation/pcmcia/driver-changes.txt
@@ -1,6 +1,13 @@
This file details changes in 2.6 which affect PCMCIA card driver authors:
-* in-kernel device<->driver matching
+* event handler initialization in struct pcmcia_driver (as of 2.6.13)
+ The event handler is notified of all events, and must be initialized
+ as the event() callback in the driver's struct pcmcia_driver.
+
+* pcmcia/version.h should not be used (as of 2.6.13)
+ This file will be removed eventually.
+
+* in-kernel device<->driver matching (as of 2.6.13)
PCMCIA devices and their correct drivers can now be matched in
kernelspace. See 'devicetable.txt' for details.
@@ -49,3 +56,12 @@ This file details changes in 2.6 which affect PCMCIA card driver authors:
memory regions in-use. The name argument should be a pointer to
your driver name. Eg, for pcnet_cs, name should point to the
string "pcnet_cs".
+
+* CardServices is gone
+ CardServices() in 2.4 is just a big switch statement to call various
+ services. In 2.6, all of those entry points are exported and called
+ directly (except for pcmcia_report_error(), just use cs_error() instead).
+
+* struct pcmcia_driver
+ You need to use struct pcmcia_driver and pcmcia_{un,}register_driver
+ instead of {un,}register_pccard_driver
diff --git a/Documentation/power/swsusp-dmcrypt.txt b/Documentation/power/swsusp-dmcrypt.txt
new file mode 100644
index 000000000000..59931b46ff7e
--- /dev/null
+++ b/Documentation/power/swsusp-dmcrypt.txt
@@ -0,0 +1,138 @@
+Author: Andreas Steinmetz <ast@domdv.de>
+
+
+How to use dm-crypt and swsusp together:
+========================================
+
+Some prerequisites:
+You know how dm-crypt works. If not, visit the following web page:
+http://www.saout.de/misc/dm-crypt/
+You have read Documentation/power/swsusp.txt and understand it.
+You did read Documentation/initrd.txt and know how an initrd works.
+You know how to create or how to modify an initrd.
+
+Now your system is properly set up, your disk is encrypted except for
+the swap device(s) and the boot partition which may contain a mini
+system for crypto setup and/or rescue purposes. You may even have
+an initrd that does your current crypto setup already.
+
+At this point you want to encrypt your swap, too. Still you want to
+be able to suspend using swsusp. This, however, means that you
+have to be able to either enter a passphrase or that you read
+the key(s) from an external device like a pcmcia flash disk
+or an usb stick prior to resume. So you need an initrd, that sets
+up dm-crypt and then asks swsusp to resume from the encrypted
+swap device.
+
+The most important thing is that you set up dm-crypt in such
+a way that the swap device you suspend to/resume from has
+always the same major/minor within the initrd as well as
+within your running system. The easiest way to achieve this is
+to always set up this swap device first with dmsetup, so that
+it will always look like the following:
+
+brw------- 1 root root 254, 0 Jul 28 13:37 /dev/mapper/swap0
+
+Now set up your kernel to use /dev/mapper/swap0 as the default
+resume partition, so your kernel .config contains:
+
+CONFIG_PM_STD_PARTITION="/dev/mapper/swap0"
+
+Prepare your boot loader to use the initrd you will create or
+modify. For lilo the simplest setup looks like the following
+lines:
+
+image=/boot/vmlinuz
+initrd=/boot/initrd.gz
+label=linux
+append="root=/dev/ram0 init=/linuxrc rw"
+
+Finally you need to create or modify your initrd. Lets assume
+you create an initrd that reads the required dm-crypt setup
+from a pcmcia flash disk card. The card is formatted with an ext2
+fs which resides on /dev/hde1 when the card is inserted. The
+card contains at least the encrypted swap setup in a file
+named "swapkey". /etc/fstab of your initrd contains something
+like the following:
+
+/dev/hda1 /mnt ext3 ro 0 0
+none /proc proc defaults,noatime,nodiratime 0 0
+none /sys sysfs defaults,noatime,nodiratime 0 0
+
+/dev/hda1 contains an unencrypted mini system that sets up all
+of your crypto devices, again by reading the setup from the
+pcmcia flash disk. What follows now is a /linuxrc for your
+initrd that allows you to resume from encrypted swap and that
+continues boot with your mini system on /dev/hda1 if resume
+does not happen:
+
+#!/bin/sh
+PATH=/sbin:/bin:/usr/sbin:/usr/bin
+mount /proc
+mount /sys
+mapped=0
+noresume=`grep -c noresume /proc/cmdline`
+if [ "$*" != "" ]
+then
+ noresume=1
+fi
+dmesg -n 1
+/sbin/cardmgr -q
+for i in 1 2 3 4 5 6 7 8 9 0
+do
+ if [ -f /proc/ide/hde/media ]
+ then
+ usleep 500000
+ mount -t ext2 -o ro /dev/hde1 /mnt
+ if [ -f /mnt/swapkey ]
+ then
+ dmsetup create swap0 /mnt/swapkey > /dev/null 2>&1 && mapped=1
+ fi
+ umount /mnt
+ break
+ fi
+ usleep 500000
+done
+killproc /sbin/cardmgr
+dmesg -n 6
+if [ $mapped = 1 ]
+then
+ if [ $noresume != 0 ]
+ then
+ mkswap /dev/mapper/swap0 > /dev/null 2>&1
+ fi
+ echo 254:0 > /sys/power/resume
+ dmsetup remove swap0
+fi
+umount /sys
+mount /mnt
+umount /proc
+cd /mnt
+pivot_root . mnt
+mount /proc
+umount -l /mnt
+umount /proc
+exec chroot . /sbin/init $* < dev/console > dev/console 2>&1
+
+Please don't mind the weird loop above, busybox's msh doesn't know
+the let statement. Now, what is happening in the script?
+First we have to decide if we want to try to resume, or not.
+We will not resume if booting with "noresume" or any parameters
+for init like "single" or "emergency" as boot parameters.
+
+Then we need to set up dmcrypt with the setup data from the
+pcmcia flash disk. If this succeeds we need to reset the swap
+device if we don't want to resume. The line "echo 254:0 > /sys/power/resume"
+then attempts to resume from the first device mapper device.
+Note that it is important to set the device in /sys/power/resume,
+regardless if resuming or not, otherwise later suspend will fail.
+If resume starts, script execution terminates here.
+
+Otherwise we just remove the encrypted swap device and leave it to the
+mini system on /dev/hda1 to set the whole crypto up (it is up to
+you to modify this to your taste).
+
+What then follows is the well known process to change the root
+file system and continue booting from there. I prefer to unmount
+the initrd prior to continue booting but it is up to you to modify
+this.
diff --git a/Documentation/power/swsusp.txt b/Documentation/power/swsusp.txt
index 7a6b78966459..ddf907fbcc05 100644
--- a/Documentation/power/swsusp.txt
+++ b/Documentation/power/swsusp.txt
@@ -311,3 +311,10 @@ As a rule of thumb use encrypted swap to protect your data while your
system is shut down or suspended. Additionally use the encrypted
suspend image to prevent sensitive data from being stolen after
resume.
+
+Q: Why we cannot suspend to a swap file?
+
+A: Because accessing swap file needs the filesystem mounted, and
+filesystem might do something wrong (like replaying the journal)
+during mount. [Probably could be solved by modifying every filesystem
+to support some kind of "really read-only!" option. Patches welcome.]
diff --git a/Documentation/power/video.txt b/Documentation/power/video.txt
index 7a4a5036d123..1a44e8acb54c 100644
--- a/Documentation/power/video.txt
+++ b/Documentation/power/video.txt
@@ -46,6 +46,12 @@ There are a few types of systems where video works after S3 resume:
POSTing bios works. Ole Rohne has patch to do just that at
http://dev.gentoo.org/~marineam/patch-radeonfb-2.6.11-rc2-mm2.
+(8) on some systems, you can use the video_post utility mentioned here:
+ http://bugzilla.kernel.org/show_bug.cgi?id=3670. Do echo 3 > /sys/power/state
+ && /usr/sbin/video_post - which will initialize the display in console mode.
+ If you are in X, you can switch to a virtual terminal and back to X using
+ CTRL+ALT+F1 - CTRL+ALT+F7 to get the display working in graphical mode again.
+
Now, if you pass acpi_sleep=something, and it does not work with your
bios, you'll get a hard crash during resume. Be careful. Also it is
safest to do your experiments with plain old VGA console. The vesafb
@@ -64,7 +70,8 @@ Model hack (or "how to do it")
------------------------------------------------------------------------------
Acer Aspire 1406LC ole's late BIOS init (7), turn off DRI
Acer TM 242FX vbetool (6)
-Acer TM C300 vga=normal (only suspend on console, not in X), vbetool (6)
+Acer TM C110 video_post (8)
+Acer TM C300 vga=normal (only suspend on console, not in X), vbetool (6) or video_post (8)
Acer TM 4052LCi s3_bios (2)
Acer TM 636Lci s3_bios vga=normal (2)
Acer TM 650 (Radeon M7) vga=normal plus boot-radeon (5) gets text console back
diff --git a/Documentation/scsi/scsi_mid_low_api.txt b/Documentation/scsi/scsi_mid_low_api.txt
index da176c95d0fb..7536823c0cb1 100644
--- a/Documentation/scsi/scsi_mid_low_api.txt
+++ b/Documentation/scsi/scsi_mid_low_api.txt
@@ -388,7 +388,6 @@ Summary:
scsi_remove_device - detach and remove a SCSI device
scsi_remove_host - detach and remove all SCSI devices owned by host
scsi_report_bus_reset - report scsi _bus_ reset observed
- scsi_set_device - place device reference in host structure
scsi_track_queue_full - track successive QUEUE_FULL events
scsi_unblock_requests - allow further commands to be queued to given host
scsi_unregister - [calls scsi_host_put()]
@@ -741,20 +740,6 @@ void scsi_report_bus_reset(struct Scsi_Host * shost, int channel)
/**
- * scsi_set_device - place device reference in host structure
- * @shost: a pointer to a scsi host instance
- * @pdev: pointer to device instance to assign
- *
- * Returns nothing
- *
- * Might block: no
- *
- * Defined in: include/scsi/scsi_host.h .
- **/
-void scsi_set_device(struct Scsi_Host * shost, struct device * dev)
-
-
-/**
* scsi_track_queue_full - track successive QUEUE_FULL events on given
* device to determine if and when there is a need
* to adjust the queue depth on the device.
diff --git a/Documentation/serial/driver b/Documentation/serial/driver
index ac7eabbf662a..87856d3cfb67 100644
--- a/Documentation/serial/driver
+++ b/Documentation/serial/driver
@@ -111,24 +111,17 @@ hardware.
Interrupts: locally disabled.
This call must not sleep
- stop_tx(port,tty_stop)
+ stop_tx(port)
Stop transmitting characters. This might be due to the CTS
line becoming inactive or the tty layer indicating we want
- to stop transmission.
-
- tty_stop: 1 if this call is due to the TTY layer issuing a
- TTY stop to the driver (equiv to rs_stop).
+ to stop transmission due to an XOFF character.
Locking: port->lock taken.
Interrupts: locally disabled.
This call must not sleep
- start_tx(port,tty_start)
- start transmitting characters. (incidentally, nonempty will
- always be nonzero, and shouldn't be used - it will be dropped).
-
- tty_start: 1 if this call was due to the TTY layer issuing
- a TTY start to the driver (equiv to rs_start)
+ start_tx(port)
+ start transmitting characters.
Locking: port->lock taken.
Interrupts: locally disabled.
diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt
index 104a994b8289..5c49ba07e709 100644
--- a/Documentation/sound/alsa/ALSA-Configuration.txt
+++ b/Documentation/sound/alsa/ALSA-Configuration.txt
@@ -132,6 +132,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
mpu_irq - IRQ # for MPU-401 UART (PnP setup)
dma1 - first DMA # for AD1816A chip (PnP setup)
dma2 - second DMA # for AD1816A chip (PnP setup)
+ clockfreq - Clock frequency for AD1816A chip (default = 0, 33000Hz)
Module supports up to 8 cards, autoprobe and PnP.
@@ -636,11 +637,16 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
3stack-digout 3-jack in back, a HP out and a SPDIF out
5stack 5-jack in back, 2-jack in front
5stack-digout 5-jack in back, 2-jack in front, a SPDIF out
+ 6stack 6-jack in back, 2-jack in front
+ 6stack-digout 6-jack with a SPDIF out
w810 3-jack
z71v 3-jack (HP shared SPDIF)
asus 3-jack
uniwill 3-jack
F1734 2-jack
+ test for testing/debugging purpose, almost all controls can be
+ adjusted. Appearing only when compiled with
+ $CONFIG_SND_DEBUG=y
CMI9880
minimal 3-jack in back
@@ -1054,6 +1060,13 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
The power-management is supported.
+ Module snd-pxa2xx-ac97 (on arm only)
+ ------------------------------------
+
+ Module for AC97 driver for the Intel PXA2xx chip
+
+ For ARM architecture only.
+
Module snd-rme32
----------------
@@ -1173,6 +1186,13 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
Module supports up to 8 cards.
+ Module snd-sun-dbri (on sparc only)
+ -----------------------------------
+
+ Module for DBRI sound chips found on Sparcs.
+
+ Module supports up to 8 cards.
+
Module snd-wavefront
--------------------
@@ -1371,7 +1391,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
Module snd-vxpocket
-------------------
- Module for Digigram VX-Pocket VX2 PCMCIA card.
+ Module for Digigram VX-Pocket VX2 and 440 PCMCIA cards.
ibl - Capture IBL size. (default = 0, minimum size)
@@ -1391,29 +1411,6 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
Note: the driver is build only when CONFIG_ISA is set.
- Module snd-vxp440
- -----------------
-
- Module for Digigram VX-Pocket 440 PCMCIA card.
-
- ibl - Capture IBL size. (default = 0, minimum size)
-
- Module supports up to 8 cards. The module is compiled only when
- PCMCIA is supported on kernel.
-
- To activate the driver via the card manager, you'll need to set
- up /etc/pcmcia/vxp440.conf. See the sound/pcmcia/vx/vxp440.c.
-
- When the driver is compiled as a module and the hotplug firmware
- is supported, the firmware data is loaded via hotplug automatically.
- Install the necessary firmware files in alsa-firmware package.
- When no hotplug fw loader is available, you need to load the
- firmware via vxloader utility in alsa-tools package.
-
- About capture IBL, see the description of snd-vx222 module.
-
- Note: the driver is build only when CONFIG_ISA is set.
-
Module snd-ymfpci
-----------------
diff --git a/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl b/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl
index db0b7d2dc477..0475478c2484 100644
--- a/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl
+++ b/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl
@@ -3422,10 +3422,17 @@ struct _snd_pcm_runtime {
<para>
The <structfield>iface</structfield> field specifies the type of
- the control,
- <constant>SNDRV_CTL_ELEM_IFACE_XXX</constant>. There are
- <constant>MIXER</constant>, <constant>PCM</constant>,
- <constant>CARD</constant>, etc.
+ the control, <constant>SNDRV_CTL_ELEM_IFACE_XXX</constant>, which
+ is usually <constant>MIXER</constant>.
+ Use <constant>CARD</constant> for global controls that are not
+ logically part of the mixer.
+ If the control is closely associated with some specific device on
+ the sound card, use <constant>HWDEP</constant>,
+ <constant>PCM</constant>, <constant>RAWMIDI</constant>,
+ <constant>TIMER</constant>, or <constant>SEQUENCER</constant>, and
+ specify the device number with the
+ <structfield>device</structfield> and
+ <structfield>subdevice</structfield> fields.
</para>
<para>
diff --git a/Documentation/stable_api_nonsense.txt b/Documentation/stable_api_nonsense.txt
index 3cea13875277..f39c9d714db3 100644
--- a/Documentation/stable_api_nonsense.txt
+++ b/Documentation/stable_api_nonsense.txt
@@ -132,7 +132,7 @@ to extra work for the USB developers. Since all Linux USB developers do
their work on their own time, asking programmers to do extra work for no
gain, for free, is not a possibility.
-Security issues are also a very important for Linux. When a
+Security issues are also very important for Linux. When a
security issue is found, it is fixed in a very short amount of time. A
number of times this has caused internal kernel interfaces to be
reworked to prevent the security problem from occurring. When this
diff --git a/Documentation/stable_kernel_rules.txt b/Documentation/stable_kernel_rules.txt
new file mode 100644
index 000000000000..2c81305090df
--- /dev/null
+++ b/Documentation/stable_kernel_rules.txt
@@ -0,0 +1,58 @@
+Everything you ever wanted to know about Linux 2.6 -stable releases.
+
+Rules on what kind of patches are accepted, and what ones are not, into
+the "-stable" tree:
+
+ - It must be obviously correct and tested.
+ - It can not bigger than 100 lines, with context.
+ - It must fix only one thing.
+ - It must fix a real bug that bothers people (not a, "This could be a
+ problem..." type thing.)
+ - It must fix a problem that causes a build error (but not for things
+ marked CONFIG_BROKEN), an oops, a hang, data corruption, a real
+ security issue, or some "oh, that's not good" issue. In short,
+ something critical.
+ - No "theoretical race condition" issues, unless an explanation of how
+ the race can be exploited.
+ - It can not contain any "trivial" fixes in it (spelling changes,
+ whitespace cleanups, etc.)
+ - It must be accepted by the relevant subsystem maintainer.
+ - It must follow Documentation/SubmittingPatches rules.
+
+
+Procedure for submitting patches to the -stable tree:
+
+ - Send the patch, after verifying that it follows the above rules, to
+ stable@kernel.org.
+ - The sender will receive an ack when the patch has been accepted into
+ the queue, or a nak if the patch is rejected. This response might
+ take a few days, according to the developer's schedules.
+ - If accepted, the patch will be added to the -stable queue, for review
+ by other developers.
+ - Security patches should not be sent to this alias, but instead to the
+ documented security@kernel.org.
+
+
+Review cycle:
+
+ - When the -stable maintainers decide for a review cycle, the patches
+ will be sent to the review committee, and the maintainer of the
+ affected area of the patch (unless the submitter is the maintainer of
+ the area) and CC: to the linux-kernel mailing list.
+ - The review committee has 48 hours in which to ack or nak the patch.
+ - If the patch is rejected by a member of the committee, or linux-kernel
+ members object to the patch, bringing up issues that the maintainers
+ and members did not realize, the patch will be dropped from the
+ queue.
+ - At the end of the review cycle, the acked patches will be added to
+ the latest -stable release, and a new -stable release will happen.
+ - Security patches will be accepted into the -stable tree directly from
+ the security kernel team, and not go through the normal review cycle.
+ Contact the kernel security team for more details on this procedure.
+
+
+Review committe:
+
+ - This will be made up of a number of kernel developers who have
+ volunteered for this task, and a few that haven't.
+
diff --git a/Documentation/usb/sn9c102.txt b/Documentation/usb/sn9c102.txt
index cf9a1187edce..3f8a119db31b 100644
--- a/Documentation/usb/sn9c102.txt
+++ b/Documentation/usb/sn9c102.txt
@@ -297,6 +297,7 @@ Vendor ID Product ID
0x0c45 0x602a
0x0c45 0x602b
0x0c45 0x602c
+0x0c45 0x602d
0x0c45 0x6030
0x0c45 0x6080
0x0c45 0x6082
@@ -333,6 +334,7 @@ Model Manufacturer
----- ------------
HV7131D Hynix Semiconductor, Inc.
MI-0343 Micron Technology, Inc.
+OV7630 OmniVision Technologies, Inc.
PAS106B PixArt Imaging, Inc.
PAS202BCB PixArt Imaging, Inc.
TAS5110C1B Taiwan Advanced Sensor Corporation
@@ -470,9 +472,11 @@ order):
- Luca Capello for the donation of a webcam;
- Joao Rodrigo Fuzaro, Joao Limirio, Claudio Filho and Caio Begotti for the
donation of a webcam;
+- Jon Hollstrom for the donation of a webcam;
- Carlos Eduardo Medaglia Dyonisio, who added the support for the PAS202BCB
image sensor;
- Stefano Mozzi, who donated 45 EU;
+- Andrew Pearce for the donation of a webcam;
- Bertrik Sikken, who reverse-engineered and documented the Huffman compression
algorithm used in the SN9C10x controllers and implemented the first decoder;
- Mizuno Takafumi for the donation of a webcam;
diff --git a/Documentation/usb/usbmon.txt b/Documentation/usb/usbmon.txt
index 2f8431f92b77..63cb7edd177e 100644
--- a/Documentation/usb/usbmon.txt
+++ b/Documentation/usb/usbmon.txt
@@ -101,6 +101,13 @@ Here is the list of words, from left to right:
or 3 and 2 positions, correspondingly.
- URB Status. This field makes no sense for submissions, but is present
to help scripts with parsing. In error case, it contains the error code.
+ In case of a setup packet, it contains a Setup Tag. If scripts read a number
+ in this field, they proceed to read Data Length. Otherwise, they read
+ the setup packet before reading the Data Length.
+- Setup packet, if present, consists of 5 words: one of each for bmRequestType,
+ bRequest, wValue, wIndex, wLength, as specified by the USB Specification 2.0.
+ These words are safe to decode if Setup Tag was 's'. Otherwise, the setup
+ packet was present, but not captured, and the fields contain filler.
- Data Length. This is the actual length in the URB.
- Data tag. The usbmon may not always capture data, even if length is nonzero.
Only if tag is '=', the data words are present.
@@ -125,25 +132,31 @@ class ParsedLine {
String data_str = st.nextToken();
int len = data_str.length() / 2;
int i;
+ int b; // byte is signed, apparently?! XXX
for (i = 0; i < len; i++) {
- data[data_len] = Byte.parseByte(
- data_str.substring(i*2, i*2 + 2),
- 16);
+ // data[data_len] = Byte.parseByte(
+ // data_str.substring(i*2, i*2 + 2),
+ // 16);
+ b = Integer.parseInt(
+ data_str.substring(i*2, i*2 + 2),
+ 16);
+ if (b >= 128)
+ b *= -1;
+ data[data_len] = (byte) b;
data_len++;
}
}
}
}
-This format is obviously deficient. For example, the setup packet for control
-transfers is not delivered. This will change in the future.
+This format may be changed in the future.
Examples:
-An input control transfer to get a port status:
+An input control transfer to get a port status.
-d74ff9a0 2640288196 S Ci:001:00 -115 4 <
-d74ff9a0 2640288202 C Ci:001:00 0 4 = 01010100
+d5ea89a0 3575914555 S Ci:001:00 s a3 00 0000 0003 0004 4 <
+d5ea89a0 3575914560 C Ci:001:00 0 4 = 01050000
An output bulk transfer to send a SCSI command 0x5E in a 31-byte Bulk wrapper
to a storage device at address 5:
diff --git a/Documentation/video4linux/CARDLIST.bttv b/Documentation/video4linux/CARDLIST.bttv
index aeeafec0594c..62a12a08e2ac 100644
--- a/Documentation/video4linux/CARDLIST.bttv
+++ b/Documentation/video4linux/CARDLIST.bttv
@@ -1,4 +1,4 @@
-card=0 - *** UNKNOWN/GENERIC ***
+card=0 - *** UNKNOWN/GENERIC ***
card=1 - MIRO PCTV
card=2 - Hauppauge (bt848)
card=3 - STB, Gateway P/N 6000699 (bt848)
diff --git a/Documentation/video4linux/CARDLIST.cx88 b/Documentation/video4linux/CARDLIST.cx88
index 4377aa11f567..03deb0726aa4 100644
--- a/Documentation/video4linux/CARDLIST.cx88
+++ b/Documentation/video4linux/CARDLIST.cx88
@@ -27,3 +27,6 @@ card=25 - Digital-Logic MICROSPACE Entertainment Center (MEC)
card=26 - IODATA GV/BCTV7E
card=27 - PixelView PlayTV Ultra Pro (Stereo)
card=28 - DViCO FusionHDTV 3 Gold-T
+card=29 - ADS Tech Instant TV DVB-T PCI
+card=30 - TerraTec Cinergy 1400 DVB-T
+card=31 - DViCO FusionHDTV 5 Gold
diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134
index 735e8ba02d9f..1b5a3a9ffbe2 100644
--- a/Documentation/video4linux/CARDLIST.saa7134
+++ b/Documentation/video4linux/CARDLIST.saa7134
@@ -1,10 +1,10 @@
- 0 -> UNKNOWN/GENERIC
+ 0 -> UNKNOWN/GENERIC
1 -> Proteus Pro [philips reference design] [1131:2001,1131:2001]
2 -> LifeView FlyVIDEO3000 [5168:0138,4e42:0138]
3 -> LifeView FlyVIDEO2000 [5168:0138]
4 -> EMPRESS [1131:6752]
5 -> SKNet Monster TV [1131:4e85]
- 6 -> Tevion MD 9717
+ 6 -> Tevion MD 9717
7 -> KNC One TV-Station RDS / Typhoon TV Tuner RDS [1131:fe01,1894:fe01]
8 -> Terratec Cinergy 400 TV [153B:1142]
9 -> Medion 5044
@@ -34,6 +34,7 @@
33 -> AVerMedia DVD EZMaker [1461:10ff]
34 -> Noval Prime TV 7133
35 -> AverMedia AverTV Studio 305 [1461:2115]
+ 36 -> UPMOST PURPLE TV [12ab:0800]
37 -> Items MuchTV Plus / IT-005
38 -> Terratec Cinergy 200 TV [153B:1152]
39 -> LifeView FlyTV Platinum Mini [5168:0212]
@@ -43,20 +44,21 @@
43 -> :Zolid Xpert TV7134
44 -> Empire PCI TV-Radio LE
45 -> Avermedia AVerTV Studio 307 [1461:9715]
- 46 -> AVerMedia Cardbus TV/Radio [1461:d6ee]
+ 46 -> AVerMedia Cardbus TV/Radio (E500) [1461:d6ee]
47 -> Terratec Cinergy 400 mobile [153b:1162]
48 -> Terratec Cinergy 600 TV MK3 [153B:1158]
49 -> Compro VideoMate Gold+ Pal [185b:c200]
50 -> Pinnacle PCTV 300i DVB-T + PAL [11bd:002d]
51 -> ProVideo PV952 [1540:9524]
52 -> AverMedia AverTV/305 [1461:2108]
+ 53 -> ASUS TV-FM 7135 [1043:4845]
54 -> LifeView FlyTV Platinum FM [5168:0214,1489:0214]
- 55 -> LifeView FlyDVB-T DUO [5168:0306]
+ 55 -> LifeView FlyDVB-T DUO [5168:0502,5168:0306]
56 -> Avermedia AVerTV 307 [1461:a70a]
57 -> Avermedia AVerTV GO 007 FM [1461:f31f]
58 -> ADS Tech Instant TV (saa7135) [1421:0350,1421:0370]
59 -> Kworld/Tevion V-Stream Xpert TV PVR7134
- 60 -> Typhoon DVB-T Duo Digital/Analog Cardbus
- 61 -> Philips TOUGH DVB-T reference design
+ 60 -> Typhoon DVB-T Duo Digital/Analog Cardbus [4e42:0502]
+ 61 -> Philips TOUGH DVB-T reference design [1131:2004]
62 -> Compro VideoMate TV Gold+II
63 -> Kworld Xpert TV PVR7134
diff --git a/Documentation/video4linux/CARDLIST.tuner b/Documentation/video4linux/CARDLIST.tuner
index e78020f68b2e..f3302e1b1b9c 100644
--- a/Documentation/video4linux/CARDLIST.tuner
+++ b/Documentation/video4linux/CARDLIST.tuner
@@ -56,9 +56,11 @@ tuner=54 - tda8290+75
tuner=55 - LG PAL (TAPE series)
tuner=56 - Philips PAL/SECAM multi (FQ1216AME MK4)
tuner=57 - Philips FQ1236A MK4
-tuner=58 - Ymec TVision TVF-8531MF
+tuner=58 - Ymec TVision TVF-8531MF/8831MF/8731MF
tuner=59 - Ymec TVision TVF-5533MF
tuner=60 - Thomson DDT 7611 (ATSC/NTSC)
-tuner=61 - Tena TNF9533-D/IF
+tuner=61 - Tena TNF9533-D/IF/TNF9533-B/DF
tuner=62 - Philips TEA5767HN FM Radio
tuner=63 - Philips FMD1216ME MK3 Hybrid Tuner
+tuner=64 - LG TDVS-H062F/TUA6034
+tuner=65 - Ymec TVF66T5-B/DFF
diff --git a/Documentation/video4linux/bttv/Cards b/Documentation/video4linux/bttv/Cards
index 7f8c7eb70ab2..8f1941ede4da 100644
--- a/Documentation/video4linux/bttv/Cards
+++ b/Documentation/video4linux/bttv/Cards
@@ -20,7 +20,7 @@ All other cards only differ by additional components as tuners, sound
decoders, EEPROMs, teletext decoders ...
-Unsupported Cards:
+Unsupported Cards:
------------------
Cards with Zoran (ZR) or Philips (SAA) or ISA are not supported by
@@ -50,11 +50,11 @@ Bt848a/Bt849 single crytal operation support possible!!!
Miro/Pinnacle PCTV
------------------
-- Bt848
- some (all??) come with 2 crystals for PAL/SECAM and NTSC
+- Bt848
+ some (all??) come with 2 crystals for PAL/SECAM and NTSC
- PAL, SECAM or NTSC TV tuner (Philips or TEMIC)
- MSP34xx sound decoder on add on board
- decoder is supported but AFAIK does not yet work
+ decoder is supported but AFAIK does not yet work
(other sound MUX setting in GPIO port needed??? somebody who fixed this???)
- 1 tuner, 1 composite and 1 S-VHS input
- tuner type is autodetected
@@ -70,7 +70,7 @@ in 1997!
Hauppauge Win/TV pci
--------------------
-There are many different versions of the Hauppauge cards with different
+There are many different versions of the Hauppauge cards with different
tuners (TV+Radio ...), teletext decoders.
Note that even cards with same model numbers have (depending on the revision)
different chips on it.
@@ -80,22 +80,22 @@ different chips on it.
- PAL, SECAM, NTSC or tuner with or without Radio support
e.g.:
- PAL:
+ PAL:
TDA5737: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners
TSA5522: 1.4 GHz I2C-bus controlled synthesizer, I2C 0xc2-0xc3
-
+
NTSC:
TDA5731: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners
TSA5518: no datasheet available on Philips site
-- Philips SAA5246 or SAA5284 ( or no) Teletext decoder chip
+- Philips SAA5246 or SAA5284 ( or no) Teletext decoder chip
with buffer RAM (e.g. Winbond W24257AS-35: 32Kx8 CMOS static RAM)
SAA5246 (I2C 0x22) is supported
-- 256 bytes EEPROM: Microchip 24LC02B or Philips 8582E2Y
+- 256 bytes EEPROM: Microchip 24LC02B or Philips 8582E2Y
with configuration information
I2C address 0xa0 (24LC02B also responds to 0xa2-0xaf)
- 1 tuner, 1 composite and (depending on model) 1 S-VHS input
- 14052B: mux for selection of sound source
-- sound decoder: TDA9800, MSP34xx (stereo cards)
+- sound decoder: TDA9800, MSP34xx (stereo cards)
Askey CPH-Series
@@ -108,17 +108,17 @@ Developed by TelSignal(?), OEMed by many vendors (Typhoon, Anubis, Dynalink)
CPH05x: BT878 with FM
CPH06x: BT878 (w/o FM)
CPH07x: BT878 capture only
-
+
TV standards:
CPH0x0: NTSC-M/M
CPH0x1: PAL-B/G
CPH0x2: PAL-I/I
CPH0x3: PAL-D/K
- CPH0x4: SECAM-L/L
- CPH0x5: SECAM-B/G
- CPH0x6: SECAM-D/K
- CPH0x7: PAL-N/N
- CPH0x8: PAL-B/H
+ CPH0x4: SECAM-L/L
+ CPH0x5: SECAM-B/G
+ CPH0x6: SECAM-D/K
+ CPH0x7: PAL-N/N
+ CPH0x8: PAL-B/H
CPH0x9: PAL-M/M
CPH03x was often sold as "TV capturer".
@@ -174,7 +174,7 @@ Lifeview Flyvideo Series:
"The FlyVideo2000 and FlyVideo2000s product name have renamed to FlyVideo98."
Their Bt8x8 cards are listed as discontinued.
Flyvideo 2000S was probably sold as Flyvideo 3000 in some contries(Europe?).
- The new Flyvideo 2000/3000 are SAA7130/SAA7134 based.
+ The new Flyvideo 2000/3000 are SAA7130/SAA7134 based.
"Flyvideo II" had been the name for the 848 cards, nowadays (in Germany)
this name is re-used for LR50 Rev.W.
@@ -235,12 +235,12 @@ Prolink
Multimedia TV packages (card + software pack):
PixelView Play TV Theater - (Model: PV-M4200) = PixelView Play TV pro + Software
PixelView Play TV PAK - (Model: PV-BT878P+ REV 4E)
- PixelView Play TV/VCR - (Model: PV-M3200 REV 4C / 8D / 10A )
+ PixelView Play TV/VCR - (Model: PV-M3200 REV 4C / 8D / 10A )
PixelView Studio PAK - (Model: M2200 REV 4C / 8D / 10A )
PixelView PowerStudio PAK - (Model: PV-M3600 REV 4E)
PixelView DigitalVCR PAK - (Model: PV-M2400 REV 4C / 8D / 10A )
- PixelView PlayTV PAK II (TV/FM card + usb camera) PV-M3800
+ PixelView PlayTV PAK II (TV/FM card + usb camera) PV-M3800
PixelView PlayTV XP PV-M4700,PV-M4700(w/FM)
PixelView PlayTV DVR PV-M4600 package contents:PixelView PlayTV pro, windvr & videoMail s/w
@@ -254,7 +254,7 @@ Prolink
DTV3000 PV-DTV3000P+ DVB-S CI = Twinhan VP-1030
DTV2000 DVB-S = Twinhan VP-1020
-
+
Video Conferencing:
PixelView Meeting PAK - (Model: PV-BT878P)
PixelView Meeting PAK Lite - (Model: PV-BT878P)
@@ -308,7 +308,7 @@ KNC One
newer Cards have saa7134, but model name stayed the same?
-Provideo
+Provideo
--------
PV951 or PV-951 (also are sold as:
Boeder TV-FM Video Capture Card
@@ -353,7 +353,7 @@ AVerMedia
AVerTV
AVerTV Stereo
AVerTV Studio (w/FM)
- AVerMedia TV98 with Remote
+ AVerMedia TV98 with Remote
AVerMedia TV/FM98 Stereo
AVerMedia TVCAM98
TVCapture (Bt848)
@@ -373,7 +373,7 @@ AVerMedia
(1) Daughterboard MB68-A with TDA9820T and TDA9840T
(2) Sony NE41S soldered (stereo sound?)
(3) Daughterboard M118-A w/ pic 16c54 and 4 MHz quartz
-
+
US site has different drivers for (as of 09/2002):
EZ Capture/InterCam PCI (BT-848 chip)
EZ Capture/InterCam PCI (BT-878 chip)
@@ -437,7 +437,7 @@ Terratec
Terra TValueRadio, "LR102 Rev.C" printed on the PCB
Terra TV/Radio+ Version 1.0, "80-CP2830100-0" TTTV3 printed on the PCB,
"CPH010-E83" on the back, SAA6588T, TDA9873H
- Terra TValue Version BT878, "80-CP2830110-0 TTTV4" printed on the PCB,
+ Terra TValue Version BT878, "80-CP2830110-0 TTTV4" printed on the PCB,
"CPH011-D83" on back
Terra TValue Version 1.0 "ceb105.PCB" (really identical to Terra TV+ Version 1.0)
Terra TValue New Revision "LR102 Rec.C"
@@ -528,7 +528,7 @@ Koutech
KW-606RSF
KW-607A (capture only)
KW-608 (Zoran capture only)
-
+
IODATA (jp)
------
GV-BCTV/PCI
@@ -542,15 +542,15 @@ Canopus (jp)
-------
WinDVR = Kworld "KW-TVL878RF"
-www.sigmacom.co.kr
+www.sigmacom.co.kr
------------------
- Sigma Cyber TV II
+ Sigma Cyber TV II
www.sasem.co.kr
---------------
Litte OnAir TV
-hama
+hama
----
TV/Radio-Tuner Card, PCI (Model 44677) = CPH051
@@ -638,7 +638,7 @@ Media-Surfer (esc-kathrein.de)
Jetway (www.jetway.com.tw)
--------------------------
- JW-TV 878M
+ JW-TV 878M
JW-TV 878 = KWorld KW-TV878RF
Galaxis
@@ -715,7 +715,7 @@ Hauppauge
809 MyVideo
872 MyTV2Go FM
-
+
546 WinTV Nova-S CI
543 WinTV Nova
907 Nova-S USB
@@ -739,7 +739,7 @@ Hauppauge
832 MyTV2Go
869 MyTV2Go-FM
805 MyVideo (USB)
-
+
Matrix-Vision
-------------
@@ -764,7 +764,7 @@ Gallant (www.gallantcom.com) www.minton.com.tw
Intervision IV-550 (bt8x8)
Intervision IV-100 (zoran)
Intervision IV-1000 (bt8x8)
-
+
Asonic (www.asonic.com.cn) (website down)
-----------------------------------------
SkyEye tv 878
@@ -804,11 +804,11 @@ Kworld (www.kworld.com.tw)
JTT/ Justy Corp.http://www.justy.co.jp/ (www.jtt.com.jp website down)
---------------------------------------------------------------------
- JTT-02 (JTT TV) "TV watchmate pro" (bt848)
+ JTT-02 (JTT TV) "TV watchmate pro" (bt848)
ADS www.adstech.com
-------------------
- Channel Surfer TV ( CHX-950 )
+ Channel Surfer TV ( CHX-950 )
Channel Surfer TV+FM ( CHX-960FM )
AVEC www.prochips.com
@@ -874,7 +874,7 @@ www.ids-imaging.de
------------------
Falcon Series (capture only)
In USA: http://www.theimagingsource.com/
- DFG/LC1
+ DFG/LC1
www.sknet-web.co.jp
-------------------
@@ -890,7 +890,7 @@ Cybertainment
CyberMail Xtreme
These are Flyvideo
-VCR (http://www.vcrinc.com/)
+VCR (http://www.vcrinc.com/)
---
Video Catcher 16
@@ -920,7 +920,7 @@ Sdisilk www.sdisilk.com/
SDI Silk 200 SDI Input Card
www.euresys.com
- PICOLO series
+ PICOLO series
PMC/Pace
www.pacecom.co.uk website closed
diff --git a/Documentation/video4linux/bttv/Insmod-options b/Documentation/video4linux/bttv/Insmod-options
index 7bb5a50b0779..fc94ff235ffa 100644
--- a/Documentation/video4linux/bttv/Insmod-options
+++ b/Documentation/video4linux/bttv/Insmod-options
@@ -44,6 +44,9 @@ bttv.o
push used by bttv. bttv will disable overlay
by default on this hardware to avoid crashes.
With this insmod option you can override this.
+ no_overlay=1 Disable overlay. It should be used by broken
+ hardware that doesn't support PCI2PCI direct
+ transfers.
automute=0/1 Automatically mutes the sound if there is
no TV signal, on by default. You might try
to disable this if you have bad input signal
diff --git a/Documentation/video4linux/not-in-cx2388x-datasheet.txt b/Documentation/video4linux/not-in-cx2388x-datasheet.txt
index 96b638b5ba1d..edbfe744d21d 100644
--- a/Documentation/video4linux/not-in-cx2388x-datasheet.txt
+++ b/Documentation/video4linux/not-in-cx2388x-datasheet.txt
@@ -34,4 +34,8 @@ MO_OUTPUT_FORMAT (0x310164)
2: HACTEXT
1: HSFMT
+0x47 is the sync byte for MPEG-2 transport stream packets.
+Datasheet incorrectly states to use 47 decimal. 188 is the length.
+All DVB compliant frontends output packets with this start code.
+
=================================================================================
diff --git a/Documentation/vm/locking b/Documentation/vm/locking
index c3ef09ae3bb1..f366fa956179 100644
--- a/Documentation/vm/locking
+++ b/Documentation/vm/locking
@@ -83,19 +83,18 @@ single address space optimization, so that the zap_page_range (from
vmtruncate) does not lose sending ipi's to cloned threads that might
be spawned underneath it and go to user mode to drag in pte's into tlbs.
-swap_list_lock/swap_device_lock
--------------------------------
+swap_lock
+--------------
The swap devices are chained in priority order from the "swap_list" header.
The "swap_list" is used for the round-robin swaphandle allocation strategy.
The #free swaphandles is maintained in "nr_swap_pages". These two together
-are protected by the swap_list_lock.
+are protected by the swap_lock.
-The swap_device_lock, which is per swap device, protects the reference
-counts on the corresponding swaphandles, maintained in the "swap_map"
-array, and the "highest_bit" and "lowest_bit" fields.
+The swap_lock also protects all the device reference counts on the
+corresponding swaphandles, maintained in the "swap_map" array, and the
+"highest_bit" and "lowest_bit" fields.
-Both of these are spinlocks, and are never acquired from intr level. The
-locking hierarchy is swap_list_lock -> swap_device_lock.
+The swap_lock is a spinlock, and is never acquired from intr level.
To prevent races between swap space deletion or async readahead swapins
deciding whether a swap handle is being used, ie worthy of being read in
diff --git a/Documentation/watchdog/watchdog-api.txt b/Documentation/watchdog/watchdog-api.txt
index 28388aa700c6..c5beb548cfc4 100644
--- a/Documentation/watchdog/watchdog-api.txt
+++ b/Documentation/watchdog/watchdog-api.txt
@@ -228,6 +228,26 @@ advantechwdt.c -- Advantech Single Board Computer
The GETSTATUS call returns if the device is open or not.
[FIXME -- silliness again?]
+booke_wdt.c -- PowerPC BookE Watchdog Timer
+
+ Timeout default varies according to frequency, supports
+ SETTIMEOUT
+
+ Watchdog can not be turned off, CONFIG_WATCHDOG_NOWAYOUT
+ does not make sense
+
+ GETSUPPORT returns the watchdog_info struct, and
+ GETSTATUS returns the supported options. GETBOOTSTATUS
+ returns a 1 if the last reset was caused by the
+ watchdog and a 0 otherwise. This watchdog can not be
+ disabled once it has been started. The wdt_period kernel
+ parameter selects which bit of the time base changing
+ from 0->1 will trigger the watchdog exception. Changing
+ the timeout from the ioctl calls will change the
+ wdt_period as defined above. Finally if you would like to
+ replace the default Watchdog Handler you can implement the
+ WatchdogHandler() function in your own code.
+
eurotechwdt.c -- Eurotech CPU-1220/1410
The timeout can be set using the SETTIMEOUT ioctl and defaults
diff --git a/Documentation/x86_64/boot-options.txt b/Documentation/x86_64/boot-options.txt
index b9e6be00cadf..678e8f192db2 100644
--- a/Documentation/x86_64/boot-options.txt
+++ b/Documentation/x86_64/boot-options.txt
@@ -6,6 +6,11 @@ only the AMD64 specific ones are listed here.
Machine check
mce=off disable machine check
+ mce=bootlog Enable logging of machine checks left over from booting.
+ Disabled by default because some BIOS leave bogus ones.
+ If your BIOS doesn't do that it's a good idea to enable though
+ to make sure you log even machine check events that result
+ in a reboot.
nomce (for compatibility with i386): same as mce=off
@@ -47,7 +52,7 @@ Timing
notsc
Don't use the CPU time stamp counter to read the wall time.
This can be used to work around timing problems on multiprocessor systems
- with not properly synchronized CPUs. Only useful with a SMP kernel
+ with not properly synchronized CPUs.
report_lost_ticks
Report when timer interrupts are lost because some code turned off
@@ -74,6 +79,9 @@ Idle loop
event. This will make the CPUs eat a lot more power, but may be useful
to get slightly better performance in multiprocessor benchmarks. It also
makes some profiling using performance counters more accurate.
+ Please note that on systems with MONITOR/MWAIT support (like Intel EM64T
+ CPUs) this option has no performance advantage over the normal idle loop.
+ It may also interact badly with hyperthreading.
Rebooting
@@ -178,6 +186,5 @@ Debugging
Misc
noreplacement Don't replace instructions with more appropiate ones
- for the CPU. This may be useful on asymmetric MP systems
- where some CPU have less capabilities than the others.
-
+ for the CPU. This may be useful on asymmetric MP systems
+ where some CPU have less capabilities than the others.
OpenPOWER on IntegriCloud