diff options
Diffstat (limited to 'Documentation')
134 files changed, 8668 insertions, 2538 deletions
diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX index 8de8a01a2474..f28a24e0279b 100644 --- a/Documentation/00-INDEX +++ b/Documentation/00-INDEX @@ -138,6 +138,8 @@ java.txt - info on the in-kernel binary support for Java(tm). kbuild/ - directory with info about the kernel build process. +kdumpt.txt + - mini HowTo on getting the crash dump code to work. kernel-doc-nano-HOWTO.txt - mini HowTo on generation and location of kernel documentation files. kernel-docs.txt diff --git a/Documentation/Changes b/Documentation/Changes index 57542bc25edd..5eaab0441d76 100644 --- a/Documentation/Changes +++ b/Documentation/Changes @@ -44,9 +44,9 @@ running, the suggested command should tell you. Again, keep in mind that this list assumes you are already functionally running a Linux 2.4 kernel. Also, not all tools are -necessary on all systems; obviously, if you don't have any PCMCIA (PC -Card) hardware, for example, you probably needn't concern yourself -with pcmcia-cs. +necessary on all systems; obviously, if you don't have any ISDN +hardware, for example, you probably needn't concern yourself with +isdn4k-utils. o Gnu C 2.95.3 # gcc --version o Gnu make 3.79.1 # make --version @@ -57,13 +57,15 @@ o e2fsprogs 1.29 # tune2fs o jfsutils 1.1.3 # fsck.jfs -V o reiserfsprogs 3.6.3 # reiserfsck -V 2>&1|grep reiserfsprogs o xfsprogs 2.6.0 # xfs_db -V +o pcmciautils 004 o pcmcia-cs 3.1.21 # cardmgr -V o quota-tools 3.09 # quota -V o PPP 2.4.0 # pppd --version o isdn4k-utils 3.1pre1 # isdnctrl 2>&1|grep version o nfs-utils 1.0.5 # showmount --version o procps 3.2.0 # ps --version -o oprofile 0.5.3 # oprofiled --version +o oprofile 0.9 # oprofiled --version +o udev 058 # udevinfo -V Kernel compilation ================== @@ -186,13 +188,20 @@ architecture independent and any version from 2.0.0 onward should work correctly with this version of the XFS kernel code (2.6.0 or later is recommended, due to some significant improvements). +PCMCIAutils +----------- + +PCMCIAutils replaces pcmcia-cs (see below). It properly sets up +PCMCIA sockets at system startup and loads the appropriate modules +for 16-bit PCMCIA devices if the kernel is modularized and the hotplug +subsystem is used. Pcmcia-cs --------- PCMCIA (PC Card) support is now partially implemented in the main -kernel source. Pay attention when you recompile your kernel ;-). -Also, be sure to upgrade to the latest pcmcia-cs release. +kernel source. The "pcmciautils" package (see above) replaces pcmcia-cs +for newest kernels. Quota-tools ----------- @@ -349,9 +358,13 @@ Xfsprogs -------- o <ftp://oss.sgi.com/projects/xfs/download/> +Pcmciautils +----------- +o <ftp://ftp.kernel.org/pub/linux/utils/kernel/pcmcia/> + Pcmcia-cs --------- -o <ftp://pcmcia-cs.sourceforge.net/pub/pcmcia-cs/pcmcia-cs-3.1.21.tar.gz> +o <http://pcmcia-cs.sourceforge.net/> Quota-tools ---------- diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile index e69b3d2e7884..fa3e29ad8a46 100644 --- a/Documentation/DocBook/Makefile +++ b/Documentation/DocBook/Makefile @@ -8,7 +8,7 @@ DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml videobook.xml \ kernel-hacking.xml kernel-locking.xml deviceiobook.xml \ - procfs-guide.xml writing_usb_driver.xml scsidrivers.xml \ + procfs-guide.xml writing_usb_driver.xml \ sis900.xml kernel-api.xml journal-api.xml lsm.xml usb.xml \ gadget.xml libata.xml mtdnand.xml librs.xml @@ -49,7 +49,7 @@ installmandocs: mandocs KERNELDOC = scripts/kernel-doc DOCPROC = scripts/basic/docproc -XMLTOFLAGS = -m Documentation/DocBook/stylesheet.xsl +XMLTOFLAGS = -m $(srctree)/Documentation/DocBook/stylesheet.xsl #XMLTOFLAGS += --skip-validation ### diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl index 757cef8f8491..d650ce36485f 100644 --- a/Documentation/DocBook/kernel-api.tmpl +++ b/Documentation/DocBook/kernel-api.tmpl @@ -266,7 +266,7 @@ X!Ekernel/module.c <chapter id="hardware"> <title>Hardware Interfaces</title> <sect1><title>Interrupt Handling</title> -!Iarch/i386/kernel/irq.c +!Ikernel/irq/manage.c </sect1> <sect1><title>Resources Management</title> @@ -338,7 +338,6 @@ X!Earch/i386/kernel/mca.c X!Iinclude/linux/device.h --> !Edrivers/base/driver.c -!Edrivers/base/class_simple.c !Edrivers/base/core.c !Edrivers/base/firmware_class.c !Edrivers/base/transport_class.c diff --git a/Documentation/DocBook/libata.tmpl b/Documentation/DocBook/libata.tmpl index 6df1dfd18b65..375ae760dc1e 100644 --- a/Documentation/DocBook/libata.tmpl +++ b/Documentation/DocBook/libata.tmpl @@ -84,6 +84,14 @@ void (*port_disable) (struct ata_port *); Called from ata_bus_probe() and ata_bus_reset() error paths, as well as when unregistering from the SCSI module (rmmod, hot unplug). + This function should do whatever needs to be done to take the + port out of use. In most cases, ata_port_disable() can be used + as this hook. + </para> + <para> + Called from ata_bus_probe() on a failed probe. + Called from ata_bus_reset() on a failed bus reset. + Called from ata_scsi_release(). </para> </sect2> @@ -98,6 +106,13 @@ void (*dev_config) (struct ata_port *, struct ata_device *); found. Typically used to apply device-specific fixups prior to issue of SET FEATURES - XFER MODE, and prior to operation. </para> + <para> + Called by ata_device_add() after ata_dev_identify() determines + a device is present. + </para> + <para> + This entry may be specified as NULL in ata_port_operations. + </para> </sect2> @@ -135,6 +150,8 @@ void (*tf_read) (struct ata_port *ap, struct ata_taskfile *tf); registers / DMA buffers. ->tf_read() is called to read the hardware registers / DMA buffers, to obtain the current set of taskfile register values. + Most drivers for taskfile-based hardware (PIO or MMIO) use + ata_tf_load() and ata_tf_read() for these hooks. </para> </sect2> @@ -147,6 +164,8 @@ void (*exec_command)(struct ata_port *ap, struct ata_taskfile *tf); <para> causes an ATA command, previously loaded with ->tf_load(), to be initiated in hardware. + Most drivers for taskfile-based hardware use ata_exec_command() + for this hook. </para> </sect2> @@ -161,6 +180,10 @@ Allow low-level driver to filter ATA PACKET commands, returning a status indicating whether or not it is OK to use DMA for the supplied PACKET command. </para> + <para> + This hook may be specified as NULL, in which case libata will + assume that atapi dma can be supported. + </para> </sect2> @@ -175,6 +198,14 @@ u8 (*check_err)(struct ata_port *ap); Reads the Status/AltStatus/Error ATA shadow register from hardware. On some hardware, reading the Status register has the side effect of clearing the interrupt condition. + Most drivers for taskfile-based hardware use + ata_check_status() for this hook. + </para> + <para> + Note that because this is called from ata_device_add(), at + least a dummy function that clears device interrupts must be + provided for all drivers, even if the controller doesn't + actually have a taskfile status register. </para> </sect2> @@ -188,7 +219,13 @@ void (*dev_select)(struct ata_port *ap, unsigned int device); Issues the low-level hardware command(s) that causes one of N hardware devices to be considered 'selected' (active and available for use) on the ATA bus. This generally has no -meaning on FIS-based devices. + meaning on FIS-based devices. + </para> + <para> + Most drivers for taskfile-based hardware use + ata_std_dev_select() for this hook. Controllers which do not + support second drives on a port (such as SATA contollers) will + use ata_noop_dev_select(). </para> </sect2> @@ -204,6 +241,8 @@ void (*phy_reset) (struct ata_port *ap); for device presence (PATA and SATA), typically a soft reset (SRST) will be performed. Drivers typically use the helper functions ata_bus_reset() or sata_phy_reset() for this hook. + Many SATA drivers use sata_phy_reset() or call it from within + their own phy_reset() functions. </para> </sect2> @@ -227,6 +266,25 @@ PCI IDE DMA Status register. These hooks are typically either no-ops, or simply not implemented, in FIS-based drivers. </para> + <para> +Most legacy IDE drivers use ata_bmdma_setup() for the bmdma_setup() +hook. ata_bmdma_setup() will write the pointer to the PRD table to +the IDE PRD Table Address register, enable DMA in the DMA Command +register, and call exec_command() to begin the transfer. + </para> + <para> +Most legacy IDE drivers use ata_bmdma_start() for the bmdma_start() +hook. ata_bmdma_start() will write the ATA_DMA_START flag to the DMA +Command register. + </para> + <para> +Many legacy IDE drivers use ata_bmdma_stop() for the bmdma_stop() +hook. ata_bmdma_stop() clears the ATA_DMA_START flag in the DMA +command register. + </para> + <para> +Many legacy IDE drivers use ata_bmdma_status() as the bmdma_status() hook. + </para> </sect2> @@ -250,6 +308,10 @@ int (*qc_issue) (struct ata_queued_cmd *qc); helper function ata_qc_issue_prot() for taskfile protocol-based dispatch. More advanced drivers implement their own ->qc_issue. </para> + <para> + ata_qc_issue_prot() calls ->tf_load(), ->bmdma_setup(), and + ->bmdma_start() as necessary to initiate a transfer. + </para> </sect2> @@ -279,6 +341,21 @@ void (*irq_clear) (struct ata_port *); before the interrupt handler is registered, to be sure hardware is quiet. </para> + <para> + The second argument, dev_instance, should be cast to a pointer + to struct ata_host_set. + </para> + <para> + Most legacy IDE drivers use ata_interrupt() for the + irq_handler hook, which scans all ports in the host_set, + determines which queued command was active (if any), and calls + ata_host_intr(ap,qc). + </para> + <para> + Most legacy IDE drivers use ata_bmdma_irq_clear() for the + irq_clear() hook, which simply clears the interrupt and error + flags in the DMA status register. + </para> </sect2> @@ -292,6 +369,7 @@ void (*scr_write) (struct ata_port *ap, unsigned int sc_reg, <para> Read and write standard SATA phy registers. Currently only used if ->phy_reset hook called the sata_phy_reset() helper function. + sc_reg is one of SCR_STATUS, SCR_CONTROL, SCR_ERROR, or SCR_ACTIVE. </para> </sect2> @@ -307,17 +385,29 @@ void (*host_stop) (struct ata_host_set *host_set); ->port_start() is called just after the data structures for each port are initialized. Typically this is used to alloc per-port DMA buffers / tables / rings, enable DMA engines, and similar - tasks. + tasks. Some drivers also use this entry point as a chance to + allocate driver-private memory for ap->private_data. + </para> + <para> + Many drivers use ata_port_start() as this hook or call + it from their own port_start() hooks. ata_port_start() + allocates space for a legacy IDE PRD table and returns. </para> <para> ->port_stop() is called after ->host_stop(). It's sole function is to release DMA/memory resources, now that they are no longer - actively being used. + actively being used. Many drivers also free driver-private + data from port at this time. + </para> + <para> + Many drivers use ata_port_stop() as this hook, which frees the + PRD table. </para> <para> ->host_stop() is called after all ->port_stop() calls have completed. The hook must finalize hardware shutdown, release DMA and other resources, etc. + This hook may be specified as NULL, in which case it is not called. </para> </sect2> diff --git a/Documentation/DocBook/scsidrivers.tmpl b/Documentation/DocBook/scsidrivers.tmpl deleted file mode 100644 index d058e65daf19..000000000000 --- a/Documentation/DocBook/scsidrivers.tmpl +++ /dev/null @@ -1,193 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" - "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> - -<book id="scsidrivers"> - <bookinfo> - <title>SCSI Subsystem Interfaces</title> - - <authorgroup> - <author> - <firstname>Douglas</firstname> - <surname>Gilbert</surname> - <affiliation> - <address> - <email>dgilbert@interlog.com</email> - </address> - </affiliation> - </author> - </authorgroup> - <pubdate>2003-08-11</pubdate> - - <copyright> - <year>2002</year> - <year>2003</year> - <holder>Douglas Gilbert</holder> - </copyright> - - <legalnotice> - <para> - This documentation is free software; you can redistribute - it and/or modify it under the terms of the GNU General Public - License as published by the Free Software Foundation; either - version 2 of the License, or (at your option) any later - version. - </para> - - <para> - This program is distributed in the hope that it will be - useful, but WITHOUT ANY WARRANTY; without even the implied - warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. - See the GNU General Public License for more details. - </para> - - <para> - You should have received a copy of the GNU General Public - License along with this program; if not, write to the Free - Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, - MA 02111-1307 USA - </para> - - <para> - For more details see the file COPYING in the source - distribution of Linux. - </para> - </legalnotice> - - </bookinfo> - -<toc></toc> - - <chapter id="intro"> - <title>Introduction</title> - <para> -This document outlines the interface between the Linux scsi mid level -and lower level drivers. Lower level drivers are variously called HBA -(host bus adapter) drivers, host drivers (HD) or pseudo adapter drivers. -The latter alludes to the fact that a lower level driver may be a -bridge to another IO subsystem (and the "ide-scsi" driver is an example -of this). There can be many lower level drivers active in a running -system, but only one per hardware type. For example, the aic7xxx driver -controls adaptec controllers based on the 7xxx chip series. Most lower -level drivers can control one or more scsi hosts (a.k.a. scsi initiators). - </para> -<para> -This document can been found in an ASCII text file in the linux kernel -source: <filename>Documentation/scsi/scsi_mid_low_api.txt</filename> . -It currently hold a little more information than this document. The -<filename>drivers/scsi/hosts.h</filename> and <filename> -drivers/scsi/scsi.h</filename> headers contain descriptions of members -of important structures for the scsi subsystem. -</para> - </chapter> - - <chapter id="driver-struct"> - <title>Driver structure</title> - <para> -Traditionally a lower level driver for the scsi subsystem has been -at least two files in the drivers/scsi directory. For example, a -driver called "xyz" has a header file "xyz.h" and a source file -"xyz.c". [Actually there is no good reason why this couldn't all -be in one file.] Some drivers that have been ported to several operating -systems (e.g. aic7xxx which has separate files for generic and -OS-specific code) have more than two files. Such drivers tend to have -their own directory under the drivers/scsi directory. - </para> - <para> -scsi_module.c is normally included at the end of a lower -level driver. For it to work a declaration like this is needed before -it is included: -<programlisting> - static Scsi_Host_Template driver_template = DRIVER_TEMPLATE; - /* DRIVER_TEMPLATE should contain pointers to supported interface - functions. Scsi_Host_Template is defined hosts.h */ - #include "scsi_module.c" -</programlisting> - </para> - <para> -The scsi_module.c assumes the name "driver_template" is appropriately -defined. It contains 2 functions: -<orderedlist> -<listitem><para> - init_this_scsi_driver() called during builtin and module driver - initialization: invokes mid level's scsi_register_host() -</para></listitem> -<listitem><para> - exit_this_scsi_driver() called during closedown: invokes - mid level's scsi_unregister_host() -</para></listitem> -</orderedlist> - </para> -<para> -When a new, lower level driver is being added to Linux, the following -files (all found in the drivers/scsi directory) will need some attention: -Makefile, Config.help and Config.in . It is probably best to look at what -an existing lower level driver does in this regard. -</para> - </chapter> - - <chapter id="intfunctions"> - <title>Interface Functions</title> -!EDocumentation/scsi/scsi_mid_low_api.txt - </chapter> - - <chapter id="locks"> - <title>Locks</title> -<para> -Each Scsi_Host instance has a spin_lock called Scsi_Host::default_lock -which is initialized in scsi_register() [found in hosts.c]. Within the -same function the Scsi_Host::host_lock pointer is initialized to point -at default_lock with the scsi_assign_lock() function. Thereafter -lock and unlock operations performed by the mid level use the -Scsi_Host::host_lock pointer. -</para> -<para> -Lower level drivers can override the use of Scsi_Host::default_lock by -using scsi_assign_lock(). The earliest opportunity to do this would -be in the detect() function after it has invoked scsi_register(). It -could be replaced by a coarser grain lock (e.g. per driver) or a -lock of equal granularity (i.e. per host). Using finer grain locks -(e.g. per scsi device) may be possible by juggling locks in -queuecommand(). -</para> - </chapter> - - <chapter id="changes"> - <title>Changes since lk 2.4 series</title> -<para> -io_request_lock has been replaced by several finer grained locks. The lock -relevant to lower level drivers is Scsi_Host::host_lock and there is one -per scsi host. -</para> -<para> -The older error handling mechanism has been removed. This means the -lower level interface functions abort() and reset() have been removed. -</para> -<para> -In the 2.4 series the scsi subsystem configuration descriptions were -aggregated with the configuration descriptions from all other Linux -subsystems in the Documentation/Configure.help file. In the 2.5 series, -the scsi subsystem now has its own (much smaller) drivers/scsi/Config.help -file. -</para> - </chapter> - - <chapter id="credits"> - <title>Credits</title> -<para> -The following people have contributed to this document: -<orderedlist> -<listitem><para> -Mike Anderson <email>andmike@us.ibm.com</email> -</para></listitem> -<listitem><para> -James Bottomley <email>James.Bottomley@steeleye.com</email> -</para></listitem> -<listitem><para> -Patrick Mansfield <email>patmans@us.ibm.com</email> -</para></listitem> -</orderedlist> -</para> - </chapter> - -</book> diff --git a/Documentation/DocBook/stylesheet.xsl b/Documentation/DocBook/stylesheet.xsl index e14c21dda403..64be9f7ee3bb 100644 --- a/Documentation/DocBook/stylesheet.xsl +++ b/Documentation/DocBook/stylesheet.xsl @@ -2,4 +2,5 @@ <stylesheet xmlns="http://www.w3.org/1999/XSL/Transform" version="1.0"> <param name="chunk.quietly">1</param> <param name="funcsynopsis.style">ansi</param> +<param name="funcsynopsis.tabular.threshold">80</param> </stylesheet> diff --git a/Documentation/IPMI.txt b/Documentation/IPMI.txt index 90d10e708ca3..84d3d4d10c17 100644 --- a/Documentation/IPMI.txt +++ b/Documentation/IPMI.txt @@ -25,9 +25,10 @@ subject and I can't cover it all here! Configuration ------------- -The LinuxIPMI driver is modular, which means you have to pick several +The Linux IPMI driver is modular, which means you have to pick several things to have it work right depending on your hardware. Most of -these are available in the 'Character Devices' menu. +these are available in the 'Character Devices' menu then the IPMI +menu. No matter what, you must pick 'IPMI top-level message handler' to use IPMI. What you do beyond that depends on your needs and hardware. @@ -35,33 +36,30 @@ IPMI. What you do beyond that depends on your needs and hardware. The message handler does not provide any user-level interfaces. Kernel code (like the watchdog) can still use it. If you need access from userland, you need to select 'Device interface for IPMI' if you -want access through a device driver. Another interface is also -available, you may select 'IPMI sockets' in the 'Networking Support' -main menu. This provides a socket interface to IPMI. You may select -both of these at the same time, they will both work together. - -The driver interface depends on your hardware. If you have a board -with a standard interface (These will generally be either "KCS", -"SMIC", or "BT", consult your hardware manual), choose the 'IPMI SI -handler' option. A driver also exists for direct I2C access to the -IPMI management controller. Some boards support this, but it is -unknown if it will work on every board. For this, choose 'IPMI SMBus -handler', but be ready to try to do some figuring to see if it will -work. - -There is also a KCS-only driver interface supplied, but it is -depracated in favor of the SI interface. +want access through a device driver. + +The driver interface depends on your hardware. If your system +properly provides the SMBIOS info for IPMI, the driver will detect it +and just work. If you have a board with a standard interface (These +will generally be either "KCS", "SMIC", or "BT", consult your hardware +manual), choose the 'IPMI SI handler' option. A driver also exists +for direct I2C access to the IPMI management controller. Some boards +support this, but it is unknown if it will work on every board. For +this, choose 'IPMI SMBus handler', but be ready to try to do some +figuring to see if it will work on your system if the SMBIOS/APCI +information is wrong or not present. It is fairly safe to have both +these enabled and let the drivers auto-detect what is present. You should generally enable ACPI on your system, as systems with IPMI -should have ACPI tables describing them. +can have ACPI tables describing them. If you have a standard interface and the board manufacturer has done their job correctly, the IPMI controller should be automatically -detect (via ACPI or SMBIOS tables) and should just work. Sadly, many -boards do not have this information. The driver attempts standard -defaults, but they may not work. If you fall into this situation, you -need to read the section below named 'The SI Driver' on how to -hand-configure your system. +detected (via ACPI or SMBIOS tables) and should just work. Sadly, +many boards do not have this information. The driver attempts +standard defaults, but they may not work. If you fall into this +situation, you need to read the section below named 'The SI Driver' or +"The SMBus Driver" on how to hand-configure your system. IPMI defines a standard watchdog timer. You can enable this with the 'IPMI Watchdog Timer' config option. If you compile the driver into @@ -73,6 +71,18 @@ closed (by default it is disabled on close). Go into the 'Watchdog Cards' menu, enable 'Watchdog Timer Support', and enable the option 'Disable watchdog shutdown on close'. +IPMI systems can often be powered off using IPMI commands. Select +'IPMI Poweroff' to do this. The driver will auto-detect if the system +can be powered off by IPMI. It is safe to enable this even if your +system doesn't support this option. This works on ATCA systems, the +Radisys CPI1 card, and any IPMI system that supports standard chassis +management commands. + +If you want the driver to put an event into the event log on a panic, +enable the 'Generate a panic event to all BMCs on a panic' option. If +you want the whole panic string put into the event log using OEM +events, enable the 'Generate OEM events containing the panic string' +option. Basic Design ------------ @@ -80,7 +90,7 @@ Basic Design The Linux IPMI driver is designed to be very modular and flexible, you only need to take the pieces you need and you can use it in many different ways. Because of that, it's broken into many chunks of -code. These chunks are: +code. These chunks (by module name) are: ipmi_msghandler - This is the central piece of software for the IPMI system. It handles all messages, message timing, and responses. The @@ -93,18 +103,26 @@ ipmi_devintf - This provides a userland IOCTL interface for the IPMI driver, each open file for this device ties in to the message handler as an IPMI user. -ipmi_si - A driver for various system interfaces. This supports -KCS, SMIC, and may support BT in the future. Unless you have your own -custom interface, you probably need to use this. +ipmi_si - A driver for various system interfaces. This supports KCS, +SMIC, and BT interfaces. Unless you have an SMBus interface or your +own custom interface, you probably need to use this. ipmi_smb - A driver for accessing BMCs on the SMBus. It uses the I2C kernel driver's SMBus interfaces to send and receive IPMI messages over the SMBus. -af_ipmi - A network socket interface to IPMI. This doesn't take up -a character device in your system. +ipmi_watchdog - IPMI requires systems to have a very capable watchdog +timer. This driver implements the standard Linux watchdog timer +interface on top of the IPMI message handler. + +ipmi_poweroff - Some systems support the ability to be turned off via +IPMI commands. -Note that the KCS-only interface ahs been removed. +These are all individually selectable via configuration options. + +Note that the KCS-only interface has been removed. The af_ipmi driver +is no longer supported and has been removed because it was impossible +to do 32 bit emulation on 64-bit kernels with it. Much documentation for the interface is in the include files. The IPMI include files are: @@ -424,7 +442,7 @@ at module load time (for a module) with: modprobe ipmi_smb.o addr=<adapter1>,<i2caddr1>[,<adapter2>,<i2caddr2>[,...]] dbg=<flags1>,<flags2>... - [defaultprobe=0] [dbg_probe=1] + [defaultprobe=1] [dbg_probe=1] The addresses are specified in pairs, the first is the adapter ID and the second is the I2C address on that adapter. @@ -532,3 +550,67 @@ Once you open the watchdog timer, you must write a 'V' character to the device to close it, or the timer will not stop. This is a new semantic for the driver, but makes it consistent with the rest of the watchdog drivers in Linux. + + +Panic Timeouts +-------------- + +The OpenIPMI driver supports the ability to put semi-custom and custom +events in the system event log if a panic occurs. if you enable the +'Generate a panic event to all BMCs on a panic' option, you will get +one event on a panic in a standard IPMI event format. If you enable +the 'Generate OEM events containing the panic string' option, you will +also get a bunch of OEM events holding the panic string. + + +The field settings of the events are: +* Generator ID: 0x21 (kernel) +* EvM Rev: 0x03 (this event is formatting in IPMI 1.0 format) +* Sensor Type: 0x20 (OS critical stop sensor) +* Sensor #: The first byte of the panic string (0 if no panic string) +* Event Dir | Event Type: 0x6f (Assertion, sensor-specific event info) +* Event Data 1: 0xa1 (Runtime stop in OEM bytes 2 and 3) +* Event data 2: second byte of panic string +* Event data 3: third byte of panic string +See the IPMI spec for the details of the event layout. This event is +always sent to the local management controller. It will handle routing +the message to the right place + +Other OEM events have the following format: +Record ID (bytes 0-1): Set by the SEL. +Record type (byte 2): 0xf0 (OEM non-timestamped) +byte 3: The slave address of the card saving the panic +byte 4: A sequence number (starting at zero) +The rest of the bytes (11 bytes) are the panic string. If the panic string +is longer than 11 bytes, multiple messages will be sent with increasing +sequence numbers. + +Because you cannot send OEM events using the standard interface, this +function will attempt to find an SEL and add the events there. It +will first query the capabilities of the local management controller. +If it has an SEL, then they will be stored in the SEL of the local +management controller. If not, and the local management controller is +an event generator, the event receiver from the local management +controller will be queried and the events sent to the SEL on that +device. Otherwise, the events go nowhere since there is nowhere to +send them. + + +Poweroff +-------- + +If the poweroff capability is selected, the IPMI driver will install +a shutdown function into the standard poweroff function pointer. This +is in the ipmi_poweroff module. When the system requests a powerdown, +it will send the proper IPMI commands to do this. This is supported on +several platforms. + +There is a module parameter named "poweroff_control" that may either be zero +(do a power down) or 2 (do a power cycle, power the system off, then power +it on in a few seconds). Setting ipmi_poweroff.poweroff_control=x will do +the same thing on the kernel command line. The parameter is also available +via the proc filesystem in /proc/ipmi/poweroff_control. Note that if the +system does not support power cycling, it will always to the power off. + +Note that if you have ACPI enabled, the system will prefer using ACPI to +power off. diff --git a/Documentation/SubmittingDrivers b/Documentation/SubmittingDrivers index de3b252e717d..c3cca924e94b 100644 --- a/Documentation/SubmittingDrivers +++ b/Documentation/SubmittingDrivers @@ -13,13 +13,14 @@ Allocating Device Numbers ------------------------- Major and minor numbers for block and character devices are allocated -by the Linux assigned name and number authority (currently better -known as H Peter Anvin). The site is http://www.lanana.org/. This +by the Linux assigned name and number authority (currently this is +Torben Mathiasen). The site is http://www.lanana.org/. This also deals with allocating numbers for devices that are not going to be submitted to the mainstream kernel. +See Documentation/devices.txt for more information on this. -If you don't use assigned numbers then when you device is submitted it will -get given an assigned number even if that is different from values you may +If you don't use assigned numbers then when your device is submitted it will +be given an assigned number even if that is different from values you may have shipped to customers before. Who To Submit Drivers To @@ -32,7 +33,8 @@ Linux 2.2: If the code area has a general maintainer then please submit it to the maintainer listed in MAINTAINERS in the kernel file. If the maintainer does not respond or you cannot find the appropriate - maintainer then please contact Alan Cox <alan@lxorguk.ukuu.org.uk> + maintainer then please contact the 2.2 kernel maintainer: + Marc-Christian Petersen <m.c.p@wolk-project.de>. Linux 2.4: The same rules apply as 2.2. The final contact point for Linux 2.4 @@ -48,7 +50,7 @@ What Criteria Determine Acceptance Licensing: The code must be released to us under the GNU General Public License. We don't insist on any kind - of exclusively GPL licensing, and if you wish the driver + of exclusive GPL licensing, and if you wish the driver to be useful to other communities such as BSD you may well wish to release under multiple licenses. diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index 9838d32b2fe7..7f43b040311e 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -35,7 +35,7 @@ not in any lower subdirectory. To create a patch for a single file, it is often sufficient to do: - SRCTREE= linux-2.4 + SRCTREE= linux-2.6 MYFILE= drivers/net/mydriver.c cd $SRCTREE @@ -48,17 +48,18 @@ To create a patch for multiple files, you should unpack a "vanilla", or unmodified kernel source tree, and generate a diff against your own source tree. For example: - MYSRC= /devel/linux-2.4 + MYSRC= /devel/linux-2.6 - tar xvfz linux-2.4.0-test11.tar.gz - mv linux linux-vanilla - wget http://www.moses.uklinux.net/patches/dontdiff - diff -uprN -X dontdiff linux-vanilla $MYSRC > /tmp/patch - rm -f dontdiff + tar xvfz linux-2.6.12.tar.gz + mv linux-2.6.12 linux-2.6.12-vanilla + diff -uprN -X linux-2.6.12-vanilla/Documentation/dontdiff \ + linux-2.6.12-vanilla $MYSRC > /tmp/patch "dontdiff" is a list of files which are generated by the kernel during the build process, and should be ignored in any diff(1)-generated -patch. dontdiff is maintained by Tigran Aivazian <tigran@veritas.com> +patch. The "dontdiff" file is included in the kernel tree in +2.6.12 and later. For earlier kernel versions, you can get it +from <http://www.xenotime.net/linux/doc/dontdiff>. Make sure your patch does not include any extra files which do not belong in a patch submission. Make sure to review your patch -after- @@ -66,18 +67,20 @@ generated it with diff(1), to ensure accuracy. If your changes produce a lot of deltas, you may want to look into splitting them into individual patches which modify things in -logical stages, this will facilitate easier reviewing by other +logical stages. This will facilitate easier reviewing by other kernel developers, very important if you want your patch accepted. -There are a number of scripts which can aid in this; +There are a number of scripts which can aid in this: Quilt: http://savannah.nongnu.org/projects/quilt Randy Dunlap's patch scripts: -http://developer.osdl.org/rddunlap/scripts/patching-scripts.tgz +http://www.xenotime.net/linux/scripts/patching-scripts-002.tar.gz Andrew Morton's patch scripts: -http://www.zip.com.au/~akpm/linux/patches/patch-scripts-0.16 +http://www.zip.com.au/~akpm/linux/patches/patch-scripts-0.20 + + 2) Describe your changes. @@ -132,21 +135,6 @@ which require discussion or do not have a clear advantage should usually be sent first to linux-kernel. Only after the patch is discussed should the patch then be submitted to Linus. -For small patches you may want to CC the Trivial Patch Monkey -trivial@rustcorp.com.au set up by Rusty Russell; which collects "trivial" -patches. Trivial patches must qualify for one of the following rules: - Spelling fixes in documentation - Spelling fixes which could break grep(1). - Warning fixes (cluttering with useless warnings is bad) - Compilation fixes (only if they are actually correct) - Runtime fixes (only if they actually fix things) - Removing use of deprecated functions/macros (eg. check_region). - Contact detail and documentation fixes - Non-portable code replaced by portable code (even in arch-specific, - since people copy, as long as it's trivial) - Any fix by the author/maintainer of the file. (ie. patch monkey - in re-transmission mode) - 5) Select your CC (e-mail carbon copy) list. @@ -161,6 +149,11 @@ USB, framebuffer devices, the VFS, the SCSI subsystem, etc. See the MAINTAINERS file for a mailing list that relates specifically to your change. +If changes affect userland-kernel interfaces, please send +the MAN-PAGES maintainer (as listed in the MAINTAINERS file) +a man-pages patch, or at least a notification of the change, +so that some information makes its way into the manual pages. + Even if the maintainer did not respond in step #4, make sure to ALWAYS copy the maintainer when you change their code. @@ -178,6 +171,8 @@ patches. Trivial patches must qualify for one of the following rules: since people copy, as long as it's trivial) Any fix by the author/maintainer of the file. (ie. patch monkey in re-transmission mode) +URL: <http://www.kernel.org/pub/linux/kernel/people/rusty/trivial/> + @@ -271,7 +266,7 @@ patch, which certifies that you wrote it or otherwise have the right to pass it on as a open-source patch. The rules are pretty simple: if you can certify the below: - Developer's Certificate of Origin 1.0 + Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: @@ -291,15 +286,32 @@ can certify the below: person who certified (a), (b) or (c) and I have not modified it. + (d) I understand and agree that this project and the contribution + are public and that a record of the contribution (including all + personal information I submit with it, including my sign-off) is + maintained indefinitely and may be redistributed consistent with + this project or the open source license(s) involved. + then you just add a line saying - Signed-off-by: Random J Developer <random@developer.org> + Signed-off-by: Random J Developer <random@developer.example.org> Some people also put extra tags at the end. They'll just be ignored for now, but you can do this to mark internal company procedures or just point out some special detail about the sign-off. + +12) More references for submitting patches + +Andrew Morton, "The perfect patch" (tpp). + <http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt> + +Jeff Garzik, "Linux kernel patch submission format." + <http://linux.yyz.us/patch-format.html> + + + ----------------------------------- SECTION 2 - HINTS, TIPS, AND TRICKS ----------------------------------- @@ -368,7 +380,5 @@ and 'extern __inline__'. 4) Don't over-design. Don't try to anticipate nebulous future cases which may or may not -be useful: "Make it as simple as you can, and no simpler" - - +be useful: "Make it as simple as you can, and no simpler." diff --git a/Documentation/acpi-hotkey.txt b/Documentation/acpi-hotkey.txt new file mode 100644 index 000000000000..0acdc80c30c2 --- /dev/null +++ b/Documentation/acpi-hotkey.txt @@ -0,0 +1,38 @@ +driver/acpi/hotkey.c implement: +1. /proc/acpi/hotkey/event_config +(event based hotkey or event config interface): +a. add a event based hotkey(event) : +echo "0:bus::action:method:num:num" > event_config + +b. delete a event based hotkey(event): +echo "1:::::num:num" > event_config + +c. modify a event based hotkey(event): +echo "2:bus::action:method:num:num" > event_config + +2. /proc/acpi/hotkey/poll_config +(polling based hotkey or event config interface): +a.add a polling based hotkey(event) : +echo "0:bus:method:action:method:num" > poll_config +this adding command will create a proc file +/proc/acpi/hotkey/method, which is used to get +result of polling. + +b.delete a polling based hotkey(event): +echo "1:::::num" > event_config + +c.modify a polling based hotkey(event): +echo "2:bus:method:action:method:num" > poll_config + +3./proc/acpi/hotkey/action +(interface to call aml method associated with a +specific hotkey(event)) +echo "event_num:event_type:event_argument" > + /proc/acpi/hotkey/action. +The result of the execution of this aml method is +attached to /proc/acpi/hotkey/poll_method, which is dnyamically +created. Please use command "cat /proc/acpi/hotkey/polling_method" +to retrieve it. + +Note: Use cmdline "acpi_generic_hotkey" to over-ride +loading any platform specific drivers. diff --git a/Documentation/arm/Samsung-S3C24XX/USB-Host.txt b/Documentation/arm/Samsung-S3C24XX/USB-Host.txt new file mode 100644 index 000000000000..b93b68e2b143 --- /dev/null +++ b/Documentation/arm/Samsung-S3C24XX/USB-Host.txt @@ -0,0 +1,93 @@ + S3C24XX USB Host support + ======================== + + + +Introduction +------------ + + This document details the S3C2410/S3C2440 in-built OHCI USB host support. + +Configuration +------------- + + Enable at least the following kernel options: + + menuconfig: + + Device Drivers ---> + USB support ---> + <*> Support for Host-side USB + <*> OHCI HCD support + + + .config: + CONFIG_USB + CONFIG_USB_OHCI_HCD + + + Once these options are configured, the standard set of USB device + drivers can be configured and used. + + +Board Support +------------- + + The driver attaches to a platform device, which will need to be + added by the board specific support file in linux/arch/arm/mach-s3c2410, + such as mach-bast.c or mach-smdk2410.c + + The platform device's platform_data field is only needed if the + board implements extra power control or over-current monitoring. + + The OHCI driver does not ensure the state of the S3C2410's MISCCTRL + register, so if both ports are to be used for the host, then it is + the board support file's responsibility to ensure that the second + port is configured to be connected to the OHCI core. + + +Platform Data +------------- + + See linux/include/asm-arm/arch-s3c2410/usb-control.h for the + descriptions of the platform device data. An implementation + can be found in linux/arch/arm/mach-s3c2410/usb-simtec.c . + + The `struct s3c2410_hcd_info` contains a pair of functions + that get called to enable over-current detection, and to + control the port power status. + + The ports are numbered 0 and 1. + + power_control: + + Called to enable or disable the power on the port. + + enable_oc: + + Called to enable or disable the over-current monitoring. + This should claim or release the resources being used to + check the power condition on the port, such as an IRQ. + + report_oc: + + The OHCI driver fills this field in for the over-current code + to call when there is a change to the over-current state on + an port. The ports argument is a bitmask of 1 bit per port, + with bit X being 1 for an over-current on port X. + + The function s3c2410_usb_report_oc() has been provided to + ensure this is called correctly. + + port[x]: + + This is struct describes each port, 0 or 1. The platform driver + should set the flags field of each port to S3C_HCDFLG_USED if + the port is enabled. + + + +Document Author +--------------- + +Ben Dooks, (c) 2005 Simtec Electronics diff --git a/Documentation/basic_profiling.txt b/Documentation/basic_profiling.txt index 65e3dc2d4437..8764e9f70821 100644 --- a/Documentation/basic_profiling.txt +++ b/Documentation/basic_profiling.txt @@ -27,9 +27,13 @@ dump output readprofile -m /boot/System.map > captured_profile Oprofile -------- -Get the source (I use 0.8) from http://oprofile.sourceforge.net/ -and add "idle=poll" to the kernel command line + +Get the source (see Changes for required version) from +http://oprofile.sourceforge.net/ and add "idle=poll" to the kernel command +line. + Configure with CONFIG_PROFILING=y and CONFIG_OPROFILE=y & reboot on new kernel + ./configure --with-kernel-support make install @@ -46,7 +50,7 @@ start opcontrol --start stop opcontrol --stop dump output opreport > output_file -To only report on the kernel, run opreport /boot/vmlinux > output_file +To only report on the kernel, run opreport -l /boot/vmlinux > output_file A reset is needed to clear old statistics, which survive a reboot. diff --git a/Documentation/block/ioprio.txt b/Documentation/block/ioprio.txt new file mode 100644 index 000000000000..96ccf681075e --- /dev/null +++ b/Documentation/block/ioprio.txt @@ -0,0 +1,176 @@ +Block io priorities +=================== + + +Intro +----- + +With the introduction of cfq v3 (aka cfq-ts or time sliced cfq), basic io +priorities is supported for reads on files. This enables users to io nice +processes or process groups, similar to what has been possible to cpu +scheduling for ages. This document mainly details the current possibilites +with cfq, other io schedulers do not support io priorities so far. + +Scheduling classes +------------------ + +CFQ implements three generic scheduling classes that determine how io is +served for a process. + +IOPRIO_CLASS_RT: This is the realtime io class. This scheduling class is given +higher priority than any other in the system, processes from this class are +given first access to the disk every time. Thus it needs to be used with some +care, one io RT process can starve the entire system. Within the RT class, +there are 8 levels of class data that determine exactly how much time this +process needs the disk for on each service. In the future this might change +to be more directly mappable to performance, by passing in a wanted data +rate instead. + +IOPRIO_CLASS_BE: This is the best-effort scheduling class, which is the default +for any process that hasn't set a specific io priority. The class data +determines how much io bandwidth the process will get, it's directly mappable +to the cpu nice levels just more coarsely implemented. 0 is the highest +BE prio level, 7 is the lowest. The mapping between cpu nice level and io +nice level is determined as: io_nice = (cpu_nice + 20) / 5. + +IOPRIO_CLASS_IDLE: This is the idle scheduling class, processes running at this +level only get io time when no one else needs the disk. The idle class has no +class data, since it doesn't really apply here. + +Tools +----- + +See below for a sample ionice tool. Usage: + +# ionice -c<class> -n<level> -p<pid> + +If pid isn't given, the current process is assumed. IO priority settings +are inherited on fork, so you can use ionice to start the process at a given +level: + +# ionice -c2 -n0 /bin/ls + +will run ls at the best-effort scheduling class at the highest priority. +For a running process, you can give the pid instead: + +# ionice -c1 -n2 -p100 + +will change pid 100 to run at the realtime scheduling class, at priority 2. + +---> snip ionice.c tool <--- + +#include <stdio.h> +#include <stdlib.h> +#include <errno.h> +#include <getopt.h> +#include <unistd.h> +#include <sys/ptrace.h> +#include <asm/unistd.h> + +extern int sys_ioprio_set(int, int, int); +extern int sys_ioprio_get(int, int); + +#if defined(__i386__) +#define __NR_ioprio_set 289 +#define __NR_ioprio_get 290 +#elif defined(__ppc__) +#define __NR_ioprio_set 273 +#define __NR_ioprio_get 274 +#elif defined(__x86_64__) +#define __NR_ioprio_set 251 +#define __NR_ioprio_get 252 +#elif defined(__ia64__) +#define __NR_ioprio_set 1274 +#define __NR_ioprio_get 1275 +#else +#error "Unsupported arch" +#endif + +_syscall3(int, ioprio_set, int, which, int, who, int, ioprio); +_syscall2(int, ioprio_get, int, which, int, who); + +enum { + IOPRIO_CLASS_NONE, + IOPRIO_CLASS_RT, + IOPRIO_CLASS_BE, + IOPRIO_CLASS_IDLE, +}; + +enum { + IOPRIO_WHO_PROCESS = 1, + IOPRIO_WHO_PGRP, + IOPRIO_WHO_USER, +}; + +#define IOPRIO_CLASS_SHIFT 13 + +const char *to_prio[] = { "none", "realtime", "best-effort", "idle", }; + +int main(int argc, char *argv[]) +{ + int ioprio = 4, set = 0, ioprio_class = IOPRIO_CLASS_BE; + int c, pid = 0; + + while ((c = getopt(argc, argv, "+n:c:p:")) != EOF) { + switch (c) { + case 'n': + ioprio = strtol(optarg, NULL, 10); + set = 1; + break; + case 'c': + ioprio_class = strtol(optarg, NULL, 10); + set = 1; + break; + case 'p': + pid = strtol(optarg, NULL, 10); + break; + } + } + + switch (ioprio_class) { + case IOPRIO_CLASS_NONE: + ioprio_class = IOPRIO_CLASS_BE; + break; + case IOPRIO_CLASS_RT: + case IOPRIO_CLASS_BE: + break; + case IOPRIO_CLASS_IDLE: + ioprio = 7; + break; + default: + printf("bad prio class %d\n", ioprio_class); + return 1; + } + + if (!set) { + if (!pid && argv[optind]) + pid = strtol(argv[optind], NULL, 10); + + ioprio = ioprio_get(IOPRIO_WHO_PROCESS, pid); + + printf("pid=%d, %d\n", pid, ioprio); + + if (ioprio == -1) + perror("ioprio_get"); + else { + ioprio_class = ioprio >> IOPRIO_CLASS_SHIFT; + ioprio = ioprio & 0xff; + printf("%s: prio %d\n", to_prio[ioprio_class], ioprio); + } + } else { + if (ioprio_set(IOPRIO_WHO_PROCESS, pid, ioprio | ioprio_class << IOPRIO_CLASS_SHIFT) == -1) { + perror("ioprio_set"); + return 1; + } + + if (argv[optind]) + execvp(argv[optind], &argv[optind]); + } + + return 0; +} + +---> snip ionice.c tool <--- + + +March 11 2005, Jens Axboe <axboe@suse.de> diff --git a/Documentation/cciss.txt b/Documentation/cciss.txt index d599beb9df8a..c8f9a73111da 100644 --- a/Documentation/cciss.txt +++ b/Documentation/cciss.txt @@ -17,6 +17,7 @@ This driver is known to work with the following cards: * SA P600 * SA P800 * SA E400 + * SA E300 If nodes are not already created in the /dev/cciss directory, run as root: diff --git a/Documentation/cdrom/sbpcd b/Documentation/cdrom/sbpcd index d1825dffca34..b3ba63f4ce3e 100644 --- a/Documentation/cdrom/sbpcd +++ b/Documentation/cdrom/sbpcd @@ -419,6 +419,7 @@ into the file "track01": */ #include <stdio.h> #include <sys/ioctl.h> +#include <sys/types.h> #include <linux/cdrom.h> static struct cdrom_tochdr hdr; @@ -429,7 +430,7 @@ static int datafile, drive; static int i, j, limit, track, err; static char filename[32]; -main(int argc, char *argv[]) +int main(int argc, char *argv[]) { /* * open /dev/cdrom @@ -516,6 +517,7 @@ entry[track+1].cdte_addr.lba=entry[track].cdte_addr.lba+300; } arg.addr.lba++; } + return 0; } /*===================== end program ========================================*/ @@ -564,15 +566,16 @@ Appendix -- the "cdtester" utility: #include <stdio.h> #include <malloc.h> #include <sys/ioctl.h> +#include <sys/types.h> #include <linux/cdrom.h> #ifdef AZT_PRIVATE_IOCTLS #include <linux/../../drivers/cdrom/aztcd.h> -#endif AZT_PRIVATE_IOCTLS +#endif /* AZT_PRIVATE_IOCTLS */ #ifdef SBP_PRIVATE_IOCTLS #include <linux/../../drivers/cdrom/sbpcd.h> #include <linux/fs.h> -#endif SBP_PRIVATE_IOCTLS +#endif /* SBP_PRIVATE_IOCTLS */ struct cdrom_tochdr hdr; struct cdrom_tochdr tocHdr; @@ -590,7 +593,7 @@ union struct cdrom_msf msf; unsigned char buf[CD_FRAMESIZE_RAW]; } azt; -#endif AZT_PRIVATE_IOCTLS +#endif /* AZT_PRIVATE_IOCTLS */ int i, i1, i2, i3, j, k; unsigned char sequence=0; unsigned char command[80]; @@ -738,7 +741,7 @@ void display(int size,unsigned char *buffer) } } -main(int argc, char *argv[]) +int main(int argc, char *argv[]) { printf("\nTesting tool for a CDROM driver's audio functions V0.1\n"); printf("(C) 1995 Eberhard Moenkeberg <emoenke@gwdg.de>\n"); @@ -1046,12 +1049,13 @@ main(int argc, char *argv[]) rc=ioctl(drive,CDROMAUDIOBUFSIZ,j); printf("%d frames granted.\n",rc); break; -#endif SBP_PRIVATE_IOCTLS +#endif /* SBP_PRIVATE_IOCTLS */ default: printf("unknown command: \"%s\".\n",command); break; } } + return 0; } /*==========================================================================*/ diff --git a/Documentation/cpu-freq/governors.txt b/Documentation/cpu-freq/governors.txt index b85481acd0ca..933fae74c337 100644 --- a/Documentation/cpu-freq/governors.txt +++ b/Documentation/cpu-freq/governors.txt @@ -9,6 +9,7 @@ Dominik Brodowski <linux@brodo.de> + some additions and corrections by Nico Golde <nico@ngolde.de> @@ -25,6 +26,7 @@ Contents: 2.1 Performance 2.2 Powersave 2.3 Userspace +2.4 Ondemand 3. The Governor Interface in the CPUfreq Core @@ -86,7 +88,7 @@ highest frequency within the borders of scaling_min_freq and scaling_max_freq. -2.1 Powersave +2.2 Powersave ------------- The CPUfreq governor "powersave" sets the CPU statically to the @@ -94,7 +96,7 @@ lowest frequency within the borders of scaling_min_freq and scaling_max_freq. -2.2 Userspace +2.3 Userspace ------------- The CPUfreq governor "userspace" allows the user, or any userspace @@ -103,6 +105,14 @@ by making a sysfs file "scaling_setspeed" available in the CPU-device directory. +2.4 Ondemand +------------ + +The CPUfreq govenor "ondemand" sets the CPU depending on the +current usage. To do this the CPU must have the capability to +switch the frequency very fast. + + 3. The Governor Interface in the CPUfreq Core ============================================= diff --git a/Documentation/cpusets.txt b/Documentation/cpusets.txt index 2f8f24eaefd9..ad944c060312 100644 --- a/Documentation/cpusets.txt +++ b/Documentation/cpusets.txt @@ -51,6 +51,14 @@ mems_allowed vector. If a cpuset is cpu or mem exclusive, no other cpuset, other than a direct ancestor or descendent, may share any of the same CPUs or Memory Nodes. +A cpuset that is cpu exclusive has a sched domain associated with it. +The sched domain consists of all cpus in the current cpuset that are not +part of any exclusive child cpusets. +This ensures that the scheduler load balacing code only balances +against the cpus that are in the sched domain as defined above and not +all of the cpus in the system. This removes any overhead due to +load balancing code trying to pull tasks outside of the cpu exclusive +cpuset only to be prevented by the tasks' cpus_allowed mask. User level code may create and destroy cpusets by name in the cpuset virtual file system, manage the attributes and permissions of these @@ -84,6 +92,9 @@ This can be especially valuable on: and a database), or * NUMA systems running large HPC applications with demanding performance characteristics. + * Also cpu_exclusive cpusets are useful for servers running orthogonal + workloads such as RT applications requiring low latency and HPC + applications that are throughput sensitive These subsets, or "soft partitions" must be able to be dynamically adjusted, as the job mix changes, without impacting other concurrently @@ -125,6 +136,8 @@ Cpusets extends these two mechanisms as follows: - A cpuset may be marked exclusive, which ensures that no other cpuset (except direct ancestors and descendents) may contain any overlapping CPUs or Memory Nodes. + Also a cpu_exclusive cpuset would be associated with a sched + domain. - You can list all the tasks (by pid) attached to any cpuset. The implementation of cpusets requires a few, simple hooks @@ -136,6 +149,9 @@ into the rest of the kernel, none in performance critical paths: allowed in that tasks cpuset. - in sched.c migrate_all_tasks(), to keep migrating tasks within the CPUs allowed by their cpuset, if possible. + - in sched.c, a new API partition_sched_domains for handling + sched domain changes associated with cpu_exclusive cpusets + and related changes in both sched.c and arch/ia64/kernel/domain.c - in the mbind and set_mempolicy system calls, to mask the requested Memory Nodes by what's allowed in that tasks cpuset. - in page_alloc, to restrict memory to allowed nodes. diff --git a/Documentation/devices.txt b/Documentation/devices.txt index bb67cf25010e..0f515175c72a 100644 --- a/Documentation/devices.txt +++ b/Documentation/devices.txt @@ -94,6 +94,7 @@ Your cooperation is appreciated. 9 = /dev/urandom Faster, less secure random number gen. 10 = /dev/aio Asyncronous I/O notification interface 11 = /dev/kmsg Writes to this come out as printk's + 12 = /dev/oldmem Access to crash dump from kexec kernel 1 block RAM disk 0 = /dev/ram0 First RAM disk 1 = /dev/ram1 Second RAM disk diff --git a/Documentation/dontdiff b/Documentation/dontdiff index 9a33bb94f74f..96bea278bbf6 100644 --- a/Documentation/dontdiff +++ b/Documentation/dontdiff @@ -41,6 +41,7 @@ COPYING CREDITS CVS ChangeSet +Image Kerntypes MODS.txt Module.symvers @@ -103,6 +104,8 @@ logo_*.c logo_*_clut224.c logo_*_mono.c lxdialog +mach-types +mach-types.h make_times_h map maui_boot.h @@ -111,6 +114,7 @@ mkdep mktables modpost modversions.h* +offset.h offsets.h oui.c* parse.c* diff --git a/Documentation/driver-model/device.txt b/Documentation/driver-model/device.txt index 58cc5dc8fd3e..a05ec50f8004 100644 --- a/Documentation/driver-model/device.txt +++ b/Documentation/driver-model/device.txt @@ -76,6 +76,14 @@ driver_data: Driver-specific data. platform_data: Platform data specific to the device. + Example: for devices on custom boards, as typical of embedded + and SOC based hardware, Linux often uses platform_data to point + to board-specific structures describing devices and how they + are wired. That can include what ports are available, chip + variants, which GPIO pins act in what additional roles, and so + on. This shrinks the "Board Support Packages" (BSPs) and + minimizes board-specific #ifdefs in drivers. + current_state: Current power state of the device. saved_state: Pointer to saved state of the device. This is usable by diff --git a/Documentation/driver-model/driver.txt b/Documentation/driver-model/driver.txt index 6031a68dd3f5..fabaca1ab1b0 100644 --- a/Documentation/driver-model/driver.txt +++ b/Documentation/driver-model/driver.txt @@ -5,21 +5,17 @@ struct device_driver { char * name; struct bus_type * bus; - rwlock_t lock; - atomic_t refcount; - - list_t bus_list; + struct completion unloaded; + struct kobject kobj; list_t devices; - struct driver_dir_entry dir; + struct module *owner; int (*probe) (struct device * dev); int (*remove) (struct device * dev); int (*suspend) (struct device * dev, pm_message_t state, u32 level); int (*resume) (struct device * dev, u32 level); - - void (*release) (struct device_driver * drv); }; @@ -51,7 +47,6 @@ being converted completely to the new model. static struct device_driver eepro100_driver = { .name = "eepro100", .bus = &pci_bus_type, - .devclass = ðernet_devclass, /* when it's implemented */ .probe = eepro100_probe, .remove = eepro100_remove, @@ -85,7 +80,6 @@ static struct pci_driver eepro100_driver = { .driver = { .name = "eepro100", .bus = &pci_bus_type, - .devclass = ðernet_devclass, /* when it's implemented */ .probe = eepro100_probe, .remove = eepro100_remove, .suspend = eepro100_suspend, @@ -166,27 +160,32 @@ Callbacks int (*probe) (struct device * dev); -probe is called to verify the existence of a certain type of -hardware. This is called during the driver binding process, after the -bus has verified that the device ID of a device matches one of the -device IDs supported by the driver. - -This callback only verifies that there actually is supported hardware -present. It may allocate a driver-specific structure, but it should -not do any initialization of the hardware itself. The device-specific -structure may be stored in the device's driver_data field. - - int (*init) (struct device * dev); - -init is called during the binding stage. It is called after probe has -successfully returned and the device has been registered with its -class. It is responsible for initializing the hardware. +The probe() entry is called in task context, with the bus's rwsem locked +and the driver partially bound to the device. Drivers commonly use +container_of() to convert "dev" to a bus-specific type, both in probe() +and other routines. That type often provides device resource data, such +as pci_dev.resource[] or platform_device.resources, which is used in +addition to dev->platform_data to initialize the driver. + +This callback holds the driver-specific logic to bind the driver to a +given device. That includes verifying that the device is present, that +it's a version the driver can handle, that driver data structures can +be allocated and initialized, and that any hardware can be initialized. +Drivers often store a pointer to their state with dev_set_drvdata(). +When the driver has successfully bound itself to that device, then probe() +returns zero and the driver model code will finish its part of binding +the driver to that device. + +A driver's probe() may return a negative errno value to indicate that +the driver did not bind to this device, in which case it should have +released all reasources it allocated. int (*remove) (struct device * dev); -remove is called to dissociate a driver with a device. This may be +remove is called to unbind a driver from a device. This may be called if a device is physically removed from the system, if the -driver module is being unloaded, or during a reboot sequence. +driver module is being unloaded, during a reboot sequence, or +in other cases. It is up to the driver to determine if the device is present or not. It should free any resources allocated specifically for the diff --git a/Documentation/dvb/README.dibusb b/Documentation/dvb/README.dibusb deleted file mode 100644 index 7a9e958513f3..000000000000 --- a/Documentation/dvb/README.dibusb +++ /dev/null @@ -1,285 +0,0 @@ -Documentation for dib3000* frontend drivers and dibusb device driver -==================================================================== - -Copyright (C) 2004-5 Patrick Boettcher (patrick.boettcher@desy.de), - -dibusb and dib3000mb/mc drivers based on GPL code, which has - -Copyright (C) 2004 Amaury Demol for DiBcom (ademol@dibcom.fr) - -This program is free software; you can redistribute it and/or -modify it under the terms of the GNU General Public License as -published by the Free Software Foundation, version 2. - - -Supported devices USB1.1 -======================== - -Produced and reselled by Twinhan: ---------------------------------- -- TwinhanDTV USB-Ter DVB-T Device (VP7041) - http://www.twinhan.com/product_terrestrial_3.asp - -- TwinhanDTV Magic Box (VP7041e) - http://www.twinhan.com/product_terrestrial_4.asp - -- HAMA DVB-T USB device - http://www.hama.de/portal/articleId*110620/action*2598 - -- CTS Portable (Chinese Television System) (2) - http://www.2cts.tv/ctsportable/ - -- Unknown USB DVB-T device with vendor ID Hyper-Paltek - - -Produced and reselled by KWorld: --------------------------------- -- KWorld V-Stream XPERT DTV DVB-T USB - http://www.kworld.com.tw/en/product/DVBT-USB/DVBT-USB.html - -- JetWay DTV DVB-T USB - http://www.jetway.com.tw/evisn/product/lcd-tv/DVT-USB/dtv-usb.htm - -- ADSTech Instant TV DVB-T USB - http://www.adstech.com/products/PTV-333/intro/PTV-333_intro.asp?pid=PTV-333 - - -Others: -------- -- Ultima Electronic/Artec T1 USB TVBOX (AN2135, AN2235, AN2235 with Panasonic Tuner) - http://82.161.246.249/products-tvbox.html - -- Compro Videomate DVB-U2000 - DVB-T USB (2) - http://www.comprousa.com/products/vmu2000.htm - -- Grandtec USB DVB-T - http://www.grand.com.tw/ - -- Avermedia AverTV DVBT USB (2) - http://www.avermedia.com/ - -- DiBcom USB DVB-T reference device (non-public) - - -Supported devices USB2.0 -======================== -- Twinhan MagicBox II (2) - http://www.twinhan.com/product_terrestrial_7.asp - -- Hanftek UMT-010 (1) - http://www.globalsources.com/si/6008819757082/ProductDetail/Digital-TV/product_id-100046529 - -- Typhoon/Yakumo/HAMA DVB-T mobile USB2.0 (1) - http://www.yakumo.de/produkte/index.php?pid=1&ag=DVB-T - -- Artec T1 USB TVBOX (FX2) (2) - -- Hauppauge WinTV NOVA-T USB2 - http://www.hauppauge.com/ - -- KWorld/ADSTech Instant DVB-T USB2.0 (DiB3000M-B) - -- DiBcom USB2.0 DVB-T reference device (non-public) - -1) It is working almost. -2) No test reports received yet. - - -0. NEWS: - 2005-02-11 - added support for the KWorld/ADSTech Instant DVB-T USB2.0. Thanks a lot to Joachim von Caron - 2005-02-02 - added support for the Hauppauge Win-TV Nova-T USB2 - 2005-01-31 - distorted streaming is finally gone for USB1.1 devices - 2005-01-13 - moved the mirrored pid_filter_table back to dvb-dibusb - - first almost working version for HanfTek UMT-010 - - found out, that Yakumo/HAMA/Typhoon are predessors of the HanfTek UMT-010 - 2005-01-10 - refactoring completed, now everything is very delightful - - tuner quirks for some weird devices (Artec T1 AN2235 device has sometimes a - Panasonic Tuner assembled). Tunerprobing implemented. Thanks a lot to Gunnar Wittich. - 2004-12-29 - after several days of struggling around bug of no returning URBs fixed. - 2004-12-26 - refactored the dibusb-driver, splitted into separate files - - i2c-probing enabled - 2004-12-06 - possibility for demod i2c-address probing - - new usb IDs (Compro,Artec) - 2004-11-23 - merged changes from DiB3000MC_ver2.1 - - revised the debugging - - possibility to deliver the complete TS for USB2.0 - 2004-11-21 - first working version of the dib3000mc/p frontend driver. - 2004-11-12 - added additional remote control keys. Thanks to Uwe Hanke. - 2004-11-07 - added remote control support. Thanks to David Matthews. - 2004-11-05 - added support for a new devices (Grandtec/Avermedia/Artec) - - merged my changes (for dib3000mb/dibusb) to the FE_REFACTORING, because it became HEAD - - moved transfer control (pid filter, fifo control) from usb driver to frontend, it seems - better settled there (added xfer_ops-struct) - - created a common files for frontends (mc/p/mb) - 2004-09-28 - added support for a new device (Unkown, vendor ID is Hyper-Paltek) - 2004-09-20 - added support for a new device (Compro DVB-U2000), thanks - to Amaury Demol for reporting - - changed usb TS transfer method (several urbs, stopping transfer - before setting a new pid) - 2004-09-13 - added support for a new device (Artec T1 USB TVBOX), thanks - to Christian Motschke for reporting - 2004-09-05 - released the dibusb device and dib3000mb-frontend driver - - (old news for vp7041.c) - 2004-07-15 - found out, by accident, that the device has a TUA6010XS for - PLL - 2004-07-12 - figured out, that the driver should also work with the - CTS Portable (Chinese Television System) - 2004-07-08 - firmware-extraction-2.422-problem solved, driver is now working - properly with firmware extracted from 2.422 - - #if for 2.6.4 (dvb), compile issue - - changed firmware handling, see vp7041.txt sec 1.1 - 2004-07-02 - some tuner modifications, v0.1, cleanups, first public - 2004-06-28 - now using the dvb_dmx_swfilter_packets, everything - runs fine now - 2004-06-27 - able to watch and switching channels (pre-alpha) - - no section filtering yet - 2004-06-06 - first TS received, but kernel oops :/ - 2004-05-14 - firmware loader is working - 2004-05-11 - start writing the driver - -1. How to use? -NOTE: This driver was developed using Linux 2.6.6., -it is working with 2.6.7 and above. - -Linux 2.4.x support is not planned, but patches are very welcome. - -NOTE: I'm using Debian testing, so the following explaination (especially -the hotplug-path) needn't match your system, but probably it will :). - -The driver is included in the kernel since Linux 2.6.10. - -1.1. Firmware - -The USB driver needs to download a firmware to start working. - -You can either use "get_dvb_firmware dibusb" to download the firmware or you -can get it directly via - -for USB1.1 (AN2135) -http://www.linuxtv.org/downloads/firmware/dvb-dibusb-5.0.0.11.fw - -for USB1.1 (AN2235) (a few Artec T1 devices) -http://www.linuxtv.org/downloads/firmware/dvb-dibusb-an2235-1.fw - -for USB2.0 (FX2) Hauppauge, DiBcom -http://www.linuxtv.org/downloads/firmware/dvb-dibusb-6.0.0.5.fw - -for USB2.0 ADSTech/Kworld USB2.0 -http://www.linuxtv.org/downloads/firmware/dvb-dibusb-adstech-usb2-1.fw - -for USB2.0 HanfTek -http://www.linuxtv.org/downloads/firmware/dvb-dibusb-an2235-1.fw - - -1.2. Compiling - -Since the driver is in the linux kernel, activating the driver in -your favorite config-environment should sufficient. I recommend -to compile the driver as module. Hotplug does the rest. - -1.3. Loading the drivers - -Hotplug is able to load the driver, when it is needed (because you plugged -in the device). - -If you want to enable debug output, you have to load the driver manually and -from withing the dvb-kernel cvs repository. - -first have a look, which debug level are available: - -modinfo dib3000mb -modinfo dib3000-common -modinfo dib3000mc -modinfo dvb-dibusb - -modprobe dib3000-common debug=<level> -modprobe dib3000mb debug=<level> -modprobe dib3000mc debug=<level> -modprobe dvb-dibusb debug=<level> - -should do the trick. - -When the driver is loaded successfully, the firmware file was in -the right place and the device is connected, the "Power"-LED should be -turned on. - -At this point you should be able to start a dvb-capable application. For myself -I used mplayer, dvbscan, tzap and kaxtv, they are working. Using the device -in vdr is working now also. - -2. Known problems and bugs - -- Don't remove the USB device while running an DVB application, your system will die. - -2.1. Adding support for devices - -It is not possible to determine the range of devices based on the DiBcom -reference designs. This is because the reference design of DiBcom can be sold -to thirds, without telling DiBcom (so done with the Twinhan VP7041 and -the HAMA device). - -When you think you have a device like this and the driver does not recognizes it, -please send the ****load*.inf and the ****cap*.inf of the Windows driver to me. - -Sometimes the Vendor or Product ID is identical to the ones of Twinhan, even -though it is not a Twinhan device (e.g. HAMA), then please send me the name -of the device. I will add it to this list in order to make this clear to -others. - -If you are familar with C you can also add the VID and PID of the device to -the dvb-dibusb-core.c-file and create a patch and send it over to me or to -the linux-dvb mailing list, _after_ you have tried compiling and modprobing -it. - -2.2. USB1.1 Bandwidth limitation - -Most of the currently supported devices are USB1.1 and thus they have a -maximum bandwidth of about 5-6 MBit/s when connected to a USB2.0 hub. -This is not enough for receiving the complete transport stream of a -DVB-T channel (which can be about 16 MBit/s). Normally this is not a -problem, if you only want to watch TV (this does not apply for HDTV), -but watching a channel while recording another channel on the same -frequency simply does not work very well. This applies to all USB1.1 -DVB-T devices, not just dibusb) - -Update: For the USB1.1 and VDR some work has been done (patches and comments -are still very welcome). Maybe the problem is solved in the meantime because I -now use the dmx_sw_filter function instead of dmx_sw_filter_packet. I hope the -linux-dvb software filter is able to get the best of the garbled TS. - -The bug, where the TS is distorted by a heavy usage of the device is gone -definitely. All dibusb-devices I was using (Twinhan, Kworld, DiBcom) are -working like charm now with VDR. Sometimes I even was able to record a channel -and watch another one. - -2.3. Comments - -Patches, comments and suggestions are very very welcome. - -3. Acknowledgements - Amaury Demol (ademol@dibcom.fr) and Francois Kanounnikoff from DiBcom for - providing specs, code and help, on which the dvb-dibusb, dib3000mb and - dib3000mc are based. - - David Matthews for identifying a new device type (Artec T1 with AN2235) - and for extending dibusb with remote control event handling. Thank you. - - Alex Woods for frequently answering question about usb and dvb - stuff, a big thank you. - - Bernd Wagner for helping with huge bug reports and discussions. - - Gunnar Wittich and Joachim von Caron for their trust for giving me - root-shells on their machines to implement support for new devices. - - Some guys on the linux-dvb mailing list for encouraging me - - Peter Schildmann >peter.schildmann-nospam-at-web.de< for his - user-level firmware loader, which saves a lot of time - (when writing the vp7041 driver) - - Ulf Hermenau for helping me out with traditional chinese. - - André Smoktun and Christian Frömmel for supporting me with - hardware and listening to my problems very patient diff --git a/Documentation/dvb/README.dvb-usb b/Documentation/dvb/README.dvb-usb new file mode 100644 index 000000000000..ac0797ea646c --- /dev/null +++ b/Documentation/dvb/README.dvb-usb @@ -0,0 +1,232 @@ +Documentation for dvb-usb-framework module and its devices + +Idea behind the dvb-usb-framework +================================= + +In March 2005 I got the new Twinhan USB2.0 DVB-T device. They provided specs and a firmware. + +Quite keen I wanted to put the driver (with some quirks of course) into dibusb. +After reading some specs and doing some USB snooping, it realized, that the +dibusb-driver would be a complete mess afterwards. So I decided to do it in a +different way: With the help of a dvb-usb-framework. + +The framework provides generic functions (mostly kernel API calls), such as: + +- Transport Stream URB handling in conjunction with dvb-demux-feed-control + (bulk and isoc are supported) +- registering the device for the DVB-API +- registering an I2C-adapter if applicable +- remote-control/input-device handling +- firmware requesting and loading (currently just for the Cypress USB + controllers) +- other functions/methods which can be shared by several drivers (such as + functions for bulk-control-commands) +- TODO: a I2C-chunker. It creates device-specific chunks of register-accesses + depending on length of a register and the number of values that can be + multi-written and multi-read. + +The source code of the particular DVB USB devices does just the communication +with the device via the bus. The connection between the DVB-API-functionality +is done via callbacks, assigned in a static device-description (struct +dvb_usb_device) each device-driver has to have. + +For an example have a look in drivers/media/dvb/dvb-usb/vp7045*. + +Objective is to migrate all the usb-devices (dibusb, cinergyT2, maybe the +ttusb; flexcop-usb already benefits from the generic flexcop-device) to use +the dvb-usb-lib. + +TODO: dynamic enabling and disabling of the pid-filter in regard to number of +feeds requested. + +Supported devices +======================== + +See the LinuxTV DVB Wiki at www.linuxtv.org for a complete list of +cards/drivers/firmwares: + +http://www.linuxtv.org/wiki/index.php/DVB_USB + +0. History & News: + 2005-06-30 - added support for WideView WT-220U (Thanks to Steve Chang) + 2005-05-30 - added basic isochronous support to the dvb-usb-framework + added support for Conexant Hybrid reference design and Nebula DigiTV USB + 2005-04-17 - all dibusb devices ported to make use of the dvb-usb-framework + 2005-04-02 - re-enabled and improved remote control code. + 2005-03-31 - ported the Yakumo/Hama/Typhoon DVB-T USB2.0 device to dvb-usb. + 2005-03-30 - first commit of the dvb-usb-module based on the dibusb-source. First device is a new driver for the + TwinhanDTV Alpha / MagicBox II USB2.0-only DVB-T device. + + (change from dvb-dibusb to dvb-usb) + 2005-03-28 - added support for the AVerMedia AverTV DVB-T USB2.0 device (Thanks to Glen Harris and Jiun-Kuei Jung, AVerMedia) + 2005-03-14 - added support for the Typhoon/Yakumo/HAMA DVB-T mobile USB2.0 + 2005-02-11 - added support for the KWorld/ADSTech Instant DVB-T USB2.0. Thanks a lot to Joachim von Caron + 2005-02-02 - added support for the Hauppauge Win-TV Nova-T USB2 + 2005-01-31 - distorted streaming is gone for USB1.1 devices + 2005-01-13 - moved the mirrored pid_filter_table back to dvb-dibusb + - first almost working version for HanfTek UMT-010 + - found out, that Yakumo/HAMA/Typhoon are predecessors of the HanfTek UMT-010 + 2005-01-10 - refactoring completed, now everything is very delightful + - tuner quirks for some weird devices (Artec T1 AN2235 device has sometimes a + Panasonic Tuner assembled). Tunerprobing implemented. Thanks a lot to Gunnar Wittich. + 2004-12-29 - after several days of struggling around bug of no returning URBs fixed. + 2004-12-26 - refactored the dibusb-driver, splitted into separate files + - i2c-probing enabled + 2004-12-06 - possibility for demod i2c-address probing + - new usb IDs (Compro, Artec) + 2004-11-23 - merged changes from DiB3000MC_ver2.1 + - revised the debugging + - possibility to deliver the complete TS for USB2.0 + 2004-11-21 - first working version of the dib3000mc/p frontend driver. + 2004-11-12 - added additional remote control keys. Thanks to Uwe Hanke. + 2004-11-07 - added remote control support. Thanks to David Matthews. + 2004-11-05 - added support for a new devices (Grandtec/Avermedia/Artec) + - merged my changes (for dib3000mb/dibusb) to the FE_REFACTORING, because it became HEAD + - moved transfer control (pid filter, fifo control) from usb driver to frontend, it seems + better settled there (added xfer_ops-struct) + - created a common files for frontends (mc/p/mb) + 2004-09-28 - added support for a new device (Unkown, vendor ID is Hyper-Paltek) + 2004-09-20 - added support for a new device (Compro DVB-U2000), thanks + to Amaury Demol for reporting + - changed usb TS transfer method (several urbs, stopping transfer + before setting a new pid) + 2004-09-13 - added support for a new device (Artec T1 USB TVBOX), thanks + to Christian Motschke for reporting + 2004-09-05 - released the dibusb device and dib3000mb-frontend driver + + (old news for vp7041.c) + 2004-07-15 - found out, by accident, that the device has a TUA6010XS for + PLL + 2004-07-12 - figured out, that the driver should also work with the + CTS Portable (Chinese Television System) + 2004-07-08 - firmware-extraction-2.422-problem solved, driver is now working + properly with firmware extracted from 2.422 + - #if for 2.6.4 (dvb), compile issue + - changed firmware handling, see vp7041.txt sec 1.1 + 2004-07-02 - some tuner modifications, v0.1, cleanups, first public + 2004-06-28 - now using the dvb_dmx_swfilter_packets, everything + runs fine now + 2004-06-27 - able to watch and switching channels (pre-alpha) + - no section filtering yet + 2004-06-06 - first TS received, but kernel oops :/ + 2004-05-14 - firmware loader is working + 2004-05-11 - start writing the driver + +1. How to use? +1.1. Firmware + +Most of the USB drivers need to download a firmware to the device before start +working. + +Have a look at the Wikipage for the DVB-USB-drivers to find out, which firmware +you need for your device: + +http://www.linuxtv.org/wiki/index.php/DVB_USB + +1.2. Compiling + +Since the driver is in the linux kernel, activating the driver in +your favorite config-environment should sufficient. I recommend +to compile the driver as module. Hotplug does the rest. + +If you use dvb-kernel enter the build-2.6 directory run 'make' and 'insmod.sh +load' afterwards. + +1.3. Loading the drivers + +Hotplug is able to load the driver, when it is needed (because you plugged +in the device). + +If you want to enable debug output, you have to load the driver manually and +from withing the dvb-kernel cvs repository. + +first have a look, which debug level are available: + +modinfo dvb-usb +modinfo dvb-usb-vp7045 +etc. + +modprobe dvb-usb debug=<level> +modprobe dvb-usb-vp7045 debug=<level> +etc. + +should do the trick. + +When the driver is loaded successfully, the firmware file was in +the right place and the device is connected, the "Power"-LED should be +turned on. + +At this point you should be able to start a dvb-capable application. I'm use +(t|s)zap, mplayer and dvbscan to test the basics. VDR-xine provides the +long-term test scenario. + +2. Known problems and bugs + +- Don't remove the USB device while running an DVB application, your system + will go crazy or die most likely. + +2.1. Adding support for devices + +TODO + +2.2. USB1.1 Bandwidth limitation + +A lot of the currently supported devices are USB1.1 and thus they have a +maximum bandwidth of about 5-6 MBit/s when connected to a USB2.0 hub. +This is not enough for receiving the complete transport stream of a +DVB-T channel (which is about 16 MBit/s). Normally this is not a +problem, if you only want to watch TV (this does not apply for HDTV), +but watching a channel while recording another channel on the same +frequency simply does not work very well. This applies to all USB1.1 +DVB-T devices, not just the dvb-usb-devices) + +The bug, where the TS is distorted by a heavy usage of the device is gone +definitely. All dvb-usb-devices I was using (Twinhan, Kworld, DiBcom) are +working like charm now with VDR. Sometimes I even was able to record a channel +and watch another one. + +2.3. Comments + +Patches, comments and suggestions are very very welcome. + +3. Acknowledgements + Amaury Demol (ademol@dibcom.fr) and Francois Kanounnikoff from DiBcom for + providing specs, code and help, on which the dvb-dibusb, dib3000mb and + dib3000mc are based. + + David Matthews for identifying a new device type (Artec T1 with AN2235) + and for extending dibusb with remote control event handling. Thank you. + + Alex Woods for frequently answering question about usb and dvb + stuff, a big thank you. + + Bernd Wagner for helping with huge bug reports and discussions. + + Gunnar Wittich and Joachim von Caron for their trust for providing + root-shells on their machines to implement support for new devices. + + Allan Third and Michael Hutchinson for their help to write the Nebula + digitv-driver. + + Glen Harris for bringing up, that there is a new dibusb-device and Jiun-Kuei + Jung from AVerMedia who kindly provided a special firmware to get the device + up and running in Linux. + + Jennifer Chen, Jeff and Jack from Twinhan for kindly supporting by + writing the vp7045-driver. + + Steve Chang from WideView for providing information for new devices and + firmware files. + + Michael Paxton for submitting remote control keymaps. + + Some guys on the linux-dvb mailing list for encouraging me. + + Peter Schildmann >peter.schildmann-nospam-at-web.de< for his + user-level firmware loader, which saves a lot of time + (when writing the vp7041 driver) + + Ulf Hermenau for helping me out with traditional chinese. + + André Smoktun and Christian Frömmel for supporting me with + hardware and listening to my problems very patiently. diff --git a/Documentation/dvb/bt8xx.txt b/Documentation/dvb/bt8xx.txt index d64430bf4bb6..e6b8d05bc08d 100644 --- a/Documentation/dvb/bt8xx.txt +++ b/Documentation/dvb/bt8xx.txt @@ -1,69 +1,55 @@ -How to get the Nebula, PCTV and Twinhan DST cards working -========================================================= +How to get the Nebula Electronics DigiTV, Pinnacle PCTV Sat, Twinhan DST + clones working +========================================================================================= -This class of cards has a bt878a as the PCI interface, and -require the bttv driver. +1) General information +====================== -Please pay close attention to the warning about the bttv module -options below for the DST card. +This class of cards has a bt878a chip as the PCI interface. +The different card drivers require the bttv driver to provide the means +to access the i2c bus and the gpio pins of the bt8xx chipset. -1) General informations -======================= +2) Compilation rules for Kernel >= 2.6.12 +========================================= -These drivers require the bttv driver to provide the means to access -the i2c bus and the gpio pins of the bt8xx chipset. +Enable the following options: -Because of this, you need to enable "Device drivers" => "Multimedia devices" - => "Video For Linux" => "BT848 Video For Linux" - -Furthermore you need to enable + => "Video For Linux" => "BT848 Video For Linux" "Device drivers" => "Multimedia devices" => "Digital Video Broadcasting Devices" - => "DVB for Linux" "DVB Core Support" "Nebula/Pinnacle PCTV/TwinHan PCI Cards" + => "DVB for Linux" "DVB Core Support" "Nebula/Pinnacle PCTV/TwinHan PCI Cards" -2) Loading Modules -================== +3) Loading Modules, described by two approaches +=============================================== In general you need to load the bttv driver, which will handle the gpio and -i2c communication for us, plus the common dvb-bt8xx device driver. -The frontends for Nebula (nxt6000), Pinnacle PCTV (cx24110) and -TwinHan (dst) are loaded automatically by the dvb-bt8xx device driver. +i2c communication for us, plus the common dvb-bt8xx device driver, +which is called the backend. +The frontends for Nebula DigiTV (nxt6000), Pinnacle PCTV Sat (cx24110), +TwinHan DST + clones (dst and dst-ca) are loaded automatically by the backend. +For further details about TwinHan DST + clones see /Documentation/dvb/ci.txt. -3a) Nebula / Pinnacle PCTV --------------------------- +3a) The manual approach +----------------------- - $ modprobe bttv (normally bttv is being loaded automatically by kmod) - $ modprobe dvb-bt8xx (or just place dvb-bt8xx in /etc/modules for automatic loading) +Loading modules: +modprobe bttv +modprobe dvb-bt8xx +Unloading modules: +modprobe -r dvb-bt8xx +modprobe -r bttv -3b) TwinHan and Clones +3b) The automatic approach -------------------------- - $ modprobe bttv i2c_hw=1 card=0x71 - $ modprobe dvb-bt8xx - $ modprobe dst - -The value 0x71 will override the PCI type detection for dvb-bt8xx, -which is necessary for TwinHan cards. - -If you're having an older card (blue color circuit) and card=0x71 locks -your machine, try using 0x68, too. If that does not work, ask on the -mailing list. - -The DST module takes a couple of useful parameters. - -verbose takes values 0 to 5. These values control the verbosity level. - -debug takes values 0 and 1. You can either disable or enable debugging. - -dst_addons takes values 0 and 0x20. A value of 0 means it is a FTA card. -0x20 means it has a Conditional Access slot. - -The autodected values are determined bythe cards 'response -string' which you can see in your logs e.g. +If not already done by installation, place a line either in +/etc/modules.conf or in /etc/modprobe.conf containing this text: +alias char-major-81 bttv -dst_get_device_id: Recognise [DSTMCI] +Then place a line in /etc/modules containing this text: +dvb-bt8xx +Reboot your system and have fun! -- -Authors: Richard Walker, Jamie Honan, Michael Hunold, Manu Abraham +Authors: Richard Walker, Jamie Honan, Michael Hunold, Manu Abraham, Uwe Bugla diff --git a/Documentation/fb/intelfb.txt b/Documentation/fb/intelfb.txt new file mode 100644 index 000000000000..c12d39a23c3d --- /dev/null +++ b/Documentation/fb/intelfb.txt @@ -0,0 +1,135 @@ +Intel 830M/845G/852GM/855GM/865G/915G Framebuffer driver +================================================================ + +A. Introduction + This is a framebuffer driver for various Intel 810/815 compatible +graphics devices. These would include: + + Intel 830M + Intel 810E845G + Intel 852GM + Intel 855GM + Intel 865G + Intel 915G + +B. List of available options + + a. "video=intelfb" + enables the intelfb driver + + Recommendation: required + + b. "mode=<xres>x<yres>[-<bpp>][@<refresh>]" + select mode + + Recommendation: user preference + (default = 1024x768-32@70) + + c. "vram=<value>" + select amount of system RAM in MB to allocate for the video memory + if not enough RAM was already allocated by the BIOS. + + Recommendation: 1 - 4 MB. + (default = 4 MB) + + d. "voffset=<value>" + select at what offset in MB of the logical memory to allocate the + framebuffer memory. The intent is to avoid the memory blocks + used by standard graphics applications (XFree86). Depending on your + usage, adjust the value up or down, (0 for maximum usage, 63/127 MB + for the least amount). Note, an arbitrary setting may conflict + with XFree86. + + Recommendation: do not set + (default = 48 MB) + + e. "accel" + enable text acceleration. This can be enabled/reenabled anytime + by using 'fbset -accel true/false'. + + Recommendation: enable + (default = set) + + f. "hwcursor" + enable cursor acceleration. + + Recommendation: enable + (default = set) + + g. "mtrr" + enable MTRR. This allows data transfers to the framebuffer memory + to occur in bursts which can significantly increase performance. + Not very helpful with the intel chips because of 'shared memory'. + + Recommendation: set + (default = set) + + h. "fixed" + disable mode switching. + + Recommendation: do not set + (default = not set) + + The binary parameters can be unset with a "no" prefix, example "noaccel". + The default parameter (not named) is the mode. + +C. Kernel booting + +Separate each option/option-pair by commas (,) and the option from its value +with an equals sign (=) as in the following: + +video=i810fb:option1,option2=value2 + +Sample Usage +------------ + +In /etc/lilo.conf, add the line: + +append="video=intelfb:800x600-32@75,accel,hwcursor,vram=8" + +This will initialize the framebuffer to 800x600 at 32bpp and 75Hz. The +framebuffer will use 8 MB of System RAM. hw acceleration of text and cursor +will be enabled. + +D. Module options + + The module parameters are essentially similar to the kernel +parameters. The main difference is that you need to include a Boolean value +(1 for TRUE, and 0 for FALSE) for those options which don't need a value. + +Example, to enable MTRR, include "mtrr=1". + +Sample Usage +------------ + +Using the same setup as described above, load the module like this: + + modprobe intelfb mode=800x600-32@75 vram=8 accel=1 hwcursor=1 + +Or just add the following to /etc/modprobe.conf + + options intelfb mode=800x600-32@75 vram=8 accel=1 hwcursor=1 + +and just do a + + modprobe intelfb + + +E. Acknowledgment: + + 1. Geert Uytterhoeven - his excellent howto and the virtual + framebuffer driver code made this possible. + + 2. Jeff Hartmann for his agpgart code. + + 3. David Dawes for his original kernel 2.4 code. + + 4. The X developers. Insights were provided just by reading the + XFree86 source code. + + 5. Antonino A. Daplas for his inspiring i810fb driver. + + 6. Andrew Morton for his kernel patches maintenance. + +########################### +Sylvain diff --git a/Documentation/fb/vesafb.txt b/Documentation/fb/vesafb.txt index 814e2f56a6ad..62db6758d1c1 100644 --- a/Documentation/fb/vesafb.txt +++ b/Documentation/fb/vesafb.txt @@ -144,7 +144,21 @@ vgapal Use the standard vga registers for palette changes. This is the default. pmipal Use the protected mode interface for palette changes. -mtrr setup memory type range registers for the vesafb framebuffer. +mtrr:n setup memory type range registers for the vesafb framebuffer + where n: + 0 - disabled (equivalent to nomtrr) + 1 - uncachable + 2 - write-back + 3 - write-combining (default) + 4 - write-through + + If you see the following in dmesg, choose the type that matches the + old one. In this example, use "mtrr:2". +... +mtrr: type mismatch for e0000000,8000000 old: write-back new: write-combining +... + +nomtrr disable mtrr vremap:n remap 'n' MiB of video RAM. If 0 or not specified, remap memory diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index b9eb209318ab..8b1430b46655 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -43,6 +43,14 @@ Who: Randy Dunlap <rddunlap@osdl.org> --------------------------- +What: RAW driver (CONFIG_RAW_DRIVER) +When: December 2005 +Why: declared obsolete since kernel 2.6.3 + O_DIRECT can be used instead +Who: Adrian Bunk <bunk@stusta.de> + +--------------------------- + What: register_ioctl32_conversion() / unregister_ioctl32_conversion() When: April 2005 Why: Replaced by ->compat_ioctl in file_operations and other method @@ -66,6 +74,14 @@ Who: Paul E. McKenney <paulmck@us.ibm.com> --------------------------- +What: remove verify_area() +When: July 2006 +Files: Various uaccess.h headers. +Why: Deprecated and redundant. access_ok() should be used instead. +Who: Jesper Juhl <juhl-lkml@dif.dk> + +--------------------------- + What: IEEE1394 Audio and Music Data Transmission Protocol driver, Connection Management Procedures driver When: November 2005 @@ -83,3 +99,39 @@ Why: Deprecated in favour of the new ioctl-based rawiso interface, which is more efficient. You should really be using libraw1394 for raw1394 access anyway. Who: Jody McIntyre <scjody@steamballoon.com> + +--------------------------- + +What: register_serial/unregister_serial +When: September 2005 +Why: This interface does not allow serial ports to be registered against + a struct device, and as such does not allow correct power management + of such ports. 8250-based ports should use serial8250_register_port + and serial8250_unregister_port, or platform devices instead. +Who: Russell King <rmk@arm.linux.org.uk> + +--------------------------- + +What: i2c sysfs name change: in1_ref, vid deprecated in favour of cpu0_vid +When: November 2005 +Files: drivers/i2c/chips/adm1025.c, drivers/i2c/chips/adm1026.c +Why: Match the other drivers' name for the same function, duplicate names + will be available until removal of old names. +Who: Grant Coady <gcoady@gmail.com> + +--------------------------- + +What: PCMCIA control ioctl (needed for pcmcia-cs [cardmgr, cardctl]) +When: November 2005 +Files: drivers/pcmcia/: pcmcia_ioctl.c +Why: With the 16-bit PCMCIA subsystem now behaving (almost) like a + normal hotpluggable bus, and with it using the default kernel + infrastructure (hotplug, driver core, sysfs) keeping the PCMCIA + control ioctl needed by cardmgr and cardctl from pcmcia-cs is + unnecessary, and makes further cleanups and integration of the + PCMCIA subsystem into the Linux kernel device driver model more + difficult. The features provided by cardmgr and cardctl are either + handled by the kernel itself now or are available in the new + pcmciautils package available at + http://kernel.org/pub/linux/utils/kernel/pcmcia/ +Who: Dominik Brodowski <linux@brodo.de> diff --git a/Documentation/filesystems/ext2.txt b/Documentation/filesystems/ext2.txt index b5cb9110cc6b..d16334ec48ba 100644 --- a/Documentation/filesystems/ext2.txt +++ b/Documentation/filesystems/ext2.txt @@ -58,6 +58,8 @@ noacl Don't support POSIX ACLs. nobh Do not attach buffer_heads to file pagecache. +xip Use execute in place (no caching) if possible + grpquota,noquota,quota,usrquota Quota options are silently ignored by ext2. diff --git a/Documentation/filesystems/inotify.txt b/Documentation/filesystems/inotify.txt new file mode 100644 index 000000000000..6d501903f68e --- /dev/null +++ b/Documentation/filesystems/inotify.txt @@ -0,0 +1,151 @@ + inotify + a powerful yet simple file change notification system + + + +Document started 15 Mar 2005 by Robert Love <rml@novell.com> + + +(i) User Interface + +Inotify is controlled by a set of three system calls and normal file I/O on a +returned file descriptor. + +First step in using inotify is to initialise an inotify instance: + + int fd = inotify_init (); + +Each instance is associated with a unique, ordered queue. + +Change events are managed by "watches". A watch is an (object,mask) pair where +the object is a file or directory and the mask is a bit mask of one or more +inotify events that the application wishes to receive. See <linux/inotify.h> +for valid events. A watch is referenced by a watch descriptor, or wd. + +Watches are added via a path to the file. + +Watches on a directory will return events on any files inside of the directory. + +Adding a watch is simple: + + int wd = inotify_add_watch (fd, path, mask); + +Where "fd" is the return value from inotify_init(), path is the path to the +object to watch, and mask is the watch mask (see <linux/inotify.h>). + +You can update an existing watch in the same manner, by passing in a new mask. + +An existing watch is removed via + + int ret = inotify_rm_watch (fd, wd); + +Events are provided in the form of an inotify_event structure that is read(2) +from a given inotify instance. The filename is of dynamic length and follows +the struct. It is of size len. The filename is padded with null bytes to +ensure proper alignment. This padding is reflected in len. + +You can slurp multiple events by passing a large buffer, for example + + size_t len = read (fd, buf, BUF_LEN); + +Where "buf" is a pointer to an array of "inotify_event" structures at least +BUF_LEN bytes in size. The above example will return as many events as are +available and fit in BUF_LEN. + +Each inotify instance fd is also select()- and poll()-able. + +You can find the size of the current event queue via the standard FIONREAD +ioctl on the fd returned by inotify_init(). + +All watches are destroyed and cleaned up on close. + + +(ii) + +Prototypes: + + int inotify_init (void); + int inotify_add_watch (int fd, const char *path, __u32 mask); + int inotify_rm_watch (int fd, __u32 mask); + + +(iii) Internal Kernel Implementation + +Each inotify instance is associated with an inotify_device structure. + +Each watch is associated with an inotify_watch structure. Watches are chained +off of each associated device and each associated inode. + +See fs/inotify.c for the locking and lifetime rules. + + +(iv) Rationale + +Q: What is the design decision behind not tying the watch to the open fd of + the watched object? + +A: Watches are associated with an open inotify device, not an open file. + This solves the primary problem with dnotify: keeping the file open pins + the file and thus, worse, pins the mount. Dnotify is therefore infeasible + for use on a desktop system with removable media as the media cannot be + unmounted. Watching a file should not require that it be open. + +Q: What is the design decision behind using an-fd-per-instance as opposed to + an fd-per-watch? + +A: An fd-per-watch quickly consumes more file descriptors than are allowed, + more fd's than are feasible to manage, and more fd's than are optimally + select()-able. Yes, root can bump the per-process fd limit and yes, users + can use epoll, but requiring both is a silly and extraneous requirement. + A watch consumes less memory than an open file, separating the number + spaces is thus sensible. The current design is what user-space developers + want: Users initialize inotify, once, and add n watches, requiring but one + fd and no twiddling with fd limits. Initializing an inotify instance two + thousand times is silly. If we can implement user-space's preferences + cleanly--and we can, the idr layer makes stuff like this trivial--then we + should. + + There are other good arguments. With a single fd, there is a single + item to block on, which is mapped to a single queue of events. The single + fd returns all watch events and also any potential out-of-band data. If + every fd was a separate watch, + + - There would be no way to get event ordering. Events on file foo and + file bar would pop poll() on both fd's, but there would be no way to tell + which happened first. A single queue trivially gives you ordering. Such + ordering is crucial to existing applications such as Beagle. Imagine + "mv a b ; mv b a" events without ordering. + + - We'd have to maintain n fd's and n internal queues with state, + versus just one. It is a lot messier in the kernel. A single, linear + queue is the data structure that makes sense. + + - User-space developers prefer the current API. The Beagle guys, for + example, love it. Trust me, I asked. It is not a surprise: Who'd want + to manage and block on 1000 fd's via select? + + - No way to get out of band data. + + - 1024 is still too low. ;-) + + When you talk about designing a file change notification system that + scales to 1000s of directories, juggling 1000s of fd's just does not seem + the right interface. It is too heavy. + + Additionally, it _is_ possible to more than one instance and + juggle more than one queue and thus more than one associated fd. There + need not be a one-fd-per-process mapping; it is one-fd-per-queue and a + process can easily want more than one queue. + +Q: Why the system call approach? + +A: The poor user-space interface is the second biggest problem with dnotify. + Signals are a terrible, terrible interface for file notification. Or for + anything, for that matter. The ideal solution, from all perspectives, is a + file descriptor-based one that allows basic file I/O and poll/select. + Obtaining the fd and managing the watches could have been done either via a + device file or a family of new system calls. We decided to implement a + family of system calls because that is the preffered approach for new kernel + interfaces. The only real difference was whether we wanted to use open(2) + and ioctl(2) or a couple of new system calls. System calls beat ioctls. + diff --git a/Documentation/filesystems/isofs.txt b/Documentation/filesystems/isofs.txt index f64a10506689..424585ff6ea1 100644 --- a/Documentation/filesystems/isofs.txt +++ b/Documentation/filesystems/isofs.txt @@ -26,7 +26,11 @@ Mount options unique to the isofs filesystem. mode=xxx Sets the permissions on files to xxx nojoliet Ignore Joliet extensions if they are present. norock Ignore Rock Ridge extensions if they are present. - unhide Show hidden files. + hide Completely strip hidden files from the file system. + showassoc Show files marked with the 'associated' bit + unhide Deprecated; showing hidden files is now default; + If given, it is a synonym for 'showassoc' which will + recreate previous unhide behavior session=x Select number of session on multisession CD sbsector=xxx Session begins from sector xxx diff --git a/Documentation/filesystems/ntfs.txt b/Documentation/filesystems/ntfs.txt index f89b440fad1d..eef4aca0c753 100644 --- a/Documentation/filesystems/ntfs.txt +++ b/Documentation/filesystems/ntfs.txt @@ -21,7 +21,7 @@ Overview ======== Linux-NTFS comes with a number of user-space programs known as ntfsprogs. -These include mkntfs, a full-featured ntfs file system format utility, +These include mkntfs, a full-featured ntfs filesystem format utility, ntfsundelete used for recovering files that were unintentionally deleted from an NTFS volume and ntfsresize which is used to resize an NTFS partition. See the web site for more information. @@ -149,7 +149,14 @@ case_sensitive=<BOOL> If case_sensitive is specified, treat all file names as name, if it exists. If case_sensitive, you will need to provide the correct case of the short file name. -errors=opt What to do when critical file system errors are found. +disable_sparse=<BOOL> If disable_sparse is specified, creation of sparse + regions, i.e. holes, inside files is disabled for the + volume (for the duration of this mount only). By + default, creation of sparse regions is enabled, which + is consistent with the behaviour of traditional Unix + filesystems. + +errors=opt What to do when critical filesystem errors are found. Following values can be used for "opt": continue: DEFAULT, try to clean-up as much as possible, e.g. marking a corrupt inode as @@ -432,6 +439,24 @@ ChangeLog Note, a technical ChangeLog aimed at kernel hackers is in fs/ntfs/ChangeLog. +2.1.23: + - Stamp the user space journal, aka transaction log, aka $UsnJrnl, if + it is present and active thus telling Windows and applications using + the transaction log that changes can have happened on the volume + which are not recorded in $UsnJrnl. + - Detect the case when Windows has been hibernated (suspended to disk) + and if this is the case do not allow (re)mounting read-write to + prevent data corruption when you boot back into the suspended + Windows session. + - Implement extension of resident files using the normal file write + code paths, i.e. most very small files can be extended to be a little + bit bigger but not by much. + - Add new mount option "disable_sparse". (See list of mount options + above for details.) + - Improve handling of ntfs volumes with errors and strange boot sectors + in particular. + - Fix various bugs including a nasty deadlock that appeared in recent + kernels (around 2.6.11-2.6.12 timeframe). 2.1.22: - Improve handling of ntfs volumes with errors. - Fix various bugs and race conditions. diff --git a/Documentation/filesystems/sysfs.txt b/Documentation/filesystems/sysfs.txt index 60f6c2c4d477..dc276598a65a 100644 --- a/Documentation/filesystems/sysfs.txt +++ b/Documentation/filesystems/sysfs.txt @@ -214,7 +214,7 @@ Other notes: A very simple (and naive) implementation of a device attribute is: -static ssize_t show_name(struct device * dev, char * buf) +static ssize_t show_name(struct device *dev, struct device_attribute *attr, char *buf) { return sprintf(buf,"%s\n",dev->name); } diff --git a/Documentation/filesystems/tmpfs.txt b/Documentation/filesystems/tmpfs.txt index 417e3095fe39..0d783c504ead 100644 --- a/Documentation/filesystems/tmpfs.txt +++ b/Documentation/filesystems/tmpfs.txt @@ -71,8 +71,8 @@ can be changed on remount. The size parameter also accepts a suffix % to limit this tmpfs instance to that percentage of your physical RAM: the default, when neither size nor nr_blocks is specified, is size=50% -If both nr_blocks (or size) and nr_inodes are set to 0, neither blocks -nor inodes will be limited in that instance. It is generally unwise to +If nr_blocks=0 (or size=0), blocks will not be limited in that instance; +if nr_inodes=0, inodes will not be limited. It is generally unwise to mount with such options, since it allows any user with write access to use up all the memory on the machine; but enhances the scalability of that instance in a system with many cpus making intensive use of it. @@ -97,4 +97,4 @@ RAM/SWAP in 10240 inodes and it is only accessible by root. Author: Christoph Rohland <cr@sap.com>, 1.12.01 Updated: - Hugh Dickins <hugh@veritas.com>, 01 September 2004 + Hugh Dickins <hugh@veritas.com>, 13 March 2005 diff --git a/Documentation/filesystems/xip.txt b/Documentation/filesystems/xip.txt new file mode 100644 index 000000000000..6c0cef10eb4d --- /dev/null +++ b/Documentation/filesystems/xip.txt @@ -0,0 +1,67 @@ +Execute-in-place for file mappings +---------------------------------- + +Motivation +---------- +File mappings are performed by mapping page cache pages to userspace. In +addition, read&write type file operations also transfer data from/to the page +cache. + +For memory backed storage devices that use the block device interface, the page +cache pages are in fact copies of the original storage. Various approaches +exist to work around the need for an extra copy. The ramdisk driver for example +does read the data into the page cache, keeps a reference, and discards the +original data behind later on. + +Execute-in-place solves this issue the other way around: instead of keeping +data in the page cache, the need to have a page cache copy is eliminated +completely. With execute-in-place, read&write type operations are performed +directly from/to the memory backed storage device. For file mappings, the +storage device itself is mapped directly into userspace. + +This implementation was initialy written for shared memory segments between +different virtual machines on s390 hardware to allow multiple machines to +share the same binaries and libraries. + +Implementation +-------------- +Execute-in-place is implemented in three steps: block device operation, +address space operation, and file operations. + +A block device operation named direct_access is used to retrieve a +reference (pointer) to a block on-disk. The reference is supposed to be +cpu-addressable, physical address and remain valid until the release operation +is performed. A struct block_device reference is used to address the device, +and a sector_t argument is used to identify the individual block. As an +alternative, memory technology devices can be used for this. + +The block device operation is optional, these block devices support it as of +today: +- dcssblk: s390 dcss block device driver + +An address space operation named get_xip_page is used to retrieve reference +to a struct page. To address the target page, a reference to an address_space, +and a sector number is provided. A 3rd argument indicates whether the +function should allocate blocks if needed. + +This address space operation is mutually exclusive with readpage&writepage that +do page cache read/write operations. +The following filesystems support it as of today: +- ext2: the second extended filesystem, see Documentation/filesystems/ext2.txt + +A set of file operations that do utilize get_xip_page can be found in +mm/filemap_xip.c . The following file operation implementations are provided: +- aio_read/aio_write +- readv/writev +- sendfile + +The generic file operations do_sync_read/do_sync_write can be used to implement +classic synchronous IO calls. + +Shortcomings +------------ +This implementation is limited to storage devices that are cpu addressable at +all times (no highmem or such). It works well on rom/ram, but enhancements are +needed to make it work with flash in read+write mode. +Putting the Linux kernel and/or its modules on a xip filesystem does not mean +they are not copied. diff --git a/Documentation/hwmon/adm1021 b/Documentation/hwmon/adm1021 new file mode 100644 index 000000000000..03d02bfb3df1 --- /dev/null +++ b/Documentation/hwmon/adm1021 @@ -0,0 +1,111 @@ +Kernel driver adm1021 +===================== + +Supported chips: + * Analog Devices ADM1021 + Prefix: 'adm1021' + Addresses scanned: I2C 0x18 - 0x1a, 0x29 - 0x2b, 0x4c - 0x4e + Datasheet: Publicly available at the Analog Devices website + * Analog Devices ADM1021A/ADM1023 + Prefix: 'adm1023' + Addresses scanned: I2C 0x18 - 0x1a, 0x29 - 0x2b, 0x4c - 0x4e + Datasheet: Publicly available at the Analog Devices website + * Genesys Logic GL523SM + Prefix: 'gl523sm' + Addresses scanned: I2C 0x18 - 0x1a, 0x29 - 0x2b, 0x4c - 0x4e + Datasheet: + * Intel Xeon Processor + Prefix: - any other - may require 'force_adm1021' parameter + Addresses scanned: none + Datasheet: Publicly available at Intel website + * Maxim MAX1617 + Prefix: 'max1617' + Addresses scanned: I2C 0x18 - 0x1a, 0x29 - 0x2b, 0x4c - 0x4e + Datasheet: Publicly available at the Maxim website + * Maxim MAX1617A + Prefix: 'max1617a' + Addresses scanned: I2C 0x18 - 0x1a, 0x29 - 0x2b, 0x4c - 0x4e + Datasheet: Publicly available at the Maxim website + * National Semiconductor LM84 + Prefix: 'lm84' + Addresses scanned: I2C 0x18 - 0x1a, 0x29 - 0x2b, 0x4c - 0x4e + Datasheet: Publicly available at the National Semiconductor website + * Philips NE1617 + Prefix: 'max1617' (probably detected as a max1617) + Addresses scanned: I2C 0x18 - 0x1a, 0x29 - 0x2b, 0x4c - 0x4e + Datasheet: Publicly available at the Philips website + * Philips NE1617A + Prefix: 'max1617' (probably detected as a max1617) + Addresses scanned: I2C 0x18 - 0x1a, 0x29 - 0x2b, 0x4c - 0x4e + Datasheet: Publicly available at the Philips website + * TI THMC10 + Prefix: 'thmc10' + Addresses scanned: I2C 0x18 - 0x1a, 0x29 - 0x2b, 0x4c - 0x4e + Datasheet: Publicly available at the TI website + * Onsemi MC1066 + Prefix: 'mc1066' + Addresses scanned: I2C 0x18 - 0x1a, 0x29 - 0x2b, 0x4c - 0x4e + Datasheet: Publicly available at the Onsemi website + + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com> + +Module Parameters +----------------- + +* read_only: int + Don't set any values, read only mode + + +Description +----------- + +The chips supported by this driver are very similar. The Maxim MAX1617 is +the oldest; it has the problem that it is not very well detectable. The +MAX1617A solves that. The ADM1021 is a straight clone of the MAX1617A. +Ditto for the THMC10. From here on, we will refer to all these chips as +ADM1021-clones. + +The ADM1021 and MAX1617A reports a die code, which is a sort of revision +code. This can help us pinpoint problems; it is not very useful +otherwise. + +ADM1021-clones implement two temperature sensors. One of them is internal, +and measures the temperature of the chip itself; the other is external and +is realised in the form of a transistor-like device. A special alarm +indicates whether the remote sensor is connected. + +Each sensor has its own low and high limits. When they are crossed, the +corresponding alarm is set and remains on as long as the temperature stays +out of range. Temperatures are measured in degrees Celsius. Measurements +are possible between -65 and +127 degrees, with a resolution of one degree. + +If an alarm triggers, it will remain triggered until the hardware register +is read at least once. This means that the cause for the alarm may already +have disappeared! + +This driver only updates its values each 1.5 seconds; reading it more often +will do no harm, but will return 'old' values. It is possible to make +ADM1021-clones do faster measurements, but there is really no good reason +for that. + +Xeon support +------------ + +Some Xeon processors have real max1617, adm1021, or compatible chips +within them, with two temperature sensors. + +Other Xeons have chips with only one sensor. + +If you have a Xeon, and the adm1021 module loads, and both temperatures +appear valid, then things are good. + +If the adm1021 module doesn't load, you should try this: + modprobe adm1021 force_adm1021=BUS,ADDRESS + ADDRESS can only be 0x18, 0x1a, 0x29, 0x2b, 0x4c, or 0x4e. + +If you have dual Xeons you may have appear to have two separate +adm1021-compatible chips, or two single-temperature sensors, at distinct +addresses. diff --git a/Documentation/hwmon/adm1025 b/Documentation/hwmon/adm1025 new file mode 100644 index 000000000000..39d2b781b5d6 --- /dev/null +++ b/Documentation/hwmon/adm1025 @@ -0,0 +1,51 @@ +Kernel driver adm1025 +===================== + +Supported chips: + * Analog Devices ADM1025, ADM1025A + Prefix: 'adm1025' + Addresses scanned: I2C 0x2c - 0x2e + Datasheet: Publicly available at the Analog Devices website + * Philips NE1619 + Prefix: 'ne1619' + Addresses scanned: I2C 0x2c - 0x2d + Datasheet: Publicly available at the Philips website + +The NE1619 presents some differences with the original ADM1025: + * Only two possible addresses (0x2c - 0x2d). + * No temperature offset register, but we don't use it anyway. + * No INT mode for pin 16. We don't play with it anyway. + +Authors: + Chen-Yuan Wu <gwu@esoft.com>, + Jean Delvare <khali@linux-fr.org> + +Description +----------- + +(This is from Analog Devices.) The ADM1025 is a complete system hardware +monitor for microprocessor-based systems, providing measurement and limit +comparison of various system parameters. Five voltage measurement inputs +are provided, for monitoring +2.5V, +3.3V, +5V and +12V power supplies and +the processor core voltage. The ADM1025 can monitor a sixth power-supply +voltage by measuring its own VCC. One input (two pins) is dedicated to a +remote temperature-sensing diode and an on-chip temperature sensor allows +ambient temperature to be monitored. + +One specificity of this chip is that the pin 11 can be hardwired in two +different manners. It can act as the +12V power-supply voltage analog +input, or as the a fifth digital entry for the VID reading (bit 4). It's +kind of strange since both are useful, and the reason for designing the +chip that way is obscure at least to me. The bit 5 of the configuration +register can be used to define how the chip is hardwired. Please note that +it is not a choice you have to make as the user. The choice was already +made by your motherboard's maker. If the configuration bit isn't set +properly, you'll have a wrong +12V reading or a wrong VID reading. The way +the driver handles that is to preserve this bit through the initialization +process, assuming that the BIOS set it up properly beforehand. If it turns +out not to be true in some cases, we'll provide a module parameter to force +modes. + +This driver also supports the ADM1025A, which differs from the ADM1025 +only in that it has "open-drain VID inputs while the ADM1025 has on-chip +100k pull-ups on the VID inputs". It doesn't make any difference for us. diff --git a/Documentation/hwmon/adm1026 b/Documentation/hwmon/adm1026 new file mode 100644 index 000000000000..473c689d7924 --- /dev/null +++ b/Documentation/hwmon/adm1026 @@ -0,0 +1,93 @@ +Kernel driver adm1026 +===================== + +Supported chips: + * Analog Devices ADM1026 + Prefix: 'adm1026' + Addresses scanned: I2C 0x2c, 0x2d, 0x2e + Datasheet: Publicly available at the Analog Devices website + http://www.analog.com/en/prod/0,,766_825_ADM1026,00.html + +Authors: + Philip Pokorny <ppokorny@penguincomputing.com> for Penguin Computing + Justin Thiessen <jthiessen@penguincomputing.com> + +Module Parameters +----------------- + +* gpio_input: int array (min = 1, max = 17) + List of GPIO pins (0-16) to program as inputs +* gpio_output: int array (min = 1, max = 17) + List of GPIO pins (0-16) to program as outputs +* gpio_inverted: int array (min = 1, max = 17) + List of GPIO pins (0-16) to program as inverted +* gpio_normal: int array (min = 1, max = 17) + List of GPIO pins (0-16) to program as normal/non-inverted +* gpio_fan: int array (min = 1, max = 8) + List of GPIO pins (0-7) to program as fan tachs + + +Description +----------- + +This driver implements support for the Analog Devices ADM1026. Analog +Devices calls it a "complete thermal system management controller." + +The ADM1026 implements three (3) temperature sensors, 17 voltage sensors, +16 general purpose digital I/O lines, eight (8) fan speed sensors (8-bit), +an analog output and a PWM output along with limit, alarm and mask bits for +all of the above. There is even 8k bytes of EEPROM memory on chip. + +Temperatures are measured in degrees Celsius. There are two external +sensor inputs and one internal sensor. Each sensor has a high and low +limit. If the limit is exceeded, an interrupt (#SMBALERT) can be +generated. The interrupts can be masked. In addition, there are over-temp +limits for each sensor. If this limit is exceeded, the #THERM output will +be asserted. The current temperature and limits have a resolution of 1 +degree. + +Fan rotation speeds are reported in RPM (rotations per minute) but measured +in counts of a 22.5kHz internal clock. Each fan has a high limit which +corresponds to a minimum fan speed. If the limit is exceeded, an interrupt +can be generated. Each fan can be programmed to divide the reference clock +by 1, 2, 4 or 8. Not all RPM values can accurately be represented, so some +rounding is done. With a divider of 8, the slowest measurable speed of a +two pulse per revolution fan is 661 RPM. + +There are 17 voltage sensors. An alarm is triggered if the voltage has +crossed a programmable minimum or maximum limit. Note that minimum in this +case always means 'closest to zero'; this is important for negative voltage +measurements. Several inputs have integrated attenuators so they can measure +higher voltages directly. 3.3V, 5V, 12V, -12V and battery voltage all have +dedicated inputs. There are several inputs scaled to 0-3V full-scale range +for SCSI terminator power. The remaining inputs are not scaled and have +a 0-2.5V full-scale range. A 2.5V or 1.82V reference voltage is provided +for negative voltage measurements. + +If an alarm triggers, it will remain triggered until the hardware register +is read at least once. This means that the cause for the alarm may already +have disappeared! Note that in the current implementation, all hardware +registers are read whenever any data is read (unless it is less than 2.0 +seconds since the last update). This means that you can easily miss +once-only alarms. + +The ADM1026 measures continuously. Analog inputs are measured about 4 +times a second. Fan speed measurement time depends on fan speed and +divisor. It can take as long as 1.5 seconds to measure all fan speeds. + +The ADM1026 has the ability to automatically control fan speed based on the +temperature sensor inputs. Both the PWM output and the DAC output can be +used to control fan speed. Usually only one of these two outputs will be +used. Write the minimum PWM or DAC value to the appropriate control +register. Then set the low temperature limit in the tmin values for each +temperature sensor. The range of control is fixed at 20 °C, and the +largest difference between current and tmin of the temperature sensors sets +the control output. See the datasheet for several example circuits for +controlling fan speed with the PWM and DAC outputs. The fan speed sensors +do not have PWM compensation, so it is probably best to control the fan +voltage from the power lead rather than on the ground lead. + +The datasheet shows an example application with VID signals attached to +GPIO lines. Unfortunately, the chip may not be connected to the VID lines +in this way. The driver assumes that the chips *is* connected this way to +get a VID voltage. diff --git a/Documentation/hwmon/adm1031 b/Documentation/hwmon/adm1031 new file mode 100644 index 000000000000..130a38382b98 --- /dev/null +++ b/Documentation/hwmon/adm1031 @@ -0,0 +1,35 @@ +Kernel driver adm1031 +===================== + +Supported chips: + * Analog Devices ADM1030 + Prefix: 'adm1030' + Addresses scanned: I2C 0x2c to 0x2e + Datasheet: Publicly available at the Analog Devices website + http://products.analog.com/products/info.asp?product=ADM1030 + + * Analog Devices ADM1031 + Prefix: 'adm1031' + Addresses scanned: I2C 0x2c to 0x2e + Datasheet: Publicly available at the Analog Devices website + http://products.analog.com/products/info.asp?product=ADM1031 + +Authors: + Alexandre d'Alton <alex@alexdalton.org> + Jean Delvare <khali@linux-fr.org> + +Description +----------- + +The ADM1030 and ADM1031 are digital temperature sensors and fan controllers. +They sense their own temperature as well as the temperature of up to one +(ADM1030) or two (ADM1031) external diodes. + +All temperature values are given in degrees Celsius. Resolution is 0.5 +degree for the local temperature, 0.125 degree for the remote temperatures. + +Each temperature channel has its own high and low limits, plus a critical +limit. + +The ADM1030 monitors a single fan speed, while the ADM1031 monitors up to +two. Each fan channel has its own low speed limit. diff --git a/Documentation/hwmon/adm9240 b/Documentation/hwmon/adm9240 new file mode 100644 index 000000000000..35f618f32896 --- /dev/null +++ b/Documentation/hwmon/adm9240 @@ -0,0 +1,177 @@ +Kernel driver adm9240 +===================== + +Supported chips: + * Analog Devices ADM9240 + Prefix: 'adm9240' + Addresses scanned: I2C 0x2c - 0x2f + Datasheet: Publicly available at the Analog Devices website + http://www.analog.com/UploadedFiles/Data_Sheets/79857778ADM9240_0.pdf + + * Dallas Semiconductor DS1780 + Prefix: 'ds1780' + Addresses scanned: I2C 0x2c - 0x2f + Datasheet: Publicly available at the Dallas Semiconductor (Maxim) website + http://pdfserv.maxim-ic.com/en/ds/DS1780.pdf + + * National Semiconductor LM81 + Prefix: 'lm81' + Addresses scanned: I2C 0x2c - 0x2f + Datasheet: Publicly available at the National Semiconductor website + http://www.national.com/ds.cgi/LM/LM81.pdf + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com>, + Michiel Rook <michiel@grendelproject.nl>, + Grant Coady <gcoady@gmail.com> with guidance + from Jean Delvare <khali@linux-fr.org> + +Interface +--------- +The I2C addresses listed above assume BIOS has not changed the +chip MSB 5-bit address. Each chip reports a unique manufacturer +identification code as well as the chip revision/stepping level. + +Description +----------- +[From ADM9240] The ADM9240 is a complete system hardware monitor for +microprocessor-based systems, providing measurement and limit comparison +of up to four power supplies and two processor core voltages, plus +temperature, two fan speeds and chassis intrusion. Measured values can +be read out via an I2C-compatible serial System Management Bus, and values +for limit comparisons can be programmed in over the same serial bus. The +high speed successive approximation ADC allows frequent sampling of all +analog channels to ensure a fast interrupt response to any out-of-limit +measurement. + +The ADM9240, DS1780 and LM81 are register compatible, the following +details are common to the three chips. Chip differences are described +after this section. + + +Measurements +------------ +The measurement cycle + +The adm9240 driver will take a measurement reading no faster than once +each two seconds. User-space may read sysfs interface faster than the +measurement update rate and will receive cached data from the most +recent measurement. + +ADM9240 has a very fast 320us temperature and voltage measurement cycle +with independent fan speed measurement cycles counting alternating rising +edges of the fan tacho inputs. + +DS1780 measurement cycle is about once per second including fan speed. + +LM81 measurement cycle is about once per 400ms including fan speed. +The LM81 12-bit extended temperature measurement mode is not supported. + +Temperature +----------- +On chip temperature is reported as degrees Celsius as 9-bit signed data +with resolution of 0.5 degrees Celsius. High and low temperature limits +are 8-bit signed data with resolution of one degree Celsius. + +Temperature alarm is asserted once the temperature exceeds the high limit, +and is cleared when the temperature falls below the temp1_max_hyst value. + +Fan Speed +--------- +Two fan tacho inputs are provided, the ADM9240 gates an internal 22.5kHz +clock via a divider to an 8-bit counter. Fan speed (rpm) is calculated by: + +rpm = (22500 * 60) / (count * divider) + +Automatic fan clock divider + + * User sets 0 to fan_min limit + - low speed alarm is disabled + - fan clock divider not changed + - auto fan clock adjuster enabled for valid fan speed reading + + * User sets fan_min limit too low + - low speed alarm is enabled + - fan clock divider set to max + - fan_min set to register value 254 which corresponds + to 664 rpm on adm9240 + - low speed alarm will be asserted if fan speed is + less than minimum measurable speed + - auto fan clock adjuster disabled + + * User sets reasonable fan speed + - low speed alarm is enabled + - fan clock divider set to suit fan_min + - auto fan clock adjuster enabled: adjusts fan_min + + * User sets unreasonably high low fan speed limit + - resolution of the low speed limit may be reduced + - alarm will be asserted + - auto fan clock adjuster enabled: adjusts fan_min + + * fan speed may be displayed as zero until the auto fan clock divider + adjuster brings fan speed clock divider back into chip measurement + range, this will occur within a few measurement cycles. + +Analog Output +------------- +An analog output provides a 0 to 1.25 volt signal intended for an external +fan speed amplifier circuit. The analog output is set to maximum value on +power up or reset. This doesn't do much on the test Intel SE440BX-2. + +Voltage Monitor + +Voltage (IN) measurement is internally scaled: + + nr label nominal maximum resolution + mV mV mV + 0 +2.5V 2500 3320 13.0 + 1 Vccp1 2700 3600 14.1 + 2 +3.3V 3300 4380 17.2 + 3 +5V 5000 6640 26.0 + 4 +12V 12000 15940 62.5 + 5 Vccp2 2700 3600 14.1 + +The reading is an unsigned 8-bit value, nominal voltage measurement is +represented by a reading of 192, being 3/4 of the measurement range. + +An alarm is asserted for any voltage going below or above the set limits. + +The driver reports and accepts voltage limits scaled to the above table. + +VID Monitor +----------- +The chip has five inputs to read the 5-bit VID and reports the mV value +based on detected CPU type. + +Chassis Intrusion +----------------- +An alarm is asserted when the CI pin goes active high. The ADM9240 +Datasheet has an example of an external temperature sensor driving +this pin. On an Intel SE440BX-2 the Chassis Intrusion header is +connected to a normally open switch. + +The ADM9240 provides an internal open drain on this line, and may output +a 20 ms active low pulse to reset an external Chassis Intrusion latch. + +Clear the CI latch by writing value 1 to the sysfs chassis_clear file. + +Alarm flags reported as 16-bit word + + bit label comment + --- ------------- -------------------------- + 0 +2.5 V_Error high or low limit exceeded + 1 VCCP_Error high or low limit exceeded + 2 +3.3 V_Error high or low limit exceeded + 3 +5 V_Error high or low limit exceeded + 4 Temp_Error temperature error + 6 FAN1_Error fan low limit exceeded + 7 FAN2_Error fan low limit exceeded + 8 +12 V_Error high or low limit exceeded + 9 VCCP2_Error high or low limit exceeded + 12 Chassis_Error CI pin went high + +Remaining bits are reserved and thus undefined. It is important to note +that alarm bits may be cleared on read, user-space may latch alarms and +provide the end-user with a method to clear alarm memory. diff --git a/Documentation/hwmon/asb100 b/Documentation/hwmon/asb100 new file mode 100644 index 000000000000..ab7365e139be --- /dev/null +++ b/Documentation/hwmon/asb100 @@ -0,0 +1,72 @@ +Kernel driver asb100 +==================== + +Supported Chips: + * Asus ASB100 and ASB100-A "Bach" + Prefix: 'asb100' + Addresses scanned: I2C 0x2d + Datasheet: none released + +Author: Mark M. Hoffman <mhoffman@lightlink.com> + +Description +----------- + +This driver implements support for the Asus ASB100 and ASB100-A "Bach". +These are custom ASICs available only on Asus mainboards. Asus refuses to +supply a datasheet for these chips. Thanks go to many people who helped +investigate their hardware, including: + +Vitaly V. Bursov +Alexander van Kaam (author of MBM for Windows) +Bertrik Sikken + +The ASB100 implements seven voltage sensors, three fan rotation speed +sensors, four temperature sensors, VID lines and alarms. In addition to +these, the ASB100-A also implements a single PWM controller for fans 2 and +3 (i.e. one setting controls both.) If you have a plain ASB100, the PWM +controller will simply not work (or maybe it will for you... it doesn't for +me). + +Temperatures are measured and reported in degrees Celsius. + +Fan speeds are reported in RPM (rotations per minute). An alarm is +triggered if the rotation speed has dropped below a programmable limit. + +Voltage sensors (also known as IN sensors) report values in volts. + +The VID lines encode the core voltage value: the voltage level your +processor should work with. This is hardcoded by the mainboard and/or +processor itself. It is a value in volts. + +Alarms: (TODO question marks indicate may or may not work) + +0x0001 => in0 (?) +0x0002 => in1 (?) +0x0004 => in2 +0x0008 => in3 +0x0010 => temp1 (1) +0x0020 => temp2 +0x0040 => fan1 +0x0080 => fan2 +0x0100 => in4 +0x0200 => in5 (?) (2) +0x0400 => in6 (?) (2) +0x0800 => fan3 +0x1000 => chassis switch +0x2000 => temp3 + +Alarm Notes: + +(1) This alarm will only trigger if the hysteresis value is 127C. +I.e. it behaves the same as w83781d. + +(2) The min and max registers for these values appear to +be read-only or otherwise stuck at 0x00. + +TODO: +* Experiment with fan divisors > 8. +* Experiment with temp. sensor types. +* Are there really 13 voltage inputs? Probably not... +* Cleanups, no doubt... + diff --git a/Documentation/hwmon/ds1621 b/Documentation/hwmon/ds1621 new file mode 100644 index 000000000000..1fee6f1e6bc5 --- /dev/null +++ b/Documentation/hwmon/ds1621 @@ -0,0 +1,108 @@ +Kernel driver ds1621 +==================== + +Supported chips: + * Dallas Semiconductor DS1621 + Prefix: 'ds1621' + Addresses scanned: I2C 0x48 - 0x4f + Datasheet: Publicly available at the Dallas Semiconductor website + http://www.dalsemi.com/ + * Dallas Semiconductor DS1625 + Prefix: 'ds1621' + Addresses scanned: I2C 0x48 - 0x4f + Datasheet: Publicly available at the Dallas Semiconductor website + http://www.dalsemi.com/ + +Authors: + Christian W. Zuckschwerdt <zany@triq.net> + valuable contributions by Jan M. Sendler <sendler@sendler.de> + ported to 2.6 by Aurelien Jarno <aurelien@aurel32.net> + with the help of Jean Delvare <khali@linux-fr.org> + +Module Parameters +------------------ + +* polarity int + Output's polarity: 0 = active high, 1 = active low + +Description +----------- + +The DS1621 is a (one instance) digital thermometer and thermostat. It has +both high and low temperature limits which can be user defined (i.e. +programmed into non-volatile on-chip registers). Temperature range is -55 +degree Celsius to +125 in 0.5 increments. You may convert this into a +Fahrenheit range of -67 to +257 degrees with 0.9 steps. If polarity +parameter is not provided, original value is used. + +As for the thermostat, behavior can also be programmed using the polarity +toggle. On the one hand ("heater"), the thermostat output of the chip, +Tout, will trigger when the low limit temperature is met or underrun and +stays high until the high limit is met or exceeded. On the other hand +("cooler"), vice versa. That way "heater" equals "active low", whereas +"conditioner" equals "active high". Please note that the DS1621 data sheet +is somewhat misleading in this point since setting the polarity bit does +not simply invert Tout. + +A second thing is that, during extensive testing, Tout showed a tolerance +of up to +/- 0.5 degrees even when compared against precise temperature +readings. Be sure to have a high vs. low temperature limit gap of al least +1.0 degree Celsius to avoid Tout "bouncing", though! + +As for alarms, you can read the alarm status of the DS1621 via the 'alarms' +/sys file interface. The result consists mainly of bit 6 and 5 of the +configuration register of the chip; bit 6 (0x40 or 64) is the high alarm +bit and bit 5 (0x20 or 32) the low one. These bits are set when the high or +low limits are met or exceeded and are reset by the module as soon as the +respective temperature ranges are left. + +The alarm registers are in no way suitable to find out about the actual +status of Tout. They will only tell you about its history, whether or not +any of the limits have ever been met or exceeded since last power-up or +reset. Be aware: When testing, it showed that the status of Tout can change +with neither of the alarms set. + +Temperature conversion of the DS1621 takes up to 1000ms; internal access to +non-volatile registers may last for 10ms or below. + +High Accuracy Temperature Reading +--------------------------------- + +As said before, the temperature issued via the 9-bit i2c-bus data is +somewhat arbitrary. Internally, the temperature conversion is of a +different kind that is explained (not so...) well in the DS1621 data sheet. +To cut the long story short: Inside the DS1621 there are two oscillators, +both of them biassed by a temperature coefficient. + +Higher resolution of the temperature reading can be achieved using the +internal projection, which means taking account of REG_COUNT and REG_SLOPE +(the driver manages them): + +Taken from Dallas Semiconductors App Note 068: 'Increasing Temperature +Resolution on the DS1620' and App Note 105: 'High Resolution Temperature +Measurement with Dallas Direct-to-Digital Temperature Sensors' + +- Read the 9-bit temperature and strip the LSB (Truncate the .5 degs) +- The resulting value is TEMP_READ. +- Then, read REG_COUNT. +- And then, REG_SLOPE. + + TEMP = TEMP_READ - 0.25 + ((REG_SLOPE - REG_COUNT) / REG_SLOPE) + +Note that this is what the DONE bit in the DS1621 configuration register is +good for: Internally, one temperature conversion takes up to 1000ms. Before +that conversion is complete you will not be able to read valid things out +of REG_COUNT and REG_SLOPE. The DONE bit, as you may have guessed by now, +tells you whether the conversion is complete ("done", in plain English) and +thus, whether the values you read are good or not. + +The DS1621 has two modes of operation: "Continuous" conversion, which can +be understood as the default stand-alone mode where the chip gets the +temperature and controls external devices via its Tout pin or tells other +i2c's about it if they care. The other mode is called "1SHOT", that means +that it only figures out about the temperature when it is explicitly told +to do so; this can be seen as power saving mode. + +Now if you want to read REG_COUNT and REG_SLOPE, you have to either stop +the continuous conversions until the contents of these registers are valid, +or, in 1SHOT mode, you have to have one conversion made. diff --git a/Documentation/hwmon/fscher b/Documentation/hwmon/fscher new file mode 100644 index 000000000000..64031659aff3 --- /dev/null +++ b/Documentation/hwmon/fscher @@ -0,0 +1,169 @@ +Kernel driver fscher +==================== + +Supported chips: + * Fujitsu-Siemens Hermes chip + Prefix: 'fscher' + Addresses scanned: I2C 0x73 + +Authors: + Reinhard Nissl <rnissl@gmx.de> based on work + from Hermann Jung <hej@odn.de>, + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com> + +Description +----------- + +This driver implements support for the Fujitsu-Siemens Hermes chip. It is +described in the 'Register Set Specification BMC Hermes based Systemboard' +from Fujitsu-Siemens. + +The Hermes chip implements a hardware-based system management, e.g. for +controlling fan speed and core voltage. There is also a watchdog counter on +the chip which can trigger an alarm and even shut the system down. + +The chip provides three temperature values (CPU, motherboard and +auxiliary), three voltage values (+12V, +5V and battery) and three fans +(power supply, CPU and auxiliary). + +Temperatures are measured in degrees Celsius. The resolution is 1 degree. + +Fan rotation speeds are reported in RPM (rotations per minute). The value +can be divided by a programmable divider (1, 2 or 4) which is stored on +the chip. + +Voltage sensors (also known as "in" sensors) report their values in volts. + +All values are reported as final values from the driver. There is no need +for further calculations. + + +Detailed description +-------------------- + +Below you'll find a single line description of all the bit values. With +this information, you're able to decode e. g. alarms, wdog, etc. To make +use of the watchdog, you'll need to set the watchdog time and enable the +watchdog. After that it is necessary to restart the watchdog time within +the specified period of time, or a system reset will occur. + +* revision + READING & 0xff = 0x??: HERMES revision identification + +* alarms + READING & 0x80 = 0x80: CPU throttling active + READING & 0x80 = 0x00: CPU running at full speed + + READING & 0x10 = 0x10: software event (see control:1) + READING & 0x10 = 0x00: no software event + + READING & 0x08 = 0x08: watchdog event (see wdog:2) + READING & 0x08 = 0x00: no watchdog event + + READING & 0x02 = 0x02: thermal event (see temp*:1) + READING & 0x02 = 0x00: no thermal event + + READING & 0x01 = 0x01: fan event (see fan*:1) + READING & 0x01 = 0x00: no fan event + + READING & 0x13 ! 0x00: ALERT LED is flashing + +* control + READING & 0x01 = 0x01: software event + READING & 0x01 = 0x00: no software event + + WRITING & 0x01 = 0x01: set software event + WRITING & 0x01 = 0x00: clear software event + +* watchdog_control + READING & 0x80 = 0x80: power off on watchdog event while thermal event + READING & 0x80 = 0x00: watchdog power off disabled (just system reset enabled) + + READING & 0x40 = 0x40: watchdog timebase 60 seconds (see also wdog:1) + READING & 0x40 = 0x00: watchdog timebase 2 seconds + + READING & 0x10 = 0x10: watchdog enabled + READING & 0x10 = 0x00: watchdog disabled + + WRITING & 0x80 = 0x80: enable "power off on watchdog event while thermal event" + WRITING & 0x80 = 0x00: disable "power off on watchdog event while thermal event" + + WRITING & 0x40 = 0x40: set watchdog timebase to 60 seconds + WRITING & 0x40 = 0x00: set watchdog timebase to 2 seconds + + WRITING & 0x20 = 0x20: disable watchdog + + WRITING & 0x10 = 0x10: enable watchdog / restart watchdog time + +* watchdog_state + READING & 0x02 = 0x02: watchdog system reset occurred + READING & 0x02 = 0x00: no watchdog system reset occurred + + WRITING & 0x02 = 0x02: clear watchdog event + +* watchdog_preset + READING & 0xff = 0x??: configured watch dog time in units (see wdog:3 0x40) + + WRITING & 0xff = 0x??: configure watch dog time in units + +* in* (0: +5V, 1: +12V, 2: onboard 3V battery) + READING: actual voltage value + +* temp*_status (1: CPU sensor, 2: onboard sensor, 3: auxiliary sensor) + READING & 0x02 = 0x02: thermal event (overtemperature) + READING & 0x02 = 0x00: no thermal event + + READING & 0x01 = 0x01: sensor is working + READING & 0x01 = 0x00: sensor is faulty + + WRITING & 0x02 = 0x02: clear thermal event + +* temp*_input (1: CPU sensor, 2: onboard sensor, 3: auxiliary sensor) + READING: actual temperature value + +* fan*_status (1: power supply fan, 2: CPU fan, 3: auxiliary fan) + READING & 0x04 = 0x04: fan event (fan fault) + READING & 0x04 = 0x00: no fan event + + WRITING & 0x04 = 0x04: clear fan event + +* fan*_div (1: power supply fan, 2: CPU fan, 3: auxiliary fan) + Divisors 2,4 and 8 are supported, both for reading and writing + +* fan*_pwm (1: power supply fan, 2: CPU fan, 3: auxiliary fan) + READING & 0xff = 0x00: fan may be switched off + READING & 0xff = 0x01: fan must run at least at minimum speed (supply: 6V) + READING & 0xff = 0xff: fan must run at maximum speed (supply: 12V) + READING & 0xff = 0x??: fan must run at least at given speed (supply: 6V..12V) + + WRITING & 0xff = 0x00: fan may be switched off + WRITING & 0xff = 0x01: fan must run at least at minimum speed (supply: 6V) + WRITING & 0xff = 0xff: fan must run at maximum speed (supply: 12V) + WRITING & 0xff = 0x??: fan must run at least at given speed (supply: 6V..12V) + +* fan*_input (1: power supply fan, 2: CPU fan, 3: auxiliary fan) + READING: actual RPM value + + +Limitations +----------- + +* Measuring fan speed +It seems that the chip counts "ripples" (typical fans produce 2 ripples per +rotation while VERAX fans produce 18) in a 9-bit register. This register is +read out every second, then the ripple prescaler (2, 4 or 8) is applied and +the result is stored in the 8 bit output register. Due to the limitation of +the counting register to 9 bits, it is impossible to measure a VERAX fan +properly (even with a prescaler of 8). At its maximum speed of 3500 RPM the +fan produces 1080 ripples per second which causes the counting register to +overflow twice, leading to only 186 RPM. + +* Measuring input voltages +in2 ("battery") reports the voltage of the onboard lithium battery and not ++3.3V from the power supply. + +* Undocumented features +Fujitsu-Siemens Computers has not documented all features of the chip so +far. Their software, System Guard, shows that there are a still some +features which cannot be controlled by this implementation. diff --git a/Documentation/hwmon/gl518sm b/Documentation/hwmon/gl518sm new file mode 100644 index 000000000000..ce0881883bca --- /dev/null +++ b/Documentation/hwmon/gl518sm @@ -0,0 +1,74 @@ +Kernel driver gl518sm +===================== + +Supported chips: + * Genesys Logic GL518SM release 0x00 + Prefix: 'gl518sm' + Addresses scanned: I2C 0x2c and 0x2d + Datasheet: http://www.genesyslogic.com/pdf + * Genesys Logic GL518SM release 0x80 + Prefix: 'gl518sm' + Addresses scanned: I2C 0x2c and 0x2d + Datasheet: http://www.genesyslogic.com/pdf + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Kyösti Mälkki <kmalkki@cc.hut.fi> + Hong-Gunn Chew <hglinux@gunnet.org> + Jean Delvare <khali@linux-fr.org> + +Description +----------- + +IMPORTANT: + +For the revision 0x00 chip, the in0, in1, and in2 values (+5V, +3V, +and +12V) CANNOT be read. This is a limitation of the chip, not the driver. + +This driver supports the Genesys Logic GL518SM chip. There are at least +two revision of this chip, which we call revision 0x00 and 0x80. Revision +0x80 chips support the reading of all voltages and revision 0x00 only +for VIN3. + +The GL518SM implements one temperature sensor, two fan rotation speed +sensors, and four voltage sensors. It can report alarms through the +computer speakers. + +Temperatures are measured in degrees Celsius. An alarm goes off while the +temperature is above the over temperature limit, and has not yet dropped +below the hysteresis limit. The alarm always reflects the current +situation. Measurements are guaranteed between -10 degrees and +110 +degrees, with a accuracy of +/-3 degrees. + +Rotation speeds are reported in RPM (rotations per minute). An alarm is +triggered if the rotation speed has dropped below a programmable limit. In +case when you have selected to turn fan1 off, no fan1 alarm is triggered. + +Fan readings can be divided by a programmable divider (1, 2, 4 or 8) to +give the readings more range or accuracy. Not all RPM values can +accurately be represented, so some rounding is done. With a divider +of 2, the lowest representable value is around 1900 RPM. + +Voltage sensors (also known as VIN sensors) report their values in volts. +An alarm is triggered if the voltage has crossed a programmable minimum or +maximum limit. Note that minimum in this case always means 'closest to +zero'; this is important for negative voltage measurements. The VDD input +measures voltages between 0.000 and 5.865 volt, with a resolution of 0.023 +volt. The other inputs measure voltages between 0.000 and 4.845 volt, with +a resolution of 0.019 volt. Note that revision 0x00 chips do not support +reading the current voltage of any input except for VIN3; limit setting and +alarms work fine, though. + +When an alarm is triggered, you can be warned by a beeping signal through your +computer speaker. It is possible to enable all beeping globally, or only the +beeping for some alarms. + +If an alarm triggers, it will remain triggered until the hardware register +is read at least once (except for temperature alarms). This means that the +cause for the alarm may already have disappeared! Note that in the current +implementation, all hardware registers are read whenever any data is read +(unless it is less than 1.5 seconds since the last update). This means that +you can easily miss once-only alarms. + +The GL518SM only updates its values each 1.5 seconds; reading it more often +will do no harm, but will return 'old' values. diff --git a/Documentation/hwmon/it87 b/Documentation/hwmon/it87 new file mode 100644 index 000000000000..0d0195040d88 --- /dev/null +++ b/Documentation/hwmon/it87 @@ -0,0 +1,96 @@ +Kernel driver it87 +================== + +Supported chips: + * IT8705F + Prefix: 'it87' + Addresses scanned: from Super I/O config space, or default ISA 0x290 (8 I/O ports) + Datasheet: Publicly available at the ITE website + http://www.ite.com.tw/ + * IT8712F + Prefix: 'it8712' + Addresses scanned: I2C 0x28 - 0x2f + from Super I/O config space, or default ISA 0x290 (8 I/O ports) + Datasheet: Publicly available at the ITE website + http://www.ite.com.tw/ + * SiS950 [clone of IT8705F] + Prefix: 'sis950' + Addresses scanned: from Super I/O config space, or default ISA 0x290 (8 I/O ports) + Datasheet: No longer be available + +Author: Christophe Gauthron <chrisg@0-in.com> + + +Module Parameters +----------------- + +* update_vbat: int + + 0 if vbat should report power on value, 1 if vbat should be updated after + each read. Default is 0. On some boards the battery voltage is provided + by either the battery or the onboard power supply. Only the first reading + at power on will be the actual battery voltage (which the chip does + automatically). On other boards the battery voltage is always fed to + the chip so can be read at any time. Excessive reading may decrease + battery life but no information is given in the datasheet. + +* fix_pwm_polarity int + + Force PWM polarity to active high (DANGEROUS). Some chips are + misconfigured by BIOS - PWM values would be inverted. This option tries + to fix this. Please contact your BIOS manufacturer and ask him for fix. + +Description +----------- + +This driver implements support for the IT8705F, IT8712F and SiS950 chips. + +This driver also supports IT8712F, which adds SMBus access, and a VID +input, used to report the Vcore voltage of the Pentium processor. +The IT8712F additionally features VID inputs. + +These chips are 'Super I/O chips', supporting floppy disks, infrared ports, +joysticks and other miscellaneous stuff. For hardware monitoring, they +include an 'environment controller' with 3 temperature sensors, 3 fan +rotation speed sensors, 8 voltage sensors, and associated alarms. + +Temperatures are measured in degrees Celsius. An alarm is triggered once +when the Overtemperature Shutdown limit is crossed. + +Fan rotation speeds are reported in RPM (rotations per minute). An alarm is +triggered if the rotation speed has dropped below a programmable limit. Fan +readings can be divided by a programmable divider (1, 2, 4 or 8) to give the +readings more range or accuracy. Not all RPM values can accurately be +represented, so some rounding is done. With a divider of 2, the lowest +representable value is around 2600 RPM. + +Voltage sensors (also known as IN sensors) report their values in volts. An +alarm is triggered if the voltage has crossed a programmable minimum or +maximum limit. Note that minimum in this case always means 'closest to +zero'; this is important for negative voltage measurements. All voltage +inputs can measure voltages between 0 and 4.08 volts, with a resolution of +0.016 volt. The battery voltage in8 does not have limit registers. + +The VID lines (IT8712F only) encode the core voltage value: the voltage +level your processor should work with. This is hardcoded by the mainboard +and/or processor itself. It is a value in volts. + +If an alarm triggers, it will remain triggered until the hardware register +is read at least once. This means that the cause for the alarm may already +have disappeared! Note that in the current implementation, all hardware +registers are read whenever any data is read (unless it is less than 1.5 +seconds since the last update). This means that you can easily miss +once-only alarms. + +The IT87xx only updates its values each 1.5 seconds; reading it more often +will do no harm, but will return 'old' values. + +To change sensor N to a thermistor, 'echo 2 > tempN_type' where N is 1, 2, +or 3. To change sensor N to a thermal diode, 'echo 3 > tempN_type'. +Give 0 for unused sensor. Any other value is invalid. To configure this at +startup, consult lm_sensors's /etc/sensors.conf. (2 = thermistor; +3 = thermal diode) + +The fan speed control features are limited to manual PWM mode. Automatic +"Smart Guardian" mode control handling is not implemented. However +if you want to go for "manual mode" just write 1 to pwmN_enable. diff --git a/Documentation/hwmon/lm63 b/Documentation/hwmon/lm63 new file mode 100644 index 000000000000..31660bf97979 --- /dev/null +++ b/Documentation/hwmon/lm63 @@ -0,0 +1,57 @@ +Kernel driver lm63 +================== + +Supported chips: + * National Semiconductor LM63 + Prefix: 'lm63' + Addresses scanned: I2C 0x4c + Datasheet: Publicly available at the National Semiconductor website + http://www.national.com/pf/LM/LM63.html + +Author: Jean Delvare <khali@linux-fr.org> + +Thanks go to Tyan and especially Alex Buckingham for setting up a remote +access to their S4882 test platform for this driver. + http://www.tyan.com/ + +Description +----------- + +The LM63 is a digital temperature sensor with integrated fan monitoring +and control. + +The LM63 is basically an LM86 with fan speed monitoring and control +capabilities added. It misses some of the LM86 features though: + - No low limit for local temperature. + - No critical limit for local temperature. + - Critical limit for remote temperature can be changed only once. We + will consider that the critical limit is read-only. + +The datasheet isn't very clear about what the tachometer reading is. + +An explanation from National Semiconductor: The two lower bits of the read +value have to be masked out. The value is still 16 bit in width. + +All temperature values are given in degrees Celsius. Resolution is 1.0 +degree for the local temperature, 0.125 degree for the remote temperature. + +The fan speed is measured using a tachometer. Contrary to most chips which +store the value in an 8-bit register and have a selectable clock divider +to make sure that the result will fit in the register, the LM63 uses 16-bit +value for measuring the speed of the fan. It can measure fan speeds down to +83 RPM, at least in theory. + +Note that the pin used for fan monitoring is shared with an alert out +function. Depending on how the board designer wanted to use the chip, fan +speed monitoring will or will not be possible. The proper chip configuration +is left to the BIOS, and the driver will blindly trust it. + +A PWM output can be used to control the speed of the fan. The LM63 has two +PWM modes: manual and automatic. Automatic mode is not fully implemented yet +(you cannot define your custom PWM/temperature curve), and mode change isn't +supported either. + +The lm63 driver will not update its values more frequently than every +second; reading them more often will do no harm, but will return 'old' +values. + diff --git a/Documentation/hwmon/lm75 b/Documentation/hwmon/lm75 new file mode 100644 index 000000000000..8e6356fe05d7 --- /dev/null +++ b/Documentation/hwmon/lm75 @@ -0,0 +1,65 @@ +Kernel driver lm75 +================== + +Supported chips: + * National Semiconductor LM75 + Prefix: 'lm75' + Addresses scanned: I2C 0x48 - 0x4f + Datasheet: Publicly available at the National Semiconductor website + http://www.national.com/ + * Dallas Semiconductor DS75 + Prefix: 'lm75' + Addresses scanned: I2C 0x48 - 0x4f + Datasheet: Publicly available at the Dallas Semiconductor website + http://www.maxim-ic.com/ + * Dallas Semiconductor DS1775 + Prefix: 'lm75' + Addresses scanned: I2C 0x48 - 0x4f + Datasheet: Publicly available at the Dallas Semiconductor website + http://www.maxim-ic.com/ + * Maxim MAX6625, MAX6626 + Prefix: 'lm75' + Addresses scanned: I2C 0x48 - 0x4b + Datasheet: Publicly available at the Maxim website + http://www.maxim-ic.com/ + * Microchip (TelCom) TCN75 + Prefix: 'lm75' + Addresses scanned: I2C 0x48 - 0x4f + Datasheet: Publicly available at the Microchip website + http://www.microchip.com/ + +Author: Frodo Looijaard <frodol@dds.nl> + +Description +----------- + +The LM75 implements one temperature sensor. Limits can be set through the +Overtemperature Shutdown register and Hysteresis register. Each value can be +set and read to half-degree accuracy. +An alarm is issued (usually to a connected LM78) when the temperature +gets higher then the Overtemperature Shutdown value; it stays on until +the temperature falls below the Hysteresis value. +All temperatures are in degrees Celsius, and are guaranteed within a +range of -55 to +125 degrees. + +The LM75 only updates its values each 1.5 seconds; reading it more often +will do no harm, but will return 'old' values. + +The LM75 is usually used in combination with LM78-like chips, to measure +the temperature of the processor(s). + +The DS75, DS1775, MAX6625, and MAX6626 are supported as well. +They are not distinguished from an LM75. While most of these chips +have three additional bits of accuracy (12 vs. 9 for the LM75), +the additional bits are not supported. Not only that, but these chips will +not be detected if not in 9-bit precision mode (use the force parameter if +needed). + +The TCN75 is supported as well, and is not distinguished from an LM75. + +The LM75 is essentially an industry standard; there may be other +LM75 clones not listed here, with or without various enhancements, +that are supported. + +The LM77 is not supported, contrary to what we pretended for a long time. +Both chips are simply not compatible, value encoding differs. diff --git a/Documentation/hwmon/lm77 b/Documentation/hwmon/lm77 new file mode 100644 index 000000000000..57c3a46d6370 --- /dev/null +++ b/Documentation/hwmon/lm77 @@ -0,0 +1,22 @@ +Kernel driver lm77 +================== + +Supported chips: + * National Semiconductor LM77 + Prefix: 'lm77' + Addresses scanned: I2C 0x48 - 0x4b + Datasheet: Publicly available at the National Semiconductor website + http://www.national.com/ + +Author: Andras BALI <drewie@freemail.hu> + +Description +----------- + +The LM77 implements one temperature sensor. The temperature +sensor incorporates a band-gap type temperature sensor, +10-bit ADC, and a digital comparator with user-programmable upper +and lower limit values. + +Limits can be set through the Overtemperature Shutdown register and +Hysteresis register. diff --git a/Documentation/hwmon/lm78 b/Documentation/hwmon/lm78 new file mode 100644 index 000000000000..357086ed7f64 --- /dev/null +++ b/Documentation/hwmon/lm78 @@ -0,0 +1,82 @@ +Kernel driver lm78 +================== + +Supported chips: + * National Semiconductor LM78 + Prefix: 'lm78' + Addresses scanned: I2C 0x20 - 0x2f, ISA 0x290 (8 I/O ports) + Datasheet: Publicly available at the National Semiconductor website + http://www.national.com/ + * National Semiconductor LM78-J + Prefix: 'lm78-j' + Addresses scanned: I2C 0x20 - 0x2f, ISA 0x290 (8 I/O ports) + Datasheet: Publicly available at the National Semiconductor website + http://www.national.com/ + * National Semiconductor LM79 + Prefix: 'lm79' + Addresses scanned: I2C 0x20 - 0x2f, ISA 0x290 (8 I/O ports) + Datasheet: Publicly available at the National Semiconductor website + http://www.national.com/ + +Author: Frodo Looijaard <frodol@dds.nl> + +Description +----------- + +This driver implements support for the National Semiconductor LM78, LM78-J +and LM79. They are described as 'Microprocessor System Hardware Monitors'. + +There is almost no difference between the three supported chips. Functionally, +the LM78 and LM78-J are exactly identical. The LM79 has one more VID line, +which is used to report the lower voltages newer Pentium processors use. +From here on, LM7* means either of these three types. + +The LM7* implements one temperature sensor, three fan rotation speed sensors, +seven voltage sensors, VID lines, alarms, and some miscellaneous stuff. + +Temperatures are measured in degrees Celsius. An alarm is triggered once +when the Overtemperature Shutdown limit is crossed; it is triggered again +as soon as it drops below the Hysteresis value. A more useful behavior +can be found by setting the Hysteresis value to +127 degrees Celsius; in +this case, alarms are issued during all the time when the actual temperature +is above the Overtemperature Shutdown value. Measurements are guaranteed +between -55 and +125 degrees, with a resolution of 1 degree. + +Fan rotation speeds are reported in RPM (rotations per minute). An alarm is +triggered if the rotation speed has dropped below a programmable limit. Fan +readings can be divided by a programmable divider (1, 2, 4 or 8) to give +the readings more range or accuracy. Not all RPM values can accurately be +represented, so some rounding is done. With a divider of 2, the lowest +representable value is around 2600 RPM. + +Voltage sensors (also known as IN sensors) report their values in volts. +An alarm is triggered if the voltage has crossed a programmable minimum +or maximum limit. Note that minimum in this case always means 'closest to +zero'; this is important for negative voltage measurements. All voltage +inputs can measure voltages between 0 and 4.08 volts, with a resolution +of 0.016 volt. + +The VID lines encode the core voltage value: the voltage level your processor +should work with. This is hardcoded by the mainboard and/or processor itself. +It is a value in volts. When it is unconnected, you will often find the +value 3.50 V here. + +In addition to the alarms described above, there are a couple of additional +ones. There is a BTI alarm, which gets triggered when an external chip has +crossed its limits. Usually, this is connected to all LM75 chips; if at +least one crosses its limits, this bit gets set. The CHAS alarm triggers +if your computer case is open. The FIFO alarms should never trigger; it +indicates an internal error. The SMI_IN alarm indicates some other chip +has triggered an SMI interrupt. As we do not use SMI interrupts at all, +this condition usually indicates there is a problem with some other +device. + +If an alarm triggers, it will remain triggered until the hardware register +is read at least once. This means that the cause for the alarm may +already have disappeared! Note that in the current implementation, all +hardware registers are read whenever any data is read (unless it is less +than 1.5 seconds since the last update). This means that you can easily +miss once-only alarms. + +The LM7* only updates its values each 1.5 seconds; reading it more often +will do no harm, but will return 'old' values. diff --git a/Documentation/hwmon/lm80 b/Documentation/hwmon/lm80 new file mode 100644 index 000000000000..cb5b407ba3e6 --- /dev/null +++ b/Documentation/hwmon/lm80 @@ -0,0 +1,56 @@ +Kernel driver lm80 +================== + +Supported chips: + * National Semiconductor LM80 + Prefix: 'lm80' + Addresses scanned: I2C 0x28 - 0x2f + Datasheet: Publicly available at the National Semiconductor website + http://www.national.com/ + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com> + +Description +----------- + +This driver implements support for the National Semiconductor LM80. +It is described as a 'Serial Interface ACPI-Compatible Microprocessor +System Hardware Monitor'. + +The LM80 implements one temperature sensor, two fan rotation speed sensors, +seven voltage sensors, alarms, and some miscellaneous stuff. + +Temperatures are measured in degrees Celsius. There are two sets of limits +which operate independently. When the HOT Temperature Limit is crossed, +this will cause an alarm that will be reasserted until the temperature +drops below the HOT Hysteresis. The Overtemperature Shutdown (OS) limits +should work in the same way (but this must be checked; the datasheet +is unclear about this). Measurements are guaranteed between -55 and ++125 degrees. The current temperature measurement has a resolution of +0.0625 degrees; the limits have a resolution of 1 degree. + +Fan rotation speeds are reported in RPM (rotations per minute). An alarm is +triggered if the rotation speed has dropped below a programmable limit. Fan +readings can be divided by a programmable divider (1, 2, 4 or 8) to give +the readings more range or accuracy. Not all RPM values can accurately be +represented, so some rounding is done. With a divider of 2, the lowest +representable value is around 2600 RPM. + +Voltage sensors (also known as IN sensors) report their values in volts. +An alarm is triggered if the voltage has crossed a programmable minimum +or maximum limit. Note that minimum in this case always means 'closest to +zero'; this is important for negative voltage measurements. All voltage +inputs can measure voltages between 0 and 2.55 volts, with a resolution +of 0.01 volt. + +If an alarm triggers, it will remain triggered until the hardware register +is read at least once. This means that the cause for the alarm may +already have disappeared! Note that in the current implementation, all +hardware registers are read whenever any data is read (unless it is less +than 2.0 seconds since the last update). This means that you can easily +miss once-only alarms. + +The LM80 only updates its values each 1.5 seconds; reading it more often +will do no harm, but will return 'old' values. diff --git a/Documentation/hwmon/lm83 b/Documentation/hwmon/lm83 new file mode 100644 index 000000000000..061d9ed8ff43 --- /dev/null +++ b/Documentation/hwmon/lm83 @@ -0,0 +1,76 @@ +Kernel driver lm83 +================== + +Supported chips: + * National Semiconductor LM83 + Prefix: 'lm83' + Addresses scanned: I2C 0x18 - 0x1a, 0x29 - 0x2b, 0x4c - 0x4e + Datasheet: Publicly available at the National Semiconductor website + http://www.national.com/pf/LM/LM83.html + + +Author: Jean Delvare <khali@linux-fr.org> + +Description +----------- + +The LM83 is a digital temperature sensor. It senses its own temperature as +well as the temperature of up to three external diodes. It is compatible +with many other devices such as the LM84 and all other ADM1021 clones. +The main difference between the LM83 and the LM84 in that the later can +only sense the temperature of one external diode. + +Using the adm1021 driver for a LM83 should work, but only two temperatures +will be reported instead of four. + +The LM83 is only found on a handful of motherboards. Both a confirmed +list and an unconfirmed list follow. If you can confirm or infirm the +fact that any of these motherboards do actually have an LM83, please +contact us. Note that the LM90 can easily be misdetected as a LM83. + +Confirmed motherboards: + SBS P014 + +Unconfirmed motherboards: + Gigabyte GA-8IK1100 + Iwill MPX2 + Soltek SL-75DRV5 + +The driver has been successfully tested by Magnus Forsström, who I'd +like to thank here. More testers will be of course welcome. + +The fact that the LM83 is only scarcely used can be easily explained. +Most motherboards come with more than just temperature sensors for +health monitoring. They also have voltage and fan rotation speed +sensors. This means that temperature-only chips are usually used as +secondary chips coupled with another chip such as an IT8705F or similar +chip, which provides more features. Since systems usually need three +temperature sensors (motherboard, processor, power supply) and primary +chips provide some temperature sensors, the secondary chip, if needed, +won't have to handle more than two temperatures. Thus, ADM1021 clones +are sufficient, and there is no need for a four temperatures sensor +chip such as the LM83. The only case where using an LM83 would make +sense is on SMP systems, such as the above-mentioned Iwill MPX2, +because you want an additional temperature sensor for each additional +CPU. + +On the SBS P014, this is different, since the LM83 is the only hardware +monitoring chipset. One temperature sensor is used for the motherboard +(actually measuring the LM83's own temperature), one is used for the +CPU. The two other sensors must be used to measure the temperature of +two other points of the motherboard. We suspect these points to be the +north and south bridges, but this couldn't be confirmed. + +All temperature values are given in degrees Celsius. Local temperature +is given within a range of 0 to +85 degrees. Remote temperatures are +given within a range of 0 to +125 degrees. Resolution is 1.0 degree, +accuracy is guaranteed to 3.0 degrees (see the datasheet for more +details). + +Each sensor has its own high limit, but the critical limit is common to +all four sensors. There is no hysteresis mechanism as found on most +recent temperature sensors. + +The lm83 driver will not update its values more frequently than every +other second; reading them more often will do no harm, but will return +'old' values. diff --git a/Documentation/hwmon/lm85 b/Documentation/hwmon/lm85 new file mode 100644 index 000000000000..9549237530cf --- /dev/null +++ b/Documentation/hwmon/lm85 @@ -0,0 +1,221 @@ +Kernel driver lm85 +================== + +Supported chips: + * National Semiconductor LM85 (B and C versions) + Prefix: 'lm85' + Addresses scanned: I2C 0x2c, 0x2d, 0x2e + Datasheet: http://www.national.com/pf/LM/LM85.html + * Analog Devices ADM1027 + Prefix: 'adm1027' + Addresses scanned: I2C 0x2c, 0x2d, 0x2e + Datasheet: http://www.analog.com/en/prod/0,,766_825_ADM1027,00.html + * Analog Devices ADT7463 + Prefix: 'adt7463' + Addresses scanned: I2C 0x2c, 0x2d, 0x2e + Datasheet: http://www.analog.com/en/prod/0,,766_825_ADT7463,00.html + * SMSC EMC6D100, SMSC EMC6D101 + Prefix: 'emc6d100' + Addresses scanned: I2C 0x2c, 0x2d, 0x2e + Datasheet: http://www.smsc.com/main/tools/discontinued/6d100.pdf + * SMSC EMC6D102 + Prefix: 'emc6d102' + Addresses scanned: I2C 0x2c, 0x2d, 0x2e + Datasheet: http://www.smsc.com/main/catalog/emc6d102.html + +Authors: + Philip Pokorny <ppokorny@penguincomputing.com>, + Frodo Looijaard <frodol@dds.nl>, + Richard Barrington <rich_b_nz@clear.net.nz>, + Margit Schubert-While <margitsw@t-online.de>, + Justin Thiessen <jthiessen@penguincomputing.com> + +Description +----------- + +This driver implements support for the National Semiconductor LM85 and +compatible chips including the Analog Devices ADM1027, ADT7463 and +SMSC EMC6D10x chips family. + +The LM85 uses the 2-wire interface compatible with the SMBUS 2.0 +specification. Using an analog to digital converter it measures three (3) +temperatures and five (5) voltages. It has four (4) 16-bit counters for +measuring fan speed. Five (5) digital inputs are provided for sampling the +VID signals from the processor to the VRM. Lastly, there are three (3) PWM +outputs that can be used to control fan speed. + +The voltage inputs have internal scaling resistors so that the following +voltage can be measured without external resistors: + + 2.5V, 3.3V, 5V, 12V, and CPU core voltage (2.25V) + +The temperatures measured are one internal diode, and two remote diodes. +Remote 1 is generally the CPU temperature. These inputs are designed to +measure a thermal diode like the one in a Pentium 4 processor in a socket +423 or socket 478 package. They can also measure temperature using a +transistor like the 2N3904. + +A sophisticated control system for the PWM outputs is designed into the +LM85 that allows fan speed to be adjusted automatically based on any of the +three temperature sensors. Each PWM output is individually adjustable and +programmable. Once configured, the LM85 will adjust the PWM outputs in +response to the measured temperatures without further host intervention. +This feature can also be disabled for manual control of the PWM's. + +Each of the measured inputs (voltage, temperature, fan speed) has +corresponding high/low limit values. The LM85 will signal an ALARM if any +measured value exceeds either limit. + +The LM85 samples all inputs continuously. The lm85 driver will not read +the registers more often than once a second. Further, configuration data is +only read once each 5 minutes. There is twice as much config data as +measurements, so this would seem to be a worthwhile optimization. + +Special Features +---------------- + +The LM85 has four fan speed monitoring modes. The ADM1027 has only two. +Both have special circuitry to compensate for PWM interactions with the +TACH signal from the fans. The ADM1027 can be configured to measure the +speed of a two wire fan, but the input conditioning circuitry is different +for 3-wire and 2-wire mode. For this reason, the 2-wire fan modes are not +exposed to user control. The BIOS should initialize them to the correct +mode. If you've designed your own ADM1027, you'll have to modify the +init_client function and add an insmod parameter to set this up. + +To smooth the response of fans to changes in temperature, the LM85 has an +optional filter for smoothing temperatures. The ADM1027 has the same +config option but uses it to rate limit the changes to fan speed instead. + +The ADM1027 and ADT7463 have a 10-bit ADC and can therefore measure +temperatures with 0.25 degC resolution. They also provide an offset to the +temperature readings that is automatically applied during measurement. +This offset can be used to zero out any errors due to traces and placement. +The documentation says that the offset is in 0.25 degC steps, but in +initial testing of the ADM1027 it was 1.00 degC steps. Analog Devices has +confirmed this "bug". The ADT7463 is reported to work as described in the +documentation. The current lm85 driver does not show the offset register. + +The ADT7463 has a THERM asserted counter. This counter has a 22.76ms +resolution and a range of 5.8 seconds. The driver implements a 32-bit +accumulator of the counter value to extend the range to over a year. The +counter will stay at it's max value until read. + +See the vendor datasheets for more information. There is application note +from National (AN-1260) with some additional information about the LM85. +The Analog Devices datasheet is very detailed and describes a procedure for +determining an optimal configuration for the automatic PWM control. + +The SMSC EMC6D100 & EMC6D101 monitor external voltages, temperatures, and +fan speeds. They use this monitoring capability to alert the system to out +of limit conditions and can automatically control the speeds of multiple +fans in a PC or embedded system. The EMC6D101, available in a 24-pin SSOP +package, and the EMC6D100, available in a 28-pin SSOP package, are designed +to be register compatible. The EMC6D100 offers all the features of the +EMC6D101 plus additional voltage monitoring and system control features. +Unfortunately it is not possible to distinguish between the package +versions on register level so these additional voltage inputs may read +zero. The EMC6D102 features addtional ADC bits thus extending precision +of voltage and temperature channels. + + +Hardware Configurations +----------------------- + +The LM85 can be jumpered for 3 different SMBus addresses. There are +no other hardware configuration options for the LM85. + +The lm85 driver detects both LM85B and LM85C revisions of the chip. See the +datasheet for a complete description of the differences. Other than +identifying the chip, the driver behaves no differently with regard to +these two chips. The LM85B is recommended for new designs. + +The ADM1027 and ADT7463 chips have an optional SMBALERT output that can be +used to signal the chipset in case a limit is exceeded or the temperature +sensors fail. Individual sensor interrupts can be masked so they won't +trigger SMBALERT. The SMBALERT output if configured replaces one of the other +functions (PWM2 or IN0). This functionality is not implemented in current +driver. + +The ADT7463 also has an optional THERM output/input which can be connected +to the processor PROC_HOT output. If available, the autofan control +dynamic Tmin feature can be enabled to keep the system temperature within +spec (just?!) with the least possible fan noise. + +Configuration Notes +------------------- + +Besides standard interfaces driver adds following: + +* Temperatures and Zones + +Each temperature sensor is associated with a Zone. There are three +sensors and therefore three zones (# 1, 2 and 3). Each zone has the following +temperature configuration points: + +* temp#_auto_temp_off - temperature below which fans should be off or spinning very low. +* temp#_auto_temp_min - temperature over which fans start to spin. +* temp#_auto_temp_max - temperature when fans spin at full speed. +* temp#_auto_temp_crit - temperature when all fans will run full speed. + +* PWM Control + +There are three PWM outputs. The LM85 datasheet suggests that the +pwm3 output control both fan3 and fan4. Each PWM can be individually +configured and assigned to a zone for it's control value. Each PWM can be +configured individually according to the following options. + +* pwm#_auto_pwm_min - this specifies the PWM value for temp#_auto_temp_off + temperature. (PWM value from 0 to 255) + +* pwm#_auto_pwm_freq - select base frequency of PWM output. You can select + in range of 10.0 to 94.0 Hz in .1 Hz units. + (Values 100 to 940). + +The pwm#_auto_pwm_freq can be set to one of the following 8 values. Setting the +frequency to a value not on this list, will result in the next higher frequency +being selected. The actual device frequency may vary slightly from this +specification as designed by the manufacturer. Consult the datasheet for more +details. (PWM Frequency values: 100, 150, 230, 300, 380, 470, 620, 940) + +* pwm#_auto_pwm_minctl - this flags selects for temp#_auto_temp_off temperature + the bahaviour of fans. Write 1 to let fans spinning at + pwm#_auto_pwm_min or write 0 to let them off. + +NOTE: It has been reported that there is a bug in the LM85 that causes the flag +to be associated with the zones not the PWMs. This contradicts all the +published documentation. Setting pwm#_min_ctl in this case actually affects all +PWMs controlled by zone '#'. + +* PWM Controlling Zone selection + +* pwm#_auto_channels - controls zone that is associated with PWM + +Configuration choices: + + Value Meaning + ------ ------------------------------------------------ + 1 Controlled by Zone 1 + 2 Controlled by Zone 2 + 3 Controlled by Zone 3 + 23 Controlled by higher temp of Zone 2 or 3 + 123 Controlled by highest temp of Zone 1, 2 or 3 + 0 PWM always 0% (off) + -1 PWM always 100% (full on) + -2 Manual control (write to 'pwm#' to set) + +The National LM85's have two vendor specific configuration +features. Tach. mode and Spinup Control. For more details on these, +see the LM85 datasheet or Application Note AN-1260. + +The Analog Devices ADM1027 has several vendor specific enhancements. +The number of pulses-per-rev of the fans can be set, Tach monitoring +can be optimized for PWM operation, and an offset can be applied to +the temperatures to compensate for systemic errors in the +measurements. + +In addition to the ADM1027 features, the ADT7463 also has Tmin control +and THERM asserted counts. Automatic Tmin control acts to adjust the +Tmin value to maintain the measured temperature sensor at a specified +temperature. There isn't much documentation on this feature in the +ADT7463 data sheet. This is not supported by current driver. diff --git a/Documentation/hwmon/lm87 b/Documentation/hwmon/lm87 new file mode 100644 index 000000000000..c952c57f0e11 --- /dev/null +++ b/Documentation/hwmon/lm87 @@ -0,0 +1,73 @@ +Kernel driver lm87 +================== + +Supported chips: + * National Semiconductor LM87 + Prefix: 'lm87' + Addresses scanned: I2C 0x2c - 0x2f + Datasheet: http://www.national.com/pf/LM/LM87.html + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com>, + Mark Studebaker <mdsxyz123@yahoo.com>, + Stephen Rousset <stephen.rousset@rocketlogix.com>, + Dan Eaton <dan.eaton@rocketlogix.com>, + Jean Delvare <khali@linux-fr.org>, + Original 2.6 port Jeff Oliver + +Description +----------- + +This driver implements support for the National Semiconductor LM87. + +The LM87 implements up to three temperature sensors, up to two fan +rotation speed sensors, up to seven voltage sensors, alarms, and some +miscellaneous stuff. + +Temperatures are measured in degrees Celsius. Each input has a high +and low alarm settings. A high limit produces an alarm when the value +goes above it, and an alarm is also produced when the value goes below +the low limit. + +Fan rotation speeds are reported in RPM (rotations per minute). An alarm is +triggered if the rotation speed has dropped below a programmable limit. Fan +readings can be divided by a programmable divider (1, 2, 4 or 8) to give +the readings more range or accuracy. Not all RPM values can accurately be +represented, so some rounding is done. With a divider of 2, the lowest +representable value is around 2600 RPM. + +Voltage sensors (also known as IN sensors) report their values in +volts. An alarm is triggered if the voltage has crossed a programmable +minimum or maximum limit. Note that minimum in this case always means +'closest to zero'; this is important for negative voltage measurements. + +If an alarm triggers, it will remain triggered until the hardware register +is read at least once. This means that the cause for the alarm may +already have disappeared! Note that in the current implementation, all +hardware registers are read whenever any data is read (unless it is less +than 1.0 seconds since the last update). This means that you can easily +miss once-only alarms. + +The lm87 driver only updates its values each 1.0 seconds; reading it more +often will do no harm, but will return 'old' values. + + +Hardware Configurations +----------------------- + +The LM87 has four pins which can serve one of two possible functions, +depending on the hardware configuration. + +Some functions share pins, so not all functions are available at the same +time. Which are depends on the hardware setup. This driver assumes that +the BIOS configured the chip correctly. In that respect, it differs from +the original driver (from lm_sensors for Linux 2.4), which would force the +LM87 to an arbitrary, compile-time chosen mode, regardless of the actual +chipset wiring. + +For reference, here is the list of exclusive functions: + - in0+in5 (default) or temp3 + - fan1 (default) or in6 + - fan2 (default) or in7 + - VID lines (default) or IRQ lines (not handled by this driver) diff --git a/Documentation/hwmon/lm90 b/Documentation/hwmon/lm90 new file mode 100644 index 000000000000..2c4cf39471f4 --- /dev/null +++ b/Documentation/hwmon/lm90 @@ -0,0 +1,121 @@ +Kernel driver lm90 +================== + +Supported chips: + * National Semiconductor LM90 + Prefix: 'lm90' + Addresses scanned: I2C 0x4c + Datasheet: Publicly available at the National Semiconductor website + http://www.national.com/pf/LM/LM90.html + * National Semiconductor LM89 + Prefix: 'lm99' + Addresses scanned: I2C 0x4c and 0x4d + Datasheet: Publicly available at the National Semiconductor website + http://www.national.com/pf/LM/LM89.html + * National Semiconductor LM99 + Prefix: 'lm99' + Addresses scanned: I2C 0x4c and 0x4d + Datasheet: Publicly available at the National Semiconductor website + http://www.national.com/pf/LM/LM99.html + * National Semiconductor LM86 + Prefix: 'lm86' + Addresses scanned: I2C 0x4c + Datasheet: Publicly available at the National Semiconductor website + http://www.national.com/pf/LM/LM86.html + * Analog Devices ADM1032 + Prefix: 'adm1032' + Addresses scanned: I2C 0x4c + Datasheet: Publicly available at the Analog Devices website + http://products.analog.com/products/info.asp?product=ADM1032 + * Analog Devices ADT7461 + Prefix: 'adt7461' + Addresses scanned: I2C 0x4c + Datasheet: Publicly available at the Analog Devices website + http://products.analog.com/products/info.asp?product=ADT7461 + Note: Only if in ADM1032 compatibility mode + * Maxim MAX6657 + Prefix: 'max6657' + Addresses scanned: I2C 0x4c + Datasheet: Publicly available at the Maxim website + http://www.maxim-ic.com/quick_view2.cfm/qv_pk/2578 + * Maxim MAX6658 + Prefix: 'max6657' + Addresses scanned: I2C 0x4c + Datasheet: Publicly available at the Maxim website + http://www.maxim-ic.com/quick_view2.cfm/qv_pk/2578 + * Maxim MAX6659 + Prefix: 'max6657' + Addresses scanned: I2C 0x4c, 0x4d (unsupported 0x4e) + Datasheet: Publicly available at the Maxim website + http://www.maxim-ic.com/quick_view2.cfm/qv_pk/2578 + + +Author: Jean Delvare <khali@linux-fr.org> + + +Description +----------- + +The LM90 is a digital temperature sensor. It senses its own temperature as +well as the temperature of up to one external diode. It is compatible +with many other devices such as the LM86, the LM89, the LM99, the ADM1032, +the MAX6657, MAX6658 and the MAX6659 all of which are supported by this driver. +Note that there is no easy way to differentiate between the last three +variants. The extra address and features of the MAX6659 are not supported by +this driver. Additionally, the ADT7461 is supported if found in ADM1032 +compatibility mode. + +The specificity of this family of chipsets over the ADM1021/LM84 +family is that it features critical limits with hysteresis, and an +increased resolution of the remote temperature measurement. + +The different chipsets of the family are not strictly identical, although +very similar. This driver doesn't handle any specific feature for now, +but could if there ever was a need for it. For reference, here comes a +non-exhaustive list of specific features: + +LM90: + * Filter and alert configuration register at 0xBF. + * ALERT is triggered by temperatures over critical limits. + +LM86 and LM89: + * Same as LM90 + * Better external channel accuracy + +LM99: + * Same as LM89 + * External temperature shifted by 16 degrees down + +ADM1032: + * Consecutive alert register at 0x22. + * Conversion averaging. + * Up to 64 conversions/s. + * ALERT is triggered by open remote sensor. + +ADT7461 + * Extended temperature range (breaks compatibility) + * Lower resolution for remote temperature + +MAX6657 and MAX6658: + * Remote sensor type selection + +MAX6659 + * Selectable address + * Second critical temperature limit + * Remote sensor type selection + +All temperature values are given in degrees Celsius. Resolution +is 1.0 degree for the local temperature, 0.125 degree for the remote +temperature. + +Each sensor has its own high and low limits, plus a critical limit. +Additionally, there is a relative hysteresis value common to both critical +values. To make life easier to user-space applications, two absolute values +are exported, one for each channel, but these values are of course linked. +Only the local hysteresis can be set from user-space, and the same delta +applies to the remote hysteresis. + +The lm90 driver will not update its values more frequently than every +other second; reading them more often will do no harm, but will return +'old' values. + diff --git a/Documentation/hwmon/lm92 b/Documentation/hwmon/lm92 new file mode 100644 index 000000000000..7705bfaa0708 --- /dev/null +++ b/Documentation/hwmon/lm92 @@ -0,0 +1,37 @@ +Kernel driver lm92 +================== + +Supported chips: + * National Semiconductor LM92 + Prefix: 'lm92' + Addresses scanned: I2C 0x48 - 0x4b + Datasheet: http://www.national.com/pf/LM/LM92.html + * National Semiconductor LM76 + Prefix: 'lm92' + Addresses scanned: none, force parameter needed + Datasheet: http://www.national.com/pf/LM/LM76.html + * Maxim MAX6633/MAX6634/MAX6635 + Prefix: 'lm92' + Addresses scanned: I2C 0x48 - 0x4b + MAX6633 with address in 0x40 - 0x47, 0x4c - 0x4f needs force parameter + and MAX6634 with address in 0x4c - 0x4f needs force parameter + Datasheet: http://www.maxim-ic.com/quick_view2.cfm/qv_pk/3074 + +Authors: + Abraham van der Merwe <abraham@2d3d.co.za> + Jean Delvare <khali@linux-fr.org> + + +Description +----------- + +This driver implements support for the National Semiconductor LM92 +temperature sensor. + +Each LM92 temperature sensor supports a single temperature sensor. There are +alarms for high, low, and critical thresholds. There's also an hysteresis to +control the thresholds for resetting alarms. + +Support was added later for the LM76 and Maxim MAX6633/MAX6634/MAX6635, +which are mostly compatible. They have not all been tested, so you +may need to use the force parameter. diff --git a/Documentation/hwmon/max1619 b/Documentation/hwmon/max1619 new file mode 100644 index 000000000000..d6f8d9cd7d7f --- /dev/null +++ b/Documentation/hwmon/max1619 @@ -0,0 +1,29 @@ +Kernel driver max1619 +===================== + +Supported chips: + * Maxim MAX1619 + Prefix: 'max1619' + Addresses scanned: I2C 0x18-0x1a, 0x29-0x2b, 0x4c-0x4e + Datasheet: Publicly available at the Maxim website + http://pdfserv.maxim-ic.com/en/ds/MAX1619.pdf + +Authors: + Alexey Fisher <fishor@mail.ru>, + Jean Delvare <khali@linux-fr.org> + +Description +----------- + +The MAX1619 is a digital temperature sensor. It senses its own temperature as +well as the temperature of up to one external diode. + +All temperature values are given in degrees Celsius. Resolution +is 1.0 degree for the local temperature and for the remote temperature. + +Only the external sensor has high and low limits. + +The max1619 driver will not update its values more frequently than every +other second; reading them more often will do no harm, but will return +'old' values. + diff --git a/Documentation/hwmon/pc87360 b/Documentation/hwmon/pc87360 new file mode 100644 index 000000000000..89a8fcfa78df --- /dev/null +++ b/Documentation/hwmon/pc87360 @@ -0,0 +1,189 @@ +Kernel driver pc87360 +===================== + +Supported chips: + * National Semiconductor PC87360, PC87363, PC87364, PC87365 and PC87366 + Prefixes: 'pc87360', 'pc87363', 'pc87364', 'pc87365', 'pc87366' + Addresses scanned: none, address read from Super I/O config space + Datasheets: + http://www.national.com/pf/PC/PC87360.html + http://www.national.com/pf/PC/PC87363.html + http://www.national.com/pf/PC/PC87364.html + http://www.national.com/pf/PC/PC87365.html + http://www.national.com/pf/PC/PC87366.html + +Authors: Jean Delvare <khali@linux-fr.org> + +Thanks to Sandeep Mehta, Tonko de Rooy and Daniel Ceregatti for testing. +Thanks to Rudolf Marek for helping me investigate conversion issues. + + +Module Parameters +----------------- + +* init int + Chip initialization level: + 0: None + *1: Forcibly enable internal voltage and temperature channels, except in9 + 2: Forcibly enable all voltage and temperature channels, except in9 + 3: Forcibly enable all voltage and temperature channels, including in9 + +Note that this parameter has no effect for the PC87360, PC87363 and PC87364 +chips. + +Also note that for the PC87366, initialization levels 2 and 3 don't enable +all temperature channels, because some of them share pins with each other, +so they can't be used at the same time. + + +Description +----------- + +The National Semiconductor PC87360 Super I/O chip contains monitoring and +PWM control circuitry for two fans. The PC87363 chip is similar, and the +PC87364 chip has monitoring and PWM control for a third fan. + +The National Semiconductor PC87365 and PC87366 Super I/O chips are complete +hardware monitoring chipsets, not only controlling and monitoring three fans, +but also monitoring eleven voltage inputs and two (PC87365) or up to four +(PC87366) temperatures. + + Chip #vin #fan #pwm #temp devid + + PC87360 - 2 2 - 0xE1 + PC87363 - 2 2 - 0xE8 + PC87364 - 3 3 - 0xE4 + PC87365 11 3 3 2 0xE5 + PC87366 11 3 3 3-4 0xE9 + +The driver assumes that no more than one chip is present, and one of the +standard Super I/O addresses is used (0x2E/0x2F or 0x4E/0x4F) + +Fan Monitoring +-------------- + +Fan rotation speeds are reported in RPM (revolutions per minute). An alarm +is triggered if the rotation speed has dropped below a programmable limit. +A different alarm is triggered if the fan speed is too low to be measured. + +Fan readings are affected by a programmable clock divider, giving the +readings more range or accuracy. Usually, users have to learn how it works, +but this driver implements dynamic clock divider selection, so you don't +have to care no more. + +For reference, here are a few values about clock dividers: + + slowest accuracy highest + measurable around 3000 accurate + divider speed (RPM) RPM (RPM) speed (RPM) + 1 1882 18 6928 + 2 941 37 4898 + 4 470 74 3464 + 8 235 150 2449 + +For the curious, here is how the values above were computed: + * slowest measurable speed: clock/(255*divider) + * accuracy around 3000 RPM: 3000^2/clock + * highest accurate speed: sqrt(clock*100) +The clock speed for the PC87360 family is 480 kHz. I arbitrarily chose 100 +RPM as the lowest acceptable accuracy. + +As mentioned above, you don't have to care about this no more. + +Note that not all RPM values can be represented, even when the best clock +divider is selected. This is not only true for the measured speeds, but +also for the programmable low limits, so don't be surprised if you try to +set, say, fan1_min to 2900 and it finally reads 2909. + + +Fan Control +----------- + +PWM (pulse width modulation) values range from 0 to 255, with 0 meaning +that the fan is stopped, and 255 meaning that the fan goes at full speed. + +Be extremely careful when changing PWM values. Low PWM values, even +non-zero, can stop the fan, which may cause irreversible damage to your +hardware if temperature increases too much. When changing PWM values, go +step by step and keep an eye on temperatures. + +One user reported problems with PWM. Changing PWM values would break fan +speed readings. No explanation nor fix could be found. + + +Temperature Monitoring +---------------------- + +Temperatures are reported in degrees Celsius. Each temperature measured has +associated low, high and overtemperature limits, each of which triggers an +alarm when crossed. + +The first two temperature channels are external. The third one (PC87366 +only) is internal. + +The PC87366 has three additional temperature channels, based on +thermistors (as opposed to thermal diodes for the first three temperature +channels). For technical reasons, these channels are held by the VLM +(voltage level monitor) logical device, not the TMS (temperature +measurement) one. As a consequence, these temperatures are exported as +voltages, and converted into temperatures in user-space. + +Note that these three additional channels share their pins with the +external thermal diode channels, so you (physically) can't use them all at +the same time. Although it should be possible to mix the two sensor types, +the documents from National Semiconductor suggest that motherboard +manufacturers should choose one type and stick to it. So you will more +likely have either channels 1 to 3 (thermal diodes) or 3 to 6 (internal +thermal diode, and thermistors). + + +Voltage Monitoring +------------------ + +Voltages are reported relatively to a reference voltage, either internal or +external. Some of them (in7:Vsb, in8:Vdd and in10:AVdd) are divided by two +internally, you will have to compensate in sensors.conf. Others (in0 to in6) +are likely to be divided externally. The meaning of each of these inputs as +well as the values of the resistors used for division is left to the +motherboard manufacturers, so you will have to document yourself and edit +sensors.conf accordingly. National Semiconductor has a document with +recommended resistor values for some voltages, but this still leaves much +room for per motherboard specificities, unfortunately. Even worse, +motherboard manufacturers don't seem to care about National Semiconductor's +recommendations. + +Each voltage measured has associated low and high limits, each of which +triggers an alarm when crossed. + +When available, VID inputs are used to provide the nominal CPU Core voltage. +The driver will default to VRM 9.0, but this can be changed from user-space. +The chipsets can handle two sets of VID inputs (on dual-CPU systems), but +the driver will only export one for now. This may change later if there is +a need. + + +General Remarks +--------------- + +If an alarm triggers, it will remain triggered until the hardware register +is read at least once. This means that the cause for the alarm may already +have disappeared! Note that all hardware registers are read whenever any +data is read (unless it is less than 2 seconds since the last update, in +which case cached values are returned instead). As a consequence, when +a once-only alarm triggers, it may take 2 seconds for it to show, and 2 +more seconds for it to disappear. + +Monitoring of in9 isn't enabled at lower init levels (<3) because that +channel measures the battery voltage (Vbat). It is a known fact that +repeatedly sampling the battery voltage reduces its lifetime. National +Semiconductor smartly designed their chipset so that in9 is sampled only +once every 1024 sampling cycles (that is every 34 minutes at the default +sampling rate), so the effect is attenuated, but still present. + + +Limitations +----------- + +The datasheets suggests that some values (fan mins, fan dividers) +shouldn't be changed once the monitoring has started, but we ignore that +recommendation. We'll reconsider if it actually causes trouble. diff --git a/Documentation/hwmon/sis5595 b/Documentation/hwmon/sis5595 new file mode 100644 index 000000000000..b7ae36b8cdf5 --- /dev/null +++ b/Documentation/hwmon/sis5595 @@ -0,0 +1,106 @@ +Kernel driver sis5595 +===================== + +Supported chips: + * Silicon Integrated Systems Corp. SiS5595 Southbridge Hardware Monitor + Prefix: 'sis5595' + Addresses scanned: ISA in PCI-space encoded address + Datasheet: Publicly available at the Silicon Integrated Systems Corp. site. + +Authors: + Kyösti Mälkki <kmalkki@cc.hut.fi>, + Mark D. Studebaker <mdsxyz123@yahoo.com>, + Aurelien Jarno <aurelien@aurel32.net> 2.6 port + + SiS southbridge has a LM78-like chip integrated on the same IC. + This driver is a customized copy of lm78.c + + Supports following revisions: + Version PCI ID PCI Revision + 1 1039/0008 AF or less + 2 1039/0008 B0 or greater + + Note: these chips contain a 0008 device which is incompatible with the + 5595. We recognize these by the presence of the listed + "blacklist" PCI ID and refuse to load. + + NOT SUPPORTED PCI ID BLACKLIST PCI ID + 540 0008 0540 + 550 0008 0550 + 5513 0008 5511 + 5581 0008 5597 + 5582 0008 5597 + 5597 0008 5597 + 630 0008 0630 + 645 0008 0645 + 730 0008 0730 + 735 0008 0735 + + +Module Parameters +----------------- +force_addr=0xaddr Set the I/O base address. Useful for boards + that don't set the address in the BIOS. Does not do a + PCI force; the device must still be present in lspci. + Don't use this unless the driver complains that the + base address is not set. + Example: 'modprobe sis5595 force_addr=0x290' + + +Description +----------- + +The SiS5595 southbridge has integrated hardware monitor functions. It also +has an I2C bus, but this driver only supports the hardware monitor. For the +I2C bus driver see i2c-sis5595. + +The SiS5595 implements zero or one temperature sensor, two fan speed +sensors, four or five voltage sensors, and alarms. + +On the first version of the chip, there are four voltage sensors and one +temperature sensor. + +On the second version of the chip, the temperature sensor (temp) and the +fifth voltage sensor (in4) share a pin which is configurable, but not +through the driver. Sorry. The driver senses the configuration of the pin, +which was hopefully set by the BIOS. + +Temperatures are measured in degrees Celsius. An alarm is triggered once +when the max is crossed; it is also triggered when it drops below the min +value. Measurements are guaranteed between -55 and +125 degrees, with a +resolution of 1 degree. + +Fan rotation speeds are reported in RPM (rotations per minute). An alarm is +triggered if the rotation speed has dropped below a programmable limit. Fan +readings can be divided by a programmable divider (1, 2, 4 or 8) to give +the readings more range or accuracy. Not all RPM values can accurately be +represented, so some rounding is done. With a divider of 2, the lowest +representable value is around 2600 RPM. + +Voltage sensors (also known as IN sensors) report their values in volts. An +alarm is triggered if the voltage has crossed a programmable minimum or +maximum limit. Note that minimum in this case always means 'closest to +zero'; this is important for negative voltage measurements. All voltage +inputs can measure voltages between 0 and 4.08 volts, with a resolution of +0.016 volt. + +In addition to the alarms described above, there is a BTI alarm, which gets +triggered when an external chip has crossed its limits. Usually, this is +connected to some LM75-like chip; if at least one crosses its limits, this +bit gets set. + +If an alarm triggers, it will remain triggered until the hardware register +is read at least once. This means that the cause for the alarm may already +have disappeared! Note that in the current implementation, all hardware +registers are read whenever any data is read (unless it is less than 1.5 +seconds since the last update). This means that you can easily miss +once-only alarms. + +The SiS5595 only updates its values each 1.5 seconds; reading it more often +will do no harm, but will return 'old' values. + +Problems +-------- +Some chips refuse to be enabled. We don't know why. +The driver will recognize this and print a message in dmesg. + diff --git a/Documentation/i2c/chips/smsc47b397.txt b/Documentation/hwmon/smsc47b397 index 389edae7f8df..da9d80c96432 100644 --- a/Documentation/i2c/chips/smsc47b397.txt +++ b/Documentation/hwmon/smsc47b397 @@ -1,7 +1,19 @@ +Kernel driver smsc47b397 +======================== + +Supported chips: + * SMSC LPC47B397-NC + Prefix: 'smsc47b397' + Addresses scanned: none, address read from Super I/O config space + Datasheet: In this file + +Authors: Mark M. Hoffman <mhoffman@lightlink.com> + Utilitek Systems, Inc. + November 23, 2004 The following specification describes the SMSC LPC47B397-NC sensor chip -(for which there is no public datasheet available). This document was +(for which there is no public datasheet available). This document was provided by Craig Kelly (In-Store Broadcast Network) and edited/corrected by Mark M. Hoffman <mhoffman@lightlink.com>. @@ -10,10 +22,10 @@ by Mark M. Hoffman <mhoffman@lightlink.com>. Methods for detecting the HP SIO and reading the thermal data on a dc7100. The thermal information on the dc7100 is contained in the SIO Hardware Monitor -(HWM). The information is accessed through an index/data pair. The index/data -pair is located at the HWM Base Address + 0 and the HWM Base Address + 1. The +(HWM). The information is accessed through an index/data pair. The index/data +pair is located at the HWM Base Address + 0 and the HWM Base Address + 1. The HWM Base address can be obtained from Logical Device 8, registers 0x60 (MSB) -and 0x61 (LSB). Currently we are using 0x480 for the HWM Base Address and +and 0x61 (LSB). Currently we are using 0x480 for the HWM Base Address and 0x480 and 0x481 for the index/data pair. Reading temperature information. @@ -50,7 +62,7 @@ Reading the tach LSB locks the tach MSB. The LSB Must be read first. How to convert the tach reading to RPM. -The tach reading (TCount) is given by: (Tach MSB * 256) + (Tach LSB) +The tach reading (TCount) is given by: (Tach MSB * 256) + (Tach LSB) The SIO counts the number of 90kHz (11.111us) pulses per revolution. RPM = 60/(TCount * 11.111us) @@ -72,20 +84,20 @@ To program the configuration registers, the following sequence must be followed: Enter Configuration Mode To place the chip into the Configuration State The config key (0x55) is written -to the CONFIG PORT (0x2E). +to the CONFIG PORT (0x2E). Configuration Mode In configuration mode, the INDEX PORT is located at the CONFIG PORT address and the DATA PORT is at INDEX PORT address + 1. -The desired configuration registers are accessed in two steps: +The desired configuration registers are accessed in two steps: a. Write the index of the Logical Device Number Configuration Register (i.e., 0x07) to the INDEX PORT and then write the number of the desired logical device to the DATA PORT. b. Write the address of the desired configuration register within the logical device to the INDEX PORT and then write or read the config- - uration register through the DATA PORT. + uration register through the DATA PORT. Note: If accessing the Global Configuration Registers, step (a) is not required. @@ -96,18 +108,18 @@ The chip returns to the RUN State. (This is important). Programming Example The following is an example of how to read the SIO Device ID located at 0x20 -; ENTER CONFIGURATION MODE +; ENTER CONFIGURATION MODE MOV DX,02EH MOV AX,055H OUT DX,AL -; GLOBAL CONFIGURATION REGISTER +; GLOBAL CONFIGURATION REGISTER MOV DX,02EH MOV AL,20H -OUT DX,AL +OUT DX,AL ; READ THE DATA MOV DX,02FH IN AL,DX -; EXIT CONFIGURATION MODE +; EXIT CONFIGURATION MODE MOV DX,02EH MOV AX,0AAH OUT DX,AL @@ -122,12 +134,12 @@ Obtaining the HWM Base Address. The following is an example of how to read the HWM Base Address located in Logical Device 8. -; ENTER CONFIGURATION MODE +; ENTER CONFIGURATION MODE MOV DX,02EH MOV AX,055H OUT DX,AL -; CONFIGURE REGISTER CRE0, -; LOGICAL DEVICE 8 +; CONFIGURE REGISTER CRE0, +; LOGICAL DEVICE 8 MOV DX,02EH MOV AL,07H OUT DX,AL ;Point to LD# Config Reg @@ -135,12 +147,12 @@ MOV DX,02FH MOV AL, 08H OUT DX,AL;Point to Logical Device 8 ; -MOV DX,02EH +MOV DX,02EH MOV AL,60H OUT DX,AL ; Point to HWM Base Addr MSB MOV DX,02FH IN AL,DX ; Get MSB of HWM Base Addr -; EXIT CONFIGURATION MODE +; EXIT CONFIGURATION MODE MOV DX,02EH MOV AX,0AAH OUT DX,AL diff --git a/Documentation/hwmon/smsc47m1 b/Documentation/hwmon/smsc47m1 new file mode 100644 index 000000000000..34e6478c1425 --- /dev/null +++ b/Documentation/hwmon/smsc47m1 @@ -0,0 +1,52 @@ +Kernel driver smsc47m1 +====================== + +Supported chips: + * SMSC LPC47B27x, LPC47M10x, LPC47M13x, LPC47M14x, LPC47M15x and LPC47M192 + Addresses scanned: none, address read from Super I/O config space + Prefix: 'smsc47m1' + Datasheets: + http://www.smsc.com/main/datasheets/47b27x.pdf + http://www.smsc.com/main/datasheets/47m10x.pdf + http://www.smsc.com/main/tools/discontinued/47m13x.pdf + http://www.smsc.com/main/datasheets/47m14x.pdf + http://www.smsc.com/main/tools/discontinued/47m15x.pdf + http://www.smsc.com/main/datasheets/47m192.pdf + +Authors: + Mark D. Studebaker <mdsxyz123@yahoo.com>, + With assistance from Bruce Allen <ballen@uwm.edu>, and his + fan.c program: http://www.lsc-group.phys.uwm.edu/%7Eballen/driver/ + Gabriele Gorla <gorlik@yahoo.com>, + Jean Delvare <khali@linux-fr.org> + +Description +----------- + +The Standard Microsystems Corporation (SMSC) 47M1xx Super I/O chips +contain monitoring and PWM control circuitry for two fans. + +The 47M15x and 47M192 chips contain a full 'hardware monitoring block' +in addition to the fan monitoring and control. The hardware monitoring +block is not supported by the driver. + +Fan rotation speeds are reported in RPM (rotations per minute). An alarm is +triggered if the rotation speed has dropped below a programmable limit. Fan +readings can be divided by a programmable divider (1, 2, 4 or 8) to give +the readings more range or accuracy. Not all RPM values can accurately be +represented, so some rounding is done. With a divider of 2, the lowest +representable value is around 2600 RPM. + +PWM values are from 0 to 255. + +If an alarm triggers, it will remain triggered until the hardware register +is read at least once. This means that the cause for the alarm may +already have disappeared! Note that in the current implementation, all +hardware registers are read whenever any data is read (unless it is less +than 1.5 seconds since the last update). This means that you can easily +miss once-only alarms. + + +********************** +The lm_sensors project gratefully acknowledges the support of +Intel in the development of this driver. diff --git a/Documentation/i2c/sysfs-interface b/Documentation/hwmon/sysfs-interface index 346400519d0d..346400519d0d 100644 --- a/Documentation/i2c/sysfs-interface +++ b/Documentation/hwmon/sysfs-interface diff --git a/Documentation/hwmon/userspace-tools b/Documentation/hwmon/userspace-tools new file mode 100644 index 000000000000..2622aac65422 --- /dev/null +++ b/Documentation/hwmon/userspace-tools @@ -0,0 +1,39 @@ +Introduction +------------ + +Most mainboards have sensor chips to monitor system health (like temperatures, +voltages, fans speed). They are often connected through an I2C bus, but some +are also connected directly through the ISA bus. + +The kernel drivers make the data from the sensor chips available in the /sys +virtual filesystem. Userspace tools are then used to display or set or the +data in a more friendly manner. + +Lm-sensors +---------- + +Core set of utilites that will allow you to obtain health information, +setup monitoring limits etc. You can get them on their homepage +http://www.lm-sensors.nu/ or as a package from your Linux distribution. + +If from website: +Get lmsensors from project web site. Please note, you need only userspace +part, so compile with "make user_install" target. + +General hints to get things working: + +0) get lm-sensors userspace utils +1) compile all drivers in I2C section as modules in your kernel +2) run sensors-detect script, it will tell you what modules you need to load. +3) load them and run "sensors" command, you should see some results. +4) fix sensors.conf, labels, limits, fan divisors +5) if any more problems consult FAQ, or documentation + +Other utilites +-------------- + +If you want some graphical indicators of system health look for applications +like: gkrellm, ksensors, xsensors, wmtemp, wmsensors, wmgtemp, ksysguardd, +hardware-monitor + +If you are server administrator you can try snmpd or mrtgutils. diff --git a/Documentation/hwmon/via686a b/Documentation/hwmon/via686a new file mode 100644 index 000000000000..b82014cb7c53 --- /dev/null +++ b/Documentation/hwmon/via686a @@ -0,0 +1,65 @@ +Kernel driver via686a +===================== + +Supported chips: + * Via VT82C686A, VT82C686B Southbridge Integrated Hardware Monitor + Prefix: 'via686a' + Addresses scanned: ISA in PCI-space encoded address + Datasheet: On request through web form (http://www.via.com.tw/en/support/datasheets/) + +Authors: + Kyösti Mälkki <kmalkki@cc.hut.fi>, + Mark D. Studebaker <mdsxyz123@yahoo.com> + Bob Dougherty <bobd@stanford.edu> + (Some conversion-factor data were contributed by + Jonathan Teh Soon Yew <j.teh@iname.com> + and Alex van Kaam <darkside@chello.nl>.) + +Module Parameters +----------------- + +force_addr=0xaddr Set the I/O base address. Useful for Asus A7V boards + that don't set the address in the BIOS. Does not do a + PCI force; the via686a must still be present in lspci. + Don't use this unless the driver complains that the + base address is not set. + Example: 'modprobe via686a force_addr=0x6000' + +Description +----------- + +The driver does not distinguish between the chips and reports +all as a 686A. + +The Via 686a southbridge has integrated hardware monitor functionality. +It also has an I2C bus, but this driver only supports the hardware monitor. +For the I2C bus driver, see <file:Documentation/i2c/busses/i2c-viapro> + +The Via 686a implements three temperature sensors, two fan rotation speed +sensors, five voltage sensors and alarms. + +Temperatures are measured in degrees Celsius. An alarm is triggered once +when the Overtemperature Shutdown limit is crossed; it is triggered again +as soon as it drops below the hysteresis value. + +Fan rotation speeds are reported in RPM (rotations per minute). An alarm is +triggered if the rotation speed has dropped below a programmable limit. Fan +readings can be divided by a programmable divider (1, 2, 4 or 8) to give +the readings more range or accuracy. Not all RPM values can accurately be +represented, so some rounding is done. With a divider of 2, the lowest +representable value is around 2600 RPM. + +Voltage sensors (also known as IN sensors) report their values in volts. +An alarm is triggered if the voltage has crossed a programmable minimum +or maximum limit. Voltages are internally scalled, so each voltage channel +has a different resolution and range. + +If an alarm triggers, it will remain triggered until the hardware register +is read at least once. This means that the cause for the alarm may +already have disappeared! Note that in the current implementation, all +hardware registers are read whenever any data is read (unless it is less +than 1.5 seconds since the last update). This means that you can easily +miss once-only alarms. + +The driver only updates its values each 1.5 seconds; reading it more often +will do no harm, but will return 'old' values. diff --git a/Documentation/hwmon/w83627hf b/Documentation/hwmon/w83627hf new file mode 100644 index 000000000000..78f37c2d602e --- /dev/null +++ b/Documentation/hwmon/w83627hf @@ -0,0 +1,66 @@ +Kernel driver w83627hf +====================== + +Supported chips: + * Winbond W83627HF (ISA accesses ONLY) + Prefix: 'w83627hf' + Addresses scanned: ISA address retrieved from Super I/O registers + Datasheet: http://www.winbond.com/PDF/sheet/w83627hf.pdf + * Winbond W83627THF + Prefix: 'w83627thf' + Addresses scanned: ISA address retrieved from Super I/O registers + Datasheet: http://www.winbond.com/PDF/sheet/w83627thf.pdf + * Winbond W83697HF + Prefix: 'w83697hf' + Addresses scanned: ISA address retrieved from Super I/O registers + Datasheet: http://www.winbond.com/PDF/sheet/697hf.pdf + * Winbond W83637HF + Prefix: 'w83637hf' + Addresses scanned: ISA address retrieved from Super I/O registers + Datasheet: http://www.winbond.com/PDF/sheet/w83637hf.pdf + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com>, + Mark Studebaker <mdsxyz123@yahoo.com>, + Bernhard C. Schrenk <clemy@clemy.org> + +Module Parameters +----------------- + +* force_addr: int + Initialize the ISA address of the sensors +* force_i2c: int + Initialize the I2C address of the sensors +* init: int + (default is 1) + Use 'init=0' to bypass initializing the chip. + Try this if your computer crashes when you load the module. + +Description +----------- + +This driver implements support for ISA accesses *only* for +the Winbond W83627HF, W83627THF, W83697HF and W83637HF Super I/O chips. +We will refer to them collectively as Winbond chips. + +This driver supports ISA accesses, which should be more reliable +than i2c accesses. Also, for Tyan boards which contain both a +Super I/O chip and a second i2c-only Winbond chip (often a W83782D), +using this driver will avoid i2c address conflicts and complex +initialization that were required in the w83781d driver. + +If you really want i2c accesses for these Super I/O chips, +use the w83781d driver. However this is not the preferred method +now that this ISA driver has been developed. + +Technically, the w83627thf does not support a VID reading. However, it's +possible or even likely that your mainboard maker has routed these signals +to a specific set of general purpose IO pins (the Asus P4C800-E is one such +board). The w83627thf driver now interprets these as VID. If the VID on +your board doesn't work, first see doc/vid in the lm_sensors package. If +that still doesn't help, email us at lm-sensors@lm-sensors.org. + +For further information on this driver see the w83781d driver +documentation. + diff --git a/Documentation/hwmon/w83781d b/Documentation/hwmon/w83781d new file mode 100644 index 000000000000..e5459333ba68 --- /dev/null +++ b/Documentation/hwmon/w83781d @@ -0,0 +1,402 @@ +Kernel driver w83781d +===================== + +Supported chips: + * Winbond W83781D + Prefix: 'w83781d' + Addresses scanned: I2C 0x20 - 0x2f, ISA 0x290 (8 I/O ports) + Datasheet: http://www.winbond-usa.com/products/winbond_products/pdfs/PCIC/w83781d.pdf + * Winbond W83782D + Prefix: 'w83782d' + Addresses scanned: I2C 0x20 - 0x2f, ISA 0x290 (8 I/O ports) + Datasheet: http://www.winbond.com/PDF/sheet/w83782d.pdf + * Winbond W83783S + Prefix: 'w83783s' + Addresses scanned: I2C 0x2d + Datasheet: http://www.winbond-usa.com/products/winbond_products/pdfs/PCIC/w83783s.pdf + * Winbond W83627HF + Prefix: 'w83627hf' + Addresses scanned: I2C 0x20 - 0x2f, ISA 0x290 (8 I/O ports) + Datasheet: http://www.winbond.com/PDF/sheet/w83627hf.pdf + * Asus AS99127F + Prefix: 'as99127f' + Addresses scanned: I2C 0x28 - 0x2f + Datasheet: Unavailable from Asus + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com>, + Mark Studebaker <mdsxyz123@yahoo.com> + +Module parameters +----------------- + +* init int + (default 1) + Use 'init=0' to bypass initializing the chip. + Try this if your computer crashes when you load the module. + +force_subclients=bus,caddr,saddr,saddr + This is used to force the i2c addresses for subclients of + a certain chip. Typical usage is `force_subclients=0,0x2d,0x4a,0x4b' + to force the subclients of chip 0x2d on bus 0 to i2c addresses + 0x4a and 0x4b. This parameter is useful for certain Tyan boards. + +Description +----------- + +This driver implements support for the Winbond W83781D, W83782D, W83783S, +W83627HF chips, and the Asus AS99127F chips. We will refer to them +collectively as W8378* chips. + +There is quite some difference between these chips, but they are similar +enough that it was sensible to put them together in one driver. +The W83627HF chip is assumed to be identical to the ISA W83782D. +The Asus chips are similar to an I2C-only W83782D. + +Chip #vin #fanin #pwm #temp wchipid vendid i2c ISA +as99127f 7 3 0 3 0x31 0x12c3 yes no +as99127f rev.2 (type_name = as99127f) 0x31 0x5ca3 yes no +w83781d 7 3 0 3 0x10-1 0x5ca3 yes yes +w83627hf 9 3 2 3 0x21 0x5ca3 yes yes(LPC) +w83782d 9 3 2-4 3 0x30 0x5ca3 yes yes +w83783s 5-6 3 2 1-2 0x40 0x5ca3 yes no + +Detection of these chips can sometimes be foiled because they can be in +an internal state that allows no clean access. If you know the address +of the chip, use a 'force' parameter; this will put them into a more +well-behaved state first. + +The W8378* implements temperature sensors (three on the W83781D and W83782D, +two on the W83783S), three fan rotation speed sensors, voltage sensors +(seven on the W83781D, nine on the W83782D and six on the W83783S), VID +lines, alarms with beep warnings, and some miscellaneous stuff. + +Temperatures are measured in degrees Celsius. There is always one main +temperature sensor, and one (W83783S) or two (W83781D and W83782D) other +sensors. An alarm is triggered for the main sensor once when the +Overtemperature Shutdown limit is crossed; it is triggered again as soon as +it drops below the Hysteresis value. A more useful behavior +can be found by setting the Hysteresis value to +127 degrees Celsius; in +this case, alarms are issued during all the time when the actual temperature +is above the Overtemperature Shutdown value. The driver sets the +hysteresis value for temp1 to 127 at initialization. + +For the other temperature sensor(s), an alarm is triggered when the +temperature gets higher then the Overtemperature Shutdown value; it stays +on until the temperature falls below the Hysteresis value. But on the +W83781D, there is only one alarm that functions for both other sensors! +Temperatures are guaranteed within a range of -55 to +125 degrees. The +main temperature sensors has a resolution of 1 degree; the other sensor(s) +of 0.5 degree. + +Fan rotation speeds are reported in RPM (rotations per minute). An alarm is +triggered if the rotation speed has dropped below a programmable limit. Fan +readings can be divided by a programmable divider (1, 2, 4 or 8 for the +W83781D; 1, 2, 4, 8, 16, 32, 64 or 128 for the others) to give +the readings more range or accuracy. Not all RPM values can accurately +be represented, so some rounding is done. With a divider of 2, the lowest +representable value is around 2600 RPM. + +Voltage sensors (also known as IN sensors) report their values in volts. +An alarm is triggered if the voltage has crossed a programmable minimum +or maximum limit. Note that minimum in this case always means 'closest to +zero'; this is important for negative voltage measurements. All voltage +inputs can measure voltages between 0 and 4.08 volts, with a resolution +of 0.016 volt. + +The VID lines encode the core voltage value: the voltage level your processor +should work with. This is hardcoded by the mainboard and/or processor itself. +It is a value in volts. When it is unconnected, you will often find the +value 3.50 V here. + +The W83782D and W83783S temperature conversion machine understands about +several kinds of temperature probes. You can program the so-called +beta value in the sensor files. '1' is the PII/Celeron diode, '2' is the +TN3904 transistor, and 3435 the default thermistor value. Other values +are (not yet) supported. + +In addition to the alarms described above, there is a CHAS alarm on the +chips which triggers if your computer case is open. + +When an alarm goes off, you can be warned by a beeping signal through +your computer speaker. It is possible to enable all beeping globally, +or only the beeping for some alarms. + +If an alarm triggers, it will remain triggered until the hardware register +is read at least once. This means that the cause for the alarm may +already have disappeared! Note that in the current implementation, all +hardware registers are read whenever any data is read (unless it is less +than 1.5 seconds since the last update). This means that you can easily +miss once-only alarms. + +The chips only update values each 1.5 seconds; reading them more often +will do no harm, but will return 'old' values. + +AS99127F PROBLEMS +----------------- +The as99127f support was developed without the benefit of a datasheet. +In most cases it is treated as a w83781d (although revision 2 of the +AS99127F looks more like a w83782d). +This support will be BETA until a datasheet is released. +One user has reported problems with fans stopping +occasionally. + +Note that the individual beep bits are inverted from the other chips. +The driver now takes care of this so that user-space applications +don't have to know about it. + +Known problems: + - Problems with diode/thermistor settings (supported?) + - One user reports fans stopping under high server load. + - Revision 2 seems to have 2 PWM registers but we don't know + how to handle them. More details below. + +These will not be fixed unless we get a datasheet. +If you have problems, please lobby Asus to release a datasheet. +Unfortunately several others have without success. +Please do not send mail to us asking for better as99127f support. +We have done the best we can without a datasheet. +Please do not send mail to the author or the sensors group asking for +a datasheet or ideas on how to convince Asus. We can't help. + + +NOTES: +----- + 783s has no in1 so that in[2-6] are compatible with the 781d/782d. + + 783s pin is programmable for -5V or temp1; defaults to -5V, + no control in driver so temp1 doesn't work. + + 782d and 783s datasheets differ on which is pwm1 and which is pwm2. + We chose to follow 782d. + + 782d and 783s pin is programmable for fan3 input or pwm2 output; + defaults to fan3 input. + If pwm2 is enabled (with echo 255 1 > pwm2), then + fan3 will report 0. + + 782d has pwm1-2 for ISA, pwm1-4 for i2c. (pwm3-4 share pins with + the ISA pins) + +Data sheet updates: +------------------ + - PWM clock registers: + + 000: master / 512 + 001: master / 1024 + 010: master / 2048 + 011: master / 4096 + 100: master / 8192 + + +Answers from Winbond tech support +--------------------------------- +> +> 1) In the W83781D data sheet section 7.2 last paragraph, it talks about +> reprogramming the R-T table if the Beta of the thermistor is not +> 3435K. The R-T table is described briefly in section 8.20. +> What formulas do I use to program a new R-T table for a given Beta? +> + We are sorry that the calculation for R-T table value is +confidential. If you have another Beta value of thermistor, we can help +to calculate the R-T table for you. But you should give us real R-T +Table which can be gotten by thermistor vendor. Therefore we will calculate +them and obtain 32-byte data, and you can fill the 32-byte data to the +register in Bank0.CR51 of W83781D. + + +> 2) In the W83782D data sheet, it mentions that pins 38, 39, and 40 are +> programmable to be either thermistor or Pentium II diode inputs. +> How do I program them for diode inputs? I can't find any register +> to program these to be diode inputs. + --> You may program Bank0 CR[5Dh] and CR[59h] registers. + + CR[5Dh] bit 1(VTIN1) bit 2(VTIN2) bit 3(VTIN3) + + thermistor 0 0 0 + diode 1 1 1 + + +(error) CR[59h] bit 4(VTIN1) bit 2(VTIN2) bit 3(VTIN3) +(right) CR[59h] bit 4(VTIN1) bit 5(VTIN2) bit 6(VTIN3) + + PII thermal diode 1 1 1 + 2N3904 diode 0 0 0 + + +Asus Clones +----------- + +We have no datasheets for the Asus clones (AS99127F and ASB100 Bach). +Here are some very useful information that were given to us by Alex Van +Kaam about how to detect these chips, and how to read their values. He +also gives advice for another Asus chipset, the Mozart-2 (which we +don't support yet). Thanks Alex! +I reworded some parts and added personal comments. + +# Detection: + +AS99127F rev.1, AS99127F rev.2 and ASB100: +- I2C address range: 0x29 - 0x2F +- If register 0x58 holds 0x31 then we have an Asus (either ASB100 or + AS99127F) +- Which one depends on register 0x4F (manufacturer ID): + 0x06 or 0x94: ASB100 + 0x12 or 0xC3: AS99127F rev.1 + 0x5C or 0xA3: AS99127F rev.2 + Note that 0x5CA3 is Winbond's ID (WEC), which let us think Asus get their + AS99127F rev.2 direct from Winbond. The other codes mean ATT and DVC, + respectively. ATT could stand for Asustek something (although it would be + very badly chosen IMHO), I don't know what DVC could stand for. Maybe + these codes simply aren't meant to be decoded that way. + +Mozart-2: +- I2C address: 0x77 +- If register 0x58 holds 0x56 or 0x10 then we have a Mozart-2 +- Of the Mozart there are 3 types: + 0x58=0x56, 0x4E=0x94, 0x4F=0x36: Asus ASM58 Mozart-2 + 0x58=0x56, 0x4E=0x94, 0x4F=0x06: Asus AS2K129R Mozart-2 + 0x58=0x10, 0x4E=0x5C, 0x4F=0xA3: Asus ??? Mozart-2 + You can handle all 3 the exact same way :) + +# Temperature sensors: + +ASB100: +- sensor 1: register 0x27 +- sensor 2 & 3 are the 2 LM75's on the SMBus +- sensor 4: register 0x17 +Remark: I noticed that on Intel boards sensor 2 is used for the CPU + and 4 is ignored/stuck, on AMD boards sensor 4 is the CPU and sensor 2 is + either ignored or a socket temperature. + +AS99127F (rev.1 and 2 alike): +- sensor 1: register 0x27 +- sensor 2 & 3 are the 2 LM75's on the SMBus +Remark: Register 0x5b is suspected to be temperature type selector. Bit 1 + would control temp1, bit 3 temp2 and bit 5 temp3. + +Mozart-2: +- sensor 1: register 0x27 +- sensor 2: register 0x13 + +# Fan sensors: + +ASB100, AS99127F (rev.1 and 2 alike): +- 3 fans, identical to the W83781D + +Mozart-2: +- 2 fans only, 1350000/RPM/div +- fan 1: register 0x28, divisor on register 0xA1 (bits 4-5) +- fan 2: register 0x29, divisor on register 0xA1 (bits 6-7) + +# Voltages: + +This is where there is a difference between AS99127F rev.1 and 2. +Remark: The difference is similar to the difference between + W83781D and W83782D. + +ASB100: +in0=r(0x20)*0.016 +in1=r(0x21)*0.016 +in2=r(0x22)*0.016 +in3=r(0x23)*0.016*1.68 +in4=r(0x24)*0.016*3.8 +in5=r(0x25)*(-0.016)*3.97 +in6=r(0x26)*(-0.016)*1.666 + +AS99127F rev.1: +in0=r(0x20)*0.016 +in1=r(0x21)*0.016 +in2=r(0x22)*0.016 +in3=r(0x23)*0.016*1.68 +in4=r(0x24)*0.016*3.8 +in5=r(0x25)*(-0.016)*3.97 +in6=r(0x26)*(-0.016)*1.503 + +AS99127F rev.2: +in0=r(0x20)*0.016 +in1=r(0x21)*0.016 +in2=r(0x22)*0.016 +in3=r(0x23)*0.016*1.68 +in4=r(0x24)*0.016*3.8 +in5=(r(0x25)*0.016-3.6)*5.14+3.6 +in6=(r(0x26)*0.016-3.6)*3.14+3.6 + +Mozart-2: +in0=r(0x20)*0.016 +in1=255 +in2=r(0x22)*0.016 +in3=r(0x23)*0.016*1.68 +in4=r(0x24)*0.016*4 +in5=255 +in6=255 + + +# PWM + +Additional info about PWM on the AS99127F (may apply to other Asus +chips as well) by Jean Delvare as of 2004-04-09: + +AS99127F revision 2 seems to have two PWM registers at 0x59 and 0x5A, +and a temperature sensor type selector at 0x5B (which basically means +that they swapped registers 0x59 and 0x5B when you compare with Winbond +chips). +Revision 1 of the chip also has the temperature sensor type selector at +0x5B, but PWM registers have no effect. + +We don't know exactly how the temperature sensor type selection works. +Looks like bits 1-0 are for temp1, bits 3-2 for temp2 and bits 5-4 for +temp3, although it is possible that only the most significant bit matters +each time. So far, values other than 0 always broke the readings. + +PWM registers seem to be split in two parts: bit 7 is a mode selector, +while the other bits seem to define a value or threshold. + +When bit 7 is clear, bits 6-0 seem to hold a threshold value. If the value +is below a given limit, the fan runs at low speed. If the value is above +the limit, the fan runs at full speed. We have no clue as to what the limit +represents. Note that there seem to be some inertia in this mode, speed +changes may need some time to trigger. Also, an hysteresis mechanism is +suspected since walking through all the values increasingly and then +decreasingly led to slightly different limits. + +When bit 7 is set, bits 3-0 seem to hold a threshold value, while bits 6-4 +would not be significant. If the value is below a given limit, the fan runs +at full speed, while if it is above the limit it runs at low speed (so this +is the contrary of the other mode, in a way). Here again, we don't know +what the limit is supposed to represent. + +One remarkable thing is that the fans would only have two or three +different speeds (transitional states left apart), not a whole range as +you usually get with PWM. + +As a conclusion, you can write 0x00 or 0x8F to the PWM registers to make +fans run at low speed, and 0x7F or 0x80 to make them run at full speed. + +Please contact us if you can figure out how it is supposed to work. As +long as we don't know more, the w83781d driver doesn't handle PWM on +AS99127F chips at all. + +Additional info about PWM on the AS99127F rev.1 by Hector Martin: + +I've been fiddling around with the (in)famous 0x59 register and +found out the following values do work as a form of coarse pwm: + +0x80 - seems to turn fans off after some time(1-2 minutes)... might be +some form of auto-fan-control based on temp? hmm (Qfan? this mobo is an +old ASUS, it isn't marketed as Qfan. Maybe some beta pre-attemp at Qfan +that was dropped at the BIOS) +0x81 - off +0x82 - slightly "on-ner" than off, but my fans do not get to move. I can +hear the high-pitched PWM sound that motors give off at too-low-pwm. +0x83 - now they do move. Estimate about 70% speed or so. +0x84-0x8f - full on + +Changing the high nibble doesn't seem to do much except the high bit +(0x80) must be set for PWM to work, else the current pwm doesn't seem to +change. + +My mobo is an ASUS A7V266-E. This behavior is similar to what I got +with speedfan under Windows, where 0-15% would be off, 15-2x% (can't +remember the exact value) would be 70% and higher would be full on. diff --git a/Documentation/hwmon/w83l785ts b/Documentation/hwmon/w83l785ts new file mode 100644 index 000000000000..1841cedc25b2 --- /dev/null +++ b/Documentation/hwmon/w83l785ts @@ -0,0 +1,39 @@ +Kernel driver w83l785ts +======================= + +Supported chips: + * Winbond W83L785TS-S + Prefix: 'w83l785ts' + Addresses scanned: I2C 0x2e + Datasheet: Publicly available at the Winbond USA website + http://www.winbond-usa.com/products/winbond_products/pdfs/PCIC/W83L785TS-S.pdf + +Authors: + Jean Delvare <khali@linux-fr.org> + +Description +----------- + +The W83L785TS-S is a digital temperature sensor. It senses the +temperature of a single external diode. The high limit is +theoretically defined as 85 or 100 degrees C through a combination +of external resistors, so the user cannot change it. Values seen so +far suggest that the two possible limits are actually 95 and 110 +degrees C. The datasheet is rather poor and obviously inaccurate +on several points including this one. + +All temperature values are given in degrees Celsius. Resolution +is 1.0 degree. See the datasheet for details. + +The w83l785ts driver will not update its values more frequently than +every other second; reading them more often will do no harm, but will +return 'old' values. + +Known Issues +------------ + +On some systems (Asus), the BIOS is known to interfere with the driver +and cause read errors. The driver will retry a given number of times +(5 by default) and then give up, returning the old value (or 0 if +there is no old value). It seems to work well enough so that you should +not notice anything. Thanks to James Bolt for helping test this feature. diff --git a/Documentation/i2c/busses/i2c-sis69x b/Documentation/i2c/busses/i2c-sis69x index 5be48769f65b..b88953dfd580 100644 --- a/Documentation/i2c/busses/i2c-sis69x +++ b/Documentation/i2c/busses/i2c-sis69x @@ -42,7 +42,7 @@ I suspect that this driver could be made to work for the following SiS chipsets as well: 635, and 635T. If anyone owns a board with those chips AND is willing to risk crashing & burning an otherwise well-behaved kernel in the name of progress... please contact me at <mhoffman@lightlink.com> or -via the project's mailing list: <sensors@stimpy.netroedge.com>. Please +via the project's mailing list: <lm-sensors@lm-sensors.org>. Please send bug reports and/or success stories as well. diff --git a/Documentation/i2c/chips/eeprom b/Documentation/i2c/chips/eeprom new file mode 100644 index 000000000000..f7e8104b5764 --- /dev/null +++ b/Documentation/i2c/chips/eeprom @@ -0,0 +1,96 @@ +Kernel driver eeprom +==================== + +Supported chips: + * Any EEPROM chip in the designated address range + Prefix: 'eeprom' + Addresses scanned: I2C 0x50 - 0x57 + Datasheets: Publicly available from: + Atmel (www.atmel.com), + Catalyst (www.catsemi.com), + Fairchild (www.fairchildsemi.com), + Microchip (www.microchip.com), + Philips (www.semiconductor.philips.com), + Rohm (www.rohm.com), + ST (www.st.com), + Xicor (www.xicor.com), + and others. + + Chip Size (bits) Address + 24C01 1K 0x50 (shadows at 0x51 - 0x57) + 24C01A 1K 0x50 - 0x57 (Typical device on DIMMs) + 24C02 2K 0x50 - 0x57 + 24C04 4K 0x50, 0x52, 0x54, 0x56 + (additional data at 0x51, 0x53, 0x55, 0x57) + 24C08 8K 0x50, 0x54 (additional data at 0x51, 0x52, + 0x53, 0x55, 0x56, 0x57) + 24C16 16K 0x50 (additional data at 0x51 - 0x57) + Sony 2K 0x57 + + Atmel 34C02B 2K 0x50 - 0x57, SW write protect at 0x30-37 + Catalyst 34FC02 2K 0x50 - 0x57, SW write protect at 0x30-37 + Catalyst 34RC02 2K 0x50 - 0x57, SW write protect at 0x30-37 + Fairchild 34W02 2K 0x50 - 0x57, SW write protect at 0x30-37 + Microchip 24AA52 2K 0x50 - 0x57, SW write protect at 0x30-37 + ST M34C02 2K 0x50 - 0x57, SW write protect at 0x30-37 + + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com>, + Jean Delvare <khali@linux-fr.org>, + Greg Kroah-Hartman <greg@kroah.com>, + IBM Corp. + +Description +----------- + +This is a simple EEPROM module meant to enable reading the first 256 bytes +of an EEPROM (on a SDRAM DIMM for example). However, it will access serial +EEPROMs on any I2C adapter. The supported devices are generically called +24Cxx, and are listed above; however the numbering for these +industry-standard devices may vary by manufacturer. + +This module was a programming exercise to get used to the new project +organization laid out by Frodo, but it should be at least completely +effective for decoding the contents of EEPROMs on DIMMs. + +DIMMS will typically contain a 24C01A or 24C02, or the 34C02 variants. +The other devices will not be found on a DIMM because they respond to more +than one address. + +DDC Monitors may contain any device. Often a 24C01, which responds to all 8 +addresses, is found. + +Recent Sony Vaio laptops have an EEPROM at 0x57. We couldn't get the +specification, so it is guess work and far from being complete. + +The Microchip 24AA52/24LCS52, ST M34C02, and others support an additional +software write protect register at 0x30 - 0x37 (0x20 less than the memory +location). The chip responds to "write quick" detection at this address but +does not respond to byte reads. If this register is present, the lower 128 +bytes of the memory array are not write protected. Any byte data write to +this address will write protect the memory array permanently, and the +device will no longer respond at the 0x30-37 address. The eeprom driver +does not support this register. + +Lacking functionality: + +* Full support for larger devices (24C04, 24C08, 24C16). These are not +typically found on a PC. These devices will appear as separate devices at +multiple addresses. + +* Support for really large devices (24C32, 24C64, 24C128, 24C256, 24C512). +These devices require two-byte address fields and are not supported. + +* Enable Writing. Again, no technical reason why not, but making it easy +to change the contents of the EEPROMs (on DIMMs anyway) also makes it easy +to disable the DIMMs (potentially preventing the computer from booting) +until the values are restored somehow. + +Use: + +After inserting the module (and any other required SMBus/i2c modules), you +should have some EEPROM directories in /sys/bus/i2c/devices/* of names such +as "0-0050". Inside each of these is a series of files, the eeprom file +contains the binary data from EEPROM. diff --git a/Documentation/i2c/chips/max6875 b/Documentation/i2c/chips/max6875 new file mode 100644 index 000000000000..b02002898a09 --- /dev/null +++ b/Documentation/i2c/chips/max6875 @@ -0,0 +1,66 @@ +Kernel driver max6875 +===================== + +Supported chips: + * Maxim MAX6874, MAX6875 + Prefix: 'max6875' + Addresses scanned: 0x50, 0x52 + Datasheet: + http://pdfserv.maxim-ic.com/en/ds/MAX6874-MAX6875.pdf + +Author: Ben Gardner <bgardner@wabtec.com> + + +Module Parameters +----------------- + +* allow_write int + Set to non-zero to enable write permission: + *0: Read only + 1: Read and write + + +Description +----------- + +The Maxim MAX6875 is an EEPROM-programmable power-supply sequencer/supervisor. +It provides timed outputs that can be used as a watchdog, if properly wired. +It also provides 512 bytes of user EEPROM. + +At reset, the MAX6875 reads the configuration EEPROM into its configuration +registers. The chip then begins to operate according to the values in the +registers. + +The Maxim MAX6874 is a similar, mostly compatible device, with more intputs +and outputs: + + vin gpi vout +MAX6874 6 4 8 +MAX6875 4 3 5 + +MAX6874 chips can have four different addresses (as opposed to only two for +the MAX6875). The additional addresses (0x54 and 0x56) are not probed by +this driver by default, but the probe module parameter can be used if +needed. + +See the datasheet for details on how to program the EEPROM. + + +Sysfs entries +------------- + +eeprom_user - 512 bytes of user-defined EEPROM space. Only writable if + allow_write was set and register 0x43 is 0. + +eeprom_config - 70 bytes of config EEPROM. Note that changes will not get + loaded into register space until a power cycle or device reset. + +reg_config - 70 bytes of register space. Any changes take affect immediately. + + +General Remarks +--------------- + +A typical application will require that the EEPROMs be programmed once and +never altered afterwards. + diff --git a/Documentation/i2c/chips/pca9539 b/Documentation/i2c/chips/pca9539 new file mode 100644 index 000000000000..c4fce6a13537 --- /dev/null +++ b/Documentation/i2c/chips/pca9539 @@ -0,0 +1,47 @@ +Kernel driver pca9539 +===================== + +Supported chips: + * Philips PCA9539 + Prefix: 'pca9539' + Addresses scanned: 0x74 - 0x77 + Datasheet: + http://www.semiconductors.philips.com/acrobat/datasheets/PCA9539_2.pdf + +Author: Ben Gardner <bgardner@wabtec.com> + + +Description +----------- + +The Philips PCA9539 is a 16 bit low power I/O device. +All 16 lines can be individually configured as an input or output. +The input sense can also be inverted. +The 16 lines are split between two bytes. + + +Sysfs entries +------------- + +Each is a byte that maps to the 8 I/O bits. +A '0' suffix is for bits 0-7, while '1' is for bits 8-15. + +input[01] - read the current value +output[01] - sets the output value +direction[01] - direction of each bit: 1=input, 0=output +invert[01] - toggle the input bit sense + +input reads the actual state of the line and is always available. +The direction defaults to input for all channels. + + +General Remarks +--------------- + +Note that each output, direction, and invert entry controls 8 lines. +You should use the read, modify, write sequence. +For example. to set output bit 0 of 1. + val=$(cat output0) + val=$(( $val | 1 )) + echo $val > output0 + diff --git a/Documentation/i2c/chips/pcf8574 b/Documentation/i2c/chips/pcf8574 new file mode 100644 index 000000000000..2752c8ce3167 --- /dev/null +++ b/Documentation/i2c/chips/pcf8574 @@ -0,0 +1,69 @@ +Kernel driver pcf8574 +===================== + +Supported chips: + * Philips PCF8574 + Prefix: 'pcf8574' + Addresses scanned: I2C 0x20 - 0x27 + Datasheet: Publicly available at the Philips Semiconductors website + http://www.semiconductors.philips.com/pip/PCF8574P.html + + * Philips PCF8574A + Prefix: 'pcf8574a' + Addresses scanned: I2C 0x38 - 0x3f + Datasheet: Publicly available at the Philips Semiconductors website + http://www.semiconductors.philips.com/pip/PCF8574P.html + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com>, + Dan Eaton <dan.eaton@rocketlogix.com>, + Aurelien Jarno <aurelien@aurel32.net>, + Jean Delvare <khali@linux-fr.org>, + + +Description +----------- +The PCF8574(A) is an 8-bit I/O expander for the I2C bus produced by Philips +Semiconductors. It is designed to provide a byte I2C interface to up to 16 +separate devices (8 x PCF8574 and 8 x PCF8574A). + +This device consists of a quasi-bidirectional port. Each of the eight I/Os +can be independently used as an input or output. To setup an I/O as an +input, you have to write a 1 to the corresponding output. + +For more informations see the datasheet. + + +Accessing PCF8574(A) via /sys interface +------------------------------------- + +! Be careful ! +The PCF8574(A) is plainly impossible to detect ! Stupid chip. +So every chip with address in the interval [20..27] and [38..3f] are +detected as PCF8574(A). If you have other chips in this address +range, the workaround is to load this module after the one +for your others chips. + +On detection (i.e. insmod, modprobe et al.), directories are being +created for each detected PCF8574(A): + +/sys/bus/i2c/devices/<0>-<1>/ +where <0> is the bus the chip was detected on (e. g. i2c-0) +and <1> the chip address ([20..27] or [38..3f]): + +(example: /sys/bus/i2c/devices/1-0020/) + +Inside these directories, there are two files each: +read and write (and one file with chip name). + +The read file is read-only. Reading gives you the current I/O input +if the corresponding output is set as 1, otherwise the current output +value, that is to say 0. + +The write file is read/write. Writing a value outputs it on the I/O +port. Reading returns the last written value. + +On module initialization the chip is configured as eight inputs (all +outputs to 1), so you can connect any circuit to the PCF8574(A) without +being afraid of short-circuit. diff --git a/Documentation/i2c/chips/pcf8591 b/Documentation/i2c/chips/pcf8591 new file mode 100644 index 000000000000..5628fcf4207f --- /dev/null +++ b/Documentation/i2c/chips/pcf8591 @@ -0,0 +1,90 @@ +Kernel driver pcf8591 +===================== + +Supported chips: + * Philips PCF8591 + Prefix: 'pcf8591' + Addresses scanned: I2C 0x48 - 0x4f + Datasheet: Publicly available at the Philips Semiconductor website + http://www.semiconductors.philips.com/pip/PCF8591P.html + +Authors: + Aurelien Jarno <aurelien@aurel32.net> + valuable contributions by Jan M. Sendler <sendler@sendler.de>, + Jean Delvare <khali@linux-fr.org> + + +Description +----------- +The PCF8591 is an 8-bit A/D and D/A converter (4 analog inputs and one +analog output) for the I2C bus produced by Philips Semiconductors. It +is designed to provide a byte I2C interface to up to 4 separate devices. + +The PCF8591 has 4 analog inputs programmable as single-ended or +differential inputs : +- mode 0 : four single ended inputs + Pins AIN0 to AIN3 are single ended inputs for channels 0 to 3 + +- mode 1 : three differential inputs + Pins AIN3 is the common negative differential input + Pins AIN0 to AIN2 are positive differential inputs for channels 0 to 2 + +- mode 2 : single ended and differential mixed + Pins AIN0 and AIN1 are single ended inputs for channels 0 and 1 + Pins AIN2 is the positive differential input for channel 3 + Pins AIN3 is the negative differential input for channel 3 + +- mode 3 : two differential inputs + Pins AIN0 is the positive differential input for channel 0 + Pins AIN1 is the negative differential input for channel 0 + Pins AIN2 is the positive differential input for channel 1 + Pins AIN3 is the negative differential input for channel 1 + +See the datasheet for details. + +Module parameters +----------------- + +* input_mode int + + Analog input mode: + 0 = four single ended inputs + 1 = three differential inputs + 2 = single ended and differential mixed + 3 = two differential inputs + + +Accessing PCF8591 via /sys interface +------------------------------------- + +! Be careful ! +The PCF8591 is plainly impossible to detect ! Stupid chip. +So every chip with address in the interval [48..4f] is +detected as PCF8591. If you have other chips in this address +range, the workaround is to load this module after the one +for your others chips. + +On detection (i.e. insmod, modprobe et al.), directories are being +created for each detected PCF8591: + +/sys/bus/devices/<0>-<1>/ +where <0> is the bus the chip was detected on (e. g. i2c-0) +and <1> the chip address ([48..4f]) + +Inside these directories, there are such files: +in0, in1, in2, in3, out0_enable, out0_output, name + +Name contains chip name. + +The in0, in1, in2 and in3 files are RO. Reading gives the value of the +corresponding channel. Depending on the current analog inputs configuration, +files in2 and/or in3 do not exist. Values range are from 0 to 255 for single +ended inputs and -128 to +127 for differential inputs (8-bit ADC). + +The out0_enable file is RW. Reading gives "1" for analog output enabled and +"0" for analog output disabled. Writing accepts "0" and "1" accordingly. + +The out0_output file is RW. Writing a number between 0 and 255 (8-bit DAC), send +the value to the digital-to-analog converter. Note that a voltage will +only appears on AOUT pin if aout0_enable equals 1. Reading returns the last +value written. diff --git a/Documentation/i2c/dev-interface b/Documentation/i2c/dev-interface index 09d6cda2a1fb..b849ad636583 100644 --- a/Documentation/i2c/dev-interface +++ b/Documentation/i2c/dev-interface @@ -14,9 +14,12 @@ C example ========= So let's say you want to access an i2c adapter from a C program. The -first thing to do is `#include <linux/i2c.h>" and "#include <linux/i2c-dev.h>. -Yes, I know, you should never include kernel header files, but until glibc -knows about i2c, there is not much choice. +first thing to do is "#include <linux/i2c-dev.h>". Please note that +there are two files named "i2c-dev.h" out there, one is distributed +with the Linux kernel and is meant to be included from kernel +driver code, the other one is distributed with lm_sensors and is +meant to be included from user-space programs. You obviously want +the second one here. Now, you have to decide which adapter you want to access. You should inspect /sys/class/i2c-dev/ to decide this. Adapter numbers are assigned @@ -78,7 +81,7 @@ Full interface description ========================== The following IOCTLs are defined and fully supported -(see also i2c-dev.h and i2c.h): +(see also i2c-dev.h): ioctl(file,I2C_SLAVE,long addr) Change slave address. The address is passed in the 7 lower bits of the @@ -97,10 +100,10 @@ ioctl(file,I2C_PEC,long select) ioctl(file,I2C_FUNCS,unsigned long *funcs) Gets the adapter functionality and puts it in *funcs. -ioctl(file,I2C_RDWR,struct i2c_ioctl_rdwr_data *msgset) +ioctl(file,I2C_RDWR,struct i2c_rdwr_ioctl_data *msgset) Do combined read/write transaction without stop in between. - The argument is a pointer to a struct i2c_ioctl_rdwr_data { + The argument is a pointer to a struct i2c_rdwr_ioctl_data { struct i2c_msg *msgs; /* ptr to array of simple messages */ int nmsgs; /* number of messages to exchange */ diff --git a/Documentation/i2c/porting-clients b/Documentation/i2c/porting-clients index 56404918eabc..a7adbdd9ea8a 100644 --- a/Documentation/i2c/porting-clients +++ b/Documentation/i2c/porting-clients @@ -57,7 +57,7 @@ Technical changes: Documentation/i2c/sysfs-interface for the individual files. Also convert the units these files read and write to the specified ones. If you need to add a new type of file, please discuss it on the - sensors mailing list <sensors@stimpy.netroedge.com> by providing a + sensors mailing list <lm-sensors@lm-sensors.org> by providing a patch to the Documentation/i2c/sysfs-interface file. * [Attach] For I2C drivers, the attach function should make sure diff --git a/Documentation/i2c/writing-clients b/Documentation/i2c/writing-clients index ad27511e3c7d..91664be91ffc 100644 --- a/Documentation/i2c/writing-clients +++ b/Documentation/i2c/writing-clients @@ -27,7 +27,6 @@ address. static struct i2c_driver foo_driver = { .owner = THIS_MODULE, .name = "Foo version 2.3 driver", - .id = I2C_DRIVERID_FOO, /* from i2c-id.h, optional */ .flags = I2C_DF_NOTIFY, .attach_adapter = &foo_attach_adapter, .detach_client = &foo_detach_client, @@ -37,12 +36,6 @@ static struct i2c_driver foo_driver = { The name can be chosen freely, and may be upto 40 characters long. Please use something descriptive here. -If used, the id should be a unique ID. The range 0xf000 to 0xffff is -reserved for local use, and you can use one of those until you start -distributing the driver, at which time you should contact the i2c authors -to get your own ID(s). Note that most of the time you don't need an ID -at all so you can just omit it. - Don't worry about the flags field; just put I2C_DF_NOTIFY into it. This means that your driver will be notified when new adapters are found. This is almost always what you want. @@ -171,45 +164,31 @@ The following lists are used internally: normal_i2c: filled in by the module writer. A list of I2C addresses which should normally be examined. - normal_i2c_range: filled in by the module writer. - A list of pairs of I2C addresses, each pair being an inclusive range of - addresses which should normally be examined. probe: insmod parameter. A list of pairs. The first value is a bus number (-1 for any I2C bus), the second is the address. These addresses are also probed, as if they were in the 'normal' list. - probe_range: insmod parameter. - A list of triples. The first value is a bus number (-1 for any I2C bus), - the second and third are addresses. These form an inclusive range of - addresses that are also probed, as if they were in the 'normal' list. ignore: insmod parameter. A list of pairs. The first value is a bus number (-1 for any I2C bus), the second is the I2C address. These addresses are never probed. This parameter overrules 'normal' and 'probe', but not the 'force' lists. - ignore_range: insmod parameter. - A list of triples. The first value is a bus number (-1 for any I2C bus), - the second and third are addresses. These form an inclusive range of - I2C addresses that are never probed. - This parameter overrules 'normal' and 'probe', but not the 'force' lists. force: insmod parameter. A list of pairs. The first value is a bus number (-1 for any I2C bus), the second is the I2C address. A device is blindly assumed to be on the given address, no probing is done. -Fortunately, as a module writer, you just have to define the `normal' -and/or `normal_range' parameters. The complete declaration could look -like this: +Fortunately, as a module writer, you just have to define the `normal_i2c' +parameter. The complete declaration could look like this: - /* Scan 0x20 to 0x2f, 0x37, and 0x40 to 0x4f */ - static unsigned short normal_i2c[] = { 0x37,I2C_CLIENT_END }; - static unsigned short normal_i2c_range[] = { 0x20, 0x2f, 0x40, 0x4f, - I2C_CLIENT_END }; + /* Scan 0x37, and 0x48 to 0x4f */ + static unsigned short normal_i2c[] = { 0x37, 0x48, 0x49, 0x4a, 0x4b, 0x4c, + 0x4d, 0x4e, 0x4f, I2C_CLIENT_END }; /* Magic definition of all other variables and things */ I2C_CLIENT_INSMOD; -Note that you *have* to call the two defined variables `normal_i2c' and -`normal_i2c_range', without any prefix! +Note that you *have* to call the defined variable `normal_i2c', +without any prefix! Probing classes (sensors) @@ -223,39 +202,17 @@ The following lists are used internally. They are all lists of integers. normal_i2c: filled in by the module writer. Terminated by SENSORS_I2C_END. A list of I2C addresses which should normally be examined. - normal_i2c_range: filled in by the module writer. Terminated by - SENSORS_I2C_END - A list of pairs of I2C addresses, each pair being an inclusive range of - addresses which should normally be examined. normal_isa: filled in by the module writer. Terminated by SENSORS_ISA_END. A list of ISA addresses which should normally be examined. - normal_isa_range: filled in by the module writer. Terminated by - SENSORS_ISA_END - A list of triples. The first two elements are ISA addresses, being an - range of addresses which should normally be examined. The third is the - modulo parameter: only addresses which are 0 module this value relative - to the first address of the range are actually considered. probe: insmod parameter. Initialize this list with SENSORS_I2C_END values. A list of pairs. The first value is a bus number (SENSORS_ISA_BUS for the ISA bus, -1 for any I2C bus), the second is the address. These addresses are also probed, as if they were in the 'normal' list. - probe_range: insmod parameter. Initialize this list with SENSORS_I2C_END - values. - A list of triples. The first value is a bus number (SENSORS_ISA_BUS for - the ISA bus, -1 for any I2C bus), the second and third are addresses. - These form an inclusive range of addresses that are also probed, as - if they were in the 'normal' list. ignore: insmod parameter. Initialize this list with SENSORS_I2C_END values. A list of pairs. The first value is a bus number (SENSORS_ISA_BUS for the ISA bus, -1 for any I2C bus), the second is the I2C address. These addresses are never probed. This parameter overrules 'normal' and 'probe', but not the 'force' lists. - ignore_range: insmod parameter. Initialize this list with SENSORS_I2C_END - values. - A list of triples. The first value is a bus number (SENSORS_ISA_BUS for - the ISA bus, -1 for any I2C bus), the second and third are addresses. - These form an inclusive range of I2C addresses that are never probed. - This parameter overrules 'normal' and 'probe', but not the 'force' lists. Also used is a list of pointers to sensors_force_data structures: force_data: insmod parameters. A list, ending with an element of which @@ -269,16 +226,14 @@ Also used is a list of pointers to sensors_force_data structures: So we have a generic insmod variabled `force', and chip-specific variables `force_CHIPNAME'. -Fortunately, as a module writer, you just have to define the `normal' -and/or `normal_range' parameters, and define what chip names are used. +Fortunately, as a module writer, you just have to define the `normal_i2c' +and `normal_isa' parameters, and define what chip names are used. The complete declaration could look like this: - /* Scan i2c addresses 0x20 to 0x2f, 0x37, and 0x40 to 0x4f - static unsigned short normal_i2c[] = {0x37,SENSORS_I2C_END}; - static unsigned short normal_i2c_range[] = {0x20,0x2f,0x40,0x4f, - SENSORS_I2C_END}; + /* Scan i2c addresses 0x37, and 0x48 to 0x4f */ + static unsigned short normal_i2c[] = { 0x37, 0x48, 0x49, 0x4a, 0x4b, 0x4c, + 0x4d, 0x4e, 0x4f, I2C_CLIENT_END }; /* Scan ISA address 0x290 */ static unsigned int normal_isa[] = {0x0290,SENSORS_ISA_END}; - static unsigned int normal_isa_range[] = {SENSORS_ISA_END}; /* Define chips foo and bar, as well as all module parameters and things */ SENSORS_INSMOD_2(foo,bar); diff --git a/Documentation/infiniband/core_locking.txt b/Documentation/infiniband/core_locking.txt new file mode 100644 index 000000000000..e1678542279a --- /dev/null +++ b/Documentation/infiniband/core_locking.txt @@ -0,0 +1,114 @@ +INFINIBAND MIDLAYER LOCKING + + This guide is an attempt to make explicit the locking assumptions + made by the InfiniBand midlayer. It describes the requirements on + both low-level drivers that sit below the midlayer and upper level + protocols that use the midlayer. + +Sleeping and interrupt context + + With the following exceptions, a low-level driver implementation of + all of the methods in struct ib_device may sleep. The exceptions + are any methods from the list: + + create_ah + modify_ah + query_ah + destroy_ah + bind_mw + post_send + post_recv + poll_cq + req_notify_cq + map_phys_fmr + + which may not sleep and must be callable from any context. + + The corresponding functions exported to upper level protocol + consumers: + + ib_create_ah + ib_modify_ah + ib_query_ah + ib_destroy_ah + ib_bind_mw + ib_post_send + ib_post_recv + ib_req_notify_cq + ib_map_phys_fmr + + are therefore safe to call from any context. + + In addition, the function + + ib_dispatch_event + + used by low-level drivers to dispatch asynchronous events through + the midlayer is also safe to call from any context. + +Reentrancy + + All of the methods in struct ib_device exported by a low-level + driver must be fully reentrant. The low-level driver is required to + perform all synchronization necessary to maintain consistency, even + if multiple function calls using the same object are run + simultaneously. + + The IB midlayer does not perform any serialization of function calls. + + Because low-level drivers are reentrant, upper level protocol + consumers are not required to perform any serialization. However, + some serialization may be required to get sensible results. For + example, a consumer may safely call ib_poll_cq() on multiple CPUs + simultaneously. However, the ordering of the work completion + information between different calls of ib_poll_cq() is not defined. + +Callbacks + + A low-level driver must not perform a callback directly from the + same callchain as an ib_device method call. For example, it is not + allowed for a low-level driver to call a consumer's completion event + handler directly from its post_send method. Instead, the low-level + driver should defer this callback by, for example, scheduling a + tasklet to perform the callback. + + The low-level driver is responsible for ensuring that multiple + completion event handlers for the same CQ are not called + simultaneously. The driver must guarantee that only one CQ event + handler for a given CQ is running at a time. In other words, the + following situation is not allowed: + + CPU1 CPU2 + + low-level driver -> + consumer CQ event callback: + /* ... */ + ib_req_notify_cq(cq, ...); + low-level driver -> + /* ... */ consumer CQ event callback: + /* ... */ + return from CQ event handler + + The context in which completion event and asynchronous event + callbacks run is not defined. Depending on the low-level driver, it + may be process context, softirq context, or interrupt context. + Upper level protocol consumers may not sleep in a callback. + +Hot-plug + + A low-level driver announces that a device is ready for use by + consumers when it calls ib_register_device(), all initialization + must be complete before this call. The device must remain usable + until the driver's call to ib_unregister_device() has returned. + + A low-level driver must call ib_register_device() and + ib_unregister_device() from process context. It must not hold any + semaphores that could cause deadlock if a consumer calls back into + the driver across these calls. + + An upper level protocol consumer may begin using an IB device as + soon as the add method of its struct ib_client is called for that + device. A consumer must finish all cleanup and free all resources + relating to a device before returning from the remove method. + + A consumer is permitted to sleep in its add and remove methods. diff --git a/Documentation/infiniband/user_mad.txt b/Documentation/infiniband/user_mad.txt index cae0c83f1ee9..750fe5e80ebc 100644 --- a/Documentation/infiniband/user_mad.txt +++ b/Documentation/infiniband/user_mad.txt @@ -28,13 +28,37 @@ Creating MAD agents Receiving MADs - MADs are received using read(). The buffer passed to read() must be - large enough to hold at least one struct ib_user_mad. For example: - - struct ib_user_mad mad; - ret = read(fd, &mad, sizeof mad); - if (ret != sizeof mad) + MADs are received using read(). The receive side now supports + RMPP. The buffer passed to read() must be at least one + struct ib_user_mad + 256 bytes. For example: + + If the buffer passed is not large enough to hold the received + MAD (RMPP), the errno is set to ENOSPC and the length of the + buffer needed is set in mad.length. + + Example for normal MAD (non RMPP) reads: + struct ib_user_mad *mad; + mad = malloc(sizeof *mad + 256); + ret = read(fd, mad, sizeof *mad + 256); + if (ret != sizeof mad + 256) { + perror("read"); + free(mad); + } + + Example for RMPP reads: + struct ib_user_mad *mad; + mad = malloc(sizeof *mad + 256); + ret = read(fd, mad, sizeof *mad + 256); + if (ret == -ENOSPC)) { + length = mad.length; + free(mad); + mad = malloc(sizeof *mad + length); + ret = read(fd, mad, sizeof *mad + length); + } + if (ret < 0) { perror("read"); + free(mad); + } In addition to the actual MAD contents, the other struct ib_user_mad fields will be filled in with information on the received MAD. For @@ -50,18 +74,21 @@ Sending MADs MADs are sent using write(). The agent ID for sending should be filled into the id field of the MAD, the destination LID should be - filled into the lid field, and so on. For example: + filled into the lid field, and so on. The send side does support + RMPP so arbitrary length MAD can be sent. For example: + + struct ib_user_mad *mad; - struct ib_user_mad mad; + mad = malloc(sizeof *mad + mad_length); - /* fill in mad.data */ + /* fill in mad->data */ - mad.id = my_agent; /* req.id from agent registration */ - mad.lid = my_dest; /* in network byte order... */ + mad->hdr.id = my_agent; /* req.id from agent registration */ + mad->hdr.lid = my_dest; /* in network byte order... */ /* etc. */ - ret = write(fd, &mad, sizeof mad); - if (ret != sizeof mad) + ret = write(fd, &mad, sizeof *mad + mad_length); + if (ret != sizeof *mad + mad_length) perror("write"); Setting IsSM Capability Bit diff --git a/Documentation/infiniband/user_verbs.txt b/Documentation/infiniband/user_verbs.txt new file mode 100644 index 000000000000..f847501e50b5 --- /dev/null +++ b/Documentation/infiniband/user_verbs.txt @@ -0,0 +1,69 @@ +USERSPACE VERBS ACCESS + + The ib_uverbs module, built by enabling CONFIG_INFINIBAND_USER_VERBS, + enables direct userspace access to IB hardware via "verbs," as + described in chapter 11 of the InfiniBand Architecture Specification. + + To use the verbs, the libibverbs library, available from + <http://openib.org/>, is required. libibverbs contains a + device-independent API for using the ib_uverbs interface. + libibverbs also requires appropriate device-dependent kernel and + userspace driver for your InfiniBand hardware. For example, to use + a Mellanox HCA, you will need the ib_mthca kernel module and the + libmthca userspace driver be installed. + +User-kernel communication + + Userspace communicates with the kernel for slow path, resource + management operations via the /dev/infiniband/uverbsN character + devices. Fast path operations are typically performed by writing + directly to hardware registers mmap()ed into userspace, with no + system call or context switch into the kernel. + + Commands are sent to the kernel via write()s on these device files. + The ABI is defined in drivers/infiniband/include/ib_user_verbs.h. + The structs for commands that require a response from the kernel + contain a 64-bit field used to pass a pointer to an output buffer. + Status is returned to userspace as the return value of the write() + system call. + +Resource management + + Since creation and destruction of all IB resources is done by + commands passed through a file descriptor, the kernel can keep track + of which resources are attached to a given userspace context. The + ib_uverbs module maintains idr tables that are used to translate + between kernel pointers and opaque userspace handles, so that kernel + pointers are never exposed to userspace and userspace cannot trick + the kernel into following a bogus pointer. + + This also allows the kernel to clean up when a process exits and + prevent one process from touching another process's resources. + +Memory pinning + + Direct userspace I/O requires that memory regions that are potential + I/O targets be kept resident at the same physical address. The + ib_uverbs module manages pinning and unpinning memory regions via + get_user_pages() and put_page() calls. It also accounts for the + amount of memory pinned in the process's locked_vm, and checks that + unprivileged processes do not exceed their RLIMIT_MEMLOCK limit. + + Pages that are pinned multiple times are counted each time they are + pinned, so the value of locked_vm may be an overestimate of the + number of pages pinned by a process. + +/dev files + + To create the appropriate character device files automatically with + udev, a rule like + + KERNEL="uverbs*", NAME="infiniband/%k" + + can be used. This will create device nodes named + + /dev/infiniband/uverbs0 + + and so on. Since the InfiniBand userspace verbs should be safe for + use by non-privileged processes, it may be useful to add an + appropriate MODE or GROUP to the udev rule. diff --git a/Documentation/kdump/gdbmacros.txt b/Documentation/kdump/gdbmacros.txt new file mode 100644 index 000000000000..bc1b9eb92ae1 --- /dev/null +++ b/Documentation/kdump/gdbmacros.txt @@ -0,0 +1,179 @@ +# +# This file contains a few gdb macros (user defined commands) to extract +# useful information from kernel crashdump (kdump) like stack traces of +# all the processes or a particular process and trapinfo. +# +# These macros can be used by copying this file in .gdbinit (put in home +# directory or current directory) or by invoking gdb command with +# --command=<command-file-name> option +# +# Credits: +# Alexander Nyberg <alexn@telia.com> +# V Srivatsa <vatsa@in.ibm.com> +# Maneesh Soni <maneesh@in.ibm.com> +# + +define bttnobp + set $tasks_off=((size_t)&((struct task_struct *)0)->tasks) + set $pid_off=((size_t)&((struct task_struct *)0)->pids[1].pid_list.next) + set $init_t=&init_task + set $next_t=(((char *)($init_t->tasks).next) - $tasks_off) + while ($next_t != $init_t) + set $next_t=(struct task_struct *)$next_t + printf "\npid %d; comm %s:\n", $next_t.pid, $next_t.comm + printf "===================\n" + set var $stackp = $next_t.thread.esp + set var $stack_top = ($stackp & ~4095) + 4096 + + while ($stackp < $stack_top) + if (*($stackp) > _stext && *($stackp) < _sinittext) + info symbol *($stackp) + end + set $stackp += 4 + end + set $next_th=(((char *)$next_t->pids[1].pid_list.next) - $pid_off) + while ($next_th != $next_t) + set $next_th=(struct task_struct *)$next_th + printf "\npid %d; comm %s:\n", $next_t.pid, $next_t.comm + printf "===================\n" + set var $stackp = $next_t.thread.esp + set var $stack_top = ($stackp & ~4095) + 4096 + + while ($stackp < $stack_top) + if (*($stackp) > _stext && *($stackp) < _sinittext) + info symbol *($stackp) + end + set $stackp += 4 + end + set $next_th=(((char *)$next_th->pids[1].pid_list.next) - $pid_off) + end + set $next_t=(char *)($next_t->tasks.next) - $tasks_off + end +end +document bttnobp + dump all thread stack traces on a kernel compiled with !CONFIG_FRAME_POINTER +end + +define btt + set $tasks_off=((size_t)&((struct task_struct *)0)->tasks) + set $pid_off=((size_t)&((struct task_struct *)0)->pids[1].pid_list.next) + set $init_t=&init_task + set $next_t=(((char *)($init_t->tasks).next) - $tasks_off) + while ($next_t != $init_t) + set $next_t=(struct task_struct *)$next_t + printf "\npid %d; comm %s:\n", $next_t.pid, $next_t.comm + printf "===================\n" + set var $stackp = $next_t.thread.esp + set var $stack_top = ($stackp & ~4095) + 4096 + set var $stack_bot = ($stackp & ~4095) + + set $stackp = *($stackp) + while (($stackp < $stack_top) && ($stackp > $stack_bot)) + set var $addr = *($stackp + 4) + info symbol $addr + set $stackp = *($stackp) + end + + set $next_th=(((char *)$next_t->pids[1].pid_list.next) - $pid_off) + while ($next_th != $next_t) + set $next_th=(struct task_struct *)$next_th + printf "\npid %d; comm %s:\n", $next_t.pid, $next_t.comm + printf "===================\n" + set var $stackp = $next_t.thread.esp + set var $stack_top = ($stackp & ~4095) + 4096 + set var $stack_bot = ($stackp & ~4095) + + set $stackp = *($stackp) + while (($stackp < $stack_top) && ($stackp > $stack_bot)) + set var $addr = *($stackp + 4) + info symbol $addr + set $stackp = *($stackp) + end + set $next_th=(((char *)$next_th->pids[1].pid_list.next) - $pid_off) + end + set $next_t=(char *)($next_t->tasks.next) - $tasks_off + end +end +document btt + dump all thread stack traces on a kernel compiled with CONFIG_FRAME_POINTER +end + +define btpid + set var $pid = $arg0 + set $tasks_off=((size_t)&((struct task_struct *)0)->tasks) + set $pid_off=((size_t)&((struct task_struct *)0)->pids[1].pid_list.next) + set $init_t=&init_task + set $next_t=(((char *)($init_t->tasks).next) - $tasks_off) + set var $pid_task = 0 + + while ($next_t != $init_t) + set $next_t=(struct task_struct *)$next_t + + if ($next_t.pid == $pid) + set $pid_task = $next_t + end + + set $next_th=(((char *)$next_t->pids[1].pid_list.next) - $pid_off) + while ($next_th != $next_t) + set $next_th=(struct task_struct *)$next_th + if ($next_th.pid == $pid) + set $pid_task = $next_th + end + set $next_th=(((char *)$next_th->pids[1].pid_list.next) - $pid_off) + end + set $next_t=(char *)($next_t->tasks.next) - $tasks_off + end + + printf "\npid %d; comm %s:\n", $pid_task.pid, $pid_task.comm + printf "===================\n" + set var $stackp = $pid_task.thread.esp + set var $stack_top = ($stackp & ~4095) + 4096 + set var $stack_bot = ($stackp & ~4095) + + set $stackp = *($stackp) + while (($stackp < $stack_top) && ($stackp > $stack_bot)) + set var $addr = *($stackp + 4) + info symbol $addr + set $stackp = *($stackp) + end +end +document btpid + backtrace of pid +end + + +define trapinfo + set var $pid = $arg0 + set $tasks_off=((size_t)&((struct task_struct *)0)->tasks) + set $pid_off=((size_t)&((struct task_struct *)0)->pids[1].pid_list.next) + set $init_t=&init_task + set $next_t=(((char *)($init_t->tasks).next) - $tasks_off) + set var $pid_task = 0 + + while ($next_t != $init_t) + set $next_t=(struct task_struct *)$next_t + + if ($next_t.pid == $pid) + set $pid_task = $next_t + end + + set $next_th=(((char *)$next_t->pids[1].pid_list.next) - $pid_off) + while ($next_th != $next_t) + set $next_th=(struct task_struct *)$next_th + if ($next_th.pid == $pid) + set $pid_task = $next_th + end + set $next_th=(((char *)$next_th->pids[1].pid_list.next) - $pid_off) + end + set $next_t=(char *)($next_t->tasks.next) - $tasks_off + end + + printf "Trapno %ld, cr2 0x%lx, error_code %ld\n", $pid_task.thread.trap_no, \ + $pid_task.thread.cr2, $pid_task.thread.error_code + +end +document trapinfo + Run info threads and lookup pid of thread #1 + 'trapinfo <pid>' will tell you by which trap & possibly + addresthe kernel paniced. +end diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt new file mode 100644 index 000000000000..7ff213f4becd --- /dev/null +++ b/Documentation/kdump/kdump.txt @@ -0,0 +1,141 @@ +Documentation for kdump - the kexec-based crash dumping solution +================================================================ + +DESIGN +====== + +Kdump uses kexec to reboot to a second kernel whenever a dump needs to be taken. +This second kernel is booted with very little memory. The first kernel reserves +the section of memory that the second kernel uses. This ensures that on-going +DMA from the first kernel does not corrupt the second kernel. + +All the necessary information about Core image is encoded in ELF format and +stored in reserved area of memory before crash. Physical address of start of +ELF header is passed to new kernel through command line parameter elfcorehdr=. + +On i386, the first 640 KB of physical memory is needed to boot, irrespective +of where the kernel loads. Hence, this region is backed up by kexec just before +rebooting into the new kernel. + +In the second kernel, "old memory" can be accessed in two ways. + +- The first one is through a /dev/oldmem device interface. A capture utility + can read the device file and write out the memory in raw format. This is raw + dump of memory and analysis/capture tool should be intelligent enough to + determine where to look for the right information. ELF headers (elfcorehdr=) + can become handy here. + +- The second interface is through /proc/vmcore. This exports the dump as an ELF + format file which can be written out using any file copy command + (cp, scp, etc). Further, gdb can be used to perform limited debugging on + the dump file. This method ensures methods ensure that there is correct + ordering of the dump pages (corresponding to the first 640 KB that has been + relocated). + +SETUP +===== + +1) Download http://www.xmission.com/~ebiederm/files/kexec/kexec-tools-1.101.tar.gz + and apply http://lse.sourceforge.net/kdump/patches/kexec-tools-1.101-kdump.patch + and after that build the source. + +2) Download and build the appropriate (latest) kexec/kdump (-mm) kernel + patchset and apply it to the vanilla kernel tree. + + Two kernels need to be built in order to get this feature working. + + A) First kernel: + a) Enable "kexec system call" feature (in Processor type and features). + CONFIG_KEXEC=y + b) This kernel's physical load address should be the default value of + 0x100000 (0x100000, 1 MB) (in Processor type and features). + CONFIG_PHYSICAL_START=0x100000 + c) Enable "sysfs file system support" (in Pseudo filesystems). + CONFIG_SYSFS=y + d) Boot into first kernel with the command line parameter "crashkernel=Y@X". + Use appropriate values for X and Y. Y denotes how much memory to reserve + for the second kernel, and X denotes at what physical address the reserved + memory section starts. For example: "crashkernel=64M@16M". + + B) Second kernel: + a) Enable "kernel crash dumps" feature (in Processor type and features). + CONFIG_CRASH_DUMP=y + b) Specify a suitable value for "Physical address where the kernel is + loaded" (in Processor type and features). Typically this value + should be same as X (See option d) above, e.g., 16 MB or 0x1000000. + CONFIG_PHYSICAL_START=0x1000000 + c) Enable "/proc/vmcore support" (Optional, in Pseudo filesystems). + CONFIG_PROC_VMCORE=y + d) Disable SMP support and build a UP kernel (Until it is fixed). + CONFIG_SMP=n + e) Enable "Local APIC support on uniprocessors". + CONFIG_X86_UP_APIC=y + f) Enable "IO-APIC support on uniprocessors" + CONFIG_X86_UP_IOAPIC=y + + Note: i) Options a) and b) depend upon "Configure standard kernel features + (for small systems)" (under General setup). + ii) Option a) also depends on CONFIG_HIGHMEM (under Processor + type and features). + iii) Both option a) and b) are under "Processor type and features". + +3) Boot into the first kernel. You are now ready to try out kexec-based crash + dumps. + +4) Load the second kernel to be booted using: + + kexec -p <second-kernel> --crash-dump --args-linux --append="root=<root-dev> + init 1 irqpoll" + + Note: i) <second-kernel> has to be a vmlinux image. bzImage will not work, + as of now. + ii) By default ELF headers are stored in ELF32 format (for i386). This + is sufficient to represent the physical memory up to 4GB. To store + headers in ELF64 format, specifiy "--elf64-core-headers" on the + kexec command line additionally. + iii) Specify "irqpoll" as command line parameter. This reduces driver + initialization failures in second kernel due to shared interrupts. + +5) System reboots into the second kernel when a panic occurs. A module can be + written to force the panic or "ALT-SysRq-c" can be used initiate a crash + dump for testing purposes. + +6) Write out the dump file using + + cp /proc/vmcore <dump-file> + + Dump memory can also be accessed as a /dev/oldmem device for a linear/raw + view. To create the device, type: + + mknod /dev/oldmem c 1 12 + + Use "dd" with suitable options for count, bs and skip to access specific + portions of the dump. + + Entire memory: dd if=/dev/oldmem of=oldmem.001 + +ANALYSIS +======== + +Limited analysis can be done using gdb on the dump file copied out of +/proc/vmcore. Use vmlinux built with -g and run + + gdb vmlinux <dump-file> + +Stack trace for the task on processor 0, register display, memory display +work fine. + +Note: gdb cannot analyse core files generated in ELF64 format for i386. + +TODO +==== + +1) Provide a kernel pages filtering mechanism so that core file size is not + insane on systems having huge memory banks. +2) Modify "crash" tool to make it recognize this dump. + +CONTACT +======= + +Vivek Goyal (vgoyal@in.ibm.com) +Maneesh Soni (maneesh@in.ibm.com) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 4924d387a657..3d5cd7a09b2f 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -37,7 +37,7 @@ restrictions referred to are that the relevant option is valid if: IA-32 IA-32 aka i386 architecture is enabled. IA-64 IA-64 architecture is enabled. IOSCHED More than one I/O scheduler is enabled. - IP_PNP IP DCHP, BOOTP, or RARP is enabled. + IP_PNP IP DHCP, BOOTP, or RARP is enabled. ISAPNP ISA PnP code is enabled. ISDN Appropriate ISDN support is enabled. JOY Appropriate joystick support is enabled. @@ -159,6 +159,11 @@ running once the system is up. acpi_fake_ecdt [HW,ACPI] Workaround failure due to BIOS lacking ECDT + acpi_generic_hotkey [HW,ACPI] + Allow consolidated generic hotkey driver to + over-ride platform specific driver. + See also Documentation/acpi-hotkey.txt. + ad1816= [HW,OSS] Format: <io>,<irq>,<dma>,<dma2> See also Documentation/sound/oss/AD1816. @@ -358,6 +363,10 @@ running once the system is up. cpia_pp= [HW,PPT] Format: { parport<nr> | auto | none } + crashkernel=nn[KMG]@ss[KMG] + [KNL] Reserve a chunk of physical memory to + hold a kernel to switch to with kexec on panic. + cs4232= [HW,OSS] Format: <io>,<irq>,<dma>,<dma2>,<mpuio>,<mpuirq> @@ -447,6 +456,10 @@ running once the system is up. Format: {"as"|"cfq"|"deadline"|"noop"} See Documentation/block/as-iosched.txt and Documentation/block/deadline-iosched.txt for details. + elfcorehdr= [IA-32] + Specifies physical address of start of kernel core image + elf header. + See Documentation/kdump.txt for details. enforcing [SELINUX] Set initial enforcing status. Format: {"0" | "1"} @@ -548,6 +561,9 @@ running once the system is up. i810= [HW,DRM] + i8k.ignore_dmi [HW] Continue probing hardware even if DMI data + indicates that the driver is running on unsupported + hardware. i8k.force [HW] Activate i8k driver even if SMM BIOS signature does not match list of supported models. i8k.power_status @@ -611,6 +627,17 @@ running once the system is up. ips= [HW,SCSI] Adaptec / IBM ServeRAID controller See header of drivers/scsi/ips.c. + irqfixup [HW] + When an interrupt is not handled search all handlers + for it. Intended to get systems with badly broken + firmware running. + + irqpoll [HW] + When an interrupt is not handled search all handlers + for it. Also check all handlers each timer + interrupt. Intended to get systems with badly broken + firmware running. + isapnp= [ISAPNP] Format: <RDP>, <reset>, <pci_scan>, <verbosity> @@ -736,6 +763,9 @@ running once the system is up. maxcpus= [SMP] Maximum number of processors that an SMP kernel should make use of + max_addr=[KMG] [KNL,BOOT,ia64] All physical memory greater than or + equal to this physical address is ignored. + max_luns= [SCSI] Maximum number of LUNs to probe Should be between 1 and 2^32-1. @@ -1019,6 +1049,10 @@ running once the system is up. irqmask=0xMMMM [IA-32] Set a bit mask of IRQs allowed to be assigned automatically to PCI devices. You can make the kernel exclude IRQs of your ISA cards this way. + pirqaddr=0xAAAAA [IA-32] Specify the physical address + of the PIRQ table (normally generated + by the BIOS) if it is outside the + F0000h-100000h range. lastbus=N [IA-32] Scan all buses till bus #N. Can be useful if the kernel is unable to find your secondary buses and you want to tell it explicitly which ones they are. @@ -1104,7 +1138,7 @@ running once the system is up. See Documentation/ramdisk.txt. psmouse.proto= [HW,MOUSE] Highest PS2 mouse protocol extension to - probe for (bare|imps|exps). + probe for (bare|imps|exps|lifebook|any). psmouse.rate= [HW,MOUSE] Set desired mouse report rate, in reports per second. psmouse.resetafter= diff --git a/Documentation/keys.txt b/Documentation/keys.txt index 36d80aeeaf28..0321ded4b9ae 100644 --- a/Documentation/keys.txt +++ b/Documentation/keys.txt @@ -22,6 +22,7 @@ This document has the following sections: - New procfs files - Userspace system call interface - Kernel services + - Notes on accessing payload contents - Defining a key type - Request-key callback service - Key access filesystem @@ -45,27 +46,26 @@ Each key has a number of attributes: - State. - (*) Each key is issued a serial number of type key_serial_t that is unique - for the lifetime of that key. All serial numbers are positive non-zero - 32-bit integers. + (*) Each key is issued a serial number of type key_serial_t that is unique for + the lifetime of that key. All serial numbers are positive non-zero 32-bit + integers. Userspace programs can use a key's serial numbers as a way to gain access to it, subject to permission checking. (*) Each key is of a defined "type". Types must be registered inside the - kernel by a kernel service (such as a filesystem) before keys of that - type can be added or used. Userspace programs cannot define new types - directly. + kernel by a kernel service (such as a filesystem) before keys of that type + can be added or used. Userspace programs cannot define new types directly. - Key types are represented in the kernel by struct key_type. This defines - a number of operations that can be performed on a key of that type. + Key types are represented in the kernel by struct key_type. This defines a + number of operations that can be performed on a key of that type. Should a type be removed from the system, all the keys of that type will be invalidated. (*) Each key has a description. This should be a printable string. The key - type provides an operation to perform a match between the description on - a key and a criterion string. + type provides an operation to perform a match between the description on a + key and a criterion string. (*) Each key has an owner user ID, a group ID and a permissions mask. These are used to control what a process may do to a key from userspace, and @@ -74,10 +74,10 @@ Each key has a number of attributes: (*) Each key can be set to expire at a specific time by the key type's instantiation function. Keys can also be immortal. - (*) Each key can have a payload. This is a quantity of data that represent - the actual "key". In the case of a keyring, this is a list of keys to - which the keyring links; in the case of a user-defined key, it's an - arbitrary blob of data. + (*) Each key can have a payload. This is a quantity of data that represent the + actual "key". In the case of a keyring, this is a list of keys to which + the keyring links; in the case of a user-defined key, it's an arbitrary + blob of data. Having a payload is not required; and the payload can, in fact, just be a value stored in the struct key itself. @@ -92,8 +92,8 @@ Each key has a number of attributes: (*) Each key can be in one of a number of basic states: - (*) Uninstantiated. The key exists, but does not have any data - attached. Keys being requested from userspace will be in this state. + (*) Uninstantiated. The key exists, but does not have any data attached. + Keys being requested from userspace will be in this state. (*) Instantiated. This is the normal state. The key is fully formed, and has data attached. @@ -140,10 +140,10 @@ The key service provides a number of features besides keys: clone, fork, vfork or execve occurs. A new keyring is created only when required. - The process-specific keyring is replaced with an empty one in the child - on clone, fork, vfork unless CLONE_THREAD is supplied, in which case it - is shared. execve also discards the process's process keyring and creates - a new one. + The process-specific keyring is replaced with an empty one in the child on + clone, fork, vfork unless CLONE_THREAD is supplied, in which case it is + shared. execve also discards the process's process keyring and creates a + new one. The session-specific keyring is persistent across clone, fork, vfork and execve, even when the latter executes a set-UID or set-GID binary. A @@ -177,11 +177,11 @@ The key service provides a number of features besides keys: If a system call that modifies a key or keyring in some way would put the user over quota, the operation is refused and error EDQUOT is returned. - (*) There's a system call interface by which userspace programs can create - and manipulate keys and keyrings. + (*) There's a system call interface by which userspace programs can create and + manipulate keys and keyrings. - (*) There's a kernel interface by which services can register types and - search for keys. + (*) There's a kernel interface by which services can register types and search + for keys. (*) There's a way for the a search done from the kernel to call back to userspace to request a key that can't be found in a process's keyrings. @@ -194,9 +194,9 @@ The key service provides a number of features besides keys: KEY ACCESS PERMISSIONS ====================== -Keys have an owner user ID, a group access ID, and a permissions mask. The -mask has up to eight bits each for user, group and other access. Only five of -each set of eight bits are defined. These permissions granted are: +Keys have an owner user ID, a group access ID, and a permissions mask. The mask +has up to eight bits each for user, group and other access. Only five of each +set of eight bits are defined. These permissions granted are: (*) View @@ -210,8 +210,8 @@ each set of eight bits are defined. These permissions granted are: (*) Write - This permits a key's payload to be instantiated or updated, or it allows - a link to be added to or removed from a keyring. + This permits a key's payload to be instantiated or updated, or it allows a + link to be added to or removed from a keyring. (*) Search @@ -238,8 +238,8 @@ about the status of the key service: (*) /proc/keys This lists all the keys on the system, giving information about their - type, description and permissions. The payload of the key is not - available this way: + type, description and permissions. The payload of the key is not available + this way: SERIAL FLAGS USAGE EXPY PERM UID GID TYPE DESCRIPTION: SUMMARY 00000001 I----- 39 perm 1f0000 0 0 keyring _uid_ses.0: 1/4 @@ -318,21 +318,21 @@ The main syscalls are: If a key of the same type and description as that proposed already exists in the keyring, this will try to update it with the given payload, or it will return error EEXIST if that function is not supported by the key - type. The process must also have permission to write to the key to be - able to update it. The new key will have all user permissions granted and - no group or third party permissions. + type. The process must also have permission to write to the key to be able + to update it. The new key will have all user permissions granted and no + group or third party permissions. - Otherwise, this will attempt to create a new key of the specified type - and description, and to instantiate it with the supplied payload and - attach it to the keyring. In this case, an error will be generated if the - process does not have permission to write to the keyring. + Otherwise, this will attempt to create a new key of the specified type and + description, and to instantiate it with the supplied payload and attach it + to the keyring. In this case, an error will be generated if the process + does not have permission to write to the keyring. The payload is optional, and the pointer can be NULL if not required by the type. The payload is plen in size, and plen can be zero for an empty payload. - A new keyring can be generated by setting type "keyring", the keyring - name as the description (or NULL) and setting the payload to NULL. + A new keyring can be generated by setting type "keyring", the keyring name + as the description (or NULL) and setting the payload to NULL. User defined keys can be created by specifying type "user". It is recommended that a user defined key's description by prefixed with a type @@ -369,9 +369,9 @@ The keyctl syscall functions are: key_serial_t keyctl(KEYCTL_GET_KEYRING_ID, key_serial_t id, int create); - The special key specified by "id" is looked up (with the key being - created if necessary) and the ID of the key or keyring thus found is - returned if it exists. + The special key specified by "id" is looked up (with the key being created + if necessary) and the ID of the key or keyring thus found is returned if + it exists. If the key does not yet exist, the key will be created if "create" is non-zero; and the error ENOKEY will be returned if "create" is zero. @@ -402,8 +402,8 @@ The keyctl syscall functions are: This will try to update the specified key with the given payload, or it will return error EOPNOTSUPP if that function is not supported by the key - type. The process must also have permission to write to the key to be - able to update it. + type. The process must also have permission to write to the key to be able + to update it. The payload is of length plen, and may be absent or empty as for add_key(). @@ -422,8 +422,8 @@ The keyctl syscall functions are: long keyctl(KEYCTL_CHOWN, key_serial_t key, uid_t uid, gid_t gid); - This function permits a key's owner and group ID to be changed. Either - one of uid or gid can be set to -1 to suppress that change. + This function permits a key's owner and group ID to be changed. Either one + of uid or gid can be set to -1 to suppress that change. Only the superuser can change a key's owner to something other than the key's current owner. Similarly, only the superuser can change a key's @@ -484,12 +484,12 @@ The keyctl syscall functions are: long keyctl(KEYCTL_LINK, key_serial_t keyring, key_serial_t key); - This function creates a link from the keyring to the key. The process - must have write permission on the keyring and must have link permission - on the key. + This function creates a link from the keyring to the key. The process must + have write permission on the keyring and must have link permission on the + key. - Should the keyring not be a keyring, error ENOTDIR will result; and if - the keyring is full, error ENFILE will result. + Should the keyring not be a keyring, error ENOTDIR will result; and if the + keyring is full, error ENFILE will result. The link procedure checks the nesting of the keyrings, returning ELOOP if it appears to deep or EDEADLK if the link would introduce a cycle. @@ -503,8 +503,8 @@ The keyctl syscall functions are: specified key, and removes it if found. Subsequent links to that key are ignored. The process must have write permission on the keyring. - If the keyring is not a keyring, error ENOTDIR will result; and if the - key is not present, error ENOENT will be the result. + If the keyring is not a keyring, error ENOTDIR will result; and if the key + is not present, error ENOENT will be the result. (*) Search a keyring tree for a key: @@ -513,9 +513,9 @@ The keyctl syscall functions are: const char *type, const char *description, key_serial_t dest_keyring); - This searches the keyring tree headed by the specified keyring until a - key is found that matches the type and description criteria. Each keyring - is checked for keys before recursion into its children occurs. + This searches the keyring tree headed by the specified keyring until a key + is found that matches the type and description criteria. Each keyring is + checked for keys before recursion into its children occurs. The process must have search permission on the top level keyring, or else error EACCES will result. Only keyrings that the process has search @@ -549,8 +549,8 @@ The keyctl syscall functions are: As much of the data as can be fitted into the buffer will be copied to userspace if the buffer pointer is not NULL. - On a successful return, the function will always return the amount of - data available rather than the amount copied. + On a successful return, the function will always return the amount of data + available rather than the amount copied. (*) Instantiate a partially constructed key. @@ -568,8 +568,8 @@ The keyctl syscall functions are: it, and the key must be uninstantiated. If a keyring is specified (non-zero), the key will also be linked into - that keyring, however all the constraints applying in KEYCTL_LINK apply - in this case too. + that keyring, however all the constraints applying in KEYCTL_LINK apply in + this case too. The payload and plen arguments describe the payload data as for add_key(). @@ -587,8 +587,39 @@ The keyctl syscall functions are: it, and the key must be uninstantiated. If a keyring is specified (non-zero), the key will also be linked into - that keyring, however all the constraints applying in KEYCTL_LINK apply - in this case too. + that keyring, however all the constraints applying in KEYCTL_LINK apply in + this case too. + + + (*) Set the default request-key destination keyring. + + long keyctl(KEYCTL_SET_REQKEY_KEYRING, int reqkey_defl); + + This sets the default keyring to which implicitly requested keys will be + attached for this thread. reqkey_defl should be one of these constants: + + CONSTANT VALUE NEW DEFAULT KEYRING + ====================================== ====== ======================= + KEY_REQKEY_DEFL_NO_CHANGE -1 No change + KEY_REQKEY_DEFL_DEFAULT 0 Default[1] + KEY_REQKEY_DEFL_THREAD_KEYRING 1 Thread keyring + KEY_REQKEY_DEFL_PROCESS_KEYRING 2 Process keyring + KEY_REQKEY_DEFL_SESSION_KEYRING 3 Session keyring + KEY_REQKEY_DEFL_USER_KEYRING 4 User keyring + KEY_REQKEY_DEFL_USER_SESSION_KEYRING 5 User session keyring + KEY_REQKEY_DEFL_GROUP_KEYRING 6 Group keyring + + The old default will be returned if successful and error EINVAL will be + returned if reqkey_defl is not one of the above values. + + The default keyring can be overridden by the keyring indicated to the + request_key() system call. + + Note that this setting is inherited across fork/exec. + + [1] The default default is: the thread keyring if there is one, otherwise + the process keyring if there is one, otherwise the session keyring if + there is one, otherwise the user default session keyring. =============== @@ -601,17 +632,14 @@ be broken down into two areas: keys and key types. Dealing with keys is fairly straightforward. Firstly, the kernel service registers its type, then it searches for a key of that type. It should retain the key as long as it has need of it, and then it should release it. For a -filesystem or device file, a search would probably be performed during the -open call, and the key released upon close. How to deal with conflicting keys -due to two different users opening the same file is left to the filesystem -author to solve. - -When accessing a key's payload data, key->lock should be at least read locked, -or else the data may be changed by an update being performed from userspace -whilst the driver or filesystem is trying to access it. If no update method is -supplied, then the key's payload may be accessed without holding a lock as -there is no way to change it, provided it can be guaranteed that the key's -type definition won't go away. +filesystem or device file, a search would probably be performed during the open +call, and the key released upon close. How to deal with conflicting keys due to +two different users opening the same file is left to the filesystem author to +solve. + +When accessing a key's payload contents, certain precautions must be taken to +prevent access vs modification races. See the section "Notes on accessing +payload contents" for more information. (*) To search for a key, call: @@ -629,6 +657,9 @@ type definition won't go away. Should the function fail error ENOKEY, EKEYEXPIRED or EKEYREVOKED will be returned. + If successful, the key will have been attached to the default keyring for + implicitly obtained request-key keys, as set by KEYCTL_SET_REQKEY_KEYRING. + (*) When it is no longer required, the key should be released using: @@ -690,6 +721,54 @@ type definition won't go away. void unregister_key_type(struct key_type *type); +=================================== +NOTES ON ACCESSING PAYLOAD CONTENTS +=================================== + +The simplest payload is just a number in key->payload.value. In this case, +there's no need to indulge in RCU or locking when accessing the payload. + +More complex payload contents must be allocated and a pointer to them set in +key->payload.data. One of the following ways must be selected to access the +data: + + (1) Unmodifyable key type. + + If the key type does not have a modify method, then the key's payload can + be accessed without any form of locking, provided that it's known to be + instantiated (uninstantiated keys cannot be "found"). + + (2) The key's semaphore. + + The semaphore could be used to govern access to the payload and to control + the payload pointer. It must be write-locked for modifications and would + have to be read-locked for general access. The disadvantage of doing this + is that the accessor may be required to sleep. + + (3) RCU. + + RCU must be used when the semaphore isn't already held; if the semaphore + is held then the contents can't change under you unexpectedly as the + semaphore must still be used to serialise modifications to the key. The + key management code takes care of this for the key type. + + However, this means using: + + rcu_read_lock() ... rcu_dereference() ... rcu_read_unlock() + + to read the pointer, and: + + rcu_dereference() ... rcu_assign_pointer() ... call_rcu() + + to set the pointer and dispose of the old contents after a grace period. + Note that only the key type should ever modify a key's payload. + + Furthermore, an RCU controlled payload must hold a struct rcu_head for the + use of call_rcu() and, if the payload is of variable size, the length of + the payload. key->datalen cannot be relied upon to be consistent with the + payload just dereferenced if the key's semaphore is not held. + + =================== DEFINING A KEY TYPE =================== @@ -717,15 +796,15 @@ The structure has a number of fields, some of which are mandatory: int key_payload_reserve(struct key *key, size_t datalen); - With the revised data length. Error EDQUOT will be returned if this is - not viable. + With the revised data length. Error EDQUOT will be returned if this is not + viable. (*) int (*instantiate)(struct key *key, const void *data, size_t datalen); This method is called to attach a payload to a key during construction. - The payload attached need not bear any relation to the data passed to - this function. + The payload attached need not bear any relation to the data passed to this + function. If the amount of data attached to the key differs from the size in keytype->def_datalen, then key_payload_reserve() should be called. @@ -734,38 +813,47 @@ The structure has a number of fields, some of which are mandatory: The fact that KEY_FLAG_INSTANTIATED is not set in key->flags prevents anything else from gaining access to the key. - This method may sleep if it wishes. + It is safe to sleep in this method. (*) int (*duplicate)(struct key *key, const struct key *source); If this type of key can be duplicated, then this method should be - provided. It is called to copy the payload attached to the source into - the new key. The data length on the new key will have been updated and - the quota adjusted already. + provided. It is called to copy the payload attached to the source into the + new key. The data length on the new key will have been updated and the + quota adjusted already. This method will be called with the source key's semaphore read-locked to - prevent its payload from being changed. It is safe to sleep here. + prevent its payload from being changed, thus RCU constraints need not be + applied to the source key. + + This method does not have to lock the destination key in order to attach a + payload. The fact that KEY_FLAG_INSTANTIATED is not set in key->flags + prevents anything else from gaining access to the key. + + It is safe to sleep in this method. (*) int (*update)(struct key *key, const void *data, size_t datalen); - If this type of key can be updated, then this method should be - provided. It is called to update a key's payload from the blob of data - provided. + If this type of key can be updated, then this method should be provided. + It is called to update a key's payload from the blob of data provided. key_payload_reserve() should be called if the data length might change - before any changes are actually made. Note that if this succeeds, the - type is committed to changing the key because it's already been altered, - so all memory allocation must be done first. + before any changes are actually made. Note that if this succeeds, the type + is committed to changing the key because it's already been altered, so all + memory allocation must be done first. + + The key will have its semaphore write-locked before this method is called, + but this only deters other writers; any changes to the key's payload must + be made under RCU conditions, and call_rcu() must be used to dispose of + the old payload. - key_payload_reserve() should be called with the key->lock write locked, - and the changes to the key's attached payload should be made before the - key is locked. + key_payload_reserve() should be called before the changes are made, but + after all allocations and other potentially failing function calls are + made. - The key will have its semaphore write-locked before this method is - called. Any changes to the key should be made with the key's rwlock - write-locked also. It is safe to sleep here. + It is safe to sleep in this method. (*) int (*match)(const struct key *key, const void *desc); @@ -782,12 +870,12 @@ The structure has a number of fields, some of which are mandatory: (*) void (*destroy)(struct key *key); - This method is optional. It is called to discard the payload data on a - key when it is being destroyed. + This method is optional. It is called to discard the payload data on a key + when it is being destroyed. - This method does not need to lock the key; it can consider the key as - being inaccessible. Note that the key's type may have changed before this - function is called. + This method does not need to lock the key to access the payload; it can + consider the key as being inaccessible at this time. Note that the key's + type may have been changed before this function is called. It is not safe to sleep in this method; the caller may hold spinlocks. @@ -797,26 +885,31 @@ The structure has a number of fields, some of which are mandatory: This method is optional. It is called during /proc/keys reading to summarise a key's description and payload in text form. - This method will be called with the key's rwlock read-locked. This will - prevent the key's payload and state changing; also the description should - not change. This also means it is not safe to sleep in this method. + This method will be called with the RCU read lock held. rcu_dereference() + should be used to read the payload pointer if the payload is to be + accessed. key->datalen cannot be trusted to stay consistent with the + contents of the payload. + + The description will not change, though the key's state may. + + It is not safe to sleep in this method; the RCU read lock is held by the + caller. (*) long (*read)(const struct key *key, char __user *buffer, size_t buflen); This method is optional. It is called by KEYCTL_READ to translate the - key's payload into something a blob of data for userspace to deal - with. Ideally, the blob should be in the same format as that passed in to - the instantiate and update methods. + key's payload into something a blob of data for userspace to deal with. + Ideally, the blob should be in the same format as that passed in to the + instantiate and update methods. If successful, the blob size that could be produced should be returned rather than the size copied. - This method will be called with the key's semaphore read-locked. This - will prevent the key's payload changing. It is not necessary to also - read-lock key->lock when accessing the key's payload. It is safe to sleep - in this method, such as might happen when the userspace buffer is - accessed. + This method will be called with the key's semaphore read-locked. This will + prevent the key's payload changing. It is not necessary to use RCU locking + when accessing the key's payload. It is safe to sleep in this method, such + as might happen when the userspace buffer is accessed. ============================ @@ -853,8 +946,8 @@ If it returns with the key remaining in the unconstructed state, the key will be marked as being negative, it will be added to the session keyring, and an error will be returned to the key requestor. -Supplementary information may be provided from whoever or whatever invoked -this service. This will be passed as the <callout_info> parameter. If no such +Supplementary information may be provided from whoever or whatever invoked this +service. This will be passed as the <callout_info> parameter. If no such information was made available, then "-" will be passed as this parameter instead. diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt new file mode 100644 index 000000000000..0541fe1de704 --- /dev/null +++ b/Documentation/kprobes.txt @@ -0,0 +1,588 @@ +Title : Kernel Probes (Kprobes) +Authors : Jim Keniston <jkenisto@us.ibm.com> + : Prasanna S Panchamukhi <prasanna@in.ibm.com> + +CONTENTS + +1. Concepts: Kprobes, Jprobes, Return Probes +2. Architectures Supported +3. Configuring Kprobes +4. API Reference +5. Kprobes Features and Limitations +6. Probe Overhead +7. TODO +8. Kprobes Example +9. Jprobes Example +10. Kretprobes Example + +1. Concepts: Kprobes, Jprobes, Return Probes + +Kprobes enables you to dynamically break into any kernel routine and +collect debugging and performance information non-disruptively. You +can trap at almost any kernel code address, specifying a handler +routine to be invoked when the breakpoint is hit. + +There are currently three types of probes: kprobes, jprobes, and +kretprobes (also called return probes). A kprobe can be inserted +on virtually any instruction in the kernel. A jprobe is inserted at +the entry to a kernel function, and provides convenient access to the +function's arguments. A return probe fires when a specified function +returns. + +In the typical case, Kprobes-based instrumentation is packaged as +a kernel module. The module's init function installs ("registers") +one or more probes, and the exit function unregisters them. A +registration function such as register_kprobe() specifies where +the probe is to be inserted and what handler is to be called when +the probe is hit. + +The next three subsections explain how the different types of +probes work. They explain certain things that you'll need to +know in order to make the best use of Kprobes -- e.g., the +difference between a pre_handler and a post_handler, and how +to use the maxactive and nmissed fields of a kretprobe. But +if you're in a hurry to start using Kprobes, you can skip ahead +to section 2. + +1.1 How Does a Kprobe Work? + +When a kprobe is registered, Kprobes makes a copy of the probed +instruction and replaces the first byte(s) of the probed instruction +with a breakpoint instruction (e.g., int3 on i386 and x86_64). + +When a CPU hits the breakpoint instruction, a trap occurs, the CPU's +registers are saved, and control passes to Kprobes via the +notifier_call_chain mechanism. Kprobes executes the "pre_handler" +associated with the kprobe, passing the handler the addresses of the +kprobe struct and the saved registers. + +Next, Kprobes single-steps its copy of the probed instruction. +(It would be simpler to single-step the actual instruction in place, +but then Kprobes would have to temporarily remove the breakpoint +instruction. This would open a small time window when another CPU +could sail right past the probepoint.) + +After the instruction is single-stepped, Kprobes executes the +"post_handler," if any, that is associated with the kprobe. +Execution then continues with the instruction following the probepoint. + +1.2 How Does a Jprobe Work? + +A jprobe is implemented using a kprobe that is placed on a function's +entry point. It employs a simple mirroring principle to allow +seamless access to the probed function's arguments. The jprobe +handler routine should have the same signature (arg list and return +type) as the function being probed, and must always end by calling +the Kprobes function jprobe_return(). + +Here's how it works. When the probe is hit, Kprobes makes a copy of +the saved registers and a generous portion of the stack (see below). +Kprobes then points the saved instruction pointer at the jprobe's +handler routine, and returns from the trap. As a result, control +passes to the handler, which is presented with the same register and +stack contents as the probed function. When it is done, the handler +calls jprobe_return(), which traps again to restore the original stack +contents and processor state and switch to the probed function. + +By convention, the callee owns its arguments, so gcc may produce code +that unexpectedly modifies that portion of the stack. This is why +Kprobes saves a copy of the stack and restores it after the jprobe +handler has run. Up to MAX_STACK_SIZE bytes are copied -- e.g., +64 bytes on i386. + +Note that the probed function's args may be passed on the stack +or in registers (e.g., for x86_64 or for an i386 fastcall function). +The jprobe will work in either case, so long as the handler's +prototype matches that of the probed function. + +1.3 How Does a Return Probe Work? + +When you call register_kretprobe(), Kprobes establishes a kprobe at +the entry to the function. When the probed function is called and this +probe is hit, Kprobes saves a copy of the return address, and replaces +the return address with the address of a "trampoline." The trampoline +is an arbitrary piece of code -- typically just a nop instruction. +At boot time, Kprobes registers a kprobe at the trampoline. + +When the probed function executes its return instruction, control +passes to the trampoline and that probe is hit. Kprobes' trampoline +handler calls the user-specified handler associated with the kretprobe, +then sets the saved instruction pointer to the saved return address, +and that's where execution resumes upon return from the trap. + +While the probed function is executing, its return address is +stored in an object of type kretprobe_instance. Before calling +register_kretprobe(), the user sets the maxactive field of the +kretprobe struct to specify how many instances of the specified +function can be probed simultaneously. register_kretprobe() +pre-allocates the indicated number of kretprobe_instance objects. + +For example, if the function is non-recursive and is called with a +spinlock held, maxactive = 1 should be enough. If the function is +non-recursive and can never relinquish the CPU (e.g., via a semaphore +or preemption), NR_CPUS should be enough. If maxactive <= 0, it is +set to a default value. If CONFIG_PREEMPT is enabled, the default +is max(10, 2*NR_CPUS). Otherwise, the default is NR_CPUS. + +It's not a disaster if you set maxactive too low; you'll just miss +some probes. In the kretprobe struct, the nmissed field is set to +zero when the return probe is registered, and is incremented every +time the probed function is entered but there is no kretprobe_instance +object available for establishing the return probe. + +2. Architectures Supported + +Kprobes, jprobes, and return probes are implemented on the following +architectures: + +- i386 +- x86_64 (AMD-64, E64MT) +- ppc64 +- ia64 (Support for probes on certain instruction types is still in progress.) +- sparc64 (Return probes not yet implemented.) + +3. Configuring Kprobes + +When configuring the kernel using make menuconfig/xconfig/oldconfig, +ensure that CONFIG_KPROBES is set to "y". Under "Kernel hacking", +look for "Kprobes". You may have to enable "Kernel debugging" +(CONFIG_DEBUG_KERNEL) before you can enable Kprobes. + +You may also want to ensure that CONFIG_KALLSYMS and perhaps even +CONFIG_KALLSYMS_ALL are set to "y", since kallsyms_lookup_name() +is a handy, version-independent way to find a function's address. + +If you need to insert a probe in the middle of a function, you may find +it useful to "Compile the kernel with debug info" (CONFIG_DEBUG_INFO), +so you can use "objdump -d -l vmlinux" to see the source-to-object +code mapping. + +4. API Reference + +The Kprobes API includes a "register" function and an "unregister" +function for each type of probe. Here are terse, mini-man-page +specifications for these functions and the associated probe handlers +that you'll write. See the latter half of this document for examples. + +4.1 register_kprobe + +#include <linux/kprobes.h> +int register_kprobe(struct kprobe *kp); + +Sets a breakpoint at the address kp->addr. When the breakpoint is +hit, Kprobes calls kp->pre_handler. After the probed instruction +is single-stepped, Kprobe calls kp->post_handler. If a fault +occurs during execution of kp->pre_handler or kp->post_handler, +or during single-stepping of the probed instruction, Kprobes calls +kp->fault_handler. Any or all handlers can be NULL. + +register_kprobe() returns 0 on success, or a negative errno otherwise. + +User's pre-handler (kp->pre_handler): +#include <linux/kprobes.h> +#include <linux/ptrace.h> +int pre_handler(struct kprobe *p, struct pt_regs *regs); + +Called with p pointing to the kprobe associated with the breakpoint, +and regs pointing to the struct containing the registers saved when +the breakpoint was hit. Return 0 here unless you're a Kprobes geek. + +User's post-handler (kp->post_handler): +#include <linux/kprobes.h> +#include <linux/ptrace.h> +void post_handler(struct kprobe *p, struct pt_regs *regs, + unsigned long flags); + +p and regs are as described for the pre_handler. flags always seems +to be zero. + +User's fault-handler (kp->fault_handler): +#include <linux/kprobes.h> +#include <linux/ptrace.h> +int fault_handler(struct kprobe *p, struct pt_regs *regs, int trapnr); + +p and regs are as described for the pre_handler. trapnr is the +architecture-specific trap number associated with the fault (e.g., +on i386, 13 for a general protection fault or 14 for a page fault). +Returns 1 if it successfully handled the exception. + +4.2 register_jprobe + +#include <linux/kprobes.h> +int register_jprobe(struct jprobe *jp) + +Sets a breakpoint at the address jp->kp.addr, which must be the address +of the first instruction of a function. When the breakpoint is hit, +Kprobes runs the handler whose address is jp->entry. + +The handler should have the same arg list and return type as the probed +function; and just before it returns, it must call jprobe_return(). +(The handler never actually returns, since jprobe_return() returns +control to Kprobes.) If the probed function is declared asmlinkage, +fastcall, or anything else that affects how args are passed, the +handler's declaration must match. + +register_jprobe() returns 0 on success, or a negative errno otherwise. + +4.3 register_kretprobe + +#include <linux/kprobes.h> +int register_kretprobe(struct kretprobe *rp); + +Establishes a return probe for the function whose address is +rp->kp.addr. When that function returns, Kprobes calls rp->handler. +You must set rp->maxactive appropriately before you call +register_kretprobe(); see "How Does a Return Probe Work?" for details. + +register_kretprobe() returns 0 on success, or a negative errno +otherwise. + +User's return-probe handler (rp->handler): +#include <linux/kprobes.h> +#include <linux/ptrace.h> +int kretprobe_handler(struct kretprobe_instance *ri, struct pt_regs *regs); + +regs is as described for kprobe.pre_handler. ri points to the +kretprobe_instance object, of which the following fields may be +of interest: +- ret_addr: the return address +- rp: points to the corresponding kretprobe object +- task: points to the corresponding task struct +The handler's return value is currently ignored. + +4.4 unregister_*probe + +#include <linux/kprobes.h> +void unregister_kprobe(struct kprobe *kp); +void unregister_jprobe(struct jprobe *jp); +void unregister_kretprobe(struct kretprobe *rp); + +Removes the specified probe. The unregister function can be called +at any time after the probe has been registered. + +5. Kprobes Features and Limitations + +As of Linux v2.6.12, Kprobes allows multiple probes at the same +address. Currently, however, there cannot be multiple jprobes on +the same function at the same time. + +In general, you can install a probe anywhere in the kernel. +In particular, you can probe interrupt handlers. Known exceptions +are discussed in this section. + +For obvious reasons, it's a bad idea to install a probe in +the code that implements Kprobes (mostly kernel/kprobes.c and +arch/*/kernel/kprobes.c). A patch in the v2.6.13 timeframe instructs +Kprobes to reject such requests. + +If you install a probe in an inline-able function, Kprobes makes +no attempt to chase down all inline instances of the function and +install probes there. gcc may inline a function without being asked, +so keep this in mind if you're not seeing the probe hits you expect. + +A probe handler can modify the environment of the probed function +-- e.g., by modifying kernel data structures, or by modifying the +contents of the pt_regs struct (which are restored to the registers +upon return from the breakpoint). So Kprobes can be used, for example, +to install a bug fix or to inject faults for testing. Kprobes, of +course, has no way to distinguish the deliberately injected faults +from the accidental ones. Don't drink and probe. + +Kprobes makes no attempt to prevent probe handlers from stepping on +each other -- e.g., probing printk() and then calling printk() from a +probe handler. As of Linux v2.6.12, if a probe handler hits a probe, +that second probe's handlers won't be run in that instance. + +In Linux v2.6.12 and previous versions, Kprobes' data structures are +protected by a single lock that is held during probe registration and +unregistration and while handlers are run. Thus, no two handlers +can run simultaneously. To improve scalability on SMP systems, +this restriction will probably be removed soon, in which case +multiple handlers (or multiple instances of the same handler) may +run concurrently on different CPUs. Code your handlers accordingly. + +Kprobes does not use semaphores or allocate memory except during +registration and unregistration. + +Probe handlers are run with preemption disabled. Depending on the +architecture, handlers may also run with interrupts disabled. In any +case, your handler should not yield the CPU (e.g., by attempting to +acquire a semaphore). + +Since a return probe is implemented by replacing the return +address with the trampoline's address, stack backtraces and calls +to __builtin_return_address() will typically yield the trampoline's +address instead of the real return address for kretprobed functions. +(As far as we can tell, __builtin_return_address() is used only +for instrumentation and error reporting.) + +If the number of times a function is called does not match the +number of times it returns, registering a return probe on that +function may produce undesirable results. We have the do_exit() +and do_execve() cases covered. do_fork() is not an issue. We're +unaware of other specific cases where this could be a problem. + +6. Probe Overhead + +On a typical CPU in use in 2005, a kprobe hit takes 0.5 to 1.0 +microseconds to process. Specifically, a benchmark that hits the same +probepoint repeatedly, firing a simple handler each time, reports 1-2 +million hits per second, depending on the architecture. A jprobe or +return-probe hit typically takes 50-75% longer than a kprobe hit. +When you have a return probe set on a function, adding a kprobe at +the entry to that function adds essentially no overhead. + +Here are sample overhead figures (in usec) for different architectures. +k = kprobe; j = jprobe; r = return probe; kr = kprobe + return probe +on same function; jr = jprobe + return probe on same function + +i386: Intel Pentium M, 1495 MHz, 2957.31 bogomips +k = 0.57 usec; j = 1.00; r = 0.92; kr = 0.99; jr = 1.40 + +x86_64: AMD Opteron 246, 1994 MHz, 3971.48 bogomips +k = 0.49 usec; j = 0.76; r = 0.80; kr = 0.82; jr = 1.07 + +ppc64: POWER5 (gr), 1656 MHz (SMT disabled, 1 virtual CPU per physical CPU) +k = 0.77 usec; j = 1.31; r = 1.26; kr = 1.45; jr = 1.99 + +7. TODO + +a. SystemTap (http://sourceware.org/systemtap): Work in progress +to provide a simplified programming interface for probe-based +instrumentation. +b. Improved SMP scalability: Currently, work is in progress to handle +multiple kprobes in parallel. +c. Kernel return probes for sparc64. +d. Support for other architectures. +e. User-space probes. + +8. Kprobes Example + +Here's a sample kernel module showing the use of kprobes to dump a +stack trace and selected i386 registers when do_fork() is called. +----- cut here ----- +/*kprobe_example.c*/ +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/kprobes.h> +#include <linux/kallsyms.h> +#include <linux/sched.h> + +/*For each probe you need to allocate a kprobe structure*/ +static struct kprobe kp; + +/*kprobe pre_handler: called just before the probed instruction is executed*/ +int handler_pre(struct kprobe *p, struct pt_regs *regs) +{ + printk("pre_handler: p->addr=0x%p, eip=%lx, eflags=0x%lx\n", + p->addr, regs->eip, regs->eflags); + dump_stack(); + return 0; +} + +/*kprobe post_handler: called after the probed instruction is executed*/ +void handler_post(struct kprobe *p, struct pt_regs *regs, unsigned long flags) +{ + printk("post_handler: p->addr=0x%p, eflags=0x%lx\n", + p->addr, regs->eflags); +} + +/* fault_handler: this is called if an exception is generated for any + * instruction within the pre- or post-handler, or when Kprobes + * single-steps the probed instruction. + */ +int handler_fault(struct kprobe *p, struct pt_regs *regs, int trapnr) +{ + printk("fault_handler: p->addr=0x%p, trap #%dn", + p->addr, trapnr); + /* Return 0 because we don't handle the fault. */ + return 0; +} + +int init_module(void) +{ + int ret; + kp.pre_handler = handler_pre; + kp.post_handler = handler_post; + kp.fault_handler = handler_fault; + kp.addr = (kprobe_opcode_t*) kallsyms_lookup_name("do_fork"); + /* register the kprobe now */ + if (!kp.addr) { + printk("Couldn't find %s to plant kprobe\n", "do_fork"); + return -1; + } + if ((ret = register_kprobe(&kp) < 0)) { + printk("register_kprobe failed, returned %d\n", ret); + return -1; + } + printk("kprobe registered\n"); + return 0; +} + +void cleanup_module(void) +{ + unregister_kprobe(&kp); + printk("kprobe unregistered\n"); +} + +MODULE_LICENSE("GPL"); +----- cut here ----- + +You can build the kernel module, kprobe-example.ko, using the following +Makefile: +----- cut here ----- +obj-m := kprobe-example.o +KDIR := /lib/modules/$(shell uname -r)/build +PWD := $(shell pwd) +default: + $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules +clean: + rm -f *.mod.c *.ko *.o +----- cut here ----- + +$ make +$ su - +... +# insmod kprobe-example.ko + +You will see the trace data in /var/log/messages and on the console +whenever do_fork() is invoked to create a new process. + +9. Jprobes Example + +Here's a sample kernel module showing the use of jprobes to dump +the arguments of do_fork(). +----- cut here ----- +/*jprobe-example.c */ +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/fs.h> +#include <linux/uio.h> +#include <linux/kprobes.h> +#include <linux/kallsyms.h> + +/* + * Jumper probe for do_fork. + * Mirror principle enables access to arguments of the probed routine + * from the probe handler. + */ + +/* Proxy routine having the same arguments as actual do_fork() routine */ +long jdo_fork(unsigned long clone_flags, unsigned long stack_start, + struct pt_regs *regs, unsigned long stack_size, + int __user * parent_tidptr, int __user * child_tidptr) +{ + printk("jprobe: clone_flags=0x%lx, stack_size=0x%lx, regs=0x%p\n", + clone_flags, stack_size, regs); + /* Always end with a call to jprobe_return(). */ + jprobe_return(); + /*NOTREACHED*/ + return 0; +} + +static struct jprobe my_jprobe = { + .entry = (kprobe_opcode_t *) jdo_fork +}; + +int init_module(void) +{ + int ret; + my_jprobe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("do_fork"); + if (!my_jprobe.kp.addr) { + printk("Couldn't find %s to plant jprobe\n", "do_fork"); + return -1; + } + + if ((ret = register_jprobe(&my_jprobe)) <0) { + printk("register_jprobe failed, returned %d\n", ret); + return -1; + } + printk("Planted jprobe at %p, handler addr %p\n", + my_jprobe.kp.addr, my_jprobe.entry); + return 0; +} + +void cleanup_module(void) +{ + unregister_jprobe(&my_jprobe); + printk("jprobe unregistered\n"); +} + +MODULE_LICENSE("GPL"); +----- cut here ----- + +Build and insert the kernel module as shown in the above kprobe +example. You will see the trace data in /var/log/messages and on +the console whenever do_fork() is invoked to create a new process. +(Some messages may be suppressed if syslogd is configured to +eliminate duplicate messages.) + +10. Kretprobes Example + +Here's a sample kernel module showing the use of return probes to +report failed calls to sys_open(). +----- cut here ----- +/*kretprobe-example.c*/ +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/kprobes.h> +#include <linux/kallsyms.h> + +static const char *probed_func = "sys_open"; + +/* Return-probe handler: If the probed function fails, log the return value. */ +static int ret_handler(struct kretprobe_instance *ri, struct pt_regs *regs) +{ + // Substitute the appropriate register name for your architecture -- + // e.g., regs->rax for x86_64, regs->gpr[3] for ppc64. + int retval = (int) regs->eax; + if (retval < 0) { + printk("%s returns %d\n", probed_func, retval); + } + return 0; +} + +static struct kretprobe my_kretprobe = { + .handler = ret_handler, + /* Probe up to 20 instances concurrently. */ + .maxactive = 20 +}; + +int init_module(void) +{ + int ret; + my_kretprobe.kp.addr = + (kprobe_opcode_t *) kallsyms_lookup_name(probed_func); + if (!my_kretprobe.kp.addr) { + printk("Couldn't find %s to plant return probe\n", probed_func); + return -1; + } + if ((ret = register_kretprobe(&my_kretprobe)) < 0) { + printk("register_kretprobe failed, returned %d\n", ret); + return -1; + } + printk("Planted return probe at %p\n", my_kretprobe.kp.addr); + return 0; +} + +void cleanup_module(void) +{ + unregister_kretprobe(&my_kretprobe); + printk("kretprobe unregistered\n"); + /* nmissed > 0 suggests that maxactive was set too low. */ + printk("Missed probing %d instances of %s\n", + my_kretprobe.nmissed, probed_func); +} + +MODULE_LICENSE("GPL"); +----- cut here ----- + +Build and insert the kernel module as shown in the above kprobe +example. You will see the trace data in /var/log/messages and on the +console whenever sys_open() returns a negative value. (Some messages +may be suppressed if syslogd is configured to eliminate duplicate +messages.) + +For additional information on Kprobes, refer to the following URLs: +http://www-106.ibm.com/developerworks/library/l-kprobes.html?ca=dgr-lnxw42Kprobe +http://www.redhat.com/magazine/005mar05/features/kprobes/ diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX index 834993d26730..5b01d5cc4e95 100644 --- a/Documentation/networking/00-INDEX +++ b/Documentation/networking/00-INDEX @@ -114,9 +114,7 @@ tuntap.txt vortex.txt - info on using 3Com Vortex (3c590, 3c592, 3c595, 3c597) Ethernet cards. wan-router.txt - - Wan router documentation -wanpipe.txt - - WANPIPE(tm) Multiprotocol WAN Driver for Linux WAN Router + - WAN router documentation wavelan.txt - AT&T GIS (nee NCR) WaveLAN card: An Ethernet-like radio transceiver x25.txt diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index 0bc2ed136a38..24d029455baa 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt @@ -1,5 +1,7 @@ - Linux Ethernet Bonding Driver HOWTO + Linux Ethernet Bonding Driver HOWTO + + Latest update: 21 June 2005 Initial release : Thomas Davis <tadavis at lbl.gov> Corrections, HA extensions : 2000/10/03-15 : @@ -11,15 +13,22 @@ Corrections, HA extensions : 2000/10/03-15 : Reorganized and updated Feb 2005 by Jay Vosburgh -Note : ------- +Introduction +============ + + The Linux bonding driver provides a method for aggregating +multiple network interfaces into a single logical "bonded" interface. +The behavior of the bonded interfaces depends upon the mode; generally +speaking, modes provide either hot standby or load balancing services. +Additionally, link integrity monitoring may be performed. -The bonding driver originally came from Donald Becker's beowulf patches for -kernel 2.0. It has changed quite a bit since, and the original tools from -extreme-linux and beowulf sites will not work with this version of the driver. + The bonding driver originally came from Donald Becker's +beowulf patches for kernel 2.0. It has changed quite a bit since, and +the original tools from extreme-linux and beowulf sites will not work +with this version of the driver. -For new versions of the driver, patches for older kernels and the updated -userspace tools, please follow the links at the end of this file. + For new versions of the driver, updated userspace tools, and +who to ask for help, please follow the links at the end of this file. Table of Contents ================= @@ -30,9 +39,13 @@ Table of Contents 3. Configuring Bonding Devices 3.1 Configuration with sysconfig support +3.1.1 Using DHCP with sysconfig +3.1.2 Configuring Multiple Bonds with sysconfig 3.2 Configuration with initscripts support +3.2.1 Using DHCP with initscripts +3.2.2 Configuring Multiple Bonds with initscripts 3.3 Configuring Bonding Manually -3.4 Configuring Multiple Bonds +3.3.1 Configuring Multiple Bonds Manually 5. Querying Bonding Configuration 5.1 Bonding Configuration @@ -56,21 +69,30 @@ Table of Contents 11. Promiscuous mode -12. High Availability Information +12. Configuring Bonding for High Availability 12.1 High Availability in a Single Switch Topology -12.1.1 Bonding Mode Selection for Single Switch Topology -12.1.2 Link Monitoring for Single Switch Topology 12.2 High Availability in a Multiple Switch Topology -12.2.1 Bonding Mode Selection for Multiple Switch Topology -12.2.2 Link Monitoring for Multiple Switch Topology -12.3 Switch Behavior Issues for High Availability +12.2.1 HA Bonding Mode Selection for Multiple Switch Topology +12.2.2 HA Link Monitoring for Multiple Switch Topology + +13. Configuring Bonding for Maximum Throughput +13.1 Maximum Throughput in a Single Switch Topology +13.1.1 MT Bonding Mode Selection for Single Switch Topology +13.1.2 MT Link Monitoring for Single Switch Topology +13.2 Maximum Throughput in a Multiple Switch Topology +13.2.1 MT Bonding Mode Selection for Multiple Switch Topology +13.2.2 MT Link Monitoring for Multiple Switch Topology -13. Hardware Specific Considerations -13.1 IBM BladeCenter +14. Switch Behavior Issues +14.1 Link Establishment and Failover Delays +14.2 Duplicated Incoming Packets -14. Frequently Asked Questions +15. Hardware Specific Considerations +15.1 IBM BladeCenter -15. Resources and Links +16. Frequently Asked Questions + +17. Resources and Links 1. Bonding Driver Installation @@ -86,16 +108,10 @@ the following steps: 1.1 Configure and build the kernel with bonding ----------------------------------------------- - The latest version of the bonding driver is available in the + The current version of the bonding driver is available in the drivers/net/bonding subdirectory of the most recent kernel source -(which is available on http://kernel.org). - - Prior to the 2.4.11 kernel, the bonding driver was maintained -largely outside the kernel tree; patches for some earlier kernels are -available on the bonding sourceforge site, although those patches are -still several years out of date. Most users will want to use either -the most recent kernel from kernel.org or whatever kernel came with -their distro. +(which is available on http://kernel.org). Most users "rolling their +own" will want to use the most recent kernel from kernel.org. Configure kernel with "make menuconfig" (or "make xconfig" or "make config"), then select "Bonding driver support" in the "Network @@ -103,8 +119,8 @@ device support" section. It is recommended that you configure the driver as module since it is currently the only way to pass parameters to the driver or configure more than one bonding device. - Build and install the new kernel and modules, then proceed to -step 2. + Build and install the new kernel and modules, then continue +below to install ifenslave. 1.2 Install ifenslave Control Utility ------------------------------------- @@ -147,9 +163,9 @@ default kernel source include directory. Options for the bonding driver are supplied as parameters to the bonding module at load time. They may be given as command line arguments to the insmod or modprobe command, but are usually specified -in either the /etc/modprobe.conf configuration file, or in a -distro-specific configuration file (some of which are detailed in the -next section). +in either the /etc/modules.conf or /etc/modprobe.conf configuration +file, or in a distro-specific configuration file (some of which are +detailed in the next section). The available bonding driver parameters are listed below. If a parameter is not specified the default value is used. When initially @@ -162,34 +178,34 @@ degradation will occur during link failures. Very few devices do not support at least miimon, so there is really no reason not to use it. Options with textual values will accept either the text name - or, for backwards compatibility, the option value. E.g., - "mode=802.3ad" and "mode=4" set the same mode. +or, for backwards compatibility, the option value. E.g., +"mode=802.3ad" and "mode=4" set the same mode. The parameters are as follows: arp_interval - Specifies the ARP monitoring frequency in milli-seconds. If - ARP monitoring is used in a load-balancing mode (mode 0 or 2), - the switch should be configured in a mode that evenly - distributes packets across all links - such as round-robin. If - the switch is configured to distribute the packets in an XOR + Specifies the ARP link monitoring frequency in milliseconds. + If ARP monitoring is used in an etherchannel compatible mode + (modes 0 and 2), the switch should be configured in a mode + that evenly distributes packets across all links. If the + switch is configured to distribute the packets in an XOR fashion, all replies from the ARP targets will be received on the same link which could cause the other team members to - fail. ARP monitoring should not be used in conjunction with - miimon. A value of 0 disables ARP monitoring. The default + fail. ARP monitoring should not be used in conjunction with + miimon. A value of 0 disables ARP monitoring. The default value is 0. arp_ip_target - Specifies the ip addresses to use when arp_interval is > 0. - These are the targets of the ARP request sent to determine the - health of the link to the targets. Specify these values in - ddd.ddd.ddd.ddd format. Multiple ip adresses must be - seperated by a comma. At least one IP address must be given - for ARP monitoring to function. The maximum number of targets - that can be specified is 16. The default value is no IP - addresses. + Specifies the IP addresses to use as ARP monitoring peers when + arp_interval is > 0. These are the targets of the ARP request + sent to determine the health of the link to the targets. + Specify these values in ddd.ddd.ddd.ddd format. Multiple IP + addresses must be separated by a comma. At least one IP + address must be given for ARP monitoring to function. The + maximum number of targets that can be specified is 16. The + default value is no IP addresses. downdelay @@ -207,11 +223,13 @@ lacp_rate are: slow or 0 - Request partner to transmit LACPDUs every 30 seconds (default) + Request partner to transmit LACPDUs every 30 seconds fast or 1 Request partner to transmit LACPDUs every 1 second + The default is slow. + max_bonds Specifies the number of bonding devices to create for this @@ -221,10 +239,11 @@ max_bonds miimon - Specifies the frequency in milli-seconds that MII link - monitoring will occur. A value of zero disables MII link - monitoring. A value of 100 is a good starting point. The - use_carrier option, below, affects how the link state is + Specifies the MII link monitoring frequency in milliseconds. + This determines how often the link state of each slave is + inspected for link failures. A value of zero disables MII + link monitoring. A value of 100 is a good starting point. + The use_carrier option, below, affects how the link state is determined. See the High Availability section for additional information. The default value is 0. @@ -246,17 +265,31 @@ mode active. A different slave becomes active if, and only if, the active slave fails. The bond's MAC address is externally visible on only one port (network adapter) - to avoid confusing the switch. This mode provides - fault tolerance. The primary option affects the - behavior of this mode. + to avoid confusing the switch. + + In bonding version 2.6.2 or later, when a failover + occurs in active-backup mode, bonding will issue one + or more gratuitous ARPs on the newly active slave. + One gratutious ARP is issued for the bonding master + interface and each VLAN interfaces configured above + it, provided that the interface has at least one IP + address configured. Gratuitous ARPs issued for VLAN + interfaces are tagged with the appropriate VLAN id. + + This mode provides fault tolerance. The primary + option, documented below, affects the behavior of this + mode. balance-xor or 2 - XOR policy: Transmit based on [(source MAC address - XOR'd with destination MAC address) modulo slave - count]. This selects the same slave for each - destination MAC address. This mode provides load - balancing and fault tolerance. + XOR policy: Transmit based on the selected transmit + hash policy. The default policy is a simple [(source + MAC address XOR'd with destination MAC address) modulo + slave count]. Alternate transmit policies may be + selected via the xmit_hash_policy option, described + below. + + This mode provides load balancing and fault tolerance. broadcast or 3 @@ -270,7 +303,17 @@ mode duplex settings. Utilizes all slaves in the active aggregator according to the 802.3ad specification. - Pre-requisites: + Slave selection for outgoing traffic is done according + to the transmit hash policy, which may be changed from + the default simple XOR policy via the xmit_hash_policy + option, documented below. Note that not all transmit + policies may be 802.3ad compliant, particularly in + regards to the packet mis-ordering requirements of + section 43.2.4 of the 802.3ad standard. Differing + peer implementations will have varying tolerances for + noncompliance. + + Prerequisites: 1. Ethtool support in the base drivers for retrieving the speed and duplex of each slave. @@ -333,7 +376,7 @@ mode When a link is reconnected or a new slave joins the bond the receive traffic is redistributed among all - active slaves in the bond by intiating ARP Replies + active slaves in the bond by initiating ARP Replies with the selected mac address to each of the clients. The updelay parameter (detailed below) must be set to a value equal or greater than the switch's @@ -396,6 +439,60 @@ use_carrier 0 will use the deprecated MII / ETHTOOL ioctls. The default value is 1. +xmit_hash_policy + + Selects the transmit hash policy to use for slave selection in + balance-xor and 802.3ad modes. Possible values are: + + layer2 + + Uses XOR of hardware MAC addresses to generate the + hash. The formula is + + (source MAC XOR destination MAC) modulo slave count + + This algorithm will place all traffic to a particular + network peer on the same slave. + + This algorithm is 802.3ad compliant. + + layer3+4 + + This policy uses upper layer protocol information, + when available, to generate the hash. This allows for + traffic to a particular network peer to span multiple + slaves, although a single connection will not span + multiple slaves. + + The formula for unfragmented TCP and UDP packets is + + ((source port XOR dest port) XOR + ((source IP XOR dest IP) AND 0xffff) + modulo slave count + + For fragmented TCP or UDP packets and all other IP + protocol traffic, the source and destination port + information is omitted. For non-IP traffic, the + formula is the same as for the layer2 transmit hash + policy. + + This policy is intended to mimic the behavior of + certain switches, notably Cisco switches with PFC2 as + well as some Foundry and IBM products. + + This algorithm is not fully 802.3ad compliant. A + single TCP or UDP conversation containing both + fragmented and unfragmented packets will see packets + striped across two interfaces. This may result in out + of order delivery. Most traffic types will not meet + this criteria, as TCP rarely fragments traffic, and + most UDP traffic is not involved in extended + conversations. Other implementations of 802.3ad may + or may not tolerate this noncompliance. + + The default value is layer2. This option was added in bonding +version 2.6.3. In earlier versions of bonding, this parameter does +not exist, and the layer2 policy is the only policy. 3. Configuring Bonding Devices @@ -448,8 +545,9 @@ Bonding devices can be managed by hand, however, as follows. slave devices. On SLES 9, this is most easily done by running the yast2 sysconfig configuration utility. The goal is for to create an ifcfg-id file for each slave device. The simplest way to accomplish -this is to configure the devices for DHCP. The name of the -configuration file for each device will be of the form: +this is to configure the devices for DHCP (this is only to get the +file ifcfg-id file created; see below for some issues with DHCP). The +name of the configuration file for each device will be of the form: ifcfg-id-xx:xx:xx:xx:xx:xx @@ -459,7 +557,7 @@ the device's permanent MAC address. Once the set of ifcfg-id-xx:xx:xx:xx:xx:xx files has been created, it is necessary to edit the configuration files for the slave devices (the MAC addresses correspond to those of the slave devices). -Before editing, the file will contain muliple lines, and will look +Before editing, the file will contain multiple lines, and will look something like this: BOOTPROTO='dhcp' @@ -496,16 +594,11 @@ STARTMODE="onboot" BONDING_MASTER="yes" BONDING_MODULE_OPTS="mode=active-backup miimon=100" BONDING_SLAVE0="eth0" -BONDING_SLAVE1="eth1" +BONDING_SLAVE1="bus-pci-0000:06:08.1" Replace the sample BROADCAST, IPADDR, NETMASK and NETWORK values with the appropriate values for your network. - Note that configuring the bonding device with BOOTPROTO='dhcp' -does not work; the scripts attempt to obtain the device address from -DHCP prior to adding any of the slave devices. Without active slaves, -the DHCP requests are not sent to the network. - The STARTMODE specifies when the device is brought online. The possible values are: @@ -531,9 +624,17 @@ for the bonding mode, link monitoring, and so on here. Do not include the max_bonds bonding parameter; this will confuse the configuration system if you have multiple bonding devices. - Finally, supply one BONDING_SLAVEn="ethX" for each slave, -where "n" is an increasing value, one for each slave, and "ethX" is -the name of the slave device (eth0, eth1, etc). + Finally, supply one BONDING_SLAVEn="slave device" for each +slave. where "n" is an increasing value, one for each slave. The +"slave device" is either an interface name, e.g., "eth0", or a device +specifier for the network device. The interface name is easier to +find, but the ethN names are subject to change at boot time if, e.g., +a device early in the sequence has failed. The device specifiers +(bus-pci-0000:06:08.1 in the example above) specify the physical +network device, and will not change unless the device's bus location +changes (for example, it is moved from one PCI slot to another). The +example above uses one of each type for demonstration purposes; most +configurations will choose one or the other for all slave devices. When all configuration files have been modified or created, networking must be restarted for the configuration changes to take @@ -544,7 +645,7 @@ effect. This can be accomplished via the following: Note that the network control script (/sbin/ifdown) will remove the bonding module as part of the network shutdown processing, so it is not necessary to remove the module by hand if, e.g., the -module paramters have changed. +module parameters have changed. Also, at this writing, YaST/YaST2 will not manage bonding devices (they do not show bonding interfaces on its list of network @@ -559,12 +660,37 @@ format can be found in an example ifcfg template file: Note that the template does not document the various BONDING_ settings described above, but does describe many of the other options. +3.1.1 Using DHCP with sysconfig +------------------------------- + + Under sysconfig, configuring a device with BOOTPROTO='dhcp' +will cause it to query DHCP for its IP address information. At this +writing, this does not function for bonding devices; the scripts +attempt to obtain the device address from DHCP prior to adding any of +the slave devices. Without active slaves, the DHCP requests are not +sent to the network. + +3.1.2 Configuring Multiple Bonds with sysconfig +----------------------------------------------- + + The sysconfig network initialization system is capable of +handling multiple bonding devices. All that is necessary is for each +bonding instance to have an appropriately configured ifcfg-bondX file +(as described above). Do not specify the "max_bonds" parameter to any +instance of bonding, as this will confuse sysconfig. If you require +multiple bonding devices with identical parameters, create multiple +ifcfg-bondX files. + + Because the sysconfig scripts supply the bonding module +options in the ifcfg-bondX file, it is not necessary to add them to +the system /etc/modules.conf or /etc/modprobe.conf configuration file. + 3.2 Configuration with initscripts support ------------------------------------------ This section applies to distros using a version of initscripts with bonding support, for example, Red Hat Linux 9 or Red Hat -Enterprise Linux version 3. On these systems, the network +Enterprise Linux version 3 or 4. On these systems, the network initialization scripts have some knowledge of bonding, and can be configured to control bonding devices. @@ -614,10 +740,11 @@ USERCTL=no Be sure to change the networking specific lines (IPADDR, NETMASK, NETWORK and BROADCAST) to match your network configuration. - Finally, it is necessary to edit /etc/modules.conf to load the -bonding module when the bond0 interface is brought up. The following -sample lines in /etc/modules.conf will load the bonding module, and -select its options: + Finally, it is necessary to edit /etc/modules.conf (or +/etc/modprobe.conf, depending upon your distro) to load the bonding +module with your desired options when the bond0 interface is brought +up. The following lines in /etc/modules.conf (or modprobe.conf) will +load the bonding module, and select its options: alias bond0 bonding options bond0 mode=balance-alb miimon=100 @@ -629,6 +756,33 @@ options for your configuration. will restart the networking subsystem and your bond link should be now up and running. +3.2.1 Using DHCP with initscripts +--------------------------------- + + Recent versions of initscripts (the version supplied with +Fedora Core 3 and Red Hat Enterprise Linux 4 is reported to work) do +have support for assigning IP information to bonding devices via DHCP. + + To configure bonding for DHCP, configure it as described +above, except replace the line "BOOTPROTO=none" with "BOOTPROTO=dhcp" +and add a line consisting of "TYPE=Bonding". Note that the TYPE value +is case sensitive. + +3.2.2 Configuring Multiple Bonds with initscripts +------------------------------------------------- + + At this writing, the initscripts package does not directly +support loading the bonding driver multiple times, so the process for +doing so is the same as described in the "Configuring Multiple Bonds +Manually" section, below. + + NOTE: It has been observed that some Red Hat supplied kernels +are apparently unable to rename modules at load time (the "-obonding1" +part). Attempts to pass that option to modprobe will produce an +"Operation not permitted" error. This has been reported on some +Fedora Core kernels, and has been seen on RHEL 4 as well. On kernels +exhibiting this problem, it will be impossible to configure multiple +bonds with differing parameters. 3.3 Configuring Bonding Manually -------------------------------- @@ -638,10 +792,11 @@ scripts (the sysconfig or initscripts package) do not have specific knowledge of bonding. One such distro is SuSE Linux Enterprise Server version 8. - The general methodology for these systems is to place the -bonding module parameters into /etc/modprobe.conf, then add modprobe -and/or ifenslave commands to the system's global init script. The -name of the global init script differs; for sysconfig, it is + The general method for these systems is to place the bonding +module parameters into /etc/modules.conf or /etc/modprobe.conf (as +appropriate for the installed distro), then add modprobe and/or +ifenslave commands to the system's global init script. The name of +the global init script differs; for sysconfig, it is /etc/init.d/boot.local and for initscripts it is /etc/rc.d/rc.local. For example, if you wanted to make a simple bond of two e100 @@ -649,7 +804,7 @@ devices (presumed to be eth0 and eth1), and have it persist across reboots, edit the appropriate file (/etc/init.d/boot.local or /etc/rc.d/rc.local), and add the following: -modprobe bonding -obond0 mode=balance-alb miimon=100 +modprobe bonding mode=balance-alb miimon=100 modprobe e100 ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up ifenslave bond0 eth0 @@ -657,11 +812,7 @@ ifenslave bond0 eth1 Replace the example bonding module parameters and bond0 network configuration (IP address, netmask, etc) with the appropriate -values for your configuration. The above example loads the bonding -module with the name "bond0," this simplifies the naming if multiple -bonding modules are loaded (each successive instance of the module is -given a different name, and the module instance names match the -bonding interface names). +values for your configuration. Unfortunately, this method will not provide support for the ifup and ifdown scripts on the bond devices. To reload the bonding @@ -684,20 +835,23 @@ appropriate device driver modules. For our example above, you can do the following: # ifconfig bond0 down -# rmmod bond0 +# rmmod bonding # rmmod e100 Again, for convenience, it may be desirable to create a script with these commands. -3.4 Configuring Multiple Bonds ------------------------------- +3.3.1 Configuring Multiple Bonds Manually +----------------------------------------- This section contains information on configuring multiple -bonding devices with differing options. If you require multiple -bonding devices, but all with the same options, see the "max_bonds" -module paramter, documented above. +bonding devices with differing options for those systems whose network +initialization scripts lack support for configuring multiple bonds. + + If you require multiple bonding devices, but all with the same +options, you may wish to use the "max_bonds" module parameter, +documented above. To create multiple bonding devices with differing options, it is necessary to load the bonding driver multiple times. Note that @@ -724,11 +878,16 @@ named "bond0" and creates the bond0 device in balance-rr mode with an miimon of 100. The second instance is named "bond1" and creates the bond1 device in balance-alb mode with an miimon of 50. + In some circumstances (typically with older distributions), +the above does not work, and the second bonding instance never sees +its options. In that case, the second options line can be substituted +as follows: + +install bonding1 /sbin/modprobe bonding -obond1 mode=balance-alb miimon=50 + This may be repeated any number of times, specifying a new and -unique name in place of bond0 or bond1 for each instance. +unique name in place of bond1 for each subsequent instance. - When the appropriate module paramters are in place, then -configure bonding according to the instructions for your distro. 5. Querying Bonding Configuration ================================= @@ -846,8 +1005,8 @@ tagged internally by bonding itself. As a result, bonding must self generated packets. For reasons of simplicity, and to support the use of adapters -that can do VLAN hardware acceleration offloding, the bonding -interface declares itself as fully hardware offloaing capable, it gets +that can do VLAN hardware acceleration offloading, the bonding +interface declares itself as fully hardware offloading capable, it gets the add_vid/kill_vid notifications to gather the necessary information, and it propagates those actions to the slaves. In case of mixed adapter types, hardware accelerated tagged packets that @@ -880,7 +1039,7 @@ bond interface: matches the hardware address of the VLAN interfaces. Note that changing a VLAN interface's HW address would set the -underlying device -- i.e. the bonding interface -- to promiscouos +underlying device -- i.e. the bonding interface -- to promiscuous mode, which might not be what you want. @@ -923,7 +1082,7 @@ down or have a problem making it unresponsive to ARP requests. Having an additional target (or several) increases the reliability of the ARP monitoring. - Multiple ARP targets must be seperated by commas as follows: + Multiple ARP targets must be separated by commas as follows: # example options for ARP monitoring with three targets alias bond0 bonding @@ -1045,7 +1204,7 @@ install bonding /sbin/modprobe tg3; /sbin/modprobe e1000; This will, when loading the bonding module, rather than performing the normal action, instead execute the provided command. This command loads the device drivers in the order needed, then calls -modprobe with --ingore-install to cause the normal action to then take +modprobe with --ignore-install to cause the normal action to then take place. Full documentation on this can be found in the modprobe.conf and modprobe manual pages. @@ -1130,14 +1289,14 @@ association. common to enable promiscuous mode on the device, so that all traffic is seen (instead of seeing only traffic destined for the local host). The bonding driver handles promiscuous mode changes to the bonding -master device (e.g., bond0), and propogates the setting to the slave +master device (e.g., bond0), and propagates the setting to the slave devices. For the balance-rr, balance-xor, broadcast, and 802.3ad modes, -the promiscuous mode setting is propogated to all slaves. +the promiscuous mode setting is propagated to all slaves. For the active-backup, balance-tlb and balance-alb modes, the -promiscuous mode setting is propogated only to the active slave. +promiscuous mode setting is propagated only to the active slave. For balance-tlb mode, the active slave is the slave currently receiving inbound traffic. @@ -1148,46 +1307,182 @@ sending to peers that are unassigned or if the load is unbalanced. For the active-backup, balance-tlb and balance-alb modes, when the active slave changes (e.g., due to a link failure), the -promiscuous setting will be propogated to the new active slave. +promiscuous setting will be propagated to the new active slave. -12. High Availability Information -================================= +12. Configuring Bonding for High Availability +============================================= High Availability refers to configurations that provide maximum network availability by having redundant or backup devices, -links and switches between the host and the rest of the world. - - There are currently two basic methods for configuring to -maximize availability. They are dependent on the network topology and -the primary goal of the configuration, but in general, a configuration -can be optimized for maximum available bandwidth, or for maximum -network availability. +links or switches between the host and the rest of the world. The +goal is to provide the maximum availability of network connectivity +(i.e., the network always works), even though other configurations +could provide higher throughput. 12.1 High Availability in a Single Switch Topology -------------------------------------------------- - If two hosts (or a host and a switch) are directly connected -via multiple physical links, then there is no network availability -penalty for optimizing for maximum bandwidth: there is only one switch -(or peer), so if it fails, you have no alternative access to fail over -to. + If two hosts (or a host and a single switch) are directly +connected via multiple physical links, then there is no availability +penalty to optimizing for maximum bandwidth. In this case, there is +only one switch (or peer), so if it fails, there is no alternative +access to fail over to. Additionally, the bonding load balance modes +support link monitoring of their members, so if individual links fail, +the load will be rebalanced across the remaining devices. + + See Section 13, "Configuring Bonding for Maximum Throughput" +for information on configuring bonding with one peer device. + +12.2 High Availability in a Multiple Switch Topology +---------------------------------------------------- + + With multiple switches, the configuration of bonding and the +network changes dramatically. In multiple switch topologies, there is +a trade off between network availability and usable bandwidth. + + Below is a sample network, configured to maximize the +availability of the network: -Example 1 : host to switch (or other host) + | | + |port3 port3| + +-----+----+ +-----+----+ + | |port2 ISL port2| | + | switch A +--------------------------+ switch B | + | | | | + +-----+----+ +-----++---+ + |port1 port1| + | +-------+ | + +-------------+ host1 +---------------+ + eth0 +-------+ eth1 - +----------+ +----------+ - | |eth0 eth0| switch | - | Host A +--------------------------+ or | - | +--------------------------+ other | - | |eth1 eth1| host | - +----------+ +----------+ + In this configuration, there is a link between the two +switches (ISL, or inter switch link), and multiple ports connecting to +the outside world ("port3" on each switch). There is no technical +reason that this could not be extended to a third switch. +12.2.1 HA Bonding Mode Selection for Multiple Switch Topology +------------------------------------------------------------- -12.1.1 Bonding Mode Selection for single switch topology --------------------------------------------------------- + In a topology such as the example above, the active-backup and +broadcast modes are the only useful bonding modes when optimizing for +availability; the other modes require all links to terminate on the +same peer for them to behave rationally. + +active-backup: This is generally the preferred mode, particularly if + the switches have an ISL and play together well. If the + network configuration is such that one switch is specifically + a backup switch (e.g., has lower capacity, higher cost, etc), + then the primary option can be used to insure that the + preferred link is always used when it is available. + +broadcast: This mode is really a special purpose mode, and is suitable + only for very specific needs. For example, if the two + switches are not connected (no ISL), and the networks beyond + them are totally independent. In this case, if it is + necessary for some specific one-way traffic to reach both + independent networks, then the broadcast mode may be suitable. + +12.2.2 HA Link Monitoring Selection for Multiple Switch Topology +---------------------------------------------------------------- + + The choice of link monitoring ultimately depends upon your +switch. If the switch can reliably fail ports in response to other +failures, then either the MII or ARP monitors should work. For +example, in the above example, if the "port3" link fails at the remote +end, the MII monitor has no direct means to detect this. The ARP +monitor could be configured with a target at the remote end of port3, +thus detecting that failure without switch support. + + In general, however, in a multiple switch topology, the ARP +monitor can provide a higher level of reliability in detecting end to +end connectivity failures (which may be caused by the failure of any +individual component to pass traffic for any reason). Additionally, +the ARP monitor should be configured with multiple targets (at least +one for each switch in the network). This will insure that, +regardless of which switch is active, the ARP monitor has a suitable +target to query. + + +13. Configuring Bonding for Maximum Throughput +============================================== + +13.1 Maximizing Throughput in a Single Switch Topology +------------------------------------------------------ + + In a single switch configuration, the best method to maximize +throughput depends upon the application and network environment. The +various load balancing modes each have strengths and weaknesses in +different environments, as detailed below. + + For this discussion, we will break down the topologies into +two categories. Depending upon the destination of most traffic, we +categorize them into either "gatewayed" or "local" configurations. + + In a gatewayed configuration, the "switch" is acting primarily +as a router, and the majority of traffic passes through this router to +other networks. An example would be the following: + + + +----------+ +----------+ + | |eth0 port1| | to other networks + | Host A +---------------------+ router +-------------------> + | +---------------------+ | Hosts B and C are out + | |eth1 port2| | here somewhere + +----------+ +----------+ + + The router may be a dedicated router device, or another host +acting as a gateway. For our discussion, the important point is that +the majority of traffic from Host A will pass through the router to +some other network before reaching its final destination. + + In a gatewayed network configuration, although Host A may +communicate with many other systems, all of its traffic will be sent +and received via one other peer on the local network, the router. + + Note that the case of two systems connected directly via +multiple physical links is, for purposes of configuring bonding, the +same as a gatewayed configuration. In that case, it happens that all +traffic is destined for the "gateway" itself, not some other network +beyond the gateway. + + In a local configuration, the "switch" is acting primarily as +a switch, and the majority of traffic passes through this switch to +reach other stations on the same network. An example would be the +following: + + +----------+ +----------+ +--------+ + | |eth0 port1| +-------+ Host B | + | Host A +------------+ switch |port3 +--------+ + | +------------+ | +--------+ + | |eth1 port2| +------------------+ Host C | + +----------+ +----------+port4 +--------+ + + + Again, the switch may be a dedicated switch device, or another +host acting as a gateway. For our discussion, the important point is +that the majority of traffic from Host A is destined for other hosts +on the same local network (Hosts B and C in the above example). + + In summary, in a gatewayed configuration, traffic to and from +the bonded device will be to the same MAC level peer on the network +(the gateway itself, i.e., the router), regardless of its final +destination. In a local configuration, traffic flows directly to and +from the final destinations, thus, each destination (Host B, Host C) +will be addressed directly by their individual MAC addresses. + + This distinction between a gatewayed and a local network +configuration is important because many of the load balancing modes +available use the MAC addresses of the local network source and +destination to make load balancing decisions. The behavior of each +mode is described below. + + +13.1.1 MT Bonding Mode Selection for Single Switch Topology +----------------------------------------------------------- This configuration is the easiest to set up and to understand, although you will have to decide which bonding mode best suits your -needs. The tradeoffs for each mode are detailed below: +needs. The trade offs for each mode are detailed below: balance-rr: This mode is the only mode that will permit a single TCP/IP connection to stripe traffic across multiple @@ -1206,6 +1501,23 @@ balance-rr: This mode is the only mode that will permit a single interface's worth of throughput, even after adjusting tcp_reordering. + Note that this out of order delivery occurs when both the + sending and receiving systems are utilizing a multiple + interface bond. Consider a configuration in which a + balance-rr bond feeds into a single higher capacity network + channel (e.g., multiple 100Mb/sec ethernets feeding a single + gigabit ethernet via an etherchannel capable switch). In this + configuration, traffic sent from the multiple 100Mb devices to + a destination connected to the gigabit device will not see + packets out of order. However, traffic sent from the gigabit + device to the multiple 100Mb devices may or may not see + traffic out of order, depending upon the balance policy of the + switch. Many switches do not support any modes that stripe + traffic (instead choosing a port based upon IP or MAC level + addresses); for those devices, traffic flowing from the + gigabit device to the many 100Mb devices will only utilize one + interface. + If you are utilizing protocols other than TCP/IP, UDP for example, and your application can tolerate out of order delivery, then this mode can allow for single stream datagram @@ -1220,16 +1532,21 @@ active-backup: There is not much advantage in this network topology to connected to the same peer as the primary. In this case, a load balancing mode (with link monitoring) will provide the same level of network availability, but with increased - available bandwidth. On the plus side, it does not require - any configuration of the switch. + available bandwidth. On the plus side, active-backup mode + does not require any configuration of the switch, so it may + have value if the hardware available does not support any of + the load balance modes. balance-xor: This mode will limit traffic such that packets destined for specific peers will always be sent over the same interface. Since the destination is determined by the MAC - addresses involved, this may be desirable if you have a large - network with many hosts. It is likely to be suboptimal if all - your traffic is passed through a single router, however. As - with balance-rr, the switch ports need to be configured for + addresses involved, this mode works best in a "local" network + configuration (as described above), with destinations all on + the same local network. This mode is likely to be suboptimal + if all your traffic is passed through a single router (i.e., a + "gatewayed" network configuration, as described above). + + As with balance-rr, the switch ports need to be configured for "etherchannel" or "trunking." broadcast: Like active-backup, there is not much advantage to this @@ -1241,122 +1558,131 @@ broadcast: Like active-backup, there is not much advantage to this protocol includes automatic configuration of the aggregates, so minimal manual configuration of the switch is needed (typically only to designate that some set of devices is - usable for 802.3ad). The 802.3ad standard also mandates that - frames be delivered in order (within certain limits), so in - general single connections will not see misordering of + available for 802.3ad). The 802.3ad standard also mandates + that frames be delivered in order (within certain limits), so + in general single connections will not see misordering of packets. The 802.3ad mode does have some drawbacks: the standard mandates that all devices in the aggregate operate at the same speed and duplex. Also, as with all bonding load balance modes other than balance-rr, no single connection will be able to utilize more than a single interface's worth of - bandwidth. Additionally, the linux bonding 802.3ad - implementation distributes traffic by peer (using an XOR of - MAC addresses), so in general all traffic to a particular - destination will use the same interface. Finally, the 802.3ad - mode mandates the use of the MII monitor, therefore, the ARP - monitor is not available in this mode. - -balance-tlb: This mode is also a good choice for this type of - topology. It has no special switch configuration - requirements, and balances outgoing traffic by peer, in a - vaguely intelligent manner (not a simple XOR as in balance-xor - or 802.3ad mode), so that unlucky MAC addresses will not all - "bunch up" on a single interface. Interfaces may be of - differing speeds. On the down side, in this mode all incoming - traffic arrives over a single interface, this mode requires - certain ethtool support in the network device driver of the - slave interfaces, and the ARP monitor is not available. - -balance-alb: This mode is everything that balance-tlb is, and more. It - has all of the features (and restrictions) of balance-tlb, and - will also balance incoming traffic from peers (as described in - the Bonding Module Options section, above). The only extra - down side to this mode is that the network device driver must - support changing the hardware address while the device is - open. - -12.1.2 Link Monitoring for Single Switch Topology -------------------------------------------------- + bandwidth. + + Additionally, the linux bonding 802.3ad implementation + distributes traffic by peer (using an XOR of MAC addresses), + so in a "gatewayed" configuration, all outgoing traffic will + generally use the same device. Incoming traffic may also end + up on a single device, but that is dependent upon the + balancing policy of the peer's 8023.ad implementation. In a + "local" configuration, traffic will be distributed across the + devices in the bond. + + Finally, the 802.3ad mode mandates the use of the MII monitor, + therefore, the ARP monitor is not available in this mode. + +balance-tlb: The balance-tlb mode balances outgoing traffic by peer. + Since the balancing is done according to MAC address, in a + "gatewayed" configuration (as described above), this mode will + send all traffic across a single device. However, in a + "local" network configuration, this mode balances multiple + local network peers across devices in a vaguely intelligent + manner (not a simple XOR as in balance-xor or 802.3ad mode), + so that mathematically unlucky MAC addresses (i.e., ones that + XOR to the same value) will not all "bunch up" on a single + interface. + + Unlike 802.3ad, interfaces may be of differing speeds, and no + special switch configuration is required. On the down side, + in this mode all incoming traffic arrives over a single + interface, this mode requires certain ethtool support in the + network device driver of the slave interfaces, and the ARP + monitor is not available. + +balance-alb: This mode is everything that balance-tlb is, and more. + It has all of the features (and restrictions) of balance-tlb, + and will also balance incoming traffic from local network + peers (as described in the Bonding Module Options section, + above). + + The only additional down side to this mode is that the network + device driver must support changing the hardware address while + the device is open. + +13.1.2 MT Link Monitoring for Single Switch Topology +---------------------------------------------------- The choice of link monitoring may largely depend upon which mode you choose to use. The more advanced load balancing modes do not support the use of the ARP monitor, and are thus restricted to using -the MII monitor (which does not provide as high a level of assurance -as the ARP monitor). - - -12.2 High Availability in a Multiple Switch Topology ----------------------------------------------------- - - With multiple switches, the configuration of bonding and the -network changes dramatically. In multiple switch topologies, there is -a tradeoff between network availability and usable bandwidth. - - Below is a sample network, configured to maximize the -availability of the network: - - | | - |port3 port3| - +-----+----+ +-----+----+ - | |port2 ISL port2| | - | switch A +--------------------------+ switch B | - | | | | - +-----+----+ +-----++---+ - |port1 port1| - | +-------+ | - +-------------+ host1 +---------------+ - eth0 +-------+ eth1 - - In this configuration, there is a link between the two -switches (ISL, or inter switch link), and multiple ports connecting to -the outside world ("port3" on each switch). There is no technical -reason that this could not be extended to a third switch. - -12.2.1 Bonding Mode Selection for Multiple Switch Topology ----------------------------------------------------------- - - In a topology such as this, the active-backup and broadcast -modes are the only useful bonding modes; the other modes require all -links to terminate on the same peer for them to behave rationally. - -active-backup: This is generally the preferred mode, particularly if - the switches have an ISL and play together well. If the - network configuration is such that one switch is specifically - a backup switch (e.g., has lower capacity, higher cost, etc), - then the primary option can be used to insure that the - preferred link is always used when it is available. - -broadcast: This mode is really a special purpose mode, and is suitable - only for very specific needs. For example, if the two - switches are not connected (no ISL), and the networks beyond - them are totally independant. In this case, if it is - necessary for some specific one-way traffic to reach both - independent networks, then the broadcast mode may be suitable. - -12.2.2 Link Monitoring Selection for Multiple Switch Topology +the MII monitor (which does not provide as high a level of end to end +assurance as the ARP monitor). + +13.2 Maximum Throughput in a Multiple Switch Topology +----------------------------------------------------- + + Multiple switches may be utilized to optimize for throughput +when they are configured in parallel as part of an isolated network +between two or more systems, for example: + + +-----------+ + | Host A | + +-+---+---+-+ + | | | + +--------+ | +---------+ + | | | + +------+---+ +-----+----+ +-----+----+ + | Switch A | | Switch B | | Switch C | + +------+---+ +-----+----+ +-----+----+ + | | | + +--------+ | +---------+ + | | | + +-+---+---+-+ + | Host B | + +-----------+ + + In this configuration, the switches are isolated from one +another. One reason to employ a topology such as this is for an +isolated network with many hosts (a cluster configured for high +performance, for example), using multiple smaller switches can be more +cost effective than a single larger switch, e.g., on a network with 24 +hosts, three 24 port switches can be significantly less expensive than +a single 72 port switch. + + If access beyond the network is required, an individual host +can be equipped with an additional network device connected to an +external network; this host then additionally acts as a gateway. + +13.2.1 MT Bonding Mode Selection for Multiple Switch Topology ------------------------------------------------------------- - The choice of link monitoring ultimately depends upon your -switch. If the switch can reliably fail ports in response to other -failures, then either the MII or ARP monitors should work. For -example, in the above example, if the "port3" link fails at the remote -end, the MII monitor has no direct means to detect this. The ARP -monitor could be configured with a target at the remote end of port3, -thus detecting that failure without switch support. + In actual practice, the bonding mode typically employed in +configurations of this type is balance-rr. Historically, in this +network configuration, the usual caveats about out of order packet +delivery are mitigated by the use of network adapters that do not do +any kind of packet coalescing (via the use of NAPI, or because the +device itself does not generate interrupts until some number of +packets has arrived). When employed in this fashion, the balance-rr +mode allows individual connections between two hosts to effectively +utilize greater than one interface's bandwidth. - In general, however, in a multiple switch topology, the ARP -monitor can provide a higher level of reliability in detecting link -failures. Additionally, it should be configured with multiple targets -(at least one for each switch in the network). This will insure that, -regardless of which switch is active, the ARP monitor has a suitable -target to query. +13.2.2 MT Link Monitoring for Multiple Switch Topology +------------------------------------------------------ + Again, in actual practice, the MII monitor is most often used +in this configuration, as performance is given preference over +availability. The ARP monitor will function in this topology, but its +advantages over the MII monitor are mitigated by the volume of probes +needed as the number of systems involved grows (remember that each +host in the network is configured with bonding). -12.3 Switch Behavior Issues for High Availability -------------------------------------------------- +14. Switch Behavior Issues +========================== - You may encounter issues with the timing of link up and down -reporting by the switch. +14.1 Link Establishment and Failover Delays +------------------------------------------- + + Some switches exhibit undesirable behavior with regard to the +timing of link up and down reporting by the switch. First, when a link comes up, some switches may indicate that the link is up (carrier available), but not pass traffic over the @@ -1370,30 +1696,70 @@ relevant interface(s). Second, some switches may "bounce" the link state one or more times while a link is changing state. This occurs most commonly while the switch is initializing. Again, an appropriate updelay value may -help, but note that if all links are down, then updelay is ignored -when any link becomes active (the slave closest to completing its -updelay is chosen). +help. Note that when a bonding interface has no active links, the -driver will immediately reuse the first link that goes up, even if -updelay parameter was specified. If there are slave interfaces -waiting for the updelay timeout to expire, the interface that first -went into that state will be immediately reused. This reduces down -time of the network if the value of updelay has been overestimated. +driver will immediately reuse the first link that goes up, even if the +updelay parameter has been specified (the updelay is ignored in this +case). If there are slave interfaces waiting for the updelay timeout +to expire, the interface that first went into that state will be +immediately reused. This reduces down time of the network if the +value of updelay has been overestimated, and since this occurs only in +cases with no connectivity, there is no additional penalty for +ignoring the updelay. In addition to the concerns about switch timings, if your switches take a long time to go into backup mode, it may be desirable to not activate a backup interface immediately after a link goes down. Failover may be delayed via the downdelay bonding module option. -13. Hardware Specific Considerations +14.2 Duplicated Incoming Packets +-------------------------------- + + It is not uncommon to observe a short burst of duplicated +traffic when the bonding device is first used, or after it has been +idle for some period of time. This is most easily observed by issuing +a "ping" to some other host on the network, and noticing that the +output from ping flags duplicates (typically one per slave). + + For example, on a bond in active-backup mode with five slaves +all connected to one switch, the output may appear as follows: + +# ping -n 10.0.4.2 +PING 10.0.4.2 (10.0.4.2) from 10.0.3.10 : 56(84) bytes of data. +64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.7 ms +64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) +64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) +64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) +64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) +64 bytes from 10.0.4.2: icmp_seq=2 ttl=64 time=0.216 ms +64 bytes from 10.0.4.2: icmp_seq=3 ttl=64 time=0.267 ms +64 bytes from 10.0.4.2: icmp_seq=4 ttl=64 time=0.222 ms + + This is not due to an error in the bonding driver, rather, it +is a side effect of how many switches update their MAC forwarding +tables. Initially, the switch does not associate the MAC address in +the packet with a particular switch port, and so it may send the +traffic to all ports until its MAC forwarding table is updated. Since +the interfaces attached to the bond may occupy multiple ports on a +single switch, when the switch (temporarily) floods the traffic to all +ports, the bond device receives multiple copies of the same packet +(one per slave device). + + The duplicated packet behavior is switch dependent, some +switches exhibit this, and some do not. On switches that display this +behavior, it can be induced by clearing the MAC forwarding table (on +most Cisco switches, the privileged command "clear mac address-table +dynamic" will accomplish this). + +15. Hardware Specific Considerations ==================================== This section contains additional information for configuring bonding on specific hardware platforms, or for interfacing bonding with particular switches or other devices. -13.1 IBM BladeCenter +15.1 IBM BladeCenter -------------------- This applies to the JS20 and similar systems. @@ -1407,12 +1773,12 @@ JS20 network adapter information -------------------------------- All JS20s come with two Broadcom Gigabit Ethernet ports -integrated on the planar. In the BladeCenter chassis, the eth0 port -of all JS20 blades is hard wired to I/O Module #1; similarly, all eth1 -ports are wired to I/O Module #2. An add-on Broadcom daughter card -can be installed on a JS20 to provide two more Gigabit Ethernet ports. -These ports, eth2 and eth3, are wired to I/O Modules 3 and 4, -respectively. +integrated on the planar (that's "motherboard" in IBM-speak). In the +BladeCenter chassis, the eth0 port of all JS20 blades is hard wired to +I/O Module #1; similarly, all eth1 ports are wired to I/O Module #2. +An add-on Broadcom daughter card can be installed on a JS20 to provide +two more Gigabit Ethernet ports. These ports, eth2 and eth3, are +wired to I/O Modules 3 and 4, respectively. Each I/O Module may contain either a switch or a passthrough module (which allows ports to be directly connected to an external @@ -1432,29 +1798,30 @@ BladeCenter networking configuration of ways, this discussion will be confined to describing basic configurations. - Normally, Ethernet Switch Modules (ESM) are used in I/O + Normally, Ethernet Switch Modules (ESMs) are used in I/O modules 1 and 2. In this configuration, the eth0 and eth1 ports of a JS20 will be connected to different internal switches (in the respective I/O modules). - An optical passthru module (OPM) connects the I/O module -directly to an external switch. By using OPMs in I/O module #1 and -#2, the eth0 and eth1 interfaces of a JS20 can be redirected to the -outside world and connected to a common external switch. - - Depending upon the mix of ESM and OPM modules, the network -will appear to bonding as either a single switch topology (all OPM -modules) or as a multiple switch topology (one or more ESM modules, -zero or more OPM modules). It is also possible to connect ESM modules -together, resulting in a configuration much like the example in "High -Availability in a multiple switch topology." - -Requirements for specifc modes ------------------------------- - - The balance-rr mode requires the use of OPM modules for -devices in the bond, all connected to an common external switch. That -switch must be configured for "etherchannel" or "trunking" on the + A passthrough module (OPM or CPM, optical or copper, +passthrough module) connects the I/O module directly to an external +switch. By using PMs in I/O module #1 and #2, the eth0 and eth1 +interfaces of a JS20 can be redirected to the outside world and +connected to a common external switch. + + Depending upon the mix of ESMs and PMs, the network will +appear to bonding as either a single switch topology (all PMs) or as a +multiple switch topology (one or more ESMs, zero or more PMs). It is +also possible to connect ESMs together, resulting in a configuration +much like the example in "High Availability in a Multiple Switch +Topology," above. + +Requirements for specific modes +------------------------------- + + The balance-rr mode requires the use of passthrough modules +for devices in the bond, all connected to an common external switch. +That switch must be configured for "etherchannel" or "trunking" on the appropriate ports, as is usual for balance-rr. The balance-alb and balance-tlb modes will function with @@ -1484,17 +1851,18 @@ connected to the JS20 system. Other concerns -------------- - The Serial Over LAN link is established over the primary + The Serial Over LAN (SoL) link is established over the primary ethernet (eth0) only, therefore, any loss of link to eth0 will result in losing your SoL connection. It will not fail over with other -network traffic. +network traffic, as the SoL system is beyond the control of the +bonding driver. It may be desirable to disable spanning tree on the switch (either the internal Ethernet Switch Module, or an external switch) to -avoid fail-over delays issues when using bonding. +avoid fail-over delay issues when using bonding. -14. Frequently Asked Questions +16. Frequently Asked Questions ============================== 1. Is it SMP safe? @@ -1505,8 +1873,8 @@ The new driver was designed to be SMP safe from the start. 2. What type of cards will work with it? Any Ethernet type cards (you can even mix cards - a Intel -EtherExpress PRO/100 and a 3com 3c905b, for example). They need not -be of the same speed. +EtherExpress PRO/100 and a 3com 3c905b, for example). For most modes, +devices need not be of the same speed. 3. How many bonding devices can I have? @@ -1524,11 +1892,12 @@ system. disabled. The active-backup mode will fail over to a backup link, and other modes will ignore the failed link. The link will continue to be monitored, and should it recover, it will rejoin the bond (in whatever -manner is appropriate for the mode). See the section on High -Availability for additional information. +manner is appropriate for the mode). See the sections on High +Availability and the documentation for each mode for additional +information. Link monitoring can be enabled via either the miimon or -arp_interval paramters (described in the module paramters section, +arp_interval parameters (described in the module parameters section, above). In general, miimon monitors the carrier state as sensed by the underlying network device, and the arp monitor (arp_interval) monitors connectivity to another host on the local network. @@ -1536,7 +1905,7 @@ monitors connectivity to another host on the local network. If no link monitoring is configured, the bonding driver will be unable to detect link failures, and will assume that all links are always available. This will likely result in lost packets, and a -resulting degredation of performance. The precise performance loss +resulting degradation of performance. The precise performance loss depends upon the bonding mode and network configuration. 6. Can bonding be used for High Availability? @@ -1550,12 +1919,12 @@ depends upon the bonding mode and network configuration. In the basic balance modes (balance-rr and balance-xor), it works with any system that supports etherchannel (also called trunking). Most managed switches currently available have such -support, and many unmananged switches as well. +support, and many unmanaged switches as well. The advanced balance modes (balance-tlb and balance-alb) do not have special switch requirements, but do need device drivers that support specific features (described in the appropriate section under -module paramters, above). +module parameters, above). In 802.3ad mode, it works with with systems that support IEEE 802.3ad Dynamic Link Aggregation. Most managed and many unmanaged @@ -1565,17 +1934,19 @@ switches currently available support 802.3ad. 8. Where does a bonding device get its MAC address from? - If not explicitly configured with ifconfig, the MAC address of -the bonding device is taken from its first slave device. This MAC -address is then passed to all following slaves and remains persistent -(even if the the first slave is removed) until the bonding device is -brought down or reconfigured. + If not explicitly configured (with ifconfig or ip link), the +MAC address of the bonding device is taken from its first slave +device. This MAC address is then passed to all following slaves and +remains persistent (even if the the first slave is removed) until the +bonding device is brought down or reconfigured. If you wish to change the MAC address, you can set it with -ifconfig: +ifconfig or ip link: # ifconfig bond0 hw ether 00:11:22:33:44:55 +# ip link set bond0 address 66:77:88:99:aa:bb + The MAC address can be also changed by bringing down/up the device and then changing its slaves (or their order): @@ -1591,23 +1962,28 @@ from the bond (`ifenslave -d bond0 eth0'). The bonding driver will then restore the MAC addresses that the slaves had before they were enslaved. -15. Resources and Links +16. Resources and Links ======================= The latest version of the bonding driver can be found in the latest version of the linux kernel, found on http://kernel.org +The latest version of this document can be found in either the latest +kernel source (named Documentation/networking/bonding.txt), or on the +bonding sourceforge site: + +http://www.sourceforge.net/projects/bonding + Discussions regarding the bonding driver take place primarily on the bonding-devel mailing list, hosted at sourceforge.net. If you have -questions or problems, post them to the list. +questions or problems, post them to the list. The list address is: bonding-devel@lists.sourceforge.net -https://lists.sourceforge.net/lists/listinfo/bonding-devel - -There is also a project site on sourceforge. + The administrative interface (to subscribe or unsubscribe) can +be found at: -http://www.sourceforge.net/projects/bonding +https://lists.sourceforge.net/lists/listinfo/bonding-devel Donald Becker's Ethernet Drivers and diag programs may be found at : - http://www.scyld.com/network/ diff --git a/Documentation/networking/dmfe.txt b/Documentation/networking/dmfe.txt index c0e8398674ef..046363552d09 100644 --- a/Documentation/networking/dmfe.txt +++ b/Documentation/networking/dmfe.txt @@ -1,59 +1,65 @@ - dmfe.c: Version 1.28 01/18/2000 +Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver for Linux. - A Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver for Linux. - Copyright (C) 1997 Sten Wang +This program is free software; you can redistribute it and/or +modify it under the terms of the GNU General Public License +as published by the Free Software Foundation; either version 2 +of the License, or (at your option) any later version. - This program is free software; you can redistribute it and/or - modify it under the terms of the GNU General Public License - as published by the Free Software Foundation; either version 2 - of the License, or (at your option) any later version. +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. +This driver provides kernel support for Davicom DM9102(A)/DM9132/DM9801 ethernet cards ( CNET +10/100 ethernet cards uses Davicom chipset too, so this driver supports CNET cards too ).If you +didn't compile this driver as a module, it will automatically load itself on boot and print a +line similar to : - A. Compiler command: + dmfe: Davicom DM9xxx net driver, version 1.36.4 (2002-01-17) - A-1: For normal single or multiple processor kernel - "gcc -DMODULE -D__KERNEL__ -I/usr/src/linux/net/inet -Wall - -Wstrict-prototypes -O6 -c dmfe.c" +If you compiled this driver as a module, you have to load it on boot.You can load it with command : - A-2: For single or multiple processor with kernel module version function - "gcc -DMODULE -DMODVERSIONS -D__KERNEL__ -I/usr/src/linux/net/inet - -Wall -Wstrict-prototypes -O6 -c dmfe.c" + insmod dmfe +This way it will autodetect the device mode.This is the suggested way to load the module.Or you can pass +a mode= setting to module while loading, like : - B. The following steps teach you how to activate a DM9102 board: + insmod dmfe mode=0 # Force 10M Half Duplex + insmod dmfe mode=1 # Force 100M Half Duplex + insmod dmfe mode=4 # Force 10M Full Duplex + insmod dmfe mode=5 # Force 100M Full Duplex - 1. Used the upper compiler command to compile dmfe.c +Next you should configure your network interface with a command similar to : - 2. Insert dmfe module into kernel - "insmod dmfe" ;;Auto Detection Mode (Suggest) - "insmod dmfe mode=0" ;;Force 10M Half Duplex - "insmod dmfe mode=1" ;;Force 100M Half Duplex - "insmod dmfe mode=4" ;;Force 10M Full Duplex - "insmod dmfe mode=5" ;;Force 100M Full Duplex + ifconfig eth0 172.22.3.18 + ^^^^^^^^^^^ + Your IP Adress - 3. Config a dm9102 network interface - "ifconfig eth0 172.22.3.18" - ^^^^^^^^^^^ Your IP address +Then you may have to modify the default routing table with command : - 4. Activate the IP routing table. For some distributions, it is not - necessary. You can type "route" to check. + route add default eth0 - "route add default eth0" +Now your ethernet card should be up and running. - 5. Well done. Your DM9102 adapter is now activated. +TODO: - C. Object files description: - 1. dmfe_rh61.o: For Redhat 6.1 +Implement pci_driver::suspend() and pci_driver::resume() power management methods. +Check on 64 bit boxes. +Check and fix on big endian boxes. +Test and make sure PCI latency is now correct for all cases. - If you can make sure your kernel version, you can rename - to dmfe.o and directly use it without re-compiling. +Authors: - Author: Sten Wang, 886-3-5798797-8517, E-mail: sten_wang@davicom.com.tw +Sten Wang <sten_wang@davicom.com.tw > : Original Author +Tobias Ringstrom <tori@unhappy.mine.nu> : Current Maintainer + +Contributors: + +Marcelo Tosatti <marcelo@conectiva.com.br> +Alan Cox <alan@redhat.com> +Jeff Garzik <jgarzik@pobox.com> +Vojtech Pavlik <vojtech@suse.cz> diff --git a/Documentation/networking/fib_trie.txt b/Documentation/networking/fib_trie.txt new file mode 100644 index 000000000000..f50d0c673c57 --- /dev/null +++ b/Documentation/networking/fib_trie.txt @@ -0,0 +1,145 @@ + LC-trie implementation notes. + +Node types +---------- +leaf + An end node with data. This has a copy of the relevant key, along + with 'hlist' with routing table entries sorted by prefix length. + See struct leaf and struct leaf_info. + +trie node or tnode + An internal node, holding an array of child (leaf or tnode) pointers, + indexed through a subset of the key. See Level Compression. + +A few concepts explained +------------------------ +Bits (tnode) + The number of bits in the key segment used for indexing into the + child array - the "child index". See Level Compression. + +Pos (tnode) + The position (in the key) of the key segment used for indexing into + the child array. See Path Compression. + +Path Compression / skipped bits + Any given tnode is linked to from the child array of its parent, using + a segment of the key specified by the parent's "pos" and "bits" + In certain cases, this tnode's own "pos" will not be immediately + adjacent to the parent (pos+bits), but there will be some bits + in the key skipped over because they represent a single path with no + deviations. These "skipped bits" constitute Path Compression. + Note that the search algorithm will simply skip over these bits when + searching, making it necessary to save the keys in the leaves to + verify that they actually do match the key we are searching for. + +Level Compression / child arrays + the trie is kept level balanced moving, under certain conditions, the + children of a full child (see "full_children") up one level, so that + instead of a pure binary tree, each internal node ("tnode") may + contain an arbitrarily large array of links to several children. + Conversely, a tnode with a mostly empty child array (see empty_children) + may be "halved", having some of its children moved downwards one level, + in order to avoid ever-increasing child arrays. + +empty_children + the number of positions in the child array of a given tnode that are + NULL. + +full_children + the number of children of a given tnode that aren't path compressed. + (in other words, they aren't NULL or leaves and their "pos" is equal + to this tnode's "pos"+"bits"). + + (The word "full" here is used more in the sense of "complete" than + as the opposite of "empty", which might be a tad confusing.) + +Comments +--------- + +We have tried to keep the structure of the code as close to fib_hash as +possible to allow verification and help up reviewing. + +fib_find_node() + A good start for understanding this code. This function implements a + straightforward trie lookup. + +fib_insert_node() + Inserts a new leaf node in the trie. This is bit more complicated than + fib_find_node(). Inserting a new node means we might have to run the + level compression algorithm on part of the trie. + +trie_leaf_remove() + Looks up a key, deletes it and runs the level compression algorithm. + +trie_rebalance() + The key function for the dynamic trie after any change in the trie + it is run to optimize and reorganize. Tt will walk the trie upwards + towards the root from a given tnode, doing a resize() at each step + to implement level compression. + +resize() + Analyzes a tnode and optimizes the child array size by either inflating + or shrinking it repeatedly until it fullfills the criteria for optimal + level compression. This part follows the original paper pretty closely + and there may be some room for experimentation here. + +inflate() + Doubles the size of the child array within a tnode. Used by resize(). + +halve() + Halves the size of the child array within a tnode - the inverse of + inflate(). Used by resize(); + +fn_trie_insert(), fn_trie_delete(), fn_trie_select_default() + The route manipulation functions. Should conform pretty closely to the + corresponding functions in fib_hash. + +fn_trie_flush() + This walks the full trie (using nextleaf()) and searches for empty + leaves which have to be removed. + +fn_trie_dump() + Dumps the routing table ordered by prefix length. This is somewhat + slower than the corresponding fib_hash function, as we have to walk the + entire trie for each prefix length. In comparison, fib_hash is organized + as one "zone"/hash per prefix length. + +Locking +------- + +fib_lock is used for an RW-lock in the same way that this is done in fib_hash. +However, the functions are somewhat separated for other possible locking +scenarios. It might conceivably be possible to run trie_rebalance via RCU +to avoid read_lock in the fn_trie_lookup() function. + +Main lookup mechanism +--------------------- +fn_trie_lookup() is the main lookup function. + +The lookup is in its simplest form just like fib_find_node(). We descend the +trie, key segment by key segment, until we find a leaf. check_leaf() does +the fib_semantic_match in the leaf's sorted prefix hlist. + +If we find a match, we are done. + +If we don't find a match, we enter prefix matching mode. The prefix length, +starting out at the same as the key length, is reduced one step at a time, +and we backtrack upwards through the trie trying to find a longest matching +prefix. The goal is always to reach a leaf and get a positive result from the +fib_semantic_match mechanism. + +Inside each tnode, the search for longest matching prefix consists of searching +through the child array, chopping off (zeroing) the least significant "1" of +the child index until we find a match or the child index consists of nothing but +zeros. + +At this point we backtrack (t->stats.backtrack++) up the trie, continuing to +chop off part of the key in order to find the longest matching prefix. + +At this point we will repeatedly descend subtries to look for a match, and there +are some optimizations available that can provide us with "shortcuts" to avoid +descending into dead ends. Look for "HL_OPTIMIZE" sections in the code. + +To alleviate any doubts about the correctness of the route selection process, +a new netlink operation has been added. Look for NETLINK_FIB_LOOKUP, which +gives userland access to fib_lookup(). diff --git a/Documentation/networking/generic-hdlc.txt b/Documentation/networking/generic-hdlc.txt index 7d1dc6b884f3..31bc8b759b75 100644 --- a/Documentation/networking/generic-hdlc.txt +++ b/Documentation/networking/generic-hdlc.txt @@ -1,21 +1,21 @@ Generic HDLC layer Krzysztof Halasa <khc@pm.waw.pl> -January, 2003 Generic HDLC layer currently supports: -- Frame Relay (ANSI, CCITT and no LMI), with ARP support (no InARP). - Normal (routed) and Ethernet-bridged (Ethernet device emulation) - interfaces can share a single PVC. -- raw HDLC - either IP (IPv4) interface or Ethernet device emulation. -- Cisco HDLC, -- PPP (uses syncppp.c), -- X.25 (uses X.25 routines). - -There are hardware drivers for the following cards: -- C101 by Moxa Technologies Co., Ltd. -- RISCom/N2 by SDL Communications Inc. -- and others, some not in the official kernel. +1. Frame Relay (ANSI, CCITT, Cisco and no LMI). + - Normal (routed) and Ethernet-bridged (Ethernet device emulation) + interfaces can share a single PVC. + - ARP support (no InARP support in the kernel - there is an + experimental InARP user-space daemon available on: + http://www.kernel.org/pub/linux/utils/net/hdlc/). +2. raw HDLC - either IP (IPv4) interface or Ethernet device emulation. +3. Cisco HDLC. +4. PPP (uses syncppp.c). +5. X.25 (uses X.25 routines). + +Generic HDLC is a protocol driver only - it needs a low-level driver +for your particular hardware. Ethernet device emulation (using HDLC or Frame-Relay PVC) is compatible with IEEE 802.1Q (VLANs) and 802.1D (Ethernet bridging). @@ -24,7 +24,7 @@ with IEEE 802.1Q (VLANs) and 802.1D (Ethernet bridging). Make sure the hdlc.o and the hardware driver are loaded. It should create a number of "hdlc" (hdlc0 etc) network devices, one for each WAN port. You'll need the "sethdlc" utility, get it from: - http://hq.pm.waw.pl/hdlc/ + http://www.kernel.org/pub/linux/utils/net/hdlc/ Compile sethdlc.c utility: gcc -O2 -Wall -o sethdlc sethdlc.c @@ -52,12 +52,12 @@ Setting interface: * v35 | rs232 | x21 | t1 | e1 - sets physical interface for a given port if the card has software-selectable interfaces loopback - activate hardware loopback (for testing only) -* clock ext - external clock (uses DTE RX and TX clock) -* clock int - internal clock (provides clock signal on DCE clock output) -* clock txint - TX internal, RX external (provides TX clock on DCE output) -* clock txfromrx - TX clock derived from RX clock (TX clock on DCE output) -* rate - sets clock rate in bps (not required for external clock or - for txfromrx) +* clock ext - both RX clock and TX clock external +* clock int - both RX clock and TX clock internal +* clock txint - RX clock external, TX clock internal +* clock txfromrx - RX clock external, TX clock derived from RX clock +* rate - sets clock rate in bps (for "int" or "txint" clock only) + Setting protocol: @@ -79,7 +79,7 @@ Setting protocol: * x25 - sets X.25 mode * fr - Frame Relay mode - lmi ansi / ccitt / none - LMI (link management) type + lmi ansi / ccitt / cisco / none - LMI (link management) type dce - Frame Relay DCE (network) side LMI instead of default DTE (user). It has nothing to do with clocks! t391 - link integrity verification polling timer (in seconds) - user @@ -119,13 +119,14 @@ or -If you have a problem with N2 or C101 card, you can issue the "private" -command to see port's packet descriptor rings (in kernel logs): +If you have a problem with N2, C101 or PLX200SYN card, you can issue the +"private" command to see port's packet descriptor rings (in kernel logs): sethdlc hdlc0 private -The hardware driver has to be build with CONFIG_HDLC_DEBUG_RINGS. +The hardware driver has to be build with #define DEBUG_RINGS. Attaching this info to bug reports would be helpful. Anyway, let me know if you have problems using this. -For patches and other info look at http://hq.pm.waw.pl/hdlc/ +For patches and other info look at: +<http://www.kernel.org/pub/linux/utils/net/hdlc/>. diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index a2c893a7475d..ab65714d95fc 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -304,57 +304,6 @@ tcp_low_latency - BOOLEAN changed would be a Beowulf compute cluster. Default: 0 -tcp_westwood - BOOLEAN - Enable TCP Westwood+ congestion control algorithm. - TCP Westwood+ is a sender-side only modification of the TCP Reno - protocol stack that optimizes the performance of TCP congestion - control. It is based on end-to-end bandwidth estimation to set - congestion window and slow start threshold after a congestion - episode. Using this estimation, TCP Westwood+ adaptively sets a - slow start threshold and a congestion window which takes into - account the bandwidth used at the time congestion is experienced. - TCP Westwood+ significantly increases fairness wrt TCP Reno in - wired networks and throughput over wireless links. - Default: 0 - -tcp_vegas_cong_avoid - BOOLEAN - Enable TCP Vegas congestion avoidance algorithm. - TCP Vegas is a sender-side only change to TCP that anticipates - the onset of congestion by estimating the bandwidth. TCP Vegas - adjusts the sending rate by modifying the congestion - window. TCP Vegas should provide less packet loss, but it is - not as aggressive as TCP Reno. - Default:0 - -tcp_bic - BOOLEAN - Enable BIC TCP congestion control algorithm. - BIC-TCP is a sender-side only change that ensures a linear RTT - fairness under large windows while offering both scalability and - bounded TCP-friendliness. The protocol combines two schemes - called additive increase and binary search increase. When the - congestion window is large, additive increase with a large - increment ensures linear RTT fairness as well as good - scalability. Under small congestion windows, binary search - increase provides TCP friendliness. - Default: 0 - -tcp_bic_low_window - INTEGER - Sets the threshold window (in packets) where BIC TCP starts to - adjust the congestion window. Below this threshold BIC TCP behaves - the same as the default TCP Reno. - Default: 14 - -tcp_bic_fast_convergence - BOOLEAN - Forces BIC TCP to more quickly respond to changes in congestion - window. Allows two flows sharing the same connection to converge - more rapidly. - Default: 1 - -tcp_default_win_scale - INTEGER - Sets the minimum window scale TCP will negotiate for on all - conections. - Default: 7 - tcp_tso_win_divisor - INTEGER This allows control over what percentage of the congestion window can be consumed by a single TSO frame. @@ -368,6 +317,11 @@ tcp_frto - BOOLEAN where packet loss is typically due to random radio interference rather than intermediate router congestion. +tcp_congestion_control - STRING + Set the congestion control algorithm to be used for new + connections. The algorithm "reno" is always available, but + additional choices may be available based on kernel configuration. + somaxconn - INTEGER Limit of socket listen() backlog, known in userspace as SOMAXCONN. Defaults to 128. See also tcp_max_syn_backlog for additional tuning diff --git a/Documentation/networking/multicast.txt b/Documentation/networking/multicast.txt index 5049a64313d1..b06c8c69266f 100644 --- a/Documentation/networking/multicast.txt +++ b/Documentation/networking/multicast.txt @@ -47,7 +47,6 @@ ni52 <------------------ Buggy ------------------> ni65 YES YES YES Software(#) seeq NO NO NO N/A sgiseek <------------------ Buggy ------------------> -sk_g16 NO NO YES N/A smc-ultra YES YES YES Hardware sunlance YES YES YES Hardware tulip YES YES YES Hardware diff --git a/Documentation/networking/net-modules.txt b/Documentation/networking/net-modules.txt index 3830a83513d2..0b27863f155c 100644 --- a/Documentation/networking/net-modules.txt +++ b/Documentation/networking/net-modules.txt @@ -284,9 +284,6 @@ ppp.c: seeq8005.c: *Not modularized* (Probes ports: 0x300, 0x320, 0x340, 0x360) -sk_g16.c: *Not modularized* - (Probes ports: 0x100, 0x180, 0x208, 0x220m 0x288, 0x320, 0x328, 0x390) - skeleton.c: *Skeleton* slhc.c: diff --git a/Documentation/networking/tcp.txt b/Documentation/networking/tcp.txt index 71749007091e..0fa300425575 100644 --- a/Documentation/networking/tcp.txt +++ b/Documentation/networking/tcp.txt @@ -1,5 +1,72 @@ -How the new TCP output machine [nyi] works. +TCP protocol +============ + +Last updated: 21 June 2005 + +Contents +======== + +- Congestion control +- How the new TCP output machine [nyi] works + +Congestion control +================== + +The following variables are used in the tcp_sock for congestion control: +snd_cwnd The size of the congestion window +snd_ssthresh Slow start threshold. We are in slow start if + snd_cwnd is less than this. +snd_cwnd_cnt A counter used to slow down the rate of increase + once we exceed slow start threshold. +snd_cwnd_clamp This is the maximum size that snd_cwnd can grow to. +snd_cwnd_stamp Timestamp for when congestion window last validated. +snd_cwnd_used Used as a highwater mark for how much of the + congestion window is in use. It is used to adjust + snd_cwnd down when the link is limited by the + application rather than the network. + +As of 2.6.13, Linux supports pluggable congestion control algorithms. +A congestion control mechanism can be registered through functions in +tcp_cong.c. The functions used by the congestion control mechanism are +registered via passing a tcp_congestion_ops struct to +tcp_register_congestion_control. As a minimum name, ssthresh, +cong_avoid, min_cwnd must be valid. +Private data for a congestion control mechanism is stored in tp->ca_priv. +tcp_ca(tp) returns a pointer to this space. This is preallocated space - it +is important to check the size of your private data will fit this space, or +alternatively space could be allocated elsewhere and a pointer to it could +be stored here. + +There are three kinds of congestion control algorithms currently: The +simplest ones are derived from TCP reno (highspeed, scalable) and just +provide an alternative the congestion window calculation. More complex +ones like BIC try to look at other events to provide better +heuristics. There are also round trip time based algorithms like +Vegas and Westwood+. + +Good TCP congestion control is a complex problem because the algorithm +needs to maintain fairness and performance. Please review current +research and RFC's before developing new modules. + +The method that is used to determine which congestion control mechanism is +determined by the setting of the sysctl net.ipv4.tcp_congestion_control. +The default congestion control will be the last one registered (LIFO); +so if you built everything as modules. the default will be reno. If you +build with the default's from Kconfig, then BIC will be builtin (not a module) +and it will end up the default. + +If you really want a particular default value then you will need +to set it with the sysctl. If you use a sysctl, the module will be autoloaded +if needed and you will get the expected protocol. If you ask for an +unknown congestion method, then the sysctl attempt will fail. + +If you remove a tcp congestion control module, then you will get the next +available one. Since reno can not be built as a module, and can not be +deleted, it will always be available. + +How the new TCP output machine [nyi] works. +=========================================== Data is kept on a single queue. The skb->users flag tells us if the frame is one that has been queued already. To add a frame we throw it on the end. Ack diff --git a/Documentation/networking/vortex.txt b/Documentation/networking/vortex.txt index fa12a9e4abdd..80e1cb19609f 100644 --- a/Documentation/networking/vortex.txt +++ b/Documentation/networking/vortex.txt @@ -12,7 +12,7 @@ Don is no longer the prime maintainer of this version of the driver. Please report problems to one or more of: Andrew Morton <andrewm@uow.edu.au> - Netdev mailing list <netdev@oss.sgi.com> + Netdev mailing list <netdev@vger.kernel.org> Linux kernel mailing list <linux-kernel@vger.kernel.org> Please note the 'Reporting and Diagnosing Problems' section at the end diff --git a/Documentation/networking/wanpipe.txt b/Documentation/networking/wanpipe.txt deleted file mode 100644 index aea20cd2a56e..000000000000 --- a/Documentation/networking/wanpipe.txt +++ /dev/null @@ -1,622 +0,0 @@ ------------------------------------------------------------------------------- -Linux WAN Router Utilities Package ------------------------------------------------------------------------------- -Version 2.2.1 -Mar 28, 2001 -Author: Nenad Corbic <ncorbic@sangoma.com> -Copyright (c) 1995-2001 Sangoma Technologies Inc. ------------------------------------------------------------------------------- - -INTRODUCTION - -Wide Area Networks (WANs) are used to interconnect Local Area Networks (LANs) -and/or stand-alone hosts over vast distances with data transfer rates -significantly higher than those achievable with commonly used dial-up -connections. - -Usually an external device called `WAN router' sitting on your local network -or connected to your machine's serial port provides physical connection to -WAN. Although router's job may be as simple as taking your local network -traffic, converting it to WAN format and piping it through the WAN link, these -devices are notoriously expensive, with prices as much as 2 - 5 times higher -then the price of a typical PC box. - -Alternatively, considering robustness and multitasking capabilities of Linux, -an internal router can be built (most routers use some sort of stripped down -Unix-like operating system anyway). With a number of relatively inexpensive WAN -interface cards available on the market, a perfectly usable router can be -built for less than half a price of an external router. Yet a Linux box -acting as a router can still be used for other purposes, such as fire-walling, -running FTP, WWW or DNS server, etc. - -This kernel module introduces the notion of a WAN Link Driver (WLD) to Linux -operating system and provides generic hardware-independent services for such -drivers. Why can existing Linux network device interface not be used for -this purpose? Well, it can. However, there are a few key differences between -a typical network interface (e.g. Ethernet) and a WAN link. - -Many WAN protocols, such as X.25 and frame relay, allow for multiple logical -connections (known as `virtual circuits' in X.25 terminology) over a single -physical link. Each such virtual circuit may (and almost always does) lead -to a different geographical location and, therefore, different network. As a -result, it is the virtual circuit, not the physical link, that represents a -route and, therefore, a network interface in Linux terms. - -To further complicate things, virtual circuits are usually volatile in nature -(excluding so called `permanent' virtual circuits or PVCs). With almost no -time required to set up and tear down a virtual circuit, it is highly desirable -to implement on-demand connections in order to minimize network charges. So -unlike a typical network driver, the WAN driver must be able to handle multiple -network interfaces and cope as multiple virtual circuits come into existence -and go away dynamically. - -Last, but not least, WAN configuration is much more complex than that of say -Ethernet and may well amount to several dozens of parameters. Some of them -are "link-wide" while others are virtual circuit-specific. The same holds -true for WAN statistics which is by far more extensive and extremely useful -when troubleshooting WAN connections. Extending the ifconfig utility to suit -these needs may be possible, but does not seem quite reasonable. Therefore, a -WAN configuration utility and corresponding application programmer's interface -is needed for this purpose. - -Most of these problems are taken care of by this module. Its goal is to -provide a user with more-or-less standard look and feel for all WAN devices and -assist a WAN device driver writer by providing common services, such as: - - o User-level interface via /proc file system - o Centralized configuration - o Device management (setup, shutdown, etc.) - o Network interface management (dynamic creation/destruction) - o Protocol encapsulation/decapsulation - -To ba able to use the Linux WAN Router you will also need a WAN Tools package -available from - - ftp.sangoma.com/pub/linux/current_wanpipe/wanpipe-X.Y.Z.tgz - -where vX.Y.Z represent the wanpipe version number. - -For technical questions and/or comments please e-mail to ncorbic@sangoma.com. -For general inquiries please contact Sangoma Technologies Inc. by - - Hotline: 1-800-388-2475 (USA and Canada, toll free) - Phone: (905) 474-1990 ext: 106 - Fax: (905) 474-9223 - E-mail: dm@sangoma.com (David Mandelstam) - WWW: http://www.sangoma.com - - -INSTALLATION - -Please read the WanpipeForLinux.pdf manual on how to -install the WANPIPE tools and drivers properly. - - -After installing wanpipe package: /usr/local/wanrouter/doc. -On the ftp.sangoma.com : /linux/current_wanpipe/doc - - -COPYRIGHT AND LICENSING INFORMATION - -This program is free software; you can redistribute it and/or modify it under -the terms of the GNU General Public License as published by the Free Software -Foundation; either version 2, or (at your option) any later version. - -This program is distributed in the hope that it will be useful, but WITHOUT -ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS -FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. - -You should have received a copy of the GNU General Public License along with -this program; if not, write to the Free Software Foundation, Inc., 675 Mass -Ave, Cambridge, MA 02139, USA. - - - -ACKNOWLEDGEMENTS - -This product is based on the WANPIPE(tm) Multiprotocol WAN Router developed -by Sangoma Technologies Inc. for Linux 2.0.x and 2.2.x. Success of the WANPIPE -together with the next major release of Linux kernel in summer 1996 commanded -adequate changes to the WANPIPE code to take full advantage of new Linux -features. - -Instead of continuing developing proprietary interface tied to Sangoma WAN -cards, we decided to separate all hardware-independent code into a separate -module and defined two levels of interfaces - one for user-level applications -and another for kernel-level WAN drivers. WANPIPE is now implemented as a -WAN driver compliant with the WAN Link Driver interface. Also a general -purpose WAN configuration utility and a set of shell scripts was developed to -support WAN router at the user level. - -Many useful ideas concerning hardware-independent interface implementation -were given by Mike McLagan <mike.mclagan@linux.org> and his implementation -of the Frame Relay router and drivers for Sangoma cards (dlci/sdla). - -With the new implementation of the APIs being incorporated into the WANPIPE, -a special thank goes to Alan Cox in providing insight into BSD sockets. - -Special thanks to all the WANPIPE users who performed field-testing, reported -bugs and made valuable comments and suggestions that help us to improve this -product. - - - -NEW IN THIS RELEASE - - o Updated the WANCFG utility - Calls the pppconfig to configure the PPPD - for async connections. - - o Added the PPPCONFIG utility - Used to configure the PPPD dameon for the - WANPIPE Async PPP and standard serial port. - The wancfg calls the pppconfig to configure - the pppd. - - o Fixed the PCI autodetect feature. - The SLOT 0 was used as an autodetect option - however, some high end PC's slot numbers start - from 0. - - o This release has been tested with the new backupd - daemon release. - - -PRODUCT COMPONENTS AND RELATED FILES - -/etc: (or user defined) - wanpipe1.conf default router configuration file - -/lib/modules/X.Y.Z/misc: - wanrouter.o router kernel loadable module - af_wanpipe.o wanpipe api socket module - -/lib/modules/X.Y.Z/net: - sdladrv.o Sangoma SDLA support module - wanpipe.o Sangoma WANPIPE(tm) driver module - -/proc/net/wanrouter - Config reads current router configuration - Status reads current router status - {name} reads WAN driver statistics - -/usr/sbin: - wanrouter wanrouter start-up script - wanconfig wanrouter configuration utility - sdladump WANPIPE adapter memory dump utility - fpipemon Monitor for Frame Relay - cpipemon Monitor for Cisco HDLC - ppipemon Monitor for PPP - xpipemon Monitor for X25 - wpkbdmon WANPIPE keyboard led monitor/debugger - -/usr/local/wanrouter: - README this file - COPYING GNU General Public License - Setup installation script - Filelist distribution definition file - wanrouter.rc meta-configuration file - (used by the Setup and wanrouter script) - -/usr/local/wanrouter/doc: - wanpipeForLinux.pdf WAN Router User's Manual - -/usr/local/wanrouter/patches: - wanrouter-v2213.gz patch for Linux kernels 2.2.11 up to 2.2.13. - wanrouter-v2214.gz patch for Linux kernel 2.2.14. - wanrouter-v2215.gz patch for Linux kernels 2.2.15 to 2.2.17. - wanrouter-v2218.gz patch for Linux kernels 2.2.18 and up. - wanrouter-v240.gz patch for Linux kernel 2.4.0. - wanrouter-v242.gz patch for Linux kernel 2.4.2 and up. - wanrouter-v2034.gz patch for Linux kernel 2.0.34 - wanrouter-v2036.gz patch for Linux kernel 2.0.36 and up. - -/usr/local/wanrouter/patches/kdrivers: - Sources of the latest WANPIPE device drivers. - These are used to UPGRADE the linux kernel to the newest - version if the kernel source has already been pathced with - WANPIPE drivers. - -/usr/local/wanrouter/samples: - interface sample interface configuration file - wanpipe1.cpri CHDLC primary port - wanpipe2.csec CHDLC secondary port - wanpipe1.fr Frame Relay protocol - wanpipe1.ppp PPP protocol ) - wanpipe1.asy CHDLC ASYNC protocol - wanpipe1.x25 X25 protocol - wanpipe1.stty Sync TTY driver (Used by Kernel PPPD daemon) - wanpipe1.atty Async TTY driver (Used by Kernel PPPD daemon) - wanrouter.rc sample meta-configuration file - -/usr/local/wanrouter/util: - * wan-tools utilities source code - -/usr/local/wanrouter/api/x25: - * x25 api sample programs. -/usr/local/wanrouter/api/chdlc: - * chdlc api sample programs. -/usr/local/wanrouter/api/fr: - * fr api sample programs. -/usr/local/wanrouter/config/wancfg: - wancfg WANPIPE GUI configuration program. - Creates wanpipe#.conf files. -/usr/local/wanrouter/config/cfgft1: - cfgft1 GUI CSU/DSU configuration program. - -/usr/include/linux: - wanrouter.h router API definitions - wanpipe.h WANPIPE API definitions - sdladrv.h SDLA support module API definitions - sdlasfm.h SDLA firmware module definitions - if_wanpipe.h WANPIPE Socket definitions - if_wanpipe_common.h WANPIPE Socket/Driver common definitions. - sdlapci.h WANPIPE PCI definitions - - -/usr/src/linux/net/wanrouter: - * wanrouter source code - -/var/log: - wanrouter wanrouter start-up log (created by the Setup script) - -/var/lock: (or /var/lock/subsys for RedHat) - wanrouter wanrouter lock file (created by the Setup script) - -/usr/local/wanrouter/firmware: - fr514.sfm Frame relay firmware for Sangoma S508/S514 card - cdual514.sfm Dual Port Cisco HDLC firmware for Sangoma S508/S514 card - ppp514.sfm PPP Firmware for Sangoma S508 and S514 cards - x25_508.sfm X25 Firmware for Sangoma S508 card. - - -REVISION HISTORY - -1.0.0 December 31, 1996 Initial version - -1.0.1 January 30, 1997 Status and statistics can be read via /proc - filesystem entries. - -1.0.2 April 30, 1997 Added UDP management via monitors. - -1.0.3 June 3, 1997 UDP management for multiple boards using Frame - Relay and PPP - Enabled continuous transmission of Configure - Request Packet for PPP (for 508 only) - Connection Timeout for PPP changed from 900 to 0 - Flow Control Problem fixed for Frame Relay - -1.0.4 July 10, 1997 S508/FT1 monitoring capability in fpipemon and - ppipemon utilities. - Configurable TTL for UDP packets. - Multicast and Broadcast IP source addresses are - silently discarded. - -1.0.5 July 28, 1997 Configurable T391,T392,N391,N392,N393 for Frame - Relay in router.conf. - Configurable Memory Address through router.conf - for Frame Relay, PPP and X.25. (commenting this - out enables auto-detection). - Fixed freeing up received buffers using kfree() - for Frame Relay and X.25. - Protect sdla_peek() by calling save_flags(), - cli() and restore_flags(). - Changed number of Trace elements from 32 to 20 - Added DLCI specific data monitoring in FPIPEMON. -2.0.0 Nov 07, 1997 Implemented protection of RACE conditions by - critical flags for FRAME RELAY and PPP. - DLCI List interrupt mode implemented. - IPX support in FRAME RELAY and PPP. - IPX Server Support (MARS) - More driver specific stats included in FPIPEMON - and PIPEMON. - -2.0.1 Nov 28, 1997 Bug Fixes for version 2.0.0. - Protection of "enable_irq()" while - "disable_irq()" has been enabled from any other - routine (for Frame Relay, PPP and X25). - Added additional Stats for Fpipemon and Ppipemon - Improved Load Sharing for multiple boards - -2.0.2 Dec 09, 1997 Support for PAP and CHAP for ppp has been - implemented. - -2.0.3 Aug 15, 1998 New release supporting Cisco HDLC, CIR for Frame - relay, Dynamic IP assignment for PPP and Inverse - Arp support for Frame-relay. Man Pages are - included for better support and a new utility - for configuring FT1 cards. - -2.0.4 Dec 09, 1998 Dual Port support for Cisco HDLC. - Support for HDLC (LAPB) API. - Supports BiSync Streaming code for S502E - and S503 cards. - Support for Streaming HDLC API. - Provides a BSD socket interface for - creating applications using BiSync - streaming. - -2.0.5 Aug 04, 1999 CHDLC initializatin bug fix. - PPP interrupt driven driver: - Fix to the PPP line hangup problem. - New PPP firmware - Added comments to the startup SYSTEM ERROR messages - Xpipemon debugging application for the X25 protocol - New USER_MANUAL.txt - Fixed the odd boundary 4byte writes to the board. - BiSync Streaming code has been taken out. - Available as a patch. - Streaming HDLC API has been taken out. - Available as a patch. - -2.0.6 Aug 17, 1999 Increased debugging in statup scripts - Fixed insallation bugs from 2.0.5 - Kernel patch works for both 2.2.10 and 2.2.11 kernels. - There is no functional difference between the two packages - -2.0.7 Aug 26, 1999 o Merged X25API code into WANPIPE. - o Fixed a memeory leak for X25API - o Updated the X25API code for 2.2.X kernels. - o Improved NEM handling. - -2.1.0 Oct 25, 1999 o New code for S514 PCI Card - o New CHDLC and Frame Relay drivers - o PPP and X25 are not supported in this release - -2.1.1 Nov 30, 1999 o PPP support for S514 PCI Cards - -2.1.3 Apr 06, 2000 o Socket based x25api - o Socket based chdlc api - o Socket based fr api - o Dual Port Receive only CHDLC support. - o Asynchronous CHDLC support (Secondary Port) - o cfgft1 GUI csu/dsu configurator - o wancfg GUI configuration file - configurator. - o Architectual directory changes. - -beta-2.1.4 Jul 2000 o Dynamic interface configuration: - Network interfaces reflect the state - of protocol layer. If the protocol becomes - disconnected, driver will bring down - the interface. Once the protocol reconnects - the interface will be brought up. - - Note: This option is turned off by default. - - o Dynamic wanrouter setup using 'wanconfig': - wanconfig utility can be used to - shutdown,restart,start or reconfigure - a virtual circuit dynamically. - - Frame Relay: Each DLCI can be: - created,stopped,restarted and reconfigured - dynamically using wanconfig. - - ex: wanconfig card wanpipe1 dev wp1_fr16 up - - o Wanrouter startup via command line arguments: - wanconfig also supports wanrouter startup via command line - arguments. Thus, there is no need to create a wanpipe#.conf - configuration file. - - o Socket based x25api update/bug fixes. - Added support for LCN numbers greater than 255. - Option to pass up modem messages. - Provided a PCI IRQ check, so a single S514 - card is guaranteed to have a non-sharing interrupt. - - o Fixes to the wancfg utility. - o New FT1 debugging support via *pipemon utilities. - o Frame Relay ARP support Enabled. - -beta3-2.1.4 Jul 2000 o X25 M_BIT Problem fix. - o Added the Multi-Port PPP - Updated utilites for the Multi-Port PPP. - -2.1.4 Aut 2000 - o In X25API: - Maximum packet an application can send - to the driver has been extended to 4096 bytes. - - Fixed the x25 startup bug. Enable - communications only after all interfaces - come up. HIGH SVC/PVC is used to calculate - the number of channels. - Enable protocol only after all interfaces - are enabled. - - o Added an extra state to the FT1 config, kernel module. - o Updated the pipemon debuggers. - - o Blocked the Multi-Port PPP from running on kernels - 2.2.16 or greater, due to syncppp kernel module - change. - -beta1-2.1.5 Nov 15 2000 - o Fixed the MulitPort PPP Support for kernels 2.2.16 and above. - 2.2.X kernels only - - o Secured the driver UDP debugging calls - - All illegal netowrk debugging calls are reported to - the log. - - Defined a set of allowed commands, all other denied. - - o Cpipemon - - Added set FT1 commands to the cpipemon. Thus CSU/DSU - configuraiton can be performed using cpipemon. - All systems that cannot run cfgft1 GUI utility should - use cpipemon to configure the on board CSU/DSU. - - - o Keyboard Led Monitor/Debugger - - A new utilty /usr/sbin/wpkbdmon uses keyboard leds - to convey operatinal statistic information of the - Sangoma WANPIPE cards. - NUM_LOCK = Line State (On=connected, Off=disconnected) - CAPS_LOCK = Tx data (On=transmitting, Off=no tx data) - SCROLL_LOCK = Rx data (On=receiving, Off=no rx data - - o Hardware probe on module load and dynamic device allocation - - During WANPIPE module load, all Sangoma cards are probed - and found information is printed in the /var/log/messages. - - If no cards are found, the module load fails. - - Appropriate number of devices are dynamically loaded - based on the number of Sangoma cards found. - - Note: The kernel configuraiton option - CONFIG_WANPIPE_CARDS has been taken out. - - o Fixed the Frame Relay and Chdlc network interfaces so they are - compatible with libpcap libraries. Meaning, tcpdump, snort, - ethereal, and all other packet sniffers and debuggers work on - all WANPIPE netowrk interfaces. - - Set the network interface encoding type to ARPHRD_PPP. - This tell the sniffers that data obtained from the - network interface is in pure IP format. - Fix for 2.2.X kernels only. - - o True interface encoding option for Frame Relay and CHDLC - - The above fix sets the network interface encoding - type to ARPHRD_PPP, however some customers use - the encoding interface type to determine the - protocol running. Therefore, the TURE ENCODING - option will set the interface type back to the - original value. - - NOTE: If this option is used with Frame Relay and CHDLC - libpcap library support will be broken. - i.e. tcpdump will not work. - Fix for 2.2.x Kernels only. - - o Ethernet Bridgind over Frame Relay - - The Frame Relay bridging has been developed by - Kristian Hoffmann and Mark Wells. - - The Linux kernel bridge is used to send ethernet - data over the frame relay links. - For 2.2.X Kernels only. - - o Added extensive 2.0.X support. Most new features of - 2.1.5 for protocols Frame Relay, PPP and CHDLC are - supported under 2.0.X kernels. - -beta1-2.2.0 Dec 30 2000 - o Updated drivers for 2.4.X kernels. - o Updated drivers for SMP support. - o X25API is now able to share PCI interrupts. - o Took out a general polling routine that was used - only by X25API. - o Added appropriate locks to the dynamic reconfiguration - code. - o Fixed a bug in the keyboard debug monitor. - -beta2-2.2.0 Jan 8 2001 - o Patches for 2.4.0 kernel - o Patches for 2.2.18 kernel - o Minor updates to PPP and CHLDC drivers. - Note: No functinal difference. - -beta3-2.2.9 Jan 10 2001 - o I missed the 2.2.18 kernel patches in beta2-2.2.0 - release. They are included in this release. - -Stable Release -2.2.0 Feb 01 2001 - o Bug fix in wancfg GUI configurator. - The edit function didn't work properly. - - -bata1-2.2.1 Feb 09 2001 - o WANPIPE TTY Driver emulation. - Two modes of operation Sync and Async. - Sync: Using the PPPD daemon, kernel SyncPPP layer - and the Wanpipe sync TTY driver: a PPP protocol - connection can be established via Sangoma adapter, over - a T1 leased line. - - The 2.4.0 kernel PPP layer supports MULTILINK - protocol, that can be used to bundle any number of Sangoma - adapters (T1 lines) into one, under a single IP address. - Thus, efficiently obtaining multiple T1 throughput. - - NOTE: The remote side must also implement MULTILINK PPP - protocol. - - Async:Using the PPPD daemon, kernel AsyncPPP layer - and the WANPIPE async TTY driver: a PPP protocol - connection can be established via Sangoma adapter and - a modem, over a telephone line. - - Thus, the WANPIPE async TTY driver simulates a serial - TTY driver that would normally be used to interface the - MODEM to the linux kernel. - - o WANPIPE PPP Backup Utility - This utility will monitor the state of the PPP T1 line. - In case of failure, a dial up connection will be established - via pppd daemon, ether via a serial tty driver (serial port), - or a WANPIPE async TTY driver (in case serial port is unavailable). - - Furthermore, while in dial up mode, the primary PPP T1 link - will be monitored for signs of life. - - If the PPP T1 link comes back to life, the dial up connection - will be shutdown and T1 line re-established. - - - o New Setup installation script. - Option to UPGRADE device drivers if the kernel source has - already been patched with WANPIPE. - - Option to COMPILE WANPIPE modules against the currently - running kernel, thus no need for manual kernel and module - re-compilatin. - - o Updates and Bug Fixes to wancfg utility. - -bata2-2.2.1 Feb 20 2001 - - o Bug fixes to the CHDLC device drivers. - The driver had compilation problems under kernels - 2.2.14 or lower. - - o Bug fixes to the Setup installation script. - The device drivers compilation options didn't work - properly. - - o Update to the wpbackupd daemon. - Optimized the cross-over times, between the primary - link and the backup dialup. - -beta3-2.2.1 Mar 02 2001 - o Patches for 2.4.2 kernel. - - o Bug fixes to util/ make files. - o Bug fixes to the Setup installation script. - - o Took out the backupd support and made it into - as separate package. - -beta4-2.2.1 Mar 12 2001 - - o Fix to the Frame Relay Device driver. - IPSAC sends a packet of zero length - header to the frame relay driver. The - driver tries to push its own 2 byte header - into the packet, which causes the driver to - crash. - - o Fix the WANPIPE re-configuration code. - Bug was found by trying to run the cfgft1 while the - interface was already running. - - o Updates to cfgft1. - Writes a wanpipe#.cfgft1 configuration file - once the CSU/DSU is configured. This file can - holds the current CSU/DSU configuration. - - - ->>>>>> END OF README <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< - - diff --git a/Documentation/pci.txt b/Documentation/pci.txt index 62b1dc5d97e2..76d28d033657 100644 --- a/Documentation/pci.txt +++ b/Documentation/pci.txt @@ -266,20 +266,6 @@ port an old driver to the new PCI interface. They are no longer present in the kernel as they aren't compatible with hotplug or PCI domains or having sane locking. -pcibios_present() and Since ages, you don't need to test presence -pci_present() of PCI subsystem when trying to talk to it. - If it's not there, the list of PCI devices - is empty and all functions for searching for - devices just return NULL. -pcibios_(read|write)_* Superseded by their pci_(read|write)_* - counterparts. -pcibios_find_* Superseded by their pci_get_* counterparts. -pci_for_each_dev() Superseded by pci_get_device() -pci_for_each_dev_reverse() Superseded by pci_find_device_reverse() -pci_for_each_bus() Superseded by pci_find_next_bus() pci_find_device() Superseded by pci_get_device() pci_find_subsys() Superseded by pci_get_subsys() pci_find_slot() Superseded by pci_get_slot() -pcibios_find_class() Superseded by pci_get_class() -pci_find_class() Superseded by pci_get_class() -pci_(read|write)_*_nodev() Superseded by pci_bus_(read|write)_*() diff --git a/Documentation/pcmcia/devicetable.txt b/Documentation/pcmcia/devicetable.txt new file mode 100644 index 000000000000..3351c0355143 --- /dev/null +++ b/Documentation/pcmcia/devicetable.txt @@ -0,0 +1,63 @@ +Matching of PCMCIA devices to drivers is done using one or more of the +following criteria: + +- manufactor ID +- card ID +- product ID strings _and_ hashes of these strings +- function ID +- device function (actual and pseudo) + +You should use the helpers in include/pcmcia/device_id.h for generating the +struct pcmcia_device_id[] entries which match devices to drivers. + +If you want to match product ID strings, you also need to pass the crc32 +hashes of the string to the macro, e.g. if you want to match the product ID +string 1, you need to use + +PCMCIA_DEVICE_PROD_ID1("some_string", 0x(hash_of_some_string)), + +If the hash is incorrect, the kernel will inform you about this in "dmesg" +upon module initialization, and tell you of the correct hash. + +You can determine the hash of the product ID strings by catting the file +"modalias" in the sysfs directory of the PCMCIA device. It generates a string +in the following form: +pcmcia:m0149cC1ABf06pfn00fn00pa725B842DpbF1EFEE84pc0877B627pd00000000 + +The hex value after "pa" is the hash of product ID string 1, after "pb" for +string 2 and so on. + +Alternatively, you can use this small tool to determine the crc32 hash. +simply pass the string you want to evaluate as argument to this program, +e.g. +$ ./crc32hash "Dual Speed" + +------------------------------------------------------------------------- +/* crc32hash.c - derived from linux/lib/crc32.c, GNU GPL v2 */ +#include <string.h> +#include <stdio.h> +#include <ctype.h> +#include <stdlib.h> + +unsigned int crc32(unsigned char const *p, unsigned int len) +{ + int i; + unsigned int crc = 0; + while (len--) { + crc ^= *p++; + for (i = 0; i < 8; i++) + crc = (crc >> 1) ^ ((crc & 1) ? 0xedb88320 : 0); + } + return crc; +} + +int main(int argc, char **argv) { + unsigned int result; + if (argc != 2) { + printf("no string passed as argument\n"); + return -1; + } + result = crc32(argv[1], strlen(argv[1])); + printf("0x%x\n", result); + return 0; +} diff --git a/Documentation/pcmcia/driver-changes.txt b/Documentation/pcmcia/driver-changes.txt new file mode 100644 index 000000000000..403e7b4dcdd4 --- /dev/null +++ b/Documentation/pcmcia/driver-changes.txt @@ -0,0 +1,67 @@ +This file details changes in 2.6 which affect PCMCIA card driver authors: + +* event handler initialization in struct pcmcia_driver (as of 2.6.13) + The event handler is notified of all events, and must be initialized + as the event() callback in the driver's struct pcmcia_driver. + +* pcmcia/version.h should not be used (as of 2.6.13) + This file will be removed eventually. + +* in-kernel device<->driver matching (as of 2.6.13) + PCMCIA devices and their correct drivers can now be matched in + kernelspace. See 'devicetable.txt' for details. + +* Device model integration (as of 2.6.11) + A struct pcmcia_device is registered with the device model core, + and can be used (e.g. for SET_NETDEV_DEV) by using + handle_to_dev(client_handle_t * handle). + +* Convert internal I/O port addresses to unsigned long (as of 2.6.11) + ioaddr_t should be replaced by kio_addr_t in PCMCIA card drivers. + +* irq_mask and irq_list parameters (as of 2.6.11) + The irq_mask and irq_list parameters should no longer be used in + PCMCIA card drivers. Instead, it is the job of the PCMCIA core to + determine which IRQ should be used. Therefore, link->irq.IRQInfo2 + is ignored. + +* client->PendingEvents is gone (as of 2.6.11) + client->PendingEvents is no longer available. + +* client->Attributes are gone (as of 2.6.11) + client->Attributes is unused, therefore it is removed from all + PCMCIA card drivers + +* core functions no longer available (as of 2.6.11) + The following functions have been removed from the kernel source + because they are unused by all in-kernel drivers, and no external + driver was reported to rely on them: + pcmcia_get_first_region() + pcmcia_get_next_region() + pcmcia_modify_window() + pcmcia_set_event_mask() + pcmcia_get_first_window() + pcmcia_get_next_window() + +* device list iteration upon module removal (as of 2.6.10) + It is no longer necessary to iterate on the driver's internal + client list and call the ->detach() function upon module removal. + +* Resource management. (as of 2.6.8) + Although the PCMCIA subsystem will allocate resources for cards, + it no longer marks these resources busy. This means that driver + authors are now responsible for claiming your resources as per + other drivers in Linux. You should use request_region() to mark + your IO regions in-use, and request_mem_region() to mark your + memory regions in-use. The name argument should be a pointer to + your driver name. Eg, for pcnet_cs, name should point to the + string "pcnet_cs". + +* CardServices is gone + CardServices() in 2.4 is just a big switch statement to call various + services. In 2.6, all of those entry points are exported and called + directly (except for pcmcia_report_error(), just use cs_error() instead). + +* struct pcmcia_driver + You need to use struct pcmcia_driver and pcmcia_{un,}register_driver + instead of {un,}register_pccard_driver diff --git a/Documentation/power/kernel_threads.txt b/Documentation/power/kernel_threads.txt index 60b548105edf..fb57784986b1 100644 --- a/Documentation/power/kernel_threads.txt +++ b/Documentation/power/kernel_threads.txt @@ -12,8 +12,7 @@ refrigerator. Code to do this looks like this: do { hub_events(); wait_event_interruptible(khubd_wait, !list_empty(&hub_event_list)); - if (current->flags & PF_FREEZE) - refrigerator(PF_FREEZE); + try_to_freeze(); } while (!signal_pending(current)); from drivers/usb/core/hub.c::hub_thread() diff --git a/Documentation/power/pci.txt b/Documentation/power/pci.txt index 35b1a7dae342..6fc9d511fc39 100644 --- a/Documentation/power/pci.txt +++ b/Documentation/power/pci.txt @@ -291,6 +291,44 @@ a request to enable wake events from D3, two calls should be made to pci_enable_wake (one for both D3hot and D3cold). +A reference implementation +------------------------- +.suspend() +{ + /* driver specific operations */ + + /* Disable IRQ */ + free_irq(); + /* If using MSI */ + pci_disable_msi(); + + pci_save_state(); + pci_enable_wake(); + /* Disable IO/bus master/irq router */ + pci_disable_device(); + pci_set_power_state(pci_choose_state()); +} + +.resume() +{ + pci_set_power_state(PCI_D0); + pci_restore_state(); + /* device's irq possibly is changed, driver should take care */ + pci_enable_device(); + pci_set_master(); + + /* if using MSI, device's vector possibly is changed */ + pci_enable_msi(); + + request_irq(); + /* driver specific operations; */ +} + +This is a typical implementation. Drivers can slightly change the order +of the operations in the implementation, ignore some operations or add +more deriver specific operations in it, but drivers should do something like +this on the whole. + 5. Resources ~~~~~~~~~~~~ diff --git a/Documentation/power/swsusp.txt b/Documentation/power/swsusp.txt index c7c3459fde43..7a6b78966459 100644 --- a/Documentation/power/swsusp.txt +++ b/Documentation/power/swsusp.txt @@ -164,11 +164,11 @@ place where the thread is safe to be frozen (no kernel semaphores should be held at that point and it must be safe to sleep there), and add: - if (current->flags & PF_FREEZE) - refrigerator(PF_FREEZE); + try_to_freeze(); If the thread is needed for writing the image to storage, you should -instead set the PF_NOFREEZE process flag when creating the thread. +instead set the PF_NOFREEZE process flag when creating the thread (and +be very carefull). Q: What is the difference between between "platform", "shutdown" and @@ -233,3 +233,81 @@ A: Try running cat `cat /proc/[0-9]*/maps | grep / | sed 's:.* /:/:' | sort -u` > /dev/null after resume. swapoff -a; swapon -a may also be usefull. + +Q: What happens to devices during swsusp? They seem to be resumed +during system suspend? + +A: That's correct. We need to resume them if we want to write image to +disk. Whole sequence goes like + + Suspend part + ~~~~~~~~~~~~ + running system, user asks for suspend-to-disk + + user processes are stopped + + suspend(PMSG_FREEZE): devices are frozen so that they don't interfere + with state snapshot + + state snapshot: copy of whole used memory is taken with interrupts disabled + + resume(): devices are woken up so that we can write image to swap + + write image to swap + + suspend(PMSG_SUSPEND): suspend devices so that we can power off + + turn the power off + + Resume part + ~~~~~~~~~~~ + (is actually pretty similar) + + running system, user asks for suspend-to-disk + + user processes are stopped (in common case there are none, but with resume-from-initrd, noone knows) + + read image from disk + + suspend(PMSG_FREEZE): devices are frozen so that they don't interfere + with image restoration + + image restoration: rewrite memory with image + + resume(): devices are woken up so that system can continue + + thaw all user processes + +Q: What is this 'Encrypt suspend image' for? + +A: First of all: it is not a replacement for dm-crypt encrypted swap. +It cannot protect your computer while it is suspended. Instead it does +protect from leaking sensitive data after resume from suspend. + +Think of the following: you suspend while an application is running +that keeps sensitive data in memory. The application itself prevents +the data from being swapped out. Suspend, however, must write these +data to swap to be able to resume later on. Without suspend encryption +your sensitive data are then stored in plaintext on disk. This means +that after resume your sensitive data are accessible to all +applications having direct access to the swap device which was used +for suspend. If you don't need swap after resume these data can remain +on disk virtually forever. Thus it can happen that your system gets +broken in weeks later and sensitive data which you thought were +encrypted and protected are retrieved and stolen from the swap device. +To prevent this situation you should use 'Encrypt suspend image'. + +During suspend a temporary key is created and this key is used to +encrypt the data written to disk. When, during resume, the data was +read back into memory the temporary key is destroyed which simply +means that all data written to disk during suspend are then +inaccessible so they can't be stolen later on. The only thing that +you must then take care of is that you call 'mkswap' for the swap +partition used for suspend as early as possible during regular +boot. This asserts that any temporary key from an oopsed suspend or +from a failed or aborted resume is erased from the swap device. + +As a rule of thumb use encrypted swap to protect your data while your +system is shut down or suspended. Additionally use the encrypted +suspend image to prevent sensitive data from being stolen after +resume. diff --git a/Documentation/power/video.txt b/Documentation/power/video.txt index 68734355d7cf..7a4a5036d123 100644 --- a/Documentation/power/video.txt +++ b/Documentation/power/video.txt @@ -83,8 +83,10 @@ Compaq Armada E500 - P3-700 none (1) (S1 also works OK) Compaq Evo N620c vga=normal, s3_bios (2) Dell 600m, ATI R250 Lf none (1), but needs xorg-x11-6.8.1.902-1 Dell D600, ATI RV250 vga=normal and X, or try vbestate (6) +Dell D610 vga=normal and X (possibly vbestate (6) too, but not tested) Dell Inspiron 4000 ??? (*) Dell Inspiron 500m ??? (*) +Dell Inspiron 510m ??? Dell Inspiron 600m ??? (*) Dell Inspiron 8200 ??? (*) Dell Inspiron 8500 ??? (*) @@ -115,6 +117,7 @@ IBM Thinkpad X40 Type 2371-7JG s3_bios,s3_mode (4) Medion MD4220 ??? (*) Samsung P35 vbetool needed (6) Sharp PC-AR10 (ATI rage) none (1) +Sony Vaio PCG-C1VRX/K s3_bios (2) Sony Vaio PCG-F403 ??? (*) Sony Vaio PCG-N505SN ??? (*) Sony Vaio vgn-s260 X or boot-radeon can init it (5) @@ -123,6 +126,7 @@ Toshiba Satellite 4030CDT s3_mode (3) Toshiba Satellite 4080XCDT s3_mode (3) Toshiba Satellite 4090XCDT ??? (*) Toshiba Satellite P10-554 s3_bios,s3_mode (4)(****) +Toshiba M30 (2) xor X with nvidia driver using internal AGP Uniwill 244IIO ??? (*) diff --git a/Documentation/power/video_extension.txt b/Documentation/power/video_extension.txt index 8e33d7c82c49..b2f9b1598ac2 100644 --- a/Documentation/power/video_extension.txt +++ b/Documentation/power/video_extension.txt @@ -1,13 +1,16 @@ -This driver implement the ACPI Extensions For Display Adapters -for integrated graphics devices on motherboard, as specified in -ACPI 2.0 Specification, Appendix B, allowing to perform some basic -control like defining the video POST device, retrieving EDID information -or to setup a video output, etc. Note that this is an ref. implementation only. -It may or may not work for your integrated video device. +ACPI video extensions +~~~~~~~~~~~~~~~~~~~~~ + +This driver implement the ACPI Extensions For Display Adapters for +integrated graphics devices on motherboard, as specified in ACPI 2.0 +Specification, Appendix B, allowing to perform some basic control like +defining the video POST device, retrieving EDID information or to +setup a video output, etc. Note that this is an ref. implementation +only. It may or may not work for your integrated video device. Interfaces exposed to userland through /proc/acpi/video: -VGA/info : display the supported video bus device capability like ,Video ROM, CRT/LCD/TV. +VGA/info : display the supported video bus device capability like Video ROM, CRT/LCD/TV. VGA/ROM : Used to get a copy of the display devices' ROM data (up to 4k). VGA/POST_info : Used to determine what options are implemented. VGA/POST : Used to get/set POST device. @@ -15,7 +18,7 @@ VGA/DOS : Used to get/set ownership of output switching: Please refer ACPI spec B.4.1 _DOS VGA/CRT : CRT output VGA/LCD : LCD output -VGA/TV : TV output +VGA/TVO : TV output VGA/*/brightness : Used to get/set brightness of output device Notify event through /proc/acpi/event: diff --git a/Documentation/s390/CommonIO b/Documentation/s390/CommonIO index a831d9ae5a5e..59d1166d41ee 100644 --- a/Documentation/s390/CommonIO +++ b/Documentation/s390/CommonIO @@ -30,7 +30,7 @@ Command line parameters device numbers (0xabcd or abcd, for 2.4 backward compatibility). You can use the 'all' keyword to ignore all devices. The '!' operator will cause the I/O-layer to _not_ ignore a device. - The order on the command line is not important. + The command line is parsed from left to right. For example, cio_ignore=0.0.0023-0.0.0042,0.0.4711 @@ -72,13 +72,14 @@ Command line parameters /proc/cio_ignore; "add <device range>, <device range>, ..." will ignore the specified devices. - Note: Already known devices cannot be ignored. + Note: While already known devices can be added to the list of devices to be + ignored, there will be no effect on then. However, if such a device + disappears and then reappeares, it will then be ignored. - For example, if device 0.0.abcd is already known and all other devices - 0.0.a000-0.0.afff are not known, + For example, "echo add 0.0.a000-0.0.accc, 0.0.af00-0.0.afff > /proc/cio_ignore" - will add 0.0.a000-0.0.abcc, 0.0.abce-0.0.accc and 0.0.af00-0.0.afff to the - list of ignored devices and skip 0.0.abcd. + will add 0.0.a000-0.0.accc and 0.0.af00-0.0.afff to the list of ignored + devices. The devices can be specified either by bus id (0.0.abcd) or, for 2.4 backward compatibilty, by the device number in hexadecimal (0xabcd or abcd). @@ -98,7 +99,8 @@ Command line parameters - /proc/s390dbf/cio_trace/hex_ascii Logs the calling of functions in the common I/O-layer and, if applicable, - which subchannel they were called for. + which subchannel they were called for, as well as dumps of some data + structures (like irb in an error case). The level of logging can be changed to be more or less verbose by piping to /proc/s390dbf/cio_*/level a number between 0 and 6; see the documentation on diff --git a/Documentation/s390/s390dbf.txt b/Documentation/s390/s390dbf.txt index 2d1cd939b4df..e24fdeada970 100644 --- a/Documentation/s390/s390dbf.txt +++ b/Documentation/s390/s390dbf.txt @@ -12,8 +12,8 @@ where log records can be stored efficiently in memory, where each component One purpose of this is to inspect the debug logs after a production system crash in order to analyze the reason for the crash. If the system still runs but only a subcomponent which uses dbf failes, -it is possible to look at the debug logs on a live system via the Linux proc -filesystem. +it is possible to look at the debug logs on a live system via the Linux +debugfs filesystem. The debug feature may also very useful for kernel and driver development. Design: @@ -52,16 +52,18 @@ Each debug entry contains the following data: - Flag, if entry is an exception or not The debug logs can be inspected in a live system through entries in -the proc-filesystem. Under the path /proc/s390dbf there is +the debugfs-filesystem. Under the toplevel directory "s390dbf" there is a directory for each registered component, which is named like the -corresponding component. +corresponding component. The debugfs normally should be mounted to +/sys/kernel/debug therefore the debug feature can be accessed unter +/sys/kernel/debug/s390dbf. The content of the directories are files which represent different views to the debug log. Each component can decide which views should be used through registering them with the function debug_register_view(). Predefined views for hex/ascii, sprintf and raw binary data are provided. It is also possible to define other views. The content of -a view can be inspected simply by reading the corresponding proc file. +a view can be inspected simply by reading the corresponding debugfs file. All debug logs have an an actual debug level (range from 0 to 6). The default level is 3. Event and Exception functions have a 'level' @@ -69,14 +71,14 @@ parameter. Only debug entries with a level that is lower or equal than the actual level are written to the log. This means, when writing events, high priority log entries should have a low level value whereas low priority entries should have a high one. -The actual debug level can be changed with the help of the proc-filesystem -through writing a number string "x" to the 'level' proc file which is +The actual debug level can be changed with the help of the debugfs-filesystem +through writing a number string "x" to the 'level' debugfs file which is provided for every debug log. Debugging can be switched off completely -by using "-" on the 'level' proc file. +by using "-" on the 'level' debugfs file. Example: -> echo "-" > /proc/s390dbf/dasd/level +> echo "-" > /sys/kernel/debug/s390dbf/dasd/level It is also possible to deactivate the debug feature globally for every debug log. You can change the behavior using 2 sysctl parameters in @@ -99,11 +101,11 @@ Kernel Interfaces: ------------------ ---------------------------------------------------------------------------- -debug_info_t *debug_register(char *name, int pages_index, int nr_areas, +debug_info_t *debug_register(char *name, int pages, int nr_areas, int buf_size); -Parameter: name: Name of debug log (e.g. used for proc entry) - pages_index: 2^pages_index pages will be allocated per area +Parameter: name: Name of debug log (e.g. used for debugfs entry) + pages: number of pages, which will be allocated per area nr_areas: number of debug areas buf_size: size of data area in each debug entry @@ -134,7 +136,7 @@ Return Value: none Description: Sets new actual debug level if new_level is valid. --------------------------------------------------------------------------- -+void debug_stop_all(void); +void debug_stop_all(void); Parameter: none @@ -270,7 +272,7 @@ Parameter: id: handle for debug log Return Value: 0 : ok < 0: Error -Description: registers new debug view and creates proc dir entry +Description: registers new debug view and creates debugfs dir entry --------------------------------------------------------------------------- int debug_unregister_view (debug_info_t * id, struct debug_view *view); @@ -281,7 +283,7 @@ Parameter: id: handle for debug log Return Value: 0 : ok < 0: Error -Description: unregisters debug view and removes proc dir entry +Description: unregisters debug view and removes debugfs dir entry @@ -308,7 +310,7 @@ static int init(void) { /* register 4 debug areas with one page each and 4 byte data field */ - debug_info = debug_register ("test", 0, 4, 4 ); + debug_info = debug_register ("test", 1, 4, 4 ); debug_register_view(debug_info,&debug_hex_ascii_view); debug_register_view(debug_info,&debug_raw_view); @@ -343,7 +345,7 @@ static int init(void) /* register 4 debug areas with one page each and data field for */ /* format string pointer + 2 varargs (= 3 * sizeof(long)) */ - debug_info = debug_register ("test", 0, 4, sizeof(long) * 3); + debug_info = debug_register ("test", 1, 4, sizeof(long) * 3); debug_register_view(debug_info,&debug_sprintf_view); debug_sprintf_event(debug_info, 2 , "first event in %s:%i\n",__FILE__,__LINE__); @@ -362,16 +364,16 @@ module_exit(cleanup); -ProcFS Interface +Debugfs Interface ---------------- Views to the debug logs can be investigated through reading the corresponding -proc-files: +debugfs-files: Example: -> ls /proc/s390dbf/dasd -flush hex_ascii level raw -> cat /proc/s390dbf/dasd/hex_ascii | sort +1 +> ls /sys/kernel/debug/s390dbf/dasd +flush hex_ascii level pages raw +> cat /sys/kernel/debug/s390dbf/dasd/hex_ascii | sort +1 00 00974733272:680099 2 - 02 0006ad7e 07 ea 4a 90 | .... 00 00974733272:682210 2 - 02 0006ade6 46 52 45 45 | FREE 00 00974733272:682213 2 - 02 0006adf6 07 ea 4a 90 | .... @@ -391,25 +393,36 @@ Changing the debug level Example: -> cat /proc/s390dbf/dasd/level +> cat /sys/kernel/debug/s390dbf/dasd/level 3 -> echo "5" > /proc/s390dbf/dasd/level -> cat /proc/s390dbf/dasd/level +> echo "5" > /sys/kernel/debug/s390dbf/dasd/level +> cat /sys/kernel/debug/s390dbf/dasd/level 5 Flushing debug areas -------------------- Debug areas can be flushed with piping the number of the desired -area (0...n) to the proc file "flush". When using "-" all debug areas +area (0...n) to the debugfs file "flush". When using "-" all debug areas are flushed. Examples: 1. Flush debug area 0: -> echo "0" > /proc/s390dbf/dasd/flush +> echo "0" > /sys/kernel/debug/s390dbf/dasd/flush 2. Flush all debug areas: -> echo "-" > /proc/s390dbf/dasd/flush +> echo "-" > /sys/kernel/debug/s390dbf/dasd/flush + +Changing the size of debug areas +------------------------------------ +It is possible the change the size of debug areas through piping +the number of pages to the debugfs file "pages". The resize request will +also flush the debug areas. + +Example: + +Define 4 pages for the debug areas of debug feature "dasd": +> echo "4" > /sys/kernel/debug/s390dbf/dasd/pages Stooping the debug feature -------------------------- @@ -491,7 +504,7 @@ Defining views -------------- Views are specified with the 'debug_view' structure. There are defined -callback functions which are used for reading and writing the proc files: +callback functions which are used for reading and writing the debugfs files: struct debug_view { char name[DEBUG_MAX_PROCF_LEN]; @@ -525,7 +538,7 @@ typedef int (debug_input_proc_t) (debug_info_t* id, The "private_data" member can be used as pointer to view specific data. It is not used by the debug feature itself. -The output when reading a debug-proc file is structured like this: +The output when reading a debugfs file is structured like this: "prolog_proc output" @@ -534,13 +547,13 @@ The output when reading a debug-proc file is structured like this: "header_proc output 3" "format_proc output 3" ... -When a view is read from the proc fs, the Debug Feature calls the +When a view is read from the debugfs, the Debug Feature calls the 'prolog_proc' once for writing the prolog. Then 'header_proc' and 'format_proc' are called for each existing debug entry. The input_proc can be used to implement functionality when it is written to -the view (e.g. like with 'echo "0" > /proc/s390dbf/dasd/level). +the view (e.g. like with 'echo "0" > /sys/kernel/debug/s390dbf/dasd/level). For header_proc there can be used the default function debug_dflt_header_fn() which is defined in in debug.h. @@ -602,7 +615,7 @@ debug_info = debug_register ("test", 0, 4, 4 )); debug_register_view(debug_info, &debug_test_view); for(i = 0; i < 10; i ++) debug_int_event(debug_info, 1, i); -> cat /proc/s390dbf/test/myview +> cat /sys/kernel/debug/s390dbf/test/myview 00 00964419734:611402 1 - 00 88042ca This error........... 00 00964419734:611405 1 - 00 88042ca That error........... 00 00964419734:611408 1 - 00 88042ca Problem.............. diff --git a/Documentation/scsi/ChangeLog.megaraid b/Documentation/scsi/ChangeLog.megaraid index a9356c63b800..5331d91432c7 100644 --- a/Documentation/scsi/ChangeLog.megaraid +++ b/Documentation/scsi/ChangeLog.megaraid @@ -1,3 +1,69 @@ +Release Date : Mon Mar 07 12:27:22 EST 2005 - Seokmann Ju <sju@lsil.com> +Current Version : 2.20.4.6 (scsi module), 2.20.2.6 (cmm module) +Older Version : 2.20.4.5 (scsi module), 2.20.2.5 (cmm module) + +1. Added IOCTL backward compatibility. + Convert megaraid_mm driver to new compat_ioctl entry points. + I don't have easy access to hardware, so only compile tested. + - Signed-off-by:Andi Kleen <ak@muc.de> + +2. megaraid_mbox fix: wrong order of arguments in memset() + That, BTW, shows why cross-builds are useful-the only indication of + problem had been a new warning showing up in sparse output on alpha + build (number of exceeding 256 got truncated). + - Signed-off-by: Al Viro + <viro@parcelfarce.linux.theplanet.co.uk> + +3. Convert pci_module_init to pci_register_driver + Convert from pci_module_init to pci_register_driver + (from:http://kerneljanitors.org/TODO) + - Signed-off-by: Domen Puncer <domen@coderock.org> + +4. Use the pre defined DMA mask constants from dma-mapping.h + Use the DMA_{64,32}BIT_MASK constants from dma-mapping.h when calling + pci_set_dma_mask() or pci_set_consistend_dma_mask(). See + http://marc.theaimsgroup.com/?t=108001993000001&r=1&w=2 for more + details. + Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch> + Signed-off-by: Domen Puncer <domen@coderock.org> + +5. Remove SSID checking for Dobson, Lindsay, and Verde based products. + Checking the SSVID/SSID for controllers which have Dobson, Lindsay, + and Verde is unnecessary because device ID has been assigned by LSI + and it is unique value. So, all controllers with these IOPs have to be + supported by the driver regardless SSVID/SSID. + +6. Date Thu, 27 Jan 2005 04:31:09 +0100 + From Herbert Poetzl <> + Subject RFC: assert_spin_locked() for 2.6 + + Greetings! + + overcautious programming will kill your kernel ;) + ever thought about checking a spin_lock or even + asserting that it must be held (maybe just for + spinlock debugging?) ... + + there are several checks present in the kernel + where somebody does a variation on the following: + + BUG_ON(!spin_is_locked(&some_lock)); + + so what's wrong about that? nothing, unless you + compile the code with CONFIG_DEBUG_SPINLOCK but + without CONFIG_SMP ... in which case the BUG() + will kill your kernel ... + + maybe it's not advised to make such assertions, + but here is a solution which works for me ... + (compile tested for sh, x86_64 and x86, boot/run + tested for x86 only) + + best, + Herbert + + - Herbert Poetzl <herbert@13thfloor.at>, Thu, 27 Jan 2005 + Release Date : Thu Feb 03 12:27:22 EST 2005 - Seokmann Ju <sju@lsil.com> Current Version : 2.20.4.5 (scsi module), 2.20.2.5 (cmm module) Older Version : 2.20.4.4 (scsi module), 2.20.2.4 (cmm module) diff --git a/Documentation/scsi/scsi-changer.txt b/Documentation/scsi/scsi-changer.txt new file mode 100644 index 000000000000..c132687b017a --- /dev/null +++ b/Documentation/scsi/scsi-changer.txt @@ -0,0 +1,180 @@ + +README for the SCSI media changer driver +======================================== + +This is a driver for SCSI Medium Changer devices, which are listed +with "Type: Medium Changer" in /proc/scsi/scsi. + +This is for *real* Jukeboxes. It is *not* supported to work with +common small CD-ROM changers, neither one-lun-per-slot SCSI changers +nor IDE drives. + +Userland tools available from here: + http://linux.bytesex.org/misc/changer.html + + +General Information +------------------- + +First some words about how changers work: A changer has 2 (possibly +more) SCSI ID's. One for the changer device which controls the robot, +and one for the device which actually reads and writes the data. The +later may be anything, a MOD, a CD-ROM, a tape or whatever. For the +changer device this is a "don't care", he *only* shuffles around the +media, nothing else. + + +The SCSI changer model is complex, compared to - for example - IDE-CD +changers. But it allows to handle nearly all possible cases. It knows +4 different types of changer elements: + + media transport - this one shuffles around the media, i.e. the + transport arm. Also known as "picker". + storage - a slot which can hold a media. + import/export - the same as above, but is accessable from outside, + i.e. there the operator (you !) can use this to + fill in and remove media from the changer. + Sometimes named "mailslot". + data transfer - this is the device which reads/writes, i.e. the + CD-ROM / Tape / whatever drive. + +None of these is limited to one: A huge Jukebox could have slots for +123 CD-ROM's, 5 CD-ROM readers (and therefore 6 SCSI ID's: the changer +and each CD-ROM) and 2 transport arms. No problem to handle. + + +How it is implemented +--------------------- + +I implemented the driver as character device driver with a NetBSD-like +ioctl interface. Just grabbed NetBSD's header file and one of the +other linux SCSI device drivers as starting point. The interface +should be source code compatible with NetBSD. So if there is any +software (anybody knows ???) which supports a BSDish changer driver, +it should work with this driver too. + +Over time a few more ioctls where added, volume tag support for example +wasn't covered by the NetBSD ioctl API. + + +Current State +------------- + +Support for more than one transport arm is not implemented yet (and +nobody asked for it so far...). + +I test and use the driver myself with a 35 slot cdrom jukebox from +Grundig. I got some reports telling it works ok with tape autoloaders +(Exabyte, HP and DEC). Some People use this driver with amanda. It +works fine with small (11 slots) and a huge (4 MOs, 88 slots) +magneto-optical Jukebox. Probably with lots of other changers too, most +(but not all :-) people mail me only if it does *not* work... + +I don't have any device lists, neither black-list nor white-list. Thus +it is quite useless to ask me whenever a specific device is supported or +not. In theory every changer device which supports the SCSI-2 media +changer command set should work out-of-the-box with this driver. If it +doesn't, it is a bug. Either within the driver or within the firmware +of the changer device. + + +Using it +-------- + +This is a character device with major number is 86, so use +"mknod /dev/sch0 c 86 0" to create the special file for the driver. + +If the module finds the changer, it prints some messages about the +device [ try "dmesg" if you don't see anything ] and should show up in +/proc/devices. If not.... some changers use ID ? / LUN 0 for the +device and ID ? / LUN 1 for the robot mechanism. But Linux does *not* +look for LUN's other than 0 as default, becauce there are to many +broken devices. So you can try: + + 1) echo "scsi add-single-device 0 0 ID 1" > /proc/scsi/scsi + (replace ID with the SCSI-ID of the device) + 2) boot the kernel with "max_scsi_luns=1" on the command line + (append="max_scsi_luns=1" in lilo.conf should do the trick) + + +Trouble? +-------- + +If you insmod the driver with "insmod debug=1", it will be verbose and +prints a lot of stuff to the syslog. Compiling the kernel with +CONFIG_SCSI_CONSTANTS=y improves the quality of the error messages alot +because the kernel will translate the error codes into human-readable +strings then. + +You can display these messages with the dmesg command (or check the +logfiles). If you email me some question becauce of a problem with the +driver, please include these messages. + + +Insmod options +-------------- + +debug=0/1 + Enable debug messages (see above, default: 0). + +verbose=0/1 + Be verbose (default: 1). + +init=0/1 + Send INITIALIZE ELEMENT STATUS command to the changer + at insmod time (default: 1). + +timeout_init=<seconds> + timeout for the INITIALIZE ELEMENT STATUS command + (default: 3600). + +timeout_move=<seconds> + timeout for all other commands (default: 120). + +dt_id=<id1>,<id2>,... +dt_lun=<lun1>,<lun2>,... + These two allow to specify the SCSI ID and LUN for the data + transfer elements. You likely don't need this as the jukebox + should provide this information. But some devices don't ... + +vendor_firsts= +vendor_counts= +vendor_labels= + These insmod options can be used to tell the driver that there + are some vendor-specific element types. Grundig for example + does this. Some jukeboxes have a printer to label fresh burned + CDs, which is addressed as element 0xc000 (type 5). To tell the + driver about this vendor-specific element, use this: + $ insmod ch \ + vendor_firsts=0xc000 \ + vendor_counts=1 \ + vendor_labels=printer + All three insmod options accept up to four comma-separated + values, this way you can configure the element types 5-8. + You likely need the SCSI specs for the device in question to + find the correct values as they are not covered by the SCSI-2 + standard. + + +Credits +------- + +I wrote this driver using the famous mailing-patches-around-the-world +method. With (more or less) help from: + + Daniel Moehwald <moehwald@hdg.de> + Dane Jasper <dane@sonic.net> + R. Scott Bailey <sbailey@dsddi.eds.com> + Jonathan Corbet <corbet@lwn.net> + +Special thanks go to + Martin Kuehne <martin.kuehne@bnbt.de> +for a old, second-hand (but full functional) cdrom jukebox which I use +to develop/test driver and tools now. + +Have fun, + + Gerd + +-- +Gerd Knorr <kraxel@bytesex.org> diff --git a/Documentation/scsi/scsi_mid_low_api.txt b/Documentation/scsi/scsi_mid_low_api.txt index e41703d7d24d..7536823c0cb1 100644 --- a/Documentation/scsi/scsi_mid_low_api.txt +++ b/Documentation/scsi/scsi_mid_low_api.txt @@ -388,7 +388,6 @@ Summary: scsi_remove_device - detach and remove a SCSI device scsi_remove_host - detach and remove all SCSI devices owned by host scsi_report_bus_reset - report scsi _bus_ reset observed - scsi_set_device - place device reference in host structure scsi_track_queue_full - track successive QUEUE_FULL events scsi_unblock_requests - allow further commands to be queued to given host scsi_unregister - [calls scsi_host_put()] @@ -741,20 +740,6 @@ void scsi_report_bus_reset(struct Scsi_Host * shost, int channel) /** - * scsi_set_device - place device reference in host structure - * @shost: a pointer to a scsi host instance - * @pdev: pointer to device instance to assign - * - * Returns nothing - * - * Might block: no - * - * Defined in: include/scsi/scsi_host.h . - **/ -void scsi_set_device(struct Scsi_Host * shost, struct device * dev) - - -/** * scsi_track_queue_full - track successive QUEUE_FULL events on given * device to determine if and when there is a need * to adjust the queue depth on the device. @@ -936,8 +921,7 @@ Details: * * Returns SUCCESS if command aborted else FAILED * - * Locks: struct Scsi_Host::host_lock held (with irqsave) on entry - * and assumed to be held on return. + * Locks: None held * * Calling context: kernel thread * @@ -955,8 +939,7 @@ Details: * * Returns SUCCESS if command aborted else FAILED * - * Locks: struct Scsi_Host::host_lock held (with irqsave) on entry - * and assumed to be held on return. + * Locks: None held * * Calling context: kernel thread * @@ -974,8 +957,7 @@ Details: * * Returns SUCCESS if command aborted else FAILED * - * Locks: struct Scsi_Host::host_lock held (with irqsave) on entry - * and assumed to be held on return. + * Locks: None held * * Calling context: kernel thread * @@ -993,8 +975,7 @@ Details: * * Returns SUCCESS if command aborted else FAILED * - * Locks: struct Scsi_Host::host_lock held (with irqsave) on entry - * and assumed to be held on return. + * Locks: None held * * Calling context: kernel thread * diff --git a/Documentation/serial/driver b/Documentation/serial/driver index e9c0178cd202..ac7eabbf662a 100644 --- a/Documentation/serial/driver +++ b/Documentation/serial/driver @@ -107,8 +107,8 @@ hardware. indicate that the signal is permanently active. If RI is not available, the signal should not be indicated as active. - Locking: none. - Interrupts: caller dependent. + Locking: port->lock taken. + Interrupts: locally disabled. This call must not sleep stop_tx(port,tty_stop) diff --git a/Documentation/sgi-ioc4.txt b/Documentation/sgi-ioc4.txt new file mode 100644 index 000000000000..876c96ae38db --- /dev/null +++ b/Documentation/sgi-ioc4.txt @@ -0,0 +1,45 @@ +The SGI IOC4 PCI device is a bit of a strange beast, so some notes on +it are in order. + +First, even though the IOC4 performs multiple functions, such as an +IDE controller, a serial controller, a PS/2 keyboard/mouse controller, +and an external interrupt mechanism, it's not implemented as a +multifunction device. The consequence of this from a software +standpoint is that all these functions share a single IRQ, and +they can't all register to own the same PCI device ID. To make +matters a bit worse, some of the register blocks (and even registers +themselves) present in IOC4 are mixed-purpose between these several +functions, meaning that there's no clear "owning" device driver. + +The solution is to organize the IOC4 driver into several independent +drivers, "ioc4", "sgiioc4", and "ioc4_serial". Note that there is no +PS/2 controller driver as this functionality has never been wired up +on a shipping IO card. + +ioc4 +==== +This is the core (or shim) driver for IOC4. It is responsible for +initializing the basic functionality of the chip, and allocating +the PCI resources that are shared between the IOC4 functions. + +This driver also provides registration functions that the other +IOC4 drivers can call to make their presence known. Each driver +needs to provide a probe and remove function, which are invoked +by the core driver at appropriate times. The interface of these +IOC4 function probe and remove operations isn't precisely the same +as PCI device probe and remove operations, but is logically the +same operation. + +sgiioc4 +======= +This is the IDE driver for IOC4. Its name isn't very descriptive +simply for historical reasons (it used to be the only IOC4 driver +component). There's not much to say about it other than it hooks +up to the ioc4 driver via the appropriate registration, probe, and +remove functions. + +ioc4_serial +=========== +This is the serial driver for IOC4. There's not much to say about it +other than it hooks up to the ioc4 driver via the appropriate registration, +probe, and remove functions. diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 71ef0498d5e0..a18ecb92b356 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -615,9 +615,11 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module snd-hda-intel -------------------- - Module for Intel HD Audio (ICH6, ICH6M, ICH7) + Module for Intel HD Audio (ICH6, ICH6M, ICH7), ATI SB450, + VIA VT8251/VT8237A model - force the model name + position_fix - Fix DMA pointer (0 = FIFO size, 1 = none, 2 = POSBUF) Module supports up to 8 cards. @@ -634,7 +636,16 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. 3stack-digout 3-jack in back, a HP out and a SPDIF out 5stack 5-jack in back, 2-jack in front 5stack-digout 5-jack in back, 2-jack in front, a SPDIF out + 6stack 6-jack in back, 2-jack in front + 6stack-digout 6-jack with a SPDIF out w810 3-jack + z71v 3-jack (HP shared SPDIF) + asus 3-jack + uniwill 3-jack + F1734 2-jack + test for testing/debugging purpose, almost all controls can be + adjusted. Appearing only when compiled with + $CONFIG_SND_DEBUG=y CMI9880 minimal 3-jack in back @@ -642,6 +653,15 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. full 6-jack in back, 2-jack in front full_dig 6-jack in back, 2-jack in front, SPDIF I/O allout 5-jack in back, 2-jack in front, SPDIF out + auto auto-config reading BIOS (default) + + Note 2: If you get click noises on output, try the module option + position_fix=1 or 2. position_fix=1 will use the SD_LPIB + register value without FIFO size correction as the current + DMA pointer. position_fix=2 will make the driver to use + the position buffer instead of reading SD_LPIB register. + (Usually SD_LPLIB register is more accurate than the + position buffer.) Module snd-hdsp --------------- @@ -660,7 +680,19 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. module did formerly. It will allocate the buffers in advance when any HDSP cards are found. To make the buffer allocation sure, load snd-page-alloc module in the early - stage of boot sequence. + stage of boot sequence. See "Early Buffer Allocation" + section. + + Module snd-hdspm + ---------------- + + Module for RME HDSP MADI board. + + precise_ptr - Enable precise pointer, or disable. + line_outs_monitor - Send playback streams to analog outs by default. + enable_monitor - Enable Analog Out on Channel 63/64 by default. + + See hdspm.txt for details. Module snd-ice1712 ------------------ @@ -677,15 +709,19 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. * TerraTec EWS 88D * TerraTec EWX 24/96 * TerraTec DMX 6Fire + * TerraTec Phase 88 * Hoontech SoundTrack DSP 24 * Hoontech SoundTrack DSP 24 Value * Hoontech SoundTrack DSP 24 Media 7.1 + * Event Electronics, EZ8 * Digigram VX442 + * Lionstracs, Mediastaton model - Use the given board model, one of the following: delta1010, dio2496, delta66, delta44, audiophile, delta410, delta1010lt, vx442, ewx2496, ews88mt, ews88mt_new, ews88d, - dmx6fire, dsp24, dsp24_value, dsp24_71, ez8 + dmx6fire, dsp24, dsp24_value, dsp24_71, ez8, + phase88, mediastation omni - Omni I/O support for MidiMan M-Audio Delta44/66 cs8427_timeout - reset timeout for the CS8427 chip (S/PDIF transciever) in msec resolution, default value is 500 (0.5 sec) @@ -694,20 +730,46 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. is not used with all Envy24 based cards (for example in the MidiMan Delta serie). + Note: The supported board is detected by reading EEPROM or PCI + SSID (if EEPROM isn't available). You can override the + model by passing "model" module option in case that the + driver isn't configured properly or you want to try another + type for testing. + Module snd-ice1724 ------------------ - Module for Envy24HT (VT/ICE1724) based PCI sound cards. + Module for Envy24HT (VT/ICE1724), Envy24PT (VT1720) based PCI sound cards. * MidiMan M Audio Revolution 7.1 * AMP Ltd AUDIO2000 - * TerraTec Aureon Sky-5.1, Space-7.1 + * TerraTec Aureon 5.1 Sky + * TerraTec Aureon 7.1 Space + * TerraTec Aureon 7.1 Universe + * TerraTec Phase 22 + * TerraTec Phase 28 + * AudioTrak Prodigy 7.1 + * AudioTrak Prodigy 192 + * Pontis MS300 + * Albatron K8X800 Pro II + * Chaintech ZNF3-150 + * Chaintech ZNF3-250 + * Chaintech 9CJS + * Chaintech AV-710 + * Shuttle SN25P model - Use the given board model, one of the following: - revo71, amp2000, prodigy71, aureon51, aureon71, - k8x800 + revo71, amp2000, prodigy71, prodigy192, aureon51, + aureon71, universe, k8x800, phase22, phase28, ms300, + av710 Module supports up to 8 cards and autoprobe. + Note: The supported board is detected by reading EEPROM or PCI + SSID (if EEPROM isn't available). You can override the + model by passing "model" module option in case that the + driver isn't configured properly or you want to try another + type for testing. + Module snd-intel8x0 ------------------- @@ -997,6 +1059,13 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. The power-management is supported. + Module snd-pxa2xx-ac97 (on arm only) + ------------------------------------ + + Module for AC97 driver for the Intel PXA2xx chip + + For ARM architecture only. + Module snd-rme32 ---------------- @@ -1026,7 +1095,8 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. module did formerly. It will allocate the buffers in advance when any RME9652 cards are found. To make the buffer allocation sure, load snd-page-alloc module in the early - stage of boot sequence. + stage of boot sequence. See "Early Buffer Allocation" + section. Module snd-sa11xx-uda1341 (on arm only) --------------------------------------- @@ -1115,6 +1185,13 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module supports up to 8 cards. + Module snd-sun-dbri (on sparc only) + ----------------------------------- + + Module for DBRI sound chips found on Sparcs. + + Module supports up to 8 cards. + Module snd-wavefront -------------------- @@ -1211,16 +1288,18 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. ------------------ Module for AC'97 motherboards based on VIA 82C686A/686B, 8233, - 8233A, 8233C, 8235 (south) bridge. + 8233A, 8233C, 8235, 8237 (south) bridge. mpu_port - 0x300,0x310,0x320,0x330, otherwise obtain BIOS setup [VIA686A/686B only] joystick - Enable joystick (default off) [VIA686A/686B only] ac97_clock - AC'97 codec clock base (default 48000Hz) dxs_support - support DXS channels, - 0 = auto (defalut), 1 = enable, 2 = disable, - 3 = 48k only, 4 = no VRA - [VIA8233/C,8235 only] + 0 = auto (default), 1 = enable, 2 = disable, + 3 = 48k only, 4 = no VRA, 5 = enable any sample + rate and different sample rates on different + channels + [VIA8233/C, 8235, 8237 only] ac97_quirk - AC'97 workaround for strange hardware See the description of intel8x0 module for details. @@ -1232,18 +1311,23 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. default value 1.4. Then the interrupt number will be assigned under 15. You might also upgrade your BIOS. - Note: VIA8233/5 (not VIA8233A) can support DXS (direct sound) + Note: VIA8233/5/7 (not VIA8233A) can support DXS (direct sound) channels as the first PCM. On these channels, up to 4 - streams can be played at the same time. + streams can be played at the same time, and the controller + can perform sample rate conversion with separate rates for + each channel. As default (dxs_support = 0), 48k fixed rate is chosen except for the known devices since the output is often noisy except for 48k on some mother boards due to the bug of BIOS. - Please try once dxs_support=1 and if it works on other + Please try once dxs_support=5 and if it works on other sample rates (e.g. 44.1kHz of mp3 playback), please let us know the PCI subsystem vendor/device id's (output of "lspci -nv"). - If it doesn't work, try dxs_support=4. If it still doesn't + If dxs_support=5 does not work, try dxs_support=4; if it + doesn't work too, try dxs_support=1. (dxs_support=1 is + usually for old motherboards. The correct implementated + board should work with 4 or 5.) If it still doesn't work and the default setting is ok, dxs_support=3 is the right choice. If the default setting doesn't work at all, try dxs_support=2 to disable the DXS channels. @@ -1306,7 +1390,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module snd-vxpocket ------------------- - Module for Digigram VX-Pocket VX2 PCMCIA card. + Module for Digigram VX-Pocket VX2 and 440 PCMCIA cards. ibl - Capture IBL size. (default = 0, minimum size) @@ -1326,29 +1410,6 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Note: the driver is build only when CONFIG_ISA is set. - Module snd-vxp440 - ----------------- - - Module for Digigram VX-Pocket 440 PCMCIA card. - - ibl - Capture IBL size. (default = 0, minimum size) - - Module supports up to 8 cards. The module is compiled only when - PCMCIA is supported on kernel. - - To activate the driver via the card manager, you'll need to set - up /etc/pcmcia/vxp440.conf. See the sound/pcmcia/vx/vxp440.c. - - When the driver is compiled as a module and the hotplug firmware - is supported, the firmware data is loaded via hotplug automatically. - Install the necessary firmware files in alsa-firmware package. - When no hotplug fw loader is available, you need to load the - firmware via vxloader utility in alsa-tools package. - - About capture IBL, see the description of snd-vx222 module. - - Note: the driver is build only when CONFIG_ISA is set. - Module snd-ymfpci ----------------- @@ -1497,6 +1558,36 @@ Proc interfaces (/proc/asound) echo "rvplayer 0 0 block" > /proc/asound/card0/pcm0p/oss +Early Buffer Allocation +======================= + +Some drivers (e.g. hdsp) require the large contiguous buffers, and +sometimes it's too late to find such spaces when the driver module is +actually loaded due to memory fragmentation. You can pre-allocate the +PCM buffers by loading snd-page-alloc module and write commands to its +proc file in prior, for example, in the early boot stage like +/etc/init.d/*.local scripts. + +Reading the proc file /proc/drivers/snd-page-alloc shows the current +usage of page allocation. In writing, you can send the following +commands to the snd-page-alloc driver: + + - add VENDOR DEVICE MASK SIZE BUFFERS + + VENDOR and DEVICE are PCI vendor and device IDs. They take + integer numbers (0x prefix is needed for the hex). + MASK is the PCI DMA mask. Pass 0 if not restricted. + SIZE is the size of each buffer to allocate. You can pass + k and m suffix for KB and MB. The max number is 16MB. + BUFFERS is the number of buffers to allocate. It must be greater + than 0. The max number is 4. + + - erase + + This will erase the all pre-allocated buffers which are not in + use. + + Links ===== diff --git a/Documentation/sound/alsa/CMIPCI.txt b/Documentation/sound/alsa/CMIPCI.txt index 4a7df771b806..1872e24442a4 100644 --- a/Documentation/sound/alsa/CMIPCI.txt +++ b/Documentation/sound/alsa/CMIPCI.txt @@ -89,19 +89,22 @@ and use the interleaved 4 channel data. There are some control switchs affecting to the speaker connections: -"Line-In As Rear" - As mentioned above, the line-in jack is used - for the rear (3th and 4th channels) output. -"Line-In As Bass" - The line-in jack is used for the bass (5th - and 6th channels) output. -"Mic As Center/LFE" - The mic jack is used for the bass output. - If this switch is on, you cannot use a microphone as a capture - source, of course. - +"Line-In Mode" - an enum control to change the behavior of line-in + jack. Either "Line-In", "Rear Output" or "Bass Output" can + be selected. The last item is available only with model 039 + or newer. + When "Rear Output" is chosen, the surround channels 3 and 4 + are output to line-in jack. +"Mic-In Mode" - an enum control to change the behavior of mic-in + jack. Either "Mic-In" or "Center/LFE Output" can be + selected. + When "Center/LFE Output" is chosen, the center and bass + channels (channels 5 and 6) are output to mic-in jack. Digital I/O ----------- -The CM8x38 provides the excellent SPDIF capability with very chip +The CM8x38 provides the excellent SPDIF capability with very cheap price (yes, that's the reason I bought the card :) The SPDIF playback and capture are done via the third PCM device @@ -122,8 +125,9 @@ respectively, so you cannot playback both analog and digital streams simultaneously. To enable SPDIF output, you need to turn on "IEC958 Output Switch" -control via mixer or alsactl. Then you'll see the red light on from -the card so you know that's working obviously :) +control via mixer or alsactl ("IEC958" is the official name of +so-called S/PDIF). Then you'll see the red light on from the card so +you know that's working obviously :) The SPDIF input is always enabled, so you can hear SPDIF input data from line-out with "IEC958 In Monitor" switch at any time (see below). @@ -205,9 +209,10 @@ In addition to the standard SB mixer, CM8x38 provides more functions. MIDI CONTROLLER --------------- -The MPU401-UART interface is enabled as default only for the first -(CMIPCI) card. You need to set module option "midi_port" properly -for the 2nd (CMIPCI) card. +The MPU401-UART interface is disabled as default. You need to set +module option "mpu_port" with a valid I/O port address to enable the +MIDI support. The valid I/O ports are 0x300, 0x310, 0x320 and 0x330. +Choose the value which doesn't conflict with other cards. There is _no_ hardware wavetable function on this chip (except for OPL3 synth below). @@ -229,9 +234,11 @@ I don't know why.. Joystick and Modem ------------------ -The joystick and modem should be available by enabling the control -switch "Joystick" and "Modem" respectively. But I myself have never -tested them yet. +The legacy joystick is supported. To enable the joystick support, pass +joystick_port=1 module option. The value 1 means the auto-detection. +If the auto-detection fails, try to pass the exact I/O address. + +The modem is enabled dynamically via a card control switch "Modem". Debugging Information diff --git a/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl b/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl index e789475304b6..db0b7d2dc477 100644 --- a/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl +++ b/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl @@ -371,7 +371,7 @@ <listitem><para>create <function>probe()</function> callback.</para></listitem> <listitem><para>create <function>remove()</function> callback.</para></listitem> <listitem><para>create pci_driver table which contains the three pointers above.</para></listitem> - <listitem><para>create <function>init()</function> function just calling <function>pci_module_init()</function> to register the pci_driver table defined above.</para></listitem> + <listitem><para>create <function>init()</function> function just calling <function>pci_register_driver()</function> to register the pci_driver table defined above.</para></listitem> <listitem><para>create <function>exit()</function> function to call <function>pci_unregister_driver()</function> function.</para></listitem> </itemizedlist> </para> @@ -1198,7 +1198,7 @@ /* initialization of the module */ static int __init alsa_card_mychip_init(void) { - return pci_module_init(&driver); + return pci_register_driver(&driver); } /* clean up the module */ @@ -1654,7 +1654,7 @@ <![CDATA[ static int __init alsa_card_mychip_init(void) { - return pci_module_init(&driver); + return pci_register_driver(&driver); } static void __exit alsa_card_mychip_exit(void) diff --git a/Documentation/sound/alsa/emu10k1-jack.txt b/Documentation/sound/alsa/emu10k1-jack.txt new file mode 100644 index 000000000000..751d45036a05 --- /dev/null +++ b/Documentation/sound/alsa/emu10k1-jack.txt @@ -0,0 +1,74 @@ +This document is a guide to using the emu10k1 based devices with JACK for low +latency, multichannel recording functionality. All of my recent work to allow +Linux users to use the full capabilities of their hardware has been inspired +by the kX Project. Without their work I never would have discovered the true +power of this hardware. + + http://www.kxproject.com + - Lee Revell, 2005.03.30 + +Low latency, multichannel audio with JACK and the emu10k1/emu10k2 +----------------------------------------------------------------- + +Until recently, emu10k1 users on Linux did not have access to the same low +latency, multichannel features offered by the "kX ASIO" feature of their +Windows driver. As of ALSA 1.0.9 this is no more! + +For those unfamiliar with kX ASIO, this consists of 16 capture and 16 playback +channels. With a post 2.6.9 Linux kernel, latencies down to 64 (1.33 ms) or +even 32 (0.66ms) frames should work well. + +The configuration is slightly more involved than on Windows, as you have to +select the correct device for JACK to use. Actually, for qjackctl users it's +fairly self explanatory - select Duplex, then for capture and playback select +the multichannel devices, set the in and out channels to 16, and the sample +rate to 48000Hz. The command line looks like this: + +/usr/local/bin/jackd -R -dalsa -r48000 -p64 -n2 -D -Chw:0,2 -Phw:0,3 -S + +This will give you 16 input ports and 16 output ports. + +The 16 output ports map onto the 16 FX buses (or the first 16 of 64, for the +Audigy). The mapping from FX bus to physical output is described in +SB-Live-mixer.txt (or Audigy-mixer.txt). + +The 16 input ports are connected to the 16 physical inputs. Contrary to +popular belief, all emu10k1 cards are multichannel cards. Which of these +input channels have physical inputs connected to them depends on the card +model. Trial and error is highly recommended; the pinout diagrams +for the card have been reverse engineered by some enterprising kX users and are +available on the internet. Meterbridge is helpful here, and the kX forums are +packed with useful information. + +Each input port will either correspond to a digital (SPDIF) input, an analog +input, or nothing. The one exception is the SBLive! 5.1. On these devices, +the second and third input ports are wired to the center/LFE output. You will +still see 16 capture channels, but only 14 are available for recording inputs. + +This chart, borrowed from kxfxlib/da_asio51.cpp, describes the mapping of JACK +ports to FXBUS2 (multitrack recording input) and EXTOUT (physical output) +channels. + +/*JACK (& ASIO) mappings on 10k1 5.1 SBLive cards: +-------------------------------------------- +JACK Epilog FXBUS2(nr) +-------------------------------------------- +capture_1 asio14 FXBUS2(0xe) +capture_2 asio15 FXBUS2(0xf) +capture_3 asio0 FXBUS2(0x0) +~capture_4 Center EXTOUT(0x11) // mapped to by Center +~capture_5 LFE EXTOUT(0x12) // mapped to by LFE +capture_6 asio3 FXBUS2(0x3) +capture_7 asio4 FXBUS2(0x4) +capture_8 asio5 FXBUS2(0x5) +capture_9 asio6 FXBUS2(0x6) +capture_10 asio7 FXBUS2(0x7) +capture_11 asio8 FXBUS2(0x8) +capture_12 asio9 FXBUS2(0x9) +capture_13 asio10 FXBUS2(0xa) +capture_14 asio11 FXBUS2(0xb) +capture_15 asio12 FXBUS2(0xc) +capture_16 asio13 FXBUS2(0xd) +*/ + +TODO: describe use of ld10k1/qlo10k1 in conjunction with JACK diff --git a/Documentation/sound/alsa/hdspm.txt b/Documentation/sound/alsa/hdspm.txt new file mode 100644 index 000000000000..7a67ff71a9f8 --- /dev/null +++ b/Documentation/sound/alsa/hdspm.txt @@ -0,0 +1,362 @@ +Software Interface ALSA-DSP MADI Driver + +(translated from German, so no good English ;-), +2004 - winfried ritsch + + + + Full functionality has been added to the driver. Since some of + the Controls and startup-options are ALSA-Standard and only the + special Controls are described and discussed below. + + + hardware functionality: + + + Audio transmission: + + number of channels -- depends on transmission mode + + The number of channels chosen is from 1..Nmax. The reason to + use for a lower number of channels is only resource allocation, + since unused DMA channels are disabled and less memory is + allocated. So also the throughput of the PCI system can be + scaled. (Only important for low performance boards). + + Single Speed -- 1..64 channels + + (Note: Choosing the 56channel mode for transmission or as + receiver, only 56 are transmitted/received over the MADI, but + all 64 channels are available for the mixer, so channel count + for the driver) + + Double Speed -- 1..32 channels + + Note: Choosing the 56-channel mode for + transmission/receive-mode , only 28 are transmitted/received + over the MADI, but all 32 channels are available for the mixer, + so channel count for the driver + + + Quad Speed -- 1..16 channels + + Note: Choosing the 56-channel mode for + transmission/receive-mode , only 14 are transmitted/received + over the MADI, but all 16 channels are available for the mixer, + so channel count for the driver + + Format -- signed 32 Bit Little Endian (SNDRV_PCM_FMTBIT_S32_LE) + + Sample Rates -- + + Single Speed -- 32000, 44100, 48000 + + Double Speed -- 64000, 88200, 96000 (untested) + + Quad Speed -- 128000, 176400, 192000 (untested) + + access-mode -- MMAP (memory mapped), Not interleaved + (PCM_NON-INTERLEAVED) + + buffer-sizes -- 64,128,256,512,1024,2048,8192 Samples + + fragments -- 2 + + Hardware-pointer -- 2 Modi + + + The Card supports the readout of the actual Buffer-pointer, + where DMA reads/writes. Since of the bulk mode of PCI it is only + 64 Byte accurate. SO it is not really usable for the + ALSA-mid-level functions (here the buffer-ID gives a better + result), but if MMAP is used by the application. Therefore it + can be configured at load-time with the parameter + precise-pointer. + + + (Hint: Experimenting I found that the pointer is maximum 64 to + large never to small. So if you subtract 64 you always have a + safe pointer for writing, which is used on this mode inside + ALSA. In theory now you can get now a latency as low as 16 + Samples, which is a quarter of the interrupt possibilities.) + + Precise Pointer -- off + interrupt used for pointer-calculation + + Precise Pointer -- on + hardware pointer used. + + Controller: + + + Since DSP-MADI-Mixer has 8152 Fader, it does not make sense to + use the standard mixer-controls, since this would break most of + (especially graphic) ALSA-Mixer GUIs. So Mixer control has be + provided by a 2-dimensional controller using the + hwdep-interface. + + Also all 128+256 Peak and RMS-Meter can be accessed via the + hwdep-interface. Since it could be a performance problem always + copying and converting Peak and RMS-Levels even if you just need + one, I decided to export the hardware structure, so that of + needed some driver-guru can implement a memory-mapping of mixer + or peak-meters over ioctl, or also to do only copying and no + conversion. A test-application shows the usage of the controller. + + Latency Controls --- not implemented !!! + + + Note: Within the windows-driver the latency is accessible of a + control-panel, but buffer-sizes are controlled with ALSA from + hwparams-calls and should not be changed in run-state, I did not + implement it here. + + + System Clock -- suspended !!!! + + Name -- "System Clock Mode" + + Access -- Read Write + + Values -- "Master" "Slave" + + + !!!! This is a hardware-function but is in conflict with the + Clock-source controller, which is a kind of ALSA-standard. I + makes sense to set the card to a special mode (master at some + frequency or slave), since even not using an Audio-application + a studio should have working synchronisations setup. So use + Clock-source-controller instead !!!! + + Clock Source + + Name -- "Sample Clock Source" + + Access -- Read Write + + Values -- "AutoSync", "Internal 32.0 kHz", "Internal 44.1 kHz", + "Internal 48.0 kHz", "Internal 64.0 kHz", "Internal 88.2 kHz", + "Internal 96.0 kHz" + + Choose between Master at a specific Frequency and so also the + Speed-mode or Slave (Autosync). Also see "Preferred Sync Ref" + + + !!!! This is no pure hardware function but was implemented by + ALSA by some ALSA-drivers before, so I use it also. !!! + + + Preferred Sync Ref + + Name -- "Preferred Sync Reference" + + Access -- Read Write + + Values -- "Word" "MADI" + + + Within the Auto-sync-Mode the preferred Sync Source can be + chosen. If it is not available another is used if possible. + + Note: Since MADI has a much higher bit-rate than word-clock, the + card should synchronise better in MADI Mode. But since the + RME-PLL is very good, there are almost no problems with + word-clock too. I never found a difference. + + + TX 64 channel --- + + Name -- "TX 64 channels mode" + + Access -- Read Write + + Values -- 0 1 + + Using 64-channel-modus (1) or 56-channel-modus for + MADI-transmission (0). + + + Note: This control is for output only. Input-mode is detected + automatically from hardware sending MADI. + + + Clear TMS --- + + Name -- "Clear Track Marker" + + Access -- Read Write + + Values -- 0 1 + + + Don't use to lower 5 Audio-bits on AES as additional Bits. + + + Safe Mode oder Auto Input --- + + Name -- "Safe Mode" + + Access -- Read Write + + Values -- 0 1 + + (default on) + + If on (1), then if either the optical or coaxial connection + has a failure, there is a takeover to the working one, with no + sample failure. Its only useful if you use the second as a + backup connection. + + Input --- + + Name -- "Input Select" + + Access -- Read Write + + Values -- optical coaxial + + + Choosing the Input, optical or coaxial. If Safe-mode is active, + this is the preferred Input. + +-------------- Mixer ---------------------- + + Mixer + + Name -- "Mixer" + + Access -- Read Write + + Values - <channel-number 0-127> <Value 0-65535> + + + Here as a first value the channel-index is taken to get/set the + corresponding mixer channel, where 0-63 are the input to output + fader and 64-127 the playback to outputs fader. Value 0 + is channel muted 0 and 32768 an amplification of 1. + + Chn 1-64 + + fast mixer for the ALSA-mixer utils. The diagonal of the + mixer-matrix is implemented from playback to output. + + + Line Out + + Name -- "Line Out" + + Access -- Read Write + + Values -- 0 1 + + Switching on and off the analog out, which has nothing to do + with mixing or routing. the analog outs reflects channel 63,64. + + +--- information (only read access): + + Sample Rate + + Name -- "System Sample Rate" + + Access -- Read-only + + getting the sample rate. + + + External Rate measured + + Name -- "External Rate" + + Access -- Read only + + + Should be "Autosync Rate", but Name used is + ALSA-Scheme. External Sample frequency liked used on Autosync is + reported. + + + MADI Sync Status + + Name -- "MADI Sync Lock Status" + + Access -- Read + + Values -- 0,1,2 + + MADI-Input is 0=Unlocked, 1=Locked, or 2=Synced. + + + Word Clock Sync Status + + Name -- "Word Clock Lock Status" + + Access -- Read + + Values -- 0,1,2 + + Word Clock Input is 0=Unlocked, 1=Locked, or 2=Synced. + + AutoSync + + Name -- "AutoSync Reference" + + Access -- Read + + Values -- "WordClock", "MADI", "None" + + Sync-Reference is either "WordClock", "MADI" or none. + + RX 64ch --- noch nicht implementiert + + MADI-Receiver is in 64 channel mode oder 56 channel mode. + + + AB_inp --- not tested + + Used input for Auto-Input. + + + actual Buffer Position --- not implemented + + !!! this is a ALSA internal function, so no control is used !!! + + + +Calling Parameter: + + index int array (min = 1, max = 8), + "Index value for RME HDSPM interface." card-index within ALSA + + note: ALSA-standard + + id string array (min = 1, max = 8), + "ID string for RME HDSPM interface." + + note: ALSA-standard + + enable int array (min = 1, max = 8), + "Enable/disable specific HDSPM sound-cards." + + note: ALSA-standard + + precise_ptr int array (min = 1, max = 8), + "Enable precise pointer, or disable." + + note: Use only when the application supports this (which is a special case). + + line_outs_monitor int array (min = 1, max = 8), + "Send playback streams to analog outs by default." + + + note: each playback channel is mixed to the same numbered output + channel (routed). This is against the ALSA-convention, where all + channels have to be muted on after loading the driver, but was + used before on other cards, so i historically use it again) + + + + enable_monitor int array (min = 1, max = 8), + "Enable Analog Out on Channel 63/64 by default." + + note: here the analog output is enabled (but not routed).
\ No newline at end of file diff --git a/Documentation/stable_api_nonsense.txt b/Documentation/stable_api_nonsense.txt index 3cea13875277..f39c9d714db3 100644 --- a/Documentation/stable_api_nonsense.txt +++ b/Documentation/stable_api_nonsense.txt @@ -132,7 +132,7 @@ to extra work for the USB developers. Since all Linux USB developers do their work on their own time, asking programmers to do extra work for no gain, for free, is not a possibility. -Security issues are also a very important for Linux. When a +Security issues are also very important for Linux. When a security issue is found, it is fixed in a very short amount of time. A number of times this has caused internal kernel interfaces to be reworked to prevent the security problem from occurring. When this diff --git a/Documentation/stable_kernel_rules.txt b/Documentation/stable_kernel_rules.txt new file mode 100644 index 000000000000..2c81305090df --- /dev/null +++ b/Documentation/stable_kernel_rules.txt @@ -0,0 +1,58 @@ +Everything you ever wanted to know about Linux 2.6 -stable releases. + +Rules on what kind of patches are accepted, and what ones are not, into +the "-stable" tree: + + - It must be obviously correct and tested. + - It can not bigger than 100 lines, with context. + - It must fix only one thing. + - It must fix a real bug that bothers people (not a, "This could be a + problem..." type thing.) + - It must fix a problem that causes a build error (but not for things + marked CONFIG_BROKEN), an oops, a hang, data corruption, a real + security issue, or some "oh, that's not good" issue. In short, + something critical. + - No "theoretical race condition" issues, unless an explanation of how + the race can be exploited. + - It can not contain any "trivial" fixes in it (spelling changes, + whitespace cleanups, etc.) + - It must be accepted by the relevant subsystem maintainer. + - It must follow Documentation/SubmittingPatches rules. + + +Procedure for submitting patches to the -stable tree: + + - Send the patch, after verifying that it follows the above rules, to + stable@kernel.org. + - The sender will receive an ack when the patch has been accepted into + the queue, or a nak if the patch is rejected. This response might + take a few days, according to the developer's schedules. + - If accepted, the patch will be added to the -stable queue, for review + by other developers. + - Security patches should not be sent to this alias, but instead to the + documented security@kernel.org. + + +Review cycle: + + - When the -stable maintainers decide for a review cycle, the patches + will be sent to the review committee, and the maintainer of the + affected area of the patch (unless the submitter is the maintainer of + the area) and CC: to the linux-kernel mailing list. + - The review committee has 48 hours in which to ack or nak the patch. + - If the patch is rejected by a member of the committee, or linux-kernel + members object to the patch, bringing up issues that the maintainers + and members did not realize, the patch will be dropped from the + queue. + - At the end of the review cycle, the acked patches will be added to + the latest -stable release, and a new -stable release will happen. + - Security patches will be accepted into the -stable tree directly from + the security kernel team, and not go through the normal review cycle. + Contact the kernel security team for more details on this procedure. + + +Review committe: + + - This will be made up of a number of kernel developers who have + volunteered for this task, and a few that haven't. + diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 35159176997b..9f11d36a8c10 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -49,6 +49,7 @@ show up in /proc/sys/kernel: - shmmax [ sysv ipc ] - shmmni - stop-a [ SPARC only ] +- suid_dumpable - sysrq ==> Documentation/sysrq.txt - tainted - threads-max @@ -300,6 +301,25 @@ kernel. This value defaults to SHMMAX. ============================================================== +suid_dumpable: + +This value can be used to query and set the core dump mode for setuid +or otherwise protected/tainted binaries. The modes are + +0 - (default) - traditional behaviour. Any process which has changed + privilege levels or is execute only will not be dumped +1 - (debug) - all processes dump core when possible. The core dump is + owned by the current user and no security is applied. This is + intended for system debugging situations only. Ptrace is unchecked. +2 - (suidsafe) - any binary which normally would not be dumped is dumped + readable by root only. This allows the end user to remove + such a dump but not access it directly. For security reasons + core dumps in this mode will not overwrite one another or + other files. This mode is appropriate when adminstrators are + attempting to debug problems in a normal environment. + +============================================================== + tainted: Non-zero if the kernel has been tainted. Numeric values, which diff --git a/Documentation/sysrq.txt b/Documentation/sysrq.txt index f98c2e31c143..136d817c01ba 100644 --- a/Documentation/sysrq.txt +++ b/Documentation/sysrq.txt @@ -72,6 +72,8 @@ On all - write a character to /proc/sysrq-trigger. eg: 'b' - Will immediately reboot the system without syncing or unmounting your disks. +'c' - Will perform a kexec reboot in order to take a crashdump. + 'o' - Will shut your system off (if configured and supported). 's' - Will attempt to sync all mounted filesystems. @@ -122,6 +124,9 @@ useful when you want to exit a program that will not let you switch consoles. re'B'oot is good when you're unable to shut down. But you should also 'S'ync and 'U'mount first. +'C'rashdump can be used to manually trigger a crashdump when the system is hung. +The kernel needs to have been built with CONFIG_KEXEC enabled. + 'S'ync is great when your system is locked up, it allows you to sync your disks and will certainly lessen the chance of data loss and fscking. Note that the sync hasn't taken place until you see the "OK" and "Done" appear diff --git a/Documentation/tty.txt b/Documentation/tty.txt index 3958cf746dde..8ff7bc2a0811 100644 --- a/Documentation/tty.txt +++ b/Documentation/tty.txt @@ -22,7 +22,7 @@ copy of the structure. You must not re-register over the top of the line discipline even with the same data or your computer again will be eaten by demons. -In order to remove a line discipline call tty_register_ldisc passing NULL. +In order to remove a line discipline call tty_unregister_ldisc(). In ancient times this always worked. In modern times the function will return -EBUSY if the ldisc is currently in use. Since the ldisc referencing code manages the module counts this should not usually be a concern. diff --git a/Documentation/usb/sn9c102.txt b/Documentation/usb/sn9c102.txt index cf9a1187edce..3f8a119db31b 100644 --- a/Documentation/usb/sn9c102.txt +++ b/Documentation/usb/sn9c102.txt @@ -297,6 +297,7 @@ Vendor ID Product ID 0x0c45 0x602a 0x0c45 0x602b 0x0c45 0x602c +0x0c45 0x602d 0x0c45 0x6030 0x0c45 0x6080 0x0c45 0x6082 @@ -333,6 +334,7 @@ Model Manufacturer ----- ------------ HV7131D Hynix Semiconductor, Inc. MI-0343 Micron Technology, Inc. +OV7630 OmniVision Technologies, Inc. PAS106B PixArt Imaging, Inc. PAS202BCB PixArt Imaging, Inc. TAS5110C1B Taiwan Advanced Sensor Corporation @@ -470,9 +472,11 @@ order): - Luca Capello for the donation of a webcam; - Joao Rodrigo Fuzaro, Joao Limirio, Claudio Filho and Caio Begotti for the donation of a webcam; +- Jon Hollstrom for the donation of a webcam; - Carlos Eduardo Medaglia Dyonisio, who added the support for the PAS202BCB image sensor; - Stefano Mozzi, who donated 45 EU; +- Andrew Pearce for the donation of a webcam; - Bertrik Sikken, who reverse-engineered and documented the Huffman compression algorithm used in the SN9C10x controllers and implemented the first decoder; - Mizuno Takafumi for the donation of a webcam; diff --git a/Documentation/usb/usbmon.txt b/Documentation/usb/usbmon.txt index 2f8431f92b77..63cb7edd177e 100644 --- a/Documentation/usb/usbmon.txt +++ b/Documentation/usb/usbmon.txt @@ -101,6 +101,13 @@ Here is the list of words, from left to right: or 3 and 2 positions, correspondingly. - URB Status. This field makes no sense for submissions, but is present to help scripts with parsing. In error case, it contains the error code. + In case of a setup packet, it contains a Setup Tag. If scripts read a number + in this field, they proceed to read Data Length. Otherwise, they read + the setup packet before reading the Data Length. +- Setup packet, if present, consists of 5 words: one of each for bmRequestType, + bRequest, wValue, wIndex, wLength, as specified by the USB Specification 2.0. + These words are safe to decode if Setup Tag was 's'. Otherwise, the setup + packet was present, but not captured, and the fields contain filler. - Data Length. This is the actual length in the URB. - Data tag. The usbmon may not always capture data, even if length is nonzero. Only if tag is '=', the data words are present. @@ -125,25 +132,31 @@ class ParsedLine { String data_str = st.nextToken(); int len = data_str.length() / 2; int i; + int b; // byte is signed, apparently?! XXX for (i = 0; i < len; i++) { - data[data_len] = Byte.parseByte( - data_str.substring(i*2, i*2 + 2), - 16); + // data[data_len] = Byte.parseByte( + // data_str.substring(i*2, i*2 + 2), + // 16); + b = Integer.parseInt( + data_str.substring(i*2, i*2 + 2), + 16); + if (b >= 128) + b *= -1; + data[data_len] = (byte) b; data_len++; } } } } -This format is obviously deficient. For example, the setup packet for control -transfers is not delivered. This will change in the future. +This format may be changed in the future. Examples: -An input control transfer to get a port status: +An input control transfer to get a port status. -d74ff9a0 2640288196 S Ci:001:00 -115 4 < -d74ff9a0 2640288202 C Ci:001:00 0 4 = 01010100 +d5ea89a0 3575914555 S Ci:001:00 s a3 00 0000 0003 0004 4 < +d5ea89a0 3575914560 C Ci:001:00 0 4 = 01050000 An output bulk transfer to send a SCSI command 0x5E in a 31-byte Bulk wrapper to a storage device at address 5: diff --git a/Documentation/video4linux/API.html b/Documentation/video4linux/API.html index 4b3d8f640a4a..441407b12a9f 100644 --- a/Documentation/video4linux/API.html +++ b/Documentation/video4linux/API.html @@ -1,399 +1,16 @@ -<HTML><HEAD> -<TITLE>Video4Linux Kernel API Reference v0.1:19990430</TITLE> -</HEAD> -<! Revision History: > -<! 4/30/1999 - Fred Gleason (fredg@wava.com)> -<! Documented extensions for the Radio Data System (RDS) extensions > -<BODY bgcolor="#ffffff"> -<H3>Devices</H3> -Video4Linux provides the following sets of device files. These live on the -character device formerly known as "/dev/bttv". /dev/bttv should be a -symlink to /dev/video0 for most people. -<P> -<TABLE> -<TR><TH>Device Name</TH><TH>Minor Range</TH><TH>Function</TH> -<TR><TD>/dev/video</TD><TD>0-63</TD><TD>Video Capture Interface</TD> -<TR><TD>/dev/radio</TD><TD>64-127</TD><TD>AM/FM Radio Devices</TD> -<TR><TD>/dev/vtx</TD><TD>192-223</TD><TD>Teletext Interface Chips</TD> -<TR><TD>/dev/vbi</TD><TD>224-239</TD><TD>Raw VBI Data (Intercast/teletext)</TD> -</TABLE> -<P> -Video4Linux programs open and scan the devices to find what they are looking -for. Capability queries define what each interface supports. The -described API is only defined for video capture cards. The relevant subset -applies to radio cards. Teletext interfaces talk the existing VTX API. -<P> -<H3>Capability Query Ioctl</H3> -The <B>VIDIOCGCAP</B> ioctl call is used to obtain the capability -information for a video device. The <b>struct video_capability</b> object -passed to the ioctl is completed and returned. It contains the following -information -<P> -<TABLE> -<TR><TD><b>name[32]</b><TD>Canonical name for this interface</TD> -<TR><TD><b>type</b><TD>Type of interface</TD> -<TR><TD><b>channels</b><TD>Number of radio/tv channels if appropriate</TD> -<TR><TD><b>audios</b><TD>Number of audio devices if appropriate</TD> -<TR><TD><b>maxwidth</b><TD>Maximum capture width in pixels</TD> -<TR><TD><b>maxheight</b><TD>Maximum capture height in pixels</TD> -<TR><TD><b>minwidth</b><TD>Minimum capture width in pixels</TD> -<TR><TD><b>minheight</b><TD>Minimum capture height in pixels</TD> -</TABLE> -<P> -The type field lists the capability flags for the device. These are -as follows -<P> -<TABLE> -<TR><TH>Name</TH><TH>Description</TH> -<TR><TD><b>VID_TYPE_CAPTURE</b><TD>Can capture to memory</TD> -<TR><TD><b>VID_TYPE_TUNER</b><TD>Has a tuner of some form</TD> -<TR><TD><b>VID_TYPE_TELETEXT</b><TD>Has teletext capability</TD> -<TR><TD><b>VID_TYPE_OVERLAY</b><TD>Can overlay its image onto the frame buffer</TD> -<TR><TD><b>VID_TYPE_CHROMAKEY</b><TD>Overlay is Chromakeyed</TD> -<TR><TD><b>VID_TYPE_CLIPPING</b><TD>Overlay clipping is supported</TD> -<TR><TD><b>VID_TYPE_FRAMERAM</b><TD>Overlay overwrites frame buffer memory</TD> -<TR><TD><b>VID_TYPE_SCALES</b><TD>The hardware supports image scaling</TD> -<TR><TD><b>VID_TYPE_MONOCHROME</b><TD>Image capture is grey scale only</TD> -<TR><TD><b>VID_TYPE_SUBCAPTURE</b><TD>Capture can be of only part of the image</TD> -</TABLE> -<P> -The minimum and maximum sizes listed for a capture device do not imply all -that all height/width ratios or sizes within the range are possible. A -request to set a size will be honoured by the largest available capture -size whose capture is no large than the requested rectangle in either -direction. For example the quickcam has 3 fixed settings. -<P> -<H3>Frame Buffer</H3> -Capture cards that drop data directly onto the frame buffer must be told the -base address of the frame buffer, its size and organisation. This is a -privileged ioctl and one that eventually X itself should set. -<P> -The <b>VIDIOCSFBUF</b> ioctl sets the frame buffer parameters for a capture -card. If the card does not do direct writes to the frame buffer then this -ioctl will be unsupported. The <b>VIDIOCGFBUF</b> ioctl returns the -currently used parameters. The structure used in both cases is a -<b>struct video_buffer</b>. -<P> -<TABLE> -<TR><TD><b>void *base</b></TD><TD>Base physical address of the buffer</TD> -<TR><TD><b>int height</b></TD><TD>Height of the frame buffer</TD> -<TR><TD><b>int width</b></TD><TD>Width of the frame buffer</TD> -<TR><TD><b>int depth</b></TD><TD>Depth of the frame buffer</TD> -<TR><TD><b>int bytesperline</b></TD><TD>Number of bytes of memory between the start of two adjacent lines</TD> -</TABLE> -<P> -Note that these values reflect the physical layout of the frame buffer. -The visible area may be smaller. In fact under XFree86 this is commonly the -case. XFree86 DGA can provide the parameters required to set up this ioctl. -Setting the base address to NULL indicates there is no physical frame buffer -access. -<P> -<H3>Capture Windows</H3> -The capture area is described by a <b>struct video_window</b>. This defines -a capture area and the clipping information if relevant. The -<b>VIDIOCGWIN</b> ioctl recovers the current settings and the -<b>VIDIOCSWIN</b> sets new values. A successful call to <b>VIDIOCSWIN</b> -indicates that a suitable set of parameters have been chosen. They do not -indicate that exactly what was requested was granted. The program should -call <b>VIDIOCGWIN</b> to check if the nearest match was suitable. The -<b>struct video_window</b> contains the following fields. -<P> -<TABLE> -<TR><TD><b>x</b><TD>The X co-ordinate specified in X windows format.</TD> -<TR><TD><b>y</b><TD>The Y co-ordinate specified in X windows format.</TD> -<TR><TD><b>width</b><TD>The width of the image capture.</TD> -<TR><TD><b>height</b><TD>The height of the image capture.</TD> -<TR><TD><b>chromakey</b><TD>A host order RGB32 value for the chroma key.</TD> -<TR><TD><b>flags</b><TD>Additional capture flags.</TD> -<TR><TD><b>clips</b><TD>A list of clipping rectangles. <em>(Set only)</em></TD> -<TR><TD><b>clipcount</b><TD>The number of clipping rectangles. <em>(Set only)</em></TD> -</TABLE> -<P> -Clipping rectangles are passed as an array. Each clip consists of the following -fields available to the user. -<P> -<TABLE> -<TR><TD><b>x</b></TD><TD>X co-ordinate of rectangle to skip</TD> -<TR><TD><b>y</b></TD><TD>Y co-ordinate of rectangle to skip</TD> -<TR><TD><b>width</b></TD><TD>Width of rectangle to skip</TD> -<TR><TD><b>height</b></TD><TD>Height of rectangle to skip</TD> -</TABLE> -<P> -Merely setting the window does not enable capturing. Overlay capturing -(i.e. PCI-PCI transfer to the frame buffer of the video card) -is activated by passing the <b>VIDIOCCAPTURE</b> ioctl a value of 1, and -disabled by passing it a value of 0. -<P> -Some capture devices can capture a subfield of the image they actually see. -This is indicated when VIDEO_TYPE_SUBCAPTURE is defined. -The video_capture describes the time and special subfields to capture. -The video_capture structure contains the following fields. -<P> -<TABLE> -<TR><TD><b>x</b></TD><TD>X co-ordinate of source rectangle to grab</TD> -<TR><TD><b>y</b></TD><TD>Y co-ordinate of source rectangle to grab</TD> -<TR><TD><b>width</b></TD><TD>Width of source rectangle to grab</TD> -<TR><TD><b>height</b></TD><TD>Height of source rectangle to grab</TD> -<TR><TD><b>decimation</b></TD><TD>Decimation to apply</TD> -<TR><TD><b>flags</b></TD><TD>Flag settings for grabbing</TD> -</TABLE> -The available flags are -<P> -<TABLE> -<TR><TH>Name</TH><TH>Description</TH> -<TR><TD><b>VIDEO_CAPTURE_ODD</b><TD>Capture only odd frames</TD> -<TR><TD><b>VIDEO_CAPTURE_EVEN</b><TD>Capture only even frames</TD> -</TABLE> -<P> -<H3>Video Sources</H3> -Each video4linux video or audio device captures from one or more -source <b>channels</b>. Each channel can be queries with the -<b>VDIOCGCHAN</b> ioctl call. Before invoking this function the caller -must set the channel field to the channel that is being queried. On return -the <b>struct video_channel</b> is filled in with information about the -nature of the channel itself. -<P> -The <b>VIDIOCSCHAN</b> ioctl takes an integer argument and switches the -capture to this input. It is not defined whether parameters such as colour -settings or tuning are maintained across a channel switch. The caller should -maintain settings as desired for each channel. (This is reasonable as -different video inputs may have different properties). -<P> -The <b>struct video_channel</b> consists of the following -<P> -<TABLE> -<TR><TD><b>channel</b></TD><TD>The channel number</TD> -<TR><TD><b>name</b></TD><TD>The input name - preferably reflecting the label -on the card input itself</TD> -<TR><TD><b>tuners</b></TD><TD>Number of tuners for this input</TD> -<TR><TD><b>flags</b></TD><TD>Properties the tuner has</TD> -<TR><TD><b>type</b></TD><TD>Input type (if known)</TD> -<TR><TD><b>norm</b><TD>The norm for this channel</TD> -</TABLE> -<P> -The flags defined are -<P> -<TABLE> -<TR><TD><b>VIDEO_VC_TUNER</b><TD>Channel has tuners.</TD> -<TR><TD><b>VIDEO_VC_AUDIO</b><TD>Channel has audio.</TD> -<TR><TD><b>VIDEO_VC_NORM</b><TD>Channel has norm setting.</TD> -</TABLE> -<P> -The types defined are -<P> -<TABLE> -<TR><TD><b>VIDEO_TYPE_TV</b><TD>The input is a TV input.</TD> -<TR><TD><b>VIDEO_TYPE_CAMERA</b><TD>The input is a camera.</TD> -</TABLE> -<P> -<H3>Image Properties</H3> -The image properties of the picture can be queried with the <b>VIDIOCGPICT</b> -ioctl which fills in a <b>struct video_picture</b>. The <b>VIDIOCSPICT</b> -ioctl allows values to be changed. All values except for the palette type -are scaled between 0-65535. -<P> -The <b>struct video_picture</b> consists of the following fields -<P> -<TABLE> -<TR><TD><b>brightness</b><TD>Picture brightness</TD> -<TR><TD><b>hue</b><TD>Picture hue (colour only)</TD> -<TR><TD><b>colour</b><TD>Picture colour (colour only)</TD> -<TR><TD><b>contrast</b><TD>Picture contrast</TD> -<TR><TD><b>whiteness</b><TD>The whiteness (greyscale only)</TD> -<TR><TD><b>depth</b><TD>The capture depth (may need to match the frame buffer depth)</TD> -<TR><TD><b>palette</b><TD>Reports the palette that should be used for this image</TD> -</TABLE> -<P> -The following palettes are defined -<P> -<TABLE> -<TR><TD><b>VIDEO_PALETTE_GREY</b><TD>Linear intensity grey scale (255 is brightest).</TD> -<TR><TD><b>VIDEO_PALETTE_HI240</b><TD>The BT848 8bit colour cube.</TD> -<TR><TD><b>VIDEO_PALETTE_RGB565</b><TD>RGB565 packed into 16 bit words.</TD> -<TR><TD><b>VIDEO_PALETTE_RGB555</b><TD>RGV555 packed into 16 bit words, top bit undefined.</TD> -<TR><TD><b>VIDEO_PALETTE_RGB24</b><TD>RGB888 packed into 24bit words.</TD> -<TR><TD><b>VIDEO_PALETTE_RGB32</b><TD>RGB888 packed into the low 3 bytes of 32bit words. The top 8bits are undefined.</TD> -<TR><TD><b>VIDEO_PALETTE_YUV422</b><TD>Video style YUV422 - 8bits packed 4bits Y 2bits U 2bits V</TD> -<TR><TD><b>VIDEO_PALETTE_YUYV</b><TD>Describe me</TD> -<TR><TD><b>VIDEO_PALETTE_UYVY</b><TD>Describe me</TD> -<TR><TD><b>VIDEO_PALETTE_YUV420</b><TD>YUV420 capture</TD> -<TR><TD><b>VIDEO_PALETTE_YUV411</b><TD>YUV411 capture</TD> -<TR><TD><b>VIDEO_PALETTE_RAW</b><TD>RAW capture (BT848)</TD> -<TR><TD><b>VIDEO_PALETTE_YUV422P</b><TD>YUV 4:2:2 Planar</TD> -<TR><TD><b>VIDEO_PALETTE_YUV411P</b><TD>YUV 4:1:1 Planar</TD> -</TABLE> -<P> -<H3>Tuning</H3> -Each video input channel can have one or more tuners associated with it. Many -devices will not have tuners. TV cards and radio cards will have one or more -tuners attached. -<P> -Tuners are described by a <b>struct video_tuner</b> which can be obtained by -the <b>VIDIOCGTUNER</b> ioctl. Fill in the tuner number in the structure -then pass the structure to the ioctl to have the data filled in. The -tuner can be switched using <b>VIDIOCSTUNER</b> which takes an integer argument -giving the tuner to use. A struct tuner has the following fields -<P> -<TABLE> -<TR><TD><b>tuner</b><TD>Number of the tuner</TD> -<TR><TD><b>name</b><TD>Canonical name for this tuner (eg FM/AM/TV)</TD> -<TR><TD><b>rangelow</b><TD>Lowest tunable frequency</TD> -<TR><TD><b>rangehigh</b><TD>Highest tunable frequency</TD> -<TR><TD><b>flags</b><TD>Flags describing the tuner</TD> -<TR><TD><b>mode</b><TD>The video signal mode if relevant</TD> -<TR><TD><b>signal</b><TD>Signal strength if known - between 0-65535</TD> -</TABLE> -<P> -The following flags exist -<P> -<TABLE> -<TR><TD><b>VIDEO_TUNER_PAL</b><TD>PAL tuning is supported</TD> -<TR><TD><b>VIDEO_TUNER_NTSC</b><TD>NTSC tuning is supported</TD> -<TR><TD><b>VIDEO_TUNER_SECAM</b><TD>SECAM tuning is supported</TD> -<TR><TD><b>VIDEO_TUNER_LOW</b><TD>Frequency is in a lower range</TD> -<TR><TD><b>VIDEO_TUNER_NORM</b><TD>The norm for this tuner is settable</TD> -<TR><TD><b>VIDEO_TUNER_STEREO_ON</b><TD>The tuner is seeing stereo audio</TD> -<TR><TD><b>VIDEO_TUNER_RDS_ON</b><TD>The tuner is seeing a RDS datastream</TD> -<TR><TD><b>VIDEO_TUNER_MBS_ON</b><TD>The tuner is seeing a MBS datastream</TD> -</TABLE> -<P> -The following modes are defined -<P> -<TABLE> -<TR><TD><b>VIDEO_MODE_PAL</b><TD>The tuner is in PAL mode</TD> -<TR><TD><b>VIDEO_MODE_NTSC</b><TD>The tuner is in NTSC mode</TD> -<TR><TD><b>VIDEO_MODE_SECAM</b><TD>The tuner is in SECAM mode</TD> -<TR><TD><b>VIDEO_MODE_AUTO</b><TD>The tuner auto switches, or mode does not apply</TD> -</TABLE> -<P> -Tuning frequencies are an unsigned 32bit value in 1/16th MHz or if the -<b>VIDEO_TUNER_LOW</b> flag is set they are in 1/16th KHz. The current -frequency is obtained as an unsigned long via the <b>VIDIOCGFREQ</b> ioctl and -set by the <b>VIDIOCSFREQ</b> ioctl. -<P> -<H3>Audio</H3> -TV and Radio devices have one or more audio inputs that may be selected. -The audio properties are queried by passing a <b>struct video_audio</b> to <b>VIDIOCGAUDIO</b> ioctl. The -<b>VIDIOCSAUDIO</b> ioctl sets audio properties. -<P> -The structure contains the following fields -<P> -<TABLE> -<TR><TD><b>audio</b><TD>The channel number</TD> -<TR><TD><b>volume</b><TD>The volume level</TD> -<TR><TD><b>bass</b><TD>The bass level</TD> -<TR><TD><b>treble</b><TD>The treble level</TD> -<TR><TD><b>flags</b><TD>Flags describing the audio channel</TD> -<TR><TD><b>name</b><TD>Canonical name for the audio input</TD> -<TR><TD><b>mode</b><TD>The mode the audio input is in</TD> -<TR><TD><b>balance</b><TD>The left/right balance</TD> -<TR><TD><b>step</b><TD>Actual step used by the hardware</TD> -</TABLE> -<P> -The following flags are defined -<P> -<TABLE> -<TR><TD><b>VIDEO_AUDIO_MUTE</b><TD>The audio is muted</TD> -<TR><TD><b>VIDEO_AUDIO_MUTABLE</b><TD>Audio muting is supported</TD> -<TR><TD><b>VIDEO_AUDIO_VOLUME</b><TD>The volume is controllable</TD> -<TR><TD><b>VIDEO_AUDIO_BASS</b><TD>The bass is controllable</TD> -<TR><TD><b>VIDEO_AUDIO_TREBLE</b><TD>The treble is controllable</TD> -<TR><TD><b>VIDEO_AUDIO_BALANCE</b><TD>The balance is controllable</TD> -</TABLE> -<P> -The following decoding modes are defined -<P> -<TABLE> -<TR><TD><b>VIDEO_SOUND_MONO</b><TD>Mono signal</TD> -<TR><TD><b>VIDEO_SOUND_STEREO</b><TD>Stereo signal (NICAM for TV)</TD> -<TR><TD><b>VIDEO_SOUND_LANG1</b><TD>European TV alternate language 1</TD> -<TR><TD><b>VIDEO_SOUND_LANG2</b><TD>European TV alternate language 2</TD> -</TABLE> -<P> -<H3>Reading Images</H3> -Each call to the <b>read</b> syscall returns the next available image -from the device. It is up to the caller to set format and size (using -the VIDIOCSPICT and VIDIOCSWIN ioctls) and then to pass a suitable -size buffer and length to the function. Not all devices will support -read operations. -<P> -A second way to handle image capture is via the mmap interface if supported. -To use the mmap interface a user first sets the desired image size and depth -properties. Next the VIDIOCGMBUF ioctl is issued. This reports the size -of buffer to mmap and the offset within the buffer for each frame. The -number of frames supported is device dependent and may only be one. -<P> -The video_mbuf structure contains the following fields -<P> -<TABLE> -<TR><TD><b>size</b><TD>The number of bytes to map</TD> -<TR><TD><b>frames</b><TD>The number of frames</TD> -<TR><TD><b>offsets</b><TD>The offset of each frame</TD> -</TABLE> -<P> -Once the mmap has been made the VIDIOCMCAPTURE ioctl starts the -capture to a frame using the format and image size specified in the -video_mmap (which should match or be below the initial query size). -When the VIDIOCMCAPTURE ioctl returns the frame is <em>not</em> -captured yet, the driver just instructed the hardware to start the -capture. The application has to use the VIDIOCSYNC ioctl to wait -until the capture of a frame is finished. VIDIOCSYNC takes the frame -number you want to wait for as argument. -<p> -It is allowed to call VIDIOCMCAPTURE multiple times (with different -frame numbers in video_mmap->frame of course) and thus have multiple -outstanding capture requests. A simple way do to double-buffering -using this feature looks like this: -<pre> -/* setup everything */ -VIDIOCMCAPTURE(0) -while (whatever) { - VIDIOCMCAPTURE(1) - VIDIOCSYNC(0) - /* process frame 0 while the hardware captures frame 1 */ - VIDIOCMCAPTURE(0) - VIDIOCSYNC(1) - /* process frame 1 while the hardware captures frame 0 */ -} -</pre> -Note that you are <em>not</em> limited to only two frames. The API -allows up to 32 frames, the VIDIOCGMBUF ioctl returns the number of -frames the driver granted. Thus it is possible to build deeper queues -to avoid loosing frames on load peaks. -<p> -While capturing to memory the driver will make a "best effort" attempt -to capture to screen as well if requested. This normally means all -frames that "miss" memory mapped capture will go to the display. -<P> -A final ioctl exists to allow a device to obtain related devices if a -driver has multiple components (for example video0 may not be associated -with vbi0 which would cause an intercast display program to make a bad -mistake). The VIDIOCGUNIT ioctl reports the unit numbers of the associated -devices if any exist. The video_unit structure has the following fields. -<P> -<TABLE> -<TR><TD><b>video</b><TD>Video capture device</TD> -<TR><TD><b>vbi</b><TD>VBI capture device</TD> -<TR><TD><b>radio</b><TD>Radio device</TD> -<TR><TD><b>audio</b><TD>Audio mixer</TD> -<TR><TD><b>teletext</b><TD>Teletext device</TD> -</TABLE> -<P> -<H3>RDS Datastreams</H3> -For radio devices that support it, it is possible to receive Radio Data -System (RDS) data by means of a read() on the device. The data is packed in -groups of three, as follows: -<TABLE> -<TR><TD>First Octet</TD><TD>Least Significant Byte of RDS Block</TD></TR> -<TR><TD>Second Octet</TD><TD>Most Significant Byte of RDS Block -<TR><TD>Third Octet</TD><TD>Bit 7:</TD><TD>Error bit. Indicates that -an uncorrectable error occurred during reception of this block.</TD></TR> -<TR><TD> </TD><TD>Bit 6:</TD><TD>Corrected bit. Indicates that -an error was corrected for this data block.</TD></TR> -<TR><TD> </TD><TD>Bits 5-3:</TD><TD>Received Offset. Indicates the -offset received by the sync system.</TD></TR> -<TR><TD> </TD><TD>Bits 2-0:</TD><TD>Offset Name. Indicates the -offset applied to this data.</TD></TR> -</TABLE> -</BODY> -</HTML> +<TITLE>V4L API</TITLE> +<H1>Video For Linux APIs</H1> +<table border=0> +<tr> +<td> +<A HREF=http://www.linuxtv.org/downloads/video4linux/API/V4L1_API.html> +V4L original API</a> +</td><td> +Obsoleted by V4L2 API +</td></tr><tr><td> +<A HREF=http://www.linuxtv.org/downloads/video4linux/API/V4L2_API.html> +V4L2 API</a> +</td><td> +Should be used for new projects +</td></tr> +</table> diff --git a/Documentation/video4linux/CARDLIST.bttv b/Documentation/video4linux/CARDLIST.bttv index e46761c39e3f..62a12a08e2ac 100644 --- a/Documentation/video4linux/CARDLIST.bttv +++ b/Documentation/video4linux/CARDLIST.bttv @@ -1,4 +1,4 @@ -card=0 - *** UNKNOWN/GENERIC *** +card=0 - *** UNKNOWN/GENERIC *** card=1 - MIRO PCTV card=2 - Hauppauge (bt848) card=3 - STB, Gateway P/N 6000699 (bt848) @@ -119,3 +119,17 @@ card=117 - NGS NGSTV+ card=118 - LMLBT4 card=119 - Tekram M205 PRO card=120 - Conceptronic CONTVFMi +card=121 - Euresys Picolo Tetra +card=122 - Spirit TV Tuner +card=123 - AVerMedia AVerTV DVB-T 771 +card=124 - AverMedia AverTV DVB-T 761 +card=125 - MATRIX Vision Sigma-SQ +card=126 - MATRIX Vision Sigma-SLC +card=127 - APAC Viewcomp 878(AMAX) +card=128 - DVICO FusionHDTV DVB-T Lite +card=129 - V-Gear MyVCD +card=130 - Super TV Tuner +card=131 - Tibet Systems 'Progress DVR' CS16 +card=132 - Kodicom 4400R (master) +card=133 - Kodicom 4400R (slave) +card=134 - Adlink RTV24 diff --git a/Documentation/video4linux/CARDLIST.cx88 b/Documentation/video4linux/CARDLIST.cx88 new file mode 100644 index 000000000000..03deb0726aa4 --- /dev/null +++ b/Documentation/video4linux/CARDLIST.cx88 @@ -0,0 +1,32 @@ +card=0 - UNKNOWN/GENERIC +card=1 - Hauppauge WinTV 34xxx models +card=2 - GDI Black Gold +card=3 - PixelView +card=4 - ATI TV Wonder Pro +card=5 - Leadtek Winfast 2000XP Expert +card=6 - AverTV Studio 303 (M126) +card=7 - MSI TV-@nywhere Master +card=8 - Leadtek Winfast DV2000 +card=9 - Leadtek PVR 2000 +card=10 - IODATA GV-VCP3/PCI +card=11 - Prolink PlayTV PVR +card=12 - ASUS PVR-416 +card=13 - MSI TV-@nywhere +card=14 - KWorld/VStream XPert DVB-T +card=15 - DViCO FusionHDTV DVB-T1 +card=16 - KWorld LTV883RF +card=17 - DViCO FusionHDTV 3 Gold-Q +card=18 - Hauppauge Nova-T DVB-T +card=19 - Conexant DVB-T reference design +card=20 - Provideo PV259 +card=21 - DViCO FusionHDTV DVB-T Plus +card=22 - digitalnow DNTV Live! DVB-T +card=23 - pcHDTV HD3000 HDTV +card=24 - Hauppauge WinTV 28xxx (Roslyn) models +card=25 - Digital-Logic MICROSPACE Entertainment Center (MEC) +card=26 - IODATA GV/BCTV7E +card=27 - PixelView PlayTV Ultra Pro (Stereo) +card=28 - DViCO FusionHDTV 3 Gold-T +card=29 - ADS Tech Instant TV DVB-T PCI +card=30 - TerraTec Cinergy 1400 DVB-T +card=31 - DViCO FusionHDTV 5 Gold diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134 index a6c82fa4de02..1b5a3a9ffbe2 100644 --- a/Documentation/video4linux/CARDLIST.saa7134 +++ b/Documentation/video4linux/CARDLIST.saa7134 @@ -1,10 +1,10 @@ - 0 -> UNKNOWN/GENERIC + 0 -> UNKNOWN/GENERIC 1 -> Proteus Pro [philips reference design] [1131:2001,1131:2001] 2 -> LifeView FlyVIDEO3000 [5168:0138,4e42:0138] 3 -> LifeView FlyVIDEO2000 [5168:0138] 4 -> EMPRESS [1131:6752] 5 -> SKNet Monster TV [1131:4e85] - 6 -> Tevion MD 9717 + 6 -> Tevion MD 9717 7 -> KNC One TV-Station RDS / Typhoon TV Tuner RDS [1131:fe01,1894:fe01] 8 -> Terratec Cinergy 400 TV [153B:1142] 9 -> Medion 5044 @@ -20,16 +20,45 @@ 19 -> Compro VideoMate TV [185b:c100] 20 -> Matrox CronosPlus [102B:48d0] 21 -> 10MOONS PCI TV CAPTURE CARD [1131:2001] - 22 -> Medion 2819/ AverMedia M156 [1461:a70b,1461:2115] + 22 -> AverMedia M156 / Medion 2819 [1461:a70b] 23 -> BMK MPEX Tuner 24 -> KNC One TV-Station DVR [1894:a006] 25 -> ASUS TV-FM 7133 [1043:4843] 26 -> Pinnacle PCTV Stereo (saa7134) [11bd:002b] - 27 -> Manli MuchTV M-TV002 - 28 -> Manli MuchTV M-TV001 + 27 -> Manli MuchTV M-TV002/Behold TV 403 FM + 28 -> Manli MuchTV M-TV001/Behold TV 401 29 -> Nagase Sangyo TransGear 3000TV [1461:050c] 30 -> Elitegroup ECS TVP3XP FM1216 Tuner Card(PAL-BG,FM) [1019:4cb4] 31 -> Elitegroup ECS TVP3XP FM1236 Tuner Card (NTSC,FM) [1019:4cb5] 32 -> AVACS SmartTV 33 -> AVerMedia DVD EZMaker [1461:10ff] - 34 -> LifeView FlyTV Platinum33 mini [5168:0212] + 34 -> Noval Prime TV 7133 + 35 -> AverMedia AverTV Studio 305 [1461:2115] + 36 -> UPMOST PURPLE TV [12ab:0800] + 37 -> Items MuchTV Plus / IT-005 + 38 -> Terratec Cinergy 200 TV [153B:1152] + 39 -> LifeView FlyTV Platinum Mini [5168:0212] + 40 -> Compro VideoMate TV PVR/FM [185b:c100] + 41 -> Compro VideoMate TV Gold+ [185b:c100] + 42 -> Sabrent SBT-TVFM (saa7130) + 43 -> :Zolid Xpert TV7134 + 44 -> Empire PCI TV-Radio LE + 45 -> Avermedia AVerTV Studio 307 [1461:9715] + 46 -> AVerMedia Cardbus TV/Radio (E500) [1461:d6ee] + 47 -> Terratec Cinergy 400 mobile [153b:1162] + 48 -> Terratec Cinergy 600 TV MK3 [153B:1158] + 49 -> Compro VideoMate Gold+ Pal [185b:c200] + 50 -> Pinnacle PCTV 300i DVB-T + PAL [11bd:002d] + 51 -> ProVideo PV952 [1540:9524] + 52 -> AverMedia AverTV/305 [1461:2108] + 53 -> ASUS TV-FM 7135 [1043:4845] + 54 -> LifeView FlyTV Platinum FM [5168:0214,1489:0214] + 55 -> LifeView FlyDVB-T DUO [5168:0502,5168:0306] + 56 -> Avermedia AVerTV 307 [1461:a70a] + 57 -> Avermedia AVerTV GO 007 FM [1461:f31f] + 58 -> ADS Tech Instant TV (saa7135) [1421:0350,1421:0370] + 59 -> Kworld/Tevion V-Stream Xpert TV PVR7134 + 60 -> Typhoon DVB-T Duo Digital/Analog Cardbus [4e42:0502] + 61 -> Philips TOUGH DVB-T reference design [1131:2004] + 62 -> Compro VideoMate TV Gold+II + 63 -> Kworld Xpert TV PVR7134 diff --git a/Documentation/video4linux/CARDLIST.tuner b/Documentation/video4linux/CARDLIST.tuner index f7bafe862ba0..f3302e1b1b9c 100644 --- a/Documentation/video4linux/CARDLIST.tuner +++ b/Documentation/video4linux/CARDLIST.tuner @@ -44,3 +44,23 @@ tuner=42 - Philips 1236D ATSC/NTSC daul in tuner=43 - Philips NTSC MK3 (FM1236MK3 or FM1236/F) tuner=44 - Philips 4 in 1 (ATI TV Wonder Pro/Conexant) tuner=45 - Microtune 4049 FM5 +tuner=46 - Panasonic VP27s/ENGE4324D +tuner=47 - LG NTSC (TAPE series) +tuner=48 - Tenna TNF 8831 BGFF) +tuner=49 - Microtune 4042 FI5 ATSC/NTSC dual in +tuner=50 - TCL 2002N +tuner=51 - Philips PAL/SECAM_D (FM 1256 I-H3) +tuner=52 - Thomson DDT 7610 (ATSC/NTSC) +tuner=53 - Philips FQ1286 +tuner=54 - tda8290+75 +tuner=55 - LG PAL (TAPE series) +tuner=56 - Philips PAL/SECAM multi (FQ1216AME MK4) +tuner=57 - Philips FQ1236A MK4 +tuner=58 - Ymec TVision TVF-8531MF/8831MF/8731MF +tuner=59 - Ymec TVision TVF-5533MF +tuner=60 - Thomson DDT 7611 (ATSC/NTSC) +tuner=61 - Tena TNF9533-D/IF/TNF9533-B/DF +tuner=62 - Philips TEA5767HN FM Radio +tuner=63 - Philips FMD1216ME MK3 Hybrid Tuner +tuner=64 - LG TDVS-H062F/TUA6034 +tuner=65 - Ymec TVF66T5-B/DFF diff --git a/Documentation/video4linux/README.saa7134 b/Documentation/video4linux/README.saa7134 index 1a446c65365e..1f788e498eff 100644 --- a/Documentation/video4linux/README.saa7134 +++ b/Documentation/video4linux/README.saa7134 @@ -57,6 +57,15 @@ Cards can use either of these two crystals (xtal): - 24.576MHz -> .audio_clock=0x200000 (xtal * .audio_clock = 51539600) +Some details about 30/34/35: + + - saa7130 - low-price chip, doesn't have mute, that is why all those + cards should have .mute field defined in their tuner structure. + + - saa7134 - usual chip + + - saa7133/35 - saa7135 is probably a marketing decision, since all those + chips identifies itself as 33 on pci. Credits ======= diff --git a/Documentation/video4linux/bttv/Cards b/Documentation/video4linux/bttv/Cards index 7f8c7eb70ab2..8f1941ede4da 100644 --- a/Documentation/video4linux/bttv/Cards +++ b/Documentation/video4linux/bttv/Cards @@ -20,7 +20,7 @@ All other cards only differ by additional components as tuners, sound decoders, EEPROMs, teletext decoders ... -Unsupported Cards: +Unsupported Cards: ------------------ Cards with Zoran (ZR) or Philips (SAA) or ISA are not supported by @@ -50,11 +50,11 @@ Bt848a/Bt849 single crytal operation support possible!!! Miro/Pinnacle PCTV ------------------ -- Bt848 - some (all??) come with 2 crystals for PAL/SECAM and NTSC +- Bt848 + some (all??) come with 2 crystals for PAL/SECAM and NTSC - PAL, SECAM or NTSC TV tuner (Philips or TEMIC) - MSP34xx sound decoder on add on board - decoder is supported but AFAIK does not yet work + decoder is supported but AFAIK does not yet work (other sound MUX setting in GPIO port needed??? somebody who fixed this???) - 1 tuner, 1 composite and 1 S-VHS input - tuner type is autodetected @@ -70,7 +70,7 @@ in 1997! Hauppauge Win/TV pci -------------------- -There are many different versions of the Hauppauge cards with different +There are many different versions of the Hauppauge cards with different tuners (TV+Radio ...), teletext decoders. Note that even cards with same model numbers have (depending on the revision) different chips on it. @@ -80,22 +80,22 @@ different chips on it. - PAL, SECAM, NTSC or tuner with or without Radio support e.g.: - PAL: + PAL: TDA5737: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners TSA5522: 1.4 GHz I2C-bus controlled synthesizer, I2C 0xc2-0xc3 - + NTSC: TDA5731: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners TSA5518: no datasheet available on Philips site -- Philips SAA5246 or SAA5284 ( or no) Teletext decoder chip +- Philips SAA5246 or SAA5284 ( or no) Teletext decoder chip with buffer RAM (e.g. Winbond W24257AS-35: 32Kx8 CMOS static RAM) SAA5246 (I2C 0x22) is supported -- 256 bytes EEPROM: Microchip 24LC02B or Philips 8582E2Y +- 256 bytes EEPROM: Microchip 24LC02B or Philips 8582E2Y with configuration information I2C address 0xa0 (24LC02B also responds to 0xa2-0xaf) - 1 tuner, 1 composite and (depending on model) 1 S-VHS input - 14052B: mux for selection of sound source -- sound decoder: TDA9800, MSP34xx (stereo cards) +- sound decoder: TDA9800, MSP34xx (stereo cards) Askey CPH-Series @@ -108,17 +108,17 @@ Developed by TelSignal(?), OEMed by many vendors (Typhoon, Anubis, Dynalink) CPH05x: BT878 with FM CPH06x: BT878 (w/o FM) CPH07x: BT878 capture only - + TV standards: CPH0x0: NTSC-M/M CPH0x1: PAL-B/G CPH0x2: PAL-I/I CPH0x3: PAL-D/K - CPH0x4: SECAM-L/L - CPH0x5: SECAM-B/G - CPH0x6: SECAM-D/K - CPH0x7: PAL-N/N - CPH0x8: PAL-B/H + CPH0x4: SECAM-L/L + CPH0x5: SECAM-B/G + CPH0x6: SECAM-D/K + CPH0x7: PAL-N/N + CPH0x8: PAL-B/H CPH0x9: PAL-M/M CPH03x was often sold as "TV capturer". @@ -174,7 +174,7 @@ Lifeview Flyvideo Series: "The FlyVideo2000 and FlyVideo2000s product name have renamed to FlyVideo98." Their Bt8x8 cards are listed as discontinued. Flyvideo 2000S was probably sold as Flyvideo 3000 in some contries(Europe?). - The new Flyvideo 2000/3000 are SAA7130/SAA7134 based. + The new Flyvideo 2000/3000 are SAA7130/SAA7134 based. "Flyvideo II" had been the name for the 848 cards, nowadays (in Germany) this name is re-used for LR50 Rev.W. @@ -235,12 +235,12 @@ Prolink Multimedia TV packages (card + software pack): PixelView Play TV Theater - (Model: PV-M4200) = PixelView Play TV pro + Software PixelView Play TV PAK - (Model: PV-BT878P+ REV 4E) - PixelView Play TV/VCR - (Model: PV-M3200 REV 4C / 8D / 10A ) + PixelView Play TV/VCR - (Model: PV-M3200 REV 4C / 8D / 10A ) PixelView Studio PAK - (Model: M2200 REV 4C / 8D / 10A ) PixelView PowerStudio PAK - (Model: PV-M3600 REV 4E) PixelView DigitalVCR PAK - (Model: PV-M2400 REV 4C / 8D / 10A ) - PixelView PlayTV PAK II (TV/FM card + usb camera) PV-M3800 + PixelView PlayTV PAK II (TV/FM card + usb camera) PV-M3800 PixelView PlayTV XP PV-M4700,PV-M4700(w/FM) PixelView PlayTV DVR PV-M4600 package contents:PixelView PlayTV pro, windvr & videoMail s/w @@ -254,7 +254,7 @@ Prolink DTV3000 PV-DTV3000P+ DVB-S CI = Twinhan VP-1030 DTV2000 DVB-S = Twinhan VP-1020 - + Video Conferencing: PixelView Meeting PAK - (Model: PV-BT878P) PixelView Meeting PAK Lite - (Model: PV-BT878P) @@ -308,7 +308,7 @@ KNC One newer Cards have saa7134, but model name stayed the same? -Provideo +Provideo -------- PV951 or PV-951 (also are sold as: Boeder TV-FM Video Capture Card @@ -353,7 +353,7 @@ AVerMedia AVerTV AVerTV Stereo AVerTV Studio (w/FM) - AVerMedia TV98 with Remote + AVerMedia TV98 with Remote AVerMedia TV/FM98 Stereo AVerMedia TVCAM98 TVCapture (Bt848) @@ -373,7 +373,7 @@ AVerMedia (1) Daughterboard MB68-A with TDA9820T and TDA9840T (2) Sony NE41S soldered (stereo sound?) (3) Daughterboard M118-A w/ pic 16c54 and 4 MHz quartz - + US site has different drivers for (as of 09/2002): EZ Capture/InterCam PCI (BT-848 chip) EZ Capture/InterCam PCI (BT-878 chip) @@ -437,7 +437,7 @@ Terratec Terra TValueRadio, "LR102 Rev.C" printed on the PCB Terra TV/Radio+ Version 1.0, "80-CP2830100-0" TTTV3 printed on the PCB, "CPH010-E83" on the back, SAA6588T, TDA9873H - Terra TValue Version BT878, "80-CP2830110-0 TTTV4" printed on the PCB, + Terra TValue Version BT878, "80-CP2830110-0 TTTV4" printed on the PCB, "CPH011-D83" on back Terra TValue Version 1.0 "ceb105.PCB" (really identical to Terra TV+ Version 1.0) Terra TValue New Revision "LR102 Rec.C" @@ -528,7 +528,7 @@ Koutech KW-606RSF KW-607A (capture only) KW-608 (Zoran capture only) - + IODATA (jp) ------ GV-BCTV/PCI @@ -542,15 +542,15 @@ Canopus (jp) ------- WinDVR = Kworld "KW-TVL878RF" -www.sigmacom.co.kr +www.sigmacom.co.kr ------------------ - Sigma Cyber TV II + Sigma Cyber TV II www.sasem.co.kr --------------- Litte OnAir TV -hama +hama ---- TV/Radio-Tuner Card, PCI (Model 44677) = CPH051 @@ -638,7 +638,7 @@ Media-Surfer (esc-kathrein.de) Jetway (www.jetway.com.tw) -------------------------- - JW-TV 878M + JW-TV 878M JW-TV 878 = KWorld KW-TV878RF Galaxis @@ -715,7 +715,7 @@ Hauppauge 809 MyVideo 872 MyTV2Go FM - + 546 WinTV Nova-S CI 543 WinTV Nova 907 Nova-S USB @@ -739,7 +739,7 @@ Hauppauge 832 MyTV2Go 869 MyTV2Go-FM 805 MyVideo (USB) - + Matrix-Vision ------------- @@ -764,7 +764,7 @@ Gallant (www.gallantcom.com) www.minton.com.tw Intervision IV-550 (bt8x8) Intervision IV-100 (zoran) Intervision IV-1000 (bt8x8) - + Asonic (www.asonic.com.cn) (website down) ----------------------------------------- SkyEye tv 878 @@ -804,11 +804,11 @@ Kworld (www.kworld.com.tw) JTT/ Justy Corp.http://www.justy.co.jp/ (www.jtt.com.jp website down) --------------------------------------------------------------------- - JTT-02 (JTT TV) "TV watchmate pro" (bt848) + JTT-02 (JTT TV) "TV watchmate pro" (bt848) ADS www.adstech.com ------------------- - Channel Surfer TV ( CHX-950 ) + Channel Surfer TV ( CHX-950 ) Channel Surfer TV+FM ( CHX-960FM ) AVEC www.prochips.com @@ -874,7 +874,7 @@ www.ids-imaging.de ------------------ Falcon Series (capture only) In USA: http://www.theimagingsource.com/ - DFG/LC1 + DFG/LC1 www.sknet-web.co.jp ------------------- @@ -890,7 +890,7 @@ Cybertainment CyberMail Xtreme These are Flyvideo -VCR (http://www.vcrinc.com/) +VCR (http://www.vcrinc.com/) --- Video Catcher 16 @@ -920,7 +920,7 @@ Sdisilk www.sdisilk.com/ SDI Silk 200 SDI Input Card www.euresys.com - PICOLO series + PICOLO series PMC/Pace www.pacecom.co.uk website closed diff --git a/Documentation/video4linux/bttv/Insmod-options b/Documentation/video4linux/bttv/Insmod-options index 7bb5a50b0779..fc94ff235ffa 100644 --- a/Documentation/video4linux/bttv/Insmod-options +++ b/Documentation/video4linux/bttv/Insmod-options @@ -44,6 +44,9 @@ bttv.o push used by bttv. bttv will disable overlay by default on this hardware to avoid crashes. With this insmod option you can override this. + no_overlay=1 Disable overlay. It should be used by broken + hardware that doesn't support PCI2PCI direct + transfers. automute=0/1 Automatically mutes the sound if there is no TV signal, on by default. You might try to disable this if you have bad input signal diff --git a/Documentation/video4linux/hauppauge-wintv-cx88-ir.txt b/Documentation/video4linux/hauppauge-wintv-cx88-ir.txt new file mode 100644 index 000000000000..93fec32a1188 --- /dev/null +++ b/Documentation/video4linux/hauppauge-wintv-cx88-ir.txt @@ -0,0 +1,54 @@ +The controls for the mux are GPIO [0,1] for source, and GPIO 2 for muting. + +GPIO0 GPIO1 + 0 0 TV Audio + 1 0 FM radio + 0 1 Line-In + 1 1 Mono tuner bypass or CD passthru (tuner specific) + +GPIO 16(i believe) is tied to the IR port (if present). + +------------------------------------------------------------------------------------ + +>From the data sheet: + Register 24'h20004 PCI Interrupt Status + bit [18] IR_SMP_INT Set when 32 input samples have been collected over + gpio[16] pin into GP_SAMPLE register. + +What's missing from the data sheet: + +Setup 4KHz sampling rate (roughly 2x oversampled; good enough for our RC5 +compat remote) +set register 0x35C050 to 0xa80a80 + +enable sampling +set register 0x35C054 to 0x5 + +Of course, enable the IRQ bit 18 in the interrupt mask register .(and +provide for a handler) + +GP_SAMPLE register is at 0x35C058 + +Bits are then right shifted into the GP_SAMPLE register at the specified +rate; you get an interrupt when a full DWORD is recieved. +You need to recover the actual RC5 bits out of the (oversampled) IR sensor +bits. (Hint: look for the 0/1and 1/0 crossings of the RC5 bi-phase data) An +actual raw RC5 code will span 2-3 DWORDS, depending on the actual alignment. + +I'm pretty sure when no IR signal is present the receiver is always in a +marking state(1); but stray light, etc can cause intermittent noise values +as well. Remember, this is a free running sample of the IR receiver state +over time, so don't assume any sample starts at any particular place. + +http://www.atmel.com/dyn/resources/prod_documents/doc2817.pdf +This data sheet (google search) seems to have a lovely description of the +RC5 basics + +http://users.pandora.be/nenya/electronics/rc5/ and more data + +http://www.ee.washington.edu/circuit_archive/text/ir_decode.txt +and even a reference to how to decode a bi-phase data stream. + +http://www.xs4all.nl/~sbp/knowledge/ir/rc5.htm +still more info + diff --git a/Documentation/video4linux/lifeview.txt b/Documentation/video4linux/lifeview.txt new file mode 100644 index 000000000000..b07ea79c2b7e --- /dev/null +++ b/Documentation/video4linux/lifeview.txt @@ -0,0 +1,42 @@ +collecting data about the lifeview models and the config coding on +gpio pins 0-9 ... +================================================================== + +bt878: + LR50 rev. Q ("PARTS: 7031505116), Tuner wurde als Nr. 5 erkannt, Eingänge + SVideo, TV, Composite, Audio, Remote. CP9..1=100001001 (1: 0-Ohm-Widerstand + gegen GND unbestückt; 0: bestückt) + +------------------------------------------------------------------------------ + +saa7134: + /* LifeView FlyTV Platinum FM (LR214WF) */ + /* "Peter Missel <peter.missel@onlinehome.de> */ + .name = "LifeView FlyTV Platinum FM", + /* GP27 MDT2005 PB4 pin 10 */ + /* GP26 MDT2005 PB3 pin 9 */ + /* GP25 MDT2005 PB2 pin 8 */ + /* GP23 MDT2005 PB1 pin 7 */ + /* GP22 MDT2005 PB0 pin 6 */ + /* GP21 MDT2005 PB5 pin 11 */ + /* GP20 MDT2005 PB6 pin 12 */ + /* GP19 MDT2005 PB7 pin 13 */ + /* nc MDT2005 PA3 pin 2 */ + /* Remote MDT2005 PA2 pin 1 */ + /* GP18 MDT2005 PA1 pin 18 */ + /* nc MDT2005 PA0 pin 17 strap low */ + + /* GP17 Strap "GP7"=High */ + /* GP16 Strap "GP6"=High + 0=Radio 1=TV + Drives SA630D ENCH1 and HEF4052 A1 pins + to do FM radio through SIF input */ + /* GP15 nc */ + /* GP14 nc */ + /* GP13 nc */ + /* GP12 Strap "GP5" = High */ + /* GP11 Strap "GP4" = High */ + /* GP10 Strap "GP3" = High */ + /* GP09 Strap "GP2" = Low */ + /* GP08 Strap "GP1" = Low */ + /* GP07.00 nc */ diff --git a/Documentation/video4linux/not-in-cx2388x-datasheet.txt b/Documentation/video4linux/not-in-cx2388x-datasheet.txt new file mode 100644 index 000000000000..edbfe744d21d --- /dev/null +++ b/Documentation/video4linux/not-in-cx2388x-datasheet.txt @@ -0,0 +1,41 @@ +================================================================================= +MO_OUTPUT_FORMAT (0x310164) + + Previous default from DScaler: 0x1c1f0008 + Digit 8: 31-28 + 28: PREVREMOD = 1 + + Digit 7: 27-24 (0xc = 12 = b1100 ) + 27: COMBALT = 1 + 26: PAL_INV_PHASE + (DScaler apparently set this to 1, resulted in sucky picture) + + Digits 6,5: 23-16 + 25-16: COMB_RANGE = 0x1f [default] (9 bits -> max 512) + + Digit 4: 15-12 + 15: DISIFX = 0 + 14: INVCBF = 0 + 13: DISADAPT = 0 + 12: NARROWADAPT = 0 + + Digit 3: 11-8 + 11: FORCE2H + 10: FORCEREMD + 9: NCHROMAEN + 8: NREMODEN + + Digit 2: 7-4 + 7-6: YCORE + 5-4: CCORE + + Digit 1: 3-0 + 3: RANGE = 1 + 2: HACTEXT + 1: HSFMT + +0x47 is the sync byte for MPEG-2 transport stream packets. +Datasheet incorrectly states to use 47 decimal. 188 is the length. +All DVB compliant frontends output packets with this start code. + +================================================================================= diff --git a/Documentation/w1/w1.generic b/Documentation/w1/w1.generic index eace3046a858..f937fbe1cacb 100644 --- a/Documentation/w1/w1.generic +++ b/Documentation/w1/w1.generic @@ -1,19 +1,92 @@ -Any w1 device must be connected to w1 bus master device - for example -ds9490 usb device or w1-over-GPIO or RS232 converter. -Driver for w1 bus master must provide several functions(you can find -them in struct w1_bus_master definition in w1.h) which then will be -called by w1 core to send various commands over w1 bus(by default it is -reset and search commands). When some device is found on the bus, w1 core -checks if driver for it's family is loaded. -If driver is loaded w1 core creates new w1_slave object and registers it -in the system(creates some generic sysfs files(struct w1_family_ops in -w1_family.h), notifies any registered listener and so on...). -It is device driver's business to provide any communication method -upstream. -For example w1_therm driver(ds18?20 thermal sensor family driver) -provides temperature reading function which is bound to ->rbin() method -of the above w1_family_ops structure. -w1_smem - driver for simple 64bit memory cell provides ID reading -method. +The 1-wire (w1) subsystem +------------------------------------------------------------------ +The 1-wire bus is a simple master-slave bus that communicates via a single +signal wire (plus ground, so two wires). + +Devices communicate on the bus by pulling the signal to ground via an open +drain output and by sampling the logic level of the signal line. + +The w1 subsystem provides the framework for managing w1 masters and +communication with slaves. + +All w1 slave devices must be connected to a w1 bus master device. + +Example w1 master devices: + DS9490 usb device + W1-over-GPIO + DS2482 (i2c to w1 bridge) + Emulated devices, such as a RS232 converter, parallel port adapter, etc + + +What does the w1 subsystem do? +------------------------------------------------------------------ +When a w1 master driver registers with the w1 subsystem, the following occurs: + + - sysfs entries for that w1 master are created + - the w1 bus is periodically searched for new slave devices + +When a device is found on the bus, w1 core checks if driver for it's family is +loaded. If so, the family driver is attached to the slave. +If there is no driver for the family, a simple sysfs entry is created +for the slave device. + + +W1 device families +------------------------------------------------------------------ +Slave devices are handled by a driver written for a family of w1 devices. + +A family driver populates a struct w1_family_ops (see w1_family.h) and +registers with the w1 subsystem. + +Current family drivers: +w1_therm - (ds18?20 thermal sensor family driver) + provides temperature reading function which is bound to ->rbin() method + of the above w1_family_ops structure. + +w1_smem - driver for simple 64bit memory cell provides ID reading method. You can call above methods by reading appropriate sysfs files. + + +What does a w1 master driver need to implement? +------------------------------------------------------------------ + +The driver for w1 bus master must provide at minimum two functions. + +Emulated devices must provide the ability to set the output signal level +(write_bit) and sample the signal level (read_bit). + +Devices that support the 1-wire natively must provide the ability to write and +sample a bit (touch_bit) and reset the bus (reset_bus). + +Most hardware provides higher-level functions that offload w1 handling. +See struct w1_bus_master definition in w1.h for details. + + +w1 master sysfs interface +------------------------------------------------------------------ +<xx-xxxxxxxxxxxxx> - a directory for a found device. The format is family-serial +bus - (standard) symlink to the w1 bus +driver - (standard) symlink to the w1 driver +w1_master_attempts - the number of times a search was attempted +w1_master_max_slave_count + - the maximum slaves that may be attached to a master +w1_master_name - the name of the device (w1_bus_masterX) +w1_master_search - the number of searches left to do, -1=continual (default) +w1_master_slave_count + - the number of slaves found +w1_master_slaves - the names of the slaves, one per line +w1_master_timeout - the delay in seconds between searches + +If you have a w1 bus that never changes (you don't add or remove devices), +you can set w1_master_search to a positive value to disable searches. + + +w1 slave sysfs interface +------------------------------------------------------------------ +bus - (standard) symlink to the w1 bus +driver - (standard) symlink to the w1 driver +name - the device name, usually the same as the directory name +w1_slave - (optional) a binary file whose meaning depends on the + family driver + diff --git a/Documentation/x86_64/boot-options.txt b/Documentation/x86_64/boot-options.txt index b9e6be00cadf..678e8f192db2 100644 --- a/Documentation/x86_64/boot-options.txt +++ b/Documentation/x86_64/boot-options.txt @@ -6,6 +6,11 @@ only the AMD64 specific ones are listed here. Machine check mce=off disable machine check + mce=bootlog Enable logging of machine checks left over from booting. + Disabled by default because some BIOS leave bogus ones. + If your BIOS doesn't do that it's a good idea to enable though + to make sure you log even machine check events that result + in a reboot. nomce (for compatibility with i386): same as mce=off @@ -47,7 +52,7 @@ Timing notsc Don't use the CPU time stamp counter to read the wall time. This can be used to work around timing problems on multiprocessor systems - with not properly synchronized CPUs. Only useful with a SMP kernel + with not properly synchronized CPUs. report_lost_ticks Report when timer interrupts are lost because some code turned off @@ -74,6 +79,9 @@ Idle loop event. This will make the CPUs eat a lot more power, but may be useful to get slightly better performance in multiprocessor benchmarks. It also makes some profiling using performance counters more accurate. + Please note that on systems with MONITOR/MWAIT support (like Intel EM64T + CPUs) this option has no performance advantage over the normal idle loop. + It may also interact badly with hyperthreading. Rebooting @@ -178,6 +186,5 @@ Debugging Misc noreplacement Don't replace instructions with more appropiate ones - for the CPU. This may be useful on asymmetric MP systems - where some CPU have less capabilities than the others. - + for the CPU. This may be useful on asymmetric MP systems + where some CPU have less capabilities than the others. |